Identification of copy number variation-driven genes for liver cancer via bioinformatics analysis
- Authors:
- Published online on: August 20, 2014 https://doi.org/10.3892/or.2014.3425
- Pages: 1845-1852
Abstract
Introduction
Primary liver cancer is the fifth most frequently diagnosed cancer globally and the second leading cause of cancer-related mortality. In developing countries, incidence rates are 2- to 3-fold higher than in developed countries (1) and it currently results in 360,000 cases and 350,000 deaths a year in China. The clinical prognosis is very poor with the medium survival time approaching 6 months (2). Hepatocellular carcinoma (HCC) is the most common type of liver cancer. Most cases of HCC are induced by either a viral hepatitis infection (hepatitis B or C) or cirrhosis. Despite recent discoveries in screening and early detection, HCC exhibits a rapid clinical course with an average survival of 6 months and an overall 5-year survival rate of 5% (3). Therefore, there is an urgent demand for biomarkers of early detection and targeted therapy.
Copy number variations (CNVs) are alterations of the DNA and they are being identified with different genome analysis platforms, such as array comparative genomic hybridization (aCGH), single nucleotide polymorphism (SNP) genotyping platforms, and next-generation sequencing. CNVs are involved in human health and disease (4,5) and are currently being applied for the diagnosis of various diseases (6,7).
CNVs also play important roles in the pathogenesis of various types of cancer, such as CNVs of epidermal growth factor receptor (EGFR), which have been associated with head and neck squamous (8), non-small cell lung (9), colorectal (10) and prostate cancer (11). Previous studies have indicated that decrease in the copy number of mitochondrial DNA may be a critical event during the early phase of liver carcinogenesis (12,13). Guichard et al conducted an integrated analysis of somatic mutations and focal copy-number changes and subsequently identified several key genes and pathways in HCC (14).
In the present study, we carried out an integrated analysis of liver cancer CNV data from The Cancer Genome Atlas (TCGA) and liver cancer expression profile data from the EBI Array Express database using bioinformatic tools, aiming to identify CNV-driven genes. These CNV-related differentially expressed genes (DEGs) may be potential biomarkers for early diagnosis or treatment. In addition, they may aid in identifying underlying mechanisms of liver cancer.
Materials and methods
Data sources
The CNV data set was obtained from TCGA database. Genome-Wide SNP array 6.0 chip was used to detect CNV information in 323 pairs of cases and controls with hg19 as the reference genome. Level 3 data were adopted in the following analysis. CNV sites and mean segment information were acquired in each sample. Gene expression data set E-MTAB-950 in original CEL format were downloaded from EBI Array Express. A total of 30 samples were selected out, including 10 normal liver tissue samples and 20 liver cancer samples.
Pretreatment of gene expression data
CEL format was converted into expression matrix using the rma function from package affy of R. Probes were then mapped into genes using Bioconductor with annotation files of Affymetrix Human Genome U133 Plus 2.0 Array. Expression values were averaged when multiple probes were mapped into a single gene. Box plots for gene expression data before and after normalization were plotted using R.
Pretreatment of CNV data
The case and the control group were pretreated separately. The distribution of CNVs on the 22 chromosomes was analyzed in three intervals, 1–10, 10–50 and >50 kb, respectively. P-values of difference in CNV distribution between the case and the control group were calculated using permutation test. Circos circular diagram was plotted to display CNV distribution. DEGs were also marked in the diagram.
Screening of DEGs
Differential analysis was performed with package limma to screen out DEGs. |log2FC| >0.585 (i.e. absolute fold-change >1.5) and adjusted p-value <0.05 were set as the cut-offs.
Screening of potential liver cancer-related CNVs
Using hg19 annotation information provided by UCSC (15), genes in CNV regions and values of CNVs were obtained. Liver cancer-related CNVs were then screened out according to the criterion that it is not observed in controls but is detected in >80% of cases. The gene-CNV matrix was constructed and missing value was filled up with 0 (i.e. log2 (segment_mean) = 0, copy number 1).
Screening of CNV-driven genes
Matrix of CNVs and expression values were constructed and correlation analysis was performed on genes with both values. Genes showing same trends in significant differential expression and CNV were termed as CNV-driven genes.
Functional enrichment analysis
Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed on DEGs and CNV-driven genes using Database for Annotation, Visualization, and Integrated Discovery (DAVID) (16) online tools. P-value <0.05 was set as the threshold to filter out significant terms. TF-target gene interactions were also predicted with information from UCSC using DAVID online tools. The transcriptional regulatory network was then visualized with Cytoscape (17).
Results
DEGs
A total of 19,944 gene expression values were obtained in normal liver tissue samples and liver cancer samples. Box plots for gene expression values before and after normalization are shown in Fig. 1. A total of 1,675 DEGs were identified in liver cancer, of which 1,090 were upregulated.
CNV data analysis results
CNV data were analyzed and distribution of CNVs in chromosomes is shown in Tables I and II, and Figs. 2 and 3.
Functional enrichment analysis results
Significant GO terms and KEGG pathways of upregulated and downregulated genes are listed in Tables III and IV. Cell cycle and ECM-receptor interaction were enriched in upregulated genes. Several metabolic pathways were significant in downregulated genes, such as cellular amino acid derivative metabolic process, metabolism of xenobiotics by cytochrome P450 and glycolysis/gluconeogenesis.
Screening of potential liver cancer-related CNVs
A total of 735 liver cancer-related CNVs were obtained. Matrix of genes and CNVs was then constructed. A total of 251 genes with CNVs and gene expression values were selected out, of which 46 genes showed significant differential expression between liver cancer and normal liver tissue. Given CNVs in X and Y chromosome from controls were not included, 11 genes located in X and Y chromosome were excluded from subsequent analysis and 35 genes were retained for subsequent analysis (Fig. 4 and Table V).
Table VResult of correlation analysis between copy number and differential expression for the 35 genes. |
CNV-driven genes and transcriptional regulatory network
A total of 25 CNV-driven genes were identified. Functional enrichment analysis results of these genes are shown in Table VI. In the transcriptional regulatory network (Fig. 5), 16 TFs regulated 21 CNV-driven genes. SP1, AP2, CREB, ELK1, PAX5, PPARA, STAT3 and USF were recorded in TRED as known cancer-related TFs. The other 8 TFs (AHRARNT, MAZR, NRF2, ROAZ, RORA1, SREBP1, TAXCREB and ZIC1) may play roles in the development of liver cancer.
Discussion
In the present study, we carried out an integrated analysis of copy number variation (CNV) data and gene expression data for liver cancer. A total of 1,675 differentially expressed genes (DEGs) were identified in liver cancer, of which 1,090 were upregulated. According to the CNV distribution results, in liver cancer, deletion and duplication of CNVs were common in all the 22 chromosomes. CNV repeats with length 1–10 kb were significantly more than those with length >50 kb, suggesting CNVs in liver cancer were likely to affect the expression of a single gene.
Thirty five genes with associated copy number and differential expression were acquired, of which 25 genes showed the same trends in the gene expression and CNV and they were regarded as liver cancer-related CNV-driven genes. Zinc ion binding was enriched in these genes, indicating zinc plays a role in liver cancer, which was in accordance with previous studies (18,19). Tripartite motif containing 28 (TRIM28) mediates transcriptional control via interaction with the Kruppel-associated box repression domain found in many transcription factors; it can suppress murine HCC by forming regulatory complexes with TRIM24 and TRIM33 (20). RanBP-type and C3HC4-type zinc finger containing 1 (RBCK1) can promote cancer cell proliferation (21,22). Zinc finger protein 512B (ZNF512B) is a transcription factor promoting the expression of a downstream gene in the signal transduction pathway of the transforming growth factor-β (TGF-β), which is essential for the protection and survival of neurons, however the influence of the new SNP (rs2275294) in actual ALS patients remained unknown (23). Diacylglycerol kinase theta (DGKQ) has been reported to be associated with the risk of Parkinson’s disease (PD) in Caucasian populations (24). Choline kinase β (CHKB) is both a CNV-driven gene and a candidate for susceptibility to CNS hypersomnias (EHS), as well as narcolepsy with cataplexy. Therefore, the 25 CNV-driven genes may be potential markers for liver cancer.
In the transcriptional regulatory network, 8 TFs have been linked to cancers and the other 8 TFs (AHRARNT, MAZR, NRF2, ROAZ, RORA1, SREBP1, TAXCREB and ZIC1) are implicated in regulation of the 21 CNV-driven genes and may play roles in the pathogenesis of liver cancer. The CD4 vs. CD8 lineage specification of thymocytes is linked to co-receptor expression. The transcription factor POZ (BTB) and AT hook containing zinc finger 1 (PATZ1, MAZR) has been identified as an important regulator of Cd8 expression (25). Transcription factor nuclear factor erythroid-2-related factor 2 (NRF2) is essential for the antioxidant responsive element (ARE)-mediated induction of phase II detoxifying and oxidative stress enzyme genes (26). Shibata et al reported that mutations in NRF2 impair its recognition by Keap1-Cul3 E3 ligase and promote malignancy (27). Zinc finger protein 423 (ZFP423, ROAZ), a rat C2H2 zinc finger protein, plays a role in the regulation of olfactory neuronal differentiation through its interaction with the Olf-1/EBF transcription factor family (28). Sterol regulatory element-binding protein 1 (SREBP-1), a member of the basic-helix-loop-helix-leucine zipper (bHLH-ZIP) family of transcription factors, is synthesized as a 125 kd precursor that is attached to the nuclear envelope and endoplasmic reticulum (29). Human T-lymphotropic virus type 1 Tax interacts specifically with the cellular transcription factor CREB and the viral 21-bp repeat element to form a Tax-CREB-DNA ternary complex which mediates activation of viral mRNA transcription (30). These TFs merit further study to delineate their roles in liver cancer.
Collectively, the present study identified DEGs in liver cancer and disclosed a range of CNV-driven genes. Their biological functions and regulatory network were also discussed. These findings may improve our understanding of liver cancer and advance therapy development.
Acknowledgements
This study was supported by the National High Technology Research (863) Project of China (2012AA020204).
References
Bosch FX, Ribes J and Borràs J: Epidemiology of primary liver cancer. Semin Liver Dis. 19:271–285. 1999. View Article : Google Scholar : PubMed/NCBI | |
Ko YH, Pedersen PL and Geschwind JF: Glucose catabolism in the rabbit VX2 tumor model for liver cancer: characterization and targeting hexokinase. Cancer Lett. 173:83–91. 2001. View Article : Google Scholar : PubMed/NCBI | |
Kiyosawa K, Umemura T, Ichijo T, et al: Hepatocellular carcinoma: recent trends in Japan. Gastroenterology. 127(Suppl 1): S17–S26. 2004. View Article : Google Scholar : PubMed/NCBI | |
Redon R, Ishikawa S, Fitch KR, et al: Global variation in copy number in the human genome. Nature. 444:444–454. 2006. View Article : Google Scholar : PubMed/NCBI | |
Zhang F, Gu W, Hurles ME and Lupski JR: Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet. 10:451–481. 2009. View Article : Google Scholar : PubMed/NCBI | |
Parajes S, Quinteiro C, Domínguez F and Loidi L: High frequency of copy number variations and sequence variants at CYP21A2 locus: implication for the genetic diagnosis of 21-hydroxylase deficiency. PLoS One. 3:e21382008. View Article : Google Scholar : PubMed/NCBI | |
Vissers LE, de Vries BB and Veltman JA: Genomic microarrays in mental retardation: from copy number variation to gene, from research to diagnosis. J Med Genet. 47:289–297. 2010. View Article : Google Scholar : PubMed/NCBI | |
Temam S, Kawaguchi H, El-Naggar AK, et al: Epidermal growth factor receptor copy number alterations correlate with poor clinical outcome in patients with head and neck squamous cancer. J Clin Oncol. 25:2164–2170. 2007. View Article : Google Scholar : PubMed/NCBI | |
Hirsch FR, Varella-Garcia M, Cappuzzo F, et al: Combination of EGFR gene copy number and protein expression predicts outcome for advanced non-small-cell lung cancer patients treated with gefitinib. Ann Oncol. 18:752–760. 2007. View Article : Google Scholar : PubMed/NCBI | |
Sartore-Bianchi A, Moroni M, Veronese S, et al: Epidermal growth factor receptor gene copy number and clinical outcome of metastatic colorectal cancer treated with panitumumab. J Clin Oncol. 25:3238–3245. 2007. View Article : Google Scholar : PubMed/NCBI | |
Schlomm T, Kirstein P, Iwers L, et al: Clinical significance of epidermal growth factor receptor protein overexpression and gene copy number gains in prostate cancer. Clin Cancer Res. 13:6579–6584. 2007. View Article : Google Scholar : PubMed/NCBI | |
Lee HC, Li SH, Lin JC, Wu CC, Yeh DC and Wei YH: Somatic mutations in the D-loop and decrease in the copy number of mitochondrial DNA in human hepatocellular carcinoma. Mutat Res. 547:71–78. 2004. View Article : Google Scholar : PubMed/NCBI | |
Yin PH, Lee HC, Chau GY, et al: Alteration of the copy number and deletion of mitochondrial DNA in human hepatocellular carcinoma. Br J Cancer. 90:2390–2396. 2004.PubMed/NCBI | |
Guichard C, Amaddeo G, Imbeaud S, et al: Integrated analysis of somatic mutations and focal copy-number changes identifies key genes and pathways in hepatocellular carcinoma. Nat Genet. 44:694–698. 2012. View Article : Google Scholar | |
Fujita PA, Rhead B, Zweig AS, et al: The UCSC Genome Browser database: update 2011. Nucleic Acids Res. 39:D876–D882. 2011. View Article : Google Scholar : PubMed/NCBI | |
Dennis G Jr, Sherman BT, Hosack DA, et al: DAVID: Database for Annotation, Visualization, and Integrated Discovery. Genome Biol. 4:P32003. View Article : Google Scholar : PubMed/NCBI | |
Smoot ME, Ono K, Ruscheinski J, Wang PL and Ideker T: Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics. 27:431–432. 2011. View Article : Google Scholar : PubMed/NCBI | |
Franklin RB, Levy BA, Zou J, et al: ZIP14 zinc transporter downregulation and zinc depletion in the development and progression of hepatocellular cancer. JJ Gastrointest Cancer. 43:249–257. 2012. View Article : Google Scholar : PubMed/NCBI | |
Ebara M, Fukuda H, Hatano R, et al: Relationship between copper, zinc and metallothionein in hepatocellular carcinoma and its surrounding liver parenchyma. J Hepatol. 33:415–422. 2000. View Article : Google Scholar : PubMed/NCBI | |
Herquel B, Ouararhni K, Khetchoumian K, et al: Transcription cofactors TRIM24, TRIM28, and TRIM33 associate to form regulatory complexes that suppress murine hepatocellular carcinoma. Proc Natl Acad Sci USA. 108:8212–8217. 2011. View Article : Google Scholar : PubMed/NCBI | |
Gustafsson N, Zhao C, Gustafsson JA and Dahlman-Wright K: RBCK1 drives breast cancer cell proliferation by promoting transcription of estrogen receptor α and cyclin B1. Cancer Res. 70:1265–1274. 2010.PubMed/NCBI | |
Donley C, McClelland K, McKeen HD, et al: Identification of RBCK1 as a novel regulator of FKBPL: implications for tumor growth and response to tamoxifen. Oncogene. Aug 5–2013.(Epub ahead of print). View Article : Google Scholar | |
Tetsuka S, Morita M, Iida A, Uehara R, Ikegawa S and Nakano I: ZNF512B gene is a prognostic factor in patients with amyotrophic lateral sclerosis. J Neurol Sci. 324:163–166. 2013. View Article : Google Scholar : PubMed/NCBI | |
Chen YP, Song W, Huang R, et al: GAK rs1564282 and DGKQ rs11248060 increase the risk for Parkinson’s disease in a Chinese population. J Clin Neurosci. 20:880–883. 2013. View Article : Google Scholar | |
Sakaguchi S, Hombauer M, Bilic I, et al: The zinc-finger protein MAZR is part of the transcription factor network that controls the CD4 versus CD8 lineage fate of double-positive thymocytes. Nat Immunol. 11:442–448. 2010. View Article : Google Scholar | |
Itoh K, Wakabayashi N, Katoh Y, et al: Keap1 represses nuclear activation of antioxidant responsive elements by Nrf2 through binding to the amino-terminal Neh2 domain. Genes Dev. 13:76–86. 1999. View Article : Google Scholar : PubMed/NCBI | |
Shibata T, Ohta T, Tong KI, et al: Cancer related mutations in NRF2 impair its recognition by Keap1-Cul3 E3 ligase and promote malignancy. Proc Natl Acad Sci USA. 105:13568–13573. 2008. | |
Tsai RY and Reed RR: Identification of DNA recognition sequences and protein interaction domains of the multiple-Zn-finger protein Roaz. Mol Cell Biol. 18:6447–6456. 1998.PubMed/NCBI | |
Wang X, Sato R, Brown MS, Hua X and Goldstein JL: SREBP-1, a membrane-bound transcription factor released by sterol-regulated proteolysis. Cell. 77:53–62. 1994. View Article : Google Scholar : PubMed/NCBI | |
Tie F, Adya N, Greene WC and Giam CZ: Interaction of the human T-lymphotropic virus type 1 Tax dimer with CREB and the viral 21-base-pair repeat. J Virol. 70:8368–8374. 1996.PubMed/NCBI |