Key genes associated with pancreatic cancer and their association with outcomes: A bioinformatics analysis
- Authors:
- Published online on: June 3, 2019 https://doi.org/10.3892/mmr.2019.10321
- Pages: 1343-1352
Abstract
Introduction
Pancreatic cancer is a highly malignant neoplasm of the digestive system that accounts for >200,000 deaths/year globally (1). The incidence of pancreatic cancer is low compared with that of lung, breast, colorectal and gastric cancers; however, it is associated with a very high mortality rate. It has been reported that the incidence of pancreatic cancer is very similar to the associated mortality rate; the reported 5-year survival rate of patients with pancreatic cancer is <6% (2). The mortality rate of patients with pancreatic cancer ranks fourth among common cancers, and is predicted to rise to second within a decade (3). A number of factors have been identified as contributing to the etiopathogenesis of pancreatic cancer, including heredity, smoking, high-fat diet, chronic pancreatitis and consumption of nitrous acid compounds (4). Due to the latency of pancreatic cancer, the majority of patients are diagnosed at an advanced stage, when tumor tissue has already infiltrated the surrounding tissues and has formed distant metastases, decreasing the usefulness of surgical interventions (5). As a result of drug resistance, the efficacy of postoperative adjuvant therapy has also been very unsatisfactory (6). Carbohydrate antigen 19-9 (CA19-9) is the most frequently used marker for the clinical diagnosis of cancer; the reported sensitivity and specificity of CA19-9 for the diagnosis of pancreatic cancer is 69-93 and 46–98%, respectively (7). Therefore, early diagnosis and treatment are important to improve the prognosis and survival of patients with pancreatic cancer.
At present, high-throughput sequencing is employed in a variety of contexts, such as the discovery of gene mutations and chromosomal translocations that are closely associated with the occurrence and development of tumors (8–10). High-throughput sequencing may be useful for the diagnosis of cancer and development of targeted therapies. These analyses may provide novel insights to guide subsequent research.
Materials and methods
Microarray data
The gene expression profile of GSE62165 (11) was downloaded from the GEO database (12). The data were created using the GPL13667 Affymetrix® Human Genome U219 array (Affymetrix; Thermo Fisher Scientific, Inc.). GSE62165 contained data on 118 pancreatic ductal adenocarcinoma (PDAC) samples and 13 control samples. Data were standardized using the robust multi-array average (RMA) algorithm using limma package (version 3.38.3) (13). In addition, a separate dataset, GSE28735 (14,15), was used to verify the results. The expression profiles included 45 matched pairs of pancreatic tumor and adjacent non-tumor tissues from 45 patients with PDAC. The Cancer Genome Atlas (TCGA; http://cancergenome.nih.gov/) contains genomic sequencing data involving 33 species of cancer.
Identification of differentially expressed genes (DEGs)
The limma package (version 3.38.3) (13) was used to identify DEGs between pancreatic cancer tissue and normal pancreatic tissue samples in R software (version 3.5; http://www.R-project.org). |log2 Fold Change (FC)|>3.0 and adjusted P-value <0.05 were considered to be the threshold for differential gene identification.
Gene Ontology (GO) and Kyoto Encyclopedia of genes and genomes (KEGG) pathway analysis of DEGs
GO (http://www.geneontology.org/) and the KEGG (https://www.kegg.jp/) (16–19) were used to analyze the function of DEGs using the cluster Profiler R package (20). P<0.05 was considered to indicate a statistically significant difference in functional enrichment analysis.
Core genes screening from the protein-protein interaction (PPI) network
A PPI network for the DEGs was generated using the STRING database (https://string-db.org/). Then, Cytoscape (version 3.6.1) (21) was employed, and a plug-in termed cytohubba (22) was integrated into the software. The plug-in provides 12 types of topological analysis methods [Maximal Clique Centrality, Maximum Neighborhood Component (MNC), Density of MNC, Degree, Edge Percolated Component, Bottleneck, EcCentricity, Closeness, Radiality, Betweenness, Stress and Clustering Coefficient). Using 12 analysis methods, we identified the top 18 genes as core genes.
Expression levels and survival analysis of core genes in pancreatic cancer
UALCAN (http://ualcan.path.uab.edu/index.html) (23) was employed to perform survival analysis based on the information of TCGA database. Survival analysis was performed via the Kaplan-Meier method using 18 identified core genes, based on their core gene expression levels in pancreatic adenocarcinoma (PAAD). P<0.05 was considered to be statistically significant. The P-value was calculated using log-rank test. The ‘scaled_estimate’ column provided the potential transcripts produced by each gene. The ‘scaled_estimate’ was multiplied by 106 to obtain a transcripts per million (TPM) expression value (24). Gene expression levels in tumor tissues exhibited notable inter-individual variability. High expression indicated that the TPM value was above the upper quartile value. Low expression indicated that the TPM value was equal or below the upper quartile value.
Verification of results
The findings from the bioinformatics analyses were validated using the dataset GSE28735 from the GEO database. The expression profiles included 45 matched pairs of pancreatic tumor and adjacent non-tumor tissues from 45 patients with PDAC. The online analysis tool GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) was used to determine the expression of DEGs. We further verified the expression of COL17A1.
Results
Analysis of DEGs
The selected chipset GSE62165 included 118 PDAC samples and 13 control samples. Differences in gene expression profiles were analyzed using 38 early-stage tumors and 13 normal tissues. A total of 240 DEGs (adjusted P-value <0.05; |log2FC|≥3.0) were identified using R version 3.5 software, including 137 upregulated genes and 103 downregulated genes (Table I). The heat map of genes with upregulated expression is presented in Fig. 1. A volcanic map of all genes is presented in Fig. 2.
Table I.Top 20 differentially expressed genes in early-stage pancreatic cancer tissues based on Log2FC. |
Enrichment analysis of DEGs
To investigated the distribution of DEGs, GO and KEGG analysis of upregulated and downregulated genes was conducted. GO analysis revealed that the ‘biological processes’ (BPs) of upregulated genes mainly included extracellular matrix organization, extracellular structure organization and collagen catabolic process. ‘Molecular functions’ (MFs) of upregulated DEGs primarily included extracellular matrix structural constituents, glycosaminoglycan binding and cytokine activity. For the ‘cell components’ (CCs) identified by GO analysis, proteinaceous extracellular matrix, extracellular matrix component and endoplasmic reticulum lumen were the most prominent (Table II). For downregulated DEGs, the main enriched BPs were digestion, lipid digestion and sulfur amino acid metabolic process, whereas the primary MFs were exopeptidase activity, serine-type endopeptidase activity and serine-type peptidase activity (Table II). Figs. 3 and 4 present the associations between genes and enrichment results, indicating the genes that were highly changed between the two conditions.
Table III presents KEGG pathway analysis of the DEGs, revealing that the upregulated genes were mainly located in extracellular matrix (ECM)-receptor interaction, protein digestion and absorption, and focal adhesion pathways. Conversely, downregulated genes were primarily located in pancreatic secretion, protein digestion and absorption, and fat digestion and absorption pathways. Figs. 5 and 6 present the distribution of the major KEGG pathways generated using clusterProfiler. It was observed that ECM-receptor interactions (Fig. 5) and pancreatic secretion (Fig. 6) were the pathways most enriched with up- and downregulated DEGs, respectively.
Screening of core genes in the PPI
Based on the information in the STRING database and using 12 types of calculation methods in Cytoscape, the following 18 core genes were identified: EGF, ALB, COL17A1, FN1, TIMP1, PLAU, PLA2G1B, IGFBP3, PLAUR, VCAN, COL1A1, PNLIP, CTRL, PRSS3, COMP, CPB1, ITGA2 and CEL. These core genes were associated with each other and may exhibit synergistic effects in the development of pancreatic cancer (Fig. 7). According to the previous enrichment analysis, the core genes, were mainly located in pancreatic secretion, protein digestion and absorption, and focal adhesion pathways.
Gene expression level and survival analysis
Notably, COL17A1 and PLAU genes were the only genes associated with survival. Following the identification of core genes, survival analysis for PAAD was performed using UALCAN. PLAU [which encodes the serine protease urokinase-type plasminogen activator (uPA); Fig. 8] and COL17A1 [which encodes collagen type XVII α1 chain (COL17A1); Fig. 9] were demonstrated to be significantly associated with survival (P<0.05). Subsequently, the expression levels of genes in primary pancreatic cancer were compared; only one gene was identified to be significantly differentially expressed, COL17A1, whereas PLAU was not significantly differentially expressed. The expression levels of COL17A1 were analyzed in TCGA database, and the results were consistent with those of the aforementioned differential gene analysis; COL17A1 was significantly upregulated in PAAD tumor tissues compared with normal tissues (P=1.62×10−12; Fig. 10).
Verification of COL17A1
Differences in gene expression between 45 pancreatic cancer patients and 45 normal pancreatic tissues were analyzed. In particular, the expression level of COL17A1 was investigated. The results of the analysis to verify the importance of COL17A1 are presented in Table IV; it was observed that COL17A1 was significantly upregulated in pancreatic tumor tissue in the two GEO databases.
Discussion
The incidence of pancreatic cancer and the associated mortality rates have exhibited an increasing trend in previous years (3). Studies have reported that patients with pancreatic cancer survive for only 4 months on average without treatment; even in patients who undergo treatment, the survival is not significantly extended (25). Therefore, accurate early diagnosis of pancreatic cancer and the development of effective targeted therapy is of major importance.
A previous study identified core genes in pancreatic cancer that were reported to be of diagnostic relevance (26). In the present study, the chipset GSE62165 from the GEO was analyzed, containing data of 118 PDAC and 13 normal pancreatic tissues (11). Differences in gene expression levels were only compared between normal tissues and early-stage tumor tissue. A total of 240 DEGs (137 upregulated and 103 downregulated) were identified using R, and GO (27) and KEGG pathway analyses of DEGs revealed the locations and functions of DEGs. Upregulated genes were mainly located in the ECM and collagen trimers, and were involved in ECM organization and ECM-receptor interactions, focal adhesion, and protein digestion and absorption. Conversely, downregulated genes were mainly enriched in digestion and exopeptidase activity pathways. A PPI network was built, and 18 core genes were identified; the prognostic value of these genes for patients with pancreatic cancer was analyzed using UACLAN. PLAU and COL17A1 were significantly associated with poorer survival; it was then revealed using data from TCGA that COL17A1 was significantly upregulated in pancreatic cancer tissues compared with control tissues, consistent with the results of the differential gene analysis. It was predicted that these two genes may be associated with the proliferation, invasion and metastasis of pancreatic cancer.
PLAU encodes a serine protease, uPA (28). Following GO and KEGG analyses, the functional enrichment of PLAU was investigated. PLAU is mainly involved in the regulation of cell motility, cellular component movement and locomotion (29). It is primarily expressed in the endoplasmic reticulum lumen and invadopodium (30). PLAU plays a key role in regulating cell migration and adhesion during tissue regeneration and intracellular signaling (31). Increased expression of COL17A1 leads to tumor cell invasion and metastasis of tumor cells to surrounding tissues (32). PLAU is involved in predicting the survival rate of patients with gastric cancer (33). It may serve an important role in the invasion and metastasis of pancreatic cancer cells (34); however, the specific pathways involved are yet to be determined. It is hypothesized that PLAU may serve an important role in the diagnosis and treatment of pancreatic cancer in the future.
COL17A1 is mainly located in the extracellular matrix and collagen trimmers (35). Extracellular matrix molecules, including proteoglycan and fibrin, have been reported to affect the growth, migration and differentiation of cells (36). A study showed that COL17A1 can inhibit the migration and invasion of breast cancer cells, acting as a p53 transcriptional target gene (37). A previous study has reported that the extracellular matrix is closely associated with the metastasis of breast cancer (38). High levels of collagen in breast and colorectal cancers have been associated with tumor invasion (39,40). A previous study that employed the minimum-redundancy-maximum-relevance method also identified COL17A1 as a core gene of pancreatic cancer (26); however, in the present study, the upregulated expression of COL17A1 in pancreatic cancer was verified in multiple datasets, and its effects on patient survival were determined. Survival analysis using UACLAN based on data from TCGA revealed that the expression levels of CLO17A1 were closely associated with the survival of patients with pancreatic cancer, and that CLO17A1 was highly expressed in primary pancreatic tumor tissues. The present findings suggested that the expression of COL17A1 is associated with the occurrence and development of pancreatic cancer. Therefore, this bioinformatics analysis may provide novel insight for future studies investigating the pathogenesis of pancreatic cancer.
However, the present study presented certain limitations. In examining the expression level of COL17A1, only four normal samples were investigated, and further studies examining a high number of control samples are required to confirm the present results.
Acknowledgements
Not applicable.
Funding
The present work was supported by The ‘Six Talents Summit’ Project in Jiangsu Province, miR-203 targets Survivin to upregulate the expression of Caspase-3 and promote the apoptosis of pancreatic cancer cells (grant no. WAW-008).
Availability of data and materials
The datasets used and/or analyzed in the present study are available in the GEO (http://www.ncbi.nlm.nih.gov/geo) and UALCAN (http://ualcan.path.uab.edu) repositories.
Authors' contributions
JZ and LX conceived the study. JW, ZL, KW, KZ and DX analyzed the data and drafted the manuscript. All authors reviewed and approved the final manuscript.
Ethics approval and consent to participate
Not applicable.
Patient consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
References
Kamisawa T, Wood LD, Itoi T and Takaori K: Pancreatic cancer. Lancet. 388:73–85. 2016. View Article : Google Scholar : PubMed/NCBI | |
Michaud D: Epidemiology of pancreatic cancer. Minerva Chir. 59:99–111. 2004.PubMed/NCBI | |
Siegel R, Ma J, Zou Z and Jemal A: Cancer statistics, 2014. CA Cancer J Clin. 64:9–29. 2014. View Article : Google Scholar : PubMed/NCBI | |
Risch HA: Etiology of pancreatic cancer, with a hypothesis concerning the role of N-nitroso compounds and excess gastric acidity. J Natl Cancer Inst. 95:948–960. 2003. View Article : Google Scholar : PubMed/NCBI | |
Kern SE, Shi C and Hruban RH: The complexity of pancreatic ductal cancers and multidimensional strategies for therapeutic targeting. J Pathol. 223:295–306. 2011. View Article : Google Scholar : PubMed/NCBI | |
Grasso C, Jansen G and Giovannetti E: Drug resistance in pancreatic cancer: Impact of altered energy metabolism. Crit Rev Oncol Hematol. 114:139–152. 2017. View Article : Google Scholar : PubMed/NCBI | |
Eskelinen M and Haglund U: Developments in serologic detection of human pancreatic adenocarcinoma. Scand J Gastroenterol. 34:833–844. 1999. View Article : Google Scholar : PubMed/NCBI | |
Bass AJ, Lawrence MS, Brace LE, Ramos AH, Drier Y, Cibulskis K, Sougnez C, Voet D, Saksena G, Sivachenko A, et al: Genomic sequencing of colorectal adenocarcinomas identifies a recurrent VTI1A-TCF7L2 fusion. Nat Genet. 43:964–968. 2011. View Article : Google Scholar : PubMed/NCBI | |
Sjöblom T, Jones S, Wood LD, Parsons DW, Lin J, Barber TD, Mandelker D, Leary RJ, Ptak J, Silliman N, et al: The consensus coding sequences of human breast and colorectal cancers. Science. 314:268–274. 2006. View Article : Google Scholar : PubMed/NCBI | |
Wood LD, Parsons DW, Jones S, Lin J, Sjöblom T, Leary RJ, Shen D, Boca SM, Barber T, Ptak J, et al: The genomic landscapes of human breast and colorectal cancers. Science. 318:1108–1113. 2007. View Article : Google Scholar : PubMed/NCBI | |
Janky R, Binda MM, Allemeersch J, Van den Broeck A, Govaere O, Swinnen JV, Roskams T, Aerts S and Topal B: Prognostic relevance of molecular subtypes and master regulators in pancreatic ductal adenocarcinoma. BMC Cancer. 16:6322016. View Article : Google Scholar : PubMed/NCBI | |
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, et al: NCBI GEO: Archive for functional genomics data sets-update. Nucleic Acids Res 41 (Database Issue). D991–D995. 2013. | |
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W and Smyth GK: Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43:e472015. View Article : Google Scholar : PubMed/NCBI | |
Zhang G, Schetter A, He P, Funamizu N, Gaedcke J, Ghadimi BM, Ried T, Hassan R, Yfantis HG, Lee DH, et al: DPEP1 inhibits tumor cell invasiveness, enhances chemosensitivity and predicts clinical outcome in pancreatic ductal adenocarcinoma. PLoS One. 7:e315072012. View Article : Google Scholar : PubMed/NCBI | |
Zhang G, He P, Tan H, Budhu A, Gaedcke J, Ghadimi BM, Ried T, Yfantis HG, Lee DH, Maitra A, et al: Integration of metabolomics and transcriptomics revealed a fatty acid network exerting growth inhibitory effects in human pancreatic cancer. Clin Cancer Res. 19:4983–4993. 2013. View Article : Google Scholar : PubMed/NCBI | |
Xing Z, Chu C, Chen L and Kong X: The use of Gene Ontology terms and KEGG pathways for analysis and prediction of oncogenes. Biochim Biophys Acta. 1860:2725–2734. 2016. View Article : Google Scholar : PubMed/NCBI | |
Kanehisa M, Sato Y, Furumichi M, Morishima K and Tanabe M: New approach for understanding genome variations in KEGG. Nucleic Acids Res. 47:D590–D595. 2019. View Article : Google Scholar : PubMed/NCBI | |
Kanehisa M, Furumichi M, Tanabe M, Sato Y and Morishima K: KEGG: New perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45:D353–D361. 2017. View Article : Google Scholar : PubMed/NCBI | |
Kanehisa M and Goto S: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28:27–30. 2000. View Article : Google Scholar : PubMed/NCBI | |
Yu G, Wang LG, Han Y and He QY: clusterProfiler: An R package for comparing biological themes among gene clusters. OMICS. 16:284–287. 2012. View Article : Google Scholar : PubMed/NCBI | |
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B and Ideker T: Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13:2498–2504. 2003. View Article : Google Scholar : PubMed/NCBI | |
Chin CH, Chen SH, Wu HH, Ho CW, Ko MT and Lin CY: cytoHubba: Identifying hub objects and sub-networks from complex interactome. BMC Syst Biol. 8 (Suppl 4):S112014. View Article : Google Scholar : PubMed/NCBI | |
Chandrashekar DS, Bashel B, Balasubramanya SAH, Creighton CJ, Ponce-Rodriguez I, Chakravarthi BVSK and Varambally S: UALCAN: A portal for facilitating tumor subgroup gene expression and survival analyses. Neoplasia. 19:649–658. 2017. View Article : Google Scholar : PubMed/NCBI | |
Li B and Dewey CN: RSEM: Accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 12:3232011. View Article : Google Scholar : PubMed/NCBI | |
Wang X, Wang L, Mo Q, Dong Y, Wang G and Ji A: Changes of Th17/Treg cell and related cytokines in pancreatic cancer patients. Int J Clin Exp Pathol. 8:5702–5708. 2015.PubMed/NCBI | |
Shen S, Gui T and Ma C: Identification of molecular biomarkers for pancreatic cancer with mRMR shortest path method. Oncotarget. 8:41432–41439. 2017.PubMed/NCBI | |
Thomas PD: The gene ontology and the meaning of biological function. Methods Mol Biol. 1446:15–24. 2017. View Article : Google Scholar : PubMed/NCBI | |
Duffy MJ, Duggan C, Mulcahy HE, McDermott EW and O'Higgins NJ: Urokinase plasminogen activator: A prognostic marker in breast cancer including patients with axillary node-negative disease. Clin Chem. 44:1177–1183. 1998.PubMed/NCBI | |
Nielsen TO, Andrews HN, Cheang M, Kucab JE, Hsu FD, Ragaz J, Gilks CB, Makretsov N, Bajdik CD, Brookes C, et al: Expression of the insulin-like growth factor I receptor and urokinase plasminogen activator in breast cancer is associated with poor survival: Potential for intervention with 17-allylamino geldanamycin. Cancer Res. 64:286–291. 2004. View Article : Google Scholar : PubMed/NCBI | |
Pavón MA, Arroyo-Solera I, Céspedes MV, Casanova I, León X and Mangues R: uPA/uPAR and SERPINE1 in head and neck cancer: Role in tumor resistance, metastasis, prognosis and therapy. Oncotarget. 7:57351–57366. 2016. View Article : Google Scholar : PubMed/NCBI | |
Amos S, Redpath GT, Dipierro CG, Carpenter JE and Hussaini IM: Epidermal growth factor receptor-mediated regulation of urokinase plasminogen activator expression and glioblastoma invasion via C-SRC/MAPK/AP-1 signaling pathways. J Neuropathol Exp Neurol. 69:582–592. 2010. View Article : Google Scholar : PubMed/NCBI | |
Chaudhary A, Hilton MB, Seaman S, Haines DC, Stevenson S, Lemotte PK, Tschantz WR, Zhang XM, Saha S, Fleming T and St Croix B: TEM8/ANTXR1 blockade inhibits pathological angiogenesis and potentiates tumoricidal responses against multiple cancer types. Cancer Cell. 21:212–226. 2012. View Article : Google Scholar : PubMed/NCBI | |
Xu ZY, Chen JS and Shu YQ: Gene expression profile towards the prediction of patient survival of gastric cancer. Biomed Pharmacother. 64:133–139. 2010. View Article : Google Scholar : PubMed/NCBI | |
Liu P, Weng Y, Sui Z, Wu Y, Meng X, Wu M, Jin H, Tan X, Zhang L and Zhang Y: Quantitative secretomic analysis of pancreatic cancer cells in serum-containing conditioned medium. Sci Rep. 6:376062016. View Article : Google Scholar : PubMed/NCBI | |
Borradori L and Sonnenberg A: Structure and function of hemidesmosomes: More than simple adhesion complexes. J Invest Dermatol. 112:411–418. 1999. View Article : Google Scholar : PubMed/NCBI | |
Järveläinen H, Sainio A, Koulu M, Wight TN and Penttinen R: Extracellular matrix molecules: Potential targets in pharmacotherapy. Pharmacol Rev. 61:198–223. 2009. View Article : Google Scholar : PubMed/NCBI | |
Yodsurang V, Tanikawa C, Miyamoto T, Lo PHY, Hirata M and Matsuda K: Identification of a novel p53 target, COL17A1, that inhibits breast cancer cell migration and invasion. Oncotarget. 8:55790–55803. 2017. View Article : Google Scholar : PubMed/NCBI | |
Chowdhury N and Sapru S: Association of protein translation and extracellular matrix gene sets with breast cancer metastasis: Findings uncovered on analysis of multiple publicly available datasets using individual patient data approach. PLoS One. 10:e01296102015. View Article : Google Scholar : PubMed/NCBI | |
Rizwan A, Bulte C, Kalaichelvan A, Cheng M, Krishnamachary B, Bhujwalla ZM, Jiang L and Glunde K: Metastatic breast cancer cells in lymph nodes increase nodal collagen density. Sci Rep. 5:100022015. View Article : Google Scholar : PubMed/NCBI | |
Zou X, Feng B, Dong T, Yan G, Tan B, Shen H, Huang A, Zhang X, Zhang M, Yang P, et al: Up-regulation of type I collagen during tumorigenesis of colorectal cancer revealed by quantitative proteomic analysis. J Proteomics. 94:473–485. 2013. View Article : Google Scholar : PubMed/NCBI |