Bioinformatics analysis of gene expression profiles to identify causal genes in luminal B2 breast cancer
- Authors:
- Published online on: October 23, 2017 https://doi.org/10.3892/ol.2017.7256
- Pages: 7880-7888
-
Copyright: © Wang et al. This is an open access article distributed under the terms of Creative Commons Attribution License.
Abstract
Introduction
Breast cancer, a complex and heterogeneous malignancy, is the most frequently diagnosed cancer and the leading cause of cancer mortality among females worldwide; there were estimated to be 1.7 million cases and 521,900 deaths in 2012 (1). The incidence rates of breast cancer in economically developed regions, including North America, Australia, New Zealand, and Northern and Western Europe, are higher than those in Africa and Asia (1).
Breast cancer is classified into 4 subtypes: Luminal A, luminal B, human epidermal growth factor receptor 2 (HER2) -positive and basal-like, according to clinically and biologically relevant molecular features that were previously identified by DNA microarray-based expression profiling (2–4). These subtypes are associated with distinct pathological features, treatment responses and clinical outcomes (5). Basal-like breast cancer is predominantly triple-negative (i.e. no expression of estrogen receptor (ER), progesterone receptor (PR) or HER2). In HER2-positive breast cancer, the HER2 gene is overexpressed. The basal-like and HER2-positive subtypes are associated with a poor prognosis, and the luminal A subtype was identified as having the more favorable clinical outcome (4).
The clinical and biological behavior of the luminal B subtype is more aggressive than the luminal A subtype, with a higher metastatic risk and greater resistance to hormone therapy and conventional chemotherapy; patients with luminal B cancer exhibit increased relapse rates in the first 5 years following diagnosis (6). Luminal B-subtype cancer cases have a higher recurrence score than luminal A, based on OncotypeDX detection: In a cohort of 831 untreated lymph node metastasis-negative patients, the hazard ratio (HR) for early metastasis (<5 years) was 2.86 for luminal B relative to luminal A (6,7).
As a subtype of Luminal B breast cancer, luminal B2 breast cancer (LB2BC) is usually associated with poor prognosis, and the pathophysiological mechanism underlying luminal B2 breast cancer remains unknown. In the present study, bioinformatics methods were used to identify differentially expressed genes (DEGs) between LB2BC and non-tumor tissues, with the aim of providing valuable information for further pathogenesis mechanism elucidation of LB2BC.
Subjects and methods
TGCA data
All 1,098 patients (as included in the database on the 27th of October, 2015) with breast cancer were retrieved from The Cancer Genome Atlas (TGCA) data portal (8). A total of 117 LB2BC expression profiles from this dataset were included in the present study, according to the inclusion and exclusion criteria below. Level 3 raw expression data of mRNA from the Illumina HiSeq_miRNAseq V2 platform for the 117 LB2BC profiles and 18 normal controls were downloaded from TCGA data portal.
Exclusion criteria
The exclusion criteria for patients with LB2BC were as follows: i) History of any other malignancy; ii) history of previous treatment, including chemotherapy, radiotherapy or endocrinotherapy; iii) samples without mRNA sequence data.
Inclusion criteria
The inclusion criteria were as follows: i) HER2 status, as assessed by immunohistochemistry (IHC), was positive; ii) ER expression, PR expression, or both, were positive.
Screening of differentially expressed genes
In R (version 3.2.3), the DESeq (version 1.28.0) package was used for reads count analysis (9). The limma package (version 1.9.6) (10) was used to screen for DEGs by comparing LB2BC expression profiles to control profiles. The raw P-value was adjusted to a false discovery rate (FDR) value using the Benjamin and Hochberg method. Genes within the cut-off criteria of FDR<0.001 were designated as DEGs.
Functional annotation of DEGs
To identify the biological function and associated metabolic pathways for the DEGs, they were submitted to the Kyoto Encyclopedia of Genes and Genomes (KEGG) database for pathway enrichment analysis and GOrilla online software (11) for gene ontology (GO) term enrichment analysis (12,13). P<0.001 and FDR<0.05 were set as the cut-offs for selecting significantly enriched functional GO terms and KEGG pathway, respectively.
Construction of protein-protein interaction (PPI) network
A PPI network can organize genes into a network to aid the understanding of their biological function (13). The BioGRID database (14) includes verified and predicted protein interactions. In the present study, DEGs were mapped into the BioGRID database to screen for interacting protein pairs (15). A PPI network of the top 15 upregulated and downregulated DEGs was constructed and visualized using Cytoscape software (version 3.3.0) (16).
Reverse transcription-quantitative polymerase chain reaction (RT-qPCR)
A total of five tumor tissues and five paired adjacent normal tissues were obtained from five female patients with breast cancer undergoing routine surgical procedures in Linyi People's Hospital (Linyo, China) during February to August 2015, with a mean age of 45.5 years. The tissues were frozen immediately after surgery in liquid nitrogen and were stored at −80°C until RNA extraction. The present study was approved by the Ethics Committee of Linyi People's Hospital. Written informed consent was obtained from all patients prior to enrollment in the present study. Total RNA from tissue samples was extracted using TRIzol reagent (Invitrogen; Thermo Fisher Scientific, Inc., Waltham, MA, USA) according to the manufacturer's protocol. The SuperScript III Reverse Transcription kit (Invitrogen; Thermo Fisher Scientific, Inc.) was used for cDNA synthesis according to the manufacturer's protocol. qPCR was used to quantify differences in the mRNA expression of associated genes using SYBRGreen PCR reagent (Applied Biosystems; Thermo Fisher Scientific, Inc.) according to the manufacturer's protocol. PCR reactions were performed in triplicate and run under the following conditions: 1 cycle of 95°C for 10 min, followed by 45 cycles of 95°C for 15 sec and 60°C for 60 sec. The relative expression level was calculated using the 2−ΔΔCq method (17), normalized to β-actin. Each reaction was performed ≥3 times.
The primer sequences were as follows: Neurotrophic receptor tyrosine kinase 2 (NTRK2) forward, 5′-ATCTCCAACCTCAGACCACCACT-3′ and reverse, 5′-AATCTGTTTCTCATCCTTCCCATAC-3′; fibronectin-1 (FN1) forward, 5′-CAACCTACGGATGACTCGTGCTT-3′ and reverse, 5′-TTTCTCCCTGACGGTCCCACTT-3′; Polo-like kinase 1 (PLK1) forward, 5′-GGCAGCGTGCAGATCAACTT-3′ and reverse, 5′-CAGGAGACTCAGGCGGTATGT-3′; epidermal growth factor receptor (EGFR) forward, 5′-GTGACCGTTTGGGAGTTGATGA-3′ and reverse, 5′-GGCTGAGGGAGGCGTTCTC-3′); and β-actin forward, 5′-CTGAAGTACCCCATCGAGCAC-3′ and reverse, 5′-ATAGCACAGCCTGGATAGCAAC-3′.
Statistical analysis
RT-qPCR experimental data were expressed as the mean ± standard deviation. GraphPad Prism (version 6.0) software (GraphPad Software, Inc., La Jolla, CA, USA) was used to perform statistical analysis. Statistically significant differences between two groups were evaluated using an unpaired Student's t-test. P<0.05 was considered to indicate a statistically significant difference.
Results
DEGs in LB2BC
A total of 2,251 significantly DEGs were identified in LB2BC tissues compared with non-tumor controls. The DEGs included 759 upregulated and 1,492 downregulated genes. Matrix metalloproteinase-11 was the most significantly upregulated gene; NTRK2 was the most significantly downregulated gene. The top 15 significantly upregulated and downregulated genes are listed in Table I. The pattern of expression change for the top 200 DEGs is presented in Fig. 1.
GO annotation of DEGs
To obtain insights into their potential biological roles, GO term enrichment analysis was performed on the DEGs from LB2BC. ‘Mitotic cell cycle process’ (GO, 1903047; P=1.22×10−8), ‘cell cycle G2/M phase transition’ (GO, 0044839; P=7.43×10−7) and ‘G2/M transition of mitotic cell cycle’ (GO, 0000086; P=7.43×10−7) were the most significantly enriched biological process GO terms; ‘vinculin binding’ (GO, 0017166; P=5.02×10−6), ‘structural constituent of muscle’ (GO, 0008307; P=1.53×10−5) and ‘potassium channel regulator activity’ (GO, 0015459; P=2.77×10−5) were the most significantly enriched molecular function GO terms; ‘condensed chromosome outer kinetochore’ (GO, 0000940; P=2.34×10−5), ‘extracellular matrix’ (GO, 0031012; P=4.08×10−5) and ‘microtubule spindle’ (GO, 0005876; P=8.77×10−5) were the most significantly enriched cellular component GO terms (Table II).
Pathway enrichment analysis
KEGG pathway enrichment analysis was performed on the DEGs from LB2BC. FDR<0.05 was used as the cut-off for pathway significance. The most significantly enriched pathways were ‘focal adhesion’ (FDR=1.02×10−20), ‘pathways in cancer’ (FDR=1.50×10−20), ‘ECM-receptor interactions’ (FDR=1.15×10−14) and ‘cytokine-cytokine receptor interaction’ (FDR=1.50×10−11; Table III).
Protein-protein interaction (PPI) network construction
PPI networks for the top 15 upregulated and downregulated DEGs were constructed using Cytoscape software. The network consisted of 1,681 nodes and 1,898 edges. The hub proteins were EGFR, FN1 and PLK1 (Fig. 2).
RT-qPCR verification of DEGs
The expression level of four of the identified DEGs (NTRK2, FN1, PLK1 and EGFR) were validated with RT-qPCR in five LB2BC tumor and adjacent tissue samples. The expression of NTRK2 did not differ significantly between LB2BC tumor and adjacent tissues, but trended towards downregulation in LB2BC (Fig. 3A). The expression of FN1 was significantly upregulated in LB2BC (P<0.05; Fig. 3B). PLK1 expression did not differ significantly between LB2BC and adjacent tumor tissues (Fig. 3C). The expression of EGFR was significantly downregulated in LB2BC compared with adjacent tissue (P<0.001 Fig. 3D).
Discussion
NTRK2 was the most significantly downregulated DEG identified in LB2BC (Table I) and was associated with the KEGG term ‘MAPK signaling pathway’ (Table III). The expression of NTRK2 did not significantly differ between LB2BC and non-cancerous tissue, although it trended towards downregulation in LB2BC (Fig. 3A). NTRK2 is a member of the neurotrophic tyrosine receptor kinase family. The expression of NTRK2 was previously identified through microarray analysis to be downregulated in patients with breast cancer with a relatively poor prognosis (18). NTRK2 expression leads to anoikis resistance in several types of cancer, including ovarian cancer (19,20). NTRK2 is a direct target of microRNA (miR)-200c; miR-200c decreases the endogenous expression of NTRK2 and mediated the suppression of anoikis resistance in breast cancer cells (21).
In the present study, FN1 was identified as a top-10 upregulated DEG in LB2BC (Table I). RT-qPCR analysis identified that FN1 was significantly upregulated in LB2BC (Fig. 3B), in accord with the DEG analysis. FN1 was one of the hub proteins in the PPI network (Fig. 2). FN1 is a glycoprotein present in a soluble dimeric form in plasma, and in a dimeric or multimeric form at the cell surface and the extracellular matrix (22). FN1 is associated with cell adhesion, cell migration, wound healing and cell metastasis (23). FN1 is upregulated in breast cancer; FN1 was also identified as hub protein in a previously constructed PPI network (24). Previous gene regulatory network analysis of breast cancer demonstrated that FN1 was upregulated in aggressive breast cancer cell lines and was associated with the aggressive behavior of breast cancer cells (25). The expression of FN1 is a positive predictor for breast cancer cell line sensitivity to paclitaxel (26). The elevated expression of FN1 promotes vascular endothelial growth factor-C expression and the epithelial-mesenchymal transition, promoting lymph node metastasis in human oral squamous cell carcinoma (27). FN1 was associated with 5 significantly enriched KEGG pathways, including ‘focal adhesion’, ‘pathways in cancer’, ‘ECM-receptor interaction’, ‘regulation of actin cytoskeleton’ and ‘small cell lung cancer’ (Table III).
PLK1 was significantly upregulated in LB2BC (Table I). PLK1 is a key regulator of cell division and is overexpressed in numerous types of human cancer, including prostate, pancreatic and breast cancer (28). PLK1 is significantly enriched in the ‘cell cycle’ and ‘progesterone-mediated oocyte maturation’ pathways (Table III). PLK1 was identified as downregulated in LB2BC as assessed by RT-qPCR, whereas the opposite result was obtained in DEG analysis (Fig. 3C); this may be due to the small experimental sample size (n=5) used in the present study. PLK1 was a hub protein, interacting with 153 genes in the PPI network (Fig. 2). PLK1 encodes a serine/threonine protein kinase that belongs to the cell cycle serine/threonine-protein kinase CDC5/Polo subfamily. PLK1 is significantly overexpressed in triple-negative breast cancer (TNBC), compared with other breast cancer subtypes; the inhibition of PLK1 activity induces an increase in G2/M cell cycle arrest and TNBC cell apoptosis (29). PLK1 is overexpressed in breast cancer cell lines compared with normal breast cell lines; silencing PLK1 enhances the sensitivity of breast cancer cell lines to rapamycin (30).
EGFR expression was significantly downregulated in LB2BC, as observed via DEG analysis and validated using RT-qPCR (Table I; Fig. 3D). EGFR was the hub protein with the highest connectivity degree, connecting to 781 genes in the PPI network. EGFR was significantly enriched in several pathways, including pathways in cancer and the MAPK signaling pathway (Table III). EGFR was previously identified as upregulated in various types of cancer, including lung cancer, gastric cancer and glioblastoma (31–33). EGFR is frequently overexpressed in TNBC; high EGFR gene copy number predicts a poor patient outcome (34). EGFR expression was downregulated in LB2BC in the present study, indicating that the pathogenesis of LB2BC is distinct from TNBC.
In conclusion, 2,251 DEGs were identified between LB2BC and non-tumor tissue. The top 15 upregulated and downregulated genes in LB2BC were used to construct a PPI network. A number of genes, including NTRK2, FN1, PLK1 and EGFR, were identified that potentially serve critical roles in the pathogenesis of LB2BC via the ‘focal adhesion’, ‘pathways in cancer’ and ‘ECM-receptor interaction signaling’ KEGG pathways. The findings of the present study may contribute to the further elucidation of the pathogenesis of LB2BC.
References
Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J and Jemal A: Global cancer statistics, 2012. CA Cancer J Clin. 65:87–108. 2015. View Article : Google Scholar : PubMed/NCBI | |
Perou CM, Sørlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, et al: Molecular portraits of human breast tumours. Nature. 406:747–752. 2000. View Article : Google Scholar : PubMed/NCBI | |
Sørlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, et al: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci USA. 98:pp. 10869–10874. 2001; View Article : Google Scholar : PubMed/NCBI | |
Hu Z, Fan C, Oh DS, Marron JS, He X, Qaqish BF, Livasy C, Carey LA, Reynolds E, Dressler L, et al: The molecular portraits of breast tumors are conserved across microarray platforms. BMC Genomics. 7:962006. View Article : Google Scholar : PubMed/NCBI | |
Dai X, Li T, Bai Z, Yang Y, Liu X, Zhan J and Shi B: Breast cancer intrinsic subtype classification, clinical use and future trends. Am J Cancer Res. 5:2929–2943. 2015.PubMed/NCBI | |
Ignatiadis M, Bedard P, Haibe-Kains B, Haibe-Kains B, Singhal S, Loi S, Criscitiello C, Desmedt C, Bontempi G, Piccart M, et al: A meta-analysis of gene expression profiling studies identifies clinically relevant oncogenic pathways in basal-like breast cancer. Cancer Res. 69:1062009. View Article : Google Scholar | |
Tran B and Bedard PL: Luminal-B breast cancer and novel therapeutic targets. Breast Cancer Res. 13:2212011. View Article : Google Scholar : PubMed/NCBI | |
Tomczak K, Czerwińska P and Wiznerowicz M: The Cancer Genome Atlas (TCGA): An immeasurable source of knowledge. Contemp Oncol (Pozn). 19:A68–A77. 2015.PubMed/NCBI | |
Anders S and Huber W: Differential expression analysis for sequence count data. Genome Biol. 11:R1062010. View Article : Google Scholar : PubMed/NCBI | |
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W and Smyth GK: Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43:e472015. View Article : Google Scholar : PubMed/NCBI | |
Eden E, Navon R, Steinfeld I, Lipson D and Yakhini Z: GOrilla: A tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 10:482009. View Article : Google Scholar : PubMed/NCBI | |
Kanehisa M: The KEGG database. Novartis Found Symp. 247:91–103. 119–128. 244–252. 2002. View Article : Google Scholar : PubMed/NCBI | |
Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, et al: A human protein-protein interaction network: A resource for annotating the proteome. Cell. 122:957–968. 2005. View Article : Google Scholar : PubMed/NCBI | |
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, et al: The BioGRID interaction database: 2015 update. Nucleic Acids Res. 39(Database Issue): D470–D478. 2015. View Article : Google Scholar | |
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, et al: The BioGRID interaction database: 2015 update. Nucleic Acids Res. 43(Database Issue): D470–D478. 2015. View Article : Google Scholar : PubMed/NCBI | |
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B and Ideker T: Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13:2498–2504. 2003. View Article : Google Scholar : PubMed/NCBI | |
Schmittgen TD and Livak KJ: Analyzing real-time PCR data by the comparative C(T) method. Nat Protoc. 3:1101–1108. 2008. View Article : Google Scholar : PubMed/NCBI | |
Li Z, Peng L, Han S, Huang Z, Shi F, Cai Z, Li X, Zhang P, Zhu H and Jin W: Screening molecular markers in early breast cancer of the same pathological types but with different prognoses using Agilent gene chip. Nan Fang Yi Ke Da Xue Xue Bao. 33:1483–1488. 2013.(In Chinese). PubMed/NCBI | |
Yu X, Liu L, Cai B, He Y and Wan X: Suppression of anoikis by the neurotrophic receptor TrkB in human ovarian cancer. Cancer Sci. 99:543–552. 2008. View Article : Google Scholar : PubMed/NCBI | |
Kupferman ME, Jiffar T, El-Naggar A, Yilmaz T, Zhou G, Xie T, Feng L, Wang J, Holsinger FC, Yu D and Myers JN: TrkB induces EMT and has a key role in invasion of head and neck squamous cell carcinoma. Oncogene. 29:2047–2059. 2010. View Article : Google Scholar : PubMed/NCBI | |
Howe EN, Cochrane DR and Richer JK: Targets of miR-200c mediate suppression of cell motility and anoikis resistance. Breast Cancer Res. 13:R452011. View Article : Google Scholar : PubMed/NCBI | |
Chen D, Wang X, Liang D, Gordon J, Mittal A, Manley N, Degenhardt K and Astrof S: Fibronectin signals through integrin α5β1 to regulate cardiovascular development in a cell type-specific manner. Dev Biol. 407:195–210. 2015. View Article : Google Scholar : PubMed/NCBI | |
Steffens S, Schrader AJ, Vetter G, Eggers H, Blasig H, Becker J, Kuczyk MA and Serth J: Fibronectin 1 protein expression in clear cell renal cell carcinoma. Oncol Lett. 3:787–790. 2012.PubMed/NCBI | |
Liu X, Ma Y, Yang W, Wu X, Jiang L and Chen X: Identification of therapeutic targets for breast cancer using biological informatics methods. Mol Med Rep. 12:1789–1795. 2015. View Article : Google Scholar : PubMed/NCBI | |
Yang X, Jia M, Li Z, Lu S, Qi X, Zhao B, Wang X, Rong Y, Shi J, Zhang Z, et al: Bioinformatics analysis of aggressive behavior of breast cancer via an integrated gene regulatory network. J Cancer Res Ther. 10:1013–1018. 2014. View Article : Google Scholar : PubMed/NCBI | |
Dorman SN, Baranova K, Knoll JH, Urquhart BL, Mariani G, Carcangiu ML and Rogan PK: Genomic signatures for paclitaxel and gemcitabine resistance in breast cancer derived by machine learning. Mol Oncol. 10:85–100. 2016. View Article : Google Scholar : PubMed/NCBI | |
Morita Y, Hata K, Nakanishi M, Omata T, Morita N, Yura Y, Nishimura R and Yoneda T: Cellular fibronectin 1 promotes VEGF-C expression, lymphangiogenesis and lymph node metastasis associated with human oral squamous cell carcinoma. Clin Exp Metastasis. 32:739–753. 2015. View Article : Google Scholar : PubMed/NCBI | |
Gutteridge RE, Ndiaye MA, Liu X and Ahmad N: Plk1 inhibitors in cancer therapy: From laboratory to clinics. Mol Cancer Ther. 15:1427–1435. 2016. View Article : Google Scholar : PubMed/NCBI | |
Maire V, Némati F, Richardson M, Vincent-Salomon A, Tesson B, Rigaill G, Gravier E, Marty-Prouvost B, De Koning L, Lang G, et al: Polo-like kinase 1: A potential therapeutic option in combination with conventional chemotherapy for the management of patients with triple-negative breast cancer. Cancer Res. 73:813–823. 2013. View Article : Google Scholar : PubMed/NCBI | |
Ou O, Huppi K, Chakka S, Gehlhaus K, Dubois W, Patel J, Chen J, Mackiewicz M, Jones TL, Pitt JJ, et al: Loss-of-function RNAi screens in breast cancer cells identify AURKB, PLK1, PIK3R1, MAPK12, PRKD2, and PTK6 as sensitizing targets of rapamycin activity. Cancer Lett. 354:336–347. 2014. View Article : Google Scholar : PubMed/NCBI | |
Serna E, Lopez-Gines C, Monleon D, Muñoz-Hidalgo L, Callaghan RC, Gil-Benso R, Martinetto H, Gregori-Romero A, Gonzalez-Darder J and Cerda-Nicolas M: Correlation between EGFR amplification and the expression of microRNA-200c in primary glioblastoma multiforme. PLoS One. 9:e1029272014. View Article : Google Scholar : PubMed/NCBI | |
Naruke A, Azuma M, Takeuchi A, Ishido K, Katada C, Sasaki T, Higuchi K, Tanabe S, Saegusa M and Koizumi W: Comparison of site-specific gene expression levels in primary tumors and synchronous lymph node metastases in advanced gastric cancer. Gastric Cancer. 18:262–270. 2015. View Article : Google Scholar : PubMed/NCBI | |
Yang CH, Chou HC, Fu YN, Yeh CL, Cheng HW, Chang IC, Liu KJ, Chang GC, Tsai TF, Tsai SF, et al: EGFR over-expression in non-small cell lung cancers harboring EGFR mutations is associated with marked down-regulation of CD82. Biochim Biophys Acta. 1852:1540–1549. 2015. View Article : Google Scholar : PubMed/NCBI | |
Park HS, Jang MH, Kim EJ, Kim HJ, Lee HJ, Kim YJ, Kim JH, Kang E, Kim SW, Kim IA and Park SY: High EGFR gene copy number predicts poor outcome in triple-negative breast cancer. Mod Pathol. 27:1212–1222. 2014. View Article : Google Scholar : PubMed/NCBI |