Breast cancer in young women is more aggressive with a poorer prognosis and overall survival compared to older women diagnosed with the disease. Despite recent research, the underlying biology and molecular alterations that drive the aggressive nature of breast tumors associated with breast cancer in young women have yet to be elucidated. In this study, we performed transcriptomic profile and network analyses of breast tumors arising in Middle Eastern women to identify age-specific gene signatures. Moreover, we studied molecular alterations associated with cancer progression in young women using cross-species comparative genomics approach coupled with copy number alterations (CNA) associated with breast cancers from independent studies. We identified 63 genes specific to tumors in young women that showed alterations distinct from two age cohorts of older women. The network analyses revealed potential critical regulatory roles for Myc, PI3K/Akt, NF-κB, and IL-1 in disease characteristics of breast tumors arising in young women. Cross-species comparative genomics analysis of progression from pre-invasive ductal carcinoma in situ (DCIS) to invasive ductal carcinoma (IDC) revealed 16 genes with concomitant genomic alterations, CCNB2, UBE2C, TOP2A, CEP55, TPX2, BIRC5, KIAA0101, SHCBP1, UBE2T, PTTG1, NUSAP1, DEPDC1, HELLS, CCNB1, KIF4A, and RRM2, that may be involved in tumorigenesis and in the processes of invasion and progression of disease. Array findings were validated using qRT-PCR, immunohistochemistry, and extensive in silico analyses of independently performed microarray datasets. To our knowledge, this study provides the first comprehensive genomic analysis of breast cancer in Middle Eastern women in age-specific cohorts and potential markers for cancer progression in young women. Our data demonstrate that cancer appearing in young women contain distinct biological characteristics and deregulated signaling pathways. Moreover, our integrative genomic and cross-species analysis may provide robust biomarkers for the detection of disease progression in young women, and lead to more effective treatment strategies.
Citation: Colak D, Nofal A, AlBakheet A, Nirmal M, Jeprel H, et al. (2013) Age-Specific Gene Expression Signatures for Breast Tumors and Cross-Species Conserved Potential Cancer Progression Markers in Young Women. PLoS ONE 8(5): e63204. doi:10.1371/journal.pone.0063204
Editor: Aedín C. Culhane, Harvard School of Public Health, United States of America
Received: August 17, 2012; Accepted: April 2, 2013; Published: May 21, 2013
Copyright: © 2013 Colak et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This study was funded by a King Abdulaziz City for Science and Technology grant (KACST # ARP-2432 to SMA), King Faisal Specialist Hospital and Research Center, and National Plan for Science, Technology and Innovation program/KACST (grant 11-BIO2072-20 to DC). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: DC is a PLOS ONE Board Member. This does not alter the authors’ adherence to all the PLOS ONE policies on sharing data and materials.
Breast cancer is the most common type of cancer among women worldwide with an estimated 1,300,000 new cases and 465,000 deaths annually . Breast cancer is the major cause of morbidity and mortality among females in Saudi Arabia . Clinical observations indicate that 45% of all female breast cancers in Saudi Arabia developed before the age of 45 years, compared to 9.6% in the United States of America , . Breast cancer diagnosed in young women is more aggressive in nature with a poorer prognosis and disease free survival compared to older counterparts , , , . Indeed, it has been shown that survival in younger women is significantly worse for all stages of breast cancer in comparison to older women , . Although previous studies have described young age is an independent predictor of poor prognosis, the underlying biology driving the aggressive nature of breast cancer arising in young women remains to be elucidated , , , .
Typically, the most common histologic type of breast cancer initiates as a premalignant lesion known as atypical ductal hyperplasia (ADH), then progresses into the preinvasive stage called ductal carcinoma in situ (DCIS), and culminates in invasive ductal carcinoma (IDC) . Though it is a multistep process during which genetic alterations accumulate, molecular and pathological evidence suggests that DCIS is a precursor to invasive disease , , , . A genome-wide microarray-based gene expression analysis would be expected to provide an opportunity to discover genes specifically activated or inactivated during the course of breast cancer progression. Despite recent research, the mechanisms underlying tumorigenesis and progression of breast cancer in young women is still not clear , . In particular, the identification of “progression markers” is crucial for determining which lesions are likely to become invasive.
A cross-species comparative genomics approach represents a powerful strategy to identify target genes that may play a role in tumor initiation and progression to malignancy and thus has great therapeutic potential , , , , . Previous studies have used this approach successfully to understand the molecular pathogenesis of various cancers and disease progression , , , . The rationale is that genomic aberrations and altered pathways involved in oncogenesis are conserved by evolution across different species , , , and a number of important driver mutations in various cancers have been identified using comparative genomic approaches , , . For example, cross-species gene-expression analysis of mouse and human data uncovered gene expression signatures that demonstrate K-Ras oncogene activation in human lung cancers . In another example, Scott Lowe and colleagues identified two oncogenes that are co-amplified and cooperate to promote tumorigenesis by comparing gene amplifications in mouse and human hepatocellular carcinomas .
There are areas of genomic instability reported in many cancers, including breast cancer, and some regions commonly exhibit either deletion or increased gene dosage, leading to changes in DNA copy number (CN) , , , . Integrating gene expression with CN data is an effective strategy for interpreting DNA and RNA level anomalies in cancer to identify genes involved with tumor initiation and progression , , . Hence, integrating cross-species comparative analysis of human and animal models of breast cancer progression with genomic DNA copy number alterations may lead to robust biomarkers for breast cancer disease progression , , , , .
In this study, we analyzed whole-genome mRNA expression profiling from breast tumors and adjacent normal tissues from Middle Eastern women (n = 113 samples) in age-specific cohorts to characterize the underlying biology of aggressive breast cancers appearing in young women. Moreover, we performed an integrative and cross-species comparative genomics approach to identify evolutionarily conserved marker genes for disease progression in young women and validated its prognostic potential.
Materials and Methods
Patients and Samples
In this study, we focused on breast cancer patients diagnosed with infiltrating ductal carcinoma (IDC) and ductal carcinoma in situ (DCIS). Breast cancer samples were collected from primary tumors of 76 patients who sought treatment and underwent surgery (breast conservation surgery or total mastectomy) at the King Faisal Specialist Hospital and Research Center. Signed informed consent was obtained from all patients. On excision of tissues by a surgeon, an anatomic pathologist obtained a sample of the tumor tissue and adjacent normal breast tissue from the same breast having the tumor. 113 samples were collected from patients and fully consented according to institutional review board approved protocols (KFSHRC IRB Protocol). The study was approved by the research ethics board at our institution (RAC# 2031091). Fresh surgical samples including tumors and adjacent disease free tissues were placed in RNAlater™ (Ambion, Inc) and stored at −20°C after micro dissection had been performed for pathological confirmation. All normal breast tissues were confirmed by the pathologist to have normal morphology before the results were analyzed. Whenever possible depending on the quantity of the surgical samples, a piece of every sample was also snap frozen in liquid nitrogen and then stored at −80°C for subsequent isolation of DNA and proteins. The majority of samples received no prior chemotherapy; only two had chemotherapy and were excluded from further analysis.
Histological assessment of tumors and axillary lymph nodes were done by using formalin-fixed, paraffin-embedded breast cancer samples for HER2, estrogen receptor (ER), and progesterone receptor (PR) status. ER status was determined by immunohistochemistry and measured as a percentage and intensity of positive nuclear staining. The estrogen and progesterone receptors were stained with relevant specific antibodies (Novocastra, Newcastle upon Tyne, UK). For HER2 immunohistochemistry, HercepTest™ (Dako Denmark A/S, Glostrup, Denmark) was used with scores of 0 and 1+ considered negative and 2+ equivocal and 3+ considered positive.
Cancers were categorized as luminal A (ER-positive and/or PR-positive and HER2- and either histologic grade 1 or 2); luminal B (ER-positive and/or PR-positive and HER2+ or ER-positive and/or PR-positive, HER2- and grade 3); HER2 (ER-negative and PR-negative and HER2+); and triple negative (ER-, PR-, and HER2-) as defined previously . Description of the clinicopathological characteristics of patients and breast cancer subtypes for luminal A, luminal B, HER2, and triple negative based on the histological evaluations are shown in Table 1.
Table 1. Age-specific patients’ characteristics.doi:10.1371/journal.pone.0063204.t001
Total RNA was extracted from tumor and adjacent normal tissue from patients with standard protocols. Sample handling, cDNA synthesis, cRNA labeling and synthesis, hybridization, washing, array (GeneChip® Human Genome U133Plus 2.0 Array, Affymetrix Inc., Santa Clara, CA, USA) scanning, and all related quality controls were performed according to the manufacturer’s instructions. The Affymetrix GeneChip/GCOS software (Affymetrix Inc.) was used to calculate the raw expression value of each gene from the scanned image. The total RNA quality was assessed by the values of the 3′–5′ ratios for actin and glyceraldehyde- 3-phosphate dehydrogenase (GAPDH). DChip ,  outlier detection algorithm was used to identify outlier arrays. 104 samples/chips passed the above mentioned quality controls and were used for further analyses. The CEL files were utilized for further analysis using dChip , , MEV , , and PARTEK Genomics Suite (Partek® software, Partek Inc., St. Louis, MO, USA).
Global expression profiling of samples from tumor, IDC (n = 64) and DCIS (n = 7), and adjacent disease free tissues (n = 33) were probed using Affymetrix’s GeneChip® Human Genome U133 Plus 2.0 Arrays representing over 47,000 transcripts and variants using more than 54,000 probe sets. The open source R/Bioconductor packages, (Fred Hutchinson Cancer Research Center, Seattle, WA, USA)  were employed to normalize the data by the GC Robust Multi-array Average (GC-RMA) algorithm , . The GC-RMA takes into account the GC content of the probe sequences when comparing the expression intensities of the different probe sets. To determine significant differences in gene expression levels among different age groups (young women (≤45 years), 45 to 55 years (pre) and ≥55 years (elderly) cohorts), we performed a multi-factor ANOVA including ER, PR, HER2, and grade status as additional factors in a linear additive model, as described previously . We used tumor samples data with complete pathological reports in this model (n = 67). Additionally, we used all tumor and normal samples (n = 104), and performed two-way ANOVA by taking age (young, pre, and elderly), type (tumor or normal), as well as their interaction into the model . In this model, we compared transcriptomes of the tumor tissue and normal tissue for each age group separately. Significantly modulated genes were defined as those with an absolute fold change >2.0 and adjusted p-value <0.05. Multiple hypothesis testing was controlled by applying the Benjamini-Hochberg false discovery rate (FDR) correction. Unsupervised two-dimensional hierarchical clustering using Euclidean distance as well as Pearson’s correlation with average linkage clustering was performed. Biological themes associated with the differentially expressed genes was identified by using DAVID Bioinformatics Resources , Expression Analysis Systematic Explorer (EASE) , and Ingenuity Pathways Analysis (IPA) 6.3 (Ingenuity Systems, Mountain View, CA). Using these bioinformatics tools, we were able to gain greater biological insights into activated or repressed functional processes and altered pathways in the disease pathogenesis compared to the listing of differentially expressed genes. Categorical variables and differences in rates between groups were analyzed using the χ2 test. The Fisher exact test was used when expected cell counts were less than 5 using the Monte Carlo method as implemented in SAS. A P-value of <0.05 was considered significant. Statistical analyses were performed by using SAS 9.2 (SAS Institute, Cary, NC), MATLAB (The MathWorks), and PARTEK Genomics Suite softwares. All microarray data reported here are MIAME compliant and have been submitted to the NCBI Gene Expression Omnibus (GEO) database (GSE29044), according to MIAME standards .
For cross-species analysis, the murine markers of disease progression are taken from Kretschmer et al (Table S4 in , GSE21444). Online analysis tools and databases developed by Gyorff et al  containing gene expression data and survival information from over 1800 breast cancer patients were obtained and downloaded from Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/). These were used to assess the prognostic potential of our gene signature (details of the datasets included in the database are given in the original publication ). In addition, The Cancer Genome Atlas (TCGA) data from breast invasive carcinoma (n = 536) and matched normal (n = 63) (https://tcga-data.nci.nih.gov/tcga/), and datasets from GSE7390  and GSE12093  through the canEvolve web portal (www.canevolve.org/) were used for independent validation analyses. The Miller et al.  dataset (GSE3494) was also reanalyzed for validation of our gene signature. The GeneSigDB database  was used to find the overlap/overrepresentation of our gene signatures with previously published gene signatures for various cancers, including breast cancer. Finally, multiple large genomic data sets with DNA copy number alterations associated with breast cancer were retrieved from the Gene Expression Omnibus database through canEvolve (GSE7545, GSE16619, and GSE9154 data sets) and cBio Cancer Genomics Portals  (TCGA, Nature 2012 data ) for integrative genomic analysis.
Functional Pathway and Network Analysis
Functional pathway, gene ontology and network analyses were executed using Ingenuity Pathways Analysis (IPA) 6.3 (Ingenuity Systems, Mountain View, CA), a web-delivered application that enables the discovery, visualization, and exploration of molecular interaction networks in gene expression data. The differentially expressed gene lists were mapped to their corresponding gene objects in the Ingenuity pathway knowledge base. These so-called focus genes were then used as a starting point for generating biological networks. A score was assigned to each network in the dataset to estimate the relevance of the network to the uploaded gene list. This score reflects the negative logarithm of the P that indicates the likelihood of the focus genes in a network being found together due to random chance. Using a 99% confidence level, scores of ≥2 were considered significant. A right-tailed Fisher’s exact test was used to calculate a p value determining the probability that the biological function (or pathway) assigned to that data set is explained by chance alone.
Realtime RT-PCR Experiments
Confirmatory realtime RT-PCR experiments were performed using the ABI 7500 Sequence Detection System (Applied Biosystems). 50 ng total RNA procured from the same microarray study samples were transcribed into cDNA using a Sensicript Kit (QIAGEN Inc., Valencia, CA, USA) under the following conditions: 25°C for 10 min, 42°C for 2 hrs, and 70°C for 15 min in a total volume of 20 µl. Five differentially expressed genes (ESR1, IL1RN, SEPP1, TIAM1, and SCD) were selected and primers designed using Primer3 software. After primer optimization, realtime PCR experiments were performed with 6 µl cDNA using Quantitech SyBr Green Kit (QIAGEN), employing GAPDH as the endogenous control gene. All reactions were conducted in triplicates and the data was analyzed using the delta delta CT method , .
Validation of protein expression was done using immunohistochemistry. Immunohistochemical staining was performed using standard techniques. Monoclonal anti-TGF-α antibody (Calbiochem, clone 213-4.4, dilution 1:50), monoclonal anti PI3 kinase P85 alpha antibody (Abcam, Cambridge, UK, clone ep380y, dilution 1:20) and polyclonal anti IL1 Receptor I antibody (Abcam, Cambridge, UK, Protein G purified, dilution 1:20) were run manually. Slides were deparaffinized by routine techniques. Antigen retrieval was done in Tris/EDTA buffer, pH 9 heated at 95°C in a microwave for 25 minutes. After blocking endogenous peroxidase activity with a 3% aqueous H2O2 solution for 5 minutes, the sections were incubated with primary antibodies overnight at 4°C. Labeling was detected with Envision Plus Detection Kit (Dako, cat. No. K4001). Reaction was detected either by DAB (3, 3-diaminobenzidine, sigma, cat. No. D5905-100TAB) or by AEC (3- amino-9-ethylcarbozale, sigma, cat. No. A-5754). The sections were counterstained with Harris hematoxylin (Acros Organics). Staining was visualized using the DAKO Envision kit according to the instructions of the manufacturer (DAKO, Carpinteria, CA).
Global Expression Profiling in Different Age Cohorts
Genome-wide gene expression profiling provides a comprehensive view of the transcriptional changes that occur during the carcinogenic process and enables the understanding of biology beyond what may be apparent from studies assessing only clinicopathologic features. Here, we first analyzed the whole-genome mRNA expression profile from tumors (n = 71) and adjacent disease free tissues (n = 33) and compared tumor with the normal tissue in each age cohort, young women (≤45 years), 45 to 55 years (pre) and ≥55 years (elderly), separately. We identified 2632, 2029 and 2842 significantly dysregulated genes (up- or down-regulated) present in tumors from young, pre and elderly cohorts (adjusted p value <5% and FC >2), respectively (Figure S1A). To obtain deeper insight into tumor pathogenesis in each age cohort, we performed gene ontology (GO) enrichment and interaction network analyses by using Expression Analysis Systematic Explorer (EASE)  and the Ingenuity knowledge base. The network analysis indicated activation of MYC, NF-κB and TGF-β signaling pathways in young, pre and elderly cohorts, respectively (Figure S1B).
Genomic Signature Specific to Tumors Arising in Young Women
We next compared the transcriptomes of tumors across three age cohorts using a multi-factor ANOVA, controlling for ER, PR, HER2, and grade of the tumors (n = 67). The ANOVA identified 567 genes that were significantly modulated among three age groups (unadjusted p<0.01). The unsupervised principal component analysis (PCA) using 567 genes separated samples according to their age group, hence supporting the conclusion that there are distinct gene expression changes associated with tumors that are dependent on the age of the patient (Figure 1A). We then analyzed overrepresentation of any clinicopathologic or tumor subtype among the age groups, and found no statistically significant associations.
Figure 1. Identification of genes specific to young women with breast cancer.
(A)The unsupervised principal component analysis (PCA) separated samples according to their age group hence supporting the conclusion that there is a distinct gene expression changes associated with the tumor in different age groups. The red spheres refer to young patients (≤45; Young), green for 45–55 years (Pre), and blue for ≥55 years (Post). (B) Venn diagram characterizing differential gene expression between and specific to different age groups. The red circle (left) shows the 804 probes that are differentially expressed between Young and Post; 77 probes (corresponding to 63 genes) were found to be specific to tumor in young women only (circled in light pink). (C) Unsupervised two-dimensional hierarchical clustering of all tumor samples based on their gene expression similarity using young-age-specific 77 probes was performed using Pearson’s correlation with average linkage clustering. The hierarchical clustering revealed clear pattern of genes deregulation defining two main transcriptome clusters, one was mainly composed primarily younger cases, and one was composed of primarily elderly women. Samples are denoted in columns and genes are denoted in rows (gene symbols listed on the right). The expression level of each gene across the samples is scaled to [−4, 4] interval. These mapped expression levels are depicted using a color scale as shown at the bottom of the figure, as such highly expressed genes are indicated in red, intermediate in black, and weakly expressed in green.doi:10.1371/journal.pone.0063204.g001
The gene signature specific to tumors in young women (≤45 years) were obtained by overlapping gene lists. When comparing two groups of samples to identify genes differentially expressed in a given group, we used p-value and the fold change (FC) between two groups as the cut-off criteria. As shown in Figure 1B, each circle in the Venn diagram represents the differential expression between two “age groups”. This Venn diagram approach revealed that 79 probes were common to both ≤45 vs 45–55 and ≤45 vs >55 comparisons, and 77 probes (corresponding to 63 genes) were specific to tumors in the young group of patients (Y) (shown in pink, in Figure 1B, listed in Table 2) that have significantly higher or lower expression in young women compared to their older counterparts. The unsupervised two-dimensional hierarchical clustering using 63 genes revealed clear patterns of gene deregulation defining two main transcriptome clusters, one was mainly composed of primarily younger women, and the other one was composed primarily of older patients (Figure 1C). The Microarray Literature-based Annotation (MILANO) database  search indicated 98% of those 63 genes had a published association with cancer. Moreover, we tested 63 young age-specific gene signatures against the published gene signatures in GeneSigDB database , and found overrepresentation of our gene set in over 500 gene signatures for various cancers, including breast cancer (adjusted p-value <0.05). The GO and functional analyses revealed significant enrichment of categories, including carcinogenesis, tissue development, cellular development, cellular growth and proliferation, tumor morphology, and cell death (Figure 2A). The network analysis indicated alterations in a number of cancer related pathways, including p38 MAPK, PI3K/AKT, ERK/MAPK and NF-κB signaling pathways, and a potential role of TGFA, ErbB2, and IL-1/IL-1R in young women with breast cancer (Figure 2B).
Figure 2. Functional and network analyses of genes specific to young women.
(A) The gene ontology and functional analysis of young-age-tumor specific genes (up/down-regulated) were performed using the Ingenuity knowledge base. X-axis indicates the significance (-log P value) of the functional/pathway association that is dependent on the number of genes in a class as well as biologic relevance. The threshold line represents a P value of 0.05. (B–C) Gene interaction network analyses of genes specific to young women and very young women, respectively. Top scoring gene interaction networks with high relevancy scores (with highest relevance score) are shown. Green/red indicates decreased/increased mRNA expression in younger patients compared to older counterparts. The color intensity is correlated with fold change. Straight lines are for direct gene to gene interactions, dashed lines are for indirect ones (D) QRTPCR validation. Grey bars represent microarray hybridizations, and, and dark bars represent values from qRT-PCR. Ratio of expression for each gene in older group (>45) to very young group (≤35) is shown as fold change. A significant correlation existed between the microarray and realtime RT-PCR results.doi:10.1371/journal.pone.0063204.g002
Table 2. Differentially expressed genes between young women and two older cohorts.doi:10.1371/journal.pone.0063204.t002
Genomic Signature Specific to Breast Cancers in Very Young Woman
In Saudi Arabia, almost 50% of all the breast cancer patients were reported to be less than 45 years old. Accordingly, we performed additional analyses within the young women’s subset comparing transcriptomes of women younger than 35 years (very young) to two other age cohorts: 35 to 45 years and >45 years. We identified genes that were specific to tumors in very young women using the same methodology that was described previously. The heat map clearly shows significantly higher or lower expression of these genes in very young women compared to the two older age cohorts (Figure S2). The enriched biological processes associated with significantly dysregulated genes that are unique to very young patients include, among others, mitotic cell cycle (p-value = 0.02), morphogenesis (p-value = 0.01), cell proliferation (p-value = 0.03), and death (p-value = 0.049). Similar to young women, network analysis indicated alterations in p38 MAPK, PI3K/AKT and NF-κB signaling pathways, and potentially important roles of IL1RN, ESR1, and ErbB2 in very young women (Figure 2C and Figure S2).
Cross-Species Comparative Genomics Analysis Coupled with Genomic Alteration Data to Identify Genes that may Play a Role in Cancer Development and Progression in Young Women
Ductal carcinoma in situ (DCIS) is heterogeneous group of pre-invasive tumors which may progress rapidly or slowly to invasive cancer. Therefore, an ability to identify which DCIS lesions are likely to progress to the potentially life threatening stage of invasive ductal carcinoma (IDC) would greatly help in the treatment plan and prognosis of the disease. To identify the putative genes involved in disease progression in young women, we performed genome-wide gene expression profiles characteristic of the sequential disease stages (DCIS and IDC) of breast cancer and compared them to age-matched normal controls in young women (≤45 years). We defined potential progression genes as genes that are significantly altered in both DCIS and IDC as these likely represent the earliest molecular steps in acquiring the capacity for invasion , , . We identified 1015 and 4873 genes differentially expressed (up and down-regulated) in DCIS and IDC compared to normal, respectively, and 697 probes (corresponding to 484 unique genes) that had significantly altered expression in both DCIS and IDC (Figure 3A).
Figure 3. Progression from ductal carcinoma in situ (DCIS) to invasive ductal carcinoma (IDC) in young women.
(A) The Venn diagram illustrates that there are 1015 genes differentially expressed (up- or down-regulated) in DCIS compared to normal, whereas 4873 genes differentially expressed in IDC compared to normal controls. 143 genes differentially regulated between IDC and DCIS (green circle). (B) The functional analysis of 16 potential progression genes identified through cross-species comparative genomics analysis. Y-axis indicates the significance (-log P value) of the functional association that is dependent on the number of genes in a class as well as biologic relevance. The threshold line represents a P value of 0.05. (C) Gene interaction networks and pathways analyses of 16-gene progression signature. Green/red indicates decreased/increased mRNA expression in IDC compared to normal controls. The color intensity is correlated with fold change. Straight lines are for direct gene to gene interactions, dashed lines are for indirect ones. (D) Invasive breast tumor cases (from TCGA, Nature 2012 ) displayed altered amplification/homozygous deletion/up-or down-regulation (RNA) or mutation in our 16-progression gene signature. Cases are denoted in columns, and genes in rows (gene symbols are listed on the left).doi:10.1371/journal.pone.0063204.g003
We next performed cross-species comparative genomics analysis to identify potential gene markers for DCIS progression to IDC that are conserved in mouse and human. This approach has been shown to lead to robust markers that may play a role in cancer development and progression. Indeed, driver mutations that are important in cancer have been identified using this strategy , , , , , . We used gene expression data from Kretschmer et al.  for murine markers of disease progression. The comparison of our progression gene signature with the murine markers (human orthologous) revealed 16 genes that were conserved between mouse and human (p<0.001) (Table 3). GO analyses using both EASE and IPA tools revealed that these genes are mainly involved in biological processes such as cell cycle, mitosis, embryonic development, DNA replication, growth and apoptosis (Figure 3B). The top five significantly altered canonical pathways include Cell cycle: G2/M DNA Checkpoint Regulation (p value = 1.1×10−5), Mitotic Roles of Polo-Like Kinase (p value = 3.3×10−5), ATM Signaling (p value = 1.5×10−3), Cyclins and cell cycle regulation (p value = 2.9×10−3), and Sonic hedgehog signaling (p value = 0.03). The network analysis illustrated activated pathways as well as interactions of genes that may potentially play a role in disease progression (Figure 3C). A literature-based search of 16 genes using the MILANO database  demonstrated the association of these genes with cancer progression, tumor development and invasion in various cancers, including breast cancer , , , , , , .
Table 3. List of 16 cross-species conserved DCIS to IDC potential progression gene signature.doi:10.1371/journal.pone.0063204.t003
The presence of altered DNA CN may contribute to cancer formation and progression and could include transcriptional control mechanisms that locally impact gene expression levels , , , . Integrating the gene expression data with CN alterations may identify novel early breast cancer markers of malignant transformation and progression , , . Hence, we integrated our cross-species conserved progression gene signature with four independent studies of genome copy number alterationsin human breast tumors (as detailed in “Materials and Methods” section) and found that our gene signature has concomitant DNA alterations , ,  (Table 3, Figure 3D).
Comparison of DCIS and IDC Transcriptome in Young Women
Comparison of expression profile characteristics between IDC and DCIS in young women revealed dysregulation of 143 genes, 96% of which had significantly higher expression in DCIS compared to those in IDC (Figure 3A). These genes were enriched within functional categories including immune response, tissue morphology, cellular growth and proliferation, cell death and cellular movement. The network analysis highlighted alterations in PI3K/Akt, NFkB, Jnk, and ERK pathways (Figure S3).
The Venn diagram approach resulted in 27 genes and 94 probes (corresponding to 72 genes) that were unique to IDC and DCIS, respectively (Table S1, Figure S3). Interestingly, 85% of genes specific to IDC were down-regulated compared to normal controls. The IDC gene signature, including DUSP6, PTGDS, IFNGR1, PIK3R1, FCER1A, P2RY14, PVRL2, SELP, and TFPI were involved in cell death, immune response, cellular movement and tissue development. The interaction network and pathway analyses revealed alterations in G-Protein Coupled Receptor Signaling, PI3K Signaling, and ERK/MAPK Signaling (Figure S3). In contrast to IDC, 97% of DCIS specific genes were up-regulated in DCIS vs normal, including genes such as CD22, IGHM, MS4A1, BCR, RBL2, and MAP3K5 (Table S1 and Figure S3).
In silico Independent Validations
To validate our results, we used four independently performed microarray datasets as well as data available in the database developed by Gyorffy et al. . The first validation dataset was generated by The Cancer Genome Atlas ((https://tcga-data.nci.nih.gov/tcga/). This dataset is composed of samples from invasive breast carcinoma patients (n = 536) and matched normal controls (n = 63). Our cross-species conserved 16-progression gene signature was significantly up-regulated in patients compared to normal controls (adjusted P-value <1.19×10−32) and was sufficient to cluster and differentiate samples as tumor versus normal controls (data not shown).
We then assessed the prognostic capability of our genes on independent microarray datasets involving large numbers of breast cancer patients with survival data. We confirmed the prognostic significance of all of our 16 genes for recurrence free survival (RFS;n = 2324), overall survival (OS;n = 464), and distant metastasis free survival (DMFS; n = 673) in datasets from Gyorffy et al. . The high expressions of these genes were significantly associated with poor disease outcome (Table 3). Moreover, the prognostic significance of 16 genes were tested on additional two datasets of breast cancer patients from GSE7390  and GSE12093 . The GSE7390 dataset consisted of 198 lymph node-negative (N-) patients . The purpose of this analysis was to identify patients at high risk of early distant metastases. The data from Zhang et al (GSE12093)  included 136 breast cancers that were treated with tamoxifen to classify high-risk patients that benefit from adjuvant tamoxifen therapy. We found that thirteen of our genes were significantly associated with a high risk patient group with distant metastases in at least two of the datasets tested (Table 3). Six genes (RRM2, BIRC5, TOP2A, NUSAP1, TPX2, and CCNB2) were of significant clinical relevance in all the datasets tested, especially for identifying a high risk patient group (Table 3, Figure S4).
As a further validation of our results, we re-analyzed an independently performed microarray dataset from Miller et al .This dataset was composed of 251 human breast cancer samples, of which 31 were derived from young women, which were used in this re-analysis. We evaluated the performance of the 16-progression gene signature on this dataset. Unsupervised clustering was performed and we found that our gene signature was sufficient to separate patients into two clusters which differed significantly by p53 mutation status (Figure S4). The cluster which had high expression of these genes comprised nearly of all the p53 mutant tumors. Intriguingly, TP53 mutations in breast cancer are associated with poor survival independent of other risk factors .
The Microarray Literature-based Annotation (MILANO) database  search revealed that all of the 16 genes were associated with tumor progression, development, and invasiveness in various cancers, including breast cancer , , , . Moreover, comparing the 16-gene signature with gene signatures available in the GeneSigDB database  revealed statistically significant overlap (P-value <0.05, corrected for multiple testing) with over 400 published cancer gene signatures for various cancers, including 161 gene signatures for breast cancer. Furthermore, these genes were also mapped to human genomic CN alterations associated with invasive breast tumors in independent genomic studies, implicating the involvement of these genes in malignant transformation and progression , , .
Validation of Microarray Data by qRT-PCR and Immunohistochemistry
To confirm the microarray results by an independent method, we selected five significantly dysregulated genes (ESR1, IL1RN, SEPP1, TIAM1, and SCD) in very young (≤35 years) and/or young (≤45 years) women compared to older cohorts and validated the expression levels using qRT-PCR. A significant correlation existed between the microarray and realtime RT-PCR results, (Figure 2D and Figure S2 (Pearson’s r >0.76). This correlation was stronger when comparing the older group (>45 years) to the very young women cohort (≤35 years) (r = 0.99; Figure 2D) versus comparing the young woman group (35–45 years) to the very young women cohort (r = 0.77; Figure S2).
Moreover, we performed immunohistochemical staining in breast cancer patient samples using antibodies directed against TGFA, IL1RN and PI3K. The TGFA positivity was significantly associated with young age (Fisher’s exact test, p value = 0.02). In fact, 90% of young patients (n = 10) tested positive, which is in concordance with the microarray result. IL1RN was found to have higher expression in older cohorts compared to young patients in our microarray analysis, which was also validated by qRT-PCR (Figure 2D). Indeed, five of the six samples that tested positive by immunohistochemical staining were from older patients. Testing for protein expression of PI3K revealed that it was not expressed in all of the IDC cases (n = 10), but positive for DCIS, which is also in concordance with the microarray result (Figure S3). Hence, the immunohistochemistry verified the protein expression of the selected candidates. Representative images of positively stained tumors are shown in Figure 4A–C, respectively).
Figure 4. Protein expression of selected genes by immunohistochemical staining in breast cancer patients’ samples using antibodies directed against (B) TGFA, (C) IL1RN, and (D) PI3K.
Representative images of positively stained tumors are shown (magnification, ×200).doi:10.1371/journal.pone.0063204.g004
Numerous studies have shown that younger women with breast cancer have a poorer prognosis and disease free survival compared to their older counterparts , , , , , . Indeed, young age has been shown to be an independent predictor for poor prognosis even after controlling for different histopathological features , . However, the biology driving this disease process and the molecular pathways that contribute to aggressive tumors in younger women are largely unknown. Clinical observations indicate that 45% of all female breast cancers in Saudi Arabia appear in women younger than 45 years of age . Hence, in this study, we sought to understand the molecular underpinnings of breast cancer in an age-specific manner in order to elucidate genes and pathways giving rise to aggressive tumors in young women using a transcriptomic approach. Furthermore, we explored molecular alterations of breast cancer progression from DCIS to potentially lethal stages of IDC in young women and identified potential progression marker genes using cross-species comparative genomics analysis.
We performed two different approaches to identify gene signatures for different age cohorts of women with breast cancer. In the first approach, we compared whole-genome mRNA expression profile from tumors and disease free normal tissues in three age cohorts of young women (≤45 years), 45 to 55 years (pre) and ≥55 years (elderly). The network analyses of significantly dysregulated genes revealed the activation of MYC , , , NF-κB  and TGF-β signaling ,  pathways in young, pre and elderly cohorts, respectively. In the second approach, we compared transcriptomes of tumors arising in young women to those from two older counterparts, and identified 63 genes that had distinct expression patterns in young women. By performing these approaches, we gained important insights into pathways and genes that were specifically altered in young women. The pathway analysis indicated alterations in PI3K/Akt , , MYC , ,  and NF-κB  signaling pathways, and potential critical roles for TGFA , , ErbB2 , , , , and IL-1/IL-1R , ,  which may promote angiogenesis, tumor growth, and metastasis and hence cause the aggressive phenotype observed in young women. Previous reports have shown in experimental models that Interleukin 1 (IL-1) promotes angiogenesis, tumor growth, and metastasis , and its presence in some human cancers is associated with aggressive tumor biology . The activation of IL-1/IL-1R though autocrine or paracrine mechanisms can lead to a cascade of secondary tumorigenic cytokines, which can subsequently contribute to angiogenesis, tumor-cell proliferation and tumor invasion . For example, these inflammatory cytokines can regulate the proliferation of breast cells through estrogen production by the steroid catalyzing enzymes in breast tissues . Interestingly, mutant alleles of IL1RN were associated with shortened disease-free and overall survival among Caucasian women with breast cancer . Similarly, IL-1 expression has been shown to be an adverse prognostic factor , . NF-κB signaling has been shown to be activated in various tumors, including human breast cancers. Most recently, it has been shown in mouse models that epithelial NF-κB is an active contributor to tumor progression, inhibition of which could have a significant therapeutic impact even at later stages of mammary tumor progression . Our data also indicated that the levels of expression of TIAM1 and VANGL2 in very young women are significantly lower than in their older counterparts. The expression of TIAM1 has been shown to be associated with increased invasiveness and progression of breast carcinomas . Recently, it has been reported that VANGL2 promotes migration of cells by a metalloproteinase-dependent invasion of extra cellular matrix and therefore influences invasion and perhaps metastasis .
Previous studies have shown that important driver mutations in various cancers can be identified using comparative genomic approaches , , , , . Such studies suggest that the conserved changes across species may be mechanistically essential for cancer development and progression, and hence they may be critical targets for therapeutic intervention , , . Therefore, focusing on differentially expressed genes derived from these comparative approaches along with concomitant altered DNA copy number changes may identify novel early breast cancer markers of malignant transformation and progression , , . One of the major contributions of this study is the identification of 16 potential disease progression marker genes, including CCNB2, UBE2C, TPX2, KIF4A, BIRC5, NUSAP1, and RRM2, using integrative and cross-species comparative genomics analysis. These genes are related to mitosis, cell cycle, embryonic development, DNA replication, cell division and proliferation. Our findings are consistent with previously performed independent studies of breast cancer progression , , . However, the novelty of our results is that genes identified in this study were evolutionarily conserved across species, and along with genomic alterations, and we provide evidence for the potential role of previously reported genes as well as new genes in the progression of young women’s breast cancer progression.
Testing our genes on independent microarray datasets using samples from over 3000 breast cancer patients demonstrated that high expression of these genes are significantly associated with poor outcome. Intriguingly, our 16-gene signature separated patients in Miller et. al.’s study into two clusters that differed significantly in their TP53 mutation status. The cluster which had high expression of these genes comprised nearly of all the p53 mutant tumors. Previous studies have reported that TP53 mutations in breast cancer are associated with poor survival independent of other risk factors  and have a strong association with hormone receptor negative, HER2+ and basal-like subgroups , . Furthermore, a Microarray Literature-based Annotation database  search indicated the involvement of our 16 genes in tumor development, progression, and invasiveness in various cancers, including breast cancer , , , , , , . Taken together, these observations suggest that the 16-progression-gene signature has the potential to classify tumors which may have invasive capacity and may be crucial for determining which lesions are more likely to become invasive.
Differential expression analysis of DCIS and IDC in young women revealed significant down regulation of PI3K, DUSP6, CD22, RB, BCR, MS4A1 (also known as CD20), and MAP3K5 as well as alterations in PI3K/Akt, NFkB, Jnk, and ERK pathways. The PI3K/Akt pathway is involved in regulation of cell proliferation and implicated in carcinogenesis . The network analysis also indicated a central role of the retinoblastoma tumor suppressor (RB), which may be potentially important in tumor progression. This gene has been found to be functionally inactivated in the majority of human cancers, and aberrant in nearly half of breast cancers . Deficiency in RB function compromises cell cycle checkpoints, and contributes to aggressive tumor proliferation . Comparison of IDC and DCIS transcriptomes resulted in 27 signature genes that are unique to IDC, and differentiated from DCIS in young women. The majority of these genes (85%) were repressed (or down-regulated) compared to normal controls, except for few genes, such as Poliovirus receptor-related 2 (PVRL2, CD112). PVRL2 has been found to have enhanced expression in various tumors, and it has been suggested to have a role in tumor invasion and migration , .
In summary, to our knowledge this study provides the first comprehensive transcriptomic analysis of breast tumors that characterizes the underlying biological mechanisms in an age-specific manner in a cohort of Middle Eastern women, and coupled with an integrative cross-species comparative genomics approach has identified genes that could be potential biomarkers for tumor progression in young women. Our global expression profiling resulted in 63 genes that are specific to young women’s breast tumors. The network analyses illustrated the interaction of potential critical genes and the altered pathways associated with breast cancer that specifically appear in young women. The implication from these findings is that these genes may be contributing to the aggressive tumor behavior often present in these patients. Our results confirm previous studies as well as provide additional insights into young age (≤45 years) and very young age (≤35 years) specific oncogenic alterations that may be promoting tumorigenesis. Our cross species data analyses coupled with genomic copy number alterations may provide robust biomarkers for the detection of disease progression in young women and may lead to improved diagnosis and therapeutic options.
(A) Comparison of each age cohort, young women (≤45 years), 45 to 55 years (pre) and ≥55 years (post), with the age-matched normal controls. We identified 2632, 2029 and 2842 significantly dysregulated genes (up or down) due to tumor in young, pre and old cohorts respectively (adjusted p value <5% and FC >2). (B) Gene interaction networks analysis of differentially expressed genes associated with tumor in each age cohort. Green/red indicates decreased/increased mRNA expression in patients compared to age-matched normal controls. The color intensity is correlated with fold change. Straight lines are for direct gene to gene interactions, dashed lines are for indirect ones (top scoring networks are shown).
(A) Heatmap of very young-specific tumor genes across all tumor samples. Samples are denoted in columns and genes are denoted in rows. The heatmap clearly shows that those set of genes were significantly up- or down-regulated in tumor samples from very young women. The expression level of each gene across the samples is scaled to [-3, 3] interval. These mapped expression levels are depicted using a color scale as shown at the top of the figure, as such highly expressed genes are indicated in red, intermediate in black, and weakly expressed in green. (B) Validation of microarray data by realtime RT-PCR. Ratio of expression for each gene in Young (age 35 to 45) to very young (< = 35). Red bars represent microarray hybridizations, and, and blue bars represent values from qRT-PCR. (C) Gene interaction networks analysis of genes specific to very young women tumor. Green/red indicates decreased/increased mRNA expression in younger patients compared to older counterparts. The color intensity is correlated with fold change. Straight lines are for direct gene to gene interactions, dashed lines are for indirect ones.
I. Comparison of the expression profile characteristics of IDC and DCIS. (a) 143 genes have significantly different levels of expression between DCIS compared to IDC. (b) Functional enrichment analysis of genes whose expression altered between DCIS and IDC. (c-d) The network analysis of 143 genes. Green/red indicates decreased/increased mRNA expression in IDC compared to normal controls. II. Network analyses of genes specific to DCIS or IDC in young women (A) Venn diagram illustrating 27 genes and 94 probes (corresponding to 72 genes) that are specific to IDC and DCIS, respectively. (B) Network analyses of genes specific to IDC. Green/red indicates decreased/increased mRNA expression in IDC compared to normal controls. (C) Network analyses of genes specific to DCIS (top two significant networks shown). Green/red indicates decreased/increased mRNA expression in DCIS compared to normal controls. The color intensity is correlated with fold change. Straight lines are for direct gene to gene interactions, dashed lines are for indirect ones. DCIS: ductal carcinoma in situ; IDC: invasive ductal carcinoma.
In Silico Independent Validation Analysis. (A) Re-analyzed dataset from Miller et al  that was composed of 251 human tumor samples, of which 31 were derived from young women, which was used in the re-analysis. Our progression signature gene list was sufficient to separate patients in Miller et. al.’s study into two clusters which differed significantly with the p53 mutation status. The cluster which had high expression of these genes comprised nearly of all the p53 mutant tumors. (B) GSE7390  and GSE12093  datasets were used for independent validation analyses. Genes, including RRM2, BIRC5, TOP2A, NUSAP1, TPX2, and CCNB2 were of significant clinical relevance for identifying patients at high risk patients groups (result for RRM2 has been shown).
Gene signatures specific to malignant stage of invasive ductal carcinoma (IDC) and pre-invasive ductal carcinoma in situ (DCIS) in young women.
We are grateful to the patients for their participation in this study. We would like to thank John Quackenbush and Pinar T. Ozand for helpful suggestions.
Conceived and designed the experiments: SMA DC. Performed the experiments: AN NK AA MN HJ MSI AT. Analyzed the data: DC AE. Contributed reagents/materials/analysis tools: SMA TT DA AE OM NK. Wrote the paper: DC SMA BHP NK.
- 1. Kamangar F, Dores GM, Anderson WF (2006) Patterns of cancer incidence, mortality, and prevalence across five continents: defining priorities to reduce cancer disparities in different geographic regions of the world. J Clin Oncol 24: 2137–2150. doi: 10.1200/jco.2005.05.2308
- 2. Arabia CRoS (2009) Cancer Incidence Report Saudi Arabia 2005.
- 3. Society AC (2010) Breast Cancer Facts & Figures 2009–2010. Atlanta: American Cancer Society, Inc.
- 4. Chung M, Chang HR, Bland KI, Wanebo HJ (1996) Younger women with breast carcinoma have a poorer prognosis than older women. Cancer 77: 97–103. doi: 10.1002/(sici)1097-0142(19960101)77:1<97::aid-cncr16>3.0.co;2-3
- 5. Maggard MA, O’Connell JB, Lane KE, Liu JH, Etzioni DA, et al. (2003) Do young breast cancer patients have worse outcomes? J Surg Res 113: 109–113. doi: 10.1016/s0022-4804(03)00179-3
- 6. Adami HO, Malker B, Holmberg L, Persson I, Stone B (1986) The relation between survival and age at diagnosis in breast cancer. N Engl J Med 315: 559–563. doi: 10.1056/nejm198608283150906
- 7. Anders CK, Hsu DS, Broadwater G, Acharya CR, Foekens JA, et al. (2008) Young age at diagnosis correlates with worse prognosis and defines a subset of breast cancers with shared patterns of gene expression. J Clin Oncol 26: 3324–3330. doi: 10.1200/jco.2007.14.2471
- 8. Bleyer A, Barr R, Hayes-Lattin B, Thomas D, Ellis C, et al. (2008) The distinctive biology of cancer in adolescents and young adults. Nat Rev Cancer 8: 288–298. doi: 10.1038/nrc2349
- 9. Nixon AJ, Neuberg D, Hayes DF, Gelman R, Connolly JL, et al. (1994) Relationship of patient age to pathologic features of the tumor and prognosis for patients with stage I or II breast cancer. J Clin Oncol 12: 888–894.
- 10. El Saghir NS, Seoud M, Khalil MK, Charafeddine M, Salem ZK, et al. (2006) Effects of young age at presentation on survival in breast cancer. BMC Cancer 6: 194. doi: 10.1186/1471-2407-6-194
- 11. Holli K, Isola J (1997) Effect of age on the survival of breast cancer patients. Eur J Cancer 33: 425–428. doi: 10.1016/s0959-8049(97)89017-x
- 12. Aebi S, Gelber S, Castiglione-Gertsch M, Gelber RD, Collins J, et al. (2000) Is chemotherapy alone adequate for young women with oestrogen-receptor-positive breast cancer? Lancet 355: 1869–1874. doi: 10.1016/s0140-6736(00)02292-3
- 13. Elkum N, Dermime S, Ajarim D, Al-Zahrani A, Alsayed A, et al. (2007) Being 40 or younger is an independent risk factor for relapse in operable breast cancer patients: the Saudi Arabia experience. BMC Cancer 7: 222. doi: 10.1186/1471-2407-7-222
- 14. Bombonati A, Sgroi DC (2011) The molecular pathology of breast cancer progression. J Pathol 223: 307–317. doi: 10.1002/path.2808
- 15. Ma XJ, Salunga R, Tuggle JT, Gaudet J, Enright E, et al. (2003) Gene expression profiles of human breast cancer progression. Proc Natl Acad Sci U S A 100: 5974–5979. doi: 10.1073/pnas.0931261100
- 16. Chin K, de Solorzano CO, Knowles D, Jones A, Chou W, et al. (2004) In situ analyses of genome instability in breast cancer. Nat Genet 36: 984–988. doi: 10.1038/ng1409
- 17. Burstein HJ, Polyak K, Wong JS, Lester SC, Kaelin CM (2004) Ductal carcinoma in situ of the breast. N Engl J Med 350: 1430–1441. doi: 10.1056/nejmra031301
- 18. Amari M, Moriya T, Ishida T, Harada Y, Ohnuki K, et al. (2003) Loss of heterozygosity analyses of asynchronous lesions of ductal carcinoma in situ and invasive ductal carcinoma of the human breast. Jpn J Clin Oncol 33: 556–562. doi: 10.1093/jjco/hyg109
- 19. Castro NP, Osorio CA, Torres C, Bastos EP, Mourao-Neto M, et al. (2008) Evidence that molecular changes in cells occur before morphological alterations during the progression of breast ductal carcinoma. Breast Cancer Res 10: R87. doi: 10.1186/bcr2157
- 20. Ma XJ, Dahiya S, Richardson E, Erlander M, Sgroi DC (2009) Gene expression profiling of the tumor microenvironment during breast cancer progression. Breast Cancer Res 11: R7. doi: 10.1186/bcr2222
- 21. Peeper D, Berns A (2006) Cross-species oncogenomics in cancer gene identification. Cell 125: 1230–1233. doi: 10.1016/j.cell.2006.06.018
- 22. Gaspar C, Cardoso J, Franken P, Molenaar L, Morreau H, et al. (2008) Cross-species comparison of human and mouse intestinal polyps reveals conserved mechanisms in adenomatous polyposis coli (APC)-driven tumorigenesis. Am J Pathol 172: 1363–1380. doi: 10.2353/ajpath.2008.070851
- 23. Paoloni M, Davis S, Lana S, Withrow S, Sangiorgi L, et al. (2009) Canine tumor cross-species genomics uncovers targets linked to osteosarcoma progression. BMC Genomics 10: 625. doi: 10.1186/1471-2164-10-625
- 24. Sweet-Cordero A, Mukherjee S, Subramanian A, You H, Roix JJ, et al. (2005) An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis. Nat Genet 37: 48–55. doi: 10.1038/ng1490
- 25. Graeber TG, Sawyers CL (2005) Cross-species comparisons of cancer signaling. Nat Genet 37: 7–8. doi: 10.1038/ng0105-7
- 26. Ellwood-Yen K, Graeber TG, Wongvipat J, Iruela-Arispe ML, Zhang J, et al. (2003) Myc-driven murine prostate cancer shares molecular features with human prostate tumors. Cancer Cell 4: 223–238. doi: 10.1016/s1535-6108(03)00197-1
- 27. Colak D, Chishti MA, Al-Bakheet AB, Al-Qahtani A, Shoukri MM, et al. (2010) Integrative and comparative genomics analysis of early hepatocellular carcinoma differentiated from liver regeneration in young and old. Mol Cancer 9: 146. doi: 10.1186/1476-4598-9-146
- 28. Gonzalez-Angulo AM, Hennessy BT, Mills GB (2010) Future of personalized medicine in oncology: a systems biology approach. J Clin Oncol 28: 2777–2783. doi: 10.1200/jco.2009.27.0777
- 29. Zender L, Spector MS, Xue W, Flemming P, Cordon-Cardo C, et al. (2006) Identification and validation of oncogenes in liver cancer using an integrative oncogenomic approach. Cell 125: 1253–1267. doi: 10.1016/j.cell.2006.05.030
- 30. Comprehensive molecular portraits of human breast tumours. Nature 490: 61–70.
- 31. Kadota M, Sato M, Duncan B, Ooshima A, Yang HH, et al. (2009) Identification of novel gene amplifications in breast cancer and coexistence of gene amplification with an activating mutation of PIK3CA. Cancer Res 69: 7357–7365. doi: 10.1158/0008-5472.can-09-0064
- 32. Haverty PM, Fridlyand J, Li L, Getz G, Beroukhim R, et al. (2008) High-resolution genomic and expression analyses of copy number alterations in breast tumors. Genes Chromosomes Cancer 47: 530–542. doi: 10.1002/gcc.20558
- 33. Pollack JR, Sorlie T, Perou CM, Rees CA, Jeffrey SS, et al. (2002) Microarray analysis reveals a major direct role of DNA copy number alteration in the transcriptional program of human breast tumors. Proc Natl Acad Sci U S A 99: 12963–12968. doi: 10.1073/pnas.162471999
- 34. Patil MA, Chua MS, Pan KH, Lin R, Lih CJ, et al. (2005) An integrated data analysis approach to characterize genes highly expressed in hepatocellular carcinoma. Oncogene 24: 3737–3747. doi: 10.1038/sj.onc.1208479
- 35. Garraway LA, Widlund HR, Rubin MA, Getz G, Berger AJ, et al. (2005) Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature 436: 117–122. doi: 10.1038/nature03664
- 36. Ala U, Piro RM, Grassi E, Damasco C, Silengo L, et al. (2008) Prediction of human disease genes by human-mouse conserved coexpression analysis. PLoS Comput Biol 4: e1000043. doi: 10.1371/journal.pcbi.1000043
- 37. Kretschmer C, Sterner-Kock A, Siedentopf F, Schoenegg W, Schlag PM, et al. (2011) Identification of early molecular markers for breast cancer. Mol Cancer 10: 15. doi: 10.1186/1476-4598-10-15
- 38. Bennett CN, Green JE (2008) Unlocking the power of cross-species genomic analyses: identification of evolutionarily conserved breast cancer networks and validation of preclinical models. Breast Cancer Res 10: 213. doi: 10.1186/bcr2125
- 39. Collins LC, Marotti JD, Gelber S, Cole K, Ruddy K, et al. (2012) Pathologic features and molecular phenotype by patient age in a large cohort of young women with breast cancer. Breast Cancer Res Treat 131: 1061–1066. doi: 10.1007/s10549-011-1872-9
- 40. Li C, Hung Wong W (2001) Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol 2: RESEARCH0032. doi: 10.1186/gb-2001-2-8-research0032
- 41. Li C, Wong WH (2001) Model-based analysis of oligonucleotide arrays: expression index computation and outlier detection. Proc Natl Acad Sci U S A 98: 31–36. doi: 10.1073/pnas.98.1.31
- 42. Saeed AI, Bhagabati NK, Braisted JC, Liang W, Sharov V, et al. (2006) TM4 microarray software suite. Methods Enzymol 411: 134–193. doi: 10.1016/s0076-6879(06)11009-5
- 43. Saeed AI, Sharov V, White J, Li J, Liang W, et al. (2003) TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34: 374–378.
- 44. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, et al. (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5: R80.
- 45. Wu Z, Irizarry RA (2004) Preprocessing of oligonucleotide array data. Nat Biotechnol 22: 656–658; author reply 658.
- 46. Wu Z, Irizarry RA (2005) Stochastic models inspired by hybridization theory for short oligonucleotide arrays. J Comput Biol 12: 882–893. doi: 10.1089/cmb.2005.12.882
- 47. Pavlidis P (2003) Using ANOVA for gene selection from microarray studies of the nervous system. Methods 31: 282–289. doi: 10.1016/s1046-2023(03)00157-9
- 48. Thomas PD, Kejariwal A, Guo N, Mi H, Campbell MJ, et al. (2006) Applications for protein sequence-function evolution data: mRNA/protein expression analysis and coding SNP scoring tools. Nucleic Acids Res 34: W645–650. doi: 10.1093/nar/gkl229
- 49. Hosack DA, Dennis G Jr, Sherman BT, Lane HC, Lempicki RA (2003) Identifying biological themes within lists of genes with EASE. Genome Biol 4: R70. doi: 10.1186/gb-2003-4-10-r70
- 50. Brazma A, Hingamp P, Quackenbush J, Sherlock G, Spellman P, et al. (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29: 365–371.
- 51. Gyorffy B, Lanczky A, Eklund AC, Denkert C, Budczies J, et al. (2010) An online survival analysis tool to rapidly assess the effect of 22,277 genes on breast cancer prognosis using microarray data of 1,809 patients. Breast Cancer Res Treat 123: 725–731. doi: 10.1007/s10549-009-0674-9
- 52. Desmedt C, Piette F, Loi S, Wang Y, Lallemand F, et al. (2007) Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res 13: 3207–3214. doi: 10.1158/1078-0432.ccr-06-2765
- 53. Zhang Y, Sieuwerts AM, McGreevy M, Casey G, Cufer T, et al. (2009) The 76-gene signature defines high-risk patients that benefit from adjuvant tamoxifen therapy. Breast Cancer Res Treat 116: 303–309. doi: 10.1007/s10549-008-0183-2
- 54. Miller LD, Smeds J, George J, Vega VB, Vergara L, et al. (2005) An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci U S A 102: 13550–13555. doi: 10.1073/pnas.0506230102
- 55. Culhane AC, Schroder MS, Sultana R, Picard SC, Martinelli EN, et al. (2012) GeneSigDB: a manually curated database and resource for analysis of gene expression signatures. Nucleic Acids Res 40: D1060–1066. doi: 10.1093/nar/gkr901
- 56. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, et al. (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2: 401–404. doi: 10.1158/2159-8290.cd-12-0095
- 57. Reiner A, Yekutieli D, Benjamini Y (2003) Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19: 368–375. doi: 10.1093/bioinformatics/btf877
- 58. Livak KJ, Schmittgen TD (2001) Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods 25: 402–408. doi: 10.1006/meth.2001.1262
- 59. Rubinstein R, Simon I (2005) MILANO–custom annotation of microarray results using automatic literature searches. BMC Bioinformatics 6: 12.
- 60. Psyrri A, Kalogeras KT, Kronenwett R, Wirtz RM, Batistatou A, et al. (2012) Prognostic significance of UBE2C mRNA expression in high-risk early breast cancer. A Hellenic Cooperative Oncology Group (HeCOG) Study. Ann Oncol 23: 1422–1427. doi: 10.1093/annonc/mdr527
- 61. Waligorska-Stachura J, Jankowska A, Wasko R, Liebert W, Biczysko M, et al. (2012) Survivin–prognostic tumor biomarker in human neoplasms–review. Ginekol Pol 83: 537–540.
- 62. Stav D, Bar I, Sandbank J (2007) Usefulness of CDK5RAP3, CCNB2, and RAGE genes for the diagnosis of lung adenocarcinoma. Int J Biol Markers 22: 108–113.
- 63. Olson JE, Wang X, Goode EL, Pankratz VS, Fredericksen ZS, et al. (2010) Variation in genes required for normal mitosis and risk of breast cancer. Breast Cancer Res Treat 119: 423–430. doi: 10.1007/s10549-009-0386-1
- 64. Chen Z, Zhang C, Wu D, Chen H, Rorick A, et al. (2011) Phospho-MED1-enhanced UBE2C locus looping drives castration-resistant prostate cancer growth. EMBO J 30: 2405–2419. doi: 10.1038/emboj.2011.154
- 65. Albertson DG, Collins C, McCormick F, Gray JW (2003) Chromosome aberrations in solid tumors. Nat Genet 34: 369–376. doi: 10.1038/ng1215
- 66. Zhao X, Weir BA, LaFramboise T, Lin M, Beroukhim R, et al. (2005) Homozygous deletions and chromosome amplifications in human lung carcinomas revealed by single nucleotide polymorphism array analysis. Cancer Res 65: 5561–5570. doi: 10.1158/0008-5472.can-04-4603
- 67. Pharoah PD, Day NE, Caldas C (1999) Somatic mutations in the p53 gene and prognosis in breast cancer: a meta-analysis. Br J Cancer 80: 1968–1973.
- 68. Anders CK, Acharya CR, Hsu DS, Broadwater G, Garman K, et al. (2008) Age-specific differences in oncogenic pathway deregulation seen in human breast tumors. PLoS One 3: e1373. doi: 10.1371/journal.pone.0001373
- 69. Fredholm H, Eaker S, Frisell J, Holmberg L, Fredriksson I, et al. (2009) Breast cancer in young women: poor survival despite intensive treatment. PLoS One 4: e7695. doi: 10.1371/journal.pone.0007695
- 70. Xu J, Chen Y, Olopade OI (2010) MYC and Breast Cancer. Genes Cancer 1: 629–640. doi: 10.1177/1947601910378691
- 71. Corzo C, Corominas JM, Tusquets I, Salido M, Bellet M, et al. (2006) The MYC oncogene in breast cancer progression: from benign epithelium to invasive carcinoma. Cancer Genet Cytogenet 165: 151–156. doi: 10.1016/j.cancergencyto.2005.08.013
- 72. Park BK, Zhang H, Zeng Q, Dai J, Keller ET, et al. (2007) NF-kappaB in breast cancer cells promotes osteolytic bone metastasis by inducing osteoclastogenesis via GM-CSF. Nat Med 13: 62–69. doi: 10.1038/nm1519
- 73. Buck MB, Knabbe C (2006) TGF-beta signaling in breast cancer. Ann N Y Acad Sci 1089: 119–126. doi: 10.1196/annals.1386.024
- 74. Katsuno Y, Lamouille S, Derynck R (2013) TGF-beta signaling and epithelial-mesenchymal transition in cancer progression. Curr Opin Oncol 25: 76–84. doi: 10.1097/cco.0b013e32835b6371
- 75. Tokunaga E, Kimura Y, Mashino K, Oki E, Kataoka A, et al. (2006) Activation of PI3K/Akt signaling and hormone resistance in breast cancer. Breast Cancer 13: 137–144. doi: 10.2325/jbcs.13.137
- 76. McAuliffe PF, Meric-Bernstam F, Mills GB, Gonzalez-Angulo AM (2010) Deciphering the role of PI3K/Akt/mTOR pathway in breast cancer biology and pathogenesis. Clin Breast Cancer 10 Suppl 3S59–65. doi: 10.3816/cbc.2010.s.013
- 77. D’Errico A, Barozzi C, Fiorentino M, Carella R, Di Simone M, et al. (2000) Role and new perspectives of transforming growth factor-alpha (TGF-alpha) in adenocarcinoma of the gastro-oesophageal junction. Br J Cancer 82: 865–870.
- 78. Hantschmann P, Jeschke U, Friese K (2005) TGF-alpha, c-erbB-2 expression and neoangiogenesis in vulvar squamous cell carcinoma. Anticancer Res 25: 1731–1737.
- 79. Hartley MC, McKinley BP, Rogers EA, Kalbaugh CA, Messich HS, et al.. (2006) Differential expression of prognostic factors and effect on survival in young (< or = 40) breast cancer patients: a case-control study. Am Surg 72: 1189–1194; discussion 1194–1185.
- 80. Agrup M, Stal O, Olsen K, Wingren S (2000) C-erbB-2 overexpression and survival in early onset breast cancer. Breast Cancer Res Treat 63: 23–29. doi: 10.1023/a:1006498721508
- 81. Slamon DJ, Godolphin W, Jones LA, Holt JA, Wong SG, et al. (1989) Studies of the HER-2/neu proto-oncogene in human breast and ovarian cancer. Science 244: 707–712. doi: 10.1126/science.2470152
- 82. Pantschenko AG, Pushkar I, Anderson KH, Wang Y, Miller LJ, et al. (2003) The interleukin-1 family of cytokines and receptors in human breast cancer: implications for tumor progression. Int J Oncol 23: 269–284.
- 83. Grimm C, Kantelhardt E, Heinze G, Polterauer S, Zeillinger R, et al. (2009) The prognostic value of four interleukin-1 gene polymorphisms in Caucasian women with breast cancer: a multicenter study. BMC Cancer 9: 78. doi: 10.1186/1471-2407-9-78
- 84. Graziano F, Ruzzo A (2005) Role of the interleukin-1 receptor antagonist gene polymorphism (IL-1RN*2) in early gastric cancer. J Clin Oncol 23: 5272; author reply 5272–5273.
- 85. Saijo Y, Tanaka M, Miki M, Usui K, Suzuki T, et al. (2002) Proinflammatory cytokine IL-1 beta promotes tumor growth of Lewis lung carcinoma by induction of angiogenic factors: in vivo analysis of tumor-stromal interaction. J Immunol 169: 469–475.
- 86. Elaraj DM, Weinreich DM, Varghese S, Puhlmann M, Hewitt SM, et al. (2006) The role of interleukin 1 in growth and metastasis of human cancer xenografts. Clin Cancer Res 12: 1088–1096. doi: 10.1158/1078-0432.ccr-05-1603
- 87. Honma S, Shimodaira K, Shimizu Y, Tsuchiya N, Saito H, et al. (2002) The influence of inflammatory cytokines on estrogen production and cell proliferation in human breast cancer cells. Endocr J 49: 371–377. doi: 10.1507/endocrj.49.371
- 88. Jin L, Yuan RQ, Fuchs A, Yao Y, Joseph A, et al. (1997) Expression of interleukin-1beta in human breast carcinoma. Cancer 80: 421–434. doi: 10.1002/(sici)1097-0142(19970801)80:3<421::aid-cncr10>3.0.co;2-z
- 89. Connelly L, Barham W, Onishko HM, Sherrill T, Chodosh LA, et al. (2011) Inhibition of NF-kappa B activity in mammary epithelium increases tumor latency and decreases tumor burden. Oncogene 30: 1402–1412. doi: 10.1038/onc.2010.521
- 90. Stebel A, Brachetti C, Kunkel M, Schmidt M, Fritz G (2009) Progression of breast tumors is accompanied by a decrease in expression of the Rho guanine exchange factor Tiam1. Oncol Rep 21: 217–222. doi: 10.3892/or_00000211
- 91. Cantrell VA, Jessen JR (2010) The planar cell polarity protein Van Gogh-Like 2 regulates tumor cell migration and matrix metalloproteinase-dependent invasion. Cancer Lett 287: 54–61. doi: 10.1016/j.canlet.2009.05.041
- 92. Bennett CN, Green JE (2010) Genomic analyses as a guide to target identification and preclinical testing of mouse models of breast cancer. Toxicol Pathol 38: 88–95. doi: 10.1177/0192623309357074
- 93. Langerod A, Zhao H, Borgan O, Nesland JM, Bukholm IR, et al. (2007) TP53 mutation status and gene expression profiles are powerful prognostic markers of breast cancer. Breast Cancer Res 9: R30. doi: 10.1186/bcr1675
- 94. Rossner P Jr, Gammon MD, Zhang YJ, Terry MB, Hibshoosh H, et al. (2009) Mutations in p53, p53 protein overexpression and breast cancer survival. J Cell Mol Med 13: 3847–3857. doi: 10.1111/j.1582-4934.2008.00553.x
- 95. Martelli AM, Cocco L, Capitani S, Miscia S, Papa S, et al. (2007) Nuclear phosphatidylinositol 3,4,5-trisphosphate, phosphatidylinositol 3-kinase, Akt, and PTen: emerging key regulators of anti-apoptotic signaling and carcinogenesis. Eur J Histochem 51 Suppl 1125–131.
- 96. Bosco EE, Knudsen ES (2007) RB in breast cancer: at the crossroads of tumorigenesis and treatment. Cell Cycle 6: 667–671. doi: 10.4161/cc.6.6.3988
- 97. Sloan KE, Eustace BK, Stewart JK, Zehetmeier C, Torella C, et al. (2004) CD155/PVR plays a key role in cell motility during tumor cell invasion and migration. BMC Cancer 4: 73.
- 98. Pende D, Spaggiari GM, Marcenaro S, Martini S, Rivera P, et al. (2005) Analysis of the receptor-ligand interactions in the natural killer-mediated lysis of freshly isolated myeloid or lymphoblastic leukemias: evidence for the involvement of the Poliovirus receptor (CD155) and Nectin-2 (CD112). Blood 105: 2066–2073. doi: 10.1182/blood-2004-09-3548