SB is an employee of Exosome Diagnostics GmbH; JQ is Exosome Diagnostics Scientific Advisory Board Member and has minimal stock options in the company. Neither this nor any other affiliations alters the authors’ adherence to all the PLOS ONE policies on sharing data and materials.
Conceived and designed the experiments: ACC MS DH. Performed the experiments: MS ACC. Analyzed the data: MS ACC SB. Contributed reagents/materials/analysis tools: OH SB BHK JQ. Wrote the paper: MS ACC DS JQ.
Although ovarian cancer is often initially chemotherapy-sensitive, the vast majority of tumors eventually relapse and patients die of increasingly aggressive disease. Cancer stem cells are believed to have properties that allow them to survive therapy and may drive recurrent tumor growth. Cancer stem cells or cancer-initiating cells are a rare cell population and difficult to isolate experimentally. Genes that are expressed by stem cells may characterize a subset of less differentiated tumors and aid in prognostic classification of ovarian cancer. The purpose of this study was the genomic identification and characterization of a subtype of ovarian cancer that has stem cell-like gene expression. Using human and mouse gene signatures of embryonic, adult, or cancer stem cells, we performed an unsupervised bipartition class discovery on expression profiles from 145 serous ovarian tumors to identify a stem-like and more differentiated subgroup. Subtypes were reproducible and were further characterized in four independent, heterogeneous ovarian cancer datasets. We identified a stem-like subtype characterized by a 51-gene signature, which is significantly enriched in tumors with properties of Type II ovarian cancer; high grade, serous tumors, and poor survival. Conversely, the differentiated tumors share properties with Type I, including lower grade and mixed histological subtypes. The stem cell-like signature was prognostic within high-stage serous ovarian cancer, classifying a small subset of high-stage tumors with better prognosis, in the differentiated subtype. In multivariate models that adjusted for common clinical factors (including grade, stage, age), the subtype classification was still a significant predictor of relapse. The prognostic stem-like gene signature yields new insights into prognostic differences in ovarian cancer, provides a genomic context for defining Type I/II subtypes, and potential gene targets which following further validation may be valuable in the clinical management or treatment of ovarian cancer.
Ovarian cancer is the fifth most common cause of cancer deaths among women and is the leading cause of death from gynecological neoplastic disease
In breast cancer, there are widely-accepted molecular subtypes. Approximately 15% of breast cancers are estrogen receptor (ER)-negative, high-grade and often basal-like breast cancer that are enriched in cells expressing putative stem cell markers CD44+/CD24−
In contrast, ovarian cancer has no consensus molecular subtype classification. Tothill et al. used k-means clustering of microarray data and described six molecular subtypes of serous and endometrioid ovarian cancer
Here we report the identification of ovarian cancer subtypes based on the expression of genes associated with stem cell signatures. Using a computational approach, we demonstrate the presence of a poor prognosis, stem cell-like subtype in ovarian cancer that aligned closely with the cell of origin classification and provides the first genomic definition of Type I/II ovarian cancer. This gene expression profile does not demonstrate the existence of a subpopulation of cancer stem cells in these tumors. Instead it discovers common molecular pathways expressed by these cancers and stem cells. Tumors displaying expression of stem-like genes may have a less differentiated phenotype. Discovery of this stem cell subtype provides us a more complete understanding of ovarian cancer’s molecular diversity and opens up the potential for new and more directed approaches to treating and managing the disease.
Stem-like cluster discovery was applied to ovarian cancer gene expression data published by Tothill et al.
The four validation ovarian cancer gene expression datasets used in this analysis were from Dressman et al.
RMA-normalization was used in most datasets because of its highly reproducible results and correlation with RT-PCR data
In the Desmedt and Veridex breast cancer datasets, we predicted molecular subtypes as described by Desmedt et al. (2008)
Statistical analyses, unless otherwise described, were performed using all available data and standard functions in R version 2.10.1. The ISIS algorithm
To diminish confounding effects from differential stroma and non-tumor cells in different arrayed sites (e.g. peritoneal or ovary) (unpublished data), only AOCS arrays of ovarian mRNA from malignant, serous, and primary site ovarian tumors from patients that did not receive neoadjuvant therapy were included, reducing the dataset from 285 to 145 patients. Analyses of the AOCS dataset were performed on this subset unless we specify the “entire” AOCS dataset (n = 285) or “remaining” AOCS data (n = 140), which were tumors not used for ISIS class discovery.
Genes used for subtype cluster discovery were limited to 83 mouse and human gene signatures of adult, cancer, or embryonic stem cells obtained from GeneSigDB
The resulting matrix of 2,632 stem-like genes was subject to ISIS bipartition discovery. The highest scoring bipartition that was significantly (p<0.05) associated with grade and disease-free survival was selected for further investigation. These criteria were based on the finding that the stem cell-like sub-population of breast cancer tumors discovered by Ben-Porath et al. was characterized by high grade and poor prognosis
Gene-Set-Enrichment Analysis (GSEA) with nonparametric inference for linear models as implemented in
To confirm the presence of the stem-like subtype in ovarian cancer, the stem-like subtype classification was applied to multiple independent microarray datasets. In order to predict the class of new tumors, we first needed generate a “stemness molecular classifier,” a model of gene weight which discriminated the stem-like and differentiated tumors. Diagonal linear discriminant (DLD) analysis
The ovarian cancer datasets used for validation are phenotypically and clinically heterogeneous, and contain different histological subtypes, grades, prognoses, treatments and follow-up protocols. Unless stated we did not control for phenotype variation; instead we exploited the heterogeneity in datasets, in particular histology and grade, to explore how the bipartition associated with phenotypes beyond those represented in the AOCS dataset and to determine the extent to which the stemness molecular classifier could be generalized.
To explore whether ovarian cancer has a stem-like component, we tested whether genes reported to be expressed by stem cells are also expressed in a subset of ovarian tumors. To do this, we extracted the union of all adult, cancer, or embryonic stem (ES) cell gene expression signatures in GeneSigDB
We then took the AOCS ovarian cancer gene expression data (n = 145 patients) and considered only the 2,632 genes reported to be expressed in stem cells. To this, we applied ISIS
The top scoring bipartition, hereafter referred to as the ovarian cancer stemness bipartition differentiated two distinct subgroups of ovarian cancer patients: a set of 121 patients with worse disease free (p = 0.0541), overall survival (p = 0.102), and higher grade (p = 0.00326) that was interpreted as more “stem cell-like” as these tumors over-expressed a number of genes known to be associated with stemness, and a smaller group of 24 patients with better survival and lower grade that we refer to as the “differentiated” subgroup (
(A) A heatmap of gene expression profiles of the 24 differentiated (green) and 121 stem-like (blue) tumors from the AOCS dataset
Leave-one-patient-out cross-validation was performed to extract the most robust gene signature of this bipartition, resulting in a 51-gene stemness signature (Table S2 in
To provide further support for the phenotypes revealed by the bipartition, we tested if gene targets known to be expressed in stem cells were differentially regulated between the stem-like and differentiated subtypes using gene set enrichment analysis (GSEA). Thirteen lists of genes (Table S4 in
Gene Set | Enrichment | P-value | Adjusted P-value |
|
|
ES exp1 | Stem-like | 0.00002 | 0.00004 |
ES exp2 | Stem-like | 0.00031 | 0.01671 | |
|
Nanog targets | Stem-like | 0.00115 | 0.01573 |
Oct4 targets | Stem-like | 0.01509 | 0.07951 | |
Sox2 targets | Stem-like | 0.02296 | 0.12607 | |
NOS targets | Stem-like | 0.00969 | 0.10489 | |
NOS TFs | Stem-like | 0.09596 | 0.13822 | |
|
Myc targets1 | Stem-like | 0.01144 | 0.03774 |
Myc targets2 | Stem-like | 0.01349 | 0.10387 | |
|
Suz12 targets | Differentiated | 0.05251 | 0.05508 |
Eed targets | Differentiated | 0.06293 | 0.05605 | |
H3K27 bound | Differentiated | 0.05046 | 0.06329 | |
PRC2 targets | Differentiated | 0.08553 | 0.09058 |
Analysis was repeated after removing proliferation-related genes from the gene sets, as described by Ben-Porath et al.
To further characterize tumors in the stem-like subtype, we performed GSEA using all of the gene sets in MSigDB
Both high-grade, serous ovarian and basal-like breast cancer are seen in women with mutant BRCA1
Desmedt |
Veridex |
|||
Basal | Non-basal | Basal | Non-basal | |
Stem-like | 35 | 14 | 90 | 63 |
Differentiated | 11 | 138 | 12 | 179 |
Fisher’s exact test p = 3.50×10−18 and p = 1.11×10−27, Desmedt and Veridex datasets respectively.
To validate our ovarian stem-like and differentiated molecular subtypes, we applied the stemness DLD molecular classifier to three independent ovarian cancer microarray datasets and the “remaining AOCS data” (n = 140) not used in the initial bipartition discovery. Two datasets, Crijns et al.
First, we confirmed the association between grade and the stem-like molecular subtype. In these independent data, stem-like tumors had higher grade in the Wu (n = 103, logistic regression p = 1.63×10−5), remaining AOCS (n = 140, p = 1.16×10−7), and Dressman (n = 118, p = 0.073) datasets.
Next we explored which histological subtypes of ovarian cancer were classified as stem-like. Serous is the most common histological form of epithelial ovarian tumor, but epithelial ovarian cancer is a heterogeneous disease with mixed malignancy potentials and histological subtypes, including endometrioid, clear cell, and mucinous
To further evaluate association with histological subtype, we examined the entire AOCS dataset (n = 285). This larger dataset was comprised of the serous AOCS discovery dataset (n = 145) and the remaining AOCS data (n = 140), which included LMP serous tumors and malignant endometrioid and serous tumors arrayed from sites other than the ovary. We observed that the differentiated subtype was significantly enriched in endometrioid tumors (9/20, Fisher’s test p<0.05 after FWER correction
The stem-like subtype classification was not equivalent to the classification recently proposed by Tothill et al. in which serous and endometrioid tumors are identified as one of six molecular subtypes, C1–C6
Dataset | Differentiated | Stem-like | P-value | ||
|
Stage | I | 31 | 4 |
|
|
II | 6 | 5 | ||
III | 13 | 31 | |||
IV | 4 | 5 | |||
Grade | 1 | 18 | 1 |
|
|
2 | 9 | 8 | |||
3 | 14 | 24 | |||
Histology | Clear Cell | 7 | 1 |
|
|
Endometrioid | 22 | 15 | |||
Mucinous | 13 | 0 | |||
OSE | 4 | 0 | |||
Serous | 12 | 29 | |||
|
Type | LMP | 17 | 1 |
|
|
Malignant | 35 | 232 |
|
|
Stage | I | 15 | 9 | ||
II | 8 | 10 | |||
III | 28 | 189 | |||
IV | 1 | 21 | |||
Grade | 1 | 14 | 5 |
|
|
2 | 19 | 78 | |||
3 | 17 | 147 | |||
Histology | Adenocarcinoma | 0 | 1 | * | |
Endometrioid | 9 | 11 | |||
Serous | 43 | 221 | |||
Primary site | Fallopian tube | 0 | 8 |
|
|
Ovary | 52 | 191 | |||
Peritoneum | 0 | 34 | |||
Arrayed Site | Other | 0 | 14 |
|
|
Ovary | 50 | 150 | |||
Peritoneum | 2 | 69 | |||
Age | Median age | 56.3 | 59.3 | * | |
Residual disease | <1 cm | 43 | 118 |
|
|
>1 cm | 5 | 76 | |||
Molecular subtype | C1 | 3 | 80 |
|
|
C2 | 2 | 48 | |||
C3 | 25 | 3 | |||
C4 | 7 | 39 | |||
C5 | 7 | 29 | |||
C6 | 4 | 4 | |||
NC | 4 | 30 |
p-value <0.01,
p-value <0.001. OSE Ovarian surface epithelium, NC not classified. In each dataset, p-values were corrected for family-wise error rate using Hommel’s method
In the three independent validation datasets and the remaining AOCS data, tumors with the stem-like subtype had worse prognosis (
In the remaining AOCS dataset, the stem-like subtype has strongly worse (A) disease-free survival (p<0.001) and (B) overall survival (p = 0.00127). In the (C) Crijns and (D) Dressman datasets, the stem-like subtype has significantly worse overall survival (p = 0.022 and p = 0.035, respectively).
A finding of potential clinical importance is that the stemness molecular classifier may also be prognostic within high-grade, high-stage serous ovarian cancer. The stem-like subtype had worse disease-free (p = 0.0053) and overall survival (p = 0.0299) in high-grade, malignant tumors of the entire AOCS dataset. In independent analysis of each histology, the stem-like subtype was associated with poorer disease-free survival in high-grade serous (p = 0.0447), but was not a significant predictor in high-grade endometrioid tumors (p = 0.278). In the Crijns and Dressman datasets, which were exclusively high-stage serous tumors, the stemness molecular classifier identified a small subset of differentiated subtype tumors with better overall survival (
The stem-like subtype was associated with phenotypes often predictive of poor outcome (
Despite these associations, in multivariate analysis, the stemness bipartition remained a strong predictor of worse disease-free survival. The bipartition remained a significant predictor of disease-free survival when adjusting for one (p<0.005) or two variables among stage, grade, and residual disease (
Variable | Hazard ratio | Lower limit (95% CI) | Upper limit (95% CI) | P-value |
|
||||
Stem-like subtype | 1.75 | 1.00 | 3.05 | 0.0498 |
Residual disease | 1.43 | 1.02 | 2.00 | 0.0374 |
Stage |
7.30 | 2.57 | 20.8 | 0.000195 |
|
||||
Stem-like subtype | 2.17 | 1.22 | 3.85 | 0.00818 |
Grade |
1.16 | 0.586 | 2.29 | 0.674 |
Stage |
7.72 | 2.80 | 21.2 | 7.64×10−5 |
|
||||
Stem-like subtype | 2.36 | 1.32 | 4.22 | 0.00370 |
Grade |
1.42 | 0.672 | 3.01 | 0.358 |
Residual disease | 1.76 | 1.26 | 2.46 | 0.000872 |
p<0.05,
p<0.01,
p<0.001.
In regression analyses, the ordinal variables stage and grade were broken into multiple components using default functions in R. However, only the linear components (levels treated as a continuous variable) are displayed in the table because the other components were not significant. Grade was also coded as a quadratic component (grade 2 vs. grades 1 and 3) and stage as both quadratic (stages 2 and 3 vs. stages 1 and 4) and cubic (stage 2> stage 4> stage 1> stage 3) components.
A recent pathogenesis model of ovarian cancer divides tumors into Type I, which is low-grade and histologically diverse, and Type II, which is high-grade and mostly serous
Type I | Type II | Stemness bipartition | |
|
Possibly | No | Differentiated subtype is enriched in genes related to cilia and more similar to normal fallopian tube. |
|
Serous, endometrioid, mucinous,clear cell | Mostly serous | Stem-like subtype overrepresented serous ovarian cancer while the differentiated subtype had mixed histology ( |
|
KRAS, BRAF PTEN, CTNNB1, ERBB2, PIK3CA | Mostly p53 | Stem-like subtype is enriched in p53-mutant tumors and p53 mutation-associated genes. Differentiated subtype is enriched in other mutations. |
|
Sometimes | No | Almost all 18 of the LMP tumors were classified as differentiated in entire AOCS dataset. |
|
Low grade | High grade | Stem-like subtype has higher grade across datasets. |
|
25% | 75% | Original stem-like subtype comprises 83% of the tumors and in entire AOCS dataset, 77% of tumors. |
|
10% | 90% | Stem-like consists of 91% of deaths in original bipartition and 94% in entire AOCS dataset. |
Both Type II and the stem-like subtype are associated with poor prognosis, high-grade serous tumors (
Mutations characteristic of Type II ovarian cancer are found in the stem-like subtype. Type II tumors are thought to arise from precursor lesions in fallopian tube epithelium and have “p53 signatures” that have strong p53 immunoreactivity and usually p53 mutations
In contrast, “differentiated” tumors and Type I tumors describe histologically diverse and mostly (although not exclusively) low-grade and LMP tumors (
Type I tumors purportedly arise from either ovarian surface epithelium that undergoes metaplasia or epithelium of fallopian tube, endometrium, or peritoneum that proliferates after being trapped in ovarian cortical inclusion cysts
Consistent with the hypothesis that Type I (but not Type II) tumors are enriched for expression of genes associated with cilia
Therefore, the Type I/II and stemness classifications are similar in terms of grade, histological subtype, prevalence, and lethality, in addition to cell of origin, presence of ciliated cells, and mutation status (
Of the 51 genes used for classification, 37 were present in all four ovarian cancer datasets used for clinical validation (remaining AOCS, Wu, Dressman, and Crijns), because the gene expression profiling was performed on different technological platforms. Of these, a subset of 12 were consistently over-expressed (p<0.05 after FDR correction) in either the stem-like or differentiated subtypes across the four datasets and are thus most robustly expressed. The six stem-like subtype genes were UVRAG, CXCR4, RGS19, RAD51AP1, PSAT1, and CXCL10, and the six differentiated subtype genes were FOXA2, EIF1, MTUS1, DFNB31, TRAF3IP2, and SLC22A5. Despite enrichment in gene expression of the targets of Nanog, Oct4, and Sox2 in the stem-like subtype (
The cancer stem cell theory proposes that a subpopulation of cells inherit or acquire stem-like properties that enable them to survive therapy and drive recurrent tumor growth, but the function and identification of such stem cells is controversial both in normal and malignant ovarian and fallopian tube tissue
Tumors identified as being of the stem-like subtype have higher tumor grade and significantly worse prognosis, properties that were reproducible in independent and heterogeneous ovarian cancer microarray datasets; the associations between stem cell-like gene expression and grade or survival have been observed before but has not been explored in ovarian cancer
The classification is also valuable because of similarity to and support for a subtype classification associated with distinct pathogenesis pathways. Type I ovarian cancer, which includes low-grade and histologically heterogeneous tumors that may arise in the ovary, is similar to the differentiated subtype while Type II, which includes high-grade and mostly serous tumors that arise in the fallopian tube, is similar to the stem-like subtype
Our signature classified a low number of high-grade, serous tumors, which would normally be classified as Type II, as Type I. The identification of good prognosis, high-grade serous carcinomas may reflect novel biological insight or gene expression patterns of tumors that were originally low-grade and became high-grade
The 51 classification genes we identified may provide insight into pathogenesis of Type I and II ovarian cancer. FOXA2, which is consistently up in the differentiated subtype, is an inhibitor of the epithelial-mesenchymal transition associated with invasion and metastasis
The stem-like subtype was enriched in stem cell-related gene sets, including gene targets of stem cell markers Oct4 and Nanog, which are reported to be associated with grade and stage in serous ovarian cancer
The stem-like subtype was predominant in high grade serous ovarian and basal-like breast cancer supporting common biological connections between these cancers that also share BRCA1 dysfunction
We present a prognostic stem-like subtype classification of ovarian cancer which provides a genomic context for the Type I/II classification. Though it does not explain all variation in survival, in combination with other clinical indicators, it may have the potential to explain prognosis with greater accuracy than currently used clinical variables can alone. This classification requires further experimental validation in a large cohort of patients to characterize the properties of each molecular subtype, their association with Types I and II ovarian cancer and demonstrate possible clinical application. Our study provides support for the recently described Type I/II model of ovarian cancer and provides a molecular signature for stratification of these subtypes.
This file contains: Figure S1 Distributions of stemness bipartition diagonal linear discriminant (DLD) scores in each dataset. In the ovarian cancer datasets, the DLD scores were all bimodal, and Gaussian mixture modeling
(DOCX)
We thank Professor Ronny Drapkin and Dr. Panos Konstantinopoulos for useful discussions. We are grateful to Professor Win Hide and his group for assistance in curation of stem cell gene signatures and for interpretation of results and Ms. Renee Rubio and Ms. Hui-Ying Piao for technical support.