Conceived and designed the experiments: SSJ AAP AHT HZ SHD MNM SRQ RFP RWD. Performed the experiments: AAP HZ GD. Analyzed the data: MAC AR GB HZ SSJ SHD. Contributed reagents/materials/analysis tools: MLT RHA RWC JAM SS AWK JMF FES SRQ. Wrote the paper: SSJ SHD AAP AHT MNM HZ. Invented the MagSweeper: AHT AAP MNM RFP RWD SSJ. Optimized MagSweeper configuration and performance: AAP AHT.
Current address: Department of Obstetrics and Gynecology, Stanford University School of Medicine, Stanford, California, United States of America
Current address: Department of Diagnostic Research, Illumina, Inc., Hayward, California, United States of America
Current address: Novartis Institutes for Biomedical Research, Cambridge, Massachusetts, United States of America
Current address: Cancer Research Institute, College of Medicine, Xiangfan University, Xiangyang, Hubei, China
Dr. Stefanie Jeffrey, Dr. Ashley Powell, Dr. AmirAli Talasaz, Dr. Michael Mindrinos, Dr. Fabian Pease, and Dr. Ronald Davis are inventors of the MagSweeper technology used in this study. Stanford University has licensed this technology to Illumina, Inc., and receives licensing royalties. Dr. Jeffrey has donated her royalties to support student programs at The Jackson Laboratory, a nonprofit biomedical research institution. Dr. AmirAli Talasaz is currently employed by Illumina. Dr. Stephen Quake is a co-founder of Fluidigm Corporation and member of its board of directors and scientific advisory board. Fluidigm chips were used in this study. This does not alter the authors’ adherence to all the PLoS ONE policies on sharing data and materials.
To improve cancer therapy, it is critical to target metastasizing cells. Circulating tumor cells (CTCs) are rare cells found in the blood of patients with solid tumors and may play a key role in cancer dissemination. Uncovering CTC phenotypes offers a potential avenue to inform treatment. However, CTC transcriptional profiling is limited by leukocyte contamination; an approach to surmount this problem is single cell analysis. Here we demonstrate feasibility of performing high dimensional single CTC profiling, providing early insight into CTC heterogeneity and allowing comparisons to breast cancer cell lines widely used for drug discovery.
We purified CTCs using the MagSweeper, an immunomagnetic enrichment device that isolates live tumor cells from unfractionated blood. CTCs that met stringent criteria for further analysis were obtained from 70% (14/20) of primary and 70% (21/30) of metastatic breast cancer patients; none were captured from patients with non-epithelial cancer (n = 20) or healthy subjects (n = 25). Microfluidic-based single cell transcriptional profiling of 87 cancer-associated and reference genes showed heterogeneity among individual CTCs, separating them into two major subgroups, based on 31 highly expressed genes. In contrast, single cells from seven breast cancer cell lines were tightly clustered together by sample ID and ER status. CTC profiles were distinct from those of cancer cell lines, questioning the suitability of such lines for drug discovery efforts for late stage cancer therapy.
For the first time, we directly measured high dimensional gene expression in individual CTCs without the common practice of pooling such cells. Elevated transcript levels of genes associated with metastasis
To cure epithelial-based cancers–such as cancers of the breast, prostate, lung, colon, and pancreas–therapies need to be directed toward those cells that cause metastases. Lethal epithelial cancers generally originate in a primary tumor and then spread (metastasize) to other organs by shedding cells into the bloodstream and/or lymphatic channels. Disseminating metastatic cells may lodge, remain dormant for varying amounts of time, and ultimately grow as secondary tumors in other body sites. Secondary tumors may re-seed additional metastatic cells into the bloodstream
While considerable progress has been made towards elucidating the basic biology of primary tumors to guide therapy, the molecular characterization of metastatic disease, which generally occurs months or years after primary tumor excision, remains limited. The treatment of patients with metastatic disease continues to be based largely on biomarkers from their primary tumor, despite frequent discordance between primary and metastatic cancer
CTCs are rare epithelial cells present in cancer patient blood amidst approximately 5×109 anuclear red blood cells and 5–10×106 nucleated white blood cells (leukocytes) per ml. Due to the general absence of epithelial cells in normal blood, the standard definition of a CTC is an epithelial cell found in the blood of a patient with cancer, confirmed by 1) visualization of an intact nucleus using DAPI, 4′,6-diamidino-2-phenylindole, a DNA-binding fluorescent stain; 2) expression of cytokeratin; and 3) lack of expression of the white blood cell marker, CD45, the leukocyte-common antigen gene
According to the current standard of care, which includes surgical resection of primary tumors, CTCs identifiable in the blood of patients with metastatic recurrence must, by definition, derive from metastatic foci. The number of CTCs in blood samples has been shown to correlate with clinical outcome in patients with metastatic breast, prostate, colorectal, and lung cancer
Intratumoral heterogeneity of primary breast cancers is well illustrated by the presence of distinct oncogene mutations even within a single microscopic field of tumor tissue
This study was reviewed and approved by Stanford’s Human Subjects Research Compliance Board and adhered to HIPAA regulations. All human subjects signed informed consent prior to blood sample collection.
MCF7, SKBR3, T47D and MDA-MB-231 breast cancer cell lines were purchased from American Type Culture Collection (ATCC) and tested to be free of mycoplasma contamination. Since these cell lines were originally derived from disseminated lesions of the human host (
Study participants with primary and metastatic breast cancer were recruited through the Stanford Breast Oncology Clinic at the discretion of their treating medical oncologists. Blood was collected by venipuncture or from implanted venous access ports or both into 10 mL BD Vacutainer plastic EDTA tubes (Becton Dickinson). The first 9 ml tube of blood from each blood draw was discarded to prevent contamination by skin epithelial cells from the needle puncture site. Then, approximately 9 ml of blood was collected from each human subject and kept at room temperature. All blood samples were processed within three hours of collection.
To isolate CTCs, whole blood was labeled with 4.5 µm magnetic beads (Dynabeads Epithelial Enrich, Invitrogen) coated with the monoclonal BerEP4 antibody against human EpCAM (epithelial cell adhesion molecule, formally known as TACSTD1). Cells were labeled at room temperature with constant mixing for one hour. The samples were then diluted with PBS and processed for capture by a sweeping magnetic device - the MagSweeper (
A. MagSweeper device showing magnetic rods sheathed in plastic above the capture, wash and release stations. B. A diagrammatic view of MagSweeper cell isolation protocol. C. A controlled shear force produced by the movement of the magnetic rods in the wash station releases non-specifically bound blood cells. For cells with attached magnetic beads (black circles), the magnetic rod produces a magnetic force in z proportion to the nonuniformity (dB2/dz) of the magnetic field, thus imparting momentum in z proportional to (dB2/dz) and to a dwell time that depends both on the sweep speed and on the velocity distribution across the boundary layer that extends into the fluid from the surface of the sheath, optimizing capture of labeled cells and release of contaminating unlabeled cells. D. Photomicrograph (200X) of a CTC labeled with 4.5 µm immunomagnetic beads isolated from a patient with metastatic breast cancer. Magnetic beads are small dark spheres; the CTC appears as a translucent cell surrounded by clusters of beads.
Single tumor cells contain picogram quantities of RNA, insufficient for reproducible whole genome microarray analysis. Target genes were preamplified using TaqMan gene expression assays (20x) (Applied Biosystems) and CellsDirect qRT-PCR kit (Invitrogen). The TaqMan gene expression assays (20x) were combined and diluted with TE (Tris and EDTA) buffer to yield 0.2x assay mixture. The pre-amplification was done in a 10 µl volume including 5.0 µl Cells Direct 2x Reaction Mix; 2.5 µl combined assay mixture, 1 µl of PBS containing the target cell [or human reference RNA (Stratagene)], 0.5 µl TE (pH 8.0), and 1 µl RT-Taq enzyme. The RT step was performed at 50°C for 15 minutes, followed by 18 cycles of amplification (95°C for 15 seconds and then 60°C for 4 minutes). Pre-amplified cDNA were diluted 5 times in TE buffer and stored at −20°C.
TaqMan Universal Master Mix (Applied Biosystems) and 96.96 dynamic array chips, together with the NanoFlexTM 4-IFC Controller and the BioMark Real –Time PCR System (Fluidigm Corporation) were used for chip based high throughput qRT-PCR arrays, performed following the standard Fluidigm protocol
CT readings with Biomark software’s quality check score <0.65 or CT ≥35 were treated as missing/immeasurable; otherwise, we considered the gene expressed. The following ten genes were excluded because: 1)
At this stage, many more CTCs were isolated from some patients than others. To balance the analysis, at random, samples were further reduced to select exactly seven cells from each of the seven cell lines, and at most five cells per patient from the CTCs. The resulting set of cell lines and CTCs comprised the analysis set used in statistical summaries and heatmaps. To normalize the expression, we computed for each sample the mean CT of the reference panel of
To produce heatmap images of the data, the expression values were truncated to a range of +/−3 standard deviations of the centered expression (across all genes); missing values were drawn in black. To cluster the data, first, missing values were replaced by plugging in the minimum value of −3 standard deviations, reflecting the low levels of expression that they represent. Then standard hierarchical clustering was used with the Euclidean distance metric. All analyses were performed using R software version 2.13.1 (
To test whether sample processing with the MagSweeper itself altered gene expression profiles, we measured the expression of a subset of 15 genes in breast cancer cell lines before and after cell processing. Overall gene expression pattern was not altered during the labeling or dynamic capture processes of our MagSweeper isolation protocol, although we noted that even within clonally-derived cell cultures before processing, some variation exists at the single cell level (
A. Gene expression heat maps of CT measurements of 15 genes by microfluidic qRT-PCR assays performed on single MCF7 cells before and after labeling and capture by the MagSweeper. Each gene is measured in triplicate for each single cell. Some single cell expression variation is inherent among individual cells, but the overall pattern showed no marked effect by our isolation protocol. B. Average plating efficiency (percent of single cells that formed colonies after seven days) of MCF7 cells; either control, labeled with beads, or labeled and captured by the MagSweeper, performed in triplicate. This demonstrates that cell viability was not affected by our purification protocol.
We next demonstrated that high dimensional single cell analysis reliably characterizes tumor cells using 96.96 Dynamic Arrays to measure the expression of 87 cancer-associated and reference genes in individual cells isolated from primary and metastatic breast cancer cell lines. This exploratory panel of genes was selected from the published literature and our previous work in breast cancer gene expression for their role in molecular pathways relevant to breast cancer and to represent breast cancer biomarkers, prognostic markers, and phenotypes associated with cancer signaling pathways, epithelial-mesenchymal transition (EMT), cancer stem cells, and metastasis, as well as phenotypes indicative of contaminating leukocytes (
Initially, we tested assay reproducibility for single cell high dimensional profiling on randomly selected cells from each of three primary (CCdl054, CCdl672, CCdl675) and four metastatic breast cancer cell lines (T47D, MCF7, SKBR3, and MDA-MB-231). Hierarchical clustering was performed with expression data for 87 selected genes normalized by
A. Heatmap of single cell gene expression of 87 genes within seven individual cells isolated from three primary tumor-derived (pink: CCdl054, orange: CCdl672, gold: CCdl675), and four metastatic effusion-derived (red: MDA-231 plum: SKBR3, dark green: MCF7, and bright green: T47D) breast cancer cell lines. Yellow indicates high gene expression; gray is median expression; blue indicates low expression; and black represents undetectable expression. All cells showed expected expression patterns. The breast cancer cell lines used represent a spectrum of cell differentiation, e.g., from less differentiated and more mesenchymal/stem cell-like ER-negative (basal-like) cells (MDA-231 and SKBR3) to more differentiated ER-positive (luminal-like) cells represented by CCdl054, CCdl672, CCdl675, MCF7, and T47D.
We used the MagSweeper to process blood samples from 45 patients without epithelial cancer: 25 healthy volunteers and 20 lymphoma patients. None had detectable cells in the capture buffer.
For cells captured from breast cancer patient blood samples, gene expression was measured in a total of 510 patient cells isolated by the MagSweeper. These represented 65 blood samples from 50 patients: 20 primary breast cancer patients without detectable metastatic disease, and 30 metastatic breast cancer patients (
In the hierarchical clustering analysis of CTCs, to avoid individual patient bias, no more than 5 independent RNA samples derived from EpCAM-captured
Thirty-one of the 87 genes evaluated were consistently detectable in at least 15 percent of the CTCs analyzed. Aside from 3 reference genes (
Unsupervised clustering analysis based on the above-mentioned subset of commonly expressed genes stratified CTCs into: (a) Cluster I - a relatively small cluster comprised of 21 cells from 13 patients, and (b) Cluster II - a larger cluster comprised of 84 cells from 30 patients (
Heatmap of single cell gene expression for 31-gene subset data derived from 105 CTCs isolated from patients with primary and metastatic breast cancer. Yellow indicates high gene expression; gray is median expression; blue indicates low expression; and black represents undetectable expression. The samples reveal two robust clusters for CTCs (lavender: Cluster I; turquoise blue: Cluster II). In addition to epithelial markers, these genes include pathways associated with EMT, metastasis, and AKT/mTOR signaling.
A final important observation was that unlike breast cancer cell lines, CTCs did not cluster by case ID. Eight (23%) cases were represented in both clusters; 5 cases were exclusive to Cluster I, and 22 cases to Cluster II (
CTCCluster | Totalcases | Numberof CTCs | Median age at primary Dx (yrs) | Primary(%) | Metastatic(%) | ER or PR- pos(%) | HER2-pos(%) | TripleNegative (%) |
I | 13 | 21 | 43 | 4/13 (31) | 9/13 (69) | 6/13 (46) | 2/13 (15) | 5/13 (38) |
II | 30 | 84 | 45 | 12/30 (40) | 18/30 (60) | 12/30 (40) | 5/30 (17) | 13/30 (43) |
In an effort to evaluate the similarities between widely used experimental tumor cell models and patient derived tumor cells, we combined single cell expression data from primary and metastatic breast cancer cell lines, and CTC samples towards a clustering analysis of 154 individual cells. When all 87 test genes were considered in this comparison, while cell lines and CTCs were indeed clustered apart, CTC subclassification was not robust, likely due to a large number of values resulting from undetectable transcript levels (
Heatmap of single cell gene expression for 31-gene subset data derived from seven breast cancer cell lines and 105 CTCs isolated from patients with primary and metastatic breast cancer. Yellow indicates high gene expression; gray is median expression; blue indicates low expression; and black represents undetectable expression. The samples reveal two robust clusters for CTCs (lavender: Cluster I; turquoise blue: Cluster II) and two clusters representing primary (pink: CCdl054, orange: CCdl672, gold: CCdl675) and metastatic cell lines. Note dendrogram branches that cluster ER-negative cell lines (red: MDA-231; plum: SKBR3) and ER-positive cell lines (dark green: MCF7, and bright green: T47D).
Over the past several years, a major factor enabling the continued characterization of surgically resected tumor tissue is the highly enriched content of malignant cells in the sample, which facilitates direct assays on primary tumor cell populations. In contrast, studying the biology of cells that successfully disseminate from the primary tumor site requires prior separation from normal components within patient blood. We have developed a cell purification technology – the MagSweeper, which gently isolates rare CTCs with high specificity. Our previous studies have shown that the MagSweeper can be used reliably to extract functional human CTCs from the blood of mice implanted with human tumor xenografts, which retain both their tumor-initiating and metastasizing capacities
Analyzing tumor cells by their genomic and transcriptomic profiles has been an important first step towards understanding cancer biology. For example, gene expression profiling of primary tumors and its application in the molecular subtyping of breast cancer has provided a biological framework for defining the clinical heterogeneity of this disease. Although an aggressive basal breast tumor subtype was evident with select biomarkers long before the advent of genomics
Single cell analysis depicts the true diversity of a heterogeneous population. We found that single CTCs displayed striking quantitative variability within a wide spectrum of genes that would have been obscured by analysis of pooled multiple cells. These analyses enabled us to identify different CTC subpopulations even within a single blood sample.
It is widely accepted that only a small minority of cells in the primary tumor are progenitors or “culprits” leading to deadly metastases. To cure cancer, such culprit cells need to be identified and characterized for targeted therapy. From the perspective of patient care, CTC biology may be more pertinent than primary tumor biology because some CTCs may follow paths to future metastatic seeding or home to specific metastatic sites. Profiling CTCs specifically refines analyses of those cells capable of entering blood vessels and surviving within the vasculature. In our study, the extracted CTCs were almost exclusively Triple Negative (lacking
The demonstration of numerical/quantitative associations between CTCs and clinical outcome in previous studies
Identifying metastatic cell diversity through CTC profiling could more effectively guide drug selection in late stage cancer patients, making it reasonable to speculate that patients whose blood contains CTCs with these diverse phenotypes could greatly benefit from optimized multidrug treatment regimens. Therapy that targets only one CTC population might not ablate other subpopulations, which may continue to spread and grow. High transcript levels of genes most commonly expressed in CTCs suggest valuable targeting opportunities prior to metastatic seeding. The finding of overexpression of a metastasis-associated calcium- and zinc-binding protein encoding gene -
Our expression profiling analyses demonstrated that CTC populations are relatively quiescent. Transcript levels of growth factors and their receptors, such as
(TIF)
(DOC)
(DOC)
(DOC)
(DOC)
We thank Luigi Warren, Tomer Kalisky, and Christina Fan for sharing their knowledge of single cell microfluidic assays; Ma’ayan Leiberman and Loralee Lobato for assistance with patient sample collection.