The authors would like to declare that P.W. Laird holds the following issued patents on the MethyLight technology (“Process for High Throughput DNA Methylation Analysis”; U.S. Patent # 6,331,393; Date Filed: May 14, 1999; Date of Issue: December 18, 2001. “Process for High Throughput DNA Methylation Analysis”; U.S. Patent # 7,112,404; Date Filed: December 10, 2001; Date of Issue: September 26, 2006. “Process for High Throughput DNA Methylation Analysis”; Date Filed: December 10, 2001; Date of Issue: June 30, 2009), which have been licensed to Epigenomics AG. In addition, P.W. Laird, D.J. Weisenberger, and M. Campan are named as inventors on the following pending patent application for the Digital MethyLight technology (“DNA Methylation Analysis by Digital Bisulfite Genomic Sequencing and Digital MethyLight”; Pub App No: 20080254474; Date Filed: April 14, 2008). D.J. Weisenberger is a consultant for Zymo Research. There are no further patents, products in development or marketed products to declare. This does not alter the authors‚ adherence to all the PLOS ONE policies on sharing data and materials.
Conceived and designed the experiments: CPEL MC DJW PWL. Performed the experiments: CPEL MC TH HS. Analyzed the data: CPEL MC TH HS PWL. Contributed reagents/materials/analysis tools: CPEL RFS AEMJ HS PJMJK CMD RAEMT PWL. Wrote the paper: CPEL.
There is an increasing demand for accurate biomarkers for early non-invasive colorectal cancer detection. We employed a genome-scale marker discovery method to identify and verify candidate DNA methylation biomarkers for blood-based detection of colorectal cancer.
We used DNA methylation data from 711 colorectal tumors, 53 matched adjacent-normal colonic tissue samples, 286 healthy blood samples and 4,201 tumor samples of 15 different cancer types. DNA methylation data were generated by the Illumina Infinium HumanMethylation27 and the HumanMethylation450 platforms, which determine the methylation status of 27,578 and 482,421 CpG sites respectively. We first performed a multistep marker selection to identify candidate markers with high methylation across all colorectal tumors while harboring low methylation in healthy samples and other cancer types. We then used pre-therapeutic plasma and serum samples from 107 colorectal cancer patients and 98 controls without colorectal cancer, confirmed by colonoscopy, to verify candidate markers. We selected two markers for further evaluation: methylated
Our systematic marker discovery and verification study for blood-based DNA methylation markers resulted in two novel colorectal cancer biomarkers, THBD-M and C9orf50-M. THBD-M in particular showed promising performance in clinical samples, justifying its further optimization and clinical testing.
Colorectal cancer (CRC) is a common disease with an estimated 143,460 new cases in the USA in 2012
An optimal screening test is expected to be highly sensitive and specific, pose no risk to the patients, and have high patient acceptance. It should also be cost effective and easy to perform. As current screening procedures lack sufficient positive predictive value, require unpleasant preparation or cause discomfort, there is a need to develop new non-invasive tests for the detection of CRC at a stage early enough for treatment to be successful. DNA methylation markers are promising tools that could be useful for early cancer detection. In the past decade it has become clear that cancer cells have aberrant patterns of DNA methylation and that these abnormalities can be detected in tumor-derived DNA found in the plasma or serum of cancer patients
A number of studies have already reported the use of DNA methylation markers for blood-based detection of CRC with varying results
The local and regional institutional review boards approved this study. Informed consent was obtained from all participating patients and controls. Pre-therapeutic plasma and serum samples were obtained from CRC patients in the outpatient clinic via phlebotomy of the median cubital vein from April 2008 to December 2011. Plasma and serum were isolated within 30 minutes of venapuncture as previously described
Controls were defined as subjects without CRC or any malignancy in the past five years and were included in this study at the endoscopy department. Individuals undergoing colonoscopy, who showed no sign of a colorectal malignancy, were eligible to participate. Indications for colonoscopy for these patients were surveillance colonoscopies because of inflammatory bowel disease (IBD; Crohn's disease or Ulcerative Colitis), positive family history of CRC, gastro-intestinal complaints or rectal blood loss. An experienced gastroenterologist performed all colonoscopies. Patients with mild, controlled IBD were included as long as it was possible to reliably inspect the colonic mucosa at colonoscopy
CRC tissue was obtained during the surgical resection of the tumor and immediately sent to the pathologist. The pathologist dissected a representative part of the tumor and stored the fresh-frozen sample at −80°C within one hour after surgical resection. In addition, a pathologically normal colon sample was taken at least 10 cm away from the edge of the tumor and stored in the same way.
In the marker discovery phase of this study we used DNA methylation data generated by Illumina Infinium HumanMethylation27 BeadChip® (HM27) and the HumanMethylation450 BeadChip® (HM450) platforms. The Infinium assay quantifies DNA methylation levels at specific cytosine residues adjacent to guanine residues (CpG loci), by calculating the ratio (β-value) of intensities between locus-specific methylated and unmethylated bead-bound probes. The β-value is a continuous variable, ranging from 0 (unmethylated) to 1 (fully methylated)
We used available Infinium HM27 and HM450 data from 711 colorectal tumors, 53 matched adjacent-normal colonic tissue samples and 10 peripheral blood lymphocyte (PBL) samples of healthy individuals to identify and verify candidate DNA methylation tumor markers. In addition, we used Infinium data from publicly available data sets (GEO and TCGA) representing 274 healthy PBL samples and 4,201 malignant tissue specimens from 15 different cancer types to maximize CRC specificity (see
DNA Methylation Analysis Platform | ||
Sample collections (TCGA abbr) | HM27 | HM450 |
GEO Colorectal cancer tumor | 100 | |
GEO PBL from healthy controls | 10 | 2 |
GEO Normal colorectal tissue |
29 | |
GEO PBL from healthy controls | 274 | |
TCGA Normal colorectal tissue (COAD/READ) | 24 | |
TCGA CRC tumor discovery set (COAD/READ) | 236 | 40 |
TCGA CRC tumor verification set (COAD/READ) | 335 | |
TCGA Acute myeloid leukemia (LAML) | 192 | 192 |
TCGA Bladder urothlial carcinoma (BLCA) | 78 | |
TCGA Breast invasive carcinoma (BRCA) | 316 | 498 |
TCGA Gastric adenocarcinoma (STAD) | 82 | 70 |
TCGA Glioblastoma mulitforme (GBM) | 296 | |
TCGA Skin Cutaneous Melanoma (SKCM) | 241 | |
TCGA Lung adenocarcinoma (LUAD) | 128 | 222 |
TCGA Lung squamous cell carcinoma (LUSC) | 134 | 150 |
TCGA Ovarium serous adenocarcinoma (OV) | 405 | |
TCGA Pancreas (PAAD) | 30 | |
TCGA Prostate (PRAD) | 154 | |
TCGA Renal clear cell (KIRC) | 219 | 283 |
TCGA Thyroid carcinoma (THCA) | 230 | |
TCGA Head and Neck squamous cell carcinoma (HNSC) | 292 | |
TCGA Uterine corpus endometrioid carcinoma (UCEC) | 117 | 256 |
normal samples were obtained from surgical specimens of CRC patients, at least 10 cm from the tumor margins.
these samples were among the samples run on the HM27 platform.
We employed a multistep filtering process in the discovery phase of this study. DNA methylation data generated by the two different BeadChips (HM27 and HM450) were analyzed separately, but using the same filtering steps (
We used DNA methylation data from the Infinium HumanMethylation27 Beadchip (HM27) and HumanMethylation450 Beadchip (HM450) Infinium platforms to screen 27,578 (HM27) and 482,421 (HM450) CpG loci for their methylation status in CRC samples, PBL samples from healthy subjects, paired normal colorectal tissue samples (NC) and 15 other types of cancer (OC). We used a stepwise approach eliminating probes that failed in any of the samples, probes that contained SNPs or repeat sequences, probes with a highest PBL β-value (β-PBLH) or a mean normal colon tissue β-value (β-NCM) higher than the associated 10th percentile of CRC tumor β-values (β-CRC10) or higher than 0.2 in any of the PBL or NC samples (Infinium panel). The remaining probes were ranked based on the difference between β-CRC10 and β-PBLH and the top 25 were selected from both datasets (HM27 and HM450) for filtering against OC samples. Probes with a mean OC β-value higher than the associated mean CRC β-value (β-CRCM) were eliminated. A total of 15 MethyLight reactions (markers) were designed for 10 probes and tested in a sequence of verification steps (MethyLight panel). Markers were eliminated if their performance was suboptimal in controls such as
A (top figure: HM27, bottom figure: HM450), scatterplots of the highest PBL β-value (β-PBLH) of 10 (HM27) and 2 (HM450) healthy control samples (X-axis) against the associated 10th percentile of CRC tumor β-values (β-CRC10) on the Y-axis. The blue dots represent the eliminated probes (HM27: n = 23,049; HM450: n = 367,833) and the red dots (HM27: n = 695; HM450: n = 30,207) represent the retained probes with a β-CRC10>β-PBLH or a β-PBLH<0.2. B, scatterplots of the mean normal colon tissue β-value (β-NCM) for the retained probes from Panel A (X-axis) against the associated β-CRC10 (Y-axis). The red dots (HM27: n = 512; HM450: n = 28,428) represent the eliminated probes, the green dots represent the retained probes (HM27: n = 183; HM450: n = 1779) with a β-CRC10>β-NCM or a β-NCM<0.2. C, scatterplots of the retained probes from Panel B (green) displayed by the difference between β-CRC10 and β-PBLH (X-axis) against the associated β-CRC10 (Y-axis). The dots within the yellow square are the probes selected for additional filtering against other types of cancer. The white arrows point out the probes of the two candidate markers. D, ROC curves for the probes used in the multiplex reaction based on methylation β-values of 335 independent colorectal cancer samples and 23 independent matched normal colorectal tissue samples (the DNA methylation data of these samples were not used in the marker discovery pipeline). The dark grey color is the area under the curve.
DNA from two healthy PBL samples and 25 CRC tumor samples were extracted according to the previously described protocol
The MethyLight assay was performed as previously described
Digital MethyLight is a quantitative PCR technique in which bisulfite-converted DNA is analyzed using the MethyLight PCR assay in a distributive fashion over 96 reaction chambers for each sample. This technique is an efficient and effective method of obtaining DNA methylation information for samples with small amounts of DNA and was performed as described earlier
The two candidate markers that survived this elimination process were labeled with different fluorophores. This enabled reaction specific colored PCR outcomes that allowed us to distinguish hits from each of these markers when they were run together (multiplex). All probes and primers were synthesized by Biosearch Technology, Inc, Novota, California, USA.
The two-marker multiplex was tested on plasma and serum samples from 75 independent CRC patients and 70 controls with a test volume of 1 ml.
CONTROLS | CRC PATIENTS | ||||||||||
Sample set | Samples | Median Age in Year (Range) | Female/Male | Crohn's disease/Ulcerative colitis | Samples | Median Age in Year (Range) | Female/Male | Rectum n (%) | Colon n (%) | Stage | n (%) |
n | n | ||||||||||
Pooled set | 32 | 59 (40–85) | 16/16 | 7/9 | 31 | 63 (41–84) | 15/16 | 5 (16) | 26 (84) | I | 9 (31) |
II | 6 (19) | ||||||||||
III | 14 (45) | ||||||||||
IV | 2 (1) | ||||||||||
Independent set | 66 | 61 (39–85) | 22/48 | 2/8 | 75 | 72 (51–92) | 34/41 | 19 (25) | 56 (75) | I | 19 (25) |
II | 24 (32) | ||||||||||
III | 31 (41) | ||||||||||
IV | 1 (0) |
The computation of confidence intervals of areas under the curve (AUCs) and the statistical tests were conducted in R (version 2.14.0), with the R package pROC
We performed a stepwise marker discovery analysis using available DNA methylation data sets from a large number of CRC tumors, 15 different other cancer types, and control samples from plasma, PBL and matched adjacent-normal colonic tissues (
We designed and tested a total of 15 real time PCR-based MethyLight assays (markers) for the ten remaining probes. MethyLight-based techniques are highly sensitive methods for detection of methylated DNA molecules
We evaluated the performance of
Jitter plots representing Infinium-based DNA methylation β-values of
We developed a multiplex reaction for the two markers using different reporter dyes for each of the reactions. The THBD-M probe was labeled with a QUASAR fluorophore that results in a red fluorescent signal and the C9orf50-M probe was labeled with the blue FAM fluorophore. The primers and probes of the two markers were tested for interference by combining them in one solution at various concentrations using M.
A total of 106 CRC patients and 98 controls without CRC, verified by colonoscopy, were included in this prospective study. Paired serum and plasma samples were available from all controls and 103 CRC patients, while only plasma was obtained from three CRC patients. Although stage IV CRC was an exclusion criterion in this study, aspecific abnormalities were seen on pre-operative imaging diagnostics for three patients (e.g. small pulmonary nodules on CT-thorax) which later, but before surgery, turned out to be distant metastasis. These patients were subsequently upstaged to stage IV CRC. Thirty-two plasma and serum samples from controls and CRC patients were previously used in the pooled sample analysis as mentioned above.
We tested the multiplexed Digital MethyLight assays for THBD-M and C9orf50-M markers on individual plasma samples from 75 CRC and 66 controls and on individual serum samples from 72 CRC and 66 controls.
Digital MethyLight was performed in 1 ml plasma (A) and serum (D) to detect THBD-M and C9orf50-M in CRC and control samples. The absolute number of molecules detected by the multiplex (sum of the two markers) reaction is recorded on the y-axis. The CRC samples are arranged by stage. Asterisks (*) indicate samples with more than 25 molecules detected (up to 153 molecules in plasma and 157 molecules in serum). ROC curves and AUCs (95% confidence intervals) of the different CRC stages in plasma (B) and serum (E) based on the number of detected molecules. ROC analysis and AUCs (95% confidence intervals) for THBD-M, C9orf50-M as individual reactions and as a multiplex reaction in plasma (C) and serum (F).
We also determined the CEA levels in preoperative serum samples from 107 CRC patients. An elevated serum CEA (≥5.0 ng/ml) was observed in 35/107 (33%) patients. For stage I CRC serum CEA was elevated in 14%, for stage II in 33%, for stage III in 39% and for stage IV in 67%.
One of the important shortcomings in the published CRC biomarker studies is the reliance on a candidate gene approach for marker discovery. This approach is often based on a nonsystematic selection of candidate marker genes, which are tested in healthy and cancerous tissues and then validated in a patient population
The application of Digital PCR to multiplexed MethyLight assays allowed for efficient use of valuable samples by simultaneously analyzing more than one marker without loss of sensitivity. This technology allows for the detection of single methylated DNA molecules against a large background of unmethylated molecules, and provides a quantitative PCR test result
Circulating free cancer DNA (cfDNA) has the potential to be tumor-specific and has a relatively short half-life making it suitable as biomarker
One of the technical factors that could influence diagnostic performance of a biomarker is test volume. For example, the
While in this study, the use of serum resulted in a slightly higher test performance of THBD-M and the multiplex compared to plasma, this difference was of borderline significance. Although it has been reported that serum contains more cfDNA than plasma, no large-scale studies have been published comparing serum and plasma as test medium for blood-based detection of malignant diseases. Hence, it remains unclear whether serum or plasma is the optimal test specimen
THBD-M outperformed C9orf50-M, and combining the two markers in a multiplexed assay did not increase test sensitivity. With a detection threshold of zero molecules per 1 ml plasma, THBD-M was able to detect 71% of all CRCs at a specificity of 80%. Interestingly, for stage I/II the detection rate in CRC was 74% with this marker. THBD-M had a higher sensitivity for the detection of colon cancers (77% for all stages) than rectal tumors (53% for all stages) in plasma. This difference was marginally significant (p = 0.07). Early stage colon cancers were also detected by this marker at a relatively high percentage, 75% for stage I, and 77% for stage II. It is known that a subset of right-sided colon tumors exhibits high frequency of DNA hypermethylation at multiple promoter CpG islands, which is designated as CIMP
The fact that this diagnostic test detected a considerable fraction of mostly curable CRCs, with 5-year survival rates of 72%–93%
Currently, no blood-based markers have yet been approved by the FDA for the use of early detection of CRC. Serum CEA is the only blood-based biomarker that is in use for CRC detection, but it lacks the sensitivity for primary CRC detection. Serum CEA measurement is used mainly as a follow-up tool after initial treatment, and yields a sensitivity of approximately 72% for the detection of liver metastasis and 60% for local recurrence with specificities of 91% and 86% respectively
In conclusion, we identified two novel blood-based DNA methylation markers for early detection of CRC though a systematic genome-scale marker discovery and verification study. Of these two markers, THBD-M had a promising performance in clinical samples justifying its further optimization and clinical testing.
(PDF)
(PDF)
(PDF)
(PDF)
We thank Joke van Zoest and all other employees of the clinical chemistry laboratory and of the departments of pathology and surgery of the Groene Hart Hospital, Wendy Plokkaar and Irene Janssens, the Nijbakker-Morra foundation, Bontius Foundation, Ketel-1 foundation, Michaël-van Vloten Foundation and all members of Peter W. Laird's and Ite A. Laird-Offringa's laboratories, the USC Epigenome Center, employees of the LUMC endoscopy department and the LUMC surgical research laboratory.