Conceived and designed the experiments: JO EEH PG. Performed the experiments: PG NM JKL AP AM JK MN. Analyzed the data: PG AT GH. Contributed reagents/materials/analysis tools: JW AAA KP AR TS JR. Wrote the paper: JO PG EEH.
The authors have declared that no competing interests exist.
Prostate cancer (PCa) and colorectal cancer (CRC) are the most commonly diagnosed cancers and cancer-related causes of death in Poland. To date, numerous single nucleotide polymorphisms (SNPs) associated with susceptibility to both cancer types have been identified, but their effect on disease risk may differ among populations.
To identify new SNPs associated with PCa and CRC in the Polish population, a genome-wide association study (GWAS) was performed using DNA sample pools on Affymetrix Genome-Wide Human SNP 6.0 arrays. A total of 135 PCa patients and 270 healthy men (PCa sub-study) and 525 patients with adenoma (AD), 630 patients with CRC and 690 controls (AD/CRC sub-study) were included in the analysis. Allele frequency distributions were compared with t-tests and χ2-tests. Only those significantly associated SNPs with a proxy SNP (
The GWAS selected six and 24 new candidate SNPs associated with PCa and CRC susceptibility, respectively. In the replication study, 17 of these associations were confirmed as significant in additive model of inheritance. Seven of them remained significant after correction for multiple hypothesis testing. Additionally, 17 previously reported risk variants have been identified, five of which remained significant after correction.
Pooled-DNA GWAS enabled the identification of new susceptibility loci for CRC in the Polish population. Previously reported CRC and PCa predisposition variants were also identified, validating the global nature of their associations. Further independent replication studies are required to confirm significance of the newly uncovered candidate susceptibility loci.
Cancers are highly heterogeneous, polygenic disorders that arise in a multi-step process involving the selection of successive cellular clones and result from genetic as well as specific environmental factors. In the former case, both high-penetrance mutations and low-penetrance polymorphisms may determine a patient's defense and adaptive mechanisms against exposure to carcinogenic factors, determining susceptibility to this disease. However, the effect of common low-penetrance risk determinants is small when in isolation, increasing susceptibility only through the cumulative effect associated with the occurrence of multiple risk variants
The association between allele frequency and susceptibility to disease can be studied by focusing on individually selected variants or, instead, on the position of over a million DNA variants, using single nucleotide polymorphism (SNP) microarray technology. Microarray platforms used by genome-wide association studies (GWAS) represent a relatively mature technology that allows scanning the entire genome to detect potential associations with disease without prior knowledge of their position or biological function. In theory, as a consequence of linkage disequilibrium (LD) between SNPs at a given locus, a high proportion of all diversity could be captured by genotyping a relatively smaller subset of markers (the so-called tagging SNPs)
To date, over 1,000 susceptibility loci, usually of small or modest effect and accuracy from low to moderately high, have been identified by GWAS
Prostate cancer (PCa) and colorectal cancer (CRC) are the most common types of cancers in the Polish population, and the leading cause of cancer-related morbidity and mortality
A comprehensive analysis of variants conferring genetic susceptibility to CRC and PCa based on GWAS has not been conducted in the Polish population yet. A major cause for this lack of studies is the high cost of the SNP microarray technology, particularly considering that new loci identified by GWAS have been associated with progressively smaller effect sizes, demanding an increase in the statistical power (namely sample size) of GWAS. An alternative approach using pooled DNA samples has been developed
In this study, we describe a pooled DNA sample-based GWAS as a cost-effective alternative to identify genetic variants of moderate effect associated with CRC and PCa in the Polish population. Pooled DNA samples were processed using microarray technology, and GWAS was employed as a genetic variance filtering approach. The technical validation of the GWAS results and the replication studies on individual DNA samples was conducted using much cheaper PCR-based genotyping technology.
All enrolled patients and control subjects were Polish Caucasians recruited from two urban populations, Warsaw and Szczecin. The study was approved by the local ethics committee (Medical Center for Postgraduate Education and Cancer Center, Warsaw, Poland), and all participants provided written informed consent. The study protocol conforms to the ethical guidelines of the 1975 Declaration of Helsinki.
GWAS cohorts comprised: (1. AD/CRC sub-study) 525 patients (270 females and 255 males) diagnosed with colorectal adenomas (AD), 630 patients (240 females and 390 males) diagnosed with CRC and 705 healthy individuals (420 females and 285 males), and (2. PCa sub-study) 285 male patients diagnosed with PCa and 285 healthy men.
Larger cohorts of cases and controls were enrolled in a replication study, including: (1. AD/CRC sub-study) 945 (509 females and 436 males) patients with AD, 889 (352 females and 537 males) patients with CRC and 2188 (1542 females and 646 males) healthy individuals, and (2. PCa sub-study) 447 patients with PCa and 800 healthy men controls. The median age at diagnosis for AD, CRC and PCa was 60 years (range: 36–85), 64 years (range: 29–89) and 67 years (range: 42–83 years), respectively. Sample sizes and the age distribution of each group are shown in
GWAS validation | Replication study | |||||||||||
Enrolled | After TaqMan® filtration | Enrolled | After TaqMan® filtration | |||||||||
N | Range | Median | N | Range | Median | N | Range | Median | N | Range | Median | |
PCa | 135 | 45–83 | 67 | 118 | 45–83 | 58 | 447 | 42–83 | 67 | 419 | 42–83 | 67 |
AD | 525 | 27–85 | 59 | 476 | 27–85 | 59 | 945 | 32–85 | 60 | 856 | 36–85 | 60 |
AD (F) | 270 | 28–85 | 58 | 242 | 28–85 | 58 | 509 | 32–85 | 60 | 454 | 40–85 | 60 |
AD (M) | 255 | 27–85 | 60 | 234 | 27–85 | 60 | 436 | 36–85 | 61 | 402 | 36–85 | 61 |
CRC | 630 | 29–86 | 65 | 598 | 29–86 | 65 | 889 | 28–89 | 64 | 840 | 29–89 | 64 |
CRC (F) | 240 | 29–86 | 63 | 234 | 29–86 | 63 | 352 | 29–89 | 63 | 341 | 29–89 | 63 |
CRC (M) | 390 | 32–84 | 66 | 364 | 32–84 | 66 | 537 | 28–85 | 65 | 499 | 30–85 | 65 |
Control - PCa | 270 | 27–81 | 55 | 261 | 27–81 | 55 | 800 | 27–86 | 59 | 772 | 27–86 | 59 |
Control - AD/CRC | 690 | 27–81 | 57 | 669 | 27–81 | 57 | 2188 | 21–87 | 58 | 1981 | 21–87 | 58 |
Control - AD/CRC (F) | 420 | 40–77 | 58 | 408 | 40–77 | 58 | 1542 | 21–87 | 58 | 1399 | 21–87 | 58 |
Control - AD/CRC (M) | 270 | 27–81 | 55 | 261 | 27–81 | 55 | 646 | 24–82 | 57.5 | 582 | 24–82 | 57 |
The GWAS validation panel indicates numbers of patients (N) enrolled in the GWAS, after excluding microarrays that did not meet quality control criteria based on the PCA results. The ‘Range’ and ‘Median’ values regard age of cases and controls in respective groups. Both GWAS validation and replication analyses were done using respective individual patient TaqMan® genotyping. The TaqMan® genotyping data was subjected to a quality filtration using the 5% threshold of per-individual maximum genotype missingness (see ‘
Genomic DNA was extracted from whole blood treated with EDTA using the QIAamp DNA Mini Kit (Qiagen, Germany), following the manufacturer's protocol. Before pooling, DNA sample concentrations were measured based on their fluorescent intensity using Quant-iT™ PicoGreen dsDNA Kit (Invitrogen, United Kingdom). To determine DNA quality with precision, the 260 nm/280 nm absorbance ratio of each sample was also measured using a NanoDrop 1000 spectrophotometer (Thermo Fisher Scientific Inc., USA), and samples were run on a 1% agarose gel to determine DNA integrity visually.
DNA samples that passed quality control tests were combined mixing equimolar concentrations according to patient diagnose to obtain 15-DNA sample pools. Pooled DNA samples were then brought to a final concentration of 50 ng/µl in Tris-EDTA buffer (pH = 8), with concentrations of Tris and EDTA not exceeding 10 mM and 0.1 mM, respectively. In the AD/CRC sub-study, a total of 35, 42 and 47 DNA pools were prepared for AD, CRC and controls, respectively, whereas in the PCa sub-study, a total of 19 and 19 DNA pools for both PCa and controls, respectively. To reduce the influence of experimental variation, DNA pools were subdivided into triple technical repeats and assayed independently, using separate microarrays, on the Affymetrix Genome-Wide Human SNP Array 6.0. Microarray genotyping experiments and the extraction of probe set signal intensities were performed using ATLAS Biolabs GmbH (Berlin, Germany).
For the technical validation of GWAS findings and for the replication study, individual patients were genotyped using TaqMan SNP Genotyping Assays (Life Technologies, USA), SensiMix™ II Probe Kit (Bioline Ltd, United Kingdom), and a 7900HT Real-Time PCR system (Life Technologies, USA).
The intensity of each SNP was calculated as the relative allele signal (RAS) for each microarray, such that: RAS = A/(A+B), where A and B are the probe set intensity values of alleles A and B, respectively, according to the Affymetrix coding
To detect significant differences in allele frequency between PCa and the control group a combination of two statistical approaches was used. Firstly, between-group differences in RAS were tested using Student's t-tests to take into account RAS variation among pools representing each group
Candidate SNPs for individual genotyping were selected by combining the results from both the t-test and χ2-test, using the clumping algorithm in the PLINK v1.06 software (
Technical validation of those candidate SNPs selected by the pooled-DNA GWAS was performed by individual genotyping of the same experimental cohorts. TaqMan genotyping data was first subjected to quality control procedures, including thresholds for maximum individual missingness for each of the SNPs <0.05, maximum genotype missingness for each of the individuals <0.05 and the Hardy-Weinberg disequilibrium <0.001 for the control group. GWAS candidate associations were validated using the allelic χ2-test (PLINK v1.07 software). SNPs with
Validated GWAS-derived SNPs and literature-selected SNPs (
The heterogeneity among study populations was assessed with the
The GWAS was carried out using pooled 15-DNA samples and the Affymetrix Genome-Wide Human SNP Array 6.0. The following outliers, identified by the PCA results, were excluded from the further analyses: 1) one pool representing 15 control male subjects in the AD/CRC sub-study and 2) 10 pools representing 150 PCa patients and one pool representing 15 controls, in the PCa sub-study. A reason why so many of PCa patient pools had to be rejected from further consideration is not clear. It can only be speculated that some pre-analytical variability, such as discreet changes in DNA quality and/or DNA microarray hybridization could affect the final results of the allelotyping experiments.
The pooled-DNA GWAS revealed 44 candidate SNPs associated with either AD, CRC or PCa, of which two were repeated in two unrelated comparisons. Considering SNP population frequencies of 0.2–0.5, our AD/CRC GWAS reached a power ranging from 98.6% to 99.8% and from 43% to 64% to detect effect size of OR = 2.0 and 1.5, respectively, at α = 1E-03, as estimated according to Dupont et al.
Next, the GWAS-selected SNPs were validated by genotyping of individual DNA samples using TaqMan SNP Genotyping Assays. Five candidate SNPs (rs2557030, rs2557227, rs2574608, rs2755895, rs7583683) were excluded from further statistical analysis due to significant deviations (
G1 vs. G2 | dbSNP ID |
Region | MA | Pooled-DNA GWAS | GWAS – technical validation | |||||||
F1 | F2 | OR (95% CI) | F1 | F2 | OR (95% CI) | |||||||
PCa vs. N | rs1934636 | 1q32.2 | C | 0.464 | 0.356 | 1.57 (1.03–2.38) | 1.31E-04 | 3.15E-03 | 0.373 | 0.233 | 1.96 (1.24–3.10) | 4.50E-05 |
rs12629904 | 3q13.31 | T | 0.196 | 0.13 | 1.63 (0.94–2.84) | 1.25E-04 | 1.39E-02 | 0.079 | 0.03 | 2.77 (1.07–7.22) | 2.58E-03 | |
rs1733329 | 3q13.33 | T | 0.347 | 0.284 | 1.34 (0.86–2.08) | 1.41E-04 | 6.78E-02 | 0.332 | 0.23 | 1.66 (1.04–2.65) | 2.37E-03 | |
rs1430579 | 4q31.21 | C | 0.408 | 0.279 | 1.78 (1.15–2.75) | 2.42E-04 | 2.26E-04 | 0.341 | 0.211 | 1.93 (1.21–3.10) | 9.10E-05 | |
rs667472 | 12p13.32 | A | 0.364 | 0.248 | 1.74 (1.11–2.71) | 3.13E-04 | 5.37E-04 | 0.214 | 0.13 | 1.82 (1.05–3.17) | 2.95E-03 | |
rs11616166 | 12p12.3 | G | 0.289 | 0.174 | 1.93 (1.19–3.14) | 1.42E-05 | 1.53E-04 | 0.142 | 0.061 | 2.55 (1.25–5.17) | 2.01E-04 | |
AD vs. N | rs6762970 | 3p12.3 | A | 0.43 | 0.497 | 0.76 (0.61–0.96) | 1.74E-05 | 9.29E-04 | 0.402 | 0.487 | 0.71 (0.56–0.90) | 5.75E-05 |
AD vs. N (F) | rs7631421 | 3p14.1 | C | 0.361 | 0.452 | 0.68 (0.50–0.94) | 2.26E-05 | 8.72E-04 | 0.259 | 0.344 | 0.67 (0.47–0.95) | 1.49E-03 |
rs2128834 | 3p22.1 | G | 0.183 | 0.242 | 0.7 (0.48–1.03) | 2.77E-05 | 9.83E-03 | 0.116 | 0.206 | 0.51 (0.32–0.80) | 3.31E-05 | |
AD vs. N (M) | rs11876485 | 18q11.2 | T | 0.157 | 0.211 | 0.7 (0.45–1.09) | 8.96E-06 | 2.52E-02 | 0.122 | 0.169 | 0.68 (0.41–1.14) | 3.54E-02 |
rs5975081 | 23q25 | G | 0.365 | 0.246 | 1.76 (1.21–2.57) | 2.56E-04 | 2.89E-05 | 0.171 | 0.096 | 1.94 (1.14–3.31) | 1.35E-02 | |
CRC vs. N | rs6702619 |
1p21.2 | T | 0.446 | 0.517 | 0.75 (0.61–0.93) | 5.77E-07 | 2.39E-04 | 0.425 | 0.502 | 0.73 (0.59–0.92) | 1.13E-04 |
rs7611300 |
3q26.33 | A | 0.28 | 0.223 | 1.36 (1.06–1.74) | 1.63E-06 | 7.11E-04 | 0.018 | 0.013 | 1.39 (0.57–3.42) | 3.32E-01 | |
rs13219695 | 6p21.2 | G | 0.321 | 0.387 | 0.75 (0.60–0.94) | 8.14E-06 | 3.79E-04 | 0.109 | 0.169 | 0.6 (0.43–0.83) | 1.59E-05 | |
rs2799652 | 6q16.1 | A | 0.499 | 0.434 | 1.3 (1.05–1.61) | 2.35E-05 | 9.04E-04 | 0.338 | 0.276 | 1.34 (1.05–1.70) | 6.89E-04 | |
rs879872 | 11p15.5 | T | 0.272 | 0.214 | 1.37 (1.07–1.77) | 1.14E-11 | 4.44E-04 | 0.026 | 0.018 | 1.46 (0.86–3.12) | 1.76E-01 | |
rs7171423 | 15q25.1 | C | 0.338 | 0.272 | 1.37 (1.08–1.73) | 3.60E-05 | 2.27E-04 | 0.192 | 0.135 | 1.52 (1.13–2.06) | 9.99E-05 | |
rs3803820 | 17q24.2 | G | 0.335 | 0.272 | 1.35 (1.07–1.71) | 2.36E-04 | 4.24E-04 | 0.125 | 0.085 | 1.54 (1.07–2.21) | 1.15E-03 | |
rs12689028 | 23p22.31 | C | 0.334 | 0.271 | 1.35 (1.07–1.71) | 1.19E-08 | 4.88E-04 | 0.044 | 0.053 | 0.82 (0.49–1.38) | 3.66E-01 | |
rs912956 | 23p11.1 | C | 0.437 | 0.374 | 1.3 (1.04–1.62) | 1.93E-07 | 9.49E-04 | 0.247 | 0.2 | 1.31 (1.01–1.71) | 1.47E-02 | |
rs5987543 | 23q22.2 | C | 0.411 | 0.348 | 1.31 (1.05–1.63) | 8.02E-05 | 9.11E-04 | 0.161 | 0.127 | 1.32 (0.96–1.81) | 4.05E-02 | |
CRC vs. N (F) | rs9283670 | 4p13 | C | 0.346 | 0.26 | 1.51 (1.07–2.12) | 1.41E-04 | 9.24E-04 | 0.109 | 0.065 | 1.76 (1.00–3.11) | 5.85E-03 |
rs17165506 | 7p21.3 | G | 0.293 | 0.202 | 1.64 (1.14–2.36) | 3.41E-06 | 1.82E-04 | 0.154 | 0.081 | 2.07 (1.25–3.41) | 5.54E-05 | |
rs441261 | 7p14.3 | G | 0.32 | 0.268 | 1.29 (0.91–1.82) | 4.92E-06 | 4.47E-02 | 0.224 | 0.12 | 2.12 (1.38–3.25) | 1.06E-06 | |
CRC vs. N (M) | rs12994941 | 2p21 | C | 0.446 | 0.349 | 1.5 (1.09–2.07) | 2.69E-05 | 4.17E-04 | 0.242 | 0.165 | 1.62 (1.08–2.42) | 1.09E-03 |
rs7611300 |
3q26.33 | A | 0.283 | 0.201 | 1.57 (1.08–2.27) | 1.74E-06 | 7.35E-04 | 0.019 | 0.016 | 1.19 (0.35–4.06) | 6.23E-01 | |
rs40972 | 5q23.3 | T | 0.196 | 0.245 | 0.75 (0.52–1.09) | 2.11E-05 | 3.09E-02 | 0.065 | 0.131 | 0.46 (0.27–0.80) | 8.18E-05 | |
rs13192135 | 6p24.3 | G | 0.253 | 0.302 | 0.78 (0.55–1.11) | 3.82E-06 | 5.08E-02 | 0.019 | 0.06 | 0.3 (0.12–0.75) | 2.02E-04 | |
rs5978435 | 23p22.2 | C | 0.493 | 0.592 | 0.67 (0.49–0.92) | 9.16E-06 | 3.88E-04 | 0.377 | 0.251 | 1.81 (1.27–2.57) | 9.68E-04 | |
CRC vs. AD | rs7533097 | 1p31.3 | C | 0.685 | 0.749 | 0.73 (0.51–1.03) | 2.16E-04 | 6.91E-04 | 0.142 | 0.093 | 1.61 (1.10–2.37) | 5.66E-04 |
rs6702619 |
1p21.2 | T | 0.446 | 0.528 | 0.72 (0.57–0.91) | 3.29E-08 | 8.96E-05 | 0.425 | 0.533 | 0.65 (0.51–0.83) | 6.62E-07 | |
rs9848984 | 3p26.3 | C | 0.676 | 0.754 | 0.68 (0.53–0.88) | 1.19E-04 | 3.52E-05 | 0.077 | 0.04 | 2 (1.16–3.46) | 3.89E-04 | |
rs11742611 | 5q11.2 | G | 0.516 | 0.58 | 0.77 (0.61–0.97) | 1.09E-04 | 2.00E-03 | 0.416 | 0.514 | 0.67 (0.53–0.86) | 7.10E-06 | |
rs10814948 | 9p24.2 | T | 0.702 | 0.634 | 1.36 (1.06–1.74) | 1.87E-08 | 4.84E-04 | 0.252 | 0.346 | 0.64 (0.49–0.83) | 2.47E-06 | |
rs1147451 | 14q23.3 | T | 0.681 | 0.621 | 1.3 (1.02–1.66) | 5.12E-05 | 2.26E-03 | 0.287 | 0.368 | 0.69 (0.53–0.89) | 6.72E-05 | |
rs5990890 | 23p22.12 | G | 0.672 | 0.732 | 0.75 (0.58–0.97) | 1.58E-04 | 1.81E-03 | 0.126 | 0.086 | 1.53 (1.03–2.29) | 1.23E-02 | |
CRC vs. AD (F) | rs16860868 | 3q13.2 | C | 0.67 | 0.749 | 0.68 (0.48–0.96) | 4.06E-05 | 5.67E-03 | 0.282 | 0.17 | 1.92 (1.24–2.98) | 3.62E-05 |
CRC vs. AD (M) | rs6972867 | 7p12.2 | C | 0.543 | 0.623 | 0.72 (0.52–0.99) | 5.19E-08 | 4.35E-03 | 0.239 | 0.153 | 1.74 (1.13–2.67) | 3.55E-04 |
rs7321756 | 13q31.2 | G | 0.271 | 0.185 | 1.64 (1.12–2.39) | 1.59E-05 | 3.57E-04 | 0.865 | 0.929 | 0.49 (0.27–0.88) | 4.73E-04 |
Technical validation was performed by individual typing of DNA samples from the same study cohorts used for pooled-DNA GWAS. The allele frequency distribution and χ2-test
/SNP identifier based on NCBI SNP database;
/SNP identified in two independent comparisons.
Allelic | Additive model | Meta-analysis | |||||||||||
G1 vs. G2 | dbSNP ID |
Region | Gene |
MA | F1 | F2 | OR (95% CI) | OR (95% CI) | |||||
PCa vs. N | rs1934636 | 1q32.2 | KCNH1 ( |
C | 0.290 | 0.245 | 1.26 (1.04–1.52) |
|
5.27E-02 | 1.14 (0.93–1.41) | 2.09E-01 | 3.93E-01 | 81.2 (0.0212) |
rs12629904 | 3q13.31 |
|
T | 0.059 | 0.045 | 1.33 (0.92–1.94) | 1.32E-01 | 2.83E-01 | 1.37 (0.92–2.05) | 1.21E-01 | 3.03E-01 | 69.5 (0.0703) | |
rs1733329 | 3q13.33 | FSTL1 | T | 0.262 | 0.238 | 1.14 (0.94–1.38) | 1.88E-01 | 3.14E-01 | 1.17 (0.94–1.44) | 1.55E-01 | 3.32E-01 | 73.6 (0.0517) | |
rs1430579 | 4q31.21 | UCP1 | C | 0.264 | 0.242 | 1.12 (0.92–1.36) | 2.45E-01 | 3.67E-01 | 1.07 (0.88–1.32) | 4.92E-01 | 6.70E-01 | 87 (0.0055) | |
rs667472 | 12p13.32 | KCNA5 | A | 0.160 | 0.167 | 0.95 (0.75–1.19) | 6.58E-01 | 7.59E-01 | 0.96 (0.75–1.24) | 7.56E-01 | 8.76E-01 | 87 (0.0056) | |
rs11616166 | 12p12.3 | AEBP2 ( |
G | 0.080 | 0.069 | 1.18 (0.85–1.62) | 3.23E-01 | 4.40E-01 | 1.05 (0.73–1.51) | 7.93E-01 | 8.76E-01 | 84.3 (0.0116) | |
AD vs. N | rs6762970 | 3p12.3 | CNTN3 | A | 0.418 | 0.450 | 0.88 (0.78–0.99) |
|
2.23E-01 | 0.85 (0.74–0.97) |
|
1.35E-01 | 76.8 (0.0379) |
AD vs. N (F) | rs7631421 | 3p14.1 | MITF | C | 0.306 | 0.322 | 0.93 (0.79–1.10) | 3.67E-01 | 8.03E-01 | 0.9 (0.75–1.10) | 3.06E-01 | 8.91E-01 | 78.6 (0.0307) |
rs2128834 | 3p22.1 | ULK4 ( |
G | 0.148 | 0.178 | 0.81 (0.65–0.99) |
|
3.56E-01 | 0.86 (0.68–1.09) | 2.16E-01 | 8.91E-01 | 82.1 (0.0182) | |
CRC vs. N | rs6702619 | 1p21.2 | PALMD | T | 0.455 | 0.482 | 0.9 (0.80–1.01) | 6.22E-02 | 2.36E-01 | 0.89 (0.78–1.01) | 7.39E-02 | 2.10E-01 | 75.4 (0.0439) |
rs13219695 | 6p21.2 | BTBD9 ( |
G | 0.117 | 0.153 | 0.73 (0.62–0.87) |
|
|
0.71 (0.58–0.86) |
|
|
42.6 (0.1869) | |
rs2799652 | 6q16.1 | FUT9 | A | 0.337 | 0.300 | 1.19 (1.05–1.34) |
|
|
1.19 (1.03–1.36) |
|
1.01E-01 | 24.1 (0.2509) | |
rs7171423 | 15q25.1 | FAM108C1 ( |
C | 0.176 | 0.146 | 1.26 (1.08–1.47) |
|
|
1.26 (1.06–1.50) |
|
9.51E-02 | 52.6 (0.1465) | |
rs3803820 | 17q24.2 | PRKCA ( |
G | 0.121 | 0.094 | 1.32 (1.10–1.58) |
|
|
1.27 (1.03–1.56) |
|
1.27E-01 | 0 (0.3525) | |
CRC vs. N (F) | rs9283670 | 4p13 | PHOX2B | C | 0.094 | 0.081 | 1.17 (0.87–1.57) | 2.89E-01 | 5.78E-01 | 1.16 (0.83–1.62) | 3.99E-01 | 7.94E-01 | 60.5 (0.1117) |
rs17165506 | 7p21.3 | TMEM106B | G | 0.132 | 0.107 | 1.28 (0.99–1.64) | 5.88E-02 | 2.35E-01 | 1.33 (1.00–1.76) |
|
1.87E-01 | 78.3 (0.0319) | |
rs441261 | 7p14.3 | SLC25A5 | G | 0.202 | 0.150 | 1.44 (1.16–1.78) |
|
|
1.39 (1.09–1.78) |
|
9.88E-02 | 75.9 (0.0419) | |
CRC vs. N (M) | rs12994941 | 2p21 | RPS12 | C | 0.233 | 0.214 | 1.12 (0.91–1.37) | 2.95E-01 | 7.65E-01 | 0.98 (0.78–1.24) | 8.72E-01 | 9.14E-01 | 76.3 (0.0402) |
rs40972 | 5q23.3 | ADAMTS19 ( |
T | 0.072 | 0.123 | 0.55 (0.41–0.74) |
|
|
0.55 (0.39–0.77) |
|
|
0 (0.4839) | |
rs13192135 | 6p24.3 | BMP6 ( |
G | 0.021 | 0.040 | 0.52 (0.31–0.88) |
|
9.74E-02 | 0.47 (0.26–0.84) |
|
9.03E-02 | 31.4 (0.2272) | |
rs5978435 | 23p22.2 | ARHGAP6 ( |
C | 0.362 | 0.280 | 1.46 (1.13–1.89) |
|
5.10E-02 | 1.22 (1.05–1.41) |
|
9.03E-02 | 0 (0.3379) | |
CRC vs. AD | rs7533097 | 1p31.3 | SGIP1 | C | 0.135 | 0.097 | 1.45 (1.17–1.80) |
|
|
1.41 (1.10–1.80) |
|
|
0 (0.5462) |
rs6702619 | 1p21.2 | PALMD | T | 0.455 | 0.507 | 0.81 (0.71–0.93) |
|
|
0.79 (0.68–0.93) |
|
|
75.2 (0.0446) | |
rs9848984 | 3p26.3 | CHL1 | C | 0.070 | 0.046 | 1.54 (1.15–2.07) |
|
|
1.75 (1.24–2.48) |
|
|
7 (0.2997) | |
rs11742611 | 5q11.2 | PELO | G | 0.433 | 0.484 | 0.81 (0.71–0.93) |
|
|
0.86 (0.73–1.00) | 5.35E-02 | 2.08E-01 | 64.3 (0.0944) | |
rs10814948 | 9p24.2 | GLIS3 | T | 0.261 | 0.316 | 0.76 (0.66–0.89) |
|
|
0.75 (0.63–0.90) |
|
|
54.1 (0.1401) | |
rs1147451 | 14q23.3 | FUT8 | T | 0.294 | 0.346 | 0.79 (0.68–0.91) |
|
|
0.8 (0.68–0.95) |
|
6.77E-02 | 13.6 (0.2819) | |
CRC vs. AD (F) | rs16860868 | 3q13.2 | WDR52 | C | 0.257 | 0.195 | 1.42 (1.12–1.80) |
|
1.15E-01 | 1.31 (1.00–1.71) |
|
3.24E-01 | 55.4 (0.1345) |
CRC vs. AD (M) | rs6972867 | 7p12.2 | ZPBP | C | 0.243 | 0.174 | 1.53 (1.21–1.93) |
|
|
1.62 (1.22–2.14) |
|
|
0 (0.5176) |
rs7321756 | 13q31.2 | SLITRK5 | G | 0.131 | 0.101 | 1.35 (1.00–1.81) |
|
2.37E-01 | 1.28 (0.91–1.81) | 1.51E-01 | 4.25E-01 | 63.4 (0.0986) |
Bold denotes significant association (
/SNP identifier based on NCBI SNP database;
/NCBI ID of genes localized in proximity to the SNPs of interest (source: HapMap).
The statistical evidence for heterogeneity between allele frequencies across validation and replication study groups was assessed by the Q-test
Six of the significantly associated SNPs were located within intronic gene regions:
Thirty four and nine additional SNPs, previously shown to be associated with CRC
The association of 14 literature-selected variants with AD or CRC and four literature-selected variants with PCa was confirmed (
Allelic | Additive model | |||||||||
dbSNP ID |
Region | Gene |
MA | G1 vs. G2 | OR (95% CI) | OR (95% CI) | ||||
rs1800894 | 1q32.1 | IL10 ( |
T | AD vs. N | 0.67 (0.47–0.96) | 2.77E-02 | 2.23E-01 | 0.58 (0.38–0.89) | 1.24E-02 | 1.24E-01 |
AD vs. N (F) | 0.79 (0.5–1.25) | 3.15E-01 | 8.03E-01 | 0.53 (0.30–0.94) | 3.08E-02 | 3.19E-01 | ||||
CRC vs. N (F) | 1.61 (1.09–2.39) | 1.67E-02 | 9.42E-02 | 1.6 (1.05–2.44) | 2.98E-02 | 1.48E-01 | ||||
CRC vs. AD | 1.78 (1.21–2.64) | 3.28E-03 |
|
2.1 (1.30–3.37) | 2.24E-03 | 2.61E-02 | ||||
CRC vs. AD (F) | 2.04 (1.2–3.45) | 7.01E-03 | 1.15E-01 | 3.03 (1.58–5.81) | 8.51E-04 |
|
||||
rs373572 | 3p25.3 | RAD18 ( |
C | CRC vs. AD (M) | 0.83 (0.67–1.02) | 7.02E-02 | 2.62E-01 | 0.78 (0.61–0.99) | 4.55E-02 | 2.44E-01 |
rs822395 | 3q27.3 | ADIPOQ ( |
C | CRC vs. N (F) | 1.2 (1.01–1.43) | 3.67E-02 | 1.65E-01 | 1.28 (1.05–1.55) | 1.28E-02 | 1.03E-01 |
CRC vs. AD (F) | 1.3 (1.06–1.6) | 1.29E-02 | 1.15E-01 | 1.33 (1.05–1.69) | 1.70E-02 | 2.55E-01 | ||||
rs2229992 | 5q21 | APC ( |
T | CRC vs AD (M) | 1.26 (1.04–1.51) | 1.81E-02 | 2.05E-01 | 1.26 (1.00–1.59) | 4.75E-02 | 2.44E-01 |
rs16892766 | 8q23.3 | EIF3H | C | CRC vs. N | 1.63 (1.34–1.97) | 6.27E-07 |
|
1.45 (1.16–1.81) | 9.71E-04 |
|
CRC vs. N (F) | 1.76 (1.34–2.3) | 4.11E-05 |
|
1.53 (1.12–2.09) | 7.13E-03 | 9.88E-02 | ||||
CRC vs. N (M) | 1.5 (1.12–2.01) | 7.08E-03 | 6.37E-02 | 1.43 (1.01–2.01) | 4.26E-02 | 2.73E-01 | ||||
CRC vs. AD | 1.34 (1.07–1.68) | 1.09E-02 |
|
1.39 (1.07–1.82) | 1.48E-02 | 7.42E-02 | ||||
rs6983267 | 8q24.21 |
|
T | AD vs. N | 0.84 (0.75–0.95) | 3.39E-03 | 5.76E-02 | 0.84 (0.74–0.96) | 1.14E-02 | 1.24E-01 |
AD vs. N (F) | 0.81 (0.69–0.94) | 5.11E-03 | 8.94E-02 | 0.8 (0.68–0.95) | 1.28E-02 | 1.98E-01 | ||||
PCa vs. N | 0.77 (0.65–0.91) | 2.07E-03 |
|
0.75 (0.62–0.90) | 2.49E-03 |
|
||||
rs1447295 | 8q24.21 |
|
A | PCa vs. N | 1.53 (1.18–1.97) | 1.13E-03 | 8.49E-03 | 1.41 (1.06–1.86) | 1.73E-02 | 6.49E-02 |
rs1057910 | 10q23.33 | CYP2C9 ( |
C | CRC vs. N (F) | 1.51 (1.11–2.05) | 8.12E-03 | 5.85E-02 | 1.51 (1.07–2.13) | 1.97E-02 | 1.26E-01 |
CRC vs. AD (F) | 1.54 (1.05–2.25) | 2.63E-02 | 1.49E-01 | 1.57 (1.02–2.41) | 3.96E-02 | 3.24E-01 | ||||
rs7931342 | 11q13.2 | MYEOV | G | PCa vs. N | 1.25 (1.05–1.47) | 1.10E-02 |
|
1.27 (1.05–1.53) | 1.30E-02 | 6.48E-02 |
rs3802842 | 11q23.1 |
|
C | CRC vs. AD | 0.82 (0.7–0.95) | 9.90E-03 |
|
0.82 (0.69–0.97) | 2.43E-02 | 1.06E-01 |
CRC vs. AD (M) | 0.79 (0.64–0.97) | 2.56E-02 | 2.05E-01 | 0.77 (0.60–0.98) | 3.30E-02 | 2.44E-01 | ||||
rs7136702 | 12q13.13 | LARP4 | T | CRC vs. AD (M) | 1.17 (0.96–1.44) | 1.22E-01 | 3.57E-01 | 1.31 (1.03–1.67) | 3.04E-02 | 2.44E-01 |
rs696 | 14q13.2 | NFKBIA ( |
T | CRC vs. N (F) | 1.17 (0.98–1.38) | 7.59E-02 | 2.73E-01 | 1.22 (1.02–1.47) | 3.24E-02 | 1.48E-01 |
rs4779584 | 15q13.3 |
|
T | AD vs. N (M) | 1.24 (1.01–1.54) | 4.28E-02 | 3.53E-01 | 1.34 (1.05–1.70) | 1.86E-02 | 5.04E-01 |
CRC vs. N (M) | 1.34 (1.1–1.63) | 3.66E-03 | 5.10E-02 | 1.37 (1.09–1.73) | 7.46E-03 | 9.03E-02 | ||||
rs9929218 | 16q22.1 | CDH1 ( |
A | AD vs. N | 0.88 (0.78–1) | 5.05E-02 | 2.23E-01 | 0.86 (0.75–1.00) | 4.39E-02 | 2.63E-01 |
AD vs. N (M) | 0.84 (0.69–1.02) | 8.08E-02 | 5.33E-01 | 0.77 (0.60–0.98) | 3.48E-02 | 5.04E-01 | ||||
rs1859962 | 17q24.3 |
|
T | PCa vs. N | 0.73 (0.62–0.87) | 4.20E-04 |
|
0.73 (0.61–0.89) | 1.57E-03 |
|
rs4939827 | 18q21.1 | SMAD7 ( |
C | AD vs. N | 0.81 (0.72–0.9) | 1.98E-04 |
|
0.82 (0.72–0.94) | 3.86E-03 | 1.16E-01 |
AD vs. N (F) | 0.72 (0.62–0.84) | 1.96E-05 |
|
0.76 (0.64–0.90) | 1.54E-03 |
|
||||
CRC vs. N | 0.83 (0.74–0.94) | 1.87E-03 |
|
0.85 (0.75–0.96) | 1.20E-02 | 1.01E-01 | ||||
CRC vs. N (F) | 0.79 (0.67–0.94) | 7.19E-03 | 5.85E-02 | 0.78 (0.65–0.94) | 9.26E-03 | 9.88E-02 | ||||
rs961253 | 20p12.3 |
|
A | CRC vs. AD (M) | 1.2 (0.98–1.46) | 7.50E-02 | 2.62E-01 | 1.33 (1.05–1.68) | 1.65E-02 | 2.44E-01 |
Bold denotes significant association (
/SNP identifier based on NCBI SNP database;
/NCBI ID of genes localized in proximity to the SNPs of interest (source: HapMap).
To validate the global nature of these associations, between-dataset heterogeneity was tested. In the meta-analysis we included three SNPs associated with CRC and four SNPs associated with PCa susceptibility in our replication study for which associations were found with the same phenotype in at least four other studies. A random-effects model was used to calculate the pooled-OR values. As shown in
Random effects | Heterogeneity | |||||||
dbSNP ID |
Risk allele | Phenotype | OR (95% CI) | Z |
Q |
No. of studies | References | |
rs1447295 | A | PCa vs. N | 1.45 (1.33–1.57) | <0.001 | 0.139 | 9.676 | 7 | |
rs6983267 | G | PCa vs. N | 1.26 (1.19–1.33) | <0.001 | 0.013 | 19.373 | 9 | |
rs7931342 | G | PCa vs. N | 1.19 (1.14–1.24) | <0.001 | 0.676 | 3.157 | 6 | |
rs1859962 | G | PCa vs. N | 1.24 (1.17–1.31) | <0.001 | 0.313 | 4.757 | 5 | |
rs16892766 | C | CRC vs. N | 1.27 (1.23–1.32) | <0.001 | 0.691 | 3.059 | 6 | |
rs4779584 | T | CRC vs. N | 1.20 (1.53–1.25) | <0.001 | 0.092 | 13.61 | 9 | |
rs4939827 | C |
CRC vs. N | 0.84 (0.81–0.88) | <0.001 | 0.015 | 18.95 | 9 |
/SNP identifier based on NCBI SNP database;
/meta-analysis was done for minor allele (MA).
To check whether any of the studied variants was associated with an early age of PCa onset, we performed a logistic regression analysis including cases only, with a binary indicator for age (below or above 65 years of age, coded as 1 and 0, respectively) at PCa diagnosis and the studied SNPs as independent variables. There were 171 patients diagnosed at age 65 or earlier and 247 patients older than 65. Two SNPs were significantly associated with age at PCa diagnosis (
It is generally accepted that well-designed GWAS should be conducted with groups of at least 1,000 patients and 1,000 controls, even though appropriate levels of statistical power to test for genetic associations (at
Since the final GWAS results depend on many factors, each associated with a different stage of the experimental procedure, their analysis and interpretation are often challenging. It is essential to realize that the GWAS results reflect, at best, the differences in the genetic material of the cases and controls used for analysis. Although this may seem obvious, it emphasizes one of the most fundamental conditions required for a successful GWAS. Therefore, precise diagnostic criteria must be employed to obtain homogenous groups, as a nonrandom distribution of individuals with traits governed by strong genetic determinants, such as single-gene mutations, will strongly bias the final GWAS outcome.
Although our pooled DNA-based GWAS represent studies with small sample size, they identified 30 SNPs significantly overrepresented in the studied groups (
Although not all GWAS-selected susceptibility SNPs will have a direct functional association with a cancer phenotype, a careful analysis of the GWAS results showed that those SNPs located in intronic regions or in the LD blocks with nearby genes have a potential to influence cancer development (
The rs3803820 located in the
The rs9848984 SNP at 3p26.3, downstream to the close homolog of L1 (
The rs2799652 SNP was found in the promoter region of the alpha-(1,3)-fucosyltransferase (
We replicated previously reported associations between four PCa and 14 AD/CRC risk variants in our Polish-based cohorts. Four SNPs (rs1859962, rs7931342, rs1447295 and rs6983267) were widely reported as PCa risk variants in Caucasian, African or Asian populations
Interestingly, the stratified analyses revealed that the rs4939827 (18q21.1) variant's association was limited to women only (OR = 0.6, 95% CI 0.42–0.88,
SNPs rs1447295 and rs6983267 are located at the 8q24 region. Several studies have identified 8q24 as an important region associated with risk for various cancers, including prostate, breast, colon, ovarian and bladder cancers
Among the polymorphisms in block 4 (region 3) at 8q24, rs6983267 has been consistently identified in many studies, with an OR ranging from 0.65 to 1.42
Only a few studies examine the association between rs1447295 and PCa risk and between rs6983267 and both PCa and CRC risk in the Polish population
Still, some previously reported associations with CRC and PCa risk were not replicated in our study. This may have been a result of a low statistical power coupled with a high genetic heterogeneity and/or cancer complexity
The only factor that decreases cancer-related mortality significantly is early diagnosis. Since at the early stage of development cancers are asymptomatic or associated with unspecific symptoms, early diagnosis is usually accidental or results from the participation in screening programs. Epidemiological studies demonstrate that screening can be effective in a few cancer locations, including the large bowel and prostate. However, screening effectiveness depends not only on the availability of appropriate diagnostic tests, but also on the general acceptance of the proposed screening methods by those who consider themselves healthy. Colonoscopy used for CRC screening also allows simultaneous detection and removal of ADs, but it is a rather expensive procedure with low acceptability, especially by men
One of the early hopes of the GWAS approach was to enable the development of risk prediction models that could accurately select high-risk individuals based on their genetic profiles. However, the proportion of risk explained by known susceptibility variants is still small. For example, according to a recently published meta-analysis of 30 selected SNPs associated with PCa risk, the proportion of the total genetic variance attributed to each SNP ranged from 0.2% to 0.9% as based on both OR and risk allele frequency
The major idea behind genomic studies is not only to enable recognizing genetic variability associated with susceptibility to a disease, but also to recognize the complex nature of genetic variability underlying its pathogenesis
In summary, in this study we provide evidence for the utility of pooled sample-based GWAS instead of genome-wide genotyping of individual DNA samples as a cost-effective alternative approach for filtering genetic variance which reached a decent statistical power particularly for the relatively common SNP markers of moderate effect sizes. The usefulness of pooling-based GWAS was exemplified through the identification of SNPs associated with CRC and PCa susceptibility in the Polish population. However, considering the complex nature of cancer, which involves the interaction of different genetic and environmental factors, detecting all cancer markers present in the human genome is a task beyond capabilities. In addition to previous findings, the risk information provided in the present study is still not sufficient to be used in clinical practice.
Literature-selected SNPs used in the replication study.
(DOC)
SNP association with early PCa onset (before 65 years of age) considering additive (ADD), dominant (DOM), or recessive (REC) models of gene action.
(DOC)
Statistical power of the AD/CRC GWAS for alleles found at different frequencies in the general population (p0).
(TIF)