Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Integrative Analysis of Somatic Mutations Altering MicroRNA Targeting in Cancer Genomes

  • Jesse D. Ziebarth,

    Affiliations Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America, Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America

  • Anindya Bhattacharya,

    Affiliations Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America, Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America

  • Yan Cui

    ycui2@uthsc.edu

    Affiliations Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America, Center for Integrative and Translational Genomics, University of Tennessee Health Science Center, Memphis, Tennessee, United States of America

Abstract

Determining the functional impact of somatic mutations is crucial to understanding tumorigenesis and metastasis. Recent sequences of several cancers have provided comprehensive lists of somatic mutations across entire genomes, enabling investigation of the functional impact of somatic mutations in non-coding regions. Here, we study somatic mutations in 3′UTRs of genes that have been identified in four cancers and computationally predict how they may alter miRNA targeting, potentially resulting in dysregulation of the expression of the genes harboring these mutations. We find that somatic mutations create or disrupt putative miRNA target sites in the 3′UTRs of many genes, including several genes, such as MITF, EPHA3, TAL1, SCG3, and GSDMA, which have been previously associated with cancer. We also integrate the somatic mutations with germline mutations and results of association studies. Specifically, we identify putative miRNA target sites in the 3′UTRs of BMPR1B, KLK3, and SPRY4 that are disrupted by both somatic and germline mutations and, also, are in linkage disequilibrium blocks with high scoring markers from cancer association studies. The somatic mutation in BMPR1B is located in a target site of miR-125b; germline mutations in this target site have previously been both shown to disrupt regulation of BMPR1B by miR-125b and linked with cancer.

Introduction

The genomes of most adult human cancers contain thousands of somatic mutations [1], and a critical aspect of cancer research is determining which of these somatic mutations have crucial functional impact on biological processes related to tumorigenesis and metastasis [2], [3], [4]. Until recently, efforts to sequence cancer genomes have focused on the impact of mutations in coding regions and identifying non-synonymous point mutations, small frameshift deletions, or large genomic rearrangements that may, for example, create fusions genes [5], [6]. With the rapid advances in sequencing technologies, it has become possible to sequence and compare whole genomes of normal and cancer tissues from the same individual to identify somatic mutations [7]. Recently, the entire genomes of normal and cancer tissues in patients with lung cancer [8], melanoma [9], small cell lung cancer (SCLC) [10], and prostate cancer [11] have been sequenced, providing somatic mutations in these cancers in both coding and non-coding regions. However, there has, to this point, been limited investigation of the effect of non-coding somatic mutations on cancer pathogenesis.

One effect of somatic mutations in non-coding regions that has the potential to significantly impact cellular functions associated with cancer is the alteration of microRNA (miRNA) targeting. MicroRNAs are small, non-coding RNAs that function as posttranscriptional regulators of mRNA expression, typically by inhibiting translation or causing the degradation of their mRNA targets. Many miRNAs are up- or down-regulated in cancers, indicating that they act as oncogenes or tumor suppressors, respectively; and miRNA expression profiles have been used to accurately classify cancer subtypes [12]. MicroRNAs have been shown to control many important cellular processes that are altered in cancers, including differentiation, proliferation, and apoptosis [13]. The function of miRNAs is particularly sensitive to genetic variants because complementarity between the seed region of the miRNA and an mRNA sequence is often required for miRNA targeting [14]. Therefore, it is not surprising that germline mutations that disrupt miRNA targeting have been found to play important roles in many diseases [15], [16], [17], [18] including several types of cancer [19], such as melanoma [20], leukemia [21], [22], and breast cancer [23], [24], as well as in oncogenic transformation [25]. Germline mutations that alter miRNA target sites have also been investigated as being the functional causative variants that underlie the results of genome-wide association studies (GWAS) [26], [27]. Recently, a somatic mutation in the 3′UTR of TNFAIP2, a known target of the PRAM1 oncogene, creates a new miRNA target site that results in a reduction of TNFAIP2 expression in a patient with acute myeloid leukemia [28]. This example illustrates the potential for somatic mutations to alter miRNA targeting and contribute to pathogenesis, but there has, to this point, been limited investigation of somatic mutations located in miRNA target sites.

Here, we systematically examine how somatic mutations may alter miRNA targeting (Figure 1). First, we collect somatic mutations in 3′UTRs, the genomic regions that are typically considered to be the most common binding sites of miRNAs, obtained from whole genome sequences of four cancers and analyze the patterns of these 3′UTR mutations. Next, we computationally predict how 3′UTR somatic mutations alter miRNA target sites and identify which of these somatic mutations may be particularly relevant to cancer pathogenesis. We determine somatic mutations that are both located within genes that have been linked with cancer and alter putative target sites of cancer-related miRNAs. We also attempt to link alteration of miRNA targeting with cancer through integration of these somatic mutation with the results of association studies. We identify three miRNA target sites that are altered by both somatic and germline mutations in linkage disequilibrium blocks with high scoring markers identified in GWAS of cancers.

thumbnail
Figure 1. Overview of the study.

Somatic mutations within putative miRNA target sites are linked with cancer-related genes and miRNAs as well as the results of cancer association studies.

https://doi.org/10.1371/journal.pone.0047137.g001

Results

Patterns of somatic mutations in 3′UTRs

We collected a total of 610 somatic mutations in 3′UTRs from four cancers (SCLC, melanoma, lung, and prostate). Excepting prostate cancer, somatic mutations were determined from whole genome sequencing of single samples; seven samples were sequenced for prostate cancer. None of the somatic mutations in 3′UTRs were identified in multiple cancer types. Only 1 (a T>C substitution at 30693148 in the 3′UTR of TUBB that was found in two prostate cancer samples) of the 152 (0.66%) somatic mutations in 3′UTRs identified in prostate cancer was found in multiple samples. The occurrence of somatic mutations in multiple prostate cancer samples across the entire genome was similarly rare, as only 116 of the 28626 (0.41%) of the somatic mutations in prostate cancer were found in multiple samples genome-wide. To compare the types of substitutions that occurred in each cancer type, we calculated the frequency of each class of single base substitution (Figure 2). The distributions of substitutions in 3′UTRs varied across types of cancers. For example, the majority of melanoma substitutions were G>A/C>T, while the most prevalent mutations in both lung and SCLC samples were G>T/C>A substitutions. These trends agreed with the rates of the mutations found across all regions of the genome for each type of cancer, and, in general, the percentage of mutations for each type of substitution were similar for 3′UTRs and for the entire genome. Together, these results indicate that mutations in 3′UTRs have similar causes (e.g., ultraviolet exposure for melanoma, smoking for lung cancer) as the mutations in the entire genome.

thumbnail
Figure 2. Frequency of single base substitutions.

The percentage of each class of substitution among somatic mutations in 3′UTRs (black bar) or across the entire genome (white bar) is shown for (A) lung cancer, (B) SCLC, (C) melanoma, and (D) prostate cancer.

https://doi.org/10.1371/journal.pone.0047137.g002

We also investigated if somatic mutations in 3′UTRs were more likely to be located at the 5′ end or 3′ end of the 3′UTR. For each somatic mutation, we compared the distance from the start of the 3′UTR (i.e, the end of the final exon) to the mutation to the total length of the 3′UTR. We then counted the number of somatic mutations in different sections of the 3′UTRs using a rolling window with a width of 5% and found that the number of somatic mutations varied considerably along the 3′UTR (Figure 3). The overall pattern of the distribution of all of the somatic mutations (Figure 3a) most closely matches that obtained from lung cancer (Figure 3b), the study that produced the largest number of mutations. In lung cancer (Figure 3b), there are many mutations immediately downstream of the end of the final coding exon, with the number of mutations sharply decreasing as the distance approached 10% of the 3′UTR length.

thumbnail
Figure 3. Location of somatic mutations in 3′UTRs.

For each somatic mutation, the percentage of the distance from the start of the 3′UTR to the somatic mutation compared to the total length of the 3′UTR was calculated. The figure shows the number of mutations in rolling windows of 5% of the 3′UTR length for somatic mutations in (A) all cancer types, (B) lung cancer, (C) SCLC, (D) melanoma, and (E) prostate cancer.

https://doi.org/10.1371/journal.pone.0047137.g003

Somatic mutations in 3′UTRs alter miRNA targeting

While a complete understanding of how the mRNA targets of a miRNA are selected has yet to be elucidated, sequence complementarity between nucleotides at the 5′ end, or seed region, of the mature miRNA sequence and a mRNA target site, which is typically in the 3′UTR, is common to many miRNA-mRNA pairs. Dozens of computational methods for predicting the targets of miRNAs have been developed, based on complementarity, as well as other criteria including conservation of the target site across species, target site accessibility in the secondary structure of the mRNA, the sequence context of the target site, and the thermodynamics of binding [29], [30]. We used two methods to identify somatic mutations with the potential to impact miRNA targeting (Table S1). First, we calculated context+ scores using the latest version of TargetScan [31], one of the most widely used and highest performing miRNA prediction tools [32], [33], for two sets of 3′UTR sequences, one containing the allele found in the normal tissue and one containing the allele found in cancer tissue. We then identified somatic mutations that were located within target sites predicted by TargetScan and impacted context+ scores. Second, we attempted to create a more inclusive list of 3′UTR somatic mutations that impact miRNA targeting by determining the mutations that alter 6mer, 7mer, or 8mer sites complementary to miRNA seeds. This second approach was motivated by recent analysis of mRNA sequences targeted by miRNAs in CLIP-Seq experiments in human [34] and HITS-CLIP experiments in mouse [35] that found that while longer (e.g., 7 nt and 8 nt) matches between the mRNA sequence and miRNA seed had higher specificities, the majority of functional target sites contained only 6 nt matches [36].

Given the large number of unique miRNA seeds, we expected to find that most somatic mutations either disrupted or created at least a 6mer match to a miRNA seed (Table S1). 608 of the 610 somatic mutations in 3′UTRs altered at least a 6mer long potential miRNA binding site and 525 mutations altered context+ scores calculated by TargetScan 6.0 for at least one miRNA. We then attempted to identify somatic mutations that had a high priority of having a role in cancer pathogenesis. First, we selected only miRNA-mRNA pairs for which the somatic mutation resulted in a magnitude change greater than 0.2 for the context+ score of a miRNA targeting the mRNA, providing the somatic mutations in target sites that were in the top 15% of those most likely to be functional based on the context+ score. Next, we limited the impacted putative target sites based on the miRNA and removed miRNAs that either had low expression (fewer than 100 total reads) in the RNA-Seq experiments collected in miRBase [37] or have not been previously associated with cancer in the PhenomiR database [38]. Finally, we used the Cancer Gene Census [39] and other literature sources to identify genes that are known tumor suppressors, oncogenes, or have other functional associations with cancer. Table 1 contains a selection of the somatic mutations that altered miRNA targeting and met these criteria. We also examined tissue- and cancer-specific miRNA expression to identify miRNAs that have been shown to be highly expressed in the particular tissue or cancer in which the somatic mutations were identified (Table S1). Several of the somatic mutations in Table 1, including those in TAL1, BMPR1B, KDM5A, SCG3, and BCAS3 impacted target sites of miRNAs that have been shown to be expressed in the same tissue in which the miRNA was identified.

thumbnail
Table 1. Selected somatic mutations that alter miRNA target sites in cancer-related genes.

https://doi.org/10.1371/journal.pone.0047137.t001

Of particular interest are oncogenes with somatic mutations that disrupt miRNA targeting and tumor suppressors with somatic mutations that create new miRNA targets, as these mutations could potentially explain the respective up- and down-regulation of these genes in cancers (Mutations meeting this criterion are shown in bold in Table 1). For example, increased expression of TAL1 [40], SCG3 [41] and GSDMA [42], [43] has been observed in cancers, and somatic mutations in the 3′UTRs of these genes disrupt putative targets of miRNAs that have been associated with cancer. The disruption of these target sites may prevent regulation of the levels of these genes by miRNAs, leading to higher expression. In contrast, EPHA3 [44] and MITF [45] are under-expressed in cancers or have been shown to act as tumor suppressors; the somatic mutations may create new target sites that lead to increased inhibition of translation or degradation of the mRNAs. Notably, one of the somatic mutations selected by this method impacted an experimentally validated target site of miR-125b in BMPR1B [46], which will be examined in more detail in the next section.

GWAS- and CGAS-informed functional analysis of somatic mutations that alter miRNA targeting

Genome-wide and candidate gene association studies have identified a large, and growing, number of genomic locations harboring germline mutations associated with increased risk for cancer. In many cases, the specific germline mutations that underlie these associations and their functional impact remain unknown; however, germline mutations that alter miRNA targeting have been identified as promising candidates for potentially explain the increased risk for several of cancers [19]. Therefore, we attempted to integrate the somatic mutations that alter miRNA targeting with germline mutations and the results of association studies. We sought to identify miRNA target sites in linkage disequilibrium with high scoring markers from association studies that are altered by both germline mutations and somatic mutations identified in cancers. Specifically, we identified both experimentally supported and computationally predicted miRNA target sites altered by somatic mutations that were also altered by germline mutations, and then, determined if the target was in the same haplotype block as high scoring markers from cancer association studies. Three genes, BMPR1B, KLK3, and SPRY4, contained miRNA target sites altered by both somatic and germline mutations that were in linkage disequilbrium blocks containing high scoring association study markers (Table 2 and Figure 4).

thumbnail
Figure 4. Disruption of miRNA target sites that are in linkage disequilibrium with high scoring markers (purple) from cancer association studies by both germline (blue) and somatic (red) mutations.

(A) Disruption of a target site of miR-125b in the 3′UTR of BMPR1B. (B) Disruption of a target site of miR-210 in the 3′UTR of KLK3. (C) Disruption of a target site of miR-608 in the 3′UTR of SPRY4.

https://doi.org/10.1371/journal.pone.0047137.g004

thumbnail
Table 2. Somatic mutations that alter miRNA targeting in linkage disequilibrium blocks of association study markers.

https://doi.org/10.1371/journal.pone.0047137.t002

The 3′UTR of BMPR1B contains a binding site for miR-125b that is disrupted by both a somatic mutation that was identified in lung cancer (chr4:g.96075969G>T) and a germline SNP (rs1434536). This target site is also in a haplotype block with rs11097457, one of the top 100 highest scoring markers in the Cancer Genetic Markers of Susceptibility (CGEMS) study, which is associated with breast cancer risk [46] (Figure 4a). The R2 value for correlation between rs11097457 and rs1434536 in the 1000 Genomes Project [47] is 0.82. The targeting of BMPR1B by miR-125b and the possibility that genetic variants disrupt this target site and play a role in cancer have been previously studied [46]. Saetrom et al. found that rs1434536 was in strong linkage disequilibrium with two high scoring markers in a breast cancer association study, confirmed the association in an independent breast cancer cohort, and showed that the SNP disrupted regulation of BMPR1B by miR-125b.

Both a somatic mutation (chr19:g.51363764A>C) and a germline mutation (rs1803136) in the 3′UTR of KLK3, a gene whose expression is commonly used as a diagnostic marker in prostate cancer [48], disrupted predicted target sites for miR-675, miR-138, and miR-210. These target sites were in the same linkage disequilibrium block, and only ∼850 basepairs away, from rs2735839 (Figure 4b), which was strongly associated with increased risk in a GWAS of prostate cancer [49]. Moreover, the somatic mutation (chr19:g.51363764A>C) was also identified in a patient with prostate cancer [11]. There has also been previous evidence that miR-675 [50], miR-210 [51] and miR-138 [52] regulate cancer cell proliferation. We also found a somatic mutation (chr5:g. 141691500G>T) and a germline mutation rs72117814 within a predicted binding site for miR-608 in the 3′UTR of SPRY4 which was located in the same linkage disequilibrium block as rs4624820, a high-ranking marker in a testicular cancer GWAS [53], [54] (Figure 4c). SPRY4 inhibits the mitogen-activated protein kinase pathway (MAPK) which is activated by the KITLG-KIT pathway, which has been associated with testicular cancer [53]. Because the germline mutations that disrupt target sites in SPRY4 and KLK3 are not included in the 1000 Genomes Project or HapMap data, we were not able to calculate the correlation between the germline SNPs and the high-ranking GWAS markers.

Discussion

Recent sequencing of the entire genomes of normal and cancer tissues from the same individual have provided comprehensive lists of somatic mutations. While there have been several efforts to identify the functional impact of somatic mutations in coding regions [5], [55], non-coding somatic mutations have received relatively little attention, despite the importance of these regions to gene regulation. One report investigated the rates of non-coding somatic mutations in multiple myeloma and observed that many non-coding mutations were near coding regions with known somatic hypermutation and that the mutation frequency in some-non-coding regions was greater than that expected by chance [56], but the functional impact of these non-coding mutations was not investigated. Here, we made an initial effort to identify non-coding somatic mutations that have the potential to cause dysregulation of gene expression and contribute to cancer pathogenesis. Specifically, we focused on somatic mutations located in 3′UTRs and investigated how these mutations may alter miRNA targeting. We found that the distributions of the different types of single base substitutions among somatic mutations in 3′UTRs varied for different types of cancers, but agreed with the distributions across the entire genome in each cancer type (Figure 2). We also investigated the distribution of miRNAs across the 3′UTRs and found that, for lung cancer, there was a large number of somatic mutation located in the 3′UTR very near the final coding exon. The distribution of mutations across genes has been used to determine the selective application of DNA repair, and it has been shown that DNA repair is more common among transcribed strands compared to non-transcribed strands and to the 5′ end of genes compared with the 3′ end [9]. While the large number of somatic mutations in the 3′UTR near the final coding exon in lung cancer is only an initial result based on a relatively small number of somatic mutations, observation of similar behavior as more somatic mutations are identified may enable increased understanding of DNA repair in the 3′UTR.

One way in which somatic mutations within 3′UTRs may have a functional impact is if they impact miRNA targeting by disrupting or creating miRNA target sites. We specifically identified somatic mutations that are predicted to disrupt miRNA target sites within genes, including TAL1, SCG3, and GSDMA, that are over-expressed in cancer and mutations that are predicted to create new miRNA target sites within genes, including MITF and EPHA3, that are underexpressed in cancer. While it is straightforward to identify how somatic mutations may impact miRNA function through these two modes (oncogenes with disrupted sites and tumor suppressors with created sites), it is likely that dysregulation of miRNA function in cancer occurs through more complex relationships that may not be consistent for all types of cancer. For example, several miRNAs, including the miR-17-19b cluster [12], [57], [58], and genes, including CDH1 [59], have been shown to have oncogenic properties in some cancer types while acting as tumor suppressors in others. Additionally, miRNAs increase the expression of their targets in some cases [60].

Greenberg et al. [61] investigated the global impact of somatic mutations in melanoma, lung cancer, and leukemia. They found that the mutations in melanoma decreased the binding of miRNAs to 3′UTRs, but did not observe as significant of a decrease in binding for somatic mutations in the other cancers. They attributed this result to UV-induced mutations found in melanoma being primarily Strong-to-Weak mutations (i.e., those mutations which reduce thermodynamic hybridization stability). While we focused on how the somatic mutations impacted complementarity between miRNA seeds and target sites, and not the impact of the mutations on binding energy, several of our results agreed with the conclusions by Greenberg et al. We found that the frequencies of the single base substitutions varied across cancer types (Figure 2), resulting in more Strong-to-Weak mutations in melanoma than other cancers. We can also use our results (Table S1) to compare with Greenberg et al. by calculating the ratio of the number of putative miRNA target sites disrupted by somatic mutations to the number of putative miRNA target sites created by the somatic mutations. The disrupted to created target site ratio is 1.18 for melanoma mutations, which is similar to the ratio found in SCLC (1.19) and higher than that found in prostate (1.12) and lung cancer (1.08), suggesting that it is possible that the somatic mutations in melanoma result in an overall decrease in miRNA binding in comparison with normal tissues and other cancers.

We attempted to identify important functional somatic mutations by leveraging the results of association studies. We identified target sites that contain both somatic and germline mutations and are in linkage disequilibrium blocks with high scoring markers from association studies of cancers. This procedure integrates two sources of information indicating the possibility that alteration of the target site plays a role in cancer; the germline mutation in the target site is a potential cause of the increased risk associated with the linked marker in the association study, while the somatic mutation in the target may play a role in tumorigenesis in other individuals. We identified three target sites located in BMPR1B, KLK3, and SRPY4 that contain both somatic and germline mutations and are linked with association studies. Both the genes containing these somatic mutations and the miRNAs that target these sites have been previously associated with cancer. A 3′UTR somatic mutation in BMPR1B identified in a lung cancer patient disrupts the specific target site of miR-125b that has previously been investigated for its role in cancer [46]. The target site contains a SNP, rs1434536, that is in linkage disequilibrium with two high scoring markers in a breast cancer association study and results in disruption of the regulation of BMPR1B by miR-125b. The somatic mutation indicates a second path through which the regulation of the gene by miRNAs could be disrupted, potentially contributing to tumorigenesis. While there has not been such strong experimental support for mutations disrupting the regulation of KLK3 [49]and SPRY4 [53], [54] by miRNAs in cancer, both of these genes have strong associations with cancer. Levels of KLK3 are commonly used for diagnosing prostate cancer [48], and the somatic mutation altering miRNA targeting of KLK3 was identified in prostate cancer. SPRY4 is involved in the KITLG-KIT pathway, which has been associated with cancer [53]. Additionally, two somatic mutations (chr12:g.88889449G>A and chr12:g.88887136G>A), in putative binding sites for miR-203 and miR-183, respectively, were located in the 3′UTR of KITLG. Expression of miR-183 has been shown to be correlated with expression of miR-203 [62], and both miRNAs are involved in suppression of expression of stem cell factors in cancer cells [62] and in proliferation of cancer [62], [63]. The KITLG somatic mutations are in a linkage disequilibrium block with rs995030, a marker SNP rs995030 which is strongly associated with testicular cancer risk [53]. Therefore, these somatic mutations in the 3′UTRs of SPRY4 and KITLG are promising candidates for contributions to tumorigenesis by the dysregulation of the KITLG-KIT pathway.

While the current study was able to identify somatic mutations that may impact miRNA targeting and play a role in cancer pathogenesis, it is limited by several factors. First, all but one of the somatic mutations studied here was identified in a single patient, and, therefore, the mutations may not commonly be found in other patients or may not be generalizable to other populations and cancer etiologies. Second, due to the relatively small number of experimentally known miRNA binding sites and a lack of understanding of the specifics of miRNA targeting, this study was, in most cases, only able to identify somatic mutations that alter predicted miRNA target sites. Specifically, we focused on how somatic mutations impact sequences within 3′UTRs complementary to miRNA seeds, as these features have been the focus of most miRNA targeting prediction algorithms; however, this approach neglects how somatic mutations within other locations in a target site, such as 3′ compensatory sites, may impact binding. Additionally, while 3′UTRs have traditionally been believed to harbor the majority of miRNA target sites, several recent experiments have shown that 5′UTRs [64] and coding regions [65] also contain functional miRNA targets. In the coming years, we expect that improvements in sequencing technologies may be able to address these limitations, increasing understanding of how alteration of miRNA targeting by germline and somatic mutations plays a role in cancer and other diseases in the coming years. New experimental techniques, such as CLIP-Seq [34], [35], have the promise to provide both extensive lists of experimentally supported miRNA target sites and the basis for a more complete understanding of miRNA targeting, potentially improving computational target predictions. Also, the number of somatic mutations and cancer-associated markers from GWAS will likely continue to grow rapidly, and methods that integrate these resources will therefore become increasingly fruitful. In particular, increasing the number of known somatic mutations will allow for the identification of mutations that commonly occur in cancer. While we were to determine one target site (the target site of miR-125b in BMPR1B) that offered the combination of experimental support, disruption by both germline and somatic mutations, and links with association studies, these developing resources may soon enable the identification of many similar high priority miRNA targets.

Materials and Methods

Sources of somatic mutations in 3′UTRs

Somatic mutations were compiled from the supplementary material of the original papers for lung [8] and prostate [11] cancer and from the non-coding variants of the COSMIC database [66] for SCLC [10] and melanoma [9]. Somatic mutations were determined using SOLiD, for SCLC [10], and Illumina GAII platforms, for melanoma [10] and prostate cancer [11]. The lung cancer mutations [8] were determined using 31- to 35-base mate-paired reads from DNA nanoarrays produced from adsorbing sequence substrate to silicon substrates with grid-patterened arrays. To determine somatic mutations that are located in 3′UTRs, we compared the location of the mutation with the start and end locations of 3′UTRs of RefSeq genes from the UCSC genome browser [67], [68]. When necessary, we used the liftover tool in the Galaxy web-server [69] to convert genomic locations to the GRCh37/hg19 assembly of the human genome. To determine the frequency of each class of substitution, we selected only the somatic mutations that were single base substitutions from the list of somatic mutations in 3′UTRs as well as the complete list of somatic mutations across the entire genome from the supplementary information of the original papers for each of the cancers. To examine the relative location of somatic mutations within 3′UTRs, we first removed mutations that were located in multiple RefSeq genes that had different 3′UTRs.

miRNA target sites altered by somatic mutation

We collected the sequences of the 3′UTR of all RefSeq genes using the UCSC Genome Browser. For each somatic mutation within a 3′UTR, we then created two sets of sequences, one containing the reference allele at the location of the somatic mutation and one containing the mutant allele. We then used two methods to identify somatic mutations that impacted putative miRNA target sites. First, we used TargetScan 6.0 [31] to calculate the impact of somatic mutations on the context+ score for the interaction between the 3′UTR sequence and all human miRNAs included in miRBase release 18 [37]. We also determined somatic mutations that impact binding to six miRNA seed classes [36], namely, 8mers (bases 1–8 of the miRNA), 7merA (bases 1–7), 7merB (bases 2–8), 6merA (bases 1–6), 6merB (bases 2–7), and 6merC (bases 3–8). We determined somatic mutations in 3′UTR sequences that disrupted, created, and modified potential target sites with perfect Watson-Crick complementarity to the miRNA seeds. Target sites found in the reference sequence and not the mutant sequence were disrupted by the somatic mutation, while target sites found in the mutant sequence and not the reference sequence were created by the somatic mutation. Target sites with different seed match types in the reference and mutant sequences (e.g., a reference sequence with a 6merA match to a miRNA that becomes a 7merA match in the mutated sequence) were modified by the somatic mutation (Table S1).

To help identify somatic mutations that altered functional mRNA-miRNA interactions, we collected miRNA expression data from several sources and added these data to Table S1. First, to identify miRNAs that are expressed in any tissue, we used the total number of RNA-Seq reads for mature miRNAs from all experiments included in miRBase release 18 [37]. Additionally, we collected tissue-specific mature miRNA expression from miRBase for melanoma and miRNA sequencing experiments by Landgraf et al. [70] for lung, SCLC, and prostate cancer. Tissue-specific miRNA expression in melanoma was determined by totaling the number of reads for each miRNA from 11 melanoma experiments included in miRBase. Tissue-specific miRNA expression for lung cancers (both lung and SCLC) and prostate cancer was determined by totaling the number of miRNA reads from 4 lung adenocarcinoma samples and 1 prostate sample, respectively.

Linking somatic mutations with associations studies

To link somatic mutations that alter miRNA targeting with the results of association studies, we collected high ranking markers from association studies of cancer from dbGaP [71], the NHGRI GWAS Catalog [72], and the Cancer GAMAdb (http://www.hugenavigator.net/CancerGEMKB/caIntegratorStartPage.do). We first determined if the binding sites that were created or disrupted by these somatic mutations were also altered by germline mutations by identifying germline mutations from dbSNP build 132 [73], [74] that were located within seed matches in the mRNA sequences. We then calculated the distance between the target site containing the mutations and the association study markers and examined the linkage disequilibrium (LD) blocks of all markers that were within 100 Kb of an altered target site using Haploview [75]. For all but one highly ranked marker near a mutated target site, the association study was performed in a European population, and we obtained LD blocks using data from the CEU+TSI population from HapMap Project 2, release 27. The remaining GWAS marker (rs1247860) was associated with a cancer phenotype in a Han Chinese population [76]; we used the CHB population in Haploview and determined that no target sites containing somatic mutations were in LD with the marker. For germline mutations contained in the 1000 Genomes Project [47], we calculated the R2 or the correlation between the GWAS marker and the germline mutations within the LD block using SNAP [77].

Supporting Information

Table S1.

Impact of somatic mutations on miRNA target sites.

https://doi.org/10.1371/journal.pone.0047137.s001

(XLS)

Author Contributions

Conceived and designed the experiments: JZ YC. Performed the experiments: JZ AB. Analyzed the data: JZ AB. Wrote the paper: JZ AB YC.

References

  1. 1. Stratton MR (2011) Exploring the genomes of cancer cells: progress and promise. Science 331: 1553–1558.
  2. 2. Boehm JS, Hahn WC (2011) Towards systematic functional characterization of cancer genomes. Nat Rev Genet 12: 487–498.
  3. 3. Chin L, Gray JW (2008) Translating insights from the cancer genome into clinical practice. Nature 452: 553–563.
  4. 4. Chin L, Andersen JN, Futreal PA (2011) Cancer genomics: from discovery science to personalized medicine. Nat Med 17: 297–303.
  5. 5. Shi Z, Moult J (2011) Structural and functional impact of cancer-related missense somatic mutations. J Mol Biol 413: 495–512.
  6. 6. Reva B, Antipin Y, Sander C (2011) Predicting the functional impact of protein mutations: application to cancer genomics. Nucleic Acids Res 39: e118.
  7. 7. Chin L, Hahn WC, Getz G, Meyerson M (2011) Making sense of cancer genomic data. Genes Dev 25: 534–555.
  8. 8. Lee W, Jiang Z, Liu J, Haverty PM, Guan Y, et al. (2010) The mutation spectrum revealed by paired genome sequences from a lung cancer patient. Nature 465: 473–477.
  9. 9. Pleasance ED, Cheetham RK, Stephens PJ, McBride DJ, Humphray SJ, et al. (2010) A comprehensive catalogue of somatic mutations from a human cancer genome. Nature 463: 191–196.
  10. 10. Pleasance ED, Stephens PJ, O'Meara S, McBride DJ, Meynert A, et al. (2010) A small-cell lung cancer genome with complex signatures of tobacco exposure. Nature 463: 184–190.
  11. 11. Berger MF, Lawrence MS, Demichelis F, Drier Y, Cibulskis K, et al. (2011) The genomic complexity of primary human prostate cancer. Nature 470: 214–220.
  12. 12. Esquela-Kerscher A, Slack FJ (2006) Oncomirs - microRNAs with a role in cancer. Nat Rev Cancer 6: 259–269.
  13. 13. Medina PP, Slack FJ (2008) microRNAs and cancer: an overview. Cell Cycle 7: 2485–2492.
  14. 14. Saunders MA, Liang H, Li WH (2007) Human polymorphism at microRNAs and microRNA target sites. Proc Natl Acad Sci U S A 104: 3300–3305.
  15. 15. Sethupathy P, Collins FS (2008) MicroRNA target site polymorphisms and human disease. Trends Genet 24: 489–497.
  16. 16. Bao L, Zhou M, Wu L, Lu L, Goldowitz D, et al. (2007) PolymiRTS Database: linking polymorphisms in microRNA target sites with complex traits. Nucleic Acids Res 35: D51–54.
  17. 17. Chen K, Song F, Calin GA, Wei Q, Hao X, et al. (2008) Polymorphisms in microRNA targets: a gold mine for molecular epidemiology. Carcinogenesis 29: 1306–1311.
  18. 18. Ziebarth JD, Bhattacharya A, Chen A, Cui Y (2012) PolymiRTS Database 2.0: linking polymorphisms in microRNA target sites with human diseases and complex traits. Nucleic Acids Res 40: D216–221.
  19. 19. Ryan BM, Robles AI, Harris CC (2010) Genetic variation in microRNA networks: the implications for cancer research. Nat Rev Cancer 10: 389–402.
  20. 20. Godshalk SE, Paranjape T, Nallur S, Speed W, Chan E, et al. (2011) A Variant in a MicroRNA complementary site in the 3′ UTR of the KIT oncogene increases risk of acral melanoma. Oncogene 30: 1542–1550.
  21. 21. Calin GA, Dumitru CD, Shimizu M, Bichi R, Zupo S, et al. (2002) Frequent deletions and down-regulation of micro- RNA genes miR15 and miR16 at 13q14 in chronic lymphocytic leukemia. Proc Natl Acad Sci U S A 99: 15524–15529.
  22. 22. Calin GA, Ferracin M, Cimmino A, Di Leva G, Shimizu M, et al. (2005) A MicroRNA signature associated with prognosis and progression in chronic lymphocytic leukemia. N Engl J Med 353: 1793–1801.
  23. 23. Chen AX, Yu KD, Fan L, Li JY, Yang C, et al. (2011) Germline genetic variants disturbing the Let-7/LIN28 double-negative feedback loop alter breast cancer susceptibility. PLoS Genet 7: e1002259.
  24. 24. Zhang L, Liu Y, Song F, Zheng H, Hu L, et al. (2011) Functional SNP in the microRNA-367 binding site in the 3′UTR of the calcium channel ryanodine receptor gene 3 (RYR3) affects breast cancer risk and calcification. Proc Natl Acad Sci U S A 108: 13653–13658.
  25. 25. Mayr C, Hemann MT, Bartel DP (2007) Disrupting the pairing between let-7 and Hmga2 enhances oncogenic transformation. Science 315: 1576–1579.
  26. 26. Thomas LF, Saito T, Saetrom P (2011) Inferring causative variants in microRNA target sites. Nucleic Acids Res 39: e109.
  27. 27. Richardson K, Lai CQ, Parnell LD, Lee YC, Ordovas JM (2011) A genome-wide survey for SNPs altering microRNA seed sites identifies functional candidates in GWAS. BMC Genomics 12: 504.
  28. 28. Ramsingh G, Koboldt DC, Trissal M, Chiappinelli KB, Wylie T, et al. (2010) Complete characterization of the microRNAome in a patient with acute myeloid leukemia. Blood 116: 5316–5326.
  29. 29. Hammell M (2010) Computational methods to identify miRNA targets. Semin Cell Dev Biol 21: 738–744.
  30. 30. Saito T, Saetrom P (2010) MicroRNAs–targeting and target prediction. N Biotechnol 27: 243–249.
  31. 31. Garcia DM, Baek D, Shin C, Bell GW, Grimson A, et al. (2011) Weak seed-pairing stability and high target-site abundance decrease the proficiency of lsy-6 and other microRNAs. Nat Struct Mol Biol 18: 1139–1146.
  32. 32. Sethupathy P, Megraw M, Hatzigeorgiou AG (2006) A guide through present computational approaches for the identification of mammalian microRNA targets. Nat Methods 3: 881–886.
  33. 33. Alexiou P, Maragkakis M, Papadopoulos GL, Reczko M, Hatzigeorgiou AG (2009) Lost in translation: an assessment and perspective for computational microRNA target identification. Bioinformatics 25: 3049–3055.
  34. 34. Hafner M, Landthaler M, Burger L, Khorshid M, Hausser J, et al. (2010) Transcriptome-wide identification of RNA-binding protein and microRNA target sites by PAR-CLIP. Cell 141: 129–141.
  35. 35. Chi SW, Zang JB, Mele A, Darnell RB (2009) Argonaute HITS-CLIP decodes microRNA-mRNA interaction maps. Nature 460: 479–486.
  36. 36. Ellwanger DC, Buttner FA, Mewes HW, Stumpflen V (2011) The sufficient minimal set of miRNA seed types. Bioinformatics 27: 1346–1350.
  37. 37. Kozomara A, Griffiths-Jones S (2011) miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res 39: D152–157.
  38. 38. Ruepp A, Kowarsch A, Schmidl D, Buggenthin F, Brauner B, et al. (2010) PhenomiR: a knowledgebase for microRNA expression in diseases and biological processes. Genome Biol 11: R6.
  39. 39. Futreal PA, Coin L, Marshall M, Down T, Hubbard T, et al. (2004) A census of human cancer genes. Nat Rev Cancer 4: 177–183.
  40. 40. Cardoso BA, de Almeida SF, Laranjeira AB, Carmo-Fonseca M, Yunes JA, et al. (2011) TAL1/SCL is downregulated upon histone deacetylase inhibition in T-cell acute lymphoblastic leukemia cells. Leukemia 25: 1578–1586.
  41. 41. Moss AC, Jacobson GM, Walker LE, Blake NW, Marshall E, et al. (2009) SCG3 transcript in peripheral blood is a prognostic biomarker for REST-deficient small cell lung cancer. Clin Cancer Res 15: 274–283.
  42. 42. Saeki N, Usui T, Aoyagi K, Kim DH, Sato M, et al. (2009) Distinctive expression and function of four GSDM family genes (GSDMA-D) in normal and malignant upper gastrointestinal epithelium. Genes Chromosomes Cancer 48: 261–271.
  43. 43. Saeki N, Kim DH, Usui T, Aoyagi K, Tatsuta T, et al. (2007) GASDERMIN, suppressed frequently in gastric cancer, is a target of LMO1 in TGF-beta-dependent apoptotic signalling. Oncogene 26: 6488–6498.
  44. 44. Fox BP, Tabone CJ, Kandpal RP (2006) Potential clinical relevance of Eph receptors and ephrin ligands expressed in prostate carcinoma cell lines. Biochem Biophys Res Commun 342: 1263–1272.
  45. 45. Bemis LT, Chen R, Amato CM, Classen EH, Robinson SE, et al. (2008) MicroRNA-137 targets microphthalmia-associated transcription factor in melanoma cell lines. Cancer Res 68: 1362–1368.
  46. 46. Saetrom P, Biesinger J, Li SM, Smith D, Thomas LF, et al. (2009) A risk variant in an miR-125b binding site in BMPR1B is associated with breast cancer pathogenesis. Cancer Res 69: 7459–7465.
  47. 47. The 1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467: 1061–1073.
  48. 48. Penney KL, Schumacher FR, Kraft P, Mucci LA, Sesso HD, et al. (2011) Association of KLK3 (PSA) genetic variants with prostate cancer risk and PSA levels. Carcinogenesis 32: 853–859.
  49. 49. Eeles RA, Kote-Jarai Z, Giles GG, Olama AA, Guy M, et al. (2008) Multiple newly identified loci associated with prostate cancer susceptibility. Nat Genet 40: 316–321.
  50. 50. Tsang WP, Ng EK, Ng SS, Jin H, Yu J, et al. (2010) Oncofetal H19-derived miR-675 regulates tumor suppressor RB in human colorectal cancer. Carcinogenesis 31: 350–358.
  51. 51. Tsuchiya S, Fujiwara T, Sato F, Shimada Y, Tanaka E, et al. (2011) MicroRNA-210 regulates cancer cell proliferation through targeting fibroblast growth factor receptor-like 1 (FGFRL1). J Biol Chem 286: 420–428.
  52. 52. Jin Y, Wang C, Liu X, Mu W, Chen Z, et al. (2011) Molecular Characterization of the MicroRNA-138-Fos-like Antigen 1 (FOSL1) Regulatory Module in Squamous Cell Carcinoma. J Biol Chem 286: 40104–40109.
  53. 53. Rapley EA, Turnbull C, Al Olama AA, Dermitzakis ET, Linger R, et al. (2009) A genome-wide association study of testicular germ cell tumor. Nat Genet 41: 807–810.
  54. 54. Turnbull C, Rapley EA, Seal S, Pernet D, Renwick A, et al. (2010) Variants near DMRT1, TERT and ATF7IP are associated with testicular germ cell cancer. Nat Genet 42: 604–607.
  55. 55. Youn A, Simon R (2011) Identifying cancer driver genes in tumor genome sequencing studies. Bioinformatics 27: 175–181.
  56. 56. Chapman MA, Lawrence MS, Keats JJ, Cibulskis K, Sougnez C, et al. (2011) Initial genome sequencing and analysis of multiple myeloma. Nature 471: 467–472.
  57. 57. He L, Thomson JM, Hemann MT, Hernando-Monge E, Mu D, et al. (2005) A microRNA polycistron as a potential human oncogene. Nature 435: 828–833.
  58. 58. O'Donnell KA, Wentzel EA, Zeller KI, Dang CV, Mendell JT (2005) c-Myc-regulated microRNAs modulate E2F1 expression. Nature 435: 839–843.
  59. 59. Lewis-Tuffin LJ, Rodriguez F, Giannini C, Scheithauer B, Necela BM, et al. (2010) Misregulated E-cadherin expression associated with an aggressive brain tumor phenotype. PLoS One 5: e13665.
  60. 60. Place RF, Li LC, Pookot D, Noonan EJ, Dahiya R (2008) MicroRNA-373 induces expression of genes with complementary promoter sequences. Proc Natl Acad Sci U S A 105: 1608–1613.
  61. 61. Greenberg E, Rechavi G, Amariglio N, Solomon O, Schachter J, et al. (2011) Mutagen-specific mutation signature determines global microRNA binding. PLoS One 6: e27400.
  62. 62. Wellner U, Schubert J, Burk UC, Schmalhofer O, Zhu F, et al. (2009) The EMT-activator ZEB1 promotes tumorigenicity by repressing stemness-inhibiting microRNAs. Nat Cell Biol 11: 1487–1495.
  63. 63. Yantiss RK, Goodarzi M, Zhou XK, Rennert H, Pirog EC, et al. (2009) Clinical, pathologic, and molecular features of early-onset colorectal carcinoma. Am J Surg Pathol 33: 572–582.
  64. 64. Lytle JR, Yario TA, Steitz JA (2007) Target mRNAs are repressed as efficiently by microRNA-binding sites in the 5′ UTR as in the 3′ UTR. Proc Natl Acad Sci U S A 104: 9667–9672.
  65. 65. Schnall-Levin M, Zhao Y, Perrimon N, Berger B (2010) Conserved microRNA targeting in Drosophila is as widespread in coding regions as in 3′UTRs. Proc Natl Acad Sci U S A 107: 15751–15756.
  66. 66. Forbes SA, Bindal N, Bamford S, Cole C, Kok CY, et al. (2011) COSMIC: mining complete cancer genomes in the Catalogue of Somatic Mutations in Cancer. Nucleic Acids Res 39: D945–950.
  67. 67. Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, et al. (2004) The UCSC Table Browser data retrieval tool. Nucleic Acids Res 32: D493–496.
  68. 68. Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, et al. (2011) The UCSC Genome Browser database: update 2011. Nucleic Acids Res 39: D876–882.
  69. 69. Blankenberg D, Von Kuster G, Coraor N, Ananda G, Lazarus R, et al. (2010) Galaxy: a web-based genome analysis tool for experimentalists. Curr Protoc Mol Biol Chapter 19: Unit 19 10 11–21.
  70. 70. Landgraf P, Rusu M, Sheridan R, Sewer A, Iovino N, et al. (2007) A mammalian microRNA expression atlas based on small RNA library sequencing. Cell 129: 1401–1414.
  71. 71. Mailman MD, Feolo M, Jin Y, Kimura M, Tryka K, et al. (2007) The NCBI dbGaP database of genotypes and phenotypes. Nat Genet 39: 1181–1186.
  72. 72. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, et al. (2009) Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 106: 9362–9367.
  73. 73. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, et al. (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29: 308–311.
  74. 74. Sayers EW, Barrett T, Benson DA, Bolton E, Bryant SH, et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 39: D38–51.
  75. 75. Barrett JC, Fry B, Maller J, Daly MJ (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics 21: 263–265.
  76. 76. Chen ZJ, Zhao H, He L, Shi Y, Qin Y, et al. (2011) Genome-wide association study identifies susceptibility loci for polycystic ovary syndrome on chromosome 2p16.3, 2p21 and 9q33.3. Nat Genet 43: 55–59.
  77. 77. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O'Donnell CJ, et al. (2008) SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics 24: 2938–2939.
  78. 78. Zhang L, Huang J, Yang N, Greshock J, Megraw MS, et al. (2006) microRNAs exhibit high frequency genomic alterations in human cancer. Proc Natl Acad Sci U S A 103: 9136–9141.
  79. 79. Jiang J, Lee EJ, Gusev Y, Schmittgen TD (2005) Real-time expression profiling of microRNA precursors in human cancer cell lines. Nucleic Acids Res 33: 5394–5403.
  80. 80. Jukic DM, Rao UN, Kelly L, Skaf JS, Drogowski LM, et al. (2010) Microrna profiling analysis of differences between the melanoma of young adults and older adults. J Transl Med 8: 27.
  81. 81. Guo C, Sah JF, Beard L, Willson JK, Markowitz SD, et al. (2008) The noncoding RNA, miR-126, suppresses the growth of neoplastic cells by targeting phosphatidylinositol 3-kinase signaling and is frequently lost in colon cancers. Genes Chromosomes Cancer 47: 939–946.
  82. 82. Feng N, Xu B, Tao J, Li P, Cheng G, et al. (2012) A miR-125b binding site polymorphism in bone morphogenetic protein membrane receptor type IB gene and prostate cancer risk in China. Mol Biol Rep 39: 369–373.
  83. 83. D'Alessandro V, Muscarella LA, Copetti M, Zelante L, Carella M, et al. (2008) Molecular detection of neuron-specific ELAV-like-positive cells in the peripheral blood of patients with small-cell lung cancer. Cell Oncol 30: 291–297.
  84. 84. Chen X, Ba Y, Ma L, Cai X, Yin Y, et al. (2008) Characterization of microRNAs in serum: a novel class of biomarkers for diagnosis of cancer and other diseases. Cell Res 18: 997–1006.
  85. 85. Zeng J, Ge Z, Wang L, Li Q, Wang N, et al. (2010) The histone demethylase RBP2 Is overexpressed in gastric cancer and its inhibition triggers senescence of cancer cells. Gastroenterology 138: 981–992.
  86. 86. Chakravarti N, Lotan R, Diwan AH, Warneke CL, Johnson MM, et al. (2007) Decreased expression of retinoid receptors in melanoma: entailment in tumorigenesis and prognosis. Clin Cancer Res 13: 4817–4824.
  87. 87. Schotte D, Chau JC, Sylvester G, Liu G, Chen C, et al. (2009) Identification of new microRNA genes and aberrant microRNA profiles in childhood acute lymphoblastic leukemia. Leukemia 23: 313–322.
  88. 88. Leidinger P, Keller A, Borries A, Reichrath J, Rass K, et al. (2010) High-throughput miRNA profiling of human melanoma blood samples. BMC Cancer 10: 262.
  89. 89. Siva K, Venu P, Mahadevan A, Shankar SK, Inamdar MS (2007) Human BCAS3 expression in embryonic stem cells and vascular precursors suggests a role in human embryogenesis and tumor angiogenesis. PLoS One 2: e1202.
  90. 90. Gururaj AE, Singh RR, Rayala SK, Holm C, den Hollander P, et al. (2006) MTA1, a transcriptional activator of breast cancer amplified sequence 3. Proc Natl Acad Sci U S A 103: 6670–6675.
  91. 91. Yamamoto M, Cid E, Bru S, Yamamoto F (2011) Rare and frequent promoter methylation, respectively, of TSHZ2 and 3 genes that are both downregulated in expression in breast and prostate cancers. PLoS One 6: e17149.