Conceived and designed the experiments: MTA MH RGM. Performed the experiments: MH RGM BRK AHK NP MTA. Analyzed the data: MH RGM BRK AHK MTA. Contributed reagents/materials/analysis tools: MH RGM BRK AHK NP MTA. Wrote the paper: MH RGM BRK AHK NP MTA.
The authors have declared that no competing interests exist.
Mammalian hibernation is a complex phenotype involving metabolic rate reduction, bradycardia, profound hypothermia, and a reliance on stored fat that allows the animal to survive for months without food in a state of suspended animation. To determine the genes responsible for this phenotype in the thirteen-lined ground squirrel (
The investigation of complex phenotypes in eukaryotes has been largely confined to a handful of exhaustively studied “model organisms.” The advantage of studying this small number of plants and animals is due to a combination of factors such as ease of maintenance, a large research community, genetic selection of mutants, funded stock centers, and sequenced genomes. As a result major advances in genetics and molecular biology have been made using this select group of species. Of course, the living world is filled with novel phenotypes that are not found in the most commonly studied organisms. Mammalian hibernation is one of these phenotypes.
Hibernation is a seasonal adaptation that results in a major departure from standard mammalian homeostasis. Radical depressions in metabolism, heart rate, body temperature and oxygen consumption allow small mammals to survive up to 6 months with little or no food in a state of suspended animation called torpor (for review see
Over the past decade microarray hybridization has been commonly used as a means of measuring gene activity by whole tissue transcription profiling. More recently massive parallel sequencing has played a larger role in quantifying gene expression by deep sequencing the transcriptome. The term “deep sequencing” refers to the millions of RNA sequence reads that are generated by this method. Deep sequencing the transcriptome, also known as RNAseq, provides both the sequence and frequency of RNA molecules that are present at any particular time in a specific cell type, tissue or organ. Counting the number of mRNAs that are encoded by individual genes provides an indicator of protein-coding potential, a major contributor to phenotype.
We have performed RNAseq using the Roche 454 system to identify genes that are expressed in active and hibernating thirteen-lined ground squirrels (
Deep sequencing the transcriptome offers many advantages over other methods for measuring gene activity such as microarrays, which have hybridization bias detection, higher background and require more sample quantity. The 454 platform generates individual RNA sequence reads up to 600 bases in length. This size is useful for non-model organisms such as the thirteen-lined ground squirrel where longer read length greatly facilitates reliable assembly and identification of the corresponding transcript. In this study we generated 3.7 million reads, added original cDNA sequence to the thirteen-lined ground squirrel database, and report a novel approach for sequencing the mitochondrial genome that can be applied to other eukaryotes. Moreover, we have uncovered new patterns of differential gene activity throughout the year that contribute to our understanding of the hibernation phenotype.
All animal use in this study was carried out in strict accordance with the approval of the University of Minnesota Institutional Animal Care and Use Committee (protocol #0805A34502). Thirteen-lined ground squirrels were wild-caught in central Minnesota. Animals were housed at the University of Minnesota School of Medicine Duluth, where they were fed Purina Laboratory Rodent Diet 5001 and provided with water
For each time point, 3 males and 3 females with weights within one standard deviation of the mean were selected for study. Animals with obvious health defects were excluded from consideration. A summary of animal characteristics including weight, body temperature, and food availability at the time of sacrifice is provided in
Collection point | State | Mass | Body Temperature | Food Availability |
April | Active | 165±14 g | 37.6±0.4°C | Yes |
August | Active | 231±9 g | 36.7±0.4°C | Yes |
October | Active | 248±10 g | 32.4±1.9°C | Yes |
January | Torpid | 176±13 g | 6.0±0.4°C | No |
January | IBA | 180±10 g | 28.1±3.0°C | No |
March | Arousal | 163±25 g | 35.3±0.4°C | No |
All animal surgical procedures were performed under isoflurane anesthesia and all efforts were made to minimize potential pain and suffering. For each of the 6 collection points throughout the year, all animals were sacrificed during the same 9 a.m. to 4 p.m. time frame. Heart, skeletal muscle, and white adipose tissues were collected on ice immediately after sacrifice. The heart was separated from the pericardium, brown fat and attached vessels. Skeletal muscle was collected from the hind limb vastus lateralis. Visceral white adipose tissue (WAT) was dissected to collect only the retroperitoneal fat pad
Total RNA was isolated from 250 mg each of heart, skeletal and white adipose tissue. Briefly, 250 mg of frozen tissue was thawed in 1 mL of ice cold TriReagent (Ambion/Applied Biosystems, Foster City, CA) and homogenized on ice using a Tissue Tearor homogenizer (BioSpec Products Inc., Bartlesville, OK). The homogenates were centrifuged at 12,000× g, 4°C for 10 min using a Beckman F2402H rotor and Avanti centrifuge (Beckman Coulter, Inc. Brea, CA) to pellet cell debris. For WAT only, the cell lysate was recovered from beneath the upper lipid layer, combined with 1 volume TriReagent and centrifuged a second time (12,000×g, °C, 10 min). Cell lysates were incubated 5 min at room temperature, and combined with 1-bromo-3-chloropropane (Sigma, St. Louis, MO). Aqueous and organic phases were separated by centrifugation (12,000× g, 4°C, 15 min) and RNA was extracted using Qiagen RNeasy mini column protocol for RNA isolation from animal tissues (Qiagen, Valencia, CA).
Quality of the isolated total RNA was assessed by non-denaturing agarose gel electrophoresis (1.5% agarose, 1× Tris-acetate-EDTA buffer, 0.05 µg/ mL ethidium bromide) and quantity of RNA was determined using Nanodrop spectrophotometer (Thermo Fischer Scientific Inc., Waltham, MA). Potential contaminating DNA was removed by treatment with DNase 1 using an Ambion Turbo DNA
The following poly(A)+ RNA isolation, library preparation, emPCR and sequencing was carried out by the BioMedical Genomics Center at the University of Minnesota. Ethanol precipitation of total ground squirrel RNA preparations was performed. The RNA pellet was rehydrated in 100 µL of nuclease-free water. All samples were run on an Agilent 6000 Nano chip on the BioAnalyzer 2100 to verify the integrity of the RNA. All samples had a BioAnalyzer RIN number of 8 or greater, signifying high quality RNA. Poly(A)+ RNA was prepared from 10 µg of total RNA in a total of 250 µL of nuclease-free water and added 250 µL of 2× Binding Solution from the Applied BioSystems MicroPoly(A)Purist Kit. We followed the manufacturer's protocol through recovery of the poly(A)+ RNA, followed by a second round of oligo (dT) selection.
Preparing the cDNA library consisted of 9 major steps: 1) fragmentation of RNA, 2) double-stranded cDNA synthesis, 3) fragment end repair, 4) AMPure Bead preparation, 5) rapid library MID adaptor ligation, 6) small fragment removal, 7) library quantification, 8) cDNA library quality assessment, and 9) preparation of working aliquots. Solutions prepared prior to starting the protocol were 10 mM Tris-HCL pH 7.5, 70% ethanol, RNA fragmentation solution, 0.2 M EDTA pH 8.0, and 400 µM Roche Primer “random”. All of the samples were prepared using the individual sample clean-up (ISC) method (Section 3.4 and 3.6 of the Roche cDNA Rapid Library Preparation Method Manual, October 2009). Each of the individual samples was indexed using one of the twelve Roche MID adaptors. We used the TBS-380 from Turner BioSystems to quantify the DNA library (Section 3.7.2 of the Roche cDNA Rapid Library Preparation Method Manual, October 2009). The Rapid Library quantification calculator at
After normalization of the individual libraries, groups of three indexed samples were pooled together in equal molar ratios and titrated to optimize yield and sequence quality. This was done using emulsion-based clonal amplification (emPCR) following the procedure outlined in the Roche emPCR Method Manual Lib-L SV, October 2009 (version 2). For each pooled cDNA library, four single tube emPCR amplification reactions were prepared using different input amounts (2, 4, 8 and 16 molecules per bead; Roche Technical Bulletin for Updated Titration Range for GS-FLX Titanium emPCR Lib-L Kit, March 2010). The TissueLyser II from Qiagen was used to prepare the emulsions prior to amplification. Amplification was carried out using a Tetrad 2 thermal cycler from BioRad. The emulsions were broken and the DNA capture beads were collected and pooled, each pool was then enriched for beads carrying single stranded cDNA (sstcDNA) according to the manufacturer's instructions (Roche). The enriched bead samples were then counted using a Z1 Coulter Counter (Beckmann Coulter) to calculate the percent enrichment (i.e., the percent of initial beads that contained sstcDNA). Based on a linear regression of these percent enrichment values against the initial sstcDNA amounts, we calculated the amount of cDNA needed to produce an expected 8% enrichment.
Large volume bulk emPCR was performed as described in the emPCR Method Manual – Lib_L LV, October 2009, from Roche Applied Science. We started with two pools consisting of three indexed libraries each. For each pooled library, the copy per bead that achieved closest to 8% enrichment during the titration was used for bulk emPCR. Two large volume emulsification reactions were prepared for each pooled library to be loaded on a 2-region PicoTiterPlate (PTP) from Roche Applied Science. Roche's vacuum-assisted protocol for emulsion breaking and bead recovery for large volume emulsions was followed. Bulk library pools that resulted in 8-15% enrichment were used for sequencing.
Sequencing of each pooled library was carried out on the Genome Analyzer FLX. The PTP was divided using a 2-region gasket. The number of DNA library beads loaded per region was approximately 2 million. The Roche GS GLX Titanium Sequencing and PicoTiterPlate Kits were used to conduct three full sequencing runs. Each of the three runs contained two pools of three different indexed libraries. Image analysis and base-calling software were performed with standard protocols and default parameters. Each indexed library represented between 11–28% of the total number of reads.
The bioinformatic pipeline for converting raw reads into identified sequence is summarized in
To compare the RNA level of individual genes across all six collection points we used upper quartile normalization as described in the supplementary materials of Bullard et al.
To identify differentially expressed genes across the six sampled points we first determined the p-value of a six-way Fisher's exact test for each gene. These p-values were corrected for multiple comparisons by calculating false discovery rate (FDR;
This study was designed to provide a yearlong examination of gene activity responsible for the hibernation phenotype.
The assembled contigs and remaining 619,767 unassembled sequences were used to identify proteins that are encoded by the raw sequence reads.
Time point | August Active | October Active | January Torpor | January IBA | March Active | April Active |
Total reads | 424,124 | 259,936 | 328,909 | 217,662 | 176,694 | 270,109 |
RefSeq | 93.1% | 90.9% | 94.4% | 93.8% | 94.7% | 94.7% |
UniProt identified | 90.2% | 87.0% | 91.8% | 90.4% | 92.1% | 91.6% |
Mitochondrial | 19.1% | 24.7% | 22.2% | 17.6% | 23.2% | 23.8% |
"RefSeq" refers to reads identified in the human mRNA RefSeq database. “UniProt identified” refers to reads identified in the human UniProt database
Time point | August Active | October Active | January Torpor | January IBA | March Active | April Active |
Total reads | 173,086 | 110,748 | 150,778 | 86,220 | 291,697 | 195,567 |
RefSeq | 95.4% | 94.6% | 94.5% | 93.8% | 94.5% | 95.3% |
UniProt identified | 92.1% | 92.2% | 92.2% | 91.0% | 90.9% | 91.6% |
Mitochondrial | 11.4% | 6.8% | 6.8% | 7.2% | 10.8% | 18.8% |
"RefSeq" refers to reads identified in the human mRNA RefSeq database. “UniProt identified” refers to reads identified in the human UniProt database
Time point | August Active | October Active | January Torpor | January IBA | March Active | April Active |
Total reads | 163,906 | 123,722 | 89,393 | 200,889 | 245,839 | 110,125 |
RefSeq | 91.5% | 92.1% | 90.8% | 91.1% | 89.2% | 86.5% |
UniProt identified | 89.2% | 89.7% | 88.7% | 88.9% | 86.8% | 83.3% |
Mitochondrial | 3.0% | 1.4% | 1.0% | 1.2% | 1.7% | 3.6% |
"RefSeq" refers to reads identified in the human mRNA RefSeq database. “UniProt identified” refers to reads identified in the human UniProt database
Reads and contigs which did not match human UniProt records included ribosomal RNA and other non-protein coding RNA. The ribosomal RNA was not surprising given its abundance, despite two rounds of oligo-dT selection. A comparison (using BLASTn) of our sequences with the fRNAdb
April | August | October | Torpor | IBA | March | |
Skeletal muscle | 1883 | 943 | 1724 | 1961 | 1708 | 799 |
Heart | 2849 | 2834 | 5056 | 2345 | 1355 | 3039 |
WAT | 635 | 388 | 1636 | 412 | 448 | 367 |
Upper-quartile normalized counts for reads matching the
The assembly of multiple raw sequence reads for a single mRNA generates high quality sequence due to the redundancy of reads spread throughout the contig. Multiple reads across the same region of the transcript increases sequence accuracy because sequencing errors can easily be identified and corrected. The overall distribution of reads for a single mRNA typically showed a higher density near the middle of the transcript. This distribution is expected if fragmentation and random priming are unbiased simply from the probability of overlaps along the transcript (a more mathematically detailed description is given in the next section).
To demonstrate sequence read coverage,
The mitochondrial genome is transcribed as a single transcript that is then processed into smaller pieces that are polyadenylated
The relative simplicity of mitochondrial transcripts - no introns or alternative splicing - provided a unique opportunity to examine potential biases in our library preparation. To do this we developed a mathematical model of read density.
To model the distribution of reads along a mitochondrial gene, we assume: (1) the gene produces a single transcript type of length L, and (2) the observed distribution of read lengths results from a uniform (unbiased) fragmentation of the original transcripts. If we have a read of length J≤L drawn from this distribution, it has L-J+1 possible starting locations along the transcript. For the Ith position on the transcript, the number of such read locations that overlap it is the minimum of {I, J, L-I+1, L-J+1}. So the probability p(I,J,L) of a read of length J overlapping position I in a transcript of length L is min({I, J, L-I+1, L-J+1})/(L-J+1). For fixed L and J this is a piecewise-linear function of I. An example graph of this function is shown in
The nuclear encoded acyl-CoA desaturase (ACOD) mRNA is the single most abundant RNA transcript in white adipose tissue. ACOD converts a single carbon-carbon bond into a double bond thus turning a saturated fatty acid into an unsaturated fatty acid. This is an important reaction for a hibernating species because fatty acids with double bonds in the cis conformation have lower melting temperatures and thus greater fluidity. Examination of the thousands of ACOD reads reveals that the genome of
Skeletal muscle | ||||||
April | August | October | Torpor | IBA | March | |
ACODA | 4 | 29 | 11 | 3 | 1 | 1 |
ACODB | 2 | 6 | 2 | 11 | 7 | 2 |
Total | 7 | 38 | 14 | 14 | 12 | 4 |
Heart | ||||||
April | August | October | Torpor | IBA | March | |
ACODA | 4 | 40 | 6 | 3 | 2 | 3 |
ACODB | 22 | 18 | 23 | 18 | 20 | 3 |
Total | 29 | 62 | 29 | 23 | 25 | 10 |
White adipose | ||||||
April | August | October | Torpor | IBA | March | |
ACODA | 1100 | 6687 | 5083 | 3690 | 3285 | 2097 |
ACODB | 28 | 1361 | 1267 | 768 | 796 | 368 |
Total | 1137 | 8313 | 7387 | 4561 | 4199 | 2560 |
Upper-quartile normalized counts for reads matching two distinct ground squirrel ACOD sequences, and the total normalized counts that were mapped to ACOD_HUMAN. Because of rounding the upper-quartile counts, and the fact that some contigs from the untranslated parts of the transcripts cannot be definitively assigned, the total counts are not always the sum of the counts of the two separate genes.
The fundamental physiological adaptation present in natural hibernators is a greatly reduced metabolic rate with accompanying reductions in body temperature, heart rate and respiration. In this study, changes in mRNA levels of specific genes throughout the year were analyzed to identify biological processes that are likely to be important for maintaining organ function during these physiological extremes.
To validate our transcriptome approach of measuring differential gene expression we performed qRT-PCR measurements of several genes across the year (
The colored bars indicate the normalized number of reads at each time point. A. Torpor. NAC1, Sodium/calcium exchanger 1; PDK4, Pyruvate dehydrogenase kinase, isoenzyme 4; RYR2, Ryanodine receptor 2. B. Torpor plus IBA. SPTB2, Spectrin beta chain, brain 1; PARM1, Prostrate androgen-regulated mucin-like protein 1; AT1A1, Sodium/potassium-transporting ATPase subunit alpha-1. C. March. NU1M, NADH-ubiquinone oxidoreductase chain 1; CYB, Cytochrome b; COX2, Cytochrome C Oxidase Subunit 2.
The colored bars indicate the normalized number of reads at each time point. A. April. SARCO, Sarcolipin; GLNA, Glutamine synthetase; KCRS, Creatine kinase S-type, mitochondrial; TNNT1,Troponin T type 1 slow skeletal muscle; THIM, 3-ketoacyl-CoA thiolase, mitochondrial. B. October. UBC, Ubiquitin C; DDB1, DNA damage-binding protein 1; CCD69, Coiled-coil domain-containing protein 69; ASB2, Ankyrin repeat and SOCS box protein 2. C. Torpor. NDRG2, N-myc downstream regulated gene 2; PDK4, Pyruvate dehydrogenase kinase, isoenzyme 4; HS90B, Heat shock protein HSP 89-beta; MAP4, Microtubule-associated protein 4; NF2L1, Nuclear factor erythroid 2-related factor 1.
The colored bars indicate the normalized number of reads at each time point. A. August. ACOD, Acyl Co-A desaturase; FABP4, adipocyte fatty acid binding protein. B. October. LEP, leptin; PAG16, group XVI phospholipase A2; and ALBU, albumin. C. IBA. PDK4, Pyruvate dehydrogenase kinase, isoenzyme 4; PCKGC, phosphoenolpyruvate carboxykinase; ANXA8, annexin A8; HMCS2, hydroxymethylglutaryl-CoA synthase.
The 1.0×10−11 FDR criterion resulted in a final set of 155 heart-expressed genes with median mRNA expression of 882 normalized counts (1857, 453; upper and lower quartiles, respectively) across the six samples with Fisher's exact test p-value ≤1.4×10−13. In skeletal muscle, 92 genes met our FDR cut off and had a median mRNA expression of 1037 (2180, 510) normalized counts and Fisher's exact test p-value ≤9.6×10−14. In WAT, 132 genes met our FDR criteria and had median mRNA expression of 605 (1217, 232) normalized counts across samples and Fisher's exact test p-value ≤6.7×10−14. The complete list of expressed genes that meet our FDR criterion for each tissue can be found in
We highlighted three examples of time point-specific differential gene expression for each of the three tissues. For heart we show highly expressed genes in torpor, torpor and IBA, and March time points (
Three genes that show high expression in heart across both the torpor and IBA time points (
Highly expressed skeletal muscle genes were shown for April, October, and torpor (
We highlight four highly expressed genes in skeletal muscle during October (
Highly expressed genes in WAT were highlighted for the August, October, and IBA time points (
Three genes expressed highly in WAT during October included leptin (LEP, 6.4-fold over March), albumin (ALBU, at least 21.8-fold over all other points), and group XVI phospholipase A2 (PAG16, 2.2-fold over August), (
In this study we used massive parallel sequencing to determine the identity and level of RNAs involved in mammalian hibernation. In an attempt to paint a comprehensive picture of this circannual phenotype we chose six different time points/activity states (April active, August active, October active, January torpor, January IBA, and March arousal), and selected three tissues (heart, WAT, and skeletal muscle) important for animal survival before, during, and after the hibernation season. Despite the lack of motor activity during torpor skeletal muscle shows minimal effects of disuse atrophy; the heart is a contractile organ that continues to work at near-freezing body temperatures; and WAT serves as the primary supplier of fuel for the entire body in the absence of food.
The Roche 454 sequencing platform was chosen for this study because of relatively long read lengths that can be identified with confidence in the absence of a complete genomic sequence
Our strategy of deep sequencing the transcriptome can be applied to any non-model organism to uncover vast amounts of information on gene activity where none existed before. Researchers studying plants and animals with novel phenotypes, but lacking the genetic tools associated with popular model organisms, can now generate millions of distinct sequence reads with as little as 1 to 10 µg of total RNA. Computational tools for bioinformatic analysis of these massive data sets are now widely available to researchers around the world. To underscore this point we used only free and open-source software to analyze the sequence data generated in this study; primarily MIRA, Biopython and Sage.
While the intent of this study was to determine differential gene expression between various time points and activity states, deep sequencing revealed a variety of more subtle findings such as transcript splicing variants, gene duplications, heterogeneity in untranslated regions, and a novel method for sequencing the mitochondrial genome. The many benefits of deep sequencing RNA can rapidly increase the amount of molecular information from understudied organisms and therefore enhance the accessibility of these systems to the research community.
Assembly of the mitochondrial genome for
We highlighted a handful of highly expressed genes from heart, skeletal muscle, and white adipose tissues that are differentially expressed in a time point specific manner (
Throughout hibernation the heart shows tremendous plasticity during arousal from torpor as heart rate explodes from 5 beats per minute (bpm) to 400 bpm, only to drop back to a state of bradycardia during repeated cycles of torpor and IBA
Expression of mitochondrial genes NU1M and CYB, which encode proteins of the electron transport chain, declined from August to torpor and IBA
Gene expression reflecting a reduction in motor activity was seen in skeletal muscle. The expression of the structural protein TNNT1 is much lower in August than in April, possibly reflecting the difference in activity level between a recently captured free-living (April) and a captive (August) ground squirrel (
Ubiquitination of proteins is increased in atrophying skeletal muscle (reviewed in
In skeletal muscle n-Myc downstream regulated gene 2 (NDRG2) and PDK4 are most highly expressed during torpor, IBA, and March (
The annual natural history of WAT in a hibernating mammal can be divided into lipogenic and lipolytic periods that are reflected in the relative abundance of mRNAs encoding lipogenic or lipolytic proteins
In this study we observed a high abundance of three lipogenesis-related genes in the August and October samples (
The mRNA encoding leptin was observed to have greatest abundance during the August and October lipogenic phase (
The large increase in albumin expression during October (
Deep sequencing of the transcriptome as a means to interrogate the molecular basis of the hibernation phenotype has revealed changes in the expression of specific genes at various times throughout the year. This new approach of measuring gene activity opens the door for investigating other novel phenotypes in the vast world of “non-model” organisms. In thirteen-lined ground squirrels we not only found new patterns of gene activity in heart, skeletal muscle and white adipose, we also generated sequence for over 14,000 different mRNAs and used a portion of the transcriptome data to generate the complete sequence of the mitochondrial genome. As expected we found differentially expressed genes involved in the carbohydrate to lipid fuel switch, as well as enhanced ion transport required for heart contraction during torpor. Interestingly, transcripts coding ubiquitin pathway proteins UBC, ASB2 and DDB1 are abundant and peak in October. This intriguing finding suggests that ubiquitin-mediated proteolysis could provide a mechanism by which skeletal muscle can be catabolized and serve as a secondary fuel depot during winter.
The overall abundance of genes with varying expression patterns in the entire dataset presents an unparalleled opportunity to explore the molecular mechanisms controlling hibernation in mammals. The normalized transcript levels for all genes identified at each of the six time points from all three tissues can be found in the supporting information (
Mitochondrial genome sequence for
(TXT)
Correlation between qRT-PCR and normalized RNA sequence read counts for 24 different mRNAs expressed in WAT isolated from individual animals (N = 6) at five time points from August through March. Log transformed qRT-PCR “Take off CT” values (24 transcripts, 30 individual animals, 755 reactions) was correlated with log transformed normalized 454 counts for pooled samples. The analysis reveals a significant correlation between methods (Pearson's coefficient of correlation = −0.5614, 95% upper limit = −0.4558, 95% lower limit = −0.5614, p<0.0001) that supports the use of normalized 454 counts as a measure of relative mRNA expression across groups.
(DOC)
Normalized transcript levels for all genes identified at each of the six time points for heart, skeletal muscle, and WAT.
(XLS)
Complete list of expressed genes that meet the 1.0×10−11 FDR criterion for heart, skeletal muscle, and WAT.
(XLS)
The authors thank V. Caskey for technical support during the initial stages of this project and J. Bjork for generating