The F. tularensis type A strain FSC198 from Slovakia and a second strain FSC043, which has attenuated virulence, are both considered to be derivatives of the North American F. tularensis type A strain SCHU S4. These strains have been propagated under different conditions: the FSC198 has undergone natural propagation in the environment, while the strain FSC043 has been cultivated on artificial media in laboratories. Here, we have compared the genome sequences of FSC198, FSC043, and SCHU S4 to explore the possibility that the contrasting propagation conditions may have resulted in different mutational patterns. We found four insertion/deletion events (INDELs) in the strain FSC043, as compared to the SCHU S4, while no single nucleotide polymorphisms (SNPs) or variable number of tandem repeats (VNTRs) were identified. This result contrasts with previously reported findings for the strain FSC198, where eight SNPs and three VNTR differences, but no INDELs exist as compared to the SCHU S4 strain. The mutations detected in the laboratory and naturally propagated type A strains, respectively, demonstrate distinct patterns supporting that analysis of mutational spectra might be a useful tool to reveal differences in past growth conditions. Such information may be useful to identify leads in a microbial forensic investigation.
Citation: Sjödin A, Svensson K, Lindgren M, Forsman M, Larsson P (2010) Whole-Genome Sequencing Reveals Distinct Mutational Patterns in Closely Related Laboratory and Naturally Propagated Francisella tularensis Strains. PLoS ONE 5(7): e11556. doi:10.1371/journal.pone.0011556
Editor: Niyaz Ahmed, University of Hyderabad, India
Received: March 5, 2010; Accepted: May 6, 2010; Published: July 19, 2010
Copyright: © 2010 Sjödin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Funding: This project was funded by the Swedish Ministry of Foreign Affairs, project A4952, the Swedish Civil Contingencies Agency, project B4055, and the National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Services under grant No. AI60689. Grant support was obtained from the Swedish Medical Research Council and the Medical Faculty, Umeå University, Umeå, Sweden. The work was performed in part at the Umeå Centre for Microbial Research (UCMR). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Following the anthrax attacks of 2001, microbial forensics has emerged as a new scientific discipline dedicated to the investigation of biocrime and bioterrorism to link pathogen, crime, and perpetrator . In molecular methods being developed to this end, selectively neutral genetic mutations, such as synonymous single nucleotide polymorphisms (SNPs) and extragenic tandem repeats  present advantages over non-neutral mutations to establish relationships between strains. Non-neutral characters are more prone to homoplasy (i.e. sharing of marker states for other reasons than ancestry) and less likely to accumulate at a constant rate, properties that may distort phylogenetic analyses. However, non-neutral mutations may also provide a different but potentially important aspect for microbial forensics. Since such mutations may reflect the selective forces experienced by a bacterium, they may also provide information on past propagation conditions.
Here, we investigate this possibility by comparing mutational patterns detected in three strains designated SCHU S4 (FSC237), FSC198 (SE-219), and FSC043 of Francisella tularensis subspecies tularensis (type A1). Due to its high virulence, ease of dissemination, low infectious dose, and previous weaponisation, this pathogen has been classified by the Centers for Disease Control and Prevention among the top six ‘Category A’ biological threat agents . Type A strains (in particular subgroup A1) demonstrate high virulence to humans  compared to the two other subspecies holarctica (type B) and mediasiatica, and are almost entirely restricted to North America . To date, the only exception to the North American geographical confinement of F. tularensis type A is a handful of isolates recovered in Europe: in western Slovakia in 1986 , and in a bordering area in Austria in 1990 (Gurycova unpublished). A recent genomic sequencing effort demonstrated that one of the recovered Slovakian strains, FSC198, is virtually identical and has been derived from SCHU S4 . Data from the study also provided plausible evidence supporting that the European isolates indeed represent valid natural isolates and not events of laboratory contamination.
We sequenced the strain FSC043, which is another derivative strain of the SCHU S4. In contrast to the assumed natural propagation of FSC198, the FSC043 has been cultivated repeatedly on artificial media in laboratories during which it likely lost its pathogenicity for mice . Detection of different mutational patterns between these strains would therefore support the possibility to infer differences in culture conditions from mutational data.
Correction of Francisella tularensis subsp. tularensis strain SCHU S4 genome sequence
The genome sequence of F. tularensis subsp. tularensis strain SCHU S4 AJ749949.1  available in GenBank  contained sequence errors in the form of SNPs and incorrect variable numbers of tandem repeats (VNTR), as identified recently by Chaudhuri et al. . We have confirmed these errors and a corrected version of the genome sequence of SCHU S4 has been deposited in GenBank under accession number AJ749949.2.
Identified mutational patterns
Direct mapping of sequence reads from FSC043 on the genome sequences of FSC198 and SCHU S4 showed an average coverage of 107× separated by highly repetitive regions. The phylogenetic positions and relationships of the analyzed strains are shown in Figure 1. Genome-wide sequence comparisons between strain FSC043 and strain SCHU S4 did not identify any SNPs between the two strains and they showed identical VNTR patterns, while three VNTRs differentiated them from FSC198. We found four insertion/deletion events differentiating FSC043 from SCHU S4 and FSC198 (Table 1). Three of them (Ftind51–53) were small deletions (2, 1, and 1 bp, respectively), while the fourth was a 1,480 bp deletion and corresponded to the previously identified RD18 . Ftind51 affected a putative metal ion transporter protein (FTT0615), while Ftind52 and Ftind53 were located within the duplicated Francisella Pathogenicity Island (FPI). Additional sequencing confirmed Ftind52 and Ftind53 as single mutations in both copies of the pdpC gene (pdpC1 and pdpC2). The eight previously identified SNPs (S1–S8) in FSC198  were all non-synonymous mutations and affected genes for UDP-N-acetylglucosamine pyrophosphorylase (glmU), an outer membrane protein (FTT0602), an acid phosphatase (FTT0620), a soluble pyridine nucleotide transhydrogenase (sthA), a lipoprotein located between lpnA and lpnB (FTT0903), a cardiolipin synthetase (ybhO), and a D-methionine transport protein (FTT1124). A summary of mutations in the analyzed genomes (Table 2) is shown in Table 1. The proposed strain history and an overview of the mutations are depicted in Figure 2.
Figure 1. Relationships within the species F. tularensis.
The evolutionary tree was inferred using the Neighbor-Joining method. Bootstrap support values (500 replicates) are shown next to branches. Scale bar indicates the number of base substitutions per site.doi:10.1371/journal.pone.0011556.g001
Figure 2. Overview of different paths of evolution.
Strain FSC043 and strain FSC198 have been exposed to different environments since their split from the common ancestor strain SCHU S4. Strain FSC043 has experienced ‘artificial’ life cycles inside a laboratory while strain FSC198 has been exposed to a natural environment in Slovakia, which is reflected in their genomes by exhibiting different mutation patterns.doi:10.1371/journal.pone.0011556.g002
Table 1. Identified SNP, INDEL and VNTR differences between FSC198, FSC043 and SCHU S4 strains and their corresponding regions in three other genomes of F. tularensis.doi:10.1371/journal.pone.0011556.t001
In this study, we found that different propagation conditions for the two F. tularensis strains FSC198 and FSC043 were supported by genomic data. While propagation in natural conditions has been assumed for the strain FSC198, the strain FSC043 has been extensively passaged in vitro in laboratories. Our results confirm previous findings that FSC198 differs from SCHU S4 at three VNTR loci and by eight intragenic and non-synonymous SNP mutations. In contrast, FSC043 was identical with SCHU S4 at all 25 VNTR loci, and no SNP mutations were found. Instead, all mutations in FSC043 were found to be intragenic deletion events; three micro deletions and the previously identified large deletion RD18 .
All four mutations found in strain FSC043 have caused disruption of gene functions: all of the disrupted genes in the strain FSC043 have been linked to virulence or have orthologs in F. novicida that have been linked to virulence. One of the two genes (FTT0918) which span the large deletion region RD18, is involved in iron uptake  and has been shown to be essential for virulence in the parental SCHU S4 strain  as well as in the attenuation of the Live Vaccine Strain . Similar repeat-mediated deletions as in the RD18 locus (and another locus denoted RD19) have been identified, and seem to be characteristic for several laboratory propagated F. tularensis strains . In agreement, it has frequently been observed that the genomes of laboratory strains eventually become degraded after passage on artificial media .
The Ftind52 and Ftind53 mutations represent identical deletion events but in different copies of the pdpC gene of the duplicated Francisella Pathogenicity Island, a locus important for F. tularensis virulence . While these mutations could have occurred independently, it is likely that the mutation at one pdpC locus could have been transferred to the other pdpC locus by gene conversion (nonreciprocal recombination). High sequence homogeneity of insertion sequence elements within F. tularensis strains but divergence between subspecies suggests a strong effect of this process in F. tularensis. The pdpC gene is required for infection of F. tularensis in mammalian cells  but not for F. novicida infection of mosquito cells . The mutation Ftind51 affects a putative metal ion transporter protein. A transposon mutant of the corresponding gene in F. novicida was negatively selected in a mouse model of infection . The Ftind51 mutation may therefore also be linked to virulence in the strain FSC043. In the strain FSC198, three(sthA, ybhO and metA) of the seven genes affected by mutation have been linked to virulence –. Since approximately 30% of the core genome (1162 genes) in F. tularensis , have been experimentally identified as virulence genes to date –, –, it is possible that the seemingly preferential disruption of virulence genes in the FSC043 and the mutation of virulence genes in the FSC198 may be due to chance.
Although certain mutations (e.g. rearrangements, large tandem repeat-polymorphisms) are not reliably detected by the sequencing methodology used, it is not unlikely that the few deletions detected completely may explain the attenuation of virulence in the strain FSC043. This hypothesis, however, needs further examination by specific phenotypic characterization of the strains studied and by experimental gene deletion and/or complementation .
Two evolutionary scenarios may have resulted in the gene disruptions detected in the strain FSC043. The disrupted genes may represent neutral events (i.e. genetic drift), caused by genetic bottlenecks that reduced the impact of selection, or because the mutated genes became superfluous when the bacterium was cultured on artificial media. It is also possible that the disruptions have been beneficial and therefore become positively selected. In a recent study of experimental populations of Escherichia coli , where the impact of genetic drift was reduced by the use of large inoculates, the results indicated positive selection as the predominant cause of the fixation of mutations. Since the strain FSC043 is likely to have experienced reoccurring and severe genetic bottlenecks by the transfer of single colonies between agar plates, however, the fixation of the disruptive mutations could also be due to genetic drift in this strain. Regardless of whether mutations in the FSC043 are neutral (fixed by genetic drift) or non-neutral (fixed by positive selection), it is interesting that the frequency of fixed disruptive mutations (INDELs) occurred at a frequency that greatly exceeded all other mutations (SNPs, VNTRs) in the strain FSC043. This pattern contrasts sharply to that for the naturally propageted strain FSC198, where all mutations were non-synomous SNPs and VNTRs reflecting the adaptation to its propagation environment.
Thus, our data agree with previous indications that the strain FSC198 has been propagated in a natural environment subsequent to its divergence from the progenitor strain SCHU S4 . We find also that these results support the potential utility of analysis of mutational patterns to infer past propagation conditions. The generality and validity of these findings will require further confirmation, but may provide a new type of evidence in microbial forensics.
Materials and Methods
The F. tularensis subsp. tularensis strain FSC043 was obtained from the Francisella Strain Collection (FSC) at the Swedish Defense Research Agency, Umeå. FSC043 was deposited to FSC by the Rocky Mountain Laboratories, Hamilton, MT, US, in 1992. The strain FSC043 represents a standard laboratory strain and it has as such been cultured extensively over the past six decades. It is uncertain precisely when the attenuating genetic mutations occurred. An overview of strains and genome sequences used is presented in Table 2. Major relationships within the species Francisella are depicted in Figure 1.
Genome sequencing of FSC043
FSC043 was re-cultured, suspended in phosphate buffered saline and heat-killed. DNA was prepared by a chaotropic salt method . Sequencing was performed by a commercial service provider (Geneservices, Cambridge, UK) using an Illumina GAII instrument with 36 bp single-end reads. Images acquired from the Illumina sequencer were processed through the Illumina pipeline to obtain sequence and quality scores for each base. Sequence reads have been deposited in the NCBI Short Read Archive  as SRA009329.1.
Genome assembly was performed by two alternative, complementary approaches. The first method was based on alignment and assembly using reference genomes from two strains within the subspecies tularensis type A1, FSC198  and SCHU S4 . Firstly, sequences from FSC043 were compared against the reference genomes using VAAL . Additional analysis was performed in MOSAIK  allowing for non-unique hits in assembly, followed by identification of SNPs and small INDELs using Gigabayes . Allowing non-unique mapping of reads allows identification of potential mutations within duplicated regions. Results from both VAAL and MOSAIK were inspected and confirmed in Consed .
Secondly, de novo assembly of short reads was performed using Velvet  using settings producing the highest N50 value. Constructed contigs were mapped to the same reference genomes using Exonorate  and nucmer in the package MUMmer . Identified mutations among the three analyzed type A1 strains were further compared to the type A2 strain WY96-3418  and the type B strain LVS.
Sequence differences around VNTRs and RD18 in Francisella genomes (Table 1) were analyzed in silico using previously published primers . To confirm VNTRs in FSC043, MLVA was performed using a CEQ 8800 instrument (Beckman Coulters, Fullerton, CA), as previously described .
Verification of mutations within duplicated region
The F. tularensis subsp. tularensis strains SCHU S4 and FSC043 were grown on modified GC-agar base at 5% , suspended in water and used as PCR (Polymerase Chain Reaction) templates together with Expand Long Range polymerase (Roche Applied Science, Mannheim, Germany). Firstly, the regions FTT1353 to FTT1360 and FTT1709 to FTT1715 of the FPI were amplified using the internal FPI primer pairs FPI-1 and FPI-2, (Table 3), in order to differentiate the two copies of the FPI. Each region comprised approximately 17 kbp. The resulting PCR products were purified from agarose using the High Pure PCR Purification Kit (Roche Applied Science, Mannheim, Germany) according to manufacturer's protocol. A second PCR was performed on each of the two purified PCR products, where the 5.5 kb regions surrounding pdpC1 and pdpC2, respectively, was PCR amplified as eight sequential fragments to facilitate sequencing using primer pairs pdpC-1 to pdpC-8 (Table 3). The average overlap between fragments was 118 nucleotides. Each fragment was cloned into the pCR4-TOPO TA cloning vector (Invitrogen AB, Stockholm, Sweden) and plasmids corresponding to four different clones from each of the eight combinations were purified using the QIAPrep Spin Miniprep Kit (Qiagen, Hilden, Germany) and all 32 clones were sequenced (Eurofins MWG operon, Ebersberg, Germany). A one base pair deletion was observed in both copies of pdpC in FSC043. To verify this, the region was amplified in both SCHU S4 and FSC043 using primer pair pdpC-9 (this does not allow a separation of the two FPI copies), and subsequent sequencing of the 691 bp PCR product was performed, confirming the mutation. No other differences were observed among the 5.5 kb sequenced region.
Conceived and designed the experiments: AS ML PL. Performed the experiments: AS ML. Analyzed the data: AS KS ML PL. Wrote the paper: AS KS MF PL.
- 1. Tucker JB, Koblentz GD (2009) The four faces of microbial forensics. Biosecur Bioterror 7: 389–397.
- 2. Oggioni MR, Claverys JP (1999) Repeated extragenic sequences in prokaryotic genomes: a proposal for the origin and dynamics of the RUP element in Streptococcus pneumoniae. Microbiology 145(Pt 10): 2647–2653.
- 3. Centers for Disease Control and Prevention Centers for Disease Control and Prevention web site. Available: http://www.cdc.gov/.
- 4. Kugeler KJ, Mead PS, Janusz AM, Staples JE, Kubota KA, et al. (2009) Molecular Epidemiology of Francisella tularensis in the United States. Clin Infect Dis 48: 863–870.
- 5. Johansson A, Farlow J, Larsson P, Dukerich M, Chambers E, et al. (2004) Worldwide genetic relationships among Francisella tularensis isolates determined by multiple-locus variable-number tandem repeat analysis. J Bacteriol 186: 5808–5818.
- 6. Gurycova D (1998) First isolation of Francisella tularensis subsp. tularensis in Europe. Eur J Epidemiol 14: 797–802.
- 7. Chaudhuri RR, Ren CP, Desmond L, Vincent GA, Silman NJ, et al. (2007) Genome sequencing shows that European isolates of Francisella tularensis subspecies tularensis are almost identical to US laboratory strain Schu S4. PLoS ONE 2: e352.
- 8. Twine S, Bystrom M, Chen W, Forsman M, Golovliov I, et al. (2005) A mutant of Francisella tularensis strain SCHU S4 lacking the ability to express a 58-kilodalton protein is attenuated for virulence and is an effective live vaccine. Infect Immun 73: 8345–8352.
- 9. Larsson P, Oyston PC, Chain P, Chu MC, Duffield M, et al. (2005) The complete genome sequence of Francisella tularensis, the causative agent of tularemia. Nat Genet 37: 153–159.
- 10. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2009) GenBank. Nucleic Acids Res 37: D26–D31.
- 11. Svensson K, Larsson P, Johansson D, Byström M, Forsman M, et al. (2005) Evolution of subspecies of Francisella tularensis. J Bacteriol 187: 3903–3908.
- 12. Lindgren H, Honn M, Golovlev I, Kadzhaev K, Conlan W, et al. (2009) The 58-kilodalton major virulence factor of Francisella tularensis is required for efficient utilization of iron. Infect Immun 77: 4429–4436.
- 13. Twine SM, Shen H, Kelly JF, Chen W, Sjostedt A, et al. (2006) Virulence comparison in mice of distinct isolates of type A Francisella tularensis. Microb Pathog 40: 133–138.
- 14. Salomonsson E, Forsberg A, Roos N, Holz C, Maier B, et al. (2009) Functional analyses of pilin-like proteins from Francisella tularensis: complementation of type IV pilus phenotypes in Neisseria gonorrhoeae. Microbiology 155: 2546–2559.
- 15. Nilsson AI, Koskiniemi S, Eriksson S, Kugelberg E, Hinton JCD, et al. (2005) Bacterial genome size reduction by experimental evolution. Proc Natl Acad Sci U S A 102: 12112–12116.
- 16. Nano FE, Zhang N, Cowley SC, Klose KE, Cheung KKM, et al. (2004) A Francisella tularensis pathogenicity island required for intramacrophage growth. J Bacteriol 186: 6430–6436.
- 17. Barker JR, Klose KE (2007) Molecular and genetic basis of pathogenesis in Francisella tularensis. Ann N Y Acad Sci 1105: 138–159.
- 18. Read A, Vogl SJ, Hueffer K, Gallagher LA, Happ GM (2008) Francisella genes required for replication in mosquito cells. J Med Entomol 45: 1108–1116.
- 19. Weiss DS, Brotcke A, Henry T, Margolis JJ, Chan K, et al. (2007) In vivo negative selection screen identifies genes required for Francisella virulence. Proc Natl Acad Sci U S A 104: 6037–6042.
- 20. Su J, Yang J, Zhao D, Kawula TH, Banas JA, et al. (2007) Genome-wide identification of Francisella tularensis virulence determinants. Infect Immun 75: 3089–3101.
- 21. Kraemer PS, Mitchell A, Pelletier MR, Gallagher LA, Wasnick M, et al. (2009) Genome-wide screen in Francisella novicida for genes required for pulmonary and systemic infection in mice. Infect Immun 77: 232–244.
- 22. Maier TM, Casey MS, Becker RH, Dorsey CW, Glass EM, et al. (2007) Identification of Francisella tularensis Himar1-based transposon mutants defective for replication in macrophages. Infect Immun 75: 5376–5389.
- 23. Larsson P, Elfsmark D, Svensson K, Wikstrom P, Forsman M, et al. (2009) Molecular evolutionary consequences of niche restriction in Francisella tularensis, a facultative intracellular pathogen. PLoS Pathog 5: e1000472.
- 24. Qin A, Mann BJ (2006) Identification of transposon insertion mutants of Francisella tularensis tularensis strain Schu S4 deficient in intracellular replication in the hepatic cell line HepG2. BMC Microbiol 6: 69.
- 25. Tempel R, Lai XH, Crosa L, Kozlowicz B, Heffron F (2006) Attenuated Francisella novicida transposon mutants protect mice against wild-type challenge. Infect Immun 74: 5095–5105.
- 26. Wehrly TD, Chong A, Virtaneva K, Sturdevant DE, Child R, et al. (2009) Intracellular biology and virulence determinants of Francisella tularensis revealed by transcriptional profiling inside macrophages. Cell Microbiol 11: 1128–1150.
- 27. Kadzhaev K, Zingmark C, Golovliov I, Bolanowski M, Shen H, et al. (2009) Identification of genes contributing to the virulence of Francisella tularensis SCHU S4 in a mouse intradermal infection model. PLoS One 4: e5463.
- 28. Frank DW, Zahrt TC (2007) Genetics and genetic manipulation in francisella tularensis. Ann N Y Acad Sci 1105: 67–97.
- 29. Barrick JE, Yu DS, Yoon SH, Jeong H, Oh TK, et al. (2009) Genome evolution and adaptation in a long-term experiment with Escherichia coli. Nature 461: 1243–1247.
- 30. Johansson A, Ibrahim A, Gransson I, Eriksson U, Gurycova D, et al. (2000) Evaluation of PCR-based methods for discrimination of Francisella species and subspecies and development of a specific PCR that distinguishes the two major subspecies of Francisella tularensis. J Clin Microbiol 38: 4180–4185.
- 31. Shumway M, Cochrane G, Sugawara H (2010) Archiving next generation sequencing data. Nucleic Acids Res 38: D870–D871.
- 32. Nusbaum C, Ohsumi TK, Gomez J, Aquadro J, Victor TC, et al. (2009) Sensitive, specific polymorphism discovery in bacteria using massively parallel sequencing. Nat Methods 6: 67–69.
- 33. Marth GThe MarthLab project page. Available: http://bioinformatics.bc.edu/marthlab.
- 34. Gordon D, Abajian C, Green P (1998) Consed: a graphical tool for sequence finishing. Genome Res 8: 195–202.
- 35. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18: 821–829.
- 36. Slater GSC, Birney E (2005) Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics 6: 31.
- 37. Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, et al. (2004) Versatile and open software for comparing large genomes. Genome Biol 5: R12.
- 38. Beckstrom-Sternberg SM, Auerbach RK, Godbole S, Pearson JV, Beckstrom-Sternberg JS, et al. (2007) Complete genomic characterization of a pathogenic A.II strain of Francisella tularensis subspecies tularensis. PLoS ONE 2: e947.
- 39. Larsson P, Svensson K, Karlsson L, Guala D, Granberg M, et al. (2007) Canonical insertion-deletion markers for rapid DNA typing of Francisella tularensis. Emerg Infect Dis 13: 1725–1732.
- 40. Svensson K, Bck E, Eliasson H, Berglund L, Granberg M, et al. (2009) Landscape epidemiology of tularemia outbreaks in Sweden. Emerg Infect Dis 15: 1937–1947.
- 41. Svensson K, Granberg M, Karlsson L, Neubauerova V, Forsman M, et al. (2009) A real-time PCR array for hierarchical identification of Francisella isolates. PLoS One 4: e8360.
- 42. Eigelsbach HT, Braun W, Herring RD (1951) Studies on the variation of Bacterium tularense. J Bacteriol 61: 557–569.
- 43. Hesselbrock W, Foshay L (1945) The morphology of bacterium tularense. J Bacteriol 49: 209–231.