Conceived and designed the experiments: GL ROD. Performed the experiments: JX. Analyzed the data: JX CTD HZ PR. Wrote the paper: JX GL ROD MCC CTD HZ PR.
The authors have declared that no competing interests exist.
Influenza neuraminidase (NA) is an important surface glycoprotein and plays a vital role in viral replication and drug development. The NA is found in influenza A and B viruses, with nine subtypes classified in influenza A. The complete knowledge of influenza NA evolutionary history and phylodynamics, although critical for the prevention and control of influenza epidemics and pandemics, remains lacking.
Evolutionary and phylogenetic analyses of influenza NA sequences using Maximum Likelihood and Bayesian MCMC methods demonstrated that the divergence of influenza viruses into types A and B occurred earlier than the divergence of influenza A NA subtypes. Twenty-three lineages were identified within influenza A, two lineages were classified within influenza B, and most lineages were specific to host, subtype or geographical location. Interestingly, evolutionary rates vary not only among lineages but also among branches within lineages. The estimated tMRCAs of influenza lineages suggest that the viruses of different lineages emerge several months or even years before their initial detection. The
The divergence into influenza type A and B from a putative ancestral NA was followed by the divergence of type A into nine NA subtypes, of which 23 lineages subsequently diverged. This study provides a better understanding of influenza NA lineages and their evolutionary dynamics, which may facilitate early detection of newly emerging influenza viruses and thus improve influenza surveillance.
Influenza virus belongs to the viral family Orthomyxoviridae and has a segmented negative-sense RNA genome in an enveloped virion
Genetic mutation is considered one of the most important molecular mechanisms in the evolution of influenza virus
Influenza virus has also shown the propensity to escape immunity because of continuous antigenic drift, i.e., mutation at the epitope positions of HA and NA segments
Each of the influenza viral genes is thought to be important in viral replication and interaction with host cells; therefore, understanding the evolutionary tempo and mode of each viral gene can provide new insight into the epidemiology of influenza viruses
Influenza A viral neuraminidases are classified into nine subtypes (N1–N9) according to their antigenic properties, whereas influenza B neuraminidases are classified into two lineages
The Maximum Likelihood (ML) and MCMC Bayesian analyses demonstrate that the influenza NA gene diverged first into A and B (Group I and Group II), followed by the division of influenza A subtypes (
Influenza NA genes form two groups (Group I and Group II), which correspond to influenza A and B, respectively. Influenza A NA is further classified into two subgroups (Subgroup I and Subgroup II). The viral strains are colored for different hosts: human in green, swine in blue, avian in red and equine in purple. The bootstrap support values are indicated at major nodes. The scale bar at the bottom indicates the numbers of nucleotide substitutions per site.
A total of 23 lineages, two to three lineages for each subtype, were identified within influenza A viruses, while two lineages were classified within influenza B (
Influenza | Subtype | Lineage/Sublineage | Annotation | Isolationperiod | Representative sequence | Main virus subtypes |
|
N1 | 1A.1 | H5N1 | 1996–2010 | A/Goose/Guangdong/1/96(H5N1) | H5N1 |
1A.2 | Eurasian avian | 1934–2009 | A/fowl/Rostock/45/1934(H7N1) | H1N1, H3N1, H5N1, H6N1, H7N1,H9N1, H11N1 | ||
1A.3 | Pandemic H1N1 2009 | 2009–2010 | A/Texas/05/2009(H1N1) | H1N1 | ||
1A.4 | Eurasian (avian-like) swine | 1979–2010 | A/swine/Belgium/WVL1/1979 | H1N1 | ||
1A.5 | North American avian | 1969–2008 | A/duck/PA/486/1969(H6N1) | H1N1, H3N1,H4N1, H5N1, H6N1,H10N1, H12N1 | ||
1B | North American swine | 1930–2009 | A/swine/Iowa/15/1930(H1N1) | H1N1, H3N1 | ||
1C | Major human | 1918–2009 | A/Brevig_Mission/1/18(H1N1) | H1N1 | ||
N2 | 2A.1 | H9N2 | 1994–2009 | A/chicken/Guangdong/SS/94 | H9N2 | |
2A.2 | Eurasian avian | 1977–2008 | A/duck/Hokkaido/5/1977 | H3N2, H5N2, H6N2, H7N2, H9N2, H11N2 | ||
2A.3 | North American avian | 1966–2008 | A/turkey/Wisconsin/1/1966(H9N2) | H3N2, H4N2,H5N2, H6N2, H7N2, H9N2, H11N2, H10N2, H13N2 | ||
2B | Major human and swine | 1957–2009 | A/Japan/305/1957 | H3N2, H2N2, H1N2 | ||
N3 | 3A | North American avian | 1971–2010 | A/turkey/Oregon/1971(H7N3) | H7N3, H4N3, H1N3, H10N3, H11N3,H6N3, H5N3, H3N3, H2N3 | |
3B | Eurasian/Oceanian avian | 1959–2009 | A/shearwater/Australia/751/1975(H5N3) | H1N3, H5N3, H4N3, H3N3, H8N3, H12N3, H7N3, H2N3, H10N3, H11N3, H9N3 | ||
3C | Other avian | 1975–2009 | A/sabines gull/Alaska/296/1975(H5N3) | H7N3, H16N3, H3N3, H13N3, H5N3 | ||
N4 | 4A | North American avian | 1967–2010 | A/turkey/Ontario/6118/1967(H8N4) | H3N4, H8N4, H12N4, H4N4, H2N4 | |
4B | Eurasian/Oceanian avian | 1979–2008 | A/gray teal/Australia/2/1979(H4N4) | H4N4, H8N4, H9N4,H10N4 | ||
N5 | 5A | North American avian | 1976–2009 | A/mallard duck/ALB/60/1976(H12N5) | H12N5, H1N5, H11N5, H3N5, H6N5,H4N5, H5N5, H2N5, H9N5, H10N5, H7N5 | |
5B | Eurasian/Oceanian avian | 1972–2009 | A/shearwater/Australia/1/1972(H6N5) | H6N5, H1N5, H3N5, H8N5, H10N5,H12N5, H4N5, H14N5 | ||
N6 | 6A | North American avian | 1976–2010 | A/mallard duck/ALB/20/1976(H4N6) | H3N6, H4N6, H10N6, H6N6, H1N6 | |
6B | Eurasian/Oceanian avian | 1956–2010 | A/duck/Czech Republic/1/1956(H4N6) | H4N6, H3N6, H5N6, H9N6 | ||
N7 | 7A | North American avian | 1977–2010 | A/mallard duck/ALB/302/1977 (H10N7) | H4N7, H10N7, H3N7, H2N7, H7N7,H5N7, H8N7, H13N7 | |
7B | Eurasian/Oceanian avian | 1902–2008 | A/chicken/Brescia/1902(H7N7) | H7N7, H10N7, H5N7, H11N7 | ||
7C | Equine | 1956–1977 | A/equine/Prague/1/1956(H7N7) | H7N7 | ||
N8 | 8A | North American avian | 1963–2010 | A/turkey/Canada/1963(H6N8) | H3N8, H4N8, H6N8, H7N8, H2N8, H10N8 | |
8B | Equine | 1963–2010 | A/equine/Miami/1/1963(H3N8) | H3N8 | ||
8C | Eurasian/Oceanian avian | 1963–2010 | A/duck/Ukraine/1/1963(H3N8) | H3N8, H10N8, H11N8, H6N8, H7N8,H2N8,H4N8 | ||
N9 | 9A | North American avian | 1966–2008 | A/turkey/Ontario/7732/1966 (H5N9) | H11N9, H13N9, H12N9, H5N9, H10N9,H3N9, H2N9, H1N9, H7N9, H4N9 | |
9B | Eurasian/Oceanian avian I | 1996–2010 | A/duck/Siberia/700/1996(H11N9) | H11N9, H5N9, H7N9, H6N9, H2N9, H1N9 | ||
9C | Eurasian/Oceanian avian II | 1978–2004 | A/duck/Hong Kong/278/1978(H2N9) | H11N9, H5N9, H15N9, H10N9, H2N9 | ||
|
Yam88 | B/Yamagata/16/88-like | 1988–2009 | B/Yamagata/16/1988 | ||
Vic77 | B/Victoria/2/87-like | 1987–2002 | B/Victoria/2/1987 |
Three lineages, 1A, 1B and 1C, were identified based upon strong bootstrap support values (100%) of the phylogenetic tree, which was generated from 4,146 sequences (
A: N1; B: N2; C: N5; D: N8. The annotation for each lineage was labeled on the trees. Three lineages in N1 (1A, 1B and 1C), two lineages in N2 (2A and 2B), two lineages in N5 (5A and 5B), and two lineages in N8 (8A and 8B) were classified. The bootstrap values supporting the corresponding lineages are shown to the left of the major nodes. Scale bars indicate the numbers of nucleotide substitutions per site.
Sublineage 1A.1 originated from the recent highly pathogenic H5N1 avian influenza epizootic that started in Asia around 1996 and has spread throughout the Eastern Hemisphere. The viruses in 1A.1 are mostly from birds (n = 1,031), but some are from humans (n = 164), swine (n = 8), tigers (n = 2) and mink (n = 1). Sublineage 1A.2 is composed of mostly Eurasian avian influenza viruses (n = 230), whereas some human highly pathogenic H5N1 influenza viruses (n = 24) sampled in 1997 in Hong Kong were also found in 1A.2. Sublineage 1A.4 consists of Eurasian swine influenza viruses which were originally derived from Eurasian avian viruses and first detected in Belgium in 1979. Not surprisingly, 1A.3 (Pandemic H1N1 2009) is grouped together with Eurasian swine, which confirms previous findings that the NA segment of pandemic H1N1 2009 viruses originated from the Eurasian swine influenza viruses. Sublineage 1A.5 is composed of viruses mainly from North American avian species (n = 162), with a few exceptions: 1 viral sequence from human and 3 from environmental samples.
Lineage 1B consists of mainly North American swine influenza viruses, while 1C is a human lineage, consisting mainly of H1N1 human influenza viruses. The viruses in 1B correspond mostly to the classical H1N1 isolates from swine (n = 126), but include 9 isolates from humans and 9 from birds, indicating sporadic interspecies transmissions of influenza viruses from swine to humans or birds. Lineage 1C consists predominantly of human viruses (n = 1204), with a few exceptions, namely, swine (4 isolates) and birds (2 isolates). Within the influenza A N1 subtype, avian influenza viruses include sequences from multiple HA subtypes (e.g., H1N1, H3N1, H5N1, H6N1, H7N1, H9N1, and H11N1), whereas human and swine viruses have limited HA subtypes (human: H1N1; swine: H1N1, H3N1).
The N2 sequences (3,754 in total) were classified into two major lineages, 2A and 2B (
The 2A.1 is a subtype-specific sublineage consisting of mainly H9N2 avian influenza viruses, with the majority from birds (n = 412), but with 24 sequences from swine and 4 from humans, which indicates the occurrence of interspecies transmissions. The 2A.2 and 2A.3 correspond to Eurasian and North American avian viruses, respectively. The viruses of 2A.2 are mainly from birds (n = 342), but a few are from swine (n = 7) and humans (n = 2). A similar result was also found in 2A.3, which includes 291 avian viruses, 1 H7N2 human virus, and 29 viruses isolated from environmental samples.
Within 2B, most of the influenza viruses are from human H2N2 and H3N2 influenza viruses (n = 2,340) and swine H3N2 and H1N2 viruses (n = 214). However, avian influenza H3N2 viruses (n = 11) were also found in this lineage. Interestingly, there were five major clades of swine influenza viruses scattered within lineage 2B, suggesting these viruses originate from human viruses through either genome reassortment or direct transmission events. It is also noted that the branch lengths of the swine clusters are much longer as compared to those of the closely related human viruses, indicating extensive evolution of the N2 gene in swine viruses after transmission from humans to swine.
Three lineages, 3A, 3B, and 3C, were found in N3, with genetic distances between lineages ranging from 0.173 to 0.349 (
The N4, N5 and N6 subtypes were each classified into two lineages, one corresponding to North American avian (4A, 5A and 6A) and the other Eurasian/Oceanian avian (4B, 5B and 6B) (
Three lineages were identified in N7 and N8, which correspond to North American avian (7A, 8A), equine (7C, 8B) and Eurasian/Oceanian avian (7B, 8C), respectively (
The NA genes of influenza B viruses were divided into two distinct lineages, B/Victoria/2/87-like (Vic87) and B/Yamagata/16/88-like (Yam88) (
Two lineages, Yam88 and Vic87, were classified. The bootstrap values supporting the corresponding lineages are shown on the major nodes. The scale bars indicate the numbers of nucleotide substitutions per site.
Outliers were identified and removed before the estimation of substitution rate and tMRCA for each lineage (
Influenza | Subtype | Lineage/Sublineage | Substitution rate (×10−3 subs/site/year) | tMRCA (calendar year) | ||||
Mean | 95%HPD lower | 95% HPD upper | Mean | 95% HPD lower | 95% HPD upper | |||
|
N1 | 1A.1 | 3.06/3.73 | 2.63/3.16 | 3.48/4.32 | 1988/1992 | 1984/1987 | 1992/1996 |
1A.2 | 3.42/4.07 | 3.03/3.43 | 3.79/4.74 | 1927/1931 | 1922/1923 | 1931/1934 | ||
1A.3 | 2.83/3.58 | 1.63/2.52 | 3.96/4.67 | 19-Nov-08/ | 7-June-08/ | 16-Mar-09/ | ||
7-Dec-08 | 12-Jun-08 | 30-Mar-09 | ||||||
1A.4 | 3.62/3.96 | 3.23/3.40 | 3.99/4.58 | 1978/1977 | 1977/1974 | 1979/1979 | ||
1A.5 | 3.00/4.05 | 2.69/3.04 | 3.36/4.99 | 1921/1950 | 1911/1920 | 1934/1967 | ||
1B | 2.55/2.97 | 2.25/2.58 | 2.83/3.37 | 1929/1927 | 1928/1923 | 1930/1930 | ||
1C | 1.79/2.44 | 1.42/2.02 | 2.14/2.89 | 1898/1910 | 1882/1896 | 1909/1918 | ||
N2 | 2A.1 | 4.45/4.61 | 4.07/3.98 | 4.89/5.24 | 1990/1989 | 1989/1984 | 1991/1993 | |
2A.2 | 2.53/2.81 | 2.25/2.38 | 2.81/3.26 | 1974/1972 | 1971/1963 | 1976/1977 | ||
2A.3 | 2.96/3.19 | 2.66/2.73 | 3.26/3.68 | 1951/1954 | 1945/1937 | 1957/1965 | ||
2B | 3.05/3.31 | 2.74/2.91 | 3.89/3.75 | 1956/1956 | 1955/1954 | 1957/1957 | ||
N3 | 3A | 2.92/3.23 | 2.6/2.61 | 3.27/3.83 | 1954/1959 | 1944/1941 | 1963/1971 | |
3B | 2.67/2.96 | 2.39/2.43 | 3.04/3.47 | 1955/1950 | 1950/1933 | 1957/1959 | ||
3C | 3.22/3.91 | 2.63/1.78 | 3.85/5.96 | 1955/1956 | 1949/1926 | 1961/1975 | ||
N4 | 4A | 3.37/4.30 | 2.82/3.39 | 3.93/5.27 | 1964/1966 | 1962/1962 | 1967/1967 | |
4B | 3.78/4.42 | 3.09/2.75 | 4.5/5.98 | 1970/1970 | 1966/1956 | 1973/1978 | ||
N5 | 5A | 2.88/3.63 | 2.47/2.92 | 3.27/4.32 | 1971/1972 | 1968/1965 | 1975/1976 | |
5B | 2.68/3.61 | 2.07/2.21 | 3.34/4.81 | 1953/1964 | 1941/1945 | 1963/1972 | ||
N6 | 6A | 2.1/2.32 | 1.88/1.86 | 2.3/2.79 | 1960/1955 | 1956/1934 | 1963/1970 | |
6B | 2.69/3.08 | 2.39/2.55 | 2.97/3.63 | 1943/1940 | 1940/1920 | 1946/1952 | ||
N7 | 7A | 3.8/4.87 | 3.33/4.00 | 4.33/5.73 | 1975/1975 | 1974/1972 | 1976/1977 | |
7B | 2.99/3.97 | 2.52/2.94 | 3.46/4.91 | 1892/1899 | 1882/1892 | 1901/1901 | ||
7C | 2.65/3.13 | 1.08/1.90 | 3.88/4.43 | 1952/1955 | 1940/1952 | 1956/1956 | ||
N8 | 8A | 1.54/2.31 | 1.36/1.93 | 1.73/2.71 | 1930/1956 | 1915/1941 | 1940/1963 | |
8B | −/1.68 | −/1.37 | −/2.02 | −/1954 | −/1945 | −/1961 | ||
8C | 1.1/2.13 | 0.86/1.52 | 1.35/2.71 | 1921/1946 | 1904/1923 | 1937/1961 | ||
N9 | 9A | 2.8/3.36 | 2.49/2.77 | 3.13/3.92 | 1960/1961 | 1957/1952 | 1962/1966 | |
9B | 2.75/3.32 | 2.19/2.41 | 3.39/4.21 | 1994/1995 | 1992/1992 | 1996/1996 | ||
9C | −/2.16 | −/0.24 | −/3.95 | −/1948 | −/1890 | −/1977 | ||
|
Yam88 | 2.30/2.47 | 1.99/2.08 | 2.62/2.85 | 1986/1986 | 1985/1982 | 1987/1988 | |
Vic87 | 1.90/2.14 | 1.50/1.65 | 2.3/2.62 | 1985/1985 | 1983/1982 | 1987/1987 |
Values calculated based upon the random local clock model/values calculated based upon the uncorrelated exponential relaxed clock model; Dash signs (-) indicate missing data.
The Bayesian consensus tree for each lineage, along with posterior mean branch lengths scaled in real time, is depicted in
A: H5N1 (1A.1), B: North American swine N1 (1B), C: Human H1N1 (1C), D: H9N2 (2A.1), E: Equine N7 (7C), F: Yama88 influenza B NA (Yama88). Branch coloring indicates inferred rates of nucleotide substitution from blue (slow) to red (fast). The scale bar indicates the number of years before the present.
The H9N2 lineage was found to have a mean substitution rate of 4.45×10−3 (
The time of most recent common ancestor (tMRCA) varies from lineage to lineage (
Different selection pressures were revealed in different lineages as indicated by the ratio of non-synonymous (
Influenza | Subtype | Lineages/Sublineage | No. ofsequences | SLAC | FEL | IFEL | |
|
N1 | 1A.1 | 1241 | 16, 46, 83, 313, 340, 365 | 8, 339, 434 | 8, 16,46, 76, 339 | 0.274 (0.262–0.286) |
1A.2 | 263 | 460 | 20, 105, 460 | 20, 105, 454 | 0.202 (0.186–0.219) | ||
1A.3 | 794 | None | 53 | 53, 388, 452 | 0.227 (0.206–0.249) | ||
1A.4 | 80 | None | None | 210 | 0.180 (0.163–0.197) | ||
1A.5 | 228 | None | 449 | 95, 449 | 0.148 (0.135–0.162) | ||
1B | 139 | 46 | 46, 53, 75, 81, 339 | 46, 53, 339, 453 | 0.174 (0.158–0.192) | ||
1C | 1210 | 84, 222, 248 | 19, 84, 151, 222, 248, 319, 365 | 59, 222, 248, 344, 365 | 0.261 (0.249–0.274) | ||
N2 | 2A.1 | 586 | 9, 43, 50, 141, 199, 356 | 20, 43, 141, 199, 356 | 20, 43, 141, 199, 356 | 0.252 (0.240–0.264) | |
2A.2 | 210 | 30 | None | 43 | 0.174 (0.162–0.186) | ||
2A.3 | 328 | 356, 416 | 113, 356, 414, 416 | 356, 414, 416 | 0.218 (0.204–0.233) | ||
2B | 2169 | 5, 43, 56, 120, 126,148, 151, 370, 434 | 5, 43, 44, 56, 120, 126, 147,148, 151, 370, 434 | 43, 56, 127, 147, 267, 332,358, 370, 392, 455 | 0.313 (0.301–0.326) | ||
N3 | 3A | 113 | None | 413, 432, 457 | 413 | 0.130 (0.115–0.146) | |
3B | 120 | None | 413 | 52, 413 | 0.161 (0.145–0.178) | ||
3C | 9 | None | None | None | 0.092 (0.074–0.113) | ||
N4 | 4A | 39 | None | 74 | None | 0.081 (0.065–0.100) | |
4B | 11 | None | None | 78 | 0.062 (0.047–0.080) | ||
N5 | 5A | 68 | None | 30, 282 | 30, 282 | 0.140 (0.122–0.160) | |
5B | 17 | None | None | 30 | 0.078 (0.061–0.097) | ||
N6 | 6A | 206 | None | None | 172 | 0.111 (0.100–0.123) | |
6B | 45 | None | None | None | 0.114 (0.100–0.129) | ||
N7 | 7A | 90 | None | None | None | 0.153 (0.132–0.176) | |
7B | 42 | None | 42 | None | 0.092 (0.079–0.107) | ||
7C | 10 | None | None | None | 0.135 (0.091–0.191) | ||
N8 | 8A | 253 | 265 | 265 | 265, 376 | 0.128 (0.118–0.138) | |
8B | 95 | None | None | None | 0.281 (0.242–0.323) | ||
8C | 61 | None | 35, 41 | None | 0.129 (0.114–0.145) | ||
N9 | 9A | 76 | None | None | None | 0.095 (0.082–0.109) | |
9B | 25 | None | None | None | 0.106 (0.081–0.136) | ||
9C | 9 | None | None | None | 0.068 (0.047–0.095) | ||
|
Yam88 | 565 | 42, 65, 248, 373 | 65, 248, 345, 373, 395 | 42, 65, 248, 373, 389, 436 | 0.259 (0.238–0.281) | |
Vic87 | 83 | None | 345 | 106, 345 | 0.257 (0.215–0.305) |
Position relative to the start codon.
Human lineages were found to have the largest numbers of positively selected sites, with 16 sites for the human N2 lineage (2B), 9 sites for human H1N1 lineage (1C), and 8 sites for Yam88 lineage (
Protein structure analyses revealed all the positively selected sites were located at the surface of the NA protein and pertained to antibody binding and/or interactions with the sugar molecules of host cells (
A: Influenza A human N1 neuraminidase (1C) (A/Brevig Mission/1/18 H1N1, 1918 “Spanish flu”, PDB ID: 3B7E); B: Influenza A human N2 neuraminidase (2B) (A/Tokyo/3/67 H2N2, 1967, PDB ID: 1IVG); C: Influenza B viral neuraminidase for Yam88 (B/Perth/211/2001, PDB ID: 3K36); D: Influenza B viral neuraminidase for Vic87 (B/Perth/211/2001, PDB ID: 3K36). The positive selection sites are denoted as green balls. Structural regions are denoted in different colors: yellow for alpha-helices, red for beta sheets, and blue for loops.
In the human H1N1 lineage (1C), amino acid positions 151, 222 and 344 were found to be under a strong positive selection, and the amino acids in these appear to interact with the NA inhibitor – zanamivir, a drug molecule according to the NA structure (
With regard to another human lineage (2B), positions 126 and 127 were found to be within the binding pocket of influenza A virus (
For human influenza B, positions 42, 65, 248, 345, 373, 389, 395, and 436 were found to be under positive selection (
The ML and Bayesian MCMC analyses revealed that the divergence of influenza A and B NA genes occurred earlier than the divergence of influenza A NA subtypes. Similar findings were reported for the hemagglutinin (HA) genes
The lineages from different hosts are colored, with the emergence times of the lineages represented by the horizontal positions of squared boxes and the mean substitution rates depicted by the degree of line thickness. Note that within 2A there are five swine clusters.
In this study, 23 NA lineages were determined within influenza A based upon both theoretical (e.g., phylogenetic tree topology) and empirical criteria (e.g., pandemic events). The majority of lineages were found to be specific in hosts, or geographical locations, with a genetic distance around 0.2, ranging from 0.117 to 0.349. These results are generally consistent with previous findings
Classification and designation of the lineages and sublineages within the influenza A virus are essential for studies of viral evolution, ecology and epidemiology. However, how to accurately identify an evolutionary lineage of influenza A viruses is challenging. Whether the naming system will be accepted and used by influenza researchers is even more challenging. To trace the evolutionary change of highly pathogenic avian influenza (HPAI) viruses, a hierarchical nomenclature system for HPAI hemagglutinin clades and sub-clades has been implemented by the WHO/OIE/FAO H5N1 Evolution Working Group and widely adapted by the research community
It is notable that substitution rates are not the same across all branches within a phylogenetic tree. The relaxed clock model was developed to cope with this issue. An average rate across all branches in the tree is estimated under relaxed clock model in BEAST with 95% HPDs summarized from average rates, which are estimated from the sampled trees
This study demonstrated that human influenza viruses were shown to have little geographical restriction, indicating that human viruses were transmitted globally and probably rapidly as well
Lineage 2B includes human influenza viruses isolated from two different subtypes, H2N2 between 1957 and 1968 and H3N2 after 1968, which share the same N2 gene maintained in human influenza virus after the antigenic shift from H2 to H3 occurred in 1968
In addition to the above discussed human lineages, pandemic H1N1 2009 influenza viruses are believed to have arisen from a reassortment between North American and Eurasian swine lineages, and as expected, the pandemic H1N1 2009 viruses grouped with the Eurasian swine lineage
Influenza viruses circulating in non-human species have evolved in association with their various hosts on different continents for extended periods of time
The two subtype-specific avian sublineages, 1A.1 for H5N1 and 2A.1 for H9N2, are considered to have pandemic potential and were found to evolve relatively faster compared with other avian lineages from multiple subtypes (
Two lineages, H7N7 (7C) and H3N8 (8B), were revealed in equine influenza viruses. The H7N7 equine influenza viruses have not been detected since the late 1970s
Two major swine virus groups, Eurasian (avian-like) swine (1A.4) and North American swine (1B), were found within N1 (
Complicated evolutionary dynamics were observed in lineage 2B. Within this major human lineage, five separate sub-clusters of swine viruses occurred in North America and Eurasia, suggesting that human-origin N2 genes were transmitted to swine in at least five separate instances (
In addition to the complexity found in Eurasian swine N2 viruses, similarly, in North America in 1998 there were outbreaks of influenza observed in swine herds in Minnesota, Iowa, and Texas. The outbreaks were caused by a triple-reassortant H3N2 virus which contained genes from human (HA, NA, and PB1), swine (NS, NP, and M), and avian (PB2 and PA) influenza viruses
A total of 14,328 neuraminidase (NA) nucleotide sequences longer than 1330 nts, excluding laboratory recombinant sequences, were downloaded from the Influenza Virus Resource at NCBI
Influenza | Subtype | Human | Avian | Swine | Equine | Others | Total |
|
N1 | 3810 | 1853 | 243 | 0 | 50 | 5956 |
N2 | 3378 | 1215 | 258 | 0 | 81 | 4932 | |
N3 | 1 | 412 | 4 | 0 | 26 | 443 | |
N4 | 0 | 121 | 0 | 0 | 2 | 123 | |
N5 | 0 | 141 | 0 | 0 | 0 | 141 | |
N6 | 0 | 583 | 3 | 0 | 24 | 610 | |
N7 | 0 | 219 | 1 | 11 | 4 | 235 | |
N8 | 0 | 568 | 2 | 118 | 95 | 783 | |
N9 | 0 | 192 | 0 | 0 | 2 | 194 | |
|
911 | 0 | 0 | 0 | 0 | 911 |
Homologous gene recombination was identified using the 3SEQ algorithm under RDP3
The SeqMat program was used to collapse similar sequences from the same location and the same year, which results in ∼1500 representative sequences, respectively, for N1 and N2
Influenza A and B NA sequences are remotely related with around 40% nucleotide sequence similarity. We thus conducted both protein and nucleotide sequence alignments using Expresso - a program based upon protein structural information for alignment and TranslatorX - a program referring to the corresponding protein sequence alignment to align nucleotide sequences, respectively
Phylogenetic analysis was conducted using the Maximum-likelihood (ML) method in RAxML
Lineages were determined based upon the topology of phylogenetic trees and strong bootstrap support values (100 for influenza A and approximately 90 for influenza B). The genetics distances between lineages were calculated using the Kimura-2-Parameter (K2P) distance matric under MEGA 5.0
The substitution rate and the time of most recent common ancestor (tMRCA) were estimated for each lineage/sublineage using the Bayesian Markov Chain Monte Carlo (MCMC) method available in the BEAST package
Three clock models were compared statistically for each dataset using a Bayes factor test in the Tracer program
The ratio of non-synonymous (
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(TIF)
(DOCX)
(TREE)
(DOCX)
We are grateful to the Holland Computing Center (HCC) at the University of Nebraska-Lincoln (UNL) for the computing support. We specially thank our UNL colleagues: David Swanson, Ashu Guru, and Jun Wang and our UNO students and colleagues: Pavan Attaluri, Santosh Servisetti, Thaine Rowley and Mohammad Shafiullah for their help. We particularly thank the Academic Editor: Dr Dong-Yan Jin and four anonymous reviewers for their helpful comments and constructive suggestions.