1
|
Zhang M, Zhou J, Ni R, Zhao X, Chen Y, Sun Y, Liu Z, Han X, Luo C, Fu X, Shao Y. Genomic Analyses Uncover Evolutionary Features of Influenza A/H3N2 Viruses in Yunnan Province, China, from 2017 to 2022. Viruses 2024; 16:138. [PMID: 38257838 PMCID: PMC10820241 DOI: 10.3390/v16010138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2023] [Revised: 01/13/2024] [Accepted: 01/16/2024] [Indexed: 01/24/2024] Open
Abstract
Influenza A viruses evolve at a high rate of nucleotide substitution, thereby requiring continuous monitoring to determine the efficacy of vaccines and antiviral drugs. In the current study, we performed whole-genome sequencing analyses of 253 influenza A/H3N2 strains from Yunnan Province, China, during 2017-2022. The hemagglutinin (HA) segments of Yunnan A/H3N2 strains isolated during 2017-2018 harbored a high genetic diversity due to heterogeneous distribution across branches. The mutation regularity of the predominant antigenic epitopes of HA segments in Yunnan was inconsistent in different years. Some important functional mutations in gene segments associated with viral adaptation and drug tolerance were revealed. The rapid genomic evolution of Yunnan A/H3N2 strains from 2017 to 2022 mainly concentrated on segments, i.e., matrix protein 2 (M2), non-structural protein 1 (NS1), neuraminidase (NA), NS2, and HA, with a high overall non-synonymous/synonymous substitution ratio (dN/dS). Our results highlighted a decline in vaccine efficacy against the A/H3N2 circulating strains, particularly against the Yunnan 2021-2022 A/H3N2 strains. These findings aid our understanding of evolutionary characteristics and epidemiological monitoring of the A/H3N2 viruses and provide in-depth insights into the protective efficacy of influenza vaccines.
Collapse
Affiliation(s)
- Meiling Zhang
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Jienan Zhou
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Ruize Ni
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Xiaonan Zhao
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Yaoyao Chen
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Yanhong Sun
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Zhaosheng Liu
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Xiaoyu Han
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Chunrui Luo
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Xiaoqing Fu
- Department of Acute Infectious Diseases Control and Prevention, Yunnan Center for Disease Control and Prevention, Kunming 650022, China; (M.Z.); (J.Z.); (R.N.); (X.Z.); (Y.C.); (Y.S.); (Z.L.); (X.H.); (C.L.)
| | - Yong Shao
- State Key Laboratory of Genetic Resources and Evolution, Chinese Academy of Sciences, Kunming Institute of Zoology, Kunming 650201, China
| |
Collapse
|
2
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor Decomposition-based Feature Extraction and Classification to Detect Natural Selection from Genomic Data. Mol Biol Evol 2023; 40:msad216. [PMID: 37772983 PMCID: PMC10581699 DOI: 10.1093/molbev/msad216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 08/10/2023] [Accepted: 09/14/2023] [Indexed: 09/30/2023] Open
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under nonconvex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data although preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx, which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
Affiliation(s)
- Md Ruhul Amin
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Mahmudul Hasan
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Sandipan Paul Arnab
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| | - Michael DeGiorgio
- Department of Electrical Engineering and Computer Science, Florida Atlantic University, Boca Raton, FL 33431, USA
| |
Collapse
|
3
|
Amin MR, Hasan M, Arnab SP, DeGiorgio M. Tensor decomposition based feature extraction and classification to detect natural selection from genomic data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.27.527731. [PMID: 37034767 PMCID: PMC10081272 DOI: 10.1101/2023.03.27.527731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Inferences of adaptive events are important for learning about traits, such as human digestion of lactose after infancy and the rapid spread of viral variants. Early efforts toward identifying footprints of natural selection from genomic data involved development of summary statistic and likelihood methods. However, such techniques are grounded in simple patterns or theoretical models that limit the complexity of settings they can explore. Due to the renaissance in artificial intelligence, machine learning methods have taken center stage in recent efforts to detect natural selection, with strategies such as convolutional neural networks applied to images of haplotypes. Yet, limitations of such techniques include estimation of large numbers of model parameters under non-convex settings and feature identification without regard to location within an image. An alternative approach is to use tensor decomposition to extract features from multidimensional data while preserving the latent structure of the data, and to feed these features to machine learning models. Here, we adopt this framework and present a novel approach termed T-REx , which extracts features from images of haplotypes across sampled individuals using tensor decomposition, and then makes predictions from these features using classical machine learning methods. As a proof of concept, we explore the performance of T-REx on simulated neutral and selective sweep scenarios and find that it has high power and accuracy to discriminate sweeps from neutrality, robustness to common technical hurdles, and easy visualization of feature importance. Therefore, T-REx is a powerful addition to the toolkit for detecting adaptive processes from genomic data.
Collapse
|
4
|
Croze M, Kim Y. Inference of population genetic parameters from an irregular time series of seasonal influenza virus sequences. Genetics 2021; 217:6066165. [PMID: 33724414 DOI: 10.1093/genetics/iyaa039] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2020] [Accepted: 12/17/2020] [Indexed: 11/12/2022] Open
Abstract
Basic summary statistics that quantify the population genetic structure of influenza virus are important for understanding and inferring the evolutionary and epidemiological processes. However, the sampling dates of global virus sequences in the last several decades are scattered nonuniformly throughout the calendar. Such temporal structure of samples and the small effective size of viral population hampers the use of conventional methods to calculate summary statistics. Here, we define statistics that overcome this problem by correcting for the sampling-time difference in quantifying a pairwise sequence difference. A simple linear regression method jointly estimates the mutation rate and the level of sequence polymorphism, thus providing an estimate of the effective population size. It also leads to the definition of Wright's FST for arbitrary time-series data. Furthermore, as an alternative to Tajima's D statistic or the site-frequency spectrum, a mismatch distribution corrected for sampling-time differences can be obtained and compared between actual and simulated data. Application of these methods to seasonal influenza A/H3N2 viruses sampled between 1980 and 2017 and sequences simulated under the model of recurrent positive selection with metapopulation dynamics allowed us to estimate the synonymous mutation rate and find parameter values for selection and demographic structure that fit the observation. We found that the mutation rates of HA and PB1 segments before 2007 were particularly high and that including recurrent positive selection in our model was essential for the genealogical structure of the HA segment. Methods developed here can be generally applied to population genetic inferences using serially sampled genetic data.
Collapse
Affiliation(s)
- Myriam Croze
- Division of EcoScience, Ewha Womans University, Seoul 03760, Korea
| | - Yuseob Kim
- Division of EcoScience, Ewha Womans University, Seoul 03760, Korea.,Department of Life Science, Ewha Womans University, Seoul 03760, Korea
| |
Collapse
|
5
|
Eliseev A, Gibson KM, Avdeyev P, Novik D, Bendall ML, Pérez-Losada M, Alexeev N, Crandall KA. Evaluation of haplotype callers for next-generation sequencing of viruses. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2020; 82:104277. [PMID: 32151775 PMCID: PMC7293574 DOI: 10.1016/j.meegid.2020.104277] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 03/04/2020] [Accepted: 03/06/2020] [Indexed: 01/30/2023]
Abstract
Currently, the standard practice for assembling next-generation sequencing (NGS) reads of viral genomes is to summarize thousands of individual short reads into a single consensus sequence, thus confounding useful intra-host diversity information for molecular phylodynamic inference. It is hypothesized that a few viral strains may dominate the intra-host genetic diversity with a variety of lower frequency strains comprising the rest of the population. Several software tools currently exist to convert NGS sequence variants into haplotypes. Previous benchmarks of viral haplotype reconstruction programs used simulation scenarios that are useful from a mathematical perspective but do not reflect viral evolution and epidemiology. Here, we tested twelve NGS haplotype reconstruction methods using viral populations simulated under realistic evolutionary dynamics. We simulated coalescent-based populations that spanned known levels of viral genetic diversity, including mutation rates, sample size and effective population size, to test the limits of the haplotype reconstruction methods and to ensure coverage of predicted intra-host viral diversity levels (especially HIV-1). All twelve investigated haplotype callers showed variable performance and produced drastically different results that were mainly driven by differences in mutation rate and, to a lesser extent, in effective population size. Most methods were able to accurately reconstruct haplotypes when genetic diversity was low. However, under higher levels of diversity (e.g., those seen intra-host HIV-1 infections), haplotype reconstruction quality was highly variable and, on average, poor. All haplotype reconstruction tools, except QuasiRecomb and ShoRAH, greatly underestimated intra-host diversity and the true number of haplotypes. PredictHaplo outperformed, in regard to highest precision, recall, and lowest UniFrac distance values, the other haplotype reconstruction tools followed by CliqueSNV, which, given more computational time, may have outperformed PredictHaplo. Here, we present an extensive comparison of available viral haplotype reconstruction tools and provide insights for future improvements in haplotype reconstruction tools using both short-read and long-read technologies.
Collapse
Affiliation(s)
- Anton Eliseev
- Computer Technologies Laboratory, ITMO University, Saint-Petersburg, Russia
| | - Keylie M Gibson
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA.
| | - Pavel Avdeyev
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; Department of Mathematics, George Washington University, Washington, DC, USA
| | - Dmitry Novik
- Computer Technologies Laboratory, ITMO University, Saint-Petersburg, Russia
| | - Matthew L Bendall
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA
| | - Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, Vairão, Portugal
| | - Nikita Alexeev
- Computer Technologies Laboratory, ITMO University, Saint-Petersburg, Russia
| | - Keith A Crandall
- Computational Biology Institute, Milken Institute School of Public Health, George Washington University, Washington, DC, USA; Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, George Washington University, Washington, DC, USA
| |
Collapse
|
6
|
Raghwani J, Thompson RN, Koelle K. Selection on non-antigenic gene segments of seasonal influenza A virus and its impact on adaptive evolution. Virus Evol 2017; 3:vex034. [PMID: 29250432 PMCID: PMC5724400 DOI: 10.1093/ve/vex034] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Most studies on seasonal influenza A/H3N2 virus adaptation have focused on the main antigenic gene, hemagglutinin. However, there is increasing evidence that the genome-wide genetic background of novel antigenic variants can influence these variants’ emergence probabilities and impact their patterns of dominance in the population. This suggests that non-antigenic genes may be important in shaping the viral evolutionary dynamics. To better understand the role of selection on non-antigenic genes in the adaptive evolution of seasonal influenza viruses, we have developed a simple population genetic model that considers a virus with one antigenic and one non-antigenic gene segment. By simulating this model under different regimes of selection and reassortment, we find that the empirical patterns of lineage turnover for the antigenic and non-antigenic gene segments are best captured when there is both limited viral coinfection and selection operating on both gene segments. In contrast, under a scenario of only neutral evolution in the non-antigenic gene segment, we see persistence of multiple lineages for long periods of time in that segment, which is not compatible with observed molecular evolutionary patterns. Further, we find that reassortment, occurring in coinfected individuals, can increase the speed of viral adaptive evolution by primarily reducing selective interference and genetic linkage effects. Together, these findings suggest that, for influenza, with six internal or non-antigenic gene segments, the evolutionary dynamics of novel antigenic variants are likely to be influenced by the genome-wide genetic background as a result of linked selection among both beneficial and deleterious mutations.
Collapse
Affiliation(s)
- Jayna Raghwani
- Department of Zoology, University of Oxford, Oxford, OX1 3SY, UK
| | - Robin N Thompson
- Department of Zoology, University of Oxford, Oxford, OX1 3SY, UK
| | - Katia Koelle
- Department of Biology, Duke University, Durham, NC 27708, USA
| |
Collapse
|
7
|
Glatman-Freedman A, Drori Y, Beni SA, Friedman N, Pando R, Sefty H, Tal I, McCauley J, Rahav G, Keller N, Shohat T, Mendelson E, Hindiyeh M, Mandelboim M. Genetic divergence of Influenza A(H3N2) amino acid substitutions mark the beginning of the 2016-2017 winter season in Israel. J Clin Virol 2017; 93:71-75. [PMID: 28672275 PMCID: PMC5711789 DOI: 10.1016/j.jcv.2017.05.020] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Revised: 05/25/2017] [Accepted: 05/29/2017] [Indexed: 12/05/2022]
Abstract
BACKGROUND Influenza vaccine composition is reevaluated each year due to the frequency and accumulation of genetic changes that influenza viruses undergo. The beginning of the 2016-2017 influenza surveillance period in Israel has been marked by the dominance of influenza A(H3N2). OBJECTIVES To evaluate the type, subtype, genetic evolution and amino acid substitutions of influenza A(H3N2) viruses detected among community patients with influenza-like illness (ILI) and hospitalized patients with respiratory illness in the first weeks of the 2016-2017 influenza season. STUDY DESIGN Respiratory samples from community patients with influenza-like illness and from hospitalized patients underwent identification, subtyping and molecular characterization. Hemagglutinin sequences were compared to the vaccine strain, phylogenetic tree was created, and amino acid substitutions were determined. RESULTS Influenza A(H3N2) predominated during the early stages of the 2016-2017 influenza season. Noticeably, approximately 20% of community patients and 36% of hospitalized patients, positive for influenza3), received the 2016-2017 influenza vaccine. The influenza A(H3N2) viruses demonstrated genetic divergence from the vaccine strain into three separate subgroups within the 3C.2a clade. One resembled the new 3C.2a1 subclade, one resembled the recently proposed 3C.2a2 subclade and the other was not previously described. Diversity was observed within each subgroup, in terms of additional amino acid substitutions. CONCLUSIONS Characterization of the 2016-2017 A(H3N2) influenza viruses is imperative for determining the future influenza vaccine composition.
Collapse
Affiliation(s)
- Aharona Glatman-Freedman
- The Israel Center for Disease Control, Israel Ministry of Health, Tel-Hashomer, Ramat Gan, Israel; Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Departments of Pediatrics and Family and Community Medicine, New York Medical College, Valhalla, New York, USA
| | - Yaron Drori
- Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Central Virology Laboratory, Ministry of Health, Chaim Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
| | - Sharon Alexandra Beni
- Division of Infectious Diseases, Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
| | - Nehemya Friedman
- Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Central Virology Laboratory, Ministry of Health, Chaim Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
| | - Rakefet Pando
- The Israel Center for Disease Control, Israel Ministry of Health, Tel-Hashomer, Ramat Gan, Israel; Central Virology Laboratory, Ministry of Health, Chaim Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
| | - Hanna Sefty
- The Israel Center for Disease Control, Israel Ministry of Health, Tel-Hashomer, Ramat Gan, Israel
| | - Ilana Tal
- Division of Infectious Diseases, Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
| | - John McCauley
- WHO Collaborating Centre for Reference and Research on Influenza, Crick Worldwide Influenza Centre, the Francis Crick Institute, London, United Kingdom
| | - Galia Rahav
- Division of Infectious Diseases, Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel; Department of Internal Medicine, Sackler Faculty of Medicine, Tel-Aviv University, Israel
| | - Nathan Keller
- Microbiology Laboratory, Chaim Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel; Ariel University, Ariel, Israel
| | - Tamy Shohat
- The Israel Center for Disease Control, Israel Ministry of Health, Tel-Hashomer, Ramat Gan, Israel; Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ella Mendelson
- Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Central Virology Laboratory, Ministry of Health, Chaim Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
| | - Musa Hindiyeh
- Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Central Virology Laboratory, Ministry of Health, Chaim Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel
| | - Michal Mandelboim
- Department of Epidemiology and Preventive Medicine, School of Public Health, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel; Central Virology Laboratory, Ministry of Health, Chaim Sheba Medical Center, Tel Hashomer, Ramat Gan, Israel.
| |
Collapse
|