1
|
Ye Q, Wang H, Xu F, Zhang S, Zhang S, Yang Z, Zhang L. Co-Mutations and Possible Variation Tendency of the Spike RBD and Membrane Protein in SARS-CoV-2 by Machine Learning. Int J Mol Sci 2024; 25:4662. [PMID: 38731879 PMCID: PMC11083383 DOI: 10.3390/ijms25094662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2024] [Revised: 04/18/2024] [Accepted: 04/23/2024] [Indexed: 05/13/2024] Open
Abstract
Since the onset of the coronavirus disease 2019 (COVID-19) pandemic, SARS-CoV-2 variants capable of breakthrough infections have attracted global attention. These variants have significant mutations in the receptor-binding domain (RBD) of the spike protein and the membrane (M) protein, which may imply an enhanced ability to evade immune responses. In this study, an examination of co-mutations within the spike RBD and their potential correlation with mutations in the M protein was conducted. The EVmutation method was utilized to analyze the distribution of the mutations to elucidate the relationship between the mutations in the spike RBD and the alterations in the M protein. Additionally, the Sequence-to-Sequence Transformer Model (S2STM) was employed to establish mapping between the amino acid sequences of the spike RBD and M proteins, offering a novel and efficient approach for streamlined sequence analysis and the exploration of their interrelationship. Certain mutations in the spike RBD, G339D-S373P-S375F and Q493R-Q498R-Y505, are associated with a heightened propensity for inducing mutations at specific sites within the M protein, especially sites 3 and 19/63. These results shed light on the concept of mutational synergy between the spike RBD and M proteins, illuminating a potential mechanism that could be driving the evolution of SARS-CoV-2.
Collapse
Affiliation(s)
- Qiushi Ye
- MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi’an Jiaotong University, Xi’an 710049, China; (Q.Y.)
| | - He Wang
- MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi’an Jiaotong University, Xi’an 710049, China; (Q.Y.)
| | - Fanding Xu
- School of Life Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
| | - Sijia Zhang
- MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi’an Jiaotong University, Xi’an 710049, China; (Q.Y.)
| | - Shengli Zhang
- MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi’an Jiaotong University, Xi’an 710049, China; (Q.Y.)
| | - Zhiwei Yang
- MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi’an Jiaotong University, Xi’an 710049, China; (Q.Y.)
- School of Life Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China
| | - Lei Zhang
- MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi’an Jiaotong University, Xi’an 710049, China; (Q.Y.)
| |
Collapse
|
2
|
Loguercio S, Calverley BC, Wang C, Shak D, Zhao P, Sun S, Budinger GS, Balch WE. Understanding the host-pathogen evolutionary balance through Gaussian process modeling of SARS-CoV-2. PATTERNS (NEW YORK, N.Y.) 2023; 4:100800. [PMID: 37602209 PMCID: PMC10436005 DOI: 10.1016/j.patter.2023.100800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/22/2023] [Accepted: 06/22/2023] [Indexed: 08/22/2023]
Abstract
We have developed a machine learning (ML) approach using Gaussian process (GP)-based spatial covariance (SCV) to track the impact of spatial-temporal mutational events driving host-pathogen balance in biology. We show how SCV can be applied to understanding the response of evolving covariant relationships linking the variant pattern of virus spread to pathology for the entire SARS-CoV-2 genome on a daily basis. We show that GP-based SCV relationships in conjunction with genome-wide co-occurrence analysis provides an early warning anomaly detection (EWAD) system for the emergence of variants of concern (VOCs). EWAD can anticipate changes in the pattern of performance of spread and pathology weeks in advance, identifying signatures destined to become VOCs. GP-based analyses of variation across entire viral genomes can be used to monitor micro and macro features responsible for host-pathogen balance. The versatility of GP-based SCV defines starting point for understanding nature's evolutionary path to complexity through natural selection.
Collapse
Affiliation(s)
| | - Ben C. Calverley
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Chao Wang
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Daniel Shak
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Pei Zhao
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - Shuhong Sun
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| | - G.R. Scott Budinger
- Division of Pulmonary and Critical Care Medicine, Northwestern University, Chicago, IL, USA
| | - William E. Balch
- Department of Molecular Medicine, Scripps Research, La Jolla, CA, USA
| |
Collapse
|
3
|
Rahimian K, Arefian E, Mahdavi B, Mahmanzar M, Kuehu D, Deng Y. SARS2Mutant: SARS-CoV-2 amino-acid mutation atlas database. NAR Genom Bioinform 2023; 5:lqad037. [PMID: 37101659 PMCID: PMC10124966 DOI: 10.1093/nargab/lqad037] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2022] [Revised: 02/27/2023] [Accepted: 04/18/2023] [Indexed: 04/28/2023] Open
Abstract
The coronavirus disease 19 (COVID-19) is a highly pathogenic viral infection of the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), resulted in the global pandemic of 2020. A lack of therapeutic and preventive strategies has quickly posed significant threats to world health. A comprehensive understanding of SARS-CoV-2 evolution and natural selection, how it impacts host interaction, and phenotype symptoms is vital to develop effective strategies against the virus. The SARS2Mutant database (http://sars2mutant.com/) was developed to provide valuable insights based on millions of high-quality, high-coverage SARS-CoV-2 complete protein sequences. Users of this database have the ability to search for information on three amino acid substitution mutation strategies based on gene name, geographical zone, or comparative analysis. Each strategy is presented in five distinct formats which includes: (i) mutated sample frequencies, (ii) heat maps of mutated amino acid positions, (iii) mutation survivals, (iv) natural selections and (v) details of substituted amino acids, including their names, positions, and frequencies. GISAID is a primary database of genomics sequencies of influenza viruses updated daily. SARS2Mutant is a secondary database developed to discover mutation and conserved regions from the primary data to assist with design for targeted vaccine, primer, and drug discoveries.
Collapse
Affiliation(s)
- Karim Rahimian
- Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran
| | - Ehsan Arefian
- Department of Microbiology, School of Biology, College of Science, University of Tehran, Tehran, Iran
| | - Bahar Mahdavi
- Department of Computer Science, Tarbiat Modares University, Tehran, Iran
| | - Mohammadamin Mahmanzar
- Department of Quantitative Health Sciences, John A. Burns School of Medicine, University of Hawaii at Manoa, Honolulu, HI 96813, USA
| | - Donna Lee Kuehu
- Department of Quantitative Health Sciences, John A. Burns School of Medicine, University of Hawaii at Manoa, Honolulu, HI 96813, USA
| | - Youping Deng
- Department of Quantitative Health Sciences, John A. Burns School of Medicine, University of Hawaii at Manoa, Honolulu, HI 96813, USA
| |
Collapse
|
4
|
Nucleotide-based genetic networks: Methods and applications. J Biosci 2022. [PMID: 36226367 PMCID: PMC9554864 DOI: 10.1007/s12038-022-00290-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Genomic variations have been acclaimed as among the key players in understanding the biological mechanisms behind migration, evolution, and adaptation to extreme conditions. Due to stochastic evolutionary forces, the frequency of polymorphisms is affected by changes in the frequency of nearby polymorphisms in the same DNA sample, making them connected in terms of evolution. This article presents all the ingredients to understand the cumulative effects and complex behaviors of genetic variations in the human mitochondrial genome by analyzing co-occurrence networks of nucleotides, and shows key results obtained from such analyses. The article emphasizes recent investigations of these co-occurrence networks, describing the role of interactions between nucleotides in fundamental processes of human migration and viral evolution. The corresponding co-mutation-based genetic networks revealed genetic signatures of human adaptation in extreme environments. This article provides the methods of constructing such networks in detail, along with their graph-theoretical properties, and applications of the genomic networks in understanding the role of nucleotide co-evolution in evolution of the whole genome.
Collapse
|
5
|
Alam ASMRU, Islam OK, Hasan MS, Islam MR, Mahmud S, Al‐Emran HM, Jahid IK, Crandall KA, Hossain MA. Dominant clade-featured SARS-CoV-2 co-occurring mutations reveal plausible epistasis: An in silico based hypothetical model. J Med Virol 2022; 94:1035-1049. [PMID: 34676891 PMCID: PMC8661685 DOI: 10.1002/jmv.27416] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 10/15/2021] [Accepted: 10/20/2021] [Indexed: 01/18/2023]
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has evolved into eight fundamental clades with four of these clades (G, GH, GR, and GV) globally prevalent in 2020. To explain plausible epistatic effects of the signature co-occurring mutations of these circulating clades on viral replication and transmission fitness, we proposed a hypothetical model using in silico approach. Molecular docking and dynamics analyses showed the higher infectiousness of a spike mutant through more favorable binding of G614 with the elastase-2. RdRp mutation p.P323L significantly increased genome-wide mutations (p < 0.0001), allowing for more flexible RdRp (mutated)-NSP8 interaction that may accelerate replication. Superior RNA stability and structural variation at NSP3:C241T might impact protein, RNA interactions, or both. Another silent 5'-UTR:C241T mutation might affect translational efficiency and viral packaging. These four G-clade-featured co-occurring mutations might increase viral replication. Sentinel GH-clade ORF3a:p.Q57H variants constricted the ion-channel through intertransmembrane-domain interaction of cysteine(C81)-histidine(H57). The GR-clade N:p.RG203-204KR would stabilize RNA interaction by a more flexible and hypo-phosphorylated SR-rich region. GV-clade viruses seemingly gained the evolutionary advantage of the confounding factors; nevertheless, N:p.A220V might modulate RNA binding with no phenotypic effect. Our hypothetical model needs further retrospective and prospective studies to understand detailed molecular events and their relationship to the fitness of SARS-CoV-2.
Collapse
Affiliation(s)
| | - Ovinu Kibria Islam
- Department of MicrobiologyJashore University of Science and TechnologyJashoreBangladesh
| | - Md. Shazid Hasan
- Department of MicrobiologyJashore University of Science and TechnologyJashoreBangladesh
| | - Mir Raihanul Islam
- Division of Poverty, Health, and NutritionInternational Food Policy Research InstituteBangladesh
| | - Shafi Mahmud
- Department Genetic Engineering and BiotechnologyUniversity of RajshahiRajshahiBangladesh
| | - Hassan M. Al‐Emran
- Department of Biomedical EngineeringJashore University of Science and TechnologyJashoreBangladesh
| | - Iqbal Kabir Jahid
- Department of MicrobiologyJashore University of Science and TechnologyJashoreBangladesh
| | - Keith A. Crandall
- Department of Biostatistics and Bioinformatics, Computational Biology Institute, Milken Institute School of Public HealthThe George Washington UniversityWashington DCUSA
| | - M. Anwar Hossain
- Office of the Vice ChancellorJashore University of Science and TechnologyJashoreBangladesh
- Department of MicrobiologyUniversity of DhakaDhakaBangladesh
| |
Collapse
|
6
|
Qin L, Ding X, Li Y, Chen Q, Meng J, Jiang T. Co-mutation modules capture the evolution and transmission patterns of SARS-CoV-2. Brief Bioinform 2021; 22:6297966. [PMID: 34121111 DOI: 10.1093/bib/bbab222] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 05/10/2021] [Accepted: 05/21/2021] [Indexed: 02/06/2023] Open
Abstract
The rapid spread and huge impact of the COVID-19 pandemic caused by the emerging SARS-CoV-2 have driven large efforts for sequencing and analyzing the viral genomes. Mutation analyses have revealed that the virus keeps mutating and shows a certain degree of genetic diversity, which could result in the alteration of its infectivity and pathogenicity. Therefore, appropriate delineation of SARS-CoV-2 genetic variants enables us to understand its evolution and transmission patterns. By focusing on the nucleotides that co-substituted, we first identified 42 co-mutation modules that consist of at least two co-substituted nucleotides during the SARS-CoV-2 evolution. Then based on these co-mutation modules, we classified the SARS-CoV-2 population into 43 groups and further identified the phylogenetic relationships among groups based on the number of inconsistent co-mutation modules, which were validated with phylogenetic trees. Intuitively, we tracked tempo-spatial patterns of the 43 groups, of which 11 groups were geographic-specific. Different epidemic periods showed specific co-circulating groups, where the dominant groups existed and had multiple sub-groups of parallel evolution. Our work enables us to capture the evolution and transmission patterns of SARS-CoV-2, which can contribute to guiding the prevention and control of the COVID-19 pandemic. An interactive website for grouping SARS-CoV-2 genomes and visualizing the spatio-temporal distribution of groups is available at https://www.jianglab.tech/cmm-grouping/.
Collapse
Affiliation(s)
- Luyao Qin
- Center for Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005; Suzhou Institute of Systems Medicine, Suzhou, Jiangsu 215123, China
| | - Xiao Ding
- Center for Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005; Suzhou Institute of Systems Medicine, Suzhou, Jiangsu 215123, China
| | - Yongjie Li
- Guangxi University, Nanning, Guangxi 530004, China
| | | | - Jing Meng
- Center for Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005; Suzhou Institute of Systems Medicine, Suzhou, Jiangsu 215123, China
| | - Taijiao Jiang
- Center for Systems Medicine, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing 100005; Suzhou Institute of Systems Medicine, Suzhou, Jiangsu 215123, China.,Guangdong Laboratory, 510005 Guangzhou, China
| |
Collapse
|
7
|
Sarkar R, Mitra S, Chandra P, Saha P, Banerjee A, Dutta S, Chawla-Sarkar M. Comprehensive analysis of genomic diversity of SARS-CoV-2 in different geographic regions of India: an endeavour to classify Indian SARS-CoV-2 strains on the basis of co-existing mutations. Arch Virol 2021; 166:801-812. [PMID: 33464421 PMCID: PMC7814186 DOI: 10.1007/s00705-020-04911-0] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 10/21/2020] [Indexed: 01/24/2023]
Abstract
Accumulation of mutations within the genome is the primary driving force in viral evolution within an endemic setting. This inherent feature often leads to altered virulence, infectivity and transmissibility, and antigenic shifts to escape host immunity, which might compromise the efficacy of vaccines and antiviral drugs. Therefore, we carried out a genome-wide analysis of circulating SARS-CoV-2 strains to detect the emergence of novel co-existing mutations and trace their geographical distribution within India. Comprehensive analysis of whole genome sequences of 837 Indian SARS-CoV-2 strains revealed the occurrence of 33 different mutations, 18 of which were unique to India. Novel mutations were observed in the S glycoprotein (6/33), NSP3 (5/33), RdRp/NSP12 (4/33), NSP2 (2/33), and N (1/33). Non-synonymous mutations were found to be 3.07 times more prevalent than synonymous mutations. We classified the Indian isolates into 22 groups based on their co-existing mutations. Phylogenetic analysis revealed that the representative strains of each group were divided into various sub-clades within their respective clades, based on the presence of unique co-existing mutations. The A2a clade was found to be dominant in India (71.34%), followed by A3 (23.29%) and B (5.36%), but a heterogeneous distribution was observed among various geographical regions. The A2a clade was highly predominant in East India, Western India, and Central India, whereas the A2a and A3 clades were nearly equal in prevalence in South and North India. This study highlights the divergent evolution of SARS-CoV-2 strains and co-circulation of multiple clades in India. Monitoring of the emerging mutations will pave the way for vaccine formulation and the design of antiviral drugs.
Collapse
Affiliation(s)
- Rakesh Sarkar
- Division of Virology, National Institute of Cholera and Enteric Diseases, P-33, C.I.T. Road, Scheme-XM, Beliaghata, Kolkata, West Bengal, 700010, India
| | - Suvrotoa Mitra
- Division of Virology, National Institute of Cholera and Enteric Diseases, P-33, C.I.T. Road, Scheme-XM, Beliaghata, Kolkata, West Bengal, 700010, India
| | - Pritam Chandra
- Division of Virology, National Institute of Cholera and Enteric Diseases, P-33, C.I.T. Road, Scheme-XM, Beliaghata, Kolkata, West Bengal, 700010, India
| | - Priyanka Saha
- Division of Virology, National Institute of Cholera and Enteric Diseases, P-33, C.I.T. Road, Scheme-XM, Beliaghata, Kolkata, West Bengal, 700010, India
| | - Anindita Banerjee
- Division of Virology, National Institute of Cholera and Enteric Diseases, P-33, C.I.T. Road, Scheme-XM, Beliaghata, Kolkata, West Bengal, 700010, India
| | - Shanta Dutta
- Division of Virology, National Institute of Cholera and Enteric Diseases, P-33, C.I.T. Road, Scheme-XM, Beliaghata, Kolkata, West Bengal, 700010, India
| | - Mamta Chawla-Sarkar
- Division of Virology, National Institute of Cholera and Enteric Diseases, P-33, C.I.T. Road, Scheme-XM, Beliaghata, Kolkata, West Bengal, 700010, India.
| |
Collapse
|
8
|
Shinde P, Whitwell HJ, Verma RK, Ivanchenko M, Zaikin A, Jalan S. Impact of modular mitochondrial epistatic interactions on the evolution of human subpopulations. Mitochondrion 2021; 58:111-122. [PMID: 33618020 DOI: 10.1016/j.mito.2021.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 09/22/2020] [Accepted: 02/03/2021] [Indexed: 12/23/2022]
Abstract
Investigation of human mitochondrial (mt) genome variation has been shown to provide insights to the human history and natural selection. By analyzing 24,167 human mt-genome samples, collected for five continents, we have developed a co-mutation network model to investigate characteristic human evolutionary patterns. The analysis highlighted richer co-mutating regions of the mt-genome, suggesting the presence of epistasis. Specifically, a large portion of COX genes was found to co-mutate in Asian and American populations, whereas, in African, European, and Oceanic populations, there was greater co-mutation bias in hypervariable regions. Interestingly, this study demonstrated hierarchical modularity as a crucial agent for these co-mutation networks. More profoundly, our ancestry-based co-mutation module analyses showed that mutations cluster preferentially in known mitochondrial haplogroups. Contemporary human mt-genome nucleotides most closely resembled the ancestral state, and very few of them were found to be ancestral-variants. Overall, these results demonstrated that subpopulation-based biases may favor mitochondrial gene specific epistasis.
Collapse
Affiliation(s)
- Pramod Shinde
- Department of Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore 453552, India.
| | - Harry J Whitwell
- National Phenome Centre and Imperial Clinical Phenotyping Centre, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK; Division of Systems Medicine, Department of Metabolism, Digestion and Reproduction, Imperial College London, London, UK; Centre for Analysis of Complex Systems, Sechenov First Moscow State Medical University, Moscow, Russia
| | - Rahul Kumar Verma
- Department of Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore 453552, India
| | - Mikhail Ivanchenko
- Department of Applied Mathematics and Centre of Bioinformatics, Lobachevsky State University of Nizhny Novgorod, Nizhny Novgorod, Russia
| | - Alexey Zaikin
- Centre for Analysis of Complex Systems, Sechenov First Moscow State Medical University, Moscow, Russia; Department of Applied Mathematics and Centre of Bioinformatics, Lobachevsky State University of Nizhny Novgorod, Nizhny Novgorod, Russia; Department of Mathematics and Institute for Women's Health, University College London, London WC1E 6BT, UK
| | - Sarika Jalan
- Department of Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore 453552, India; Complex Systems Lab, Department of Physics, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore 453552, India; Center for Theoretical Physics of Complex Systems, Institute for Basic Science(IBS), Daejeon 34126, Korea.
| |
Collapse
|
9
|
Verma RK, Kalyakulina A, Giuliani C, Shinde P, Kachhvah AD, Ivanchenko M, Jalan S. Analysis of human mitochondrial genome co-occurrence networks of Asian population at varying altitudes. Sci Rep 2021; 11:133. [PMID: 33420243 PMCID: PMC7794584 DOI: 10.1038/s41598-020-80271-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Accepted: 12/16/2020] [Indexed: 12/13/2022] Open
Abstract
Networks have been established as an extremely powerful framework to understand and predict the behavior of many large-scale complex systems. We studied network motifs, the basic structural elements of networks, to describe the possible role of co-occurrence of genomic variations behind high altitude adaptation in the Asian human population. Mitochondrial DNA (mtDNA) variations have been acclaimed as one of the key players in understanding the biological mechanisms behind adaptation to extreme conditions. To explore the cumulative effects of variations in the mitochondrial genome with the variation in the altitude, we investigated human mt-DNA sequences from the NCBI database at different altitudes under the co-occurrence motifs framework. Analysis of the co-occurrence motifs using similarity clustering revealed a clear distinction between lower and higher altitude regions. In addition, the previously known high altitude markers 3394 and 7697 (which are definitive sites of haplogroup M9a1a1c1b) were found to co-occur within their own gene complexes indicating the impact of intra-genic constraint on co-evolution of nucleotides. Furthermore, an ancestral 'RSRS50' variant 10,398 was found to co-occur only at higher altitudes supporting the fact that a separate route of colonization at these altitudes might have taken place. Overall, our analysis revealed the presence of co-occurrence interactions specific to high altitude at a whole mitochondrial genome level. This study, combined with the classical haplogroups analysis is useful in understanding the role of co-occurrence of mitochondrial variations in high altitude adaptation.
Collapse
Affiliation(s)
- Rahul K Verma
- Department of Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore, 453552, India
| | - Alena Kalyakulina
- Department of Applied Mathematics and Centre of Bioinformatics, Lobachevsky State University of Nizhny Novgorod, Nizhny Novgorod, Russia
| | - Cristina Giuliani
- Laboratory of Molecular Anthropology & Center for Genome Biology, Department of Biological, Geological and Environmental Sciences, University of Bologna, Bologna, Italy
| | - Pramod Shinde
- Division of Vaccine Discovery, La Jolla Institute for Immunology, La Jolla, CA, 92037, USA
| | - Ajay Deep Kachhvah
- Complex Systems Lab, Department of Physics, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore, 453552, India
| | - Mikhail Ivanchenko
- Department of Applied Mathematics and Centre of Bioinformatics, Lobachevsky State University of Nizhny Novgorod, Nizhny Novgorod, Russia.,Laboratory of Systems Medicine of Healthy Aging and Department of Applied Mathematics, Lobachevsky University, Nizhny Novgorod, Russia
| | - Sarika Jalan
- Department of Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore, 453552, India. .,Complex Systems Lab, Department of Physics, Indian Institute of Technology Indore, Khandwa Road, Simrol, Indore, 453552, India. .,Laboratory of Systems Medicine of Healthy Aging and Department of Applied Mathematics, Lobachevsky University, Nizhny Novgorod, Russia. .,Center for Theoretical Physics of Complex Systems, Institute for Basic Science (IBS), Daejeon, 34126, Republic of Korea.
| |
Collapse
|
10
|
Peng Y, Wu A, Meng J, Yang L, Wang D, Shu Y, Jiang T. Automated recommendation of the seasonal influenza vaccine strain with PREDAC. BIOSAFETY AND HEALTH 2020. [DOI: 10.1016/j.bsheal.2020.04.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
11
|
Adly AS, Adly AS, Adly MS. Approaches Based on Artificial Intelligence and the Internet of Intelligent Things to Prevent the Spread of COVID-19: Scoping Review. J Med Internet Res 2020; 22:e19104. [PMID: 32584780 PMCID: PMC7423390 DOI: 10.2196/19104] [Citation(s) in RCA: 62] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 06/24/2020] [Accepted: 06/25/2020] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Artificial intelligence (AI) and the Internet of Intelligent Things (IIoT) are promising technologies to prevent the concerningly rapid spread of coronavirus disease (COVID-19) and to maximize safety during the pandemic. With the exponential increase in the number of COVID-19 patients, it is highly possible that physicians and health care workers will not be able to treat all cases. Thus, computer scientists can contribute to the fight against COVID-19 by introducing more intelligent solutions to achieve rapid control of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes the disease. OBJECTIVE The objectives of this review were to analyze the current literature, discuss the applicability of reported ideas for using AI to prevent and control COVID-19, and build a comprehensive view of how current systems may be useful in particular areas. This may be of great help to many health care administrators, computer scientists, and policy makers worldwide. METHODS We conducted an electronic search of articles in the MEDLINE, Google Scholar, Embase, and Web of Knowledge databases to formulate a comprehensive review that summarizes different categories of the most recently reported AI-based approaches to prevent and control the spread of COVID-19. RESULTS Our search identified the 10 most recent AI approaches that were suggested to provide the best solutions for maximizing safety and preventing the spread of COVID-19. These approaches included detection of suspected cases, large-scale screening, monitoring, interactions with experimental therapies, pneumonia screening, use of the IIoT for data and information gathering and integration, resource allocation, predictions, modeling and simulation, and robotics for medical quarantine. CONCLUSIONS We found few or almost no studies regarding the use of AI to examine COVID-19 interactions with experimental therapies, the use of AI for resource allocation to COVID-19 patients, or the use of AI and the IIoT for COVID-19 data and information gathering/integration. Moreover, the adoption of other approaches, including use of AI for COVID-19 prediction, use of AI for COVID-19 modeling and simulation, and use of AI robotics for medical quarantine, should be further emphasized by researchers because these important approaches lack sufficient numbers of studies. Therefore, we recommend that computer scientists focus on these approaches, which are still not being adequately addressed.
Collapse
Affiliation(s)
- Aya Sedky Adly
- Faculty of Computers and Artificial Intelligence, Helwan University, Cairo, Egypt
| | - Afnan Sedky Adly
- Faculty of Physical Therapy, Cardiovascular-Respiratory Disorders and Geriatrics, Laser Applications in Physical Medicine, Cairo University, Cairo, Egypt
- Faculty of Physical Therapy, Internal Medicine, Beni-Suef University, Beni-Suef, Egypt
| | - Mahmoud Sedky Adly
- Faculty of Oral and Dental Medicine, Cairo University, Cairo, Egypt
- Royal College of Surgeons of Edinburgh, Scotland, United Kingdom
| |
Collapse
|
12
|
Yin R, Luusua E, Dabrowski J, Zhang Y, Kwoh CK. Tempel: time-series mutation prediction of influenza A viruses via attention-based recurrent neural networks. Bioinformatics 2020; 36:2697-2704. [DOI: 10.1093/bioinformatics/btaa050] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2019] [Revised: 01/01/2020] [Accepted: 01/22/2020] [Indexed: 02/06/2023] Open
Abstract
Abstract
Motivation
Influenza viruses are persistently threatening public health, causing annual epidemics and sporadic pandemics. The evolution of influenza viruses remains to be the main obstacle in the effectiveness of antiviral treatments due to rapid mutations. The goal of this work is to predict whether mutations are likely to occur in the next flu season using historical glycoprotein hemagglutinin sequence data. One of the major challenges is to model the temporality and dimensionality of sequential influenza strains and to interpret the prediction results.
Results
In this article, we propose an efficient and robust time-series mutation prediction model (Tempel) for the mutation prediction of influenza A viruses. We first construct the sequential training samples with splittings and embeddings. By employing recurrent neural networks with attention mechanisms, Tempel is capable of considering the historical residue information. Attention mechanisms are being increasingly used to improve the performance of mutation prediction by selectively focusing on the parts of the residues. A framework is established based on Tempel that enables us to predict the mutations at any specific residue site. Experimental results on three influenza datasets show that Tempel can significantly enhance the predictive performance compared with widely used approaches and provide novel insights into the dynamics of viral mutation and evolution.
Availability and implementation
The datasets, source code and supplementary documents are available at: https://drive.google.com/drive/folders/15WULR5__6k47iRotRPl3H7ghi3RpeNXH.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Rui Yin
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Emil Luusua
- Faculty of Science and Engineering, Linköping University, Linköping, Sweden
| | - Jan Dabrowski
- School of Computer Science, Swansea University, Swansea, UK
| | - Yu Zhang
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| | - Chee Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, Singapore
| |
Collapse
|
13
|
Shinde P, Sarkar C, Jalan S. Codon based co-occurrence network motifs in human mitochondria. Sci Rep 2018; 8:3060. [PMID: 29449618 PMCID: PMC5814444 DOI: 10.1038/s41598-018-21454-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Accepted: 02/05/2018] [Indexed: 11/09/2022] Open
Abstract
The nucleotide polymorphism in the human mitochondrial genome (mtDNA) tolled by codon position bias plays an indispensable role in human population dispersion and expansion. Herein, genome-wide nucleotide co-occurrence networks were constructed using data comprised of five different geographical regions and around 3000 samples for each region. We developed a powerful network model to describe complex mitochondrial evolutionary patterns among codon and non-codon positions. We found evidence that the evolution of human mitochondria DNA is dominated by adaptive forces, particularly mutation and selection, which was supported by many previous studies. The diversity observed in the mtDNA was compared with mutations, co-occurring mutations, network motifs considering codon positions as causing agent. This comparison showed that long-range nucleotide co-occurrences have a large effect on genomic diversity. Most notably, codon motifs apparently underpinned the preferences among codon positions for co-evolution which is probably highly biased during the origin of the genetic code. Our analysis also showed that variable nucleotide positions of different human sub-populations implemented the independent mtDNA evolution to its geographical dispensation. Ergo, this study has provided both a network framework and a codon glance to investigate co-occurring genomic variations that are critical in underlying complex mitochondrial evolution.
Collapse
Affiliation(s)
- Pramod Shinde
- Centre for Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Simrol, Indore, 453552, India
| | - Camellia Sarkar
- Centre for Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Simrol, Indore, 453552, India
| | - Sarika Jalan
- Centre for Biosciences and Biomedical Engineering, Indian Institute of Technology Indore, Simrol, Indore, 453552, India.
- Complex Systems Lab, Discipline of Physics, Indian Institute of Technology Indore, Simrol, Indore, 453552, India.
| |
Collapse
|
14
|
Lawrence P, Danet N, Reynard O, Volchkova V, Volchkov V. Human transmission of Ebola virus. Curr Opin Virol 2016; 22:51-58. [PMID: 28012412 DOI: 10.1016/j.coviro.2016.11.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2016] [Revised: 11/25/2016] [Accepted: 11/29/2016] [Indexed: 12/11/2022]
Abstract
Ever since the first recognised outbreak of Ebolavirus in 1976, retrospective epidemiological analyses and extensive studies with animal models have given us insight into the nature of the pathology and transmission mechanisms of this virus. In this review focusing on Ebolavirus, we present an outline of our current understanding of filovirus human-to-human transmission and of our knowledge concerning the molecular basis of viral transmission and potential for adaptation, with particular focus on what we have learnt from the 2014 outbreak in West Africa. We identify knowledge gaps relating to transmission and pathogenicity mechanisms, molecular adaptation and filovirus ecology.
Collapse
Affiliation(s)
- Philip Lawrence
- Molecular Basis of Viral Pathogenicity, International Centre for Research in Infectiology (CIRI), INSERM U1111 - CNRS UMR5308, Université Lyon 1, Ecole Normale Supérieure de Lyon, Lyon 69007, France; Université de Lyon, UMRS 449, Laboratoire de Biologie Générale, Université Catholique de Lyon - EPHE, Lyon 69288, France
| | - Nicolas Danet
- Molecular Basis of Viral Pathogenicity, International Centre for Research in Infectiology (CIRI), INSERM U1111 - CNRS UMR5308, Université Lyon 1, Ecole Normale Supérieure de Lyon, Lyon 69007, France
| | - Olivier Reynard
- Molecular Basis of Viral Pathogenicity, International Centre for Research in Infectiology (CIRI), INSERM U1111 - CNRS UMR5308, Université Lyon 1, Ecole Normale Supérieure de Lyon, Lyon 69007, France
| | - Valentina Volchkova
- Molecular Basis of Viral Pathogenicity, International Centre for Research in Infectiology (CIRI), INSERM U1111 - CNRS UMR5308, Université Lyon 1, Ecole Normale Supérieure de Lyon, Lyon 69007, France
| | - Viktor Volchkov
- Molecular Basis of Viral Pathogenicity, International Centre for Research in Infectiology (CIRI), INSERM U1111 - CNRS UMR5308, Université Lyon 1, Ecole Normale Supérieure de Lyon, Lyon 69007, France.
| |
Collapse
|
15
|
Chen H, Zhou X, Zheng J, Kwoh CK. Rules of co-occurring mutations characterize the antigenic evolution of human influenza A/H3N2, A/H1N1 and B viruses. BMC Med Genomics 2016; 9:69. [PMID: 28117657 PMCID: PMC5260787 DOI: 10.1186/s12920-016-0230-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The human influenza viruses undergo rapid evolution (especially in hemagglutinin (HA), a glycoprotein on the surface of the virus), which enables the virus population to constantly evade the human immune system. Therefore, the vaccine has to be updated every year to stay effective. There is a need to characterize the evolution of influenza viruses for better selection of vaccine candidates and the prediction of pandemic strains. Studies have shown that the influenza hemagglutinin evolution is driven by the simultaneous mutations at antigenic sites. Here, we analyze simultaneous or co-occurring mutations in the HA protein of human influenza A/H3N2, A/H1N1 and B viruses to predict potential mutations, characterizing the antigenic evolution. METHODS We obtain the rules of mutation co-occurrence using association rule mining after extracting HA1 sequences and detect co-mutation sites under strong selective pressure. Then we predict the potential drifts with specific mutations of the viruses based on the rules and compare the results with the "observed" mutations in different years. RESULTS The sites under frequent mutations are in antigenic regions (epitopes) or receptor binding sites. CONCLUSIONS Our study demonstrates the co-occurring site mutations obtained by rule mining can capture the evolution of influenza viruses, and confirms that cooperative interactions among sites of HA1 protein drive the influenza antigenic evolution.
Collapse
Affiliation(s)
- Haifen Chen
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore
| | - Xinrui Zhou
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore
| | - Jie Zheng
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore
- Genome Institute of Singapore, A*STAR, Biopolis, 138672, Singapore, Singapore
| | - Chee-Keong Kwoh
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, 639798, Singapore, Singapore.
| |
Collapse
|
16
|
Predicting the Mutating Distribution at Antigenic Sites of the Influenza Virus. Sci Rep 2016; 6:20239. [PMID: 26837263 PMCID: PMC4738307 DOI: 10.1038/srep20239] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2015] [Accepted: 12/29/2015] [Indexed: 11/13/2022] Open
Abstract
Mutations of the influenza virus lead to antigenic changes that cause recurrent epidemics and vaccine resistance. Preventive measures would benefit greatly from the ability to predict the potential distribution of new antigenic sites in future strains. By leveraging the extensive historical records of HA sequences for 90 years, we designed a computational model to simulate the dynamic evolution of antigenic sites in A/H1N1. With templates of antigenic sequences, the model can effectively predict the potential distribution of future antigenic mutants. Validation on 10932 HA sequences from the last 16 years showing that the mutated antigenic sites of over 94% of reported strains fell in our predicted profile. Meanwhile, our model can successfully capture 96% of antigenic sites in those dominant epitopes. Similar results are observed on the complete set of H3N2 historical data, supporting the general applicability of our model to multiple sub-types of influenza. Our results suggest that the mutational profile of future antigenic sites can be predicted based on historical evolutionary traces despite the widespread, random mutations in influenza. Coupled with closely monitored sequence data from influenza surveillance networks, our method can help to forecast changes in viral antigenicity for seasonal flu and inform public health interventions.
Collapse
|
17
|
Network of co-mutations in Ebola virus genome predicts the disease lethality. Cell Res 2015; 25:753-6. [PMID: 25976404 DOI: 10.1038/cr.2015.54] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
|
18
|
Monitoring infectious diseases in the big data era. Sci Bull (Beijing) 2014; 60:144-145. [PMID: 32215224 PMCID: PMC7089167 DOI: 10.1007/s11434-014-0696-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 12/01/2014] [Indexed: 10/28/2022]
|
19
|
Computational prediction of vaccine strains for human influenza A (H3N2) viruses. J Virol 2014; 88:12123-32. [PMID: 25122778 DOI: 10.1128/jvi.01861-14] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Human influenza A viruses are rapidly evolving pathogens that cause substantial morbidity and mortality in seasonal epidemics around the globe. To ensure continued protection, the strains used for the production of the seasonal influenza vaccine have to be regularly updated, which involves data collection and analysis by numerous experts worldwide. Computer-guided analysis is becoming increasingly important in this problem due to the vast amounts of generated data. We here describe a computational method for selecting a suitable strain for production of the human influenza A virus vaccine. It interprets available antigenic and genomic sequence data based on measures of antigenic novelty and rate of propagation of the viral strains throughout the population. For viral isolates sampled between 2002 and 2007, we used this method to predict the antigenic evolution of the H3N2 viruses in retrospective testing scenarios. When seasons were scored as true or false predictions, our method returned six true positives, three false negatives, eight true negatives, and one false positive, or 78% accuracy overall. In comparison to the recommendations by the WHO, we identified the correct antigenic variant once at the same time and twice one season ahead. Even though it cannot be ruled out that practical reasons such as lack of a sufficiently well-growing candidate strain may in some cases have prevented recommendation of the best-matching strain by the WHO, our computational decision procedure allows quantitative interpretation of the growing amounts of data and may help to match the vaccine better to predominating strains in seasonal influenza epidemics. Importance: Human influenza A viruses continuously change antigenically to circumvent the immune protection evoked by vaccination or previously circulating viral strains. To maintain vaccine protection and thereby reduce the mortality and morbidity caused by infections, regular updates of the vaccine strains are required. We have developed a data-driven framework for vaccine strain prediction which facilitates the computational analysis of genetic and antigenic data and does not rely on explicit evolutionary models. Our computational decision procedure generated good matches of the vaccine strain to the circulating predominant strain for most seasons and could be used to support the expert-guided prediction made by the WHO; it thus may allow an increase in vaccine efficacy.
Collapse
|
20
|
|
21
|
Steinbrück L, McHardy AC. Inference of genotype-phenotype relationships in the antigenic evolution of human influenza A (H3N2) viruses. PLoS Comput Biol 2012; 8:e1002492. [PMID: 22532796 PMCID: PMC3330098 DOI: 10.1371/journal.pcbi.1002492] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2011] [Accepted: 03/09/2012] [Indexed: 01/05/2023] Open
Abstract
Distinguishing mutations that determine an organism's phenotype from (near-) neutral ‘hitchhikers’ is a fundamental challenge in genome research, and is relevant for numerous medical and biotechnological applications. For human influenza viruses, recognizing changes in the antigenic phenotype and a strains' capability to evade pre-existing host immunity is important for the production of efficient vaccines. We have developed a method for inferring ‘antigenic trees’ for the major viral surface protein hemagglutinin. In the antigenic tree, antigenic weights are assigned to all tree branches, which allows us to resolve the antigenic impact of the associated amino acid changes. Our technique predicted antigenic distances with comparable accuracy to antigenic cartography. Additionally, it identified both known and novel sites, and amino acid changes with antigenic impact in the evolution of influenza A (H3N2) viruses from 1968 to 2003. The technique can also be applied for inference of ‘phenotype trees’ and genotype–phenotype relationships from other types of pairwise phenotype distances. The molecular evolution of any organism is described by changes in the genotype resulting from genetic drift or selection to maintain or establish fitness under the given environmental conditions. Identification of phenotype-defining changes and their distinction from (near-) neutral (‘hitchhikers’) ones is a fundamental challenge in genome research. The standard approach involves time- and cost-intensive mutation experiments, which are typically low throughput, due to their experimental nature. We have developed a computational method for the inference of phenotypic impact of genotypic changes that is applicable to any system, within or across species, where homologous genetic sequences and associated pairwise phenotype distances are available. We demonstrate the accuracy of our method by application to the human influenza A (H3N2) virus. This exemplary system is of particular interest, as recognizing changes in the antigenic phenotype and a viral strains' capability to evade pre-existing host immunity is important for the production of efficient vaccines. We accurately identified known sites and amino acid changes with antigenic impact over 35 years of evolution, and provide further details on individual antigenically relevant changes in the evolution of influenza A (H3N2) viruses.
Collapse
Affiliation(s)
- Lars Steinbrück
- Department for Algorithmic Bioinformatics, Heinrich Heine University, Düsseldorf, Germany
- Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, Saarbrücken, Germany
| | - Alice Carolyn McHardy
- Department for Algorithmic Bioinformatics, Heinrich Heine University, Düsseldorf, Germany
- Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, Saarbrücken, Germany
- * E-mail:
| |
Collapse
|
22
|
Xia Z, Huynh T, Kang SG, Zhou R. Free-energy simulations reveal that both hydrophobic and polar interactions are important for influenza hemagglutinin antibody binding. Biophys J 2012; 102:1453-61. [PMID: 22455929 PMCID: PMC3309282 DOI: 10.1016/j.bpj.2012.01.043] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2011] [Revised: 01/25/2012] [Accepted: 01/27/2012] [Indexed: 11/24/2022] Open
Abstract
Antibodies binding to conserved epitopes can provide a broad range of neutralization to existing influenza subtypes and may also prevent the propagation of potential pandemic viruses by fighting against emerging strands. Here we propose a computational framework to study structural binding patterns and detailed molecular mechanisms of viral surface glycoprotein hemagglutinin (HA) binding with a broad spectrum of neutralizing monoclonal antibody fragments (Fab). We used rigorous free-energy perturbation (FEP) methods to calculate the antigen-antibody binding affinities, with an aggregate underlying molecular-dynamics simulation time of several microseconds (∼2 μs) using all-atom, explicit-solvent models. We achieved a high accuracy in the validation of our FEP protocol against a series of known binding affinities for this complex system, with <0.5 kcal/mol errors on average. We then introduced what to our knowledge are novel mutations into the interfacial region to further study the binding mechanism. We found that the stacking interaction between Trp-21 in HA2 and Phe-55 in the CDR-H2 of Fab is crucial to the antibody-antigen association. A single mutation of either W21A or F55A can cause a binding affinity decrease of ΔΔG > 4.0 kcal/mol (equivalent to an ∼1000-fold increase in the dissociation constant K(d)). Moreover, for group 1 HA subtypes (which include both the H1N1 swine flu and the H5N1 bird flu), the relative binding affinities change only slightly (< ±1 kcal/mol) when nonpolar residues at the αA helix of HA mutate to conservative amino acids of similar size, which explains the broad neutralization capability of antibodies such as F10 and CR6261. Finally, we found that the hydrogen-bonding network between His-38 (in HA1) and Ser-30/Gln-64 (in Fab) is important for preserving the strong binding of Fab against group 1 HAs, whereas the lack of such hydrogen bonds with Asn-38 in most group 2 HAs may be responsible for the escape of antibody neutralization. These large-scale simulations may provide new insight into the antigen-antibody binding mechanism at the atomic level, which could be essential for designing more-effective vaccines for influenza.
Collapse
Affiliation(s)
- Zhen Xia
- Computational Biology Center, IBM Thomas J. Watson Research Center, Yorktown Heights, New York
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, Texas
| | - Tien Huynh
- Computational Biology Center, IBM Thomas J. Watson Research Center, Yorktown Heights, New York
| | - Seung-gu Kang
- Computational Biology Center, IBM Thomas J. Watson Research Center, Yorktown Heights, New York
| | - Ruhong Zhou
- Computational Biology Center, IBM Thomas J. Watson Research Center, Yorktown Heights, New York
- Department of Chemistry, Columbia University, New York, New York
| |
Collapse
|
23
|
Du X, Dong L, Lan Y, Peng Y, Wu A, Zhang Y, Huang W, Wang D, Wang M, Guo Y, Shu Y, Jiang T. Mapping of H3N2 influenza antigenic evolution in China reveals a strategy for vaccine strain recommendation. Nat Commun 2012; 3:709. [PMID: 22426230 DOI: 10.1038/ncomms1710] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2011] [Accepted: 01/26/2012] [Indexed: 12/23/2022] Open
Abstract
One of the primary efforts in influenza vaccine strain recommendation is to monitor through gene sequencing the viral surface protein haemagglutinin (HA) variants that lead to viral antigenic changes. Here we have developed a computational method, denoted as PREDAC, to predict antigenic clusters of influenza A (H3N2) viruses with high accuracy from viral HA sequences. Application of PREDAC to large-scale HA sequence data of H3N2 viruses isolated from diverse regions of Mainland China identified 17 antigenic clusters that have dominated for at least one season between 1968 and 2010. By tracking the dynamics of the dominant antigenic clusters, we not only find that dominant antigenic clusters change more frequently in China than in the United States/Europe, but also characterize the antigenic patterns of seasonal H3N2 viruses within China. Furthermore, we demonstrate that the coupling of large-scale HA sequencing with PREDAC can significantly improve vaccine strain recommendation for China.
Collapse
Affiliation(s)
- Xiangjun Du
- Key Laboratory of Protein and Peptide Pharmaceuticals, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
ElHefnawi M, Alaidi O, Mohamed N, Kamar M, El-Azab I, Zada S, Siam R. Identification of novel conserved functional motifs across most Influenza A viral strains. Virol J 2011; 8:44. [PMID: 21272360 PMCID: PMC3036627 DOI: 10.1186/1743-422x-8-44] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2010] [Accepted: 01/27/2011] [Indexed: 01/15/2023] Open
Abstract
Background Influenza A virus poses a continuous threat to global public health. Design of novel universal drugs and vaccine requires a careful analysis of different strains of Influenza A viral genome from diverse hosts and subtypes. We performed a systematic in silico analysis of Influenza A viral segments of all available Influenza A viral strains and subtypes and grouped them based on host, subtype, and years isolated, and through multiple sequence alignments we extrapolated conserved regions, motifs, and accessible regions for functional mapping and annotation. Results Across all species and strains 87 highly conserved regions (conservation percentage > = 90%) and 19 functional motifs (conservation percentage = 100%) were found in PB2, PB1, PA, NP, M, and NS segments. The conservation percentage of these segments ranged between 94 - 98% in human strains (the most conserved), 85 - 93% in swine strains (the most variable), and 91 - 94% in avian strains. The most conserved segment was different in each host (PB1 for human strains, NS for avian strains, and M for swine strains). Target accessibility prediction yielded 324 accessible regions, with a single stranded probability > 0.5, of which 78 coincided with conserved regions. Some of the interesting annotations in these regions included sites for protein-protein interactions, the RNA binding groove, and the proton ion channel. Conclusions The influenza virus has evolved to adapt to its host through variations in the GC content and conservation percentage of the conserved regions. Nineteen universal conserved functional motifs were discovered, of which some were accessible regions with interesting biological functions. These regions will serve as a foundation for universal drug targets as well as universal vaccine design.
Collapse
Affiliation(s)
- Mahmoud ElHefnawi
- Informatics and Systems Department and Biomedical Informatics and chemo informatics group, Division of Engineering Research and Centre of Excellence for Advanced Sciences, National Research Centre, Tahrir Street, 12311 Cairo, Egypt.
| | | | | | | | | | | | | |
Collapse
|
25
|
Steinbrück L, McHardy AC. Allele dynamics plots for the study of evolutionary dynamics in viral populations. Nucleic Acids Res 2010; 39:e4. [PMID: 20959296 PMCID: PMC3017622 DOI: 10.1093/nar/gkq909] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Phylodynamic techniques combine epidemiological and genetic information to analyze the evolutionary and spatiotemporal dynamics of rapidly evolving pathogens, such as influenza A or human immunodeficiency viruses. We introduce ‘allele dynamics plots’ (AD plots) as a method for visualizing the evolutionary dynamics of a gene in a population. Using AD plots, we propose how to identify the alleles that are likely to be subject to directional selection. We analyze the method’s merits with a detailed study of the evolutionary dynamics of seasonal influenza A viruses. AD plots for the major surface protein of seasonal influenza A (H3N2) and the 2009 swine-origin influenza A (H1N1) viruses show the succession of substitutions that became fixed in the evolution of the two viral populations. They also allow the early identification of those viral strains that later rise to predominance, which is important for the problem of vaccine strain selection. In summary, we describe a technique that reveals the evolutionary dynamics of a rapidly evolving population and allows us to identify alleles and associated genetic changes that might be under directional selection. The method can be applied for the study of influenza A viruses and other rapidly evolving species or viruses.
Collapse
Affiliation(s)
- Lars Steinbrück
- Max-Planck Research Group for Computational Genomics and Epidemiology, Max-Planck Institute for Informatics, Saarbrücken, Germany
| | | |
Collapse
|
26
|
Influenza A gradual and epochal evolution: insights from simple models. PLoS One 2009; 4:e7426. [PMID: 19841740 PMCID: PMC2759541 DOI: 10.1371/journal.pone.0007426] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Accepted: 09/09/2009] [Indexed: 11/20/2022] Open
Abstract
The recurrence of influenza A epidemics has originally been explained by a “continuous antigenic drift” scenario. Recently, it has been shown that if genetic drift is gradual, the evolution of influenza A main antigen, the haemagglutinin, is punctuated. As a consequence, it has been suggested that influenza A dynamics at the population level should be approximated by a serial model. Here, simple models are used to test whether a serial model requires gradual antigenic drift within groups of strains with the same antigenic properties (antigenic clusters). We compare the effect of status based and history based frameworks and the influence of reduced susceptibility and infectivity assumptions on the transient dynamics of antigenic clusters. Our results reveal that the replacement of a resident antigenic cluster by a mutant cluster, as observed in data, is reproduced only by the status based model integrating the reduced infectivity assumption. This combination of assumptions is useful to overcome the otherwise extremely high model dimensionality of models incorporating many strains, but relies on a biological hypothesis not obviously satisfied. Our findings finally suggest the dynamical importance of gradual antigenic drift even in the presence of punctuated immune escape. A more regular renewal of susceptible pool than the one implemented in a serial model should be part of a minimal theory for influenza at the population level.
Collapse
|
27
|
Xia Z, Jin G, Zhu J, Zhou R. Using a mutual information-based site transition network to map the genetic evolution of influenza A/H3N2 virus. ACTA ACUST UNITED AC 2009; 25:2309-17. [PMID: 19706746 DOI: 10.1093/bioinformatics/btp423] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Mapping the antigenic and genetic evolution pathways of influenza A is of critical importance in the vaccine development and drug design of influenza virus. In this article, we have analyzed more than 4000 A/H3N2 hemagglutinin (HA) sequences from 1968 to 2008 to model the evolutionary path of the influenza virus, which allows us to predict its future potential drifts with specific mutations. RESULTS The mutual information (MI) method was used to design a site transition network (STN) for each amino acid site in the A/H3N2 HA sequence. The STN network indicates that most of the dynamic interactions are positioned around the epitopes and the receptor binding domain regions, with strong preferences in both the mutation sites and amino acid types being mutated to. The network also shows that antigenic changes accumulate over time, with occasional large changes due to multiple co-occurring mutations at antigenic sites. Furthermore, the cluster analysis by subdividing the STN into several subnetworks reveals a more detailed view about the features of the antigenic change: the characteristic inner sites and the connecting inter-subnetwork sites are both responsible for the drifts. A novel five-step prediction algorithm based on the STN shows a reasonable accuracy in reproducing historical HA mutations. For example, our method can reproduce the 2003-2004 A/H3N2 mutations with approximately 70% accuracy. The method also predicts seven possible mutations for the next antigenic drift in the coming 2009-2010 season. The STN approach also agrees well with the phylogenetic tree and antigenic maps based on HA inhibition assays. AVAILABILITY All code and data are available at http://ibi.zju.edu.cn/birdflu/.
Collapse
Affiliation(s)
- Zhen Xia
- Institute of Bioinformatics, Zhejiang University, Hangzhou, PR China
| | | | | | | |
Collapse
|
28
|
Huang JW, King CC, Yang JM. Co-evolution positions and rules for antigenic variants of human influenza A/H3N2 viruses. BMC Bioinformatics 2009; 10 Suppl 1:S41. [PMID: 19208143 PMCID: PMC2648776 DOI: 10.1186/1471-2105-10-s1-s41] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
Background In pandemic and epidemic forms, avian and human influenza viruses often cause significant damage to human society and economics. Gradually accumulated mutations on hemagglutinin (HA) cause immunologically distinct circulating strains, which lead to the antigenic drift (named as antigenic variants). The "antigenic variants" often requires a new vaccine to be formulated before each annual epidemic. Mapping the genetic evolution to the antigenic drift of influenza viruses is an emergent issue to public health and vaccine development Results We developed a method for identifying antigenic critical amino acid positions, rules, and co-mutated positions for antigenic variants. The information gain (IG) and the entropy are used to measure the score of an amino acid position on hemagglutinin (HA) for discriminating between antigenic variants and similar viruses. A position with high IG and entropy implied that this position is highly correlated to an antigenic drift. Nineteen positions with high IG and high genetic diversity are identified as antigenic critical positions on the HA proteins. Most of these antigenic critical positions are located on five epitopes or on the surface based on the HA structure. Based on IG values and entropies of these 19 positions on the HA, the decision tree was applied to create a rule-based model and to identify rules for predicting antigenic variants of a given two HA sequences which are often a vaccine strain and a circulating strain. The predicting accuracies of this model on two sets, which consist of a training set (181 hemagglutination inhibition (HI) assays) and an independent test set (31,878 HI assays), are 91.2% and 96.2% respectively. Conclusion Our method is able to identify critical positions, rules, and co-mutated positions on HA for predicting the antigenic variants. The information gains and the entropies of HA positions provide insight to the antigenic drift and co-evolution positions for influenza seasons. We believe that our method is robust and is potential useful for studying influenza virus evolution and vaccine development.
Collapse
Affiliation(s)
- Jhang-Wei Huang
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, 30050, Taiwan.
| | | | | |
Collapse
|
29
|
Han JDJ, Liu Y, Xue H, Xia K, Yu H, Zhu S, Chen Z, Zhang W, Huang Z, Jin C, Xian B, Li J, Hou L, Han Y, Niu C, Alcon TC. Developmental systems biology flourishing on new technologies. J Genet Genomics 2008; 35:577-84. [PMID: 18937914 DOI: 10.1016/s1673-8527(08)60078-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2008] [Revised: 09/04/2008] [Accepted: 09/05/2008] [Indexed: 11/24/2022]
Abstract
Organism development is a systems level process. It has benefited greatly from the recent technological advances in the field of systems biology. DNA microarray, phenome, interactome and transcriptome mapping, the new generation of deep sequencing technologies, and faster and better computational and modeling approaches have opened new frontiers for both systems biologists and developmental biologists to reexamine the old developmental biology questions, such as pattern formation, and to tackle new problems, such as stem cell reprogramming. As showcased in the International Developmental Systems Biology Symposium organized by Chinese Academy of Sciences, developmental systems biology is flourishing in many perspectives, from the evolution of developmental systems, to the underlying genetic and molecular pathways and networks, to the genomic, epigenomic and noncoding levels, to the computational analysis and modeling. We believe that the field will continue to reap rewards into the future with these new approaches.
Collapse
Affiliation(s)
- Jing-Dong J Han
- Chinese Academy of Sciences Key Laboratory of Molecular Developmental Biology, Center for Molecular Systems Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
30
|
Correlating novel variable and conserved motifs in the Hemagglutinin protein with significant biological functions. Virol J 2008; 5:91. [PMID: 18681973 PMCID: PMC2553082 DOI: 10.1186/1743-422x-5-91] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2008] [Accepted: 08/05/2008] [Indexed: 12/03/2022] Open
Abstract
Background Variations in the influenza Hemagglutinin protein contributes to antigenic drift resulting in decreased efficiency of seasonal influenza vaccines and escape from host immune response. We performed an in silico study to determine characteristics of novel variable and conserved motifs in the Hemagglutinin protein from previously reported H3N2 strains isolated from Hong Kong from 1968–1999 to predict viral motifs involved in significant biological functions. Results 14 MEME blocks were generated and comparative analysis of the MEME blocks identified blocks 1, 2, 3 and 7 to correlate with several biological functions. Analysis of the different Hemagglutinin sequences elucidated that the single block 7 has the highest frequency of amino acid substitution and the highest number of co-mutating pairs. MEME 2 showed intermediate variability and MEME 1 was the most conserved. Interestingly, MEME blocks 2 and 7 had the highest incidence of potential post-translational modifications sites including phosphorylation sites, ASN glycosylation motifs and N-myristylation sites. Similarly, these 2 blocks overlap with previously identified antigenic sites and receptor binding sites. Conclusion Our study identifies motifs in the Hemagglutinin protein with different amino acid substitution frequencies over a 31 years period, and derives relevant functional characteristics by correlation of these motifs with potential post-translational modifications sites, antigenic and receptor binding sites.
Collapse
|