1
|
Zhang S, Li YD, Cai YR, Kang XP, Feng Y, Li YC, Chen YH, Li J, Bao LL, Jiang T. Compositional features analysis by machine learning in genome represents linear adaptation of monkeypox virus. Front Genet 2024; 15:1361952. [PMID: 38495668 PMCID: PMC10940399 DOI: 10.3389/fgene.2024.1361952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 02/21/2024] [Indexed: 03/19/2024] Open
Abstract
Introduction: The global headlines have been dominated by the sudden and widespread outbreak of monkeypox, a rare and endemic zoonotic disease caused by the monkeypox virus (MPXV). Genomic composition based machine learning (ML) methods have recently shown promise in identifying host adaptability and evolutionary patterns of virus. Our study aimed to analyze the genomic characteristics and evolutionary patterns of MPXV using ML methods. Methods: The open reading frame (ORF) regions of full-length MPXV genomes were filtered and 165 ORFs were selected as clusters with the highest homology. Unsupervised machine learning methods of t-distributed stochastic neighbor embedding (t-SNE), Principal Component Analysis (PCA), and hierarchical clustering were performed to observe the DCR characteristics of the selected ORF clusters. Results: The results showed that MPXV sequences post-2022 showed an obvious linear adaptive evolution, indicating that it has become more adapted to the human host after accumulating mutations. For further accurate analysis, the ORF regions with larger variations were filtered out based on the ranking of homology difference to narrow down the key ORF clusters, which drew the same conclusion of linear adaptability. Then key differential protein structures were predicted by AlphaFold 2, which meant that difference in main domains might be one of the internal reasons for linear adaptive evolution. Discussion: Understanding the process of linear adaptation is critical in the constant evolutionary struggle between viruses and their hosts, playing a significant role in crafting effective measures to tackle viral diseases. Therefore, the present study provides valuable insights into the evolutionary patterns of the MPXV in 2022 from the perspective of genomic composition characteristics analysis through ML methods.
Collapse
Affiliation(s)
- Sen Zhang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing, China
| | - Ya-Dan Li
- College of Basic Medical Sciences, Anhui Medical University, Hefei, China
| | - Yu-Rong Cai
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing, China
- College of the First Clinical Medical, Inner Mongolia Medical University, Hohhot, China
| | - Xiao-Ping Kang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing, China
| | - Ye Feng
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing, China
| | - Yu-Chang Li
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing, China
| | - Yue-Hong Chen
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing, China
| | - Jing Li
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, Academy of Military Medical Sciences, Beijing, China
- College of Basic Medical Sciences, Anhui Medical University, Hefei, China
| | - Li-Li Bao
- College of Basic Medical Sciences, Inner Mongolia Medical University, Hohhot, China
| | - Tao Jiang
- College of Basic Medical Sciences, Anhui Medical University, Hefei, China
| |
Collapse
|
2
|
Jiang S, Zhang S, Kang X, Feng Y, Li Y, Nie M, Li Y, Chen Y, Zhao S, Jiang T, Li J. Risk Assessment of the Possible Intermediate Host Role of Pigs for Coronaviruses with a Deep Learning Predictor. Viruses 2023; 15:1556. [PMID: 37515242 PMCID: PMC10384923 DOI: 10.3390/v15071556] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 07/13/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Swine coronaviruses (CoVs) have been found to cause infection in humans, suggesting that Suiformes might be potential intermediate hosts in CoV transmission from their natural hosts to humans. The present study aims to establish convolutional neural network (CNN) models to predict host adaptation of swine CoVs. Decomposing of each ORF1ab and Spike sequence was performed with dinucleotide composition representation (DCR) and other traits. The relationship between CoVs from different adaptive hosts was analyzed by unsupervised learning, and CNN models based on DCR of ORF1ab and Spike were built to predict the host adaptation of swine CoVs. The rationality of the models was verified with phylogenetic analysis. Unsupervised learning showed that there is a multiple host adaptation of different swine CoVs. According to the adaptation prediction of CNN models, swine acute diarrhea syndrome CoV (SADS-CoV) and porcine epidemic diarrhea virus (PEDV) are adapted to Chiroptera, swine transmissible gastroenteritis virus (TGEV) is adapted to Carnivora, porcine hemagglutinating encephalomyelitis (PHEV) might be adapted to Primate, Rodent, and Lagomorpha, and porcine deltacoronavirus (PDCoV) might be adapted to Chiroptera, Artiodactyla, and Carnivora. In summary, the DCR trait has been confirmed to be representative for the CoV genome, and the DCR-based deep learning model works well to assess the adaptation of swine CoVs to other mammals. Suiformes might be intermediate hosts for human CoVs and other mammalian CoVs. The present study provides a novel approach to assess the risk of adaptation and transmission to humans and other mammals of swine CoVs.
Collapse
Affiliation(s)
- Shuyang Jiang
- College of Mathematics, Jilin University, Changchun, Jilin 130012, China
| | - Sen Zhang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Xiaoping Kang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Ye Feng
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Yadan Li
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Maoshun Nie
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Yuchang Li
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Yuehong Chen
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Shishun Zhao
- College of Mathematics, Jilin University, Changchun, Jilin 130012, China
| | - Tao Jiang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| | - Jing Li
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing 100071, China
| |
Collapse
|
3
|
Li J, Tian F, Zhang S, Liu SS, Kang XP, Li YD, Wei JQ, Lin W, Lei Z, Feng Y, Jiang JF, Jiang T, Tong Y. Genomic representation predicts an asymptotic host adaptation of bat coronaviruses using deep learning. Front Microbiol 2023; 14:1157608. [PMID: 37213516 PMCID: PMC10198438 DOI: 10.3389/fmicb.2023.1157608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 04/03/2023] [Indexed: 05/23/2023] Open
Abstract
Introduction Coronaviruses (CoVs) are naturally found in bats and can occasionally cause infection and transmission in humans and other mammals. Our study aimed to build a deep learning (DL) method to predict the adaptation of bat CoVs to other mammals. Methods The CoV genome was represented with a method of dinucleotide composition representation (DCR) for the two main viral genes, ORF1ab and Spike. DCR features were first analyzed for their distribution among adaptive hosts and then trained with a DL classifier of convolutional neural networks (CNN) to predict the adaptation of bat CoVs. Results and discussion The results demonstrated inter-host separation and intra-host clustering of DCR-represented CoVs for six host types: Artiodactyla, Carnivora, Chiroptera, Primates, Rodentia/Lagomorpha, and Suiformes. The DCR-based CNN with five host labels (without Chiroptera) predicted a dominant adaptation of bat CoVs to Artiodactyla hosts, then to Carnivora and Rodentia/Lagomorpha mammals, and later to primates. Moreover, a linear asymptotic adaptation of all CoVs (except Suiformes) from Artiodactyla to Carnivora and Rodentia/Lagomorpha and then to Primates indicates an asymptotic bats-other mammals-human adaptation. Conclusion Genomic dinucleotides represented as DCR indicate a host-specific separation, and clustering predicts a linear asymptotic adaptation shift of bat CoVs from other mammals to humans via deep learning.
Collapse
Affiliation(s)
- Jing Li
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
| | - Fengjuan Tian
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering (BAIC-SM), College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Sen Zhang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
| | - Shun-Shuai Liu
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
| | - Xiao-Ping Kang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
| | - Ya-Dan Li
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
| | - Jun-Qing Wei
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering (BAIC-SM), College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Wei Lin
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering (BAIC-SM), College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Zhongyi Lei
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering (BAIC-SM), College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
| | - Ye Feng
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
| | - Jia-Fu Jiang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
- Jia-Fu Jiang
| | - Tao Jiang
- State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, China
- Tao Jiang
| | - Yigang Tong
- Beijing Advanced Innovation Center for Soft Matter Science and Engineering (BAIC-SM), College of Life Science and Technology, Beijing University of Chemical Technology, Beijing, China
- *Correspondence: Yigang Tong
| |
Collapse
|
4
|
Philip AM, Ahmed WS, Biswas KH. Reversal of the unique Q493R mutation increases the affinity of Omicron S1-RBD for ACE2. Comput Struct Biotechnol J 2023; 21:1966-1977. [PMID: 36936816 PMCID: PMC10006685 DOI: 10.1016/j.csbj.2023.02.019] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 01/28/2023] [Accepted: 02/09/2023] [Indexed: 02/16/2023] Open
Abstract
The SARS-CoV-2 Omicron variant containing 15 mutations, including the unique Q493R, in the spike protein receptor binding domain (S1-RBD) is highly infectious. While comparison with previously reported mutations provide some insights, the mechanism underlying the increased infections and the impact of the reversal of the unique Q493R mutation seen in BA.4, BA.5, BA.2.75, BQ.1 and XBB lineages is not yet completely understood. Here, using structural modelling and molecular dynamics (MD) simulations, we show that the Omicron mutations increases the affinity of S1-RBD for ACE2, and a reversal of the unique Q493R mutation further increases the ACE2-S1-RBD affinity. Specifically, we performed all atom, explicit solvent MD simulations using a modelled structure of the Omicron S1-RBD-ACE2 and compared the trajectories with the WT complex revealing a substantial reduction in the Cα-atom fluctuation in the Omicron S1-RBD and increased hydrogen bond and other interactions. Residue level analysis revealed an alteration in the interaction between several residues including a switch in the interaction of ACE2 D38 from S1-RBD Y449 in the WT complex to the mutated R residue (Q493R) in Omicron complex. Importantly, simulations with Revertant (Omicron without the Q493R mutation) complex revealed further enhancement of the interaction between S1-RBD and ACE2. Thus, results presented here not only provide insights into the increased infectious potential of the Omicron variant but also a mechanistic basis for the reversal of the Q493R mutation seen in some Omicron lineages and will aid in understanding the impact of mutations in SARS-CoV-2 evolution.
Collapse
Affiliation(s)
- Angelin M. Philip
- Division of Genomics and Translational Biomedicine, College of Health & Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, Doha 34110, Qatar
| | - Wesam S. Ahmed
- Division of Biological and Biomedical Sciences, College of Health & Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, Doha 34110, Qatar
| | - Kabir H. Biswas
- Division of Biological and Biomedical Sciences, College of Health & Life Sciences, Hamad Bin Khalifa University, Qatar Foundation, Doha 34110, Qatar
- Corresponding author.
| |
Collapse
|
5
|
Nan BG, Zhang S, Li YC, Kang XP, Chen YH, Li L, Jiang T, Li J. Convolutional Neural Networks Based on Sequential Spike Predict the High Human Adaptation of SARS-CoV-2 Omicron Variants. Viruses 2022; 14:v14051072. [PMID: 35632811 PMCID: PMC9147419 DOI: 10.3390/v14051072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 05/10/2022] [Accepted: 05/11/2022] [Indexed: 12/04/2022] Open
Abstract
The COVID-19 pandemic has frequently produced more highly transmissible SARS-CoV-2 variants, such as Omicron, which has produced sublineages. It is a challenge to tell apart high-risk Omicron sublineages and other lineages of SARS-CoV-2 variants. We aimed to build a fine-grained deep learning (DL) model to assess SARS-CoV-2 transmissibility, updating our former coarse-grained model, with the training/validating data of early-stage SARS-CoV-2 variants and based on sequential Spike samples. Sequential amino acid (AA) frequency was decomposed into serially and slidingly windowed fragments in Spike. Unsupervised machine learning approaches were performed to observe the distribution in sequential AA frequency and then a supervised Convolutional Neural Network (CNN) was built with three adaptation labels to predict the human adaptation of Omicron variants in sublineages. Results indicated clear inter-lineage separation and intra-lineage clustering for SARS-CoV-2 variants in the decomposed sequential AAs. Accurate classification by the predictor was validated for the variants with different adaptations. Higher adaptation for the BA.2 sublineage and middle-level adaptation for the BA.1/BA.1.1 sublineages were predicted for Omicron variants. Summarily, the Omicron BA.2 sublineage is more adaptive than BA.1/BA.1.1 and has spread more rapidly, particularly in Europe. The fine-grained adaptation DL model works well for the timely assessment of the transmissibility of SARS-CoV-2 variants, facilitating the control of emerging SARS-CoV-2 variants.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Jing Li
- Correspondence: (T.J.); (J.L.)
| |
Collapse
|