1
|
Roy T, Sharma K, Dhall A, Patiyal S, Raghava GPS. In silico method for predicting infectious strains of influenza A virus from its genome and protein sequences. J Gen Virol 2022; 103. [PMID: 36318663 DOI: 10.1099/jgv.0.001802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023] Open
Abstract
Influenza A is a contagious viral disease responsible for four pandemics in the past and a major public health concern. Being zoonotic in nature, the virus can cross the species barrier and transmit from wild aquatic bird reservoirs to humans via intermediate hosts. In this study, we have developed a computational method for the prediction of human-associated and non-human-associated influenza A virus sequences. The models were trained and validated on proteins and genome sequences of influenza A virus. Firstly, we have developed prediction models for 15 types of influenza A proteins using composition-based and one-hot-encoding features. We have achieved a highest AUC of 0.98 for HA protein on a validation dataset using dipeptide composition-based features. Of note, we obtained a maximum AUC of 0.99 using one-hot-encoding features for protein-based models on a validation dataset. Secondly, we built models using whole genome sequences which achieved an AUC of 0.98 on a validation dataset. In addition, we showed that our method outperforms a similarity-based approach (i.e., blast) on the same validation dataset. Finally, we integrated our best models into a user-friendly web server 'FluSPred' (https://webs.iiitd.edu.in/raghava/fluspred/index.html) and a standalone version (https://github.com/raghavagps/FluSPred) for the prediction of human-associated/non-human-associated influenza A virus strains.
Collapse
Affiliation(s)
- Trinita Roy
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Khushal Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Gajendra Pal Singh Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| |
Collapse
|
2
|
Deconstruction of Risk Prediction of Ischemic Cardiovascular and Cerebrovascular Diseases Based on Deep Learning. CONTRAST MEDIA & MOLECULAR IMAGING 2022; 2022:8478835. [PMID: 36263000 PMCID: PMC9546720 DOI: 10.1155/2022/8478835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Revised: 08/24/2022] [Accepted: 09/07/2022] [Indexed: 01/26/2023]
Abstract
Over the years, with the widespread use of computer technology and the dramatic increase in electronic medical data, data-driven approaches to medical data analysis have emerged. However, the analysis of medical data remains challenging due to the mixed nature of the data, the incompleteness of many records, and the high level of noise. This paper proposes an improved neural network DBN-LSTM that combines a deep belief network (DBN) with a long short-term memory (LSTM) network. The subset of feature attributes processed by CFS-EGA is used for training, and the optimal selection test of the number of hidden layers is performed on the upper DBN in the process of training DBN-LSTM. At the same time, the validation set is combined to determine the hyperparameters of the LSTM. Construct the DNN, CNN, and long short-term memory (LSTM) network for comparative analysis with DBN-LSTM. Use the classification method to compare the average of the final results of the two experiments. The results show that the prediction accuracy of DBN-LSTM for cardiovascular and cerebrovascular diseases reaches 95.61%, which is higher than the three traditional neural networks.
Collapse
|
3
|
Xu Y, Wojtczak D. Dive into machine learning algorithms for influenza virus host prediction with hemagglutinin sequences. Biosystems 2022; 220:104740. [DOI: 10.1016/j.biosystems.2022.104740] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 07/02/2022] [Accepted: 07/16/2022] [Indexed: 11/26/2022]
|
4
|
Borkenhagen LK, Allen MW, Runstadler JA. Influenza virus genotype to phenotype predictions through machine learning: a systematic review. Emerg Microbes Infect 2021; 10:1896-1907. [PMID: 34498543 PMCID: PMC8462836 DOI: 10.1080/22221751.2021.1978824] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background: There is great interest in understanding the viral genomic predictors of phenotypic traits that allow influenza A viruses to adapt to or become more virulent in different hosts. Machine learning techniques have demonstrated promise in addressing this critical need for other pathogens because the underlying algorithms are especially well equipped to uncover complex patterns in large datasets and produce generalizable predictions for new data. As the body of research where these techniques are applied for influenza A virus phenotype prediction continues to grow, it is useful to consider the strengths and weaknesses of these approaches to understand what has prevented these models from seeing widespread use by surveillance laboratories and to identify gaps that are underexplored with this technology. Methods and Results: We present a systematic review of English literature published through 15 April 2021 of studies employing machine learning methods to generate predictions of influenza A virus phenotypes from genomic or proteomic input. Forty-nine studies were included in this review, spanning the topics of host discrimination, human adaptability, subtype and clade assignment, pandemic lineage assignment, characteristics of infection, and antiviral drug resistance. Conclusions: Our findings suggest that biases in model design and a dearth of wet laboratory follow-up may explain why these models often go underused. We, therefore, offer guidance to overcome these limitations, aid in improving predictive models of previously studied influenza A virus phenotypes, and extend those models to unexplored phenotypes in the ultimate pursuit of tools to enable the characterization of virus isolates across surveillance laboratories.
Collapse
Affiliation(s)
- Laura K Borkenhagen
- Department of Infectious Disease and Global Health, Cummings School of Veterinary Medicine, Tufts University, North Grafton, MA, USA
| | - Martin W Allen
- Department of Computer Science, School of Engineering, Tufts University, Medford, MA, USA
| | - Jonathan A Runstadler
- Department of Infectious Disease and Global Health, Cummings School of Veterinary Medicine, Tufts University, North Grafton, MA, USA
| |
Collapse
|
5
|
Avian Influenza in Wild Birds and Poultry: Dissemination Pathways, Monitoring Methods, and Virus Ecology. Pathogens 2021; 10:pathogens10050630. [PMID: 34065291 PMCID: PMC8161317 DOI: 10.3390/pathogens10050630] [Citation(s) in RCA: 49] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Revised: 05/13/2021] [Accepted: 05/14/2021] [Indexed: 12/21/2022] Open
Abstract
Avian influenza is one of the largest known threats to domestic poultry. Influenza outbreaks on poultry farms typically lead to the complete slaughter of the entire domestic bird population, causing severe economic losses worldwide. Moreover, there are highly pathogenic avian influenza (HPAI) strains that are able to infect the swine or human population in addition to their primary avian host and, as such, have the potential of being a global zoonotic and pandemic threat. Migratory birds, especially waterfowl, are a natural reservoir of the avian influenza virus; they carry and exchange different virus strains along their migration routes, leading to antigenic drift and antigenic shift, which results in the emergence of novel HPAI viruses. This requires monitoring over time and in different locations to allow for the upkeep of relevant knowledge on avian influenza virus evolution and the prevention of novel epizootic and epidemic outbreaks. In this review, we assess the role of migratory birds in the spread and introduction of influenza strains on a global level, based on recent data. Our analysis sheds light on the details of viral dissemination linked to avian migration, the viral exchange between migratory waterfowl and domestic poultry, virus ecology in general, and viral evolution as a process tightly linked to bird migration. We also provide insight into methods used to detect and quantify avian influenza in the wild. This review may be beneficial for the influenza research community and may pave the way to novel strategies of avian influenza and HPAI zoonosis outbreak monitoring and prevention.
Collapse
|
6
|
Ding J, Lin Q, Zhang J, Young GM, Jiang C, Zhong Y, Zhang J. Rapid identification of pathogens by using surface-enhanced Raman spectroscopy and multi-scale convolutional neural network. Anal Bioanal Chem 2021; 413:3801-3811. [PMID: 33961103 DOI: 10.1007/s00216-021-03332-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 03/30/2021] [Accepted: 04/08/2021] [Indexed: 12/17/2022]
Abstract
Salmonella is a prevalent pathogen causing serious morbidity and mortality worldwide. There are over 2600 serovars of Salmonella. Among them, Salmonella Enteritidis, Salmonella Typhimurium, and Salmonella Paratyphi were reported to be the most common foodborne pathogenic serovars in the EU and China. In order to provide a more efficient approach to detect and distinguish these serovars, a new analytical method was developed by combining surface-enhanced Raman spectroscopy (SERS) with multi-scale convolutional neural network (CNN). We prepared 34-nm gold nanoparticles (AuNPs) as the label-free Raman substrate, measured 1854 SERS spectra of these three Salmonella serovars, and then proposed a multi-scale CNN model with three parallel CNNs to achieve multi-dimensional extraction of SERS spectral features. We observed the impact of the number of iterations and training samples on the recognition accuracy by changing the ratio of the number of the training and testing sets. By comparing the calculated data with experimental one, it was shown that our model could reach recognition accuracy more than 97%. These results indicate that it was not only feasible to combine SERS spectroscopy with multi-scale CNN for Salmonella serotype identification, but also for other pathogen species and serovar identifications.
Collapse
Affiliation(s)
- Jingyu Ding
- College of Food Science and Technology, Shanghai Ocean University, Shanghai, 201306, China
| | - Qingqing Lin
- Key Laboratory of Ministry of Education of China for Research of Design and Electromagnetic Compatibility of High-Speed Electronic System, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Jiameng Zhang
- Key Laboratory of Ministry of Education of China for Research of Design and Electromagnetic Compatibility of High-Speed Electronic System, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Glenn M Young
- Department of Food Science and Technology, University of California, Davis, CA, 95616, USA
| | - Chun Jiang
- Key Laboratory of Ministry of Education of China for Research of Design and Electromagnetic Compatibility of High-Speed Electronic System, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Yaoguang Zhong
- College of Food Science and Technology, Shanghai Ocean University, Shanghai, 201306, China.
| | - Jianhua Zhang
- School of Agriculture and Biology, Bor S. Luh Food Safety Research Center, Shanghai Jiao Tong University, Shanghai, 200240, China.
- NMPA Key Laboratory for Testing Technology of Pharmaceutical Microbiology, Shanghai Institute for Food and Drug Control, Shanghai, 201203, China.
| |
Collapse
|
7
|
Tarasova O, Poroikov V. Machine Learning in Discovery of New Antivirals and Optimization of Viral Infections Therapy. Curr Med Chem 2021; 28:7840-7861. [PMID: 33949929 DOI: 10.2174/0929867328666210504114351] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/13/2021] [Accepted: 02/24/2021] [Indexed: 11/22/2022]
Abstract
Nowadays, computational approaches play an important role in the design of new drug-like compounds and optimization of pharmacotherapeutic treatment of diseases. The emerging growth of viral infections, including those caused by the Human Immunodeficiency Virus (HIV), Ebola virus, recently detected coronavirus, and some others, leads to many newly infected people with a high risk of death or severe complications. A huge amount of chemical, biological, clinical data is at the disposal of the researchers. Therefore, there are many opportunities to find the relationships between the particular features of chemical data and the antiviral activity of biologically active compounds based on machine learning approaches. Biological and clinical data can also be used for building models to predict relationships between viral genotype and drug resistance, which might help determine the clinical outcome of treatment. In the current study, we consider machine-learning approaches in the antiviral research carried out during the past decade. We overview in detail the application of machine-learning methods for the design of new potential antiviral agents and vaccines, drug resistance prediction, and analysis of virus-host interactions. Our review also covers the perspectives of using the machine-learning approaches for antiviral research, including Dengue, Ebola viruses, Influenza A, Human Immunodeficiency Virus, coronaviruses, and some others.
Collapse
Affiliation(s)
- Olga Tarasova
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| |
Collapse
|