1
|
Yang S, Xu P. HemoDL: Hemolytic peptides prediction by double ensemble engines from Rich sequence-derived and transformer-enhanced information. Anal Biochem 2024; 690:115523. [PMID: 38552762 DOI: 10.1016/j.ab.2024.115523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2023] [Revised: 03/20/2024] [Accepted: 03/22/2024] [Indexed: 04/02/2024]
Abstract
Hemolytic peptides can trigger hemolysis by rupturing red blood cells' membranes and triggering cell disruption. Due to the labor-intensive and time-consuming in-lab identification process, accurate, high-throughput hemolytic peptide prediction is crucial for the growth of peptide sequence data in proteomics and peptidomics. In this study, we offer the HemoDL ensemble learning model, which learns the distinct distribution of sequence characteristics for predicting the hemolytic activity of peptides using a double LightGBM framework. To determine the most informative encoding features, we compare 17 widely used features across four benchmark datasets. Our investigation reveals that CTD, BPF, Charge, AAC, GDPC, ATC, QSO, and transformer-based features exhibit more positive contributions to detecting the hemolytic activity of peptides. Comparison with eight state-of-the-art methods demonstrates that HemoDL outperforms other models, attaining higher Matthews Correlation Coefficient values on four test datasets, ranging from 6.30% to 16.04%, 6.63%-11.26%, 4.76%-9.92%, and 7.41%-15.03%, respectively. Additionally, we provide the HemoDL with a user-friendly graphical interface available at https://github.com/abcair/HemoDL. In summary, the HemoDL model, leveraging CTD, BPF, Charge, AAC, GDPC, ATC, QSO and transformer-based encoding features within a double LightGBM learning framework, achieves high accuracy in predicting the hemolytic activity of peptides.
Collapse
Affiliation(s)
- Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou, 213164, China; The Affiliated Changzhou No.2 People's Hospital of Nanjing Medical University, Changzhou, 213164, China
| | - Piao Xu
- College of Economics and Management, Nanjing Forestry University, China.
| |
Collapse
|
2
|
Salod Z, Mahomed O. Mapping Potential Vaccine Candidates Predicted by VaxiJen for Different Viral Pathogens between 2017-2021-A Scoping Review. Vaccines (Basel) 2022; 10:1785. [PMID: 36366294 PMCID: PMC9695814 DOI: 10.3390/vaccines10111785] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 10/16/2022] [Accepted: 10/18/2022] [Indexed: 09/29/2023] Open
Abstract
Reverse vaccinology (RV) is a promising alternative to traditional vaccinology. RV focuses on in silico methods to identify antigens or potential vaccine candidates (PVCs) from a pathogen's proteome. Researchers use VaxiJen, the most well-known RV tool, to predict PVCs for various pathogens. The purpose of this scoping review is to provide an overview of PVCs predicted by VaxiJen for different viruses between 2017 and 2021 using Arksey and O'Malley's framework and the Preferred Reporting Items for Systematic Reviews extension for Scoping Reviews (PRISMA-ScR) guidelines. We used the term 'vaxijen' to search PubMed, Scopus, Web of Science, EBSCOhost, and ProQuest One Academic. The protocol was registered at the Open Science Framework (OSF). We identified articles on this topic, charted them, and discussed the key findings. The database searches yielded 1033 articles, of which 275 were eligible. Most studies focused on severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), published between 2020 and 2021. Only a few articles (8/275; 2.9%) conducted experimental validations to confirm the predictions as vaccine candidates, with 2.2% (6/275) articles mentioning recombinant protein expression. Researchers commonly targeted parts of the SARS-CoV-2 spike (S) protein, with the frequently predicted epitopes as PVCs being major histocompatibility complex (MHC) class I T cell epitopes WTAGAAAYY, RQIAPGQTG, IAIVMVTIM, and B cell epitope IAPGQTGKIADY, among others. The findings of this review are promising for the development of novel vaccines. We recommend that vaccinologists use these findings as a guide to performing experimental validation for various viruses, with SARS-CoV-2 as a priority, because better vaccines are needed, especially to stay ahead of the emergence of new variants. If successful, these vaccines could provide broader protection than traditional vaccines.
Collapse
Affiliation(s)
- Zakia Salod
- Discipline of Public Health Medicine, University of KwaZulu-Natal, Durban 4051, South Africa
| | | |
Collapse
|
3
|
Rodriguez SE, Hawman DW, Sorvillo TE, O'Neal TJ, Bird BH, Rodriguez LL, Bergeron É, Nichol ST, Montgomery JM, Spiropoulou CF, Spengler JR. Immunobiology of Crimean-Congo hemorrhagic fever. Antiviral Res 2022; 199:105244. [PMID: 35026307 PMCID: PMC9245446 DOI: 10.1016/j.antiviral.2022.105244] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 01/05/2022] [Accepted: 01/06/2022] [Indexed: 12/29/2022]
Abstract
Human infection with Crimean-Congo hemorrhagic fever virus (CCHFV), a tick-borne pathogen in the family Nairoviridae, can result in a spectrum of outcomes, ranging from asymptomatic infection through mild clinical signs to severe or fatal disease. Studies of CCHFV immunobiology have investigated the relationship between innate and adaptive immune responses with disease severity, attempting to elucidate factors associated with differential outcomes. In this article, we begin by highlighting unanswered questions, then review current efforts to answer them. We discuss in detail current clinical studies and research in laboratory animals on CCHF, including immune targets of infection and adaptive and innate immune responses. We summarize data about the role of the immune response in natural infections of animals and humans and experimental studies in vitro and in vivo and from evaluating immune-based therapies and vaccines, and present recommendations for future research.
Collapse
Affiliation(s)
- Sergio E Rodriguez
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia; Department of Microbiology & Immunology, University of Texas Medical Branch, Galveston, TX, USA; Galveston National Laboratory, University of Texas Medical Branch, Galveston, TX, USA
| | - David W Hawman
- Laboratory of Virology, Rocky Mountain Laboratories, Division of Intramural Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, USA
| | - Teresa E Sorvillo
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia; Department of Microbiology & Immunology, University of Texas Medical Branch, Galveston, TX, USA; Galveston National Laboratory, University of Texas Medical Branch, Galveston, TX, USA; One Health Institute, School of Veterinary Medicine, University of California Davis, Davis, CA, USA
| | - T Justin O'Neal
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Brian H Bird
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia; One Health Institute, School of Veterinary Medicine, University of California Davis, Davis, CA, USA
| | - Luis L Rodriguez
- Foreign Animal Disease Research Unit, Plum Island Animal Disease Center, Agricultural Research Service, United States Department of Agriculture, Orient Point, New York, USA
| | - Éric Bergeron
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia; Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, Georgia
| | - Stuart T Nichol
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Joel M Montgomery
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Christina F Spiropoulou
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia
| | - Jessica R Spengler
- Viral Special Pathogens Branch, Division of High-Consequence Pathogens and Pathology, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia.
| |
Collapse
|
4
|
Nosrati M, Amani J. In silico screening of ssDNA aptamer against Escherichia coli O157:H7: A machine learning and the Pseudo K-tuple nucleotide composition based approach. Comput Biol Chem 2021; 95:107568. [PMID: 34543910 DOI: 10.1016/j.compbiolchem.2021.107568] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 08/02/2021] [Accepted: 08/24/2021] [Indexed: 02/07/2023]
Abstract
This study was planned to in silico screening of ssDNA aptamer against Escherichia coli O157:H7 by combination of machine learning and the PseKNC approach. For this, firstly a total numbers of 47 validated ssDNA aptamers as well as 498 random DNA sequences were considered as positive and negative training data respectively. The sequences then converted to numerical vectors using PseKNC method through Pse-in-one 2.0 web server. After that, the numerical vectors were subjected to classification by the SVM, ANN and RF algorithms available in Orange 3.2.0 software. The performances of the tested models were evaluated using cross-validation, random sampling and ROC curve analyzes. The primary results demonstrated that the ANN and RF algorithms have appropriate performances for the data classification. To improve the performances of mentioned classifiers the positive training data was triplicated and re-training process was also performed. The results confirmed that data size improvement had significant effect on the accuracy of data classification especially about RF model. Subsequently, the RF algorithm with accuracy of 98% was selected for aptamer screening. The thermodynamics details of folding process as well as secondary structures of the screened aptamers were also considered as final evaluations. The results confirmed that the selected aptamers by the proposed method had appropriate structure properties and there is no thermodynamics limit for the aptamers folding.
Collapse
Affiliation(s)
- Mokhtar Nosrati
- Department of Biotechnology, Faculty of Biological Science and Technology, University of Isfahan, Isfahan, Iran.
| | - Jafar Amani
- Applied Microbiology Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
5
|
Some illuminating remarks on molecular genetics and genomics as well as drug development. Mol Genet Genomics 2020; 295:261-274. [PMID: 31894399 DOI: 10.1007/s00438-019-01634-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2019] [Accepted: 12/05/2019] [Indexed: 02/07/2023]
Abstract
Facing the explosive growth of biological sequences unearthed in the post-genomic age, one of the most important but also most difficult problems in computational biology is how to express a biological sequence with a discrete model or a vector, but still keep it with considerable sequence-order information or its special pattern. To deal with such a challenging problem, the ideas of "pseudo amino acid components" and "pseudo K-tuple nucleotide composition" have been proposed. The ideas and their approaches have further stimulated the birth for "distorted key theory", "wenxing diagram", and substantially strengthening the power in treating the multi-label systems, as well as the establishment of the famous "5-steps rule". All these logic developments are quite natural that are very useful not only for theoretical scientists but also for experimental scientists in conducting genetics/genomics analysis and drug development. Presented in this review paper are also their future perspectives; i.e., their impacts will become even more significant and propounding.
Collapse
|