1
|
Kataria A, Srivastava A, Singh DD, Haque S, Han I, Yadav DK. Systematic computational strategies for identifying protein targets and lead discovery. RSC Med Chem 2024; 15:2254-2269. [PMID: 39026640 PMCID: PMC11253860 DOI: 10.1039/d4md00223g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 05/10/2024] [Indexed: 07/20/2024] Open
Abstract
Computational algorithms and tools have retrenched the drug discovery and development timeline. The applicability of computational approaches has gained immense relevance owing to the dramatic surge in the structural information of biomacromolecules and their heteromolecular complexes. Computational methods are now extensively used in identifying new protein targets, druggability assessment, pharmacophore mapping, molecular docking, the virtual screening of lead molecules, bioactivity prediction, molecular dynamics of protein-ligand complexes, affinity prediction, and for designing better ligands. Herein, we provide an overview of salient components of recently reported computational drug-discovery workflows that includes algorithms, tools, and databases for protein target identification and optimized ligand selection.
Collapse
Affiliation(s)
- Arti Kataria
- Laboratory of Bacteriology, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) Hamilton MT 59840 USA
| | - Ankit Srivastava
- Laboratory of Neurological Infections and Immunity, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) Hamilton MT 59840 USA
| | - Desh Deepak Singh
- Amity Institute of Biotechnology, Amity University Rajasthan Jaipur India
| | - Shafiul Haque
- Research and Scientific Studies Unit, College of Nursing and Health Sciences, Jazan University Jazan-45142 Saudi Arabia
| | - Ihn Han
- Plasma Bioscience Research Center, Applied Plasma Medicine Center, Department of Electrical & Biological Physics, Kwangwoon University Seoul 01897 Republic of Korea +82 32 820 4948
| | - Dharmendra Kumar Yadav
- Department of Biologics, College of Pharmacy, Gachon University Hambakmoeiro 191, Yeonsu-gu Incheon 21924 Republic of Korea
| |
Collapse
|
2
|
Rahman S, Chiou CC, Ahmad S, Islam ZU, Tanaka T, Alouffi A, Chen CC, Almutairi MM, Ali A. Subtractive Proteomics and Reverse-Vaccinology Approaches for Novel Drug Target Identification and Chimeric Vaccine Development against Bartonella henselae Strain Houston-1. Bioengineering (Basel) 2024; 11:505. [PMID: 38790371 PMCID: PMC11118080 DOI: 10.3390/bioengineering11050505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Revised: 05/03/2024] [Accepted: 05/15/2024] [Indexed: 05/26/2024] Open
Abstract
Bartonella henselae is a Gram-negative bacterium causing a variety of clinical symptoms, ranging from cat-scratch disease to severe systemic infections, and it is primarily transmitted by infected fleas. Its status as an emerging zoonotic pathogen and its capacity to persist within host erythrocytes and endothelial cells emphasize its clinical significance. Despite progress in understanding its pathogenesis, limited knowledge exists about the virulence factors and regulatory mechanisms specific to the B. henselae strain Houston-1. Exploring these aspects is crucial for targeted therapeutic strategies against this versatile pathogen. Using reverse-vaccinology-based subtractive proteomics, this research aimed to identify the most antigenic proteins for formulating a multi-epitope vaccine against the B. henselae strain Houston-1. One crucial virulent and antigenic protein, the PAS domain-containing sensor histidine kinase protein, was identified. Subsequently, the identification of B-cell and T-cell epitopes for the specified protein was carried out and the evaluated epitopes were checked for their antigenicity, allergenicity, solubility, MHC binding capability, and toxicity. The filtered epitopes were merged using linkers and an adjuvant to create a multi-epitope vaccine construct. The structure was then refined, with 92.3% of amino acids falling within the allowed regions. Docking of the human receptor (TLR4) with the vaccine construct was performed and demonstrated a binding energy of -1047.2 Kcal/mol with more interactions. Molecular dynamic simulations confirmed the stability of this docked complex, emphasizing the conformation and interactions between the molecules. Further experimental validation is necessary to evaluate its effectiveness against B. henselae.
Collapse
Affiliation(s)
- Sudais Rahman
- Department of Zoology, Abdul Wali Khan University, Mardan 23200, Khyber Pakhtunkhwa, Pakistan;
| | - Chien-Chun Chiou
- Department of Dermatology, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi 600, Taiwan;
| | - Shabir Ahmad
- Institute of Chemistry and Center for Computing in Engineering and Sciences, University of Campinas (UNICAMP), Campinas 13084-862, Brazil;
| | - Zia Ul Islam
- Department of Biotechnology, Abdul Wali Khan University, Mardan 23200, Khyber Pakhtunkhwa, Pakistan
| | - Tetsuya Tanaka
- Laboratory of Infectious Diseases, Joint Faculty of Veterinary Medicine, Kagoshima University, Kagoshima 890-0065, Japan
| | - Abdulaziz Alouffi
- King Abdulaziz City for Science and Technology, Riyadh 12354, Saudi Arabia
| | - Chien-Chin Chen
- Department of Pathology, Ditmanson Medical Foundation Chia-Yi Christian Hospital, Chiayi 600, Taiwan
- Department of Cosmetic Science, Chia Nan University of Pharmacy and Science, Tainan 717, Taiwan
- Ph.D. Program in Translational Medicine, Rong Hsing Research Center for Translational Medicine, National Chung Hsing University, Taichung 402, Taiwan
- Department of Biotechnology and Bioindustry Sciences, College of Bioscience and Biotechnology, National Cheng Kung University, Tainan 701, Taiwan
| | - Mashal M. Almutairi
- Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, Riyadh 11451, Saudi Arabia;
| | - Abid Ali
- Department of Zoology, Abdul Wali Khan University, Mardan 23200, Khyber Pakhtunkhwa, Pakistan;
| |
Collapse
|
3
|
Biró B, Zhao B, Kurgan L. Complementarity of the residue-level protein function and structure predictions in human proteins. Comput Struct Biotechnol J 2022; 20:2223-2234. [PMID: 35615015 PMCID: PMC9118482 DOI: 10.1016/j.csbj.2022.05.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 05/02/2022] [Accepted: 05/02/2022] [Indexed: 11/24/2022] Open
Abstract
Sequence-based predictors of the residue-level protein function and structure cover a broad spectrum of characteristics including intrinsic disorder, secondary structure, solvent accessibility and binding to nucleic acids. They were catalogued and evaluated in numerous surveys and assessments. However, methods focusing on a given characteristic are studied separately from predictors of other characteristics, while they are typically used on the same proteins. We fill this void by studying complementarity of a representative collection of methods that target different predictions using a large, taxonomically consistent, and low similarity dataset of human proteins. First, we bridge the gap between the communities that develop structure-trained vs. disorder-trained predictors of binding residues. Motivated by a recent study of the protein-binding residue predictions, we empirically find that combining the structure-trained and disorder-trained predictors of the DNA-binding and RNA-binding residues leads to substantial improvements in predictive quality. Second, we investigate whether diverse predictors generate results that accurately reproduce relations between secondary structure, solvent accessibility, interaction sites, and intrinsic disorder that are present in the experimental data. Our empirical analysis concludes that predictions accurately reflect all combinations of these relations. Altogether, this study provides unique insights that support combining results produced by diverse residue-level predictors of protein function and structure.
Collapse
Affiliation(s)
- Bálint Biró
- Institute of Genetics and Biotechnology, Hungarian University of Agriculture and Life Sciences, Gödöllő, Hungary
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, United States
| |
Collapse
|
4
|
Zhao B, Katuwawala A, Oldfield CJ, Dunker AK, Faraggi E, Gsponer J, Kloczkowski A, Malhis N, Mirdita M, Obradovic Z, Söding J, Steinegger M, Zhou Y, Kurgan L. DescribePROT: database of amino acid-level protein structure and function predictions. Nucleic Acids Res 2021; 49:D298-D308. [PMID: 33119734 PMCID: PMC7778963 DOI: 10.1093/nar/gkaa931] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 09/11/2020] [Accepted: 10/05/2020] [Indexed: 12/30/2022] Open
Abstract
We present DescribePROT, the database of predicted amino acid-level descriptors of structure and function of proteins. DescribePROT delivers a comprehensive collection of 13 complementary descriptors predicted using 10 popular and accurate algorithms for 83 complete proteomes that cover key model organisms. The current version includes 7.8 billion predictions for close to 600 million amino acids in 1.4 million proteins. The descriptors encompass sequence conservation, position specific scoring matrix, secondary structure, solvent accessibility, intrinsic disorder, disordered linkers, signal peptides, MoRFs and interactions with proteins, DNA and RNAs. Users can search DescribePROT by the amino acid sequence and the UniProt accession number and entry name. The pre-computed results are made available instantaneously. The predictions can be accesses via an interactive graphical interface that allows simultaneous analysis of multiple descriptors and can be also downloaded in structured formats at the protein, proteome and whole database scale. The putative annotations included by DescriPROT are useful for a broad range of studies, including: investigations of protein function, applied projects focusing on therapeutics and diseases, and in the development of predictors for other protein sequence descriptors. Future releases will expand the coverage of DescribePROT. DescribePROT can be accessed at http://biomine.cs.vcu.edu/servers/DESCRIBEPROT/.
Collapse
Affiliation(s)
- Bi Zhao
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Akila Katuwawala
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | | | - A Keith Dunker
- Center for Computational Biology and Bioinformatics, Indiana University School of Medicine, Indianapolis, IN, USA
| | - Eshel Faraggi
- Battelle Center for Mathematical Medicine at the Nationwide Children's Hospital, and Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine at the Nationwide Children's Hospital, and Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BC, Canada
| | - Milot Mirdita
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Zoran Obradovic
- Department of Computer and Information Sciences, Temple University, Philadelphia, PA, USA
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
| | - Martin Steinegger
- School of Biological Sciences and Institute of Molecular Biology & Genetics, Seoul National University, Seoul, Republic of Korea
| | - Yaoqi Zhou
- Institute for Glycomics, Griffith University, Gold Coast, Queensland, Australia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| |
Collapse
|
5
|
Guo L, Jiang Q, Jin X, Liu L, Zhou W, Yao S, Wu M, Wang Y. A Deep Convolutional Neural Network to Improve the Prediction of Protein Secondary Structure. Curr Bioinform 2020. [DOI: 10.2174/1574893615666200120103050] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
Protein secondary structure prediction (PSSP) is a fundamental task in
bioinformatics that is helpful for understanding the three-dimensional structure and biological
function of proteins. Many neural network-based prediction methods have been developed for
protein secondary structures. Deep learning and multiple features are two obvious means to improve
prediction accuracy.
Objective:
To promote the development of PSSP, a deep convolutional neural network-based
method is proposed to predict both the eight-state and three-state of protein secondary structure.
Methods:
In this model, sequence and evolutionary information of proteins are combined as multiple
input features after preprocessing. A deep convolutional neural network with no pooling layer and
connection layer is then constructed to predict the secondary structure of proteins. L2 regularization,
batch normalization, and dropout techniques are employed to avoid over-fitting and obtain better
prediction performance, and an improved cross-entropy is used as the loss function.
Results:
Our proposed model can obtain Q3 prediction results of 86.2%, 84.5%, 87.8%, and 84.7%,
respectively, on CullPDB, CB513, CASP10 and CASP11 datasets, with corresponding Q8
prediction results of 74.1%, 70.5%, 74.9%, and 71.3%.
Conclusion:
We have proposed the DCNN-SS deep convolutional-network-based PSSP method,
and experimental results show that DCNN-SS performs competitively with other methods.
Collapse
Affiliation(s)
- Lin Guo
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Qian Jiang
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Xin Jin
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Lin Liu
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Wei Zhou
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Shaowen Yao
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Min Wu
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| | - Yun Wang
- School of Software, Yunnan University, Kunming, China; 2School of Information, Yunnan Normal University, Kunming, China
| |
Collapse
|
6
|
Wang W, Langlois R, Langlois M, Genchev GZ, Wang X, Lu H. Functional Site Discovery From Incomplete Training Data: A Case Study With Nucleic Acid-Binding Proteins. Front Genet 2019; 10:729. [PMID: 31543893 PMCID: PMC6729729 DOI: 10.3389/fgene.2019.00729] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 07/11/2019] [Indexed: 12/27/2022] Open
Abstract
Function annotation efforts provide a foundation to our understanding of cellular processes and the functioning of the living cell. This motivates high-throughput computational methods to characterize new protein members of a particular function. Research work has focused on discriminative machine-learning methods, which promise to make efficient, de novo predictions of protein function. Furthermore, available function annotation exists predominantly for individual proteins rather than residues of which only a subset is necessary for the conveyance of a particular function. This limits discriminative approaches to predicting functions for which there is sufficient residue-level annotation, e.g., identification of DNA-binding proteins or where an excellent global representation can be divined. Complete understanding of the various functions of proteins requires discovery and functional annotation at the residue level. Herein, we cast this problem into the setting of multiple-instance learning, which only requires knowledge of the protein’s function yet identifies functionally relevant residues and need not rely on homology. We developed a new multiple-instance leaning algorithm derived from AdaBoost and benchmarked this algorithm against two well-studied protein function prediction tasks: annotating proteins that bind DNA and RNA. This algorithm outperforms certain previous approaches in annotating protein function while identifying functionally relevant residues involved in binding both DNA and RNA, and on one protein-DNA benchmark, it achieves near perfect classification.
Collapse
Affiliation(s)
- Wenchuan Wang
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas
| | - Robert Langlois
- Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States
| | - Marina Langlois
- Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States
| | - Georgi Z Genchev
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas.,Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States.,Bulgarian Institute for Genomics and Precision Medicine, Sofia, Bulgaria
| | - Xiaolei Wang
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas.,Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Hui Lu
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas.,Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States.,Center for Biomedical Informatics, Shanghai Children's Hospital, Shanghai, China
| |
Collapse
|
7
|
Kashani-Amin E, Sakhteman A, Larijani B, Ebrahim-Habibi A. Introducing a New Model of Sweet Taste Receptor, a Class C G-protein Coupled Receptor (C GPCR). Cell Biochem Biophys 2019; 77:227-243. [PMID: 31069640 DOI: 10.1007/s12013-019-00872-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2018] [Accepted: 04/27/2019] [Indexed: 12/31/2022]
Abstract
The structure of sweet taste receptor (STR), a heterodimer of class C G-protein coupled receptors comprising T1R2 and T1R3 molecules, is still undetermined. In this study, a new enhanced model of the receptor is introduced based on the most recent templates. The improvement, stability, and reliability of the model are discussed in details. Each domain of the protein, i.e., VFTM, CR, and TMD, were separately constructed by hybrid-model construction methods and then assembled to build whole monomers. Overall, 680 ns molecular dynamics simulation was performed for the individual domains, the whole monomers and the heterodimer form of the VFTM orthosteric binding site. The latter's structure obtained from 200 ns simulation was docked with aspartame; among various binding sites suggested by FTMAP server, the experimentally suggested binding domain in T1R2 was retrieved. Local three-dimensional structures and helices spans were evaluated and showed acceptable accordance with the template structures and secondary structure predictions. Individual domains and whole monomer structures were found stable and reliable to be used. In conclusion, several validations have shown reliability of the new and enhanced models for further molecular modeling studies on structure and function of STR and C GPCRs.
Collapse
Affiliation(s)
- Elaheh Kashani-Amin
- Biosensor Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Amirhossein Sakhteman
- Department of Medicinal Chemistry, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran.,Medicinal Chemistry and Natural Products Research Center, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Bagher Larijani
- Endocrinology and Metabolism Research Center, Endocrinology and Metabolism Clinical Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Azadeh Ebrahim-Habibi
- Biosensor Research Center, Endocrinology and Metabolism Molecular-Cellular Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
8
|
In silico prediction of prolactin molecules as a tool for equine genomics reproduction. Mol Divers 2019; 23:1019-1028. [PMID: 30740642 DOI: 10.1007/s11030-018-09914-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 12/31/2018] [Indexed: 10/27/2022]
Abstract
The prolactin hormone is involved in several biological functions, although its main role resides on reproduction. As it interferes on fertility changes, studies focused on human health have established a linkage of this hormone to fertility losses. Regarding animal research, there is still a lack of information about the structure of prolactin. In case of horse breeding, prolactin has a particular influence; once there is an individualization of these animals and equines are known for presenting several reproductive disorders. As there is no molecular structure available for the prolactin hormone and receptor, we performed several bioinformatics analyses through prediction and refinement softwares, as well as manual modifications. Aiming to elucidate the first computational structure of both molecules and analyse structural and functional aspects related to these proteins, here we provide the first known equine model for prolactin and prolactin receptor, which obtained high global quality scores in diverse software's for quality assessment. QMEAN overall score obtained for ePrl was (- 4.09) and QMEANbrane for ePrlr was (- 8.45), which proves the structures' reliability. This study will implement another tool in equine genomics in order to give light to interactions of these molecules, structural and functional alterations and therefore help diagnosing fertility problems, contributing in the selection of a high genetic herd.
Collapse
|