1
|
Zhou Y, Jiang Y, Chen SJ. SPRank─A Knowledge-Based Scoring Function for RNA-Ligand Pose Prediction and Virtual Screening. J Chem Theory Comput 2024. [PMID: 39150889 DOI: 10.1021/acs.jctc.4c00681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2024]
Abstract
The growing interest in RNA-targeted drugs underscores the need for computational modeling of interactions between RNA molecules and small compounds. Having a reliable scoring function for RNA-ligand interactions is essential for effective computational drug screening. An ideal scoring function should not only predict the native pose for ligand binding but also rank the affinity of the binding for different ligands. However, existing scoring functions are primarily designed to predict the native binding modes for a given RNA-ligand pair and have not been thoroughly assessed for virtual screening purposes. In this paper, we introduce SPRank, a combination of machine-learning and knowledge-based scoring functions developed through a weighted iterative approach, specifically designed to tackle both binding mode prediction and virtual screening challenges. Our approach incorporates third-party docking software, such as rDock and AutoDock Vina, to sample flexible ligands against an ensemble of RNA structures, capturing the conformational flexibility of both the RNA and the ligand. Through rigorous testing, SPRank demonstrates improved performance compared to the tested scoring functions across four test sets comprising 122, 42, 55, and 71 nucleic acid-ligand complexes. Furthermore, SPRank exhibits improved performance in virtual screening tests targeting the HIV-1 TAR ensemble, which highlights its advantage in drug discovery. These results underscore the advantages of SPRank as a potentially promising tool for the RNA-targeted drug design. The source code of SPRank and the data sets are freely accessible at https://github.com/Vfold-RNA/SPRank.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri 65211-7010, United States
| | - Yangwei Jiang
- Department of Physics and Astronomy, University of Missouri-Columbia, Columbia, Missouri 65211-7010, United States
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri-Columbia, Columbia, Missouri 65211-7010, United States
| |
Collapse
|
2
|
Erdős G, Dosztányi Z. AIUPred: combining energy estimation with deep learning for the enhanced prediction of protein disorder. Nucleic Acids Res 2024; 52:W176-W181. [PMID: 38747347 PMCID: PMC11223784 DOI: 10.1093/nar/gkae385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/19/2024] [Accepted: 05/07/2024] [Indexed: 07/06/2024] Open
Abstract
Intrinsically disordered proteins and protein regions (IDPs/IDRs) carry out important biological functions without relying on a single well-defined conformation. As these proteins are a challenge to study experimentally, computational methods play important roles in their characterization. One of the commonly used tools is the IUPred web server which provides prediction of disordered regions and their binding sites. IUPred is rooted in a simple biophysical model and uses a limited number of parameters largely derived on globular protein structures only. This enabled an incredibly fast and robust prediction method, however, its limitations have also become apparent in light of recent breakthrough methods using deep learning techniques. Here, we present AIUPred, a novel version of IUPred which incorporates deep learning techniques into the energy estimation framework. It achieves improved performance while keeping the robustness of the original method. Based on the evaluation of recent benchmark datasets, AIUPred scored amongst the top three single sequence based methods. With a new web server we offer fast and reliable visual analysis for users as well as options to analyze whole genomes in mere seconds with the downloadable package. AIUPred is available at https://aiupred.elte.hu.
Collapse
Affiliation(s)
- Gábor Erdős
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, Eötvös Loránd University, Pázmány Péter stny 1/c, Budapest H-1117, Hungary
| |
Collapse
|
3
|
Ghosh D, Biswas A, Radhakrishna M. Advanced computational approaches to understand protein aggregation. BIOPHYSICS REVIEWS 2024; 5:021302. [PMID: 38681860 PMCID: PMC11045254 DOI: 10.1063/5.0180691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
Protein aggregation is a widespread phenomenon implicated in debilitating diseases like Alzheimer's, Parkinson's, and cataracts, presenting complex hurdles for the field of molecular biology. In this review, we explore the evolving realm of computational methods and bioinformatics tools that have revolutionized our comprehension of protein aggregation. Beginning with a discussion of the multifaceted challenges associated with understanding this process and emphasizing the critical need for precise predictive tools, we highlight how computational techniques have become indispensable for understanding protein aggregation. We focus on molecular simulations, notably molecular dynamics (MD) simulations, spanning from atomistic to coarse-grained levels, which have emerged as pivotal tools in unraveling the complex dynamics governing protein aggregation in diseases such as cataracts, Alzheimer's, and Parkinson's. MD simulations provide microscopic insights into protein interactions and the subtleties of aggregation pathways, with advanced techniques like replica exchange molecular dynamics, Metadynamics (MetaD), and umbrella sampling enhancing our understanding by probing intricate energy landscapes and transition states. We delve into specific applications of MD simulations, elucidating the chaperone mechanism underlying cataract formation using Markov state modeling and the intricate pathways and interactions driving the toxic aggregate formation in Alzheimer's and Parkinson's disease. Transitioning we highlight how computational techniques, including bioinformatics, sequence analysis, structural data, machine learning algorithms, and artificial intelligence have become indispensable for predicting protein aggregation propensity and locating aggregation-prone regions within protein sequences. Throughout our exploration, we underscore the symbiotic relationship between computational approaches and empirical data, which has paved the way for potential therapeutic strategies against protein aggregation-related diseases. In conclusion, this review offers a comprehensive overview of advanced computational methodologies and bioinformatics tools that have catalyzed breakthroughs in unraveling the molecular basis of protein aggregation, with significant implications for clinical interventions, standing at the intersection of computational biology and experimental research.
Collapse
Affiliation(s)
- Deepshikha Ghosh
- Department of Biological Sciences and Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | - Anushka Biswas
- Department of Chemical Engineering, Indian Institute of Technology (IIT) Gandhinagar, Palaj, Gujarat 382355, India
| | | |
Collapse
|
4
|
Goulard Coderc de Lacam E, Roux B, Chipot C. Classifying Protein-Protein Binding Affinity with Free-Energy Calculations and Machine Learning Approaches. J Chem Inf Model 2024; 64:1081-1091. [PMID: 38272021 DOI: 10.1021/acs.jcim.3c01586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Understanding the intricate phenomenon of neuronal wiring in the brain is of great interest in neuroscience. In the fruit fly, Drosophila melanogaster, the Dpr-DIP interactome has been identified to play an important role in this process. However, experimental data suggest that a merely limited subset of complexes, essentially 57 out of a total of 231, exhibit strong binding affinity. In this work, we sought to identify the residue-level molecular basis underlying the difference in binding affinity using a state-of-the-art methodology consisting of standard binding free-energy calculations with a geometrical route and machine learning (ML) techniques. We determined the binding affinity for two complexes using statistical mechanics simulations, achieving an excellent reproduction of the experimental data. Moreover, we predicted the binding free energy for two additional low-affinity complexes, devoid of experimental estimation, while simultaneously identifying key residues for the binding. Furthermore, through the use of ML algorithms, linear discriminant analysis, and random forest, we achieved remarkable accuracy, as high as 0.99, in discerning between strong (cognate) and weak (noncognate) binders. The presented ML approach encompasses easily transferable input features, enabling its broad application to any interactome while facilitating the identification of pivotal residues critical for binding interactions. The predictive power of the generated model was probed on similar protein families from 13 diverse species. Our ML model exhibited commendable performance on these additional data sets, showcasing its reliability and robustness across the species barrier.
Collapse
Affiliation(s)
- Emma Goulard Coderc de Lacam
- Laboratoire International Associé Centre National de la Recherche Scientifique et University of Illinois at Urbana-Champaign, Unité Mixte de Recherche no. 7019, Université de Lorraine, B.P. 70239, 54506 Vandœuvre-lès-Nancy Cedex, France
| | - Benoît Roux
- Department of Biochemistry and Molecular Biology, The University of Chicago, 929 E. 57th Street W225, Chicago, Illinois 60637, United States
- Department of Chemistry, The University of Chicago, 5735 S Ellis Avenue, Chicago, Illinois 60637, United States
| | - Christophe Chipot
- Laboratoire International Associé Centre National de la Recherche Scientifique et University of Illinois at Urbana-Champaign, Unité Mixte de Recherche no. 7019, Université de Lorraine, B.P. 70239, 54506 Vandœuvre-lès-Nancy Cedex, France
- Department of Biochemistry and Molecular Biology, The University of Chicago, 929 E. 57th Street W225, Chicago, Illinois 60637, United States
- Theoretical and Computational Biophysics Group, Beckman Institute, and Department of Physics, University of Illinois at Urbana-Champaign, Urbana, Illinois 61820, United States
- Department of Chemistry, The University of Hawai'i at Ma̅noa, 2545 McCarthy Mall, Honolulu, Hawaii 96822, United States
| |
Collapse
|
5
|
Yang Z, Wang Y, Ni X, Yang S. DeepDRP: Prediction of intrinsically disordered regions based on integrated view deep learning architecture from transformer-enhanced and protein information. Int J Biol Macromol 2023; 253:127390. [PMID: 37827403 DOI: 10.1016/j.ijbiomac.2023.127390] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 09/20/2023] [Accepted: 10/09/2023] [Indexed: 10/14/2023]
Abstract
Intrinsic disorder in proteins, a widely distributed phenomenon in nature, is related to many crucial biological processes and various diseases. Traditional determination methods tend to be costly and labor-intensive, therefore it is desirable to seek an accurate identification method of intrinsically disordered proteins (IDPs). In this paper, we proposed a novel Deep learning model for Intrinsically Disordered Regions in Proteins named DeepDRP. DeepDRP employed an innovative TimeDistributed strategy and Bi-LSTM architecture to predict IDPs and is driven by integrated view features of PSSM, Energy-based encoding, AAindex, and transformer-enhanced embeddings including DR-BERT, OntoProtein, Prot-T5, and ESM-2. The comparison of different feature combinations indicates that the transformer-enhanced features contribute far more than traditional features to predict IDPs and ESM-2 accounts for a larger contribution in the pre-trained fusion vectors. The ablation test verified that the TimeDistributed strategy surely increased the model performance and is an efficient approach to the IDP prediction. Compared with eight state-of-the-art methods on the DISORDER723, S1, and DisProt832 datasets, the Matthews correlation coefficient of DeepDRP significantly outperformed competing methods by 4.90 % to 36.20 %, 11.80 % to 26.33 %, and 4.82 % to 13.55 %. In brief, DeepDRP is a reliable model for IDP prediction and is freely available at https://github.com/ZX-COLA/DeepDRP.
Collapse
Affiliation(s)
- Zexi Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China
| | - Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Xinye Ni
- The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, China
| | - Sen Yang
- School of Computer Science and Artificial Intelligence Aliyun School of Big Data School of Software, Changzhou University, Changzhou 213164, China; The Affiliated Changzhou No. 2 People's Hospital of Nanjing Medical University, Changzhou 213164, China.
| |
Collapse
|
6
|
Vemula D, Jayasurya P, Sushmitha V, Kumar YN, Bhandari V. CADD, AI and ML in drug discovery: A comprehensive review. Eur J Pharm Sci 2023; 181:106324. [PMID: 36347444 DOI: 10.1016/j.ejps.2022.106324] [Citation(s) in RCA: 31] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 10/26/2022] [Accepted: 11/03/2022] [Indexed: 11/06/2022]
Abstract
Computer-aided drug design (CADD) is an emerging field that has drawn a lot of interest because of its potential to expedite and lower the cost of the drug development process. Drug discovery research is expensive and time-consuming, and it frequently took 10-15 years for a drug to be commercially available. CADD has significantly impacted this area of research. Further, the combination of CADD with Artificial Intelligence (AI), Machine Learning (ML), and Deep Learning (DL) technologies to handle enormous amounts of biological data has reduced the time and cost associated with the drug development process. This review will discuss how CADD, AI, ML, and DL approaches help identify drug candidates and various other steps of the drug discovery process. It will also provide a detailed overview of the different in silico tools used and how these approaches interact.
Collapse
Affiliation(s)
- Divya Vemula
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | - Perka Jayasurya
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | - Varthiya Sushmitha
- National Institute of Pharmaceutical Education and Research- Hyderabad, India
| | | | - Vasundhra Bhandari
- National Institute of Pharmaceutical Education and Research- Hyderabad, India.
| |
Collapse
|
7
|
Dsouza R, Mashayekhi G, Etemadpour R, Schwander P, Ourmazd A. Energy landscapes from cryo-EM snapshots: a benchmarking study. Sci Rep 2023; 13:1372. [PMID: 36697500 PMCID: PMC9876912 DOI: 10.1038/s41598-023-28401-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 01/18/2023] [Indexed: 01/27/2023] Open
Abstract
Biomolecules undergo continuous conformational motions, a subset of which are functionally relevant. Understanding, and ultimately controlling biomolecular function are predicated on the ability to map continuous conformational motions, and identify the functionally relevant conformational trajectories. For equilibrium and near-equilibrium processes, function proceeds along minimum-energy pathways on one or more energy landscapes, because higher-energy conformations are only weakly occupied. With the growing interest in identifying functional trajectories, the need for reliable mapping of energy landscapes has become paramount. In response, various data-analytical tools for determining structural variability are emerging. A key question concerns the veracity with which each data-analytical tool can extract functionally relevant conformational trajectories from a collection of single-particle cryo-EM snapshots. Using synthetic data as an independently known ground truth, we benchmark the ability of four leading algorithms to determine biomolecular energy landscapes and identify the functionally relevant conformational paths on these landscapes. Such benchmarking is essential for systematic progress toward atomic-level movies of continuous biomolecular function.
Collapse
Affiliation(s)
- Raison Dsouza
- University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA
| | - Ghoncheh Mashayekhi
- University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA
| | - Roshanak Etemadpour
- University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA
| | - Peter Schwander
- University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA
| | - Abbas Ourmazd
- University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA.
| |
Collapse
|
8
|
Chang Y, Hawkins BA, Du JJ, Groundwater PW, Hibbs DE, Lai F. A Guide to In Silico Drug Design. Pharmaceutics 2022; 15:pharmaceutics15010049. [PMID: 36678678 PMCID: PMC9867171 DOI: 10.3390/pharmaceutics15010049] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 12/16/2022] [Accepted: 12/17/2022] [Indexed: 12/28/2022] Open
Abstract
The drug discovery process is a rocky path that is full of challenges, with the result that very few candidates progress from hit compound to a commercially available product, often due to factors, such as poor binding affinity, off-target effects, or physicochemical properties, such as solubility or stability. This process is further complicated by high research and development costs and time requirements. It is thus important to optimise every step of the process in order to maximise the chances of success. As a result of the recent advancements in computer power and technology, computer-aided drug design (CADD) has become an integral part of modern drug discovery to guide and accelerate the process. In this review, we present an overview of the important CADD methods and applications, such as in silico structure prediction, refinement, modelling and target validation, that are commonly used in this area.
Collapse
Affiliation(s)
- Yiqun Chang
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Bryson A. Hawkins
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Jonathan J. Du
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Paul W. Groundwater
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - David E. Hibbs
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Felcia Lai
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
- Correspondence:
| |
Collapse
|
9
|
Dayhoff GW, Uversky VN. Rapid prediction and analysis of protein intrinsic disorder. Protein Sci 2022; 31:e4496. [PMID: 36334049 PMCID: PMC9679974 DOI: 10.1002/pro.4496] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/28/2022] [Accepted: 11/02/2022] [Indexed: 11/07/2022]
Abstract
Protein intrinsic disorder is found in all kingdoms of life and is known to underpin numerous physiological and pathological processes. Computational methods play an important role in characterizing and identifying intrinsically disordered proteins and protein regions. Herein, we present a new high-efficiency web-based disorder predictor named Rapid Intrinsic Disorder Analysis Online (RIDAO) that is designed to facilitate the application of protein intrinsic disorder analysis in genome-scale structural bioinformatics and comparative genomics/proteomics. RIDAO integrates six established disorder predictors into a single, unified platform that reproduces the results of individual predictors with near-perfect fidelity. To demonstrate the potential applications, we construct a test set containing more than one million sequences from one hundred organisms comprising over 420 million residues. Using this test set, we compare the efficiency and accessibility (i.e., ease of use) of RIDAO to five well-known and popular disorder predictors, namely: AUCpreD, IUPred3, metapredict V2, flDPnn, and SPOT-Disorder2. We show that RIDAO yields per-residue predictions at a rate two to six orders of magnitude greater than the other predictors and completely processes the test set in under an hour. RIDAO can be accessed free of charge at https://ridao.app.
Collapse
Affiliation(s)
- Guy W. Dayhoff
- Department of ChemistryUniversity of South FloridaTampaFloridaUSA
| | - Vladimir N. Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research InstituteUniversity of South FloridaTampaFloridaUSA
| |
Collapse
|
10
|
Arif M, Kabir M, Ahmed S, Khan A, Ge F, Khelifi A, Yu DJ. DeepCPPred: A Deep Learning Framework for the Discrimination of Cell-Penetrating Peptides and Their Uptake Efficiencies. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2749-2759. [PMID: 34347603 DOI: 10.1109/tcbb.2021.3102133] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Cell-penetrating peptides (CPPs) are special peptides capable of carrying a variety of bioactive molecules, such as genetic materials, short interfering RNAs and nanoparticles, into cells. Recently, research on CPP has gained substantial interest from researchers, and the biological mechanisms of CPPS have been assessed in the context of safe drug delivery agents and therapeutic applications. Correct identification and synthesis of CPPs using traditional biochemical methods is an extremely slow, expensive and laborious task particularly due to the large volume of unannotated peptide sequences accumulating in the World Bank repository. Hence, a powerful bioinformatics predictor that rapidly identifies CPPs with a high recognition rate is urgently needed. To date, numerous computational methods have been developed for CPP prediction. However, the available machine-learning (ML) tools are unable to distinguish both the CPPs and their uptake efficiencies. This study aimed to develop a two-layer deep learning framework named DeepCPPred to identify both CPPs in the first phase and peptide uptake efficiency in the second phase. The DeepCPPred predictor first uses four types of descriptors that cover evolutionary, energy estimation, reduced sequence and amino-acid contact information. Then, the extracted features are optimized through the elastic net algorithm and fed into a cascade deep forest algorithm to build the final CPP model. The proposed method achieved 99.45 percent overall accuracy with the CPP924 benchmark dataset in the first layer and 95.43 percent accuracy in the second layer with the CPPSite3 dataset using a 5-fold cross-validation test. Thus, our proposed bioinformatics tool surpassed all the existing state-of-the-art sequence-based CPP approaches.
Collapse
|
11
|
Liang Y, Yang S, Zheng L, Wang H, Zhou J, Huang S, Yang L, Zuo Y. Research progress of reduced amino acid alphabets in protein analysis and prediction. Comput Struct Biotechnol J 2022; 20:3503-3510. [PMID: 35860409 PMCID: PMC9284397 DOI: 10.1016/j.csbj.2022.07.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 06/30/2022] [Accepted: 07/01/2022] [Indexed: 11/29/2022] Open
Abstract
A comprehensive summary of the literature on the reduced amino acid alphabets. A systematic review of the development history of reduced amino acid alphabets. Rich application cases of amino acid reduction alphabets are described in the article. A detailed analysis of the properties and uses of the reduced amino acid alphabets.
Proteins are the executors of cellular physiological activities, and accurate structural and function elucidation are crucial for the refined mapping of proteins. As a feature engineering method, the reduction of amino acid composition is not only an important method for protein structure and function analysis, but also opens a broad horizon for the complex field of machine learning. Representing sequences with fewer amino acid types greatly reduces the complexity and noise of traditional feature engineering in dimension, and provides more interpretable predictive models for machine learning to capture key features. In this paper, we systematically reviewed the strategy and method studies of the reduced amino acid (RAA) alphabets, and summarized its main research in protein sequence alignment, functional classification, and prediction of structural properties, respectively. In the end, we gave a comprehensive analysis of 672 RAA alphabets from 74 reduction methods.
Collapse
Affiliation(s)
- Yuchao Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Siqi Yang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Hao Wang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Jian Zhou
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Shenghui Huang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
- Corresponding authors.
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
- Corresponding authors.
| |
Collapse
|
12
|
Vujovic F, Hunter N, Farahani RM. Notch ankyrin domain: evolutionary rise of a thermodynamic sensor. Cell Commun Signal 2022; 20:66. [PMID: 35585601 PMCID: PMC9118731 DOI: 10.1186/s12964-022-00886-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2022] [Accepted: 04/21/2022] [Indexed: 12/19/2022] Open
Abstract
Notch signalling pathway plays a key role in metazoan biology by contributing to resolution of binary decisions in the life cycle of cells during development. Outcomes such as proliferation/differentiation dichotomy are resolved by transcriptional remodelling that follows a switch from Notchon to Notchoff state, characterised by dissociation of Notch intracellular domain (NICD) from DNA-bound RBPJ. Here we provide evidence that transitioning to the Notchoff state is regulated by heat flux, a phenomenon that aligns resolution of fate dichotomies to mitochondrial activity. A combination of phylogenetic analysis and computational biochemistry was utilised to disclose structural adaptations of Notch1 ankyrin domain that enabled function as a sensor of heat flux. We then employed DNA-based micro-thermography to measure heat flux during brain development, followed by analysis in vitro of the temperature-dependent behaviour of Notch1 in mouse neural progenitor cells. The structural capacity of NICD to operate as a thermodynamic sensor in metazoans stems from characteristic enrichment of charged acidic amino acids in β-hairpins of the ankyrin domain that amplify destabilising inter-residue electrostatic interactions and render the domain thermolabile. The instability emerges upon mitochondrial activity which raises the perinuclear and nuclear temperatures to 50 °C and 39 °C, respectively, leading to destabilization of Notch1 transcriptional complex and transitioning to the Notchoff state. Notch1 functions a metazoan thermodynamic sensor that is switched on by intercellular contacts, inputs heat flux as a proxy for mitochondrial activity in the Notchon state via the ankyrin domain and is eventually switched off in a temperature-dependent manner. Video abstract
Collapse
Affiliation(s)
- Filip Vujovic
- IDR/Westmead Institute for Medical Research, Westmead, NSW, 2145, Australia.,School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Sydney, NSW, 2006, Australia
| | - Neil Hunter
- IDR/Westmead Institute for Medical Research, Westmead, NSW, 2145, Australia
| | - Ramin M Farahani
- IDR/Westmead Institute for Medical Research, Westmead, NSW, 2145, Australia. .,School of Medical Sciences, Faculty of Medicine and Health, University of Sydney, Sydney, NSW, 2006, Australia.
| |
Collapse
|
13
|
Zhou Y, Jiang Y, Chen SJ. RNA-ligand molecular docking: advances and challenges. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2022; 12:e1571. [PMID: 37293430 PMCID: PMC10250017 DOI: 10.1002/wcms.1571] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 07/20/2021] [Indexed: 12/16/2022]
Abstract
With rapid advances in computer algorithms and hardware, fast and accurate virtual screening has led to a drastic acceleration in selecting potent small molecules as drug candidates. Computational modeling of RNA-small molecule interactions has become an indispensable tool for RNA-targeted drug discovery. The current models for RNA-ligand binding have mainly focused on the docking-and-scoring method. Accurate docking and scoring should tackle four crucial problems: (1) conformational flexibility of ligand, (2) conformational flexibility of RNA, (3) efficient sampling of binding sites and binding poses, and (4) accurate scoring of different binding modes. Moreover, compared with the problem of protein-ligand docking, predicting ligand binding to RNA, a negatively charged polymer, is further complicated by additional effects such as metal ion effects. Thermodynamic models based on physics-based and knowledge-based scoring functions have shown highly encouraging success in predicting ligand binding poses and binding affinities. Recently, kinetic models for ligand binding have further suggested that including dissociation kinetics (residence time) in ligand docking would result in improved performance in estimating in vivo drug efficacy. More recently, the rise of deep-learning approaches has led to new tools for predicting RNA-small molecule binding. In this review, we present an overview of the recently developed computational methods for RNA-ligand docking and their advantages and disadvantages.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Yangwei Jiang
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|
14
|
Can docking scoring functions guarantee success in virtual screening? VIRTUAL SCREENING AND DRUG DOCKING 2022. [DOI: 10.1016/bs.armc.2022.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
15
|
Chu WT, Yan Z, Chu X, Zheng X, Liu Z, Xu L, Zhang K, Wang J. Physics of biomolecular recognition and conformational dynamics. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2021; 84:126601. [PMID: 34753115 DOI: 10.1088/1361-6633/ac3800] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 11/09/2021] [Indexed: 06/13/2023]
Abstract
Biomolecular recognition usually leads to the formation of binding complexes, often accompanied by large-scale conformational changes. This process is fundamental to biological functions at the molecular and cellular levels. Uncovering the physical mechanisms of biomolecular recognition and quantifying the key biomolecular interactions are vital to understand these functions. The recently developed energy landscape theory has been successful in quantifying recognition processes and revealing the underlying mechanisms. Recent studies have shown that in addition to affinity, specificity is also crucial for biomolecular recognition. The proposed physical concept of intrinsic specificity based on the underlying energy landscape theory provides a practical way to quantify the specificity. Optimization of affinity and specificity can be adopted as a principle to guide the evolution and design of molecular recognition. This approach can also be used in practice for drug discovery using multidimensional screening to identify lead compounds. The energy landscape topography of molecular recognition is important for revealing the underlying flexible binding or binding-folding mechanisms. In this review, we first introduce the energy landscape theory for molecular recognition and then address four critical issues related to biomolecular recognition and conformational dynamics: (1) specificity quantification of molecular recognition; (2) evolution and design in molecular recognition; (3) flexible molecular recognition; (4) chromosome structural dynamics. The results described here and the discussions of the insights gained from the energy landscape topography can provide valuable guidance for further computational and experimental investigations of biomolecular recognition and conformational dynamics.
Collapse
Affiliation(s)
- Wen-Ting Chu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Xiakun Chu
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| | - Xiliang Zheng
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Zuojia Liu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Li Xu
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Kun Zhang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun 130022, People's Republic of China
| | - Jin Wang
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794, United States of America
| |
Collapse
|
16
|
Timmons PB, Hewage CM. ENNAVIA is a novel method which employs neural networks for antiviral and anti-coronavirus activity prediction for therapeutic peptides. Brief Bioinform 2021; 22:bbab258. [PMID: 34297817 PMCID: PMC8575049 DOI: 10.1093/bib/bbab258] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 06/09/2021] [Accepted: 06/18/2021] [Indexed: 11/14/2022] Open
Abstract
Viruses represent one of the greatest threats to human health, necessitating the development of new antiviral drug candidates. Antiviral peptides often possess excellent biological activity and a favourable toxicity profile, and therefore represent a promising field of novel antiviral drugs. As the quantity of sequencing data grows annually, the development of an accurate in silico method for the prediction of peptide antiviral activities is important. This study leverages advances in deep learning and cheminformatics to produce a novel sequence-based deep neural network classifier for the prediction of antiviral peptide activity. The method outperforms the existent best-in-class, with an external test accuracy of 93.9%, Matthews correlation coefficient of 0.87 and an Area Under the Curve of 0.93 on the dataset of experimentally validated peptide activities. This cutting-edge classifier is available as an online web server at https://research.timmons.eu/ennavia, facilitating in silico screening and design of peptide antiviral drugs by the wider research community.
Collapse
Affiliation(s)
- Patrick Brendan Timmons
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland
| | - Chandralal M Hewage
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland
| |
Collapse
|
17
|
Scalvini B, Sheikhhassani V, Mashaghi A. Topological principles of protein folding. Phys Chem Chem Phys 2021; 23:21316-21328. [PMID: 34545868 DOI: 10.1039/d1cp03390e] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
What is the topology of a protein and what governs protein folding to a specific topology? This is a fundamental question in biology. The protein folding reaction is a critically important cellular process, which is failing in many prevalent diseases. Understanding protein folding is also key to the design of new proteins for applications. However, our ability to predict the folding of a protein chain is quite limited and much is still unknown about the topological principles of folding. Current predictors of folding kinetics, including the contact order and size, present a limited predictive power, suggesting that these models are fundamentally incomplete. Here, we use a newly developed mathematical framework to define and extract the topology of a native protein conformation beyond knot theory, and investigate the relationship between native topology and folding kinetics in experimentally characterized proteins. We show that not only the folding rate, but also the mechanistic insight into folding mechanisms can be inferred from topological parameters. We identify basic topological features that speed up or slow down the folding process. The approach enabled the decomposition of protein 3D conformation into topologically independent elementary folding units, called circuits. The number of circuits correlates significantly with the folding rate, offering not only an efficient kinetic predictor, but also a tool for a deeper understanding of theoretical folding models. This study contributes to recent work that reveals the critical relevance of topology to protein folding with a new, contact-based, mathematically rigorous perspective. We show that topology can predict folding kinetics when geometry-based predictors like contact order and size fail.
Collapse
Affiliation(s)
- Barbara Scalvini
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Vahid Sheikhhassani
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| | - Alireza Mashaghi
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Einsteinweg 55, 2333CC Leiden, The Netherlands.
| |
Collapse
|
18
|
Combination and tricombination therapy to destabilize the structural integrity of COVID-19 by some bioactive compounds with antiviral drugs: insights from molecular docking study. Struct Chem 2021; 32:1415-1430. [PMID: 33437137 PMCID: PMC7791912 DOI: 10.1007/s11224-020-01723-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 12/29/2020] [Indexed: 11/14/2022]
Abstract
Recently, the SARS-CoV-2 (COVID-19) pandemic virus has been spreading throughout the world. Until now, no certified drugs have been discovered to efficiently inhibit the virus. The scientists are struggling to find new safe bioactive inhibitors of this deadly virus. In this study, we aim to find antagonists that may inhibit the activity of the three major viral targets: SARS-CoV-2 3-chymotrypsin-like protease (6LU7), SARS-CoV-2 spike protein (6VYB), and a host target human angiotensin-converting enzyme 2 (ACE2) receptor (1R42), which is the entry point for the viral encounter, were studied with the prospects of identifying significant drug candidate(s) against COVID-19 infection. Then, the protein stability produced score of less than 0.6 for all residues of all studied receptors. This confirmed that these receptors are extremely stable proteins, so it is very difficult to unstable the stability of these proteins through utilizing individual drugs. Hence, we studied the combination and tricombination therapy between bioactive compounds which have the best binding affinity and some antiviral drugs like chloroquine, hydroxychloroquine, azithromycin, simeprevir, baloxavir, lopinavir, and favipiravir to show the effect of combination and tricombination therapy to disrupt the stability of the three major viral targets that are mentioned previously. Also, ADMET study suggested that most of all studied bioactive compounds are safe and nontoxic compounds. All results confirmed that caulerpin can be utilized as a combination and tricombination therapy along with the studied antiviral drugs for disrupting the stability of the three major viral receptors (6LU7, 6VYB, and 1R42).
Collapse
|
19
|
Timmons PB, Hewage CM. ENNAACT is a novel tool which employs neural networks for anticancer activity classification for therapeutic peptides. Biomed Pharmacother 2020; 133:111051. [PMID: 33254015 DOI: 10.1016/j.biopha.2020.111051] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/08/2020] [Accepted: 11/19/2020] [Indexed: 12/12/2022] Open
Abstract
The prevalence of cancer as a threat to human life, responsible for 9.6 million deaths worldwide in 2018, motivates the search for new anticancer agents. While many options are currently available for treatment, these are often expensive and impact the human body unfavourably. Anticancer peptides represent a promising emerging field of anticancer therapeutics, which are characterized by favourable toxicity profile. The development of accurate in silico methods for anticancer peptide prediction is of paramount importance, as the amount of available sequence data is growing each year. This study leverages advances in machine learning research to produce a novel sequence-based deep neural network classifier for anticancer peptide activity. The classifier achieves performance comparable to the best-in-class, with a cross-validated accuracy of 98.3%, Matthews correlation coefficient of 0.91 and an Area Under the Curve of 0.95. This innovative classifier is available as a web server at https://research.timmons.eu/ennaact, facilitating in silico screening and design of new anticancer peptide chemotherapeutics by the research community.
Collapse
Affiliation(s)
- Patrick Brendan Timmons
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland
| | - Chandralal M Hewage
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland.
| |
Collapse
|
20
|
Dashti A, Mashayekhi G, Shekhar M, Ben Hail D, Salah S, Schwander P, des Georges A, Singharoy A, Frank J, Ourmazd A. Retrieving functional pathways of biomolecules from single-particle snapshots. Nat Commun 2020; 11:4734. [PMID: 32948759 PMCID: PMC7501871 DOI: 10.1038/s41467-020-18403-x] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2019] [Accepted: 08/17/2020] [Indexed: 11/18/2022] Open
Abstract
A primary reason for the intense interest in structural biology is the fact that knowledge of structure can elucidate macromolecular functions in living organisms. Sustained effort has resulted in an impressive arsenal of tools for determining the static structures. But under physiological conditions, macromolecules undergo continuous conformational changes, a subset of which are functionally important. Techniques for capturing the continuous conformational changes underlying function are essential for further progress. Here, we present chemically-detailed conformational movies of biological function, extracted data-analytically from experimental single-particle cryo-electron microscopy (cryo-EM) snapshots of ryanodine receptor type 1 (RyR1), a calcium-activated calcium channel engaged in the binding of ligands. The functional motions differ substantially from those inferred from static structures in the nature of conformationally active structural domains, the sequence and extent of conformational motions, and the way allosteric signals are transduced within and between domains. Our approach highlights the importance of combining experiment, advanced data analysis, and molecular simulations.
Collapse
Affiliation(s)
- Ali Dashti
- Department of Physics, University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA
| | - Ghoncheh Mashayekhi
- Department of Physics, University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA
| | - Mrinal Shekhar
- Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign 405 N. Mathews Ave., Urbana, IL, 61801, USA
- School of Molecular Sciences, Center for Applied Structural Discovery, Arizona State University, Tempe, AZ, 85287, USA
| | - Danya Ben Hail
- Structural Biology Initiative, CUNY Advanced Science Research Center, City University of New York, New York, NY, 10031, USA
| | - Salah Salah
- Structural Biology Initiative, CUNY Advanced Science Research Center, City University of New York, New York, NY, 10031, USA
- Department of Chemistry & Biochemistry, City College of New York, New York, NY, 10031, USA
- Ph.D. Programs in Physics, Chemistry & Biochemistry, The Graduate Center of the City University of New York, New York, NY, 10016, USA
| | - Peter Schwander
- Department of Physics, University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA
| | - Amedee des Georges
- Structural Biology Initiative, CUNY Advanced Science Research Center, City University of New York, New York, NY, 10031, USA.
- Department of Chemistry & Biochemistry, City College of New York, New York, NY, 10031, USA.
- Ph.D. Programs in Physics, Chemistry & Biochemistry, The Graduate Center of the City University of New York, New York, NY, 10016, USA.
| | - Abhishek Singharoy
- School of Molecular Sciences, Center for Applied Structural Discovery, Arizona State University, Tempe, AZ, 85287, USA.
| | - Joachim Frank
- Department of Biochemistry and Molecular Biophysics, Columbia University, 2-221 Black Building, 650 West 168th Street, New York, NY, 10032, USA.
- Department of Biological Sciences, Columbia University, 600 Fairchild Center, New York, NY, 10027, USA.
| | - Abbas Ourmazd
- Department of Physics, University of Wisconsin Milwaukee, 3135 N. Maryland Ave, Milwaukee, WI, 53211, USA.
| |
Collapse
|
21
|
Sun Z, Huang S, Zheng L, Liang P, Yang W, Zuo Y. ICTC-RAAC: An improved web predictor for identifying the types of ion channel-targeted conotoxins by using reduced amino acid cluster descriptors. Comput Biol Chem 2020; 89:107371. [PMID: 32950852 DOI: 10.1016/j.compbiolchem.2020.107371] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Revised: 09/01/2020] [Accepted: 09/02/2020] [Indexed: 12/27/2022]
Abstract
Conotoxins are small peptide toxins which are rich in disulfide and have the unique diversity of sequences. It is significant to correctly identify the types of ion channel-targeted conotoxins because that they are considered as the optimal pharmacological candidate medicine in drug design owing to their ability specifically binding to ion channels and interfering with neural transmission. Comparing with other feature extracting methods, the reduced amino acid cluster (RAAC) better resolved in simplifying protein complexity and identifying functional conserved regions. Thus, in our study, 673 RAACs generated from 74 types of reduced amino acid alphabet were comprehensively assessed to establish a state-of-the-art predictor for predicting ion channel-targeted conotoxins. The results showed Type 20, Cluster 9 (T = 20, C = 9) in the tripeptide composition (N = 3) achieved the best accuracy, 89.3%, which was based on the algorithm of amino acids reduction of variance maximization. Further, the ANOVA with incremental feature selection (IFS) was used for feature selection to improve prediction performance. Finally, the cross-validation results showed that the best overall accuracy we calculated was 96.4% and 1.8% higher than the best accuracy of previous studies. Based on the predictor we proposed, a user-friendly webserver was established and can be friendly accessed at http://bioinfor.imu.edu.cn/ictcraac.
Collapse
Affiliation(s)
- Zijie Sun
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China; School of Mathematical Sciences, Inner Mongolia University, Hohhot, 010021, China
| | - Shenghui Huang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Pengfei Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China
| | - Wuritu Yang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China.
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, 010070, China.
| |
Collapse
|
22
|
Khanppnavar B, Roy A, Chandra K, Uversky VN, Maiti NC, Datta S. Deciphering the structural intricacy in virulence effectors for proton-motive force mediated unfolding in type-III protein secretion. Int J Biol Macromol 2020; 159:18-33. [DOI: 10.1016/j.ijbiomac.2020.04.266] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2020] [Revised: 04/28/2020] [Accepted: 04/29/2020] [Indexed: 10/24/2022]
|
23
|
Li X, Tang Q, Tang H, Chen W. Identifying Antioxidant Proteins by Combining Multiple Methods. Front Bioeng Biotechnol 2020; 8:858. [PMID: 32793581 PMCID: PMC7391787 DOI: 10.3389/fbioe.2020.00858] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Accepted: 07/03/2020] [Indexed: 11/13/2022] Open
Abstract
Antioxidant proteins play important roles in preventing free radical oxidation from damaging cells and DNA. They have become ideal candidates of disease prevention and treatment. Therefore, it is urgent to identify antioxidants from natural compounds. Since experimental methods are still cost ineffective, a series of computational methods have been proposed to identify antioxidant proteins. However, the performance of the current methods are still not satisfactory. In this study, a support vector machine based method, called Vote9, was proposed to identify antioxidants, in which the sequences were encoded by using the features generated from 9 optimal individual models. Results from jackknife test demonstrated that Vote9 is comparable with the best one of the existing predictors for this task. We hope that Vote9 will become a useful tool or at least can play a complementary role to the existing methods for identifying antioxidants.
Collapse
Affiliation(s)
- Xianhai Li
- School of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China.,Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Qiang Tang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Hua Tang
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Wei Chen
- School of Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China.,Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China.,School of Life Sciences, Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan, China
| |
Collapse
|
24
|
Ahmed SA, Abdelrheem DA, El-Mageed HRA, Mohamed HS, Rahman AA, Elsayed KNM, Ahmed SA. Destabilizing the structural integrity of COVID-19 by caulerpin and its derivatives along with some antiviral drugs: An in silico approaches for a combination therapy. Struct Chem 2020; 31:2391-2412. [PMID: 32837118 PMCID: PMC7376526 DOI: 10.1007/s11224-020-01586-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Accepted: 07/06/2020] [Indexed: 12/17/2022]
Abstract
Presently, the SARS-CoV-2 (COVID-19) pandemic has been spreading throughout the world. Some drugs such as lopinavir, simeprevir, hydroxychloroquine, chloroquine, and amprenavir have been recommended for COVID-19 treatment by some researchers, but these drugs were not effective enough against this virus. This study based on in silico approaches was aimed to increase the anti-COVID-19 activities of these drugs by using caulerpin and its derivatives as an adjunct drug against SARS-CoV-2 receptor proteins: the SARS-CoV-2 main protease and the SARS-CoV-2 spike protein. Caulerpin exhibited antiviral activities against chikungunya virus and herpes simplex virus type 1. Caulerpin and some of its derivatives showed inhibitory activity against Alzheimer’s disease. The web server ANCHOR revealed higher protein stability for the two receptors with disordered score (< 0.6). Molecular docking analysis showed that the binding energies of most of the caulerpin derivatives were higher than all the suggested drugs for the two receptors. Also, we deduced that inserting NH2, halogen, and vinyl groups can increase the binding affinity of caulerpin toward 6VYB and 6LU7, while inserting an alkyl group decreases the binding affinity of caulerpin toward 6VYB and 6LU7. So, we can modify the inhibitory effect of caulerpin against 6VYB and 6LU7 by inserting NH2, halogen, and vinyl groups. Based on the protein disordered results, the SARS-CoV-2 main protease and SARS-CoV-2 spike protein domain are highly stable proteins, so it is quite difficult to unstabilize their integrity by using individual drugs. Also, molecular dynamics (MD) simulation indicates that binding of the combination therapy of simeprevir and the candidate studied compounds to the receptors was stable and had no major effect on the flexibility of the protein throughout the simulations and provided a suitable basis for our study. So, this study suggested that caulerpin and its derivatives could be used as a combination therapy along with lopinavir, simeprevir, hydroxychloroquine, chloroquine, and amprenavir for disrupting the stability of SARS-CoV2 receptor proteins to increase the antiviral activity of these drugs.
Collapse
Affiliation(s)
- Shimaa A Ahmed
- Department of Chemistry, Faculty of Science, Beni-Suef University, Beni Suef, 62511 Egypt
| | - Doaa A Abdelrheem
- Department of Chemistry, Faculty of Science, Beni-Suef University, Beni Suef, 62511 Egypt
| | - H R Abd El-Mageed
- Micro-analysis and Environmental Research and Community Services Center, Faculty of Science, Beni-Suef University, Beni Suef, Egypt
| | - Hussein S Mohamed
- Research Institute of Medicinal and Aromatic Plants (RIMAP), Beni-Suef University, Beni Suef, Egypt
| | - Aziz A Rahman
- Department of Pharmacy, University of Rajshahi, Rajshahi, 6205 Bangladesh
| | - Khaled N M Elsayed
- Department of Botany, Faculty of Science, Beni-Suef University, Beni-Suef, 62511 Egypt
| | - Sayed A Ahmed
- Department of Chemistry, Faculty of Science, Beni-Suef University, Beni Suef, 62511 Egypt
| |
Collapse
|
25
|
Timmons PB, Hewage CM. HAPPENN is a novel tool for hemolytic activity prediction for therapeutic peptides which employs neural networks. Sci Rep 2020; 10:10869. [PMID: 32616760 PMCID: PMC7331684 DOI: 10.1038/s41598-020-67701-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 06/09/2020] [Indexed: 12/11/2022] Open
Abstract
The growing prevalence of resistance to antibiotics motivates the search for new antibacterial agents. Antimicrobial peptides are a diverse class of well-studied membrane-active peptides which function as part of the innate host defence system, and form a promising avenue in antibiotic drug research. Some antimicrobial peptides exhibit toxicity against eukaryotic membranes, typically characterised by hemolytic activity assays, but currently, the understanding of what differentiates hemolytic and non-hemolytic peptides is limited. This study leverages advances in machine learning research to produce a novel artificial neural network classifier for the prediction of hemolytic activity from a peptide's primary sequence. The classifier achieves best-in-class performance, with cross-validated accuracy of [Formula: see text] and Matthews correlation coefficient of 0.71. This innovative classifier is available as a web server at https://research.timmons.eu/happenn , allowing the research community to utilise it for in silico screening of peptide drug candidates for high therapeutic efficacies.
Collapse
Affiliation(s)
- Patrick Brendan Timmons
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland
| | - Chandralal M Hewage
- UCD School of Biomolecular and Biomedical Science, UCD Centre for Synthesis and Chemical Biology, UCD Conway Institute, University College Dublin, Dublin 4, Ireland.
| |
Collapse
|
26
|
Saikia S, Bordoloi M. Molecular Docking: Challenges, Advances and its Use in Drug Discovery Perspective. Curr Drug Targets 2020; 20:501-521. [PMID: 30360733 DOI: 10.2174/1389450119666181022153016] [Citation(s) in RCA: 203] [Impact Index Per Article: 50.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 06/08/2018] [Accepted: 08/28/2018] [Indexed: 01/21/2023]
Abstract
Molecular docking is a process through which small molecules are docked into the macromolecular structures for scoring its complementary values at the binding sites. It is a vibrant research area with dynamic utility in structure-based drug-designing, lead optimization, biochemical pathway and for drug designing being the most attractive tools. Two pillars for a successful docking experiment are correct pose and affinity prediction. Each program has its own advantages and drawbacks with respect to their docking accuracy, ranking accuracy and time consumption so a general conclusion cannot be drawn. Moreover, users don't always consider sufficient diversity in their test sets which results in certain programs to outperform others. In this review, the prime focus has been laid on the challenges of docking and troubleshooters in existing programs, underlying algorithmic background of docking, preferences regarding the use of docking programs for best results illustrated with examples, comparison of performance for existing tools and algorithms, state of art in docking, recent trends of diseases and current drug industries, evidence from clinical trials and post-marketing surveillance are discussed. These aspects of the molecular drug designing paradigm are quite controversial and challenging and this review would be an asset to the bioinformatics and drug designing communities.
Collapse
Affiliation(s)
- Surovi Saikia
- Natural Products Chemistry Group, CSIR North East Institute of Science & Technology, Jorhat-785006, Assam, India
| | - Manobjyoti Bordoloi
- Natural Products Chemistry Group, CSIR North East Institute of Science & Technology, Jorhat-785006, Assam, India
| |
Collapse
|
27
|
Qiu L, Zou X. Scoring Functions for Protein-RNA Complex Structure Prediction: Advances, Applications, and Future Directions. COMMUNICATIONS IN INFORMATION AND SYSTEMS 2020; 20:1-22. [PMID: 33867869 DOI: 10.4310/cis.2020.v20.n1.a1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Protein-RNA interaction is among the most essential of biological events in living cells, being involved in protein synthesizing, RNA processing and transport, DNA transcription, and regulation of gene expression, and many other critical bio-molecular activities. A thorough understanding of this interaction is of paramount importance in fundamental study of a variety of vital cellular processes and therapeutic application for remedy of a broad range of diseases. Experimental high-resolution 3D structure determination is the primary source of knowledge for protein-RNA complexes. However, due to technical limitations, the existing techniques for experimental structure determination couldn't match the demand from fast growing interest in academia and industry. This problem necessitates the alternative high-throughput computational method for protein-RNA complex structure prediction. Similar to the in silico methods used for protein-protein and protein-DNA interactions, a reliable prediction of protein-RNA complex structure requires a scoring function with commensurate discriminatory power. Derived from determined structures and purposed to predict the to-be-determined structures, the scoring function is not only a predictive tool but also a gauge of our knowledge of protein-RNA interaction. In this review, we present an overview of the status of existing scoring functions and the scientific principle behind their constructions as well as their strengths and limitations. Finally, we will discuss about future directions of the scoring function development for protein-RNA structure prediction.
Collapse
Affiliation(s)
- Liming Qiu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri 65211
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, Missouri 65211.,Department of Physics & Astronomy, University of Missouri, Columbia, Missouri 65211.,Department of Biochemistry, University of Missouri, Columbia, Missouri 65211.,Informatics Institute, University of Missouri, Columbia, Missouri 65211
| |
Collapse
|
28
|
Laine E, Karami Y, Carbone A. GEMME: a simple and fast global epistatic model predicting mutational effects. Mol Biol Evol 2019; 36:2604-2619. [PMID: 31406981 PMCID: PMC6805226 DOI: 10.1093/molbev/msz179] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 06/03/2019] [Accepted: 08/02/2019] [Indexed: 12/15/2022] Open
Abstract
The systematic and accurate description of protein mutational landscapes is a question of utmost importance in biology, bioengineering, and medicine. Recent progress has been achieved by leveraging on the increasing wealth of genomic data and by modeling intersite dependencies within biological sequences. However, state-of-the-art methods remain time consuming. Here, we present Global Epistatic Model for predicting Mutational Effects (GEMME) (www.lcqb.upmc.fr/GEMME), an original and fast method that predicts mutational outcomes by explicitly modeling the evolutionary history of natural sequences. This allows accounting for all positions in a sequence when estimating the effect of a given mutation. GEMME uses only a few biologically meaningful and interpretable parameters. Assessed against 50 high- and low-throughput mutational experiments, it overall performs similarly or better than existing methods. It accurately predicts the mutational landscapes of a wide range of protein families, including viral ones and, more generally, of much conserved families. Given an input alignment, it generates the full mutational landscape of a protein in a matter of minutes. It is freely available as a package and a webserver at www.lcqb.upmc.fr/GEMME/.
Collapse
Affiliation(s)
- Elodie Laine
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Yasaman Karami
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.,Sorbonne Université, UPMC-Univ P6, Institut du Calcul et de la Simulation
| | - Alessandra Carbone
- Sorbonne Université, UPMC University Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.,Institut Universitaire de France
| |
Collapse
|
29
|
Mészáros B, Erdos G, Dosztányi Z. IUPred2A: context-dependent prediction of protein disorder as a function of redox state and protein binding. Nucleic Acids Res 2019; 46:W329-W337. [PMID: 29860432 PMCID: PMC6030935 DOI: 10.1093/nar/gky384] [Citation(s) in RCA: 868] [Impact Index Per Article: 173.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2018] [Accepted: 05/11/2018] [Indexed: 01/31/2023] Open
Abstract
The structural states of proteins include ordered globular domains as well as intrinsically disordered protein regions that exist as highly flexible conformational ensembles in isolation. Various computational tools have been developed to discriminate ordered and disordered segments based on the amino acid sequence. However, properties of IDRs can also depend on various conditions, including binding to globular protein partners or environmental factors, such as redox potential. These cases provide further challenges for the computational characterization of disordered segments. In this work we present IUPred2A, a combined web interface that allows to generate energy estimation based predictions for ordered and disordered residues by IUPred2 and for disordered binding regions by ANCHOR2. The updated web server retains the robustness of the original programs but offers several new features. While only minor bug fixes are implemented for IUPred, the next version of ANCHOR is significantly improved through a new architecture and parameters optimized on novel datasets. In addition, redox-sensitive regions can also be highlighted through a novel experimental feature. The web server offers graphical and text outputs, a RESTful interface, access to software download and extensive help, and can be accessed at a new location: http://iupred2a.elte.hu.
Collapse
Affiliation(s)
- Bálint Mészáros
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| | - Gábor Erdos
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| | - Zsuzsanna Dosztányi
- MTA-ELTE Momentum Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest H-1117, Hungary
| |
Collapse
|
30
|
Yan Y, Wen Z, Zhang D, Huang SY. Determination of an effective scoring function for RNA-RNA interactions with a physics-based double-iterative method. Nucleic Acids Res 2019; 46:e56. [PMID: 29506237 PMCID: PMC5961370 DOI: 10.1093/nar/gky113] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2017] [Accepted: 02/08/2018] [Indexed: 11/15/2022] Open
Abstract
RNA–RNA interactions play fundamental roles in gene and cell regulation. Therefore, accurate prediction of RNA–RNA interactions is critical to determine their complex structures and understand the molecular mechanism of the interactions. Here, we have developed a physics-based double-iterative strategy to determine the effective potentials for RNA–RNA interactions based on a training set of 97 diverse RNA–RNA complexes. The double-iterative strategy circumvented the reference state problem in knowledge-based scoring functions by updating the potentials through iteration and also overcame the decoy-dependent limitation in previous iterative methods by constructing the decoys iteratively. The derived scoring function, which is referred to as DITScoreRR, was evaluated on an RNA–RNA docking benchmark of 60 test cases and compared with three other scoring functions. It was shown that for bound docking, our scoring function DITScoreRR obtained the excellent success rates of 90% and 98.3% in binding mode predictions when the top 1 and 10 predictions were considered, compared to 63.3% and 71.7% for van der Waals interactions, 45.0% and 65.0% for ITScorePP, and 11.7% and 26.7% for ZDOCK 2.1, respectively. For unbound docking, DITScoreRR achieved the good success rates of 53.3% and 71.7% in binding mode predictions when the top 1 and 10 predictions were considered, compared to 13.3% and 28.3% for van der Waals interactions, 11.7% and 26.7% for our ITScorePP, and 3.3% and 6.7% for ZDOCK 2.1, respectively. DITScoreRR also performed significantly better in ranking decoys and obtained significantly higher score-RMSD correlations than the other three scoring functions. DITScoreRR will be of great value for the prediction and design of RNA structures and RNA–RNA complexes.
Collapse
Affiliation(s)
- Yumeng Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| | - Zeyu Wen
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| | - Di Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, P.R. China
| |
Collapse
|
31
|
Li M, Cao H, Lai L, Liu Z. Disordered linkers in multidomain allosteric proteins: Entropic effect to favor the open state or enhanced local concentration to favor the closed state? Protein Sci 2019; 27:1600-1610. [PMID: 30019371 DOI: 10.1002/pro.3475] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 06/12/2018] [Accepted: 06/24/2018] [Indexed: 12/11/2022]
Abstract
There are many multidomain allosteric proteins where an allosteric signal at the allosteric domain modifies the activity of the functional domain. Intrinsically disordered regions (linkers) are widely involved in this kind of regulation process, but the essential role they play therein is not well understood. Here, we investigated the effect of linkers in stabilizing the open or the closed states of multidomain proteins using combined thermodynamic deduction and coarse-grained molecular dynamics simulations. We revealed that the influence of linker can be fully characterized by an effective local concentration [B]0 . When Kd is smaller than [B]0 , the closed state would be favored; while the open state would be preferred when Kd is larger than [B]0 . We used four protein systems with markedly different domain-domain binding affinity and structural order/disorder as model systems to understand the relationship between [B]0 and the linker length as well as its flexibility. The linker length is the main practical determinant of [B]0 . [B]0 of a flexible linker with 40-60 residues was determined to be in a narrow range of 0.2-0.6 mM, while a too short or too long length would dramatically decrease [B]0 . With the revealed [B]0 range, the introduction of a flexible linker makes the regulation of weakly interacting partners possible.
Collapse
Affiliation(s)
- Maodong Li
- Center for Quantitative Biology, Peking University, Beijing, 100871, China
| | - Huaiqing Cao
- College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Luhua Lai
- Center for Quantitative Biology, Peking University, Beijing, 100871, China.,College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China.,State Key Laboratory for Structural Chemistry of Unstable and Stable Species, Beijing National Laboratory for Molecular Sciences (BNLMS), Peking University, Beijing, 100871, China
| | - Zhirong Liu
- Center for Quantitative Biology, Peking University, Beijing, 100871, China.,College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China.,State Key Laboratory for Structural Chemistry of Unstable and Stable Species, Beijing National Laboratory for Molecular Sciences (BNLMS), Peking University, Beijing, 100871, China
| |
Collapse
|
32
|
Venev SV, Zeldovich KB. Thermophilic Adaptation in Prokaryotes Is Constrained by Metabolic Costs of Proteostasis. Mol Biol Evol 2019; 35:211-224. [PMID: 29106597 PMCID: PMC5850847 DOI: 10.1093/molbev/msx282] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Prokaryotes evolved to thrive in an extremely diverse set of habitats, and their proteomes bear signatures of environmental conditions. Although correlations between amino acid usage and environmental temperature are well-documented, understanding of the mechanisms of thermal adaptation remains incomplete. Here, we couple the energetic costs of protein folding and protein homeostasis to build a microscopic model explaining both the overall amino acid composition and its temperature trends. Low biosynthesis costs lead to low diversity of physical interactions between amino acid residues, which in turn makes proteins less stable and drives up chaperone activity to maintain appropriate levels of folded, functional proteins. Assuming that the cost of chaperone activity is proportional to the fraction of unfolded client proteins, we simulated thermal adaptation of model proteins subject to minimization of the total cost of amino acid synthesis and chaperone activity. For the first time, we predicted both the proteome-average amino acid abundances and their temperature trends simultaneously, and found strong correlations between model predictions and 402 genomes of bacteria and archaea. The energetic constraint on protein evolution is more apparent in highly expressed proteins, selected by codon adaptation index. We found that in bacteria, highly expressed proteins are similar in composition to thermophilic ones, whereas in archaea no correlation between predicted expression level and thermostability was observed. At the same time, thermal adaptations of highly expressed proteins in bacteria and archaea are nearly identical, suggesting that universal energetic constraints prevail over the phylogenetic differences between these domains of life.
Collapse
Affiliation(s)
- Sergey V Venev
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, MA
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 368 Plantation St, Worcester, MA
| |
Collapse
|
33
|
Yan Z, Wang J. Superfunneled Energy Landscape of Protein Evolution Unifies the Principles of Protein Evolution, Folding, and Design. PHYSICAL REVIEW LETTERS 2019; 122:018103. [PMID: 31012725 DOI: 10.1103/physrevlett.122.018103] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 11/08/2018] [Indexed: 06/09/2023]
Abstract
Evolution is essential for shaping the biological functions. Darwin proposed the selection as the driving force for evolution upon mutations. While mutations are clear, the quantification of the selection force is still challenging. In this study, we identified and quantified both thermodynamic stability and kinetic accessibility as the selection forces for protein evolution. The protein evolution can be viewed and quantified as a trajectory moving along a superfunneled energy landscape with a line attractor at the bottom. The resulting evolved sequences and structures show strong protein characteristics including the hydrophobic core, high designability, and fast folding. The evolution principle uncovered here is validated on real proteins and sheds light on the protein design.
Collapse
Affiliation(s)
- Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
| | - Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
- Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, New York 11790, USA
| |
Collapse
|
34
|
Aleksandrov A, Myllykallio H. Advances and challenges in drug design against tuberculosis: application of in silico approaches. Expert Opin Drug Discov 2018; 14:35-46. [PMID: 30477360 DOI: 10.1080/17460441.2019.1550482] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
INTRODUCTION Tuberculosis (TB) caused by Mycobacterium tuberculosis (Mtb) remains the deadliest infectious disease in the world with one-third of the world's population thought to be infected. Over the years, TB mortality rate has been largely reduced; however, this progress has been threatened by the increasing appearance of multidrug-resistant Mtb. Considerable recent efforts have been undertaken to develop new generation antituberculosis drugs. Many of these attempts have relied on in silico approaches, which have emerged recently as powerful tools complementary to biochemical attempts. Areas covered: The authors review the status of pharmaceutical drug development against TB with a special emphasis on computational work. They focus on those studies that have been validated by in vitro and/or in vivo experiments, and thus, that can be considered as successful. The major goals of this review are to present target protein systems, to highlight how in silico efforts compliment experiments, and to aid future drug design endeavors. Expert opinion: Despite having access to all of the gene and protein sequences of Mtb, the search for new optimal treatments against this deadly pathogen are still ongoing. Together with the geometric growth of protein structural and sequence databases, computational methods have become a powerful technique accelerating the successful identification of new ligands.
Collapse
Affiliation(s)
- Alexey Aleksandrov
- a Laboratoire d'Optique et Biosciences (CNRS UMR7645, INSERM U1182) , Ecole Polytechnique , Palaiseau , France
| | - Hannu Myllykallio
- a Laboratoire d'Optique et Biosciences (CNRS UMR7645, INSERM U1182) , Ecole Polytechnique , Palaiseau , France
| |
Collapse
|
35
|
Anishchenko I, Kundrotas PJ, Vakser IA. Contact Potential for Structure Prediction of Proteins and Protein Complexes from Potts Model. Biophys J 2018; 115:809-821. [PMID: 30122295 DOI: 10.1016/j.bpj.2018.07.035] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 07/16/2018] [Accepted: 07/31/2018] [Indexed: 12/18/2022] Open
Abstract
The energy function is the key component of protein modeling methodology. This work presents a semianalytical approach to the development of contact potentials for protein structure modeling. Residue-residue and atom-atom contact energies were derived by maximizing the probability of observing native sequences in a nonredundant set of protein structures. The optimization task was formulated as an inverse statistical mechanics problem applied to the Potts model. Its solution by pseudolikelihood maximization provides consistent estimates of coupling constants at atomic and residue levels. The best performance was achieved when interacting atoms were grouped according to their physicochemical properties. For individual protein structures, the performance of the contact potentials in distinguishing near-native structures from the decoys is similar to the top-performing scoring functions. The potentials also yielded significant improvement in the protein docking success rates. The potentials recapitulated experimentally determined protein stability changes upon point mutations and protein-protein binding affinities. The approach offers a different perspective on knowledge-based potentials and may serve as the basis for their further development.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas
| | - Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas.
| |
Collapse
|
36
|
Seligmann H. Protein Sequences Recapitulate Genetic Code Evolution. Comput Struct Biotechnol J 2018; 16:177-189. [PMID: 30002789 PMCID: PMC6040577 DOI: 10.1016/j.csbj.2018.05.001] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 05/14/2018] [Accepted: 05/17/2018] [Indexed: 12/16/2022] Open
Abstract
Several hypotheses predict ranks of amino acid assignments to genetic code's codons. Analyses here show that average positions of amino acid species in proteins correspond to assignment ranks, in particular as predicted by Juke's neutral mutation hypothesis for codon assignments. In all tested protein groups, including co- and post-translationally folding proteins, 'recent' amino acids are on average closer to gene 5' extremities than 'ancient' ones. Analyses of pairwise residue contact energies matrices suggest that early amino acids stereochemically selected late ones that stablilize residue interactions within protein cores, presumably producing 5'-late-to-3'-early amino acid protein sequence gradients. The gradient might reduce protein misfolding, also after mutations, extending principles of neutral mutations to protein folding. Presumably, in self-perpetuating and self-correcting systems like the genetic code, initial conditions produce similarities between evolution of the process (the genetic code) and 'ontogeny' of resulting structures (here proteins), producing apparent teleonomy between process and product.
Collapse
Affiliation(s)
- Hervé Seligmann
- Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes, UMR MEPHI, Aix-Marseille Université, IRD, Assistance Publique-Hôpitaux de Marseille, Institut Hospitalo-Universitaire Méditerranée-Infection, 19-21 boulevard Jean Moulin, 13005 Marseille, France.
| |
Collapse
|
37
|
Zhang D, Chen SJ. IsRNA: An Iterative Simulated Reference State Approach to Modeling Correlated Interactions in RNA Folding. J Chem Theory Comput 2018; 14:2230-2239. [PMID: 29499114 DOI: 10.1021/acs.jctc.7b01228] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Coarse-grained RNA folding models promise great potential for RNA structure prediction. A key component in a coarse-grained folding model is the force field. One of the challenges in the coarse-grained force field calculation is how to treat the correlation between the different degrees of freedoms. Here, we describe a new approach (IsRNA) to extract the correlated energy functions from the known structures. Through iterative molecular dynamics simulations, we build the correlation effects into the reference states, from which we extract the energy functions. The validity of IsRNA is supported by the close agreement between the simulated Boltzmann-like probability distributions for all the structure parameters and those observed from the experimentally determined structures. The correlated energy functions derived here may provide a new tool for RNA 3D structure prediction.
Collapse
Affiliation(s)
- Dong Zhang
- Department of Physics, Department of Biochemistry, and MU Informatics Institute , University of Missouri , Columbia , Missouri 65211 , United States
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and MU Informatics Institute , University of Missouri , Columbia , Missouri 65211 , United States
| |
Collapse
|
38
|
Murugan R, Buchauer L, Triller G, Kreschel C, Costa G, Pidelaserra Martí G, Imkeller K, Busse CE, Chakravarty S, Sim BKL, Hoffman SL, Levashina EA, Kremsner PG, Mordmüller B, Höfer T, Wardemann H. Clonal selection drives protective memory B cell responses in controlled human malaria infection. Sci Immunol 2018; 3:3/20/eaap8029. [DOI: 10.1126/sciimmunol.aap8029] [Citation(s) in RCA: 109] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2017] [Accepted: 11/30/2017] [Indexed: 01/20/2023]
|
39
|
Xu X, Huang M, Zou X. Docking-based inverse virtual screening: methods, applications, and challenges. BIOPHYSICS REPORTS 2018; 4:1-16. [PMID: 29577065 PMCID: PMC5860130 DOI: 10.1007/s41048-017-0045-8] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2017] [Accepted: 09/08/2017] [Indexed: 01/09/2023] Open
Abstract
Identifying potential protein targets for a small-compound ligand query is crucial to the process of drug development. However, there are tens of thousands of proteins in human alone, and it is almost impossible to scan all the existing proteins for a query ligand using current experimental methods. Recently, a computational technology called docking-based inverse virtual screening (IVS) has attracted much attention. In docking-based IVS, a panel of proteins is screened by a molecular docking program to identify potential targets for a query ligand. Ever since the first paper describing a docking-based IVS program was published about a decade ago, the approach has been gradually improved and utilized for a variety of purposes in the field of drug discovery. In this article, the methods employed in docking-based IVS are reviewed in detail, including target databases, docking engines, and scoring function methodologies. Several web servers developed for non-expert users are also reviewed. Then, a number of applications are presented according to different research purposes, such as target identification, side effects/toxicity, drug repositioning, drug-target network development, and receptor design. The review concludes by discussing the challenges that docking-based IVS needs to overcome to become a robust tool for pharmaceutical engineering.
Collapse
Affiliation(s)
- Xianjin Xu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211 USA
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211 USA
- Informatics Institute, University of Missouri, Columbia, MO 65211 USA
- Department of Biochemistry, University of Missouri, Columbia, MO 65211 USA
| | - Marshal Huang
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211 USA
- Informatics Institute, University of Missouri, Columbia, MO 65211 USA
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211 USA
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211 USA
- Informatics Institute, University of Missouri, Columbia, MO 65211 USA
- Department of Biochemistry, University of Missouri, Columbia, MO 65211 USA
| |
Collapse
|
40
|
Computational Methods for Efficient Sampling of Protein Landscapes and Disclosing Allosteric Regions. COMPUTATIONAL MOLECULAR MODELLING IN STRUCTURAL BIOLOGY 2018; 113:33-63. [DOI: 10.1016/bs.apcsb.2018.06.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
41
|
Dosztányi Z. Prediction of protein disorder based on IUPred. Protein Sci 2017; 27:331-340. [PMID: 29076577 DOI: 10.1002/pro.3334] [Citation(s) in RCA: 119] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2017] [Revised: 10/25/2017] [Accepted: 10/25/2017] [Indexed: 12/19/2022]
Abstract
Many proteins contain intrinsically disordered regions (IDRs), functional polypeptide segments that in isolation adopt a highly flexible conformational ensemble instead of a single, well-defined structure. Disorder prediction methods, which can discriminate ordered and disordered regions from the amino acid sequence, have contributed significantly to our current understanding of the distinct properties of intrinsically disordered proteins by enabling the characterization of individual examples as well as large-scale analyses of these protein regions. One popular method, IUPred provides a robust prediction of protein disorder based on an energy estimation approach that captures the fundamental difference between the biophysical properties of ordered and disordered regions. This paper reviews the energy estimation method underlying IUPred and the basic properties of the web server. Through an example, it also illustrates how the prediction output can be interpreted in a more complex case by taking into account the heterogeneous nature of IDRs. Various applications that benefited from IUPred to provide improved disorder predictions, complementing domain annotations and aiding the identification of functional short linear motifs are also described here. IUPred is freely available for noncommercial users through the web server (http://iupred.enzim.hu and http://iupred.elte.hu) . The program can also be downloaded and installed locally for large-scale analyses.
Collapse
Affiliation(s)
- Zsuzsanna Dosztányi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, H-1117, Hungary
| |
Collapse
|
42
|
Yan Z, Wang J. SPA-LN: a scoring function of ligand-nucleic acid interactions via optimizing both specificity and affinity. Nucleic Acids Res 2017; 45:e110. [PMID: 28431169 PMCID: PMC5499587 DOI: 10.1093/nar/gkx255] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2016] [Accepted: 04/05/2017] [Indexed: 01/10/2023] Open
Abstract
Nucleic acids have been widely recognized as potential targets in drug discovery and aptamer selection. Quantifying the interactions between small molecules and nucleic acids is critical to discover lead compounds and design novel aptamers. Scoring function is normally employed to quantify the interactions in structure-based virtual screening. However, the predictive power of nucleic acid–ligand scoring functions is still a challenge compared to other types of biomolecular recognition. With the rapid growth of experimentally determined nucleic acid–ligand complex structures, in this work, we develop a knowledge-based scoring function of nucleic acid–ligand interactions, namely SPA-LN. SPA-LN is optimized by maximizing both the affinity and specificity of native complex structures. The development strategy is different from those of previous nucleic acid–ligand scoring functions which focus on the affinity only in the optimization. The native conformation is stabilized while non-native conformations are destabilized by our optimization, making the funnel-like binding energy landscape more biased toward the native state. The performance of SPA-LN validates the development strategy and provides a relatively more accurate way to score the nucleic acid–ligand interactions.
Collapse
Affiliation(s)
- Zhiqiang Yan
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China
| | - Jin Wang
- State Key Laboratory of Electroanalytical Chemistry, Changchun Institute of Applied Chemistry, Chinese Academy of Sciences, Changchun, Jilin 130022, China.,Department of Chemistry & Physics, State University of New York at Stony Brook, Stony Brook, NY 11794-3400, USA
| |
Collapse
|
43
|
Young EJ, Burton R, Mahalik JP, Sumpter BG, Fuentes-Cabrera M, Kerfeld CA, Ducat DC. Engineering the Bacterial Microcompartment Domain for Molecular Scaffolding Applications. Front Microbiol 2017; 8:1441. [PMID: 28824573 PMCID: PMC5534457 DOI: 10.3389/fmicb.2017.01441] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 07/17/2017] [Indexed: 01/03/2023] Open
Abstract
As synthetic biology advances the intricacy of engineered biological systems, the importance of spatial organization within the cellular environment must not be marginalized. Increasingly, biological engineers are investigating means to control spatial organization within the cell, mimicking strategies used by natural pathways to increase flux and reduce cross-talk. A modular platform for constructing a diverse set of defined, programmable architectures would greatly assist in improving yields from introduced metabolic pathways and increasing insulation of other heterologous systems. Here, we review recent research on the shell proteins of bacterial microcompartments and discuss their potential application as "building blocks" for a range of customized intracellular scaffolds. We summarize the state of knowledge on the self-assembly of BMC shell proteins and discuss future avenues of research that will be important to realize the potential of BMC shell proteins as predictively assembling and programmable biological materials for bioengineering.
Collapse
Affiliation(s)
- Eric J. Young
- Biochemistry and Molecular Biology, Michigan State University, East LansingMI, United States
- MSU-DOE Plant Research Laboratory, East LansingMI, United States
| | - Rodney Burton
- MSU-DOE Plant Research Laboratory, East LansingMI, United States
| | - Jyoti P. Mahalik
- Computational Sciences and Engineering, Oak Ridge National Laboratory, Oak RidgeTN, United States
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak RidgeTN, United States
| | - Bobby G. Sumpter
- Computational Sciences and Engineering, Oak Ridge National Laboratory, Oak RidgeTN, United States
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak RidgeTN, United States
| | - Miguel Fuentes-Cabrera
- Computational Sciences and Engineering, Oak Ridge National Laboratory, Oak RidgeTN, United States
- Center for Nanophase Materials Sciences, Oak Ridge National Laboratory, Oak RidgeTN, United States
| | - Cheryl A. Kerfeld
- Biochemistry and Molecular Biology, Michigan State University, East LansingMI, United States
- MSU-DOE Plant Research Laboratory, East LansingMI, United States
- Molecular Biophysics and Integrated Bioimaging Division, Berkeley National Laboratory, BerkeleyCA, United States
| | - Daniel C. Ducat
- Biochemistry and Molecular Biology, Michigan State University, East LansingMI, United States
- MSU-DOE Plant Research Laboratory, East LansingMI, United States
| |
Collapse
|
44
|
Akbar S, Hayat M, Iqbal M, Jan MA. iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space. Artif Intell Med 2017; 79:62-70. [PMID: 28655440 DOI: 10.1016/j.artmed.2017.06.008] [Citation(s) in RCA: 90] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 06/12/2017] [Accepted: 06/16/2017] [Indexed: 01/10/2023]
Abstract
Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, 'iACP-GAEnsC', is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm-based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, 'iACP-GAEnsC' model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that 'iACP-GAEnsC' model might be a leading tool in the field of drug design and proteomics for researchers.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| | - Muhammad Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| | - Mian Ahmad Jan
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| |
Collapse
|
45
|
Abstract
In addition to continuous rapid progress in RNA structure determination, probing, and biophysical studies, the past decade has seen remarkable advances in the development of a new generation of RNA folding theories and models. In this article, we review RNA structure prediction models and models for ion-RNA and ligand-RNA interactions. These new models are becoming increasingly important for a mechanistic understanding of RNA function and quantitative design of RNA nanotechnology. We focus on new methods for physics-based, knowledge-based, and experimental data-directed modeling for RNA structures and explore the new theories for the predictions of metal ion and ligand binding sites and metal ion-dependent RNA stabilities. The integration of these new methods with theories about the cellular environment effects in RNA folding, such as molecular crowding and cotranscriptional kinetic effects, may ultimately lead to an all-encompassing RNA folding model.
Collapse
Affiliation(s)
- Li-Zhen Sun
- Department of Physics, Department of Biochemistry, and MU Informatics Institute, University of Missouri, Columbia, Missouri 65211;
| | - Dong Zhang
- Department of Physics, Department of Biochemistry, and MU Informatics Institute, University of Missouri, Columbia, Missouri 65211;
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry, and MU Informatics Institute, University of Missouri, Columbia, Missouri 65211;
| |
Collapse
|
46
|
Coluzza I. Computational protein design: a review. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2017; 29:143001. [PMID: 28140371 DOI: 10.1088/1361-648x/aa5c76] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Proteins are one of the most versatile modular assembling systems in nature. Experimentally, more than 110 000 protein structures have been identified and more are deposited every day in the Protein Data Bank. Such an enormous structural variety is to a first approximation controlled by the sequence of amino acids along the peptide chain of each protein. Understanding how the structural and functional properties of the target can be encoded in this sequence is the main objective of protein design. Unfortunately, rational protein design remains one of the major challenges across the disciplines of biology, physics and chemistry. The implications of solving this problem are enormous and branch into materials science, drug design, evolution and even cryptography. For instance, in the field of drug design an effective computational method to design protein-based ligands for biological targets such as viruses, bacteria or tumour cells, could give a significant boost to the development of new therapies with reduced side effects. In materials science, self-assembly is a highly desired property and soon artificial proteins could represent a new class of designable self-assembling materials. The scope of this review is to describe the state of the art in computational protein design methods and give the reader an outline of what developments could be expected in the near future.
Collapse
Affiliation(s)
- Ivan Coluzza
- Computational Physics, Faculty of Physics, University of Vienna, Vienna, Austria
| |
Collapse
|
47
|
Iqbal S, Hoque MT. Estimation of Position Specific Energy as a Feature of Protein Residues from Sequence Alone for Structural Classification. PLoS One 2016; 11:e0161452. [PMID: 27588752 PMCID: PMC5010294 DOI: 10.1371/journal.pone.0161452] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 08/06/2016] [Indexed: 11/20/2022] Open
Abstract
A set of features computed from the primary amino acid sequence of proteins, is crucial in the process of inducing a machine learning model that is capable of accurately predicting three-dimensional protein structures. Solutions for existing protein structure prediction problems are in need of features that can capture the complexity of molecular level interactions. With a view to this, we propose a novel approach to estimate position specific estimated energy (PSEE) of a residue using contact energy and predicted relative solvent accessibility (RSA). Furthermore, we demonstrate PSEE can be reasonably estimated based on sequence information alone. PSEE is useful in identifying the structured as well as unstructured or, intrinsically disordered region of a protein by computing favorable and unfavorable energy respectively, characterized by appropriate threshold. The most intriguing finding, verified empirically, is the indication that the PSEE feature can effectively classify disorder versus ordered residues and can segregate different secondary structure type residues by computing the constituent energies. PSEE values for each amino acid strongly correlate with the hydrophobicity value of the corresponding amino acid. Further, PSEE can be used to detect the existence of critical binding regions that essentially undergo disorder-to-order transitions to perform crucial biological functions. Towards an application of disorder prediction using the PSEE feature, we have rigorously tested and found that a support vector machine model informed by a set of features including PSEE consistently outperforms a model with an identical set of features with PSEE removed. In addition, the new disorder predictor, DisPredict2, shows competitive performance in predicting protein disorder when compared with six existing disordered protein predictors.
Collapse
Affiliation(s)
- Sumaiya Iqbal
- Department of Computer Science, University of New Orleans, New Orleans, LA, United States of America
| | - Md Tamjidul Hoque
- Department of Computer Science, University of New Orleans, New Orleans, LA, United States of America
| |
Collapse
|
48
|
Mahalik JP, Brown KA, Cheng X, Fuentes-Cabrera M. Theoretical Study of the Initial Stages of Self-Assembly of a Carboxysome's Facet. ACS NANO 2016; 10:5751-8. [PMID: 26906087 DOI: 10.1021/acsnano.5b07805] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Bacterial microcompartments, BMCs, are organelles that exist within wide variety of bacteria and act as nanofactories. Among the different types of known BMCs, the carboxysome has been studied the most. The carboxysome plays an important role in the light-independent part of the photosynthesis process, where its icosahedral-like proteinaceous shell acts as a membrane that controls the transport of metabolites. Although a structural model exists for the carboxysome shell, it remains largely unknown how the shell proteins self-assemble. Understanding the self-assembly process can provide insights into how the shell affects the carboxysome's function and how it can be modified to create new functionalities, such as artificial nanoreactors and artificial protein membranes. Here, we describe a theoretical framework that employs Monte Carlo simulations with a coarse-grain potential that reproduces well the atomistic potential of mean force; employing this framework, we are able to capture the initial stages of the 2D self-assembly of CcmK2 hexamers, a major protein-shell component of the carboxysome's facet. The simulations reveal that CcmK2 hexamers self-assemble into clusters that resemble what was seen experimentally in 2D layers. Further analysis of the simulation results suggests that the 2D self-assembly of carboxysome's facets is driven by a nucleation-growth process, which in turn could play an important role in the hierarchical self-assembly of BMC shells in general.
Collapse
Affiliation(s)
| | - Kirsten A Brown
- Chemistry Department, Mercer University , 1501 Mercer University Drive, Macon, Georgia 31207, United States
| | - Xiaolin Cheng
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee , M407 Walters Life Sciences, 1414 Cumberland Avenue, Knoxville, Tennessee 37996, United States
| | | |
Collapse
|
49
|
Zheng Z, Wang T, Li P, Merz KM. KECSA-Movable Type Implicit Solvation Model (KMTISM). J Chem Theory Comput 2016; 11:667-82. [PMID: 25691832 PMCID: PMC4325602 DOI: 10.1021/ct5007828] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Indexed: 11/30/2022]
Abstract
![]()
Computation
of the solvation free energy for chemical and biological
processes has long been of significant interest. The key challenges
to effective solvation modeling center on the choice of potential
function and configurational sampling. Herein, an energy sampling
approach termed the “Movable Type” (MT) method, and
a statistical energy function for solvation modeling, “Knowledge-based
and Empirical Combined Scoring Algorithm” (KECSA) are developed
and utilized to create an implicit solvation model: KECSA-Movable
Type Implicit Solvation Model (KMTISM) suitable for the study of chemical
and biological systems. KMTISM is an implicit solvation model, but
the MT method performs energy sampling at the atom pairwise level.
For a specific molecular system, the MT method collects energies from
prebuilt databases for the requisite atom pairs at all relevant distance
ranges, which by its very construction encodes all possible molecular
configurations simultaneously. Unlike traditional statistical energy
functions, KECSA converts structural statistical information into
categorized atom pairwise interaction energies as a function of the
radial distance instead of a mean force energy function. Within the
implicit solvent model approximation, aqueous solvation free energies
are then obtained from the NVT ensemble partition function generated
by the MT method. Validation is performed against several subsets
selected from the Minnesota Solvation Database v2012. Results are
compared with several solvation free energy calculation methods, including
a one-to-one comparison against two commonly used classical implicit
solvation models: MM-GBSA and MM-PBSA. Comparison against a quantum
mechanics based polarizable continuum model is also discussed (Cramer
and Truhlar’s Solvation Model 12).
Collapse
Affiliation(s)
- Zheng Zheng
- Institute for Cyber Enabled Research, Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824-1322, United States
| | | | | | | |
Collapse
|
50
|
iDPF-PseRAAAC: A Web-Server for Identifying the Defensin Peptide Family and Subfamily Using Pseudo Reduced Amino Acid Alphabet Composition. PLoS One 2015; 10:e0145541. [PMID: 26713618 PMCID: PMC4694767 DOI: 10.1371/journal.pone.0145541] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2015] [Accepted: 12/04/2015] [Indexed: 11/29/2022] Open
Abstract
Defensins as one of the most abundant classes of antimicrobial peptides are an essential part of the innate immunity that has evolved in most living organisms from lower organisms to humans. To identify specific defensins as interesting antifungal leads, in this study, we constructed a more rigorous benchmark dataset and the iDPF-PseRAAAC server was developed to predict the defensin family and subfamily. Using reduced dipeptide compositions were used, the overall accuracy of proposed method increased to 95.10% for the defensin family, and 98.39% for the vertebrate subfamily, which is higher than the accuracy from other methods. The jackknife test shows that more than 4% improvement was obtained comparing with the previous method. A free online server was further established for the convenience of most experimental scientists at http://wlxy.imu.edu.cn/college/biostation/fuwu/iDPF-PseRAAAC/index.asp. A friendly guide is provided to describe how to use the web server. We anticipate that iDPF-PseRAAAC may become a useful high-throughput tool for both basic research and drug design.
Collapse
|