1
|
Chen L, Li Q, Nasif KFA, Xie Y, Deng B, Niu S, Pouriyeh S, Dai Z, Chen J, Xie CY. AI-Driven Deep Learning Techniques in Protein Structure Prediction. Int J Mol Sci 2024; 25:8426. [PMID: 39125995 PMCID: PMC11313475 DOI: 10.3390/ijms25158426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Revised: 07/29/2024] [Accepted: 07/29/2024] [Indexed: 08/12/2024] Open
Abstract
Protein structure prediction is important for understanding their function and behavior. This review study presents a comprehensive review of the computational models used in predicting protein structure. It covers the progression from established protein modeling to state-of-the-art artificial intelligence (AI) frameworks. The paper will start with a brief introduction to protein structures, protein modeling, and AI. The section on established protein modeling will discuss homology modeling, ab initio modeling, and threading. The next section is deep learning-based models. It introduces some state-of-the-art AI models, such as AlphaFold (AlphaFold, AlphaFold2, AlphaFold3), RoseTTAFold, ProteinBERT, etc. This section also discusses how AI techniques have been integrated into established frameworks like Swiss-Model, Rosetta, and I-TASSER. The model performance is compared using the rankings of CASP14 (Critical Assessment of Structure Prediction) and CASP15. CASP16 is ongoing, and its results are not included in this review. Continuous Automated Model EvaluatiOn (CAMEO) complements the biennial CASP experiment. Template modeling score (TM-score), global distance test total score (GDT_TS), and Local Distance Difference Test (lDDT) score are discussed too. This paper then acknowledges the ongoing difficulties in predicting protein structure and emphasizes the necessity of additional searches like dynamic protein behavior, conformational changes, and protein-protein interactions. In the application section, this paper introduces some applications in various fields like drug design, industry, education, and novel protein development. In summary, this paper provides a comprehensive overview of the latest advancements in established protein modeling and deep learning-based models for protein structure predictions. It emphasizes the significant advancements achieved by AI and identifies potential areas for further investigation.
Collapse
Affiliation(s)
- Lingtao Chen
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Qiaomu Li
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Kazi Fahim Ahmad Nasif
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Ying Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Bobin Deng
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Shuteng Niu
- Department of Computer Science, Bowling Green State University, Bowling Green, OH 43403, USA;
| | - Seyedamin Pouriyeh
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| | - Zhiyu Dai
- Division of Pulmonary and Critical Care Medicine, John T. Milliken Department of Medicine, Washington University School of Medicine in St. Louis, St. Louis, MO 63110, USA;
| | - Jiawei Chen
- College of Computing, Data Science and Society, University of California, Berkeley, CA 94720, USA;
| | - Chloe Yixin Xie
- College of Computing and Software Engineering, Kennesaw State University, Marietta, GA 30060, USA; (L.C.); (Q.L.); (K.F.A.N.); (Y.X.); (B.D.); (S.P.)
| |
Collapse
|
2
|
Peng CX, Zhou XG, Zhang GJ. De novo Protein Structure Prediction by Coupling Contact With Distance Profile. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:395-406. [PMID: 32750861 DOI: 10.1109/tcbb.2020.3000758] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
De novo protein structure prediction is a challenging problem that requires both an accurate energy function and an efficient conformation sampling method. In this study, a de novo structure prediction method, named CoDiFold, is proposed. In CoDiFold, contacts and distance profiles are organically combined into the Rosetta low-resolution energy function to improve the accuracy of energy function. As a result, the correlation between energy and root mean square deviation (RMSD) is improved. In addition, a population-based multi-mutation strategy is designed to balance the exploration and exploitation of conformation space sampling. The average RMSD of the models generated by the proposed protocol is decreased by 49.24 and 45.21 percent in the test set with 43 proteins compared with those of Rosetta and QUARK de novo protocols, respectively. The results also demonstrate that the structures predicted by proposed CoDiFold are comparable to the state-of-the-art methods for the 10 FM targets of CASP13. The source code and executable versions are freely available at http://github.com/iobio-zjut/CoDiFold.
Collapse
|
3
|
Sakalli T, Surmeli NB. Functional characterization of a novel CYP119 variant to explore its biocatalytic potential. Biotechnol Appl Biochem 2021; 69:1741-1756. [PMID: 34431570 DOI: 10.1002/bab.2243] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 08/18/2021] [Indexed: 12/27/2022]
Abstract
Biocatalysts are increasingly applied in the pharmaceutical and chemical industry. Cytochrome P450 enzymes (P450s) are valuable biocatalysts due to their ability to hydroxylate unactivated carbon atoms using molecular oxygen. P450s catalyze reactions using nicotinamide adenine dinucleotide phosphate (NAD(P)H) cofactor and electron transfer proteins. Alternatively, P450s can utilize hydrogen peroxide (H2 O2 ) as an oxidant, but this pathway is inefficient. P450s that show higher efficiency with peroxides are sought after in industrial applications. P450s from thermophilic organisms have more potential applications as they are stable toward high temperature, high and low pH, and organic solvents. CYP119 is an acidothermophilic P450 from Sulfolobus acidocaldarius. In our previous study, a novel T213R/T214I (double mutant [DM]) variant of CYP119 was obtained by screening a mutant library for higher peroxidation activity utilizing H2 O2 . Here, we characterized the substrate scope; stability toward peroxides; and temperature and organic solvent tolerance of DM CYP119 to identify its potential as an industrial biocatalyst. DM CYP119 displayed higher stability than wild-type (WT) CYP119 toward organic peroxides. It shows higher peroxidation activity for non-natural substrates and higher affinity for progesterone and other bioactive potential substrates compared to WT CYP119. DM CYP119 emerges as a new biocatalyst with a wide range of potential applications in the pharmaceutical and chemical industry.
Collapse
Affiliation(s)
- Tugce Sakalli
- Department of Bioengineering, Faculty of Engineering, İzmir Institute of Technology, Urla, Izmir, Turkey
| | - Nur Basak Surmeli
- Department of Bioengineering, Faculty of Engineering, İzmir Institute of Technology, Urla, Izmir, Turkey
| |
Collapse
|
4
|
Mulligan VK. Current directions in combining simulation-based macromolecular modeling approaches with deep learning. Expert Opin Drug Discov 2021; 16:1025-1044. [PMID: 33993816 DOI: 10.1080/17460441.2021.1918097] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Introduction: Structure-guided drug discovery relies on accurate computational methods for modeling macromolecules. Simulations provide means of predicting macromolecular folds, of discovering function from structure, and of designing macromolecules to serve as drugs. Success rates are limited for any of these tasks, however. Recently, deep neural network-based methods have greatly enhanced the accuracy of predictions of protein structure from sequence, generating excitement about the potential impact of deep learning.Areas covered: This review introduces biologists to deep neural network architecture, surveys recent successes of deep learning in structure prediction, and discusses emerging deep learning-based approaches for structure-function analysis and design. Particular focus is given to the interplay between simulation-based and neural network-based approaches.Expert opinion: As deep learning grows integral to macromolecular modeling, simulation- and neural network-based approaches must grow more tightly interconnected. Modular software architecture must emerge allowing both types of tools to be combined with maximal versatility. Open sharing of code under permissive licenses will be essential. Although experiments will remain the gold standard for reliable information to guide drug discovery, we may soon see successful drug development projects based on high-accuracy predictions from algorithms that combine simulation with deep learning - the ultimate validation of this combination's power.
Collapse
|
5
|
Başlar MS, Sakallı T, Güralp G, Kestevur Doğru E, Haklı E, Surmeli NB. Development of an improved Amplex Red peroxidation activity assay for screening cytochrome P450 variants and identification of a novel mutant of the thermophilic CYP119. J Biol Inorg Chem 2020; 25:949-962. [PMID: 32924072 DOI: 10.1007/s00775-020-01816-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Accepted: 08/30/2020] [Indexed: 10/23/2022]
Abstract
Biocatalysts are increasingly utilized in the synthesis of drugs and agrochemicals as an alternative to chemical catalysis. They are preferred in the synthesis of enantiopure products due to their high regioselectivity and enantioselectivity. Cytochrome P450 (P450) oxygenases are valuable biocatalysts, since they catalyze the oxidation of carbon-hydrogen bonds with high efficiency and selectivity. However, practical use of P450s is limited due to their need for expensive cofactors and electron transport partners. P450s can employ hydrogen peroxide (H2O2) as an oxygen and electron donor, but the reaction with H2O2 is inefficient. The development of P450s that can use H2O2 will expand their applications. Here, an assay that utilizes Amplex Red peroxidation, to rapidly screen H2O2-dependent activity of P450 mutants in cell lysate was developed. This assay was employed to identify mutants of CYP119, a thermophilic P450 from Sulfolobus acidocaldarius, with increased peroxidation activity. A mutant library of CYP119 containing substitutions in the heme active site was constructed via combinatorial active-site saturation test and screened for improved activity. Screening of 158 colonies led to five mutants with higher activity. Among improved variants, T213R/T214I was characterized. T213R/T214I exhibited fivefold higher kcat for Amplex Red peroxidation and twofold higher kcat for styrene epoxidation. T213R/T214I showed higher stability towards heme degradation by H2O2. While the Km for H2O2 and styrene were not altered by the mutation, a fourfold decrease in the affinity for another substrate, lauric acid, was observed. In conclusion, Amplex Red peroxidation screening of CYP119 mutants yielded enzymes with increased peroxide-dependent activity.
Collapse
Affiliation(s)
- M Semih Başlar
- Department of Bioengineering, İzmir Institute of Technology, Gülbahçe, Urla, Izmir, Turkey
| | - Tuğçe Sakallı
- Program in Biotechnology and Bioengineering, İzmir Institute of Technology, Gülbahce, Urla, Izmir, Turkey
| | - Gülce Güralp
- Department of Bioengineering, İzmir Institute of Technology, Gülbahçe, Urla, Izmir, Turkey
| | - Ekin Kestevur Doğru
- Department of Bioengineering, İzmir Institute of Technology, Gülbahçe, Urla, Izmir, Turkey
| | - Emre Haklı
- Program in Biotechnology and Bioengineering, İzmir Institute of Technology, Gülbahce, Urla, Izmir, Turkey
| | - Nur Basak Surmeli
- Department of Bioengineering, İzmir Institute of Technology, Gülbahçe, Urla, Izmir, Turkey.
| |
Collapse
|
6
|
Ross TM, DiNapoli J, Giel-Moloney M, Bloom CE, Bertran K, Balzli C, Strugnell T, Sá E Silva M, Mebatsion T, Bublot M, Swayne DE, Kleanthous H. A computationally designed H5 antigen shows immunological breadth of coverage and protects against drifting avian strains. Vaccine 2019; 37:2369-2376. [PMID: 30905528 DOI: 10.1016/j.vaccine.2019.03.018] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 03/05/2019] [Accepted: 03/11/2019] [Indexed: 02/07/2023]
Abstract
Since the first identification of the H5N1 Goose/Guangdong lineage in 1996, this highly pathogenic avian influenza virus has spread worldwide, becoming endemic in domestic poultry. Sporadic transmission to humans has raised concerns of a potential pandemic and underscores the need for a broad cross-protective influenza vaccine. Here, we tested our previously described methodology, termed Computationally Optimized Broadly Reactive Antigen (COBRA), to generate a novel hemagglutinin (HA) gene, termed COBRA-2, that was based on H5 HA sequences from 2005 to 2006. The COBRA-2 HA virus-like particle (VLP) vaccines were used to vaccinate chickens and the immune responses were compared to responses elicited by VLP's expressing HA from A/whooper swan/Mongolia/244/2005 (WS/05), a representative 2005 vaccine virus from clade 2.2. To support this evaluation a hemagglutination inhibition (HAI) breadth panel was developed consisting of phylogenetically and antigenically diverse H5 strains in circulation from 2005 to 2006, as well as recent drift variants (2008 - 2014). We found that the COBRA-2 VLP vaccines elicited robust HAI titers against this entire breadth panel, whereas the VLP vaccine based upon the recommended WS/05 HA only elicited HAI responses against a subset of strains. Furthermore, while all vaccines protected chickens against challenge with the WS/05 virus, only the human COBRA-2 VLP vaccinated birds were protected (80%) against a recent drifted clade 2.3.2.1B, A/duck/Vietnam/NCVD-672/2011 (VN/11) virus. This is the first report to demonstrate seroprotective antibody responses against genetically diverse clades and sub-clades of H5 viruses and protective efficacy against a recent drifted variant using a globular head based design strategy.
Collapse
Affiliation(s)
- Ted M Ross
- University of Georgia, Center for Vaccines and Immunology, Department of Infectious Diseases, Athens, GA 30602, USA
| | | | | | - Chalise E Bloom
- University of Georgia, Center for Vaccines and Immunology, Department of Infectious Diseases, Athens, GA 30602, USA
| | - Kateri Bertran
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, U.S. National Poultry Research Center, Agricultural Research Service, U.S. Department of Agriculture, Athens, GA 30602, USA
| | - Charles Balzli
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, U.S. National Poultry Research Center, Agricultural Research Service, U.S. Department of Agriculture, Athens, GA 30602, USA
| | - Tod Strugnell
- Sanofi-Pasteur, 38 Sidney Street, Cambridge, MA 02139, USA
| | | | | | - Michel Bublot
- Boehringer lngelheim, S.A.S., R&D, 69007 Lyon, France
| | - David E Swayne
- Exotic and Emerging Avian Viral Diseases Research Unit, Southeast Poultry Research Laboratory, U.S. National Poultry Research Center, Agricultural Research Service, U.S. Department of Agriculture, Athens, GA 30602, USA
| | | |
Collapse
|
7
|
Jing X, Dong Q, Lu R, Dong Q. Protein Inter-Residue Contacts Prediction: Methods, Performances and Applications. Curr Bioinform 2019. [DOI: 10.2174/1574893613666181109130430] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:Protein inter-residue contacts prediction play an important role in the field of protein structure and function research. As a low-dimensional representation of protein tertiary structure, protein inter-residue contacts could greatly help de novo protein structure prediction methods to reduce the conformational search space. Over the past two decades, various methods have been developed for protein inter-residue contacts prediction.Objective:We provide a comprehensive and systematic review of protein inter-residue contacts prediction methods.Results:Protein inter-residue contacts prediction methods are roughly classified into five categories: correlated mutations methods, machine-learning methods, fusion methods, templatebased methods and 3D model-based methods. In this paper, firstly we describe the common definition of protein inter-residue contacts and show the typical application of protein inter-residue contacts. Then, we present a comprehensive review of the three main categories for protein interresidue contacts prediction: correlated mutations methods, machine-learning methods and fusion methods. Besides, we analyze the constraints for each category. Furthermore, we compare several representative methods on the CASP11 dataset and discuss performances of these methods in detail.Conclusion:Correlated mutations methods achieve better performances for long-range contacts, while the machine-learning method performs well for short-range contacts. Fusion methods could take advantage of the machine-learning and correlated mutations methods. Employing more effective fusion strategy could be helpful to further improve the performances of fusion methods.
Collapse
Affiliation(s)
- Xiaoyang Jing
- School of Computer Science, Fudan University, Shanghai, China
| | - Qimin Dong
- Vocational and Technical Education Center of Linxi County, Chifeng, Inner Mongolia, China
| | - Ruqian Lu
- School of Computer Science, Fudan University, Shanghai, China
| | - Qiwen Dong
- Faculty of Education, East China Normal University, Shanghai, China
| |
Collapse
|
8
|
Experimental accuracy in protein structure refinement via molecular dynamics simulations. Proc Natl Acad Sci U S A 2018; 115:13276-13281. [PMID: 30530696 DOI: 10.1073/pnas.1811364115] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Refinement is the last step in protein structure prediction pipelines to convert approximate homology models to experimental accuracy. Protocols based on molecular dynamics (MD) simulations have shown promise, but current methods are limited to moderate levels of consistent refinement. To explore the energy landscape between homology models and native structures and analyze the challenges of MD-based refinement, eight test cases were studied via extensive simulations followed by Markov state modeling. In all cases, native states were found very close to the experimental structures and at the lowest free energies, but refinement was hindered by a rough energy landscape. Transitions from the homology model to the native states require the crossing of significant kinetic barriers on at least microsecond time scales. A significant energetic driving force toward the native state was lacking until its immediate vicinity, and there was significant sampling of off-pathway states competing for productive refinement. The role of recent force field improvements is discussed and transition paths are analyzed in detail to inform which key transitions have to be overcome to achieve successful refinement.
Collapse
|
9
|
Kc DB. Recent advances in sequence-based protein structure prediction. Brief Bioinform 2018; 18:1021-1032. [PMID: 27562963 DOI: 10.1093/bib/bbw070] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Indexed: 11/13/2022] Open
Abstract
The most accurate characterizations of the structure of proteins are provided by structural biology experiments. However, because of the high cost and labor-intensive nature of the structural experiments, the gap between the number of protein sequences and solved structures is widening rapidly. Development of computational methods to accurately model protein structures from sequences is becoming increasingly important to the biological community. In this article, we highlight some important progress in the field of protein structure prediction, especially those related to free modeling (FM) methods that generate structure models without using homologous templates. We also provide a short synopsis of some of the recent advances in FM approaches as demonstrated in the recent Computational Assessment of Structure Prediction competition as well as recent trends and outlook for FM approaches in protein structure prediction.
Collapse
|
10
|
Nerli S, McShan AC, Sgourakis NG. Chemical shift-based methods in NMR structure determination. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2018; 106-107:1-25. [PMID: 31047599 PMCID: PMC6788782 DOI: 10.1016/j.pnmrs.2018.03.002] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 03/09/2018] [Accepted: 03/09/2018] [Indexed: 05/08/2023]
Abstract
Chemical shifts are highly sensitive probes harnessed by NMR spectroscopists and structural biologists as conformational parameters to characterize a range of biological molecules. Traditionally, assignment of chemical shifts has been a labor-intensive process requiring numerous samples and a suite of multidimensional experiments. Over the past two decades, the development of complementary computational approaches has bolstered the analysis, interpretation and utilization of chemical shifts for elucidation of high resolution protein and nucleic acid structures. Here, we review the development and application of chemical shift-based methods for structure determination with a focus on ab initio fragment assembly, comparative modeling, oligomeric systems, and automated assignment methods. Throughout our discussion, we point out practical uses, as well as advantages and caveats, of using chemical shifts in structure modeling. We additionally highlight (i) hybrid methods that employ chemical shifts with other types of NMR restraints (residual dipolar couplings, paramagnetic relaxation enhancements and pseudocontact shifts) that allow for improved accuracy and resolution of generated 3D structures, (ii) the utilization of chemical shifts to model the structures of sparsely populated excited states, and (iii) modeling of sidechain conformations. Finally, we briefly discuss the advantages of contemporary methods that employ sparse NMR data recorded using site-specific isotope labeling schemes for chemical shift-driven structure determination of larger molecules. With this review, we aim to emphasize the accessibility and versatility of chemical shifts for structure determination of challenging biological systems, and to point out emerging areas of development that lead us towards the next generation of tools.
Collapse
Affiliation(s)
- Santrupti Nerli
- Department of Chemistry and Biochemistry, University of California Santa Cruz, Santa Cruz, CA 95064, United States; Department of Computer Science, University of California Santa Cruz, Santa Cruz, CA 95064, United States
| | - Andrew C McShan
- Department of Chemistry and Biochemistry, University of California Santa Cruz, Santa Cruz, CA 95064, United States
| | - Nikolaos G Sgourakis
- Department of Chemistry and Biochemistry, University of California Santa Cruz, Santa Cruz, CA 95064, United States.
| |
Collapse
|
11
|
Schaarschmidt J, Monastyrskyy B, Kryshtafovych A, Bonvin AM. Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age. Proteins 2018; 86 Suppl 1:51-66. [PMID: 29071738 PMCID: PMC5820169 DOI: 10.1002/prot.25407] [Citation(s) in RCA: 126] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 10/06/2017] [Accepted: 10/24/2017] [Indexed: 12/20/2022]
Abstract
Following up on the encouraging results of residue-residue contact prediction in the CASP11 experiment, we present the analysis of predictions submitted for CASP12. The submissions include predictions of 34 groups for 38 domains classified as free modeling targets which are not accessible to homology-based modeling due to a lack of structural templates. CASP11 saw a rise of coevolution-based methods outperforming other approaches. The improvement of these methods coupled to machine learning and sequence database growth are most likely the main driver for a significant improvement in average precision from 27% in CASP11 to 47% in CASP12. In more than half of the targets, especially those with many homologous sequences accessible, precisions above 90% were achieved with the best predictors reaching a precision of 100% in some cases. We furthermore tested the impact of using these contacts as restraints in ab initio modeling of 14 single-domain free modeling targets using Rosetta. Adding contacts to the Rosetta calculations resulted in improvements of up to 26% in GDT_TS within the top five structures.
Collapse
Affiliation(s)
- Joerg Schaarschmidt
- Faculty of Science ‐ ChemistryComputational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht UniversityUtrechtThe Netherlands
| | | | | | - Alexandre M.J.J. Bonvin
- Faculty of Science ‐ ChemistryComputational Structural Biology Group, Bijvoet Center for Biomolecular Research, Utrecht UniversityUtrechtThe Netherlands
| |
Collapse
|
12
|
Kordyukova LV, Shtykova EV, Baratova LA, Svergun DI, Batishchev OV. Matrix proteins of enveloped viruses: a case study of Influenza A virus M1 protein. J Biomol Struct Dyn 2018; 37:671-690. [PMID: 29388479 DOI: 10.1080/07391102.2018.1436089] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Influenza A virus, a member of the Orthomyxoviridae family of enveloped viruses, is one of the human and animal top killers, and its structure and components are therefore extensively studied during the last decades. The most abundant component, M1 matrix protein, forms a matrix layer (scaffold) under the viral lipid envelope, and the functional roles as well as structural peculiarities of the M1 protein are still under heavy debate. Despite multiple attempts of crystallization, no high resolution structure is available for the full length M1 of Influenza A virus. The likely reason for the difficulties lies in the intrinsic disorder of the M1 C-terminal part preventing diffraction quality crystals to be grown. Alternative structural methods including synchrotron small-angle X-ray scattering (SAXS), atomic force microscopy, cryo-electron microscopy/tomography are therefore widely applied to understand the structure of M1, its self-association and interactions with the lipid membrane and the viral nucleocapsid. These methods reveal striking similarities in the behavior of M1 and matrix proteins of other enveloped RNA viruses, with the differences accompanied by the specific features of the viral lifecycles, thus suggesting common interaction principles and, possibly, common evolutional ancestors. The structural information on the Influenza A virus M1 protein obtained to the date strongly suggests that the intrinsic disorder in the C-terminal domain has important functional implications.
Collapse
Affiliation(s)
- Larisa V Kordyukova
- a Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University , Moscow , Russian Federation
| | - Eleonora V Shtykova
- b Shubnikov Institute of Crystallography of Federal Scientific Research Centre 'Crystallography and Photonics' of Russian Academy of Sciences , Moscow , Russian Federation.,c Semenov Institute of Chemical Physics , Russian Academy of Sciences , Moscow , Russian Federation
| | - Lyudmila A Baratova
- a Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University , Moscow , Russian Federation
| | | | - Oleg V Batishchev
- e Frumkin Institute of Physical Chemistry and Electrochemistry , Russian Academy of Sciences , Moscow , Russian Federation.,f Moscow Institute of Physics and Technology , Dolgoprudniy , Russian Federation
| |
Collapse
|
13
|
Lemos A, Melo R, Preto AJ, Almeida JG, Moreira IS, Cordeiro MNDS. In Silico Studies Targeting G-protein Coupled Receptors for Drug Research Against Parkinson's Disease. Curr Neuropharmacol 2018; 16:786-848. [PMID: 29521236 PMCID: PMC6080095 DOI: 10.2174/1570159x16666180308161642] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Revised: 02/16/2018] [Accepted: 02/02/2018] [Indexed: 11/22/2022] Open
Abstract
Parkinson's Disease (PD) is a long-term neurodegenerative brain disorder that mainly affects the motor system. The causes are still unknown, and even though currently there is no cure, several therapeutic options are available to manage its symptoms. The development of novel antiparkinsonian agents and an understanding of their proper and optimal use are, indeed, highly demanding. For the last decades, L-3,4-DihydrOxyPhenylAlanine or levodopa (L-DOPA) has been the gold-standard therapy for the symptomatic treatment of motor dysfunctions associated to PD. However, the development of dyskinesias and motor fluctuations (wearing-off and on-off phenomena) associated with long-term L-DOPA replacement therapy have limited its antiparkinsonian efficacy. The investigation for non-dopaminergic therapies has been largely explored as an attempt to counteract the motor side effects associated with dopamine replacement therapy. Being one of the largest cell membrane protein families, G-Protein-Coupled Receptors (GPCRs) have become a relevant target for drug discovery focused on a wide range of therapeutic areas, including Central Nervous System (CNS) diseases. The modulation of specific GPCRs potentially implicated in PD, excluding dopamine receptors, may provide promising non-dopaminergic therapeutic alternatives for symptomatic treatment of PD. In this review, we focused on the impact of specific GPCR subclasses, including dopamine receptors, adenosine receptors, muscarinic acetylcholine receptors, metabotropic glutamate receptors, and 5-hydroxytryptamine receptors, on the pathophysiology of PD and the importance of structure- and ligand-based in silico approaches for the development of small molecules to target these receptors.
Collapse
Affiliation(s)
- Agostinho Lemos
- LAQV/REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, Rua do Campo Alegre s/n, 4169-007Porto, Portugal
- GIGA Cyclotron Research Centre In Vivo Imaging, University of Liège, 4000Liège, Belgium
| | - Rita Melo
- CNC - Center for Neuroscience and Cell Biology, Faculty of Medicine, University of Coimbra, Rua Larga, 3004-517Coimbra, Portugal
- Centro de Ciências e Tecnologias Nucleares, Instituto Superior Técnico, Universidade de Lisboa, Estrada Nacional 10 (ao km 139,7), 2695-066 Bobadela LRS, Portugal
| | - Antonio Jose Preto
- CNC - Center for Neuroscience and Cell Biology, Faculty of Medicine, University of Coimbra, Rua Larga, 3004-517Coimbra, Portugal
| | - Jose Guilherme Almeida
- CNC - Center for Neuroscience and Cell Biology, Faculty of Medicine, University of Coimbra, Rua Larga, 3004-517Coimbra, Portugal
| | - Irina Sousa Moreira
- CNC - Center for Neuroscience and Cell Biology, Faculty of Medicine, University of Coimbra, Rua Larga, 3004-517Coimbra, Portugal
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry, Utrecht University, Utrecht, 3584CH, The Netherlands
| | - Maria Natalia Dias Soeiro Cordeiro
- LAQV/REQUIMTE, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, Rua do Campo Alegre s/n, 4169-007Porto, Portugal
| |
Collapse
|
14
|
Jing X, Dong Q, Lu R. RRCRank: a fusion method using rank strategy for residue-residue contact prediction. BMC Bioinformatics 2017; 18:390. [PMID: 28865433 PMCID: PMC5581475 DOI: 10.1186/s12859-017-1811-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Accepted: 08/28/2017] [Indexed: 11/10/2022] Open
Abstract
Background In structural biology area, protein residue-residue contacts play a crucial role in protein structure prediction. Some researchers have found that the predicted residue-residue contacts could effectively constrain the conformational search space, which is significant for de novo protein structure prediction. In the last few decades, related researchers have developed various methods to predict residue-residue contacts, especially, significant performance has been achieved by using fusion methods in recent years. In this work, a novel fusion method based on rank strategy has been proposed to predict contacts. Unlike the traditional regression or classification strategies, the contact prediction task is regarded as a ranking task. First, two kinds of features are extracted from correlated mutations methods and ensemble machine-learning classifiers, and then the proposed method uses the learning-to-rank algorithm to predict contact probability of each residue pair. Results First, we perform two benchmark tests for the proposed fusion method (RRCRank) on CASP11 dataset and CASP12 dataset respectively. The test results show that the RRCRank method outperforms other well-developed methods, especially for medium and short range contacts. Second, in order to verify the superiority of ranking strategy, we predict contacts by using the traditional regression and classification strategies based on the same features as ranking strategy. Compared with these two traditional strategies, the proposed ranking strategy shows better performance for three contact types, in particular for long range contacts. Third, the proposed RRCRank has been compared with several state-of-the-art methods in CASP11 and CASP12. The results show that the RRCRank could achieve comparable prediction precisions and is better than three methods in most assessment metrics. Conclusions The learning-to-rank algorithm is introduced to develop a novel rank-based method for the residue-residue contact prediction of proteins, which achieves state-of-the-art performance based on the extensive assessment. Electronic supplementary material The online version of this article (10.1186/s12859-017-1811-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xiaoyang Jing
- School of Computer Science, Fudan University, Shanghai, 200433, People's Republic of China
| | - Qiwen Dong
- School of Data Science and Engineering, East China Normal University, Shanghai, 200062, People's Republic of China.
| | - Ruqian Lu
- School of Computer Science, Fudan University, Shanghai, 200433, People's Republic of China
| |
Collapse
|
15
|
Ma L, Wang DD, Zou B, Yan H. An Eigen-Binding Site Based Method for the Analysis of Anti-EGFR Drug Resistance in Lung Cancer Treatment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1187-1194. [PMID: 27187970 DOI: 10.1109/tcbb.2016.2568184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We explore the drug resistance mechanism in non-small cell lung cancer treatment by characterizing the drug-binding site of a protein mutant based on local surface and energy features. These features are transformed to an eigen-binding site space and used for drug resistance level prediction and analysis.
Collapse
|
16
|
L. S, Vasu P. In silico designing of therapeutic protein enriched with branched-chain amino acids for the dietary treatment of chronic liver disease. J Mol Graph Model 2017; 76:192-204. [DOI: 10.1016/j.jmgm.2017.06.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Revised: 06/16/2017] [Accepted: 06/19/2017] [Indexed: 02/07/2023]
|
17
|
Xiong D, Zeng J, Gong H. A deep learning framework for improving long-range residue–residue contact prediction using a hierarchical strategy. Bioinformatics 2017; 33:2675-2683. [DOI: 10.1093/bioinformatics/btx296] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 05/02/2017] [Indexed: 12/31/2022] Open
Affiliation(s)
- Dapeng Xiong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing, China
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China
- Beijing Innovation Center of Structural Biology, Tsinghua University, Beijing, China
| |
Collapse
|
18
|
Goh GB, Hodas NO, Vishnu A. Deep learning for computational chemistry. J Comput Chem 2017; 38:1291-1307. [PMID: 28272810 DOI: 10.1002/jcc.24764] [Citation(s) in RCA: 301] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Revised: 01/09/2017] [Accepted: 01/18/2017] [Indexed: 02/06/2023]
Abstract
The rise and fall of artificial neural networks is well documented in the scientific literature of both computer science and computational chemistry. Yet almost two decades later, we are now seeing a resurgence of interest in deep learning, a machine learning algorithm based on multilayer neural networks. Within the last few years, we have seen the transformative impact of deep learning in many domains, particularly in speech recognition and computer vision, to the extent that the majority of expert practitioners in those field are now regularly eschewing prior established models in favor of deep learning models. In this review, we provide an introductory overview into the theory of deep neural networks and their unique properties that distinguish them from traditional machine learning algorithms used in cheminformatics. By providing an overview of the variety of emerging applications of deep neural networks, we highlight its ubiquity and broad applicability to a wide range of challenges in the field, including quantitative structure activity relationship, virtual screening, protein structure prediction, quantum chemistry, materials design, and property prediction. In reviewing the performance of deep neural networks, we observed a consistent outperformance against non-neural networks state-of-the-art models across disparate research topics, and deep neural network-based models often exceeded the "glass ceiling" expectations of their respective tasks. Coupled with the maturity of GPU-accelerated computing for training deep neural networks and the exponential growth of chemical data on which to train these networks on, we anticipate that deep learning algorithms will be a valuable tool for computational chemistry. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Garrett B Goh
- Advanced Computing, Mathematics, and Data Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington, 99354
| | - Nathan O Hodas
- Advanced Computing, Mathematics, and Data Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington, 99354
| | - Abhinav Vishnu
- Advanced Computing, Mathematics, and Data Division, Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, Washington, 99354
| |
Collapse
|
19
|
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
20
|
Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. CURRENT PROTOCOLS IN BIOINFORMATICS 2016; 54:5.6.1-5.6.37. [PMID: 27322406 PMCID: PMC5031415 DOI: 10.1002/cpbi.3] [Citation(s) in RCA: 1865] [Impact Index Per Article: 233.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
21
|
May JC, McLean JA. Advanced Multidimensional Separations in Mass Spectrometry: Navigating the Big Data Deluge. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2016; 9:387-409. [PMID: 27306312 PMCID: PMC5763907 DOI: 10.1146/annurev-anchem-071015-041734] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
Hybrid analytical instrumentation constructed around mass spectrometry (MS) is becoming the preferred technique for addressing many grand challenges in science and medicine. From the omics sciences to drug discovery and synthetic biology, multidimensional separations based on MS provide the high peak capacity and high measurement throughput necessary to obtain large-scale measurements used to infer systems-level information. In this article, we describe multidimensional MS configurations as technologies that are big data drivers and review some new and emerging strategies for mining information from large-scale datasets. We discuss the information content that can be obtained from individual dimensions, as well as the unique information that can be derived by comparing different levels of data. Finally, we summarize some emerging data visualization strategies that seek to make highly dimensional datasets both accessible and comprehensible.
Collapse
Affiliation(s)
- Jody C May
- Department of Chemistry, Center for Innovative Technology, Vanderbilt Institute for Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, Tennessee 37235;
| | - John A McLean
- Department of Chemistry, Center for Innovative Technology, Vanderbilt Institute for Chemical Biology, Vanderbilt Institute for Integrative Biosystems Research and Education, Vanderbilt University, Nashville, Tennessee 37235;
| |
Collapse
|
22
|
Khan FI, Wei DQ, Gu KR, Hassan MI, Tabrez S. Current updates on computer aided protein modeling and designing. Int J Biol Macromol 2016; 85:48-62. [DOI: 10.1016/j.ijbiomac.2015.12.072] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Revised: 12/17/2015] [Accepted: 12/21/2015] [Indexed: 12/15/2022]
|
23
|
Shim JY, Khurana L, Kendall DA. Computational analysis of the CB1 carboxyl-terminus in the receptor-G protein complex. Proteins 2016; 84:532-43. [PMID: 26994549 DOI: 10.1002/prot.24999] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Revised: 01/07/2016] [Accepted: 01/19/2016] [Indexed: 01/03/2023]
Abstract
Despite the important role of the carboxyl-terminus (Ct) of the activated brain cannabinoid receptor one (CB1) in the regulation of G protein signaling, a structural understanding of interactions with G proteins is lacking. This is largely due to the highly flexible nature of the CB1 Ct that dynamically adapts its conformation to the presence of G proteins. In the present study, we explored how the CB1 Ct can interact with the G protein by building on our prior modeling of the CB1-Gi complex (Shim, Ahn, and Kendall, The Journal of Biological Chemistry 2013;288:32449-32465) to incorporate a complete CB1 Ct (Glu416(Ct)-Leu472(Ct)). Based on the structural constraints from NMR studies, we employed ROSETTA to predict tertiary folds, ZDOCK to predict docking orientation, and molecular dynamics (MD) simulations to obtain two distinct plausible models of CB1 Ct in the CB1-Gi complex. The resulting models were consistent with the NMR-determined helical structure (H9) in the middle region of the CB1 Ct. The CB1 Ct directly interacted with both Gα and Gβ and stabilized the receptor at the Gi interface. The results of site-directed mutagenesis studies of Glu416(Ct), Asp423(Ct), Asp428(Ct), and Arg444(Ct) of CB1 Ct suggested that the CB1 Ct can influence receptor-G protein coupling by stabilizing the receptor at the Gi interface. This research provided, for the first time, models of the CB1 Ct in contact with the G protein.
Collapse
Affiliation(s)
- Joong-Youn Shim
- Department of Chemistry, University of North Carolina, Chapel Hill, North Carolina, 27514
| | - Leepakshi Khurana
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut, 06269-3092
| | - Debra A Kendall
- Department of Pharmaceutical Sciences, University of Connecticut, Storrs, Connecticut, 06269-3092
| |
Collapse
|
24
|
Abstract
In the field of computational structural proteomics, contact predictions have shown new prospects of solving the longstanding problem of ab initio protein structure prediction. In the last few years, application of deep learning algorithms and availability of large protein sequence databases, combined with improvement in methods that derive contacts from multiple sequence alignments, have shown a huge increase in the precision of contact prediction. In addition, these predicted contacts have also been used to build three-dimensional models from scratch.In this chapter, we briefly discuss many elements of protein residue-residue contacts and the methods available for prediction, focusing on a state-of-the-art contact prediction tool, DNcon. Illustrating with a case study, we describe how DNcon can be used to make ab initio contact predictions for a given protein sequence and discuss how the predicted contacts may be analyzed and evaluated.
Collapse
Affiliation(s)
- Badri Adhikari
- Department of Computer Science, University of Missouri, 201 Engineering Building West, Columbia, MO, 65211, USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, 201 Engineering Building West, Columbia, MO, 65211, USA.
| |
Collapse
|
25
|
Lee MS, Olson MA. Assessment of Detection and Refinement Strategies for de novo Protein Structures Using Force Field and Statistical Potentials. J Chem Theory Comput 2015; 3:312-24. [PMID: 26627174 DOI: 10.1021/ct600195f] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
De novo predictions of protein structures at high resolution are plagued by the problem of detecting the native conformation from false energy minima. In this work, we provide an assessment of various detection and refinement protocols on a small subset of the second-generation all-atom Rosetta decoy set (Tsai et al. Proteins 2003, 53, 76-87) using two potentials: the all-atom CHARMM PARAM22 force field combined with generalized Born/surface-area (GB-SA) implicit solvation and the DFIRE-AA statistical potential. Detection schemes included DFIRE-AA conformational scoring and energy minimization followed by scoring with both GB-SA and DFIRE-AA potentials. Refinement methods included short-time (1-ps) molecular dynamics simulations, temperature-based replica exchange molecular dynamics, and a new computational unfold/refold procedure. Refinement methods include temperature-based replica exchange molecular dynamics and a new computational unfold/refold procedure. Our results indicate that simple detection with only minimization is the best protocol for finding the most nativelike structures in the decoy set. The refinement techniques that we tested are generally unsuccessful in improving detection; however, they provide marginal improvements to some of the decoy structures. Future directions in the development of refinement techniques are discussed in the context of the limitations of the protocols evaluated in this study.
Collapse
Affiliation(s)
- Michael S Lee
- Computational and Information Sciences Directorate, U.S. Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, and Department of Cell Biology and Biochemistry, U.S. Army Medical Research Institute of Infectious Diseases, Frederick, Maryland 21702
| | - Mark A Olson
- Computational and Information Sciences Directorate, U.S. Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, and Department of Cell Biology and Biochemistry, U.S. Army Medical Research Institute of Infectious Diseases, Frederick, Maryland 21702
| |
Collapse
|
26
|
Rastogi S, Borgo B, Pazdernik N, Fox P, Mardis ER, Kohara Y, Havranek J, Schedl T. Caenorhabditis elegans glp-4 Encodes a Valyl Aminoacyl tRNA Synthetase. G3 (BETHESDA, MD.) 2015; 5:2719-28. [PMID: 26464357 PMCID: PMC4683644 DOI: 10.1534/g3.115.021899] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 10/05/2015] [Indexed: 02/07/2023]
Abstract
Germline stem cell proliferation is necessary to populate the germline with sufficient numbers of cells for gametogenesis and for signaling the soma to control organismal properties such as aging. The Caenorhabditis elegans gene glp-4 was identified by the temperature-sensitive allele bn2 where mutants raised at the restrictive temperature produce adults that are essentially germ cell deficient, containing only a small number of stem cells arrested in the mitotic cycle but otherwise have a morphologically normal soma. We determined that glp-4 encodes a valyl aminoacyl transfer RNA synthetase (VARS-2) and that the probable null phenotype is early larval lethality. Phenotypic analysis indicates glp-4(bn2ts) is partial loss of function in the soma. Structural modeling suggests that bn2 Gly296Asp results in partial loss of function by a novel mechanism: aspartate 296 in the editing pocket induces inappropriate deacylation of correctly charged Val-tRNA(val). Intragenic suppressor mutations are predicted to displace aspartate 296 so that it is less able to catalyze inappropriate deacylation. Thus glp-4(bn2ts) likely causes reduced protein translation due to decreased levels of Val-tRNA(val). The germline, as a reproductive preservation mechanism during unfavorable conditions, signals the soma for organismal aging, stress and pathogen resistance. glp-4(bn2ts) mutants are widely used to generate germline deficient mutants for organismal studies, under the assumption that the soma is unaffected. As reduced translation has also been demonstrated to alter organismal properties, it is unclear whether changes in aging, stress resistance, etc. observed in glp-4(bn2ts) mutants are the result of germline deficiency or reduced translation.
Collapse
Affiliation(s)
- Suchita Rastogi
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110
| | - Ben Borgo
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110
| | - Nanette Pazdernik
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110
| | - Paul Fox
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110
| | - Elaine R Mardis
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110
| | - Yuji Kohara
- National Institute of Genetics, Mishima, 411-8540 Japan
| | - Jim Havranek
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St Louis, Missouri 63110
| |
Collapse
|
27
|
Structure-based function analysis of putative conserved proteins with isomerase activity from Haemophilus influenzae. 3 Biotech 2015; 5:741-763. [PMID: 28324524 PMCID: PMC4569619 DOI: 10.1007/s13205-014-0274-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Accepted: 12/18/2014] [Indexed: 01/09/2023] Open
Abstract
Haemophilus influenzae, a Gram-negative bacterium and a member of the family Pasteurellaceae, causes chronic bronchitis, bacteremia, meningitis, etc. The H. influenzae is the first organism whose genome was completely sequenced and annotated. Here, we have extensively analyzed the genome of H. influenzae using available proteins structure and function analysis tools. The objective of this analysis is to assign a precise function to hypothetical proteins (HPs) whose functions are not determined so far. Function prediction of these proteins is helpful in precise understanding of mechanisms of pathogenesis and biochemical pathways important for selecting novel therapeutic target. After an extensive analysis of H. Influenzae genome we have found 13 HPs showing high level of sequence and structural similarity to the enzyme isomerase. Consequently, the structures of HPs have been modeled and analyzed to determine their precise functions. We found these HPs are alanine racemase, lysine 2, 3-aminomutase, topoisomerase DNA-binding C4 zinc finger, pseudouridine synthase B, C and E (Rlu B, C and E), hydroxypyruvate isomerase, nucleoside-diphosphate-sugar epimerase, amidophosphoribosyltransferase, aldose-1-epimerase, tautomerase/MIF, Xylose isomerase-like, have TIM barrel domain and sedoheptulose-7-phosphate isomerase like activity, signifying their corresponding functions in the H. influenzae. This work provides a better understanding of the role HPs with isomerase activities in the survival and pathogenesis of H. influenzae.
Collapse
|
28
|
Yang J, Zhang W, He B, Walker SE, Zhang H, Govindarajoo B, Virtanen J, Xue Z, Shen HB, Zhang Y. Template-based protein structure prediction in CASP11 and retrospect of I-TASSER in the last decade. Proteins 2015; 84 Suppl 1:233-46. [PMID: 26343917 DOI: 10.1002/prot.24918] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2015] [Revised: 08/13/2015] [Accepted: 08/31/2015] [Indexed: 01/26/2023]
Abstract
We report the structure prediction results of a new composite pipeline for template-based modeling (TBM) in the 11th CASP experiment. Starting from multiple structure templates identified by LOMETS based meta-threading programs, the QUARK ab initio folding program is extended to generate initial full-length models under strong constraints from template alignments. The final atomic models are then constructed by I-TASSER based fragment reassembly simulations, followed by the fragment-guided molecular dynamic simulation and the MQAP-based model selection. It was found that the inclusion of QUARK-TBM simulations as an intermediate modeling step could help improve the quality of the I-TASSER models for both Easy and Hard TBM targets. Overall, the average TM-score of the first I-TASSER model is 12% higher than that of the best LOMETS templates, with the RMSD in the same threading-aligned regions reduced from 5.8 to 4.7 Å. Nevertheless, there are nearly 18% of TBM domains with the templates deteriorated by the structure assembly pipeline, which may be attributed to the errors of secondary structure and domain orientation predictions that propagate through and degrade the procedures of template identification and final model selections. To examine the record of progress, we made a retrospective report of the I-TASSER pipeline in the last five CASP experiments (CASP7-11). The data show no clear progress of the LOMETS threading programs over PSI-BLAST; but obvious progress on structural improvement relative to threading templates was witnessed in recent CASP experiments, which is probably attributed to the integration of the extended ab initio folding simulation with the threading assembly pipeline and the introduction of atomic-level structure refinements following the reduced modeling simulations. Proteins 2016; 84(Suppl 1):233-246. © 2015 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Jianyi Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Wenxuan Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Baoji He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Sara Elizabeth Walker
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Hongjiu Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Brandon Govindarajoo
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Jouko Virtanen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Zhidong Xue
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Hong-Bin Shen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, 48109.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, 48109.
| |
Collapse
|
29
|
Yang J, He BJ, Jang R, Zhang Y, Shen HB. Accurate disulfide-bonding network predictions improve ab initio structure prediction of cysteine-rich proteins. Bioinformatics 2015; 31:3773-81. [PMID: 26254435 DOI: 10.1093/bioinformatics/btv459] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2015] [Accepted: 08/02/2015] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION Cysteine-rich proteins cover many important families in nature but there are currently no methods specifically designed for modeling the structure of these proteins. The accuracy of disulfide connectivity pattern prediction, particularly for the proteins of higher-order connections, e.g., >3 bonds, is too low to effectively assist structure assembly simulations. RESULTS We propose a new hierarchical order reduction protocol called Cyscon for disulfide-bonding prediction. The most confident disulfide bonds are first identified and bonding prediction is then focused on the remaining cysteine residues based on SVR training. Compared with purely machine learning-based approaches, Cyscon improved the average accuracy of connectivity pattern prediction by 21.9%. For proteins with more than 5 disulfide bonds, Cyscon improved the accuracy by 585% on the benchmark set of PDBCYS. When applied to 158 non-redundant cysteine-rich proteins, Cyscon predictions helped increase (or decrease) the TM-score (or RMSD) of the ab initio QUARK modeling by 12.1% (or 14.4%). This result demonstrates a new avenue to improve the ab initio structure modeling for cysteine-rich proteins. AVAILABILITY AND IMPLEMENTATION http://www.csbio.sjtu.edu.cn/bioinf/Cyscon/ CONTACT zhng@umich.edu or hbshen@sjtu.edu.cn. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jing Yang
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Bao-Ji He
- State Key Laboratory of Theoretical Physics, Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing 100190, China, Department of Computational Medicine and Bioinformatics and
| | - Richard Jang
- Department of Computational Medicine and Bioinformatics and
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics and Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China, Department of Computational Medicine and Bioinformatics and
| |
Collapse
|
30
|
Shen Y, Bax A. Homology modeling of larger proteins guided by chemical shifts. Nat Methods 2015; 12:747-50. [PMID: 26053889 PMCID: PMC4521993 DOI: 10.1038/nmeth.3437] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Accepted: 03/25/2015] [Indexed: 12/22/2022]
Abstract
We describe an approach to the structure determination of large proteins that relies on experimental NMR chemical shifts, plus sparse nuclear Overhauser effect (NOE) data if available. Our alignment method, POMONA (protein alignments obtained by matching of NMR assignments), directly exploits pre-existing bioinformatics algorithms to match experimental chemical shifts to values predicted for the crystallographic database. Protein templates generated by POMONA are subsequently used as input for chemical shift-based Rosetta comparative modeling (CS-RosettaCM) to generate reliable full-atom models.
Collapse
Affiliation(s)
- Yang Shen
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, Maryland, USA
| | - Ad Bax
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, Maryland, USA
| |
Collapse
|
31
|
Shahbaaz M, Ahmad F, Imtaiyaz Hassan M. Structure-based functional annotation of putative conserved proteins having lyase activity from Haemophilus influenzae. 3 Biotech 2015; 5:317-336. [PMID: 28324295 PMCID: PMC4434415 DOI: 10.1007/s13205-014-0231-z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2014] [Accepted: 05/28/2014] [Indexed: 12/20/2022] Open
Abstract
Haemophilus influenzae is a small pleomorphic Gram-negative bacteria which causes several chronic diseases, including bacteremia, meningitis, cellulitis, epiglottitis, septic arthritis, pneumonia, and empyema. Here we extensively analyzed the sequenced genome of H. influenzae strain Rd KW20 using protein family databases, protein structure prediction, pathways and genome context methods to assign a precise function to proteins whose functions are unknown. These proteins are termed as hypothetical proteins (HPs), for which no experimental information is available. Function prediction of these proteins would surely be supportive to precisely understand the biochemical pathways and mechanism of pathogenesis of Haemophilus influenzae. During the extensive analysis of H. influenzae genome, we found the presence of eight HPs showing lyase activity. Subsequently, we modeled and analyzed three-dimensional structure of all these HPs to determine their functions more precisely. We found these HPs possess cystathionine-β-synthase, cyclase, carboxymuconolactone decarboxylase, pseudouridine synthase A and C, D-tagatose-1,6-bisphosphate aldolase and aminodeoxychorismate lyase-like features, indicating their corresponding functions in the H. influenzae. Lyases are actively involved in the regulation of biosynthesis of various hormones, metabolic pathways, signal transduction, and DNA repair. Lyases are also considered as a key player for various biological processes. These enzymes are critically essential for the survival and pathogenesis of H. influenzae and, therefore, these enzymes may be considered as a potential target for structure-based rational drug design. Our structure–function relationship analysis will be useful to search and design potential lead molecules based on the structure of these lyases, for drug design and discovery.
Collapse
Affiliation(s)
- Mohd Shahbaaz
- Department of Computer Science, Jamia Millia Islamia, New Delhi, 110025, India
| | - Faizan Ahmad
- Center for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi, 110025, India
| | - Md Imtaiyaz Hassan
- Center for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi, 110025, India.
| |
Collapse
|
32
|
Xun S, Jiang F, Wu YD. Significant Refinement of Protein Structure Models Using a Residue-Specific Force Field. J Chem Theory Comput 2015; 11:1949-56. [PMID: 26574396 DOI: 10.1021/acs.jctc.5b00029] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
An important application of all-atom explicit-solvent molecular dynamics (MD) simulations is the refinement of protein structures from low-resolution experiments or template-based modeling. A critical requirement is that the native structure is stable with the force field. We have applied a recently developed residue-specific force field, RSFF1, to a set of 30 refinement targets from recent CASP experiments. Starting from their experimental structures, 1.0 μs unrestrained simulations at 298 K retain most of the native structures quite well except for a few flexible terminals and long internal loops. Starting from each homology model, a 150 ns MD simulation at 380 K generates the best RMSD improvement of 0.85 Å on average. The structural improvements roughly correlate with the RMSD of the initial homology models, indicating possible consistent structure refinement. Finally, targets TR614 and TR624 have been subjected to long-time replica-exchange MD simulations. Significant structural improvements are generated, with RMSD of 1.91 and 1.36 Å with respect to their crystal structures. Thus, it is possible to achieve realistic refinement of protein structure models to near-experimental accuracy, using accurate force field with sufficient conformational sampling.
Collapse
Affiliation(s)
- Sangni Xun
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Fan Jiang
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China
| | - Yun-Dong Wu
- Laboratory of Computational Chemistry and Drug Design, Laboratory of Chemical Genomics, Peking University Shenzhen Graduate School , Shenzhen, 518055, China.,College of Chemistry and Molecular Engineering, Peking University , Beijing, 100871, China
| |
Collapse
|
33
|
Helmer D, Rink I, Dalton JAR, Brahm K, Jöst M, Nargang TM, Blum W, Wadhwani P, Brenner-Weiss G, Rapp BE, Giraldo J, Schmitz K. Rational design of a peptide capture agent for CXCL8 based on a model of the CXCL8:CXCR1 complex. RSC Adv 2015. [DOI: 10.1039/c4ra13749c] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A CXCL8-binding peptide designed from the interaction sites of CXCR1 with CXCL8 serves as a capture agent and inhibits neutrophil migration.
Collapse
|
34
|
H. DeLuca S, L. DeLuca S, Leaver-Fay A, Meiler J. RosettaTMH: a method for membrane protein structure elucidation combining EPR distance restraints with assembly of transmembrane helices. AIMS BIOPHYSICS 2015. [DOI: 10.3934/biophy.2016.1.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
35
|
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | | |
Collapse
|
36
|
Shishkov AV, Bogacheva EN. Tritium planigraphy and nanosized biological particles. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY B 2014. [DOI: 10.1134/s1990793114040162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
37
|
Dalton JAR, Gómez-Santacana X, Llebaria A, Giraldo J. Computational analysis of negative and positive allosteric modulator binding and function in metabotropic glutamate receptor 5 (in)activation. J Chem Inf Model 2014; 54:1476-87. [PMID: 24793143 DOI: 10.1021/ci500127c] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Metabotropic glutamate receptors (mGluRs) are high-profile G-protein coupled receptors drug targets because of their involvement in several neurological disease states, and mGluR5 in particular is a subtype whose controlled allosteric modulation, both positive and negative, can potentially be useful for the treatment of schizophrenia and relief of chronic pain, respectively. Here we model mGluR5 with a collection of positive and negative allosteric modulators (PAMs and NAMs) in both active and inactive receptor states, in a manner that is consistent with experimental information, using a specialized protocol that includes homology to increase docking accuracy, and receptor relaxation to generate an individual induced fit with each allosteric modulator. Results implicate two residues in particular for NAM and PAM function: NAM interaction with W785 for receptor inactivation, and NAM/PAM H-bonding with S809 for receptor (in)activation. Models suggest the orientation of the H-bond between allosteric modulator and S809, controlled by PAM/NAM chemistry, influences the position of TM7, which in turn influences the shape of the allosteric site, and potentially the receptor state. NAM-bound and PAM-bound mGluR5 models also reveal that although PAMs and NAMs bind in the same pocket and share similar binding modes, they have distinct effects on the conformation of the receptor. Our models, together with the identification of a possible activation mechanism, may be useful in the rational design of new allosteric modulators for mGluR5.
Collapse
Affiliation(s)
- James A R Dalton
- Laboratory of Molecular Neuropharmacology and Bioinformatics, Institut de Neurociències and Unitat de Bioestadística, Universitat Autònoma de Barcelona , 08193 Bellaterra, Barcelona, Spain
| | | | | | | |
Collapse
|
38
|
Larsen A, Wagner JR, Jain A, Vaidehi N. Protein structure refinement of CASP target proteins using GNEIMO torsional dynamics method. J Chem Inf Model 2014; 54:508-17. [PMID: 24397429 PMCID: PMC3985798 DOI: 10.1021/ci400484c] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Indexed: 11/30/2022]
Abstract
A longstanding challenge in using computational methods for protein structure prediction is the refinement of low-resolution structural models derived from comparative modeling methods into highly accurate atomistic models useful for detailed structural studies. Previously, we have developed and demonstrated the utility of the internal coordinate molecular dynamics (MD) technique, generalized Newton-Euler inverse mass operator (GNEIMO), for refinement of small proteins. Using GNEIMO, the high-frequency degrees of freedom are frozen and the protein is modeled as a collection of rigid clusters connected by torsional hinges. This physical model allows larger integration time steps and focuses the conformational search in the low frequency torsional degrees of freedom. Here, we have applied GNEIMO with temperature replica exchange to refine low-resolution protein models of 30 proteins taken from the continuous assessment of structure prediction (CASP) competition. We have shown that GNEIMO torsional MD method leads to refinement of up to 1.3 Å in the root-mean-square deviation in coordinates for 30 CASP target proteins without using any experimental data as restraints in performing the GNEIMO simulations. This is in contrast with the unconstrained all-atom Cartesian MD method performed under the same conditions, where refinement requires the use of restraints during the simulations.
Collapse
Affiliation(s)
- Adrien
B. Larsen
- Division
of Immunology, Beckman Research Institute
of the City of Hope, 1500, E. Duarte Road, Duarte, California 91010, United States
| | - Jeffrey R. Wagner
- Division
of Immunology, Beckman Research Institute
of the City of Hope, 1500, E. Duarte Road, Duarte, California 91010, United States
| | - Abhinandan Jain
- Jet
Propulsion Laboratory, California Institute
of Technology, Pasadena, California 91109, United States
| | - Nagarajan Vaidehi
- Division
of Immunology, Beckman Research Institute
of the City of Hope, 1500, E. Duarte Road, Duarte, California 91010, United States
| |
Collapse
|
39
|
Custódio FL, Barbosa HJ, Dardenne LE. A multiple minima genetic algorithm for protein structure prediction. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.10.029] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
40
|
Mao B, Tejero R, Baker D, Montelione GT. Protein NMR structures refined with Rosetta have higher accuracy relative to corresponding X-ray crystal structures. J Am Chem Soc 2014; 136:1893-906. [PMID: 24392845 PMCID: PMC4129517 DOI: 10.1021/ja409845w] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We have found that refinement of protein NMR structures using Rosetta with experimental NMR restraints yields more accurate protein NMR structures than those that have been deposited in the PDB using standard refinement protocols. Using 40 pairs of NMR and X-ray crystal structures determined by the Northeast Structural Genomics Consortium, for proteins ranging in size from 5-22 kDa, restrained Rosetta refined structures fit better to the raw experimental data, are in better agreement with their X-ray counterparts, and have better phasing power compared to conventionally determined NMR structures. For 37 proteins for which NMR ensembles were available and which had similar structures in solution and in the crystal, all of the restrained Rosetta refined NMR structures were sufficiently accurate to be used for solving the corresponding X-ray crystal structures by molecular replacement. The protocol for restrained refinement of protein NMR structures was also compared with restrained CS-Rosetta calculations. For proteins smaller than 10 kDa, restrained CS-Rosetta, starting from extended conformations, provides slightly more accurate structures, while for proteins in the size range of 10-25 kDa the less CPU intensive restrained Rosetta refinement protocols provided equally or more accurate structures. The restrained Rosetta protocols described here can improve the accuracy of protein NMR structures and should find broad and general for studies of protein structure and function.
Collapse
Affiliation(s)
- Binchen Mao
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, and Department of Biochemistry and Molecular Biology of Robert Wood Johnson Medical School, and Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey , Piscataway, New Jersey 08854, United States
| | | | | | | |
Collapse
|
41
|
Abstract
Computer-aided drug discovery/design methods have played a major role in the development of therapeutically important small molecules for over three decades. These methods are broadly classified as either structure-based or ligand-based methods. Structure-based methods are in principle analogous to high-throughput screening in that both target and ligand structure information is imperative. Structure-based approaches include ligand docking, pharmacophore, and ligand design methods. The article discusses theory behind the most important methods and recent successful applications. Ligand-based methods use only ligand information for predicting activity depending on its similarity/dissimilarity to previously known active ligands. We review widely used ligand-based methods such as ligand-based pharmacophores, molecular descriptors, and quantitative structure-activity relationships. In addition, important tools such as target/ligand data bases, homology modeling, ligand fingerprint methods, etc., necessary for successful implementation of various computer-aided drug discovery/design methods in a drug discovery campaign are discussed. Finally, computational methods for toxicity prediction and optimization for favorable physiologic properties are discussed with successful examples from literature.
Collapse
Affiliation(s)
- Gregory Sliwoski
- Jr., Center for Structural Biology, 465 21st Ave South, BIOSCI/MRBIII, Room 5144A, Nashville, TN 37232-8725.
| | | | | | | |
Collapse
|
42
|
Abstract
The Nipah virus phosphoprotein (P) is multimeric and tethers the viral polymerase to the nucleocapsid. We present the crystal structure of the multimerization domain of Nipah virus P: a long, parallel, tetrameric, coiled coil with a small, α-helical cap structure. Across the paramyxoviruses, these domains share little sequence identity yet are similar in length and structural organization, suggesting a common requirement for scaffolding or spatial organization of the functions of P in the virus life cycle.
Collapse
|
43
|
Monastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A. Evaluation of residue-residue contact prediction in CASP10. Proteins 2013; 82 Suppl 2:138-53. [PMID: 23760879 DOI: 10.1002/prot.24340] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 05/14/2013] [Accepted: 05/21/2013] [Indexed: 12/13/2022]
Abstract
We present the results of the assessment of the intramolecular residue-residue contact predictions from 26 prediction groups participating in the 10th round of the CASP experiment. The most recently developed direct coupling analysis methods did not take part in the experiment likely because they require a very deep sequence alignment not available for any of the 114 CASP10 targets. The performance of contact prediction methods was evaluated with the measures used in previous CASPs (i.e., prediction accuracy and the difference between the distribution of the predicted contacts and that of all pairs of residues in the target protein), as well as new measures, such as the Matthews correlation coefficient, the area under the precision-recall curve and the ranks of the first correctly and incorrectly predicted contact. We also evaluated the ability to detect interdomain contacts and tested whether the difficulty of predicting contacts depends upon the protein length and the depth of the family sequence alignment. The analyses were carried out on the target domains for which structural homologs did not exist or were difficult to identify. The evaluation was performed for all types of contacts (short, medium, and long-range), with emphasis placed on long-range contacts, i.e. those involving residues separated by at least 24 residues along the sequence. The assessment suggests that the best CASP10 contact prediction methods perform at approximately the same level, and comparably to those participating in CASP9.
Collapse
|
44
|
Lessons from application of the UNRES force field to predictions of structures of CASP10 targets. Proc Natl Acad Sci U S A 2013; 110:14936-41. [PMID: 23980156 DOI: 10.1073/pnas.1313316110] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The performance of the physics-based protocol, whose main component is the United Residue (UNRES) physics-based coarse-grained force field, developed in our laboratory for the prediction of protein structure from amino acid sequence, is illustrated. Candidate models are selected, based on probabilities of the conformational families determined by multiplexed replica-exchange simulations, from the 10th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP10). For target T0663, classified as a new fold, which consists of two domains homologous to those of known proteins, UNRES predicted the correct symmetry of packing, in which the domains are rotated with respect to each other by 180° in the experimental structure. By contrast, models obtained by knowledge-based methods, in which each domain is modeled very accurately but not rotated, resulted in incorrect packing. Two UNRES models of this target were featured by the assessors. Correct domain packing was also predicted by UNRES for the homologous target T0644, which has a similar structure to that of T0663, except that the two domains are not rotated. Predictions for two other targets, T0668 and T0684_D2, are among the best ones by global distance test score. These results suggest that our physics-based method has substantial predictive power. In particular, it has the ability to predict domain-domain orientations, which is a significant advance in the state of the art.
Collapse
|
45
|
Sathler PC, Santana M, Lourenço AL, Rodrigues CR, Abreu P, Cabral LM, Castro HC. Human thromboxane synthase: comparative modeling and docking evaluation with the competitive inhibitors Dazoxiben and Ozagrel. J Enzyme Inhib Med Chem 2013; 29:527-31. [PMID: 23914925 DOI: 10.3109/14756366.2013.817403] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Thromboxane synthase (TXAS) is a P450 epoxygenase that synthesizes thromboxane A2 (TXA2), a potent mediator of platelet aggregation, vasoconstriction and bronchoconstriction. This enzyme plays an important role in several human diseases, including myocardial infarction, stroke, septic shock, asthma and cancer. Despite of the increasing interest on developing TXAS inhibitors, the structure and activity of TXAS are still not totally elucidated. In this study, we used a comparative molecular modeling approach to construct a reliable model of TXAS and analyze its interactions with Dazoxiben and Ozagrel, two competitive inhibitors. Our results were compatible with experimental published data, showing feasible cation-π interaction between the iron atom of the heme group of TXAS and the basic nitrogen atom of the imidazolyl group of those inhibitors. In the absence of the experimental structure of thromboxane synthase, this freely available model may be useful for designing new antiplatelet drugs for diseases related with TXA2.
Collapse
Affiliation(s)
- Plínio Cunha Sathler
- School of Pharmacy, Federal University of Rio de Janeiro , Niterói, Rio de Janeiro , Brazil
| | | | | | | | | | | | | |
Collapse
|
46
|
Combs SA, Deluca SL, Deluca SH, Lemmon GH, Nannemann DP, Nguyen ED, Willis JR, Sheehan JH, Meiler J. Small-molecule ligand docking into comparative models with Rosetta. Nat Protoc 2013; 8:1277-98. [PMID: 23744289 DOI: 10.1038/nprot.2013.074] [Citation(s) in RCA: 120] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Structure-based drug design is frequently used to accelerate the development of small-molecule therapeutics. Although substantial progress has been made in X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy, the availability of high-resolution structures is limited owing to the frequent inability to crystallize or obtain sufficient NMR restraints for large or flexible proteins. Computational methods can be used to both predict unknown protein structures and model ligand interactions when experimental data are unavailable. This paper describes a comprehensive and detailed protocol using the Rosetta modeling suite to dock small-molecule ligands into comparative models. In the protocol presented here, we review the comparative modeling process, including sequence alignment, threading and loop building. Next, we cover docking a small-molecule ligand into the protein comparative model. In addition, we discuss criteria that can improve ligand docking into comparative models. Finally, and importantly, we present a strategy for assessing model quality. The entire protocol is presented on a single example selected solely for didactic purposes. The results are therefore not representative and do not replace benchmarks published elsewhere. We also provide an additional tutorial so that the user can gain hands-on experience in using Rosetta. The protocol should take 5-7 h, with additional time allocated for computer generation of models.
Collapse
Affiliation(s)
- Steven A Combs
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
47
|
Bhattacharya D, Cheng J. 3Drefine: consistent protein structure refinement by optimizing hydrogen bonding network and atomic-level energy minimization. Proteins 2013; 81:119-31. [PMID: 22927229 PMCID: PMC3634918 DOI: 10.1002/prot.24167] [Citation(s) in RCA: 130] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2012] [Revised: 07/26/2012] [Accepted: 08/17/2012] [Indexed: 12/27/2022]
Abstract
One of the major limitations of computational protein structure prediction is the deviation of predicted models from their experimentally derived true, native structures. The limitations often hinder the possibility of applying computational protein structure prediction methods in biochemical assignment and drug design that are very sensitive to structural details. Refinement of these low-resolution predicted models to high-resolution structures close to the native state, however, has proven to be extremely challenging. Thus, protein structure refinement remains a largely unsolved problem. Critical assessment of techniques for protein structure prediction (CASP) specifically indicated that most predictors participating in the refinement category still did not consistently improve model quality. Here, we propose a two-step refinement protocol, called 3Drefine, to consistently bring the initial model closer to the native structure. The first step is based on optimization of hydrogen bonding (HB) network and the second step applies atomic-level energy minimization on the optimized model using a composite physics and knowledge-based force fields. The approach has been evaluated on the CASP benchmark data and it exhibits consistent improvement over the initial structure in both global and local structural quality measures. 3Drefine method is also computationally inexpensive, consuming only few minutes of CPU time to refine a protein of typical length (300 residues). 3Drefine web server is freely available at http://sysbio.rnet.missouri.edu/3Drefine/.
Collapse
Affiliation(s)
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
- Informatics Institute, University of Missouri, Columbia, MO 65211, USA
- Bond Life Science Center, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
48
|
Kaufmann KW, Meiler J. Using RosettaLigand for small molecule docking into comparative models. PLoS One 2012; 7:e50769. [PMID: 23239984 PMCID: PMC3519832 DOI: 10.1371/journal.pone.0050769] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2012] [Accepted: 10/24/2012] [Indexed: 11/18/2022] Open
Abstract
Computational small molecule docking into comparative models of proteins is widely used to query protein function and in the development of small molecule therapeutics. We benchmark RosettaLigand docking into comparative models for nine proteins built during CASP8 that contain ligands. We supplement the study with 21 additional protein/ligand complexes to cover a wider space of chemotypes. During a full docking run in 21 of the 30 cases, RosettaLigand successfully found a native-like binding mode among the top ten scoring binding modes. From the benchmark cases we find that careful template selection based on ligand occupancy provides the best chance of success while overall sequence identity between template and target do not appear to improve results. We also find that binding energy normalized by atom number is often less than -0.4 in native-like binding modes.
Collapse
Affiliation(s)
- Kristian W. Kaufmann
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States of America
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, Tennessee, United States of America
- Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- Institute of Chemical Biology, Vanderbilt University, Nashville, Tennessee, United States of America
- * E-mail:
| |
Collapse
|
49
|
Xu D, Zhang Y. Toward optimal fragment generations for ab initio protein structure assembly. Proteins 2012; 81:229-39. [PMID: 22972754 DOI: 10.1002/prot.24179] [Citation(s) in RCA: 170] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2012] [Revised: 08/06/2012] [Accepted: 09/03/2012] [Indexed: 01/03/2023]
Abstract
Fragment assembly using structural motifs excised from other solved proteins has shown to be an efficient method for ab initio protein-structure prediction. However, how to construct accurate fragments, how to derive optimal restraints from fragments, and what the best fragment length is are the basic issues yet to be systematically examined. In this work, we developed a gapless-threading method to generate position-specific structure fragments. Distance profiles and torsion angle pairs are then derived from the fragments by statistical consistency analysis, which achieved comparable accuracy with the machine-learning-based methods although the fragments were taken from unrelated proteins. When measured by both accuracies of the derived distance profiles and torsion angle pairs, we come to a consistent conclusion that the optimal fragment length for structural assembly is around 10, and at least 100 fragments at each location are needed to achieve optimal structure assembly. The distant profiles and torsion angle pairs as derived by the fragments have been successfully used in QUARK for ab initio protein structure assembly and are provided by the QUARK online server at http://zhanglab.ccmb. med.umich.edu/QUARK/.
Collapse
Affiliation(s)
- Dong Xu
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | |
Collapse
|
50
|
Abstract
MOTIVATION Residue-residue contact prediction is important for protein structure prediction and other applications. However, the accuracy of current contact predictors often barely exceeds 20% on long-range contacts, falling short of the level required for ab initio structure prediction. RESULTS Here, we develop a novel machine learning approach for contact map prediction using three steps of increasing resolution. First, we use 2D recursive neural networks to predict coarse contacts and orientations between secondary structure elements. Second, we use an energy-based method to align secondary structure elements and predict contact probabilities between residues in contacting alpha-helices or strands. Third, we use a deep neural network architecture to organize and progressively refine the prediction of contacts, integrating information over both space and time. We train the architecture on a large set of non-redundant proteins and test it on a large set of non-homologous domains, as well as on the set of protein domains used for contact prediction in the two most recent CASP8 and CASP9 experiments. For long-range contacts, the accuracy of the new CMAPpro predictor is close to 30%, a significant increase over existing approaches. AVAILABILITY CMAPpro is available as part of the SCRATCH suite at http://scratch.proteomics.ics.uci.edu/. CONTACT pfbaldi@uci.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pietro Di Lena
- Department of Computer Science, University of California, Irvine, CA 92697, USA
| | | | | |
Collapse
|