1
|
Rosignoli S, Lustrino E, Di Silverio I, Paiardini A. Making Use of Averaging Methods in MODELLER for Protein Structure Prediction. Int J Mol Sci 2024; 25:1731. [PMID: 38339009 PMCID: PMC10855553 DOI: 10.3390/ijms25031731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 01/23/2024] [Accepted: 01/29/2024] [Indexed: 02/12/2024] Open
Abstract
Recent advances in protein structure prediction, driven by AlphaFold 2 and machine learning, demonstrate proficiency in static structures but encounter challenges in capturing essential dynamic features crucial for understanding biological function. In this context, homology-based modeling emerges as a cost-effective and computationally efficient alternative. The MODELLER (version 10.5, accessed on 30 November 2023) algorithm can be harnessed for this purpose since it computes intermediate models during simulated annealing, enabling the exploration of attainable configurational states and energies while minimizing its objective function. There have been a few attempts to date to improve the models generated by its algorithm, and in particular, there is no literature regarding the implementation of an averaging procedure involving the intermediate models in the MODELLER algorithm. In this study, we examined MODELLER's output using 225 target-template pairs, extracting the best representatives of intermediate models. Applying an averaging procedure to the selected intermediate structures based on statistical potentials, we aimed to determine: (1) whether averaging improves the quality of structural models during the building phase; (2) if ranking by statistical potentials reliably selects the best models, leading to improved final model quality; (3) whether using a single template versus multiple templates affects the averaging approach; (4) whether the "ensemble" nature of the MODELLER building phase can be harnessed to capture low-energy conformations in holo structures modeling. Our findings indicate that while improvements typically fall short of a few decimal points in the model evaluation metric, a notable fraction of configurations exhibit slightly higher similarity to the native structure than MODELLER's proposed final model. The averaging-building procedure proves particularly beneficial in (1) regions of low sequence identity between the target and template(s), the most challenging aspect of homology modeling; (2) holo protein conformations generation, an area in which MODELLER and related tools usually fall short of the expected performance.
Collapse
Affiliation(s)
| | | | | | - Alessandro Paiardini
- Department of Biochemical Sciences, Sapienza University of Rome, 00185 Rome, Italy; (S.R.); (E.L.); (I.D.S.)
| |
Collapse
|
2
|
Abstract
Protein structure modeling is one of the most advanced and complex processes in computational biology. One of the major problems for the protein structure prediction field has been how to estimate the accuracy of the predicted 3D models, on both a local and global level, in the absence of known structures. We must be able to accurately measure the confidence that we have in the quality predicted 3D models of proteins for them to become widely adopted by the general bioscience community. To address this major issue, it was necessary to develop new model quality assessment (MQA) methods and integrate them into our pipelines for building 3D protein models. Our MQA method, called ModFOLD, has been ranked as one of the most accurate MQA tools in independent blind evaluations. This chapter discusses model quality assessment in the protein modeling field, demonstrating both its strengths and limitations. We also present some of the best methods according to independent benchmarking data, which has been gathered in recent years.
Collapse
Affiliation(s)
- Ali H A Maghrabi
- College of Applied Sciences, Umm Al Qura University, Mecca, Saudi Arabia
| | | | - Liam J McGuffin
- School of Biological Sciences, University of Reading, Reading, UK.
| |
Collapse
|
3
|
Kuang D, Issakova D, Kim J. Learning Proteome Domain Folding Using LSTMs in an Empirical Kernel Space. J Mol Biol 2022; 434:167686. [PMID: 35716781 DOI: 10.1016/j.jmb.2022.167686] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 06/08/2022] [Accepted: 06/10/2022] [Indexed: 11/30/2022]
Abstract
The recognition of protein structural folds is the starting point for protein function inference and for many structural prediction tools. We previously introduced the idea of using empirical comparisons to create a data-augmented feature space called PESS (Protein Empirical Structure Space)1 as a novel approach for protein structure prediction. Here, we extend the previous approach by generating the PESS feature space over fixed-length subsequences of query peptides, and applying a sequential neural network model, with one long short-term memory cell layer followed by a fully connected layer. Using this approach, we show that only a small group of domains as a training set is needed to achieve near state-of-the-art accuracy on fold recognition. Our method improves on the previous approach by reducing the training set required and improving the model's ability to generalize across species, which will help fold prediction for newly discovered proteins.
Collapse
Affiliation(s)
- Da Kuang
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA.
| | - Dina Issakova
- Department of Biology, University of Pennsylvania, Philadelphia, USA.
| | - Junhyong Kim
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA; Department of Biology, University of Pennsylvania, Philadelphia, USA.
| |
Collapse
|
4
|
Alamri MA, Mirza MU, Adeel MM, Ashfaq UA, Tahir ul Qamar M, Shahid F, Ahmad S, Alatawi EA, Albalawi GM, Allemailem KS, Almatroudi A. Structural Elucidation of Rift Valley Fever Virus L Protein towards the Discovery of Its Potential Inhibitors. Pharmaceuticals (Basel) 2022; 15:659. [PMID: 35745579 PMCID: PMC9228520 DOI: 10.3390/ph15060659] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 05/16/2022] [Accepted: 05/20/2022] [Indexed: 12/17/2022] Open
Abstract
Rift valley fever virus (RVFV) is the causative agent of a viral zoonosis that causes a significant clinical burden in domestic and wild ruminants. Major outbreaks of the virus occur in livestock, and contaminated animal products or arthropod vectors can transmit the virus to humans. The viral RNA-dependent RNA polymerase (RdRp; L protein) of the RVFV is responsible for viral replication and is thus an appealing drug target because no effective and specific vaccine against this virus is available. The current study reported the structural elucidation of the RVFV-L protein by in-depth homology modeling since no crystal structure is available yet. The inhibitory binding modes of known potent L protein inhibitors were analyzed. Based on the results, further molecular docking-based virtual screening of Selleckchem Nucleoside Analogue Library (156 compounds) was performed to find potential new inhibitors against the RVFV L protein. ADME (Absorption, Distribution, Metabolism, and Excretion) and toxicity analysis of these compounds was also performed. Besides, the binding mechanism and stability of identified compounds were confirmed by a 50 ns molecular dynamic (MD) simulation followed by MM/PBSA binding free energy calculations. Homology modeling determined a stable multi-domain structure of L protein. An analysis of known L protein inhibitors, including Monensin, Mycophenolic acid, and Ribavirin, provide insights into the binding mechanism and reveals key residues of the L protein binding pocket. The screening results revealed that the top three compounds, A-317491, Khasianine, and VER155008, exhibited a high affinity at the L protein binding pocket. ADME analysis revealed good pharmacodynamics and pharmacokinetic profiles of these compounds. Furthermore, MD simulation and binding free energy analysis endorsed the binding stability of potential compounds with L protein. In a nutshell, the present study determined potential compounds that may aid in the rational design of novel inhibitors of the RVFV L protein as anti-RVFV drugs.
Collapse
Affiliation(s)
- Mubarak A. Alamri
- Department of Pharmaceutical Chemistry, College of Pharmacy, Prince Sattam Bin Abdulaziz University, Al-Kharj 16273, Saudi Arabia;
| | - Muhammad Usman Mirza
- Department of Chemistry and Biochemistry, University of Windsor, Windsor, ON N9B 3P4, Canada;
| | - Muhammad Muzammal Adeel
- 3D Genomics Research Center, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China;
| | - Usman Ali Ashfaq
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Faisalabad 38000, Pakistan; (U.A.A.); (F.S.)
| | - Muhammad Tahir ul Qamar
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Faisalabad 38000, Pakistan; (U.A.A.); (F.S.)
| | - Farah Shahid
- Department of Bioinformatics and Biotechnology, Government College University Faisalabad, Faisalabad 38000, Pakistan; (U.A.A.); (F.S.)
| | - Sajjad Ahmad
- Department of Health and Biological Sciences, Abasyn University, Peshawar 25000, Pakistan;
| | - Eid A. Alatawi
- Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, University of Tabuk, Tabuk 71491, Saudi Arabia;
| | - Ghadah M. Albalawi
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia; (G.M.A.); (A.A.)
- Department of Laboratory and Blood Bank, King Fahd Specialist Hospital, Tabuk 47717, Saudi Arabia
| | - Khaled S. Allemailem
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia; (G.M.A.); (A.A.)
| | - Ahmad Almatroudi
- Department of Medical Laboratories, College of Applied Medical Sciences, Qassim University, Buraydah 51452, Saudi Arabia; (G.M.A.); (A.A.)
| |
Collapse
|
5
|
Structure-Aware Mycobacterium tuberculosis Functional Annotation Uncloaks Resistance, Metabolic, and Virulence Genes. mSystems 2021; 6:e0067321. [PMID: 34726489 PMCID: PMC8562490 DOI: 10.1128/msystems.00673-21] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Accurate and timely functional genome annotation is essential for translating basic pathogen research into clinically impactful advances. Here, through literature curation and structure-function inference, we systematically update the functional genome annotation of Mycobacterium tuberculosis virulent type strain H37Rv. First, we systematically curated annotations for 589 genes from 662 publications, including 282 gene products absent from leading databases. Second, we modeled 1,711 underannotated proteins and developed a semiautomated pipeline that captured shared function between 400 protein models and structural matches of known function on Protein Data Bank, including drug efflux proteins, metabolic enzymes, and virulence factors. In aggregate, these structure- and literature-derived annotations update 940/1,725 underannotated H37Rv genes and generate hundreds of functional hypotheses. Retrospectively applying the annotation to a recent whole-genome transposon mutant screen provided missing function for 48% (13/27) of underannotated genes altering antibiotic efficacy and 33% (23/69) required for persistence during mouse tuberculosis (TB) infection. Prospective application of the protein models enabled us to functionally interpret novel laboratory generated pyrazinamide (PZA)-resistant mutants of unknown function, which implicated the emerging coenzyme A depletion model of PZA action in the mutants’ PZA resistance. Our findings demonstrate the functional insight gained by integrating structural modeling and systematic literature curation, even for widely studied microorganisms. Functional annotations and protein structure models are available at https://tuberculosis.sdsu.edu/H37Rv in human- and machine-readable formats. IMPORTANCEMycobacterium tuberculosis, the primary causative agent of tuberculosis, kills more humans than any other infectious bacterium. Yet 40% of its genome is functionally uncharacterized, leaving much about the genetic basis of its resistance to antibiotics, capacity to withstand host immunity, and basic metabolism yet undiscovered. Irregular literature curation for functional annotation contributes to this gap. We systematically curated functions from literature and structural similarity for over half of poorly characterized genes, expanding the functionally annotated Mycobacterium tuberculosis proteome. Applying this updated annotation to recent in vivo functional screens added functional information to dozens of clinically pertinent proteins described as having unknown function. Integrating the annotations with a prospective functional screen identified new mutants resistant to a first-line TB drug, supporting an emerging hypothesis for its mode of action. These improvements in functional interpretation of clinically informative studies underscore the translational value of this functional knowledge. Structure-derived annotations identify hundreds of high-confidence candidates for mechanisms of antibiotic resistance, virulence factors, and basic metabolism and other functions key in clinical and basic tuberculosis research. More broadly, they provide a systematic framework for improving prokaryotic reference annotations.
Collapse
|
6
|
Kryshtafovych A, Moult J, Billings WM, Della Corte D, Fidelis K, Kwon S, Olechnovič K, Seok C, Venclovas Č, Won J. Modeling SARS-CoV-2 proteins in the CASP-commons experiment. Proteins 2021; 89:1987-1996. [PMID: 34462960 PMCID: PMC8616790 DOI: 10.1002/prot.26231] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/23/2021] [Accepted: 08/26/2021] [Indexed: 01/21/2023]
Abstract
Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).
Collapse
Affiliation(s)
| | - John Moult
- Department of Cell Biology and Molecular genetics, Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, USA
| | - Wendy M Billings
- Department of Physics & Astronomy, Brigham Young University, Provo, Utah, USA
| | - Dennis Della Corte
- Department of Physics & Astronomy, Brigham Young University, Provo, Utah, USA
| | - Krzysztof Fidelis
- Genome Center, University of California, Davis, Davis, California, USA
| | - Sohee Kwon
- Department of Chemistry, Seoul National University, Seoul, South Korea
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, South Korea
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Jonghun Won
- Department of Chemistry, Seoul National University, Seoul, South Korea
| | | |
Collapse
|
7
|
Mahmud S, Rafi MO, Paul GK, Promi MM, Shimu MSS, Biswas S, Emran TB, Dhama K, Alyami SA, Moni MA, Saleh MA. Designing a multi-epitope vaccine candidate to combat MERS-CoV by employing an immunoinformatics approach. Sci Rep 2021; 11:15431. [PMID: 34326355 PMCID: PMC8322212 DOI: 10.1038/s41598-021-92176-1] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 05/21/2021] [Indexed: 01/26/2023] Open
Abstract
Currently, no approved vaccine is available against the Middle East respiratory syndrome coronavirus (MERS-CoV), which causes severe respiratory disease. The spike glycoprotein is typically considered a suitable target for MERS-CoV vaccine candidates. A computational strategy can be used to design an antigenic vaccine against a pathogen. Therefore, we used immunoinformatics and computational approaches to design a multi-epitope vaccine that targets the spike glycoprotein of MERS-CoV. After using numerous immunoinformatics tools and applying several immune filters, a poly-epitope vaccine was constructed comprising cytotoxic T-cell lymphocyte (CTL)-, helper T-cell lymphocyte (HTL)-, and interferon-gamma (IFN-γ)-inducing epitopes. In addition, various physicochemical, allergenic, and antigenic profiles were evaluated to confirm the immunogenicity and safety of the vaccine. Molecular interactions, binding affinities, and the thermodynamic stability of the vaccine were examined through molecular docking and dynamic simulation approaches, during which we identified a stable and strong interaction with Toll-like receptors (TLRs). In silico immune simulations were performed to assess the immune-response triggering capabilities of the vaccine. This computational analysis suggested that the proposed vaccine candidate would be structurally stable and capable of generating an effective immune response to combat viral infections; however, experimental evaluations remain necessary to verify the exact safety and immunogenicity profile of this vaccine.
Collapse
Affiliation(s)
- Shafi Mahmud
- Microbiology Laboratory, Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, 6505, Bangladesh
| | - Md Oliullah Rafi
- Department of Genetic Engineering and Biotechnology, Jashore University of Science and Technology, Jashore, 7408, Bangladesh
| | - Gobindo Kumar Paul
- Microbiology Laboratory, Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, 6505, Bangladesh
| | - Maria Meha Promi
- Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, 6505, Bangladesh
| | - Mst Sharmin Sultana Shimu
- Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, 6505, Bangladesh
| | - Suvro Biswas
- Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, 6505, Bangladesh
| | - Talha Bin Emran
- Department of Pharmacy, BGC Trust University Bangladesh, Chittagong, 4381, Bangladesh
| | - Kuldeep Dhama
- Division of Pathology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, 243122, Uttar Pradesh, India
| | - Salem A Alyami
- Department of Mathematics and Statistics, Imam Mohammad Ibn Saud Islamic University, Riyadh, 11432, Saudi Arabia
| | - Mohammad Ali Moni
- Faculty of Medicine, WHO Collaborating Centre on eHealth, UNSW Digital Health, School of Public Health and Community Medicine, UNSW Sydney, Sydney, NSW, 2052, Australia.
| | - Md Abu Saleh
- Microbiology Laboratory, Department of Genetic Engineering and Biotechnology, University of Rajshahi, Rajshahi, 6505, Bangladesh.
| |
Collapse
|
8
|
Shey RA, Ghogomu SM, Shintouo CM, Nkemngo FN, Nebangwa DN, Esoh K, Yaah NE, Manka’aFri M, Nguve JE, Ngwese RA, Njume FN, Bertha FA, Ayong L, Njemini R, Vanhamme L, Souopgui J. Computational Design and Preliminary Serological Analysis of a Novel Multi-Epitope Vaccine Candidate against Onchocerciasis and Related Filarial Diseases. Pathogens 2021; 10:99. [PMID: 33494344 PMCID: PMC7912539 DOI: 10.3390/pathogens10020099] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 01/14/2021] [Accepted: 01/18/2021] [Indexed: 11/16/2022] Open
Abstract
: Onchocerciasis is a skin and eye disease that exerts a heavy socio-economic burden, particularly in sub-Saharan Africa, a region which harbours greater than 96% of either infected or at-risk populations. The elimination plan for the disease is currently challenged by many factors including amongst others; the potential emergence of resistance to the main chemotherapeutic agent, ivermectin (IVM). Novel tools, including preventative and therapeutic vaccines, could provide additional impetus to the disease elimination tool portfolio. Several observations in both humans and animals have provided evidence for the development of both natural and artificial acquired immunity. In this study, immuno-informatics tools were applied to design a filarial-conserved multi-epitope subunit vaccine candidate, (designated Ov-DKR-2) consisting of B-and T-lymphocyte epitopes of eight immunogenic antigens previously assessed in pre-clinical studies. The high-percentage conservation of the selected proteins and epitopes predicted in related nematode parasitic species hints that the generated chimera may be instrumental for cross-protection. Bioinformatics analyses were employed for the prediction, refinement, and validation of the 3D structure of the Ov-DKR-2 chimera. In-silico immune simulation projected significantly high levels of IgG1, T-helper, T-cytotoxic cells, INF-γ, and IL-2 responses. Preliminary immunological analyses revealed that the multi-epitope vaccine candidate reacted with antibodies in sera from both onchocerciasis-infected individuals, endemic normals as well as loiasis-infected persons but not with the control sera from European individuals. These results support the premise for further characterisation of the engineered protein as a vaccine candidate for onchocerciasis.
Collapse
Affiliation(s)
- Robert Adamu Shey
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
- Department of Molecular Biology, Institute of Biology and Molecular Medicine, IBMM, Université Libre de Bruxelles, Gosselies Campus, 6040 Gosselies, Belgium;
| | - Stephen Mbigha Ghogomu
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
| | - Cabirou Mounchili Shintouo
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
- Frailty in Ageing Research Group, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium;
- Department of Gerontology, Faculty of Medicine and Pharmacy, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium
| | - Francis Nongley Nkemngo
- Department of Microbiology and Parasitology, Faculty of Science, University of Buea, Buea 99999, Cameroon;
- Centre for Research in Infectious Diseases (CRID), Department of Parasitology and Medical Entomology, Yaounde BP 13591, Cameroon
| | - Derrick Neba Nebangwa
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
| | - Kevin Esoh
- Division of Human Genetics, Health Sciences Campus, Department of Pathology, University of Cape Town, Anzio Rd, Observatory, Cape Town 7925, South Africa;
| | - Ntang Emmaculate Yaah
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
| | - Muyanui Manka’aFri
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
| | - Joel Ebai Nguve
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
| | - Roland Akwelle Ngwese
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
| | - Ferdinand Ngale Njume
- Department of Biochemistry and Molecular Biology, Faculty of Science, University of Buea, Buea 99999, Cameroon; (R.A.S.); (S.M.G.); (C.M.S.); (D.N.N.); (N.E.Y.); (M.M.); (J.E.N.); (R.A.N.); (F.N.N.)
- Department of Molecular Biology, Institute of Biology and Molecular Medicine, IBMM, Université Libre de Bruxelles, Gosselies Campus, 6040 Gosselies, Belgium;
| | - Fru Asa Bertha
- Department of Public Health and Hygiene, Faculty of Health Science, University of Buea, Buea 99999, Cameroon;
| | - Lawrence Ayong
- Malaria Research Unit, Centre Pasteur Cameroon, Yaoundé Rue 2005, Cameroon;
| | - Rose Njemini
- Frailty in Ageing Research Group, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium;
- Department of Gerontology, Faculty of Medicine and Pharmacy, Vrije Universiteit Brussel, Laarbeeklaan 103, B-1090 Brussels, Belgium
| | - Luc Vanhamme
- Department of Molecular Biology, Institute of Biology and Molecular Medicine, IBMM, Université Libre de Bruxelles, Gosselies Campus, 6040 Gosselies, Belgium;
| | - Jacob Souopgui
- Department of Molecular Biology, Institute of Biology and Molecular Medicine, IBMM, Université Libre de Bruxelles, Gosselies Campus, 6040 Gosselies, Belgium;
| |
Collapse
|
9
|
Role of Bioinformatics in Biological Sciences. Adv Bioinformatics 2021. [DOI: 10.1007/978-981-33-6191-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022] Open
|
10
|
Wang Y, Ping L, Luan X, Chen Y, Fan X, Li L, Liu Y, Wang P, Zhang S, Zhang B, Chen X. A Mutation in VWA1, Encoding von Willebrand Factor A Domain-Containing Protein 1, Is Associated With Hemifacial Microsomia. Front Cell Dev Biol 2020; 8:571004. [PMID: 33015062 PMCID: PMC7509151 DOI: 10.3389/fcell.2020.571004] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 08/19/2020] [Indexed: 12/31/2022] Open
Abstract
Background Hemifacial microsomia (HFM) is a type of rare congenital syndrome caused by developmental disorders of the first and second pharyngeal arches that occurs in one out of 5,600 live births. There are significant gaps in our knowledge of the pathogenic genes underlying this syndrome. Methods Whole exome sequencing (WES) was performed on five patients, one asymptomatic carrier, and two marry-in members of a five-generation pedigree. Structure of WARP (product of VWA1) was predicted using the Phyre2 web portal. In situ hybridization and vwa1-knockdown/knockout studies in zebrafish using morpholino and CRISPR/Cas9 techniques were performed. Cartilage staining and immunofluorescence were carried out. Results Through WES and a set of filtration, we identified a c.G905A:p.R302Q point mutation in a novel candidate pathogenic gene, VWA1. The Phyre2 web portal predicted alterations in secondary and tertiary structures of WARP, indicating changes in its function as well. Predictions of protein-to-protein interactions in five pathways related to craniofacial development revealed possible interactions with four proteins in the FGF pathway. Knockdown/knockout studies of the zebrafish revealed deformities of pharyngeal cartilage. A decrease of the proliferation of cranial neural crest cells (CNCCs) and alteration of the structure of pharyngeal chondrocytes were observed in the morphants as well. Conclusion Our data suggest that a mutation in VWA1 is functionally linked to HFM through suppression of CNCC proliferation and disruption of the organization of pharyngeal chondrocytes.
Collapse
Affiliation(s)
- Yibei Wang
- Department of Otolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.,Department of Otolaryngology, China-Japan Friendship Hospital, Beijing, China
| | - Lu Ping
- Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiaodong Luan
- School of Medicine, Tsinghua University, Beijing, China.,Department of Cardiology, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Tsinghua University, Beijing, China
| | - Yushan Chen
- Department of Otolaryngology, The Ohio State University, Columbus, OH, United States
| | - Xinmiao Fan
- Department of Otolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Lianyan Li
- Key Laboratory of Cell Proliferation and Differentiation of the Ministry of Education, College of Life Sciences, Peking University, Beijing, China
| | - Yaping Liu
- Department of Medical Genetics and National Laboratory of Medical Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Pu Wang
- Department of Otolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.,Department of Otolaryngology Head and Neck Surgery, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Shuyang Zhang
- School of Medicine, Tsinghua University, Beijing, China.,Department of Cardiology, Peking Union Medical College Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, China.,Tsinghua-Peking Center for Life Sciences, Tsinghua University, Beijing, China
| | - Bo Zhang
- Key Laboratory of Cell Proliferation and Differentiation of the Ministry of Education, College of Life Sciences, Peking University, Beijing, China
| | - Xiaowei Chen
- Department of Otolaryngology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| |
Collapse
|
11
|
Mubassir MHM, Naser MA, Abdul-Wahab MF, Jawad T, Alvy RI, Hamdan S. Comprehensive in silico modeling of the rice plant PRR Xa21 and its interaction with RaxX21-sY and OsSERK2. RSC Adv 2020; 10:15800-15814. [PMID: 35493652 PMCID: PMC9052883 DOI: 10.1039/d0ra01396j] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Accepted: 04/15/2020] [Indexed: 12/19/2022] Open
Abstract
The first layer of defense that plants deploy to ward off a microbial invasion comes in the form of pattern-triggered immunity (PTI), which is initiated when the pattern-recognition receptors (PRRs) bind with the pathogen-associated molecular patterns (PAMPs) and co-receptor proteins, and transmit a defense signal. Although several plant PRRs have been discovered, very few of them have been fully characterized, and their functional parameters assessed. In this study, the 3D-model prediction of an entire plant PRR protein, Xa21, was done by implementing multiple in silico modeling techniques. Subsequently, the PAMP RaxX21-sY (sulphated RaxX21) and leucine-rich repeat (LRR) domain of the co-receptor OsSERK2 were docked with the LRR domain of Xa21. The docked complex of these three proteins formed a heterodimer that closely resembles the other crystallographic PTI complexes available. Molecular dynamics simulations and MM/PBSA calculations were applied for an in-depth analysis of the interactions between Xa21 LRR, RaxX21-sY, and OsSERK2 LRR. Arg230 and Arg185 from Xa21 LRR, Val2 and Lys15 from RaxX21-sY and Lys164 from OsSERK2 LRR were found to be the prominent residues which might contribute significantly in the formation of a heterodimer during the PTI process mediated by Xa21. Additionally, RaxX21-sY interacted much more favorably with Xa21 LRR in the presence of OsSERK2 LRR in the complex, which substantiates the necessity of the co-receptor in Xa21 mediated PTI to recognize the PAMP RaxX21-sY. However, the free energy binding calculation reveals the favorability of a heterodimer formation of PRR Xa21 and co-receptor OsSERK2 without the presence of PAMP RaxX21-sY, which validate the previous lab result.
Collapse
Affiliation(s)
- M H M Mubassir
- Department of Mathematics and Natural Sciences, BRAC University 66 Mohakhali Dhaka-1212 Bangladesh
| | - M Abu Naser
- Faculty Bioscience and Medical Engineering, Universiti Teknologi Malaysia 81310 Johor Bahru Johor Malaysia
| | - Mohd Firdaus Abdul-Wahab
- Faculty Bioscience and Medical Engineering, Universiti Teknologi Malaysia 81310 Johor Bahru Johor Malaysia
| | - Tanvir Jawad
- Department of Mathematics and Natural Sciences, BRAC University 66 Mohakhali Dhaka-1212 Bangladesh
| | - Raghib Ishraq Alvy
- Department of Mathematics and Natural Sciences, BRAC University 66 Mohakhali Dhaka-1212 Bangladesh
| | - Salehhuddin Hamdan
- Faculty Bioscience and Medical Engineering, Universiti Teknologi Malaysia 81310 Johor Bahru Johor Malaysia
| |
Collapse
|
12
|
Tisza MJ, Pastrana DV, Welch NL, Stewart B, Peretti A, Starrett GJ, Pang YYS, Krishnamurthy SR, Pesavento PA, McDermott DH, Murphy PM, Whited JL, Miller B, Brenchley J, Rosshart SP, Rehermann B, Doorbar J, Ta'ala BA, Pletnikova O, Troncoso JC, Resnick SM, Bolduc B, Sullivan MB, Varsani A, Segall AM, Buck CB. Discovery of several thousand highly diverse circular DNA viruses. eLife 2020; 9:51971. [PMID: 32014111 PMCID: PMC7000223 DOI: 10.7554/elife.51971] [Citation(s) in RCA: 116] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 01/06/2020] [Indexed: 12/18/2022] Open
Abstract
Although millions of distinct virus species likely exist, only approximately 9000 are catalogued in GenBank's RefSeq database. We selectively enriched for the genomes of circular DNA viruses in over 70 animal samples, ranging from nematodes to human tissue specimens. A bioinformatics pipeline, Cenote-Taker, was developed to automatically annotate over 2500 complete genomes in a GenBank-compliant format. The new genomes belong to dozens of established and emerging viral families. Some appear to be the result of previously undescribed recombination events between ssDNA and ssRNA viruses. In addition, hundreds of circular DNA elements that do not encode any discernable similarities to previously characterized sequences were identified. To characterize these ‘dark matter’ sequences, we used an artificial neural network to identify candidate viral capsid proteins, several of which formed virus-like particles when expressed in culture. These data further the understanding of viral sequence diversity and allow for high throughput documentation of the virosphere. When scientists hunt for new DNA sequences, sometimes they get a lot more than they bargained for. Such is the case in metagenomic surveys, which analyze not just DNA of a particular organism, but all the DNA in an environment at large. A vexing problem with these surveys is the overwhelming number of DNA sequences detected that are so different from any known microbe that they cannot be classified using traditional approaches. However, some of these “known unknowns” are undoubtedly viral sequences, because only a fraction of the enormous diversity of viruses has been characterized. This “viral dark matter” is a major obstacle for those studying viruses. This led Tisza et al. to attempt to classify some of the unknown viral sequences in their metagenomic surveys. The search, which specifically focused on viruses with circular DNA genomes, detected over 2,500 circular viral genomes. Intensive analysis revealed that many of these genomes had similar makeup to previously discovered viruses, but hundreds of them were totally different from any known virus, based on typical methods of comparison. Computational analysis of genes that were conserved among some of these brand-new circular sequences often revealed virus-like features. Experiments on a few of these genes showed that they encoded proteins capable of forming particles reminiscent of characteristic viral shells, implying that these new sequences are indeed viruses. Tisza et al. have added the 2,500 newly characterized viral sequences to the publicly accessible GenBank database, and the sequences are being considered for the more authoritative RefSeq database, which currently contains around 9,000 complete viral genomes. The expanded databases will hopefully now better equip scientists to explore the enormous diversity of viruses and help medics and veterinarians to detect disease-causing viruses in humans and other animals.
Collapse
Affiliation(s)
- Michael J Tisza
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| | - Diana V Pastrana
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| | - Nicole L Welch
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| | - Brittany Stewart
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| | - Alberto Peretti
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| | - Gabriel J Starrett
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| | - Yuk-Ying S Pang
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| | - Siddharth R Krishnamurthy
- Metaorganism Immunity Section, Laboratory of Immune System Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, United States
| | - Patricia A Pesavento
- Department of Pathology, Microbiology, and Immunology, University of California, Davis, Davis, United States
| | - David H McDermott
- Molecular Signaling Section, Laboratory of Molecular Immunology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, United States
| | - Philip M Murphy
- Molecular Signaling Section, Laboratory of Molecular Immunology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, United States
| | - Jessica L Whited
- Department of Orthopedic Surgery, Harvard Medical School, The Harvard Stem Cell Institute, Brigham and Women's Hospital, Boston, United States.,Broad Institute of MIT and Harvard, Cambridge, United States.,Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, United States
| | - Bess Miller
- Department of Orthopedic Surgery, Harvard Medical School, The Harvard Stem Cell Institute, Brigham and Women's Hospital, Boston, United States.,Broad Institute of MIT and Harvard, Cambridge, United States
| | - Jason Brenchley
- Barrier Immunity Section, Lab of Viral Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Cambridge, United States
| | - Stephan P Rosshart
- Immunology Section, Liver Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, United States
| | - Barbara Rehermann
- Immunology Section, Liver Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, United States
| | - John Doorbar
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | | | - Olga Pletnikova
- Department of Pathology (Neuropathology), Johns Hopkins University School of Medicine, Baltimore, United States
| | - Juan C Troncoso
- Department of Pathology (Neuropathology), Johns Hopkins University School of Medicine, Baltimore, United States
| | - Susan M Resnick
- Laboratory of Behavioral Neuroscience, National Institute on Aging, National Institutes of Health, Baltimore, United States
| | - Ben Bolduc
- Department of Microbiology, Ohio State University, Columbus, United States
| | - Matthew B Sullivan
- Department of Microbiology, Ohio State University, Columbus, United States.,Civil Environmental and Geodetic Engineering, Ohio State University, Columbus, United States
| | - Arvind Varsani
- The Biodesign Center of Fundamental and Applied Microbiomics, School of Life Sciences, Center for Evolution and Medicine, Arizona State University, Tempe, United States.,Structural Biology Research Unit, Department of Clinical Laboratory Sciences, University of Cape Town, Rondebosch, South Africa
| | - Anca M Segall
- Viral Information Institute and Department of Biology, San Diego State University, San Diego, United States
| | - Christopher B Buck
- Lab of Cellular Oncology, National Cancer Institute, National Institutes of Health, Bethesda, United States
| |
Collapse
|
13
|
Olechnovič K, Monastyrskyy B, Kryshtafovych A, Venclovas Č. Comparative analysis of methods for evaluation of protein models against native structures. Bioinformatics 2019; 35:937-944. [PMID: 30169622 DOI: 10.1093/bioinformatics/bty760] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2018] [Revised: 08/04/2018] [Accepted: 08/28/2018] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION Measuring discrepancies between protein models and native structures is at the heart of development of protein structure prediction methods and comparison of their performance. A number of different evaluation methods have been developed; however, their comprehensive and unbiased comparison has not been performed. RESULTS We carried out a comparative analysis of several popular model assessment methods (RMSD, TM-score, GDT, QCS, CAD-score, LDDT, SphereGrinder and RPF) to reveal their relative strengths and weaknesses. The analysis, performed on a large and diverse model set derived in the course of three latest community-wide CASP experiments (CASP10-12), had two major directions. First, we looked at general differences between the scores by analyzing distribution, correspondence and correlation of their values as well as differences in selecting best models. Second, we examined the score differences taking into account various structural properties of models (stereochemistry, hydrogen bonds, packing of domains and chain fragments, missing residues, protein length and secondary structure). Our results provide a solid basis for an informed selection of the most appropriate score or combination of scores depending on the task at hand. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Kliment Olechnovič
- Institute of Biotechnology Life Sciences Center Vilnius University, Saulėtekio 7, Vilnius, Lithuania
| | | | | | - Česlovas Venclovas
- Institute of Biotechnology Life Sciences Center Vilnius University, Saulėtekio 7, Vilnius, Lithuania
| |
Collapse
|
14
|
Cai Y, Li X, Sun Z, Lu Y, Zhao H, Hanson J, Paliwal K, Litfin T, Zhou Y, Yang Y. SPOT-Fold: Fragment-Free Protein Structure Prediction Guided by Predicted Backbone Structure and Contact Map. J Comput Chem 2019; 41:745-750. [PMID: 31845383 DOI: 10.1002/jcc.26132] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2019] [Revised: 10/07/2019] [Accepted: 12/01/2019] [Indexed: 02/01/2023]
Abstract
Protein structure determination has long been one of the most challenging problems in molecular biology for the past 60 years. Here we present an ab initio protein tertiary-structure prediction method assisted by predicted contact maps from SPOT-Contact and predicted dihedral angles from SPIDER 3. These predicted properties were then fed to the crystallography and NMR system (CNS) for restrained structure modeling. The resulted structures are first evaluated by the potential energy calculated by CNS, followed by dDFIRE energy function for model selections. The method called SPOT-Fold has been tested on 241 CASP targets between 67 and 670 amino acid residues, 60 randomly selected globular proteins under 100 amino acids. The method has a comparable accuracy to other contact-map-based modeling techniques. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yufeng Cai
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Xiongjun Li
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Zhe Sun
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Yutong Lu
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, 510000, China
| | - Jack Hanson
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland, 4122, Australia
| | - Kuldip Paliwal
- Signal Processing Laboratory, Griffith University, Brisbane, Queensland, 4122, Australia
| | - Thomas Litfin
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland, 4222, Australia
| | - Yaoqi Zhou
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Southport, Queensland, 4222, Australia
| | - Yuedong Yang
- School of Data and Computer Science, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou, 510006, China
| |
Collapse
|
15
|
Wei X, Li ZC, Li SJ, Peng XB, Zhao Q. Protein structure determination using a Riemannian approach. FEBS Lett 2019; 594:1036-1051. [PMID: 31769509 DOI: 10.1002/1873-3468.13688] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 10/31/2019] [Accepted: 11/14/2019] [Indexed: 11/05/2022]
Abstract
Protein NMR structure determination is one of the most extensively studied problems. Here, we adopt a novel method based on a matrix completion technique - the Riemannian approach - to rebuild the protein structure from the nuclear Overhauser effect distance restraints and the dihedral angle restraints. In comparison with the cyana method, the results generated via the Riemannian approach are more similar to the standard X-ray crystallographic structures as a result of the simple but powerful internal calculation processing function. In addition, our results demonstrate that the Riemannian approach has a comparable or even better performance than the cyana method on other structural assessment metrics, including the stereochemical quality and restraint violations. The Riemannian approach software is available at: https://github.com/xubiaopeng/Protein_Recon_MCRiemman.
Collapse
Affiliation(s)
- Xian Wei
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China.,Department of Science, Taiyuan Institute of Technology, China
| | - Zhi-Cheng Li
- Department of Physics, Taiyuan Normal University, China
| | - Shi-Jian Li
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China
| | - Xu-Biao Peng
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China
| | - Qing Zhao
- Center for Quantum Technology Research, School of Physics, Beijing Institute of Technology, China
| |
Collapse
|
16
|
Hura GL, Hodge CD, Rosenberg D, Guzenko D, Duarte JM, Monastyrskyy B, Grudinin S, Kryshtafovych A, Tainer JA, Fidelis K, Tsutakawa SE. Small angle X-ray scattering-assisted protein structure prediction in CASP13 and emergence of solution structure differences. Proteins 2019; 87:1298-1314. [PMID: 31589784 DOI: 10.1002/prot.25827] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2019] [Revised: 09/27/2019] [Accepted: 09/27/2019] [Indexed: 12/14/2022]
Abstract
Small angle X-ray scattering (SAXS) measures comprehensive distance information on a protein's structure, which can constrain and guide computational structure prediction algorithms. Here, we evaluate structure predictions of 11 monomeric and oligomeric proteins for which SAXS data were collected and provided to predictors in the 13th round of the Critical Assessment of protein Structure Prediction (CASP13). The category for SAXS-assisted predictions made gains in certain areas for CASP13 compared to CASP12. Improvements included higher quality data with size exclusion chromatography-SAXS (SEC-SAXS) and better selection of targets and communication of results by CASP organizers. In several cases, we can track improvements in model accuracy with use of SAXS data. For hard multimeric targets where regular folding algorithms were unsuccessful, SAXS data helped predictors to build models better resembling the global shape of the target. For most models, however, no significant improvement in model accuracy at the domain level was registered from use of SAXS data, when rigorously comparing SAXS-assisted models to the best regular server predictions. To promote future progress in this category, we identify successes, challenges, and opportunities for improved strategies in prediction, assessment, and communication of SAXS data to predictors. An important observation is that, for many targets, SAXS data were inconsistent with crystal structures, suggesting that these proteins adopt different conformation(s) in solution. This CASP13 result, if representative of PDB structures and future CASP targets, may have substantive implications for the structure training databases used for machine learning, CASP, and use of prediction models for biology.
Collapse
Affiliation(s)
- Greg L Hura
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California.,Department of Chemistry and Biochemistry, University of California Santa Cruz, Santa Cruz, California
| | - Curtis D Hodge
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California
| | - Daniel Rosenberg
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California
| | - Dmytro Guzenko
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, California
| | - Jose M Duarte
- Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, San Diego, La Jolla, California
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California
| | - Sergei Grudinin
- Univ. Grenoble Alpes, CNRS, Inria, Grenoble INP, LJK, 38000, Grenoble, France
| | - Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California
| | - John A Tainer
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California.,Department of Molecular and Cellular Oncology, The University of Texas M. D. Anderson Cancer Center, Houston, Texas
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California
| | - Susan E Tsutakawa
- Molecular Biophysics and Integrated Bioimaging, Lawrence Berkeley National Laboratory, Berkeley, California
| |
Collapse
|
17
|
Sala D, Huang YJ, Cole CA, Snyder DA, Liu G, Ishida Y, Swapna GVT, Brock KP, Sander C, Fidelis K, Kryshtafovych A, Inouye M, Tejero R, Valafar H, Rosato A, Montelione GT. Protein structure prediction assisted with sparse NMR data in CASP13. Proteins 2019; 87:1315-1332. [PMID: 31603581 DOI: 10.1002/prot.25837] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 09/26/2019] [Accepted: 09/27/2019] [Indexed: 01/05/2023]
Abstract
CASP13 has investigated the impact of sparse NMR data on the accuracy of protein structure prediction. NOESY and 15 N-1 H residual dipolar coupling data, typical of that obtained for 15 N,13 C-enriched, perdeuterated proteins up to about 40 kDa, were simulated for 11 CASP13 targets ranging in size from 80 to 326 residues. For several targets, two prediction groups generated models that are more accurate than those produced using baseline methods. Real NMR data collected for a de novo designed protein were also provided to predictors, including one data set in which only backbone resonance assignments were available. Some NMR-assisted prediction groups also did very well with these data. CASP13 also assessed whether incorporation of sparse NMR data improves the accuracy of protein structure prediction relative to nonassisted regular methods. In most cases, incorporation of sparse, noisy NMR data results in models with higher accuracy. The best NMR-assisted models were also compared with the best regular predictions of any CASP13 group for the same target. For six of 13 targets, the most accurate model provided by any NMR-assisted prediction group was more accurate than the most accurate model provided by any regular prediction group; however, for the remaining seven targets, one or more regular prediction method provided a more accurate model than even the best NMR-assisted model. These results suggest a novel approach for protein structure determination, in which advanced prediction methods are first used to generate structural models, and sparse NMR data is then used to validate and/or refine these models.
Collapse
Affiliation(s)
- Davide Sala
- Magnetic Resonance Center, University of Florence, Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Sesto Fiorentino, Italy
| | - Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York
| | - Casey A Cole
- Department of Computer Science & Engineering, University of South Carolina, Columbia, South Carolina
| | - David A Snyder
- Department of Chemistry, College of Science and Health, William Paterson University, Wayne, New Jersey
| | - Gaohua Liu
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Nexomics Biosciences, Bordentown, New Jersey
| | - Yojiro Ishida
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - G V T Swapna
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - Kelly P Brock
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts.,cBio Center, Dana-Farber Cancer Institute, Boston, Massachusetts
| | | | | | - Masayori Inouye
- Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - Roberto Tejero
- Departamento de Quimica Fisica, Universidad de Valencia, Valencia, Spain
| | - Homayoun Valafar
- Department of Computer Science & Engineering, University of South Carolina, Columbia, South Carolina
| | - Antonio Rosato
- Magnetic Resonance Center, University of Florence, Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Sesto Fiorentino, Italy
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York.,Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| |
Collapse
|
18
|
Mirza MU, Vanmeert M, Ali A, Iman K, Froeyen M, Idrees M. Perspectives towards antiviral drug discovery against Ebola virus. J Med Virol 2019; 91:2029-2048. [PMID: 30431654 PMCID: PMC7166701 DOI: 10.1002/jmv.25357] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 11/04/2018] [Indexed: 12/18/2022]
Abstract
Ebola virus disease (EVD), caused by Ebola viruses, resulted in more than 11 500 deaths according to a recent 2018 WHO report. With mortality rates up to 90%, it is nowadays one of the most deadly infectious diseases. However, no Food and Drug Administration‐approved Ebola drugs or vaccines are available yet with the mainstay of therapy being supportive care. The high fatality rate and absence of effective treatment or vaccination make Ebola virus a category‐A biothreat pathogen. Fortunately, a series of investigational countermeasures have been developed to control and prevent this global threat. This review summarizes the recent therapeutic advances and ongoing research progress from research and development to clinical trials in the development of small‐molecule antiviral drugs, small‐interference RNA molecules, phosphorodiamidate morpholino oligomers, full‐length monoclonal antibodies, and vaccines. Moreover, difficulties are highlighted in the search for effective countermeasures against EVD with additional focus on the interplay between available in silico prediction methods and their evidenced potential in antiviral drug discovery.
Collapse
Affiliation(s)
- Muhammad Usman Mirza
- Department of Pharmaceutical Sciences, REGA Institute for Medical Research, Medicinal Chemistry, KU Leuven, Leuven, Belgium
| | - Michiel Vanmeert
- Department of Pharmaceutical Sciences, REGA Institute for Medical Research, Medicinal Chemistry, KU Leuven, Leuven, Belgium
| | - Amjad Ali
- Department of Genetics, Hazara University, Mansehra, Pakistan.,Molecular Virology Laboratory, Centre for Applied Molecular Biology (CAMB), University of the Punjab, Lahore, Pakistan
| | - Kanzal Iman
- Biomedical Informatics Research Laboratory (BIRL), Department of Biology, Lahore University of Management Sciences (LUMS), Lahore, Pakistan
| | - Matheus Froeyen
- Department of Pharmaceutical Sciences, REGA Institute for Medical Research, Medicinal Chemistry, KU Leuven, Leuven, Belgium
| | - Muhammad Idrees
- Molecular Virology Laboratory, Centre for Applied Molecular Biology (CAMB), University of the Punjab, Lahore, Pakistan.,Hazara University Mansehra, Khyber Pakhtunkhwa Pakistan
| |
Collapse
|
19
|
Buttinelli M, Panetta G, Bucci A, Frascaria D, Morea V, Miele AE. Protein Engineering of Multi-Modular Transcription Factor Alcohol Dehydrogenase Repressor 1 (Adr1p), a Tool for Dissecting In Vitro Transcription Activation. Biomolecules 2019; 9:biom9090497. [PMID: 31533362 PMCID: PMC6769490 DOI: 10.3390/biom9090497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 09/03/2019] [Accepted: 09/11/2019] [Indexed: 11/17/2022] Open
Abstract
Studying transcription machinery assembly in vitro is challenging because of long intrinsically disordered regions present within the multi-modular transcription factors. One example is alcohol dehydrogenase repressor 1 (Adr1p) from fermenting yeast, responsible for the metabolic switch from glucose to ethanol. The role of each individual transcription activation domain (TAD) has been previously studied, but their interplay and their roles in enhancing the stability of the protein is not known. In this work, we designed five unique miniAdr1 constructs containing either TADs I-II-III or TAD I and III, connected by linkers of different sizes and compositions. We demonstrated that miniAdr1-BL, containing only PAR-TAD I+III with a basic linker (BL), binds the cognate DNA sequence, located in the promoter of the ADH2 (alcohol dehydrogenase 2) gene, and is necessary to stabilize the heterologous expression. In fact, we found that the sequence of the linker between TAD I and III affected the solubility of free miniAdr1 proteins, as well as the stability of their complexes with DNA. miniAdr1-BL is the stable unit able to recognize ADH2 in vitro, and hence it is a promising tool for future studies on nucleosomal DNA binding and transcription machinery assembly in vitro.
Collapse
Affiliation(s)
- Memmo Buttinelli
- Department of Biology and Biotechnology “Charles Darwin”, Sapienza University of Rome, P.le Aldo Moro 5, 00185 Rome, Italy; (M.B.); (A.B.); (D.F.)
| | - Gianna Panetta
- Department of Biochemical Sciences, Sapienza University of Rome, P.le Aldo Moro 5, 00185 Rome, Italy;
| | - Ambra Bucci
- Department of Biology and Biotechnology “Charles Darwin”, Sapienza University of Rome, P.le Aldo Moro 5, 00185 Rome, Italy; (M.B.); (A.B.); (D.F.)
- Department of Biochemical Sciences, Sapienza University of Rome, P.le Aldo Moro 5, 00185 Rome, Italy;
| | - Daniele Frascaria
- Department of Biology and Biotechnology “Charles Darwin”, Sapienza University of Rome, P.le Aldo Moro 5, 00185 Rome, Italy; (M.B.); (A.B.); (D.F.)
| | - Veronica Morea
- National Research Council of Italy (CNR), Institute of Molecular Biology and Pathology, P.le Aldo Moro 5, 00185 Rome, Italy;
| | - Adriana Erica Miele
- Department of Biochemical Sciences, Sapienza University of Rome, P.le Aldo Moro 5, 00185 Rome, Italy;
- Institut de Chimie et Biochimie Moléculaires et Supramoléculaires (ICBMS), UMR 5246 CNRS–UCBL-Université de Lyon, 43 boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
- Correspondence: ; Tel.: +39-06-4991-0556
| |
Collapse
|
20
|
In silico structural elucidation of RNA-dependent RNA polymerase towards the identification of potential Crimean-Congo Hemorrhagic Fever Virus inhibitors. Sci Rep 2019; 9:6809. [PMID: 31048746 PMCID: PMC6497722 DOI: 10.1038/s41598-019-43129-2] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2018] [Accepted: 04/17/2019] [Indexed: 01/05/2023] Open
Abstract
The Crimean-Congo Hemorrhagic Fever virus (CCHFV) is a segmented negative single-stranded RNA virus (-ssRNA) which causes severe hemorrhagic fever in humans with a mortality rate of ~50%. To date, no vaccine has been approved. Treatment is limited to supportive care with few investigational drugs in practice. Previous studies have identified viral RNA dependent RNA Polymerase (RdRp) as a potential drug target due to its significant role in viral replication and transcription. Since no crystal structure is available yet, we report the structural elucidation of CCHFV-RdRp by in-depth homology modeling. Even with low sequence identity, the generated model suggests a similar overall structure as previously reported RdRps. More specifically, the model suggests the presence of structural/functional conserved RdRp motifs for polymerase function, the configuration of uniform spatial arrangement of core RdRp sub-domains, and predicted positively charged entry/exit tunnels, as seen in sNSV polymerases. Extensive pharmacophore modeling based on per-residue energy contribution with investigational drugs allowed the concise mapping of pharmacophoric features and identified potential hits. The combination of pharmacophoric features with interaction energy analysis revealed functionally important residues in the conserved motifs together with in silico predicted common inhibitory binding modes with highly potent reference compounds.
Collapse
|
21
|
Ma T, Zang T, Wang Q, Ma J. Refining protein structures using enhanced sampling techniques with restraints derived from an ensemble-based model. Protein Sci 2018; 27:1842-1849. [PMID: 30098055 DOI: 10.1002/pro.3486] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2018] [Revised: 07/05/2018] [Accepted: 07/18/2018] [Indexed: 12/12/2022]
Abstract
This paper reports a method for high-accuracy protein structural refinement, which is a direct extension of the method in our recent publication (Zang, J Chem Phys 2018; 149:072319). It combines a parallel continuous simulated tempering (PCST) method with a temperature-dependent restraint and a blind model selection scheme. In this work, a single-reference-based restraint in previous work was changed to an ensemble-based model (EBM), in which the non-bonded Lennard-Jones term for each contacting atomic pair in previous restraining potential was replaced by a multi-Gaussian function whose parameters are derived from an ensemble of structures such as the ones from various CASP participating groups. The purpose of EBM is to take advantage of partial "correctness" distributed among members of the structural ensemble. Totally 18 targets were refined from the refinement category of CASP10, CASP11 and CASP12. In Top-1 group, 11 out of 18 targets had better models (greater GDT_TS scores) than the CASPR participants. In Top-5 group, nine out of 18 were better. Our results show that PCST-EBM method can considerably improve the low-accuracy structures.
Collapse
Affiliation(s)
- Tianqi Ma
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas, 77005
| | - Tianwu Zang
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas, 77005
| | - Qinghua Wang
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, 77030
| | - Jianpeng Ma
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas, 77005.,Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, 77030
| |
Collapse
|
22
|
Pfeiffenberger E, Bates PA. Predicting improved protein conformations with a temporal deep recurrent neural network. PLoS One 2018; 13:e0202652. [PMID: 30180164 PMCID: PMC6122789 DOI: 10.1371/journal.pone.0202652] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2018] [Accepted: 08/07/2018] [Indexed: 02/03/2023] Open
Abstract
Accurate protein structure prediction from amino acid sequence is still an unsolved problem. The most reliable methods centre on template based modelling. However, the accuracy of these models entirely depends on the availability of experimentally resolved homologous template structures. In order to generate more accurate models, extensive physics based molecular dynamics (MD) refinement simulations are performed to sample many different conformations to find improved conformational states. In this study, we propose a deep recurrent network model, called DeepTrajectory, that is able to identify these improved conformational states, with high precision, from a variety of different MD based sampling protocols. The proposed model learns the temporal patterns of features computed from MD trajectory data in order to classify whether each recorded simulation snapshot is an improved quality conformational state, decreased quality conformational state or whether there is no perceivable change in state with respect to the starting conformation. The model was trained and tested on 904 trajectories from 42 different protein systems with a cumulative number of more than 1.7 million snapshots. We show that our model outperforms other state of the art machine-learning algorithms that do not consider temporal dependencies. To our knowledge, DeepTrajectory is the first implementation of a time-dependent deep-learning protocol that is re-trainable and able to adapt to any new MD based sampling procedure, thereby demonstrating how a neural network can be used to learn the latter part of the protein folding funnel.
Collapse
Affiliation(s)
- Erik Pfeiffenberger
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, United Kingdom
| | - Paul A. Bates
- Biomolecular Modelling Laboratory, The Francis Crick Institute, 1 Midland Road, London NW1 1AT, United Kingdom
| |
Collapse
|
23
|
Zang T, Ma T, Wang Q, Ma J. Improving low-accuracy protein structures using enhanced sampling techniques. J Chem Phys 2018; 149:072319. [PMID: 30134714 PMCID: PMC5995690 DOI: 10.1063/1.5027243] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 05/23/2018] [Indexed: 11/14/2022] Open
Abstract
In this paper, we report results of using enhanced sampling and blind selection techniques for high-accuracy protein structural refinement. By combining a parallel continuous simulated tempering (PCST) method, previously developed by Zang et al. [J. Chem. Phys. 141, 044113 (2014)], and the structure based model (SBM) as restraints, we refined 23 targets (18 from the refinement category of the CASP10 and 5 from that of CASP12). We also designed a novel model selection method to blindly select high-quality models from very long simulation trajectories. The combined use of PCST-SBM with the blind selection method yielded final models that are better than initial models. For Top-1 group, 7 out of 23 targets had better models (greater global distance test total scores) than the critical assessment of structure prediction participants. For Top-5 group, 10 out of 23 were better. Our results justify the crucial position of enhanced sampling in protein structure prediction and refinement and demonstrate that a considerable improvement of low-accuracy structures is achievable with current force fields.
Collapse
Affiliation(s)
- Tianwu Zang
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas 77005, USA
| | - Tianqi Ma
- Applied Physics Program and Department of Bioengineering, Rice University, Houston, Texas 77005, USA
| | - Qinghua Wang
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, BCM-125, Houston, Texas 77030, USA
| | - Jianpeng Ma
- Author to whom correspondence should be addressed: . Telephone: 713-798-8187. Fax: 713-796-9438
| |
Collapse
|
24
|
Deng H, Jia Y, Zhang Y. Protein structure prediction. INTERNATIONAL JOURNAL OF MODERN PHYSICS. B 2018; 32:1840009. [PMID: 30853739 PMCID: PMC6407873 DOI: 10.1142/s021797921840009x] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Predicting 3D structure of protein from its amino acid sequence is one of the most important unsolved problems in biophysics and computational biology. This paper attempts to give a comprehensive introduction of the most recent effort and progress on protein structure prediction. Following the general flowchart of structure prediction, related concepts and methods are presented and discussed. Moreover, brief introductions are made to several widely-used prediction methods and the community-wide critical assessment of protein structure prediction (CASP) experiments.
Collapse
Affiliation(s)
- Haiyou Deng
- College of Science, Huazhong Agricultural University, Wuhan 4R0070, P. R. China
| | - Ya Jia
- College of Physical Science and Technology, Central China Normal University, Wuhan 430079, P. R. China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 45108, USA
| |
Collapse
|
25
|
Virtanen JJ, Zhang Y. MR-REX: molecular replacement by cooperative conformational search and occupancy optimization on low-accuracy protein models. Acta Crystallogr D Struct Biol 2018; 74:606-620. [PMID: 29968671 PMCID: PMC6038387 DOI: 10.1107/s2059798318005612] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 04/10/2018] [Indexed: 11/10/2022] Open
Abstract
Molecular replacement (MR) has commonly been employed to derive the phase information in protein crystal X-ray diffraction, but its success rate decreases rapidly when the search model is dissimilar to the target. MR-REX has been developed to perform an MR search by replica-exchange Monte Carlo simulations, which enables cooperative rotation and translation searches and simultaneous clash and occupancy optimization. MR-REX was tested on a set of 1303 protein structures of different accuracies and successfully placed 699 structures at positions that have an r.m.s.d. of below 2 Å to the target position, which is 10% higher than was obtained by Phaser. However, cases studies show that many of the models for which Phaser failed and MR-REX succeeded can be solved by Phaser by pruning them and using nondefault parameters. The factors effecting success and the parts of the methodology which lead to success are studied. The results demonstrate a new avenue for molecular replacement which outperforms (and has results that are complementary to) the state-of-the-art MR methods, in particular for distantly homologous proteins.
Collapse
Affiliation(s)
- Jouko J. Virtanen
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
26
|
Correa L, Borguesan B, Farfan C, Inostroza-Ponta M, Dorn M. A Memetic Algorithm for 3-D Protein Structure Prediction Problem. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:690-704. [PMID: 27925594 DOI: 10.1109/tcbb.2016.2635143] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Memetic Algorithms are population-based metaheuristics intrinsically concerned with exploiting all available knowledge about the problem under study. The incorporation of problem domain knowledge is not an optional mechanism, but a fundamental feature of the Memetic Algorithms. In this paper, we present a Memetic Algorithm to tackle the three-dimensional protein structure prediction problem. The method uses a structured population and incorporates a Simulated Annealing algorithm as a local search strategy, as well as ad-hoc crossover and mutation operators to deal with the problem. It takes advantage of structural knowledge stored in the Protein Data Bank, by using an Angle Probability List that helps to reduce the search space and to guide the search strategy. The proposed algorithm was tested on nineteen protein sequences of amino acid residues, and the results show the ability of the algorithm to find native-like protein structures. Experimental results have revealed that the proposed algorithm can find good solutions regarding root-mean-square deviation and global distance total score test in comparison with the experimental protein structures. We also show that our results are comparable in terms of folding organization with state-of-the-art prediction methods, corroborating the effectiveness of our proposal.
Collapse
|
27
|
Kadian K, Vijay S, Gupta Y, Rawal R, Singh J, Anvikar A, Pande V, Sharma A. Structural modeling identifies Plasmodium vivax 4-diphosphocytidyl-2C-methyl-d-erythritol kinase (IspE) as a plausible new antimalarial drug target. Parasitol Int 2018; 67:375-385. [PMID: 29550587 DOI: 10.1016/j.parint.2018.03.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 03/12/2018] [Accepted: 03/12/2018] [Indexed: 12/23/2022]
Abstract
Malaria parasites utilize Methylerythritol phosphate (MEP) pathway for synthesis of isoprenoid precursors which are essential for maturation and survival of parasites during erythrocytic and gametocytic stages. The absence of MEP pathway in the human host establishes MEP pathway enzymes as a repertoire of essential drug targets. The fourth enzyme, 4-diphosphocytidyl-2C-methyl-d-erythritol kinase (IspE) has been proved essential in pathogenic bacteria, however; it has not yet been studied in any Plasmodium species. This study was undertaken to investigate genetic polymorphism and concomitant structural implications of the Plasmodium vivax IspE (PvIspE) by employing sequencing, modeling and bioinformatics approach. We report that PvIspE gene displayed six non-synonymous mutations which were restricted to non-conserved regions within the gene from seven topographically distinct malaria-endemic regions of India. Phylogenetic studies reflected that PvIspE occupies unique status within Plasmodia genus and reflects that Plasmodium vivax IspE gene has a distant and non-conserved relation with human ortholog Mevalonate Kinase (MAVK). Structural modeling analysis revealed that all PvIspE Indian isolates have critically conserved canonical galacto-homoserine-mevalonate-phosphomevalonate kinase (GHMP) domain within the active site lying in a deep cleft sandwiched between ATP and CDPME-binding domains. The active core region was highly conserved among all clinical isolates, may be due to >60% β-pleated rigid architecture. The mapped structural analysis revealed the critically conserved active site of PvIspE, both sequence, and spacially among all Indian isolates; showing no significant changes in the active site. Our study strengthens the candidature of Plasmodium vivax IspE enzyme as a future target for novel antimalarials.
Collapse
Affiliation(s)
- Kavita Kadian
- Protein Biochemistry and Structural Biology Laboratory, National Institute of Malaria Research (ICMR), Sector-8, Dwarka, New Delhi, India
| | - Sonam Vijay
- Protein Biochemistry and Structural Biology Laboratory, National Institute of Malaria Research (ICMR), Sector-8, Dwarka, New Delhi, India
| | - Yash Gupta
- Protein Biochemistry and Structural Biology Laboratory, National Institute of Malaria Research (ICMR), Sector-8, Dwarka, New Delhi, India
| | - Ritu Rawal
- Protein Biochemistry and Structural Biology Laboratory, National Institute of Malaria Research (ICMR), Sector-8, Dwarka, New Delhi, India
| | - Jagbir Singh
- Protein Biochemistry and Structural Biology Laboratory, National Institute of Malaria Research (ICMR), Sector-8, Dwarka, New Delhi, India
| | - Anup Anvikar
- Epidemiology and Clinical Research, National Institute of Malaria Research (ICMR), Sector-8, Dwarka, New Delhi, India
| | - Veena Pande
- Department of Biotechnology, Kumaun University, Nainital, Uttarakhand, India
| | - Arun Sharma
- Protein Biochemistry and Structural Biology Laboratory, National Institute of Malaria Research (ICMR), Sector-8, Dwarka, New Delhi, India.
| |
Collapse
|
28
|
Al Nasr K, Yousef F, Jebril R, Jones C. Analytical Approaches to Improve Accuracy in Solving the Protein Topology Problem. Molecules 2018; 23:E28. [PMID: 29360779 PMCID: PMC6017786 DOI: 10.3390/molecules23020028] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Revised: 01/19/2018] [Accepted: 01/19/2018] [Indexed: 11/17/2022] Open
Abstract
To take advantage of recent advances in genomics and proteomics it is critical that the three-dimensional physical structure of biological macromolecules be determined. Cryo-Electron Microscopy (cryo-EM) is a promising and improving method for obtaining this data, however resolution is often not sufficient to directly determine the atomic scale structure. Despite this, information for secondary structure locations is detectable. De novo modeling is a computational approach to modeling these macromolecular structures based on cryo-EM derived data. During de novo modeling a mapping between detected secondary structures and the underlying amino acid sequence must be identified. DP-TOSS (Dynamic Programming for determining the Topology Of Secondary Structures) is one tool that attempts to automate the creation of this mapping. By treating the correspondence between the detected structures and the structures predicted from sequence data as a constraint graph problem DP-TOSS achieved good accuracy in its original iteration. In this paper, we propose modifications to the scoring methodology of DP-TOSS to improve its accuracy. Three scoring schemes were applied to DP-TOSS and tested: (i) a skeleton-based scoring function; (ii) a geometry-based analytical function; and (iii) a multi-well potential energy-based function. A test of 25 proteins shows that a combination of these schemes can improve the performance of DP-TOSS to solve the topology determination problem for macromolecule proteins.
Collapse
Affiliation(s)
- Kamal Al Nasr
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA.
| | - Feras Yousef
- Department of Mathematics, The University of Jordan, Amman 11942, Jordan.
| | - Ruba Jebril
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA.
| | - Christopher Jones
- Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA.
| |
Collapse
|
29
|
Kryshtafovych A, Monastyrskyy B, Fidelis K, Moult J, Schwede T, Tramontano A. Evaluation of the template-based modeling in CASP12. Proteins 2017; 86 Suppl 1:321-334. [PMID: 29159950 DOI: 10.1002/prot.25425] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2017] [Revised: 10/22/2017] [Accepted: 11/16/2017] [Indexed: 01/29/2023]
Abstract
The article describes results of numerical evaluation of CASP12 models submitted on targets for which structural templates could be identified and for which servers produced models of relatively high accuracy. The emphasis is on analysis of details of models, and how well the models compete with experimental structures. Performance of contributing research groups is measured in terms of backbone accuracy, all-atom local geometry, and the ability to estimate local errors in models. Separate analyses for all participating groups and automatic servers were carried out. Compared with the last CASP, two years ago, there have been significant improvements in a number of areas, particularly the accuracy of protein backbone atoms, accuracy of sequence alignment between models and available structures, increased accuracy over that which can be obtained from simple copying of a closest template, and accuracy of modeling of sub-structures not present in the closest template. These advancements are likely associated with more effective strategies to build non-template regions of the targets ab initio, better algorithms to combine information from multiple templates, enhanced refinement methods, and better methods for estimating model accuracy.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome Center, University of California, Davis, California
| | - John Moult
- Institute for Bioscience and Biotechnology Research and Department of Cell Biology and Molecular Genetics, University of Maryland, Maryland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland.,SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Anna Tramontano
- Department of Biochemical Sciences, Sapienza - University of Rome, P. le A. Moro, 5, Rome, 00185
| |
Collapse
|
30
|
Zhang C, Mortuza SM, He B, Wang Y, Zhang Y. Template-based and free modeling of I-TASSER and QUARK pipelines using predicted contact maps in CASP12. Proteins 2017; 86 Suppl 1:136-151. [PMID: 29082551 DOI: 10.1002/prot.25414] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2017] [Revised: 10/09/2017] [Accepted: 10/27/2017] [Indexed: 12/26/2022]
Abstract
We develop two complementary pipelines, "Zhang-Server" and "QUARK", based on I-TASSER and QUARK pipelines for template-based modeling (TBM) and free modeling (FM), and test them in the CASP12 experiment. The combination of I-TASSER and QUARK successfully folds three medium-size FM targets that have more than 150 residues, even though the interplay between the two pipelines still awaits further optimization. Newly developed sequence-based contact prediction by NeBcon plays a critical role to enhance the quality of models, particularly for FM targets, by the new pipelines. The inclusion of NeBcon predicted contacts as restraints in the QUARK simulations results in an average TM-score of 0.41 for the best in top five predicted models, which is 37% higher than that by the QUARK simulations without contacts. In particular, there are seven targets that are converted from non-foldable to foldable (TM-score >0.5) due to the use of contact restraints in the simulations. Another additional feature in the current pipelines is the local structure quality prediction by ResQ, which provides a robust residue-level modeling error estimation. Despite the success, significant challenges still remain in ab initio modeling of multi-domain proteins and folding of β-proteins with complicated topologies bound by long-range strand-strand interactions. Improvements on domain boundary and long-range contact prediction, as well as optimal use of the predicted contacts and multiple threading alignments, are critical to address these issues seen in the CASP12 experiment.
Collapse
Affiliation(s)
- Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - S M Mortuza
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
| | - Baoji He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.,Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China
| | - Yanting Wang
- Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan.,Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
31
|
Jończyk J, Malawska B, Bajda M. Hybrid approach to structure modeling of the histamine H3 receptor: Multi-level assessment as a tool for model verification. PLoS One 2017; 12:e0186108. [PMID: 28982153 PMCID: PMC5629032 DOI: 10.1371/journal.pone.0186108] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2017] [Accepted: 09/25/2017] [Indexed: 12/18/2022] Open
Abstract
The crucial role of G-protein coupled receptors and the significant achievements associated with a better understanding of the spatial structure of known receptors in this family encouraged us to undertake a study on the histamine H3 receptor, whose crystal structure is still unresolved. The latest literature data and availability of different software enabled us to build homology models of higher accuracy than previously published ones. The new models are expected to be closer to crystal structures; and therefore, they are much more helpful in the design of potential ligands. In this article, we describe the generation of homology models with the use of diverse tools and a hybrid assessment. Our study incorporates a hybrid assessment connecting knowledge-based scoring algorithms with a two-step ligand-based docking procedure. Knowledge-based scoring employs probability theory for global energy minimum determination based on information about native amino acid conformation from a dataset of experimentally determined protein structures. For a two-step docking procedure two programs were applied: GOLD was used in the first step and Glide in the second. Hybrid approaches offer advantages by combining various theoretical methods in one modeling algorithm. The biggest advantage of hybrid methods is their intrinsic ability to self-update and self-refine when additional structural data are acquired. Moreover, the diversity of computational methods and structural data used in hybrid approaches for structure prediction limit inaccuracies resulting from theoretical approximations or fuzziness of experimental data. The results of docking to the new H3 receptor model allowed us to analyze ligand-receptor interactions for reference compounds.
Collapse
Affiliation(s)
- Jakub Jończyk
- Department of Physicochemical Drug Analysis, Faculty of Pharmacy, Jagiellonian University Medical College, Krakow, Poland
| | - Barbara Malawska
- Department of Physicochemical Drug Analysis, Faculty of Pharmacy, Jagiellonian University Medical College, Krakow, Poland
| | - Marek Bajda
- Department of Physicochemical Drug Analysis, Faculty of Pharmacy, Jagiellonian University Medical College, Krakow, Poland
- * E-mail:
| |
Collapse
|
32
|
Lam SD, Das S, Sillitoe I, Orengo C. An overview of comparative modelling and resources dedicated to large-scale modelling of genome sequences. Acta Crystallogr D Struct Biol 2017; 73:628-640. [PMID: 28777078 PMCID: PMC5571743 DOI: 10.1107/s2059798317008920] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 06/14/2017] [Indexed: 12/02/2022] Open
Abstract
Computational modelling of proteins has been a major catalyst in structural biology. Bioinformatics groups have exploited the repositories of known structures to predict high-quality structural models with high efficiency at low cost. This article provides an overview of comparative modelling, reviews recent developments and describes resources dedicated to large-scale comparative modelling of genome sequences. The value of subclustering protein domain superfamilies to guide the template-selection process is investigated. Some recent cases in which structural modelling has aided experimental work to determine very large macromolecular complexes are also cited.
Collapse
Affiliation(s)
- Su Datt Lam
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
- School of Biosciences and Biotechnology, Faculty of Science and Technology, University Kebangsaan Malaysia, 43600 Bangi, Selangor, Malaysia
| | - Sayoni Das
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
| | - Christine Orengo
- Institute of Structural and Molecular Biology, UCL, Darwin Building, Gower Street, London WC1E 6BT, England
| |
Collapse
|
33
|
Feig M. Computational protein structure refinement: Almost there, yet still so far to go. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2017; 7:e1307. [PMID: 30613211 PMCID: PMC6319934 DOI: 10.1002/wcms.1307] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Protein structures are essential in modern biology yet experimental methods are far from being able to catch up with the rapid increase in available genomic data. Computational protein structure prediction methods aim to fill the gap while the role of protein structure refinement is to take approximate initial template-based models and bring them closer to the true native structure. Current methods for computational structure refinement rely on molecular dynamics simulations, related sampling methods, or iterative structure optimization protocols. The best methods are able to achieve moderate degrees of refinement but consistent refinement that can reach near-experimental accuracy remains elusive. Key issues revolve around the accuracy of the energy function, the inability to reliably rank multiple models, and the use of restraints that keep sampling close to the native state but also limit the degree of possible refinement. A different aspect is the question of what exactly the target of high-resolution refinement should be as experimental structures are affected by experimental conditions and different biological questions require varying levels of accuracy. While improvement of the global protein structure is a difficult problem, high-resolution refinement methods that improves local structural quality such as favorable stereochemistry and the avoidance of atomic clashes are much more successful.
Collapse
Affiliation(s)
- Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, 603 Wilson Rd., Room 218 BCH, East Lansing, MI, USA, ; 517-432-7439
| |
Collapse
|
34
|
Middleton SA, Illuminati J, Kim J. Complete fold annotation of the human proteome using a novel structural feature space. Sci Rep 2017; 7:46321. [PMID: 28406174 PMCID: PMC5390313 DOI: 10.1038/srep46321] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Accepted: 03/14/2017] [Indexed: 11/11/2022] Open
Abstract
Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.
Collapse
Affiliation(s)
- Sarah A Middleton
- Genomics and Computational Biology Program, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Joseph Illuminati
- Department of Computer Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Junhyong Kim
- Genomics and Computational Biology Program, University of Pennsylvania, Philadelphia, PA 19104, USA.,Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
35
|
Khoury GA, Smadbeck J, Kieslich CA, Koskosidis AJ, Guzman YA, Tamamis P, Floudas CA. Princeton_TIGRESS 2.0: High refinement consistency and net gains through support vector machines and molecular dynamics in double-blind predictions during the CASP11 experiment. Proteins 2017; 85:1078-1098. [PMID: 28241391 DOI: 10.1002/prot.25274] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Revised: 02/01/2017] [Accepted: 02/14/2017] [Indexed: 12/28/2022]
Abstract
Protein structure refinement is the challenging problem of operating on any protein structure prediction to improve its accuracy with respect to the native structure in a blind fashion. Although many approaches have been developed and tested during the last four CASP experiments, a majority of the methods continue to degrade models rather than improve them. Princeton_TIGRESS (Khoury et al., Proteins 2014;82:794-814) was developed previously and utilizes separate sampling and selection stages involving Monte Carlo and molecular dynamics simulations and classification using an SVM predictor. The initial implementation was shown to consistently refine protein structures 76% of the time in our own internal benchmarking on CASP 7-10 targets. In this work, we improved the sampling and selection stages and tested the method in blind predictions during CASP11. We added a decomposition of physics-based and hybrid energy functions, as well as a coordinate-free representation of the protein structure through distance-binning Cα-Cα distances to capture fine-grained movements. We performed parameter estimation to optimize the adjustable SVM parameters to maximize precision while balancing sensitivity and specificity across all cross-validated data sets, finding enrichment in our ability to select models from the populations of similar decoys generated for targets in CASPs 7-10. The MD stage was enhanced such that larger structures could be further refined. Among refinement methods that are currently implemented as web-servers, Princeton_TIGRESS 2.0 demonstrated the most consistent and most substantial net refinement in blind predictions during CASP11. The enhanced refinement protocol Princeton_TIGRESS 2.0 is freely available as a web server at http://atlas.engr.tamu.edu/refinement/. Proteins 2017; 85:1078-1098. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- George A Khoury
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey
| | - James Smadbeck
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey
| | - Chris A Kieslich
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Alexandra J Koskosidis
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Yannis A Guzman
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey.,Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Phanourios Tamamis
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| | - Christodoulos A Floudas
- Artie McFerrin Department of Chemical Engineering, Texas A&M University, College Station, Texas.,Texas A&M Energy Institute, Texas A&M University, College Station, Texas
| |
Collapse
|
36
|
Pang YP. FF12MC: A revised AMBER forcefield and new protein simulation protocol. Proteins 2016; 84:1490-516. [PMID: 27348292 PMCID: PMC5129589 DOI: 10.1002/prot.25094] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Revised: 06/16/2016] [Accepted: 06/18/2016] [Indexed: 12/25/2022]
Abstract
Specialized to simulate proteins in molecular dynamics (MD) simulations with explicit solvation, FF12MC is a combination of a new protein simulation protocol employing uniformly reduced atomic masses by tenfold and a revised AMBER forcefield FF99 with (i) shortened CH bonds, (ii) removal of torsions involving a nonperipheral sp(3) atom, and (iii) reduced 1-4 interaction scaling factors of torsions ϕ and ψ. This article reports that in multiple, distinct, independent, unrestricted, unbiased, isobaric-isothermal, and classical MD simulations FF12MC can (i) simulate the experimentally observed flipping between left- and right-handed configurations for C14-C38 of BPTI in solution, (ii) autonomously fold chignolin, CLN025, and Trp-cage with folding times that agree with the experimental values, (iii) simulate subsequent unfolding and refolding of these miniproteins, and (iv) achieve a robust Z score of 1.33 for refining protein models TMR01, TMR04, and TMR07. By comparison, the latest general-purpose AMBER forcefield FF14SB locks the C14-C38 bond to the right-handed configuration in solution under the same protein simulation conditions. Statistical survival analysis shows that FF12MC folds chignolin and CLN025 in isobaric-isothermal MD simulations 2-4 times faster than FF14SB under the same protein simulation conditions. These results suggest that FF12MC may be used for protein simulations to study kinetics and thermodynamics of miniprotein folding as well as protein structure and dynamics. Proteins 2016; 84:1490-1516. © 2016 The Authors Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Yuan-Ping Pang
- Computer-Aided Molecular Design Laboratory, Mayo Clinic, Rochester, MN, 55905, USA.
| |
Collapse
|
37
|
Zer Aviv P, Shubely M, Moskovits Y, Viskind O, Albeck A, Vertommen D, Ruthstein S, Shokhen M, Gruzman A. A New Oxopiperazin-Based Peptidomimetic Molecule Inhibits Prostatic Acid Phosphatase Secretion and Induces Prostate Cancer Cell Apoptosis. ChemistrySelect 2016. [DOI: 10.1002/slct.201600987] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Pinchas Zer Aviv
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| | - Moran Shubely
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| | - Yoni Moskovits
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| | - Olga Viskind
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| | - Amnon Albeck
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| | - Didier Vertommen
- de Duve Institute; Université catholique de Louvain; Brussels 1200 Belgium
| | - Sharon Ruthstein
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| | - Michael Shokhen
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| | - Arie Gruzman
- Department of Chemistry; Bar-Ilan University; Ramat-Gan 5290002 Israel
| |
Collapse
|
38
|
Kryshtafovych A, Monastyrskyy B, Fidelis K. CASP11 statistics and the prediction center evaluation system. Proteins 2016; 84 Suppl 1:15-9. [PMID: 26857434 PMCID: PMC5479680 DOI: 10.1002/prot.25005] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 01/18/2016] [Accepted: 02/04/2016] [Indexed: 01/10/2023]
Abstract
We outline the role of the Protein Structure Prediction Center (predictioncenter.org) in conducting the CASP11 and CASP ROLL experiments, discuss the experiment statistics, and provide an overview of the present CASP infrastructure. The biggest changes compared to the previous CASPs are the implementation of the evaluation system incorporating practically all evaluation measures, statistical tests, and visualization tools historically used by the CASP assessors, the expansion of the infrastructure to incorporate new categories of contact-assisted and multimeric predictions, and the redesign of the assessors' web-workspace enabling assessments based on multiple measures for different group categories and target sets. Proteins 2016; 84(Suppl 1):15-19. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Andriy Kryshtafovych
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Bohdan Monastyrskyy
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616
| | - Krzysztof Fidelis
- Protein Structure Prediction Center, Genome and Biomedical Sciences Facilities, University of California, Davis, California, 95616.
| |
Collapse
|
39
|
Modi V, Xu Q, Adhikari S, Dunbrack RL. Assessment of template-based modeling of protein structure in CASP11. Proteins 2016; 84 Suppl 1:200-20. [PMID: 27081927 DOI: 10.1002/prot.25049] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2016] [Revised: 04/04/2016] [Accepted: 04/11/2016] [Indexed: 12/27/2022]
Abstract
We present the assessment of predictions submitted in the template-based modeling (TBM) category of CASP11 (Critical Assessment of Protein Structure Prediction). Model quality was judged on the basis of global and local measures of accuracy on all atoms including side chains. The top groups on 39 human-server targets based on model 1 predictions were LEER, Zhang, LEE, MULTICOM, and Zhang-Server. The top groups on 81 targets by server groups based on model 1 predictions were Zhang-Server, nns, BAKER-ROSETTASERVER, QUARK, and myprotein-me. In CASP11, the best models for most targets were equal to or better than the best template available in the Protein Data Bank, even for targets with poor templates. The overall performance in CASP11 is similar to the performance of predictors in CASP10 with slightly better performance on the hardest targets. For most targets, assessment measures exhibited bimodal probability density distributions. Multi-dimensional scaling of an RMSD matrix for each target typically revealed a single cluster with models similar to the target structure, with a mode in the GDT-TS density between 40 and 90, and a wide distribution of models highly divergent from each other and from the experimental structure, with density mode at a GDT-TS value of ∼20. The models in this peak in the density were either compact models with entirely the wrong fold, or highly non-compact models. The results argue for a density-driven approach in future CASP TBM assessments that accounts for the bimodal nature of these distributions instead of Z scores, which assume a unimodal, Gaussian distribution. Proteins 2016; 84(Suppl 1):200-220. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Vivek Modi
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Qifang Xu
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Sam Adhikari
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111
| | - Roland L Dunbrack
- Fox Chase Cancer Center, Institute for Cancer Research, Philadelphia, Pennsylvania, 19111.
| |
Collapse
|
40
|
Figueroa M, Sleutel M, Vandevenne M, Parvizi G, Attout S, Jacquin O, Vandenameele J, Fischer AW, Damblon C, Goormaghtigh E, Valerio-Lepiniec M, Urvoas A, Durand D, Pardon E, Steyaert J, Minard P, Maes D, Meiler J, Matagne A, Martial JA, Van de Weerdt C. The unexpected structure of the designed protein Octarellin V.1 forms a challenge for protein structure prediction tools. J Struct Biol 2016; 195:19-30. [PMID: 27181418 DOI: 10.1016/j.jsb.2016.05.004] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Revised: 04/19/2016] [Accepted: 05/12/2016] [Indexed: 12/26/2022]
Abstract
Despite impressive successes in protein design, designing a well-folded protein of more 100 amino acids de novo remains a formidable challenge. Exploiting the promising biophysical features of the artificial protein Octarellin V, we improved this protein by directed evolution, thus creating a more stable and soluble protein: Octarellin V.1. Next, we obtained crystals of Octarellin V.1 in complex with crystallization chaperons and determined the tertiary structure. The experimental structure of Octarellin V.1 differs from its in silico design: the (αβα) sandwich architecture bears some resemblance to a Rossman-like fold instead of the intended TIM-barrel fold. This surprising result gave us a unique and attractive opportunity to test the state of the art in protein structure prediction, using this artificial protein free of any natural selection. We tested 13 automated webservers for protein structure prediction and found none of them to predict the actual structure. More than 50% of them predicted a TIM-barrel fold, i.e. the structure we set out to design more than 10years ago. In addition, local software runs that are human operated can sample a structure similar to the experimental one but fail in selecting it, suggesting that the scoring and ranking functions should be improved. We propose that artificial proteins could be used as tools to test the accuracy of protein structure prediction algorithms, because their lack of evolutionary pressure and unique sequences features.
Collapse
Affiliation(s)
- Maximiliano Figueroa
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium.
| | - Mike Sleutel
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
| | - Marylene Vandevenne
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Gregory Parvizi
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Sophie Attout
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Olivier Jacquin
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Julie Vandenameele
- Laboratoire d'Enzymologie et Repliement des Protéines, Centre for Protein Engineering, University of Liège, Liège, Belgium
| | - Axel W Fischer
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | | | - Erik Goormaghtigh
- Laboratory for the Structure and Function of Biological Membranes, Center for Structural Biology and Bioinformatics, Université Libre de Bruxelles, Brussels, Belgium
| | - Marie Valerio-Lepiniec
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Agathe Urvoas
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Dominique Durand
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Els Pardon
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; Structural Biology Research Center, VIB, Pleinlaan 2, 1050 Brussels, Belgium
| | - Jan Steyaert
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; Structural Biology Research Center, VIB, Pleinlaan 2, 1050 Brussels, Belgium
| | - Philippe Minard
- Institute for Integrative Biology of the Cell (I2BC), UMT 9198, CEA, CNRS, Université Paris-Sud, Orsay, France
| | - Dominique Maes
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium
| | - Jens Meiler
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - André Matagne
- Laboratoire d'Enzymologie et Repliement des Protéines, Centre for Protein Engineering, University of Liège, Liège, Belgium
| | - Joseph A Martial
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium
| | - Cécile Van de Weerdt
- GIGA-Research, Molecular Biomimetics and Protein Engineering, University of Liège, Liège, Belgium.
| |
Collapse
|
41
|
Li J, Cheng J. A Stochastic Point Cloud Sampling Method for Multi-Template Protein Comparative Modeling. Sci Rep 2016; 6:25687. [PMID: 27161489 PMCID: PMC4861977 DOI: 10.1038/srep25687] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 04/21/2016] [Indexed: 12/04/2022] Open
Abstract
Generating tertiary structural models for a target protein from the known structure of its homologous template proteins and their pairwise sequence alignment is a key step in protein comparative modeling. Here, we developed a new stochastic point cloud sampling method, called MTMG, for multi-template protein model generation. The method first superposes the backbones of template structures, and the Cα atoms of the superposed templates form a point cloud for each position of a target protein, which are represented by a three-dimensional multivariate normal distribution. MTMG stochastically resamples the positions for Cα atoms of the residues whose positions are uncertain from the distribution, and accepts or rejects new position according to a simulated annealing protocol, which effectively removes atomic clashes commonly encountered in multi-template comparative modeling. We benchmarked MTMG on 1,033 sequence alignments generated for CASP9, CASP10 and CASP11 targets, respectively. Using multiple templates with MTMG improves the GDT-TS score and TM-score of structural models by 2.96–6.37% and 2.42–5.19% on the three datasets over using single templates. MTMG’s performance was comparable to Modeller in terms of GDT-TS score, TM-score, and GDT-HA score, while the average RMSD was improved by a new sampling approach. The MTMG software is freely available at: http://sysbio.rnet.missouri.edu/multicom_toolbox/mtmg.html.
Collapse
Affiliation(s)
- Jilong Li
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA.,Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
42
|
Li W, Schaeffer RD, Otwinowski Z, Grishin NV. Estimation of Uncertainties in the Global Distance Test (GDT_TS) for CASP Models. PLoS One 2016; 11:e0154786. [PMID: 27149620 PMCID: PMC4858170 DOI: 10.1371/journal.pone.0154786] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 04/19/2016] [Indexed: 11/19/2022] Open
Abstract
The Critical Assessment of techniques for protein Structure Prediction (or CASP) is a community-wide blind test experiment to reveal the best accomplishments of structure modeling. Assessors have been using the Global Distance Test (GDT_TS) measure to quantify prediction performance since CASP3 in 1998. However, identifying significant score differences between close models is difficult because of the lack of uncertainty estimations for this measure. Here, we utilized the atomic fluctuations caused by structure flexibility to estimate the uncertainty of GDT_TS scores. Structures determined by nuclear magnetic resonance are deposited as ensembles of alternative conformers that reflect the structural flexibility, whereas standard X-ray refinement produces the static structure averaged over time and space for the dynamic ensembles. To recapitulate the structural heterogeneous ensemble in the crystal lattice, we performed time-averaged refinement for X-ray datasets to generate structural ensembles for our GDT_TS uncertainty analysis. Using those generated ensembles, our study demonstrates that the time-averaged refinements produced structure ensembles with better agreement with the experimental datasets than the averaged X-ray structures with B-factors. The uncertainty of the GDT_TS scores, quantified by their standard deviations (SDs), increases for scores lower than 50 and 70, with maximum SDs of 0.3 and 1.23 for X-ray and NMR structures, respectively. We also applied our procedure to the high accuracy version of GDT-based score and produced similar results with slightly higher SDs. To facilitate score comparisons by the community, we developed a user-friendly web server that produces structure ensembles for NMR and X-ray structures and is accessible at http://prodata.swmed.edu/SEnCS. Our work helps to identify the significance of GDT_TS score differences, as well as to provide structure ensembles for estimating SDs of any scores.
Collapse
Affiliation(s)
- Wenlin Li
- Department of Biochemistry and Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
| | - R. Dustin Schaeffer
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
| | - Zbyszek Otwinowski
- Department of Biochemistry and Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
| | - Nick V. Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
- Department of Biochemistry and Department of Biophysics, University of Texas Southwestern Medical Center, Dallas, Texas, 75390–9050, United States of America
- * E-mail:
| |
Collapse
|
43
|
Gupta RS, Khadka B. Evidence for the presence of key chlorophyll-biosynthesis-related proteins in the genus Rubrobacter (Phylum Actinobacteria) and its implications for the evolution and origin of photosynthesis. PHOTOSYNTHESIS RESEARCH 2016; 127:201-18. [PMID: 26174026 DOI: 10.1007/s11120-015-0177-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2015] [Accepted: 07/06/2015] [Indexed: 05/18/2023]
Abstract
Homologs showing high degree of sequence similarity to the three subunits of the protochlorophyllide oxidoreductase enzyme complex (viz. BchL, BchN, and BchB), which carries out a central role in chlorophyll-bacteriochlorophyll (Bchl) biosynthesis, are uniquely found in photosynthetic organisms. The results of BLAST searches and homology modeling presented here show that proteins exhibiting a high degree of sequence and structural similarity to the BchB and BchN proteins are also present in organisms from the high G+C Gram-positive phylum of Actinobacteria, specifically in members of the genus Rubrobacter (R. x ylanophilus and R. r adiotolerans). The results presented exclude the possibility that the observed BLAST hits are for subunits of the nitrogenase complex or the chlorin reductase complex. The branching in phylogenetic trees and the sequence characteristics of the Rubrobacter BchB/BchN homologs indicate that these homologs are distinct from those found in other photosynthetic bacteria and that they may represent ancestral forms of the BchB/BchN proteins. Although a homolog showing high degree of sequence similarity to the BchL protein was not detected in Rubrobacter, another protein, belonging to the ParA/Soj/MinD family, present in these bacteria, exhibits high degree of structural similarity to the BchL. In addition to the BchB/BchN homologs, Rubrobacter species also contain homologs showing high degree of sequence similarity to different subunits of magnesium chelatase (BchD, BchH, and BchI) as well as proteins showing significant similarity to the BchP and BchG proteins. Interestingly, no homologs corresponding to the BchX, BchY, and BchZ proteins were detected in the Rubrobacter species. These results provide the first suggestive evidence that some form of photosynthesis either exists or was anciently present within the phylum Actinobacteria (high G+C Gram-positive) in members of the genus Rubrobacter. The significance of these results concerning the origin of the Bchl-based photosynthesis is also discussed.
Collapse
Affiliation(s)
- Radhey S Gupta
- Department of Biochemistry, McMaster University, Hamilton, ON, L8N 3Z5, Canada.
| | - Bijendra Khadka
- Department of Biochemistry, McMaster University, Hamilton, ON, L8N 3Z5, Canada
| |
Collapse
|
44
|
Punta M, Mistry J. Homology-Based Annotation of Large Protein Datasets. Methods Mol Biol 2016; 1415:153-176. [PMID: 27115632 DOI: 10.1007/978-1-4939-3572-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Advances in DNA sequencing technologies have led to an increasing amount of protein sequence data being generated. Only a small fraction of this protein sequence data will have experimental annotation associated with them. Here, we describe a protocol for in silico homology-based annotation of large protein datasets that makes extensive use of manually curated collections of protein families. We focus on annotations provided by the Pfam database and suggest ways to identify family outliers and family variations. This protocol may be useful to people who are new to protein data analysis, or who are unfamiliar with the current computational tools that are available.
Collapse
Affiliation(s)
- Marco Punta
- Sorbonne Universités, UPMC-Univ P6, CNRS, Laboratoire de Biologie Computationnelle et Quantitative - UMR 7238, 15 rue de l'Ecole deMédecine, Paris, France.
| | - Jaina Mistry
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| |
Collapse
|
45
|
Yang J, Zhang Y. Protein Structure and Function Prediction Using I-TASSER. CURRENT PROTOCOLS IN BIOINFORMATICS 2015; 52:5.8.1-5.8.15. [PMID: 26678386 PMCID: PMC4871818 DOI: 10.1002/0471250953.bi0508s52] [Citation(s) in RCA: 304] [Impact Index Per Article: 33.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
I-TASSER is a hierarchical protocol for automated protein structure prediction and structure-based function annotation. Starting from the amino acid sequence of target proteins, I-TASSER first generates full-length atomic structural models from multiple threading alignments and iterative structural assembly simulations followed by atomic-level structure refinement. The biological functions of the protein, including ligand-binding sites, enzyme commission number, and gene ontology terms, are then inferred from known protein function databases based on sequence and structure profile comparisons. I-TASSER is freely available as both an on-line server and a stand-alone package. This unit describes how to use the I-TASSER protocol to generate structure and function prediction and how to interpret the prediction results, as well as alternative approaches for further improving the I-TASSER modeling quality for distant-homologous and multi-domain protein targets.
Collapse
Affiliation(s)
- Jianyi Yang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
- School of Mathematical Sciences, Nankai University, Tianjin, People's Republic of China
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan
| |
Collapse
|
46
|
Agrawal P, Bhalla S, Usmani SS, Singh S, Chaudhary K, Raghava GPS, Gautam A. CPPsite 2.0: a repository of experimentally validated cell-penetrating peptides. Nucleic Acids Res 2015; 44:D1098-103. [PMID: 26586798 PMCID: PMC4702894 DOI: 10.1093/nar/gkv1266] [Citation(s) in RCA: 218] [Impact Index Per Article: 24.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Accepted: 11/03/2015] [Indexed: 11/14/2022] Open
Abstract
CPPsite 2.0 (http://crdd.osdd.net/raghava/cppsite/) is an updated version of manually curated database (CPPsite) of cell-penetrating peptides (CPPs). The current version holds around 1850 peptide entries, which is nearly two times than the entries in the previous version. The updated data were curated from research papers and patents published in last three years. It was observed that most of the CPPs discovered/ tested, in last three years, have diverse chemical modifications (e.g. non-natural residues, linkers, lipid moieties, etc.). We have compiled this information on chemical modifications systematically in the updated version of the database. In order to understand the structure-function relationship of these peptides, we predicted tertiary structure of CPPs, possessing both modified and natural residues, using state-of-the-art techniques. CPPsite 2.0 also maintains information about model systems (in vitro/in vivo) used for CPP evaluation and different type of cargoes (e.g. nucleic acid, protein, nanoparticles, etc.) delivered by these peptides. In order to assist a wide range of users, we developed a user-friendly responsive website, with various tools, suitable for smartphone, tablet and desktop users. In conclusion, CPPsite 2.0 provides significant improvements over the previous version in terms of data content.
Collapse
Affiliation(s)
- Piyush Agrawal
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India
| | - Sherry Bhalla
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India
| | | | - Sandeep Singh
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India
| | - Kumardeep Chaudhary
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India
| | - Gajendra P S Raghava
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India
| | - Ankur Gautam
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh 160036, India
| |
Collapse
|
47
|
Joung I, Lee SY, Cheng Q, Kim JY, Joo K, Lee SJ, Lee J. Template-free modeling by LEE and LEER in CASP11. Proteins 2015; 84 Suppl 1:118-30. [DOI: 10.1002/prot.24944] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Revised: 08/26/2015] [Accepted: 10/11/2015] [Indexed: 12/25/2022]
Affiliation(s)
- InSuk Joung
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences; Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Sun Young Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Qianyi Cheng
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences; Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Jong Yun Kim
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Keehyoung Joo
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- Center for Advanced Computation, Korea Institute for Advanced Study; Seoul 130-722 Korea
| | - Sung Jong Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- Department of Physics; University of Suwon; Hwaseong-Si Gyeonggi-Do 445-743 Korea
| | - Jooyoung Lee
- Center for In Silico Protein Science, Korea Institute for Advanced Study; Seoul 130-722 Korea
- School of Computational Sciences; Korea Institute for Advanced Study; Seoul 130-722 Korea
- Center for Advanced Computation, Korea Institute for Advanced Study; Seoul 130-722 Korea
| |
Collapse
|
48
|
Somarowthu S. Progress and Current Challenges in Modeling Large RNAs. J Mol Biol 2015; 428:736-747. [PMID: 26585404 DOI: 10.1016/j.jmb.2015.11.011] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Revised: 11/03/2015] [Accepted: 11/08/2015] [Indexed: 12/21/2022]
Abstract
Recent breakthroughs in next-generation sequencing technologies have led to the discovery of several classes of non-coding RNAs (ncRNAs). It is now apparent that RNA molecules are not only just carriers of genetic information but also key players in many cellular processes. While there has been a rapid increase in the number of ncRNA sequences deposited in various databases over the past decade, the biological functions of these ncRNAs are largely not well understood. Similar to proteins, RNA molecules carry out a function by forming specific three-dimensional structures. Understanding the function of a particular RNA therefore requires a detailed knowledge of its structure. However, determining experimental structures of RNA is extremely challenging. In fact, RNA-only structures represent just 1% of the total structures deposited in the PDB. Thus, computational methods that predict three-dimensional RNA structures are in high demand. Computational models can provide valuable insights into structure-function relationships in ncRNAs and can aid in the development of functional hypotheses and experimental designs. In recent years, a set of diverse RNA structure prediction tools have become available, which differ in computational time, input data and accuracy. This review discusses the recent progress and challenges in RNA structure prediction methods.
Collapse
Affiliation(s)
- Srinivas Somarowthu
- Department of Molecular, Cellular and Developmental Biology, Yale University, 219 Prospect Street, Kline Biology Tower, New Haven, CT 06511, USA.
| |
Collapse
|
49
|
Poleksic A. A polynomial time algorithm for computing the area under a GDT curve. Algorithms Mol Biol 2015; 10:27. [PMID: 26504491 PMCID: PMC4620747 DOI: 10.1186/s13015-015-0058-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2015] [Accepted: 10/09/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Progress in the field of protein three-dimensional structure prediction depends on the development of new and improved algorithms for measuring the quality of protein models. Perhaps the best descriptor of the quality of a protein model is the GDT function that maps each distance cutoff θ to the number of atoms in the protein model that can be fit under the distance θ from the corresponding atoms in the experimentally determined structure. It has long been known that the area under the graph of this function (GDT_A) can serve as a reliable, single numerical measure of the model quality. Unfortunately, while the well-known GDT_TS metric provides a crude approximation of GDT_A, no algorithm currently exists that is capable of computing accurate estimates of GDT_A. METHODS We prove that GDT_A is well defined and that it can be approximated by the Riemann sums, using available methods for computing accurate (near-optimal) GDT function values. RESULTS In contrast to the GDT_TS metric, GDT_A is neither insensitive to large nor oversensitive to small changes in model's coordinates. Moreover, the problem of computing GDT_A is tractable. More specifically, GDT_A can be computed in cubic asymptotic time in the size of the protein model. CONCLUSIONS This paper presents the first algorithm capable of computing the near-optimal estimates of the area under the GDT function for a protein model. We believe that the techniques implemented in our algorithm will pave ways for the development of more practical and reliable procedures for estimating 3D model quality.
Collapse
|
50
|
Meier A, Söding J. Automatic Prediction of Protein 3D Structures by Probabilistic Multi-template Homology Modeling. PLoS Comput Biol 2015; 11:e1004343. [PMID: 26496371 PMCID: PMC4619893 DOI: 10.1371/journal.pcbi.1004343] [Citation(s) in RCA: 91] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Accepted: 05/19/2015] [Indexed: 11/22/2022] Open
Abstract
Homology modeling predicts the 3D structure of a query protein based on the sequence alignment with one or more template proteins of known structure. Its great importance for biological research is owed to its speed, simplicity, reliability and wide applicability, covering more than half of the residues in protein sequence space. Although multiple templates have been shown to generally increase model quality over single templates, the information from multiple templates has so far been combined using empirically motivated, heuristic approaches. We present here a rigorous statistical framework for multi-template homology modeling. First, we find that the query proteins’ atomic distance restraints can be accurately described by two-component Gaussian mixtures. This insight allowed us to apply the standard laws of probability theory to combine restraints from multiple templates. Second, we derive theoretically optimal weights to correct for the redundancy among related templates. Third, a heuristic template selection strategy is proposed. We improve the average GDT-ha model quality score by 11% over single template modeling and by 6.5% over a conventional multi-template approach on a set of 1000 query proteins. Robustness with respect to wrong constraints is likewise improved. We have integrated our multi-template modeling approach with the popular MODELLER homology modeling software in our free HHpred server http://toolkit.tuebingen.mpg.de/hhpred and also offer open source software for running MODELLER with the new restraints at https://bitbucket.org/soedinglab/hh-suite. Since a protein’s function is largely determined by its structure, predicting a protein’s structure from its amino acid sequence can be very useful to understand its molecular functions and its role in biological pathways. By far the most widely used computational approach for protein structure prediction relies on detecting a homologous relationship with a protein of known structure and using this protein as a template to model the structure of the query protein on it. The basic concepts of this homology modelling approach have not changed during the last 20 years. In this study we extend the probabilistic formulation of homology modelling to the consistent treatment of multiple templates. Our new theoretical approach allowed us to improve the quality of homology models by 11% over a baseline single-template approach and by 6.5% over a multi-template approach.
Collapse
Affiliation(s)
- Armin Meier
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
- Gene Center, Ludwig-Maximilians-Universität München Munich, Munich, Germany
| | - Johannes Söding
- Quantitative and Computational Biology, Max Planck Institute for Biophysical Chemistry, Göttingen, Germany
- Gene Center, Ludwig-Maximilians-Universität München Munich, Munich, Germany
- * E-mail:
| |
Collapse
|