1
|
Zhang H, Lan J, Wang H, Lu R, Zhang N, He X, Yang J, Chen L. AlphaFold2 in biomedical research: facilitating the development of diagnostic strategies for disease. Front Mol Biosci 2024; 11:1414916. [PMID: 39139810 PMCID: PMC11319189 DOI: 10.3389/fmolb.2024.1414916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/15/2024] [Indexed: 08/15/2024] Open
Abstract
Proteins, as the primary executors of physiological activity, serve as a key factor in disease diagnosis and treatment. Research into their structures, functions, and interactions is essential to better understand disease mechanisms and potential therapies. DeepMind's AlphaFold2, a deep-learning protein structure prediction model, has proven to be remarkably accurate, and it is widely employed in various aspects of diagnostic research, such as the study of disease biomarkers, microorganism pathogenicity, antigen-antibody structures, and missense mutations. Thus, AlphaFold2 serves as an exceptional tool to bridge fundamental protein research with breakthroughs in disease diagnosis, developments in diagnostic strategies, and the design of novel therapeutic approaches and enhancements in precision medicine. This review outlines the architecture, highlights, and limitations of AlphaFold2, placing particular emphasis on its applications within diagnostic research grounded in disciplines such as immunology, biochemistry, molecular biology, and microbiology.
Collapse
Affiliation(s)
- Hong Zhang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Jiajing Lan
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Huijie Wang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Ruijie Lu
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Nanqi Zhang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Xiaobai He
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
- Key Laboratory of Biomarkers and In Vitro Diagnosis Translation of Zhejiang Province, Hangzhou, China
| | - Jun Yang
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
| | - Linjie Chen
- School of Laboratory Medicine, Hangzhou Medical College, Hangzhou, China
- Zhejiang Engineering Research Centre for Key Technology of Diagnostic Testing, Hangzhou, China
| |
Collapse
|
2
|
Huh E, Agosto MA, Wensel TG, Lichtarge O. Coevolutionary signals in metabotropic glutamate receptors capture residue contacts and long-range functional interactions. J Biol Chem 2023; 299:103030. [PMID: 36806686 PMCID: PMC10060750 DOI: 10.1016/j.jbc.2023.103030] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 02/09/2023] [Accepted: 02/10/2023] [Indexed: 02/18/2023] Open
Abstract
Upon ligand binding to a G protein-coupled receptor, extracellular signals are transmitted into a cell through sets of residue interactions that translate ligand binding into structural rearrangements. These interactions needed for functions impose evolutionary constraints so that, on occasion, mutations in one position may be compensated by other mutations at functionally coupled positions. To quantify the impact of amino acid substitutions in the context of major evolutionary divergence in the G protein-coupled receptor subfamily of metabotropic glutamate receptors (mGluRs), we combined two phylogenetic-based algorithms, Evolutionary Trace and covariation Evolutionary Trace, to infer potential structure-function couplings and roles in mGluRs. We found a subset of evolutionarily important residues at known functional sites and evidence of coupling among distinct structural clusters in mGluR. In addition, experimental mutagenesis and functional assays confirmed that some highly covariant residues are coupled, revealing their synergy. Collectively, these findings inform a critical step toward understanding the molecular and structural basis of amino acid variation patterns within mGluRs and provide insight for drug development, protein engineering, and analysis of naturally occurring variants.
Collapse
Affiliation(s)
- Eunna Huh
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Melina A Agosto
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA; Retina and Optic Nerve Research Laboratory, Department of Physiology and Biophysics, Dalhousie University, Halifax, Canada
| | - Theodore G Wensel
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Olivier Lichtarge
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| |
Collapse
|
3
|
Wu M, Lv K, Li J, Wu B, He B. Coevolutionary analysis reveals a distal amino acid residue pair affecting the catalytic activity of GH5 processive endoglucanase from Bacillus subtilis BS-5. Biotechnol Bioeng 2022; 119:2105-2114. [PMID: 35438195 DOI: 10.1002/bit.28113] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 04/05/2022] [Accepted: 04/08/2022] [Indexed: 11/06/2022]
Abstract
EG5C-1, processive endoglucanase from Bacillus subtilis, is a typical bifunctional cellulase with endoglucanase and exoglucanase activities. The engineering of processive endoglucanase focuses on the catalytic pocket or carbohydrate-binding module tailoring based on sequence/structure information. Herein, a computational strategy was applied to identify the desired mutants in the enzyme molecule by evolutionary coupling analysis; subsequently, four residue pairs were selected as evolutionary mutational hotspots. Based on iterative-saturation mutagenesis and subsequent enzymatic activity analysis, a superior mutant K51T/L93T was identified away from the active center. This variant had increased specific activity from 4170 U/µmol of wild-type (WT) to 5678 U/µmol towards CMC-Na and an increase towards the substrate Avicel from 320 U/µmol in WT to 521 U/µmol. In addition, kinetic measurements suggested that superior mutant K51T/L93T had a high substrate affinity (Km ) and a remarkable improvement in catalytic efficiency (kcat /Km ). Furthermore, molecular dynamics simulations revealed that the K51T/L93T mutation altered the spatial conformation at the active site cleft, enhancing the interaction frequency between active site residues and substrate, improving catalytic efficiency and substrate affinity. The current studies provided some perspectives on the effects of distal residue substitution, which might assist in the engineering of processive endoglucanase or other glycoside hydrolases. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Mujunqi Wu
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| | - Kemin Lv
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| | - Jiahuang Li
- School of Biopharmacy, China Pharmaceutical University, Nanjing, 211198, Jiangsu, China
| | - Bin Wu
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| | - Bingfang He
- School of Pharmaceutical Sciences, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| |
Collapse
|
4
|
Gu J, Sim BR, Li J, Yu Y, Qin L, Wu L, Shen Y, Nie Y, Zhao YL, Xu Y. Evolutionary coupling-inspired engineering of alcohol dehydrogenase reveals the influence of distant sites on its catalytic efficiency for stereospecific synthesis of chiral alcohols. Comput Struct Biotechnol J 2021; 19:5864-5873. [PMID: 34815831 PMCID: PMC8572861 DOI: 10.1016/j.csbj.2021.10.031] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 10/19/2021] [Accepted: 10/20/2021] [Indexed: 01/02/2023] Open
Abstract
Alcohol dehydrogenase (ADH) has attracted much attention due to its ability to catalyze the synthesis of important chiral alcohol pharmaceutical intermediates with high stereoselectivity. ADH protein engineering efforts have generally focused on reshaping the substrate-binding pocket. However, distant sites outside the pocket may also affect its activity, although the underlying molecular mechanism remains unclear. The current study aimed to apply evolutionary coupling-inspired engineering to the ADH CpRCR and to identify potential mutation sites. Through conservative analysis, phylogenic analysis and residues distribution analysis, the co-evolution hotspots Leu34 and Leu137 were confirmed to be highly evolved under the pressure of natural selection and to be possibly related to the catalytic function of the protein. Hence, Leu34 and Leu137, far away from the active center, were selected for mutation. The generated CpRCR-L34A and CpRCR-L137V variants showed high stereoselectivity and 1.24-7.81 fold increase in k cat /K m value compared with that of the wild type, when reacted with 8 aromatic ketones or β-ketoesters. Corresponding computational study implied that L34 and L137 may extend allosteric fluctuation in the protein structure from the distal mutational site to the active site. Moreover, the L34 and L137 mutations modified the pre-reaction state in multiple ways, in terms of position of the hydride with respect to the target carbonyl. These findings provide insights into the catalytic mechanism of the enzyme and facilitate its regulation from the perspective of the site interaction network.
Collapse
Affiliation(s)
- Jie Gu
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Byu Ri Sim
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, MOE-LSB & MOE-LSC, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jiarui Li
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Yangqing Yu
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Lei Qin
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Lunjie Wu
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Yu Shen
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Yao Nie
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
- Suqian Industrial Technology Research Institute of Jiangnan University, Suqian 223814, China
| | - Yi-Lei Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, MOE-LSB & MOE-LSC, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yan Xu
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key Laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| |
Collapse
|
5
|
Quignot C, Granger P, Chacón P, Guerois R, Andreani J. Atomic-level evolutionary information improves protein-protein interface scoring. Bioinformatics 2021; 37:3175-3181. [PMID: 33901284 DOI: 10.1093/bioinformatics/btab254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 03/20/2021] [Accepted: 04/19/2021] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION The crucial role of protein interactions and the difficulty in characterising them experimentally strongly motivates the development of computational approaches for structural prediction. Even when protein-protein docking samples correct models, current scoring functions struggle to discriminate them from incorrect decoys. The previous incorporation of conservation and coevolution information has shown promise for improving protein-protein scoring. Here, we present a novel strategy to integrate atomic-level evolutionary information into different types of scoring functions to improve their docking discrimination. RESULTS : We applied this general strategy to our residue-level statistical potential from InterEvScore and to two atomic-level scores, SOAP-PP and Rosetta interface score (ISC). Including evolutionary information from as few as ten homologous sequences improves the top 10 success rates of individual atomic-level scores SOAP-PP and Rosetta ISC by respectively 6 and 13.5 percentage points, on a large benchmark of 752 docking cases. The best individual homology-enriched score reaches a top 10 success rate of 34.4%. A consensus approach based on the complementarity between different homology-enriched scores further increases the top 10 success rate to 40%. AVAILABILITY All data used for benchmarking and scoring results, as well as a Singularity container of the pipeline, are available at http://biodev.cea.fr/interevol/interevdata/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chloé Quignot
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Pierre Granger
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Pablo Chacón
- Department of Biological Chemical Physics, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid, Spain
| | - Raphael Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
6
|
Mesdaghi S, Murphy DL, Sánchez Rodríguez F, Burgos-Mármol JJ, Rigden DJ. In silico prediction of structure and function for a large family of transmembrane proteins that includes human Tmem41b. F1000Res 2021; 9:1395. [PMID: 33520197 PMCID: PMC7818093 DOI: 10.12688/f1000research.27676.2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/11/2021] [Indexed: 01/07/2023] Open
Abstract
Background: Recent strides in computational structural biology have opened up an opportunity to understand previously uncharacterised proteins. The under-representation of transmembrane proteins in the Protein Data Bank highlights the need to apply new and advanced bioinformatics methods to shed light on their structure and function. This study focuses on a family of transmembrane proteins containing the Pfam domain PF09335 ('SNARE_ASSOC'/ 'VTT '/'Tvp38'/'DedA'). One prominent member, Tmem41b, has been shown to be involved in early stages of autophagosome formation and is vital in mouse embryonic development as well as being identified as a viral host factor of SARS-CoV-2. Methods: We used evolutionary covariance-derived information to construct and validate ab initio models, make domain boundary predictions and infer local structural features. Results: The results from the structural bioinformatics analysis of Tmem41b and its homologues showed that they contain a tandem repeat that is clearly visible in evolutionary covariance data but much less so by sequence analysis. Furthermore, cross-referencing of other prediction data with covariance analysis showed that the internal repeat features two-fold rotational symmetry. Ab initio modelling of Tmem41b and homologues reinforces these structural predictions. Local structural features predicted to be present in Tmem41b were also present in Cl -/H + antiporters. Conclusions: The results of this study strongly point to Tmem41b and its homologues being transporters for an as-yet uncharacterised substrate and possibly using H + antiporter activity as its mechanism for transport.
Collapse
Affiliation(s)
- Shahram Mesdaghi
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - David L. Murphy
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - J. Javier Burgos-Mármol
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK,
| |
Collapse
|
7
|
Mesdaghi S, Murphy DL, Sánchez Rodríguez F, Burgos-Mármol JJ, Rigden DJ. In silico prediction of structure and function for a large family of transmembrane proteins that includes human Tmem41b. F1000Res 2021; 9:1395. [PMID: 33520197 PMCID: PMC7818093 DOI: 10.12688/f1000research.27676.1] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 11/23/2020] [Indexed: 01/07/2023] Open
Abstract
Background: Recent strides in computational structural biology have opened up an opportunity to understand previously uncharacterised proteins. The under-representation of transmembrane proteins in the Protein Data Bank highlights the need to apply new and advanced bioinformatics methods to shed light on their structure and function. This study focuses on a family of transmembrane proteins containing the Pfam domain PF09335 ('SNARE_ASSOC'/ 'VTT '/'Tvp38'). One prominent member, Tmem41b, has been shown to be involved in early stages of autophagosome formation and is vital in mouse embryonic development as well as being identified as a viral host factor of SARS-CoV-2. Methods: We used evolutionary covariance-derived information to construct and validate ab initio models, make domain boundary predictions and infer local structural features. Results: The results from the structural bioinformatics analysis of Tmem41b and its homologues showed that they contain a tandem repeat that is clearly visible in evolutionary covariance data but much less so by sequence analysis. Furthermore, cross-referencing of other prediction data with covariance analysis showed that the internal repeat features two-fold rotational symmetry. Ab initio modelling of Tmem41b and homologues reinforces these structural predictions. Local structural features predicted to be present in Tmem41b were also present in Cl -/H + antiporters. Conclusions: The results of this study strongly point to Tmem41b and its homologues being transporters for an as-yet uncharacterised substrate and possibly using H + antiporter activity as its mechanism for transport.
Collapse
Affiliation(s)
- Shahram Mesdaghi
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - David L. Murphy
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - J. Javier Burgos-Mármol
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Liverpool, L69 7ZB, UK,
| |
Collapse
|
8
|
Rodríguez FS, Mesdaghi S, Simpkin AJ, Burgos-Mármol JJ, Murphy DL, Uski V, Keegan RM, Rigden DJ. ConPlot: Web-based application for the visualisation of protein contact maps integrated with other data. Bioinformatics 2021; 37:2763-2765. [PMID: 34499718 PMCID: PMC8428603 DOI: 10.1093/bioinformatics/btab049] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2020] [Revised: 12/18/2020] [Accepted: 01/21/2021] [Indexed: 12/15/2022] Open
Abstract
Summary Covariance-based predictions of residue contacts and inter-residue distances are an increasingly popular data type in protein bioinformatics. Here we present ConPlot, a web-based application for convenient display and analysis of contact maps and distograms. Integration of predicted contact data with other predictions is often required to facilitate inference of structural features. ConPlot can therefore use the empty space near the contact map diagonal to display multiple coloured tracks representing other sequence-based predictions. Popular file formats are natively read and bespoke data can also be flexibly displayed. This novel visualization will enable easier interpretation of predicted contact maps. Availability and implementation available online at www.conplot.org, along with documentation and examples. Alternatively, ConPlot can be installed and used locally using the docker image from the project’s Docker Hub repository. ConPlot is licensed under the BSD 3-Clause. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Filomeno Sánchez Rodríguez
- Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England.,Life Science, Diamond Light Source, Harwell Science and Innovation Campus, Oxfordshire OX11 0DE, Didcot, England
| | - Shahram Mesdaghi
- Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Adam J Simpkin
- Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - J Javier Burgos-Mármol
- Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - David L Murphy
- Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Ville Uski
- UKRI-STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, England
| | - Ronan M Keegan
- UKRI-STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, England
| | - Daniel J Rigden
- Institute of Structural, Molecular and Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
9
|
Shao D, Mao W, Xing Y, Gong H. RDb2C2: an improved method to identify the residue-residue pairing in β strands. BMC Bioinformatics 2020; 21:133. [PMID: 32245403 PMCID: PMC7126467 DOI: 10.1186/s12859-020-3476-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Accepted: 03/31/2020] [Indexed: 11/17/2022] Open
Abstract
Background Despite the great advance of protein structure prediction, accurate prediction of the structures of mainly β proteins is still highly challenging, but could be assisted by the knowledge of residue-residue pairing in β strands. Previously, we proposed a ridge-detection-based algorithm RDb2C that adopted a multi-stage random forest framework to predict the β-β pairing given the amino acid sequence of a protein. Results In this work, we developed a second version of this algorithm, RDb2C2, by employing the residual neural network to further enhance the prediction accuracy. In the benchmark test, this new algorithm improves the F1-score by > 10 percentage points, reaching impressively high values of ~ 72% and ~ 73% in the BetaSheet916 and BetaSheet1452 sets, respectively. Conclusion Our new method promotes the prediction accuracy of β-β pairing to a new level and the prediction results could better assist the structure modeling of mainly β proteins. We prepared an online server of RDb2C2 at http://structpred.life.tsinghua.edu.cn/rdb2c2.html.
Collapse
|
10
|
Abriata LA, Dal Peraro M. State-of-the-art web services for de novo protein structure prediction. Brief Bioinform 2020; 22:5870389. [PMID: 34020540 DOI: 10.1093/bib/bbaa139] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2020] [Revised: 06/04/2020] [Accepted: 06/05/2020] [Indexed: 02/06/2023] Open
Abstract
Residue coevolution estimations coupled to machine learning methods are revolutionizing the ability of protein structure prediction approaches to model proteins that lack clear homologous templates in the Protein Data Bank (PDB). This has been patent in the last round of the Critical Assessment of Structure Prediction (CASP), which presented several very good models for the hardest targets. Unfortunately, literature reporting on these advances often lacks digests tailored to lay end users; moreover, some of the top-ranking predictors do not provide webservers that can be used by nonexperts. How can then end users benefit from these advances and correctly interpret the predicted models? Here we review the web resources that biologists can use today to take advantage of these state-of-the-art methods in their research, including not only the best de novo modeling servers but also datasets of models precomputed by experts for structurally uncharacterized protein families. We highlight their features, advantages and pitfalls for predicting structures of proteins without clear templates. We present a broad number of applications that span from driving forward biochemical investigations that lack experimental structures to actually assisting experimental structure determination in X-ray diffraction, cryo-EM and other forms of integrative modeling. We also discuss issues that must be considered by users yet still require further developments, such as global and residue-wise model quality estimates and sources of residue coevolution other than monomeric tertiary structure.
Collapse
Affiliation(s)
- Luciano A Abriata
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - Matteo Dal Peraro
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| |
Collapse
|
11
|
Hong SH, Joo K, Lee J. ConDo: protein domain boundary prediction using coevolutionary information. Bioinformatics 2020; 35:2411-2417. [PMID: 30500873 DOI: 10.1093/bioinformatics/bty973] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 11/15/2018] [Accepted: 11/29/2018] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Domain boundary prediction is one of the most important problems in the study of protein structure and function. Many sequence-based domain boundary prediction methods are either template-based or machine learning (ML) based. ML-based methods often perform poorly due to their use of only local (i.e. short-range) features. These conventional features such as sequence profiles, secondary structures and solvent accessibilities are typically restricted to be within 20 residues of the domain boundary candidate. RESULTS To address the performance of ML-based methods, we developed a new protein domain boundary prediction method (ConDo) that utilizes novel long-range features such as coevolutionary information in addition to the aforementioned local window features as inputs for ML. Toward this purpose, two types of coevolutionary information were extracted from multiple sequence alignment using direct coupling analysis: (i) partially aligned sequences, and (ii) correlated mutation information. Both the partially aligned sequence information and the modularity of residue-residue couplings possess long-range correlation information. AVAILABILITY AND IMPLEMENTATION https://github.com/gicsaw/ConDo.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Keehyoung Joo
- Center for Advanced Computation, Korea Institute for Advanced Study, Korea
| | - Jooyoung Lee
- School of Computational Sciences.,Center for Advanced Computation, Korea Institute for Advanced Study, Korea
| |
Collapse
|
12
|
Correa Marrero M, Immink RGH, de Ridder D, van Dijk ADJ. Improved inference of intermolecular contacts through protein-protein interaction prediction using coevolutionary analysis. Bioinformatics 2020; 35:2036-2042. [PMID: 30398547 DOI: 10.1093/bioinformatics/bty924] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 10/11/2018] [Accepted: 11/05/2018] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION Predicting residue-residue contacts between interacting proteins is an important problem in bioinformatics. The growing wealth of sequence data can be used to infer these contacts through correlated mutation analysis on multiple sequence alignments of interacting homologs of the proteins of interest. This requires correct identification of pairs of interacting proteins for many species, in order to avoid introducing noise (i.e. non-interacting sequences) in the analysis that will decrease predictive performance. RESULTS We have designed Ouroboros, a novel algorithm to reduce such noise in intermolecular contact prediction. Our method iterates between weighting proteins according to how likely they are to interact based on the correlated mutations signal, and predicting correlated mutations based on the weighted sequence alignment. We show that this approach accurately discriminates between protein interaction versus non-interaction and simultaneously improves the prediction of intermolecular contact residues compared to a naive application of correlated mutation analysis. This requires no training labels concerning interactions or contacts. Furthermore, the method relaxes the assumption of one-to-one interaction of previous approaches, allowing for the study of many-to-many interactions. AVAILABILITY AND IMPLEMENTATION Source code and test data are available at www.bif.wur.nl/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Richard G H Immink
- Laboratory of Molecular Biology, Department of Plant Sciences.,Bioscience, Wageningen Plant Research
| | | | - Aalt D J van Dijk
- Bioinformatics Group, Department of Plant Sciences.,Bioscience, Wageningen Plant Research.,Biometris, Department of Plant Sciences, Wageningen University & Research, Wageningen PB, The Netherlands
| |
Collapse
|
13
|
Kim DN, Gront D, Sanbonmatsu KY. Practical Considerations for Atomistic Structure Modeling with Cryo-EM Maps. J Chem Inf Model 2020; 60:2436-2442. [PMID: 32422044 PMCID: PMC7891309 DOI: 10.1021/acs.jcim.0c00090] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We describe common approaches to atomistic structure modeling with single particle analysis derived cryo-EM maps. Several strategies for atomistic model building and atomistic model fitting methods are discussed, including selection criteria and implementation procedures. In covering basic concepts and caveats, this short perspective aims to help facilitate active discussion between scientists at different levels with diverse backgrounds.
Collapse
Affiliation(s)
- Doo Nam Kim
- Computational Biology Team, Biological Science Division, Pacific Northwest National Laboratory, Richland, Washington, 99354, United States
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Karissa Y. Sanbonmatsu
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, 87545, United States
- New Mexico Consortium, Los Alamos, New Mexico, 87544, United States
| |
Collapse
|
14
|
Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Jessica Andreani
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Chloé Quignot
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Raphael Guerois
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| |
Collapse
|
15
|
Feng J, Shukla D. FingerprintContacts: Predicting Alternative Conformations of Proteins from Coevolution. J Phys Chem B 2020; 124:3605-3615. [PMID: 32283936 DOI: 10.1021/acs.jpcb.9b11869] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Proteins are dynamic molecules which perform diverse molecular functions by adopting different three-dimensional structures. Recent progress in residue-residue contacts prediction opens up new avenues for the de novo protein structure prediction from sequence information. However, it is still difficult to predict more than one conformation from residue-residue contacts alone. This is due to the inability to deconvolve the complex signals of residue-residue contacts, i.e., spatial contacts relevant for protein folding, conformational diversity, and ligand binding. Here, we introduce a machine learning based method, called FingerprintContacts, for extending the capabilities of residue-residue contacts. This algorithm leverages the features of residue-residue contacts, that is, (1) a single conformation outperforms the others in the structural prediction using all the top ranking residue-residue contacts as structural constraints and (2) conformation specific contacts rank lower and constitute a small fraction of residue-residue contacts. We demonstrate the capabilities of FingerprintContacts on eight ligand binding proteins with varying conformational motions. Furthermore, FingerprintContacts identifies small clusters of residue-residue contacts which are preferentially located in the dynamically fluctuating regions. With the rapid growth in protein sequence information, we expect FingerprintContacts to be a powerful first step in structural understanding of protein functional mechanisms.
Collapse
|
16
|
Abriata LA. Building blocks for commodity augmented reality-based molecular visualization and modeling in web browsers. PeerJ Comput Sci 2020; 6:e260. [PMID: 33816912 PMCID: PMC7924717 DOI: 10.7717/peerj-cs.260] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2019] [Accepted: 01/22/2020] [Indexed: 06/12/2023]
Abstract
For years, immersive interfaces using virtual and augmented reality (AR) for molecular visualization and modeling have promised a revolution in the way how we teach, learn, communicate and work in chemistry, structural biology and related areas. However, most tools available today for immersive modeling require specialized hardware and software, and are costly and cumbersome to set up. These limitations prevent wide use of immersive technologies in education and research centers in a standardized form, which in turn prevents large-scale testing of the actual effects of such technologies on learning and thinking processes. Here, I discuss building blocks for creating marker-based AR applications that run as web pages on regular computers, and explore how they can be exploited to develop web content for handling virtual molecular systems in commodity AR with no more than a webcam- and internet-enabled computer. Examples span from displaying molecules, electron microscopy maps and molecular orbitals with minimal amounts of HTML code, to incorporation of molecular mechanics, real-time estimation of experimental observables and other interactive resources using JavaScript. These web apps provide virtual alternatives to physical, plastic-made molecular modeling kits, where the computer augments the experience with information about spatial interactions, reactivity, energetics, etc. The ideas and prototypes introduced here should serve as starting points for building active content that everybody can utilize online at minimal cost, providing novel interactive pedagogic material in such an open way that it could enable mass-testing of the effect of immersive technologies on chemistry education.
Collapse
Affiliation(s)
- Luciano A. Abriata
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
17
|
Bittrich S, Schroeder M, Labudde D. StructureDistiller: Structural relevance scoring identifies the most informative entries of a contact map. Sci Rep 2019; 9:18517. [PMID: 31811259 PMCID: PMC6898053 DOI: 10.1038/s41598-019-55047-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Accepted: 11/21/2019] [Indexed: 12/17/2022] Open
Abstract
Protein folding and structure prediction are two sides of the same coin. Contact maps and the related techniques of constraint-based structure reconstruction can be considered as unifying aspects of both processes. We present the Structural Relevance (SR) score which quantifies the information content of individual contacts and residues in the context of the whole native structure. The physical process of protein folding is commonly characterized with spatial and temporal resolution: some residues are Early Folding while others are Highly Stable with respect to unfolding events. We employ the proposed SR score to demonstrate that folding initiation and structure stabilization are subprocesses realized by distinct sets of residues. The example of cytochrome c is used to demonstrate how StructureDistiller identifies the most important contacts needed for correct protein folding. This shows that entries of a contact map are not equally relevant for structural integrity. The proposed StructureDistiller algorithm identifies contacts with the highest information content; these entries convey unique constraints not captured by other contacts. Identification of the most informative contacts effectively doubles resilience toward contacts which are not observed in the native contact map. Furthermore, this knowledge increases reconstruction fidelity on sparse contact maps significantly by 0.4 Å.
Collapse
Affiliation(s)
- Sebastian Bittrich
- University of Applied Sciences Mittweida, Mittweida, 09648, Germany. .,Biotechnology Center (BIOTEC), TU Dresden, Dresden, 01307, Germany. .,Research Collaboratory for Structural Bioinformatics Protein Data Bank, University of California, San Diego, La Jolla, CA, 92093, USA.
| | | | - Dirk Labudde
- University of Applied Sciences Mittweida, Mittweida, 09648, Germany
| |
Collapse
|
18
|
Simpkin AJ, Thomas JMH, Simkovic F, Keegan RM, Rigden DJ. Molecular replacement using structure predictions from databases. Acta Crystallogr D Struct Biol 2019; 75:1051-1062. [PMID: 31793899 PMCID: PMC6889911 DOI: 10.1107/s2059798319013962] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 10/12/2019] [Indexed: 01/19/2023] Open
Abstract
Molecular replacement (MR) is the predominant route to solution of the phase problem in macromolecular crystallography. Where the lack of a suitable homologue precludes conventional MR, one option is to predict the target structure using bioinformatics. Such modelling, in the absence of homologous templates, is called ab initio or de novo modelling. Recently, the accuracy of such models has improved significantly as a result of the availability, in many cases, of residue-contact predictions derived from evolutionary covariance analysis. Covariance-assisted ab initio models representing structurally uncharacterized Pfam families are now available on a large scale in databases, potentially representing a valuable and easily accessible supplement to the PDB as a source of search models. Here, the unconventional MR pipeline AMPLE is employed to explore the value of structure predictions in the GREMLIN and PconsFam databases. It was tested whether these deposited predictions, processed in various ways, could solve the structures of PDB entries that were subsequently deposited. The results were encouraging: nine of 27 GREMLIN cases were solved, covering target lengths of 109-355 residues and a resolution range of 1.4-2.9 Å, and with target-model shared sequence identity as low as 20%. The cluster-and-truncate approach in AMPLE proved to be essential for most successes. For the overall lower quality structure predictions in the PconsFam database, remodelling with Rosetta within the AMPLE pipeline proved to be the best approach, generating ensemble search models from single-structure deposits. Finally, it is shown that the AMPLE-obtained search models deriving from GREMLIN deposits are of sufficiently high quality to be selected by the sequence-independent MR pipeline SIMBAD. Overall, the results help to point the way towards the optimal use of the expanding databases of ab initio structure predictions.
Collapse
Affiliation(s)
- Adam J. Simpkin
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Jens M. H. Thomas
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Felix Simkovic
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Ronan M. Keegan
- STFC, Rutherford Appleton Laboratory, Research Complex at Harwell, Didcot OX11 0FA, England
| | - Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
19
|
Wang X, Jing X, Deng Y, Nie Y, Xu F, Xu Y, Zhao YL, Hunt JF, Montelione GT, Szyperski T. Evolutionary coupling saturation mutagenesis: Coevolution-guided identification of distant sites influencing Bacillus naganoensis pullulanase activity. FEBS Lett 2019; 594:799-812. [PMID: 31665817 DOI: 10.1002/1873-3468.13652] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/15/2019] [Accepted: 10/25/2019] [Indexed: 01/20/2023]
Abstract
Pullulanases are well-known debranching enzymes hydrolyzing α-1,6-glycosidic linkages. To date, engineering of pullulanase is mainly focused on catalytic pocket or domain tailoring based on structure/sequence information. Saturation mutagenesis-involved directed evolution is, however, limited by the low number of mutational sites compatible with combinatorial libraries of feasible size. Using Bacillus naganoensis pullulanase as a target protein, here we introduce the 'evolutionary coupling saturation mutagenesis' (ECSM) approach: residue pair covariances are calculated to identify residues for saturation mutagenesis, focusing directed evolution on residue pairs playing important roles in natural evolution. Evolutionary coupling (EC) analysis identified seven residue pairs as evolutionary mutational hotspots. Subsequent saturation mutagenesis yielded variants with enhanced catalytic activity. The functional pairs apparently represent distant sites affecting enzyme activity.
Collapse
Affiliation(s)
- Xinye Wang
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Xiaoran Jing
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Yi Deng
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Yao Nie
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Fei Xu
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China
| | - Yan Xu
- School of Biotechnology and Key Laboratory of Industrial Biotechnology, Ministry of Education, Jiangnan University, Wuxi, China.,State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi, China
| | - Yi-Lei Zhao
- State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic and Developmental Sciences, MOE-LSB & MOE-LSC, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - John F Hunt
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.,Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.,Department of Chemistry and Chemical Biology, and Center for Biotechnology and Integrative Studies, Rensselaer Polytechnic Institute, Troy, NY, USA
| | - Thomas Szyperski
- Department of Chemistry, The State University of New York at Buffalo, NY, USA
| |
Collapse
|
20
|
Lubecka EA, Liwo A. Introduction of a bounded penalty function in contact-assisted simulations of protein structures to omit false restraints. J Comput Chem 2019; 40:2164-2178. [PMID: 31037754 DOI: 10.1002/jcc.25847] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 03/29/2019] [Accepted: 04/14/2019] [Indexed: 12/26/2022]
Abstract
Contact-assisted simulations, the contacts being predicted or determined experimentally, have become very important in the determination of the structures of proteins and other biological macromolecules. In this work, the effect of contact-distance restraints on the simulated structures was investigated with the use of multiplexed replica exchange simulations with the coarse-grained UNRES force field. A modified bounded flat-bottom restraint function that does not generate a gradient when a restraint cannot be satisfied was implemented. Calculations were run with (i) a set of four small proteins, with contact restraints derived from experimental structures, and (ii) selected CASP11 and CASP12 targets, with restraints as used at prediction time. The bounded penalty function largely omitted false contacts, which were usually inconsistent. It was found that at least 20% of correct contacts must be present in the restraint set to improve model quality with respect to unrestrained simulations. © 2019 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Emilia A Lubecka
- Institute of Informatics, Faculty of Mathematics, Physics and Informatics, University of Gdańsk, Wita Stwosza 57, 80-308 Gdańsk, Poland.,Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| | - Adam Liwo
- Faculty of Chemistry, University of Gdańsk, Wita Stwosza 63, 80-308 Gdańsk, Poland
| |
Collapse
|
21
|
Luttrell J, Liu T, Zhang C, Wang Z. Predicting protein residue-residue contacts using random forests and deep networks. BMC Bioinformatics 2019; 20:100. [PMID: 30871477 PMCID: PMC6419322 DOI: 10.1186/s12859-019-2627-6] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The ability to predict which pairs of amino acid residues in a protein are in contact with each other offers many advantages for various areas of research that focus on proteins. For example, contact prediction can be used to reduce the computational complexity of predicting the structure of proteins and even to help identify functionally important regions of proteins. These predictions are becoming especially important given the relatively low number of experimentally determined protein structures compared to the amount of available protein sequence data. RESULTS Here we have developed and benchmarked a set of machine learning methods for performing residue-residue contact prediction, including random forests, direct-coupling analysis, support vector machines, and deep networks (stacked denoising autoencoders). These methods are able to predict contacting residue pairs given only the amino acid sequence of a protein. According to our own evaluations performed at a resolution of +/- two residues, the predictors we trained with the random forest algorithm were our top performing methods with average top 10 prediction accuracy scores of 85.13% (short range), 74.49% (medium range), and 54.49% (long range). Our ensemble models (stacked denoising autoencoders combined with support vector machines) were our best performing deep network predictors and achieved top 10 prediction accuracy scores of 75.51% (short range), 60.26% (medium range), and 43.85% (long range) using the same evaluation. These tests were blindly performed on targets from the CASP11 dataset; and the results suggested that our models achieved comparable performance to contact predictors developed by groups that participated in CASP11. CONCLUSIONS Due to the challenging nature of contact prediction, it is beneficial to develop and benchmark a variety of different prediction methods. Our work has produced useful tools with a simple interface that can provide contact predictions to users without requiring a lengthy installation process. In addition to this, we have released our C++ implementation of the direct-coupling analysis method as a standalone software package. Both this tool and our RFcon web server are freely available to the public at http://dna.cs.miami.edu/RFcon /.
Collapse
Affiliation(s)
- Joseph Luttrell
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Drive, Hattiesburg, MS, 39406, USA
| | - Tong Liu
- Department of Computer Science, University of Miami, 1365 Memorial Drive, Coral Gables, FL, 33124, USA
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, 118 College Drive, Hattiesburg, MS, 39406, USA
| | - Zheng Wang
- Department of Computer Science, University of Miami, 1365 Memorial Drive, Coral Gables, FL, 33124, USA.
| |
Collapse
|
22
|
Huang YJ, Brock KP, Ishida Y, Swapna GVT, Inouye M, Marks DS, Sander C, Montelione GT. Combining Evolutionary Covariance and NMR Data for Protein Structure Determination. Methods Enzymol 2018; 614:363-392. [PMID: 30611430 PMCID: PMC6640129 DOI: 10.1016/bs.mie.2018.11.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Accurate protein structure determination by solution-state NMR is challenging for proteins greater than about 20kDa, for which extensive perdeuteration is generally required, providing experimental data that are incomplete (sparse) and ambiguous. However, the massive increase in evolutionary sequence information coupled with advances in methods for sequence covariance analysis can provide reliable residue-residue contact information for a protein from sequence data alone. These "evolutionary couplings (ECs)" can be combined with sparse NMR data to determine accurate 3D protein structures. This hybrid "EC-NMR" method has been developed using NMR data for several soluble proteins and validated by comparison with corresponding reference structures determined by X-ray crystallography and/or conventional NMR methods. For small proteins, only backbone resonance assignments are utilized, while for larger proteins both backbone and some sidechain methyl resonance assignments are generally required. ECs can be combined with sparse NMR data obtained on deuterated, selectively protonated protein samples to provide structures that are more accurate and complete than those obtained using such sparse NMR data alone. EC-NMR also has significant potential for analysis of protein structures from solid-state NMR data and for studies of integral membrane proteins. The requirement that ECs are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Kelly P Brock
- Department of Systems Biology, Harvard Medical School, Boston, MA, United States
| | - Yojiro Ishida
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Gurla V T Swapna
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Masayori Inouye
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ, United States
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, United States
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School and cBio Center, Dana-Farber Cancer Institute, Boston, MA, United States
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, United States; Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, NJ, United States.
| |
Collapse
|
23
|
Biogenesis and structure of a type VI secretion baseplate. Nat Microbiol 2018; 3:1404-1416. [DOI: 10.1038/s41564-018-0260-1] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Accepted: 08/31/2018] [Indexed: 12/20/2022]
|
24
|
Mao W, Wang T, Zhang W, Gong H. Identification of residue pairing in interacting β-strands from a predicted residue contact map. BMC Bioinformatics 2018; 19:146. [PMID: 29673311 PMCID: PMC5907701 DOI: 10.1186/s12859-018-2150-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 04/09/2018] [Indexed: 12/04/2022] Open
Abstract
Background Despite the rapid progress of protein residue contact prediction, predicted residue contact maps frequently contain many errors. However, information of residue pairing in β strands could be extracted from a noisy contact map, due to the presence of characteristic contact patterns in β-β interactions. This information may benefit the tertiary structure prediction of mainly β proteins. In this work, we propose a novel ridge-detection-based β-β contact predictor to identify residue pairing in β strands from any predicted residue contact map. Results Our algorithm RDb2C adopts ridge detection, a well-developed technique in computer image processing, to capture consecutive residue contacts, and then utilizes a novel multi-stage random forest framework to integrate the ridge information and additional features for prediction. Starting from the predicted contact map of CCMpred, RDb2C remarkably outperforms all state-of-the-art methods on two conventional test sets of β proteins (BetaSheet916 and BetaSheet1452), and achieves F1-scores of ~ 62% and ~ 76% at the residue level and strand level, respectively. Taking the prediction of the more advanced RaptorX-Contact as input, RDb2C achieves impressively higher performance, with F1-scores reaching ~ 76% and ~ 86% at the residue level and strand level, respectively. In a test of structural modeling using the top 1 L predicted contacts as constraints, for 61 mainly β proteins, the average TM-score achieves 0.442 when using the raw RaptorX-Contact prediction, but increases to 0.506 when using the improved prediction by RDb2C. Conclusion Our method can significantly improve the prediction of β-β contacts from any predicted residue contact maps. Prediction results of our algorithm could be directly applied to effectively facilitate the practical structure prediction of mainly β proteins. Availability All source data and codes are available at http://166.111.152.91/Downloads.html or the GitHub address of https://github.com/wzmao/RDb2C. Electronic supplementary material The online version of this article (10.1186/s12859-018-2150-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wenzhi Mao
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China
| | - Tong Wang
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China
| | - Wenxuan Zhang
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China.,Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing, China. .,Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing, China.
| |
Collapse
|
25
|
Stutzer C, Richards SA, Ferreira M, Baron S, Maritz-Olivier C. Metazoan Parasite Vaccines: Present Status and Future Prospects. Front Cell Infect Microbiol 2018; 8:67. [PMID: 29594064 PMCID: PMC5859119 DOI: 10.3389/fcimb.2018.00067] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 02/26/2018] [Indexed: 12/21/2022] Open
Abstract
Eukaryotic parasites and pathogens continue to cause some of the most detrimental and difficult to treat diseases (or disease states) in both humans and animals, while also continuously expanding into non-endemic countries. Combined with the ever growing number of reports on drug-resistance and the lack of effective treatment programs for many metazoan diseases, the impact that these organisms will have on quality of life remain a global challenge. Vaccination as an effective prophylactic treatment has been demonstrated for well over 200 years for bacterial and viral diseases. From the earliest variolation procedures to the cutting edge technologies employed today, many protective preparations have been successfully developed for use in both medical and veterinary applications. In spite of the successes of these applications in the discovery of subunit vaccines against prokaryotic pathogens, not many targets have been successfully developed into vaccines directed against metazoan parasites. With the current increase in -omics technologies and metadata for eukaryotic parasites, target discovery for vaccine development can be expedited. However, a good understanding of the host/vector/pathogen interface is needed to understand the underlying biological, biochemical and immunological components that will confer a protective response in the host animal. Therefore, systems biology is rapidly coming of age in the pursuit of effective parasite vaccines. Despite the difficulties, a number of approaches have been developed and applied to parasitic helminths and arthropods. This review will focus on key aspects of vaccine development that require attention in the battle against these metazoan parasites, as well as successes in the field of vaccine development for helminthiases and ectoparasites. Lastly, we propose future direction of applying successes in pursuit of next generation vaccines.
Collapse
Affiliation(s)
- Christian Stutzer
- Tick Vaccine Group, Department of Genetics, University of Pretoria, Pretoria, South Africa
| | | | | | | | | |
Collapse
|
26
|
Huang YJ, Brock KP, Sander C, Marks DS, Montelione GT. A Hybrid Approach for Protein Structure Determination Combining Sparse NMR with Evolutionary Coupling Sequence Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2018; 1105:153-169. [PMID: 30617828 DOI: 10.1007/978-981-13-2200-6_10] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary couplings (ECs), obtained from multiple sequence alignments of protein families. This hybrid "EC-NMR" method can be used to accurately model larger (15-60 kDa) proteins, and more rapidly determine structures of smaller (5-15 kDa) proteins using only backbone NMR data. The resulting structures have accuracies relative to reference structures comparable to those obtained with full backbone and sidechain NMR resonance assignments. The requirement that evolutionary couplings (ECs) are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, potentially also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Kelly P Brock
- cBio Center, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA
- cBio Center, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.
| |
Collapse
|
27
|
Thomas JMH, Simkovic F, Keegan R, Mayans O, Zhang C, Zhang Y, Rigden DJ. Approaches to ab initio molecular replacement of α-helical transmembrane proteins. Acta Crystallogr D Struct Biol 2017; 73:985-996. [PMID: 29199978 PMCID: PMC5713875 DOI: 10.1107/s2059798317016436] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Accepted: 11/15/2017] [Indexed: 02/06/2023] Open
Abstract
α-Helical transmembrane proteins are a ubiquitous and important class of proteins, but present difficulties for crystallographic structure solution. Here, the effectiveness of the AMPLE molecular replacement pipeline in solving α-helical transmembrane-protein structures is assessed using a small library of eight ideal helices, as well as search models derived from ab initio models generated both with and without evolutionary contact information. The ideal helices prove to be surprisingly effective at solving higher resolution structures, but ab initio-derived search models are able to solve structures that could not be solved with the ideal helices. The addition of evolutionary contact information results in a marked improvement in the modelling and makes additional solutions possible.
Collapse
Affiliation(s)
- Jens M. H. Thomas
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Felix Simkovic
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Ronan Keegan
- Research Complex at Harwell, STFC Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| | - Olga Mayans
- Fachbereich Biologie, Universität Konstanz, D-78457 Konstanz, Germany
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, Department of Biological Chemistry, Medical School, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, Department of Biological Chemistry, Medical School, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109-2218, USA
| | - Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
28
|
Abriata LA, Tamò GE, Monastyrskyy B, Kryshtafovych A, Dal Peraro M. Assessment of hard target modeling in CASP12 reveals an emerging role of alignment-based contact prediction methods. Proteins 2017; 86 Suppl 1:97-112. [DOI: 10.1002/prot.25423] [Citation(s) in RCA: 61] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Revised: 11/09/2017] [Accepted: 11/13/2017] [Indexed: 12/25/2022]
Affiliation(s)
- Luciano A. Abriata
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB); Lausanne Switzerland
| | - Giorgio E. Tamò
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB); Lausanne Switzerland
| | | | | | - Matteo Dal Peraro
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Swiss Institute of Bioinformatics (SIB); Lausanne Switzerland
| |
Collapse
|
29
|
Cirauqui N, Abriata LA, van der Goot FG, Dal Peraro M. Structural, physicochemical and dynamic features conserved within the aerolysin pore-forming toxin family. Sci Rep 2017; 7:13932. [PMID: 29066778 PMCID: PMC5654971 DOI: 10.1038/s41598-017-13714-4] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Accepted: 09/26/2017] [Indexed: 11/10/2022] Open
Abstract
Aerolysin is the founding member of a major class of β-pore-forming toxins (β-PFTs) found throughout all kingdoms of life. PFTs are cytotoxic proteins produced as soluble monomers, which oligomerize at the membrane of target host cells forming pores that may lead to osmotic lysis and cell death. Besides their role in microbial infection, they have become interesting for their potential as biotechnological sensors and delivery systems. Using an approach that integrates bioinformatics with molecular modeling and simulation, we looked for conserved features across this large toxin family. The cell surface-binding domains present high variability within the family to provide membrane receptor specificity. On the contrary, the novel concentric double β-barrel structure found in aerolysin is highly conserved in terms of sequence, structure and conformational dynamics, which likely contribute to preserve a common transition mechanism from the prepore to the mature pore within the family.Our results point to the key role of several amino acids in the conformational changes needed for oligomerization and further pore formation, such as Y221, W227, P248, Q263 and L277, which we propose are involved in the release of the stem loop and the two adjacent β-strands to form the transmembrane β-barrel.
Collapse
Affiliation(s)
- Nuria Cirauqui
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
- Department of Pharmaceutical Biotechnology, Universidade Federal do Rio de Janeiro, 21941-902, Rio de Janeiro, Brazil
| | - Luciano A Abriata
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - F Gisou van der Goot
- Global Health Institute, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland
| | - Matteo Dal Peraro
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), 1015, Lausanne, Switzerland.
| |
Collapse
|
30
|
Sønderby P, Rinnan Å, Madsen JJ, Harris P, Bukrinski JT, Peters GHJ. Small-Angle X-ray Scattering Data in Combination with RosettaDock Improves the Docking Energy Landscape. J Chem Inf Model 2017; 57:2463-2475. [DOI: 10.1021/acs.jcim.6b00789] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Affiliation(s)
- Pernille Sønderby
- Department
of Chemistry, Technical University of Denmark, DK-2800 Kongens
Lyngby, Denmark
| | - Åsmund Rinnan
- Department
of Food Science, Faculty of Science, University of Copenhagen, DK-1958 Frederiksberg C, Denmark
| | - Jesper J. Madsen
- Department
of Chemistry, The University of Chicago, Chicago, Illinois 60637, United States
| | - Pernille Harris
- Department
of Chemistry, Technical University of Denmark, DK-2800 Kongens
Lyngby, Denmark
| | | | - Günther H. J. Peters
- Department
of Chemistry, Technical University of Denmark, DK-2800 Kongens
Lyngby, Denmark
| |
Collapse
|
31
|
Abstract
Co-evolution techniques were originally conceived to assist in protein structure prediction by inferring pairs of residues that share spatial proximity. However, the functional relationships that can be extrapolated from co-evolution have also proven to be useful in a wide array of structural bioinformatics applications. These techniques are a powerful way to extract structural and functional information in a sequence-rich world.
Collapse
|