1
|
Cheng Y, Wang H, Xu H, Liu Y, Ma B, Chen X, Zeng X, Wang X, Wang B, Shiau C, Ovchinnikov S, Su XD, Wang C. Co-evolution-based prediction of metal-binding sites in proteomes by machine learning. Nat Chem Biol 2023; 19:548-555. [PMID: 36593274 DOI: 10.1038/s41589-022-01223-z] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 11/08/2022] [Indexed: 01/03/2023]
Abstract
Metal ions have various important biological roles in proteins, including structural maintenance, molecular recognition and catalysis. Previous methods of predicting metal-binding sites in proteomes were based on either sequence or structural motifs. Here we developed a co-evolution-based pipeline named 'MetalNet' to systematically predict metal-binding sites in proteomes. We applied MetalNet to proteomes of four representative prokaryotic species and predicted 4,849 potential metalloproteins, which substantially expands the currently annotated metalloproteomes. We biochemically and structurally validated previously unannotated metal-binding sites in several proteins, including apo-citrate lyase phosphoribosyl-dephospho-CoA transferase citX, an Escherichia coli enzyme lacking structural or sequence homology to any known metalloprotein (Protein Data Bank (PDB) codes: 7DCM and 7DCN ). MetalNet also successfully recapitulated all known zinc-binding sites from the human spliceosome complex. The pipeline of MetalNet provides a unique and enabling tool for interrogating the hidden metalloproteome and studying metal biology.
Collapse
Affiliation(s)
- Yao Cheng
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Haobo Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Hua Xu
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | - Yuan Liu
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
| | - Bin Ma
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xuemin Chen
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xin Zeng
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Xianghe Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Bo Wang
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | | | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellow, Harvard University, Cambridge, MA, USA
| | - Xiao-Dong Su
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China.
| | - Chu Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
| |
Collapse
|
2
|
Zha J, Li M, Kong R, Lu S, Zhang J. Explaining and Predicting Allostery with Allosteric Database and Modern Analytical Techniques. J Mol Biol 2022; 434:167481. [DOI: 10.1016/j.jmb.2022.167481] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 01/25/2022] [Accepted: 01/31/2022] [Indexed: 12/17/2022]
|
3
|
Abstract
Metalloproteins play diverse and critical functions in all living systems, and their dysfunctional forms are closely related to many human diseases. The development of methods that enable comprehensive mapping of metalloproteome is of great interest to help elucidate crucial roles of metalloproteins in both physiology and pathology, as well as to discover new metalloproteins. We herein briefly review recent progress in the field of metalloproteomics and provide future outlooks.
Collapse
Affiliation(s)
- Xin Zeng
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| | - Yao Cheng
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing 100871, China.,College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Chu Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing 100871, China.,College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Sciences, Peking University, Beijing 100871, China
| |
Collapse
|
4
|
Rigden DJ, Thomas JMH, Simkovic F, Simpkin A, Winn MD, Mayans O, Keegan RM. Ensembles generated from crystal structures of single distant homologues solve challenging molecular-replacement cases in AMPLE. Acta Crystallogr D Struct Biol 2018; 74:183-193. [PMID: 29533226 PMCID: PMC5947759 DOI: 10.1107/s2059798318002310] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 02/07/2018] [Indexed: 01/17/2023] Open
Abstract
Molecular replacement (MR) is the predominant route to solution of the phase problem in macromolecular crystallography. Although routine in many cases, it becomes more effortful and often impossible when the available experimental structures typically used as search models are only distantly homologous to the target. Nevertheless, with current powerful MR software, relatively small core structures shared between the target and known structure, of 20-40% of the overall structure for example, can succeed as search models where they can be isolated. Manual sculpting of such small structural cores is rarely attempted and is dependent on the crystallographer's expertise and understanding of the protein family in question. Automated search-model editing has previously been performed on the basis of sequence alignment, in order to eliminate, for example, side chains or loops that are not present in the target, or on the basis of structural features (e.g. solvent accessibility) or crystallographic parameters (e.g. B factors). Here, based on recent work demonstrating a correlation between evolutionary conservation and protein rigidity/packing, novel automated ways to derive edited search models from a given distant homologue over a range of sizes are presented. A variety of structure-based metrics, many readily obtained from online webservers, can be fed to the MR pipeline AMPLE to produce search models that succeed with a set of test cases where expertly manually edited comparators, further processed in diverse ways with MrBUMP, fail. Further significant performance gains result when the structure-based distance geometry method CONCOORD is used to generate ensembles from the distant homologue. To our knowledge, this is the first such approach whereby a single structure is meaningfully transformed into an ensemble for the purposes of MR. Additional cases further demonstrate the advantages of the approach. CONCOORD is freely available and computationally inexpensive, so these novel methods offer readily available new routes to solve difficult MR cases.
Collapse
Affiliation(s)
- Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Jens M. H. Thomas
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Felix Simkovic
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Adam Simpkin
- Institute of Integrative Biology, University of Liverpool, Crown Street, Liverpool L69 7ZB, England
| | - Martyn D. Winn
- Science and Technology Facilities Council, Daresbury Laboratory, Warrington WA4 4AD, England
| | - Olga Mayans
- Fachbereich Biologie, Universität Konstanz, 78457 Konstanz, Germany
| | - Ronan M. Keegan
- Research Complex at Harwell, STFC Rutherford Appleton Laboratory, Didcot OX11 0FA, England
| |
Collapse
|
5
|
Simkovic F, Ovchinnikov S, Baker D, Rigden DJ. Applications of contact predictions to structural biology. IUCRJ 2017; 4:291-300. [PMID: 28512576 PMCID: PMC5414403 DOI: 10.1107/s2052252517005115] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 04/03/2017] [Indexed: 06/07/2023]
Abstract
Evolutionary pressure on residue interactions, intramolecular or intermolecular, that are important for protein structure or function can lead to covariance between the two positions. Recent methodological advances allow much more accurate contact predictions to be derived from this evolutionary covariance signal. The practical application of contact predictions has largely been confined to structural bioinformatics, yet, as this work seeks to demonstrate, the data can be of enormous value to the structural biologist working in X-ray crystallo-graphy, cryo-EM or NMR. Integrative structural bioinformatics packages such as Rosetta can already exploit contact predictions in a variety of ways. The contribution of contact predictions begins at construct design, where structural domains may need to be expressed separately and contact predictions can help to predict domain limits. Structure solution by molecular replacement (MR) benefits from contact predictions in diverse ways: in difficult cases, more accurate search models can be constructed using ab initio modelling when predictions are available, while intermolecular contact predictions can allow the construction of larger, oligomeric search models. Furthermore, MR using supersecondary motifs or large-scale screens against the PDB can exploit information, such as the parallel or antiparallel nature of any β-strand pairing in the target, that can be inferred from contact predictions. Contact information will be particularly valuable in the determination of lower resolution structures by helping to assign sequence register. In large complexes, contact information may allow the identity of a protein responsible for a certain region of density to be determined and then assist in the orientation of an available model within that density. In NMR, predicted contacts can provide long-range information to extend the upper size limit of the technique in a manner analogous but complementary to experimental methods. Finally, predicted contacts can distinguish between biologically relevant interfaces and mere lattice contacts in a final crystal structure, and have potential in the identification of functionally important regions and in foreseeing the consequences of mutations.
Collapse
Affiliation(s)
- Felix Simkovic
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| | - Sergey Ovchinnikov
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98195, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Box 357370, Seattle, WA 98195, USA
| | - Daniel J. Rigden
- Institute of Integrative Biology, University of Liverpool, Liverpool L69 7ZB, England
| |
Collapse
|
6
|
Levy RM, Haldane A, Flynn WF. Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness. Curr Opin Struct Biol 2016; 43:55-62. [PMID: 27870991 DOI: 10.1016/j.sbi.2016.11.004] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 11/03/2016] [Indexed: 11/17/2022]
Abstract
Potts Hamiltonian models of protein sequence co-variation are statistical models constructed from the pair correlations observed in a multiple sequence alignment (MSA) of a protein family. These models are powerful because they capture higher order correlations induced by mutations evolving under constraints and help quantify the connections between protein sequence, structure, and function maintained through evolution. We review recent work with Potts models to predict protein structure and sequence-dependent conformational free energy landscapes, to survey protein fitness landscapes and to explore the effects of epistasis on fitness. We also comment on the numerical methods used to infer these models for each application.
Collapse
Affiliation(s)
- Ronald M Levy
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122, United States.
| | - Allan Haldane
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122, United States
| | - William F Flynn
- Center for Biophysics and Computational Biology, Department of Chemistry, and Institute for Computational Molecular Science, Temple University, Philadelphia, PA 19122, United States; Department of Physics and Astronomy, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, United States
| |
Collapse
|