1
|
Li Y, Wang Y, Tan YQ, Yue Q, Guo Y, Yan R, Meng L, Zhai H, Tong L, Yuan Z, Li W, Wang C, Han S, Ren S, Yan Y, Wang W, Gao L, Tan C, Hu T, Zhang H, Liu L, Yang P, Jiang W, Ye Y, Tan H, Wang Y, Lu C, Li X, Xie J, Yuan G, Cui Y, Shen B, Wang C, Guan Y, Li W, Shi Q, Lin G, Ni T, Sun Z, Ye L, Vourekas A, Guo X, Lin M, Zheng K. The landscape of RNA-binding proteins in mammalian spermatogenesis. Science 2024:eadj8172. [PMID: 39208083 DOI: 10.1126/science.adj8172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 04/08/2024] [Accepted: 08/20/2024] [Indexed: 09/04/2024]
Abstract
Despite continuous expansion of the RNA-binding protein (RBP) world, there is a lack of systematic understanding of RBPs in mammalian testis, which harbors one of the most complex tissue transcriptomes. We adapted RNA interactome capture to mouse male germ cells, building an RBP atlas characterized by multiple layers of dynamics along spermatogenesis. Trapping of RNA-crosslinked peptides showed that the glutamic acid-arginine (ER) patch, a residue-coevolved polyampholytic element present in coiled-coils, enhances RNA binding of its host RBPs. Deletion of this element in NONO (non-POU domain-containing octamer-binding protein) led to a defective mitosis-to-meiosis transition due to compromised NONO-RNA interactions. Whole-exome sequencing of over 1000 infertile men revealed a prominent role of RBPs in the human genetic architecture of male infertility and identified risk ER patch variants.
Collapse
Affiliation(s)
- Yang Li
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Yuanyuan Wang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
- Department of Neurobiology, School of Basic Medical Science, Nanjing Medical University, Nanjing 211166, China
| | - Yue-Qiu Tan
- Institute of Reproductive and Stem Cell Engineering, NHC Key Laboratory of Human Stem Cell and Reproductive Engineering, School of Basic Medical Science, Central South University, Changsha 410083, China
- Clinical Research Center for Reproduction and Genetics in Hunan Province, Reproductive and Genetic Hospital of CITIC-Xiangya, Changsha 410008, China
| | - Qiuling Yue
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
- Department of Andrology, Nanjing Drum Tower Hospital, the Affiliated Hospital of Nanjing University, Nanjing 210008, China
| | - Yueshuai Guo
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Ruoyu Yan
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
- College of Life Sciences, Northwest A&F University, Yangling 712100, China
| | - Lanlan Meng
- Institute of Reproductive and Stem Cell Engineering, NHC Key Laboratory of Human Stem Cell and Reproductive Engineering, School of Basic Medical Science, Central South University, Changsha 410083, China
- Clinical Research Center for Reproduction and Genetics in Hunan Province, Reproductive and Genetic Hospital of CITIC-Xiangya, Changsha 410008, China
| | - Huicong Zhai
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Lingxiu Tong
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Zihan Yuan
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Wu Li
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Cuicui Wang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Shenglin Han
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Sen Ren
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Yitong Yan
- Department of Neurobiology, School of Basic Medical Science, Nanjing Medical University, Nanjing 211166, China
| | - Weixu Wang
- Institute of Computational Biology, Helmholtz Center Munich, Munich 85764, Germany
| | - Lei Gao
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Chen Tan
- Institute of Reproductive and Stem Cell Engineering, NHC Key Laboratory of Human Stem Cell and Reproductive Engineering, School of Basic Medical Science, Central South University, Changsha 410083, China
| | - Tongyao Hu
- Institute of Reproductive and Stem Cell Engineering, NHC Key Laboratory of Human Stem Cell and Reproductive Engineering, School of Basic Medical Science, Central South University, Changsha 410083, China
| | - Hao Zhang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Liya Liu
- Department of Neurobiology, School of Basic Medical Science, Nanjing Medical University, Nanjing 211166, China
| | - Pinglan Yang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Wanyin Jiang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Yiting Ye
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Huanhuan Tan
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Yanfeng Wang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Chenyu Lu
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Xin Li
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Jie Xie
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Gege Yuan
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Yiqiang Cui
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Bin Shen
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Cheng Wang
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
- Department of Bioinformatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Yichun Guan
- Center for Reproductive Medicine, the Third Affiliated Hospital of Zhengzhou University, Zhengzhou 450052, China
| | - Wei Li
- Guangzhou Women and Children's Medical Center, Guangzhou Medical University, Guangzhou 510623, China
| | - Qinghua Shi
- Division of Reproduction and Genetics, First Affiliated Hospital of USC, Hefei National Laboratory for Physical Sciences at Microscale, School of Basic Medical Sciences, Division of Life Sciences and Medicine, Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei 230027, Anhui, China
| | - Ge Lin
- Institute of Reproductive and Stem Cell Engineering, NHC Key Laboratory of Human Stem Cell and Reproductive Engineering, School of Basic Medical Science, Central South University, Changsha 410083, China
- Clinical Research Center for Reproduction and Genetics in Hunan Province, Reproductive and Genetic Hospital of CITIC-Xiangya, Changsha 410008, China
| | - Ting Ni
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Human Phenome Institute, Shanghai Engineering Research Center of Industrial Microorganisms, School of Life Sciences and Huashan Hospital, Fudan University, Shanghai 200438, China
| | - Zheng Sun
- Department of Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Lan Ye
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Anastasios Vourekas
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Xuejiang Guo
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| | - Mingyan Lin
- Department of Neurobiology, School of Basic Medical Science, Nanjing Medical University, Nanjing 211166, China
- Changzhou Medical Center, The Affiliated Changzhou Second People's Hospital of Nanjing Medical University, Changzhou 213000, China
- Division of Birth Cohort Study, Fujian Maternity and Child Health Hospital, Fuzhou 350014, China
| | - Ke Zheng
- State Key Laboratory of Reproductive Medicine and Offspring Health, Center for Global Health, School of Public Health, Nanjing Medical University, Nanjing 211166, China
| |
Collapse
|
2
|
Bryant P, Noé F. Structure prediction of alternative protein conformations. Nat Commun 2024; 15:7328. [PMID: 39187507 PMCID: PMC11347660 DOI: 10.1038/s41467-024-51507-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 08/07/2024] [Indexed: 08/28/2024] Open
Abstract
Proteins are dynamic molecules whose movements result in different conformations with different functions. Neural networks such as AlphaFold2 can predict the structure of single-chain proteins with conformations most likely to exist in the PDB. However, almost all protein structures with multiple conformations represented in the PDB have been used while training these models. Therefore, it is unclear whether alternative protein conformations can be genuinely predicted using these networks, or if they are simply reproduced from memory. Here, we train a structure prediction network, Cfold, on a conformational split of the PDB to generate alternative conformations. Cfold enables efficient exploration of the conformational landscape of monomeric protein structures. Over 50% of experimentally known nonredundant alternative protein conformations evaluated here are predicted with high accuracy (TM-score > 0.8).
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany.
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrhenius väg 20C, 114 18, Stockholm, Sweden.
- Science for Life Laboratory, 172 21, Solna, Sweden.
| | - Frank Noé
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178, Berlin, Germany
| |
Collapse
|
3
|
Jang YJ, Qin QQ, Huang SY, Peter ATJ, Ding XM, Kornmann B. Accurate prediction of protein function using statistics-informed graph networks. Nat Commun 2024; 15:6601. [PMID: 39097570 PMCID: PMC11297950 DOI: 10.1038/s41467-024-50955-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 07/15/2024] [Indexed: 08/05/2024] Open
Abstract
Understanding protein function is pivotal in comprehending the intricate mechanisms that underlie many crucial biological activities, with far-reaching implications in the fields of medicine, biotechnology, and drug development. However, more than 200 million proteins remain uncharacterized, and computational efforts heavily rely on protein structural information to predict annotations of varying quality. Here, we present a method that utilizes statistics-informed graph networks to predict protein functions solely from its sequence. Our method inherently characterizes evolutionary signatures, allowing for a quantitative assessment of the significance of residues that carry out specific functions. PhiGnet not only demonstrates superior performance compared to alternative approaches but also narrows the sequence-function gap, even in the absence of structural information. Our findings indicate that applying deep learning to evolutionary data can highlight functional sites at the residue level, providing valuable support for interpreting both existing properties and new functionalities of proteins in research and biomedicine.
Collapse
Affiliation(s)
- Yaan J Jang
- Department of Biochemistry, University of Oxford, Oxford, UK.
- AmoAi Technologies, Oxford, UK.
| | - Qi-Qi Qin
- AmoAi Technologies, Oxford, UK
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Si-Yu Huang
- AmoAi Technologies, Oxford, UK
- Oxford Martin School, University of Oxford, Oxford, UK
- School of Systems Science, Beijing Normal University, Beijing, China
| | | | - Xue-Ming Ding
- School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai, China
| | - Benoît Kornmann
- Department of Biochemistry, University of Oxford, Oxford, UK.
| |
Collapse
|
4
|
Correa Marrero M, Jänes J, Baptista D, Beltrao P. Integrating Large-Scale Protein Structure Prediction into Human Genetics Research. Annu Rev Genomics Hum Genet 2024; 25:123-140. [PMID: 38621234 DOI: 10.1146/annurev-genom-120622-020615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
The last five years have seen impressive progress in deep learning models applied to protein research. Most notably, sequence-based structure predictions have seen transformative gains in the form of AlphaFold2 and related approaches. Millions of missense protein variants in the human population lack annotations, and these computational methods are a valuable means to prioritize variants for further analysis. Here, we review the recent progress in deep learning models applied to the prediction of protein structure and protein variants, with particular emphasis on their implications for human genetics and health. Improved prediction of protein structures facilitates annotations of the impact of variants on protein stability, protein-protein interaction interfaces, and small-molecule binding pockets. Moreover, it contributes to the study of host-pathogen interactions and the characterization of protein function. As genome sequencing in large cohorts becomes increasingly prevalent, we believe that better integration of state-of-the-art protein informatics technologies into human genetics research is of paramount importance.
Collapse
Affiliation(s)
- Miguel Correa Marrero
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| | - Jürgen Jänes
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| | | | - Pedro Beltrao
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| |
Collapse
|
5
|
Hirata E, Sakata KT, Dearden GI, Noor F, Menon I, Chiduza GN, Menon AK. Molecular characterization of Rft1, an ER membrane protein associated with congenital disorder of glycosylation RFT1-CDG. J Biol Chem 2024; 300:107584. [PMID: 39025454 PMCID: PMC11365447 DOI: 10.1016/j.jbc.2024.107584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2024] [Revised: 07/10/2024] [Accepted: 07/11/2024] [Indexed: 07/20/2024] Open
Abstract
The oligosaccharide needed for protein N-glycosylation is assembled on a lipid carrier via a multistep pathway. Synthesis is initiated on the cytoplasmic face of the endoplasmic reticulum (ER) and completed on the luminal side after transbilayer translocation of a heptasaccharide lipid intermediate. More than 30 congenital disorders of glycosylation (CDGs) are associated with this pathway, including RFT1-CDG which results from defects in the membrane protein Rft1. Rft1 is essential for the viability of yeast and mammalian cells and was proposed as the transporter needed to flip the heptasaccharide lipid intermediate across the ER membrane. However, other studies indicated that Rft1 is not required for heptasaccharide lipid flipping in microsomes or unilamellar vesicles reconstituted with ER membrane proteins, nor is it required for the viability of at least one eukaryote. It is therefore not known what essential role Rft1 plays in N-glycosylation. Here, we present a molecular characterization of human Rft1, using yeast cells as a reporter system. We show that it is a multispanning membrane protein located in the ER, with its N and C termini facing the cytoplasm. It is not N-glycosylated. The majority of RFT1-CDG mutations map to highly conserved regions of the protein. We identify key residues that are important for Rft1's ability to support N-glycosylation and cell viability. Our results provide a necessary platform for future work on this enigmatic protein.
Collapse
Affiliation(s)
- Eri Hirata
- Department of Biochemistry, Weill Cornell Medical College, New York, New York, USA
| | - Ken-Taro Sakata
- Department of Biochemistry, Weill Cornell Medical College, New York, New York, USA
| | - Grace I Dearden
- Department of Biochemistry, Weill Cornell Medical College, New York, New York, USA
| | - Faria Noor
- Department of Biochemistry, Weill Cornell Medical College, New York, New York, USA
| | - Indu Menon
- Department of Biochemistry, Weill Cornell Medical College, New York, New York, USA
| | - George N Chiduza
- Structure and Function of Biological Membranes - Chemistry Department, Université Libre de Bruxelles - Campus Plaine, Brussels, Belgium
| | - Anant K Menon
- Department of Biochemistry, Weill Cornell Medical College, New York, New York, USA.
| |
Collapse
|
6
|
Hirata E, Sakata KT, Dearden GI, Noor F, Menon I, Chiduza GN, Menon AK. Molecular characterization of Rft1, an ER membrane protein associated with congenital disorder of glycosylation RFT1-CDG. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.03.587922. [PMID: 38617304 PMCID: PMC11014557 DOI: 10.1101/2024.04.03.587922] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
The oligosaccharide needed for protein N-glycosylation is assembled on a lipid carrier via a multi-step pathway. Synthesis is initiated on the cytoplasmic face of the endoplasmic reticulum (ER) and completed on the luminal side after transbilayer translocation of a heptasaccharide lipid intermediate. More than 30 Congenital Disorders of Glycosylation (CDGs) are associated with this pathway, including RFT1-CDG which results from defects in the membrane protein Rft1. Rft1 is essential for the viability of yeast and mammalian cells and was proposed as the transporter needed to flip the heptasaccharide lipid intermediate across the ER membrane. However, other studies indicated that Rft1 is not required for heptasaccharide lipid flipping in microsomes or unilamellar vesicles reconstituted with ER membrane proteins, nor is it required for the viability of at least one eukaryote. It is therefore not known what essential role Rft1 plays in N-glycosylation. Here, we present a molecular characterization of human Rft1, using yeast cells as a reporter system. We show that it is a multi-spanning membrane protein located in the ER, with its N and C-termini facing the cytoplasm. It is not N-glycosylated. The majority of RFT1-CDG mutations map to highly conserved regions of the protein. We identify key residues that are important for Rft1's ability to support N-glycosylation and cell viability. Our results provide a necessary platform for future work on this enigmatic protein.
Collapse
Affiliation(s)
- Eri Hirata
- Department of Biochemistry, Weill Cornell Medical College, New York, NY 10065, USA
| | - Ken-taro Sakata
- Department of Biochemistry, Weill Cornell Medical College, New York, NY 10065, USA
| | - Grace I. Dearden
- Department of Biochemistry, Weill Cornell Medical College, New York, NY 10065, USA
| | - Faria Noor
- Department of Biochemistry, Weill Cornell Medical College, New York, NY 10065, USA
| | - Indu Menon
- Department of Biochemistry, Weill Cornell Medical College, New York, NY 10065, USA
| | - George N. Chiduza
- Structure and Function of Biological Membranes - Chemistry Department, Université Libre de Bruxelles - Campus Plaine, 1050 Brussels, Belgium
| | - Anant K. Menon
- Department of Biochemistry, Weill Cornell Medical College, New York, NY 10065, USA
| |
Collapse
|
7
|
Porter LL, Artsimovitch I, Ramírez-Sarmiento CA. Metamorphic proteins and how to find them. Curr Opin Struct Biol 2024; 86:102807. [PMID: 38537533 PMCID: PMC11102287 DOI: 10.1016/j.sbi.2024.102807] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 03/05/2024] [Accepted: 03/06/2024] [Indexed: 04/04/2024]
Abstract
In the last two decades, our existing notion that most foldable proteins have a unique native state has been challenged by the discovery of metamorphic proteins, which reversibly interconvert between multiple, sometimes highly dissimilar, native states. As the number of known metamorphic proteins increases, several computational and experimental strategies have emerged for gaining insights about their refolding processes and identifying unknown metamorphic proteins amongst the known proteome. In this review, we describe the current advances in biophysically and functionally ascertaining the structural interconversions of metamorphic proteins and how coevolution can be harnessed to identify novel metamorphic proteins from sequence information. We also discuss the challenges and ongoing efforts in using artificial intelligence-based protein structure prediction methods to discover metamorphic proteins and predict their corresponding three-dimensional structures.
Collapse
Affiliation(s)
- Lauren L Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD 20892, USA.
| | - Irina Artsimovitch
- Department of Microbiology and Center for RNA Biology, The Ohio State University, Columbus, OH 43210, USA.
| | - César A Ramírez-Sarmiento
- Institute for Biological and Medical Engineering, Schools of Engineering, Medicine and Biological Sciences, Pontificia Universidad Católica de Chile, Santiago 7820436, Chile; ANID, Millennium Science Initiative Program, Millennium Institute for Integrative Biology (iBio), Santiago 833150, Chile.
| |
Collapse
|
8
|
Fang T, Szklarczyk D, Hachilif R, von Mering C. Enhancing coevolutionary signals in protein-protein interaction prediction through clade-wise alignment integration. Sci Rep 2024; 14:6009. [PMID: 38472223 PMCID: PMC10933411 DOI: 10.1038/s41598-024-55655-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/26/2024] [Indexed: 03/14/2024] Open
Abstract
Protein-protein interactions (PPIs) play essential roles in most biological processes. The binding interfaces between interacting proteins impose evolutionary constraints that have successfully been employed to predict PPIs from multiple sequence alignments (MSAs). To construct MSAs, critical choices have to be made: how to ensure the reliable identification of orthologs, and how to optimally balance the need for large alignments versus sufficient alignment quality. Here, we propose a divide-and-conquer strategy for MSA generation: instead of building a single, large alignment for each protein, multiple distinct alignments are constructed under distinct clades in the tree of life. Coevolutionary signals are searched separately within these clades, and are only subsequently integrated using machine learning techniques. We find that this strategy markedly improves overall prediction performance, concomitant with better alignment quality. Using the popular DCA algorithm to systematically search pairs of such alignments, a genome-wide all-against-all interaction scan in a bacterial genome is demonstrated. Given the recent successes of AlphaFold in predicting direct PPIs at atomic detail, a discover-and-refine approach is proposed: our method could provide a fast and accurate strategy for pre-screening the entire genome, submitting to AlphaFold only promising interaction candidates-thus reducing false positives as well as computation time.
Collapse
Affiliation(s)
- Tao Fang
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Damian Szklarczyk
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Radja Hachilif
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland
| | - Christian von Mering
- Department of Molecular Life Sciences, University of Zurich, 8057, Zurich, Switzerland.
- SIB Swiss Institute of Bioinformatics, 1015, Lausanne, Switzerland.
| |
Collapse
|
9
|
Yehorova D, Crean RM, Kasson PM, Kamerlin SCL. Key interaction networks: Identifying evolutionarily conserved non-covalent interaction networks across protein families. Protein Sci 2024; 33:e4911. [PMID: 38358258 PMCID: PMC10868456 DOI: 10.1002/pro.4911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/08/2024] [Accepted: 01/10/2024] [Indexed: 02/16/2024]
Abstract
Protein structure (and thus function) is dictated by non-covalent interaction networks. These can be highly evolutionarily conserved across protein families, the members of which can diverge in sequence and evolutionary history. Here we present KIN, a tool to identify and analyze conserved non-covalent interaction networks across evolutionarily related groups of proteins. KIN is available for download under a GNU General Public License, version 2, from https://www.github.com/kamerlinlab/KIN. KIN can operate on experimentally determined structures, predicted structures, or molecular dynamics trajectories, providing insight into both conserved and missing interactions across evolutionarily related proteins. This provides useful insight both into protein evolution, as well as a tool that can be exploited for protein engineering efforts. As a showcase system, we demonstrate applications of this tool to understanding the evolutionary-relevant conserved interaction networks across the class A β-lactamases.
Collapse
Affiliation(s)
- Dariia Yehorova
- School of Chemistry and Biochemistry, Georgia Institute of TechnologyAtlantaGeorgiaUSA
| | - Rory M. Crean
- Department of Chemistry—BMCUppsala UniversityUppsalaSweden
| | - Peter M. Kasson
- Department of Molecular PhysiologyUniversity of VirginiaCharlottesvilleVirginiaUSA
- Department Biomedical EngineeringUniversity of VirginiaCharlottesvilleVirginiaUSA
- Department of Cell and Molecular BiologyUppsala UniversityUppsalaSweden
| | - Shina C. L. Kamerlin
- School of Chemistry and Biochemistry, Georgia Institute of TechnologyAtlantaGeorgiaUSA
- Department of Chemistry—BMCUppsala UniversityUppsalaSweden
| |
Collapse
|
10
|
Teng Z, Pan X, Liu Y, You J, Zhang H, Zhao Z, Qiao Z, Rao Z. Engineering serine hydroxymethyltransferases for efficient synthesis of L-serine in Escherichia coli. BIORESOURCE TECHNOLOGY 2024; 393:130153. [PMID: 38052329 DOI: 10.1016/j.biortech.2023.130153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 12/01/2023] [Accepted: 12/02/2023] [Indexed: 12/07/2023]
Abstract
L-serine is a high-value amino acid widely used in the food, medicine, and cosmetic industries. However, the low yield of L-serine has limited its industrial production. In this study, a cellular factory for efficient synthesis of L-serine was obtained by engineering the serine hydroxymethyltransferases (SHMT). Firstly, after screening the SHMT from Alcanivorax dieselolei by genome mining, a mutant AdSHMTE266M with high thermal stability was identified through rational design. Subsequently, an iterative saturating mutant library was constructed by using coevolutionary analysis, and a mutant AdSHMTE160L/E193Q with enzyme activity 1.35 times higher than AdSHMT was identified. Additionally, the target protein AdSHMTE160L/E193Q/E266M was efficiently overexpressed by improving its mRNA stability. Finally, combining the substrate addition strategy and system optimization, the optimized strain BL21/pET28a-AdSHMTE160L/E193Q/E266M-5'UTR-REP3S16 produced 106.06 g/L L-serine, which is the highest production to date. This study provides new ideas and insights for the engineering design of SHMT and the industrial production of L-serine.
Collapse
Affiliation(s)
- Zixin Teng
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China
| | - Xuewei Pan
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China
| | - Yunran Liu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China
| | - Jiajia You
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China
| | - Hengwei Zhang
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China
| | - Zhenqiang Zhao
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China
| | - Zhina Qiao
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China
| | - Zhiming Rao
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi 214122, Jiangsu, China; Yixing Institute of Food and Biotechnology Co., Ltd, Yixing 214200, China.
| |
Collapse
|
11
|
Guilvout I, Samsudin F, Huber RG, Bond PJ, Bardiaux B, Francetic O. Membrane platform protein PulF of the Klebsiella type II secretion system forms a trimeric ion channel essential for endopilus assembly and protein secretion. mBio 2024; 15:e0142323. [PMID: 38063437 PMCID: PMC10790770 DOI: 10.1128/mbio.01423-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 10/24/2023] [Indexed: 01/17/2024] Open
Abstract
IMPORTANCE Type IV pili and type II secretion systems are members of the widespread type IV filament (T4F) superfamily of nanomachines that assemble dynamic and versatile surface fibers in archaea and bacteria. The assembly and retraction of T4 filaments with diverse surface properties and functions require the plasma membrane platform proteins of the GspF/PilC superfamily. Generally considered dimeric, platform proteins are thought to function as passive transmitters of the mechanical energy generated by the ATPase motor, to somehow promote insertion of pilin subunits into the nascent pilus fibers. Here, we generate and experimentally validate structural predictions that support the trimeric state of a platform protein PulF from a type II secretion system. The PulF trimers form selective proton or sodium channels which might energize pilus assembly using the membrane potential. The conservation of the channel sequence and structural features implies a common mechanism for all T4F assembly systems. We propose a model of the oligomeric PulF-PulE ATPase complex that provides an essential framework to investigate and understand the pilus assembly mechanism.
Collapse
Affiliation(s)
- Ingrid Guilvout
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Biochemistry of Macromolecular Interactions Unit, Paris, France
| | | | | | - Peter J. Bond
- Bioinformatics Institute (A-STAR), Singapore, Singapore
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore
| | - Benjamin Bardiaux
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, France
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Bacterial Transmembrane Systems Unit, Paris, France
| | - Olivera Francetic
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Biochemistry of Macromolecular Interactions Unit, Paris, France
| |
Collapse
|
12
|
Rai GP, Shanker A. Coevolution-based computational approach to detect resistance mechanism of epidermal growth factor receptor. BIOCHIMICA ET BIOPHYSICA ACTA. MOLECULAR CELL RESEARCH 2024; 1871:119592. [PMID: 37730130 DOI: 10.1016/j.bbamcr.2023.119592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 08/24/2023] [Accepted: 09/10/2023] [Indexed: 09/22/2023]
Abstract
Tyrosine kinase epidermal growth factor receptor (EGFR) correlates the neoplastic cell metastasis, angiogenesis, neoplastic incursion, and apoptosis. Due to the involvement of EGFR in these biological processes, it becomes a most potent target for treating non-small cell lung cancer (NSCLC). The tyrosine kinase inhibitors (TKI) have endorsed high efficacy and anticipation to patients but unfortunately, within a year of treatment, drug targets develop resistance due to mutations. The present study detected the compensatory mutations in EGFR to know the evolutionary mechanism of drug resistance. The results of this study demonstrate that compensatory mutations enlarge the drug-binding pocket which may lead to the altered orientation of the ligand (gefitinib and erlotinib) causing drug resistance. This indicates that coevolutionary forces play a significant role in fine-tuning the structure of EGFR protein against the drugs. The analysis provides insight into the evolution-induced structural aspects of drug resistance changes in EGFR which in turn be useful in designing drugs with better efficacy.
Collapse
Affiliation(s)
- Gyan Prakash Rai
- Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar 824236, India
| | - Asheesh Shanker
- Department of Bioinformatics, Central University of South Bihar, Gaya, Bihar 824236, India.
| |
Collapse
|
13
|
Wayment-Steele HK, Ojoawo A, Otten R, Apitz JM, Pitsawong W, Hömberger M, Ovchinnikov S, Colwell L, Kern D. Predicting multiple conformations via sequence clustering and AlphaFold2. Nature 2024; 625:832-839. [PMID: 37956700 PMCID: PMC10808063 DOI: 10.1038/s41586-023-06832-9] [Citation(s) in RCA: 57] [Impact Index Per Article: 57.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 11/03/2023] [Indexed: 11/15/2023]
Abstract
AlphaFold2 (ref. 1) has revolutionized structural biology by accurately predicting single structures of proteins. However, a protein's biological function often depends on multiple conformational substates2, and disease-causing point mutations often cause population changes within these substates3,4. We demonstrate that clustering a multiple-sequence alignment by sequence similarity enables AlphaFold2 to sample alternative states of known metamorphic proteins with high confidence. Using this method, named AF-Cluster, we investigated the evolutionary distribution of predicted structures for the metamorphic protein KaiB5 and found that predictions of both conformations were distributed in clusters across the KaiB family. We used nuclear magnetic resonance spectroscopy to confirm an AF-Cluster prediction: a cyanobacteria KaiB variant is stabilized in the opposite state compared with the more widely studied variant. To test AF-Cluster's sensitivity to point mutations, we designed and experimentally verified a set of three mutations predicted to flip KaiB from Rhodobacter sphaeroides from the ground to the fold-switched state. Finally, screening for alternative states in protein families without known fold switching identified a putative alternative state for the oxidoreductase Mpt53 in Mycobacterium tuberculosis. Further development of such bioinformatic methods in tandem with experiments will probably have a considerable impact on predicting protein energy landscapes, essential for illuminating biological function.
Collapse
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Adedolapo Ojoawo
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Renee Otten
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Treeline Biosciences, Watertown, MA, USA
| | - Julia M Apitz
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Warintra Pitsawong
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Biomolecular Discovery, Relay Therapeutics, Cambridge, MA, USA
| | - Marc Hömberger
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Treeline Biosciences, Watertown, MA, USA
| | | | - Lucy Colwell
- Google Research, Cambridge, MA, USA
- Cambridge University, Cambridge, UK
| | - Dorothee Kern
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA.
| |
Collapse
|
14
|
Li X, Chen B, Chen W, Pu Z, Qi X, Yang L, Wu J, Yu H. Customized multiple sequence alignment as an effective strategy to improve performance of Taq DNA polymerase. Appl Microbiol Biotechnol 2023; 107:6507-6525. [PMID: 37658164 DOI: 10.1007/s00253-023-12744-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 08/06/2023] [Accepted: 08/24/2023] [Indexed: 09/03/2023]
Abstract
Engineering Taq DNA polymerase (TaqPol) for improved activity, stability and sensitivity was critical for its wide applications. Multiple sequence alignment (MSA) has been widely used in engineering enzymes for improved properties. Here, we first designed TaqPol mutations based on MSA of 2756 sequences from both thermophilic and non-thermophilic organisms. Two double mutations were generated including a variant H676F/R677G showing a decrease in both activity and stability, and a variant Y686R/E687K showing an improved activity, but a decreased stability. Mutations targeted on coevolutionary residues of Arg677 and Tyr686 were then applied to rescue stability or activity loss of the double mutants, which achieved a partial success. Sequence analysis revealed that the two mutations are abundant in non-thermophilic sequences but not in thermophilic homologues. Then, a small-scale MSA containing sequences from only thermophilic organisms was applied to predict 13 single variants and two of them, E507Q and E734N showed a simultaneous increase in both stability and activity, even in sensitivity. A customized MSA was hence more effective in engineering a thermophilic enzyme and could be used in engineering other enzymes. Molecular dynamics simulations revealed the impact of mutations on the protein dynamics and interactions between TaqPol and substrates. KEY POINTS: • The pool of sequence for alignment is critical to engineering Taq DNA polymerase. • The variants with low properties can be rescued by mutations in coevolving network. • Improving binding with DNA can improve DNA polymerase stability and activity.
Collapse
Affiliation(s)
- Xinjia Li
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China
- ZJU-Hangzhou Global Scientific and Technological Innovation Centre, Hangzhou, 311200, Zhejiang, China
| | - Binbin Chen
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China
- ZJU-Hangzhou Global Scientific and Technological Innovation Centre, Hangzhou, 311200, Zhejiang, China
| | - Wanyi Chen
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China
- ZJU-Hangzhou Global Scientific and Technological Innovation Centre, Hangzhou, 311200, Zhejiang, China
| | - Zhongji Pu
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China
- ZJU-Hangzhou Global Scientific and Technological Innovation Centre, Hangzhou, 311200, Zhejiang, China
| | - Xin Qi
- Building No.4, Zhongguancun Dongsheng International Science Park, No. 1 North Yongtaizhuang Road, Haidian District, Beijing, 100192, China
| | - Lirong Yang
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China
- ZJU-Hangzhou Global Scientific and Technological Innovation Centre, Hangzhou, 311200, Zhejiang, China
| | - Jianping Wu
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China
- ZJU-Hangzhou Global Scientific and Technological Innovation Centre, Hangzhou, 311200, Zhejiang, China
| | - Haoran Yu
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China.
- ZJU-Hangzhou Global Scientific and Technological Innovation Centre, Hangzhou, 311200, Zhejiang, China.
| |
Collapse
|
15
|
Kilian M, Bischofs IB. Co-evolution at protein-protein interfaces guides inference of stoichiometry of oligomeric protein complexes by de novo structure prediction. Mol Microbiol 2023; 120:763-782. [PMID: 37777474 DOI: 10.1111/mmi.15169] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 09/10/2023] [Accepted: 09/11/2023] [Indexed: 10/02/2023]
Abstract
The quaternary structure with specific stoichiometry is pivotal to the specific function of protein complexes. However, determining the structure of many protein complexes experimentally remains a major bottleneck. Structural bioinformatics approaches, such as the deep learning algorithm Alphafold2-multimer (AF2-multimer), leverage the co-evolution of amino acids and sequence-structure relationships for accurate de novo structure and contact prediction. Pseudo-likelihood maximization direct coupling analysis (plmDCA) has been used to detect co-evolving residue pairs by statistical modeling. Here, we provide evidence that combining both methods can be used for de novo prediction of the quaternary structure and stoichiometry of a protein complex. We achieve this by augmenting the existing AF2-multimer confidence metrics with an interpretable score to identify the complex with an optimal fraction of native contacts of co-evolving residue pairs at intermolecular interfaces. We use this strategy to predict the quaternary structure and non-trivial stoichiometries of Bacillus subtilis spore germination protein complexes with unknown structures. Co-evolution at intermolecular interfaces may therefore synergize with AI-based de novo quaternary structure prediction of structurally uncharacterized bacterial protein complexes.
Collapse
Affiliation(s)
- Max Kilian
- Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
- BioQuant Center for Quantitative Analysis of Molecular and Cellular Biosystems, Heidelberg University, Heidelberg, Germany
- Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg, Germany
| | - Ilka B Bischofs
- Max-Planck-Institute for Terrestrial Microbiology, Marburg, Germany
- BioQuant Center for Quantitative Analysis of Molecular and Cellular Biosystems, Heidelberg University, Heidelberg, Germany
- Center for Molecular Biology of Heidelberg University (ZMBH), Heidelberg, Germany
| |
Collapse
|
16
|
Alderson TR, Pritišanac I, Kolarić Đ, Moses AM, Forman-Kay JD. Systematic identification of conditionally folded intrinsically disordered regions by AlphaFold2. Proc Natl Acad Sci U S A 2023; 120:e2304302120. [PMID: 37878721 PMCID: PMC10622901 DOI: 10.1073/pnas.2304302120] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/30/2023] [Indexed: 10/27/2023] Open
Abstract
The AlphaFold Protein Structure Database contains predicted structures for millions of proteins. For the majority of human proteins that contain intrinsically disordered regions (IDRs), which do not adopt a stable structure, it is generally assumed that these regions have low AlphaFold2 confidence scores that reflect low-confidence structural predictions. Here, we show that AlphaFold2 assigns confident structures to nearly 15% of human IDRs. By comparison to experimental NMR data for a subset of IDRs that are known to conditionally fold (i.e., upon binding or under other specific conditions), we find that AlphaFold2 often predicts the structure of the conditionally folded state. Based on databases of IDRs that are known to conditionally fold, we estimate that AlphaFold2 can identify conditionally folding IDRs at a precision as high as 88% at a 10% false positive rate, which is remarkable considering that conditionally folded IDR structures were minimally represented in its training data. We find that human disease mutations are nearly fivefold enriched in conditionally folded IDRs over IDRs in general and that up to 80% of IDRs in prokaryotes are predicted to conditionally fold, compared to less than 20% of eukaryotic IDRs. These results indicate that a large majority of IDRs in the proteomes of human and other eukaryotes function in the absence of conditional folding, but the regions that do acquire folds are more sensitive to mutations. We emphasize that the AlphaFold2 predictions do not reveal functionally relevant structural plasticity within IDRs and cannot offer realistic ensemble representations of conditionally folded IDRs.
Collapse
Affiliation(s)
- T. Reid Alderson
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ONM5S 1A8, Canada
| | - Iva Pritišanac
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Đesika Kolarić
- Department of Molecular Biology and Biochemistry, Gottfried Schatz Research Center for Cell Signaling, Metabolism and Aging, Medical University of Graz, Graz8010, Austria
| | - Alan M. Moses
- Department of Cell and Systems Biology, University of Toronto, Toronto, ONM5S 35G, Canada
| | - Julie D. Forman-Kay
- Department of Biochemistry, University of Toronto, Toronto, ONM5S 1A8, Canada
- Molecular Medicine Program, The Hospital for Sick Children, Toronto, ONM5G 0A4, Canada
| |
Collapse
|
17
|
Jia K, Kilinc M, Jernigan RL. New alignment method for remote protein sequences by the direct use of pairwise sequence correlations and substitutions. FRONTIERS IN BIOINFORMATICS 2023; 3:1227193. [PMID: 37900964 PMCID: PMC10602800 DOI: 10.3389/fbinf.2023.1227193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 08/14/2023] [Indexed: 10/31/2023] Open
Abstract
Understanding protein sequences and how they relate to the functions of proteins is extremely important. One of the most basic operations in bioinformatics is sequence alignment and usually the first things learned from these are which positions are the most conserved and often these are critical parts of the structure, such as enzyme active site residues. In addition, the contact pairs in a protein usually correspond closely to the correlations between residue positions in the multiple sequence alignment, and these usually change in a systematic and coordinated way, if one position changes then the other member of the pair also changes to compensate. In the present work, these correlated pairs are taken as anchor points for a new type of sequence alignment. The main advantage of the method here is its combining the remote homolog detection from our method PROST with pairwise sequence substitutions in the rigorous method from Kleinjung et al. We show a few examples of some resulting sequence alignments, and how they can lead to improvements in alignments for function, even for a disordered protein.
Collapse
Affiliation(s)
- Kejue Jia
- Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, IA, United States
| | - Mesih Kilinc
- Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Robert L. Jernigan
- Roy J. Carver Department of Biochemistry, Biophysics, and Molecular Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| |
Collapse
|
18
|
Schafer JW, Porter LL. Evolutionary selection of proteins with two folds. Nat Commun 2023; 14:5478. [PMID: 37673981 PMCID: PMC10482954 DOI: 10.1038/s41467-023-41237-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 08/24/2023] [Indexed: 09/08/2023] Open
Abstract
Although most globular proteins fold into a single stable structure, an increasing number have been shown to remodel their secondary and tertiary structures in response to cellular stimuli. State-of-the-art algorithms predict that these fold-switching proteins adopt only one stable structure, missing their functionally critical alternative folds. Why these algorithms predict a single fold is unclear, but all of them infer protein structure from coevolved amino acid pairs. Here, we hypothesize that coevolutionary signatures are being missed. Suspecting that single-fold variants could be masking these signatures, we developed an approach, called Alternative Contact Enhancement (ACE), to search both highly diverse protein superfamilies-composed of single-fold and fold-switching variants-and protein subfamilies with more fold-switching variants. ACE successfully revealed coevolution of amino acid pairs uniquely corresponding to both conformations of 56/56 fold-switching proteins from distinct families. Then, we used ACE-derived contacts to (1) predict two experimentally consistent conformations of a candidate protein with unsolved structure and (2) develop a blind prediction pipeline for fold-switching proteins. The discovery of widespread dual-fold coevolution indicates that fold-switching sequences have been preserved by natural selection, implying that their functionalities provide evolutionary advantage and paving the way for predictions of diverse protein structures from single sequences.
Collapse
Affiliation(s)
- Joseph W Schafer
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Lauren L Porter
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA.
- National Heart, Lung, and Blood Institute, Biochemistry and Biophysics Center, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
19
|
Porter LL. Fluid protein fold space and its implications. Bioessays 2023; 45:e2300057. [PMID: 37431685 PMCID: PMC10529699 DOI: 10.1002/bies.202300057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2023] [Revised: 06/21/2023] [Accepted: 06/23/2023] [Indexed: 07/12/2023]
Abstract
Fold-switching proteins, which remodel their secondary and tertiary structures in response to cellular stimuli, suggest a new view of protein fold space. For decades, experimental evidence has indicated that protein fold space is discrete: dissimilar folds are encoded by dissimilar amino acid sequences. Challenging this assumption, fold-switching proteins interconnect discrete groups of dissimilar protein folds, making protein fold space fluid. Three recent observations support the concept of fluid fold space: (1) some amino acid sequences interconvert between folds with distinct secondary structures, (2) some naturally occurring sequences have switched folds by stepwise mutation, and (3) fold switching is evolutionarily selected and likely confers advantage. These observations indicate that minor amino acid sequence modifications can transform protein structure and function. Consequently, proteomic structural and functional diversity may be expanded by alternative splicing, small nucleotide polymorphisms, post-translational modifications, and modified translation rates.
Collapse
Affiliation(s)
- Lauren L. Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD
- National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD
| |
Collapse
|
20
|
Eme L, Tamarit D, Caceres EF, Stairs CW, De Anda V, Schön ME, Seitz KW, Dombrowski N, Lewis WH, Homa F, Saw JH, Lombard J, Nunoura T, Li WJ, Hua ZS, Chen LX, Banfield JF, John ES, Reysenbach AL, Stott MB, Schramm A, Kjeldsen KU, Teske AP, Baker BJ, Ettema TJG. Inference and reconstruction of the heimdallarchaeial ancestry of eukaryotes. Nature 2023; 618:992-999. [PMID: 37316666 PMCID: PMC10307638 DOI: 10.1038/s41586-023-06186-2] [Citation(s) in RCA: 50] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 05/10/2023] [Indexed: 06/16/2023]
Abstract
In the ongoing debates about eukaryogenesis-the series of evolutionary events leading to the emergence of the eukaryotic cell from prokaryotic ancestors-members of the Asgard archaea play a key part as the closest archaeal relatives of eukaryotes1. However, the nature and phylogenetic identity of the last common ancestor of Asgard archaea and eukaryotes remain unresolved2-4. Here we analyse distinct phylogenetic marker datasets of an expanded genomic sampling of Asgard archaea and evaluate competing evolutionary scenarios using state-of-the-art phylogenomic approaches. We find that eukaryotes are placed, with high confidence, as a well-nested clade within Asgard archaea and as a sister lineage to Hodarchaeales, a newly proposed order within Heimdallarchaeia. Using sophisticated gene tree and species tree reconciliation approaches, we show that analogous to the evolution of eukaryotic genomes, genome evolution in Asgard archaea involved significantly more gene duplication and fewer gene loss events compared with other archaea. Finally, we infer that the last common ancestor of Asgard archaea was probably a thermophilic chemolithotroph and that the lineage from which eukaryotes evolved adapted to mesophilic conditions and acquired the genetic potential to support a heterotrophic lifestyle. Our work provides key insights into the prokaryote-to-eukaryote transition and a platform for better understanding the emergence of cellular complexity in eukaryotic cells.
Collapse
Affiliation(s)
- Laura Eme
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Laboratoire Écologie, Systématique, Évolution, CNRS, Université Paris-Saclay, AgroParisTech, Gif-sur-Yvette, France
| | - Daniel Tamarit
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, The Netherlands
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
- Theoretical Biology and Bioinformatics, Department of Biology, Faculty of Science, Utrecht University, Utrecht, The Netherlands
| | - Eva F Caceres
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, The Netherlands
| | - Courtney W Stairs
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Department of Biology, Lund University, Lund, Sweden
| | - Valerie De Anda
- Department of Marine Science, Marine Science Institute, University of Texas Austin, Port Aransas, TX, USA
- Department of Integrative Biology, University of Texas Austin, Austin, TX, USA
| | - Max E Schön
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Kiley W Seitz
- Department of Marine Science, Marine Science Institute, University of Texas Austin, Port Aransas, TX, USA
- Structural and Computational Biology, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Nina Dombrowski
- Department of Marine Science, Marine Science Institute, University of Texas Austin, Port Aransas, TX, USA
- Department of Marine Microbiology and Biogeochemistry, NIOZ, Royal Netherlands Institute for Sea Research, AB Den Burg, The Netherlands
| | - William H Lewis
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, The Netherlands
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Felix Homa
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, The Netherlands
| | - Jimmy H Saw
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
- Department of Biological Sciences, The George Washington University, Washington, DC, USA
| | - Jonathan Lombard
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Takuro Nunoura
- Research Center for Bioscience and Nanoscience (CeBN), Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Yokosuka, Japan
| | - Wen-Jun Li
- State Key Laboratory of Biocontrol, Guangdong Provincial Key Laboratory of Plant Resources and Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), School of Life Sciences, Sun Yat-Sen University, Guangzhou, PR China
| | - Zheng-Shuang Hua
- Chinese Academy of Sciences Key Laboratory of Urban Pollutant Conversion, Department of Environmental Science and Engineering, University of Science and Technology of China, Hefei, PR China
| | - Lin-Xing Chen
- Department of Earth and Planetary Sciences, University of California, Berkeley, CA, USA
| | - Jillian F Banfield
- Department of Earth and Planetary Sciences, University of California, Berkeley, CA, USA
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA, USA
| | - Emily St John
- Department of Biology, Portland State University, Portland, OR, USA
| | | | - Matthew B Stott
- School of Biological Sciences, University of Canterbury, Christchurch, New Zealand
| | - Andreas Schramm
- Section for Microbiology, Department of Biology, Aarhus University, Aarhus, Denmark
| | - Kasper U Kjeldsen
- Section for Microbiology, Department of Biology, Aarhus University, Aarhus, Denmark
| | - Andreas P Teske
- Department of Earth, Marine and Environmental Sciences, University of North Carolina, Chapel Hill, NC, USA
| | - Brett J Baker
- Department of Marine Science, Marine Science Institute, University of Texas Austin, Port Aransas, TX, USA
- Department of Integrative Biology, University of Texas Austin, Austin, TX, USA
| | - Thijs J G Ettema
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
- Laboratory of Microbiology, Wageningen University and Research, Wageningen, The Netherlands.
| |
Collapse
|
21
|
Pomarici ND, Cacciato R, Kokot J, Fernández-Quintero ML, Liedl KR. Evolution of the Immunoglobulin Isotypes-Variations of Biophysical Properties among Animal Classes. Biomolecules 2023; 13:801. [PMID: 37238671 PMCID: PMC10216798 DOI: 10.3390/biom13050801] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 05/03/2023] [Accepted: 05/05/2023] [Indexed: 05/28/2023] Open
Abstract
The adaptive immune system arose around 500 million years ago in jawed fish, and, since then, it has mediated the immune defense against pathogens in all vertebrates. Antibodies play a central role in the immune reaction, recognizing and attacking external invaders. During the evolutionary process, several immunoglobulin isotypes emerged, each having a characteristic structural organization and dedicated function. In this work, we investigate the evolution of the immunoglobulin isotypes, in order to highlight the relevant features that were preserved over time and the parts that, instead, mutated. The residues that are coupled in the evolution process are often involved in intra- or interdomain interactions, meaning that they are fundamental to maintaining the immunoglobulin fold and to ensuring interactions with other domains. The explosive growth of available sequences allows us to point out the evolutionary conserved residues and compare the biophysical properties among different animal classes and isotypes. Our study offers a general overview of the evolution of immunoglobulin isotypes and advances the knowledge of their characteristic biophysical properties, as a first step in guiding protein design from evolution.
Collapse
Affiliation(s)
| | | | | | - Monica L. Fernández-Quintero
- Department of General, Inorganic and Theoretical Chemistry, Center for Molecular Biosciences Innsbruck (CMBI), University of Innsbruck, Innrain 80-82, A-6020 Innsbruck, Austria
| | - Klaus R. Liedl
- Department of General, Inorganic and Theoretical Chemistry, Center for Molecular Biosciences Innsbruck (CMBI), University of Innsbruck, Innrain 80-82, A-6020 Innsbruck, Austria
| |
Collapse
|
22
|
Cheng Y, Wang H, Xu H, Liu Y, Ma B, Chen X, Zeng X, Wang X, Wang B, Shiau C, Ovchinnikov S, Su XD, Wang C. Co-evolution-based prediction of metal-binding sites in proteomes by machine learning. Nat Chem Biol 2023; 19:548-555. [PMID: 36593274 DOI: 10.1038/s41589-022-01223-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 11/08/2022] [Indexed: 01/03/2023]
Abstract
Metal ions have various important biological roles in proteins, including structural maintenance, molecular recognition and catalysis. Previous methods of predicting metal-binding sites in proteomes were based on either sequence or structural motifs. Here we developed a co-evolution-based pipeline named 'MetalNet' to systematically predict metal-binding sites in proteomes. We applied MetalNet to proteomes of four representative prokaryotic species and predicted 4,849 potential metalloproteins, which substantially expands the currently annotated metalloproteomes. We biochemically and structurally validated previously unannotated metal-binding sites in several proteins, including apo-citrate lyase phosphoribosyl-dephospho-CoA transferase citX, an Escherichia coli enzyme lacking structural or sequence homology to any known metalloprotein (Protein Data Bank (PDB) codes: 7DCM and 7DCN ). MetalNet also successfully recapitulated all known zinc-binding sites from the human spliceosome complex. The pipeline of MetalNet provides a unique and enabling tool for interrogating the hidden metalloproteome and studying metal biology.
Collapse
Affiliation(s)
- Yao Cheng
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Haobo Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Hua Xu
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | - Yuan Liu
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
| | - Bin Ma
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xuemin Chen
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xin Zeng
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Xianghe Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Bo Wang
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | | | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellow, Harvard University, Cambridge, MA, USA
| | - Xiao-Dong Su
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China.
| | - Chu Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
| |
Collapse
|
23
|
Bordin N, Dallago C, Heinzinger M, Kim S, Littmann M, Rauer C, Steinegger M, Rost B, Orengo C. Novel machine learning approaches revolutionize protein knowledge. Trends Biochem Sci 2023; 48:345-359. [PMID: 36504138 PMCID: PMC10570143 DOI: 10.1016/j.tibs.2022.11.001] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 10/24/2022] [Accepted: 11/17/2022] [Indexed: 12/10/2022]
Abstract
Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing structural biology. Obtaining accurate models of proteins and annotating their functions on a large scale is no longer limited by time and resources. The most recent method to be top ranked by the Critical Assessment of Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), is capable of building structural models with an accuracy comparable to that of experimental structures. Annotations of 3D models are keeping pace with the deposition of the structures due to advancements in protein language models (pLMs) and structural aligners that help validate these transferred annotations. In this review we describe how recent developments in ML for protein science are making large-scale structural bioinformatics available to the general scientific community.
Collapse
Affiliation(s)
- Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK
| | - Christian Dallago
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; VantAI, 151 W 42nd Street, New York, NY 10036, USA
| | - Michael Heinzinger
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748 Garching, Germany
| | - Stephanie Kim
- School of Biological Sciences, Seoul National University, Seoul, South Korea; Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Maria Littmann
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany
| | - Clemens Rauer
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK
| | - Martin Steinegger
- School of Biological Sciences, Seoul National University, Seoul, South Korea; Artificial Intelligence Institute, Seoul National University, Seoul, South Korea
| | - Burkhard Rost
- Technical University of Munich (TUM) Department of Informatics, Bioinformatics and Computational Biology - i12, Boltzmannstr. 3, 85748 Garching/Munich, Germany; Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, 85748 Garching/Munich, Germany; TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, Gower St, WC1E 6BT London, UK.
| |
Collapse
|
24
|
Wang S, Lei H, Ji Z. Exploring Oxidoreductases from Extremophiles for Biosynthesis in a Non-Aqueous System. Int J Mol Sci 2023; 24:ijms24076396. [PMID: 37047370 PMCID: PMC10094897 DOI: 10.3390/ijms24076396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Revised: 03/19/2023] [Accepted: 03/27/2023] [Indexed: 04/14/2023] Open
Abstract
Organic solvent tolerant oxidoreductases are significant for both scientific research and biomanufacturing. However, it is really challenging to obtain oxidoreductases due to the shortages of natural resources and the difficulty to obtained it via protein modification. This review summarizes the recent advances in gene mining and structure-functional study of oxidoreductases from extremophiles for non-aqueous reaction systems. First, new strategies combining genome mining with bioinformatics provide new insights to the discovery and identification of novel extreme oxidoreductases. Second, analysis from the perspectives of amino acid interaction networks explain the organic solvent tolerant mechanism, which regulate the discrete structure-functional properties of extreme oxidoreductases. Third, further study by conservation and co-evolution analysis of extreme oxidoreductases provides new perspectives and strategies for designing robust enzymes for an organic media reaction system. Furthermore, the challenges and opportunities in designing biocatalysis non-aqueous systems are highlighted.
Collapse
Affiliation(s)
- Shizhen Wang
- Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Xiamen Key Laboratory of Synthetic Biotechnology, Xiamen University, Xiamen 361005, China
| | - Hangbin Lei
- Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Zhehui Ji
- Department of Chemical and Biochemical Engineering, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
25
|
Luan Y, Tang Z, He Y, Xie Z. Intra-Domain Residue Coevolution in Transcription Factors Contributes to DNA Binding Specificity. Microbiol Spectr 2023; 11:e0365122. [PMID: 36943132 PMCID: PMC10100741 DOI: 10.1128/spectrum.03651-22] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 02/22/2023] [Indexed: 03/23/2023] Open
Abstract
Understanding the basis of the DNA-binding specificity of transcription factors (TFs) has been of long-standing interest. Despite extensive efforts to map millions of putative TF binding sequences, identifying the critical determinants for DNA binding specificity remains a major challenge. The coevolution of residues in proteins occurs due to a shared evolutionary history. However, it is unclear how coevolving residues in TFs contribute to DNA binding specificity. Here, we systematically collected publicly available data sets from multiple large-scale high-throughput TF-DNA interaction screening experiments for the major TF families with large numbers of TF members. These families included the Homeobox, HLH, bZIP_1, Ets, HMG_box, ZF-C4, and Zn_clus TFs. We detected TF subclass-determining sites (TSDSs) and showed that the TSDSs were more likely to coevolve with other TSDSs than with non-TSDSs, particularly for the Homeobox, HLH, Ets, bZIP_1, and HMG_box TF families. By in silico modeling, we showed that mutation of the highly coevolving residues could significantly reduce the stability of the TF-DNA complex. The distant residues from the DNA interface also contributed to TF-DNA binding activity. Overall, our study gave evidence that coevolved residues relate to transcriptional regulation and provided insights into the potential application of engineered DNA-binding domains and proteins. IMPORTANCE While unraveling DNA-binding specificity of TFs is the key to understanding the basis and molecular mechanism of gene expression regulation, identifying the critical determinants that contribute to DNA binding specificity remains a major challenge. In this study, we provided evidence showing that coevolving residues in TF domains contributed to DNA binding specificity. We demonstrated that the TSDSs were more likely to coevolve with other TSDSs than with non-TSDSs. Mutation of the coevolving residue pairs (CRPs) could significantly reduce the stability of THE TF-DNA complex, and even the distant residues from the DNA interface contribute to TF-DNA binding activity. Collectively, our study expands our knowledge of the interactions among coevolved residues in TFs, tertiary contacting, and functional importance in refined transcriptional regulation. Understanding the impact of coevolving residues in TFs will help understand the details of transcription of gene regulation and advance the application of engineered DNA-binding domains and protein.
Collapse
Affiliation(s)
- Yizhao Luan
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zehua Tang
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Yao He
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| | - Zhi Xie
- State Key Laboratory of Ophthalmology, Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
26
|
Kleeorin Y, Russ WP, Rivoire O, Ranganathan R. Undersampling and the inference of coevolution in proteins. Cell Syst 2023; 14:210-219.e7. [PMID: 36693377 PMCID: PMC10911952 DOI: 10.1016/j.cels.2022.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 01/02/2022] [Accepted: 12/23/2022] [Indexed: 01/24/2023]
Abstract
Protein structure, function, and evolution depend on local and collective epistatic interactions between amino acids. A powerful approach to defining these interactions is to construct models of couplings between amino acids that reproduce the empirical statistics (frequencies and correlations) observed in sequences comprising a protein family. The top couplings are then interpreted. Here, we show that as currently implemented, this inference unequally represents epistatic interactions, a problem that fundamentally arises from limited sampling of sequences in the context of distinct scales at which epistasis occurs in proteins. We show that these issues explain the ability of current approaches to predict tertiary contacts between amino acids and the inability to obviously expose larger networks of functionally relevant, collectively evolving residues called sectors. This work provides a necessary foundation for more deeply understanding and improving evolution-based models of proteins.
Collapse
Affiliation(s)
- Yaakov Kleeorin
- Center for Physics of Evolving Systems, Department of Biochemistry & Molecular Biology, University of Chicago, Chicago, IL 60637, USA
| | - William P Russ
- Green Center for Systems Biology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Olivier Rivoire
- Center for Interdisciplinary Research in Biology (CIRB), Collège de France, CNRS, INSERM, PSL Research University, 75005 Paris, France.
| | - Rama Ranganathan
- Center for Physics of Evolving Systems, Department of Biochemistry & Molecular Biology, University of Chicago, Chicago, IL 60637, USA; The Pritzker School for Molecular Engineering, University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
27
|
Jia K, Kilinc M, Jernigan RL. Functional Protein Dynamics Directly from Sequences. J Phys Chem B 2023; 127:1914-1921. [PMID: 36848294 PMCID: PMC10009744 DOI: 10.1021/acs.jpcb.2c05766] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 02/15/2023] [Indexed: 03/01/2023]
Abstract
The sequence correlations within a protein multiple sequence alignment are routinely being used to predict contacts within its structure, but here we point out that these data can also be used to predict a protein's dynamics directly. The elastic network protein dynamics models rely directly upon the contacts, and the normal modes of motion are obtained from the decomposition of the inverse of the contact map. To make the direct connection between sequence and dynamics, it is necessary to apply coarse-graining to the structure at the level of one point per amino acid, which has often been done, and protein coarse-grained dynamics from elastic network models has been highly successful, particularly in representing the large-scale motions of proteins that usually relate closely to their functions. The interesting implication of this is that it is not necessary to know the structure itself to obtain its dynamics and instead to use the sequence information directly to obtain the dynamics.
Collapse
Affiliation(s)
- Kejue Jia
- Bioinformatics and Computational
Biology Program and Roy J. Carver Department of Biochemistry, Biophysics
and Molecular Biology Iowa State University, Ames, Iowa 50011, United States
| | - Mesih Kilinc
- Bioinformatics and Computational
Biology Program and Roy J. Carver Department of Biochemistry, Biophysics
and Molecular Biology Iowa State University, Ames, Iowa 50011, United States
| | - Robert L. Jernigan
- Bioinformatics and Computational
Biology Program and Roy J. Carver Department of Biochemistry, Biophysics
and Molecular Biology Iowa State University, Ames, Iowa 50011, United States
| |
Collapse
|
28
|
Karamanos TK. Chasing long-range evolutionary couplings in the AlphaFold era. Biopolymers 2023; 114:e23530. [PMID: 36752285 PMCID: PMC10909459 DOI: 10.1002/bip.23530] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 02/09/2023]
Abstract
Coevolution between protein residues is normally interpreted as direct contact. However, the evolutionary record of a protein sequence contains rich information that may include long-range functional couplings, couplings that report on homo-oligomeric states or even conformational changes. Due to the complexity of the sequence space and the lack of structural information on various members of a protein family, it has been difficult to effectively mine the additional information encoded in a multiple sequence alignment (MSA). Here, taking advantage of the recent release of the AlphaFold (AF) database we attempt to identify coevolutionary couplings that cannot be explained simply by spatial proximity. We propose a simple computational method that performs direct coupling analysis on a MSA and searches for couplings that are not satisfied in any of the AF models of members of the identified protein family. Application of this method on 2012 protein families suggests that ~12% of the total identified coevolving residue pairs are spatially distant and more likely to be disordered than their contacting counterparts. We expect that this analysis will help improve the quality of coevolutionary distance restraints used for structure determination and will be useful in identifying potentially functional/allosteric cross-talk between distant residues.
Collapse
|
29
|
Chakravarty D, Schafer JW, Porter LL. Distinguishing features of fold-switching proteins. Protein Sci 2023; 32:e4596. [PMID: 36782353 PMCID: PMC9951197 DOI: 10.1002/pro.4596] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 01/30/2023] [Accepted: 02/09/2023] [Indexed: 02/15/2023]
Abstract
Though many folded proteins assume one stable structure that performs one function, a small-but-increasing number remodel their secondary and tertiary structures and change their functions in response to cellular stimuli. These fold-switching proteins regulate biological processes and are associated with autoimmune dysfunction, severe acute respiratory syndrome coronavirus-2 infection, and more. Despite their biological importance, it is difficult to computationally predict fold switching. With the aim of advancing computational prediction and experimental characterization of fold switchers, this review discusses several features that distinguish fold-switching proteins from their single-fold and intrinsically disordered counterparts. First, the isolated structures of fold switchers are less stable and more heterogeneous than single folders but more stable and less heterogeneous than intrinsically disordered proteins (IDPs). Second, the sequences of single fold, fold switching, and intrinsically disordered proteins can evolve at distinct rates. Third, proteins from these three classes are best predicted using different computational techniques. Finally, late-breaking results suggest that single folders, fold switchers, and IDPs have distinct patterns of residue-residue coevolution. The review closes by discussing high-throughput and medium-throughput experimental approaches that might be used to identify new fold-switching proteins.
Collapse
Affiliation(s)
- Devlina Chakravarty
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
| | - Joseph W. Schafer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
| | - Lauren L. Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesdaMarylandUSA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of HealthBethesdaMarylandUSA
| |
Collapse
|
30
|
Gu J, Xu Y, Nie Y. Role of distal sites in enzyme engineering. Biotechnol Adv 2023; 63:108094. [PMID: 36621725 DOI: 10.1016/j.biotechadv.2023.108094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 11/15/2022] [Accepted: 01/01/2023] [Indexed: 01/06/2023]
Abstract
The limitations associated with natural enzyme catalysis have triggered the rise of the field of protein engineering. Traditional rational design was based on the analysis of protein structural information and catalytic mechanisms to identify key active sites or ligand binding sites to reshape the substrate pocket. The role and significance of functional sites in the active center have been studied extensively. With a deeper understanding of the structure-catalysis relationship map, the entire protein molecule can be filled with residues that play a substantial role in its structure and function. However, the catalytic mechanism underlying distal mutations remains unclear. The aim of this review was to highlight the criticality of the distal site in enzyme engineering based on the following three aspects: What can distal mutations exert on function from mutability landscape? How do distal sites influence enzyme function? How to predict and design distal mutations? This review provides insights into the catalytic mechanism of enzymes from the global interaction network, knowledge from sequence-structure-dynamics-function relationships, and strategies for distal mutation-based protein engineering.
Collapse
Affiliation(s)
- Jie Gu
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China
| | - Yan Xu
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China; State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, China
| | - Yao Nie
- Lab of Brewing Microbiology and Applied Enzymology, School of Biotechnology and Key laboratory of Industrial Biotechnology of Ministry of Education, Jiangnan University, Wuxi 214122, China; Suqian Industrial Technology Research Institute of Jiangnan University, Suqian 223814, China.
| |
Collapse
|
31
|
Samanta R, Sanghvi N, Beckett D, Matysiak S. Emergence of allostery through reorganization of protein residue network architecture. J Chem Phys 2023; 158:085104. [PMID: 36859102 PMCID: PMC9974213 DOI: 10.1063/5.0136010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 02/03/2023] [Indexed: 02/09/2023] Open
Abstract
Despite more than a century of study, consensus on the molecular basis of allostery remains elusive. A comparison of allosteric and non-allosteric members of a protein family can shed light on this important regulatory mechanism, and the bacterial biotin protein ligases, which catalyze post-translational biotin addition, provide an ideal system for such comparison. While the Class I bacterial ligases only function as enzymes, the bifunctional Class II ligases use the same structural architecture for an additional transcription repression function. This additional function depends on allosterically activated homodimerization followed by DNA binding. In this work, we used experimental, computational network, and bioinformatics analyses to uncover distinguishing features that enable allostery in the Class II biotin protein ligases. Experimental studies of the Class II Escherichia coli protein indicate that catalytic site residues are critical for both catalysis and allostery. However, allostery also depends on amino acids that are more broadly distributed throughout the protein structure. Energy-based community network analysis of representative Class I and Class II proteins reveals distinct residue community architectures, interactions among the communities, and responses of the network to allosteric effector binding. Bioinformatics mutual information analyses of multiple sequence alignments indicate distinct networks of coevolving residues in the two protein families. The results support the role of divergent local residue community network structures both inside and outside of the conserved enzyme active site combined with distinct inter-community interactions as keys to the emergence of allostery in the Class II biotin protein ligases.
Collapse
Affiliation(s)
- Riya Samanta
- Fischell Department of Bioengineering, University of Maryland, College Park, Maryland 20742, USA
| | - Neel Sanghvi
- Fischell Department of Bioengineering, University of Maryland, College Park, Maryland 20742, USA
| | - Dorothy Beckett
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Silvina Matysiak
- Fischell Department of Bioengineering, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
32
|
Gianni S, Jemth P. Allostery Frustrates the Experimentalist. J Mol Biol 2023; 435:167934. [PMID: 36586463 DOI: 10.1016/j.jmb.2022.167934] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 12/20/2022] [Accepted: 12/22/2022] [Indexed: 12/29/2022]
Abstract
Proteins interact with other proteins, with nucleic acids, lipids, carbohydrates and various small molecules in the living cell. These interactions have been quantified and structurally characterized in numerous studies such that we today have a comprehensive picture of protein structure and function. However, proteins are dynamic and even folded proteins are likely more heterogeneous than they appear in most descriptions. One property of proteins that relies on dynamics and heterogeneity is allostery, the ability of a protein to change structure and function upon ligand binding to an allosteric site. Over the last decades the concept of allostery was broadened to embrace all types of long-range interactions across a protein including purely entropic changes without a conformational change in single protein domains. But with this re-definition came a problem: How do we measure allostery? In this opinion, we discuss some caveats arising from the quantitative description of single-domain allostery from an experimental perspective and how the limitations cannot be separated from the definition of allostery per se. Furthermore, we attempt to tie together allostery with the concept of frustration in an effort to investigate the links between these two complex, and yet general, properties of proteins. We arrive at the conclusion that the sensitivity to perturbation of allosteric networks in single protein domains is too large for the networks to be of significant biological relevance.
Collapse
Affiliation(s)
- Stefano Gianni
- Istituto Pasteur-Fondazione Cenci Bolognetti and Istituto di Biologia e Patologia Molecolari del CNR, Dipartimento di Scienze Biochimiche "A. Rossi Fanelli," Sapienza Università di Roma, 00185 Rome, Italy.
| | - Per Jemth
- Department of Medical Biochemistry and Microbiology, Uppsala University, BMC Box 582, SE-75123 Uppsala, Sweden.
| |
Collapse
|
33
|
Zhang Y, Jiang Y, Gao K, Sui D, Yu P, Su M, Wei GW, Hu J. Structural insights into the elevator-type transport mechanism of a bacterial ZIP metal transporter. Nat Commun 2023; 14:385. [PMID: 36693843 PMCID: PMC9873690 DOI: 10.1038/s41467-023-36048-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Accepted: 01/13/2023] [Indexed: 01/26/2023] Open
Abstract
The Zrt-/Irt-like protein (ZIP) family consists of ubiquitously expressed divalent metal transporters critically involved in maintaining systemic and cellular homeostasis of zinc, iron, and manganese. Here, we present a study on a prokaryotic ZIP from Bordetella bronchiseptica (BbZIP) by combining structural biology, evolutionary covariance, computational modeling, and a variety of biochemical assays to tackle the issue of the transport mechanism which has not been established for the ZIP family. The apo state structure in an inward-facing conformation revealed a disassembled transport site, altered inter-helical interactions, and importantly, a rigid body movement of a 4-transmembrane helix (TM) bundle relative to the other TMs. The computationally generated and biochemically validated outward-facing conformation model revealed a slide of the 4-TM bundle, which carries the transport site(s), by approximately 8 Å toward the extracellular side against the static TMs which mediate dimerization. These findings allow us to conclude that BbZIP is an elevator-type transporter.
Collapse
Affiliation(s)
- Yao Zhang
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Yuhan Jiang
- Department of Chemistry, Michigan State University, East Lansing, MI, USA
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, East Lansing, MI, USA
| | - Dexin Sui
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Peixuan Yu
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
| | - Min Su
- Life Sciences Institute, University of Michigan, Ann Arbor, MI, USA
| | - Guo-Wei Wei
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
- Department of Mathematics, Michigan State University, East Lansing, MI, USA
| | - Jian Hu
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA.
- Department of Chemistry, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
34
|
Tresnak DT, Hackel BJ. Deep Antimicrobial Activity and Stability Analysis Inform Lysin Sequence-Function Mapping. ACS Synth Biol 2023; 12:249-264. [PMID: 36599162 PMCID: PMC10822705 DOI: 10.1021/acssynbio.2c00509] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Antibiotic-resistant infectious disease is a critical challenge to human health. Antimicrobial proteins offer a compelling solution if engineered for potency, selectivity, and physiological stability. Lysins, which lyse cells via degradation of cell wall peptidoglycans, have significant potential to fill this role. Yet, the functional complexity of antimicrobial activity has hindered high-throughput characterization for discovery and design. To dramatically expand knowledge of the sequence-function landscape of lysins, we developed a depletion-based assay for library-scale measurement of lysin inhibitory activity. We coupled this platform with a high-throughput proteolytic stability assay to assess the activity and stability of ∼5 × 104 lysin catalytic domain variants, resulting in the discovery of a variant with increased activity (70 ± 20%) and stability (7.2 ± 0.4 °C increased midpoint of thermal denaturation). Ridge regression of the resulting data set demonstrated that libraries with a higher average Hamming distance better informed pairwise models and that coupling activity and stability assays enabled better prediction of catalytically active lysins. The best models achieved Pearson's correlation coefficients of 0.87 ± 0.01 and 0.61 ± 0.04 for predicting catalytic domain stability and activity, respectively. Our work provides an efficient strategy for constructing protein sequence-function landscapes, drastically increases screening throughput for engineering lysins, and yields promising lysins for further development.
Collapse
Affiliation(s)
- Daniel T. Tresnak
- Department of Chemical Engineering and Materials Science, University of Minnesota – Twin Cities, 421 Washington Avenue SE, Minneapolis, MN 55455
| | - Benjamin J. Hackel
- Department of Chemical Engineering and Materials Science, University of Minnesota – Twin Cities, 421 Washington Avenue SE, Minneapolis, MN 55455
| |
Collapse
|
35
|
Schafer JW, Porter LL. Evolutionary selection of proteins with two folds. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.18.524637. [PMID: 36789442 PMCID: PMC9928049 DOI: 10.1101/2023.01.18.524637] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Although most globular proteins fold into a single stable structure 1 , an increasing number have been shown to remodel their secondary and tertiary structures in response to cellular stimuli 2 . State-of-the-art algorithms 3-5 predict that these fold-switching proteins assume only one stable structure 6,7 , missing their functionally critical alternative folds. Why these algorithms predict a single fold is unclear, but all of them infer protein structure from coevolved amino acid pairs. Here, we hypothesize that coevolutionary signatures are being missed. Suspecting that over-represented single-fold sequences may be masking these signatures, we developed an approach to search both highly diverse protein superfamilies-composed of single-fold and fold-switching variants-and protein subfamilies with more fold-switching variants. This approach successfully revealed coevolution of amino acid pairs uniquely corresponding to both conformations of 56/58 fold-switching proteins from distinct families. Then, using a set of coevolved amino acid pairs predicted by our approach, we successfully biased AlphaFold2 5 to predict two experimentally consistent conformations of a candidate protein with unsolved structure. The discovery of widespread dual-fold coevolution indicates that fold-switching sequences have been preserved by natural selection, implying that their functionalities provide evolutionary advantage and paving the way for predictions of diverse protein structures from single sequences.
Collapse
Affiliation(s)
- Joseph W. Schafer
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA
| | - Lauren L. Porter
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA
- National Heart, Lung, and Blood Institute, Biochemistry and Biophysics Center, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
36
|
Newman KE, Tindall SN, Mader SL, Khalid S, Thomas GH, Van Der Woude MW. A novel fold for acyltransferase-3 (AT3) proteins provides a framework for transmembrane acyl-group transfer. eLife 2023; 12:e81547. [PMID: 36630168 PMCID: PMC9833829 DOI: 10.7554/elife.81547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 12/04/2022] [Indexed: 01/12/2023] Open
Abstract
Acylation of diverse carbohydrates occurs across all domains of life and can be catalysed by proteins with a membrane bound acyltransferase-3 (AT3) domain (PF01757). In bacteria, these proteins are essential in processes including symbiosis, resistance to viruses and antimicrobials, and biosynthesis of antibiotics, yet their structure and mechanism are largely unknown. In this study, evolutionary co-variance analysis was used to build a computational model of the structure of a bacterial O-antigen modifying acetyltransferase, OafB. The resulting structure exhibited a novel fold for the AT3 domain, which molecular dynamics simulations demonstrated is stable in the membrane. The AT3 domain contains 10 transmembrane helices arranged to form a large cytoplasmic cavity lined by residues known to be essential for function. Further molecular dynamics simulations support a model where the acyl-coA donor spans the membrane through accessing a pore created by movement of an important loop capping the inner cavity, enabling OafB to present the acetyl group close to the likely catalytic resides on the extracytoplasmic surface. Limited but important interactions with the fused SGNH domain in OafB are identified, and modelling suggests this domain is mobile and can both accept acyl-groups from the AT3 and then reach beyond the membrane to reach acceptor substrates. Together this new general model of AT3 function provides a framework for the development of inhibitors that could abrogate critical functions of bacterial pathogens.
Collapse
Affiliation(s)
- Kahlan E Newman
- School of Chemistry, University of SouthamptonSouthamptonUnited Kingdom
| | - Sarah N Tindall
- Department of Biology and the York Biomedical Research Institute, University of YorkYorkUnited Kingdom
| | - Sophie L Mader
- Department of Biochemistry, University of OxfordOxfordUnited Kingdom
| | - Syma Khalid
- Department of Biochemistry, University of OxfordOxfordUnited Kingdom
| | - Gavin H Thomas
- Department of Biology and the York Biomedical Research Institute, University of YorkYorkUnited Kingdom
| | - Marjan W Van Der Woude
- Hull York Medical School and the York Biomedical Research Institute, University of YorkYorkUnited Kingdom
| |
Collapse
|
37
|
Hoehe MR, Herwig R. Analysis of 1276 Haplotype-Resolved Genomes Allows Characterization of Cis- and Trans-Abundant Genes. Methods Mol Biol 2023; 2590:237-272. [PMID: 36335503 DOI: 10.1007/978-1-0716-2819-5_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Many methods for haplotyping have materialized, but their application on a significant scale has been rare to date. Here we summarize analyses that were carried out in 1092 genomes from the 1000 Genomes Consortium and validated in an unprecedented number of 184 PGP genomes that have been experimentally haplotype-resolved by application of the Long-Fragment Read (LFR) technology. These analyses provided first insights into the diplotypic nature of human genomes and its potential functional implications. Thus, protein-changing variants were not randomly distributed between the two homologues of 18,121 autosomal protein-coding genes but occurred significantly more frequently in cis than in trans configurations in virtually each of the 1276 phased genomes. This resulted in global cis/trans ratios of ~60:40, establishing "cis abundance" as a universal characteristic of diploid human genomes. This phenomenon was based on two different classes of genes, a larger one exhibiting cis configurations of protein-changing variants in excess, so-called "cis-abundant" genes, and a smaller one of "trans-abundant" genes. These two gene classes, which together constitute a common diplotypic exome, were further functionally distinguished by means of gene ontology (GO) and pathway enrichment analysis. Moreover, they were distinguishable in terms of their effects on the human interactome, where they constitute distinct cis and trans modules, as shown with network propagation on a large integrated protein-protein interaction network. These analyses, recently performed with updated database and analysis tools, further consolidated the characterization of cis- and trans-abundant genes while expanding previous results. In this chapter, we present the key results along with the materials and methods to motivate readers to investigate these findings independently and gain further insights into the diplotypic nature of genes and genomes.
Collapse
Affiliation(s)
- Margret R Hoehe
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | - Ralf Herwig
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| |
Collapse
|
38
|
Weaver RJ, Rabinowitz S, Thueson K, Havird JC. Genomic Signatures of Mitonuclear Coevolution in Mammals. Mol Biol Evol 2022; 39:6775223. [PMID: 36288802 PMCID: PMC9641969 DOI: 10.1093/molbev/msac233] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Mitochondrial (mt) and nuclear-encoded proteins are integrated in aerobic respiration, requiring co-functionality among gene products from fundamentally different genomes. Different evolutionary rates, inheritance mechanisms, and selection pressures set the stage for incompatibilities between interacting products of the two genomes. The mitonuclear coevolution hypothesis posits that incompatibilities may be avoided if evolution in one genome selects for complementary changes in interacting genes encoded by the other genome. Nuclear compensation, in which deleterious mtDNA changes are offset by compensatory nuclear changes, is often invoked as the primary mechanism for mitonuclear coevolution. Yet, direct evidence supporting nuclear compensation is rare. Here, we used data from 58 mammalian species representing eight orders to show strong correlations between evolutionary rates of mt and nuclear-encoded mt-targeted (N-mt) proteins, but not between mt and non-mt-targeted nuclear proteins, providing strong support for mitonuclear coevolution across mammals. N-mt genes with direct mt interactions also showed the strongest correlations. Although most N-mt genes had elevated dN/dS ratios compared to mt genes (as predicted under nuclear compensation), N-mt sites in close contact with mt proteins were not overrepresented for signs of positive selection compared to noncontact N-mt sites (contrary to predictions of nuclear compensation). Furthermore, temporal patterns of N-mt and mt amino acid substitutions did not support predictions of nuclear compensation, even in positively selected, functionally important residues with direct mitonuclear contacts. Overall, our results strongly support mitonuclear coevolution across ∼170 million years of mammalian evolution but fail to support nuclear compensation as the major mode of mitonuclear coevolution.
Collapse
Affiliation(s)
- Ryan J Weaver
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA.,Department of Natural Resource Ecology and Management, Iowa State University, Ames, IA
| | | | - Kiley Thueson
- Department of Integrative Biology, University of Texas, Austin, TX
| | - Justin C Havird
- Department of Integrative Biology, University of Texas, Austin, TX
| |
Collapse
|
39
|
Zhang H, Xu MS, Fan X, Chung WK, Shen Y. Predicting functional effect of missense variants using graph attention neural networks. NAT MACH INTELL 2022; 4:1017-1028. [PMID: 37484202 PMCID: PMC10361701 DOI: 10.1038/s42256-022-00561-w] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 10/07/2022] [Indexed: 11/16/2022]
Abstract
Accurate prediction of damaging missense variants is critically important for interpreting a genome sequence. Although many methods have been developed, their performance has been limited. Recent advances in machine learning and the availability of large-scale population genomic sequencing data provide new opportunities to considerably improve computational predictions. Here we describe the graphical missense variant pathogenicity predictor (gMVP), a new method based on graph attention neural networks. Its main component is a graph with nodes that capture predictive features of amino acids and edges weighted by co-evolution strength, enabling effective pooling of information from the local protein context and functionally correlated distal positions. Evaluation of deep mutational scan data shows that gMVP outperforms other published methods in identifying damaging variants in TP53, PTEN, BRCA1 and MSH2. Furthermore, it achieves the best separation of de novo missense variants in neuro developmental disorder cases from those in controls. Finally, the model supports transfer learning to optimize gain- and loss-of-function predictions in sodium and calcium channels. In summary, we demonstrate that gMVP can improve interpretation of missense variants in clinical testing and genetic studies.
Collapse
Affiliation(s)
- Haicang Zhang
- Department of Systems Biology, Columbia University, New York, NY, USA
| | | | - Xiao Fan
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Pediatrics, Columbia University, New York, NY, USA
| | - Wendy K. Chung
- Department of Pediatrics, Columbia University, New York, NY, USA
- Department of Medicine, Columbia University, New York, NY, USA
| | - Yufeng Shen
- Department of Systems Biology, Columbia University, New York, NY, USA
- Department of Biomedical Informatics, Columbia University, New York, NY, USA
- JP Sulzberger Columbia Genome Center, Columbia University, New York, NY, USA
| |
Collapse
|
40
|
Ahmed S, Chattopadhyay G, Manjunath K, Bhasin M, Singh N, Rasool M, Das S, Rana V, Khan N, Mitra D, Asok A, Singh R, Varadarajan R. Combining cysteine scanning with chemical labeling to map protein-protein interactions and infer bound structure in an intrinsically disordered region. Front Mol Biosci 2022; 9:997653. [PMID: 36275627 PMCID: PMC9585320 DOI: 10.3389/fmolb.2022.997653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/12/2022] [Indexed: 11/13/2022] Open
Abstract
The Mycobacterium tuberculosis genome harbours nine toxin-antitoxin (TA) systems of the mazEF family. These consist of two proteins, a toxin and an antitoxin, encoded in an operon. While the toxin has a conserved fold, the antitoxins are structurally diverse and the toxin binding region is typically intrinsically disordered before binding. We describe high throughput methodology for accurate mapping of interfacial residues and apply it to three MazEF complexes. The method involves screening one partner protein against a panel of chemically masked single cysteine mutants of its interacting partner, displayed on the surface of yeast cells. Such libraries have much lower diversity than those generated by saturation mutagenesis, simplifying library generation and data analysis. Further, because of the steric bulk of the masking reagent, labeling of virtually all exposed epitope residues should result in loss of binding, and buried residues are inaccessible to the labeling reagent. The binding residues are deciphered by probing the loss of binding to the labeled cognate partner by flow cytometry. Using this methodology, we have identified the interfacial residues for MazEF3, MazEF6 and MazEF9 TA systems of M. tuberculosis. In the case of MazEF9, where a crystal structure was available, there was excellent agreement between our predictions and the crystal structure, superior to those with AlphaFold2. We also report detailed biophysical characterization of the MazEF3 and MazEF9 TA systems and measured the relative affinities between cognate and non-cognate toxin–antitoxin partners in order to probe possible cross-talk between these systems.
Collapse
Affiliation(s)
- Shahbaz Ahmed
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | | | | | - Munmun Bhasin
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Neelam Singh
- Tuberculosis Research Laboratory, Translational Health Science and Technology Institute, Faridabad, India
| | - Mubashir Rasool
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Sayan Das
- Tuberculosis Research Laboratory, Translational Health Science and Technology Institute, Faridabad, India
| | - Varsha Rana
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Neha Khan
- Tuberculosis Research Laboratory, Translational Health Science and Technology Institute, Faridabad, India
| | - Debarghya Mitra
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Aparna Asok
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
| | - Ramandeep Singh
- Tuberculosis Research Laboratory, Translational Health Science and Technology Institute, Faridabad, India
| | - Raghavan Varadarajan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
- *Correspondence: Raghavan Varadarajan,
| |
Collapse
|
41
|
Sen N, Anishchenko I, Bordin N, Sillitoe I, Velankar S, Baker D, Orengo C. Characterizing and explaining the impact of disease-associated mutations in proteins without known structures or structural homologs. Brief Bioinform 2022; 23:bbac187. [PMID: 35641150 PMCID: PMC9294430 DOI: 10.1093/bib/bbac187] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2021] [Revised: 04/23/2022] [Accepted: 04/27/2022] [Indexed: 12/12/2022] Open
Abstract
Mutations in human proteins lead to diseases. The structure of these proteins can help understand the mechanism of such diseases and develop therapeutics against them. With improved deep learning techniques, such as RoseTTAFold and AlphaFold, we can predict the structure of proteins even in the absence of structural homologs. We modeled and extracted the domains from 553 disease-associated human proteins without known protein structures or close homologs in the Protein Databank. We noticed that the model quality was higher and the Root mean square deviation (RMSD) lower between AlphaFold and RoseTTAFold models for domains that could be assigned to CATH families as compared to those which could only be assigned to Pfam families of unknown structure or could not be assigned to either. We predicted ligand-binding sites, protein-protein interfaces and conserved residues in these predicted structures. We then explored whether the disease-associated missense mutations were in the proximity of these predicted functional sites, whether they destabilized the protein structure based on ddG calculations or whether they were predicted to be pathogenic. We could explain 80% of these disease-associated mutations based on proximity to functional sites, structural destabilization or pathogenicity. When compared to polymorphisms, a larger percentage of disease-associated missense mutations were buried, closer to predicted functional sites, predicted as destabilizing and pathogenic. Usage of models from the two state-of-the-art techniques provide better confidence in our predictions, and we explain 93 additional mutations based on RoseTTAFold models which could not be explained based solely on AlphaFold models.
Collapse
Affiliation(s)
- Neeladri Sen
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Ivan Anishchenko
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
| | - Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| | - Sameer Velankar
- Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA 98195, USA
- Institute for Protein Design, University of Washington, Seattle, WA 98195, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
| | - Christine Orengo
- Institute of Structural and Molecular Biology, University College London, London, WC1E 6BT, UK
| |
Collapse
|
42
|
Abstract
Bacterial conjugation is the fundamental process of unidirectional transfer of DNAs, often plasmid DNAs, from a donor cell to a recipient cell1. It is the primary means by which antibiotic resistance genes spread among bacterial populations2,3. In Gram-negative bacteria, conjugation is mediated by a large transport apparatus—the conjugative type IV secretion system (T4SS)—produced by the donor cell and embedded in both its outer and inner membranes. The T4SS also elaborates a long extracellular filament—the conjugative pilus—that is essential for DNA transfer4,5. Here we present a high-resolution cryo-electron microscopy (cryo-EM) structure of a 2.8 megadalton T4SS complex composed of 92 polypeptides representing 8 of the 10 essential T4SS components involved in pilus biogenesis. We added the two remaining components to the structural model using co-evolution analysis of protein interfaces, to enable the reconstitution of the entire system including the pilus. This structure describes the exceptionally large protein–protein interaction network required to assemble the many components that constitute a T4SS and provides insights on the unique mechanism by which they elaborate pili. Cryo-electron microscopy structures of a 2.8 megadalton bacterial type IV secretion system encoded by the plasmid R388 and comprising 92 polypeptides provide insights into the stepwise mechanism of pilus assembly.
Collapse
|
43
|
Colman DR, Labesse G, Swapna G, Stefanakis J, Montelione GT, Boyd ES, Royer CA. Structural evolution of the ancient enzyme, dissimilatory sulfite reductase. Proteins 2022; 90:1331-1345. [PMID: 35122336 PMCID: PMC9018543 DOI: 10.1002/prot.26315] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 01/29/2022] [Indexed: 07/21/2023]
Abstract
Dissimilatory sulfite reductase is an ancient enzyme that has linked the global sulfur and carbon biogeochemical cycles since at least 3.47 Gya. While much has been learned about the phylogenetic distribution and diversity of DsrAB across environmental gradients, far less is known about the structural changes that occurred to maintain DsrAB function as the enzyme accompanied diversification of sulfate/sulfite reducing organisms (SRO) into new environments. Analyses of available crystal structures of DsrAB from Archaeoglobus fulgidus and Desulfovibrio vulgaris, representing early and late evolving lineages, respectively, show that certain features of DsrAB are structurally conserved, including active siro-heme binding motifs. Whether such structural features are conserved among DsrAB recovered from varied environments, including hot spring environments that host representatives of the earliest evolving SRO lineage (e.g., MV2-Eury), is not known. To begin to overcome these gaps in our understanding of the evolution of DsrAB, structural models from MV2.Eury were generated and evolutionary sequence co-variance analyses were conducted on a curated DsrAB database. Phylogenetically diverse DsrAB harbor many conserved functional residues including those that ligate active siro-heme(s). However, evolutionary co-variance analysis of monomeric DsrAB subunits revealed several False Positive Evolutionary Couplings (FPEC) that correspond to residues that have co-evolved despite being too spatially distant in the monomeric structure to allow for direct contact. One set of FPECs corresponds to residues that form a structural path between the two active siro-heme moieties across the interface between heterodimers, suggesting the potential for allostery or electron transfer within the enzyme complex. Other FPECs correspond to structural loops and gaps that may have been selected to stabilize enzyme function in different environments. These structural bioinformatics results suggest that DsrAB has maintained allosteric communication pathways between subunits as SRO diversified into new environments. The observations outlined here provide a framework for future biochemical and structural analyses of DsrAB to examine potential allosteric control of this enzyme.
Collapse
Affiliation(s)
- Daniel R. Colman
- Department of Microbiology and Cell Biology, Montana State University, Bozeman, Montana 59717
| | - Gilles Labesse
- Centre de Biochimie Structurale, CNRS UMR 5048, Montpellier, France 34090
| | - G.V.T. Swapna
- Dept of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers The State University of New Jersey, Piscataway, NJ, 08854 USA
| | | | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY 12180
| | - Eric S. Boyd
- Department of Microbiology and Cell Biology, Montana State University, Bozeman, Montana 59717
| | - Catherine A. Royer
- Department of Biological Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180
| |
Collapse
|
44
|
Weissenow K, Heinzinger M, Rost B. Protein language-model embeddings for fast, accurate, and alignment-free protein structure prediction. Structure 2022; 30:1169-1177.e4. [DOI: 10.1016/j.str.2022.05.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 02/25/2022] [Accepted: 04/29/2022] [Indexed: 01/27/2023]
|
45
|
Wu M, Lv K, Li J, Wu B, He B. Coevolutionary analysis reveals a distal amino acid residue pair affecting the catalytic activity of GH5 processive endoglucanase from Bacillus subtilis BS-5. Biotechnol Bioeng 2022; 119:2105-2114. [PMID: 35438195 DOI: 10.1002/bit.28113] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 04/05/2022] [Accepted: 04/08/2022] [Indexed: 11/06/2022]
Abstract
EG5C-1, processive endoglucanase from Bacillus subtilis, is a typical bifunctional cellulase with endoglucanase and exoglucanase activities. The engineering of processive endoglucanase focuses on the catalytic pocket or carbohydrate-binding module tailoring based on sequence/structure information. Herein, a computational strategy was applied to identify the desired mutants in the enzyme molecule by evolutionary coupling analysis; subsequently, four residue pairs were selected as evolutionary mutational hotspots. Based on iterative-saturation mutagenesis and subsequent enzymatic activity analysis, a superior mutant K51T/L93T was identified away from the active center. This variant had increased specific activity from 4170 U/µmol of wild-type (WT) to 5678 U/µmol towards CMC-Na and an increase towards the substrate Avicel from 320 U/µmol in WT to 521 U/µmol. In addition, kinetic measurements suggested that superior mutant K51T/L93T had a high substrate affinity (Km ) and a remarkable improvement in catalytic efficiency (kcat /Km ). Furthermore, molecular dynamics simulations revealed that the K51T/L93T mutation altered the spatial conformation at the active site cleft, enhancing the interaction frequency between active site residues and substrate, improving catalytic efficiency and substrate affinity. The current studies provided some perspectives on the effects of distal residue substitution, which might assist in the engineering of processive endoglucanase or other glycoside hydrolases. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Mujunqi Wu
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| | - Kemin Lv
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| | - Jiahuang Li
- School of Biopharmacy, China Pharmaceutical University, Nanjing, 211198, Jiangsu, China
| | - Bin Wu
- College of Biotechnology and Pharmaceutical Engineering, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| | - Bingfang He
- School of Pharmaceutical Sciences, Nanjing Tech University, 30 Puzhunan road, Nanjing, 211816, Jiangsu, China
| |
Collapse
|
46
|
Do HN, Haldane A, Levy RM, Miao Y. Unique features of different classes of G-protein-coupled receptors revealed from sequence coevolutionary and structural analysis. Proteins 2022; 90:601-614. [PMID: 34599827 PMCID: PMC8738117 DOI: 10.1002/prot.26256] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 09/21/2021] [Accepted: 09/27/2021] [Indexed: 02/03/2023]
Abstract
G-protein-coupled receptors (GPCRs) are the largest family of human membrane proteins and represent the primary targets of about one third of currently marketed drugs. Despite the critical importance, experimental structures have been determined for only a limited portion of GPCRs and functional mechanisms of GPCRs remain poorly understood. Here, we have constructed novel sequence coevolutionary models of the A and B classes of GPCRs and compared them with residue contact frequency maps generated with available experimental structures. Significant portions of structural residue contacts were successfully detected in the sequence-based covariational models. "Exception" residue contacts predicted from sequence coevolutionary models but not available structures added missing links that were important for GPCR activation and allosteric modulation. Moreover, we identified distinct residue contacts involving different sets of functional motifs for GPCR activation, such as the Na+ pocket, CWxP, DRY, PIF, and NPxxY motifs in the class A and the HETx and PxxG motifs in the class B. Finally, we systematically uncovered critical residue contacts tuned by allosteric modulation in the two classes of GPCRs, including those from the activation motifs and particularly the extracellular and intracellular loops in class A GPCRs. These findings provide a promising framework for rational design of ligands to regulate GPCR activation and allosteric modulation.
Collapse
Affiliation(s)
- Hung N Do
- The Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047
| | - Allan Haldane
- Department of Chemistry, Center for Biophysics and Computational Biology, Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122,Corresponding authors: and
| | - Ronald M Levy
- Department of Chemistry, Center for Biophysics and Computational Biology, Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122
| | - Yinglong Miao
- The Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66047,Corresponding authors: and
| |
Collapse
|
47
|
Belcher Dufrisne M, Swope N, Kieber M, Yang JY, Han J, Li J, Moremen KW, Prestegard JH, Columbus L. Human CEACAM1 N-domain dimerization is independent from glycan modifications. Structure 2022; 30:658-670.e5. [PMID: 35219398 PMCID: PMC9081242 DOI: 10.1016/j.str.2022.02.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 11/15/2021] [Accepted: 02/01/2022] [Indexed: 12/31/2022]
Abstract
Carcinoembryonic cellular adhesion molecules (CEACAMs) serve diverse roles in cell signaling, proliferation, and survival and are made up of one or several immunoglobulin (Ig)-like ectodomains glycosylated in vivo. The physiological oligomeric state and how it contributes to protein function are central to understanding CEACAMs. Two putative dimer conformations involving different CEACAM1 N-terminal Ig-like domain (CCM1) protein faces (ABED and GFCC'C″) were identified from crystal structures. GFCC'C″ was identified as the dominant CCM1 solution dimer, but ambiguity regarding the effect of glycosylation on dimer formation calls its physiological relevance into question. We present the first crystal structure of minimally glycosylated CCM1 in the GFCC'C″ dimer conformation and characterization in solution by continuous-wave and double electron-electron resonance electron paramagnetic resonance spectroscopy. Our results suggest the GFCC'C″ dimer is dominant in solution with different levels of glycosylation, and structural conservation and co-evolved residues support that the GFCC'C″ dimer is conserved across CEACAMs.
Collapse
|
48
|
Hot spots-making directed evolution easier. Biotechnol Adv 2022; 56:107926. [DOI: 10.1016/j.biotechadv.2022.107926] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 01/04/2022] [Accepted: 02/07/2022] [Indexed: 01/20/2023]
|
49
|
Si Y, Zhang Y, Yan C. A reproducibility analysis-based statistical framework for residue-residue evolutionary coupling detection. Brief Bioinform 2022; 23:6509046. [PMID: 35037015 DOI: 10.1093/bib/bbab576] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 11/26/2021] [Accepted: 12/15/2021] [Indexed: 11/14/2022] Open
Abstract
Direct coupling analysis (DCA) has been widely used to infer evolutionary coupled residue pairs from the multiple sequence alignment (MSA) of homologous sequences. However, effectively selecting residue pairs with significant evolutionary couplings according to the result of DCA is a non-trivial task. In this study, we developed a general statistical framework for significant evolutionary coupling detection, referred to as irreproducible discovery rate (IDR)-DCA, which is based on reproducibility analysis of the coupling scores obtained from DCA on manually created MSA replicates. IDR-DCA was applied to select residue pairs for contact prediction for monomeric proteins, protein-protein interactions and monomeric RNAs, in which three different versions of DCA were applied. We demonstrated that with the application of IDR-DCA, the residue pairs selected using a universal threshold always yielded stable performance for contact prediction. Comparing with the application of carefully tuned coupling score cutoffs, IDR-DCA always showed better performance. The robustness of IDR-DCA was also supported through the MSA downsampling analysis. We further demonstrated the effectiveness of applying constraints obtained from residue pairs selected by IDR-DCA to assist RNA secondary structure prediction.
Collapse
Affiliation(s)
- Yunda Si
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Yi Zhang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Chengfei Yan
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| |
Collapse
|
50
|
Extracting phylogenetic dimensions of coevolution reveals hidden functional signals. Sci Rep 2022; 12:820. [PMID: 35039514 PMCID: PMC8764114 DOI: 10.1038/s41598-021-04260-1] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 12/17/2021] [Indexed: 11/08/2022] Open
Abstract
Despite the structural and functional information contained in the statistical coupling between pairs of residues in a protein, coevolution associated with function is often obscured by artifactual signals such as genetic drift, which shapes a protein's phylogenetic history and gives rise to concurrent variation between protein sequences that is not driven by selection for function. Here, we introduce a background model for phylogenetic contributions of statistical coupling that separates the coevolution signal due to inter-clade and intra-clade sequence comparisons and demonstrate that coevolution can be measured on multiple phylogenetic timescales within a single protein. Our method, nested coevolution (NC), can be applied as an extension to any coevolution metric. We use NC to demonstrate that poorly conserved residues can nonetheless have important roles in protein function. Moreover, NC improved the structural-contact predictions of several coevolution-based methods, particularly in subsampled alignments with fewer sequences. NC also lowered the noise in detecting functional sectors of collectively coevolving residues. Sectors of coevolving residues identified after application of NC were more spatially compact and phylogenetically distinct from the rest of the protein, and strongly enriched for mutations that disrupt protein activity. Thus, our conceptualization of the phylogenetic separation of coevolution provides the potential to further elucidate relationships among protein evolution, function, and genetic diseases.
Collapse
|