1
|
Konecki DM, Hamrick S, Wang C, Agosto MA, Wensel TG, Lichtarge O. CovET: A covariation-evolutionary trace method that identifies protein structure-function modules. J Biol Chem 2023; 299:104896. [PMID: 37290531 PMCID: PMC10338321 DOI: 10.1016/j.jbc.2023.104896] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 06/01/2023] [Accepted: 06/02/2023] [Indexed: 06/10/2023] Open
Abstract
Measuring the relative effect that any two sequence positions have on each other may improve protein design or help better interpret coding variants. Current approaches use statistics and machine learning but rarely consider phylogenetic divergences which, as shown by Evolutionary Trace studies, provide insight into the functional impact of sequence perturbations. Here, we reframe covariation analyses in the Evolutionary Trace framework to measure the relative tolerance to perturbation of each residue pair during evolution. This approach (CovET) systematically accounts for phylogenetic divergences: at each divergence event, we penalize covariation patterns that belie evolutionary coupling. We find that while CovET approximates the performance of existing methods to predict individual structural contacts, it performs significantly better at finding structural clusters of coupled residues and ligand binding sites. For example, CovET found more functionally critical residues when we examined the RNA recognition motif and WW domains. It correlates better with large-scale epistasis screen data. In the dopamine D2 receptor, top CovET residue pairs recovered accurately the allosteric activation pathway characterized for Class A G protein-coupled receptors. These data suggest that CovET ranks highest the sequence position pairs that play critical functional roles through epistatic and allosteric interactions in evolutionarily relevant structure-function motifs. CovET complements current methods and may shed light on fundamental molecular mechanisms of protein structure and function.
Collapse
Affiliation(s)
- Daniel M Konecki
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, Texas, USA
| | - Spencer Hamrick
- Chemical, Physical, and Structural Biology Graduate Program, Baylor College of Medicine, Houston, Texas, USA
| | - Chen Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA
| | - Melina A Agosto
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA
| | - Theodore G Wensel
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, Texas, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA; Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA; Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, Texas, USA
| | - Olivier Lichtarge
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, Texas, USA; Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA; Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas, USA; Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, Texas, USA; Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas, USA.
| |
Collapse
|
2
|
Hsu TK, Asmussen J, Koire A, Choi BK, Gadhikar MA, Huh E, Lin CH, Konecki DM, Kim YW, Pickering CR, Kimmel M, Donehower LA, Frederick MJ, Myers JN, Katsonis P, Lichtarge O. A general calculus of fitness landscapes finds genes under selection in cancers. Genome Res 2022; 32:916-929. [PMID: 35301263 PMCID: PMC9104707 DOI: 10.1101/gr.275811.121] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 03/14/2022] [Indexed: 11/24/2022]
Abstract
Genetic variants drive the evolution of traits and diseases. We previously modeled these variants as small displacements in fitness landscapes and estimated their functional impact by differentiating the evolutionary relationship between genotype and phenotype. Conversely, here we integrate these derivatives to identify genes steering specific traits. Over cancer cohorts, integration identified 460 likely tumor-driving genes. Many have literature and experimental support but had eluded prior genomic searches for positive selection in tumors. Beyond providing cancer insights, these results introduce a general calculus of evolution to quantify the genotype-phenotype relationship and discover genes associated with complex traits and diseases.
Collapse
Affiliation(s)
- Teng-Kuei Hsu
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Jennifer Asmussen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Amanda Koire
- Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Byung-Kwon Choi
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Mayur A Gadhikar
- Department of Head and Neck Surgery, The University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030, USA
| | - Eunna Huh
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Chih-Hsu Lin
- Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Daniel M Konecki
- Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Young Won Kim
- Program in Integrative Molecular and Biomedical Sciences, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Curtis R Pickering
- Department of Head and Neck Surgery, The University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030, USA
| | - Marek Kimmel
- Departments of Statistics and Bioengineering, Rice University, Houston, Texas 77005, USA
- Department of Systems Engineering and Biology, Silesian University of Technology, 44-100 Gliwice, Poland
| | - Lawrence A Donehower
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Mitchell J Frederick
- Department of Otolaryngology-Head and Neck Surgery, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Jeffrey N Myers
- Department of Head and Neck Surgery, The University of Texas M.D. Anderson Cancer Center, Houston, Texas 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Olivier Lichtarge
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, Texas 77030, USA
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
- Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, Texas 77030, USA
- Department of Pharmacology and Chemical Biology, Baylor College of Medicine, Houston, Texas 77030, USA
- Program in Integrative Molecular and Biomedical Sciences, Baylor College of Medicine, Houston, Texas 77030, USA
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, Texas 77030, USA
| |
Collapse
|
3
|
Wang C, Konecki DM, Marciano DC, Govindarajan H, Williams AM, Wastuwidyaningtyas B, Bourquard T, Katsonis P, Lichtarge O. Identification of evolutionarily stable functional and immunogenic sites across the SARS-CoV-2 proteome and greater coronavirus family. Bioinformatics 2021; 37:4033-4040. [PMID: 34043002 PMCID: PMC8243408 DOI: 10.1093/bioinformatics/btab406] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 05/10/2021] [Accepted: 05/26/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Since the first recognized case of COVID-19, more than 100 million people have been infected worldwide. Global efforts in drug and vaccine development to fight the disease have yielded vaccines and drug candidates to cure COVID-19. However, the spread of SARS-CoV-2 variants threatens the continued efficacy of these treatments. In order to address this, we interrogate the evolutionary history of the entire SARS-CoV-2 proteome to identify evolutionarily conserved functional sites that can inform the search for treatments with broader coverage across the coronavirus family. RESULTS Combining coronavirus family sequence information with the mutations observed in the current COVID-19 outbreak, we systematically and comprehensively define evolutionarily stable sites that may provide useful drug and vaccine targets and which are less likely to be compromised by the emergence of new virus strains. Several experimentally validated effective drugs interact with these proposed target sites. In addition, the same evolutionary information can prioritize cross reactive antigens that are useful in directing multi-epitope vaccine strategies to illicit broadly neutralizing immune responses to the betacoronavirus family. Although the results are focused on SARS-CoV-2, these approaches stem from evolutionary principles that are agnostic to the organism or infective agent. AVAILABILITY AND IMPLEMENTATION The results of this work are made interactively available at http://cov.lichtargelab.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Daniel M Konecki
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - David C Marciano
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Harikumar Govindarajan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Amanda M Williams
- Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Thomas Bourquard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
- Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
4
|
Wang C, Konecki DM, Marciano DC, Govindarajan H, Williams AM, Wastuwidyaningtyas B, Bourquard T, Katsonis P, Lichtarge O. Identification of evolutionarily stable functional and immunogenic sites across the SARS-CoV-2 proteome and the greater coronavirus family. RESEARCH SQUARE 2021:rs.3.rs-95030. [PMID: 33106800 PMCID: PMC7587783 DOI: 10.21203/rs.3.rs-95030/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Since the first recognized case of COVID-19, more than 100 million people have been infected worldwide. Global efforts in drug and vaccine development to fight the disease have yielded vaccines and drug candidates to cure COVID-19. However, the spread of SARS-CoV-2 variants threatens the continued efficacy of these treatments. In order to address this, we interrogate the evolutionary history of the entire SARS-CoV-2 proteome to identify evolutionarily conserved functional sites that can inform the search for treatments with broader coverage across the coronavirus family. Combining this information with the mutations observed in the current COVID-19 outbreak, we systematically and comprehensively define evolutionarily stable sites that may provide useful drug and vaccine targets and which are less likely to be compromised by the emergence of new virus strains. Several experimentally-validated effective drugs interact with these proposed target sites. In addition, the same evolutionary information can prioritize cross reactive antigens that are useful in directing multi-epitope vaccine strategies to illicit broadly neutralizing immune responses to the betacoronavirus family. Although the results are focused on SARS-CoV-2, these approaches stem from evolutionary principles that are agnostic to the organism or infective agent.
Collapse
Affiliation(s)
- Chen Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Daniel M. Konecki
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - David C. Marciano
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Harikumar Govindarajan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Amanda M. Williams
- Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Thomas Bourquard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- MAbSilico, Nouzilly, Centre, France, EU
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
- Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
5
|
Wang C, Konecki DM, Marciano DC, Govindarajan H, Williams AM, Wastuwidyaningtyas B, Bourquard T, Katsonis P, Lichtarge O. Identification of evolutionarily stable functional and immunogenic sites across the SARS-CoV-2 proteome and the greater coronavirus family. RESEARCH SQUARE 2021:rs.3.rs-95030. [PMID: 36575762 PMCID: PMC9793837 DOI: 10.21203/rs.3.rs-95030/v3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Since the first recognized case of COVID-19, more than 100 million people have been infected worldwide. Global efforts in drug and vaccine development to fight the disease have yielded vaccines and drug candidates to cure COVID-19. However, the spread of SARS-CoV-2 variants threatens the continued efficacy of these treatments. In order to address this, we interrogate the evolutionary history of the entire SARS-CoV-2 proteome to identify evolutionarily conserved functional sites that can inform the search for treatments with broader coverage across the coronavirus family. Combining this information with the mutations observed in the current COVID-19 outbreak, we systematically and comprehensively define evolutionarily stable sites that may provide useful drug and vaccine targets and which are less likely to be compromised by the emergence of new virus strains. Several experimentally-validated effective drugs interact with these proposed target sites. In addition, the same evolutionary information can prioritize cross reactive antigens that are useful in directing multi-epitope vaccine strategies to illicit broadly neutralizing immune responses to the betacoronavirus family. Although the results are focused on SARS-CoV-2, these approaches stem from evolutionary principles that are agnostic to the organism or infective agent.
Collapse
Affiliation(s)
- Chen Wang
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Daniel M. Konecki
- Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
| | - David C. Marciano
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA,Correspondence: (D.C.M), (O.L.)
| | - Harikumar Govindarajan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Amanda M. Williams
- Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Thomas Bourquard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA,MAbSilico, Nouzilly, Centre, France, EU
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA,Quantitative and Computational Biosciences Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA,Cancer and Cell Biology Graduate Program, Baylor College of Medicine, Houston, TX 77030, USA,Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA,Correspondence: (D.C.M), (O.L.)
| |
Collapse
|
6
|
Serçinoğlu O, Ozbek P. Sequence-structure-function relationships in class I MHC: A local frustration perspective. PLoS One 2020; 15:e0232849. [PMID: 32421728 PMCID: PMC7233585 DOI: 10.1371/journal.pone.0232849] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2019] [Accepted: 04/22/2020] [Indexed: 12/22/2022] Open
Abstract
Class I Major Histocompatibility Complex (MHC) binds short antigenic peptides with the help of Peptide Loading Complex (PLC), and presents them to T-cell Receptors (TCRs) of cytotoxic T-cells and Killer-cell Immunglobulin-like Receptors (KIRs) of Natural Killer (NK) cells. With more than 10000 alleles, human MHC (Human Leukocyte Antigen, HLA) is the most polymorphic protein in humans. This allelic diversity provides a wide coverage of peptide sequence space, yet does not affect the three-dimensional structure of the complex. Moreover, TCRs mostly interact with HLA in a common diagonal binding mode, and KIR-HLA interaction is allele-dependent. With the aim of establishing a framework for understanding the relationships between polymorphism (sequence), structure (conserved fold) and function (protein interactions) of the human MHC, we performed here a local frustration analysis on pMHC homology models covering 1436 HLA I alleles. An analysis of local frustration profiles indicated that (1) variations in MHC fold are unlikely due to minimally-frustrated and relatively conserved residues within the HLA peptide-binding groove, (2) high frustration patches on HLA helices are either involved in or near interaction sites of MHC with the TCR, KIR, or tapasin of the PLC, and (3) peptide ligands mainly stabilize the F-pocket of HLA binding groove.
Collapse
Affiliation(s)
- Onur Serçinoğlu
- Department of Bioengineering, Recep Tayyip Erdogan University, Faculty of Engineering, Fener, Rize, Turkey
| | - Pemra Ozbek
- Department of Bioengineering, Marmara University, Faculty of Engineering, Goztepe, Istanbul, Turkey
- * E-mail:
| |
Collapse
|
7
|
Wu J, Liu S, Wang H. Invasive fungi-derived defensins kill drug-resistant bacterial pathogens. Peptides 2018; 99:82-91. [PMID: 29174563 DOI: 10.1016/j.peptides.2017.11.009] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Revised: 11/17/2017] [Accepted: 11/20/2017] [Indexed: 01/25/2023]
Abstract
Fungi-derived defensins are a class of antimicrobial peptides with therapeutic potential due to their high antibacterial efficacy and low toxicity. Based on the genomic strategy, we have identified 68 fungal defensin-like peptides (fDLPs) in five new genera, including Trichosporon, Apophysomyces, Lichtheimia, Beauveria and Scedosporium and characterized a new synthetic defensin (scedosporisin) from an invasive fungus. It was active against Gram-positive bacteria but not active against negative bacteria. Importantly, it killed several clinical resistant isolates such as methicillin-resistant Staphylococcus aureus and vancomycin-resistant Enterococci at low molecular concentrations. Scedosporisin showed low hemolysis and cytotoxicity and high serum stability. The killing kinetics of scedosporisin-2 against a clinical isolate of MRSA showed that it killed the bacteria more rapidly than that of vancomycin. Homology modeling analysis show that scedosporisin adopted a typical cysteine stabilized α-helical and β-sheet fold with a local hydrophobic patch. Scedosporisin significantly improved the survival rate of mice in the peritonitis model. This work has greatly expanded the library of fDLPs, and successfully selected leading molecules for antimicrobial drug reserves.
Collapse
Affiliation(s)
- Jiajia Wu
- Laboratory for Biological Effects of Nanomaterials and Nanosafety, National Center for Nanoscience and Technology (NCNST), No. 11 Beiyitiao, Zhongguancun, Beijing 100190, China
| | - Shijie Liu
- Laboratory for Biological Effects of Nanomaterials and Nanosafety, National Center for Nanoscience and Technology (NCNST), No. 11 Beiyitiao, Zhongguancun, Beijing 100190, China
| | - Hao Wang
- Laboratory for Biological Effects of Nanomaterials and Nanosafety, National Center for Nanoscience and Technology (NCNST), No. 11 Beiyitiao, Zhongguancun, Beijing 100190, China.
| |
Collapse
|
8
|
Lua RC, Wilson SJ, Konecki DM, Wilkins AD, Venner E, Morgan DH, Lichtarge O. UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures. Nucleic Acids Res 2015; 44:D308-12. [PMID: 26590254 PMCID: PMC4702906 DOI: 10.1093/nar/gkv1279] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2015] [Accepted: 11/02/2015] [Indexed: 02/07/2023] Open
Abstract
The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/.
Collapse
Affiliation(s)
- Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Stephen J Wilson
- Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA
| | - Daniel M Konecki
- Department of Structural and Computational Biology and Molecular Biophysics, Houston, TX 77030, USA
| | - Angela D Wilkins
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Eric Venner
- Department of Structural and Computational Biology and Molecular Biophysics, Houston, TX 77030, USA
| | - Daniel H Morgan
- Department of Structural and Computational Biology and Molecular Biophysics, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA Department of Structural and Computational Biology and Molecular Biophysics, Houston, TX 77030, USA Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA Department of Pharmacology, Baylor College of Medicine, Houston, TX 77030, USA
| |
Collapse
|
9
|
Lee TW, Yang ASP, Brittain T, Birch NP. An analysis approach to identify specific functional sites in orthologous proteins using sequence and structural information: application to neuroserpin reveals regions that differentially regulate inhibitory activity. Proteins 2015; 83:135-52. [PMID: 25363759 DOI: 10.1002/prot.24711] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2014] [Revised: 10/22/2014] [Accepted: 10/27/2014] [Indexed: 01/12/2023]
Abstract
The analysis of sequence conservation is commonly used to predict functionally important sites in proteins. We have developed an approach that first identifies highly conserved sites in a set of orthologous sequences using a weighted substitution-matrix-based conservation score and then filters these conserved sites based on the pattern of conservation present in a wider alignment of sequences from the same family and structural information to identify surface-exposed sites. This allows us to detect specific functional sites in the target protein and exclude regions that are likely to be generally important for the structure or function of the wider protein family. We applied our method to two members of the serpin family of serine protease inhibitors. We first confirmed that our method successfully detected the known heparin binding site in antithrombin while excluding residues known to be generally important in the serpin family. We next applied our sequence analysis approach to neuroserpin and used our results to guide site-directed polyalanine mutagenesis experiments. The majority of the mutant neuroserpin proteins were found to fold correctly and could still form inhibitory complexes with tissue plasminogen activator (tPA). Kinetic analysis of tPA inhibition, however, revealed altered inhibitory kinetics in several of the mutant proteins, with some mutants showing decreased association with tPA and others showing more rapid dissociation of the covalent complex. Altogether, these results confirm that our sequence analysis approach is a useful tool that can be used to guide mutagenesis experiments for the detection of specific functional sites in proteins.
Collapse
Affiliation(s)
- Tet Woo Lee
- School of Biological Sciences and Centre for Brain Research, University of Auckland, Auckland, New Zealand
| | | | | | | |
Collapse
|
10
|
Lua RC, Marciano DC, Katsonis P, Adikesavan AK, Wilkins AD, Lichtarge O. Prediction and redesign of protein-protein interactions. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:194-202. [PMID: 24878423 DOI: 10.1016/j.pbiomolbio.2014.05.004] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 05/02/2014] [Accepted: 05/17/2014] [Indexed: 12/14/2022]
Abstract
Understanding the molecular basis of protein function remains a central goal of biology, with the hope to elucidate the role of human genes in health and in disease, and to rationally design therapies through targeted molecular perturbations. We review here some of the computational techniques and resources available for characterizing a critical aspect of protein function - those mediated by protein-protein interactions (PPI). We describe several applications and recent successes of the Evolutionary Trace (ET) in identifying molecular events and shapes that underlie protein function and specificity in both eukaryotes and prokaryotes. ET is a part of analytical approaches based on the successes and failures of evolution that enable the rational control of PPI.
Collapse
Affiliation(s)
- Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - David C Marciano
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Panagiotis Katsonis
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Anbu K Adikesavan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Angela D Wilkins
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA; Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX 77030, USA; Computational and Integrative Biomedical Research Center, Baylor College of Medicine, Houston, TX 77030, USA.
| |
Collapse
|
11
|
Wilkins AD, Venner E, Marciano DC, Erdin S, Atri B, Lua RC, Lichtarge O. Accounting for epistatic interactions improves the functional analysis of protein structures. Bioinformatics 2013; 29:2714-21. [PMID: 24021383 PMCID: PMC3799481 DOI: 10.1093/bioinformatics/btt489] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Motivation: The constraints under which sequence, structure and function coevolve are not fully understood. Bringing this mutual relationship to light can reveal the molecular basis of binding, catalysis and allostery, thereby identifying function and rationally guiding protein redesign. Underlying these relationships are the epistatic interactions that occur when the consequences of a mutation to a protein are determined by the genetic background in which it occurs. Based on prior data, we hypothesize that epistatic forces operate most strongly between residues nearby in the structure, resulting in smooth evolutionary importance across the structure. Methods and Results: We find that when residue scores of evolutionary importance are distributed smoothly between nearby residues, functional site prediction accuracy improves. Accordingly, we designed a novel measure of evolutionary importance that focuses on the interaction between pairs of structurally neighboring residues. This measure that we term pair-interaction Evolutionary Trace yields greater functional site overlap and better structure-based proteome-wide functional predictions. Conclusions: Our data show that the structural smoothness of evolutionary importance is a fundamental feature of the coevolution of sequence, structure and function. Mutations operate on individual residues, but selective pressure depends in part on the extent to which a mutation perturbs interactions with neighboring residues. In practice, this principle led us to redefine the importance of a residue in terms of the importance of its epistatic interactions with neighbors, yielding better annotation of functional residues, motivating experimental validation of a novel functional site in LexA and refining protein function prediction. Contact:lichtarge@bcm.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Angela D Wilkins
- Department of Molecular and Human Genetics, CIBR Center for Computational and Integrative Biomedical Research and Program in Structural and Computational Biology & Molecular Biophysics, Baylor College of Medicine, Houston, TX 77030 and Center for Human Genetic Research, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
| | | | | | | | | | | | | |
Collapse
|
12
|
Erdin S, Venner E, Lisewski AM, Lichtarge O. Function prediction from networks of local evolutionary similarity in protein structure. BMC Bioinformatics 2013; 14 Suppl 3:S6. [PMID: 23514548 PMCID: PMC3584919 DOI: 10.1186/1471-2105-14-s3-s6] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Annotating protein function with both high accuracy and sensitivity remains a major challenge in structural genomics. One proven computational strategy has been to group a few key functional amino acids into templates and search for these templates in other protein structures, so as to transfer function when a match is found. To this end, we previously developed Evolutionary Trace Annotation (ETA) and showed that diffusing known annotations over a network of template matches on a structural genomic scale improved predictions of function. In order to further increase sensitivity, we now let each protein contribute multiple templates rather than just one, and also let the template size vary. RESULTS Retrospective benchmarks in 605 Structural Genomics enzymes showed that multiple templates increased sensitivity by up to 14% when combined with single template predictions even as they maintained the accuracy over 91%. Diffusing function globally on networks of single and multiple template matches marginally increased the area under the ROC curve over 0.97, but in a subset of proteins that could not be annotated by ETA, the network approach recovered annotations for the most confident 20-23 of 91 cases with 100% accuracy. CONCLUSIONS We improve the accuracy and sensitivity of predictions by using multiple templates per protein structure when constructing networks of ETA matches and diffusing annotations.
Collapse
Affiliation(s)
- Serkan Erdin
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
| | - Eric Venner
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
| | - Andreas Martin Lisewski
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
- Computational and Integrative Biomedical Research Center, Baylor College of Medicine, One Baylor Plaza, Houston, Texas 77030, USA
| |
Collapse
|
13
|
Wilkins AD, Bachman BJ, Erdin S, Lichtarge O. The use of evolutionary patterns in protein annotation. Curr Opin Struct Biol 2012; 22:316-25. [PMID: 22633559 DOI: 10.1016/j.sbi.2012.05.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2012] [Accepted: 05/01/2012] [Indexed: 01/13/2023]
Abstract
With genomic data skyrocketing, their biological interpretation remains a serious challenge. Diverse computational methods address this problem by pointing to the existence of recurrent patterns among sequence, structure, and function. These patterns emerge naturally from evolutionary variation, natural selection, and divergence--the defining features of biological systems--and they identify molecular events and shapes that underlie specificity of function and allosteric communication. Here we review these methods, and the patterns they identify in case studies and in proteome-wide applications, to infer and rationally redesign function.
Collapse
Affiliation(s)
- Angela D Wilkins
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | | | |
Collapse
|
14
|
Sinha R, Kundrotas PJ, Vakser IA. Protein docking by the interface structure similarity: how much structure is needed? PLoS One 2012; 7:e31349. [PMID: 22348074 PMCID: PMC3278447 DOI: 10.1371/journal.pone.0031349] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Accepted: 01/08/2012] [Indexed: 11/19/2022] Open
Abstract
The increasing availability of co-crystallized protein-protein complexes provides an opportunity to use template-based modeling for protein-protein docking. Structure alignment techniques are useful in detection of remote target-template similarities. The size of the structure involved in the alignment is important for the success in modeling. This paper describes a systematic large-scale study to find the optimal definition/size of the interfaces for the structure alignment-based docking applications. The results showed that structural areas corresponding to the cutoff values <12 Å across the interface inadequately represent structural details of the interfaces. With the increase of the cutoff beyond 12 Å, the success rate for the benchmark set of 99 protein complexes, did not increase significantly for higher accuracy models, and decreased for lower-accuracy models. The 12 Å cutoff was optimal in our interface alignment-based docking, and a likely best choice for the large-scale (e.g., on the scale of the entire genome) applications to protein interaction networks. The results provide guidelines for the docking approaches, including high-throughput applications to modeled structures.
Collapse
Affiliation(s)
- Rohita Sinha
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, United States of America
| | - Petras J. Kundrotas
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, United States of America
- * E-mail: (PJK); (IAV)
| | - Ilya A. Vakser
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, United States of America
- Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, United States of America
- * E-mail: (PJK); (IAV)
| |
Collapse
|
15
|
Abstract
The evolutionary trace (ET) is the single most validated approach to identify protein functional determinants and to target mutational analysis, protein engineering and drug design to the most relevant sites of a protein. It applies to the entire proteome; its predictions come with a reliability score; and its results typically reach significance in most protein families with 20 or more sequence homologs. In order to identify functional hot spots, ET scans a multiple sequence alignment for residue variations that correlate with major evolutionary divergences. In case studies this enables the selective separation, recoding, or mimicry of functional sites and, on a large scale, this enables specific function predictions based on motifs built from select ET-identified residues. ET is therefore an accurate, scalable and efficient method to identify the molecular determinants of protein function and to direct their rational perturbation for therapeutic purposes. Public ET servers are located at: http://mammoth.bcm.tmc.edu/.
Collapse
|
16
|
Adikesavan AK, Katsonis P, Marciano DC, Lua R, Herman C, Lichtarge O. Separation of recombination and SOS response in Escherichia coli RecA suggests LexA interaction sites. PLoS Genet 2011; 7:e1002244. [PMID: 21912525 PMCID: PMC3164682 DOI: 10.1371/journal.pgen.1002244] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2011] [Accepted: 06/29/2011] [Indexed: 12/29/2022] Open
Abstract
RecA plays a key role in homologous recombination, the induction of the DNA damage response through LexA cleavage and the activity of error-prone polymerase in Escherichia coli. RecA interacts with multiple partners to achieve this pleiotropic role, but the structural location and sequence determinants involved in these multiple interactions remain mostly unknown. Here, in a first application to prokaryotes, Evolutionary Trace (ET) analysis identifies clusters of evolutionarily important surface amino acids involved in RecA functions. Some of these clusters match the known ATP binding, DNA binding, and RecA-RecA homo-dimerization sites, but others are novel. Mutation analysis at these sites disrupted either recombination or LexA cleavage. This highlights distinct functional sites specific for recombination and DNA damage response induction. Finally, our analysis reveals a composite site for LexA binding and cleavage, which is formed only on the active RecA filament. These new sites can provide new drug targets to modulate one or more RecA functions, with the potential to address the problem of evolution of antibiotic resistance at its root. In eubacteria, genome integrity is in large part orchestrated by RecA, which directly participates in recombination, induction of DNA damage response through LexA repressor cleavage and error-prone DNA synthesis. Yet, most of the interaction sites necessary for these vital processes are largely unknown. By comparing divergences among RecA sequences and computing putative functional regions, we discovered four functional sites of RecA. Targeted point-mutations were then tested for both recombination and DNA damage induction and reveal distinct RecA functions at each one of these sites. In particular, one new set of mutants is deficient in promoting LexA cleavage and yet maintains the ability to induce the DNA damage response. These results reveal specific amino acid determinants of the RecA–LexA interaction and suggest that LexA binds RecAi and RecAi+6 at a composite site on the RecA filament, which could explain the role of the active filament during LexA cleavage.
Collapse
Affiliation(s)
- Anbu K Adikesavan
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | | | | | | | | | | |
Collapse
|
17
|
Erdin S, Lisewski AM, Lichtarge O. Protein function prediction: towards integration of similarity metrics. Curr Opin Struct Biol 2011; 21:180-8. [PMID: 21353529 DOI: 10.1016/j.sbi.2011.02.001] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2011] [Accepted: 02/03/2011] [Indexed: 11/16/2022]
Abstract
Genomic centers discover increasingly many protein sequences and structures, but not necessarily their full biological functions. Thus, currently, less than one percent of proteins have experimentally verified biochemical activities. To fill this gap, function prediction algorithms apply metrics of similarity between proteins on the premise that those sufficiently alike in sequence, or structure, will perform identical functions. Although high sensitivity is elusive, network analyses that integrate these metrics together hold the promise of rapid gains in function prediction specificity.
Collapse
Affiliation(s)
- Serkan Erdin
- Department of Molecular and Human Genetics, 1 Baylor Plaza, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | |
Collapse
|
18
|
Venner E, Lisewski AM, Erdin S, Ward RM, Amin SR, Lichtarge O. Accurate protein structure annotation through competitive diffusion of enzymatic functions over a network of local evolutionary similarities. PLoS One 2010; 5:e14286. [PMID: 21179190 PMCID: PMC3001439 DOI: 10.1371/journal.pone.0014286] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Accepted: 11/10/2010] [Indexed: 12/24/2022] Open
Abstract
High-throughput Structural Genomics yields many new protein structures without known molecular function. This study aims to uncover these missing annotations by globally comparing select functional residues across the structural proteome. First, Evolutionary Trace Annotation, or ETA, identifies which proteins have local evolutionary and structural features in common; next, these proteins are linked together into a proteomic network of ETA similarities; then, starting from proteins with known functions, competing functional labels diffuse link-by-link over the entire network. Every node is thus assigned a likelihood z-score for every function, and the most significant one at each node wins and defines its annotation. In high-throughput controls, this competitive diffusion process recovered enzyme activity annotations with 99% and 97% accuracy at half-coverage for the third and fourth Enzyme Commission (EC) levels, respectively. This corresponds to false positive rates 4-fold lower than nearest-neighbor and 5-fold lower than sequence-based annotations. In practice, experimental validation of the predicted carboxylesterase activity in a protein from Staphylococcus aureus illustrated the effectiveness of this approach in the context of an increasingly drug-resistant microbe. This study further links molecular function to a small number of evolutionarily important residues recognizable by Evolutionary Tracing and it points to the specificity and sensitivity of functional annotation by competitive global network diffusion. A web server is at http://mammoth.bcm.tmc.edu/networks.
Collapse
Affiliation(s)
- Eric Venner
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas, United States of America
- W. M. Keck Center for Interdisciplinary Bioscience Training, Houston, Texas, United States of America
| | - Andreas Martin Lisewski
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Serkan Erdin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- W. M. Keck Center for Interdisciplinary Bioscience Training, Houston, Texas, United States of America
| | - R. Matthew Ward
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas, United States of America
- W. M. Keck Center for Interdisciplinary Bioscience Training, Houston, Texas, United States of America
| | - Shivas R. Amin
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
| | - Olivier Lichtarge
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, United States of America
- Graduate Program in Structural and Computational Biology and Molecular Biophysics, Baylor College of Medicine, Houston, Texas, United States of America
- W. M. Keck Center for Interdisciplinary Bioscience Training, Houston, Texas, United States of America
- * E-mail:
| |
Collapse
|
19
|
Lua RC, Lichtarge O. PyETV: a PyMOL evolutionary trace viewer to analyze functional site predictions in protein complexes. Bioinformatics 2010; 26:2981-2. [PMID: 20929911 DOI: 10.1093/bioinformatics/btq566] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
SUMMARY PyETV is a PyMOL plugin for viewing, analyzing and manipulating predictions of evolutionarily important residues and sites in protein structures and their complexes. It seamlessly captures the output of the Evolutionary Trace server, namely ranked importance of residues, for multiple chains of a complex. It then yields a high resolution graphical interface showing their distribution and clustering throughout a quaternary structure, including at interfaces. Together with other tools in the popular PyMOL viewer, PyETV thus provides a novel tool to integrate evolutionary forces into the design of experiments targeting the most functionally relevant sites of a protein. AVAILABILITY The PyETV module is written in Python. Installation instructions and video demonstrations may be found at the URL http://mammoth.bcm.tmc.edu/traceview/HelpDocs/PyETVHelp/pyInstructions.html. CONTACT lichtarge@bcm.tmc.edu.
Collapse
Affiliation(s)
- Rhonald C Lua
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA
| | | |
Collapse
|