1
|
Ward KM, Pickett BD, Ebbert MTW, Kauwe JSK, Miller JB. Web-Based Protein Interactions Calculator Identifies Likely Proteome Coevolution with Alzheimer’s Disease-Associated Proteins. Genes (Basel) 2022; 13:genes13081346. [PMID: 36011253 PMCID: PMC9407263 DOI: 10.3390/genes13081346] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 07/22/2022] [Accepted: 07/23/2022] [Indexed: 11/19/2022] Open
Abstract
Protein–protein functional interactions arise from either transitory or permanent biomolecular associations and often lead to the coevolution of the interacting residues. Although mutual information has traditionally been used to identify coevolving residues within the same protein, its application between coevolving proteins remains largely uncharacterized. Therefore, we developed the Protein Interactions Calculator (PIC) to efficiently identify coevolving residues between two protein sequences using mutual information. We verified the algorithm using 2102 known human protein interactions and 233 known bacterial protein interactions, with a respective 1975 and 252 non-interacting protein controls. The average PIC score for known human protein interactions was 4.5 times higher than non-interacting proteins (p = 1.03 × 10−108) and 1.94 times higher in bacteria (p = 1.22 × 10−35). We then used the PIC scores to determine the probability that two proteins interact. Using those probabilities, we paired 37 Alzheimer’s disease-associated proteins with 8608 other proteins and determined the likelihood that each pair interacts, which we report through a web interface. The PIC had significantly higher sensitivity and residue-specific resolution not available in other algorithms. Therefore, we propose that the PIC can be used to prioritize potential protein interactions, which can lead to a better understanding of biological processes and additional therapeutic targets belonging to protein interaction groups.
Collapse
Affiliation(s)
- Katrisa M. Ward
- Department of Biology, Brigham Young University, Provo, UT 84602, USA; (K.M.W.); (B.D.P.); (J.S.K.K.)
| | - Brandon D. Pickett
- Department of Biology, Brigham Young University, Provo, UT 84602, USA; (K.M.W.); (B.D.P.); (J.S.K.K.)
| | - Mark T. W. Ebbert
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY 40536, USA;
- Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY 40506, USA
- Department of Neuroscience, University of Kentucky, Lexington, KY 40506, USA
| | - John S. K. Kauwe
- Department of Biology, Brigham Young University, Provo, UT 84602, USA; (K.M.W.); (B.D.P.); (J.S.K.K.)
| | - Justin B. Miller
- Sanders-Brown Center on Aging, University of Kentucky, Lexington, KY 40536, USA;
- Division of Biomedical Informatics, Department of Internal Medicine, University of Kentucky, Lexington, KY 40506, USA
- Department of Pathology and Laboratory Medicine, University of Kentucky, Lexington, KY 40506, USA
- Correspondence: ; Tel.: +1-859-562-0333
| |
Collapse
|
2
|
Structural Insights into Carboxylic Polyester-Degrading Enzymes and Their Functional Depolymerizing Neighbors. Int J Mol Sci 2021; 22:ijms22052332. [PMID: 33652738 PMCID: PMC7956259 DOI: 10.3390/ijms22052332] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 02/22/2021] [Accepted: 02/23/2021] [Indexed: 11/28/2022] Open
Abstract
Esters are organic compounds widely represented in cellular structures and metabolism, originated by the condensation of organic acids and alcohols. Esterification reactions are also used by chemical industries for the production of synthetic plastic polymers. Polyester plastics are an increasing source of environmental pollution due to their intrinsic stability and limited recycling efforts. Bioremediation of polyesters based on the use of specific microbial enzymes is an interesting alternative to the current methods for the valorization of used plastics. Microbial esterases are promising catalysts for the biodegradation of polyesters that can be engineered to improve their biochemical properties. In this work, we analyzed the structure-activity relationships in microbial esterases, with special focus on the recently described plastic-degrading enzymes isolated from marine microorganisms and their structural homologs. Our analysis, based on structure-alignment, molecular docking, coevolution of amino acids and surface electrostatics determined the specific characteristics of some polyester hydrolases that could be related with their efficiency in the degradation of aromatic polyesters, such as phthalates.
Collapse
|
3
|
Camenares D. ACES: A co-evolution simulator generates co-varying protein and nucleic acid sequences. J Bioinform Comput Biol 2020; 18:2050039. [PMID: 33215964 DOI: 10.1142/s0219720020500390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Sequence-specific and consequential interactions within or between proteins and/or RNAs can be predicted by identifying co-evolution of residues in these molecules. Different algorithms have been used to detect co-evolution, often using biological data to benchmark a methods ability to discriminate against indirect co-evolution. Such a benchmark is problematic, because not all the interactions and evolutionary constraints underlying real data can be known a priori. Instead, sequences generated in silico to simulate co-evolution would be preferable, and can be obtained using aCES, the software tool presented here. Conservation and co-evolution constraints can be specified for any residue across a number of molecules, allowing the user to capture a complex, realistic set of interactions. Resulting alignments were used to benchmark several co-evolution detection tools for their ability to separate signal from background as well as discriminating direct from indirect signals. This approach can aid in refinement of these algorithms. In addition, systematic tuning of these constraints sheds new light on how they drive co-evolution between residues. Better understanding how to detect co-evolution and the residue interactions they predict can lead to a wide range of insights important for synthetic biologists interested in engineering new, orthogonal interactions between two macromolecules.
Collapse
Affiliation(s)
- Devin Camenares
- Department of Biochemistry, Alma College, 614 West Superior St, Alma, Michigan 48801, USA
| |
Collapse
|
4
|
Oteri F, Nadalin F, Champeimont R, Carbone A. BIS2Analyzer: a server for co-evolution analysis of conserved protein families. Nucleic Acids Res 2019; 45:W307-W314. [PMID: 28472458 PMCID: PMC5570204 DOI: 10.1093/nar/gkx336] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2017] [Accepted: 04/18/2017] [Indexed: 12/13/2022] Open
Abstract
Along protein sequences, co-evolution analysis identifies residue pairs demonstrating either a specific co-adaptation, where changes in one of the residues are compensated by changes in the other during evolution or a less specific external force that affects the evolutionary rates of both residues in a similar magnitude. In both cases, independently of the underlying cause, co-evolutionary signatures within or between proteins serve as markers of physical interactions and/or functional relationships. Depending on the type of protein under study, the set of available homologous sequences may greatly differ in size and amino acid variability. BIS2Analyzer, openly accessible at http://www.lcqb.upmc.fr/BIS2Analyzer/, is a web server providing the online analysis of co-evolving amino-acid pairs in protein alignments, especially designed for vertebrate and viral protein families, which typically display a small number of highly similar sequences. It is based on BIS2, a re-implemented fast version of the co-evolution analysis tool Blocks in Sequences (BIS). BIS2Analyzer provides a rich and interactive graphical interface to ease biological interpretation of the results.
Collapse
Affiliation(s)
- Francesco Oteri
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Francesca Nadalin
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Raphaël Champeimont
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France
| | - Alessandra Carbone
- Sorbonne Universités, UPMC Univ Paris 06, CNRS, IBPS, UMR 7238, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.,Institut Universitaire de France, 75005 Paris, France
| |
Collapse
|
5
|
Zhang YH, Gong YJ, Gu TL, Yuan HQ, Zhang W, Kwong S, Zhang J. DECAL: Decomposition-Based Coevolutionary Algorithm for Many-Objective Optimization. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:27-41. [PMID: 29990116 DOI: 10.1109/tcyb.2017.2762701] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper develops a decomposition-based coevolutionary algorithm for many-objective optimization, which evolves a number of subpopulations in parallel for approaching the set of Pareto optimal solutions. The many-objective problem is decomposed into a number of subproblems using a set of well-distributed weight vectors. Accordingly, each subpopulation of the algorithm is associated with a weight vector and is responsible for solving the corresponding subproblem. The exploration ability of the algorithm is improved by using a mating pool that collects elite individuals from the cooperative subpopulations for breeding the offspring. In the subsequent environmental selection, the top-ranked individuals in each subpopulation, which are appraised by aggregation functions, survive for the next iteration. Two new aggregation functions with distinct characteristics are designed in this paper to enhance the population diversity and accelerate the convergence speed. The proposed algorithm is compared with several state-of-the-art many-objective evolutionary algorithms on a large number of benchmark instances, as well as on a real-world design problem. Experimental results show that the proposed algorithm is very competitive.
Collapse
|
6
|
Liu MD, Warner EA, Morrissey CE, Fick CW, Wu TS, Ornelas MY, Ochoa GV, Zhang B, Rathbun CM, Porterfield WB, Prescher JA, Leconte AM. Statistical Coupling Analysis-Guided Library Design for the Discovery of Mutant Luciferases. Biochemistry 2018; 57:663-671. [PMID: 29224332 PMCID: PMC6192264 DOI: 10.1021/acs.biochem.7b01014] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Directed evolution has proven to be an invaluable tool for protein engineering; however, there is still a need for developing new approaches to continue to improve the efficiency and efficacy of these methods. Here, we demonstrate a new method for library design that applies a previously developed bioinformatic method, Statistical Coupling Analysis (SCA). SCA uses homologous enzymes to identify amino acid positions that are mutable and functionally important and engage in synergistic interactions between amino acids. We use SCA to guide a library of the protein luciferase and demonstrate that, in a single round of selection, we can identify luciferase mutants with several valuable properties. Specifically, we identify luciferase mutants that possess both red-shifted emission spectra and improved stability relative to those of the wild-type enzyme. We also identify luciferase mutants that possess a >50-fold change in specificity for modified luciferins. To understand the mutational origin of these improved mutants, we demonstrate the role of mutations at N229, S239, and G246 in altered function. These studies show that SCA can be used to guide library design and rapidly identify synergistic amino acid mutations from a small library.
Collapse
Affiliation(s)
- Mira D. Liu
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| | - Elliot A. Warner
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| | - Charlotte E. Morrissey
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| | - Caitlyn W. Fick
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| | - Taia S. Wu
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| | - Marya Y. Ornelas
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| | - Gabriela V. Ochoa
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| | - Brendan Zhang
- Department of Chemistry, University of California – Irvine, Irvine, California, 92697, United States of America
| | - Colin M. Rathbun
- Department of Chemistry, University of California – Irvine, Irvine, California, 92697, United States of America
| | - William B. Porterfield
- Department of Chemistry, University of California – Irvine, Irvine, California, 92697, United States of America
| | - Jennifer A. Prescher
- Department of Chemistry, University of California – Irvine, Irvine, California, 92697, United States of America
- Department of Molecular Biology and Biochemistry, University of California – Irvine, Irvine, California, 92697, United States of America
- Department of Pharmaceutical Sciences, University of California – Irvine, Irvine, California, 92697, United States of America
| | - Aaron M. Leconte
- W.M. Keck Science Department of Claremont McKenna, Pitzer, and Scripps Colleges, Claremont, California, 91711, United States of America
| |
Collapse
|
7
|
Membrane proteins structures: A review on computational modeling tools. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2017; 1859:2021-2039. [DOI: 10.1016/j.bbamem.2017.07.008] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 07/04/2017] [Accepted: 07/13/2017] [Indexed: 01/02/2023]
|
8
|
Mandloi S, Chakrabarti S. Protein sites with more coevolutionary connections tend to evolve slower, while more variable protein families acquire higher coevolutionary connections. F1000Res 2017; 6:453. [PMID: 28751967 PMCID: PMC5506539 DOI: 10.12688/f1000research.11251.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/05/2017] [Indexed: 11/20/2022] Open
Abstract
Background: Amino acid exchanges within proteins sometimes compensate for one another and could therefore be co-evolved. It is essential to investigate the intricate relationship between the extent of coevolution and the evolutionary variability exerted at individual protein sites, as well as the whole protein. Methods: In this study, we have used a reliable set of coevolutionary connections (sites within 10Å spatial distance) and investigated their correlation with the evolutionary diversity within the respective protein sites. Results: Based on our observations, we propose an interesting hypothesis that higher numbers of coevolutionary connections are associated with lesser evolutionary variable protein sites, while higher numbers of the coevolutionary connections can be observed for a protein family that has higher evolutionary variability. Our findings also indicate that highly coevolved sites located in a solvent accessible state tend to be less evolutionary variable. This relationship reverts at the whole protein level where cytoplasmic and extracellular proteins show moderately higher anti-correlation between the number of coevolutionary connections and the average evolutionary conservation of the whole protein. Conclusions: Observations and hypothesis presented in this study provide intriguing insights towards understanding the critical relationship between coevolutionary and evolutionary changes observed within proteins. Our observations encourage further investigation to find out the reasons behind subtle variations in the relationship between coevolutionary connectivity and evolutionary diversity for proteins located at various cellular localizations and/or involved in different molecular-biological functions.
Collapse
Affiliation(s)
- Sapan Mandloi
- Department of Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research, Indian Institute of Chemical Biology, Kolkata, West Bengal, 700032, India
| | - Saikat Chakrabarti
- Department of Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research, Indian Institute of Chemical Biology, Kolkata, West Bengal, 700032, India
| |
Collapse
|
9
|
Dhers S, Holub J, Lehn JM. Coevolution and ratiometric behaviour in metal cation-driven dynamic covalent systems. Chem Sci 2016; 8:2125-2130. [PMID: 28507664 PMCID: PMC5407266 DOI: 10.1039/c6sc04662b] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 11/24/2016] [Indexed: 12/15/2022] Open
Abstract
Coevolution can be defined as the correlated changes of structurally and/or functionally connected entities. Dynamic Covalent Libraries (DCLs) have been used to demonstrate coevolution and ratiometric behaviour on a molecular level using dynamic covalent molecules such as imines and hydrazones.
Dynamic Covalent Libraries (DCLs) have been used to demonstrate coevolution behaviour on a molecular level using dynamic covalent molecules such as imines and hydrazones. Two systems are presented: the first system is based on a dialdehyde and two diamines in combination with Zn(ii) and Hg(ii) to form a 2 × 2 Constitutional Dynamic Network (CDN) of four complexes of macrocyclic bis-imines. Whereas the two metal ions, when reacted separately form a complex with each macrocycle with low selectivity, when applied together, each cation yields selectively a complex with one of the two macrocycles. Thus, the simultaneous application of both cations, where one might have expected the formation of four different complexes, results in the synergistic evolution (co-evolution) towards a simpler, more selective outcome under agonist amplification. The second system of 4 components, 2 amines and 2 aldehydes displays metalloselection together with a correlated evolution in distribution on complexation of Zn(ii) and Cu(i) with the dynamic ligand constituents and exhibits a dynamic ratiometry process related to the antagonistic behaviour of a pair of ligand constituents.
Collapse
Affiliation(s)
- Sébastien Dhers
- Laboratoire de Chimie Supramoléculaire , ISIS , Université de Strasbourg , 8 Allée Gaspard Monge , 67000 Strasbourg , France .
| | - Jan Holub
- Laboratoire de Chimie Supramoléculaire , ISIS , Université de Strasbourg , 8 Allée Gaspard Monge , 67000 Strasbourg , France .
| | - Jean-Marie Lehn
- Laboratoire de Chimie Supramoléculaire , ISIS , Université de Strasbourg , 8 Allée Gaspard Monge , 67000 Strasbourg , France .
| |
Collapse
|
10
|
Wagner JR, Lee CT, Durrant JD, Malmstrom RD, Feher VA, Amaro RE. Emerging Computational Methods for the Rational Discovery of Allosteric Drugs. Chem Rev 2016; 116:6370-90. [PMID: 27074285 PMCID: PMC4901368 DOI: 10.1021/acs.chemrev.5b00631] [Citation(s) in RCA: 170] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
![]()
Allosteric drug development holds
promise for delivering medicines
that are more selective and less toxic than those that target orthosteric
sites. To date, the discovery of allosteric binding sites and lead
compounds has been mostly serendipitous, achieved through high-throughput
screening. Over the past decade, structural data has become more readily
available for larger protein systems and more membrane protein classes
(e.g., GPCRs and ion channels), which are common allosteric drug targets.
In parallel, improved simulation methods now provide better atomistic
understanding of the protein dynamics and cooperative motions that
are critical to allosteric mechanisms. As a result of these advances,
the field of predictive allosteric drug development is now on the
cusp of a new era of rational structure-based computational methods.
Here, we review algorithms that predict allosteric sites based on
sequence data and molecular dynamics simulations, describe tools that
assess the druggability of these pockets, and discuss how Markov state
models and topology analyses provide insight into the relationship
between protein dynamics and allosteric drug binding. In each section,
we first provide an overview of the various method classes before
describing relevant algorithms and software packages.
Collapse
Affiliation(s)
- Jeffrey R Wagner
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Christopher T Lee
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Jacob D Durrant
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Robert D Malmstrom
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Victoria A Feher
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Rommie E Amaro
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| |
Collapse
|
11
|
Baker FN, Porollo A. CoeViz: a web-based tool for coevolution analysis of protein residues. BMC Bioinformatics 2016; 17:119. [PMID: 26956673 PMCID: PMC4782369 DOI: 10.1186/s12859-016-0975-z] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2015] [Accepted: 03/01/2016] [Indexed: 11/30/2022] Open
Abstract
Background Proteins generally perform their function in a folded state. Residues forming an active site, whether it is a catalytic center or interaction interface, are frequently distant in a protein sequence. Hence, traditional sequence-based prediction methods focusing on a single residue (or a short window of residues) at a time may have difficulties in identifying and clustering the residues constituting a functional site, especially when a protein has multiple functions. Evolutionary information encoded in multiple sequence alignments is known to greatly improve sequence-based predictions. Identification of coevolving residues further advances the protein structure and function annotation by revealing cooperative pairs and higher order groupings of residues. Results We present a new web-based tool (CoeViz) that provides a versatile analysis and visualization of pairwise coevolution of amino acid residues. The tool computes three covariance metrics: mutual information, chi-square statistic, Pearson correlation, and one conservation metric: joint Shannon entropy. Implemented adjustments of covariance scores include phylogeny correction, corrections for sequence dissimilarity and alignment gaps, and the average product correction. Visualization of residue relationships is enhanced by hierarchical cluster trees, heat maps, circular diagrams, and the residue highlighting in protein sequence and 3D structure. Unlike other existing tools, CoeViz is not limited to analyzing conserved domains or protein families and can process long, unstructured and multi-domain proteins thousands of residues long. Two examples are provided to illustrate the use of the tool for identification of residues (1) involved in enzymatic function, (2) forming short linear functional motifs, and (3) constituting a structural domain. Conclusions CoeViz represents a practical resource for a quick sequence-based protein annotation for molecular biologists, e.g., for identifying putative functional clusters of residues and structural domains. CoeViz also can serve computational biologists as a resource of coevolution matrices, e.g., for developing machine learning-based prediction models. The presented tool is integrated in the POLYVIEW-2D server (http://polyview.cchmc.org/) and available from resulting pages of POLYVIEW-2D.
Collapse
Affiliation(s)
- Frazier N Baker
- Department of Electrical Engineering and Computing Systems, University of Cincinnati, 2901 Woodside Drive, Cincinnati, OH, 45221, USA. .,Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH, 45229, USA.
| | - Aleksey Porollo
- Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH, 45229, USA. .,Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, 3333 Burnet Avenue, Cincinnati, OH, 45229, USA.
| |
Collapse
|
12
|
Avila-Herrera A, Pollard KS. Coevolutionary analyses require phylogenetically deep alignments and better null models to accurately detect inter-protein contacts within and between species. BMC Bioinformatics 2015; 16:268. [PMID: 26303588 PMCID: PMC4549020 DOI: 10.1186/s12859-015-0677-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2015] [Accepted: 07/17/2015] [Indexed: 01/09/2023] Open
Abstract
Background When biomolecules physically interact, natural selection operates on them jointly. Contacting positions in protein and RNA structures exhibit correlated patterns of sequence evolution due to constraints imposed by the interaction, and molecular arms races can develop between interacting proteins in pathogens and their hosts. To evaluate how well methods developed to detect coevolving residues within proteins can be adapted for cross-species, inter-protein analysis, we used statistical criteria to quantify the performance of these methods in detecting inter-protein residues within 8 angstroms of each other in the co-crystal structures of 33 bacterial protein interactions. We also evaluated their performance for detecting known residues at the interface of a host-virus protein complex with a partially solved structure. Results Our quantitative benchmarking showed that all coevolutionary methods clearly benefit from alignments with many sequences. Methods that aim to detect direct correlations generally outperform other approaches. However, faster mutual information based methods are occasionally competitive in small alignments and with relaxed false positive rates. Two commonly used null distributions are anti-conservative and have high false positive rates in some scenarios, although the empirical distribution of scores performs reasonably well with deep alignments. Conclusions We conclude that coevolutionary analysis of cross-species protein interactions holds great promise but requires sequencing many more species pairs. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0677-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Aram Avila-Herrera
- Bioinformatics Graduate Program, University of California, San Francisco, USA. .,Gladstone Institute of Cardiovascular Disease, University of California, San Francisco, USA.
| | - Katherine S Pollard
- Bioinformatics Graduate Program, University of California, San Francisco, USA. .,Gladstone Institute of Cardiovascular Disease, University of California, San Francisco, USA. .,Department of Epidemiology and Biostatistics, University of California, San Francisco, USA. .,Institute for Human Genetics, University of California, San Francisco, 94158, CA, USA.
| |
Collapse
|
13
|
Identification of residues in ABCG2 affecting protein trafficking and drug transport, using co-evolutionary analysis of ABCG sequences. Biosci Rep 2015; 35:BSR20150150. [PMID: 26294421 PMCID: PMC4613716 DOI: 10.1042/bsr20150150] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Accepted: 07/17/2015] [Indexed: 12/31/2022] Open
Abstract
ABCG2 is an ABC (ATP-binding cassette) transporter with a physiological role in urate transport in the kidney and is also implicated in multi-drug efflux from a number of organs in the body. The trafficking of the protein and the mechanism by which it recognizes and transports diverse drugs are important areas of research. In the current study, we have made a series of single amino acid mutations in ABCG2 on the basis of sequence analysis. Mutant isoforms were characterized for cell surface expression and function. One mutant (I573A) showed disrupted glycosylation and reduced trafficking kinetics. In contrast with many ABC transporter folding mutations which appear to be 'rescued' by chemical chaperones or low temperature incubation, the I573A mutation was not enriched at the cell surface by either treatment, with the majority of the protein being retained in the endoplasmic reticulum (ER). Two other mutations (P485A and M549A) showed distinct effects on transport of ABCG2 substrates reinforcing the role of TM helix 3 in drug recognition and transport and indicating the presence of intracellular coupling regions in ABCG2.
Collapse
|
14
|
Hahn C, Weiss SJ, Stojanovski S, Bachmann L. Co-Speciation of the Ectoparasite Gyrodactylus teuchis (Monogenea, Platyhelminthes) and Its Salmonid Hosts. PLoS One 2015; 10:e0127340. [PMID: 26080029 PMCID: PMC4469311 DOI: 10.1371/journal.pone.0127340] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 04/13/2015] [Indexed: 11/18/2022] Open
Abstract
Co-speciation is a fundamental concept of evolutionary biology and intuitively appealing, yet in practice hard to demonstrate as it is often blurred by other evolutionary processes. We investigate the phylogeographic history of the monogenean ectoparasites Gyrodactylus teuchis and G. truttae on European salmonids of the genus Salmo. Mitochondrial cytochrome oxidase subunit 1 and the nuclear ribosomal internal transcribed spacer 2 were sequenced for 189 Gyrodactylus individuals collected from 50 localities, distributed across most major European river systems, from the Iberian- to the Balkan Peninsula. Despite both anthropogenic and naturally caused admixture of the principal host lineages among major river basins, co-phylogenetic analyses revealed significant global congruence for host and parasite phylogenies, providing firm support for co-speciation of G. teuchis and its salmonid hosts brown trout (S. trutta) and Atlantic salmon (S. salar). The major split within G. teuchis, coinciding with the initial divergence of the hosts was dated to ~1.5 My BP, using a Bayesian framework based on an indirect calibration point obtained from the host phylogeny. The presence of G. teuchis in Europe thus predates some of the major Pleistocene glaciations. In contrast, G. truttae exhibited remarkably low intraspecific genetic diversity. Given the direct life cycle and potentially high transmission potential of gyrodactylids, this finding is interpreted as indication for a recent emergence (<60 ky BP) of G. truttae via a host-switch. Our study thus suggests that instances of two fundamentally different mechanisms of speciation (co-speciation vs. host-switching) may have occurred on the same hosts in Europe within a time span of less than 1.5 My in two gyrodactylid ectoparasite species.
Collapse
Affiliation(s)
- Christoph Hahn
- Natural History Museum, University of Oslo, 0318, Oslo, Norway
- School for Biological, Biomedical and Environmental Science, University of Hull, Hull, HU6 7RX, United Kingdom
- * E-mail:
| | - Steven J. Weiss
- Institute of Zoology, Karl-Franzens University of Graz, 8010, Graz, Austria
| | - Stojmir Stojanovski
- Department of Fish Parasitology, Hydrobiological Institute, 6000, Ohrid, R. Macedonia
| | - Lutz Bachmann
- Natural History Museum, University of Oslo, 0318, Oslo, Norway
| |
Collapse
|
15
|
Hollis KL, Harrsch FA, Nowbahari E. Ants vs. antlions: An insect model for studying the role of learned and hard-wired behavior in coevolution. LEARNING AND MOTIVATION 2015. [DOI: 10.1016/j.lmot.2014.11.003] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
16
|
Minguez P, Letunic I, Parca L, Garcia-Alonso L, Dopazo J, Huerta-Cepas J, Bork P. PTMcode v2: a resource for functional associations of post-translational modifications within and between proteins. Nucleic Acids Res 2014; 43:D494-502. [PMID: 25361965 PMCID: PMC4383916 DOI: 10.1093/nar/gku1081] [Citation(s) in RCA: 71] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
The post-translational regulation of proteins is mainly driven by two molecular events, their modification by several types of moieties and their interaction with other proteins. These two processes are interdependent and together are responsible for the function of the protein in a particular cell state. Several databases focus on the prediction and compilation of protein–protein interactions (PPIs) and no less on the collection and analysis of protein post-translational modifications (PTMs), however, there are no resources that concentrate on describing the regulatory role of PTMs in PPIs. We developed several methods based on residue co-evolution and proximity to predict the functional associations of pairs of PTMs that we apply to modifications in the same protein and between two interacting proteins. In order to make data available for understudied organisms, PTMcode v2 (http://ptmcode.embl.de) includes a new strategy to propagate PTMs from validated modified sites through orthologous proteins. The second release of PTMcode covers 19 eukaryotic species from which we collected more than 300 000 experimentally verified PTMs (>1 300 000 propagated) of 69 types extracting the post-translational regulation of >100 000 proteins and >100 000 interactions. In total, we report 8 million associations of PTMs regulating single proteins and over 9.4 million interplays tuning PPIs.
Collapse
Affiliation(s)
- Pablo Minguez
- European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Ivica Letunic
- Biobyte solutions GmbH, Bothestr 142, 69117 Heidelberg, Germany
| | - Luca Parca
- European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Luz Garcia-Alonso
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - Joaquin Dopazo
- Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - Jaime Huerta-Cepas
- European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Peer Bork
- European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, 69117 Heidelberg, Germany Max-Delbruck-Centre for Molecular Medicine, Berlin-Buch, Germany
| |
Collapse
|
17
|
Futai K. Attenuated colicin-based screening to discover and create novel resistance genes. J Microbiol Methods 2014; 100:128-36. [DOI: 10.1016/j.mimet.2014.03.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Revised: 03/06/2014] [Accepted: 03/10/2014] [Indexed: 10/25/2022]
|
18
|
Dib L, Silvestro D, Salamin N. Evolutionary footprint of coevolving positions in genes. Bioinformatics 2014; 30:1241-9. [DOI: 10.1093/bioinformatics/btu012] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
|
19
|
|
20
|
Nemoto W, Saito A, Oikawa H. Recent advances in functional region prediction by using structural and evolutionary information - Remaining problems and future extensions. Comput Struct Biotechnol J 2013; 8:e201308007. [PMID: 24688747 PMCID: PMC3962155 DOI: 10.5936/csbj.201308007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2013] [Revised: 11/12/2013] [Accepted: 11/13/2013] [Indexed: 11/22/2022] Open
Abstract
Structural genomics projects have solved many new structures with unknown functions. One strategy to investigate the function of a structure is to computationally find the functionally important residues or regions on it. Therefore, the development of functional region prediction methods has become an important research subject. An effective approach is to use a method employing structural and evolutionary information, such as the evolutionary trace (ET) method. ET ranks the residues of a protein structure by calculating the scores for relative evolutionary importance, and locates functionally important sites by identifying spatial clusters of highly ranked residues. After ET was developed, numerous ET-like methods were subsequently reported, and many of them are in practical use, although they require certain conditions. In this mini review, we first introduce the remaining problems and the recent improvements in the methods using structural and evolutionary information. We then summarize the recent developments of the methods. Finally, we conclude by describing possible extensions of the evolution- and structure-based methods.
Collapse
Affiliation(s)
- Wataru Nemoto
- Division of Life Science and Engineering, School of Science and Engineering, Tokyo Denki University (TDU), Ishizaka, Hatoyama-cho, Hiki-gun, Saitama, 350-0394, Japan
| | - Akira Saito
- Division of Life Science and Engineering, School of Science and Engineering, Tokyo Denki University (TDU), Ishizaka, Hatoyama-cho, Hiki-gun, Saitama, 350-0394, Japan
| | - Hayato Oikawa
- Division of Life Science and Engineering, School of Science and Engineering, Tokyo Denki University (TDU), Ishizaka, Hatoyama-cho, Hiki-gun, Saitama, 350-0394, Japan
| |
Collapse
|
21
|
Simonetti FL, Teppa E, Chernomoretz A, Nielsen M, Marino Buslje C. MISTIC: Mutual information server to infer coevolution. Nucleic Acids Res 2013; 41:W8-14. [PMID: 23716641 PMCID: PMC3692073 DOI: 10.1093/nar/gkt427] [Citation(s) in RCA: 122] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MISTIC (mutual information server to infer coevolution) is a web server for graphical representation of the information contained within a MSA (multiple sequence alignment) and a complete analysis tool for Mutual Information networks in protein families. The server outputs a graphical visualization of several information-related quantities using a circos representation. This provides an integrated view of the MSA in terms of (i) the mutual information (MI) between residue pairs, (ii) sequence conservation and (iii) the residue cumulative and proximity MI scores. Further, an interactive interface to explore and characterize the MI network is provided. Several tools are offered for selecting subsets of nodes from the network for visualization. Node coloring can be set to match different attributes, such as conservation, cumulative MI, proximity MI and secondary structure. Finally, a zip file containing all results can be downloaded. The server is available at http://mistic.leloir.org.ar. In summary, MISTIC allows for a comprehensive, compact, visually rich view of the information contained within an MSA in a manner unique to any other publicly available web server. In particular, the use of circos representation of MI networks and the visualization of the cumulative MI and proximity MI concepts is novel.
Collapse
Affiliation(s)
- Franco L Simonetti
- Bioinformatics Unit, Fundación Instituto Leloir, Av. Patricias Argentinas 435, C1405BWE, Buenos Aires, Argentina
| | | | | | | | | |
Collapse
|
22
|
Ashenberg O, Laub MT. Using analyses of amino Acid coevolution to understand protein structure and function. Methods Enzymol 2013; 523:191-212. [PMID: 23422431 DOI: 10.1016/b978-0-12-394292-0.00009-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Determining which residues of a protein contribute to a specific function is a difficult problem. Analyses of amino acid covariation within a protein family can serve as a useful guide by identifying residues that are functionally coupled. Covariation analyses have been successfully used on several different protein families to identify residues that work together to promote folding, enable protein-protein interactions, or contribute to an enzymatic activity. Covariation is a statistical signal that can be measured in a multiple sequence alignment of homologous proteins. As sequence databases have expanded dramatically, covariation analyses have become easier and more powerful. In this chapter, we describe how functional covariation arises during the evolution of proteins and how this signal can be distinguished from various background signals. We discuss the basic methodology for performing amino acid covariation analysis, using bacterial two-component signal transduction proteins as an example. We provide practical suggestions for each step of the process including assembly of protein sequences, construction of a multiple sequence alignment, measurement of covariation, and analysis of results.
Collapse
Affiliation(s)
- Orr Ashenberg
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA
| | | |
Collapse
|
23
|
Dib L, Carbone A. Protein fragments: functional and structural roles of their coevolution networks. PLoS One 2012; 7:e48124. [PMID: 23139761 PMCID: PMC3489791 DOI: 10.1371/journal.pone.0048124] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 09/27/2012] [Indexed: 11/19/2022] Open
Abstract
Small protein fragments, and not just residues, can be used as basic building blocks to reconstruct networks of coevolved amino acids in proteins. Fragments often enter in physical contact one with the other and play a major biological role in the protein. The nature of these interactions might be multiple and spans beyond binding specificity, allosteric regulation and folding constraints. Indeed, coevolving fragments are indicators of important information explaining folding intermediates, peptide assembly, key mutations with known roles in genetic diseases, distinguished subfamily-dependent motifs and differentiated evolutionary pressures on protein regions. Coevolution analysis detects networks of fragments interaction and highlights a high order organization of fragments demonstrating the importance of studying at a deeper level this structure. We demonstrate that it can be applied to protein families that are highly conserved or represented by few sequences, enlarging in this manner, the class of proteins where coevolution analysis can be performed and making large-scale coevolution studies a feasible goal.
Collapse
Affiliation(s)
- Linda Dib
- Université Pierre et Marie Curie, UMR 7238, Équipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| | - Alessandra Carbone
- Université Pierre et Marie Curie, UMR 7238, Équipe de Génomique Analytique, Paris, France
- CNRS, UMR 7238, Laboratoire de Génomique des Microorganismes, Paris, France
| |
Collapse
|
24
|
Kolaczkowski M, Sroda-Pomianek K, Kolaczkowska A, Michalak K. A conserved interdomain communication pathway of pseudosymmetrically distributed residues affects substrate specificity of the fungal multidrug transporter Cdr1p. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2012; 1828:479-90. [PMID: 23122779 DOI: 10.1016/j.bbamem.2012.10.024] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Revised: 09/19/2012] [Accepted: 10/21/2012] [Indexed: 11/19/2022]
Abstract
Understanding the communication pathways between remote sites in proteins is of key importance for understanding their function and mechanism of action. These remain largely unexplored among the pleiotropic drug resistance (PDR) representatives of the ubiquitous superfamily of ATP-binding cassette (ABC) transporters. To identify functionally coupled residues important for the polyspecific transport by the fungal ABC multidrug transporter Cdr1p a new selection strategy, towards increased resistance to a preferred substrate of the homologous Snq2p, was applied to a library of randomly generated mutants. The single amino acid substitutions, located pseudosymmetrically in each domain of the internally duplicated protein: the H-loop of the N-terminal nucleotide binding domain (NBD1) (C363R) and in the C-terminal NBD2 region preceding Walker A (V885G). The central regions of the first transmembrane helices 1 and 7 of both transmembrane domains were also affected by the G521S/D and A1208V substitutions respectively. Although the mutants were expressed at a similar level and located correctly to the plasma membrane, they selectively affected transport of multiple drugs, including azole antifungals. The synergistic effects of combined mutations on drug resistance, drug dependent ATPase activity and transport support the view inferred from the statistical coupling analysis (SCA) of aminoacid coevolution and mutational analysis of other ABC transporter families that these residues are an important part of the conserved, allosterically coupled interdomain communication network. Our results shed new light on the communication between the pseudosymmetrically arranged domains in a fungal PDR ABC transporter and reveal its profound influence on substrate specificity.
Collapse
Affiliation(s)
- Marcin Kolaczkowski
- Department of Biophysics, Wroclaw Medical University, PL-50-368 Wroclaw, Poland.
| | | | | | | |
Collapse
|
25
|
Li X, Zhang Z, Song J. Computational enzyme design approaches with significant biological outcomes: progress and challenges. Comput Struct Biotechnol J 2012; 2:e201209007. [PMID: 24688648 PMCID: PMC3962085 DOI: 10.5936/csbj.201209007] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Revised: 09/27/2012] [Accepted: 10/04/2012] [Indexed: 11/29/2022] Open
Abstract
Enzymes are powerful biocatalysts, however, so far there is still a large gap between the number of enzyme-based practical applications and that of naturally occurring enzymes. Multiple experimental approaches have been applied to generate nearly all possible mutations of target enzymes, allowing the identification of desirable variants with improved properties to meet the practical needs. Meanwhile, an increasing number of computational methods have been developed to assist in the modification of enzymes during the past few decades. With the development of bioinformatic algorithms, computational approaches are now able to provide more precise guidance for enzyme engineering and make it more efficient and less laborious. In this review, we summarize the recent advances of method development with significant biological outcomes to provide important insights into successful computational protein designs. We also discuss the limitations and challenges of existing methods and the future directions that should improve them.
Collapse
Affiliation(s)
- Xiaoman Li
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, Tianjin 300308, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Jiangning Song
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, Tianjin 300308, China ; Department of Biochemistry and Molecular Biology and ARC Centre of Excellence in Structural and Functional Microbial Genomics, Monash University, Melbourne, VIC 3800, Australia
| |
Collapse
|
26
|
Dib L, Carbone A. CLAG: an unsupervised non hierarchical clustering algorithm handling biological data. BMC Bioinformatics 2012; 13:194. [PMID: 23216858 PMCID: PMC3519615 DOI: 10.1186/1471-2105-13-194] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2012] [Accepted: 07/23/2012] [Indexed: 11/17/2022] Open
Abstract
Background Searching for similarities in a set of biological data is intrinsically difficult due to possible data points that should not be clustered, or that should group within several clusters. Under these hypotheses, hierarchical agglomerative clustering is not appropriate. Moreover, if the dataset is not known enough, like often is the case, supervised classification is not appropriate either. Results CLAG (for CLusters AGgregation) is an unsupervised non hierarchical clustering algorithm designed to cluster a large variety of biological data and to provide a clustered matrix and numerical values indicating cluster strength. CLAG clusterizes correlation matrices for residues in protein families, gene-expression and miRNA data related to various cancer types, sets of species described by multidimensional vectors of characters, binary matrices. It does not ask to all data points to cluster and it converges yielding the same result at each run. Its simplicity and speed allows it to run on reasonably large datasets. Conclusions CLAG can be used to investigate the cluster structure present in biological datasets and to identify its underlying graph. It showed to be more informative and accurate than several known clustering methods, as hierarchical agglomerative clustering, k-means, fuzzy c-means, model-based clustering, affinity propagation clustering, and not to suffer of the convergence problem proper to this latter.
Collapse
Affiliation(s)
- Linda Dib
- UPMC, UMR7238, Génomique Analytique, 15 rue de l'Ecole de Médecine, F-75006 Paris, France
| | | |
Collapse
|
27
|
Patterns of [FeFe] hydrogenase diversity in the gut microbial communities of lignocellulose-feeding higher termites. Appl Environ Microbiol 2012; 78:5368-74. [PMID: 22636002 DOI: 10.1128/aem.08008-11] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Hydrogen is the central free intermediate in the degradation of wood by termite gut microbes and can reach concentrations exceeding those measured for any other biological system. Degenerate primers targeting the largest family of [FeFe] hydrogenases observed in a termite gut metagenome have been used to explore the evolution and representation of these enzymes in termites. Sequences were cloned from the guts of the higher termites Amitermes sp. strain Cost010, Amitermes sp. strain JT2, Gnathamitermes sp. strain JT5, Microcerotermes sp. strain Cost008, Nasutitermes sp. strain Cost003, and Rhyncotermes sp. strain Cost004. Each gut sample harbored a more rich and evenly distributed population of hydrogenase sequences than observed previously in the guts of lower termites and Cryptocercus punctulatus. This accentuates the physiological importance of hydrogen for higher termite gut ecosystems and may reflect an increased metabolic burden, or metabolic opportunity, created by a lack of gut protozoa. The sequences were phylogenetically distinct from previously sequenced [FeFe] hydrogenases. Phylogenetic and UniFrac comparisons revealed congruence between host phylogeny and hydrogenase sequence library clustering patterns. This may reflect the combined influences of the stable intimate relationship of gut microbes with their host and environmental alterations in the gut that have occurred over the course of termite evolution. These results accentuate the physiological importance of hydrogen to termite gut ecosystems.
Collapse
|
28
|
Gregorič M, Agnarsson I, Blackledge TA, Kuntner M. How did the spider cross the river? Behavioral adaptations for river-bridging webs in Caerostris darwini (Araneae: Araneidae). PLoS One 2011; 6:e26847. [PMID: 22046378 PMCID: PMC3202572 DOI: 10.1371/journal.pone.0026847] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2011] [Accepted: 10/04/2011] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND Interspecific coevolution is well described, but we know significantly less about how multiple traits coevolve within a species, particularly between behavioral traits and biomechanical properties of animals' "extended phenotypes". In orb weaving spiders, coevolution of spider behavior with ecological and physical traits of their webs is expected. Darwin's bark spider (Caerostris darwini) bridges large water bodies, building the largest known orb webs utilizing the toughest known silk. Here, we examine C. darwini web building behaviors to establish how bridge lines are formed over water. We also test the prediction that this spider's unique web ecology and architecture coevolved with new web building behaviors. METHODOLOGY We observed C. darwini in its natural habitat and filmed web building. We observed 90 web building events, and compared web building behaviors to other species of orb web spiders. CONCLUSIONS Caerostris darwini uses a unique set of behaviors, some unknown in other spiders, to construct its enormous webs. First, the spiders release unusually large amounts of bridging silk into the air, which is then carried downwind, across the water body, establishing bridge lines. Second, the spiders perform almost no web site exploration. Third, they construct the orb capture area below the initial bridge line. In contrast to all known orb-weavers, the web hub is therefore not part of the initial bridge line but is instead built de novo. Fourth, the orb contains two types of radial threads, with those in the upper half of the web doubled. These unique behaviors result in a giant, yet rather simplified web. Our results continue to build evidence for the coevolution of behavioral (web building), ecological (web microhabitat) and biomaterial (silk biomechanics) traits that combined allow C. darwini to occupy a unique niche among spiders.
Collapse
Affiliation(s)
- Matjaž Gregorič
- Scientific Research Centre, Institute of Biology, Slovenian Academy of Sciences and Arts, Ljubljana, Slovenia.
| | | | | | | |
Collapse
|
29
|
Novel nucleotide and amino acid covariation between the 5'UTR and the NS2/NS3 proteins of hepatitis C virus: bioinformatic and functional analyses. PLoS One 2011; 6:e25530. [PMID: 21980483 PMCID: PMC3182228 DOI: 10.1371/journal.pone.0025530] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2011] [Accepted: 09/06/2011] [Indexed: 01/02/2023] Open
Abstract
Molecular covariation of highly polymorphic viruses is thought to have crucial effects on viral replication and fitness. This study employs association rule data mining of hepatitis C virus (HCV) sequences to search for specific evolutionary covariation and then tests functional relevance on HCV replication. Data mining is performed between nucleotides in the untranslated regions 5′ and 3′UTR, and the amino acid residues in the non-structural proteins NS2, NS3 and NS5B. Results indicate covariance of the 243rd nucleotide of the 5′UTR with the 14th, 41st, 76th, 110th, 211th and 212th residues of NS2 and with the 71st, 175th and 621st residues of NS3. Real-time experiments using an HCV subgenomic system to quantify viral replication confirm replication regulation for each covariant pair between 5′UTR243 and NS2-41, -76, -110, -211, and NS3-71, -175. The HCV subgenomic system with/without the NS2 region shows that regulatory effects vanish without NS2, so replicative modulation mediated by HCV 5′UTR243 depends on NS2. Strong binding of the NS2 variants to HCV RNA correlates with reduced HCV replication whereas weak binding correlates with restoration of HCV replication efficiency, as determined by RNA-protein immunoprecipitation assay band intensity. The dominant haplotype 5′UTR243-NS2-41-76-110-211-NS3-71-175 differs according to the HCV genotype: G-Ile-Ile-Ile-Gly-Ile-Met for genotype 1b and A-Leu-Val-Leu-Ser-Val-Leu for genotypes 1a, 2a and 2b. In conclusion, 5′UTR243 co-varies with specific NS2/3 protein amino acid residues, which may have significant structural and functional consequences for HCV replication. This unreported mechanism involving HCV replication possibly can be exploited in the development of advanced anti-HCV medication.
Collapse
|
30
|
Yip KY, Utz L, Sitwell S, Hu X, Sidhu SS, Turk BE, Gerstein M, Kim PM. Identification of specificity determining residues in peptide recognition domains using an information theoretic approach applied to large-scale binding maps. BMC Biol 2011; 9:53. [PMID: 21835011 PMCID: PMC3224579 DOI: 10.1186/1741-7007-9-53] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Accepted: 08/11/2011] [Indexed: 01/06/2023] Open
Abstract
Background Peptide Recognition Domains (PRDs) are commonly found in signaling proteins. They mediate protein-protein interactions by recognizing and binding short motifs in their ligands. Although a great deal is known about PRDs and their interactions, prediction of PRD specificities remains largely an unsolved problem. Results We present a novel approach to identifying these Specificity Determining Residues (SDRs). Our algorithm generalizes earlier information theoretic approaches to coevolution analysis, to become applicable to this problem. It leverages the growing wealth of binding data between PRDs and large numbers of random peptides, and searches for PRD residues that exhibit strong evolutionary covariation with some positions of the statistical profiles of bound peptides. The calculations involve only information from sequences, and thus can be applied to PRDs without crystal structures. We applied the approach to PDZ, SH3 and kinase domains, and evaluated the results using both residue proximity in co-crystal structures and verified binding specificity maps from mutagenesis studies. Discussion Our predictions were found to be strongly correlated with the physical proximity of residues, demonstrating the ability of our approach to detect physical interactions of the binding partners. Some high-scoring pairs were further confirmed to affect binding specificity using previous experimental results. Combining the covariation results also allowed us to predict binding profiles with higher reliability than two other methods that do not explicitly take residue covariation into account. Conclusions The general applicability of our approach to the three different domain families demonstrated in this paper suggests its potential in predicting binding targets and assisting the exploration of binding mechanisms.
Collapse
Affiliation(s)
- Kevin Y Yip
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, USA
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Gao H, Dou Y, Yang J, Wang J. New methods to measure residues coevolution in proteins. BMC Bioinformatics 2011; 12:206. [PMID: 21612664 PMCID: PMC3123609 DOI: 10.1186/1471-2105-12-206] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2010] [Accepted: 05/26/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The covariation of two sites in a protein is often used as the degree of their coevolution. To quantify the covariation many methods have been developed and most of them are based on residues position-specific frequencies by using the mutual information (MI) model. RESULTS In the paper, we proposed several new measures to incorporate new biological constraints in quantifying the covariation. The first measure is the mutual information with the amino acid background distribution (MIB), which incorporates the amino acid background distribution into the marginal distribution of the MI model. The modification is made to remove the effect of amino acid evolutionary pressure in measuring covariation. The second measure is the mutual information of residues physicochemical properties (MIP), which is used to measure the covariation of physicochemical properties of two sites. The third measure called MIBP is proposed by applying residues physicochemical properties into the MIB model. Moreover, scores of our new measures are applied to a robust indicator conn(k) in finding the covariation signal of each site. CONCLUSIONS We find that incorporating amino acid background distribution is effective in removing the effect of evolutionary pressure of amino acids. Thus the MIB measure describes more biological background information for the coevolution of residues. Besides, our analysis also reveals that the covariation of physicochemical properties is a new aspect of coevolution information.
Collapse
Affiliation(s)
- Hongyun Gao
- School of Mathematical Sciences, Dalian University of Technology, Dalian, People’s Republic of China
| | | | | | | |
Collapse
|
32
|
Tungtur S, Parente DJ, Swint-Kruse L. Functionally important positions can comprise the majority of a protein's architecture. Proteins 2011; 79:1589-608. [PMID: 21374721 DOI: 10.1002/prot.22985] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Revised: 12/08/2010] [Accepted: 12/15/2010] [Indexed: 01/13/2023]
Abstract
Concomitant with the genomic era, many bioinformatics programs have been developed to identify functionally important positions from sequence alignments of protein families. To evaluate these analyses, many have used the LacI/GalR family and determined whether positions predicted to be "important" are validated by published experiments. However, we previously noted that predictions do not identify all of the experimentally important positions present in the linker regions of these homologs. In an attempt to reconcile these differences, we corrected and expanded the LacI/GalR sequence set commonly used in sequence/function analyses. Next, a variety of analyses were carried out (1) for the entire LacI/GalR sequence set and (2) for a subset of homologs with functionally-important "YxPxxxAxxL" motifs in their linkers. This strategy was devised to determine whether predictions could be improved by knowledge-based sequence sorting and-for some analyses-did increase the number of linker positions identified. However, two functionally important linker positions were not reliably identified by any analysis. Finally, we compared the new predictions to all known experimental data for E. coli LacI and three homologous linkers. From these, we estimate that >50% of positions are important to the functions of the LacI/GalR homologs. In corollary, neutral positions might occur less frequently and might be easier to detect in sequence analyses. Although analyses have successfully guided mutations that partially exchange protein functions, a better experimental understanding of the sequence/function relationships in protein families would be helpful for uncovering the remaining rules used by nature to evolve new protein functions.
Collapse
Affiliation(s)
- Sudheer Tungtur
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, MSN 3030, Kansas City, Kansas 66160, USA
| | | | | |
Collapse
|
33
|
Shou C, Bhardwaj N, Lam HYK, Yan KK, Kim PM, Snyder M, Gerstein MB. Measuring the evolutionary rewiring of biological networks. PLoS Comput Biol 2011; 7:e1001050. [PMID: 21253555 PMCID: PMC3017101 DOI: 10.1371/journal.pcbi.1001050] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Accepted: 12/03/2010] [Indexed: 11/18/2022] Open
Abstract
We have accumulated a large amount of biological network data and expect even more to come. Soon, we anticipate being able to compare many different biological networks as we commonly do for molecular sequences. It has long been believed that many of these networks change, or "rewire", at different rates. It is therefore important to develop a framework to quantify the differences between networks in a unified fashion. We developed such a formalism based on analogy to simple models of sequence evolution, and used it to conduct a systematic study of network rewiring on all the currently available biological networks. We found that, similar to sequences, biological networks show a decreased rate of change at large time divergences, because of saturation in potential substitutions. However, different types of biological networks consistently rewire at different rates. Using comparative genomics and proteomics data, we found a consistent ordering of the rewiring rates: transcription regulatory, phosphorylation regulatory, genetic interaction, miRNA regulatory, protein interaction, and metabolic pathway network, from fast to slow. This ordering was found in all comparisons we did of matched networks between organisms. To gain further intuition on network rewiring, we compared our observed rewirings with those obtained from simulation. We also investigated how readily our formalism could be mapped to other network contexts; in particular, we showed how it could be applied to analyze changes in a range of "commonplace" networks such as family trees, co-authorships and linux-kernel function dependencies.
Collapse
Affiliation(s)
- Chong Shou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Nitin Bhardwaj
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Hugo Y. K. Lam
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
| | - Koon-Kiu Yan
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
| | - Philip M. Kim
- Terrence Donnelly Center for Cellular and Biomolecular Research, Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada
| | - Michael Snyder
- Department of Genetics, Stanford University, Stanford, California, United States of America
| | - Mark B. Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States of America
- Department of Computer Science, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
34
|
Dickson RJ, Wahl LM, Fernandes AD, Gloor GB. Identifying and seeing beyond multiple sequence alignment errors using intra-molecular protein covariation. PLoS One 2010; 5:e11082. [PMID: 20596526 PMCID: PMC2893159 DOI: 10.1371/journal.pone.0011082] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2010] [Accepted: 05/17/2010] [Indexed: 11/23/2022] Open
Abstract
Background There is currently no way to verify the quality of a multiple sequence alignment that is independent of the assumptions used to build it. Sequence alignments are typically evaluated by a number of established criteria: sequence conservation, the number of aligned residues, the frequency of gaps, and the probable correct gap placement. Covariation analysis is used to find putatively important residue pairs in a sequence alignment. Different alignments of the same protein family give different results demonstrating that covariation depends on the quality of the sequence alignment. We thus hypothesized that current criteria are insufficient to build alignments for use with covariation analyses. Methodology/Principal Findings We show that current criteria are insufficient to build alignments for use with covariation analyses as systematic sequence alignment errors are present even in hand-curated structure-based alignment datasets like those from the Conserved Domain Database. We show that current non-parametric covariation statistics are sensitive to sequence misalignments and that this sensitivity can be used to identify systematic alignment errors. We demonstrate that removing alignment errors due to 1) improper structure alignment, 2) the presence of paralogous sequences, and 3) partial or otherwise erroneous sequences, improves contact prediction by covariation analysis. Finally we describe two non-parametric covariation statistics that are less sensitive to sequence alignment errors than those described previously in the literature. Conclusions/Significance Protein alignments with errors lead to false positive and false negative conclusions (incorrect assignment of covariation and conservation, respectively). Covariation analysis can provide a verification step, independent of traditional criteria, to identify systematic misalignments in protein alignments. Two non-parametric statistics are shown to be somewhat insensitive to misalignment errors, providing increased confidence in contact prediction when analyzing alignments with erroneous regions because of an emphasis on they emphasize pairwise covariation over group covariation.
Collapse
Affiliation(s)
- Russell J. Dickson
- Department of Biochemistry, The University of Western Ontario, London, Canada
| | - Lindi M. Wahl
- Department of Applied Mathematics, The University of Western Ontario, London, Canada
| | - Andrew D. Fernandes
- Department of Biochemistry, The University of Western Ontario, London, Canada
- Department of Applied Mathematics, The University of Western Ontario, London, Canada
| | - Gregory B. Gloor
- Department of Biochemistry, The University of Western Ontario, London, Canada
- * E-mail:
| |
Collapse
|
35
|
Dunin-Horkawicz S, Lupas AN. Comprehensive analysis of HAMP domains: implications for transmembrane signal transduction. J Mol Biol 2010; 397:1156-74. [PMID: 20184894 DOI: 10.1016/j.jmb.2010.02.031] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2009] [Revised: 02/12/2010] [Accepted: 02/16/2010] [Indexed: 11/26/2022]
Abstract
Homodimeric receptors with one or two transmembrane (TM) segments per monomer are universal to life and represent the largest and most diverse group of cellular TM receptors. They frequently share domain types across phyla and, in some cases, have been recombined experimentally into functional chimeras (e.g., the bacterial aspartate chemoreceptor with the human insulin receptor), suggesting that they have a common mechanism. The nature of this mechanism, however, is still being debated. We have proposed a new model for transduction mechanism by axial helix rotation, based on the structure of a widespread domain, HAMP, that frequently occurs in direct continuation of the last TM segment, primarily in histidine kinases and chemoreceptors. Here we show by statistical analysis that HAMP domain sequences have biophysical properties compatible with the two conformations proposed by the model. The analysis also identifies three networks of coevolving residues, which allow the mechanism to subdivide into individual steps. The most extended of these networks is specific for membrane-bound HAMP domains and most likely accepts the signal from the TM helices. In a classification based on sequence clustering, these HAMPs form a central supercluster, surrounded by smaller clusters of divergent HAMPs, which typically combine into arrays of up to 31 consecutive copies and accept conformational input from other HAMP domains. Unexpectedly, the classification shows a division between domains of histidine kinases and those of chemoreceptors; thus, except for a few versatile lineages, HAMP domains are largely specific for one particular output domain. Within proteins using a given output domain, HAMP domains also show extensive coevolution with histidine kinases, but not with chemoreceptors. We attribute the greater capability for recombination among chemoreceptors to their acquisition of a reversible modification system, which acts as a capacitor for the initially deleterious effects of combining domains optimized in different contexts.
Collapse
Affiliation(s)
- Stanislaw Dunin-Horkawicz
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Spemannstr. 35, D-72076 Tuebingen, Germany
| | | |
Collapse
|
36
|
Gloor GB, Tyagi G, Abrassart DM, Kingston AJ, Fernandes AD, Dunn SD, Brandl CJ. Functionally compensating coevolving positions are neither homoplasic nor conserved in clades. Mol Biol Evol 2010; 27:1181-91. [PMID: 20065119 DOI: 10.1093/molbev/msq004] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We demonstrated that a pair of positions in phosphoglycerate kinase that score highly by three nonparametric covariation measures are important for function even though the positions can be occupied by aliphatic, aromatic, or charged residues. Examination of these pairs suggested that the majority of the covariation scores could be explained by within-clade conservation. However, an analysis of diversity showed that the conservation within clades of covarying pairs was indistinguishable from pairs of positions that do not covary, thus ruling out both clade conservation and extensive homoplasy as means to identify covarying positions. Mutagenesis showed that the residues in the covarying pair were epistatic, with the type of epistasis being dependent on the initial pair. The results show that nonconserved covarying positions that affect protein function can be identified with high precision.
Collapse
Affiliation(s)
- Gregory B Gloor
- Department of Biochemistry, University of Western Ontario, London, Ontario, Canada.
| | | | | | | | | | | | | |
Collapse
|
37
|
Kerr ID, Jones PM, George AM. Multidrug efflux pumps: the structures of prokaryotic ATP-binding cassette transporter efflux pumps and implications for our understanding of eukaryotic P-glycoproteins and homologues. FEBS J 2009; 277:550-63. [PMID: 19961540 DOI: 10.1111/j.1742-4658.2009.07486.x] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
One of the Holy Grails of ATP-binding cassette transporter research is a structural understanding of drug binding and transport in a eukaryotic multidrug resistance pump. These transporters are front-line mediators of drug resistance in cancers and represent an important therapeutic target in future chemotherapy. Although there has been intensive biochemical research into the human multidrug pumps, their 3D structure at atomic resolution remains unknown. The recent determination of the structure of a mouse P-glycoprotein at subatomic resolution is complemented by structures for a number of prokaryotic homologues. These structures have provided advances into our knowledge of the ATP-binding cassette exporter structure and mechanism, and have provided the template data for a number of homology modelling studies designed to reconcile biochemical data on these clinically important proteins.
Collapse
Affiliation(s)
- Ian D Kerr
- School of Biomedical Sciences, University of Nottingham, Nottingham, UK.
| | | | | |
Collapse
|
38
|
Comparing the functional roles of nonconserved sequence positions in homologous transcription repressors: implications for sequence/function analyses. J Mol Biol 2009; 395:785-802. [PMID: 19818797 DOI: 10.1016/j.jmb.2009.10.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Revised: 10/01/2009] [Accepted: 10/02/2009] [Indexed: 11/21/2022]
Abstract
The explosion of protein sequences deduced from genetic code has led to both a problem and a potential resource: Efficient data use requires interpreting the functional impact of sequence change without experimentally characterizing each protein variant. Several groups have hypothesized that interpretation could be aided by analyzing the sequences of naturally occurring homologues. To that end, myriad sequence/function analyses have been developed to predict which conserved, semi-conserved, and nonconserved positions are functionally important. These positions must be discriminated from the nonconserved positions that are functionally silent. However, the assumptions that underlie sequence analyses are based on experimental results that are sparse and usually designed to address different questions. Here, we use three homologues from a test family common to bioinformatics-the LacI/GalR transcription repressors-to test a common assumption: If a position is functionally important for one family member, it has similar importance in all homologues. We generated experimental sequence/function information for each nonconserved position in the 18 amino acids that link the DNA-binding and regulatory domains of three LacI/GalR homologues. We find that the functional importance of each position is preserved among the three linkers, albeit to different degrees. We also find that every linker position contributes to function, which has twofold implications. (1) Since the linker positions range from highly conserved to semi-conserved to nonconserved and contribute to affinity, selectivity, and allosteric response, we assert that sequence/function analyses must identify positions in the LacI/GalR linkers to be qualified as "successful". Many analyses overlook this region since most of the residues do not directly contact ligand. (2) No position in the LacI/GalR linker is functionally silent. This finding is inconsistent with another underlying principle of many analyses: Using sequence sets to discriminate important from non-contributing positions obligates silent positions, which denotes that most homologues tolerate a variety of amino acid substitutions at the position without functional change. Instead, additional combinatorial mutants in the LacI/GalR linkers show that particular substitutions can be silent in a context-dependent manner. Thus, specific permutations of sequence change (rather than change at silent positions) would facilitate neutral drift during evolution. Finally, the combinatorial mutants also reveal functional synergy between semi- and nonconserved positions. Such functional relationships would be missed by analyses that rely primarily upon co-evolution.
Collapse
|
39
|
Jeon J, Yang JS, Kim S. Integration of evolutionary features for the identification of functionally important residues in major facilitator superfamily transporters. PLoS Comput Biol 2009; 5:e1000522. [PMID: 19798434 PMCID: PMC2739438 DOI: 10.1371/journal.pcbi.1000522] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2009] [Accepted: 08/27/2009] [Indexed: 11/18/2022] Open
Abstract
The identification of functionally important residues is an important challenge for understanding the molecular mechanisms of proteins. Membrane protein transporters operate two-state allosteric conformational changes using functionally important cooperative residues that mediate long-range communication from the substrate binding site to the translocation pathway. In this study, we identified functionally important cooperative residues of membrane protein transporters by integrating sequence conservation and co-evolutionary information. A newly derived evolutionary feature, the co-evolutionary coupling number, was introduced to measure the connectivity of co-evolving residue pairs and was integrated with the sequence conservation score. We tested this method on three Major Facilitator Superfamily (MFS) transporters, LacY, GlpT, and EmrD. MFS transporters are an important family of membrane protein transporters, which utilize diverse substrates, catalyze different modes of transport using unique combinations of functional residues, and have enough characterized functional residues to validate the performance of our method. We found that the conserved cores of evolutionarily coupled residues are involved in specific substrate recognition and translocation of MFS transporters. Furthermore, a subset of the residues forms an interaction network connecting functional sites in the protein structure. We also confirmed that our method is effective on other membrane protein transporters. Our results provide insight into the location of functional residues important for the molecular mechanisms of membrane protein transporters. Major Facilitator Superfamily (MFS) transporters are one of the largest families of membrane protein transporters and are ubiquitous to all three kingdoms of life. Structural studies of MFS transporters have revealed that the members of this superfamily share structural homology; however, due to weak sequence similarity, their structural similarity has only been found after structural determination. Even after the structures were solved, painstaking efforts were needed to detect functionally important residues. The identification of functionally important cooperative residues from sequences may provide an alternative way to understanding the function of this important class of proteins. Here, we show that it is possible to identify functionally important residues of MFS transporters by integrating two different evolutionary features, sequence conservation and co-evolutionary information. Our results suggest that the conserved cores of evolutionarily coupled residues are involved in specific substrate recognition and translocation of membrane protein transporters. Also, a subset of the identified residues comprises an interaction network connecting functional sites in the protein structure. The ability to identify functional residues from protein sequences may be helpful for locating potential mutagenesis targets in mechanistic studies of membrane protein transporters.
Collapse
Affiliation(s)
- Jouhyun Jeon
- Division of Molecular and Life Science, Pohang University of Science and Technology, Pohang, Korea
| | | | | |
Collapse
|
40
|
Mapping the sequence mutations of the 2009 H1N1 influenza A virus neuraminidase relative to drug and antibody binding sites. Biol Direct 2009; 4:18; discussion 18. [PMID: 19457254 PMCID: PMC2691737 DOI: 10.1186/1745-6150-4-18] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2009] [Accepted: 05/20/2009] [Indexed: 11/30/2022] Open
Abstract
In this work, we study the consequences of sequence variations of the "2009 H1N1" (swine or Mexican flu) influenza A virus strain neuraminidase for drug treatment and vaccination. We find that it is phylogenetically more closely related to European H1N1 swine flu and H5N1 avian flu rather than to the H1N1 counterparts in the Americas. Homology-based 3D structure modeling reveals that the novel mutations are preferentially located at the protein surface and do not interfere with the active site. The latter is the binding cavity for 3 currently used neuraminidase inhibitors: oseltamivir (Tamiflu®), zanamivir (Relenza®) and peramivir; thus, the drugs should remain effective for treatment. However, the antigenic regions of the neuraminidase relevant for vaccine development, serological typing and passive antibody treatment can differ from those of previous strains and already vary among patients. This article was reviewed by Sandor Pongor and L. Aravind.
Collapse
|
41
|
Little DY, Chen L. Identification of coevolving residues and coevolution potentials emphasizing structure, bond formation and catalytic coordination in protein evolution. PLoS One 2009; 4:e4762. [PMID: 19274093 PMCID: PMC2651771 DOI: 10.1371/journal.pone.0004762] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2008] [Accepted: 02/19/2009] [Indexed: 11/30/2022] Open
Abstract
The structure and function of a protein is dependent on coordinated interactions between its residues. The selective pressures associated with a mutation at one site should therefore depend on the amino acid identity of interacting sites. Mutual information has previously been applied to multiple sequence alignments as a means of detecting coevolutionary interactions. Here, we introduce a refinement of the mutual information method that: 1) removes a significant, non-coevolutionary bias and 2) accounts for heteroscedasticity. Using a large, non-overlapping database of protein alignments, we demonstrate that predicted coevolving residue-pairs tend to lie in close physical proximity. We introduce coevolution potentials as a novel measure of the propensity for the 20 amino acids to pair amongst predicted coevolutionary interactions. Ionic, hydrogen, and disulfide bond-forming pairs exhibited the highest potentials. Finally, we demonstrate that pairs of catalytic residues have a significantly increased likelihood to be identified as coevolving. These correlations to distinct protein features verify the accuracy of our algorithm and are consistent with a model of coevolution in which selective pressures towards preserving residue interactions act to shape the mutational landscape of a protein by restricting the set of admissible neutral mutations.
Collapse
Affiliation(s)
- Daniel Y. Little
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
| | - Lu Chen
- Department of Molecular and Cell Biology, University of California, Berkeley, California, United States of America
- Helen Wills Neuroscience Institute, University of California, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
42
|
Fatakia SN, Costanzi S, Chow CC. Computing highly correlated positions using mutual information and graph theory for G protein-coupled receptors. PLoS One 2009; 4:e4681. [PMID: 19262747 PMCID: PMC2650788 DOI: 10.1371/journal.pone.0004681] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2008] [Accepted: 01/07/2009] [Indexed: 01/06/2023] Open
Abstract
G protein-coupled receptors (GPCRs) are a superfamily of seven transmembrane-spanning proteins involved in a wide array of physiological functions and are the most common targets of pharmaceuticals. This study aims to identify a cohort or clique of positions that share high mutual information. Using a multiple sequence alignment of the transmembrane (TM) domains, we calculated the mutual information between all inter-TM pairs of aligned positions and ranked the pairs by mutual information. A mutual information graph was constructed with vertices that corresponded to TM positions and edges between vertices were drawn if the mutual information exceeded a threshold of statistical significance. Positions with high degree (i.e. had significant mutual information with a large number of other positions) were found to line a well defined inter-TM ligand binding cavity for class A as well as class C GPCRs. Although the natural ligands of class C receptors bind to their extracellular N-terminal domains, the possibility of modulating their activity through ligands that bind to their helical bundle has been reported. Such positions were not found for class B GPCRs, in agreement with the observation that there are not known ligands that bind within their TM helical bundle. All identified key positions formed a clique within the MI graph of interest. For a subset of class A receptors we also considered the alignment of a portion of the second extracellular loop, and found that the two positions adjacent to the conserved Cys that bridges the loop with the TM3 qualified as key positions. Our algorithm may be useful for localizing topologically conserved regions in other protein families.
Collapse
Affiliation(s)
- Sarosh N. Fatakia
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Stefano Costanzi
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Carson C. Chow
- Laboratory of Biological Modeling, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|