1
|
Little J, Chikina M, Clark NL. Evolutionary rate covariation is a reliable predictor of co-functional interactions but not necessarily physical interactions. eLife 2024; 12:RP93333. [PMID: 38415754 PMCID: PMC10942632 DOI: 10.7554/elife.93333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024] Open
Abstract
Co-functional proteins tend to have rates of evolution that covary over time. This correlation between evolutionary rates can be measured over the branches of a phylogenetic tree through methods such as evolutionary rate covariation (ERC), and then used to construct gene networks by the identification of proteins with functional interactions. The cause of this correlation has been hypothesized to result from both compensatory coevolution at physical interfaces and nonphysical forces such as shared changes in selective pressure. This study explores whether coevolution due to compensatory mutations has a measurable effect on the ERC signal. We examined the difference in ERC signal between physically interacting protein domains within complexes compared to domains of the same proteins that do not physically interact. We found no generalizable relationship between physical interaction and high ERC, although a few complexes ranked physical interactions higher than nonphysical interactions. Therefore, we conclude that coevolution due to physical interaction is weak, but present in the signal captured by ERC, and we hypothesize that the stronger signal instead comes from selective pressures on the protein as a whole and maintenance of the general function.
Collapse
Affiliation(s)
- Jordan Little
- Department of Human Genetics, University of UtahSalt Lake CityUnited States
| | - Maria Chikina
- Department of Computational Biology, University of PittsburghPittsburghUnited States
| | - Nathan L Clark
- Department of Human Genetics, University of UtahSalt Lake CityUnited States
- Department of Biological Sciences, University of PittsburghPittsburghUnited States
| |
Collapse
|
2
|
Bioinformatic Analysis of Na +, K +-ATPase Regulation through Phosphorylation of the Alpha-Subunit N-Terminus. Int J Mol Sci 2022; 24:ijms24010067. [PMID: 36613508 PMCID: PMC9820343 DOI: 10.3390/ijms24010067] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Revised: 12/01/2022] [Accepted: 12/17/2022] [Indexed: 12/24/2022] Open
Abstract
The Na+, K+-ATPase is an integral membrane protein which uses the energy of ATP hydrolysis to pump Na+ and K+ ions across the plasma membrane of all animal cells. It plays crucial roles in numerous physiological processes, such as cell volume regulation, nutrient reabsorption in the kidneys, nerve impulse transmission, and muscle contraction. Recent data suggest that it is regulated via an electrostatic switch mechanism involving the interaction of its lysine-rich N-terminus with the cytoplasmic surface of its surrounding lipid membrane, which can be modulated through the regulatory phosphorylation of the conserved serine and tyrosine residues on the protein's N-terminal tail. Prior data indicate that the kinases responsible for phosphorylation belong to the protein kinase C (PKC) and Src kinase families. To provide indications of which particular enzyme of these families might be responsible, we analysed them for evidence of coevolution via the mirror tree method, utilising coevolution as a marker for a functional interaction. The results obtained showed that the most likely kinase isoforms to interact with the Na+, K+-ATPase were the θ and η isoforms of PKC and the Src kinase itself. These theoretical results will guide the direction of future experimental studies.
Collapse
|
3
|
Gerardos A, Dietler N, Bitbol AF. Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences. PLoS Comput Biol 2022; 18:e1010147. [PMID: 35576238 PMCID: PMC9135348 DOI: 10.1371/journal.pcbi.1010147] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 05/26/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Inferring protein-protein interactions from sequences is an important task in computational biology. Recent methods based on Direct Coupling Analysis (DCA) or Mutual Information (MI) allow to find interaction partners among paralogs of two protein families. Does successful inference mainly rely on correlations from structural contacts or from phylogeny, or both? Do these two types of signal combine constructively or hinder each other? To address these questions, we generate and analyze synthetic data produced using a minimal model that allows us to control the amounts of structural constraints and phylogeny. We show that correlations from these two sources combine constructively to increase the performance of partner inference by DCA or MI. Furthermore, signal from phylogeny can rescue partner inference when signal from contacts becomes less informative, including in the realistic case where inter-protein contacts are restricted to a small subset of sites. We also demonstrate that DCA-inferred couplings between non-contact pairs of sites improve partner inference in the presence of strong phylogeny, while deteriorating it otherwise. Moreover, restricting to non-contact pairs of sites preserves inference performance in the presence of strong phylogeny. In a natural data set, as well as in realistic synthetic data based on it, we find that non-contact pairs of sites contribute positively to partner inference performance, and that restricting to them preserves performance, evidencing an important role of phylogeny.
Collapse
Affiliation(s)
- Andonis Gerardos
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Nicola Dietler
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Anne-Florence Bitbol
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
4
|
Palenchar PM. The Influence of Codon Usage, Protein Abundance, and Protein Stability on Protein Evolution Vary by Evolutionary Distance and the Type of Protein. Protein J 2022; 41:216-229. [PMID: 35147896 DOI: 10.1007/s10930-022-10045-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2022] [Indexed: 12/01/2022]
Abstract
In general, the evolutionary rate of proteins is not primarily related to protein and amino acid functions, and factors such as protein abundance, codon usage, and the protein's TM are more important. To better understand the factors that affect protein evolution, E. coli MG1655 orthologs were compared to those in closely related bacteria and to more distantly related prokaryotes, eukaryotes, and archaea. Also, the evolution of different types of proteins was studied. The analyses indicate that the amino acid conservation of enzymes that do not use macromolecules (e.g. DNA, RNA, and proteins) as substrates and that carry out metabolic processes involving small molecules (i.e. small molecule enzymes) is different than other enzymes. For example, the small molecule enzymes have a lower percent identity than other enzymes when sequences from closely related bacteria are compared. Analyses indicate the lower percent identity is not a result of the amino acid or codon usage of the small molecule enzymes. The small molecule enzymes also don't have a significantly lower protein abundance indicating that is also not likely an important factor driving differences in amino acid conservation. Analyses indicate different methods to measure the TM of proteins have different relationships between amino acid conservation over different evolutionary distances. In totality, the results demonstrate that the relationship between the factors thought to affect protein evolution (protein abundance, codon usage, and proteins TMs) and protein evolution are complex and depend on the factor, the organisms, and the type of proteins being analyzed.
Collapse
Affiliation(s)
- Peter M Palenchar
- Department of Chemistry, Villanova University, 800 E. Lancaster Ave, Villanova, PA, 19805, USA.
| |
Collapse
|
5
|
Phylogenetic correlations can suffice to infer protein partners from sequences. PLoS Comput Biol 2019; 15:e1007179. [PMID: 31609984 PMCID: PMC6812855 DOI: 10.1371/journal.pcbi.1007179] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 10/24/2019] [Accepted: 09/25/2019] [Indexed: 12/30/2022] Open
Abstract
Determining which proteins interact together is crucial to a systems-level understanding of the cell. Recently, algorithms based on Direct Coupling Analysis (DCA) pairwise maximum-entropy models have allowed to identify interaction partners among paralogous proteins from sequence data. This success of DCA at predicting protein-protein interactions could be mainly based on its known ability to identify pairs of residues that are in contact in the three-dimensional structure of protein complexes and that coevolve to remain physicochemically complementary. However, interacting proteins possess similar evolutionary histories. What is the role of purely phylogenetic correlations in the performance of DCA-based methods to infer interaction partners? To address this question, we employ controlled synthetic data that only involve phylogeny and no interactions or contacts. We find that DCA accurately identifies the pairs of synthetic sequences that share evolutionary history. While phylogenetic correlations confound the identification of contacting residues by DCA, they are thus useful to predict interacting partners among paralogs. We find that DCA performs as well as phylogenetic methods to this end, and slightly better than them with large and accurate training sets. Employing DCA or phylogenetic methods within an Iterative Pairing Algorithm (IPA) allows to predict pairs of evolutionary partners without a training set. We further demonstrate the ability of these various methods to correctly predict pairings among real paralogous proteins with genome proximity but no known direct physical interaction, illustrating the importance of phylogenetic correlations in natural data. However, for physically interacting and strongly coevolving proteins, DCA and mutual information outperform phylogenetic methods. We finally discuss how to distinguish physically interacting proteins from proteins that only share a common evolutionary history. Many biologically important protein-protein interactions are conserved over evolutionary time scales. This leads to two different signals that can be used to computationally predict interactions between protein families and to identify specific interaction partners. First, the shared evolutionary history leads to highly similar phylogenetic relationships between interacting proteins of the two families. Second, the need to keep the interaction surfaces of partner proteins biophysically compatible causes a correlated amino-acid usage of interface residues. Employing simulated data, we show that the shared history alone can be used to detect partner proteins. Similar accuracies are achieved by algorithms comparing phylogenetic relationships and by methods based on Direct Coupling Analysis (DCA), which are primarily known for their ability to detect the second type of signal. Using natural sequence data, we show that in cases with shared evolutionary history but without known physical interactions, both methods work with similar accuracy, while for some physically interacting systems, DCA and mutual information outperform phylogenetic methods. We propose methods allowing both to predict interactions between protein families and to find interacting partners among paralogs.
Collapse
|
6
|
Ding Z, Kihara D. Computational Methods for Predicting Protein-Protein Interactions Using Various Protein Features. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2018; 93:e62. [PMID: 29927082 PMCID: PMC6097941 DOI: 10.1002/cpps.62] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Understanding protein-protein interactions (PPIs) in a cell is essential for learning protein functions, pathways, and mechanism of diseases. PPIs are also important targets for developing drugs. Experimental methods, both small-scale and large-scale, have identified PPIs in several model organisms. However, results cover only a part of PPIs of organisms; moreover, there are many organisms whose PPIs have not yet been investigated. To complement experimental methods, many computational methods have been developed that predict PPIs from various characteristics of proteins. Here we provide an overview of literature reports to classify computational PPI prediction methods that consider different features of proteins, including protein sequence, genomes, protein structure, function, PPI network topology, and those which integrate multiple methods. © 2018 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Ziyun Ding
- Department of Biological Science, Purdue University, West Lafayette, IN, 47907 USA
| | - Daisuke Kihara
- Department of Biological Science, Purdue University, West Lafayette, IN, 47907 USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907 USA
- Corresponding author: DK; , Phone: 1-765-496-2284 (DK)
| |
Collapse
|
7
|
Teppa E, Zea DJ, Marino-Buslje C. Protein-protein interactions leave evolutionary footprints: High molecular coevolution at the core of interfaces. Protein Sci 2017; 26:2438-2444. [PMID: 28980349 DOI: 10.1002/pro.3318] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Revised: 09/19/2017] [Accepted: 10/02/2017] [Indexed: 01/28/2023]
Abstract
Protein-protein interactions are essential to all aspects of life. Specific interactions result from evolutionary pressure at the interacting interfaces of partner proteins. However, evolutionary pressure is not homogeneous within the interface: for instance, each residue does not contribute equally to the binding energy of the complex. To understand functional differences between residues within the interface, we analyzed their properties in the core and rim regions. Here, we characterized protein interfaces with two evolutionary measures, conservation and coevolution, using a comprehensive dataset of 896 protein complexes. These scores can detect different selection pressures at a given position in a multiple sequence alignment. We also analyzed how the number of interactions in which a residue is involved influences those evolutionary signals. We found that the coevolutionary signal is higher in the interface core than in the interface rim region. Additionally, the difference in coevolution between core and rim regions is comparable to the known difference in conservation between those regions. Considering proteins with multiple interactions, we found that conservation and coevolution increase with the number of different interfaces in which a residue is involved, suggesting that more constraints (i.e., a residue that must satisfy a greater number of interactions) allow fewer sequence changes at those positions, resulting in higher conservation and coevolution values. These findings shed light on the evolution of protein interfaces and provide information useful for identifying protein interfaces and predicting protein-protein interactions.
Collapse
Affiliation(s)
- Elin Teppa
- Bioinformatics Unit, Fundación Instituto Leloir/IIBBA CONICET, Avda. Patricias Argentinas 435, CABA, Argentina
| | - Diego Javier Zea
- Bioinformatics Unit, Fundación Instituto Leloir/IIBBA CONICET, Avda. Patricias Argentinas 435, CABA, Argentina
| | - Cristina Marino-Buslje
- Bioinformatics Unit, Fundación Instituto Leloir/IIBBA CONICET, Avda. Patricias Argentinas 435, CABA, Argentina
| |
Collapse
|
8
|
Swint-Kruse L. Using Evolution to Guide Protein Engineering: The Devil IS in the Details. Biophys J 2017; 111:10-8. [PMID: 27410729 DOI: 10.1016/j.bpj.2016.05.030] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Revised: 04/18/2016] [Accepted: 05/20/2016] [Indexed: 10/21/2022] Open
Abstract
For decades, protein engineers have endeavored to reengineer existing proteins for novel applications. Overall, protein folds and gross functions can be readily transferred from one protein to another by transplanting large blocks of sequence (i.e., domain recombination). However, predictably fine-tuning function (e.g., by adjusting ligand affinity, specificity, catalysis, and/or allosteric regulation) remains a challenge. One approach has been to use the sequences of protein families to identify amino acid positions that change during the evolution of functional variation. The rationale is that these nonconserved positions could be mutated to predictably fine-tune function. Evolutionary approaches to protein design have had some success, but the engineered proteins seldom replicate the functional performances of natural proteins. This Biophysical Perspective reviews several complexities that have been revealed by evolutionary and experimental studies of protein function. These include 1) challenges in defining computational and biological thresholds that define important amino acids; 2) the co-occurrence of many different patterns of amino acid changes in evolutionary data; 3) difficulties in mapping the patterns of amino acid changes to discrete functional parameters; 4) the nonconventional mutational outcomes that occur for a particular group of functionally important, nonconserved positions; 5) epistasis (nonadditivity) among multiple mutations; and 6) the fact that a large fraction of a protein's amino acids contribute to its overall function. To overcome these challenges, new goals are identified for future studies.
Collapse
Affiliation(s)
- Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, University of Kansas Medical Center, Kansas City, Kansas.
| |
Collapse
|
9
|
Mandloi S, Chakrabarti S. Protein sites with more coevolutionary connections tend to evolve slower, while more variable protein families acquire higher coevolutionary connections. F1000Res 2017; 6:453. [PMID: 28751967 PMCID: PMC5506539 DOI: 10.12688/f1000research.11251.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/05/2017] [Indexed: 11/20/2022] Open
Abstract
Background: Amino acid exchanges within proteins sometimes compensate for one another and could therefore be co-evolved. It is essential to investigate the intricate relationship between the extent of coevolution and the evolutionary variability exerted at individual protein sites, as well as the whole protein. Methods: In this study, we have used a reliable set of coevolutionary connections (sites within 10Å spatial distance) and investigated their correlation with the evolutionary diversity within the respective protein sites. Results: Based on our observations, we propose an interesting hypothesis that higher numbers of coevolutionary connections are associated with lesser evolutionary variable protein sites, while higher numbers of the coevolutionary connections can be observed for a protein family that has higher evolutionary variability. Our findings also indicate that highly coevolved sites located in a solvent accessible state tend to be less evolutionary variable. This relationship reverts at the whole protein level where cytoplasmic and extracellular proteins show moderately higher anti-correlation between the number of coevolutionary connections and the average evolutionary conservation of the whole protein. Conclusions: Observations and hypothesis presented in this study provide intriguing insights towards understanding the critical relationship between coevolutionary and evolutionary changes observed within proteins. Our observations encourage further investigation to find out the reasons behind subtle variations in the relationship between coevolutionary connectivity and evolutionary diversity for proteins located at various cellular localizations and/or involved in different molecular-biological functions.
Collapse
Affiliation(s)
- Sapan Mandloi
- Department of Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research, Indian Institute of Chemical Biology, Kolkata, West Bengal, 700032, India
| | - Saikat Chakrabarti
- Department of Structural Biology and Bioinformatics Division, Council of Scientific and Industrial Research, Indian Institute of Chemical Biology, Kolkata, West Bengal, 700032, India
| |
Collapse
|
10
|
Chapa TJ, Du Y, Sun R, Yu D, French AR. Proteomic and phylogenetic coevolution analyses of pM79 and pM92 identify interactions with RNA polymerase II and delineate the murine cytomegalovirus late transcription complex. J Gen Virol 2017; 98:242-250. [PMID: 27926822 DOI: 10.1099/jgv.0.000676] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The regulation of the late viral gene expression in betaherpesviruses is largely undefined. We have previously shown that the murine cytomegalovirus proteins pM79 and pM92 are required for late gene transcription. Here, we provide insight into the mechanism of pM79 and pM92 activity by determining their interaction partners during infection. Co-immunoprecipitation-coupled MS studies demonstrate that pM79 and pM92 interact with an array of cellular and viral proteins involved in transcription. Specifically, we identify RNA polymerase II as a cellular target for both pM79 and pM92. We use inter-protein coevolution analysis to show how pM79 and pM92 likely assemble into a late transcription complex composed of late transcription regulators pM49, pM87 and pM95. Combining proteomic methods with coevolution computational analysis provides novel insights into the relationship between pM79, pM92 and RNA polymerase II and allows the generation of a model of the multi-component viral complex that regulates late gene transcription.
Collapse
Affiliation(s)
- Travis J Chapa
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, CA 90095, USA.,Division of Pediatric Rheumatology, Department of Pediatrics, Washington University School of Medicine, Saint Louis, MO 63110, USA.,Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Yushen Du
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Dong Yu
- Department of Molecular Microbiology, Washington University School of Medicine, Saint Louis, MO 63110, USA
| | - Anthony R French
- Division of Pediatric Rheumatology, Department of Pediatrics, Washington University School of Medicine, Saint Louis, MO 63110, USA
| |
Collapse
|
11
|
Zhou H, Kann MG, Mallory EK, Yang YH, Bugshan A, Binmadi NO, Basile JR. Recruitment of Tiam1 to Semaphorin 4D Activates Rac and Enhances Proliferation, Invasion, and Metastasis in Oral Squamous Cell Carcinoma. Neoplasia 2016; 19:65-74. [PMID: 28038319 PMCID: PMC5198113 DOI: 10.1016/j.neo.2016.12.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2016] [Revised: 12/01/2016] [Accepted: 12/05/2016] [Indexed: 12/25/2022] Open
Abstract
The semaphorins and the plexins are a family of large, cysteine-rich proteins originally identified as regulators of axon growth and lymphocyte activation that are now known to provide motility and positional information for a number of cell and tissue types. For example, our group and others have shown that some malignancies over express Semaphorin 4D (S4D), which acts through its receptor Plexin-B1 (PB1) on endothelial cells to attract blood vessels from the surrounding stroma for the purpose of supporting tumor growth. While plexins are the known functional receptors for the semaphorins, there is evidence that transmembrane semaphorins may transmit a signal themselves through their short cytoplasmic tail, a phenomenon known as ‘reverse signaling.’ We used computational methods based upon correlated evolution of sequences of interacting proteins, mutational analysis and in vitro and in vivo measurements of tumor aggressiveness to show that when bound to PB1, transmembrane S4D associates with the Rac GTPase exchange factor T lymphoma invasion and metastasis (Tiam) 1, which activates Rac and promotes proliferation, invasion and metastasis in oral squamous cell carcinoma (OSCC) cells. These results suggest that not only can S4D production by tumor cells affect the microenvironment, but engagement of this semaphorin at the cell surface activates a reverse signaling mechanism that influences tumor aggressiveness in OSCC.
Collapse
Affiliation(s)
- Hua Zhou
- Department of Oncology and Diagnostic Sciences, University of Maryland Dental School, 650 W. Baltimore Street, 7-North, Baltimore, MD 21201, USA
| | - Maricel G Kann
- Dept of Biological Sciences, University of Maryland Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
| | - Emily K Mallory
- Biomedical Informatics Training Program, Stanford University, 1265 Welch Road, Stanford, CA 94305, USA
| | - Ying-Hua Yang
- Department of Oncology and Diagnostic Sciences, University of Maryland Dental School, 650 W. Baltimore Street, 7-North, Baltimore, MD 21201, USA
| | - Amr Bugshan
- Department of Oncology and Diagnostic Sciences, University of Maryland Dental School, 650 W. Baltimore Street, 7-North, Baltimore, MD 21201, USA
| | - Nada O Binmadi
- Department of Oncology and Diagnostic Sciences, University of Maryland Dental School, 650 W. Baltimore Street, 7-North, Baltimore, MD 21201, USA; Department of Oral Basic & Clinical Sciences, King Abdulaziz University, Jeddah 21589, Saudi Arabia
| | - John R Basile
- Department of Oncology and Diagnostic Sciences, University of Maryland Dental School, 650 W. Baltimore Street, 7-North, Baltimore, MD 21201, USA; Greenebaum Cancer Center, 22 S. Greene Street, Baltimore, MD 21201, USA.
| |
Collapse
|
12
|
Ney B, Ahmed FH, Carere CR, Biswas A, Warden AC, Morales SE, Pandey G, Watt SJ, Oakeshott JG, Taylor MC, Stott MB, Jackson CJ, Greening C. The methanogenic redox cofactor F 420 is widely synthesized by aerobic soil bacteria. ISME JOURNAL 2016; 11:125-137. [PMID: 27505347 DOI: 10.1038/ismej.2016.100] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 06/07/2016] [Accepted: 06/13/2016] [Indexed: 02/07/2023]
Abstract
F420 is a low-potential redox cofactor that mediates the transformations of a wide range of complex organic compounds. Considered one of the rarest cofactors in biology, F420 is best known for its role in methanogenesis and has only been chemically identified in two phyla to date, the Euryarchaeota and Actinobacteria. In this work, we show that this cofactor is more widely distributed than previously reported. We detected the genes encoding all five known F420 biosynthesis enzymes (cofC, cofD, cofE, cofG and cofH) in at least 653 bacterial and 173 archaeal species, including members of the dominant soil phyla Proteobacteria, Chloroflexi and Firmicutes. Metagenome datamining validated that these genes were disproportionately abundant in aerated soils compared with other ecosystems. We confirmed through high-performance liquid chromatography analysis that aerobically grown stationary-phase cultures of three bacterial species, Paracoccus denitrificans, Oligotropha carboxidovorans and Thermomicrobium roseum, synthesized F420, with oligoglutamate sidechains of different lengths. To understand the evolution of F420 biosynthesis, we also analyzed the distribution, phylogeny and genetic organization of the cof genes. Our data suggest that although the Fo precursor to F420 originated in methanogens, F420 itself was first synthesized in an ancestral actinobacterium. F420 biosynthesis genes were then disseminated horizontally to archaea and other bacteria. Together, our findings suggest that the cofactor is more significant in aerobic bacterial metabolism and soil ecosystem composition than previously thought. The cofactor may confer several competitive advantages for aerobic soil bacteria by mediating their central metabolic processes and broadening the range of organic compounds they can synthesize, detoxify and mineralize.
Collapse
Affiliation(s)
- Blair Ney
- Research School of Chemistry, Australian National University, Acton, Australian Capital Territory, Australia.,The Commonwealth Scientific and Industrial Research Organisation, Land and Water, Acton, Australian Capital Territory, Australia
| | - F Hafna Ahmed
- Research School of Chemistry, Australian National University, Acton, Australian Capital Territory, Australia
| | - Carlo R Carere
- GNS Science, Wairakei Research Centre, Taupō, New Zealand
| | - Ambarish Biswas
- Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand
| | - Andrew C Warden
- The Commonwealth Scientific and Industrial Research Organisation, Land and Water, Acton, Australian Capital Territory, Australia
| | - Sergio E Morales
- Department of Microbiology and Immunology, University of Otago, Dunedin, New Zealand
| | - Gunjan Pandey
- The Commonwealth Scientific and Industrial Research Organisation, Land and Water, Acton, Australian Capital Territory, Australia
| | - Stephen J Watt
- Research School of Chemistry, Australian National University, Acton, Australian Capital Territory, Australia
| | - John G Oakeshott
- The Commonwealth Scientific and Industrial Research Organisation, Land and Water, Acton, Australian Capital Territory, Australia
| | - Matthew C Taylor
- The Commonwealth Scientific and Industrial Research Organisation, Land and Water, Acton, Australian Capital Territory, Australia
| | | | - Colin J Jackson
- Research School of Chemistry, Australian National University, Acton, Australian Capital Territory, Australia
| | - Chris Greening
- The Commonwealth Scientific and Industrial Research Organisation, Land and Water, Acton, Australian Capital Territory, Australia
| |
Collapse
|
13
|
Hetti Arachchilage M, Piontkivska H. Coevolutionary Analysis Identifies Protein-Protein Interaction Sites between HIV-1 Reverse Transcriptase and Integrase. Virus Evol 2016; 2:vew002. [PMID: 27152230 PMCID: PMC4854294 DOI: 10.1093/ve/vew002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The replication of human immunodeficiency virus-1 (HIV-1) requires reverse transcription of the viral RNA genome and integration of newly synthesized pro-viral DNA into the host genome. This is mediated by the viral proteins reverse transcriptase (RT) and integrase (IN). The formation and stabilization of the pre-integration complex (PIC), which is an essential step for reverse transcription, nuclear import, chromatin targeting, and subsequent integration, involves direct and indirect modes of interaction between RT and IN proteins. While epitope-based treatments targeting IN-viral DNA and IN-RT complexes appear to be a promising combination for an anti-HIV treatment, the mechanisms of IN-RT interactions within the PIC are not well understood due to the transient nature of the protein complex and the intrinsic flexibility of its components. Here, we identify potentially interacting regions between the IN and RT proteins within the PIC through the coevolutionary analysis of amino acid sequences of the two proteins. Our results show that specific regions in the two proteins have strong coevolutionary signatures, suggesting that these regions either experience direct and prolonged interactions between them that require high affinity and/or specificity or that the regions are involved in interactions mediated by dynamic conformational changes and, hence, may involve both direct and indirect interactions. Other regions were found to exhibit weak, but positive correlations, implying interactions that are likely transient and/or have low affinity. We identified a series of specific regions of potential interactions between the IN and RT proteins (e.g., specific peptide regions within the C-terminal domain of IN were identified as potentially interacting with the Connection domain of RT). Coevolutionary analysis can serve as an important step in predicting potential interactions, thus informing experimental studies. These studies can be integrated with structural data to gain a better understanding of the mechanisms of HIV protein interactions.
Collapse
Affiliation(s)
| | - Helen Piontkivska
- Department of Biological Sciences, Kent State University, Kent, OH 44242, USA
- School of Biomedical Sciences, Kent State University, Kent, OH 44242, USA
| |
Collapse
|
14
|
Parente DJ, Ray JCJ, Swint-Kruse L. Amino acid positions subject to multiple coevolutionary constraints can be robustly identified by their eigenvector network centrality scores. Proteins 2015; 83:2293-306. [PMID: 26503808 DOI: 10.1002/prot.24948] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Revised: 09/21/2015] [Accepted: 10/14/2015] [Indexed: 12/21/2022]
Abstract
As proteins evolve, amino acid positions key to protein structure or function are subject to mutational constraints. These positions can be detected by analyzing sequence families for amino acid conservation or for coevolution between pairs of positions. Coevolutionary scores are usually rank-ordered and thresholded to reveal the top pairwise scores, but they also can be treated as weighted networks. Here, we used network analyses to bypass a major complication of coevolution studies: For a given sequence alignment, alternative algorithms usually identify different, top pairwise scores. We reconciled results from five commonly-used, mathematically divergent algorithms (ELSC, McBASC, OMES, SCA, and ZNMI), using the LacI/GalR and 1,6-bisphosphate aldolase protein families as models. Calculations used unthresholded coevolution scores from which column-specific properties such as sequence entropy and random noise were subtracted; "central" positions were identified by calculating various network centrality scores. When compared among algorithms, network centrality methods, particularly eigenvector centrality, showed markedly better agreement than comparisons of the top pairwise scores. Positions with large centrality scores occurred at key structural locations and/or were functionally sensitive to mutations. Further, the top central positions often differed from those with top pairwise coevolution scores: instead of a few strong scores, central positions often had multiple, moderate scores. We conclude that eigenvector centrality calculations reveal a robust evolutionary pattern of constraints-detectable by divergent algorithms--that occur at key protein locations. Finally, we discuss the fact that multiple patterns coexist in evolutionary data that, together, give rise to emergent protein functions.
Collapse
Affiliation(s)
- Daniel J Parente
- Department of Biochemistry and Molecular Biology, University of Kansas Medical Center, Kansas City, Kansas, 66160
| | - J Christian J Ray
- Center for Computational Biology and Department of Molecular Biosciences, University of Kansas, Lawrence, Kansas, 66047
| | - Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, University of Kansas Medical Center, Kansas City, Kansas, 66160
| |
Collapse
|
15
|
Cao C, Zhao J, Doughty EK, Migliorini M, Strickland DK, Kann MG, Zhang L. Mac-1 Regulates IL-13 Activity in Macrophages by Directly Interacting with IL-13Rα1. J Biol Chem 2015; 290:21642-51. [PMID: 26160172 DOI: 10.1074/jbc.m115.645796] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Indexed: 11/06/2022] Open
Abstract
Mac-1 exhibits a unique inhibitory activity toward IL-13-induced JAK/STAT activation and thereby regulates macrophage to foam cell transformation. However, the underlying molecular mechanism is unknown. In this study, we report the identification of IL-13Rα1, a component of the IL-13 receptor (IL-13R), as a novel ligand of integrin Mac-1, using a co-evolution-based algorithm. Biochemical analyses demonstrated that recombinant IL-13Rα1 binds Mac-1 in a purified system and supports Mac-1-mediated cell adhesion. Co-immunoprecipitation experiments revealed that endogenous Mac-1 forms a complex with IL-13Rα1 in solution, and confocal fluorescence microscopy demonstrated that these two receptors co-localize with each other on the surface of macrophages. Moreover, we found that genetic inactivation of Mac-1 promotes IL-13-induced JAK/STAT activation in macrophages, resulting in enhanced polarization along the alternative activation pathway. Importantly, we observed that Mac-1(-/-) macrophages exhibit increased expression of foam cell differentiation markers including 15-lipoxygenase and lectin-type oxidized LDL receptor-1 both in vitro and in vivo. Indeed, we found that Mac-1(-/-)LDLR(-/-) mice develop significantly more foam cells than control LDLR(-/-) mice, using an in vivo model of foam cell formation. Together, our data establish for the first time a molecular mechanism by which Mac-1 regulates the signaling activity of IL-13 in macrophages. This newly identified IL-13Rα1/Mac-1-dependent pathway may offer novel targets for therapeutic intervention in the future.
Collapse
Affiliation(s)
| | | | - Emily K Doughty
- the Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, Maryland 21250
| | - Mary Migliorini
- Surgery, Center for Vascular and Inflammatory Diseases, the University of Maryland, School of Medicine, Baltimore, Maryland 21201 and
| | - Dudley K Strickland
- Surgery, Center for Vascular and Inflammatory Diseases, the University of Maryland, School of Medicine, Baltimore, Maryland 21201 and
| | - Maricel G Kann
- the Department of Biological Sciences, University of Maryland, Baltimore County, Baltimore, Maryland 21250
| | - Li Zhang
- From the Departments of Physiology and
| |
Collapse
|
16
|
Li Z, He Y, Wong L, Li J. Burial Level Change Defines a High Energetic Relevance for Protein Binding Interfaces. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:410-421. [PMID: 26357227 DOI: 10.1109/tcbb.2014.2361355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Protein-protein interfaces defined through atomic contact or solvent accessibility change are widely adopted in structural biology studies. But, these definitions cannot precisely capture energetically important regions at protein interfaces. The burial depth of an atom in a protein is related to the atom's energy. This work investigates how closely the change in burial level of an atom/residue upon complexation is related to the binding. Burial level change is different from burial level itself. An atom deeply buried in a monomer with a high burial level may not change its burial level after an interaction and it may have little burial level change. We hypothesize that an interface is a region of residues all undergoing burial level changes after interaction. By this definition, an interface can be decomposed into an onion-like structure according to the burial level change extent. We found that our defined interfaces cover energetically important residues more precisely, and that the binding free energy of an interface is distributed progressively from the outermost layer to the core. These observations are used to predict binding hot spots. Our approach's F-measure performance on a benchmark dataset of alanine mutagenesis residues is much superior or similar to those by complicated energy modeling or machine learning approaches.
Collapse
|
17
|
Wang S, Wei W, Luo X, Cai X. Genome-wide comparisons of phylogenetic similarities between partial genomic regions and the full-length genome in Hepatitis E virus genotyping. PLoS One 2014; 9:e115785. [PMID: 25542033 PMCID: PMC4277416 DOI: 10.1371/journal.pone.0115785] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2014] [Accepted: 11/26/2014] [Indexed: 01/23/2023] Open
Abstract
Besides the complete genome, different partial genomic sequences of Hepatitis E virus (HEV) have been used in genotyping studies, making it difficult to compare the results based on them. No commonly agreed partial region for HEV genotyping has been determined. In this study, we used a statistical method to evaluate the phylogenetic performance of each partial genomic sequence from a genome wide, by comparisons of evolutionary distances between genomic regions and the full-length genomes of 101 HEV isolates to identify short genomic regions that can reproduce HEV genotype assignments based on full-length genomes. Several genomic regions, especially one genomic region at the 3'-terminal of the papain-like cysteine protease domain, were detected to have relatively high phylogenetic correlations with the full-length genome. Phylogenetic analyses confirmed the identical performances between these regions and the full-length genome in genotyping, in which the HEV isolates involved could be divided into reasonable genotypes. This analysis may be of value in developing a partial sequence-based consensus classification of HEV species.
Collapse
Affiliation(s)
- Shuai Wang
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Wei Wei
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Xuenong Luo
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Xuepeng Cai
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
- * E-mail:
| |
Collapse
|
18
|
Yu Q, Li XT, Zhao X, Liu XL, Ikeo K, Gojobori T, Liu QX. Coevolution of axon guidance molecule Slit and its receptor Robo. PLoS One 2014; 9:e94970. [PMID: 24801615 PMCID: PMC4011710 DOI: 10.1371/journal.pone.0094970] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2013] [Accepted: 03/21/2014] [Indexed: 11/18/2022] Open
Abstract
Coevolution is important for the maintenance of the interaction between a ligand and its receptor during evolution. The interaction between axon guidance molecule Slit and its receptor Robo is critical for the axon repulsion in neural tissues, which is evolutionarily conserved from planarians to humans. However, the mechanism of coevolution between Slit and Robo remains unclear. In this study, we found that coordinated amino acid changes took place at interacting sites of Slit and Robo by comparing the amino acids at these sites among different organisms. In addition, the high level correlation between evolutionary rate of Slit and Robo was identified in vertebrates. Furthermore, the sites under positive selection of slit and robo were detected in the same lineage such as mosquito and teleost. Overall, our results provide evidence for the coevolution between Slit and Robo.
Collapse
Affiliation(s)
- Qi Yu
- Laboratory of Developmental Genetics, Shandong Agricultural University, Tai'an, Shandong, China
| | - Xiao-Tong Li
- Laboratory of Developmental Genetics, Shandong Agricultural University, Tai'an, Shandong, China
| | - Xiao Zhao
- Laboratory of Developmental Genetics, Shandong Agricultural University, Tai'an, Shandong, China
| | - Xun-Li Liu
- Laboratory of Developmental Genetics, Shandong Agricultural University, Tai'an, Shandong, China
| | - Kazuho Ikeo
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Takashi Gojobori
- Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima, Shizuoka, Japan
| | - Qing-Xin Liu
- Laboratory of Developmental Genetics, Shandong Agricultural University, Tai'an, Shandong, China
| |
Collapse
|
19
|
Ochoa D, Pazos F. Practical aspects of protein co-evolution. Front Cell Dev Biol 2014; 2:14. [PMID: 25364721 PMCID: PMC4207036 DOI: 10.3389/fcell.2014.00014] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Accepted: 04/02/2014] [Indexed: 11/15/2022] Open
Abstract
Co-evolution is a fundamental aspect of Evolutionary Theory. At the molecular level, co-evolutionary linkages between protein families have been used as indicators of protein interactions and functional relationships from long ago. Due to the complexity of the problem and the amount of genomic data required for these approaches to achieve good performances, it took a relatively long time from the appearance of the first ideas and concepts to the quotidian application of these approaches and their incorporation to the standard toolboxes of bioinformaticians and molecular biologists. Today, these methodologies are mature (both in terms of performance and usability/implementation), and the genomic information that feeds them large enough to allow their general application. This review tries to summarize the current landscape of co-evolution-based methodologies, with a strong emphasis on describing interesting cases where their application to important biological systems, alone or in combination with other computational and experimental approaches, allowed getting new insight into these.
Collapse
Affiliation(s)
- David Ochoa
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI) Hinxton, UK
| | - Florencio Pazos
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC) Madrid, Spain
| |
Collapse
|
20
|
Evolutionary rate covariation identifies new members of a protein network required for Drosophila melanogaster female post-mating responses. PLoS Genet 2014; 10:e1004108. [PMID: 24453993 PMCID: PMC3894160 DOI: 10.1371/journal.pgen.1004108] [Citation(s) in RCA: 118] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Accepted: 11/27/2013] [Indexed: 11/19/2022] Open
Abstract
Seminal fluid proteins transferred from males to females during copulation are required for full fertility and can exert dramatic effects on female physiology and behavior. In Drosophila melanogaster, the seminal protein sex peptide (SP) affects mated females by increasing egg production and decreasing receptivity to courtship. These behavioral changes persist for several days because SP binds to sperm that are stored in the female. SP is then gradually released, allowing it to interact with its female-expressed receptor. The binding of SP to sperm requires five additional seminal proteins, which act together in a network. Hundreds of uncharacterized male and female proteins have been identified in this species, but individually screening each protein for network function would present a logistical challenge. To prioritize the screening of these proteins for involvement in the SP network, we used a comparative genomic method to identify candidate proteins whose evolutionary rates across the Drosophila phylogeny co-vary with those of the SP network proteins. Subsequent functional testing of 18 co-varying candidates by RNA interference identified three male seminal proteins and three female reproductive tract proteins that are each required for the long-term persistence of SP responses in females. Molecular genetic analysis showed the three new male proteins are required for the transfer of other network proteins to females and for SP to become bound to sperm that are stored in mated females. The three female proteins, in contrast, act downstream of SP binding and sperm storage. These findings expand the number of seminal proteins required for SP's actions in the female and show that multiple female proteins are necessary for the SP response. Furthermore, our functional analyses demonstrate that evolutionary rate covariation is a valuable predictive tool for identifying candidate members of interacting protein networks. Reproduction requires more than a sperm and an egg. In animals with internal fertilization, other proteins in the seminal fluid and the female are essential for full fertility. Although hundreds of such reproductive proteins are known, our ability to understand how they interact remains limited. In this study, we investigated whether shared patterns of protein sequence evolution were predictive of functional interactions by focusing on a small network of proteins that control fertility and female post-mating behavior in the fruit fly, Drosophila melanogaster. We first showed that the six proteins already known to act in this network display correlated patterns of evolution across the Drosophila phylogeny. We then screened hundreds of otherwise uncharacterized male and female reproductive proteins and identified those with patterns of evolution most similar to those of the known network proteins. We tested each of these candidate genes and found six new network members that are each required for long-term fertility. Using molecular genetics, we also observed that the steps in the network at which these new proteins act are consistent with their strongest evolutionary correlations. Our results suggest that patterns of coevolution may be broadly useful for predicting protein interactions in a variety of biological processes.
Collapse
|
21
|
Zhou H, Jakobsson E. Predicting protein-protein interaction by the mirrortree method: possibilities and limitations. PLoS One 2013; 8:e81100. [PMID: 24349035 PMCID: PMC3862474 DOI: 10.1371/journal.pone.0081100] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Accepted: 10/11/2013] [Indexed: 12/02/2022] Open
Abstract
Molecular co-evolution analysis as a sequence-only based method has been used to predict protein-protein interactions. In co-evolution analysis, Pearson's correlation within the mirrortree method is a well-known way of quantifying the correlation between protein pairs. Here we studied the mirrortree method on both known interacting protein pairs and sets of presumed non-interacting protein pairs, to evaluate the utility of this correlation analysis method for predicting protein-protein interactions within eukaryotes. We varied metrics for computing evolutionary distance and evolutionary span of the species analyzed. We found the differences between co-evolutionary correlation scores of the interacting and non-interacting proteins, normalized for evolutionary span, to be significantly predictive for proteins conserved over a wide range of eukaryotic clades (from mammals to fungi). On the other hand, for narrower ranges of evolutionary span, the predictive power was much weaker.
Collapse
Affiliation(s)
- Hua Zhou
- Department of Biochemistry, University of Illinois, Urbana-Champaign, Illinois, United States of America
| | - Eric Jakobsson
- Department of Biochemistry, University of Illinois, Urbana-Champaign, Illinois, United States of America
- Beckman Institute, National Center for Supercomputing Applications, Program in Biophysics and Computational Biology, Department of Molecular and Integrative Physiology, University of Illinois, Urbana-Champaign, Illinois, United States of America
- * E-mail:
| |
Collapse
|
22
|
Wang S, Luo X, Wei W, Zheng Y, Dou Y, Cai X. Calculation of evolutionary correlation between individual genes and full-length genome: a method useful for choosing phylogenetic markers for molecular epidemiology. PLoS One 2013; 8:e81106. [PMID: 24312527 PMCID: PMC3849185 DOI: 10.1371/journal.pone.0081106] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2013] [Accepted: 10/18/2013] [Indexed: 11/21/2022] Open
Abstract
Individual genes or regions are still commonly used to estimate the phylogenetic relationships among viral isolates. The genomic regions that can faithfully provide assessments consistent with those predicted with full-length genome sequences would be preferable to serve as good candidates of the phylogenetic markers for molecular epidemiological studies of many viruses. Here we employed a statistical method to evaluate the evolutionary relationships between individual viral genes and full-length genomes without tree construction as a way to determine which gene can match the genome well in phylogenetic analyses. This method was performed by calculation of linear correlations between the genetic distance matrices of aligned individual gene sequences and aligned genome sequences. We applied this method to the phylogenetic analyses of porcine circovirus 2 (PCV2), measles virus (MV), hepatitis E virus (HEV) and Japanese encephalitis virus (JEV). Phylogenetic trees were constructed for comparisons and the possible factors affecting the method accuracy were also discussed in the calculations. The results revealed that this method could produce results consistent with those of previous studies about the proper consensus sequences that could be successfully used as phylogenetic markers. And our results also suggested that these evolutionary correlations could provide useful information for identifying genes that could be used effectively to infer the genetic relationships.
Collapse
Affiliation(s)
- Shuai Wang
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Xuenong Luo
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Wei Wei
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Yadong Zheng
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Yongxi Dou
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
- * E-mail: (YD); (XC)
| | - Xuepeng Cai
- State Key Laboratory of Veterinary Etiological Biology, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
- * E-mail: (YD); (XC)
| |
Collapse
|
23
|
Abstract
Co-evolution is a fundamental component of the theory of evolution and is essential for understanding the relationships between species in complex ecological networks. A wide range of co-evolution-inspired computational methods has been designed to predict molecular interactions, but it is only recently that important advances have been made. Breakthroughs in the handling of phylogenetic information and in disentangling indirect relationships have resulted in an improved capacity to predict interactions between proteins and contacts between different protein residues. Here, we review the main co-evolution-based computational approaches, their theoretical basis, potential applications and foreseeable developments.
Collapse
Affiliation(s)
- David de Juan
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | | |
Collapse
|
24
|
Pérez-Bercoff Å, Hudson CM, Conant GC. A conserved mammalian protein interaction network. PLoS One 2013; 8:e52581. [PMID: 23320073 PMCID: PMC3539715 DOI: 10.1371/journal.pone.0052581] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 11/20/2012] [Indexed: 11/19/2022] Open
Abstract
Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.
Collapse
Affiliation(s)
- Åsa Pérez-Bercoff
- Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | - Corey M. Hudson
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
| | - Gavin C. Conant
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
25
|
Wang S, Wei W, Zheng Y, Hou J, Dou Y, Zhang S, Luo X, Cai X. The role of insulin C-peptide in the coevolution analyses of the insulin signaling pathway: a hint for its functions. PLoS One 2012; 7:e52847. [PMID: 23300796 PMCID: PMC3531361 DOI: 10.1371/journal.pone.0052847] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2012] [Accepted: 11/21/2012] [Indexed: 12/16/2022] Open
Abstract
As the linker between the A chain and B chain of proinsulin, C-peptide displays high variability in length and amino acid composition, and has been considered as an inert byproduct of insulin synthesis and processing for many years. Recent studies have suggested that C-peptide can act as a bioactive hormone, exerting various biological effects on the pathophysiology and treatment of diabetes. In this study, we analyzed the coevolution of insulin molecules among vertebrates, aiming at exploring the evolutionary characteristics of insulin molecule, especially the C-peptide. We also calculated the correlations of evolutionary rates between the insulin and the insulin receptor (IR) sequences as well as the domain-domain pairs of the ligand and receptor by the mirrortree method. The results revealed distinctive features of C-peptide in insulin intramolecular coevolution and correlated residue substitutions, which partly supported the idea that C-peptide can act as a bioactive hormone, with significant sequence features, as well as a linker assisting the formation of mature insulin during synthesis. Interestingly, the evolution of C-peptide exerted the highest correlation with that of the insulin receptor and its ligand binding domain (LBD), implying a potential relationship with the insulin signaling pathway.
Collapse
Affiliation(s)
- Shuai Wang
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Wei Wei
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Yadong Zheng
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Junling Hou
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Yongxi Dou
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Shaohua Zhang
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
| | - Xuenong Luo
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
- * E-mail: (XL); (XC)
| | - Xuepeng Cai
- State Key Laboratory of Veterinary Etiological Biology, Key Laboratory of Zoonoses of CAAS, Key Laboratory of Veterinary Parasitology of Gansu Province, Lanzhou Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Lanzhou, Gansu, China
- * E-mail: (XL); (XC)
| |
Collapse
|
26
|
Swapna LS, Srinivasan N, Robertson DL, Lovell SC. The origins of the evolutionary signal used to predict protein-protein interactions. BMC Evol Biol 2012; 12:238. [PMID: 23217198 PMCID: PMC3537733 DOI: 10.1186/1471-2148-12-238] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Accepted: 11/17/2012] [Indexed: 12/02/2022] Open
Abstract
Background The correlation of genetic distances between pairs of protein sequence alignments has been used to infer protein-protein interactions. It has been suggested that these correlations are based on the signal of co-evolution between interacting proteins. However, although mutations in different proteins associated with maintaining an interaction clearly occur (particularly in binding interfaces and neighbourhoods), many other factors contribute to correlated rates of sequence evolution. Proteins in the same genome are usually linked by shared evolutionary history and so it would be expected that there would be topological similarities in their phylogenetic trees, whether they are interacting or not. For this reason the underlying species tree is often corrected for. Moreover processes such as expression level, are known to effect evolutionary rates. However, it has been argued that the correlated rates of evolution used to predict protein interaction explicitly includes shared evolutionary history; here we test this hypothesis. Results In order to identify the evolutionary mechanisms giving rise to the correlations between interaction proteins, we use phylogenetic methods to distinguish similarities in tree topologies from similarities in genetic distances. We use a range of datasets of interacting and non-interacting proteins from Saccharomyces cerevisiae. We find that the signal of correlated evolution between interacting proteins is predominantly a result of shared evolutionary rates, rather than similarities in tree topology, independent of evolutionary divergence. Conclusions Since interacting proteins do not have tree topologies that are more similar than the control group of non-interacting proteins, it is likely that coevolution does not contribute much to, if any, of the observed correlations.
Collapse
|
27
|
Evolutionary rate covariation in meiotic proteins results from fluctuating evolutionary pressure in yeasts and mammals. Genetics 2012. [PMID: 23183665 DOI: 10.1534/genetics.112.145979] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Evolutionary rates of functionally related proteins tend to change in parallel over evolutionary time. Such evolutionary rate covariation (ERC) is a sequence-based signature of coevolution and a potentially useful signature to infer functional relationships between proteins. One major hypothesis to explain ERC is that fluctuations in evolutionary pressure acting on entire pathways cause parallel rate changes for functionally related proteins. To explore this hypothesis we analyzed ERC within DNA mismatch repair (MMR) and meiosis proteins over phylogenies of 18 yeast species and 22 mammalian species. We identified a strong signature of ERC between eight yeast proteins involved in meiotic crossing over, which seems to have resulted from relaxation of constraint specifically in Candida glabrata. These and other meiotic proteins in C. glabrata showed marked rate acceleration, likely due to its apparently clonal reproductive strategy and the resulting infrequent use of meiotic proteins. This correlation between change of reproductive mode and change in constraint supports an evolutionary pressure origin for ERC. Moreover, we present evidence for similar relaxations of constraint in additional pathogenic yeast species. Mammalian MMR and meiosis proteins also showed statistically significant ERC; however, there was not strong ERC between crossover proteins, as observed in yeasts. Rather, mammals exhibited ERC in different pathways, such as piRNA-mediated defense against transposable elements. Overall, if fluctuation in evolutionary pressure is responsible for ERC, it could reveal functional relationships within entire protein pathways, regardless of whether they physically interact or not, so long as there was variation in constraint on that pathway.
Collapse
|
28
|
Ochoa D, García-Gutiérrez P, Juan D, Valencia A, Pazos F. Incorporating information on predicted solvent accessibility to the co-evolution-based study of protein interactions. MOLECULAR BIOSYSTEMS 2012; 9:70-6. [PMID: 23104128 DOI: 10.1039/c2mb25325a] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
A widespread family of methods for studying and predicting protein interactions using sequence information is based on co-evolution, quantified as similarity of phylogenetic trees. Part of the co-evolution observed between interacting proteins could be due to co-adaptation caused by inter-protein contacts. In this case, the co-evolution is expected to be more evident when evaluated on the surface of the proteins or the internal layers close to it. In this work we study the effect of incorporating information on predicted solvent accessibility to three methods for predicting protein interactions based on similarity of phylogenetic trees. We evaluate the performance of these methods in predicting different types of protein associations when trees based on positions with different characteristics of predicted accessibility are used as input. We found that predicted accessibility improves the results of two recent versions of the mirrortree methodology in predicting direct binary physical interactions, while it neither improves these methods, nor the original mirrortree method, in predicting other types of interactions. That improvement comes at no cost in terms of applicability since accessibility can be predicted for any sequence. We also found that predictions of protein-protein interactions are improved when multiple sequence alignments with a richer representation of sequences (including paralogs) are incorporated in the accessibility prediction.
Collapse
Affiliation(s)
- David Ochoa
- Computational Systems Biology Group, National Centre for Biotechnology (CNB-CSIC), C/Darwin, 3, Cantoblanco, 28049 Madrid, Spain
| | | | | | | | | |
Collapse
|
29
|
A computational framework for boosting confidence in high-throughput protein-protein interaction datasets. Genome Biol 2012; 13:R76. [PMID: 22937800 PMCID: PMC4053744 DOI: 10.1186/gb-2012-13-8-r76] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2012] [Accepted: 08/31/2012] [Indexed: 12/28/2022] Open
Abstract
Improving the quality and coverage of the protein interactome is of tantamount importance for biomedical research, particularly given the various sources of uncertainty in high-throughput techniques. We introduce a structure-based framework, Coev2Net, for computing a single confidence score that addresses both false-positive and false-negative rates. Coev2Net is easily applied to thousands of binary protein interactions and has superior predictive performance over existing methods. We experimentally validate selected high-confidence predictions in the human MAPK network and show that predicted interfaces are enriched for cancer -related or damaging SNPs. Coev2Net can be downloaded at http://struct2net.csail.mit.edu.
Collapse
|
30
|
Kensche PR, Duarte I, Huynen MA. A three-dimensional topology of complex I inferred from evolutionary correlations. BMC STRUCTURAL BIOLOGY 2012; 12:19. [PMID: 22857522 PMCID: PMC3436739 DOI: 10.1186/1472-6807-12-19] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/13/2012] [Accepted: 06/28/2012] [Indexed: 11/22/2022]
Abstract
Background The quaternary structure of eukaryotic NADH:ubiquinone oxidoreductase (complex I), the largest complex of the oxidative phosphorylation, is still mostly unresolved. Furthermore, it is unknown where transiently bound assembly factors interact with complex I. We therefore asked whether the evolution of complex I contains information about its 3D topology and the binding positions of its assembly factors. We approached these questions by correlating the evolutionary rates of eukaryotic complex I subunits using the mirror-tree method and mapping the results into a 3D representation by multidimensional scaling. Results More than 60% of the evolutionary correlation among the conserved seven subunits of the complex I matrix arm can be explained by the physical distance between the subunits. The three-dimensional evolutionary model of the eukaryotic conserved matrix arm has a striking similarity to the matrix arm quaternary structure in the bacterium Thermus thermophilus (rmsd=19 Å) and supports the previous finding that in eukaryotes the N-module is turned relative to the Q-module when compared to bacteria. By contrast, the evolutionary rates contained little information about the structure of the membrane arm. A large evolutionary model of 45 subunits and assembly factors allows to predict subunit positions and interactions (rmsd = 52.6 Å). The model supports an interaction of NDUFAF3, C8orf38 and C2orf56 during the assembly of the proximal matrix arm and the membrane arm. The model further suggests a tight relationship between the assembly factor NUBPL and NDUFA2, which both have been linked to iron-sulfur cluster assembly, as well as between NDUFA12 and its paralog, the assembly factor NDUFAF2. Conclusions The physical distance between subunits of complex I is a major correlate of the rate of protein evolution in the complex I matrix arm and is sufficient to infer parts of the complex’s structure with high accuracy. The resulting evolutionary model predicts the positions of a number of subunits and assembly factors.
Collapse
Affiliation(s)
- Philip R Kensche
- Center for Molecular and Biomolecular Informatics/Nijmegen Center for Molecular Life Sciences, Radboud University Medical Center, PO Box 9101, Nijmegen, HB, 6500, The Netherlands.
| | | | | |
Collapse
|
31
|
Lashin SA, Suslov VV, Matushkin YG. Theories of biological evolution from the viewpoint of the modern systemic biology. RUSS J GENET+ 2012. [DOI: 10.1134/s1022795412030064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
32
|
Clark NL, Alani E, Aquadro CF. Evolutionary rate covariation reveals shared functionality and coexpression of genes. Genome Res 2012; 22:714-20. [PMID: 22287101 DOI: 10.1101/gr.132647.111] [Citation(s) in RCA: 77] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Evolutionary rate covariation (ERC) is a phylogenetic signature that reflects the covariation of a pair of proteins over evolutionary time. ERC is typically elevated between interacting proteins and so is a promising signature to characterize molecular and functional interactions across the genome. ERC is often assumed to result from compensatory changes at interaction interfaces (i.e., intermolecular coevolution); however, its origin is still unclear and is likely to be complex. Here, we determine the biological factors responsible for ERC in a proteome-wide data set of 4459 proteins in 18 budding yeast species. We show that direct physical interaction is not required to produce ERC, because we observe strong correlations between noninteracting but cofunctional enzymes. We also demonstrate that ERC is uniformly distributed along the protein primary sequence, suggesting that intermolecular coevolution is not generally responsible for ERC between physically interacting proteins. Using multivariate analysis, we show that a pair of proteins is likely to exhibit ERC if they share a biological function or if their expression levels coevolve between species. Thus, ERC indicates shared function and coexpression of protein pairs and not necessarily coevolution between sites, as has been assumed in previous studies. This full interpretation of ERC now provides us with a powerful tool to assign uncharacterized proteins to functional groups and to determine the interconnectedness between entire genetic pathways.
Collapse
Affiliation(s)
- Nathan L Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, New York 14853, USA.
| | | | | |
Collapse
|
33
|
Lees JG, Heriche JK, Morilla I, Ranea JA, Orengo CA. Systematic computational prediction of protein interaction networks. Phys Biol 2011; 8:035008. [PMID: 21572181 DOI: 10.1088/1478-3975/8/3/035008] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Determining the network of physical protein associations is an important first step in developing mechanistic evidence for elucidating biological pathways. Despite rapid advances in the field of high throughput experiments to determine protein interactions, the majority of associations remain unknown. Here we describe computational methods for significantly expanding protein association networks. We describe methods for integrating multiple independent sources of evidence to obtain higher quality predictions and we compare the major publicly available resources available for experimentalists to use.
Collapse
Affiliation(s)
- J G Lees
- Research Department of Structural & Molecular Biology, University College London, London, UK.
| | | | | | | | | |
Collapse
|
34
|
Wang GZ, Lercher MJ. The effects of network neighbours on protein evolution. PLoS One 2011; 6:e18288. [PMID: 21532755 PMCID: PMC3075247 DOI: 10.1371/journal.pone.0018288] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 03/02/2011] [Indexed: 11/19/2022] Open
Abstract
Interacting proteins may often experience similar selection pressures. Thus, we may expect that neighbouring proteins in biological interaction networks evolve at similar rates. This has been previously shown for protein-protein interaction networks. Similarly, we find correlated rates of evolution of neighbours in networks based on co-expression, metabolism, and synthetic lethal genetic interactions. While the correlations are statistically significant, their magnitude is small, with network effects explaining only between 2% and 7% of the variation. The strongest known predictor of the rate of protein evolution remains expression level. We confirmed the previous observation that similar expression levels of neighbours indeed explain their similar evolution rates in protein-protein networks, and showed that the same is true for metabolic networks. In co-expression and synthetic lethal genetic interaction networks, however, neighbouring genes still show somewhat similar evolutionary rates even after simultaneously controlling for expression level, gene essentiality and gene length. Thus, similar expression levels and related functions (as inferred from co-expression and synthetic lethal interactions) seem to explain correlated evolutionary rates of network neighbours across all currently available types of biological networks.
Collapse
Affiliation(s)
| | - Martin J. Lercher
- Institute for Computer Science, Heinrich-Heine-University, Düsseldorf, Germany
- * E-mail:
| |
Collapse
|
35
|
Abstract
Bioinformatic methods to predict protein-protein interactions (PPI) via coevolutionary analysis have -positioned themselves to compete alongside established in vitro methods, despite a lack of understanding for the underlying molecular mechanisms of the coevolutionary process. Investigating the alignment of coevolutionary predictions of PPI with experimental data can focus the effective scope of prediction and lead to better accuracies. A new rate-based coevolutionary method, MMM, preferentially finds obligate interacting proteins that form complexes, conforming to results from studies based on coimmunoprecipitation coupled with mass spectrometry. Using gold-standard databases as a benchmark for accuracy, MMM surpasses methods based on abundance ratios, suggesting that correlated evolutionary rates may yet be better than coexpression at predicting interacting proteins. At the level of protein domains, -coevolution is difficult to detect, even with MMM, except when considering small-scale experimental data involving proteins with multiple domains. Overall, these findings confirm that coevolutionary -methods can be confidently used in predicting PPI, either independently or as drivers of coimmunoprecipitation experiments.
Collapse
|
36
|
Fromer M, Linial M. Exposing the co-adaptive potential of protein-protein interfaces through computational sequence design. ACTA ACUST UNITED AC 2010; 26:2266-72. [PMID: 20679332 DOI: 10.1093/bioinformatics/btq412] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
MOTIVATION In nature, protein-protein interactions are constantly evolving under various selective pressures. Nonetheless, it is expected that crucial interactions are maintained through compensatory mutations between interacting proteins. Thus, many studies have used evolutionary sequence data to extract such occurrences of correlated mutation. However, this research is confounded by other evolutionary pressures that contribute to sequence covariance, such as common ancestry. RESULTS Here, we focus exclusively on the compensatory mutations deriving from physical protein interactions, by performing large-scale computational mutagenesis experiments for >260 protein-protein interfaces. We investigate the potential for co-adaptability present in protein pairs that are always found together in nature (obligate) and those that are occasionally in complex (transient). By modeling each complex both in bound and unbound forms, we find that naturally transient complexes possess greater relative capacity for correlated mutation than obligate complexes, even when differences in interface size are taken into account.
Collapse
Affiliation(s)
- Menachem Fromer
- School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel
| | | |
Collapse
|
37
|
Lovell SC, Robertson DL. An integrated view of molecular coevolution in protein-protein interactions. Mol Biol Evol 2010; 27:2567-75. [PMID: 20551042 DOI: 10.1093/molbev/msq144] [Citation(s) in RCA: 106] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Protein-protein interactions effectively mediate molecular function. They are the result of specific interactions between protein interfaces and are maintained by the action of evolutionary pressure on the regions of the interacting proteins that contribute to binding. For the most part, selection restricts amino acid replacements, accounting for the conservation of binding interfaces. However, in some cases, change in one protein will be mitigated by compensatory change in its binding partner, maintaining function in the face of evolutionary change. There have been several attempts to use correlations in sequence evolution to predict interactions of proteins. Most commonly, these approaches use the entire sequence to identify correlations and so infer probable binding. However, other factors such as shared evolutionary history and similarities in the rates of evolution confound these whole-sequence-based approaches. Here, we discuss recent work on this topic and argue that both site-specific coevolutionary change and whole-sequence evolution contribute to evolutionary signals in sets of interacting proteins. We discuss the relative effects of both types of selection and how they might be identified. This permits an integrated view of protein-protein interactions, their evolution, and coevolution.
Collapse
Affiliation(s)
- Simon C Lovell
- Faculty of Life Sciences, University of Manchester, Oxford Road, Manchester, United Kingdom.
| | | |
Collapse
|
38
|
Chakrabarti S, Panchenko AR. Structural and functional roles of coevolved sites in proteins. PLoS One 2010; 5:e8591. [PMID: 20066038 PMCID: PMC2797611 DOI: 10.1371/journal.pone.0008591] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2009] [Accepted: 10/19/2009] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Understanding the residue covariations between multiple positions in protein families is very crucial and can be helpful for designing protein engineering experiments. These simultaneous changes or residue coevolution allow protein to maintain its overall structural-functional integrity while enabling it to acquire specific functional modifications. Despite the significant efforts in the field there is still controversy in terms of the preferable locations of coevolved residues on different regions of protein molecules, the strength of coevolutionary signal and role of coevolution in functional diversification. METHODOLOGY In this paper we study the scale and nature of residue coevolution in maintaining the overall functionality and structural integrity of proteins. We employed a large scale study to investigate the structural and functional aspects of coevolved residues. We found that the networks representing the coevolutionary residue connections within our dataset are in general of 'small-world' type as they have clustering coefficient values higher than random networks and also show smaller mean shortest path lengths similar and/or lower than random and regular networks. We also found that altogether 11% of functionally important sites are coevolved with any other sites. Active sites are found more frequently to coevolve with any other sites (15%) compared to protein (11%) and ligand (9%) binding sites. Metal binding and active sites are also found to be more frequently coevolved with other metal binding and active sites, respectively. Analysis of the coupling between coevolutionary processes and the spatial distribution of coevolved sites reveals that a high fraction of coevolved sites are located close to each other. Moreover, approximately 80% of charge compensatory substitutions within coevolved sites are found at very close spatial proximity (<or= 5A), pointing to the possible preservation of salt bridges in evolution. CONCLUSION Our findings show that a noticeable fraction of functionally important sites undergo coevolution and also point towards compensatory substitutions as a probable coevolutionary mechanism within spatially proximal coevolved functional sites.
Collapse
Affiliation(s)
- Saikat Chakrabarti
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (SC); (ARP)
| | - Anna R. Panchenko
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (SC); (ARP)
| |
Collapse
|
39
|
Development of a Novel Bioinformatics Tool for In Silico Validation of Protein Interactions. J Biomed Biotechnol 2010; 2010:670125. [PMID: 20625507 PMCID: PMC2896714 DOI: 10.1155/2010/670125] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2009] [Revised: 03/10/2010] [Accepted: 03/30/2010] [Indexed: 11/17/2022] Open
Abstract
Protein interactions are crucial in most biological processes. Several in silico methods have been recently developed to predict them. This paper describes a bioinformatics method that combines sequence similarity and structural information to support experimental studies on protein interactions. Given a target protein, the approach selects the most likely interactors among the candidates revealed by experimental techniques, but not yet in vivo validated. The sequence and the structural information of the in vivo confirmed proteins and complexes are exploited to evaluate the candidate interactors. Finally, a score is calculated to suggest the most likely interactors of the target protein. As an example, we searched for GRB2 interactors. We ranked a set of 46 candidate interactors by the presented method. These candidates were then reduced to 21, through a score threshold chosen by means of a cross-validation strategy. Among them, the isoform 1 of MAPK14 was in silico confirmed as a GRB2 interactor. Finally, given a set of already confirmed interactors of GRB2, the accuracy and the precision of the approach were 75% and 86%, respectively. In conclusion, the proposed method can be conveniently exploited to select the proteins to be experimentally investigated within a set of potential interactors.
Collapse
|
40
|
Covariation of branch lengths in phylogenies of functionally related genes. PLoS One 2009; 4:e8487. [PMID: 20041191 PMCID: PMC2793527 DOI: 10.1371/journal.pone.0008487] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2009] [Accepted: 11/25/2009] [Indexed: 12/05/2022] Open
Abstract
Recent studies have shown evidence for the coevolution of functionally-related genes. This coevolution is a result of constraints to maintain functional relationships between interacting proteins. The studies have focused on the correlation in gene tree branch lengths of proteins that are directly interacting with each other. We here hypothesize that the correlation in branch lengths is not limited only to proteins that directly interact, but also to proteins that operate within the same pathway. Using generalized linear models as a basis of identifying correlation, we attempted to predict the gene ontology (GO) terms of a gene based on its gene tree branch lengths. We applied our method to a dataset consisting of proteins from ten prokaryotic species. We found that the degree of accuracy to which we could predict the function of the proteins from their gene tree varied substantially with different GO terms. In particular, our model could accurately predict genes involved in translation and certain ribosomal activities with the area of the receiver-operator curve of up to 92%. Further analysis showed that the similarity between the trees of genes labeled with similar GO terms was not limited to genes that physically interacted, but also extended to genes functioning within the same pathway. We discuss the relevance of our findings as it relates to the use of phylogenetic methods in comparative genomics.
Collapse
|
41
|
Choi K, Gomez SM. Comparison of phylogenetic trees through alignment of embedded evolutionary distances. BMC Bioinformatics 2009; 10:423. [PMID: 20003527 PMCID: PMC3087345 DOI: 10.1186/1471-2105-10-423] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2009] [Accepted: 12/15/2009] [Indexed: 11/12/2022] Open
Abstract
Background The understanding of evolutionary relationships is a fundamental aspect of modern biology, with the phylogenetic tree being a primary tool for describing these associations. However, comparison of trees for the purpose of assessing similarity and the quantification of various biological processes remains a significant challenge. Results We describe a novel approach for the comparison of phylogenetic distance information based on the alignment of representative high-dimensional embeddings (xCEED: Comparison of Embedded Evolutionary Distances). The xCEED methodology, which utilizes multidimensional scaling and Procrustes-related superimposition approaches, provides the ability to measure the global similarity between trees as well as incongruities between them. We demonstrate the application of this approach to the prediction of coevolving protein interactions and demonstrate its improved performance over the mirrortree, tol-mirrortree, phylogenetic vector projection, and partial correlation approaches. Furthermore, we show its applicability to both the detection of horizontal gene transfer events as well as its potential use in the prediction of interaction specificity between a pair of multigene families. Conclusions These approaches provide additional tools for the study of phylogenetic trees and associated evolutionary processes. Source code is available at http://gomezlab.bme.unc.edu/tools.
Collapse
Affiliation(s)
- Kwangbom Choi
- Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA.
| | | |
Collapse
|
42
|
Lewis ACF, Saeed R, Deane CM. Predicting protein-protein interactions in the context of protein evolution. MOLECULAR BIOSYSTEMS 2009; 6:55-64. [PMID: 20024067 DOI: 10.1039/b916371a] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Here we review the methods for the prediction of protein interactions and the ideas in protein evolution that relate to them. The evolutionary assumptions implicit in many of the protein interaction prediction methods are elucidated. We draw attention to the caution needed in deploying certain evolutionary assumptions, in particular cross-organism transfer of interactions by sequence homology, and discuss the known issues in deriving interaction predictions from evidence of co-evolution. We also conject that there is evolutionary knowledge yet to be exploited in the prediction of interactions, in particular the heterogeneity of interactions, the increasing availability of interaction data from multiple species, and the models of protein interaction network growth.
Collapse
Affiliation(s)
- Anna C F Lewis
- Department of Statistics and Systems Biology DTC, University of Oxford, UK
| | | | | |
Collapse
|
43
|
Abstract
Coevolution maintains interactions between phenotypic traits through the process of reciprocal natural selection. Detecting molecular coevolution can expose functional interactions between molecules in the cell, generating insights into biological processes, pathways, and the networks of interactions important for cellular function. Prediction of interaction partners from different protein families exploits the property that interacting proteins can follow similar patterns and relative rates of evolution. Current methods for detecting coevolution based on the similarity of phylogenetic trees or evolutionary distance matrices have, however, been limited by requiring coevolution over the entire evolutionary history considered and are inaccurate in the presence of paralogous copies. We present a novel method for determining coevolving protein partners by finding the largest common submatrix in a given pair of distance matrices, with the size of the largest common submatrix measuring the strength of coevolution. This approach permits us to consider matrices of different size and scale, to find lineage-specific coevolution, and to predict multiple interaction partners. We used MatrixMatchMaker to predict protein-protein interactions in the human genome. We show that proteins that are known to interact physically are more strongly coevolving than proteins that simply belong to the same biochemical pathway. The human coevolution network is highly connected, suggesting many more protein-protein interactions than are currently known from high-throughput and other experimental evidence. These most strongly coevolving proteins suggest interactions that have been maintained over long periods of evolutionary time, and that are thus likely to be of fundamental importance to cellular function.
Collapse
Affiliation(s)
- Elisabeth R M Tillier
- Department of Medical Biophysics, University of Toronto, Ontario Cancer Institute, University Health Network, Canada.
| | | |
Collapse
|