1
|
Ali L, Abdel Aziz MH. Crosstalk involving two-component systems in Staphylococcus aureus signaling networks. J Bacteriol 2024; 206:e0041823. [PMID: 38456702 PMCID: PMC11025333 DOI: 10.1128/jb.00418-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2024] Open
Abstract
Staphylococcus aureus poses a serious global threat to human health due to its pathogenic nature, adaptation to environmental stress, high virulence, and the prevalence of antimicrobial resistance. The signaling network in S. aureus coordinates and integrates various internal and external inputs and stimuli to adapt and formulate a response to the environment. Two-component systems (TCSs) of S. aureus play a central role in this network where surface-expressed histidine kinases (HKs) receive and relay external signals to their cognate response regulators (RRs). Despite the purported high fidelity of signaling, crosstalk within TCSs, between HK and non-cognate RR, and between TCSs and other systems has been detected widely in bacteria. The examples of crosstalk in S. aureus are very limited, and there needs to be more understanding of its molecular recognition mechanisms, although some crosstalk can be inferred from similar bacterial systems that share structural similarities. Understanding the cellular processes mediated by this crosstalk and how it alters signaling, especially under stress conditions, may help decipher the emergence of antibiotic resistance. This review highlights examples of signaling crosstalk in bacteria in general and S. aureus in particular, as well as the effect of TCS mutations on signaling and crosstalk.
Collapse
Affiliation(s)
- Liaqat Ali
- Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, USA
| | - May H. Abdel Aziz
- Fisch College of Pharmacy, The University of Texas at Tyler, Tyler, Texas, USA
| |
Collapse
|
2
|
Rouzine IM, Rozhnova G. Evolutionary implications of SARS-CoV-2 vaccination for the future design of vaccination strategies. COMMUNICATIONS MEDICINE 2023; 3:86. [PMID: 37336956 PMCID: PMC10279745 DOI: 10.1038/s43856-023-00320-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 06/07/2023] [Indexed: 06/21/2023] Open
Abstract
Once the first SARS-CoV-2 vaccine became available, mass vaccination was the main pillar of the public health response to the COVID-19 pandemic. It was very effective in reducing hospitalizations and deaths. Here, we discuss the possibility that mass vaccination might accelerate SARS-CoV-2 evolution in antibody-binding regions compared to natural infection at the population level. Using the evidence of strong genetic variation in antibody-binding regions and taking advantage of the similarity between the envelope proteins of SARS-CoV-2 and influenza, we assume that immune selection pressure acting on these regions of the two viruses is similar. We discuss the consequences of this assumption for SARS-CoV-2 evolution in light of mathematical models developed previously for influenza. We further outline the implications of this phenomenon, if our assumptions are confirmed, for the future design of SARS-CoV-2 vaccination strategies.
Collapse
Affiliation(s)
- Igor M Rouzine
- Immunogenetics, Sechenov Institute of Evolutionary Physiology and Biochemistry of Russian Academy of Sciences, Saint-Petersburg, Russia.
| | - Ganna Rozhnova
- Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
- BioISI - Biosystems & Integrative Sciences Institute, Faculdade de Ciências, Universidade de Lisboa, Lisboa, Portugal.
- Center for Complex Systems Studies (CCSS), Utrecht University, Utrecht, The Netherlands.
| |
Collapse
|
3
|
Budzynski L, Pagnani A. Small-coupling expansion for multiple sequence alignment. Phys Rev E 2023; 107:044125. [PMID: 37198812 DOI: 10.1103/physreve.107.044125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 03/27/2023] [Indexed: 05/19/2023]
Abstract
The alignment of biological sequences such as DNA, RNA, and proteins, is one of the basic tools that allow to detect evolutionary patterns, as well as functional or structural characterizations between homologous sequences in different organisms. Typically, state-of-the-art bioinformatics tools are based on profile models that assume the statistical independence of the different sites of the sequences. Over the last years, it has become increasingly clear that homologous sequences show complex patterns of long-range correlations over the primary sequence as a consequence of the natural evolution process that selects genetic variants under the constraint of preserving the functional or structural determinants of the sequence. Here, we present an alignment algorithm based on message passing techniques that overcomes the limitations of profile models. Our method is based on a perturbative small-coupling expansion of the free energy of the model that assumes a linear chain approximation as the zeroth-order of the expansion. We test the potentiality of the algorithm against standard competing strategies on several biological sequences.
Collapse
Affiliation(s)
- Louise Budzynski
- DISAT, Politecnico di Torino, Corso Duca degli Abruzzi, 24, I-10129, Torino, Italy
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, I-10060, Candiolo, Italy
| | - Andrea Pagnani
- DISAT, Politecnico di Torino, Corso Duca degli Abruzzi, 24, I-10129, Torino, Italy
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, I-10060, Candiolo, Italy
- INFN, Sezione di Torino, Torino, Via Pietro Giuria, 1 10125 Torino Italy
| |
Collapse
|
4
|
Malinverni D, Babu MM. Data-driven design of orthogonal protein-protein interactions. Sci Signal 2023; 16:eabm4484. [PMID: 36853962 DOI: 10.1126/scisignal.abm4484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Engineering protein-protein interactions to generate new functions presents a challenge with great potential for many applications, ranging from therapeutics to synthetic biology. To avoid unwanted cross-talk with preexisting protein interaction networks in a cell, the specificity and selectivity of newly engineered proteins must be controlled. Here, we developed a computational strategy that mimics gene duplication and the divergence of preexisting interacting protein pairs to design new interactions. We used the bacterial PhoQ-PhoP two-component system as a model system to demonstrate the feasibility of this strategy and validated the approach with known experimental results. The designed protein pairs are predicted to exclusively interact with each other and to be insulated from potential cross-talk with their native partners. Thus, our approach enables exploration of uncharted regions of the protein sequence space and the design of new interacting protein pairs.
Collapse
Affiliation(s)
- Duccio Malinverni
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK.,Department of Structural Biology and Center of Excellence for Data Driven Discovery, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| | - M Madan Babu
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH, UK.,Department of Structural Biology and Center of Excellence for Data Driven Discovery, St. Jude Children's Research Hospital, Memphis, TN 38105, USA
| |
Collapse
|
5
|
Kennedy EN, Foster CA, Barr SA, Bourret RB. General strategies for using amino acid sequence data to guide biochemical investigation of protein function. Biochem Soc Trans 2022; 50:1847-1858. [PMID: 36416676 PMCID: PMC10257402 DOI: 10.1042/bst20220849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 11/04/2022] [Accepted: 11/09/2022] [Indexed: 11/24/2022]
Abstract
The rapid increase of '-omics' data warrants the reconsideration of experimental strategies to investigate general protein function. Studying individual members of a protein family is likely insufficient to provide a complete mechanistic understanding of family functions, especially for diverse families with thousands of known members. Strategies that exploit large amounts of available amino acid sequence data can inspire and guide biochemical experiments, generating broadly applicable insights into a given family. Here we review several methods that utilize abundant sequence data to focus experimental efforts and identify features truly representative of a protein family or domain. First, coevolutionary relationships between residues within primary sequences can be successfully exploited to identify structurally and/or functionally important positions for experimental investigation. Second, functionally important variable residue positions typically occupy a limited sequence space, a property useful for guiding biochemical characterization of the effects of the most physiologically and evolutionarily relevant amino acids. Third, amino acid sequence variation within domains shared between different protein families can be used to sort a particular domain into multiple subtypes, inspiring further experimental designs. Although generally applicable to any kind of protein domain because they depend solely on amino acid sequences, the second and third approaches are reviewed in detail because they appear to have been used infrequently and offer immediate opportunities for new advances. Finally, we speculate that future technologies capable of analyzing and manipulating conserved and variable aspects of the three-dimensional structures of a protein family could lead to broad insights not attainable by current methods.
Collapse
Affiliation(s)
- Emily N. Kennedy
- Department of Microbiology & Immunology, University of North Carolina, Chapel Hill, NC, United States of America
| | - Clay A. Foster
- Department of Pediatrics, Section Hematology/Oncology, University of Oklahoma Health Sciences Center, Oklahoma City, Oklahoma, United States of America
| | - Sarah A. Barr
- Department of Microbiology & Immunology, University of North Carolina, Chapel Hill, NC, United States of America
| | - Robert B. Bourret
- Department of Microbiology & Immunology, University of North Carolina, Chapel Hill, NC, United States of America
| |
Collapse
|
6
|
Insights into the Virulence of Campylobacter jejuni Associated with Two-Component Signal Transduction Systems and Single Regulators. MICROBIOLOGY RESEARCH 2022. [DOI: 10.3390/microbiolres13020016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Campylobacter jejuni is one of the major aetiologies of diarrhoea. Understanding the processes and virulence factors contributing to C. jejuni fitness is a cornerstone for developing mitigation strategies. Two-component signal transduction systems, known as two-component systems (TCSs), along with single regulators with no obvious cognate histidine kinase, help pathogens in interacting with their environments, but the available literature on C. jejuni is limited. A typical TCS possesses histidine kinase and response regulator proteins. The objective of this review was to provide insights into the virulence of C. jejuni associated with TCSs and single regulators. Despite limited research, TCSs are important contributors to the pathogenicity of C. jejuni by influencing motility (FlgSR), colonisation (DccRS), nutrient acquisition (PhosSR and BumSR), and stress response (RacRS). Of the single regulators, CbrR and CosR are involved in bile resistance and oxidative stress response, respectively. Cross-talks among TCSs complicate the full elucidation of their molecular mechanisms. Although progress has been made in characterising C. jejuni TCSs, shortfalls such as triggering signals, inability to induce mutations in some genes, or developing suitable in vivo models are still being encountered. Further research is expected to shed light on the unexplored sides of the C. jejuni TCSs, which may allow new drug discoveries and better control strategies.
Collapse
|
7
|
Gerardos A, Dietler N, Bitbol AF. Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences. PLoS Comput Biol 2022; 18:e1010147. [PMID: 35576238 PMCID: PMC9135348 DOI: 10.1371/journal.pcbi.1010147] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 05/26/2022] [Accepted: 04/27/2022] [Indexed: 11/19/2022] Open
Abstract
Inferring protein-protein interactions from sequences is an important task in computational biology. Recent methods based on Direct Coupling Analysis (DCA) or Mutual Information (MI) allow to find interaction partners among paralogs of two protein families. Does successful inference mainly rely on correlations from structural contacts or from phylogeny, or both? Do these two types of signal combine constructively or hinder each other? To address these questions, we generate and analyze synthetic data produced using a minimal model that allows us to control the amounts of structural constraints and phylogeny. We show that correlations from these two sources combine constructively to increase the performance of partner inference by DCA or MI. Furthermore, signal from phylogeny can rescue partner inference when signal from contacts becomes less informative, including in the realistic case where inter-protein contacts are restricted to a small subset of sites. We also demonstrate that DCA-inferred couplings between non-contact pairs of sites improve partner inference in the presence of strong phylogeny, while deteriorating it otherwise. Moreover, restricting to non-contact pairs of sites preserves inference performance in the presence of strong phylogeny. In a natural data set, as well as in realistic synthetic data based on it, we find that non-contact pairs of sites contribute positively to partner inference performance, and that restricting to them preserves performance, evidencing an important role of phylogeny.
Collapse
Affiliation(s)
- Andonis Gerardos
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Nicola Dietler
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Anne-Florence Bitbol
- Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
8
|
Jones RD, Qian Y, Ilia K, Wang B, Laub MT, Del Vecchio D, Weiss R. Robust and tunable signal processing in mammalian cells via engineered covalent modification cycles. Nat Commun 2022; 13:1720. [PMID: 35361767 PMCID: PMC8971529 DOI: 10.1038/s41467-022-29338-w] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 02/16/2022] [Indexed: 02/06/2023] Open
Abstract
Engineered signaling networks can impart cells with new functionalities useful for directing differentiation and actuating cellular therapies. For such applications, the engineered networks must be tunable, precisely regulate target gene expression, and be robust to perturbations within the complex context of mammalian cells. Here, we use bacterial two-component signaling proteins to develop synthetic phosphoregulation devices that exhibit these properties in mammalian cells. First, we engineer a synthetic covalent modification cycle based on kinase and phosphatase proteins derived from the bifunctional histidine kinase EnvZ, enabling analog tuning of gene expression via its response regulator OmpR. By regulating phosphatase expression with endogenous miRNAs, we demonstrate cell-type specific signaling responses and a new strategy for accurate cell type classification. Finally, we implement a tunable negative feedback controller via a small molecule-stabilized phosphatase, reducing output expression variance and mitigating the context-dependent effects of off-target regulation and resource competition. Our work lays the foundation for establishing tunable, precise, and robust control over cell behavior with synthetic signaling networks.
Collapse
Affiliation(s)
- Ross D Jones
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Yili Qian
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Katherine Ilia
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Benjamin Wang
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Michael T Laub
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Domitilla Del Vecchio
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
- Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| | - Ron Weiss
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
- Synthetic Biology Center, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
- Electrical Engineering and Computer Science Department, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
| |
Collapse
|
9
|
Improved prediction of protein-protein interactions using AlphaFold2. Nat Commun 2022; 13:1265. [PMID: 35273146 PMCID: PMC8913741 DOI: 10.1038/s41467-022-28865-w] [Citation(s) in RCA: 337] [Impact Index Per Article: 168.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 02/11/2022] [Indexed: 01/02/2023] Open
Abstract
Predicting the structure of interacting protein chains is a fundamental step towards understanding protein function. Unfortunately, no computational method can produce accurate structures of protein complexes. AlphaFold2, has shown unprecedented levels of accuracy in modelling single chain protein structures. Here, we apply AlphaFold2 for the prediction of heterodimeric protein complexes. We find that the AlphaFold2 protocol together with optimised multiple sequence alignments, generate models with acceptable quality (DockQ ≥ 0.23) for 63% of the dimers. From the predicted interfaces we create a simple function to predict the DockQ score which distinguishes acceptable from incorrect models as well as interacting from non-interacting proteins with state-of-art accuracy. We find that, using the predicted DockQ scores, we can identify 51% of all interacting pairs at 1% FPR. Predicting the structure of protein complexes is extremely difficult. Here, authors apply AlphaFold2 with optimized multiple sequence alignments to model complexes of interacting proteins, enabling prediction of both if and how proteins interact with state-of-art accuracy.
Collapse
|
10
|
Mehrabiani KM, Cheng RR, Onuchic JN. Expanding Direct Coupling Analysis to Identify Heterodimeric Interfaces from Limited Protein Sequence Data. J Phys Chem B 2021; 125:11408-11417. [PMID: 34618469 DOI: 10.1021/acs.jpcb.1c07145] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Direct coupling analysis (DCA) is a global statistical approach that uses information encoded in protein sequence data to predict spatial contacts in a three-dimensional structure of a folded protein. DCA has been widely used to predict the monomeric fold at amino acid resolution and to identify biologically relevant interaction sites within a folded protein. Going beyond single proteins, DCA has also been used to identify spatial contacts that stabilize the interaction in protein complex formation. However, extracting this higher order information necessary to predict dimer contacts presents a significant challenge. A DCA evolutionary signal is much stronger at the single protein level (intraprotein contacts) than at the protein-protein interface (interprotein contacts). Therefore, if DCA-derived information is to be used to predict the structure of these complexes, there is a need to identify statistically significant DCA predictions. We propose a simple Z-score measure that can filter good predictions despite noisy, limited data. This new methodology not only improves our prediction ability but also provides a quantitative measure for the validity of the prediction.
Collapse
Affiliation(s)
- Kareem M Mehrabiani
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Systems, Synthetic, and Physical Biology, Rice University, Houston, Texas 77005, United States
| | - Ryan R Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States
| | - José N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, United States.,Systems, Synthetic, and Physical Biology, Rice University, Houston, Texas 77005, United States.,Department of Physics & Astronomy, Rice University, Houston, Texas 77005, United States.,Department of Chemistry, Rice University, Houston, Texas 77005, United States.,Department of Biosciences, Rice University, Houston, Texas 77005, United States
| |
Collapse
|
11
|
Faßhauer P, Busche T, Kalinowski J, Mäder U, Poehlein A, Daniel R, Stülke J. Functional Redundancy and Specialization of the Conserved Cold Shock Proteins in Bacillus subtilis. Microorganisms 2021; 9:1434. [PMID: 34361870 PMCID: PMC8307031 DOI: 10.3390/microorganisms9071434] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 06/22/2021] [Accepted: 06/30/2021] [Indexed: 12/26/2022] Open
Abstract
Many bacteria encode so-called cold shock proteins. These proteins are characterized by a conserved protein domain. Often, the bacteria have multiple cold shock proteins that are expressed either constitutively or at low temperatures. In the Gram-positive model bacterium Bacillussubtilis, two of three cold shock proteins, CspB and CspD, belong to the most abundant proteins suggesting a very important function. To get insights into the role of these highly abundant proteins, we analyzed the phenotypes of single and double mutants, tested the expression of the csp genes and the impact of CspB and CspD on global gene expression in B. subtilis. We demonstrate that the simultaneous loss of both CspB and CspD results in a severe growth defect, in the loss of genetic competence, and the appearance of suppressor mutations. Overexpression of the third cold shock protein CspC could compensate for the loss of CspB and CspD. The transcriptome analysis revealed that the lack of CspB and CspD affects the expression of about 20% of all genes. In several cases, the lack of the cold shock proteins results in an increased read-through at transcription terminators suggesting that CspB and CspD might be involved in the control of transcription termination.
Collapse
Affiliation(s)
- Patrick Faßhauer
- Department of General Microbiology, GZMB, Georg-August-University Göttingen, 37077 Göttingen, Germany;
| | - Tobias Busche
- Center for Biotechnology (CeBiTec), Bielefeld University, 33615 Bielefeld, Germany; (T.B.); (J.K.)
| | - Jörn Kalinowski
- Center for Biotechnology (CeBiTec), Bielefeld University, 33615 Bielefeld, Germany; (T.B.); (J.K.)
| | - Ulrike Mäder
- Interfaculty Institute for Genetics and Functional Genomics, University Medicine Greifswald, 17487 Greifswald, Germany;
| | - Anja Poehlein
- Department of Genomic and Applied Microbiology, GZMB, Georg-August-University Göttingen, 37077 Göttingen, Germany; (A.P.); (R.D.)
| | - Rolf Daniel
- Department of Genomic and Applied Microbiology, GZMB, Georg-August-University Göttingen, 37077 Göttingen, Germany; (A.P.); (R.D.)
| | - Jörg Stülke
- Department of General Microbiology, GZMB, Georg-August-University Göttingen, 37077 Göttingen, Germany;
| |
Collapse
|
12
|
Schmidt M, Hamacher K. Identification of biophysical interaction patterns in direct coupling analysis. Phys Rev E 2021; 103:042418. [PMID: 34005861 DOI: 10.1103/physreve.103.042418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 03/27/2021] [Indexed: 11/07/2022]
Abstract
Direct-coupling analysis is a statistical learning method for protein contact prediction based on sequence information alone. The maximum entropy principle leads to an effective inverse Potts model. Predictions on contacts are based on fitted local fields and couplings from an empirical multiple sequence alignment. Typically, the l_{2} norm of the resulting two-body couplings is used for contact prediction. However, this procedure discards important information. In this paper we show that the usage of the full fields and coupling information improves prediction accuracy.
Collapse
Affiliation(s)
- Michael Schmidt
- Department of Physics, TU Darmstadt, Karolinenpl. 5, 64289 Darmstadt, Germany
| | - Kay Hamacher
- Department of Physics, TU Darmstadt, Karolinenpl. 5, 64289 Darmstadt, Germany.,Department of Biology, TU Darmstadt, Schnittspahnstr. 10, 64287 Darmstadt, Germany.,Department of Computer Science, TU Darmstadt, Karolinenpl. 5, 64289 Darmstadt, Germany
| |
Collapse
|
13
|
Salmanian S, Pezeshk H, Sadeghi M. Inter-protein residue covariation information unravels physically interacting protein dimers. BMC Bioinformatics 2020; 21:584. [PMID: 33334319 PMCID: PMC7745481 DOI: 10.1186/s12859-020-03930-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 12/09/2020] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Predicting physical interaction between proteins is one of the greatest challenges in computational biology. There are considerable various protein interactions and a huge number of protein sequences and synthetic peptides with unknown interacting counterparts. Most of co-evolutionary methods discover a combination of physical interplays and functional associations. However, there are only a handful of approaches which specifically infer physical interactions. Hybrid co-evolutionary methods exploit inter-protein residue coevolution to unravel specific physical interacting proteins. In this study, we introduce a hybrid co-evolutionary-based approach to predict physical interplays between pairs of protein families, starting from protein sequences only. RESULTS In the present analysis, pairs of multiple sequence alignments are constructed for each dimer and the covariation between residues in those pairs are calculated by CCMpred (Contacts from Correlated Mutations predicted) and three mutual information based approaches for ten accessible surface area threshold groups. Then, whole residue couplings between proteins of each dimer are unified into a single Frobenius norm value. Norms of residue contact matrices of all dimers in different accessible surface area thresholds are fed into support vector machine as single or multiple feature models. The results of training the classifiers by single features show no apparent different accuracies in distinct methods for different accessible surface area thresholds. Nevertheless, mutual information product and context likelihood of relatedness procedures may roughly have an overall higher and lower performances than other two methods for different accessible surface area cut-offs, respectively. The results also demonstrate that training support vector machine with multiple norm features for several accessible surface area thresholds leads to a considerable improvement of prediction performance. In this context, CCMpred roughly achieves an overall better performance than mutual information based approaches. The best accuracy, sensitivity, specificity, precision and negative predictive value for that method are 0.98, 1, 0.962, 0.96, and 0.962, respectively. CONCLUSIONS In this paper, by feeding norm values of protein dimers into support vector machines in different accessible surface area thresholds, we demonstrate that even small number of proteins in pairs of multiple alignments could allow one to accurately discriminate between positive and negative dimers.
Collapse
Affiliation(s)
- Sara Salmanian
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Hamid Pezeshk
- School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Tehran, Iran
- Present Address: Department of Mathematics and Statistics, Concordia University, Montreal, Canada
- School of Biological Sciences, Institute for Research in Fundamental Sciences, Tehran, Iran
| | - Mehdi Sadeghi
- National Institute of Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
14
|
Muntoni AP, Pagnani A, Weigt M, Zamponi F. Aligning biological sequences by exploiting residue conservation and coevolution. Phys Rev E 2020; 102:062409. [PMID: 33465950 DOI: 10.1103/physreve.102.062409] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Accepted: 11/12/2020] [Indexed: 11/07/2022]
Abstract
Sequences of nucleotides (for DNA and RNA) or amino acids (for proteins) are central objects in biology. Among the most important computational problems is that of sequence alignment, i.e., arranging sequences from different organisms in such a way to identify similar regions, to detect evolutionary relationships between sequences, and to predict biomolecular structure and function. This is typically addressed through profile models, which capture position specificities like conservation in sequences but assume an independent evolution of different positions. Over recent years, it has been well established that coevolution of different amino-acid positions is essential for maintaining three-dimensional structure and function. Modeling approaches based on inverse statistical physics can catch the coevolution signal in sequence ensembles, and they are now widely used in predicting protein structure, protein-protein interactions, and mutational landscapes. Here, we present DCAlign, an efficient alignment algorithm based on an approximate message-passing strategy, which is able to overcome the limitations of profile models, to include coevolution among positions in a general way, and to be therefore universally applicable to protein- and RNA-sequence alignment without the need of using complementary structural information. The potential of DCAlign is carefully explored using well-controlled simulated data, as well as real protein and RNA sequences.
Collapse
Affiliation(s)
- Anna Paola Muntoni
- Department of Applied Science and Technology (DISAT), Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy
- Laboratoire de Physique de l'Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris, F-75005 Paris, France
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative LCQB, F-75005 Paris, France
| | - Andrea Pagnani
- Department of Applied Science and Technology (DISAT), Politecnico di Torino, Corso Duca degli Abruzzi 24, I-10129 Torino, Italy
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, I-10060 Candiolo (TO), Italy
- INFN, Sezione di Torino, Via Giuria 1, I-10125 Torino, Italy
| | - Martin Weigt
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie Computationnelle et Quantitative LCQB, F-75005 Paris, France
| | - Francesco Zamponi
- Laboratoire de Physique de l'Ecole Normale Supérieure, ENS, Université PSL, CNRS, Sorbonne Université, Université de Paris, F-75005 Paris, France
| |
Collapse
|
15
|
Application of Firefly Luciferase (Luc) as a Reporter Gene for the Chemoautotrophic and Acidophilic Acidithiobacillus spp. Curr Microbiol 2020; 77:3724-3730. [PMID: 32945904 DOI: 10.1007/s00284-020-02195-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 09/03/2020] [Indexed: 10/23/2022]
Abstract
Acidithiobacillus spp. are the most active bacteria in bioleaching and bioremediation, because of their remarkable extreme environmental adaptabilities and unique metabolic characteristics. The researches on regulatory mechanisms of energy metabolism and stress resistance are critical for the understanding and application of Acidithiobacillus spp. However, the lack of an ideal reporter gene has become an obstacle for studying genes expression and regulatory mechanism in these chemoautotrophic bacteria. In this study, we reported the firefly luciferase as a reporter gene for Acidithiobacillus caldus (A. caldus) and created a firefly luciferase (Luc) reporter system. The Luc system was applied for the quantitative analysis of the transcription strength of the promoters of tetH gene and the feoA gene in A. caldus. Moreover, the regulating effect of ferric uptake regulator (Fur) on the feoP gene in A. caldus was determined using the Luc system. The Luc reporter system is not only used in the study of regulatory mechanism of A. caldus, but also applied in the researches of other Acidithiobacillus species. Therefore, this study provides a new useful tool for the studies on the molecular biological mechanism and synthetic biological modification of these chemoautotrophic bacteria, which would promote the industrial application of Acidithiobacillus spp.
Collapse
|
16
|
Correa Marrero M, Immink RGH, de Ridder D, van Dijk ADJ. Improved inference of intermolecular contacts through protein-protein interaction prediction using coevolutionary analysis. Bioinformatics 2020; 35:2036-2042. [PMID: 30398547 DOI: 10.1093/bioinformatics/bty924] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Revised: 10/11/2018] [Accepted: 11/05/2018] [Indexed: 01/09/2023] Open
Abstract
MOTIVATION Predicting residue-residue contacts between interacting proteins is an important problem in bioinformatics. The growing wealth of sequence data can be used to infer these contacts through correlated mutation analysis on multiple sequence alignments of interacting homologs of the proteins of interest. This requires correct identification of pairs of interacting proteins for many species, in order to avoid introducing noise (i.e. non-interacting sequences) in the analysis that will decrease predictive performance. RESULTS We have designed Ouroboros, a novel algorithm to reduce such noise in intermolecular contact prediction. Our method iterates between weighting proteins according to how likely they are to interact based on the correlated mutations signal, and predicting correlated mutations based on the weighted sequence alignment. We show that this approach accurately discriminates between protein interaction versus non-interaction and simultaneously improves the prediction of intermolecular contact residues compared to a naive application of correlated mutation analysis. This requires no training labels concerning interactions or contacts. Furthermore, the method relaxes the assumption of one-to-one interaction of previous approaches, allowing for the study of many-to-many interactions. AVAILABILITY AND IMPLEMENTATION Source code and test data are available at www.bif.wur.nl/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Richard G H Immink
- Laboratory of Molecular Biology, Department of Plant Sciences.,Bioscience, Wageningen Plant Research
| | | | - Aalt D J van Dijk
- Bioinformatics Group, Department of Plant Sciences.,Bioscience, Wageningen Plant Research.,Biometris, Department of Plant Sciences, Wageningen University & Research, Wageningen PB, The Netherlands
| |
Collapse
|
17
|
McLean TC, Lo R, Tschowri N, Hoskisson PA, Al Bassam MM, Hutchings MI, Som NF. Sensing and responding to diverse extracellular signals: an updated analysis of the sensor kinases and response regulators of Streptomyces species. MICROBIOLOGY-SGM 2020; 165:929-952. [PMID: 31334697 DOI: 10.1099/mic.0.000817] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Streptomyces venezuelae is a Gram-positive, filamentous actinomycete with a complex developmental life cycle. Genomic analysis revealed that S. venezuelae encodes a large number of two-component systems (TCSs): these consist of a membrane-bound sensor kinase (SK) and a cognate response regulator (RR). These proteins act together to detect and respond to diverse extracellular signals. Some of these systems have been shown to regulate antimicrobial biosynthesis in Streptomyces species, making them very attractive to researchers. The ability of S. venezuelae to sporulate in both liquid and solid cultures has made it an increasingly popular model organism in which to study these industrially and medically important bacteria. Bioinformatic analysis identified 58 TCS operons in S. venezuelae with an additional 27 orphan SK and 18 orphan RR genes. A broader approach identified 15 of the 58 encoded TCSs to be highly conserved in 93 Streptomyces species for which high-quality and complete genome sequences are available. This review attempts to unify the current work on TCS in the streptomycetes, with an emphasis on S. venezuelae.
Collapse
Affiliation(s)
- Thomas C McLean
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, Norfolk NR4 7TJ, UK
| | - Rebecca Lo
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, Norfolk NR4 7TJ, UK
| | - Natalia Tschowri
- Institut für Biologie/Mikrobiologie, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Paul A Hoskisson
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, 161 Cathedral Street, Glasgow G4 0RE, UK
| | - Mahmoud M Al Bassam
- Department of Paediatrics, Division of Host-Microbe Systems and Therapeutics, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Matthew I Hutchings
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, Norfolk NR4 7TJ, UK
| | - Nicolle F Som
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich, Norfolk NR4 7TJ, UK
| |
Collapse
|
18
|
Gandarilla-Pérez CA, Mergny P, Weigt M, Bitbol AF. Statistical physics of interacting proteins: Impact of dataset size and quality assessed in synthetic sequences. Phys Rev E 2020; 101:032413. [PMID: 32290011 DOI: 10.1103/physreve.101.032413] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Accepted: 03/04/2020] [Indexed: 11/07/2022]
Abstract
Identifying protein-protein interactions is crucial for a systems-level understanding of the cell. Recently, algorithms based on inverse statistical physics, e.g., direct coupling analysis (DCA), have allowed to use evolutionarily related sequences to address two conceptually related inference tasks: finding pairs of interacting proteins and identifying pairs of residues which form contacts between interacting proteins. Here we address two underlying questions: How are the performances of both inference tasks related? How does performance depend on dataset size and the quality? To this end, we formalize both tasks using Ising models defined over stochastic block models, with individual blocks representing single proteins and interblock couplings protein-protein interactions; controlled synthetic sequence data are generated by Monte Carlo simulations. We show that DCA is able to address both inference tasks accurately when sufficiently large training sets of known interaction partners are available and that an iterative pairing algorithm allows to make predictions even without a training set. Noise in the training data deteriorates performance. In both tasks we find a quadratic scaling relating dataset quality and size that is consistent with noise adding in square-root fashion and signal adding linearly when increasing the dataset. This implies that it is generally good to incorporate more data even if their quality are imperfect, thereby shedding light on the empirically observed performance of DCA applied to natural protein sequences.
Collapse
Affiliation(s)
- Carlos A Gandarilla-Pérez
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative (LCQB, UMR 7238), F-75005 Paris, France.,Facultad de Física, Universidad de la Habana, San Lázaro y L, Vedado, Habana 4, CP-10400, Cuba
| | - Pierre Mergny
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative (LCQB, UMR 7238), F-75005 Paris, France.,Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Jean Perrin (LJP, UMR 8237), F-75005 Paris, France
| | - Martin Weigt
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative (LCQB, UMR 7238), F-75005 Paris, France
| | - Anne-Florence Bitbol
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Jean Perrin (LJP, UMR 8237), F-75005 Paris, France.,Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| |
Collapse
|
19
|
Reimer JM, Eivaskhani M, Harb I, Guarné A, Weigt M, Schmeing TM. Structures of a dimodular nonribosomal peptide synthetase reveal conformational flexibility. Science 2020; 366:366/6466/eaaw4388. [PMID: 31699907 DOI: 10.1126/science.aaw4388] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2018] [Revised: 06/04/2019] [Accepted: 10/10/2019] [Indexed: 01/01/2023]
Abstract
Nonribosomal peptide synthetases (NRPSs) are biosynthetic enzymes that synthesize natural product therapeutics using a modular synthetic logic, whereby each module adds one aminoacyl substrate to the nascent peptide. We have determined five x-ray crystal structures of large constructs of the NRPS linear gramicidin synthetase, including a structure of a full core dimodule in conformations organized for the condensation reaction and intermodular peptidyl substrate delivery. The structures reveal differences in the relative positions of adjacent modules, which are not strictly coupled to the catalytic cycle and are consistent with small-angle x-ray scattering data. The structures and covariation analysis of homologs allowed us to create mutants that improve the yield of a peptide from a module-swapped dimodular NRPS.
Collapse
Affiliation(s)
- Janice M Reimer
- Department of Biochemistry and Center de Recherche en Biologie Structurale, McGill University, Montréal, QC H3G 0B1, Canada
| | - Maximilian Eivaskhani
- Department of Biochemistry and Center de Recherche en Biologie Structurale, McGill University, Montréal, QC H3G 0B1, Canada
| | - Ingrid Harb
- Department of Biochemistry and Center de Recherche en Biologie Structurale, McGill University, Montréal, QC H3G 0B1, Canada
| | - Alba Guarné
- Department of Biochemistry and Center de Recherche en Biologie Structurale, McGill University, Montréal, QC H3G 0B1, Canada
| | - Martin Weigt
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratory of Computational and Quantitative Biology, F-75005 Paris, France
| | - T Martin Schmeing
- Department of Biochemistry and Center de Recherche en Biologie Structurale, McGill University, Montréal, QC H3G 0B1, Canada.
| |
Collapse
|
20
|
Nerattini F, Figliuzzi M, Cardelli C, Tubiana L, Bianco V, Dellago C, Coluzza I. Identification of Protein Functional Regions. Chemphyschem 2020; 21:335-347. [PMID: 31944517 DOI: 10.1002/cphc.201900898] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 11/01/2019] [Indexed: 11/12/2022]
Abstract
Protein sequence stores the information relative to both functionality and stability, thus making it difficult to disentangle the two contributions. However, the identification of critical residues for function and stability has important implications for the mapping of the proteome interactions, as well as for many pharmaceutical applications, e. g. the identification of ligand binding regions for targeted pharmaceutical protein design. In this work, we propose a computational method to identify critical residues for protein functionality and stability and to further categorise them in strictly functional, structural and intermediate. We evaluate single site conservation and use Direct Coupling Analysis (DCA) to identify co-evolved residues both in natural and artificial evolution processes. We reproduce artificial evolution using protein design and base our approach on the hypothesis that artificial evolution in the absence of any functional constraint would exclusively lead to site conservation and co-evolution events of the structural type. Conversely, natural evolution intrinsically embeds both functional and structural information. By comparing the lists of conserved and co-evolved residues, outcomes of the analysis on natural and artificial evolution, we identify the functional residues without the need of any a priori knowledge of the biological role of the analysed protein.
Collapse
Affiliation(s)
- Francesca Nerattini
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Matteo Figliuzzi
- Sorbonne Universites, UPMC, Institut de Biologie Paris-Seine, CNRS, Laboratoire de Biologie Computationnelle et Quantitative UMR, 7238, Paris, France
| | - Chiara Cardelli
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Luca Tubiana
- Physics Department, Universitá degli studi di Trento, via Sommarive 14, 38123, Trento, IT
| | - Valentino Bianco
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria.,Faculty of Chemistry, Chemical Physics Department, Universidad Complutense de Madrid, Plaza de las Ciencias, Ciudad Universitaria, Madrid, 28040, Spain
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090, Vienna, Austria
| | - Ivan Coluzza
- CIC biomaGUNE, Paseo Miramon 182, 20014 San Sebastian, Spain, and IKERBASQUE, Basque Foundation for Science, 48013, Bilbao, Spain
| |
Collapse
|
21
|
Sala D, Cerofolini L, Fragai M, Giachetti A, Luchinat C, Rosato A. A protocol to automatically calculate homo-oligomeric protein structures through the integration of evolutionary constraints and NMR ambiguous contacts. Comput Struct Biotechnol J 2019; 18:114-124. [PMID: 31969972 PMCID: PMC6961069 DOI: 10.1016/j.csbj.2019.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2019] [Revised: 11/20/2019] [Accepted: 12/06/2019] [Indexed: 12/15/2022] Open
Abstract
Protein assemblies are involved in many important biological processes. Solid-state NMR (SSNMR) spectroscopy is a technique suitable for the structural characterization of samples with high molecular weight and thus can be applied to such assemblies. A significant bottleneck in terms of both effort and time required is the manual identification of unambiguous intermolecular contacts. This is particularly challenging for homo-oligomeric complexes, where simple uniform labeling may not be effective. We tackled this challenge by exploiting coevolution analysis to extract information on homo-oligomeric interfaces from NMR-derived ambiguous contacts. After removing the evolutionary couplings (ECs) that are already satisfied by the 3D structure of the monomer, the predicted ECs are matched with the automatically generated list of experimental contacts. This approach provides a selection of potential interface residues that is used directly in monomer-monomer docking calculations. We validated the protocol on tetrameric L-asparaginase II and dimeric Sod1.
Collapse
Affiliation(s)
- Davide Sala
- Magnetic Resonance Center (CERM), University of Florence, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
| | - Linda Cerofolini
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
| | - Marco Fragai
- Magnetic Resonance Center (CERM), University of Florence, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
- Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Italy
| | - Andrea Giachetti
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
| | - Claudio Luchinat
- Magnetic Resonance Center (CERM), University of Florence, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
- Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Italy
| | - Antonio Rosato
- Magnetic Resonance Center (CERM), University of Florence, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
- Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Italy
| |
Collapse
|
22
|
Evaluation of specificity determinants in Mycobacterium tuberculosis σ/anti-σ factor interactions. Biochem Biophys Res Commun 2019; 521:900-906. [PMID: 31711645 DOI: 10.1016/j.bbrc.2019.10.198] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Accepted: 10/31/2019] [Indexed: 01/11/2023]
Abstract
Extra Cytoplasmic Function (ECF) σ factor/regulatory protein (anti-σ factor) pairs govern environment mediated changes in gene expression in bacteria. The release of the ECF σ factor from an inactive σ/anti-σ factor complex is triggered by specific environmental stimuli. The free σ factor then associates with the RNA polymerase and drives the expression of genes in its target regulon. Multiple ECF σ/anti-σ pairs ensure calibrated changes in the expression profile by correlating diverse environmental stimuli with changes in the intracellular levels of different ECF σ factors. Specificity in σ/anti-σ factor interaction is thus essential for accurate signal transduction. Here we describe experiments to evaluate interactions between different M. tuberculosis σ and anti-σ proteins in vitro. The interaction parameters suggest that cross-talk between non-cognate σ/anti-σ pairs is likely. The sequence and conformational determinants that govern interaction specificity in a σ/anti-σ complex are not immediately evident due to substantial structural conservation. Sequence-structure analysis of all σ/anti-σ pairs suggest that conserved residues are not the primary determinants of σ/anti-σ interactions-a finding that suggests a potential route to set tolerance limits in interaction specificity. Non-specific σ/anti-σ interactions are likely to be biologically significant as it can contribute to heterogeneity in cellular responses in a bacterial population under less stringent requirements. This finding is relevant for synthetic biology approaches to engineer bacteria using σ/anti-σ transcription initiation modules for diverse applications in biotechnology.
Collapse
|
23
|
Phylogenetic correlations can suffice to infer protein partners from sequences. PLoS Comput Biol 2019; 15:e1007179. [PMID: 31609984 PMCID: PMC6812855 DOI: 10.1371/journal.pcbi.1007179] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Revised: 10/24/2019] [Accepted: 09/25/2019] [Indexed: 12/30/2022] Open
Abstract
Determining which proteins interact together is crucial to a systems-level understanding of the cell. Recently, algorithms based on Direct Coupling Analysis (DCA) pairwise maximum-entropy models have allowed to identify interaction partners among paralogous proteins from sequence data. This success of DCA at predicting protein-protein interactions could be mainly based on its known ability to identify pairs of residues that are in contact in the three-dimensional structure of protein complexes and that coevolve to remain physicochemically complementary. However, interacting proteins possess similar evolutionary histories. What is the role of purely phylogenetic correlations in the performance of DCA-based methods to infer interaction partners? To address this question, we employ controlled synthetic data that only involve phylogeny and no interactions or contacts. We find that DCA accurately identifies the pairs of synthetic sequences that share evolutionary history. While phylogenetic correlations confound the identification of contacting residues by DCA, they are thus useful to predict interacting partners among paralogs. We find that DCA performs as well as phylogenetic methods to this end, and slightly better than them with large and accurate training sets. Employing DCA or phylogenetic methods within an Iterative Pairing Algorithm (IPA) allows to predict pairs of evolutionary partners without a training set. We further demonstrate the ability of these various methods to correctly predict pairings among real paralogous proteins with genome proximity but no known direct physical interaction, illustrating the importance of phylogenetic correlations in natural data. However, for physically interacting and strongly coevolving proteins, DCA and mutual information outperform phylogenetic methods. We finally discuss how to distinguish physically interacting proteins from proteins that only share a common evolutionary history. Many biologically important protein-protein interactions are conserved over evolutionary time scales. This leads to two different signals that can be used to computationally predict interactions between protein families and to identify specific interaction partners. First, the shared evolutionary history leads to highly similar phylogenetic relationships between interacting proteins of the two families. Second, the need to keep the interaction surfaces of partner proteins biophysically compatible causes a correlated amino-acid usage of interface residues. Employing simulated data, we show that the shared history alone can be used to detect partner proteins. Similar accuracies are achieved by algorithms comparing phylogenetic relationships and by methods based on Direct Coupling Analysis (DCA), which are primarily known for their ability to detect the second type of signal. Using natural sequence data, we show that in cases with shared evolutionary history but without known physical interactions, both methods work with similar accuracy, while for some physically interacting systems, DCA and mutual information outperform phylogenetic methods. We propose methods allowing both to predict interactions between protein families and to find interacting partners among paralogs.
Collapse
|
24
|
Croce G, Gueudré T, Ruiz Cuevas MV, Keidel V, Figliuzzi M, Szurmant H, Weigt M. A multi-scale coevolutionary approach to predict interactions between protein domains. PLoS Comput Biol 2019; 15:e1006891. [PMID: 31634362 PMCID: PMC6822775 DOI: 10.1371/journal.pcbi.1006891] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 10/31/2019] [Accepted: 09/27/2019] [Indexed: 11/18/2022] Open
Abstract
Interacting proteins and protein domains coevolve on multiple scales, from their correlated presence across species, to correlations in amino-acid usage. Genomic databases provide rapidly growing data for variability in genomic protein content and in protein sequences, calling for computational predictions of unknown interactions. We first introduce the concept of direct phyletic couplings, based on global statistical models of phylogenetic profiles. They strongly increase the accuracy of predicting pairs of related protein domains beyond simpler correlation-based approaches like phylogenetic profiling (80% vs. 30-50% positives out of the 1000 highest-scoring pairs). Combined with the direct coupling analysis of inter-protein residue-residue coevolution, we provide multi-scale evidence for direct but unknown interaction between protein families. An in-depth discussion shows these to be biologically sensible and directly experimentally testable. Negative phyletic couplings highlight alternative solutions for the same functionality, including documented cases of convergent evolution. Thereby our work proves the strong potential of global statistical modeling approaches to genome-wide coevolutionary analysis, far beyond the established use for individual protein complexes and domain-domain interactions.
Collapse
Affiliation(s)
- Giancarlo Croce
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative–LCQB, Paris, France
| | | | - Maria Virginia Ruiz Cuevas
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative–LCQB, Paris, France
| | - Victoria Keidel
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona CA, United States of America
| | - Matteo Figliuzzi
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative–LCQB, Paris, France
| | - Hendrik Szurmant
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona CA, United States of America
| | - Martin Weigt
- Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Biologie computationnelle et quantitative–LCQB, Paris, France
| |
Collapse
|
25
|
Szurmant H. Evolutionary couplings of amino acid residues reveal structure and function of bacterial signaling proteins. Mol Microbiol 2019; 112:432-437. [PMID: 31102561 DOI: 10.1111/mmi.14282] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2019] [Indexed: 12/12/2022]
Abstract
The genomic era along with major advances in high-throughput sequencing technology has led to a rapid expansion of the genomic and consequently the protein sequence space. Bacterial extracytoplasmic function sigma factors have emerged as an important group of signaling proteins in bacteria involved in many regulatory decisions, most notably the adaptation to cell envelope stress. Their wide prevalence and amplification among bacterial genomes has led to sub-group classification and the realization of diverse signaling mechanisms. Mathematical frameworks have been developed to utilize extensive protein sequence alignments to extract co-evolutionary signals of interaction. This has proven useful in a number of different biological fields, including de novo structure prediction, protein-protein partner identification and the elucidation of alternative protein conformations for signal proteins, to name a few. The mathematical tools, commonly referred to under the name 'Direct Coupling Analysis' have now been applied to deduce molecular mechanisms of activation for sub-groups of extracytoplasmic sigma factors adding to previous successes on bacterial two-component signaling proteins. The amplification of signal transduction protein genes in bacterial genomes made them the first to be amenable to this approach but the sequences are available now to aid the molecular microbiologist, no matter their protein pathway of interest.
Collapse
Affiliation(s)
- Hendrik Szurmant
- Basic Medical Science, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA, USA
| |
Collapse
|
26
|
The role of coevolutionary signatures in protein interaction dynamics, complex inference, molecular recognition, and mutational landscapes. Curr Opin Struct Biol 2019; 56:179-186. [PMID: 31029927 DOI: 10.1016/j.sbi.2019.03.024] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Revised: 03/18/2019] [Accepted: 03/19/2019] [Indexed: 11/22/2022]
Abstract
Evolution imposes constraints at the interface of interacting biomolecules in order to preserve function or maintain fitness. This pressure may have a direct effect on the sequence composition of interacting biomolecules. As a result, statistical patterns of amino acid or nucleotide covariance that encode for physical and functional interactions are observed in sequences of extant organisms. In recent years, global pairwise models of amino acid and nucleotide coevolution from multiple sequence alignments have been developed and utilized to study molecular interactions in structural biology. In proteins, for which the energy landscape is funneled and minimally frustrated, a direct connection between the physical and sequence space landscapes can be established. Estimating coevolutionary information from sequences of interacting molecules has a broad impact in molecular biology. Applications include the accurate determination of 3D structures of molecular complexes, inference of protein interaction partners, models of protein-protein interaction specificity, the elucidation, and design of protein-nucleic acid recognition as well as the discovery of genome-wide epistatic effects. The current state of the art of coevolutionary analysis includes biomedical applications ranging from mutational landscapes and drug-design to vaccine development.
Collapse
|
27
|
Spangler JR, Dean SN, Leary DH, Walper SA. Response of Lactobacillus plantarum WCFS1 to the Gram-Negative Pathogen-Associated Quorum Sensing Molecule N-3-Oxododecanoyl Homoserine Lactone. Front Microbiol 2019; 10:715. [PMID: 31024494 PMCID: PMC6459948 DOI: 10.3389/fmicb.2019.00715] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Accepted: 03/21/2019] [Indexed: 12/18/2022] Open
Abstract
The bacterial quorum sensing phenomenon has been well studied since its discovery and has traditionally been considered to include signaling pathways recognized exclusively within either Gram-positive or Gram-negative bacteria. These groups of bacteria synthesize structurally distinct signaling molecules to mediate quorum sensing, where Gram-positive bacteria traditionally utilize small autoinducing peptides (AIPs) and Gram-negatives use small molecules such as acyl-homoserine lactones (AHLs). The structural differences between the types of signaling molecules have historically implied a lack of cross-talk among Gram-positive and Gram-negative quorum sensing systems. Recent investigations, however, have demonstrated the ability for AIPs and AHLs to be produced by non-canonical organisms, implying quorum sensing systems may be more universally recognized than previously hypothesized. With that in mind, our interests were piqued by the organisms Lactobacillus plantarum, a Gram-positive commensal probiotic known to participate in AIP-mediated quorum sensing, and Pseudomonas aeruginosa, a characterized Gram-negative pathogen whose virulence is in part controlled by AHL-mediated quorum sensing. Both health-related organisms are known to inhabit the human gut in various instances, both are characterized to elicit distinct effects on host immunity, and some studies hint at the putative ability of L. plantarum to degrade AHLs produced by P. aeruginosa. We therefore wanted to determine if L. plantarum cultures would respond to the addition of N-(3-oxododecanoyl)-L-homoserine lactone (3OC12) from P. aeruginosa by analyzing changes on both the transcriptome and proteome over time. Based on the observed upregulation of various two-component systems, response regulators, and native quorum sensing related genes, the resulting data provide evidence of an AHL recognition and response by L. plantarum.
Collapse
Affiliation(s)
- Joseph R. Spangler
- National Research Council Postdoctoral Fellowships, NRC Research Associateship Programs, Washington, DC, United States
| | - Scott N. Dean
- National Research Council Postdoctoral Fellowships, NRC Research Associateship Programs, Washington, DC, United States
| | - Dagmar H. Leary
- United States Naval Research Laboratory, Center for Biomolecular Science and Engineering, Washington, DC, United States
| | - Scott A. Walper
- United States Naval Research Laboratory, Center for Biomolecular Science and Engineering, Washington, DC, United States
| |
Collapse
|
28
|
Understanding molecular mechanisms in cell signaling through natural and artificial sequence variation. Nat Struct Mol Biol 2018; 26:25-34. [PMID: 30598552 DOI: 10.1038/s41594-018-0175-9] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Accepted: 11/16/2018] [Indexed: 02/08/2023]
Abstract
The functionally tolerated sequence space of proteins can now be explored in an unprecedented way, owing to the expansion of genomic databases and the development of high-throughput methods to interrogate protein function. For signaling proteins, several recent studies have shown how the analysis of sequence variation leverages the available protein-structure information to provide new insights into specificity and allosteric regulation. In this Review, we discuss recent work that illustrates how this emerging approach is providing a deeper understanding of signaling proteins.
Collapse
|
29
|
Bitbol AF. Inferring interaction partners from protein sequences using mutual information. PLoS Comput Biol 2018; 14:e1006401. [PMID: 30422978 PMCID: PMC6258550 DOI: 10.1371/journal.pcbi.1006401] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 11/27/2018] [Accepted: 10/27/2018] [Indexed: 11/30/2022] Open
Abstract
Functional protein-protein interactions are crucial in most cellular processes. They enable multi-protein complexes to assemble and to remain stable, and they allow signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interacting partners, and thus in correlations between their sequences. Pairwise maximum-entropy based models have enabled successful inference of pairs of amino-acid residues that are in contact in the three-dimensional structure of multi-protein complexes, starting from the correlations in the sequence data of known interaction partners. Recently, algorithms inspired by these methods have been developed to identify which proteins are functional interaction partners among the paralogous proteins of two families, starting from sequence data alone. Here, we demonstrate that a slightly higher performance for partner identification can be reached by an approximate maximization of the mutual information between the sequence alignments of the two protein families. Our mutual information-based method also provides signatures of the existence of interactions between protein families. These results stand in contrast with structure prediction of proteins and of multi-protein complexes from sequence data, where pairwise maximum-entropy based global statistical models substantially improve performance compared to mutual information. Our findings entail that the statistical dependences allowing interaction partner prediction from sequence data are not restricted to the residue pairs that are in direct contact at the interface between the partner proteins.
Collapse
Affiliation(s)
- Anne-Florence Bitbol
- Sorbonne Université, CNRS, Laboratoire Jean Perrin (UMR 8237), F-75005 Paris, France
| |
Collapse
|
30
|
Davidson P, Eutsey R, Redler B, Hiller NL, Laub MT, Durand D. Flexibility and constraint: Evolutionary remodeling of the sporulation initiation pathway in Firmicutes. PLoS Genet 2018; 14:e1007470. [PMID: 30212463 PMCID: PMC6136694 DOI: 10.1371/journal.pgen.1007470] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Accepted: 06/04/2018] [Indexed: 12/16/2022] Open
Abstract
The evolution of signal transduction pathways is constrained by the requirements of signal fidelity, yet flexibility is necessary to allow pathway remodeling in response to environmental challenges. A detailed understanding of how flexibility and constraint shape bacterial two component signaling systems is emerging, but how new signal transduction architectures arise remains unclear. Here, we investigate pathway remodeling using the Firmicute sporulation initiation (Spo0) pathway as a model. The present-day Spo0 pathways in Bacilli and Clostridia share common ancestry, but possess different architectures. In Clostridium acetobutylicum, sensor kinases directly phosphorylate Spo0A, the master regulator of sporulation. In Bacillus subtilis, Spo0A is activated via a four-protein phosphorelay. The current view favors an ancestral direct phosphorylation architecture, with the phosphorelay emerging in the Bacillar lineage. Our results reject this hypothesis. Our analysis of 84 broadly distributed Firmicute genomes predicts phosphorelays in numerous Clostridia, contrary to the expectation that the Spo0 phosphorelay is unique to Bacilli. Our experimental verification of a functional Spo0 phosphorelay encoded by Desulfotomaculum acetoxidans (Class Clostridia) further supports functional phosphorelays in Clostridia, which strongly suggests that the ancestral Spo0 pathway was a phosphorelay. Cross complementation assays between Bacillar and Clostridial phosphorelays demonstrate conservation of interaction specificity since their divergence over 2.7 BYA. Further, the distribution of direct phosphorylation Spo0 pathways is patchy, suggesting multiple, independent instances of remodeling from phosphorelay to direct phosphorylation. We provide evidence that these transitions are likely the result of changes in sporulation kinase specificity or acquisition of a sensor kinase with specificity for Spo0A, which is remarkably conserved in both architectures. We conclude that flexible encoding of interaction specificity, a phenotype that is only intermittently essential, and the recruitment of kinases to recognize novel environmental signals resulted in a consistent and repeated pattern of remodeling of the Spo0 pathway. Survival in a changing world requires signal transduction circuitry that can evolve to sense and respond to new environmental challenges. The Firmicute sporulation initiation (Spo0) pathway is a compelling example of a pathway with a circuit diagram that has changed over the course of evolution. In Clostridium acetobutylicum, a sensor kinase directly activates the master regulator of sporulation, Spo0A. In Bacillus subtilis, Spo0A is activated indirectly via a four-protein phosphorelay. These early observations suggested that the ancestral Spo0A was directly phosphorylated by a kinase in the earliest spore-former and that the Spo0 phosphorelay arose later in Bacilli via gain of additional proteins and interactions. Our analysis, based on a much larger set of genomes, surprisingly reveals phosphorelays, not only in Bacilli, but in many Clostridia. These findings support a model wherein sporulation was initiated by a Spo0 phosphorelay in the ancestral spore-former and the direct phosphorylation Spo0 pathways, which are observed in distinct sets of Clostridial taxa, are the result of convergent, reductive evolution. Further, our evidence suggests that these remodeling events were mediated by changes in kinase specificity, implicating flexible pathway remodeling, potentially combined with the recruitment of kinases, in Spo0 pathway evolution.
Collapse
Affiliation(s)
- Philip Davidson
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Rory Eutsey
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Brendan Redler
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - N. Luisa Hiller
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Center of Excellence in Biofilm Research, Allegheny Health Network, Pittsburgh, Pennsylvania, United States of America
| | - Michael T. Laub
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Howard Hughes Medical Institute, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- Department of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
31
|
Cheng RR, Haglund E, Tiee NS, Morcos F, Levine H, Adams JA, Jennings PA, Onuchic JN. Designing bacterial signaling interactions with coevolutionary landscapes. PLoS One 2018; 13:e0201734. [PMID: 30125296 PMCID: PMC6101370 DOI: 10.1371/journal.pone.0201734] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 07/21/2018] [Indexed: 11/19/2022] Open
Abstract
Selecting amino acids to design novel protein-protein interactions that facilitate catalysis is a daunting challenge. We propose that a computational coevolutionary landscape based on sequence analysis alone offers a major advantage over expensive, time-consuming brute-force approaches currently employed. Our coevolutionary landscape allows prediction of single amino acid substitutions that produce functional interactions between non-cognate, interspecies signaling partners. In addition, it can also predict mutations that maintain segregation of signaling pathways across species. Specifically, predictions of phosphotransfer activity between the Escherichia coli histidine kinase EnvZ to the non-cognate receiver Spo0F from Bacillus subtilis were compiled. Twelve mutations designed to enhance, suppress, or have a neutral effect on kinase phosphotransfer activity to a non-cognate partner were selected. We experimentally tested the ability of the kinase to relay phosphate to the respective designed Spo0F receiver proteins against the theoretical predictions. Our key finding is that the coevolutionary landscape theory, with limited structural data, can significantly reduce the search-space for successful prediction of single amino acid substitutions that modulate phosphotransfer between the two-component His-Asp relay partners in a predicted fashion. This combined approach offers significant improvements over large-scale mutations studies currently used for protein engineering and design.
Collapse
Affiliation(s)
- Ryan R. Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- * E-mail: (RRC); (JNO)
| | - Ellinor Haglund
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
| | - Nicholas S. Tiee
- Department of Chemistry & Biochemistry, The University of California, San Diego, California, United States of America
| | - Faruck Morcos
- Department of Biological Sciences, University of Texas at Dallas, Dallas, Texas, United States of America
- Department of Bioengineering, University of Texas at Dallas, Dallas, Texas, United States of America
| | - Herbert Levine
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Bioengineering, Rice University, Houston, Texas, United States of America
- Department of Biosciences, Rice University, Houston, Texas, United States of America
- Department of Physics & Astronomy, Rice University, Houston, Texas, United States of America
| | - Joseph A. Adams
- Department of Pharmacology, The University of California, San Diego, California, United States of America
| | - Patricia A. Jennings
- Department of Chemistry & Biochemistry, The University of California, San Diego, California, United States of America
| | - José N. Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, United States of America
- Department of Biosciences, Rice University, Houston, Texas, United States of America
- Department of Physics & Astronomy, Rice University, Houston, Texas, United States of America
- Department of Chemistry, Rice University, Houston, Texas, United States of America
- * E-mail: (RRC); (JNO)
| |
Collapse
|
32
|
Szurmant H, Weigt M. Inter-residue, inter-protein and inter-family coevolution: bridging the scales. Curr Opin Struct Biol 2018; 50:26-32. [PMID: 29101847 PMCID: PMC5940578 DOI: 10.1016/j.sbi.2017.10.014] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2017] [Revised: 10/12/2017] [Accepted: 10/13/2017] [Indexed: 10/18/2022]
Abstract
Interacting proteins coevolve at multiple but interconnected scales, from the residue-residue over the protein-protein up to the family-family level. The recent accumulation of enormous amounts of sequence data allows for the development of novel, data-driven computational approaches. Notably, these approaches can bridge scales within a single statistical framework. Although being currently applied mostly to isolated problems on single scales, their immense potential for an evolutionary informed, structural systems biology is steadily emerging.
Collapse
Affiliation(s)
- Hendrik Szurmant
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, Pomona, CA 91766, USA.
| | - Martin Weigt
- Sorbonne Universités, UPMC Université Paris 06, CNRS, Biologie Computationnelle et Quantitative - Institut de Biologie Paris Seine, 75005 Paris, France.
| |
Collapse
|
33
|
Nicoludis JM, Gaudet R. Applications of sequence coevolution in membrane protein biochemistry. BIOCHIMICA ET BIOPHYSICA ACTA. BIOMEMBRANES 2018; 1860:895-908. [PMID: 28993150 PMCID: PMC5807202 DOI: 10.1016/j.bbamem.2017.10.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Revised: 09/28/2017] [Accepted: 10/02/2017] [Indexed: 12/22/2022]
Abstract
Recently, protein sequence coevolution analysis has matured into a predictive powerhouse for protein structure and function. Direct methods, which use global statistical models of sequence coevolution, have enabled the prediction of membrane and disordered protein structures, protein complex architectures, and the functional effects of mutations in proteins. The field of membrane protein biochemistry and structural biology has embraced these computational techniques, which provide functional and structural information in an otherwise experimentally-challenging field. Here we review recent applications of protein sequence coevolution analysis to membrane protein structure and function and highlight the promising directions and future obstacles in these fields. We provide insights and guidelines for membrane protein biochemists who wish to apply sequence coevolution analysis to a given experimental system.
Collapse
Affiliation(s)
- John M Nicoludis
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, United States
| | - Rachelle Gaudet
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138, United States.
| |
Collapse
|
34
|
Abstract
Bacteria use two-component systems (TCSs) to sense and respond to environmental changes. The core genome of the major human pathogen Staphylococcus aureus encodes 16 TCSs, one of which (WalRK) is essential. Here we show that S. aureus can be deprived of its complete sensorial TCS network and still survive under growth arrest conditions similarly to wild-type bacteria. Under replicating conditions, however, the WalRK system is necessary and sufficient to maintain bacterial growth, indicating that sensing through TCSs is mostly dispensable for living under constant environmental conditions. Characterization of S. aureus derivatives containing individual TCSs reveals that each TCS appears to be autonomous and self-sufficient to sense and respond to specific environmental cues, although some level of cross-regulation between non-cognate sensor-response regulator pairs occurs in vivo. This organization, if confirmed in other bacterial species, may provide a general evolutionarily mechanism for flexible bacterial adaptation to life in new niches.
Collapse
|
35
|
Biomolecular coevolution and its applications: Going from structure prediction toward signaling, epistasis, and function. Biochem Soc Trans 2017; 45:1253-1261. [DOI: 10.1042/bst20170063] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Revised: 08/30/2017] [Accepted: 09/04/2017] [Indexed: 01/01/2023]
Abstract
Evolution leads to considerable changes in the sequence of biomolecules, while their overall structure and function remain quite conserved. The wealth of genomic sequences, the ‘Biological Big Data’, modern sequencing techniques provide allows us to investigate biomolecular evolution with unprecedented detail. Sophisticated statistical models can infer residue pair mutations resulting from spatial proximity. The introduction of predicted spatial adjacencies as constraints in biomolecular structure prediction workflows has transformed the field of protein and RNA structure prediction toward accuracies approaching the experimental resolution limit. Going beyond structure prediction, the same mathematical framework allows mimicking evolutionary fitness landscapes to infer signaling interactions, epistasis, or mutational landscapes.
Collapse
|
36
|
Fantini M, Malinverni D, De Los Rios P, Pastore A. New Techniques for Ancient Proteins: Direct Coupling Analysis Applied on Proteins Involved in Iron Sulfur Cluster Biogenesis. Front Mol Biosci 2017; 4:40. [PMID: 28664160 PMCID: PMC5471300 DOI: 10.3389/fmolb.2017.00040] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2017] [Accepted: 05/24/2017] [Indexed: 12/01/2022] Open
Abstract
Direct coupling analysis (DCA) is a powerful statistical inference tool used to study protein evolution. It was introduced to predict protein folds and protein-protein interactions, and has also been applied to the prediction of entire interactomes. Here, we have used it to analyze three proteins of the iron-sulfur biogenesis machine, an essential metabolic pathway conserved in all organisms. We show that DCA can correctly reproduce structural features of the CyaY/frataxin family (a protein involved in the human disease Friedreich's ataxia) despite being based on the relatively small number of sequences allowed by its genomic distribution. This result gives us confidence in the method. Its application to the iron-sulfur cluster scaffold protein IscU, which has been suggested to function both as an ordered and a disordered form, allows us to distinguish evolutionary traces of the structured species, suggesting that, if present in the cell, the disordered form has not left evolutionary imprinting. We observe instead, for the first time, direct indications of how the protein can dimerize head-to-head and bind 4Fe4S clusters. Analysis of the alternative scaffold protein IscA provides strong support to a coordination of the cluster by a dimeric form rather than a tetramer, as previously suggested. Our analysis also suggests the presence in solution of a mixture of monomeric and dimeric species, and guides us to the prevalent one. Finally, we used DCA to analyze interactions between some of these proteins, and discuss the potentials and limitations of the method.
Collapse
Affiliation(s)
- Marco Fantini
- BioSNS, Faculty of Mathematical and Natural Sciences, Scuola Normale SuperiorePisa, Italy
| | - Duccio Malinverni
- Institute of Physics, School of Basic Sciences, and Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de LausanneLausanne, Switzerland
| | - Paolo De Los Rios
- Institute of Physics, School of Basic Sciences, and Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de LausanneLausanne, Switzerland
| | - Annalisa Pastore
- Maurice Wohl Institute, King's CollegeLondon, United Kingdom.,Molecular Medicine Department, University of PaviaPavia, Italy
| |
Collapse
|
37
|
Park DM, Overton KW, Liou MJ, Jiao Y. Identification of a U/Zn/Cu responsive global regulatory two-component system in Caulobacter crescentus. Mol Microbiol 2017; 104:46-64. [PMID: 28035693 DOI: 10.1111/mmi.13615] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2016] [Indexed: 01/18/2023]
Abstract
Despite the well-known toxicity of uranium (U) to bacteria, little is known about how cells sense and respond to U. The recent finding of a U-specific stress response in Caulobacter crescentus has provided a foundation for studying the mechanisms of U- perception in bacteria. To gain insight into this process, we used a forward genetic screen to identify the regulatory components governing expression of the urcA promoter (PurcA ) that is strongly induced by U. This approach unearthed a previously uncharacterized two-component system, named UzcRS, which is responsible for U-dependent activation of PurcA . UzcRS is also highly responsive to zinc and copper, revealing a broader specificity than previously thought. Using ChIP-seq, we found that UzcR binds extensively throughout the genome in a metal-dependent manner and recognizes a noncanonical DNA-binding site. Coupling the genome-wide occupancy data with RNA-seq analysis revealed that UzcR is a global regulator of transcription, predominately activating genes encoding proteins that are localized to the cell envelope; these include metallopeptidases, multidrug-resistant efflux (MDR) pumps, TonB-dependent receptors and many proteins of unknown function. Collectively, our data suggest that UzcRS couples the perception of U, Zn and Cu with a novel extracytoplasmic stress response.
Collapse
Affiliation(s)
- Dan M Park
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - K Wesley Overton
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Megan J Liou
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Yongqin Jiao
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, CA, USA
| |
Collapse
|
38
|
Coucke A, Uguzzoni G, Oteri F, Cocco S, Monasson R, Weigt M. Direct coevolutionary couplings reflect biophysical residue interactions in proteins. J Chem Phys 2016; 145:174102. [DOI: 10.1063/1.4966156] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Affiliation(s)
- Alice Coucke
- Laboratoire de Physique Théorique, Ecole Normale Supérieure and CNRS-UMR8549, PSL Research University, Sorbonne Universités UPMC, 24 Rue Lhomond, 75005 Paris, France
- Sorbonne Universités, UPMC, Institut de Biologie Paris-Seine, CNRS, Laboratoire de Biologie Computationnelle et Quantitative UMR 7238, 75005 Paris, France
| | - Guido Uguzzoni
- Sorbonne Universités, UPMC, Institut de Biologie Paris-Seine, CNRS, Laboratoire de Biologie Computationnelle et Quantitative UMR 7238, 75005 Paris, France
| | - Francesco Oteri
- Sorbonne Universités, UPMC, Institut de Biologie Paris-Seine, CNRS, Laboratoire de Biologie Computationnelle et Quantitative UMR 7238, 75005 Paris, France
| | - Simona Cocco
- Laboratoire de Physique Statistique, Ecole Normale Supérieure and CNRS-UMR8550, PSL Research University, Sorbonne Universités UPMC, 24 Rue Lhomond, 75005 Paris, France
| | - Remi Monasson
- Laboratoire de Physique Théorique, Ecole Normale Supérieure and CNRS-UMR8549, PSL Research University, Sorbonne Universités UPMC, 24 Rue Lhomond, 75005 Paris, France
| | - Martin Weigt
- Sorbonne Universités, UPMC, Institut de Biologie Paris-Seine, CNRS, Laboratoire de Biologie Computationnelle et Quantitative UMR 7238, 75005 Paris, France
| |
Collapse
|
39
|
Wang ZB, Li YQ, Lin JQ, Pang X, Liu XM, Liu BQ, Wang R, Zhang CJ, Wu Y, Lin JQ, Chen LX. The Two-Component System RsrS-RsrR Regulates the Tetrathionate Intermediate Pathway for Thiosulfate Oxidation in Acidithiobacillus caldus. Front Microbiol 2016; 7:1755. [PMID: 27857710 PMCID: PMC5093147 DOI: 10.3389/fmicb.2016.01755] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Accepted: 10/19/2016] [Indexed: 01/10/2023] Open
Abstract
Acidithiobacillus caldus (A. caldus) is a common bioleaching bacterium that possesses a sophisticated and highly efficient inorganic sulfur compound metabolism network. Thiosulfate, a central intermediate in the sulfur metabolism network of A. caldus and other sulfur-oxidizing microorganisms, can be metabolized via the tetrathionate intermediate (S4I) pathway catalyzed by thiosulfate:quinol oxidoreductase (Tqo or DoxDA) and tetrathionate hydrolase (TetH). In A. caldus, there is an additional two-component system called RsrS-RsrR. Since rsrS and rsrR are arranged as an operon with doxDA and tetH in the genome, we suggest that the regulation of the S4I pathway may occur via the RsrS-RsrR system. To examine the regulatory role of the two-component system RsrS-RsrR on the S4I pathway, ΔrsrR and ΔrsrS strains were constructed in A. caldus using a newly developed markerless gene knockout method. Transcriptional analysis of the tetH cluster in the wild type and mutant strains revealed positive regulation of the S4I pathway by the RsrS-RsrR system. A 19 bp inverted repeat sequence (IRS, AACACCTGTTACACCTGTT) located upstream of the tetH promoter was identified as the binding site for RsrR by using electrophoretic mobility shift assays (EMSAs) in vitro and promoter-probe vectors in vivo. In addition, ΔrsrR, and ΔrsrS strains cultivated in K2S4O6-medium exhibited significant growth differences when compared with the wild type. Transcriptional analysis indicated that the absence of rsrS or rsrR had different effects on the expression of genes involved in sulfur metabolism and signaling systems. Finally, a model of tetrathionate sensing by RsrS, signal transduction via RsrR, and transcriptional activation of tetH-doxDA was proposed to provide insights toward the understanding of sulfur metabolism in A. caldus. This study also provided a powerful genetic tool for studies in A. caldus.
Collapse
Affiliation(s)
- Zhao-Bao Wang
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Ya-Qing Li
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Jian-Qun Lin
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Xin Pang
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Xiang-Mei Liu
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | | | - Rui Wang
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Cheng-Jia Zhang
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Yan Wu
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Jian-Qiang Lin
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| | - Lin-Xu Chen
- State Key Laboratory of Microbial Technology, Shandong University Jinan, China
| |
Collapse
|
40
|
Simultaneous identification of specifically interacting paralogs and interprotein contacts by direct coupling analysis. Proc Natl Acad Sci U S A 2016; 113:12186-12191. [PMID: 27729520 DOI: 10.1073/pnas.1607570113] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Understanding protein-protein interactions is central to our understanding of almost all complex biological processes. Computational tools exploiting rapidly growing genomic databases to characterize protein-protein interactions are urgently needed. Such methods should connect multiple scales from evolutionary conserved interactions between families of homologous proteins, over the identification of specifically interacting proteins in the case of multiple paralogs inside a species, down to the prediction of residues being in physical contact across interaction interfaces. Statistical inference methods detecting residue-residue coevolution have recently triggered considerable progress in using sequence data for quaternary protein structure prediction; they require, however, large joint alignments of homologous protein pairs known to interact. The generation of such alignments is a complex computational task on its own; application of coevolutionary modeling has, in turn, been restricted to proteins without paralogs, or to bacterial systems with the corresponding coding genes being colocalized in operons. Here we show that the direct coupling analysis of residue coevolution can be extended to connect the different scales, and simultaneously to match interacting paralogs, to identify interprotein residue-residue contacts and to discriminate interacting from noninteracting families in a multiprotein system. Our results extend the potential applications of coevolutionary analysis far beyond cases treatable so far.
Collapse
|
41
|
Abstract
Specific protein-protein interactions are crucial in the cell, both to ensure the formation and stability of multiprotein complexes and to enable signal transduction in various pathways. Functional interactions between proteins result in coevolution between the interaction partners, causing their sequences to be correlated. Here we exploit these correlations to accurately identify, from sequence data alone, which proteins are specific interaction partners. Our general approach, which employs a pairwise maximum entropy model to infer couplings between residues, has been successfully used to predict the 3D structures of proteins from sequences. Thus inspired, we introduce an iterative algorithm to predict specific interaction partners from two protein families whose members are known to interact. We first assess the algorithm's performance on histidine kinases and response regulators from bacterial two-component signaling systems. We obtain a striking 0.93 true positive fraction on our complete dataset without any a priori knowledge of interaction partners, and we uncover the origin of this success. We then apply the algorithm to proteins from ATP-binding cassette (ABC) transporter complexes, and obtain accurate predictions in these systems as well. Finally, we present two metrics that accurately distinguish interacting protein families from noninteracting ones, using only sequence data.
Collapse
|
42
|
Cheng RR, Nordesjö O, Hayes RL, Levine H, Flores SC, Onuchic JN, Morcos F. Connecting the Sequence-Space of Bacterial Signaling Proteins to Phenotypes Using Coevolutionary Landscapes. Mol Biol Evol 2016; 33:3054-3064. [PMID: 27604223 PMCID: PMC5100047 DOI: 10.1093/molbev/msw188] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Two-component signaling (TCS) is the primary means by which bacteria sense and respond to the environment. TCS involves two partner proteins working in tandem, which interact to perform cellular functions whereas limiting interactions with non-partners (i.e., cross-talk). We construct a Potts model for TCS that can quantitatively predict how mutating amino acid identities affect the interaction between TCS partners and non-partners. The parameters of this model are inferred directly from protein sequence data. This approach drastically reduces the computational complexity of exploring the sequence-space of TCS proteins. As a stringent test, we compare its predictions to a recent comprehensive mutational study, which characterized the functionality of 204 mutational variants of the PhoQ kinase in Escherichia coli We find that our best predictions accurately reproduce the amino acid combinations found in experiment, which enable functional signaling with its partner PhoP. These predictions demonstrate the evolutionary pressure to preserve the interaction between TCS partners as well as prevent unwanted cross-talk. Further, we calculate the mutational change in the binding affinity between PhoQ and PhoP, providing an estimate to the amount of destabilization needed to disrupt TCS.
Collapse
Affiliation(s)
- R R Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, TX
| | - O Nordesjö
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - R L Hayes
- Department of Biophysics, University of Michigan, Ann Arbor, MI
| | - H Levine
- Center for Theoretical Biological Physics, Rice University, Houston, TX.,Department of Bioengineering, Rice University, Houston, TX
| | - S C Flores
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - J N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX .,Department of Physics and Astronomy, Rice University, Houston, TX.,Department of Chemistry, and Biosciences, Rice University, Houston, TX
| | - F Morcos
- Department of Biological Sciences and Center for Systems Biology, University of Texas at Dallas, Dallas, TX
| |
Collapse
|
43
|
A Combined Computational and Genetic Approach Uncovers Network Interactions of the Cyanobacterial Circadian Clock. J Bacteriol 2016; 198:2439-47. [PMID: 27381914 DOI: 10.1128/jb.00235-16] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 06/27/2016] [Indexed: 01/19/2023] Open
Abstract
UNLABELLED Two-component systems (TCS) that employ histidine kinases (HK) and response regulators (RR) are critical mediators of cellular signaling in bacteria. In the model cyanobacterium Synechococcus elongatus PCC 7942, TCSs control global rhythms of transcription that reflect an integration of time information from the circadian clock with a variety of cellular and environmental inputs. The HK CikA and the SasA/RpaA TCS transduce time information from the circadian oscillator to modulate downstream cellular processes. Despite immense progress in understanding of the circadian clock itself, many of the connections between the clock and other cellular signaling systems have remained enigmatic. To narrow the search for additional TCS components that connect to the clock, we utilized direct-coupling analysis (DCA), a statistical analysis of covariant residues among related amino acid sequences, to infer coevolution of new and known clock TCS components. DCA revealed a high degree of interaction specificity between SasA and CikA with RpaA, as expected, but also with the phosphate-responsive response regulator SphR. Coevolutionary analysis also predicted strong specificity between RpaA and a previously undescribed kinase, HK0480 (herein CikB). A knockout of the gene for CikB (cikB) in a sasA cikA null background eliminated the RpaA phosphorylation and RpaA-controlled transcription that is otherwise present in that background and suppressed cell elongation, supporting the notion that CikB is an interactor with RpaA and the clock network. This study demonstrates the power of DCA to identify subnetworks and key interactions in signaling pathways and of combinatorial mutagenesis to explore the phenotypic consequences. Such a combined strategy is broadly applicable to other prokaryotic systems. IMPORTANCE Signaling networks are complex and extensive, comprising multiple integrated pathways that respond to cellular and environmental cues. A TCS interaction model, based on DCA, independently confirmed known interactions and revealed a core set of subnetworks within the larger HK-RR set. We validated high-scoring candidate proteins via combinatorial genetics, demonstrating that DCA can be utilized to reduce the search space of complex protein networks and to infer undiscovered specific interactions for signaling proteins in vivo Significantly, new interactions that link circadian response to cell division and fitness in a light/dark cycle were uncovered. The combined analysis also uncovered a more basic core clock, illustrating the synergy and applicability of a combined computational and genetic approach for investigating prokaryotic signaling networks.
Collapse
|
44
|
Zschiedrich CP, Keidel V, Szurmant H. Molecular Mechanisms of Two-Component Signal Transduction. J Mol Biol 2016; 428:3752-75. [PMID: 27519796 DOI: 10.1016/j.jmb.2016.08.003] [Citation(s) in RCA: 356] [Impact Index Per Article: 44.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2016] [Revised: 07/30/2016] [Accepted: 08/01/2016] [Indexed: 02/03/2023]
Abstract
Two-component systems (TCS) comprising sensor histidine kinases and response regulator proteins are among the most important players in bacterial and archaeal signal transduction and also occur in reduced numbers in some eukaryotic organisms. Given their importance to cellular survival, virulence, and cellular development, these systems are among the most scrutinized bacterial proteins. In the recent years, a flurry of bioinformatics, genetic, biochemical, and structural studies have provided detailed insights into many molecular mechanisms that underlie the detection of signals and the generation of the appropriate response by TCS. Importantly, it has become clear that there is significant diversity in the mechanisms employed by individual systems. This review discusses the current knowledge on common themes and divergences from the paradigm of TCS signaling. An emphasis is on the information gained by a flurry of recent structural and bioinformatics studies.
Collapse
Affiliation(s)
- Christopher P Zschiedrich
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, 309 E Second Street, Pomona, CA 91766, USA; Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla, CA 92037, USA
| | - Victoria Keidel
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, 309 E Second Street, Pomona, CA 91766, USA; Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla, CA 92037, USA
| | - Hendrik Szurmant
- Department of Basic Medical Sciences, College of Osteopathic Medicine of the Pacific, Western University of Health Sciences, 309 E Second Street, Pomona, CA 91766, USA; Department of Molecular and Experimental Medicine, The Scripps Research Institute, 10550 N Torrey Pines Road, La Jolla, CA 92037, USA.
| |
Collapse
|
45
|
The two-component signal transduction system YvcPQ regulates the bacterial resistance to bacitracin in Bacillus thuringiensis. Arch Microbiol 2016; 198:773-84. [DOI: 10.1007/s00203-016-1239-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Revised: 05/01/2016] [Accepted: 05/05/2016] [Indexed: 02/01/2023]
|
46
|
Wagner JR, Lee CT, Durrant JD, Malmstrom RD, Feher VA, Amaro RE. Emerging Computational Methods for the Rational Discovery of Allosteric Drugs. Chem Rev 2016; 116:6370-90. [PMID: 27074285 PMCID: PMC4901368 DOI: 10.1021/acs.chemrev.5b00631] [Citation(s) in RCA: 158] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
![]()
Allosteric drug development holds
promise for delivering medicines
that are more selective and less toxic than those that target orthosteric
sites. To date, the discovery of allosteric binding sites and lead
compounds has been mostly serendipitous, achieved through high-throughput
screening. Over the past decade, structural data has become more readily
available for larger protein systems and more membrane protein classes
(e.g., GPCRs and ion channels), which are common allosteric drug targets.
In parallel, improved simulation methods now provide better atomistic
understanding of the protein dynamics and cooperative motions that
are critical to allosteric mechanisms. As a result of these advances,
the field of predictive allosteric drug development is now on the
cusp of a new era of rational structure-based computational methods.
Here, we review algorithms that predict allosteric sites based on
sequence data and molecular dynamics simulations, describe tools that
assess the druggability of these pockets, and discuss how Markov state
models and topology analyses provide insight into the relationship
between protein dynamics and allosteric drug binding. In each section,
we first provide an overview of the various method classes before
describing relevant algorithms and software packages.
Collapse
Affiliation(s)
- Jeffrey R Wagner
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Christopher T Lee
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Jacob D Durrant
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Robert D Malmstrom
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Victoria A Feher
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| | - Rommie E Amaro
- Department of Chemistry & Biochemistry and ‡National Biomedical Computation Resource, University of California, San Diego , La Jolla, California 92093, United States
| |
Collapse
|
47
|
Feinauer C, Szurmant H, Weigt M, Pagnani A. Inter-Protein Sequence Co-Evolution Predicts Known Physical Interactions in Bacterial Ribosomes and the Trp Operon. PLoS One 2016; 11:e0149166. [PMID: 26882169 PMCID: PMC4755613 DOI: 10.1371/journal.pone.0149166] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2015] [Accepted: 01/28/2016] [Indexed: 11/29/2022] Open
Abstract
Interaction between proteins is a fundamental mechanism that underlies virtually all biological processes. Many important interactions are conserved across a large variety of species. The need to maintain interaction leads to a high degree of co-evolution between residues in the interface between partner proteins. The inference of protein-protein interaction networks from the rapidly growing sequence databases is one of the most formidable tasks in systems biology today. We propose here a novel approach based on the Direct-Coupling Analysis of the co-evolution between inter-protein residue pairs. We use ribosomal and trp operon proteins as test cases: For the small resp. large ribosomal subunit our approach predicts protein-interaction partners at a true-positive rate of 70% resp. 90% within the first 10 predictions, with areas of 0.69 resp. 0.81 under the ROC curves for all predictions. In the trp operon, it assigns the two largest interaction scores to the only two interactions experimentally known. On the level of residue interactions we show that for both the small and the large ribosomal subunit our approach predicts interacting residues in the system with a true positive rate of 60% and 85% in the first 20 predictions. We use artificial data to show that the performance of our approach depends crucially on the size of the joint multiple sequence alignments and analyze how many sequences would be necessary for a perfect prediction if the sequences were sampled from the same model that we use for prediction. Given the performance of our approach on the test data we speculate that it can be used to detect new interactions, especially in the light of the rapid growth of available sequence data.
Collapse
Affiliation(s)
- Christoph Feinauer
- Department of Applied Science and Technology, and Center for Computational Sciences, Politecnico di Torino, Torino, Italy
| | - Hendrik Szurmant
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Martin Weigt
- Sorbonne Universités, UPMC, UMR 7238, Computational and Quantitative Biology, Paris, France
- CNRS, UMR 7238, Computational and Quantitative Biology, Paris, France
- * E-mail: (MW); (AP)
| | - Andrea Pagnani
- Department of Applied Science and Technology, and Center for Computational Sciences, Politecnico di Torino, Torino, Italy
- Human Genetics Foundation, Molecular Biotechnology Center (MBC), Torino, Italy
- * E-mail: (MW); (AP)
| |
Collapse
|
48
|
Cheng RR, Raghunathan M, Noel JK, Onuchic JN. Constructing sequence-dependent protein models using coevolutionary information. Protein Sci 2016; 25:111-22. [PMID: 26223372 PMCID: PMC4815312 DOI: 10.1002/pro.2758] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Accepted: 07/27/2015] [Indexed: 11/08/2022]
Abstract
Recent developments in global statistical methodologies have advanced the analysis of large collections of protein sequences for coevolutionary information. Coevolution between amino acids in a protein arises from compensatory mutations that are needed to maintain the stability or function of a protein over the course of evolution. This gives rise to quantifiable correlations between amino acid sites within the multiple sequence alignment of a protein family. Here, we use the maximum entropy-based approach called mean field Direct Coupling Analysis (mfDCA) to infer a Potts model Hamiltonian governing the correlated mutations in a protein family. We use the inferred pairwise statistical couplings to generate the sequence-dependent heterogeneous interaction energies of a structure-based model (SBM) where only native contacts are considered. Considering the ribosomal S6 protein and its circular permutants as well as the SH3 protein, we demonstrate that these models quantitatively agree with experimental data on folding mechanisms. This work serves as a new framework for generating coevolutionary data-enriched models that can potentially be used to engineer key functional motions and novel interactions in protein systems.
Collapse
Affiliation(s)
- Ryan R Cheng
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, 77005
| | - Mohit Raghunathan
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, 77005
- Department of Physics & Astronomy, Rice University, Houston, Texas, 77005
| | - Jeffrey K Noel
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, 77005
- Department of Physics & Astronomy, Rice University, Houston, Texas, 77005
| | - José N Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, Texas, 77005
- Department of Physics & Astronomy, Rice University, Houston, Texas, 77005
| |
Collapse
|
49
|
Black WP, Wang L, Davis MY, Yang Z. The orphan response regulator EpsW is a substrate of the DifE kinase and it regulates exopolysaccharide in Myxococcus xanthus. Sci Rep 2015; 5:17831. [PMID: 26639551 PMCID: PMC4671073 DOI: 10.1038/srep17831] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2015] [Accepted: 11/06/2015] [Indexed: 11/17/2022] Open
Abstract
Here we attempted to identify the downstream target of the DifE histidine kinase in the regulation of exopolysaccharide (EPS) production in the Gram-negative bacterium Myxococcus xanthus. This bacterium is an important model system for the studies of Type IV pilus (T4P) because it is motile by social (S) motility which is powered by T4P retraction. EPS is critical for S motility because it is the preferred anchor for T4P retraction in this bacterium. Previous studies identified the Dif chemosensory pathway as crucial for the regulation of EPS production. However, the downstream target of the DifE kinase in this pathway was unknown. In this study, EpsW, an orphan and single-domain response regulator (RR), was identified as a potential DifE target first by bioinformatics. Subsequent experiments demonstrated that epsW is essential for EPS biosynthesis in vivo and that EpsW is directly phosphorylated by DifE in vitro. Targted mutagenesis of epsW suggests that EpsW is unlikely the terminal RR of the Dif pathway. We propose instead that EpsW is an intermediary in a multistep phosphorelay that regulates EPS in M. xanthus.
Collapse
Affiliation(s)
- Wesley P Black
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Lingling Wang
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA.,College of Life Sciences, South China Agricultural University, Guangzhou 510642, China
| | - Manli Y Davis
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Zhaomin Yang
- Department of Biological Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| |
Collapse
|
50
|
Emamjomeh A, Goliaei B, Torkamani A, Ebrahimpour R, Mohammadi N, Parsian A. Protein-protein interaction prediction by combined analysis of genomic and conservation information. Genes Genet Syst 2015; 89:259-72. [PMID: 25948120 DOI: 10.1266/ggs.89.259] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Protein-protein interactions (PPIs) are highly important because of their main role in cellular processes and biochemical pathways; therefore, PPI can be very useful in the prediction of protein functions. Experimental techniques of PPI detection have certain drawbacks; hence computational methods can be used to complement wet lab techniques. Such methods can be applied to PPI prediction as well as validation of experimental results. Computational algorithms can lead to many false PPI predictions, which in turn result in non-adequate performance. We have developed a novel method based on combined analysis, entitled PPIccc. Three different descriptors for PPIccc included gene co-expression values, codon usage similarity and conservation of surface residues between protein products of a gene pair, which combined to predict PPI. Validation of results based on Human Protein Reference Database (HPRD) indicated improvement of performance in our proposed method. The results also revealed that conservation of surface residues between proteins in combination with codon usage similarity of their related genes increase the performance of PPI prediction. This means that codon usage similarity and surface residues between proteins (only sequence-based features) can predict PPIs as good as PPIccc.
Collapse
|