1
|
Consistent Quantification of Complex Dynamics via a Novel Statistical Complexity Measure. ENTROPY 2022; 24:e24040505. [PMID: 35455168 PMCID: PMC9032123 DOI: 10.3390/e24040505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 03/29/2022] [Accepted: 04/02/2022] [Indexed: 02/06/2023]
Abstract
Natural systems often show complex dynamics. The quantification of such complex dynamics is an important step in, e.g., characterization and classification of different systems or to investigate the effect of an external perturbation on the dynamics. Promising routes were followed in the past using concepts based on (Shannon’s) entropy. Here, we propose a new, conceptually sound measure that can be pragmatically computed, in contrast to pure theoretical concepts based on, e.g., Kolmogorov complexity. We illustrate the applicability using a toy example with a control parameter and go on to the molecular evolution of the HIV1 protease for which drug treatment can be regarded as an external perturbation that changes the complexity of its molecular evolutionary dynamics. In fact, our method identifies exactly those residues which are known to bind the drug molecules by their noticeable signal. We furthermore apply our method in a completely different domain, namely foreign exchange rates, and find convincing results as well.
Collapse
|
2
|
Young BP, Loparo KA, Dick TE, Jacono FJ. Ventilatory pattern variability as a biometric for severity of acute lung injury in rats. Respir Physiol Neurobiol 2019; 265:161-171. [PMID: 30928542 PMCID: PMC9994622 DOI: 10.1016/j.resp.2019.03.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Revised: 03/05/2019] [Accepted: 03/26/2019] [Indexed: 01/27/2023]
Abstract
We hypothesize that ventilatory pattern variability (VPV) varies with the magnitude of acute lung injury (ALI). In adult male rats, we instilled a low- or high- dose of bleomycin or saline (PBS) intratracheally. While representative samples of pulmonary tissue indicated graded lung injury, coefficient of variation (CV) of TTOT did not differ among the 3 groups. Broncho-alveolar lavage fluid (BALF), respiratory rate (fR), mutual information were greater in ALI than sham rats; but did not differ between bleomycin doses. However, nonlinear complexity index (NLCI), which is the difference in sample entropy between original and surrogate data sets was greater for high- versus low- dose; but did not differ between low-dose and sham groups. Further, NLCI correlated to an injury index based on protein concentration of BALF and failure to gain weight. Finally, Receiver Operator Curves (ROCs) indicated that both mutual information and NLCI had greater sensitivity and specificity than fR and CVTTOT in identifying ALI. Thus, nonlinear analyses of VPV can distinguish ALI and out performs fR as a biometric.
Collapse
Affiliation(s)
- Benjamin P Young
- Division of Pulmonary, Critical Care, & Sleep Medicine, Department of Medicine, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Kenneth A Loparo
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Thomas E Dick
- Division of Pulmonary, Critical Care, & Sleep Medicine, Department of Medicine, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA; Department of Neurosciences, School of Medicine Case Western Reserve University, Cleveland, OH 44106, USA.
| | - Frank J Jacono
- Division of Pulmonary, Critical Care, & Sleep Medicine, Department of Medicine, School of Medicine, Case Western Reserve University, Cleveland, OH 44106, USA; Division of Pulmonary, Critical Care and Sleep Medicine, Department of Medicine, Louis Stokes VA Medical Center, Cleveland, OH 44106, USA
| |
Collapse
|
3
|
Groß C, Hamacher K, Schmitz K, Jager S. Cleavage Product Accumulation Decreases the Activity of Cutinase during PET Hydrolysis. J Chem Inf Model 2017; 57:243-255. [PMID: 28128951 DOI: 10.1021/acs.jcim.6b00556] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The Fusarium solani cutinase (FsC) is a promising candidate for the enzymatic degradation of the synthetic polyester polyethylene terephthalate (PET) but still suffers from a lack of activity. Using atomic MD simulations with different concentrations of cleavage product ethylene glycol (EG), we show influences of EG on the dynamic of FsC. We observed accumulation of EG in the active site region reducing the local flexibility of FsC. Furthermore, we used a coarse-grained mechanical model to investigate whether substrate binding in the active site causes an induced fit. We observed this supposed induced fit or "breath-like" movement during substrate binding indicating that the active site has to be flexible for substrate conversion. This guides rational design: mutants with an increased flexibility near the active site should be considered to compensate the solvent-mediated reduction in activity.
Collapse
Affiliation(s)
- Christine Groß
- Department of Biology, Computational Biology & Simulation Group, Technische Universität Darmstadt , Schnittspahnstraße 2, 64287 Darmstadt, Germany
| | - Kay Hamacher
- Department of Biology, Computational Biology & Simulation Group, Technische Universität Darmstadt , Schnittspahnstraße 2, 64287 Darmstadt, Germany
| | - Katja Schmitz
- Department of Chemistry, Biological Chemistry Group, Technische Universität Darmstadt , Alarich-Weiss-Straße 8, 64287 Darmstadt, Germany
| | - Sven Jager
- Department of Biology, Computational Biology & Simulation Group, Technische Universität Darmstadt , Schnittspahnstraße 2, 64287 Darmstadt, Germany
| |
Collapse
|
4
|
Abstract
The structural organization of a protein family is investigated by devising a method based on the random matrix theory (RMT), which uses the physiochemical properties of the amino acid with multiple sequence alignment. A graphical method to represent protein sequences using physiochemical properties is devised that gives a fast, easy, and informative way of comparing the evolutionary distances between protein sequences. A correlation matrix associated with each property is calculated, where the noise reduction and information filtering is done using RMT involving an ensemble of Wishart matrices. The analysis of the eigenvalue statistics of the correlation matrix for the β-lactamase family shows the universal features as observed in the Gaussian orthogonal ensemble (GOE). The property-based approach captures the short- as well as the long-range correlation (approximately following GOE) between the eigenvalues, whereas the previous approach (treating amino acids as characters) gives the usual short-range correlations, while the long-range correlations are the same as that of an uncorrelated series. The distribution of the eigenvector components for the eigenvalues outside the bulk (RMT bound) deviates significantly from RMT observations and contains important information about the system. The information content of each eigenvector of the correlation matrix is quantified by introducing an entropic estimate, which shows that for the β-lactamase family the smallest eigenvectors (low eigenmodes) are highly localized as well as informative. These small eigenvectors when processed gives clusters involving positions that have well-defined biological and structural importance matching with experiments. The approach is crucial for the recognition of structural motifs as shown in β-lactamase (and other families) and selectively identifies the important positions for targets to deactivate (activate) the enzymatic actions.
Collapse
Affiliation(s)
- Pradeep Bhadola
- Department of Physics and Astrophysics, University of Delhi, Delhi 110007, India
| | - Nivedita Deo
- Department of Physics and Astrophysics, University of Delhi, Delhi 110007, India
| |
Collapse
|
5
|
|
6
|
Lichtenstein F, Antoneli F, Briones MRS. MIA: Mutual Information Analyzer, a graphic user interface program that calculates entropy, vertical and horizontal mutual information of molecular sequence sets. BMC Bioinformatics 2015; 16:409. [PMID: 26652707 PMCID: PMC4676106 DOI: 10.1186/s12859-015-0837-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 12/02/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Short and long range correlations in biological sequences are central in genomic studies of covariation. These correlations can be studied using mutual information because it measures the amount of information one random variable contains about the other. Here we present MIA (Mutual Information Analyzer) a user friendly graphic interface pipeline that calculates spectra of vertical entropy (VH), vertical mutual information (VMI) and horizontal mutual information (HMI), since currently there is no user friendly integrated platform that in a single package perform all these calculations. MIA also calculates Jensen-Shannon Divergence (JSD) between pair of different species spectra, herein called informational distances. Thus, the resulting distance matrices can be presented by distance histograms and informational dendrograms, giving support to discrimination of closely related species. RESULTS In order to test MIA we analyzed sequences from Drosophila Adh locus, because the taxonomy and evolutionary patterns of different Drosophila species are well established and the gene Adh is extensively studied. The search retrieved 959 sequences of 291 species. From the total, 450 sequences of 17 species were selected. With this dataset MIA performed all tasks in less than three hours: gathering, storing and aligning fasta files; calculating VH, VMI and HMI spectra; and calculating JSD between pair of different species spectra. For each task MIA saved tables and graphics in the local disk, easily accessible for future analysis. CONCLUSIONS Our tests revealed that the "informational model free" spectra may represent species signatures. Since JSD applied to Horizontal Mutual Information spectra resulted in statistically significant distances between species, we could calculate respective hierarchical clusters, herein called Informational Dendrograms (ID). When compared to phylogenetic trees all Informational Dendrograms presented similar taxonomy and species clusterization.
Collapse
Affiliation(s)
- Flavio Lichtenstein
- Departamento de Informática em Saúde, Escola Paulista de Medicina, Universidade Federal de Sao Paulo, Rua Botucatu, 862, Ed. José Leal Prado, andar térreo, Vila Clementino, CEP 04023-062, Sao Paulo, SP, Brazil. .,Laboratory of Evolutionary Genomics and Biocomplexity, Escola Paulista de Medicina, Universidade Federal de São Paulo, Rua Pedro de Toledo, 669, 4 andar L4E, CEP 04039-032, São Paulo, SP, Brazil.
| | - Fernando Antoneli
- Departamento de Informática em Saúde, Escola Paulista de Medicina, Universidade Federal de Sao Paulo, Rua Botucatu, 862, Ed. José Leal Prado, andar térreo, Vila Clementino, CEP 04023-062, Sao Paulo, SP, Brazil. .,Laboratory of Evolutionary Genomics and Biocomplexity, Escola Paulista de Medicina, Universidade Federal de São Paulo, Rua Pedro de Toledo, 669, 4 andar L4E, CEP 04039-032, São Paulo, SP, Brazil.
| | - Marcelo R S Briones
- Departamento de Microbiologia, Immunologia and Parasitologia, Escola Paulista de Medicina, Universidade Federal de Sao Paulo, Rua Botucatu, 862, Ed. Ciências Biomédicas, 3 andar, Vila Clementino, CEP 04023-062, Sao Paulo, SP, Brazil. .,Laboratory of Evolutionary Genomics and Biocomplexity, Escola Paulista de Medicina, Universidade Federal de São Paulo, Rua Pedro de Toledo, 669, 4 andar L4E, CEP 04039-032, São Paulo, SP, Brazil.
| |
Collapse
|
7
|
Gardner SG, Miller JB, Dean T, Robinson T, Erickson M, Ridge PG, McCleary WR. Genetic analysis, structural modeling, and direct coupling analysis suggest a mechanism for phosphate signaling in Escherichia coli. BMC Genet 2015; 16 Suppl 2:S2. [PMID: 25953406 PMCID: PMC4423584 DOI: 10.1186/1471-2156-16-s2-s2] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Background Proper phosphate signaling is essential for robust growth of Escherichia coli and many other bacteria. The phosphate signal is mediated by a classic two component signal system composed of PhoR and PhoB. The PhoR histidine kinase is responsible for phosphorylating/dephosphorylating the response regulator, PhoB, which controls the expression of genes that aid growth in low phosphate conditions. The mechanism by which PhoR receives a signal of environmental phosphate levels has remained elusive. A transporter complex composed of the PstS, PstC, PstA, and PstB proteins as well as a negative regulator, PhoU, have been implicated in signaling environmental phosphate to PhoR. Results This work confirms that PhoU and the PstSCAB complex are necessary for proper signaling of high environmental phosphate. Also, we identify residues important in PhoU/PhoR interaction with genetic analysis. Using protein modeling and docking methods, we show an interaction model that points to a potential mechanism for PhoU mediated signaling to PhoR to modify its activity. This model is tested with direct coupling analysis. Conclusions These bioinformatics tools, in combination with genetic and biochemical analysis, help to identify and test a model for phosphate signaling and may be applicable to several other systems.
Collapse
|
8
|
Thompson JJ, Tabatabaei Ghomi H, Lill MA. Application of information theory to a three-body coarse-grained representation of proteins in the PDB: insights into the structural and evolutionary roles of residues in protein structure. Proteins 2014; 82:3450-65. [PMID: 25269778 DOI: 10.1002/prot.24698] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Revised: 09/09/2014] [Accepted: 09/19/2014] [Indexed: 01/03/2023]
Abstract
Knowledge-based methods for analyzing protein structures, such as statistical potentials, primarily consider the distances between pairs of bodies (atoms or groups of atoms). Considerations of several bodies simultaneously are generally used to characterize bonded structural elements or those in close contact with each other, but historically do not consider atoms that are not in direct contact with each other. In this report, we introduce an information-theoretic method for detecting and quantifying distance-dependent through-space multibody relationships between the sidechains of three residues. The technique introduced is capable of producing convergent and consistent results when applied to a sufficiently large database of randomly chosen, experimentally solved protein structures. The results of our study can be shown to reproduce established physico-chemical properties of residues as well as more recently discovered properties and interactions. These results offer insight into the numerous roles that residues play in protein structure, as well as relationships between residue function, protein structure, and evolution. The techniques and insights presented in this work should be useful in the future development of novel knowledge-based tools for the evaluation of protein structure.
Collapse
Affiliation(s)
- Jared J Thompson
- Department of Medicinal Chemistry and Molecular Pharmacology, College of Pharmacy, Purdue University, West Lafayette, Indiana
| | | | | |
Collapse
|
9
|
Qi Y, Im W. Quantification of Drive-Response Relationships Between Residues During Protein Folding. J Chem Theory Comput 2013; 9. [PMID: 24223527 DOI: 10.1021/ct4002784] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Mutual correlation and cooperativity are commonly used to describe residue-residue interactions in protein folding/function. However, these metrics do not provide any information on the causality relationships between residues. Such drive-response relationships are poorly studied in protein folding/function and difficult to measure experimentally due to technical limitations. In this study, using the information theory transfer entropy (TE) that provides a direct measurement of causality between two times series, we have quantified the drive-response relationships between residues in the folding/unfolding processes of four small proteins generated by molecular dynamics simulations. Instead of using a time-averaged single TE value, the time-dependent TE is measured with the Q-scores based on residue-residue contacts and with the statistical significance analysis along the folding/unfolding processes. The TE analysis is able to identify the driving and responding residues that are different from the highly correlated residues revealed by the mutual information analysis. In general, the driving residues have more regular secondary structures, are more buried, and show greater effects on the protein stability as well as folding and unfolding rates. In addition, the dominant driving and responding residues from the TE analysis on the whole trajectory agree with those on a single folding event, demonstrating that the drive-response relationships are preserved in the non-equilibrium process. Our study provides detailed insights into the protein folding process and has potential applications in protein engineering and interpretation of time-dependent residue-based experimental observables for protein function.
Collapse
Affiliation(s)
- Yifei Qi
- Department of Molecular Biosciences and Center for Bioinformatics, The University of Kansas, 2030 Becker Drive Lawrence, Kansas 66047, United States
| | | |
Collapse
|
10
|
Durani V, Magliery TJ. Protein engineering and stabilization from sequence statistics: variation and covariation analysis. Methods Enzymol 2013; 523:237-56. [PMID: 23422433 DOI: 10.1016/b978-0-12-394292-0.00011-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The concepts of consensus and correlation in multiple sequence alignments (MSAs) have been used in the past to understand and engineer proteins. However, there are multiple ways of acquiring MSA databases and also numerous mathematical metrics that can be applied to calculate each of the parameters. This chapter describes an overall methodology that we have chosen to employ for acquiring and statistically analyzing MSAs. We have provided a step-by-step protocol for calculating relative entropy and mutual information metrics and describe how they can be used to predict mutations that have a high probability of stabilizing a protein. This protocol allows for flexibility for modification of formulae and parameters without using anything more complicated than Microsoft Excel. We have also demonstrated various aspects of data analysis by carrying out a sample analysis on the BPTI-Kunitz family of proteins and identified mutations that would be predicted to stabilize this protein based on consensus and correlation values.
Collapse
Affiliation(s)
- Venuka Durani
- Department of Chemistry, The Ohio State University, Columbus, Ohio, USA
| | | |
Collapse
|
11
|
Aguilar D, Oliva B, Marino Buslje C. Mapping the mutual information network of enzymatic families in the protein structure to unveil functional features. PLoS One 2012; 7:e41430. [PMID: 22848494 PMCID: PMC3405127 DOI: 10.1371/journal.pone.0041430] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2012] [Accepted: 06/26/2012] [Indexed: 11/24/2022] Open
Abstract
Amino acids committed to a particular function correlate tightly along evolution and tend to form clusters in the 3D structure of the protein. Consequently, a protein can be seen as a network of co-evolving clusters of residues. The goal of this work is two-fold: first, we have combined mutual information and structural data to describe the amino acid networks within a protein and their interactions. Second, we have investigated how this information can be used to improve methods of prediction of functional residues by reducing the search space. As a main result, we found that clusters of co-evolving residues related to the catalytic site of an enzyme have distinguishable topological properties in the network. We also observed that these clusters usually evolve independently, which could be related to a fail-safe mechanism. Finally, we discovered a significant enrichment of functional residues (e.g. metal binding, susceptibility to detrimental mutations) in the clusters, which could be the foundation of new prediction tools.
Collapse
Affiliation(s)
- Daniel Aguilar
- Structural Bioinformatics Group, Departament de Ciencies Experimentals i de la Salut, Universitat Pompeu Fabra, Barcelona Biomedical Research Park, Barcelona, Spain.
| | | | | |
Collapse
|
12
|
Livesay DR, Kreth KE, Fodor AA. A critical evaluation of correlated mutation algorithms and coevolution within allosteric mechanisms. Methods Mol Biol 2012; 796:385-398. [PMID: 22052502 DOI: 10.1007/978-1-61779-334-9_21] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The notion of using the evolutionary history encoded within multiple sequence alignments to predict allosteric mechanisms is appealing. In this approach, correlated mutations are expected to reflect coordinated changes that maintain intramolecular coupling between residue pairs. Despite much early fanfare, the general suitability of correlated mutations to predict allosteric couplings has not yet been established. Lack of progress along these lines has been hindered by several algorithmic limitations including phylogenetic artifacts within alignments masking true covariance and the computational intractability of consideration of more than two correlated residues at a time. Recent progress in algorithm development, however, has been substantial with a new generation of correlated mutation algorithms that have made fundamental progress toward solving these difficult problems. Despite these encouraging results, there remains little evidence to suggest that the evolutionary constraints acting on allosteric couplings are sufficient to be recovered from multiple sequence alignments. In this review, we argue that due to the exquisite sensitivity of protein dynamics, and hence that of allosteric mechanisms, the latter vary widely within protein families. If it turns out to be generally true that even very similar homologs display a wide divergence of allosteric mechanisms, then even a perfect correlated mutation algorithm could not be reliably used as a general mechanism for discovery of allosteric pathways.
Collapse
Affiliation(s)
- Dennis R Livesay
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | | | | |
Collapse
|
13
|
Weißgraeber S, Hoffgaard F, Hamacher K. Structure-based, biophysical annotation of molecular coevolution of acetylcholinesterase. Proteins 2011; 79:3144-54. [DOI: 10.1002/prot.23144] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2011] [Revised: 06/20/2011] [Accepted: 07/20/2011] [Indexed: 01/09/2023]
|
14
|
Yang S, Yalamanchili HK, Li X, Yao KM, Sham PC, Zhang MQ, Wang J. Correlated evolution of transcription factors and their binding sites. ACTA ACUST UNITED AC 2011; 27:2972-8. [PMID: 21896508 DOI: 10.1093/bioinformatics/btr503] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION The interaction between transcription factor (TF) and transcription factor binding site (TFBS) is essential for gene regulation. Mutation in either the TF or the TFBS may weaken their interaction and thus result in abnormalities. To maintain such vital interaction, a mutation in one of the interacting partners might be compensated by a corresponding mutation in its binding partner during the course of evolution. Confirming this co-evolutionary relationship will guide us in designing protein sequences to target a specific DNA sequence or in predicting TFBS for poorly studied proteins, or even correcting and rescuing disease mutations in clinical applications. RESULTS Based on six, publicly available, experimentally validated TF-TFBS binding datasets for the basic Helix-Loop-Helix (bHLH) family, Homeo family, High-Mobility Group (HMG) family and Transient Receptor Potential channels (TRP) family, we showed that the evolutions of the TFs and their TFBSs are significantly correlated across eukaryotes. We further developed a mutual information-based method to identify co-evolved protein residues and DNA bases. This research sheds light on the dynamic relationship between TF and TFBS during their evolution. The same principle and strategy can be applied to co-evolutionary studies on protein-DNA interactions in other protein families. AVAILABILITY All the datasets, scripts and other related files have been made freely available at: http://jjwanglab.org/co-evo. CONTACT junwen@uw.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shu Yang
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | | | | | | | | | | | | |
Collapse
|
15
|
Ackerman SH, Gatti DL. The contribution of coevolving residues to the stability of KDO8P synthase. PLoS One 2011; 6:e17459. [PMID: 21408011 PMCID: PMC3052366 DOI: 10.1371/journal.pone.0017459] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2010] [Accepted: 02/03/2011] [Indexed: 12/03/2022] Open
Abstract
Background The evolutionary tree of 3-deoxy-D-manno-octulosonate 8-phosphate (KDO8P) synthase (KDO8PS), a bacterial enzyme that catalyzes a key step in the biosynthesis of bacterial endotoxin, is evenly divided between metal and non-metal forms, both having similar structures, but diverging in various degrees in amino acid sequence. Mutagenesis, crystallographic and computational studies have established that only a few residues determine whether or not KDO8PS requires a metal for function. The remaining divergence in the amino acid sequence of KDO8PSs is apparently unrelated to the underlying catalytic mechanism. Methodology/Principal Findings The multiple alignment of all known KDO8PS sequences reveals that several residue pairs coevolved, an indication of their possible linkage to a structural constraint. In this study we investigated by computational means the contribution of coevolving residues to the stability of KDO8PS. We found that about 1/4 of all strongly coevolving pairs probably originated from cycles of mutation (decreasing stability) and suppression (restoring it), while the remaining pairs are best explained by a succession of neutral or nearly neutral covarions. Conclusions/Significance Both sequence conservation and coevolution are involved in the preservation of the core structure of KDO8PS, but the contribution of coevolving residues is, in proportion, smaller. This is because small stability gains or losses associated with selection of certain residues in some regions of the stability landscape of KDO8PS are easily offset by a large number of possible changes in other regions. While this effect increases the tolerance of KDO8PS to deleterious mutations, it also decreases the probability that specific pairs of residues could have a strong contribution to the thermodynamic stability of the protein.
Collapse
Affiliation(s)
- Sharon H. Ackerman
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, Michigan, United States of America
| | - Domenico L. Gatti
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, Michigan, United States of America
- Cardiovascular Research Institute, Wayne State University School of Medicine, Detroit, Michigan, United States of America
- * E-mail:
| |
Collapse
|
16
|
Computation of mutual information from Hidden Markov Models. Comput Biol Chem 2010; 34:328-33. [DOI: 10.1016/j.compbiolchem.2010.08.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Revised: 08/30/2010] [Accepted: 08/30/2010] [Indexed: 11/22/2022]
|
17
|
Slama P, Geman D. Identification of family-determining residues in PHD fingers. Nucleic Acids Res 2010; 39:1666-79. [PMID: 21059680 PMCID: PMC3061080 DOI: 10.1093/nar/gkq947] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Histone modifications are fundamental to chromatin structure and transcriptional regulation, and are recognized by a limited number of protein folds. Among these folds are PHD fingers, which are present in most chromatin modification complexes. To date, about 15 PHD finger domains have been structurally characterized, whereas hundreds of different sequences have been identified. Consequently, an important open problem is to predict structural features of a PHD finger knowing only its sequence. Here, we classify PHD fingers into different groups based on the analysis of residue–residue co-evolution in their sequences. We measure the degree to which fixing the amino acid type at one position modifies the frequencies of amino acids at other positions. We then detect those position/amino acid combinations, or ‘conditions’, which have the strongest impact on other sequence positions. Clustering these strong conditions yields four families, providing informative labels for PHD finger sequences. Existing experimental results, as well as docking calculations performed here, reveal that these families indeed show discrepancies at the functional level. Our method should facilitate the functional characterization of new PHD fingers, as well as other protein families, solely based on sequence information.
Collapse
Affiliation(s)
- Patrick Slama
- Institute for Computational Medicine and Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA.
| | | |
Collapse
|
18
|
Bremm S, Schreck T, Boba P, Held S, Hamacher K. Computing and visually analyzing mutual information in molecular co-evolution. BMC Bioinformatics 2010; 11:330. [PMID: 20565748 PMCID: PMC2906490 DOI: 10.1186/1471-2105-11-330] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2010] [Accepted: 06/17/2010] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Selective pressure in molecular evolution leads to uneven distributions of amino acids and nucleotides. In fact one observes correlations among such constituents due to a large number of biophysical mechanisms (folding properties, electrostatics, ...). To quantify these correlations the mutual information -after proper normalization--has proven most effective. The challenge is to navigate the large amount of data, which in a study for a typical protein cannot simply be plotted. RESULTS To visually analyze mutual information we developed a matrix visualization tool that allows different views on the mutual information matrix: filtering, sorting, and weighting are among them. The user can interactively navigate a huge matrix in real-time and search e.g., for patterns and unusual high or low values. A computation of the mutual information matrix for a sequence alignment in FASTA-format is possible. The respective stand-alone program computes in addition proper normalizations for a null model of neutral evolution and maps the mutual information to Z-scores with respect to the null model. CONCLUSIONS The new tool allows to compute and visually analyze sequence data for possible co-evolutionary signals. The tool has already been successfully employed in evolutionary studies on HIV1 protease and acetylcholinesterase. The functionality of the tool was defined by users using the tool in real-world research. The software can also be used for visual analysis of other matrix-like data, such as information obtained by DNA microarray experiments. The package is platform-independently implemented in Java and free for academic use under a GPL license.
Collapse
Affiliation(s)
- Sebastian Bremm
- Interactive Graphics Systems, Dept. of Computer Science, Technische Universität Darmstadt, Germany
| | - Tobias Schreck
- Interactive Graphics Systems, Dept. of Computer Science, Technische Universität Darmstadt, Germany
| | - Patrick Boba
- Bioinformatics & Theo. Biology, Dept. of Biology, Technische Universität Darmstadt, Germany
| | - Stephanie Held
- Bioinformatics & Theo. Biology, Dept. of Biology, Technische Universität Darmstadt, Germany
| | - Kay Hamacher
- Bioinformatics & Theo. Biology, Dept. of Biology, Technische Universität Darmstadt, Germany
| |
Collapse
|
19
|
Hoffgaard F, Weil P, Hamacher K. BioPhysConnectoR: Connecting sequence information and biophysical models. BMC Bioinformatics 2010; 11:199. [PMID: 20412558 PMCID: PMC2868838 DOI: 10.1186/1471-2105-11-199] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2009] [Accepted: 04/22/2010] [Indexed: 11/10/2022] Open
Abstract
Background One of the most challenging aspects of biomolecular systems is the understanding of the coevolution in and among the molecule(s). A complete, theoretical picture of the selective advantage, and thus a functional annotation, of (co-)mutations is still lacking. Using sequence-based and information theoretical inspired methods we can identify coevolving residues in proteins without understanding the underlying biophysical properties giving rise to such coevolutionary dynamics. Detailed (atomistic) simulations are prohibitively expensive. At the same time reduced molecular models are an efficient way to determine the reduced dynamics around the native state. The combination of sequence based approaches with such reduced models is therefore a promising approach to annotate evolutionary sequence changes. Results With the R package BioPhysConnectoR we provide a framework to connect the information theoretical domain of biomolecular sequences to biophysical properties of the encoded molecules - derived from reduced molecular models. To this end we have integrated several fragmented ideas into one single package ready to be used in connection with additional statistical routines in R. Additionally, the package leverages the power of modern multi-core architectures to reduce turn-around times in evolutionary and biomolecular design studies. Our package is a first step to achieve the above mentioned annotation of coevolution by reduced dynamics around the native state of proteins. Conclusions BioPhysConnectoR is implemented as an R package and distributed under GPL 2 license. It allows for efficient and perfectly parallelized functional annotation of coevolution found at the sequence level.
Collapse
Affiliation(s)
- Franziska Hoffgaard
- Theoretical Biology and Bioinformatics, Institute of Microbiology and Genetics, Department of Biology, TU Darmstadt, Schnittspahnstrasse 10, 64289 Darmstadt, Germany.
| | | | | |
Collapse
|