1
|
Recent Advances in the Prediction of Protein Structural Classes: Feature Descriptors and Machine Learning Algorithms. CRYSTALS 2021. [DOI: 10.3390/cryst11040324] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
In the postgenomic age, rapid growth in the number of sequence-known proteins has been accompanied by much slower growth in the number of structure-known proteins (as a result of experimental limitations), and a widening gap between the two is evident. Because protein function is linked to protein structure, successful prediction of protein structure is of significant importance in protein function identification. Foreknowledge of protein structural class can help improve protein structure prediction with significant medical and pharmaceutical implications. Thus, a fast, suitable, reliable, and reasonable computational method for protein structural class prediction has become pivotal in bioinformatics. Here, we review recent efforts in protein structural class prediction from protein sequence, with particular attention paid to new feature descriptors, which extract information from protein sequence, and the use of machine learning algorithms in both feature selection and the construction of new classification models. These new feature descriptors include amino acid composition, sequence order, physicochemical properties, multiprofile Bayes, and secondary structure-based features. Machine learning methods, such as artificial neural networks (ANNs), support vector machine (SVM), K-nearest neighbor (KNN), random forest, deep learning, and examples of their application are discussed in detail. We also present our view on possible future directions, challenges, and opportunities for the applications of machine learning algorithms for prediction of protein structural classes.
Collapse
|
2
|
Mehrotra R, Loake G, Mehrotra S. Promoter choice: Selection vs. rejection. GENE REPORTS 2018. [DOI: 10.1016/j.genrep.2018.08.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
3
|
Poos AM, Maicher A, Dieckmann AK, Oswald M, Eils R, Kupiec M, Luke B, König R. Mixed Integer Linear Programming based machine learning approach identifies regulators of telomerase in yeast. Nucleic Acids Res 2016; 44:e93. [PMID: 26908654 PMCID: PMC4889924 DOI: 10.1093/nar/gkw111] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Accepted: 01/25/2016] [Indexed: 11/24/2022] Open
Abstract
Understanding telomere length maintenance mechanisms is central in cancer biology as their dysregulation is one of the hallmarks for immortalization of cancer cells. Important for this well-balanced control is the transcriptional regulation of the telomerase genes. We integrated Mixed Integer Linear Programming models into a comparative machine learning based approach to identify regulatory interactions that best explain the discrepancy of telomerase transcript levels in yeast mutants with deleted regulators showing aberrant telomere length, when compared to mutants with normal telomere length. We uncover novel regulators of telomerase expression, several of which affect histone levels or modifications. In particular, our results point to the transcription factors Sum1, Hst1 and Srb2 as being important for the regulation of EST1 transcription, and we validated the effect of Sum1 experimentally. We compiled our machine learning method leading to a user friendly package for R which can straightforwardly be applied to similar problems integrating gene regulator binding information and expression profiles of samples of e.g. different phenotypes, diseases or treatments.
Collapse
Affiliation(s)
- Alexandra M Poos
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, D-07747 Jena, Erlanger Allee 101, Germany Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology-Hans Knöll Institute (HKI) Jena, Beutenbergstrasse 11a, 07745 Jena, Germany Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 580, 69120 Heidelberg, Germany
| | - André Maicher
- Center for Molecular Biology at Heidelberg University (ZMBH), German Cancer Research Center (DKFZ)-ZMBH-Alliance, Im Neuenheimer Feld 282, 69120 Heidelberg, Germany Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Anna K Dieckmann
- Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology-Hans Knöll Institute (HKI) Jena, Beutenbergstrasse 11a, 07745 Jena, Germany Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 580, 69120 Heidelberg, Germany
| | - Marcus Oswald
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, D-07747 Jena, Erlanger Allee 101, Germany Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology-Hans Knöll Institute (HKI) Jena, Beutenbergstrasse 11a, 07745 Jena, Germany
| | - Roland Eils
- Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 580, 69120 Heidelberg, Germany Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology, and Bioquant, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Martin Kupiec
- Department of Molecular Microbiology and Biotechnology, Tel Aviv University, Ramat Aviv 69978, Israel
| | - Brian Luke
- Center for Molecular Biology at Heidelberg University (ZMBH), German Cancer Research Center (DKFZ)-ZMBH-Alliance, Im Neuenheimer Feld 282, 69120 Heidelberg, Germany Telomere Biology Group, Institute of Molecular Biology (IMB), Ackermannweg 4, 55128 Mainz, Germany
| | - Rainer König
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, D-07747 Jena, Erlanger Allee 101, Germany Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology-Hans Knöll Institute (HKI) Jena, Beutenbergstrasse 11a, 07745 Jena, Germany Theoretical Bioinformatics, German Cancer Research Center (DKFZ), Im Neuenheimer Feld 580, 69120 Heidelberg, Germany
| |
Collapse
|
4
|
Belcaid M, Toonen RJ. Demystifying computer science for molecular ecologists. Mol Ecol 2015; 24:2619-40. [PMID: 25824671 DOI: 10.1111/mec.13175] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Revised: 03/23/2015] [Accepted: 03/25/2015] [Indexed: 11/30/2022]
Abstract
In this age of data-driven science and high-throughput biology, computational thinking is becoming an increasingly important skill for tackling both new and long-standing biological questions. However, despite its obvious importance and conspicuous integration into many areas of biology, computer science is still viewed as an obscure field that has, thus far, permeated into only a few of the biology curricula across the nation. A national survey has shown that lack of computational literacy in environmental sciences is the norm rather than the exception [Valle & Berdanier (2012) Bulletin of the Ecological Society of America, 93, 373-389]. In this article, we seek to introduce a few important concepts in computer science with the aim of providing a context-specific introduction aimed at research biologists. Our goal was to help biologists understand some of the most important mainstream computational concepts to better appreciate bioinformatics methods and trade-offs that are not obvious to the uninitiated.
Collapse
Affiliation(s)
- Mahdi Belcaid
- The Hawai'i Institute of Marine Biology, P.O. Box 1346, Kane'ohe, HI, 96744, USA
| | - Robert J Toonen
- The Hawai'i Institute of Marine Biology, P.O. Box 1346, Kane'ohe, HI, 96744, USA
| |
Collapse
|
5
|
Suratanee A, Schaefer MH, Betts MJ, Soons Z, Mannsperger H, Harder N, Oswald M, Gipp M, Ramminger E, Marcus G, Männer R, Rohr K, Wanker E, Russell RB, Andrade-Navarro MA, Eils R, König R. Characterizing protein interactions employing a genome-wide siRNA cellular phenotyping screen. PLoS Comput Biol 2014; 10:e1003814. [PMID: 25255318 PMCID: PMC4178005 DOI: 10.1371/journal.pcbi.1003814] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2013] [Accepted: 07/18/2014] [Indexed: 12/19/2022] Open
Abstract
Characterizing the activating and inhibiting effect of protein-protein interactions (PPI) is fundamental to gain insight into the complex signaling system of a human cell. A plethora of methods has been suggested to infer PPI from data on a large scale, but none of them is able to characterize the effect of this interaction. Here, we present a novel computational development that employs mitotic phenotypes of a genome-wide RNAi knockdown screen and enables identifying the activating and inhibiting effects of PPIs. Exemplarily, we applied our technique to a knockdown screen of HeLa cells cultivated at standard conditions. Using a machine learning approach, we obtained high accuracy (82% AUC of the receiver operating characteristics) by cross-validation using 6,870 known activating and inhibiting PPIs as gold standard. We predicted de novo unknown activating and inhibiting effects for 1,954 PPIs in HeLa cells covering the ten major signaling pathways of the Kyoto Encyclopedia of Genes and Genomes, and made these predictions publicly available in a database. We finally demonstrate that the predicted effects can be used to cluster knockdown genes of similar biological processes in coherent subgroups. The characterization of the activating or inhibiting effect of individual PPIs opens up new perspectives for the interpretation of large datasets of PPIs and thus considerably increases the value of PPIs as an integrated resource for studying the detailed function of signaling pathways of the cellular system of interest. Mathematical models which aim to describe cellular signaling start from constructing an interaction network of effectors, mediators and their effected target proteins. Several developments came up making it easier to put these links together. Besides tediously assembling knowledge from textbooks and research articles, experimental high-throughput methods were established like Yeast-2-Hybrid assays or Fluorescence Emission Resonance Transfer. However, these methods do not elucidate the effect of such interactions. We aimed inferring if an interaction in a specific cellular context is rather activating or inhibiting. We used cellular phenotypes of a genome-wide RNAi knockdown screen of live cells to identify such activating and inhibiting effects of protein interactions. The rationale behind it is that activating protein interactions should lead to similar phenotypes when their respective genes are knocked down, whereas an inhibiting protein interaction should lead to dissimilar phenotypes. Exemplarily, we applied our method to a phenotype screen of perturbed HeLa cells. Our predictions effectively reproduced textbook relationships between proteins or domains when comparing the predicted effects with pairs of effectors, receptors, kinases, phosphatases and of general signalling modules. The presented computational approach is generic and may enable elucidating the effects of studied interactions also of other cellular systems under more specific conditions.
Collapse
Affiliation(s)
- Apichat Suratanee
- Department of Mathematics, Faculty of Applied Science, King Mongkut's University of Technology North Bangkok, Bangsue, Bangkok, Thailand
| | - Martin H. Schaefer
- EMBL/CRG Systems Biology Research Unit, Center for Genomic Regulation, Barcelona, Spain
| | - Matthew J. Betts
- Robert B. Russell, Cell Networks Protein Evolution, BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Zita Soons
- Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute Jena, Jena, Germany
- Department of Knowledge Engineering, Maastricht University, Maastricht, The Netherlands
- Theoretical Bioinformatics, German Cancer Research Center, Heidelberg, Germany
| | - Heiko Mannsperger
- Theoretical Bioinformatics, German Cancer Research Center, Heidelberg, Germany
| | - Nathalie Harder
- Theoretical Bioinformatics, German Cancer Research Center, Heidelberg, Germany
- Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology, BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Marcus Oswald
- Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute Jena, Jena, Germany
- Theoretical Bioinformatics, German Cancer Research Center, Heidelberg, Germany
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany
| | - Markus Gipp
- Department of Computer Science V, Institute of Computer Engineering, University of Mannheim, Mannheim, Germany
| | - Ellen Ramminger
- AG Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Berlin, Germany
| | - Guillermo Marcus
- Department of Computer Science V, Institute of Computer Engineering, University of Mannheim, Mannheim, Germany
| | - Reinhard Männer
- Department of Computer Science V, Institute of Computer Engineering, University of Mannheim, Mannheim, Germany
| | - Karl Rohr
- Theoretical Bioinformatics, German Cancer Research Center, Heidelberg, Germany
- Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology, BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Erich Wanker
- AG Neuroproteomics, Max Delbrueck Center for Molecular Medicine, Berlin, Germany
| | - Robert B. Russell
- Robert B. Russell, Cell Networks Protein Evolution, BioQuant, University of Heidelberg, Heidelberg, Germany
| | - Miguel A. Andrade-Navarro
- Computational Biology and Data Mining Group, Max Delbrueck Center for Molecular Medicine, Berlin, Germany
| | - Roland Eils
- Theoretical Bioinformatics, German Cancer Research Center, Heidelberg, Germany
- Department of Bioinformatics and Functional Genomics, Institute of Pharmacy and Molecular Biotechnology, BioQuant, University of Heidelberg, Heidelberg, Germany
- * E-mail: (RE); (RK)
| | - Rainer König
- Network Modeling, Leibniz Institute for Natural Product Research and Infection Biology - Hans Knöll Institute Jena, Jena, Germany
- Theoretical Bioinformatics, German Cancer Research Center, Heidelberg, Germany
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany
- * E-mail: (RE); (RK)
| |
Collapse
|
6
|
Li X, Zhao Y, Tian B, Jamaluddin M, Mitra A, Yang J, Rowicka M, Brasier AR, Kudlicki A. Modulation of gene expression regulated by the transcription factor NF-κB/RelA. J Biol Chem 2014; 289:11927-11944. [PMID: 24523406 DOI: 10.1074/jbc.m113.539965] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
Modulators (Ms) are proteins that modify the activity of transcription factors (TFs) and influence expression of their target genes (TGs). To discover modulators of NF-κB/RelA, we first identified 365 NF-κB/RelA-binding proteins using liquid chromatography-tandem mass spectrometry (LC-MS/MS). We used a probabilistic model to infer 8349 (M, NF-κB/RelA, TG) triplets and their modes of modulatory action from our combined LC-MS/MS and ChIP-Seq (ChIP followed by next generation sequencing) data, published RelA modulators and TGs, and a compendium of gene expression profiles. Hierarchical clustering of the derived modulatory network revealed functional subnetworks and suggested new pathways modulating RelA transcriptional activity. The modulators with the highest number of TGs and most non-random distribution of action modes (measured by Shannon entropy) are consistent with published reports. Our results provide a repertoire of testable hypotheses for experimental validation. One of the NF-κB/RelA modulators we identified is STAT1. The inferred (STAT1, NF-κB/RelA, TG) triplets were validated by LC-selected reaction monitoring-MS and the results of STAT1 deletion in human fibrosarcoma cells. Overall, we have identified 562 NF-κB/RelA modulators, which are potential drug targets, and clarified mechanisms of achieving NF-κB/RelA multiple functions through modulators. Our approach can be readily applied to other TFs.
Collapse
Affiliation(s)
- Xueling Li
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555; Departments of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas 77555; Hefei Institute of Intelligent Machines, Chinese Academy of Sciences, Hefei 230031, China
| | - Yingxin Zhao
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555; Center for Clinical Proteomics, University of Texas Medical Branch, Galveston, Texas 77555
| | - Bing Tian
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555; Departments of Internal Medicine, University of Texas Medical Branch, Galveston, Texas 77555
| | - Mohammad Jamaluddin
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555
| | - Abhishek Mitra
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555
| | - Jun Yang
- Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555; Departments of Internal Medicine, University of Texas Medical Branch, Galveston, Texas 77555
| | - Maga Rowicka
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555; Departments of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas 77555
| | - Allan R Brasier
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555; Center for Clinical Proteomics, University of Texas Medical Branch, Galveston, Texas 77555; Departments of Internal Medicine, University of Texas Medical Branch, Galveston, Texas 77555
| | - Andrzej Kudlicki
- Institute for Translational Sciences, University of Texas Medical Branch, Galveston, Texas 77555; Sealy Center for Molecular Medicine, University of Texas Medical Branch, Galveston, Texas 77555; Departments of Biochemistry and Molecular Biology, University of Texas Medical Branch, Galveston, Texas 77555.
| |
Collapse
|
7
|
Schrynemackers M, Küffner R, Geurts P. On protocols and measures for the validation of supervised methods for the inference of biological networks. Front Genet 2013; 4:262. [PMID: 24348517 PMCID: PMC3848415 DOI: 10.3389/fgene.2013.00262] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 11/13/2013] [Indexed: 11/30/2022] Open
Abstract
Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs.
Collapse
Affiliation(s)
- Marie Schrynemackers
- Systems and Modeling, Department of Electrical Engineering and Computer Science and GIGA-R, University of Liège Liège, Belgium
| | - Robert Küffner
- Institute for Practical Informatics and Bioinformatics, Ludwig-Maximilians-University Munich, Germany
| | - Pierre Geurts
- Systems and Modeling, Department of Electrical Engineering and Computer Science and GIGA-R, University of Liège Liège, Belgium
| |
Collapse
|
8
|
Stewart AJ, Plotkin JB. The evolution of complex gene regulation by low-specificity binding sites. Proc Biol Sci 2013; 280:20131313. [PMID: 23945682 DOI: 10.1098/rspb.2013.1313] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Requirements for gene regulation vary widely both within and among species. Some genes are constitutively expressed, whereas other genes require complex regulatory control. Transcriptional regulation is often controlled by a module of multiple transcription factor binding sites that, in combination, mediate the expression of a target gene. Here, we study how such regulatory modules evolve in response to natural selection. Using a population-genetic model, we show that complex regulatory modules which contain a larger number of binding sites must employ binding motifs that are less specific, on average, compared with smaller regulatory modules. This effect is extremely general, and it holds regardless of the selected binding logic that a module experiences. We attribute this phenomenon to the inability of stabilizing selection to maintain highly specific sites in large regulatory modules. Our analysis helps to explain broad empirical trends in the Saccharomyces cerevisiae regulatory network: those genes with a greater number of distinct transcriptional regulators feature less-specific binding motifs, compared with genes with fewer regulators. Our results also help to explain empirical trends in module size and motif specificity across species, ranging from prokaryotes to single-cellular and multi-cellular eukaryotes.
Collapse
|
9
|
Glass K, Huttenhower C, Quackenbush J, Yuan GC. Passing messages between biological networks to refine predicted interactions. PLoS One 2013; 8:e64832. [PMID: 23741402 PMCID: PMC3669401 DOI: 10.1371/journal.pone.0064832] [Citation(s) in RCA: 125] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Accepted: 04/17/2013] [Indexed: 01/10/2023] Open
Abstract
Regulatory network reconstruction is a fundamental problem in computational biology. There are significant limitations to such reconstruction using individual datasets, and increasingly people attempt to construct networks using multiple, independent datasets obtained from complementary sources, but methods for this integration are lacking. We developed PANDA (Passing Attributes between Networks for Data Assimilation), a message-passing model using multiple sources of information to predict regulatory relationships, and used it to integrate protein-protein interaction, gene expression, and sequence motif data to reconstruct genome-wide, condition-specific regulatory networks in yeast as a model. The resulting networks were not only more accurate than those produced using individual data sets and other existing methods, but they also captured information regarding specific biological mechanisms and pathways that were missed using other methodologies. PANDA is scalable to higher eukaryotes, applicable to specific tissue or cell type data and conceptually generalizable to include a variety of regulatory, interaction, expression, and other genome-scale data. An implementation of the PANDA algorithm is available at www.sourceforge.net/projects/panda-net.
Collapse
Affiliation(s)
- Kimberly Glass
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Curtis Huttenhower
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - John Quackenbush
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
| | - Guo-Cheng Yuan
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, Massachusetts, United States of America
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
10
|
Stewart AJ, Seymour RM, Pomiankowski A, Reuter M. Under-dominance constrains the evolution of negative autoregulation in diploids. PLoS Comput Biol 2013; 9:e1002992. [PMID: 23555226 PMCID: PMC3605092 DOI: 10.1371/journal.pcbi.1002992] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 02/04/2013] [Indexed: 11/19/2022] Open
Abstract
Regulatory networks have evolved to allow gene expression to rapidly track changes in the environment as well as to buffer perturbations and maintain cellular homeostasis in the absence of change. Theoretical work and empirical investigation in Escherichia coli have shown that negative autoregulation confers both rapid response times and reduced intrinsic noise, which is reflected in the fact that almost half of Escherichia coli transcription factors are negatively autoregulated. However, negative autoregulation is rare amongst the transcription factors of Saccharomyces cerevisiae. This difference is surprising because E. coli and S. cerevisiae otherwise have similar profiles of network motifs. In this study we investigate regulatory interactions amongst the transcription factors of Drosophila melanogaster and humans, and show that they have a similar dearth of negative autoregulation to that seen in S. cerevisiae. We then present a model demonstrating that this stiking difference in the noise reduction strategies used amongst species can be explained by constraints on the evolution of negative autoregulation in diploids. We show that regulatory interactions between pairs of homologous genes within the same cell can lead to under-dominance — mutations which result in stronger autoregulation, and decrease noise in homozygotes, paradoxically can cause increased noise in heterozygotes. This severely limits a diploid's ability to evolve negative autoregulation as a noise reduction mechanism. Our work offers a simple and general explanation for a previously unexplained difference between the regulatory architectures of E. coli and yeast, Drosophila and humans. It also demonstrates that the effects of diploidy in gene networks can have counter-intuitive consequences that may profoundly influence the course of evolution. All genes have to deal with intrinsic noise, and a variety of mechanisms have evolved to reduce it. One important mechanism of noise reduction for transcription factors is negative autoregulation, in which a gene product represses its own rate of transcription. Negative auotregulation occurs frequently in E. coli but, we find, occurs much more rarely in S. cerevisiae, D. melanogaster and humans. Whilst there are a great many important differences in the genetic architectures of these organisms, they tend to share, with the exception of negative autoregulation, similar profiles of network motifs. This makes the discrepancy in the degree of negative autoregulation all the more striking, as it lacks any obvious explanation. Our study presents a potential explanation, by comparing the evolvability of negative autoregulation as a noise reduction mechanism in haploids and diploids. We show that, in diploids, mutations that increase the strength of negative autoregulation at one gene copy often increase overall noise in gene expression. This results in under-dominance, in which heterozygotes are less fit than homozygotes. The result is that the evolution of negative autoregulation in diploids is significantly constrained. We verify our results using a combination of detailed molecular simulations and evolutionary simulations
Collapse
Affiliation(s)
- Alexander J Stewart
- Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
| | | | | | | |
Collapse
|
11
|
Zacher B, Abnaof K, Gade S, Younesi E, Tresch A, Fröhlich H. Joint Bayesian inference of condition-specific miRNA and transcription factor activities from combined gene and microRNA expression data. ACTA ACUST UNITED AC 2012; 28:1714-20. [PMID: 22563068 DOI: 10.1093/bioinformatics/bts257] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
MOTIVATION There have been many successful experimental and bioinformatics efforts to elucidate transcription factor (TF)-target networks in several organisms. For many organisms, these annotations are complemented by miRNA-target networks of good quality. Attempts that use these networks in combination with gene expression data to draw conclusions on TF or miRNA activity are, however, still relatively sparse. RESULTS In this study, we propose Bayesian inference of regulation of transcriptional activity (BIRTA) as a novel approach to infer both, TF and miRNA activities, from combined miRNA and mRNA expression data in a condition specific way. That means our model explains mRNA and miRNA expression for a specific experimental condition by the activities of certain miRNAs and TFs, hence allowing for differentiating between switches from active to inactive (negative switch) and inactive to active (positive switch) forms. Extensive simulations of our model reveal its good prediction performance in comparison to other approaches. Furthermore, the utility of BIRTA is demonstrated at the example of Escherichia coli data comparing aerobic and anaerobic growth conditions, and by human expression data from pancreas and ovarian cancer. AVAILABILITY AND IMPLEMENTATION The method is implemented in the R package birta, which is freely available for Bio-conductor (>=2.10) on http://www.bioconductor.org/packages/release/bioc/html/birta.html.
Collapse
Affiliation(s)
- Benedikt Zacher
- Ludwig-Maximilians-Universität München, Gene Center Munich and Center for integrated Protein Science CiPSM, Department of Chemistry and Biochemistry, Feodor-Lynen-Street 25, 81377 Munich, Germany
| | | | | | | | | | | |
Collapse
|
12
|
Zhang G, Zhou B, Wang W, Zhang M, Zhao Y, Wang Z, Yang L, Zhai J, Feng CG, Wang J, Chen X. A functional single-nucleotide polymorphism in the promoter of the gene encoding interleukin 6 is associated with susceptibility to tuberculosis. J Infect Dis 2012; 205:1697-704. [PMID: 22457277 DOI: 10.1093/infdis/jis266] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Genetic variation influences susceptibility or resistance to tuberculosis. Interleukin 6 (IL-6) contributes to protection against tuberculosis in mice. However, its role in regulating susceptibility or resistance to tuberculosis in humans is unclear. METHODS Genotyping of polymorphisms in IL-6 and IL-6R (CD126) genes was performed in 2 independent cohorts, an experimental population (495 cases and 358 controls) and a validation population (1383 cases and 1149 controls). The associations of the variants with tuberculosis were tested using 2 case-control association studies. In addition, the regulatory effects of single-nucleotide polymorphism rs1800796 (-572C > G) on IL-6 production in plasma and CD14(+) monocyte cultures stimulated with a Mycobacterium tuberculosis (M. tuberculosis) product were assessed. RESULTS The rs1800796 polymorphism is associated with increased resistance to tuberculosis (odds ratio [OR], 0.771; 95% confidential interval, .684-.870). The rs1800796GG genotype is strongly associated with reduced risk to tuberculosis (OR, 0.621; 95% CI, .460-.838). Interestingly, CD14(+) monocytes isolated from individuals with rs1800796GG genotype produced significantly less IL-6 in response to M. tuberculosis 19-kDa lipoprotein than those with CC or CG genotype. CONCLUSIONS We identified a genetic polymorphism in the IL-6 promoter that regulates cytokine production and host resistance to pulmonary tuberculosis in Chinese populations.
Collapse
Affiliation(s)
- Guoliang Zhang
- Shenzhen Third People's Hospital, Guangdong Medical College, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|