1
|
Aman Beshir J, Kebede M. In silico analysis of promoter regions and regulatory elements (motifs and CpG islands) of the genes encoding for alcohol production in Saccharomyces cerevisiaea S288C and Schizosaccharomyces pombe 972h. J Genet Eng Biotechnol 2021; 19:8. [PMID: 33428031 PMCID: PMC7801573 DOI: 10.1186/s43141-020-00097-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 11/17/2020] [Indexed: 11/10/2022]
Abstract
BACKGROUND The crucial factor in the production of bio-fuels is the choice of potent microorganisms used in fermentation processes. Despite the evolving trend of using bacteria, yeast is still the primary choice for fermentation. Molecular characterization of many genes from baker's yeast (Saccharomyces cerevisiaea), and fission yeast (Schizosaccharomyces pombe), have improved our understanding in gene structure and the regulation of its expression. This in silico study was done with the aim of analyzing the promoter regions, transcription start site (TSS), and CpG islands of genes encoding for alcohol production in S. cerevisiaea S288C and S. pombe 972h-. RESULTS The analysis revealed the highest promoter prediction scores (1.0) were obtained in five sequences (AAD4, SFA1, GRE3, YKL071W, and YPR127W) for S. cerevisiaea S288C TSS while the lowest (0.8) were found in three sequences (AAD6, ADH5, and BDH2). Similarly, in S. pombe 972h-, the highest (0.99) and lowest (0.88) prediction scores were obtained in five (Adh1, SPBC8E4.04, SPBC215.11c, SPAP32A8.02, and SPAC19G12.09) and one (erg27) sequences, respectively. Determination of common motifs revealed that S. cerevisiaea S288C had 100% coverage at MSc1 with an E value of 3.7e-007 while S. pombe 972h- had 95.23% at MSp1 with an E value of 2.6e+002. Furthermore, comparison of identified transcription factor proteins indicated that 88.88% of MSp1 were exactly similar to MSc1. It also revealed that only 21.73% in S. cerevisiaea S288C and 28% in S. pombe 972h- of the gene body regions had CpG islands. A combined phylogenetic analysis indicated that all sequences from both S. cerevisiaea S288C and S. pombe 972h- were divided into four subgroups (I, II, III, and IV). The four clades are respectively colored in blue, red, green, and violet. CONCLUSION This in silico analysis of gene promoter regions and transcription factors through the actions of regulatory structure such as motifs and CpG islands of genes encoding alcohol production could be used to predict gene expression profiles in yeast species.
Collapse
Affiliation(s)
- Jemal Aman Beshir
- Department of Applied Biology, School of Applied Natural Science, Adama Science and Technology University, P.O. Box 1888, Adama, Ethiopia
- Ethiopian Sugar Corporation, Sugar Academy, Wonji, Ethiopia
| | - Mulugeta Kebede
- Department of Applied Biology, School of Applied Natural Science, Adama Science and Technology University, P.O. Box 1888, Adama, Ethiopia
| |
Collapse
|
2
|
Identification of Structural Elements of the Lysine Specific Demethylase 2B CxxC Domain Associated with Replicative Senescence Bypass in Primary Mouse Cells. Protein J 2020; 39:232-239. [DOI: 10.1007/s10930-020-09895-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
3
|
Song J, Bjarnason J, Surette MG. The identification of functional motifs in temporal gene expression analysis. Evol Bioinform Online 2017. [DOI: 10.1177/117693430500100008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The identification of transcription factor binding sites is essential to the understanding of the regulation of gene expression and the reconstruction of genetic regulatory networks. The in silico identification of cis-regulatory motifs is challenging due to sequence variability and lack of sufficient data to generate consensus motifs that are of quantitative or even qualitative predictive value. To determine functional motifs in gene expression, we propose a strategy to adopt false discovery rate (FDR) and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data. The method decreases the number of predicted motifs, which can then be confirmed by genetic analysis. To assess the method we used simulated motif/expression data to evaluate parameters. We applied this approach to experimental data for a group of iron responsive genes in Salmonella typhimurium 14028S. The method identified known and potentially new ferric-uptake regulator (Fur) binding sites. In addition, we identified uncharacterized functional motif candidates that correlated with specific patterns of expression. A SAS code for the simulation and analysis gene expression data is available from the first author upon request.
Collapse
Affiliation(s)
- Jiuzhou Song
- Department of Animal and Avian Sciences, and University of Maryland, Maryland 20742, USA
| | - Jaime Bjarnason
- Department of Microbiology and Infectious Diseases, and Department of Biochemistry and Molecular Biology, Health Sciences Centre, University of Calgary, Calgary, AB, Canada, T2N 4N1
| | - Michael G. Surette
- Department of Microbiology and Infectious Diseases, and Department of Biochemistry and Molecular Biology, Health Sciences Centre, University of Calgary, Calgary, AB, Canada, T2N 4N1
| |
Collapse
|
4
|
Obayashi T, Okamura Y, Ito S, Tadaka S, Motoike IN, Kinoshita K. COXPRESdb: a database of comparative gene coexpression networks of eleven species for mammals. Nucleic Acids Res 2012. [PMID: 23203868 PMCID: PMC3531062 DOI: 10.1093/nar/gks1014] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Coexpressed gene databases are valuable resources for identifying new gene functions or functional modules in metabolic pathways and signaling pathways. Although coexpressed gene databases are a fundamental platform in the field of plant biology, their use in animal studies is relatively limited. The COXPRESdb (http://coxpresdb.jp) provides coexpression relationships for multiple animal species, as comparisons of coexpressed gene lists can enhance the reliability of gene coexpression determinations. Here, we report the updates of the database, mainly focusing on the following two points. First, we updated our coexpression data by including recent microarray data for the previous seven species (human, mouse, rat, chicken, fly, zebrafish and nematode) and adding four new species (monkey, dog, budding yeast and fission yeast), along with a new human microarray platform. A reliability scoring function was also implemented, based on coexpression conservation to filter out coexpression with low reliability. Second, the network drawing function was updated, to implement automatic cluster analyses with enrichment analyses in Gene Ontology and in cis elements, along with interactive network analyses with Cytoscape Web. With these updates, COXPRESdb will become a more powerful tool for analyses of functional and regulatory networks of genes in a variety of animal species.
Collapse
Affiliation(s)
- Takeshi Obayashi
- Graduate School of Information Sciences, Tohoku University, Sendai 980-8679, Japan
| | | | | | | | | | | |
Collapse
|
5
|
Zambelli F, Pesole G, Pavesi G. Motif discovery and transcription factor binding sites before and after the next-generation sequencing era. Brief Bioinform 2012; 14:225-37. [PMID: 22517426 PMCID: PMC3603212 DOI: 10.1093/bib/bbs016] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Motif discovery has been one of the most widely studied problems in bioinformatics ever since genomic and protein sequences have been available. In particular, its application to the de novo prediction of putative over-represented transcription factor binding sites in nucleotide sequences has been, and still is, one of the most challenging flavors of the problem. Recently, novel experimental techniques like chromatin immunoprecipitation (ChIP) have been introduced, permitting the genome-wide identification of protein-DNA interactions. ChIP, applied to transcription factors and coupled with genome tiling arrays (ChIP on Chip) or next-generation sequencing technologies (ChIP-Seq) has opened new avenues in research, as well as posed new challenges to bioinformaticians developing algorithms and methods for motif discovery.
Collapse
|
6
|
Finding Transcription Factor Binding Motifs for Coregulated Genes by Combining Sequence Overrepresentation with Cross-Species Conservation. JOURNAL OF PROBABILITY AND STATISTICS 2012. [DOI: 10.1155/2012/830575] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Novel computational methods for finding transcription factor binding motifs have long been sought due to tedious work of experimentally identifying them. However, the current prevailing methods yield a large number of false positive predictions due to the short, variable nature of transcriptional factor binding sites (TFBSs). We proposed here a method that combines sequence overrepresentation and cross-species sequence conservation to detect TFBSs in upstream regions of a given set of coregulated genes. We applied the method to 35S. cerevisiaetranscriptional factors with known DNA binding motifs (with the support of orthologous sequences from genomes ofS. mikatae,S. bayanus, andS. paradoxus), and the proposed method outperformed the single-genome-based motif finding methodsMEMEandAlignACEas well as the multiple-genome-based methodsPHYMEandFootprinterfor the majority of these transcriptional factors. Compared with the prevailing motif finding software, our method has some advantages in finding transcriptional factor binding motifs for potential coregulated genes if the gene upstream sequences of multiple closely related species are available. Although we used yeast genomes to assess our method in this study, it might also be applied to other organisms if suitable related species are available and the upstream sequences of coregulated genes can be obtained for the multiple closely related species.
Collapse
|
7
|
Zheng X, Liu T, Yang Z, Wang J. Large cliques in Arabidopsis gene coexpression network and motif discovery. JOURNAL OF PLANT PHYSIOLOGY 2011; 168:611-618. [PMID: 21044807 DOI: 10.1016/j.jplph.2010.09.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2010] [Revised: 08/31/2010] [Accepted: 09/06/2010] [Indexed: 05/30/2023]
Abstract
Identification of cis-regulatory elements in Arabidopsis is a key step to understanding its transcriptional regulation scheme. In this study, the Arabidopsis gene coexpression network was constructed using the ATTED-II data, and thereafter a subgraph-induced approach and clique-finding algorithm were used to extract gene coexpression groups from the gene coexpression network. A total of 23 large coexpression gene groups were obtained, with each consisting of more than 100 highly correlated genes. Four classical tools were used to predict motifs in the promoter regions of coexpressed genes. Consequently, we detected a large number of candidate biologically relevant regulatory elements, and many of them are consistent with known cis-regulatory elements from AGRIS and AthaMap. Experiments on coexpressed groups, including E2Fa target genes, showed that our method had a high probability of returning the real binding motif. Our study provides the basis for future cis-regulatory module analysis and creates a starting point to unravel regulatory networks of Arabidopsis thaliana.
Collapse
Affiliation(s)
- Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai 200234, China
| | | | | | | |
Collapse
|
8
|
Bernard V, Lecharny A, Brunaud V. Improved detection of motifs with preferential location in promoters. Genome 2011; 53:739-52. [PMID: 20924423 DOI: 10.1139/g10-042] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Many transcription factor binding sites (TFBSs) involved in gene expression regulation are preferentially located relative to the transcription start site. This property is exploited in in silico prediction approaches, one of which involves studying the local overrepresentation of motifs using a sliding window to scan promoters with considerable accuracy. Nevertheless, the consequences of the choice of the sliding window size have never before been analysed. We propose an automatic adaptation of this size to each motif distribution profile. This approach allows a better characterization of the topological constraints of the motifs and the lists of genes containing them. Moreover, our approach allowed us to highlight a nonconstant frequency of occurrence of spurious motifs that could be counter-selected close to their functional area. Therefore, to improve the accuracy of in silico prediction of TFBSs and the sensitivity of the promoter cartography, we propose, in addition to automatic adaptation of window size, consideration of the nonconstant frequency of motifs in promoters.
Collapse
Affiliation(s)
- Virginie Bernard
- Unité de Recherche en Génomique Végétale (URGV), UMR INRA 1165 - CNRS 8114 - UEVE, 91057 Evry CEDEX, France
| | | | | |
Collapse
|
9
|
Sinatra R, Condorelli D, Latora V. Networks of motifs from sequences of symbols. PHYSICAL REVIEW LETTERS 2010; 105:178702. [PMID: 21231087 DOI: 10.1103/physrevlett.105.178702] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2010] [Revised: 08/16/2010] [Indexed: 05/26/2023]
Abstract
We introduce a method to convert an ensemble of sequences of symbols into a weighted directed network whose nodes are motifs, while the directed links and their weights are defined from statistically significant co-occurences of two motifs in the same sequence. The analysis of communities of networks of motifs is shown to be able to correlate sequences with functions in the human proteome database, to detect hot topics from online social dialogs, to characterize trajectories of dynamical systems, and it might find other useful applications to process large amounts of data in various fields.
Collapse
Affiliation(s)
- Roberta Sinatra
- Dipartimento di Fisica ed Astronomia, Università di Catania, INFN, Italy.
| | | | | |
Collapse
|
10
|
Jiao Q, Yang Z, Huang J. Construction of a gene regulatory network for Arabidopsis based on metabolic pathway data. CHINESE SCIENCE BULLETIN-CHINESE 2010. [DOI: 10.1007/s11434-009-0728-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
11
|
VIP1 response elements mediate mitogen-activated protein kinase 3-induced stress gene expression. Proc Natl Acad Sci U S A 2009; 106:18414-9. [PMID: 19820165 DOI: 10.1073/pnas.0905599106] [Citation(s) in RCA: 105] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
The plant pathogen Agrobacterium tumefaciens transforms plant cells by delivering its T-DNA into the plant cell nucleus where it integrates into the plant genome and causes tumor formation. A key role of VirE2-interacting protein 1 (VIP1) in the nuclear import of T-DNA during Agrobacterium-mediated plant transformation has been unravelled and VIP1 was shown to undergo nuclear localization upon phosphorylation by the mitogen-activated protein kinase MPK3. Here, we provide evidence that VIP1 encodes a functional bZIP transcription factor that stimulates stress-dependent gene expression by binding to VIP1 response elements (VREs), a DNA hexamer motif. VREs are overrepresented in promoters responding to activation of the MPK3 pathway such as Trxh8 and MYB44. Accordingly, plants overexpressing VIP1 accumulate high levels of Trxh8 and MYB44 transcripts, whereas stress-induced expression of these genes is impaired in mpk3 mutants. Trxh8 and MYB44 promoters are activated by VIP1 in a VRE-dependent manner. VIP1 strongly enhances expression from a synthetic promoter harboring multiple VRE copies and directly interacts with VREs in vitro and in vivo. Chromatin immunoprecipitation assays of the MYB44 promoter confirm that VIP1 binding to VREs is enhanced under conditions of MPK3 pathway stimulation. These results provide molecular insight into the cellular mechanism of target gene regulation by the MPK3 pathway.
Collapse
|
12
|
Iengar P, Joshi NV. Identification of putative regulatory motifs in the upstream regions of co-expressed functional groups of genes in Plasmodium falciparum. BMC Genomics 2009; 10:18. [PMID: 19144114 PMCID: PMC2662883 DOI: 10.1186/1471-2164-10-18] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2008] [Accepted: 01/13/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Regulation of gene expression in Plasmodium falciparum (Pf) remains poorly understood. While over half the genes are estimated to be regulated at the transcriptional level, few regulatory motifs and transcription regulators have been found. RESULTS The study seeks to identify putative regulatory motifs in the upstream regions of 13 functional groups of genes expressed in the intraerythrocytic developmental cycle of Pf. Three motif-discovery programs were used for the purpose, and motifs were searched for only on the gene coding strand. Four motifs -- the 'G-rich', the 'C-rich', the 'TGTG' and the 'CACA' motifs -- were identified, and zero to all four of these occur in the 13 sets of upstream regions. The 'CACA motif' was absent in functional groups expressed during the ring to early trophozoite transition. For functional groups expressed in each transition, the motifs tended to be similar. Upstream motifs in some functional groups showed 'positional conservation' by occurring at similar positions relative to the translational start site (TLS); this increases their significance as regulatory motifs. In the ribonucleotide synthesis, mitochondrial, proteasome and organellar translation machinery genes, G-rich, C-rich, CACA and TGTG motifs, respectively, occur with striking positional conservation. In the organellar translation machinery group, G-rich motifs occur close to the TLS. The same motifs were sometimes identified for multiple functional groups; differences in location and abundance of the motifs appear to ensure different modes of action. CONCLUSION The identification of positionally conserved over-represented upstream motifs throws light on putative regulatory elements for transcription in Pf.
Collapse
Affiliation(s)
- Prathima Iengar
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India.
| | | |
Collapse
|
13
|
Huang J, Yang J, Wang G, Yu Q, Yang Z. Prediction of anther-expressed gene regulation in Arabidopsis. Sci Bull (Beijing) 2008. [DOI: 10.1007/s11434-008-0381-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
14
|
Thum KE, Shin MJ, Gutiérrez RA, Mukherjee I, Katari MS, Nero D, Shasha D, Coruzzi GM. An integrated genetic, genomic and systems approach defines gene networks regulated by the interaction of light and carbon signaling pathways in Arabidopsis. BMC SYSTEMS BIOLOGY 2008; 2:31. [PMID: 18387196 PMCID: PMC2335094 DOI: 10.1186/1752-0509-2-31] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2007] [Accepted: 04/04/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND Light and carbon are two important interacting signals affecting plant growth and development. The mechanism(s) and/or genes involved in sensing and/or mediating the signaling pathways involving these interactions are unknown. This study integrates genetic, genomic and systems approaches to identify a genetically perturbed gene network that is regulated by the interaction of carbon and light signaling in Arabidopsis. RESULTS Carbon and light insensitive (cli) mutants were isolated. Microarray data from cli186 is analyzed to identify the genes, biological processes and gene networks affected by the integration of light and carbon pathways. Analysis of this data reveals 966 genes regulated by light and/or carbon signaling in wild-type. In cli186, 216 of these light/carbon regulated genes are misregulated in response to light and/or carbon treatments where 78% are misregulated in response to light and carbon interactions. Analysis of the gene lists show that genes in the biological processes "energy" and "metabolism" are over-represented among the 966 genes regulated by carbon and/or light in wild-type, and the 216 misregulated genes in cli186. To understand connections among carbon and/or light regulated genes in wild-type and the misregulated genes in cli186, the microarray data is interpreted in the context of metabolic and regulatory networks. The network created from the 966 light/carbon regulated genes in wild-type, reveals that cli186 is affected in the light and/or carbon regulation of a network of 60 connected genes, including six transcription factors. One transcription factor, HAT22 appears to be a regulatory "hub" in the cli186 network as it shows regulatory connections linking a metabolic network of genes involved in "amino acid metabolism", "C-compound/carbohydrate metabolism" and "glycolysis/gluconeogenesis". CONCLUSION The global misregulation of gene networks controlled by light and carbon signaling in cli186 indicates that it represents one of the first Arabidopsis mutants isolated that is specifically disrupted in the integration of both carbon and light signals to control the regulation of metabolic, developmental and regulatory genes. The network analysis of misregulated genes suggests that CLI186 acts to integrate light and carbon signaling interactions and is a master regulator connecting the regulation of a host of downstream metabolic and regulatory processes.
Collapse
Affiliation(s)
- Karen E Thum
- Department of Biology, New York University, New York, NY, 10003, USA
| | - Michael J Shin
- Department of Biology, New York University, New York, NY, 10003, USA
- Department of Biology, Messiah College, Grantham, PA, 17027, USA
| | - Rodrigo A Gutiérrez
- Department of Biology, New York University, New York, NY, 10003, USA
- Departamento de Genética Molecular y Microbiología, Pontificia Universidad Católica de Chile. Alameda 340. 8331010. Santiago, Chile
| | - Indrani Mukherjee
- Department of Biology, New York University, New York, NY, 10003, USA
| | - Manpreet S Katari
- Department of Biology, New York University, New York, NY, 10003, USA
| | - Damion Nero
- Department of Biology, New York University, New York, NY, 10003, USA
| | - Dennis Shasha
- Courant Institute of Mathematical Sciences, New York University, New York, NY, 10003, USA
| | - Gloria M Coruzzi
- Department of Biology, New York University, New York, NY, 10003, USA
| |
Collapse
|
15
|
Moroni E, Caselle M, Fogolari F. Identification of DNA-binding protein target sequences by physical effective energy functions: free energy analysis of lambda repressor-DNA complexes. BMC STRUCTURAL BIOLOGY 2007; 7:61. [PMID: 17900341 PMCID: PMC2194778 DOI: 10.1186/1472-6807-7-61] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/20/2007] [Accepted: 09/27/2007] [Indexed: 11/26/2022]
Abstract
Background Specific binding of proteins to DNA is one of the most common ways gene expression is controlled. Although general rules for the DNA-protein recognition can be derived, the ambiguous and complex nature of this mechanism precludes a simple recognition code, therefore the prediction of DNA target sequences is not straightforward. DNA-protein interactions can be studied using computational methods which can complement the current experimental methods and offer some advantages. In the present work we use physical effective potentials to evaluate the DNA-protein binding affinities for the λ repressor-DNA complex for which structural and thermodynamic experimental data are available. Results The binding free energy of two molecules can be expressed as the sum of an intermolecular energy (evaluated using a molecular mechanics forcefield), a solvation free energy term and an entropic term. Different solvation models are used including distance dependent dielectric constants, solvent accessible surface tension models and the Generalized Born model. The effect of conformational sampling by Molecular Dynamics simulations on the computed binding energy is assessed; results show that this effect is in general negative and the reproducibility of the experimental values decreases with the increase of simulation time considered. The free energy of binding for non-specific complexes, estimated using the best energetic model, agrees with earlier theoretical suggestions. As a results of these analyses, we propose a protocol for the prediction of DNA-binding target sequences. The possibility of searching regulatory elements within the bacteriophage λ genome using this protocol is explored. Our analysis shows good prediction capabilities, even in absence of any thermodynamic data and information on the naturally recognized sequence. Conclusion This study supports the conclusion that physics-based methods can offer a completely complementary methodology to sequence-based methods for the identification of DNA-binding protein target sequences.
Collapse
Affiliation(s)
- Elisabetta Moroni
- Dipartimento di Fisica Teorica, Universià di Torino and INFN, Via P. Giuria 1, 10125 Torino, Italy
- Dipartimento di Fisica G. Occhialini, Università di Milano-Bicocca and INFN, Piazza delle Scienze 3, 20156 Milano, Italy
| | - Michele Caselle
- Dipartimento di Fisica Teorica, Universià di Torino and INFN, Via P. Giuria 1, 10125 Torino, Italy
| | - Federico Fogolari
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
| |
Collapse
|
16
|
Abed N, Bickle M, Mari B, Schapira M, Sanjuan-España R, Robbe Sermesant K, Moncorgé O, Mouradian-Garcia S, Barbry P, Rudkin BB, Fauvarque MO, Michaud-Soret I, Colas P. A comparative analysis of perturbations caused by a gene knock-out, a dominant negative allele, and a set of peptide aptamers. Mol Cell Proteomics 2007; 6:2110-21. [PMID: 17785351 DOI: 10.1074/mcp.m700105-mcp200] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The study of protein function mostly relies on perturbing regulatory networks by acting upon protein expression levels or using transdominant negative agents. Here we used the Escherichia coli global transcription regulator Fur (ferric uptake regulator) as a case study to compare the perturbations exerted by a gene knock-out, the expression of a dominant negative allele of a gene, and the expression of peptide aptamers that bind a gene product. These three perturbations caused phenotypes that differed quantitatively and qualitatively from one another. The Fur peptide aptamers inhibited the activity of their target to various extents and reduced the virulence of a pathogenic E. coli strain in Drosophila. A genome-wide transcriptome analysis revealed that the "penetrance" of a peptide aptamer was comparable to that of a dominant negative allele but lower than the penetrance of the gene knock-out. Our work shows that comparative analysis of phenotypic and transcriptome responses to different types of perturbation can help decipher complex regulatory networks that control various biological processes.
Collapse
Affiliation(s)
- Nadia Abed
- Differentiation and Cell Cycle Group, Laboratoire de Biologie Moléculaire de la Cellule, UMR 5239 CNRS/ENS Lyon, Université Lyon 1, Ecole Normale Supérieure de Lyon, IFR 128 BioSciences Lyon-Gerland, 46 allée d'Italie, 69364 Lyon cedex 07, France
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
17
|
Chawade A, Bräutigam M, Lindlöf A, Olsson O, Olsson B. Putative cold acclimation pathways in Arabidopsis thaliana identified by a combined analysis of mRNA co-expression patterns, promoter motifs and transcription factors. BMC Genomics 2007; 8:304. [PMID: 17764576 PMCID: PMC2001198 DOI: 10.1186/1471-2164-8-304] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2007] [Accepted: 09/02/2007] [Indexed: 01/08/2023] Open
Abstract
Background With the advent of microarray technology, it has become feasible to identify virtually all genes in an organism that are induced by developmental or environmental changes. However, relying solely on gene expression data may be of limited value if the aim is to infer the underlying genetic networks. Development of computational methods to combine microarray data with other information sources is therefore necessary. Here we describe one such method. Results By means of our method, previously published Arabidopsis microarray data from cold acclimated plants at six different time points, promoter motif sequence data extracted from ~24,000 Arabidopsis promoters and known transcription factor binding sites were combined to construct a putative genetic regulatory interaction network. The inferred network includes both previously characterised and hitherto un-described regulatory interactions between transcription factor (TF) genes and genes that encode other TFs or other proteins. Part of the obtained transcription factor regulatory network is presented here. More detailed information is available in the additional files. Conclusion The rule-based method described here can be used to infer genetic networks by combining data from microarrays, promoter sequences and known promoter binding sites. This method should in principle be applicable to any biological system. We tested the method on the cold acclimation process in Arabidopsis and could identify a more complex putative genetic regulatory network than previously described. However, it should be noted that information on specific binding sites for individual TFs were in most cases not available. Thus, gene targets for the entire TF gene families were predicted. In addition, the networks were built solely by a bioinformatics approach and experimental verifications will be necessary for their final validation. On the other hand, since our method highlights putative novel interactions, more directed experiments could now be performed.
Collapse
Affiliation(s)
- Aakash Chawade
- Department of Cell and Molecular Biology, Göteborg University, Box 462, 403 20 Göteborg, Sweden
- School of Humanities and Informatics, University of Skövde, Box 408, 541 28 Skövde, Sweden
| | - Marcus Bräutigam
- Department of Cell and Molecular Biology, Göteborg University, Box 462, 403 20 Göteborg, Sweden
| | - Angelica Lindlöf
- School of Humanities and Informatics, University of Skövde, Box 408, 541 28 Skövde, Sweden
| | - Olof Olsson
- Department of Cell and Molecular Biology, Göteborg University, Box 462, 403 20 Göteborg, Sweden
| | - Björn Olsson
- School of Humanities and Informatics, University of Skövde, Box 408, 541 28 Skövde, Sweden
| |
Collapse
|
18
|
Identification of candidate regulatory sequences in mammalian 3' UTRs by statistical analysis of oligonucleotide distributions. BMC Bioinformatics 2007; 8:174. [PMID: 17524134 PMCID: PMC1904458 DOI: 10.1186/1471-2105-8-174] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2006] [Accepted: 05/24/2007] [Indexed: 12/11/2022] Open
Abstract
Background 3' untranslated regions (3' UTRs) contain binding sites for many regulatory elements, and in particular for microRNAs (miRNAs). The importance of miRNA-mediated post-transcriptional regulation has become increasingly clear in the last few years. Results We propose two complementary approaches to the statistical analysis of oligonucleotide frequencies in mammalian 3' UTRs aimed at the identification of candidate binding sites for regulatory elements. The first method is based on the identification of sets of genes characterized by evolutionarily conserved overrepresentation of an oligonucleotide. The second method is based on the identification of oligonucleotides showing statistically significant strand asymmetry in their distribution in 3' UTRs. Conclusion Both methods are able to identify many previously known binding sites located in 3'UTRs, and in particular seed regions of known miRNAs. Many new candidates are proposed for experimental verification.
Collapse
|
19
|
The identification of functional motifs in temporal gene expression analysis. Evol Bioinform Online 2007; 1:84-96. [PMID: 19325856 PMCID: PMC2658870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
The identification of transcription factor binding sites is essential to the understanding of the regulation of gene expression and the reconstruction of genetic regulatory networks. The in silico identification of cis-regulatory motifs is challenging due to sequence variability and lack of sufficient data to generate consensus motifs that are of quantitative or even qualitative predictive value. To determine functional motifs in gene expression, we propose a strategy to adopt false discovery rate (FDR) and estimate motif effects to evaluate combinatorial analysis of motif candidates and temporal gene expression data. The method decreases the number of predicted motifs, which can then be confirmed by genetic analysis. To assess the method we used simulated motif/expression data to evaluate parameters. We applied this approach to experimental data for a group of iron responsive genes in Salmonella typhimurium 14028S. The method identified known and potentially new ferric-uptake regulator (Fur) binding sites. In addition, we identified uncharacterized functional motif candidates that correlated with specific patterns of expression. A SAS code for the simulation and analysis gene expression data is available from the first author upon request.
Collapse
|
20
|
Wu G, Nie L, Zhang W. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance. Biochem Biophys Res Commun 2006; 344:114-21. [PMID: 16603130 DOI: 10.1016/j.bbrc.2006.03.124] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2006] [Accepted: 03/21/2006] [Indexed: 11/29/2022]
Abstract
The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused either on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRNA abundance and non-random features in coding sequences (e.g., codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together. Using the AlignACE program, 442 over-represented motifs were identified from the upstream 100bp region of 293 genes located in the known regulons. Regression of mRNA expression data against the measures of coding and non-coding sequence features indicated that 54.1% of the variations in mRNA abundance can be explained by the presence of upstream motifs, while coding sequences alone contribute to 29.7% of the variations in mRNA abundance. Interestingly, most of contribution from coding sequences is overlapping with that from upstream motifs; thereby a total of 60.3% of the variations in mRNA abundance can be explained when coding and non-coding information was included. This result demonstrates that upstream regulatory motifs and coding sequence information contribute to the overall mRNA expression in a combinatorial rather than an additive manner.
Collapse
Affiliation(s)
- Gang Wu
- Department of Biological Sciences, University of Maryland at Baltimore County, 1000 Hilltop Circle, Baltimore, MD 21250, USA
| | | | | |
Collapse
|
21
|
Sandve GK, Drabløs F. A survey of motif discovery methods in an integrated framework. Biol Direct 2006; 1:11. [PMID: 16600018 PMCID: PMC1479319 DOI: 10.1186/1745-6150-1-11] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2006] [Accepted: 04/06/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND There has been a growing interest in computational discovery of regulatory elements, and a multitude of motif discovery methods have been proposed. Computational motif discovery has been used with some success in simple organisms like yeast. However, as we move to higher organisms with more complex genomes, more sensitive methods are needed. Several recent methods try to integrate additional sources of information, including microarray experiments (gene expression and ChlP-chip). There is also a growing awareness that regulatory elements work in combination, and that this combinatorial behavior must be modeled for successful motif discovery. However, the multitude of methods and approaches makes it difficult to get a good understanding of the current status of the field. RESULTS This paper presents a survey of methods for motif discovery in DNA, based on a structured and well defined framework that integrates all relevant elements. Existing methods are discussed according to this framework. CONCLUSION The survey shows that although no single method takes all relevant elements into consideration, a very large number of different models treating the various elements separately have been tried. Very often the choices that have been made are not explicitly stated, making it difficult to compare different implementations. Also, the tests that have been used are often not comparable. Therefore, a stringent framework and improved test methods are needed to evaluate the different approaches in order to conclude which ones are most promising. REVIEWERS This article was reviewed by Eugene V. Koonin, Philipp Bucher (nominated by Mikhail Gelfand) and Frank Eisenhaber.
Collapse
Affiliation(s)
- Geir Kjetil Sandve
- Department of Computer and Information Science, NTNU – Norwegian University of Science and Technology, N-7052, Trondheim, Norway
| | - Finn Drabløs
- Department of Cancer Research and Molecular Medicine, NTNU – Norwegian University of Science and Technology, N-7006, Trondheim, Norway
| |
Collapse
|
22
|
Abstract
Among more than 120 genes that are now known to regulate mammalian pigmentation, one of the key genes is MC1R, which encodes the melanocortin 1 receptor, a seven transmembrane G protein-coupled receptor expressed on the surface of melanocytes. Since the monoexonic sequence of the gene was cloned and characterized more than a decade ago, tremendous efforts have been dedicated to the extensive genotyping of mostly red-haired populations all around the world, thus providing allelic variants that may or may not account for melanoma susceptibility in the presence or absence of ultraviolet (UV) exposure. Soluble factors, such as proopiomelanocortin (POMC) derivatives, agouti signal protein (ASP) and others, regulate MC1R expression, leading to improved photoprotection via increased eumelanin synthesis or in contrast, inducing the switch to pheomelanin. However, there is an obvious lack of knowledge regarding the numerous and complex regulatory mechanisms that govern the expression of MC1R at the intra-cellular level, from gene transcription in response to an external stimulus to the expression of the mature receptor on the melanocyte surface.
Collapse
Affiliation(s)
- Francois Rouzaud
- Laboratory of Cell Biology, National Cancer Institute, National Institutes of Health, Building 37, Room 2132, Bethesda, MD 20892, USA
| | | |
Collapse
|
23
|
Mohanty B, Krishnan SPT, Swarup S, Bajic VB. Detection and preliminary analysis of motifs in promoters of anaerobically induced genes of different plant species. ANNALS OF BOTANY 2005; 96:669-81. [PMID: 16027132 PMCID: PMC4247034 DOI: 10.1093/aob/mci219] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2004] [Revised: 12/16/2004] [Accepted: 01/31/2005] [Indexed: 05/03/2023]
Abstract
BACKGROUND AND AIMS Plants can suffer from oxygen limitation during flooding or more complete submergence and may therefore switch from Kreb's cycle respiration to fermentation in association with the expression of anaerobically inducible genes coding for enzymes involved in glycolysis and fermentation. The aim of this study was to clarify mechanisms of transcriptional regulation of these anaerobic genes by identifying motifs shared by their promoter regions. METHODS Statistically significant motifs were detected by an in silico method from 13 promoters of anaerobic genes. The selected motifs were common for the majority of analysed promoters. Their significance was evaluated by searching for their presence in transcription factor-binding site databases (TRANSFAC, PlantCARE and PLACE). Using several negative control data sets, it was tested whether the motifs found were specific to the anaerobic group. KEY RESULTS Previously, anaerobic response elements have been identified in maize (Zea mays) and arabidopsis (Arabidopsis thaliana) genes. Known functional motifs were detected, such as GT and GC motifs, but also other motifs shared by most of the genes examined. Five motifs detected have not been found in plants hitherto but are present in the promoters of animal genes with various functions. The consensus sequences of these novel motifs are 5'-AAACAAA-3', 5'-AGCAGC-3', 5'-TCATCAC-3', 5'-GTTT(A/C/T)GCAA-3' and 5'-TTCCCTGTT-3'. CONCLUSIONS It is believed that the promoter motifs identified could be functional by conferring anaerobic sensitivity to the genes that possess them. This proposal now requires experimental verification.
Collapse
Affiliation(s)
- Bijayalaxmi Mohanty
- Knowledge Extraction Laboratory, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613.
| | | | | | | |
Collapse
|
24
|
Bortoluzzi S, Coppe A, Bisognin A, Pizzi C, Danieli GA. A multistep bioinformatic approach detects putative regulatory elements in gene promoters. BMC Bioinformatics 2005; 6:121. [PMID: 15904489 PMCID: PMC1173081 DOI: 10.1186/1471-2105-6-121] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2004] [Accepted: 05/18/2005] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Searching for approximate patterns in large promoter sequences frequently produces an exceedingly high numbers of results. Our aim was to exploit biological knowledge for definition of a sheltered search space and of appropriate search parameters, in order to develop a method for identification of a tractable number of sequence motifs. RESULTS Novel software (COOP) was developed for extraction of sequence motifs, based on clustering of exact or approximate patterns according to the frequency of their overlapping occurrences. Genomic sequences of 1 Kb upstream of 91 genes differentially expressed and/or encoding proteins with relevant function in adult human retina were analyzed. Methodology and results were tested by analysing 1,000 groups of putatively unrelated sequences, randomly selected among 17,156 human gene promoters. When applied to a sample of human promoters, the method identified 279 putative motifs frequently occurring in retina promoters sequences. Most of them are localized in the proximal portion of promoters, less variable in central region than in lateral regions and similar to known regulatory sequences. COOP software and reference manual are freely available upon request to the Authors. CONCLUSION The approach described in this paper seems effective for identifying a tractable number of sequence motifs with putative regulatory role.
Collapse
Affiliation(s)
- Stefania Bortoluzzi
- Department of Biology, University of Padova – Via Bassi 58/B, 35131, Padova, Italy
| | - Alessandro Coppe
- Department of Biology, University of Padova – Via Bassi 58/B, 35131, Padova, Italy
| | - Andrea Bisognin
- Department of Biology, University of Padova – Via Bassi 58/B, 35131, Padova, Italy
| | - Cinzia Pizzi
- Department of Information Engineering, University of Padova – Via Gradenigo 6/B, 35131, Padova, Italy
| | - Gian Antonio Danieli
- Department of Biology, University of Padova – Via Bassi 58/B, 35131, Padova, Italy
| |
Collapse
|
25
|
Corà D, Herrmann C, Dieterich C, Di Cunto F, Provero P, Caselle M. Ab initio identification of putative human transcription factor binding sites by comparative genomics. BMC Bioinformatics 2005; 6:110. [PMID: 15865625 PMCID: PMC1097714 DOI: 10.1186/1471-2105-6-110] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2004] [Accepted: 05/02/2005] [Indexed: 11/18/2022] Open
Abstract
Background Understanding transcriptional regulation of gene expression is one of the greatest challenges of modern molecular biology. A central role in this mechanism is played by transcription factors, which typically bind to specific, short DNA sequence motifs usually located in the upstream region of the regulated genes. We discuss here a simple and powerful approach for the ab initio identification of these cis-regulatory motifs. The method we present integrates several elements: human-mouse comparison, statistical analysis of genomic sequences and the concept of coregulation. We apply it to a complete scan of the human genome. Results By using the catalogue of conserved upstream sequences collected in the CORG database we construct sets of genes sharing the same overrepresented motif (short DNA sequence) in their upstream regions both in human and in mouse. We perform this construction for all possible motifs from 5 to 8 nucleotides in length and then filter the resulting sets looking for two types of evidence of coregulation: first, we analyze the Gene Ontology annotation of the genes in the set, searching for statistically significant common annotations; second, we analyze the expression profiles of the genes in the set as measured by microarray experiments, searching for evidence of coexpression. The sets which pass one or both filters are conjectured to contain a significant fraction of coregulated genes, and the upstream motifs characterizing the sets are thus good candidates to be the binding sites of the TF's involved in such regulation. In this way we find various known motifs and also some new candidate binding sites. Conclusion We have discussed a new integrated algorithm for the "ab initio" identification of transcription factor binding sites in the human genome. The method is based on three ingredients: comparative genomics, overrepresentation, different types of coregulation. The method is applied to a full-scan of the human genome, giving satisfactory results.
Collapse
Affiliation(s)
- D Corà
- Dipartimento di Fisica Teorica dell'Università degli Studi di Torino and INFN, Via P. Giuria 1 – I 10125 Torino, Italy
| | - C Herrmann
- LGPD-IBDM, Université de la Méditerranée / CNRS, Campus de Luminy Case 907 – F-13288 Marseille Cedex 9, France
| | - C Dieterich
- Max-Planck-Institute for Molecular Genetics, Ihnestrasse 73 – D-14195 Berlin, Germany
| | - F Di Cunto
- Dipartimento di Genetica, Biologia e Biochimica dell'Università di Torino, Via Santena 5 bis – I-10126 Torino, Italy
| | - P Provero
- Dipartimento di Genetica, Biologia e Biochimica dell'Università di Torino, Via Santena 5 bis – I-10126 Torino, Italy
| | - M Caselle
- Dipartimento di Fisica Teorica dell'Università degli Studi di Torino and INFN, Via P. Giuria 1 – I 10125 Torino, Italy
| |
Collapse
|
26
|
Corà D, Di Cunto F, Provero P, Silengo L, Caselle M. Computational identification of transcription factor binding sites by functional analysis of sets of genes sharing overrepresented upstream motifs. BMC Bioinformatics 2004; 5:57. [PMID: 15137914 PMCID: PMC449910 DOI: 10.1186/1471-2105-5-57] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2003] [Accepted: 05/11/2004] [Indexed: 11/30/2022] Open
Abstract
Background Transcriptional regulation is a key mechanism in the functioning of the cell, and is mostly effected through transcription factors binding to specific recognition motifs located upstream of the coding region of the regulated gene. The computational identification of such motifs is made easier by the fact that they often appear several times in the upstream region of the regulated genes, so that the number of occurrences of relevant motifs is often significantly larger than expected by pure chance. Results To exploit this fact, we construct sets of genes characterized by the statistical overrepresentation of a certain motif in their upstream regions. Then we study the functional characterization of these sets by analyzing their annotation to Gene Ontology terms. For the sets showing a statistically significant specific functional characterization, we conjecture that the upstream motif characterizing the set is a binding site for a transcription factor involved in the regulation of the genes in the set. Conclusions The method we propose is able to identify many known binding sites in S. cerevisiae and new candidate targets of regulation by known transcritpion factors. Its application to less well studied organisms is likely to be valuable in the exploration of their regulatory interaction network.
Collapse
Affiliation(s)
- Davide Corà
- Dipartimento di Fisica Teorica, Università di Torino, and INFN, sezione di Torino, Italy
| | - Ferdinando Di Cunto
- Dipartimento di Genetica, Biologia e Biochimica, Università di Torino, Torino, Italy
| | | | - Lorenzo Silengo
- Dipartimento di Genetica, Biologia e Biochimica, Università di Torino, Torino, Italy
| | - Michele Caselle
- Dipartimento di Fisica Teorica, Università di Torino, and INFN, sezione di Torino, Italy
| |
Collapse
|
27
|
Goutsias J, Kim S. A nonlinear discrete dynamical model for transcriptional regulation: construction and properties. Biophys J 2004; 86:1922-45. [PMID: 15041638 PMCID: PMC1304049 DOI: 10.1016/s0006-3495(04)74257-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2003] [Accepted: 11/17/2003] [Indexed: 10/21/2022] Open
Abstract
Transcriptional regulation is a fundamental mechanism of living cells, which allows them to determine their actions and properties, by selectively choosing which proteins to express and by dynamically controlling the amounts of those proteins. In this article, we revisit the problem of mathematically modeling transcriptional regulation. First, we adopt a biologically motivated continuous model for gene transcription and mRNA translation, based on first-order rate equations, coupled with a set of nonlinear equations that model cis-regulation. Then, we view the processes of transcription and translation as being discrete, which, together with the need to use computational techniques for large-scale analysis and simulation, motivates us to model transcriptional regulation by means of a nonlinear discrete dynamical system. Classical arguments from chemical kinetics allow us to specify the nonlinearities underlying cis-regulation and to include both activators and repressors as well as the notion of regulatory modules in our formulation. We show that the steady-state behavior of the proposed discrete dynamical system is identical to that of the continuous model. We discuss several aspects of our model, related to homeostatic and epigenetic regulation as well as to Boolean networks, and elaborate on their significance. Simulations of transcriptional regulation of a hypothetical metabolic pathway illustrate several properties of our model, and demonstrate that a nonlinear discrete dynamical system may be effectively used to model transcriptional regulation in a biologically relevant way.
Collapse
Affiliation(s)
- John Goutsias
- The Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland 21218, USA.
| | | |
Collapse
|