1
|
MacKenzie TMG, Cisneros R, Maynard RD, Snyder MP. Reverse-ChIP Techniques for Identifying Locus-Specific Proteomes: A Key Tool in Unlocking the Cancer Regulome. Cells 2023; 12:1860. [PMID: 37508524 PMCID: PMC10377898 DOI: 10.3390/cells12141860] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 06/30/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
A phenotypic hallmark of cancer is aberrant transcriptional regulation. Transcriptional regulation is controlled by a complicated array of molecular factors, including the presence of transcription factors, the deposition of histone post-translational modifications, and long-range DNA interactions. Determining the molecular identity and function of these various factors is necessary to understand specific aspects of cancer biology and reveal potential therapeutic targets. Regulation of the genome by specific factors is typically studied using chromatin immunoprecipitation followed by sequencing (ChIP-Seq) that identifies genome-wide binding interactions through the use of factor-specific antibodies. A long-standing goal in many laboratories has been the development of a 'reverse-ChIP' approach to identify unknown binding partners at loci of interest. A variety of strategies have been employed to enable the selective biochemical purification of sequence-defined chromatin regions, including single-copy loci, and the subsequent analytical detection of associated proteins. This review covers mass spectrometry techniques that enable quantitative proteomics before providing a survey of approaches toward the development of strategies for the purification of sequence-specific chromatin as a 'reverse-ChIP' technique. A fully realized reverse-ChIP technique holds great potential for identifying cancer-specific targets and the development of personalized therapeutic regimens.
Collapse
Affiliation(s)
| | - Rocío Cisneros
- Sarafan ChEM-H/IMA Postbaccalaureate Fellow in Target Discovery, Stanford University, Stanford, CA 94305, USA
| | - Rajan D Maynard
- Genetics Department, Stanford University, Stanford, CA 94305, USA
| | - Michael P Snyder
- Genetics Department, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
2
|
Bogomolov A, Filonov S, Chadaeva I, Rasskazov D, Khandaev B, Zolotareva K, Kazachek A, Oshchepkov D, Ivanisenko VA, Demenkov P, Podkolodnyy N, Kondratyuk E, Ponomarenko P, Podkolodnaya O, Mustafin Z, Savinkova L, Kolchanov N, Tverdokhleb N, Ponomarenko M. Candidate SNP Markers Significantly Altering the Affinity of TATA-Binding Protein for the Promoters of Human Hub Genes for Atherogenesis, Atherosclerosis and Atheroprotection. Int J Mol Sci 2023; 24:ijms24109010. [PMID: 37240358 DOI: 10.3390/ijms24109010] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2023] [Revised: 05/13/2023] [Accepted: 05/17/2023] [Indexed: 05/28/2023] Open
Abstract
Atherosclerosis is a systemic disease in which focal lesions in arteries promote the build-up of lipoproteins and cholesterol they are transporting. The development of atheroma (atherogenesis) narrows blood vessels, reduces the blood supply and leads to cardiovascular diseases. According to the World Health Organization (WHO), cardiovascular diseases are the leading cause of death, which has been especially boosted since the COVID-19 pandemic. There is a variety of contributors to atherosclerosis, including lifestyle factors and genetic predisposition. Antioxidant diets and recreational exercises act as atheroprotectors and can retard atherogenesis. The search for molecular markers of atherogenesis and atheroprotection for predictive, preventive and personalized medicine appears to be the most promising direction for the study of atherosclerosis. In this work, we have analyzed 1068 human genes associated with atherogenesis, atherosclerosis and atheroprotection. The hub genes regulating these processes have been found to be the most ancient. In silico analysis of all 5112 SNPs in their promoters has revealed 330 candidate SNP markers, which statistically significantly change the affinity of the TATA-binding protein (TBP) for these promoters. These molecular markers have made us confident that natural selection acts against underexpression of the hub genes for atherogenesis, atherosclerosis and atheroprotection. At the same time, upregulation of the one for atheroprotection promotes human health.
Collapse
Affiliation(s)
- Anton Bogomolov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Sergey Filonov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
- The Natural Sciences Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Irina Chadaeva
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Dmitry Rasskazov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Bato Khandaev
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
- The Natural Sciences Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Karina Zolotareva
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
- The Natural Sciences Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Anna Kazachek
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
- The Natural Sciences Department, Novosibirsk State University, Novosibirsk 630090, Russia
| | - Dmitry Oshchepkov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Vladimir A Ivanisenko
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Pavel Demenkov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Nikolay Podkolodnyy
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
- Institute of Computational Mathematics and Mathematical Geophysics, Novosibirsk 630090, Russia
| | - Ekaterina Kondratyuk
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Petr Ponomarenko
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Olga Podkolodnaya
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Zakhar Mustafin
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Ludmila Savinkova
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Nikolay Kolchanov
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Natalya Tverdokhleb
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| | - Mikhail Ponomarenko
- Institute of Cytology and Genetics, Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk 630090, Russia
| |
Collapse
|
3
|
Zhang MQ. A personal journey on cracking the genomic codes. QUANTITATIVE BIOLOGY 2021. [DOI: 10.15302/j-qb-021-0245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
4
|
Evolution of Brain Active Gene Promoters in Human Lineage Towards the Increased Plasticity of Gene Regulation. Mol Neurobiol 2017; 55:1871-1904. [PMID: 28233272 DOI: 10.1007/s12035-017-0427-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2016] [Accepted: 01/26/2017] [Indexed: 01/31/2023]
Abstract
Adaptability to a variety of environmental conditions is a prominent feature of Homo sapiens. We hypothesize that this feature can be explained by evolutionary changes in gene promoters active in the brain prefrontal cortex leading to a more flexible gene regulation network. The genotype-dependent range of gene expression can be broader in humans than in other higher primates. Thus, we searched for specific signatures of evolutionary changes in promoter architectures of multiple hominid genes, including the genes active in human cortical neurons that may indicate an increase of variability of gene expression rather than just changes in the level of expression, such as downregulation or upregulation of the genes. We performed a whole-genome search for genetic-based alterations that may impact gene regulation "flexibility" in a process of hominids evolution, such as (i) CpG dinucleotide content, (ii) predicted nucleosome-DNA dissociation constant, and (iii) predicted affinities for TATA-binding protein (TBP) in gene promoters. We tested all putative promoter regions across the human genome and especially gene promoters in active chromatin state in neurons of prefrontal cortex, the brain region critical for abstract thinking and social and behavioral adaptation. Our data imply that the origin of modern man has been associated with an increase of flexibility of promoter-driven gene regulation in brain. In contrast, after splitting from the ancestral lineages of H. sapiens, the evolution of ape species is characterized by reduced flexibility of gene promoter functioning, underlying reduced variability of the gene expression.
Collapse
|
5
|
Candidate SNP Markers of Chronopathologies Are Predicted by a Significant Change in the Affinity of TATA-Binding Protein for Human Gene Promoters. BIOMED RESEARCH INTERNATIONAL 2016; 2016:8642703. [PMID: 27635400 PMCID: PMC5011241 DOI: 10.1155/2016/8642703] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/04/2016] [Revised: 06/25/2016] [Accepted: 06/28/2016] [Indexed: 01/14/2023]
Abstract
Variations in human genome (e.g., single nucleotide polymorphisms, SNPs) may be associated with hereditary diseases, their complications, comorbidities, and drug responses. Using Web service SNP_TATA_Comparator presented in our previous paper, here we analyzed immediate surroundings of known SNP markers of diseases and identified several candidate SNP markers that can significantly change the affinity of TATA-binding protein for human gene promoters, with circadian consequences. For example, rs572527200 may be related to asthma, where symptoms are circadian (worse at night), and rs367732974 may be associated with heart attacks that are characterized by a circadian preference (early morning). By the same method, we analyzed the 90 bp proximal promoter region of each protein-coding transcript of each human gene of the circadian clock core. This analysis yielded 53 candidate SNP markers, such as rs181985043 (susceptibility to acute Q fever in male patients), rs192518038 (higher risk of a heart attack in patients with diabetes), and rs374778785 (emphysema and lung cancer in smokers). If they are properly validated according to clinical standards, these candidate SNP markers may turn out to be useful for physicians (to select optimal treatment for each patient) and for the general population (to choose a lifestyle preventing possible circadian complications of diseases).
Collapse
|
6
|
Wight A, Yang D, Ioshikhes I, Makrigiannis AP. Nucleosome Presence at AML-1 Binding Sites Inversely Correlates with Ly49 Expression: Revelations from an Informatics Analysis of Nucleosomes and Immune Cell Transcription Factors. PLoS Comput Biol 2016; 12:e1004894. [PMID: 27124577 PMCID: PMC4849748 DOI: 10.1371/journal.pcbi.1004894] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2015] [Accepted: 03/31/2016] [Indexed: 12/28/2022] Open
Abstract
Beyond its role in genomic organization and compaction, the nucleosome is believed to participate in the regulation of gene transcription. Here, we report a computational method to evaluate the nucleosome sensitivity for a transcription factor over a given stretch of the genome. Sensitive factors are predicted to be those with binding sites preferentially contained within nucleosome boundaries and lacking 10 bp periodicity. Based on these criteria, the Acute Myeloid Leukemia-1a (AML-1a) transcription factor, a regulator of immune gene expression, was identified as potentially sensitive to nucleosomal regulation within the mouse Ly49 gene family. This result was confirmed in RMA, a cell line with natural expression of Ly49, using MNase-Seq to generate a nucleosome map of chromosome 6, where the Ly49 gene family is located. Analysis of this map revealed a specific depletion of nucleosomes at AML-1a binding sites in the expressed Ly49A when compared to the other, silent Ly49 genes. Our data suggest that nucleosome-based regulation contributes to the expression of Ly49 genes, and we propose that this method of predicting nucleosome sensitivity could aid in dissecting the regulatory role of nucleosomes in general. The nucleosome—a large protein complex with DNA wound around it—is the fundamental unit of genomic organization in the eukaryotic cell. More than just a DNA organizer, however, nucleosomes may control gene expression by interfering with the cell’s ability to access the wound-up DNA, as shown by recent research. In this report, we demonstrate a computational method for predicting which elements of the genome are sensitive to regulation by nucleosomes. As a proof-of-concept, we identify AML-1a binding sites—important sequences in DNA regulation—as being specifically nucleosome sensitive. We then show that AML-1a sites are specifically depleted of nucleosomes when a gene is expressed, indicating the ability for nucleosomes to suppress the expression of that gene. This finding confirms that nucleosomes are likely involved in genome regulation, and provides a method for predicting which areas of the genome are probably affected most by nucleosomes. This paper also highlights the usefulness of the Ly49 gene family in testing computer-derived genomic predictions, and is of interest to anyone studying how gene expression is regulated from cell to cell.
Collapse
Affiliation(s)
- Andrew Wight
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
| | - Doo Yang
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
| | - Ilya Ioshikhes
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Institute of Systems Biology, University of Ottawa, Ottawa, Ontario, Canada
- * E-mail: (II); (APM)
| | - Andrew P. Makrigiannis
- Department of Biochemistry, Microbiology and Immunology, University of Ottawa, Ottawa, Ontario, Canada
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
- * E-mail: (II); (APM)
| |
Collapse
|
7
|
Turnaev II, Rasskazov DA, Arkova OV, Ponomarenko MP, Ponomarenko PM, Savinkova LK, Kolchanov NA. Hypothetical SNP markers that significantly affect the affinity of the TATA-binding protein to VEGFA, ERBB2, IGF1R, FLT1, KDR, and MET oncogene promoters as chemotherapy targets. Mol Biol 2016. [DOI: 10.1134/s0026893316010209] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
8
|
Periodic distribution of a putative nucleosome positioning motif in human, nonhuman primates, and archaea: mutual information analysis. Int J Genomics 2013; 2013:963956. [PMID: 23841049 PMCID: PMC3691935 DOI: 10.1155/2013/963956] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2013] [Accepted: 04/29/2013] [Indexed: 12/12/2022] Open
Abstract
Recently, Trifonov's group proposed a 10-mer DNA motif YYYYYRRRRR as a solution of the long-standing problem of sequence-based nucleosome positioning. To test whether this generic decamer represents a biological meaningful signal, we compare the distribution of this motif in primates and Archaea, which are known to contain nucleosomes, and in Eubacteria, which do not possess nucleosomes. The distribution of the motif is analyzed by the mutual information function (MIF) with a shifted version of itself (MIF profile). We found common features in the patterns of this generic decamer on MIF profiles among primate species, and interestingly we found conspicuous but dissimilar MIF profiles for each Archaea tested. The overall MIF profiles for each chromosome in each primate species also follow a similar pattern. Trifonov's generic decamer may be a highly conserved motif for the nucleosome positioning, but we argue that this is not the only motif. The distribution of this generic decamer exhibits previously unidentified periodicities, which are associated to highly repetitive sequences in the genome. Alu repetitive elements contribute to the most fundamental structure of nucleosome positioning in higher Eukaryotes. In some regions of primate chromosomes, the distribution of the decamer shows symmetrical patterns including inverted repeats.
Collapse
|
9
|
Zürcher E, Tavor-Deslex D, Lituiev D, Enkerli K, Tarr PT, Müller B. A robust and sensitive synthetic sensor to monitor the transcriptional output of the cytokinin signaling network in planta. PLANT PHYSIOLOGY 2013; 161:1066-75. [PMID: 23355633 PMCID: PMC3585579 DOI: 10.1104/pp.112.211763] [Citation(s) in RCA: 233] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/27/2012] [Accepted: 01/24/2013] [Indexed: 05/17/2023]
Abstract
Cytokinins are classic plant hormones that orchestrate plant growth, development, and physiology. They affect gene expression in target cells by activating a multistep phosphorelay network. Type-B response regulators, acting as transcriptional activators, mediate the final step in the signaling cascade. Previously, we have introduced a synthetic reporter, Two Component signaling Sensor (TCS)::green fluorescent protein (GFP), which reflects the transcriptional activity of type-B response regulators. TCS::GFP was instrumental in uncovering roles of cytokinin and deepening our understanding of existing functions. However, TCS-mediated expression of reporters is weak in some developmental contexts where cytokinin signaling has a documented role, such as in the shoot apical meristem or in the vasculature of Arabidopsis (Arabidopsis thaliana). We also observed that GFP expression becomes rapidly silenced in TCS::GFP transgenic plants. Here, we present an improved version of the reporter, TCS new (TCSn), which, compared with TCS, is more sensitive to phosphorelay signaling in Arabidopsis and maize (Zea mays) cellular assays while retaining its specificity. Transgenic Arabidopsis TCSn::GFP plants exhibit strong and dynamic GFP expression patterns consistent with known cytokinin functions. In addition, GFP expression has been stable over generations, allowing for crosses with different genetic backgrounds. Thus, TCSn represents a significant improvement to report the transcriptional output profile of phosphorelay signaling networks in Arabidopsis, maize, and likely other plants that display common response regulator DNA-binding specificities.
Collapse
|
10
|
Savinkova L, Drachkova I, Arshinova T, Ponomarenko P, Ponomarenko M, Kolchanov N. An experimental verification of the predicted effects of promoter TATA-box polymorphisms associated with human diseases on interactions between the TATA boxes and TATA-binding protein. PLoS One 2013; 8:e54626. [PMID: 23424617 PMCID: PMC3570547 DOI: 10.1371/journal.pone.0054626] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2012] [Accepted: 12/13/2012] [Indexed: 11/18/2022] Open
Abstract
Human genome sequencing has resulted in a great body of data, including a stunningly large number of single nucleotide polymorphisms (SNPs) with unknown phenotypic manifestations. Identification and comprehensive analysis of regulatory SNPs in human gene promoters will help quantify the effects of these SNPs on human health. Based on our experimental and computer-aided study of SNPs in TATA boxes and the use of literature data, we have derived an equation for TBP/TATA equilibrium binding in three successive steps: TATA-binding protein (TBP) sliding along DNA due to their nonspecific affinity for each other ↔ recognition of the TATA box ↔ stabilization of the TBP/TATA complex. Using this equation, we have analyzed TATA boxes containing SNPs associated with human diseases and made in silico predictions of changes in TBP/TATA affinity. An electrophoretic mobility shift assay (EMSA)-based experimental study performed under the most standardized conditions demonstrates that the experimentally measured values are highly correlated with the predicted values: the coefficient of linear correlation, r, was 0.822 at a significance level of α<10⁻⁷ for equilibrium K(D) values, (-ln K(D)), and 0.785 at a significance level of α<10⁻³ for changes in equilibrium K(D) (δ) due to SNPs in the TATA boxes (δ= -ln[K(D,TATAMut)]-(-ln[K(D,TATAMut)])). It has been demonstrated that the SNPs associated with increased risk of human diseases such as α-, β- and δ-thalassemia, myocardial infarction and thrombophlebitis, changes in immune response, amyotrophic lateral sclerosis, lung cancer and hemophilia B Leyden cause 2-4-fold changes in TBP/TATA affinity in most cases. The results obtained strongly suggest that the TBP/TATA equilibrium binding equation derived can be used for analysis of TATA-box sequences and identification of SNPs with a potential of being functionally important.
Collapse
Affiliation(s)
- Ludmila Savinkova
- Institute of Cytology and Genetics, Siberian Division, Russian Academy of Sciences, Novosibirsk, Russia.
| | | | | | | | | | | |
Collapse
|
11
|
Coexistence of different base periodicities in prokaryotic genomes as related to DNA curvature, supercoiling, and transcription. Genomics 2011; 98:223-31. [PMID: 21722724 DOI: 10.1016/j.ygeno.2011.06.006] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2011] [Revised: 05/30/2011] [Accepted: 06/13/2011] [Indexed: 01/15/2023]
Abstract
We analyzed the periodic patterns in E. coli promoters and compared the distributions of the corresponding patterns in promoters and in the complete genome to elucidate their function. Except the three-base periodicity, coincident with that in the coding regions and growing stronger in the region downstream from the transcriptions start (TS), all other salient periodicities are peaked upstream of TS. We found that helical periodicities with the lengths about B-helix pitch ~10.2-10.5 bp and A-helix pitch ~10.8-11.1 bp coexist in the genomic sequences. We mapped the distributions of stretches with A-, B-, and Z-like DNA periodicities onto E. coli genome. All three periodicities tend to concentrate within non-coding regions when their intensity becomes stronger and prevail in the promoter sequences. The comparison with available experimental data indicates that promoters with the most pronounced periodicities may be related to the supercoiling-sensitive genes.
Collapse
|
12
|
Mishra H, Singh N, Misra K, Lahiri T. An ANN-GA model based promoter prediction in Arabidopsis thaliana using tilling microarray data. Bioinformation 2011; 6:240-3. [PMID: 21887014 PMCID: PMC3159145 DOI: 10.6026/97320630006240] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2011] [Accepted: 05/09/2011] [Indexed: 11/23/2022] Open
Abstract
Identification of promoter region is an important part of gene annotation. Identification of promoters in eukaryotes is important as promoters modulate various
metabolic functions and cellular stress responses. In this work, a novel approach utilizing intensity values of tilling microarray data for a model eukaryotic plant
Arabidopsis thaliana, was used to specify promoter region from non-promoter region. A feed-forward back propagation neural network model supported by
genetic algorithm was employed to predict the class of data with a window size of 41. A dataset comprising of 2992 data vectors representing both promoter and
non-promoter regions, chosen randomly from probe intensity vectors for whole genome of Arabidopsis thaliana generated through tilling microarray technique
was used. The classifier model shows prediction accuracy of 69.73% and 65.36% on training and validation sets, respectively. Further, a concept of distance based
class membership was used to validate reliability of classifier, which showed promising results. The study shows the usability of micro-array probe intensities to
predict the promoter regions in eukaryotic genomes.
Collapse
Affiliation(s)
- Hrishikesh Mishra
- Division of Applied Sciences and Indo-Russian Centre for Biotechnology, Indian Institute of Information Technology, Allahabad, India
| | | | | | | |
Collapse
|
13
|
Drachkova IA, Ponomarenko PM, Arshinova TV, Ponomarenko МP, Suslov VV, Savinkova LK, Kolchanov NА. In vitro examining the existing prognoses how TBP binds to TATA with SNP associated with human diseases. Health (London) 2011. [DOI: 10.4236/health.2011.39099] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
14
|
Cui JY, Gunewardena SS, Rockwell CE, Klaassen CD. ChIPing the cistrome of PXR in mouse liver. Nucleic Acids Res 2010; 38:7943-63. [PMID: 20693526 PMCID: PMC3001051 DOI: 10.1093/nar/gkq654] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2009] [Revised: 07/03/2010] [Accepted: 07/08/2010] [Indexed: 01/14/2023] Open
Abstract
The pregnane X receptor (PXR) is a key regulator of xenobiotic metabolism and disposition in liver. However, little is known about the PXR DNA-binding signatures in vivo, or how PXR regulates novel direct targets on a genome-wide scale. Therefore, we generated a roadmap of hepatic PXR bindings in the entire mouse genome [chromatin immunoprecipitation (ChIP)-Seq]. The most frequent PXR DNA-binding motif is the AGTTCA-like direct repeat with a 4 bp spacer [direct repeat (DR)-4)]. Surprisingly, there are also high motif occurrences with spacers of a periodicity of 5 bp, forming a novel DR-(5 n+4) pattern for PXR binding. PXR-binding overlaps with the epigenetic mark for gene activation (histone-H3K4-di-methylation), but not with epigenetic marks for gene suppression (DNA methylation or histone-H3K27-tri-methylation) (ChIP-on-chip). After administering a PXR agonist, changes in mRNA of most PXR-direct target genes correlate with increased PXR binding. Specifically, increased PXR binding triggers the trans-activation of critical drug-metabolizing enzymes and transporters. The mRNA induction of these genes is absent in PXR-null mice. The current work provides the first in vivo evidence of PXR DNA-binding signatures in the mouse genome, paving the path for predicting and further understanding the multifaceted roles of PXR in liver.
Collapse
Affiliation(s)
- Julia Yue Cui
- Department of Pharmacology, Toxicology, and Therapeutics and Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Sumedha S. Gunewardena
- Department of Pharmacology, Toxicology, and Therapeutics and Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Cheryl E. Rockwell
- Department of Pharmacology, Toxicology, and Therapeutics and Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| | - Curtis D. Klaassen
- Department of Pharmacology, Toxicology, and Therapeutics and Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Kansas City, KS 66160, USA
| |
Collapse
|
15
|
Babbitt GA, Tolstorukov MY, Kim Y. The molecular evolution of nucleosome positioning through sequence-dependent deformation of the DNA polymer. J Biomol Struct Dyn 2010; 27:765-80. [PMID: 20232932 DOI: 10.1080/07391102.2010.10508584] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
The computational prediction of nucleosome positioning from DNA sequence now allows for in silico investigation of the molecular evolution of biophysical properties of the DNA molecule responsible for primary chromatin organization in the genome. To discern what signal components driving nucleosome positioning in the yeast genome are potentially targeted by natural selection, we compare the performance of various models predictive of nucleosome positioning within the context of a simple statistical test, the repositioned mutation test. We demonstrate that while nucleosome occupancy is driven largely by translational exclusion in response to AT content, there is also a strong signature of evolutionary conservation of regular patterns within nucleosomal DNA sequence related to the structural organization of the nucleosome core (e.g., 10-bp dinucleotide periodicity). We also use computer simulations to investigate hypothetical coding and regulatory constraints on the ability of sequence properties affecting nucleosome formation to adaptively evolve. Our results demonstrate that natural selection may act independently on different DNA sequence properties responsible for local chromatin organization. Furthermore, at least with respect to the deformation energy of the DNA molecule in the nucleosome, the presence of the genetic code has greatly restricted the ability of sequences to evolve the dynamic nucleosome organization typically observed in promoter regions.
Collapse
Affiliation(s)
- G A Babbitt
- School of Biological and Medical Sciences, Rochester Institute of Technology, Rochester, NY, USA.
| | | | | |
Collapse
|
16
|
Babbitt GA. Relaxed selection against accidental binding of transcription factors with conserved chromatin contexts. Gene 2010; 466:43-8. [PMID: 20637845 DOI: 10.1016/j.gene.2010.07.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2010] [Revised: 06/30/2010] [Accepted: 07/07/2010] [Indexed: 02/03/2023]
Abstract
The spurious (or nonfunctional) binding of transcription factors (TF) to the wrong locations on DNA presents a formidable challenge to genomes given the relatively low ceiling for sequence complexity within the short lengths of most binding motifs. The high potential for the occurrence of random motifs and subsequent nonfunctional binding of many transcription factors should theoretically lead to natural selection against the occurrence of spurious motif throughout the genome. However, because of the active role that chromatin can influence over eukaryotic gene regulation, it may also be expected that many supposed spurious binding sites could escape purifying selection if (A) they simply occur in regions of high nucleosome occupancy or (B) their surrounding chromatin was dynamically involved in their identity and function. We compared nucleosome occupancy and the presence/absence of functionally conserved chromatin context to the strength of selection against spurious binding of various TF binding motifs in Saccharomyces yeast. While we find no direct relationship with nucleosome occupancy, we find strong evidence that transcription factors spatially associated with evolutionarily conserved chromatin states are under relaxed selection against accidental binding. Transcription factors (with/without) a conserved chromatin context were found to occur on average, (87.7%/49.3%) of their expected frequencies. Functional binding motifs with conserved chromatin contexts were also significantly shorter in length and more often clustered. These results indicate a role of chromatin context dependency in relaxing selection against spurious binding in nearly half of all TF binding motifs throughout the yeast genome.
Collapse
Affiliation(s)
- G A Babbitt
- School of Biological and Medical Sciences, Rochester Institute of Technology, USA.
| |
Collapse
|
17
|
Papatsenko D, Goltsev Y, Levine M. Organization of developmental enhancers in the Drosophila embryo. Nucleic Acids Res 2009; 37:5665-77. [PMID: 19651877 PMCID: PMC2761283 DOI: 10.1093/nar/gkp619] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Most cell-specific enhancers are thought to lack an inherent organization, with critical binding sites distributed in a more or less random fashion. However, there are examples of fixed arrangements of binding sites, such as helical phasing, that promote the formation of higher-order protein complexes on the enhancer DNA template. Here, we investigate the regulatory ‘grammar’ of nearly 100 characterized enhancers for developmental control genes active in the early Drosophila embryo. The conservation of grammar is examined in seven divergent Drosophila genomes. Linked binding sites are observed for particular combinations of binding motifs, including Bicoid–Bicoid, Hunchback–Hunchback, Bicoid–Dorsal, Bicoid–Caudal and Dorsal–Twist. Direct evidence is presented for the importance of Bicoid–Dorsal linkage in the integration of the anterior–posterior and dorsal–ventral patterning systems. Hunchback–Hunchback interactions help explain unresolved aspects of segmentation, including the differential regulation of the eve stripe 3 + 7 and stripe 4 + 6 enhancers. We also present evidence that there is an under-representation of nucleosome positioning sequences in many enhancers, raising the possibility for a subtle higher-order structure extending across certain enhancers. We conclude that grammar of gene control regions is pervasively used in the patterning of the Drosophila embryo.
Collapse
Affiliation(s)
- Dmitri Papatsenko
- Department of Molecular Cell Biology, Division of Genetics, Genomics & Development, Center for Integrative Genomics, University of California, Berkeley, CA 94720-200, USA.
| | | | | |
Collapse
|
18
|
Shelenkov A, Korotkov E. Search of regular sequences in promoters from eukaryotic genomes. Comput Biol Chem 2009; 33:196-204. [DOI: 10.1016/j.compbiolchem.2009.03.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2008] [Revised: 02/08/2009] [Accepted: 03/18/2009] [Indexed: 12/14/2022]
|
19
|
Yokoyama KD, Ohler U, Wray GA. Measuring spatial preferences at fine-scale resolution identifies known and novel cis-regulatory element candidates and functional motif-pair relationships. Nucleic Acids Res 2009; 37:e92. [PMID: 19483094 PMCID: PMC2715254 DOI: 10.1093/nar/gkp423] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Transcriptional regulation is mediated by the collective binding of proteins called transcription factors to cis-regulatory elements. A handful of factors are known to function at particular distances from the transcription start site, although the extent to which this occurs is not well understood. Spatial dependencies can also exist between pairs of binding motifs, facilitating factor-pair interactions. We sought to determine to what extent spatial preferences measured at high-scale resolution could be utilized to predict cis-regulatory elements as well as motif-pairs binding interacting proteins. We introduce the ‘motif positional function’ model which predicts spatial biases using regression analysis, differentiating noise from true position-specific overrepresentation at single-nucleotide resolution. Our method predicts 48 consensus motifs exhibiting positional enrichment within human promoters, including fourteen motifs without known binding partners. We then extend the model to analyze distance preferences between pairs of motifs. We find that motif-pairs binding interacting factors often co-occur preferentially at multiple distances, with intervals between preferred distances often corresponding to the turn of the DNA double-helix. This offers a novel means by which to predict sequence elements with a collective role in gene regulation.
Collapse
Affiliation(s)
- Ken Daigoro Yokoyama
- Biology Department, Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA
| | | | | |
Collapse
|
20
|
Salih F, Salih B, Trifonov EN. Sequence Structure of Hidden 10.4-base Repeat in the Nucleosomes ofC. elegans. J Biomol Struct Dyn 2008; 26:273-82. [DOI: 10.1080/07391102.2008.10531241] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
21
|
Tolstorukov MY, Choudhary V, Olson WK, Zhurkin VB, Park PJ. nuScore: a web-interface for nucleosome positioning predictions. Bioinformatics 2008; 24:1456-8. [PMID: 18445607 DOI: 10.1093/bioinformatics/btn212] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED Sequence-directed mapping of nucleosome positions is of major biological interest. Here, we present a web-interface for estimation of the affinity of the histone core to DNA and prediction of nucleosome arrangement on a given sequence. Our approach is based on assessment of the energy cost of imposing the deformations required to wrap DNA around the histone surface. The interface allows the user to specify a number of options such as selecting from several structural templates for threading calculations and adding random sequences to the analysis. AVAILABILITY The nuScore interface is freely available for use at http://compbio.med.harvard.edu/nuScore. CONTACT peter_park@harvard.edu; tolstorukov@gmail.com SUPPLEMENTARY INFORMATION The site contains user manual, description of the methodology and examples.
Collapse
Affiliation(s)
- Michael Y Tolstorukov
- Harvard-Partners Center for Genetics and Genomics, Brigham and Women's Hospital, Boston, MA 02115, USA.
| | | | | | | | | |
Collapse
|
22
|
Ramaswamy A, Ioshikhes I. Global dynamics of newly constructed oligonucleosomes of conventional and variant H2A.Z histone. BMC STRUCTURAL BIOLOGY 2007; 7:76. [PMID: 17996059 PMCID: PMC2216022 DOI: 10.1186/1472-6807-7-76] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/23/2007] [Accepted: 11/08/2007] [Indexed: 11/16/2022]
Abstract
Background Complexes of nucleosomes, which often occur in the gene promoter areas, are one of the fundamental levels of chromatin organization and thus are important for transcription regulation. Investigating the dynamic structure of a single nucleosome as well as nucleosome complexes is important for understanding transcription within chromatin. In a previous work, we highlighted the influence of histone variants on the functional dynamics of a single nucleosome using normal mode analysis developed by Bahar et al. The present work further analyzes the dynamics of nucleosome complexes (nucleosome oligomers or oligonucleosomes) such as dimer, trimer and tetramer (beads on a string model) with conventional core histones as well as with the H2A.Z histone variant using normal mode analysis. Results The global dynamics of oligonucleosomes reveal larger amplitude of motion within the nucleosomes that contain the H2A.Z variant with in-planar and out-of-planar fluctuations as the common mode of relaxation. The docking region of H2A.Z and the L1:L1 interactions between H2A.Z monomers of nucleosome (that are responsible for the highly stable nucleosome containing variant H2A.Z-histone) are highly dynamic throughout the first two dynamic modes. Conclusion Dissection of the dynamics of oligonucleosomes discloses in-plane as well as out-of-plane fluctuations as the common mode of relaxation throughout the global motions. The dynamics of individual nucleosomes and the combination of the relaxation mechanisms expressed by the individual nucleosome are quite interesting and highly dependent on the number of nucleosome fragments present in the complexes. Distortions generated by the non-planar dynamics influence the DNA conformation, and hence the histone-DNA interactions significantly alter the dynamics of the DNA. The variant H2A.Z histone is a major source of weaker intra- and inter-molecular correlations resulting in more disordered motions.
Collapse
Affiliation(s)
- Amutha Ramaswamy
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio, USA.
| | | |
Collapse
|
23
|
Larsson E, Lindahl P, Mostad P. HeliCis: a DNA motif discovery tool for colocalized motif pairs with periodic spacing. BMC Bioinformatics 2007; 8:418. [PMID: 17963524 PMCID: PMC2200674 DOI: 10.1186/1471-2105-8-418] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Accepted: 10/28/2007] [Indexed: 12/22/2022] Open
Abstract
Background Correct temporal and spatial gene expression during metazoan development relies on combinatorial interactions between different transcription factors. As a consequence, cis-regulatory elements often colocalize in clusters termed cis-regulatory modules. These may have requirements on organizational features such as spacing, order and helical phasing (periodic spacing) between binding sites. Due to the turning of the DNA helix, a small modification of the distance between a pair of sites may sometimes drastically disrupt function, while insertion of a full helical turn of DNA (10–11 bp) between cis elements may cause functionality to be restored. Recently, de novo motif discovery methods which incorporate organizational properties such as colocalization and order preferences have been developed, but there are no tools which incorporate periodic spacing into the model. Results We have developed a web based motif discovery tool, HeliCis, which features a flexible model which allows de novo detection of motifs with periodic spacing. Depending on the parameter settings it may also be used for discovering colocalized motifs without periodicity or motifs separated by a fixed gap of known or unknown length. We show on simulated data that it can efficiently capture the synergistic effects of colocalization and periodic spacing to improve detection of weak DNA motifs. It provides a simple to use web interface which interactively visualizes the current settings and thereby makes it easy to understand the parameters and the model structure. Conclusion HeliCis provides simple and efficient de novo discovery of colocalized DNA motif pairs, with or without periodic spacing. Our evaluations show that it can detect weak periodic patterns which are not easily discovered using a sequential approach, i.e. first finding the binding sites and second analyzing the properties of their pairwise distances.
Collapse
Affiliation(s)
- Erik Larsson
- Wallenberg Laboratory for Cardiovascular Research, Bruna Stråket 16, Sahlgrenska University Hospital, SE-413 45 Göteborg, SWEDEN.
| | | | | |
Collapse
|
24
|
Abnizova I, Subhankulova T, Gilks WR. Recent computational approaches to understand gene regulation: mining gene regulation in silico. Curr Genomics 2007; 8:79-91. [PMID: 18660846 PMCID: PMC2435357 DOI: 10.2174/138920207780368150] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2006] [Revised: 12/13/2006] [Accepted: 12/15/2006] [Indexed: 01/03/2023] Open
Abstract
This paper reviews recent computational approaches to the understanding of gene regulation in eukaryotes. Cis-regulation of gene expression by the binding of transcription factors is a critical component of cellular physiology. In eukaryotes, a number of transcription factors often work together in a combinatorial fashion to enable cells to respond to a wide spectrum of environmental and developmental signals. Integration of genome sequences and/or Chromatin Immunoprecipitation on chip data with gene-expression data has facilitated in silico discovery of how the combinatorics and positioning of transcription factors binding sites underlie gene activation in a variety of cellular processes.The process of gene regulation is extremely complex and intriguing, therefore all possible points of view and related links should be carefully considered. Here we attempt to collect an inventory, not claiming it to be comprehensive and complete, of related computational biological topics covering gene regulation, which may en-lighten the process, and briefly review what is currently occurring in these areas.We will consider the following computational areas:o gene regulatory network construction;o evolution of regulatory DNA;o studies of its structural and statistical informational properties;o and finally, regulatory RNA.
Collapse
Affiliation(s)
| | - T Subhankulova
- Wellcome Trust/Cancer Research UK Gurdon Institute of Cancer and Developmental Biology, Cambridge, UK
| | | |
Collapse
|
25
|
Abstract
Promoter Classifier is a package of seven stand-alone Windows-based C++ programs allowing the following basic manipulations with a set of promoter sequences: (i) calculation of positional distributions of nucleotides averaged over all promoters of the dataset; (ii) calculation of the averaged occurrence frequencies of the transcription factor binding sites and their combinations; (iii) division of the dataset into subsets of sequences containing or lacking certain promoter elements or combinations; (iv) extraction of the promoter subsets containing or lacking CpG islands around the transcription start site; and (v) calculation of spatial distributions of the promoter DNA stacking energy and bending stiffness. All programs have a user-friendly interface and provide the results in a convenient graphical form. The Promoter Classifier package is an effective tool for various basic manipulations with eukaryotic promoter sequences that usually are necessary for analysis of large promoter datasets. The program Promoter Divider is described in more detail as a representative component of the package.
Collapse
Affiliation(s)
- Naum I Gershenzon
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210, USA
| | | |
Collapse
|
26
|
Ioshikhes IP, Albert I, Zanton SJ, Pugh BF. Nucleosome positions predicted through comparative genomics. Nat Genet 2006; 38:1210-5. [PMID: 16964265 DOI: 10.1038/ng1878] [Citation(s) in RCA: 260] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2006] [Accepted: 08/08/2006] [Indexed: 11/09/2022]
Abstract
DNA sequence has long been recognized as an important contributor to nucleosome positioning, which has the potential to regulate access to genes. The extent to which the nucleosomal architecture at promoters is delineated by the underlying sequence is now being worked out. Here we use comparative genomics to report a genome-wide map of nucleosome positioning sequences (NPSs) located in the vicinity of all Saccharomyces cerevisiae genes. We find that the underlying DNA sequence provides a very good predictor of nucleosome locations that have been experimentally mapped to a small fraction of the genome. Notably, distinct classes of genes possess characteristic arrangements of NPSs that may be important for their regulation. In particular, genes that have a relatively compact NPS arrangement over the promoter region tend to have a TATA box buried in an NPS and tend to be highly regulated by chromatin modifying and remodeling factors.
Collapse
Affiliation(s)
- Ilya P Ioshikhes
- Department of Biomedical Informatics, The Ohio State University, Columbus, Ohio 43210, USA
| | | | | | | |
Collapse
|
27
|
Gershenzon NI, Trifonov EN, Ioshikhes IP. The features of Drosophila core promoters revealed by statistical analysis. BMC Genomics 2006; 7:161. [PMID: 16790048 PMCID: PMC1538597 DOI: 10.1186/1471-2164-7-161] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2006] [Accepted: 06/21/2006] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Experimental investigation of transcription is still a very labor- and time-consuming process. Only a few transcription initiation scenarios have been studied in detail. The mechanism of interaction between basal machinery and promoter, in particular core promoter elements, is not known for the majority of identified promoters. In this study, we reveal various transcription initiation mechanisms by statistical analysis of 3393 nonredundant Drosophila promoters. RESULTS Using Drosophila-specific position-weight matrices, we identified promoters containing TATA box, Initiator, Downstream Promoter Element (DPE), and Motif Ten Element (MTE), as well as core elements discovered in Human (TFIIB Recognition Element (BRE) and Downstream Core Element (DCE)). Promoters utilizing known synergetic combinations of two core elements (TATA_Inr, Inr_MTE, Inr_DPE, and DPE_MTE) were identified. We also establish the existence of promoters with potentially novel synergetic combinations: TATA_DPE and TATA_MTE. Our analysis revealed several motifs with the features of promoter elements, including possible novel core promoter element(s). Comparison of Human and Drosophila showed consistent percentages of promoters with TATA, Inr, DPE, and synergetic combinations thereof, as well as most of the same functional and mutual positions of the core elements. No statistical evidence of MTE utilization in Human was found. Distinct nucleosome positioning in particular promoter classes was revealed. CONCLUSION We present lists of promoters that potentially utilize the aforementioned elements/combinations. The number of these promoters is two orders of magnitude larger than the number of promoters in which transcription initiation was experimentally studied. The sequences are ready to be experimentally tested or used for further statistical analysis. The developed approach may be utilized for other species.
Collapse
Affiliation(s)
- Naum I Gershenzon
- Department of Biomedical Informatics, The Ohio State University, 333 West 10Avenue, Columbus OH 43210, USA
- Department of Physics, Wright State University, Dayton OH 45435, USA
| | - Edward N Trifonov
- Genome Diversity Center, Institute of Evolution, University of Haifa, Haifa 31905, Israel
| | - Ilya P Ioshikhes
- Department of Biomedical Informatics, The Ohio State University, 333 West 10Avenue, Columbus OH 43210, USA
| |
Collapse
|
28
|
Chiang DY, Nix DA, Shultzaberger RK, Gasch AP, Eisen MB. Flexible promoter architecture requirements for coactivator recruitment. BMC Mol Biol 2006; 7:16. [PMID: 16646957 PMCID: PMC1488866 DOI: 10.1186/1471-2199-7-16] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2005] [Accepted: 04/28/2006] [Indexed: 11/16/2022] Open
Abstract
Background The spatial organization of transcription factor binding sites in regulatory DNA, and the composition of intersite sequences, influences the assembly of the multiprotein complexes that regulate RNA polymerase recruitment and thereby affects transcription. We have developed a genetic approach to investigate how reporter gene transcription is affected by varying the spacing between transcription factor binding sites. We characterized the components of promoter architecture that govern the yeast transcription factors Cbf1 and Met31/32, which bind independently, but collaboratively recruit the coactivator Met4. Results A Cbf1 binding site was required upstream of a Met31/32 binding site for full reporter gene expression. Distance constraints on coactivator recruitment were more flexible than those for cooperatively binding transcription factors. Distances from 18 to 50 bp between binding sites support efficient recruitment of Met4, with only slight modulation by helical phasing. Intriguingly, we found that certain sequences located between the binding sites abolished gene expression. Conclusion These results yield insight to the influence of both binding site architecture and local DNA flexibility on gene expression, and can be used to refine computational predictions of gene expression from promoter sequences. In addition, our approach can be applied to survey promoter architecture requirements for arbitrary combinations of transcription factor binding sites.
Collapse
Affiliation(s)
- Derek Y Chiang
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA
| | - David A Nix
- Department of Genome Sciences, Life Sciences Division, Ernest Orlando Lawrence Berkeley National Lab, Berkeley, CA 94720, USA
- Affymetrix, Santa Clara, CA 95051, USA
| | - Ryan K Shultzaberger
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Audrey P Gasch
- Department of Genetics, University of Wisconsin, Madison, WI 53706, USA
| | - Michael B Eisen
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Department of Genome Sciences, Life Sciences Division, Ernest Orlando Lawrence Berkeley National Lab, Berkeley, CA 94720, USA
| |
Collapse
|
29
|
Higasa K, Hayashi K. Periodicity of SNP distribution around transcription start sites. BMC Genomics 2006; 7:66. [PMID: 16579865 PMCID: PMC1448210 DOI: 10.1186/1471-2164-7-66] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2005] [Accepted: 04/03/2006] [Indexed: 11/23/2022] Open
Abstract
Background Several millions single nucleotide polymorphisms (SNPs) have already been collected and deposited in public databases and these are important resources not only for use as markers to identify disease-associated genes, but also to understand the mechanisms that underlie the genome diversification. Results A spectrum analysis of SNP density distribution in the genomic regions around transcription start sites (TSSs) revealed a remarkable periodicity of 146 nucleotides. This periodicity was observed in the regions that were associated with CpG islands (CGIs), but not in the regions without CpG islands (nonCGIs). An analysis of the sequence divergence of the same genomic regions between humans and chimpanzees also revealed a similar periodical pattern in CGI. The occurrences of any mono- or di-nucleotide sequences in these regions did not reveal such a periodicity, thus indicating that an interpretation of this periodicity solely based on the sequence-dependent susceptibility to mutation is highly unlikely. Conclusion The periodical patterns of nucleotide variability suggest the location of nucleosomes that are phased at TSS, and can be viewed as the genetic footprint of the chromatin state that has been maintained throughout mammalian evolutionary history. The results suggest the possible involvement of the nucleosome structure in the promoter function, and also a fundamental functional/structural difference between the two promoter classes, i.e., those with and without CGIs.
Collapse
Affiliation(s)
- Koichiro Higasa
- Division of Genome Analysis, Research Center for Genetic Information, Medical Institute of Bioregulation, Kyushu University, Maidashi 3-1-1, Higashi-ku, Fukuoka 812-8582 Fukuoka, Japan
| | - Kenshi Hayashi
- Division of Genome Analysis, Research Center for Genetic Information, Medical Institute of Bioregulation, Kyushu University, Maidashi 3-1-1, Higashi-ku, Fukuoka 812-8582 Fukuoka, Japan
| |
Collapse
|
30
|
Boeva V, Regnier M, Papatsenko D, Makeev V. Short fuzzy tandem repeats in genomic sequences, identification, and possible role in regulation of gene expression. Bioinformatics 2006; 22:676-84. [PMID: 16403795 DOI: 10.1093/bioinformatics/btk032] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
MOTIVATION Genomic sequences are highly redundant and contain many types of repetitive DNA. Fuzzy tandem repeats (FTRs) are of particular interest. They are found in regulatory regions of eukaryotic genes and are reported to interact with transcription factors. However, accurate assessment of FTR occurrences in different genome segments requires specific algorithm for efficient FTR identification and classification. RESULTS We have obtained formulas for P-values of FTR occurrence and developed an FTR identification algorithm implemented in TandemSWAN software. Using TandemSWAN we compared the structure and the occurrence of FTRs with short period length (up to 24 bp) in coding and non-coding regions including UTRs, heterochromatic, intergenic and enhancer sequences of Drosophila melanogaster and Drosophila pseudoobscura. Tandems with period three and its multiples were found in coding segments, whereas FTRs with periods multiple of six are overrepresented in all non-coding segment. Periods equal to 5-7 and 11-14 were characteristic of the enhancer regions and other non-coding regions close to genes. AVAILABILITY TandemSWAN web page, stand-alone version and documentation can be found at http://bioinform.genetika.ru/projects/swan/www/ SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Valentina Boeva
- Department of Bioengineering and Bioinformatics, Moscow State University Moscow, Russia.
| | | | | | | |
Collapse
|
31
|
Wang J, Hannenhalli S. Generalizations of Markov model to characterize biological sequences. BMC Bioinformatics 2005; 6:219. [PMID: 16144548 PMCID: PMC1236913 DOI: 10.1186/1471-2105-6-219] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2005] [Accepted: 09/06/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The currently used kth order Markov models estimate the probability of generating a single nucleotide conditional upon the immediately preceding (gap = 0) k units. However, this neither takes into account the joint dependency of multiple neighboring nucleotides, nor does it consider the long range dependency with gap > 0. RESULT We describe a configurable tool to explore generalizations of the standard Markov model. We evaluated whether the sequence classification accuracy can be improved by using an alternative set of model parameters. The evaluation was done on four classes of biological sequences--CpG-poor promoters, all promoters, exons and nucleosome positioning sequences. Using di- and tri-nucleotide as the model unit significantly improved the sequence classification accuracy relative to the standard single nucleotide model. In the case of nucleosome positioning sequences, optimal accuracy was achieved at a gap length of 4. Furthermore in the plot of classification accuracy versus the gap, a periodicity of 10-11 bps was observed which might indicate structural preferences in the nucleosome positioning sequence. The tool is implemented in Java and is available for download at ftp://ftp.pcbi.upenn.edu/GMM/. CONCLUSION Markov modeling is an important component of many sequence analysis tools. We have extended the standard Markov model to incorporate joint and long range dependencies between the sequence elements. The proposed generalizations of the Markov model are likely to improve the overall accuracy of sequence analysis tools.
Collapse
Affiliation(s)
- Junwen Wang
- Penn Center for Bioinformatics, Department of Genetics, University of Pennsylvania Philadelphia, PA 19104-6021, USA
| | - Sridhar Hannenhalli
- Penn Center for Bioinformatics, Department of Genetics, University of Pennsylvania Philadelphia, PA 19104-6021, USA
| |
Collapse
|
32
|
Burden S, Lin YX, Zhang R. Improving promoter prediction for the NNPP2.2 algorithm: a case study using Escherichia coli DNA sequences. Bioinformatics 2004; 21:601-7. [PMID: 15454410 DOI: 10.1093/bioinformatics/bti047] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Although a great deal of research has been undertaken in the area of promoter prediction, prediction techniques are still not fully developed. Many algorithms tend to exhibit poor specificity, generating many false positives, or poor sensitivity. The neural network prediction program NNPP2.2 is one such example. RESULTS To improve the NNPP2.2 prediction technique, the distance between the transcription start site (TSS) associated with the promoter and the translation start site (TLS) of the subsequent gene coding region has been studied for Escherichia coli K12 bacteria. An empirical probability distribution that is consistent for all E.coli promoters has been established. This information is combined with the results from NNPP2.2 to create a new technique called TLS-NNPP, which improves the specificity of promoter prediction. The technique is shown to be effective using E.coli DNA sequences, however, it is applicable to any organism for which a set of promoters has been experimentally defined. AVAILABILITY The data used in this project and the prediction results for the tested sequences can be obtained from http://www.uow.edu.au/~yanxia/E_Coli_paper/SBurden_Results.xls CONTACT alh98@uow.edu.au.
Collapse
Affiliation(s)
- S Burden
- Department of Mathematics and Applied Statistics, University of Wollongong Wollongong, NSW 2522, Australia.
| | | | | |
Collapse
|
33
|
Audit B, Vaillant C, Arnéodo A, d'Aubenton-Carafa Y, Thermes C. Wavelet Analysis of DNA Bending Profiles reveals Structural Constraints on the Evolution of Genomic Sequences. J Biol Phys 2004; 30:33-81. [PMID: 23345861 PMCID: PMC3456503 DOI: 10.1023/b:jobp.0000016438.86794.8e] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Analyses of genomic DNA sequences have shown in previous works that base pairs are correlated at large distances with scale-invariant statistical properties. We show in the present study that these correlations between nucleotides (letters) result in fact from long-range correlations (LRC) between sequence-dependent DNA structural elements (words) involved in the packaging of DNA in chromatin. Using the wavelet transform technique, we perform a comparative analysis of the DNA text and of the corresponding bending profiles generated with curvature tables based on nucleosome positioning data. This exploration through the optics of the so-called `wavelet transform microscope' reveals a characteristic scale of 100-200 bp that separates two regimes of different LRC. We focus here on the existence of LRC in the small-scale regime (≲ 200 bp). Analysis of genomes in the three kingdoms reveals that this regime is specifically associated to the presence of nucleosomes. Indeed, small scale LRC are observed in eukaryotic genomes and to a less extent in archaeal genomes, in contrast with their absence in eubacterial genomes. Similarly, this regime is observed in eukaryotic but not in bacterial viral DNA genomes. There is one exception for genomes of Poxviruses, the only animal DNA viruses that do not replicate in the cell nucleus and do not present small scale LRC. Furthermore, no small scale LRC are detected in the genomes of all examined RNA viruses, with one exception in the case of retroviruses. Altogether, these results strongly suggest that small-scale LRC are a signature of the nucleosomal structure. Finally, we discuss possible interpretations of these small-scale LRC in terms of the mechanisms that govern the positioning, the stability and the dynamics of the nucleosomes along the DNA chain. This paper is maily devoted to a pedagogical presentation of the theoretical concepts and physical methods which are well suited to perform a statistical analysis of genomic sequences. We review the results obtained with the so-called wavelet-based multifractal analysis when investigating the DNA sequences of various organisms in the three kingdoms. Some of these results have been announced in B. Audit et al. [1, 2].
Collapse
Affiliation(s)
- Benjamin Audit
- Centre de Recherche Paul Pascal, avenue Schweitzer, 33600 Pessac, France
| | | | | | | | | |
Collapse
|
34
|
Makeev VJ, Lifanov AP, Nazina AG, Papatsenko DA. Distance preferences in the arrangement of binding motifs and hierarchical levels in organization of transcription regulatory information. Nucleic Acids Res 2004; 31:6016-26. [PMID: 14530449 PMCID: PMC219477 DOI: 10.1093/nar/gkg799] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We explored distance preferences in the arrangement of binding motifs for five transcription factors (Bicoid, Krüppel, Hunchback, Knirps and Caudal) in a large set of Drosophila cis-regulatory modules (CRMs). Analysis of non-overlapping binding motifs revealed the presence of periodic signals specific to particular combinations of binding motifs. The most striking periodic signals (10 bp for Bicoid and 11 bp for Hunchback) suggest preferential positioning of some binding site combinations on the same side of the DNA helix. We also analyzed distance preferences in arrangements of highly correlated overlapping binding motifs, such as Bicoid and Krüppel. Based on the distance analysis, we extracted preferential binding site arrangements and proposed models for potential composite elements (CEs) and antagonistic motif pairs involved in the function of developmental CRMs. Our results suggest that there are distinct hierarchical levels in the organization of transcription regulatory information. We discuss the role of the hierarchy in understanding transcriptional regulation and in detection of transcription regulatory regions in genomes.
Collapse
|
35
|
Nazina AG, Papatsenko DA. Statistical extraction of Drosophila cis-regulatory modules using exhaustive assessment of local word frequency. BMC Bioinformatics 2003; 4:65. [PMID: 14690551 PMCID: PMC341902 DOI: 10.1186/1471-2105-4-65] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2003] [Accepted: 12/22/2003] [Indexed: 11/13/2022] Open
Abstract
Background Transcription regulatory regions in higher eukaryotes are often represented by cis-regulatory modules (CRM) and are responsible for the formation of specific spatial and temporal gene expression patterns. These extended, ~1 KB, regions are found far from coding sequences and cannot be extracted from genome on the basis of their relative position to the coding regions. Results To explore the feasibility of CRM extraction from a genome, we generated an original training set, containing annotated sequence data for most of the known developmental CRMs from Drosophila. Based on this set of experimental data, we developed a strategy for statistical extraction of cis-regulatory modules from the genome, using exhaustive analysis of local word frequency (LWF). To assess the performance of our analysis, we measured the correlation between predictions generated by the LWF algorithm and the distribution of conserved non-coding regions in a number of Drosophila developmental genes. Conclusions In most of the cases tested, we observed high correlation (up to 0.6–0.8, measured on the entire gene locus) between the two independent techniques. We discuss computational strategies available for extraction of Drosophila CRMs and possible extensions of these methods.
Collapse
Affiliation(s)
- Anna G Nazina
- Department of Biology, New York University, New York, USA
| | | |
Collapse
|
36
|
Rombauts S, Florquin K, Lescot M, Marchal K, Rouzé P, van de Peer Y. Computational approaches to identify promoters and cis-regulatory elements in plant genomes. PLANT PHYSIOLOGY 2003; 132:1162-76. [PMID: 12857799 PMCID: PMC167057 DOI: 10.1104/pp.102.017715] [Citation(s) in RCA: 77] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/14/2002] [Revised: 01/10/2003] [Accepted: 03/17/2003] [Indexed: 05/19/2023]
Abstract
The identification of promoters and their regulatory elements is one of the major challenges in bioinformatics and integrates comparative, structural, and functional genomics. Many different approaches have been developed to detect conserved motifs in a set of genes that are either coregulated or orthologous. However, although recent approaches seem promising, in general, unambiguous identification of regulatory elements is not straightforward. The delineation of promoters is even harder, due to its complex nature, and in silico promoter prediction is still in its infancy. Here, we review the different approaches that have been developed for identifying promoters and their regulatory elements. We discuss the detection of cis-acting regulatory elements using word-counting or probabilistic methods (so-called "search by signal" methods) and the delineation of promoters by considering both sequence content and structural features ("search by content" methods). As an example of search by content, we explored in greater detail the association of promoters with CpG islands. However, due to differences in sequence content, the parameters used to detect CpG islands in humans and other vertebrates cannot be used for plants. Therefore, a preliminary attempt was made to define parameters that could possibly define CpG and CpNpG islands in Arabidopsis, by exploring the compositional landscape around the transcriptional start site. To this end, a data set of more than 5,000 gene sequences was built, including the promoter region, the 5'-untranslated region, and the first introns and coding exons. Preliminary analysis shows that promoter location based on the detection of potential CpG/CpNpG islands in the Arabidopsis genome is not straightforward. Nevertheless, because the landscape of CpG/CpNpG islands differs considerably between promoters and introns on the one side and exons (whether coding or not) on the other, more sophisticated approaches can probably be developed for the successful detection of "putative" CpG and CpNpG islands in plants.
Collapse
Affiliation(s)
- Stephane Rombauts
- Department of Plant Systems Biology, Flanders Interuniversity Institute for Biotechnology, Ghent University, B-9000 Gent, Belgium
| | | | | | | | | | | |
Collapse
|
37
|
Abstract
Here we propose a new determinant for localization of nucleosomes along genomic DNA, in addition to sequence-dependent features. The new specific class of chromatin scaling signals involves curved DNA. According to the observed positional distribution of DNA curvature, the new synchronizing signal occurs once per four nucleosomes on average. This new factor in nucleosome positioning should substantially influence the efficiency of biological reactions through regulatory factors microscopically and the entire chromatin structure through the 30 nm fiber structure macroscopically. Allocation of the new type of signals is found to be fixed evolutionarily although they could be shifted in accordance with the hierarchy of functional genomic structures.
Collapse
Affiliation(s)
- Ryoiti Kiyama
- Research Center for Glycoscience, National Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, 305-8566, Ibaraki, Japan.
| | | |
Collapse
|
38
|
Abstract
We describe an original approach to determining sequence-structure relationships for DNA. This approach, termed ADAPT, combines all-atom molecular mechanics with a multicopy algorithm to build nucleotides that contain all four standard bases in variable proportions. These nucleotides enable us to search very rapidly for base sequences that energetically favor chosen types of DNA deformation or chosen DNA-protein or DNA-ligand interactions. Sequences satisfying the chosen criteria can be found by energy minimization, combinatorial sequence searching, or genome scanning, in a manner similar to the threading approaches developed for protein structure prediction. In the latter case, we are able to analyze roughly 2000 base pairs per second. Applications of the method to DNA allomorphic transitions, DNA deformation, and specific DNA interactions are presented.
Collapse
Affiliation(s)
- I Lafontaine
- Laboratoire de Biochimie Théorique, CNRS UPR 9080, Institut de Biologie Physico-Chimique, 13 rue Pierre et Marie Curie, Paris 75005, France
| | | |
Collapse
|
39
|
Kundu S, Bhattacharya D, Thakur AR, Majumdar R. Nucleosomal positioning and genetic divergence study based on DNA flexibility map. J Biomol Struct Dyn 2001; 18:527-33. [PMID: 11245248 DOI: 10.1080/07391102.2001.10506685] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Based on worm like chain model, DNA structural parameters--tilt, roll and rise, derived from crystallographic database have been used to determine the flexibility of DNA that regulates the nucleosomal translational positioning. Theoretically derived data has been compared to the experimental values available in loshikhes and Trifonov's database. The methodology has been extended to determine the flexibility of 18S rRNA genome in eukarya, where yeast shows a distinct difference when compared with mammals like human, mouse and rabbit.
Collapse
Affiliation(s)
- S Kundu
- Department of Biophysics, Molecular Biology & Genetics, University College of Science, Calcutta, India
| | | | | | | |
Collapse
|
40
|
Farkas G, Leibovitch BA, Elgin SC. Chromatin organization and transcriptional control of gene expression in Drosophila. Gene 2000; 253:117-36. [PMID: 10940549 DOI: 10.1016/s0378-1119(00)00240-7] [Citation(s) in RCA: 70] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
It is increasingly clear that the packaging of DNA in nucleosome arrays serves not only to constrain the genome within the nucleus, but also to encode information concerning the activity state of the gene. Packaging limits the accessibility of many regulatory DNA sequence elements and is functionally significant in the control of transcription, replication, repair and recombination. Here, we review studies of the heat-shock genes, illustrating the formation of a specific nucleosome array at an activatable promoter, and describe present information on the roles of DNA-binding factors and energy-dependent chromatin remodeling machines in facilitating assembly of an appropriate structure. Epigenetic maintenance of the activity state within large domains appears to be a key mechanism in regulating homeotic genes during development; recent advances indicate that chromatin structural organization is a critical parameter. The ability to utilize genetic, biochemical and cytological approaches makes Drosophila an ideal organism for studies of the role of chromatin structure in the regulation of gene expression.
Collapse
Affiliation(s)
- G Farkas
- Department of Biology, Washington University, St. Louis, MO 63130, USA
| | | | | |
Collapse
|
41
|
Abstract
This paper presents a survey of currently available mathematical models and algorithmical methods for trying to identify promoter sequences. The methods concern both searching in a genome for a previously defined consensus and extracting a consensus from a set of sequences. Such methods were often tailored for either eukaryotes or prokaryotes although this does not preclude use of the same method for both types of organisms. The survey therefore covers all methods; however, emphasis is placed on prokaryotic promoter sequence identification. Illustrative applications of the main extracting algorithms are given for three bacteria.
Collapse
Affiliation(s)
- A Vanet
- Institut de biologie physico-chimique, Paris, France
| | | | | |
Collapse
|