1
|
Uemura K, Ohyama T. Physical Peculiarity of Two Sites in Human Promoters: Universality and Diverse Usage in Gene Function. Int J Mol Sci 2024; 25:1487. [PMID: 38338773 PMCID: PMC10855393 DOI: 10.3390/ijms25031487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/15/2024] [Accepted: 01/18/2024] [Indexed: 02/12/2024] Open
Abstract
Since the discovery of physical peculiarities around transcription start sites (TSSs) and a site corresponding to the TATA box, research has revealed only the average features of these sites. Unsettled enigmas include the individual genes with these features and whether they relate to gene function. Herein, using 10 physical properties of DNA, including duplex DNA free energy, base stacking energy, protein-induced deformability, and stabilizing energy of Z-DNA, we clarified for the first time that approximately 97% of the promoters of 21,056 human protein-coding genes have distinctive physical properties around the TSS and/or position -27; of these, nearly 65% exhibited such properties at both sites. Furthermore, about 55% of the 21,056 genes had a minimum value of regional duplex DNA free energy within TSS-centered ±300 bp regions. Notably, distinctive physical properties within the promoters and free energies of the surrounding regions separated human protein-coding genes into five groups; each contained specific gene ontology (GO) terms. The group represented by immune response genes differed distinctly from the other four regarding the parameter of the free energies of the surrounding regions. A vital suggestion from this study is that physical-feature-based analyses of genomes may reveal new aspects of the organization and regulation of genes.
Collapse
Affiliation(s)
- Kohei Uemura
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan;
| | - Takashi Ohyama
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan;
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
| |
Collapse
|
2
|
Wang Y, Tai S, Zhang S, Sheng N, Xie X. PromGER: Promoter Prediction Based on Graph Embedding and Ensemble Learning for Eukaryotic Sequence. Genes (Basel) 2023; 14:1441. [PMID: 37510345 PMCID: PMC10379012 DOI: 10.3390/genes14071441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/04/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
Promoters are DNA non-coding regions around the transcription start site and are responsible for regulating the gene transcription process. Due to their key role in gene function and transcriptional activity, the prediction of promoter sequences and their core elements accurately is a crucial research area in bioinformatics. At present, models based on machine learning and deep learning have been developed for promoter prediction. However, these models cannot mine the deeper biological information of promoter sequences and consider the complex relationship among promoter sequences. In this work, we propose a novel prediction model called PromGER to predict eukaryotic promoter sequences. For a promoter sequence, firstly, PromGER utilizes four types of feature-encoding methods to extract local information within promoter sequences. Secondly, according to the potential relationships among promoter sequences, the whole promoter sequences are constructed as a graph. Furthermore, three different scales of graph-embedding methods are applied for obtaining the global feature information more comprehensively in the graph. Finally, combining local features with global features of sequences, PromGER analyzes and predicts promoter sequences through a tree-based ensemble-learning framework. Compared with seven existing methods, PromGER improved the average specificity of 13%, accuracy of 10%, Matthew's correlation coefficient of 16%, precision of 4%, F1 score of 6%, and AUC of 9%. Specifically, this study interpreted the PromGER by the t-distributed stochastic neighbor embedding (t-SNE) method and SHAPley Additive exPlanations (SHAP) value analysis, which demonstrates the interpretability of the model.
Collapse
Affiliation(s)
- Yan Wang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Shiwen Tai
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Shuangquan Zhang
- School of Cyber Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Nan Sheng
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Xuping Xie
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
| |
Collapse
|
3
|
Zhang M, Jia C, Li F, Li C, Zhu Y, Akutsu T, Webb GI, Zou Q, Coin LJM, Song J. Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction. Brief Bioinform 2022; 23:6502561. [PMID: 35021193 PMCID: PMC8921625 DOI: 10.1093/bib/bbab551] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/12/2021] [Accepted: 11/30/2021] [Indexed: 01/13/2023] Open
Abstract
Promoters are crucial regulatory DNA regions for gene transcriptional activation. Rapid advances in next-generation sequencing technologies have accelerated the accumulation of genome sequences, providing increased training data to inform computational approaches for both prokaryotic and eukaryotic promoter prediction. However, it remains a significant challenge to accurately identify species-specific promoter sequences using computational approaches. To advance computational support for promoter prediction, in this study, we curated 58 comprehensive, up-to-date, benchmark datasets for 7 different species (i.e. Escherichia coli, Bacillus subtilis, Homo sapiens, Mus musculus, Arabidopsis thaliana, Zea mays and Drosophila melanogaster) to assist the research community to assess the relative functionality of alternative approaches and support future research on both prokaryotic and eukaryotic promoters. We revisited 106 predictors published since 2000 for promoter identification (40 for prokaryotic promoter, 61 for eukaryotic promoter, and 5 for both). We systematically evaluated their training datasets, computational methodologies, calculated features, performance and software usability. On the basis of these benchmark datasets, we benchmarked 19 predictors with functioning webservers/local tools and assessed their prediction performance. We found that deep learning and traditional machine learning-based approaches generally outperformed scoring function-based approaches. Taken together, the curated benchmark dataset repository and the benchmarking analysis in this study serve to inform the design and implementation of computational approaches for promoter prediction and facilitate more rigorous comparison of new techniques in the future.
Collapse
Affiliation(s)
| | - Cangzhi Jia
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | | | | | | | | | - Geoffrey I Webb
- Department of Data Science and Artificial Intelligence, Monash University, Melbourne, VIC 3800, Australia,Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Quan Zou
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | - Lachlan J M Coin
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | - Jiangning Song
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| |
Collapse
|
4
|
Lee K, Moon S, Park MJ, Koh IU, Choi NH, Yu HY, Kim YJ, Kong J, Kang HG, Kim SC, Kim BJ. Integrated Analysis of Tissue-Specific Promoter Methylation and Gene Expression Profile in Complex Diseases. Int J Mol Sci 2020; 21:E5056. [PMID: 32709145 PMCID: PMC7404266 DOI: 10.3390/ijms21145056] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2020] [Revised: 07/14/2020] [Accepted: 07/16/2020] [Indexed: 02/06/2023] Open
Abstract
This study investigated whether the promoter region of DNA methylation positively or negatively regulates tissue-specific genes (TSGs) and if it correlates with disease pathophysiology. We assessed tissue specificity metrics in five human tissues, using sequencing-based approaches, including 52 whole genome bisulfite sequencing (WGBS), 52 RNA-seq, and 144 chromatin immunoprecipitation sequencing (ChIP-seq) data. A correlation analysis was performed between the gene expression and DNA methylation levels of the TSG promoter region. The TSG enrichment analyses were conducted in the gene-disease association network (DisGeNET). The epigenomic association analyses of CpGs in enriched TSG promoters were performed using 1986 Infinium MethylationEPIC array data. A correlation analysis showed significant associations between the promoter methylation and 449 TSGs' expression. A disease enrichment analysis showed that diabetes- and obesity-related diseases were high-ranked. In an epigenomic association analysis based on obesity, 62 CpGs showed statistical significance. Among them, three obesity-related CpGs were newly identified and replicated with statistical significance in independent data. In particular, a CpG (cg17075888 of PDK4), considered as potential therapeutic targets, were associated with complex diseases, including obesity and type 2 diabetes. The methylation changes in a substantial number of the TSG promoters showed a significant association with metabolic diseases. Collectively, our findings provided strong evidence of the relationship between tissue-specific patterns of epigenetic changes and metabolic diseases.
Collapse
Affiliation(s)
- Kibaick Lee
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - Sanghoon Moon
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - Mi-Jin Park
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - In-Uk Koh
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - Nak-Hyeon Choi
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - Ho-Yeong Yu
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - Young Jin Kim
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - Jinhwa Kong
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| | - Hee Gyung Kang
- Department of Pediatrics, Seoul National University College of Medicine, Seoul 03080, Korea;
| | - Song Cheol Kim
- Department of Surgery, Asan Medical Center, AMIST, University of Ulsan College of Medicine, Seoul 05505, Korea
| | - Bong-Jo Kim
- Division of Genome Research, Center for Genome Science, Korea National Institute of Health, Chungcheongbuk-do 28519, Korea; (K.L.); (S.M.); (M.-J.P.); (I.-U.K.); (N.-H.C.); (H.-Y.Y.); (Y.J.K.); (J.K.)
| |
Collapse
|
5
|
Morton T, Wong WK, Megraw M. TIPR: transcription initiation pattern recognition on a genome scale. Bioinformatics 2015; 31:3725-32. [PMID: 26254489 DOI: 10.1093/bioinformatics/btv464] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 08/03/2015] [Indexed: 12/15/2022] Open
Abstract
MOTIVATION The computational identification of gene transcription start sites (TSSs) can provide insights into the regulation and function of genes without performing expensive experiments, particularly in organisms with incomplete annotations. High-resolution general-purpose TSS prediction remains a challenging problem, with little recent progress on the identification and differentiation of TSSs which are arranged in different spatial patterns along the chromosome. RESULTS In this work, we present the Transcription Initiation Pattern Recognizer (TIPR), a sequence-based machine learning model that identifies TSSs with high accuracy and resolution for multiple spatial distribution patterns along the genome, including broadly distributed TSS patterns that have previously been difficult to characterize. TIPR predicts not only the locations of TSSs but also the expected spatial initiation pattern each TSS will form along the chromosome-a novel capability for TSS prediction algorithms. As spatial initiation patterns are associated with spatiotemporal expression patterns and gene function, this capability has the potential to improve gene annotations and our understanding of the regulation of transcription initiation. The high nucleotide resolution of this model locates TSSs within 10 nucleotides or less on average. AVAILABILITY AND IMPLEMENTATION Model source code is made available online at http://megraw.cgrb.oregonstate.edu/software/TIPR/. CONTACT megrawm@science.oregonstate.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Taj Morton
- Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Weng-Keen Wong
- Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA
| | - Molly Megraw
- Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR 97331, USA, Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA and Center for Genome Research and Biocomputing, Oregon State University, Corvallis, OR 97331, USA
| |
Collapse
|
6
|
Yella VR, Bansal M. In silico Identification of Eukaryotic Promoters. SYSTEMS AND SYNTHETIC BIOLOGY 2015. [DOI: 10.1007/978-94-017-9514-2_4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
7
|
Xiong D, Liu R, Xiao F, Gao X. ProMT: effective human promoter prediction using Markov chain model based on DNA structural properties. IEEE Trans Nanobioscience 2014; 13:374-83. [PMID: 24919203 DOI: 10.1109/tnb.2014.2327586] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The core promoters play significant and extensive roles for the initiation and regulation of DNA transcription. The identification of core promoters is one of the most challenging problems yet. Due to the diverse nature of core promoters, the results obtained through existing computational approaches are not satisfactory. None of them considered the potential influence on performance of predictive approach resulted by the interference between neighboring TSSs in TSS clusters. In this paper, we sufficiently considered this main factor and proposed an approach to locate potential TSS clusters according to the correlation of regional profiles of DNA and TSS clusters. On this basis, we further presented a novel computational approach (ProMT) for promoter prediction using Markov chain model and predictive TSS clusters based on structural properties of DNA. Extensive experiments demonstrated that ProMT can significantly improve the predictive performance. Therefore, considering interference between neighboring TSSs is essential for a wider range of promoter prediction.
Collapse
|
8
|
Huang WL, Tung CW, Liaw C, Huang HL, Ho SY. Rule-based knowledge acquisition method for promoter prediction in human and Drosophila species. ScientificWorldJournal 2014; 2014:327306. [PMID: 24955394 PMCID: PMC3927563 DOI: 10.1155/2014/327306] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2013] [Accepted: 10/10/2013] [Indexed: 01/08/2023] Open
Abstract
The rapid and reliable identification of promoter regions is important when the number of genomes to be sequenced is increasing very speedily. Various methods have been developed but few methods investigate the effectiveness of sequence-based features in promoter prediction. This study proposes a knowledge acquisition method (named PromHD) based on if-then rules for promoter prediction in human and Drosophila species. PromHD utilizes an effective feature-mining algorithm and a reference feature set of 167 DNA sequence descriptors (DNASDs), comprising three descriptors of physicochemical properties (absorption maxima, molecular weight, and molar absorption coefficient), 128 top-ranked descriptors of 4-mer motifs, and 36 global sequence descriptors. PromHD identifies two feature subsets with 99 and 74 DNASDs and yields test accuracies of 96.4% and 97.5% in human and Drosophila species, respectively. Based on the 99- and 74-dimensional feature vectors, PromHD generates several if-then rules by using the decision tree mechanism for promoter prediction. The top-ranked informative rules with high certainty grades reveal that the global sequence descriptor, the length of nucleotide A at the first position of the sequence, and two physicochemical properties, absorption maxima and molecular weight, are effective in distinguishing promoters from non-promoters in human and Drosophila species, respectively.
Collapse
Affiliation(s)
- Wen-Lin Huang
- Department of Management Information System, Asia Pacific Institute of Creativity, Miaoli 351, Taiwan
| | - Chun-Wei Tung
- School of Pharmacy, College of Pharmacy, Kaohsiung Medical University, Kaohsiung 807, Taiwan
| | - Chyn Liaw
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Hui-Ling Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Shinn-Ying Ho
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| |
Collapse
|
9
|
Bedo J, Kowalczyk A. Genome annotation test with validation on transcription start site and ChIP-Seq for Pol-II binding data. Bioinformatics 2011; 27:1610-7. [DOI: 10.1093/bioinformatics/btr263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
10
|
Zhang Z, Zhang MQ. Histone modification profiles are predictive for tissue/cell-type specific expression of both protein-coding and microRNA genes. BMC Bioinformatics 2011; 12:155. [PMID: 21569556 PMCID: PMC3120700 DOI: 10.1186/1471-2105-12-155] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2010] [Accepted: 05/14/2011] [Indexed: 02/04/2023] Open
Abstract
Background Gene expression is regulated at both the DNA sequence level and through modification of chromatin. However, the effect of chromatin on tissue/cell-type specific gene regulation (TCSR) is largely unknown. In this paper, we present a method to elucidate the relationship between histone modification/variation (HMV) and TCSR. Results A classifier for differentiating CD4+ T cell-specific genes from housekeeping genes using HMV data was built. We found HMV in both promoter and gene body regions to be predictive of genes which are targets of TCSR. For example, the histone modification types H3K4me3 and H3K27ac were identified as the most predictive for CpG-related promoters, whereas H3K4me3 and H3K79me3 were the most predictive for nonCpG-related promoters. However, genes targeted by TCSR can be predicted using other type of HMVs as well. Such redundancy implies that multiple type of underlying regulatory elements, such as enhancers or intragenic alternative promoters, which can regulate gene expression in a tissue/cell-type specific fashion, may be marked by the HMVs. Finally, we show that the predictive power of HMV for TCSR is not limited to protein-coding genes in CD4+ T cells, as we successfully predicted TCSR targeted genes in muscle cells, as well as microRNA genes with expression specific to CD4+ T cells, by the same classifier which was trained on HMV data of protein-coding genes in CD4+ T cells. Conclusion We have begun to understand the HMV patterns that guide gene expression in both tissue/cell-type specific and ubiquitous manner.
Collapse
Affiliation(s)
- Zhihua Zhang
- Department of Molecular Cell Biology, Center for Systems Biology, University of Texas at Dallas, Richardson, TX 75080, USA
| | | |
Collapse
|
11
|
Identification of TATA and TATA-less promoters in plant genomes by integrating diversity measure, GC-Skew and DNA geometric flexibility. Genomics 2010; 97:112-20. [PMID: 21112384 DOI: 10.1016/j.ygeno.2010.11.002] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Revised: 11/05/2010] [Accepted: 11/12/2010] [Indexed: 11/20/2022]
Abstract
Accurate identification of core promoters is important for gaining more insight about the understanding of the eukaryotic transcription regulation. In this study, the authors focused on the biologically realistic promoter prediction of plant genomes. By analyzing the correlative conservation, GC-compositional bias and specific structural patterns of TATA and TATA-less promoters in PlantPromDB, a hybrid multi-feature approach based on support vector machine (SVM) for predicting the two types of promoters were developed by integrating local word content, GC-Skew and DNA geometric flexibility. Compared with the TSSP-TCM program on the same test dataset, better prediction results were obtained. Especially for the TATA-less promoter, the accuracy is 10% higher than the result of TSSP-TCM program. The good performance of the hybrid promoters and the experimental data also indicate that our method has the ability to locate the promoter region of the plant genome.
Collapse
|
12
|
Liu Y, Han D, Han Y, Yan Z, Xie B, Li J, Qiao N, Hu H, Khaitovich P, Gao Y, Han JDJ. Ab initio identification of transcription start sites in the Rhesus macaque genome by histone modification and RNA-Seq. Nucleic Acids Res 2010; 39:1408-18. [PMID: 20952408 PMCID: PMC3045608 DOI: 10.1093/nar/gkq956] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Rhesus macaque is a widely used primate model organism. Its genome annotations are however still largely comparative computational predictions derived mainly from human genes, which precludes studies on the macaque-specific genes, gene isoforms or their regulations. Here we took advantage of histone H3 lysine 4 trimethylation (H3K4me3)’s ability to mark transcription start sites (TSSs) and the recently developed ChIP-Seq and RNA-Seq technology to survey the transcript structures. We generated 14 013 757 sequence tags by H3K4me3 ChIP-Seq and obtained 17 322 358 paired end reads for mRNA, and 10 698 419 short reads for sRNA from the macaque brain. By integrating these data with genomic sequence features and extending and improving a state-of-the-art TSS prediction algorithm, we ab initio predicted and verified 17 933 of previously electronically annotated TSSs at 500-bp resolution. We also predicted approximately 10 000 novel TSSs. These provide an important rich resource for close examination of the species-specific transcript structures and transcription regulations in the Rhesus macaque genome. Our approach exemplifies a relatively inexpensive way to generate a reasonably reliable TSS map for a large genome. It may serve as a guiding example for similar genome annotation efforts targeted at other model organisms.
Collapse
Affiliation(s)
- Yi Liu
- Chinese Academy of Sciences Key Laboratory of Molecular Developmental Biology, Center for Molecular Systems Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Lincui East Road, Beijing, 100101, Chinese
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Zeng J, Zhao XY, Cao XQ, Yan H. SCS: signal, context, and structure features for genome-wide human promoter recognition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2010; 7:550-562. [PMID: 20671324 DOI: 10.1109/tcbb.2008.95] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
This paper integrates the signal, context, and structure features for genome-wide human promoter recognition, which is important in improving genome annotation and analyzing transcriptional regulation without experimental supports of ESTs, cDNAs, or mRNAs. First, CpG islands are salient biological signals associated with approximately 50 percent of mammalian promoters. Second, the genomic context of promoters may have biological significance, which is based on n-mers (sequences of n bases long) and their statistics estimated from training samples. Third, sequence-dependent DNA flexibility originates from DNA 3D structures and plays an important role in guiding transcription factors to the target site in promoters. Employing decision trees, we combine above signal, context, and structure features to build a hierarchical promoter recognition system called SCS. Experimental results on controlled data sets and the entire human genome demonstrate that SCS is significantly superior in terms of sensitivity and specificity as compared to other state-of-the-art methods. The SCS promoter recognition system is available online as supplemental materials for academic use and can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TCBB.2008.95.
Collapse
Affiliation(s)
- Jia Zeng
- School of Computer Science and Technology, Soochow University, Suzhou, China.
| | | | | | | |
Collapse
|
14
|
A domesticated transposon mediates the effects of a single-nucleotide polymorphism responsible for enhanced muscle growth. EMBO Rep 2010; 11:305-11. [PMID: 20134481 PMCID: PMC2854606 DOI: 10.1038/embor.2010.6] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Revised: 12/11/2009] [Accepted: 01/05/2010] [Indexed: 11/09/2022] Open
Abstract
Single-nucleotide polymorphisms (SNPs) in the regulatory regions of the genome can have a profound impact on phenotype. The G3072A polymorphism in intron 3 of insulin-like growth factor 2 (IGF2) is implicated in higher muscle content and reduced fat in European pigs and is bound by a putative repressor. Here, we identify this repressor--which we call muscle growth regulator (MGR)--by using a DNA protein interaction screen based on quantitative mass spectrometry. MGR has a bipartite nuclear localization signal, two BED-type zinc fingers and is highly conserved between placental mammals. Surprisingly, the gene is located in an intron and belongs to the hobo-Ac-Tam3 transposase superfamily, suggesting regulatory use of a formerly parasitic element. In transactivation assays, MGR differentially represses the expression of the two SNP variants. Knockdown of MGR in C2C12 myoblast cells upregulates Igf2 expression and mild overexpression retards growth. Thus, MGR is the repressor responsible for enhanced muscle growth in the IGF2 G3072A polymorphism in commercially bred pigs.
Collapse
|
15
|
Zeng J, Zhu S, Yan H. Towards accurate human promoter recognition: a review of currently used sequence features and classification methods. Brief Bioinform 2009; 10:498-508. [PMID: 19531545 DOI: 10.1093/bib/bbp027] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
This review describes important advances that have been made during the past decade for genome-wide human promoter recognition. Interest in promoter recognition algorithms on a genome-wide scale is worldwide and touches on a number of practical systems that are important in analysis of gene regulation and in genome annotation without experimental support of ESTs, cDNAs or mRNAs. The main focus of this review is on feature extraction and model selection for accurate human promoter recognition, with descriptions of what they are, what has been accomplished, and what remains to be done.
Collapse
Affiliation(s)
- Jia Zeng
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong.
| | | | | |
Collapse
|
16
|
Gan Y, Guan J, Zhou S. A pattern-based nearest neighbor search approach for promoter prediction using DNA structural profiles. ACTA ACUST UNITED AC 2009; 25:2006-12. [PMID: 19515962 DOI: 10.1093/bioinformatics/btp359] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Identification of core promoters is a key clue in understanding gene regulations. However, due to the diverse nature of promoter sequences, the accuracy of existing prediction approaches for non-CpG island (simply CGI)-related promoters is not as high as that for CGI-related promoters. This consequently leads to a low genome-wide promoter prediction accuracy. RESULTS In this article, we first systematically analyze the similarities and differences between the two types of promoters (CGI- and non-CGI-related) from a novel structural perspective, and then devise a unified framework, called PNNP (Pattern-based Nearest Neighbor search for Promoter), to predict both CGI- and non-CGI-related promoters based on their structural features. Our comparative analysis on the structural characteristics of promoters reveals two interesting facts: (i) the structural values of CGI- and non-CGI-related promoters are quite different, but they exhibit nearly similar structural patterns; (ii) the structural patterns of promoters are obviously different from that of non-promoter sequences though the sequences have almost similar structural values. Extensive experiments demonstrate that the proposed PNNP approach is effective in capturing the structural patterns of promoters, and can significantly improve genome-wide performance of promoters prediction, especially non-CGI-related promoters prediction. AVAILABILITY The implementation of the program PNNP is available at http://admis.tongji.edu.cn/Projects/pnnp.aspx.
Collapse
Affiliation(s)
- Yanglan Gan
- Department of Computer Science and Technology, Tongji University, Shanghai 201804, China
| | | | | |
Collapse
|
17
|
Corcoran DL, Pandit KV, Gordon B, Bhattacharjee A, Kaminski N, Benos PV. Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data. PLoS One 2009; 4:e5279. [PMID: 19390574 PMCID: PMC2668758 DOI: 10.1371/journal.pone.0005279] [Citation(s) in RCA: 225] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2009] [Accepted: 03/23/2009] [Indexed: 02/05/2023] Open
Abstract
Background MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex.
Collapse
Affiliation(s)
- David L. Corcoran
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Kusum V. Pandit
- Department of Human Genetics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
- Dorothy P. and Richard P. Simmons Center for Interstitial Lung Disease, Division of Pulmonary, Allergy and Critical care Medicine, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, United States of America
| | - Ben Gordon
- Genomics, Agilent Technologies, Inc., Santa Clara, California, United States of America
| | - Arindam Bhattacharjee
- Genomics, Agilent Technologies, Inc., Santa Clara, California, United States of America
| | - Naftali Kaminski
- Dorothy P. and Richard P. Simmons Center for Interstitial Lung Disease, Division of Pulmonary, Allergy and Critical care Medicine, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania, United States of America
- * E-mail: (NK); (PVB)
| | - Panayiotis V. Benos
- Department of Computational Biology, University of Pittsburgh School of Medicine, Pittsburg, Pennsylvania, United States of America
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburg, Pennsylvania, United States of America
- * E-mail: (NK); (PVB)
| |
Collapse
|
18
|
Megraw M, Pereira F, Jensen ST, Ohler U, Hatzigeorgiou AG. A transcription factor affinity-based code for mammalian transcription initiation. Genome Res 2009; 19:644-56. [PMID: 19141595 DOI: 10.1101/gr.085449.108] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The recent arrival of large-scale cap analysis of gene expression (CAGE) data sets in mammals provides a wealth of quantitative information on coding and noncoding RNA polymerase II transcription start sites (TSS). Genome-wide CAGE studies reveal that a large fraction of TSS exhibit peaks where the vast majority of associated tags map to a particular location ( approximately 45%), whereas other active regions contain a broader distribution of initiation events. The presence of a strong single peak suggests that transcription at these locations may be mediated by position-specific sequence features. We therefore propose a new model for single-peaked TSS based solely on known transcription factors (TFs) and their respective regions of positional enrichment. This probabilistic model leads to near-perfect classification results in cross-validation (auROC = 0.98), and performance in genomic scans demonstrates that TSS prediction with both high accuracy and spatial resolution is achievable for a specific but large subgroup of mammalian promoters. The interpretable model structure suggests a DNA code in which canonical sequence features such as TATA-box, Initiator, and GC content do play a significant role, but many additional TFs show distinct spatial biases with respect to TSS location and are important contributors to the accurate prediction of single-peak transcription initiation sites. The model structure also reveals that CAGE tag clusters distal from annotated gene starts have distinct characteristics compared to those close to gene 5'-ends. Using this high-resolution single-peak model, we predict TSS for approximately 70% of mammalian microRNAs based on currently available data.
Collapse
Affiliation(s)
- Molly Megraw
- Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina 27708, USA
| | | | | | | | | |
Collapse
|
19
|
Wang X, Xuan Z, Zhao X, Li Y, Zhang MQ. High-resolution human core-promoter prediction with CoreBoost_HM. Genome Res 2008; 19:266-75. [PMID: 18997002 DOI: 10.1101/gr.081638.108] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Correctly locating the gene transcription start site and the core-promoter is important for understanding transcriptional regulation mechanism. Here we have integrated specific genome-wide histone modification and DNA sequence features together to predict RNA polymerase II core-promoters in the human genome. Our new predictor CoreBoost_HM outperforms existing promoter prediction algorithms by providing significantly higher sensitivity and specificity at high resolution. We demonstrated that even though the histone modification data used in this study are from a specific cell type (CD4+ T-cell), our method can be used to identify both active and repressed promoters. We have applied it to search the upstream regions of microRNA genes, and show that CoreBoost_HM can accurately identify the known promoters of the intergenic microRNAs. We also identified a few intronic microRNAs that may have their own promoters. This result suggests that our new method can help to identify and characterize the core-promoters of both coding and noncoding genes.
Collapse
Affiliation(s)
- Xiaowo Wang
- MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, China
| | | | | | | | | |
Collapse
|
20
|
Yang JY, Zhou Y, Yu ZG, Anh V, Zhou LQ. Human Pol II promoter recognition based on primary sequences and free energy of dinucleotides. BMC Bioinformatics 2008; 9:113. [PMID: 18294399 PMCID: PMC2292139 DOI: 10.1186/1471-2105-9-113] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2007] [Accepted: 02/24/2008] [Indexed: 01/29/2023] Open
Abstract
Background Promoter region plays an important role in determining where the transcription of a particular gene should be initiated. Computational prediction of eukaryotic Pol II promoter sequences is one of the most significant problems in sequence analysis. Existing promoter prediction methods are still far from being satisfactory. Results We attempt to recognize the human Pol II promoter sequences from the non-promoter sequences which are made up of exon and intron sequences. Four methods are used: two kinds of multifractal analysis performed on the numeric sequences obtained from the dinucleotide free energy, Z curve analysis and global descriptor of the promoter/non-promoter primary sequences. A total of 141 parameters are extracted from these methods and categorized into seven groups (methods). They are used to generate certain spaces and then each promoter/non-promoter sequence is represented by a point in the corresponding space. All the 120 possible combinations of the seven methods are tested. Based on Fisher's linear discriminant algorithm, with a relatively smaller number of parameters (96 and 117), we get satisfactory discriminant accuracies. Particularly, in the case of 117 parameters, the accuracies for the training and test sets reach 90.43% and 89.79%, respectively. A comparison with five other existing methods indicates that our methods have a better performance. Using the global descriptor method (36 parameters), 17 of the 18 experimentally verified promoter sequences of human chromosome 22 are correctly identified. Conclusion The high accuracies achieved suggest that the methods of this paper are useful for understanding the difficult problem of promoter prediction.
Collapse
Affiliation(s)
- Jian-Yi Yang
- School of Mathematics and Computational Science, Xiangtan University, Hunan 411105, China.
| | | | | | | | | |
Collapse
|
21
|
Frith MC, Valen E, Krogh A, Hayashizaki Y, Carninci P, Sandelin A. A code for transcription initiation in mammalian genomes. Genes Dev 2008; 18:1-12. [PMID: 18032727 PMCID: PMC2134772 DOI: 10.1101/gr.6831208] [Citation(s) in RCA: 189] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2007] [Accepted: 10/14/2007] [Indexed: 11/24/2022]
Abstract
Genome-wide detection of transcription start sites (TSSs) has revealed that RNA Polymerase II transcription initiates at millions of positions in mammalian genomes. Most core promoters do not have a single TSS, but an array of closely located TSSs with different rates of initiation. As a rule, genes have more than one such core promoter; however, defining the boundaries between core promoters is not trivial. These discoveries prompt a re-evaluation of our models for transcription initiation. We describe a new framework for understanding the organization of transcription initiation. We show that initiation events are clustered on the chromosomes at multiple scales-clusters within clusters-indicating multiple regulatory processes. Within the smallest of such clusters, which can be interpreted as core promoters, the local DNA sequence predicts the relative transcription start usage of each nucleotide with a remarkable 91% accuracy, implying the existence of a DNA code that determines TSS selection. Conversely, the total expression strength of such clusters is only partially determined by the local DNA sequence. Thus, the overall control of transcription can be understood as a combination of large- and small-scale effects; the selection of transcription start sites is largely governed by the local DNA sequence, whereas the transcriptional activity of a locus is regulated at a different level; it is affected by distal features or events such as enhancers and chromatin remodeling.
Collapse
Affiliation(s)
- Martin C. Frith
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
- ARC Centre in Bioinformatics, Institute for Molecular Bioscience, University of Queensland, Brisbane, Qld 4072, Australia
| | - Eivind Valen
- The Bioinformatics Centre, Department of Molecular Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 København N, Denmark
| | - Anders Krogh
- The Bioinformatics Centre, Department of Molecular Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 København N, Denmark
| | - Yoshihide Hayashizaki
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Piero Carninci
- Genome Exploration Research Group (Genome Network Project Core Group), RIKEN Genomic Sciences Center (GSC), RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa, 230-0045, Japan
- Genome Science Laboratory, Discovery Research Institute, RIKEN Wako Institute, 2-1 Hirosawa, Wako, Saitama, 351-0198, Japan
| | - Albin Sandelin
- The Bioinformatics Centre, Department of Molecular Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaløes Vej 5, DK-2200 København N, Denmark
| |
Collapse
|
22
|
Wang J, Ungar LH, Tseng H, Hannenhalli S. MetaProm: a neural network based meta-predictor for alternative human promoter prediction. BMC Genomics 2007; 8:374. [PMID: 17941982 PMCID: PMC2194789 DOI: 10.1186/1471-2164-8-374] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Accepted: 10/17/2007] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND De novo eukaryotic promoter prediction is important for discovering novel genes and understanding gene regulation. In spite of the great advances made in the past decade, recent studies revealed that the overall performances of the current promoter prediction programs (PPPs) are still poor, and predictions made by individual PPPs do not overlap each other. Furthermore, most PPPs are trained and tested on the most-upstream promoters; their performances on alternative promoters have not been assessed. RESULTS In this paper, we evaluate the performances of current major promoter prediction programs (i.e., PSPA, FirstEF, McPromoter, DragonGSF, DragonPF, and FProm) using 42,536 distinct human gene promoters on a genome-wide scale, and with emphasis on alternative promoters. We describe an artificial neural network (ANN) based meta-predictor program that integrates predictions from the current PPPs and the predicted promoters' relation to CpG islands. Our specific analysis of recently discovered alternative promoters reveals that although only 41% of the 3' most promoters overlap a CpG island, 74% of 5' most promoters overlap a CpG island. CONCLUSION Our assessment of six PPPs on 1.06 x 109 bps of human genome sequence reveals the specific strengths and weaknesses of individual PPPs. Our meta-predictor outperforms any individual PPP in sensitivity and specificity. Furthermore, we discovered that the 5' alternative promoters are more likely to be associated with a CpG island.
Collapse
Affiliation(s)
- Junwen Wang
- Center for Bioinformatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|
23
|
Abstract
Computational analysis of eukaryotic promoters is one of the most difficult problems in computational genomics and is essential for understanding gene expression profiles and reverse-engineering gene regulation network circuits. Here I give a basic introduction of the problem and recent update on both experimental and computational approaches. More details may be found in the extended references. This review is based on a summer lecture given at Max Planck Institute at Berlin in 2005.
Collapse
Affiliation(s)
- Michael Q Zhang
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| |
Collapse
|