1
|
Wang Y, Tai S, Zhang S, Sheng N, Xie X. PromGER: Promoter Prediction Based on Graph Embedding and Ensemble Learning for Eukaryotic Sequence. Genes (Basel) 2023; 14:1441. [PMID: 37510345 PMCID: PMC10379012 DOI: 10.3390/genes14071441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/04/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
Promoters are DNA non-coding regions around the transcription start site and are responsible for regulating the gene transcription process. Due to their key role in gene function and transcriptional activity, the prediction of promoter sequences and their core elements accurately is a crucial research area in bioinformatics. At present, models based on machine learning and deep learning have been developed for promoter prediction. However, these models cannot mine the deeper biological information of promoter sequences and consider the complex relationship among promoter sequences. In this work, we propose a novel prediction model called PromGER to predict eukaryotic promoter sequences. For a promoter sequence, firstly, PromGER utilizes four types of feature-encoding methods to extract local information within promoter sequences. Secondly, according to the potential relationships among promoter sequences, the whole promoter sequences are constructed as a graph. Furthermore, three different scales of graph-embedding methods are applied for obtaining the global feature information more comprehensively in the graph. Finally, combining local features with global features of sequences, PromGER analyzes and predicts promoter sequences through a tree-based ensemble-learning framework. Compared with seven existing methods, PromGER improved the average specificity of 13%, accuracy of 10%, Matthew's correlation coefficient of 16%, precision of 4%, F1 score of 6%, and AUC of 9%. Specifically, this study interpreted the PromGER by the t-distributed stochastic neighbor embedding (t-SNE) method and SHAPley Additive exPlanations (SHAP) value analysis, which demonstrates the interpretability of the model.
Collapse
Affiliation(s)
- Yan Wang
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
- School of Artificial Intelligence, Jilin University, Changchun 130012, China
| | - Shiwen Tai
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Shuangquan Zhang
- School of Cyber Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Nan Sheng
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Xuping Xie
- Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China
| |
Collapse
|
2
|
PromoterLCNN: A Light CNN-Based Promoter Prediction and Classification Model. Genes (Basel) 2022; 13:genes13071126. [PMID: 35885909 PMCID: PMC9325283 DOI: 10.3390/genes13071126] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 06/15/2022] [Accepted: 06/20/2022] [Indexed: 01/01/2023] Open
Abstract
Promoter identification is a fundamental step in understanding bacterial gene regulation mechanisms. However, accurate and fast classification of bacterial promoters continues to be challenging. New methods based on deep convolutional networks have been applied to identify and classify bacterial promoters recognized by sigma (σ) factors and RNA polymerase subunits which increase affinity to specific DNA sequences to modulate transcription and respond to nutritional or environmental changes. This work presents a new multiclass promoter prediction model by using convolutional neural networks (CNNs), denoted as PromoterLCNN, which classifies Escherichia coli promoters into subclasses σ70, σ24, σ32, σ38, σ28, and σ54. We present a light, fast, and simple two-stage multiclass CNN architecture for promoter identification and classification. Training and testing were performed on a benchmark dataset, part of RegulonDB. Comparative performance of PromoterLCNN against other CNN-based classifiers using four parameters (Acc, Sn, Sp, MCC) resulted in similar or better performance than those that commonly use cascade architecture, reducing time by approximately 30–90% for training, prediction, and hyperparameter optimization without compromising classification quality.
Collapse
|
3
|
Zhang M, Jia C, Li F, Li C, Zhu Y, Akutsu T, Webb GI, Zou Q, Coin LJM, Song J. Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction. Brief Bioinform 2022; 23:6502561. [PMID: 35021193 PMCID: PMC8921625 DOI: 10.1093/bib/bbab551] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 11/12/2021] [Accepted: 11/30/2021] [Indexed: 01/13/2023] Open
Abstract
Promoters are crucial regulatory DNA regions for gene transcriptional activation. Rapid advances in next-generation sequencing technologies have accelerated the accumulation of genome sequences, providing increased training data to inform computational approaches for both prokaryotic and eukaryotic promoter prediction. However, it remains a significant challenge to accurately identify species-specific promoter sequences using computational approaches. To advance computational support for promoter prediction, in this study, we curated 58 comprehensive, up-to-date, benchmark datasets for 7 different species (i.e. Escherichia coli, Bacillus subtilis, Homo sapiens, Mus musculus, Arabidopsis thaliana, Zea mays and Drosophila melanogaster) to assist the research community to assess the relative functionality of alternative approaches and support future research on both prokaryotic and eukaryotic promoters. We revisited 106 predictors published since 2000 for promoter identification (40 for prokaryotic promoter, 61 for eukaryotic promoter, and 5 for both). We systematically evaluated their training datasets, computational methodologies, calculated features, performance and software usability. On the basis of these benchmark datasets, we benchmarked 19 predictors with functioning webservers/local tools and assessed their prediction performance. We found that deep learning and traditional machine learning-based approaches generally outperformed scoring function-based approaches. Taken together, the curated benchmark dataset repository and the benchmarking analysis in this study serve to inform the design and implementation of computational approaches for promoter prediction and facilitate more rigorous comparison of new techniques in the future.
Collapse
Affiliation(s)
| | - Cangzhi Jia
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | | | | | | | | | - Geoffrey I Webb
- Department of Data Science and Artificial Intelligence, Monash University, Melbourne, VIC 3800, Australia,Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
| | - Quan Zou
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | - Lachlan J M Coin
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| | - Jiangning Song
- Corresponding authors: Jiangning Song, Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia. E-mail: ; Lachlan J.M. Coin, Department of Microbiology and Immunology, The Peter Doherty Institute for Infection and Immunity, The University of Melbourne, 792 Elizabeth Street, Melbourne, Victoria 3000, Australia. E-mail: ; Quan Zou, Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China. E-mail: ; Cangzhi Jia, School of Science, Dalian Maritime University, Dalian 116026, China. E-mail:
| |
Collapse
|
4
|
Idris AB, Idris EB, Ataelmanan AE, Mohamed AEA, Osman Arbab BM, Ibrahim EAM, Hassan MA. First insights into the molecular basis association between promoter polymorphisms of the IL1B gene and Helicobacter pylori infection in the Sudanese population: computational approach. BMC Microbiol 2021; 21:16. [PMID: 33413117 PMCID: PMC7792167 DOI: 10.1186/s12866-020-02072-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 12/15/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Helicobacter pylori (H. pylori) infects nearly half of the world's population with a variation in incidence among different geographic regions. Genetic variants in the promoter regions of the IL1B gene can affect cytokine expression and creates a condition of hypoacidity which favors the survival and colonization of H. pylori. Therefore, the aim of this study was to characterize the polymorphic sites in the 5'- region [-687_ + 297] of IL1B in H. pylori infection using in silico tools. RESULTS A total of five nucleotide variations were detected in the 5'-regulatory region [-687_ + 297] of IL1B which led to the addition or alteration of transcription factor binding sites (TFBSs) or composite regulatory elements (CEs). Genotyping of IL1B - 31 C > T revealed a significant association between -31 T and susceptibility to H. pylori infection in the studied population (P = 0.0363). Comparative analysis showed conservation rates of IL1B upstream [-368_ + 10] region above 70% in chimpanzee, rhesus monkey, a domesticated dog, cow and rat. CONCLUSIONS In H. pylori-infected patients, three detected SNPs (- 338, - 155 and - 31) located in the IL1B promoter were predicted to alter TFBSs and CE, which might affect the gene expression. These in silico predictions provide insight for further experimental in vitro and in vivo studies of the regulation of IL1B expression and its relationship to H. pylori infection. However, the recognition of regulatory motifs by computer algorithms is fundamental for understanding gene expression patterns.
Collapse
Affiliation(s)
- Abeer Babiker Idris
- Department of Medical Microbiology, Faculty of Medical Laboratory Sciences, University of Khartoum, Khartoum, Sudan.
| | - Einas Babiker Idris
- Medical Laboratory Specialist, Department of Medical Microbiology, Rashid Medical Complex, Riyadh, Saudi Arabia
| | - Amany Eltayib Ataelmanan
- Department of Medical Microbiology, Faculty of Medical Laboratory Sciences, University of Al-Gazirah, Wad Madani, Sudan
| | | | | | - El-Amin Mohamed Ibrahim
- Department of Medical Microbiology, Faculty of Medical Laboratory Sciences, University of Khartoum, Khartoum, Sudan
| | - Mohamed A Hassan
- Department of Bioinformatics, Africa city of technology, Khartoum, Sudan.,Department of Bioinformatics, DETAGEN Genetic Diagnostics Center, Kayseri, Turkey.,Department of Translation Bioinformatics, Detavax Biotech, Kayseri, Turkey
| |
Collapse
|
5
|
Xin S, Wang X, Dai G, Zhang J, An T, Zou W, Zhang G, Xie K, Wang J. Bioinformatics Analysis of SNPs in IL-6 Gene Promoter of Jinghai Yellow Chickens. Genes (Basel) 2018; 9:genes9090446. [PMID: 30200658 PMCID: PMC6162446 DOI: 10.3390/genes9090446] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Revised: 08/24/2018] [Accepted: 08/31/2018] [Indexed: 11/16/2022] Open
Abstract
The proinflammatory cytokine, interleukin-6 (IL-6), plays a critical role in many chronic inflammatory diseases, particularly inflammatory bowel disease. To investigate the regulation of IL-6 gene expression at the molecular level, genomic DNA sequencing of Jinghai yellow chickens (Gallus gallus) was performed to detect single-nucleotide polymorphisms (SNPs) in the region -2200 base pairs (bp) upstream to 500 bp downstream of IL-6. Transcription factor binding sites and CpG islands in the IL-6 promoter region were predicted using bioinformatics software. Twenty-eight SNP sites were identified in IL-6. Four of these 28 SNPs, three [-357 (G > A), -447 (C > G), and -663 (A > G)] in the 5' regulatory region and one in the 3' non-coding region [3177 (C > T)] are not labelled in GenBank. Bioinformatics analysis revealed 11 SNPs within the promoter region that altered putative transcription factor binding sites. Furthermore, the C-939G mutation in the promoter region may change the number of CpG islands, and SNPs in the 5' regulatory region may influence IL-6 gene expression by altering transcription factor binding or CpG methylation status. Genetic diversity analysis revealed that the newly discovered A-663G site significantly deviated from Hardy-Weinberg equilibrium. These results provide a basis for further exploration of the promoter function of the IL-6 gene and the relationships of these SNPs to intestinal inflammation resistance in chickens.
Collapse
Affiliation(s)
- Shijie Xin
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
| | - Xiaohui Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
| | - Guojun Dai
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
- Institutes of Agricultural Science and Technology Development, Yangzhou University, Yangzhou 225009, China.
| | - Jingjing Zhang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
| | - Tingting An
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
| | - Wenbin Zou
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
| | - Genxi Zhang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
- Institutes of Agricultural Science and Technology Development, Yangzhou University, Yangzhou 225009, China.
| | - Kaizhou Xie
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
- Institutes of Agricultural Science and Technology Development, Yangzhou University, Yangzhou 225009, China.
| | - Jinyu Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
- Key Lab for Animal Genetics, Breeding, Reproduction and Molecular Design of Jiangsu Province, Yangzhou 225009, China.
- Institutes of Agricultural Science and Technology Development, Yangzhou University, Yangzhou 225009, China.
| |
Collapse
|
6
|
Huang WL, Tung CW, Liaw C, Huang HL, Ho SY. Rule-based knowledge acquisition method for promoter prediction in human and Drosophila species. ScientificWorldJournal 2014; 2014:327306. [PMID: 24955394 PMCID: PMC3927563 DOI: 10.1155/2014/327306] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2013] [Accepted: 10/10/2013] [Indexed: 01/08/2023] Open
Abstract
The rapid and reliable identification of promoter regions is important when the number of genomes to be sequenced is increasing very speedily. Various methods have been developed but few methods investigate the effectiveness of sequence-based features in promoter prediction. This study proposes a knowledge acquisition method (named PromHD) based on if-then rules for promoter prediction in human and Drosophila species. PromHD utilizes an effective feature-mining algorithm and a reference feature set of 167 DNA sequence descriptors (DNASDs), comprising three descriptors of physicochemical properties (absorption maxima, molecular weight, and molar absorption coefficient), 128 top-ranked descriptors of 4-mer motifs, and 36 global sequence descriptors. PromHD identifies two feature subsets with 99 and 74 DNASDs and yields test accuracies of 96.4% and 97.5% in human and Drosophila species, respectively. Based on the 99- and 74-dimensional feature vectors, PromHD generates several if-then rules by using the decision tree mechanism for promoter prediction. The top-ranked informative rules with high certainty grades reveal that the global sequence descriptor, the length of nucleotide A at the first position of the sequence, and two physicochemical properties, absorption maxima and molecular weight, are effective in distinguishing promoters from non-promoters in human and Drosophila species, respectively.
Collapse
Affiliation(s)
- Wen-Lin Huang
- Department of Management Information System, Asia Pacific Institute of Creativity, Miaoli 351, Taiwan
| | - Chun-Wei Tung
- School of Pharmacy, College of Pharmacy, Kaohsiung Medical University, Kaohsiung 807, Taiwan
| | - Chyn Liaw
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Hui-Ling Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| | - Shinn-Ying Ho
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu 300, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu 300, Taiwan
| |
Collapse
|
7
|
Pan X, Papasani M, Hao Y, Calamito M, Wei F, Quinn Iii WJ, Basu A, Wang J, Hodawadekar S, Zaprazna K, Liu H, Shi Y, Allman D, Cancro M, Atchison ML. YY1 controls Igκ repertoire and B-cell development, and localizes with condensin on the Igκ locus. EMBO J 2013; 32:1168-82. [PMID: 23531880 DOI: 10.1038/emboj.2013.66] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Accepted: 02/11/2013] [Indexed: 12/25/2022] Open
Abstract
Conditional knock-out (KO) of Polycomb Group (PcG) protein YY1 results in pro-B cell arrest and reduced immunoglobulin locus contraction needed for distal variable gene rearrangement. The mechanisms that control these crucial functions are unknown. We deleted the 25 amino-acid YY1 REPO domain necessary for YY1 PcG function, and used this mutant (YY1ΔREPO), to transduce bone marrow from YY1 conditional KO mice. While wild-type YY1 rescued B-cell development, YY1ΔREPO failed to rescue the B-cell lineage yielding reduced numbers of B lineage cells. Although the IgH rearrangement pattern was normal, there was a selective impact at the Igκ locus that showed a dramatic skewing of the expressed Igκ repertoire. We found that the REPO domain interacts with proteins from the condensin and cohesin complexes, and that YY1, EZH2 and condensin proteins co-localize at numerous sites across the Ig kappa locus. Knock-down of a condensin subunit protein or YY1 reduced rearrangement of Igκ Vκ genes suggesting a direct role for YY1-condensin complexes in Igκ locus structure and rearrangement.
Collapse
Affiliation(s)
- Xuan Pan
- Department of Animal Biology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Zhou X, Li Z, Dai Z, Zou X. Predicting promoters by pseudo-trinucleotide compositions based on discrete wavelets transform. J Theor Biol 2013; 319:1-7. [DOI: 10.1016/j.jtbi.2012.11.024] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Revised: 11/20/2012] [Accepted: 11/21/2012] [Indexed: 10/27/2022]
|
9
|
Li MJ, Wang P, Liu X, Lim EL, Wang Z, Yeager M, Wong MP, Sham PC, Chanock SJ, Wang J. GWASdb: a database for human genetic variants identified by genome-wide association studies. Nucleic Acids Res 2011; 40:D1047-54. [PMID: 22139925 PMCID: PMC3245026 DOI: 10.1093/nar/gkr1182] [Citation(s) in RCA: 152] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Recent advances in genome-wide association studies (GWAS) have enabled us to identify thousands of genetic variants (GVs) that are associated with human diseases. As next-generation sequencing technologies become less expensive, more GVs will be discovered in the near future. Existing databases, such as NHGRI GWAS Catalog, collect GVs with only genome-wide level significance. However, many true disease susceptibility loci have relatively moderate P values and are not included in these databases. We have developed GWASdb that contains 20 times more data than the GWAS Catalog and includes less significant GVs (P < 1.0 × 10−3) manually curated from the literature. In addition, GWASdb provides comprehensive functional annotations for each GV, including genomic mapping information, regulatory effects (transcription factor binding sites, microRNA target sites and splicing sites), amino acid substitutions, evolution, gene expression and disease associations. Furthermore, GWASdb classifies these GVs according to diseases using Disease-Ontology Lite and Human Phenotype Ontology. It can conduct pathway enrichment and PPI network association analysis for these diseases. GWASdb provides an intuitive, multifunctional database for biologists and clinicians to explore GVs and their functional inferences. It is freely available at http://jjwanglab.org/gwasdb and will be updated frequently.
Collapse
Affiliation(s)
- Mulin Jun Li
- Department of Biochemistry, The University of Hong Kong, Hong Kong SAR, China
| | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Yang S, Yalamanchili HK, Li X, Yao KM, Sham PC, Zhang MQ, Wang J. Correlated evolution of transcription factors and their binding sites. ACTA ACUST UNITED AC 2011; 27:2972-8. [PMID: 21896508 DOI: 10.1093/bioinformatics/btr503] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION The interaction between transcription factor (TF) and transcription factor binding site (TFBS) is essential for gene regulation. Mutation in either the TF or the TFBS may weaken their interaction and thus result in abnormalities. To maintain such vital interaction, a mutation in one of the interacting partners might be compensated by a corresponding mutation in its binding partner during the course of evolution. Confirming this co-evolutionary relationship will guide us in designing protein sequences to target a specific DNA sequence or in predicting TFBS for poorly studied proteins, or even correcting and rescuing disease mutations in clinical applications. RESULTS Based on six, publicly available, experimentally validated TF-TFBS binding datasets for the basic Helix-Loop-Helix (bHLH) family, Homeo family, High-Mobility Group (HMG) family and Transient Receptor Potential channels (TRP) family, we showed that the evolutions of the TFs and their TFBSs are significantly correlated across eukaryotes. We further developed a mutual information-based method to identify co-evolved protein residues and DNA bases. This research sheds light on the dynamic relationship between TF and TFBS during their evolution. The same principle and strategy can be applied to co-evolutionary studies on protein-DNA interactions in other protein families. AVAILABILITY All the datasets, scripts and other related files have been made freely available at: http://jjwanglab.org/co-evo. CONTACT junwen@uw.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shu Yang
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | | | | | | | | | | | | |
Collapse
|
11
|
Nozaki T, Yachie N, Ogawa R, Kratz A, Saito R, Tomita M. Tight associations between transcription promoter type and epigenetic variation in histone positioning and modification. BMC Genomics 2011; 12:416. [PMID: 21846408 PMCID: PMC3170308 DOI: 10.1186/1471-2164-12-416] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Accepted: 08/17/2011] [Indexed: 11/19/2022] Open
Abstract
Background Transcription promoters are fundamental genomic cis-elements controlling gene expression. They can be classified into two types by the degree of imprecision of their transcription start sites: peak promoters, which initiate transcription from a narrow genomic region; and broad promoters, which initiate transcription from a wide-ranging region. Eukaryotic transcription initiation is suggested to be associated with the genomic positions and modifications of nucleosomes. For instance, it has been recently shown that histone with H3K9 acetylation (H3K9ac) is more likely to be distributed around broad promoters rather than peak promoters; it can thus be inferred that there is an association between histone H3K9 and promoter architecture. Results Here, we performed a systematic analysis of transcription promoters and gene expression, as well as of epigenetic histone behaviors, including genomic position, stability within the chromatin, and several modifications. We found that, in humans, broad promoters, but not peak promoters, generally had significant associations with nucleosome positioning and modification. Specifically, around broad promoters histones were highly distributed and aligned in an orderly fashion. This feature was more evident with histones that were methylated or acetylated; moreover, the nucleosome positions around the broad promoters were more stable than those around the peak ones. More strikingly, the overall expression levels of genes associated with broad promoters (but not peak promoters) with modified histones were significantly higher than the levels of genes associated with broad promoters with unmodified histones. Conclusion These results shed light on how epigenetic regulatory networks of histone modifications are associated with promoter architecture.
Collapse
Affiliation(s)
- Tadasu Nozaki
- Institute for Advanced Biosciences, Keio University, Tsuruoka, 997-0017, Japan
| | | | | | | | | | | |
Collapse
|
12
|
Qin J, Li MJ, Wang P, Zhang MQ, Wang J. ChIP-Array: combinatory analysis of ChIP-seq/chip and microarray gene expression data to discover direct/indirect targets of a transcription factor. Nucleic Acids Res 2011; 39:W430-6. [PMID: 21586587 PMCID: PMC3125757 DOI: 10.1093/nar/gkr332] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Chromatin immunoprecipitation (ChIP) coupled with high-throughput techniques (ChIP-X), such as next generation sequencing (ChIP-Seq) and microarray (ChIP–chip), has been successfully used to map active transcription factor binding sites (TFBS) of a transcription factor (TF). The targeted genes can be activated or suppressed by the TF, or are unresponsive to the TF. Microarray technology has been used to measure the actual expression changes of thousands of genes under the perturbation of a TF, but is unable to determine if the affected genes are direct or indirect targets of the TF. Furthermore, both ChIP-X and microarray methods produce a large number of false positives. Combining microarray expression profiling and ChIP-X data allows more effective TFBS analysis for studying the function of a TF. However, current web servers only provide tools to analyze either ChIP-X or expression data, but not both. Here, we present ChIP-Array, a web server that integrates ChIP-X and expression data from human, mouse, yeast, fruit fly and Arabidopsis. This server will assist biologists to detect direct and indirect target genes regulated by a TF of interest and to aid in the functional characterization of the TF. ChIP-Array is available at http://jjwanglab.hku.hk/ChIP-Array, with free access to academic users.
Collapse
Affiliation(s)
- Jing Qin
- Department of Biochemistry, LKS Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Hong Kong SAR, China
| | | | | | | | | |
Collapse
|
13
|
Abstract
Motivation: Promoter prediction is an important task in genome annotation projects, and during the past years many new promoter prediction programs (PPPs) have emerged. However, many of these programs are compared inadequately to other programs. In most cases, only a small portion of the genome is used to evaluate the program, which is not a realistic setting for whole genome annotation projects. In addition, a common evaluation design to properly compare PPPs is still lacking. Results: We present a large-scale benchmarking study of 17 state-of-the-art PPPs. A multi-faceted evaluation strategy is proposed that can be used as a gold standard for promoter prediction evaluation, allowing authors of promoter prediction software to compare their method to existing methods in a proper way. This evaluation strategy is subsequently used to compare the chosen promoter predictors, and an in-depth analysis on predictive performance, promoter class specificity, overlap between predictors and positional bias of the predictions is conducted. Availability: We provide the implementations of the four protocols, as well as the datasets required to perform the benchmarks to the academic community free of charge on request. Contact:yves.vandepeer@psb.ugent.be Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Thomas Abeel
- Department of Plant Systems Biology, VIB, Ghent University, Gent, Belgium
| | | | | |
Collapse
|
14
|
PU.1 can recruit BCL6 to DNA to repress gene expression in germinal center B cells. Mol Cell Biol 2009; 29:4612-22. [PMID: 19564417 DOI: 10.1128/mcb.00234-09] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
BCL6 is a transcriptional repressor crucial for germinal center formation. BCL6 represses transcription by a variety of mechanisms by binding to specific DNA sequences or by recruitment to DNA by protein interactions. We found that BCL6 can inhibit activities of the immunoglobulin kappa (Igkappa) intron and 3' enhancers. At the Igkappa 3' enhancer, BCL6 repressed enhancer activity through the PU.1 binding site. We found that BCL6 physically interacted with PU.1 in vivo and in vitro, and the results of sequential chromatin immunoprecipitation assays and transient-expression assays suggested that BCL6 recruitment to the Igkappa and Iglambda 3' enhancers occurred via PU.1 interaction. By computational studies, we identified genes that are repressed in germinal center cells and whose promoters contain conserved PU.1 binding sites in mouse and human. We found that many of these promoters bound to both PU.1 and BCL6 in vivo. In addition, BCL6 knockdown resulted in increased expression of a subset of these genes, demonstrating that BCL6 is involved in their repression. The recruitment of BCL6 to promoter regions by PU.1 represents a new regulatory mechanism that expands the number of genes regulated by this important transcriptional repressor.
Collapse
|
15
|
Zeng J, Zhu S, Yan H. Towards accurate human promoter recognition: a review of currently used sequence features and classification methods. Brief Bioinform 2009; 10:498-508. [PMID: 19531545 DOI: 10.1093/bib/bbp027] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
This review describes important advances that have been made during the past decade for genome-wide human promoter recognition. Interest in promoter recognition algorithms on a genome-wide scale is worldwide and touches on a number of practical systems that are important in analysis of gene regulation and in genome annotation without experimental support of ESTs, cDNAs or mRNAs. The main focus of this review is on feature extraction and model selection for accurate human promoter recognition, with descriptions of what they are, what has been accomplished, and what remains to be done.
Collapse
Affiliation(s)
- Jia Zeng
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong.
| | | | | |
Collapse
|
16
|
Yokoyama KD, Ohler U, Wray GA. Measuring spatial preferences at fine-scale resolution identifies known and novel cis-regulatory element candidates and functional motif-pair relationships. Nucleic Acids Res 2009; 37:e92. [PMID: 19483094 PMCID: PMC2715254 DOI: 10.1093/nar/gkp423] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
Transcriptional regulation is mediated by the collective binding of proteins called transcription factors to cis-regulatory elements. A handful of factors are known to function at particular distances from the transcription start site, although the extent to which this occurs is not well understood. Spatial dependencies can also exist between pairs of binding motifs, facilitating factor-pair interactions. We sought to determine to what extent spatial preferences measured at high-scale resolution could be utilized to predict cis-regulatory elements as well as motif-pairs binding interacting proteins. We introduce the ‘motif positional function’ model which predicts spatial biases using regression analysis, differentiating noise from true position-specific overrepresentation at single-nucleotide resolution. Our method predicts 48 consensus motifs exhibiting positional enrichment within human promoters, including fourteen motifs without known binding partners. We then extend the model to analyze distance preferences between pairs of motifs. We find that motif-pairs binding interacting factors often co-occur preferentially at multiple distances, with intervals between preferred distances often corresponding to the turn of the DNA double-helix. This offers a novel means by which to predict sequence elements with a collective role in gene regulation.
Collapse
Affiliation(s)
- Ken Daigoro Yokoyama
- Biology Department, Institute for Genome Sciences and Policy, Duke University, Durham, NC 27708, USA
| | | | | |
Collapse
|
17
|
Danko CG, Pertsov AM. Identification of gene co-regulatory modules and associated cis-elements involved in degenerative heart disease. BMC Med Genomics 2009; 2:31. [PMID: 19476647 PMCID: PMC2700136 DOI: 10.1186/1755-8794-2-31] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2008] [Accepted: 05/28/2009] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Cardiomyopathies, degenerative diseases of cardiac muscle, are among the leading causes of death in the developed world. Microarray studies of cardiomyopathies have identified up to several hundred genes that significantly alter their expression patterns as the disease progresses. However, the regulatory mechanisms driving these changes, in particular the networks of transcription factors involved, remain poorly understood. Our goals are (A) to identify modules of co-regulated genes that undergo similar changes in expression in various types of cardiomyopathies, and (B) to reveal the specific pattern of transcription factor binding sites, cis-elements, in the proximal promoter region of genes comprising such modules. METHODS We analyzed 149 microarray samples from human hypertrophic and dilated cardiomyopathies of various etiologies. Hierarchical clustering and Gene Ontology annotations were applied to identify modules enriched in genes with highly correlated expression and a similar physiological function. To discover motifs that may underly changes in expression, we used the promoter regions for genes in three of the most interesting modules as input to motif discovery algorithms. The resulting motifs were used to construct a probabilistic model predictive of changes in expression across different cardiomyopathies. RESULTS We found that three modules with the highest degree of functional enrichment contain genes involved in myocardial contraction (n = 9), energy generation (n = 20), or protein translation (n = 20). Using motif discovery tools revealed that genes in the contractile module were found to contain a TATA-box followed by a CACC-box, and are depleted in other GC-rich motifs; whereas genes in the translation module contain a pyrimidine-rich initiator, Elk-1, SP-1, and a novel motif with a GCGC core. Using a naïve Bayes classifier revealed that patterns of motifs are statistically predictive of expression patterns, with odds ratios of 2.7 (contractile), 1.9 (energy generation), and 5.5 (protein translation). CONCLUSION We identified patterns comprised of putative cis-regulatory motifs enriched in the upstream promoter sequence of genes that undergo similar changes in expression secondary to cardiomyopathies of various etiologies. Our analysis is a first step towards understanding transcription factor networks that are active in regulating gene expression during degenerative heart disease.
Collapse
Affiliation(s)
- Charles G Danko
- Department of Pharmacology, SUNY Upstate Medical University, Syracuse, NY, USA
| | - Arkady M Pertsov
- Department of Pharmacology, SUNY Upstate Medical University, Syracuse, NY, USA
| |
Collapse
|
18
|
Megraw M, Pereira F, Jensen ST, Ohler U, Hatzigeorgiou AG. A transcription factor affinity-based code for mammalian transcription initiation. Genome Res 2009; 19:644-56. [PMID: 19141595 DOI: 10.1101/gr.085449.108] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The recent arrival of large-scale cap analysis of gene expression (CAGE) data sets in mammals provides a wealth of quantitative information on coding and noncoding RNA polymerase II transcription start sites (TSS). Genome-wide CAGE studies reveal that a large fraction of TSS exhibit peaks where the vast majority of associated tags map to a particular location ( approximately 45%), whereas other active regions contain a broader distribution of initiation events. The presence of a strong single peak suggests that transcription at these locations may be mediated by position-specific sequence features. We therefore propose a new model for single-peaked TSS based solely on known transcription factors (TFs) and their respective regions of positional enrichment. This probabilistic model leads to near-perfect classification results in cross-validation (auROC = 0.98), and performance in genomic scans demonstrates that TSS prediction with both high accuracy and spatial resolution is achievable for a specific but large subgroup of mammalian promoters. The interpretable model structure suggests a DNA code in which canonical sequence features such as TATA-box, Initiator, and GC content do play a significant role, but many additional TFs show distinct spatial biases with respect to TSS location and are important contributors to the accurate prediction of single-peak transcription initiation sites. The model structure also reveals that CAGE tag clusters distal from annotated gene starts have distinct characteristics compared to those close to gene 5'-ends. Using this high-resolution single-peak model, we predict TSS for approximately 70% of mammalian microRNAs based on currently available data.
Collapse
Affiliation(s)
- Molly Megraw
- Institute for Genome Sciences and Policy, Duke University, Durham, North Carolina 27708, USA
| | | | | | | | | |
Collapse
|
19
|
Abeel T, Saeys Y, Rouzé P, Van de Peer Y. ProSOM: core promoter prediction based on unsupervised clustering of DNA physical profiles. Bioinformatics 2008; 24:i24-31. [PMID: 18586720 PMCID: PMC2718650 DOI: 10.1093/bioinformatics/btn172] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
MOTIVATION More and more genomes are being sequenced, and to keep up with the pace of sequencing projects, automated annotation techniques are required. One of the most challenging problems in genome annotation is the identification of the core promoter. Because the identification of the transcription initiation region is such a challenging problem, it is not yet a common practice to integrate transcription start site prediction in genome annotation projects. Nevertheless, better core promoter prediction can improve genome annotation and can be used to guide experimental work. RESULTS Comparing the average structural profile based on base stacking energy of transcribed, promoter and intergenic sequences demonstrates that the core promoter has unique features that cannot be found in other sequences. We show that unsupervised clustering by using self-organizing maps can clearly distinguish between the structural profiles of promoter sequences and other genomic sequences. An implementation of this promoter prediction program, called ProSOM, is available and has been compared with the state-of-the-art. We propose an objective, accurate and biologically sound validation scheme for core promoter predictors. ProSOM performs at least as well as the software currently available, but our technique is more balanced in terms of the number of predicted sites and the number of false predictions, resulting in a better all-round performance. Additional tests on the ENCODE regions of the human genome show that 98% of all predictions made by ProSOM can be associated with transcriptionally active regions, which demonstrates the high precision. AVAILABILITY Predictions for the human genome, the validation datasets and the program (ProSOM) are available upon request.
Collapse
Affiliation(s)
- Thomas Abeel
- Department of Plant Systems Biology, VIB, 9052 Gent, Belgium
| | | | | | | |
Collapse
|
20
|
Wang J, Ungar LH, Tseng H, Hannenhalli S. MetaProm: a neural network based meta-predictor for alternative human promoter prediction. BMC Genomics 2007; 8:374. [PMID: 17941982 PMCID: PMC2194789 DOI: 10.1186/1471-2164-8-374] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2007] [Accepted: 10/17/2007] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND De novo eukaryotic promoter prediction is important for discovering novel genes and understanding gene regulation. In spite of the great advances made in the past decade, recent studies revealed that the overall performances of the current promoter prediction programs (PPPs) are still poor, and predictions made by individual PPPs do not overlap each other. Furthermore, most PPPs are trained and tested on the most-upstream promoters; their performances on alternative promoters have not been assessed. RESULTS In this paper, we evaluate the performances of current major promoter prediction programs (i.e., PSPA, FirstEF, McPromoter, DragonGSF, DragonPF, and FProm) using 42,536 distinct human gene promoters on a genome-wide scale, and with emphasis on alternative promoters. We describe an artificial neural network (ANN) based meta-predictor program that integrates predictions from the current PPPs and the predicted promoters' relation to CpG islands. Our specific analysis of recently discovered alternative promoters reveals that although only 41% of the 3' most promoters overlap a CpG island, 74% of 5' most promoters overlap a CpG island. CONCLUSION Our assessment of six PPPs on 1.06 x 109 bps of human genome sequence reveals the specific strengths and weaknesses of individual PPPs. Our meta-predictor outperforms any individual PPP in sensitivity and specificity. Furthermore, we discovered that the 5' alternative promoters are more likely to be associated with a CpG island.
Collapse
Affiliation(s)
- Junwen Wang
- Center for Bioinformatics, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|
21
|
Vardhanabhuti S, Wang J, Hannenhalli S. Position and distance specificity are important determinants of cis-regulatory motifs in addition to evolutionary conservation. Nucleic Acids Res 2007; 35:3203-13. [PMID: 17452354 PMCID: PMC1904283 DOI: 10.1093/nar/gkm201] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Computational discovery of cis-regulatory elements remains challenging. To cope with the high false positives, evolutionary conservation is routinely used. However, conservation is only one of the attributes of cis-regulatory elements and is neither necessary nor sufficient. Here, we assess two additional attributes—positional and inter-motif distance specificity—that are critical for interactions between transcription factors. We first show that for a greater than expected fraction of known motifs, the genes that contain the motifs in their promoters in a position-specific or distance-specific manner are related, both in function and/or in expression pattern. We then use the position and distance specificity to discover novel motifs. Our work highlights the importance of distance and position specificity, in addition to the evolutionary conservation, in discovering cis-regulatory motifs.
Collapse
Affiliation(s)
| | | | - Sridhar Hannenhalli
- *To whom correspondence should be addressed. Tel: +215 746 8683; Fax: +215 573 3111;
| |
Collapse
|
22
|
Ohler U. Identification of core promoter modules in Drosophila and their application in accurate transcription start site prediction. Nucleic Acids Res 2006; 34:5943-50. [PMID: 17068082 PMCID: PMC1635271 DOI: 10.1093/nar/gkl608] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The reliable recognition of eukaryotic RNA polymerase II core promoters, and the associated transcription start sites (TSSs) of genes, has been an ongoing challenge for computational biology. High throughput experimental methods such as tiling arrays or 5' SAGE/EST sequencing have recently lead to much larger datasets of core promoters, and to the assessment that the well-known core promoter sequence elements such as the TATA box appear to be much less frequent than thought. Here, we address the co-occurrence of several previously identified core promoter sequence motifs in Drosophila melanogaster to determine frequently occurring core promoter modules. We then use this in a new strategy to model core promoters as a set of alternative submodels for different core promoter architectures reflecting these different motif modules. We show that this system improves greatly on computational promoter recognition and leads to highly accurate in silico TSS prediction. Our results indicate that at least for the case of the fruit fly, we are getting closer to an understanding of how the beginning of a gene is defined in a eukaryotic genome.
Collapse
Affiliation(s)
- Uwe Ohler
- Institute for Genome Sciences and Policy, Durham, NC 27708, USA.
| |
Collapse
|
23
|
Wang J, Zhang S, Schultz RM, Tseng H. Search for basonuclin target genes. Biochem Biophys Res Commun 2006; 348:1261-71. [PMID: 16919236 PMCID: PMC1630671 DOI: 10.1016/j.bbrc.2006.07.198] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2006] [Accepted: 07/25/2006] [Indexed: 11/20/2022]
Abstract
Basonuclin (Bnc 1) is a transcription factor that has an unusual ability to interact with promoters of both RNA polymerases I and II. The action of basonuclin is mediated through three pairs of evolutionarily conserved zinc fingers, which produce three DNase I footprints on the promoters of rDNA and the basonuclin gene. Using these DNase footprints, we built a computational model for the basonuclin DNA-binding module, which was used to identify in silico potential RNA polymerase II target genes in the human and mouse promoter databases. The target genes of basonuclin show that it regulates the expression of proteins involved in chromatin structure, transcription/DNA-binding, ion-channels, adhesion/cell-cell junction, signal transduction, and intracellular transport. Our results suggest that basonuclin, like MYC, may coordinate transcriptional activities among the three RNA polymerases. But basonuclin regulates a distinctive set of pathways, which differ from that regulated by MYC.
Collapse
Affiliation(s)
- Junwen Wang
- Center for Bioinformatics,University of Pennsylvania
- Department of Computer and Information
Science,University of Pennsylvania
| | | | - Richard M. Schultz
- Department of Biology,University of Pennsylvania
- Center for Research on Reproduction
andWomen’s Health,University of Pennsylvania
| | - Hung Tseng
- Department of Dermatology,University of Pennsylvania
- Cell and Developmental Biology,University of
Pennsylvania
- Center for Research on Reproduction
andWomen’s Health,University of Pennsylvania
| |
Collapse
|