1
|
Li Y, Liao Z, Luo H, Benyoucef A, Kang Y, Lai Q, Dovat S, Miller B, Chepelev I, Li Y, Zhao K, Brand M, Huang S. Alteration of CTCF-associated chromatin neighborhood inhibits TAL1-driven oncogenic transcription program and leukemogenesis. Nucleic Acids Res 2020; 48:3119-3133. [PMID: 32086528 PMCID: PMC7102946 DOI: 10.1093/nar/gkaa098] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 02/03/2020] [Accepted: 02/06/2020] [Indexed: 12/23/2022] Open
Abstract
Aberrant activation of the TAL1 is associated with up to 60% of T-ALL cases and is involved in CTCF-mediated genome organization within the TAL1 locus, suggesting that CTCF boundary plays a pathogenic role in T-ALL. Here, we show that -31-Kb CTCF binding site (-31CBS) serves as chromatin boundary that defines topologically associating domain (TAD) and enhancer/promoter interaction required for TAL1 activation. Deleted or inverted -31CBS impairs TAL1 expression in a context-dependent manner. Deletion of -31CBS reduces chromatin accessibility and blocks long-range interaction between the +51 erythroid enhancer and TAL1 promoter-1 leading to inhibition of TAL1 expression in erythroid cells, but not T-ALL cells. However, in TAL1-expressing T-ALL cells, the leukemia-prone TAL1 promoter-IV specifically interacts with the +19 stem cell enhancer located 19 Kb downstream of TAL1 and this interaction is disrupted by the -31CBS inversion in T-ALL cells. Inversion of -31CBS in Jurkat cells alters chromatin accessibility, histone modifications and CTCF-mediated TAD leading to inhibition of TAL1 expression and TAL1-driven leukemogenesis. Thus, our data reveal that -31CBS acts as critical regulator to define +19-enhancer and the leukemic prone promoter IV interaction for TAL1 activation in T-ALL. Manipulation of CTCF boundary can alter TAL1 TAD and oncogenic transcription networks in leukemogenesis.
Collapse
Affiliation(s)
- Ying Li
- Department of Pediatrics and Pharmacology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Ziwei Liao
- Department of Biochemistry & Molecular Biology, University of Florida College of Medicine, Gainesville, FL 32610, USA.,Institute of Hematology, Jinan University Medical College, ShiPai, Guangzhou, 510632, China
| | - Huacheng Luo
- Department of Pediatrics and Pharmacology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Aissa Benyoucef
- The Sprott Center for Stem Cell Research, Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, Ontario K1H 8L6, Canada
| | - Yuanyuan Kang
- Department of Biochemistry & Molecular Biology, University of Florida College of Medicine, Gainesville, FL 32610, USA
| | - Qian Lai
- Department of Pediatrics and Pharmacology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Sinisa Dovat
- Department of Pediatrics and Pharmacology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Barbara Miller
- Department of Pediatrics and Pharmacology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA
| | - Iouri Chepelev
- Laboratory of Molecular Immunology, National Heart, Lung and Blood Institute, NIH, Bethesda, MD 20814, USA.,Center for Autoimmune Genomics and Etiology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Yangqiu Li
- Institute of Hematology, Jinan University Medical College, ShiPai, Guangzhou, 510632, China
| | - Keji Zhao
- Laboratory of Molecular Immunology, National Heart, Lung and Blood Institute, NIH, Bethesda, MD 20814, USA
| | - Marjorie Brand
- The Sprott Center for Stem Cell Research, Regenerative Medicine Program, Ottawa Hospital Research Institute, Ottawa, Ontario K1H 8L6, Canada
| | - Suming Huang
- Department of Pediatrics and Pharmacology, Pennsylvania State University College of Medicine, Hershey, PA 17033, USA.,Department of Biochemistry & Molecular Biology, University of Florida College of Medicine, Gainesville, FL 32610, USA
| |
Collapse
|
2
|
Agrawal S, Ganley ARD. The conservation landscape of the human ribosomal RNA gene repeats. PLoS One 2018; 13:e0207531. [PMID: 30517151 PMCID: PMC6281188 DOI: 10.1371/journal.pone.0207531] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Accepted: 11/01/2018] [Indexed: 01/27/2023] Open
Abstract
Ribosomal RNA gene repeats (rDNA) encode ribosomal RNA, a major component of ribosomes. Ribosome biogenesis is central to cellular metabolic regulation, and several diseases are associated with rDNA dysfunction, notably cancer, However, its highly repetitive nature has severely limited characterization of the elements responsible for rDNA function. Here we make use of phylogenetic footprinting to provide a comprehensive list of novel, potentially functional elements in the human rDNA. Complete rDNA sequences for six non-human primate species were constructed using de novo whole genome assemblies. These new sequences were used to determine the conservation profile of the human rDNA, revealing 49 conserved regions in the rDNA intergenic spacer (IGS). To provide insights into the potential roles of these conserved regions, the conservation profile was integrated with functional genomics datasets. We find two major zones that contain conserved elements characterised by enrichment of transcription-associated chromatin factors, and transcription. Conservation of some IGS transcripts in the apes underpins the potential functional significance of these transcripts and the elements controlling their expression. Our results characterize the conservation landscape of the human IGS and suggest that noncoding transcription and chromatin elements are conserved and important features of this unique genomic region.
Collapse
Affiliation(s)
- Saumya Agrawal
- Institute of Natural and Mathematical Sciences, Massey University, Auckland, New Zealand
| | - Austen R. D. Ganley
- Institute of Natural and Mathematical Sciences, Massey University, Auckland, New Zealand
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| |
Collapse
|
3
|
Kharkwal H, Batool F, Koentgen F, Bell DR, Kendall DA, Ebling FJP, Duce IR. Generation and phenotypic characterisation of a cytochrome P450 4x1 knockout mouse. PLoS One 2017; 12:e0187959. [PMID: 29227996 PMCID: PMC5724839 DOI: 10.1371/journal.pone.0187959] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 10/14/2017] [Indexed: 11/18/2022] Open
Abstract
Cytochrome P450 4x1 (Cyp4x1) is expressed at very high levels in the brain but the function of this protein is unknown. It has been hypothesised to regulate metabolism of fatty acids and to affect the activity of endocannabinoid signalling systems, which are known to influence appetite and energy metabolism. The objective of the present investigation was to determine the impact of Cyp4x1 on body weight and energy metabolism by developing a line of transgenic Cyp4x1-knock out mice. Mice were developed with a global knock-out of the gene; the full-length RNA was undetectable, and mice were viable and fertile. Both male and female Cyp4x1-knock out mice gained significantly more body weight on normal lab chow diet compared to control flox mice on the same genetic background. At necropsy, Cyp4x1-knock out male mice had significantly greater intra-abdominal fat deposits (P<0.01), and enlarged adipocytes. Metabolic rate and locomotor activity as inferred from VO2 measures and crossing of infrared beams in metabolic cages were not significantly affected by the mutation in either gender. The respiratory exchange ratio was significantly decreased in male knock out mice (P<0.05), suggesting a greater degree of fat oxidation, consistent with their higher adiposity. When mice were maintained on a high fat diet, VO2 was significantly decreased in both male and female Cyp4x1-knock out mice. We conclude that the Cyp4x1-knock out mouse strain demonstrates a mildly obese phenotype, consistent with the view that cytochrome P450 4x1 plays a role in regulating fat metabolism.
Collapse
Affiliation(s)
- Himanshu Kharkwal
- School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
| | - Farhat Batool
- School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
- Department of Biochemistry, University of Karachi, Karachi, Pakistan
| | - Frank Koentgen
- Ozgene Pty Ltd., Bentley DC, Western Australia, Australia
| | - David R. Bell
- School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
- European Chemicals Agency, Helsinki, Finland
| | - David A. Kendall
- School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
| | | | - Ian R. Duce
- School of Life Sciences, University of Nottingham, Nottingham, United Kingdom
- * E-mail:
| |
Collapse
|
4
|
Genomic Alterations of Non-Coding Regions Underlie Human Cancer: Lessons from T-ALL. Trends Mol Med 2016; 22:1035-1046. [PMID: 28240214 DOI: 10.1016/j.molmed.2016.10.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 10/06/2016] [Accepted: 10/10/2016] [Indexed: 12/31/2022]
Abstract
It has been appreciated for decades that somatic genomic alterations that change coding sequences of proto-oncogenes, translocate enhancers/promoters near proto-oncogenes, or create fusion oncogenes can drive cancer by inducing oncogenic activities. An explosion of genome-wide technologies over the past decade has fueled discoveries of the roles of three-dimensional chromosome structure and powerful cis-acting elements (super-enhancers) in regulating gene transcription. In recent years, studies of human T cell acute lymphoblastic leukemia (T-ALL) using genome-wide technologies have provided paradigms for how non-coding genomic region alterations can disrupt 3D chromosome architecture or establish super-enhancers to activate oncogenic transcription of proto-oncogenes. These studies raise important issues to consider with the objective of leveraging basic knowledge into new diagnostic and therapeutic opportunities for cancer patients.
Collapse
|
5
|
Anderson D, Cordell HJ, Fakiola M, Francis RW, Syn G, Scaman ESH, Davis E, Miles SJ, McLeay T, Jamieson SE, Blackwell JM. First genome-wide association study in an Australian aboriginal population provides insights into genetic risk factors for body mass index and type 2 diabetes. PLoS One 2015; 10:e0119333. [PMID: 25760438 PMCID: PMC4356593 DOI: 10.1371/journal.pone.0119333] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2014] [Accepted: 01/28/2015] [Indexed: 12/15/2022] Open
Abstract
A body mass index (BMI) >22kg/m2 is a risk factor for type 2 diabetes (T2D) in Aboriginal Australians. To identify loci associated with BMI and T2D we undertook a genome-wide association study using 1,075,436 quality-controlled single nucleotide polymorphisms (SNPs) genotyped (Illumina 2.5M Duo Beadchip) in 402 individuals in extended pedigrees from a Western Australian Aboriginal community. Imputation using the thousand genomes (1000G) reference panel extended the analysis to 6,724,284 post quality-control autosomal SNPs. No associations achieved genome-wide significance, commonly accepted as P<5x10-8. Nevertheless, genes/pathways in common with other ethnicities were identified despite the arrival of Aboriginal people in Australia >45,000 years ago. The top hit (rs10868204 Pgenotyped = 1.50x10-6; rs11140653 Pimputed_1000G = 2.90x10-7) for BMI lies 5’ of NTRK2, the type 2 neurotrophic tyrosine kinase receptor for brain-derived neurotrophic factor (BDNF) that regulates energy balance downstream of melanocortin-4 receptor (MC4R). PIK3C2G (rs12816270 Pgenotyped = 8.06x10-6; rs10841048 Pimputed_1000G = 6.28x10-7) was associated with BMI, but not with T2D as reported elsewhere. BMI also associated with CNTNAP2 (rs6960319 Pgenotyped = 4.65x10-5; rs13225016 Pimputed_1000G = 6.57x10-5), previously identified as the strongest gene-by-environment interaction for BMI in African-Americans. The top hit (rs11240074 Pgenotyped = 5.59x10-6, Pimputed_1000G = 5.73x10-6) for T2D lies 5’ of BCL9 that, along with TCF7L2, promotes beta-catenin’s transcriptional activity in the WNT signaling pathway. Additional hits occurred in genes affecting pancreatic (KCNJ6, KCNA1) and/or GABA (GABRR1, KCNA1) functions. Notable associations observed for genes previously identified at genome-wide significance in other populations included MC4R (Pgenotyped = 4.49x10-4) for BMI and IGF2BP2 Pimputed_1000G = 2.55x10-6) for T2D. Our results may provide novel functional leads in understanding disease pathogenesis in this Australian Aboriginal population.
Collapse
Affiliation(s)
- Denise Anderson
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
| | - Heather J. Cordell
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, NE1 3BZ, United Kingdom
| | - Michaela Fakiola
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
- Cambridge Institute for Medical Research, Department of Medicine, and Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | - Richard W. Francis
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
| | - Genevieve Syn
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
| | - Elizabeth S. H. Scaman
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
| | - Elizabeth Davis
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
- Department of Endocrinology and Diabetes, Princess Margaret Hospital for Children, Subiaco, Western Australia, 6008, Australia
| | - Simon J. Miles
- Ngangganawili Aboriginal Health Service, Wiluna, Western Australia, 6646, Australia
| | - Toby McLeay
- Ngangganawili Aboriginal Health Service, Wiluna, Western Australia, 6646, Australia
| | - Sarra E. Jamieson
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
| | - Jenefer M. Blackwell
- Telethon Kids Institute, The University of Western Australia, Subiaco, Western Australia, 6008, Australia
- Cambridge Institute for Medical Research, Department of Medicine, and Department of Pathology, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
6
|
Salih MAM, Fakiola M, Abdelraheem MH, Younis BM, Musa AM, ElHassan AM, Blackwell JM, Ibrahim ME, Mohamed HS. Insights into the possible role of IFNG and IFNGR1 in Kala-azar and Post Kala-azar Dermal Leishmaniasis in Sudanese patients. BMC Infect Dis 2014; 14:662. [PMID: 25466928 PMCID: PMC4265480 DOI: 10.1186/s12879-014-0662-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2014] [Accepted: 11/24/2014] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Little is known about the parasite/host factors that lead to Post Kala-azar Dermal Leishmaniasis (PKDL) in some visceral leishmaniasis (VL) patients after drug-cure. Studies in Sudan provide evidence for association between polymorphisms in the gene (IFNGR1) encoding the alpha chain of interferon-γ receptor type I and risk of PKDL. This study aimed to identify putative functional polymorphisms in the IFNGR1 gene, and to determine whether differences in expression of interferon-γ (IFNG) and IFNGR1 at the RNA level are associated with pathogenesis of VL and/or PKDL in Sudan. METHODS Sanger sequencing was used to re-sequence 841 bp of upstream, exon1 and intron1 of the IFNGR1 gene in DNA from 30 PKDL patients. LAGAN and SYNPLOT bioinformatics tools were used to compare human, chimpanzee and dog sequences to identify conserved noncoding sequences carrying putative regulatory elements. The relative expression of IFNG and IFNGR1 in paired pre- and post-treatment RNA samples from the lymph nodes of 24 VL patients, and in RNA samples from skin biopsies of 19 PKDL patients, was measured using real time PCR. Pre- versus post-treatment expression was evaluated statistically using the nonparametric Wilcoxon matched pairs signed-rank test. RESULTS Ten variants were identified in the 841 bp of sequence, four of which are novel polymorphisms at -77A/G, +10 C/T, +18C/T and +91G/T relative to the IFNGR1 initiation site. A cluster of conserved non-coding sequences with putative regulatory variants was identified in the distal promoter of IFNGR1. Variable expression of IFNG was detected in lymph node aspirates of VL patients before treatment, with a marked reduction (P = 0.006) in expression following treatment. IFNGR1 expression was also variable in lymph node aspirates from VL patients, with no significant reduction in expression with treatment. IFNG expression was undetectable in the skin biopsies of PKDL cases, while IFNGR1 expression was also uniformly low. CONCLUSIONS Uniformly low expression of IFN and IFNGR1 in PKDL skin biopsies could explain parasite persistence and is consistent with prior demonstration of genetic association with IFNGR1 polymorphisms. Identification of novel potentially functional rare variants at IFNGR1 makes an important general contribution to knowledge of rare variants of potential relevance in this Sudanese population.
Collapse
Affiliation(s)
- Mohamed A M Salih
- Institute of Endemic Disease, University of Khartoum, P. O. Box 102, Khartoum, Sudan.
- Central laboratory, Ministry of Science and Technology, Khartoum, Sudan.
| | - Michaela Fakiola
- Department of Medicine and Department of Pathology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK.
| | - Mohamed H Abdelraheem
- Institute of Endemic Disease, University of Khartoum, P. O. Box 102, Khartoum, Sudan.
| | - Brima M Younis
- Institute of Endemic Disease, University of Khartoum, P. O. Box 102, Khartoum, Sudan.
| | - Ahmed M Musa
- Institute of Endemic Disease, University of Khartoum, P. O. Box 102, Khartoum, Sudan.
| | - Ahmed M ElHassan
- Institute of Endemic Disease, University of Khartoum, P. O. Box 102, Khartoum, Sudan.
| | - Jenefer M Blackwell
- Department of Medicine and Department of Pathology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK.
- Telethon Kids Institute, The University of Western Australia, Crawley, Australia.
| | - Muntaser E Ibrahim
- Institute of Endemic Disease, University of Khartoum, P. O. Box 102, Khartoum, Sudan.
| | - Hiba S Mohamed
- Institute of Endemic Disease, University of Khartoum, P. O. Box 102, Khartoum, Sudan.
| |
Collapse
|
7
|
Rye MS, Scaman ESH, Thornton RB, Vijayasekaran S, Coates HL, Francis RW, Pennell CE, Blackwell JM, Jamieson SE. Genetic and functional evidence for a locus controlling otitis media at chromosome 10q26.3. BMC MEDICAL GENETICS 2014; 15:18. [PMID: 24499112 PMCID: PMC3926687 DOI: 10.1186/1471-2350-15-18] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/18/2012] [Accepted: 01/21/2014] [Indexed: 01/28/2023]
Abstract
BACKGROUND Otitis media (OM) is a common childhood disease characterised by middle ear effusion and inflammation. Susceptibility to recurrent acute OM and chronic OM with effusion is 40-70% heritable. Linkage studies provide evidence for multiple putative OM susceptibility loci. This study attempts to replicate these linkages in a Western Australian (WA) population, and to identify the etiological gene(s) in a replicated region. METHODS Microsatellites were genotyped in 468 individuals from 101 multicase families (208 OM cases) from the WA Family Study of OM (WAFSOM) and non-parametric linkage analysis carried out in ALLEGRO. Association mapping utilized dense single nucleotide polymorphism (SNP) data extracted from Illumina 660 W-Quad analysis of 256 OM cases and 575 controls from the WA Pregnancy Cohort (Raine) Study. Logistic regression analysis was undertaken in ProbABEL. RT-PCR was used to compare gene expression in paired adenoid and tonsil samples, and in epithelial and macrophage cell lines. Comparative genomics methods were used to identify putative regulatory elements and transcription factor binding sites potentially affected by associated SNPs. RESULTS Evidence for linkage was observed at 10q26.3 (Zlr = 2.69; P = 0.0036; D10S1770) with borderline evidence for linkage at 10q22.3 (Zlr = 1.64; P = 0.05; D10S206). No evidence for linkage was seen at 3p25.3, 17q12, or 19q13.43. Peak association at 10q26.3 was in the intergenic region between TCERG1L and PPP2R2D (rs7922424; P = 9.47 × 10-6), immediately under the peak of linkage. Independent associations were observed at DOCK1 (rs9418832; P = 7.48 × 10-5) and ADAM12 (rs7902734; P = 8.04 × 10-4). RT-PCR analysis confirmed expression of all 4 genes in adenoid samples. ADAM12, DOCK1 and PPP2R2D, but not TCERG1L, were expressed in respiratory epithelial and macrophage cell lines. A significantly associated polymorphism (rs7087384) in strong LD with the top SNP (rs7922424; r2 = 0.97) alters a transcription factor binding site (CREB/CREBP) in the intergenic region between TCERG1L and PPP2R2D. CONCLUSIONS OM linkage was replicated at 10q26.3. Whilst multiple genes could contribute to this linkage, the weight of evidence supports PPP2R2D, a TGF-β/Activin/Nodal pathway modulator, as the more likely functional candidate lying immediately under the linkage peak for OM susceptibility at chromosome 10q26.3.
Collapse
Affiliation(s)
- Marie S Rye
- Telethon Institute for Child Health Research, The University of Western Australia, Perth, Western Australia, Australia.
| | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Patel B, Kang Y, Cui K, Litt M, Riberio MSJ, Deng C, Salz T, Casada S, Fu X, Qiu Y, Zhao K, Huang S. Aberrant TAL1 activation is mediated by an interchromosomal interaction in human T-cell acute lymphoblastic leukemia. Leukemia 2014; 28:349-61. [PMID: 23698277 PMCID: PMC10921969 DOI: 10.1038/leu.2013.158] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Revised: 05/09/2013] [Accepted: 05/16/2013] [Indexed: 01/21/2023]
Abstract
Long-range chromatin interactions control metazoan gene transcription. However, the involvement of intra- and interchromosomal interactions in development and oncogenesis remains unclear. TAL1/SCL is a critical transcription factor required for the development of all hematopoietic lineages; yet, aberrant TAL1 transcription often occurs in T-cell acute lymphoblastic leukemia (T-ALL). Here, we report that oncogenic TAL1 expression is regulated by different intra- and interchromosomal loops in normal hematopoietic and leukemic cells, respectively. These intra- and interchromosomal loops alter the cell-type-specific enhancers that interact with the TAL1 promoter. We show that human SET1 (hSET1)-mediated H3K4 methylations promote a long-range chromatin loop, which brings the +51 enhancer in close proximity to TAL1 promoter 1 in erythroid cells. The CCCTC-binding factor (CTCF) facilitates this long-range enhancer/promoter interaction of the TAL1 locus in erythroid cells while blocking the same enhancer/promoter interaction of the TAL1 locus in human T-cell leukemia. In human T-ALL, a T-cell-specific transcription factor c-Maf-mediated interchromosomal interaction brings the TAL1 promoter into close proximity with a T-cell-specific regulatory element located on chromosome 16, activating aberrant TAL1 oncogene expression. Thus, our study reveals a novel molecular mechanism involving changes in three-dimensional chromatin interactions that activate the TAL1 oncogene in human T-cell leukemia.
Collapse
Affiliation(s)
- B Patel
- Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL, USA
- These authors contributed equally to this work
| | - Y Kang
- Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL, USA
- College of Life Science, Jilin University, Changchun, China
- These authors contributed equally to this work
| | - K Cui
- Center for System Biology, NHLBI, National Institute of Health, Bethesda, MD, USA
| | - M Litt
- Medical Education Center, Ball State University, Muncie, IN, USA
| | - MSJ Riberio
- Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL, USA
| | - C Deng
- Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL, USA
| | - T Salz
- Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL, USA
| | - S Casada
- Medical Education Center, Ball State University, Muncie, IN, USA
| | - X Fu
- College of Life Science, Jilin University, Changchun, China
| | - Y Qiu
- Department of Anatomy and Cell Biology, College of Medicine, University of Florida, Gainesville, FL, USA
- Shands Cancer Center, College of Medicine, University of Florida, Gainesville, FL, USA
| | - K Zhao
- Center for System Biology, NHLBI, National Institute of Health, Bethesda, MD, USA
| | - S Huang
- Department of Biochemistry and Molecular Biology, College of Medicine, University of Florida, Gainesville, FL, USA
- Shands Cancer Center, College of Medicine, University of Florida, Gainesville, FL, USA
| |
Collapse
|
9
|
Identifying and mapping cell-type-specific chromatin programming of gene expression. Proc Natl Acad Sci U S A 2014; 111:E645-54. [PMID: 24469817 DOI: 10.1073/pnas.1312523111] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A problem of substantial interest is to systematically map variation in chromatin structure to gene-expression regulation across conditions, environments, or differentiated cell types. We developed and applied a quantitative framework for determining the existence, strength, and type of relationship between high-resolution chromatin structure in terms of DNaseI hypersensitivity and genome-wide gene-expression levels in 20 diverse human cell types. We show that ∼25% of genes show cell-type-specific expression explained by alterations in chromatin structure. We find that distal regions of chromatin structure (e.g., ±200 kb) capture more genes with this relationship than local regions (e.g., ±2.5 kb), yet the local regions show a more pronounced effect. By exploiting variation across cell types, we were capable of pinpointing the most likely hypersensitive sites related to cell-type-specific expression, which we show have a range of contextual uses. This quantitative framework is likely applicable to other settings aimed at relating continuous genomic measurements to gene-expression variation.
Collapse
|
10
|
Abstract
As more and more systems biology approaches are used to investigate the different types of biological macromolecules, increasing numbers of whole genomic studies are now available for a large array of organisms. Whether it is genomics, transcriptomics, proteomics, interactomics or metabolomics, the full complement of genomic information on all different levels can be juxtaposed between different organisms to reveal similarities or differences, and even to provide consensus models. At the intersection of comparative genomics and systems biology lies great possibility for discovery, analysis and prediction. This paper explores this nexus and the relationship from four general levels: DNA, RNA, protein and extragenomic. For each level, we provide an overview of the methods, discuss the potential challenges and survey the current research. Finally, we suggest some organizing principles and make proposals for new areas that will be important for future research.
Collapse
Affiliation(s)
- Jimmy Lin
- Wilmer Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | | |
Collapse
|
11
|
Abstract
DIALIGN is a software tool for multiple sequence alignment by combining global and local alignment features. It composes multiple alignments from local pairwise sequence similarities. This approach is particularly useful to discover conserved functional regions in sequences that share only local homologies but are otherwise unrelated. An anchoring option allows to use external information and expert knowledge in addition to primary-sequence similarity alone. The latest version of DIALIGN optionally uses matches to the PFAM database to detect weak homologies. Various versions of the program are available through Göttingen Bioinformatics Compute Server (GOBICS) at http://www.gobics.de/department/software.
Collapse
|
12
|
Retrotransposon insertion in the T-cell acute lymphocytic leukemia 1 (Tal1) gene is associated with severe renal disease and patchy alopecia in Hairpatches (Hpt) mice. PLoS One 2013; 8:e53426. [PMID: 23301070 PMCID: PMC3534690 DOI: 10.1371/journal.pone.0053426] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2012] [Accepted: 11/29/2012] [Indexed: 11/30/2022] Open
Abstract
“Hairpatches” (Hpt) is a naturally occurring, autosomal semi-dominant mouse mutation. Hpt/Hpt homozygotes die in utero, while Hpt/+ heterozygotes exhibit progressive renal failure accompanied by patchy alopecia. This mutation is a model for the rare human disorder “glomerulonephritis with sparse hair and telangiectases" (OMIM 137940). Fine mapping localized the Hpt locus to a 6.7 Mb region of Chromosome 4 containing 62 known genes. Quantitative real time PCR revealed differential expression for only one gene in the interval, T-cell acute lymphocytic leukemia 1 (Tal1), which was highly upregulated in the kidney and skin of Hpt/+ mice. Southern blot analysis of Hpt mutant DNA indicated a new EcoRI site in the Tal1 gene. High throughput sequencing identified an endogenous retroviral class II intracisternal A particle insertion in Tal1 intron 4. Our data suggests that the IAP insertion in Tal1 underlies the histopathological changes in the kidney by three weeks of age, and that glomerulosclerosis is a consequence of an initial developmental defect, progressing in severity over time. The Hairpatches mouse model allows an investigation into the effects of Tal1, a transcription factor characterized by complex regulation patterns, and its effects on renal disease.
Collapse
|
13
|
Follows GA, Ferreira R, Janes ME, Spensberger D, Cambuli F, Chaney AF, Kinston SJ, Landry JR, Green AR, Göttgens B. Mapping and functional characterisation of a CTCF-dependent insulator element at the 3' border of the murine Scl transcriptional domain. PLoS One 2012; 7:e31484. [PMID: 22396734 PMCID: PMC3291548 DOI: 10.1371/journal.pone.0031484] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Accepted: 01/09/2012] [Indexed: 11/18/2022] Open
Abstract
The Scl gene encodes a transcription factor essential for haematopoietic development. Scl transcription is regulated by a panel of cis-elements spread over 55 kb with the most distal 3′ element being located downstream of the neighbouring gene Map17, which is co-regulated with Scl in haematopoietic cells. The Scl/Map17 domain is flanked upstream by the ubiquitously expressed Sil gene and downstream by a cluster of Cyp genes active in liver, but the mechanisms responsible for delineating the domain boundaries remain unclear. Here we report identification of a DNaseI hypersensitive site at the 3′ end of the Scl/Map17 domain and 45 kb downstream of the Scl transcription start site. This element is located at the boundary of active and inactive chromatin, does not function as a classical tissue-specific enhancer, binds CTCF and is both necessary and sufficient for insulator function in haematopoietic cells in vitro. Moreover, in a transgenic reporter assay, tissue-specific expression of the Scl promoter in brain was increased by incorporation of 350 bp flanking fragments from the +45 element. Our data suggests that the +45 region functions as a boundary element that separates the Scl/Map17 and Cyp transcriptional domains, and raise the possibility that this element may be useful for improving tissue-specific expression of transgenic constructs.
Collapse
Affiliation(s)
- George A Follows
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Fakiola M, Miller EN, Fadl M, Mohamed HS, Jamieson SE, Francis RW, Cordell HJ, Peacock CS, Raju M, Khalil EA, Elhassan A, Musa AM, Silveira F, Shaw JJ, Sundar S, Jeronimo SMB, Ibrahim ME, Blackwell JM. Genetic and functional evidence implicating DLL1 as the gene that influences susceptibility to visceral leishmaniasis at chromosome 6q27. J Infect Dis 2011; 204:467-77. [PMID: 21742847 DOI: 10.1093/infdis/jir284] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Visceral leishmaniasis (VL) is caused by Leishmania donovani and Leishmania infantum chagasi. Genome-wide linkage studies from Sudan and Brazil identified a putative susceptibility locus on chromosome 6q27. METHODS Twenty-two single-nucleotide polymorphisms (SNPs) at genes PHF10, C6orf70, DLL1, FAM120B, PSMB1, and TBP were genotyped in 193 VL cases from 85 Sudanese families, and 8 SNPs at genes PHF10, C6orf70, DLL1, PSMB1, and TBP were genotyped in 194 VL cases from 80 Brazilian families. Family-based association, haplotype, and linkage disequilibrium analyses were performed. Multispecies comparative sequence analysis was used to identify conserved noncoding sequences carrying putative regulatory elements. Quantitative reverse-transcription polymerase chain reaction measured expression of candidate genes in splenic aspirates from Indian patients with VL compared with that in the control spleen sample. RESULTS Positive associations were observed at PHF10, C6orf70, DLL1, PSMB1, and TBP in Sudan, but only at DLL1 in Brazil (combined P = 3 × 10(-4) at DLL1 across Sudan and Brazil). No functional coding region variants were observed in resequencing of 22 Sudanese VL cases. DLL1 expression was significantly (P = 2 × 10(-7)) reduced (mean fold change, 3.5 [SEM, 0.7]) in splenic aspirates from patients with VL, whereas other 6q27 genes showed higher levels (1.27 × 10(-6) < P < .01) than did the control spleen sample. A cluster of conserved noncoding sequences with putative regulatory variants was identified in the distal promoter of DLL1. CONCLUSIONS DLL1, which encodes Delta-like 1, the ligand for Notch3, is strongly implicated as the chromosome 6q27 VL susceptibility gene.
Collapse
Affiliation(s)
- Michaela Fakiola
- Cambridge Institute for Medical Research and Department of Medicine, University of Cambridge School of Clinical Medicine, UK
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Binding site turnover produces pervasive quantitative changes in transcription factor binding between closely related Drosophila species. PLoS Biol 2010; 8:e1000343. [PMID: 20351773 PMCID: PMC2843597 DOI: 10.1371/journal.pbio.1000343] [Citation(s) in RCA: 151] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2009] [Accepted: 02/17/2010] [Indexed: 01/06/2023] Open
Abstract
Genome-wide comparison of transcription factor binding between related Drosophila species highlights how sequence changes affect the biochemical events that underlie animal development. Changes in gene expression play an important role in evolution, yet the molecular mechanisms underlying regulatory evolution are poorly understood. Here we compare genome-wide binding of the six transcription factors that initiate segmentation along the anterior-posterior axis in embryos of two closely related species: Drosophila melanogaster and Drosophila yakuba. Where we observe binding by a factor in one species, we almost always observe binding by that factor to the orthologous sequence in the other species. Levels of binding, however, vary considerably. The magnitude and direction of the interspecies differences in binding levels of all six factors are strongly correlated, suggesting a role for chromatin or other factor-independent forces in mediating the divergence of transcription factor binding. Nonetheless, factor-specific quantitative variation in binding is common, and we show that it is driven to a large extent by the gain and loss of cognate recognition sequences for the given factor. We find only a weak correlation between binding variation and regulatory function. These data provide the first genome-wide picture of how modest levels of sequence divergence between highly morphologically similar species affect a system of coordinately acting transcription factors during animal development, and highlight the dominant role of quantitative variation in transcription factor binding over short evolutionary distances. The differentiation of cells, tissues, and organs during animal development is established by a process in which genes that control cell identity and behavior are turned on and off at specific times and places. This process is choreographed, to a large extent, by a collection of proteins known as transcription factors that bind to specific sequences in DNA and thereby modulate the expression of neighboring genes. Because of the central role that transcription factors play in shaping organismal form and function, they have long been suggested to be major players in phenotypic evolution. However, we have a poor understanding of how changes to DNA affect transcription factor binding in living systems. Here, we use a combination of biochemical and genomic techniques to compare, between two closely related species of fruit flies in the genus Drosophila, the binding of six transcription factors that help establish the characteristic segments that form along the anterior-posterior (head to tail) axis in developing flies. We show that the patterns of transcription factor binding between these closely related species are broadly conserved, consistent with the nearly identical development and appearance of these species. However, we also show that, whereas the DNA changes that have accumulated between these species in the five million years since their divergence—roughly one difference per 10 basepairs—have not altered the locations where these factors bind, they have had a considerable effect on the amount of factor bound at each site across a population of embryos. We can trace these quantitative differences in binding to the gain and loss of the short sequences known to be preferentially recognized by these factors, giving us key insights into the effect that sequence changes have on the biochemical events that underlie animal development.
Collapse
|
16
|
Abstract
As our ability to generate sequencing data continues to increase, data analysis is replacing data generation as the rate-limiting step in genomics studies. Here we provide a guide to genomic data visualization tools that facilitate analysis tasks by enabling researchers to explore, interpret and manipulate their data, and in some cases perform on-the-fly computations. We will discuss graphical methods designed for the analysis of de novo sequencing assemblies and read alignments, genome browsing, and comparative genomics, highlighting the strengths and limitations of these approaches and the challenges ahead.
Collapse
|
17
|
Evolution of Transcription Factor Binding Sites in Mammalian Gene Regulatory Regions: Handling Counterintuitive Results. J Mol Evol 2009; 68:654-64. [DOI: 10.1007/s00239-009-9238-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2007] [Revised: 03/30/2009] [Accepted: 04/15/2009] [Indexed: 01/26/2023]
|
18
|
Smith AM, Sanchez MJ, Follows GA, Kinston S, Donaldson IJ, Green AR, Göttgens B. A novel mode of enhancer evolution: the Tal1 stem cell enhancer recruited a MIR element to specifically boost its activity. Genome Res 2008; 18:1422-32. [PMID: 18687876 PMCID: PMC2527711 DOI: 10.1101/gr.077008.108] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Altered cis-regulation is thought to underpin much of metazoan evolution, yet the underlying mechanisms remain largely obscure. The stem cell leukemia TAL1 (also known as SCL) transcription factor is essential for the normal development of blood stem cells and we have previously shown that the Tal1 +19 enhancer directs expression to hematopoietic stem cells, hematopoietic progenitors, and to endothelium. Here we demonstrate that an adjacent region 1 kb upstream (+18 element) is in an open chromatin configuration and carries active histone marks but does not function as an enhancer in transgenic mice. Instead, it boosts activity of the +19 enhancer both in stable transfection assays and during differentiation of embryonic stem (ES) cells carrying single-copy reporter constructs targeted to the Hprt locus. The +18 element contains a mammalian interspersed repeat (MIR) which is essential for the +18 function and which was transposed to the Tal1 locus approximately 160 million years ago at the time of the mammalian/marsupial branchpoint. Our data demonstrate a previously unrecognized mechanism whereby enhancer activity is modulated by a transposon exerting a "booster" function which would go undetected by conventional transgenic approaches.
Collapse
Affiliation(s)
- Aileen M Smith
- University of Cambridge Department of Haematology, Cambridge Institute for Medical Research, Cambridge CB2 2XY, United Kingdom
| | | | | | | | | | | | | |
Collapse
|
19
|
Jeronimo SMB, Holst AKB, Jamieson SE, Francis R, Martins DRA, Bezerra FL, Ettinger NA, Nascimento ET, Monteiro GR, Lacerda HG, Miller EN, Cordell HJ, Duggal P, Beaty TH, Blackwell JM, Wilson ME. Genes at human chromosome 5q31.1 regulate delayed-type hypersensitivity responses associated with Leishmania chagasi infection. Genes Immun 2007; 8:539-51. [PMID: 17713557 PMCID: PMC2435172 DOI: 10.1038/sj.gene.6364422] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Visceral leishmaniasis (VL) caused by Leishmania chagasi is endemic to northeast Brazil. A positive delayed-type hypersensitivity skin test response (DTH+) is a marker for acquired resistance to disease, clusters in families and may be genetically controlled. Twenty-three single nucleotide polymorphisms (SNPs) were genotyped in the cytokine 5q23.3-q31.1 region IRF1-IL5-IL13-IL4-IL9-LECT2-TGFBI in 102 families (323 DTH+; 190 DTH-; 123 VL individuals) from a VL endemic region in northeast Brazil. Data from 20 SNPs were analyzed for association with DTH+/- status and VL using family-based, stepwise conditional logistic regression analysis. Independent associations were observed between the DTH+ phenotype and markers in separate linkage disequilibrium blocks in LECT2 (OR 2.25; P=0.005; 95% CI=1.28-3.97) and TGFBI (OR 1.94; P=0.003; 95% CI=1.24-3.03). VL child/parent trios gave no evidence of association, but the DTH- phenotype was associated with SNP rs2070874 at IL4 (OR 3.14; P=0.006; 95% CI=1.38-7.14), and SNP rs30740 between LECT2 and TGFBI (OR 3.00; P=0.042; 95% CI=1.04-8.65). These results indicate several genes in the immune response gene cluster at 5q23.3-q31.1 influence outcomes of L. chagasi infection in this region of Brazil.
Collapse
Affiliation(s)
- S M B Jeronimo
- Department of Biochemistry, Federal University of Rio Grande do Norte, Natal, RN, Brazil
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Richardson MK, Crooijmans RPMA, Groenen MAM. Sequencing and genomic annotation of the chicken (Gallus gallus) Hox clusters, and mapping of evolutionarily conserved regions. Cytogenet Genome Res 2007; 117:110-9. [PMID: 17675851 DOI: 10.1159/000103171] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2006] [Accepted: 09/29/2006] [Indexed: 11/19/2022] Open
Abstract
Hox genes encode transcription factors that are involved in the regulation of normal development and are mutated in some diseases and malformations. Chicken HOX genes have been extensively studied in the chick limb and other developmental models. To date while the chicken HOXA cluster has been completely sequenced many other chicken HOX genes are known only from partial mRNAs or unfinished genome assemblies. Furthermore, although a finished sequence of the HOXA cluster is available, the sequence has not yet been annotated. We have therefore manually annotated the available HOX sequences and improved the sequences by sequencing PCR fragments that bridge existing gaps in the genome sequences. These sequences complement the published sequences, including the currently incomplete WashUC Gallus_gallus-2.1 build, to give an improved coverage of the cluster. We used phylogenetic footprinting to map the genomic location of 398 Ultra Conserved Regions in the HOX complex 248 of which do not overlap with any known annotated coding exon. These included the hox-related microRNAs miR-10 and miR-196. The chicken HOX clusters appear to be broadly comparable to their human counterparts. A few human orthologues were not recovered from the chicken, presumably because of incomplete sequence.
Collapse
Affiliation(s)
- M K Richardson
- Department of Integrative Zoology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | | | | |
Collapse
|
21
|
Freeling M, Rapaka L, Lyons E, Pedersen B, Thomas BC. G-boxes, bigfoot genes, and environmental response: characterization of intragenomic conserved noncoding sequences in Arabidopsis. THE PLANT CELL 2007; 19:1441-57. [PMID: 17496117 PMCID: PMC1913728 DOI: 10.1105/tpc.107.050419] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2007] [Revised: 03/10/2007] [Accepted: 04/19/2007] [Indexed: 05/15/2023]
Abstract
A tetraploidy left Arabidopsis thaliana with 6358 pairs of homoeologs that, when aligned, generated 14,944 intragenomic conserved noncoding sequences (CNSs). Our previous work assembled these phylogenetic footprints into a database. We show that known transcription factor (TF) binding motifs, including the G-box, are overrepresented in these CNSs. A total of 254 genes spanning long lengths of CNS-rich chromosomes (Bigfoot) dominate this database. Therefore, we made subdatabases: one containing Bigfoot genes and the other containing genes with three to five CNSs (Smallfoot). Bigfoot genes are generally TFs that respond to signals, with their modal CNS positioned 3.1 kb 5' from the ATG. Smallfoot genes encode components of signal transduction machinery, the cytoskeleton, or involve transcription. We queried each subdatabase with each possible 7-nucleotide sequence. Among hundreds of hits, most were purified from CNSs, and almost all of those significantly enriched in CNSs had no experimental history. The 7-mers in CNSs are not 5'- to 3'-oriented in Bigfoot genes but are often oriented in Smallfoot genes. CNSs with one G-box tend to have two G-boxes. CNSs were shared with the homoeolog only and with no other gene, suggesting that binding site turnover impedes detection. Bigfoot genes may function in adaptation to environmental change.
Collapse
Affiliation(s)
- Michael Freeling
- Department of Plant and Microbial Biology, University of California, Berkeley, California 94720, USA.
| | | | | | | | | |
Collapse
|
22
|
Ambrose HE, Papadopoulou V, Beswick RW, Wagner SD. Poly-(ADP-ribose) polymerase-1 (Parp-1) binds in a sequence-specific manner at the Bcl-6 locus and contributes to the regulation of Bcl-6 transcription. Oncogene 2007; 26:6244-52. [PMID: 17404575 DOI: 10.1038/sj.onc.1210434] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Bcl-6 is a transcription factor that is normally expressed in germinal centre B cells. It is essential for the formation of germinal centres and the production of high-affinity antibodies. Transcriptional downregulation of Bcl-6 occurs on terminal differentiation to plasma cells. Bcl-6 is highly expressed in B-cell non-Hodgkin's lymphoma and, in a subset of cases of diffuse large cell lymphoma, the mechanism of Bcl-6 overexpression involves interruption of normal transcriptional controls. Transcriptional control of Bcl-6 is, therefore, important for normal antibody responses and lymphomagenesis, but little is known of the cis-acting control elements. This report focuses on a region of mouse/human sequence homology in the first intron of Bcl-6, which is a candidate site for such a control element. We demonstrate that poly-(ADP-ribose) polymerase-1 (Parp-1) binds in vitro and in vivo to specific sequences in this region. We further show that PARP inhibitors, and Parp-1 knockdown by siRNA induce Bcl-6 mRNA expression in Bcl-6 expressing cell lines. We speculate that Parp-1 activation plays a role in switching off Bcl-6 transcription and subsequent B-cell exit from the germinal centre.
Collapse
Affiliation(s)
- H E Ambrose
- Division of Investigative Sciences, Department of Haematology, Imperial College London, Hammersmith Hospital, London, UK
| | | | | | | |
Collapse
|
23
|
Thomas BC, Rapaka L, Lyons E, Pedersen B, Freeling M. Arabidopsis intragenomic conserved noncoding sequence. Proc Natl Acad Sci U S A 2007; 104:3348-53. [PMID: 17301222 PMCID: PMC1805546 DOI: 10.1073/pnas.0611574104] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2006] [Indexed: 11/18/2022] Open
Abstract
After the most recent tetraploidy in the Arabidopsis lineage, most gene pairs lost one, but not both, of their duplicates. We manually inspected the 3,179 retained gene pairs and their surrounding gene space still present in the genome using a custom-made viewer application. The display of these pairs allowed us to define intragenic conserved noncoding sequences (CNSs), identify exon annotation errors, and discover potentially new genes. Using a strict algorithm to sort high-scoring pair sequences from the bl2seq data, we created a database of 14,944 intragenomic Arabidopsis CNSs. The mean CNS length is 31 bp, ranging from 15 to 285 bp. There are approximately 1.7 CNSs associated with a typical gene, and Arabidopsis CNSs are found in all areas around exons, most frequently in the 5' upstream region. Gene ontology classifications related to transcription, regulation, or "response to ..." external or endogenous stimuli, especially hormones, tend to be significantly overrepresented among genes containing a large number of CNSs, whereas protein localization, transport, and metabolism are common among genes with no CNSs. There is a 1.5% overlap between these CNSs and the 218,982 putative RNAs in the Arabidopsis Small RNA Project database, allowing for two mismatches. These CNSs provide a unique set of noncoding sequences enriched for function. CNS function is implied by evolutionary conservation and independently supported because CNS-richness predicts regulatory gene ontology categories.
Collapse
Affiliation(s)
| | - Lakshmi Rapaka
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | - Eric Lyons
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| | | | - Michael Freeling
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720
| |
Collapse
|
24
|
Abstract
DIALIGN is a software program for multiple alignment of DNA or protein sequences that combines global and local alignment features. During the last years, the program has been used extensively to compare syntenic regions in genomic sequences. An anchoring option speeds up the alignment procedure and makes it possible to use user-defined constraints to improve the quality of the program output. This chapter explains features of DIALIGN that are useful if genomic sequences are to be aligned. The program is online available through Göttingen Bioinformatics Compute Server at http://dialign.gobics.de/.
Collapse
|
25
|
Uchiyama I, Higuchi T, Kobayashi I. CGAT: a comparative genome analysis tool for visualizing alignments in the analysis of complex evolutionary changes between closely related genomes. BMC Bioinformatics 2006; 7:472. [PMID: 17062155 PMCID: PMC1643837 DOI: 10.1186/1471-2105-7-472] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2006] [Accepted: 10/24/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The recent accumulation of closely related genomic sequences provides a valuable resource for the elucidation of the evolutionary histories of various organisms. However, although numerous alignment calculation and visualization tools have been developed to date, the analysis of complex genomic changes, such as large insertions, deletions, inversions, translocations and duplications, still presents certain difficulties. RESULTS We have developed a comparative genome analysis tool, named CGAT, which allows detailed comparisons of closely related bacteria-sized genomes mainly through visualizing middle-to-large-scale changes to infer underlying mechanisms. CGAT displays precomputed pairwise genome alignments on both dotplot and alignment viewers with scrolling and zooming functions, and allows users to move along the pre-identified orthologous alignments. Users can place several types of information on this alignment, such as the presence of tandem repeats or interspersed repetitive sequences and changes in G+C contents or codon usage bias, thereby facilitating the interpretation of the observed genomic changes. In addition to displaying precomputed alignments, the viewer can dynamically calculate the alignments between specified regions; this feature is especially useful for examining the alignment boundaries, as these boundaries are often obscure and can vary between programs. Besides the alignment browser functionalities, CGAT also contains an alignment data construction module, which contains various procedures that are commonly used for pre- and post-processing for large-scale alignment calculation, such as the split-and-merge protocol for calculating long alignments, chaining adjacent alignments, and ortholog identification. Indeed, CGAT provides a general framework for the calculation of genome-scale alignments using various existing programs as alignment engines, which allows users to compare the outputs of different alignment programs. Earlier versions of this program have been used successfully in our research to infer the evolutionary history of apparently complex genome changes between closely related eubacteria and archaea. CONCLUSION CGAT is a practical tool for analyzing complex genomic changes between closely related genomes using existing alignment programs and other sequence analysis tools combined with extensive manual inspection.
Collapse
Affiliation(s)
- Ikuo Uchiyama
- National Institute for Basic Biology, National Institutes of Natural Sciences, Nishigonaka 38, Myodaiji, Okazaki, Aichi 444-8585, Japan
| | - Toshio Higuchi
- INTEC Web and Genome Informatics Corporation, 1-3-3 Shinsuna, Koto-ku, Tokyo 136-0075, Japan
| | - Ichizo Kobayashi
- Department of Medical Genome Sciences, Graduate School of Frontier Science & Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan
- Graduate Program of Biophysics and Biochemistry, Graduate School of Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan
| |
Collapse
|
26
|
Follows GA, Dhami P, Göttgens B, Bruce AW, Campbell PJ, Dillon SC, Smith AM, Koch C, Donaldson IJ, Scott MA, Dunham I, Janes ME, Vetrie D, Green AR. Identifying gene regulatory elements by genomic microarray mapping of DNaseI hypersensitive sites. Genome Res 2006; 16:1310-9. [PMID: 16963707 PMCID: PMC1581440 DOI: 10.1101/gr.5373606] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The identification of cis-regulatory elements is central to understanding gene transcription. Hypersensitivity of cis-regulatory elements to digestion with DNaseI remains the gold-standard approach to locating such elements. Traditional methods used to identify DNaseI hypersensitive sites are cumbersome and can only be applied to short stretches of DNA at defined locations. Here we report the development of a novel genomic array-based approach to DNaseI hypersensitive site mapping (ADHM) that permits precise, large-scale identification of such sites from as few as 5 million cells. Using ADHM we identified all previously recognized hematopoietic regulatory elements across 200 kb of the mouse T-cell acute lymphocytic leukemia-1 (Tal1) locus, and, in addition, identified two novel elements within the locus, which show transcriptional regulatory activity. We further validated the ADHM protocol by mapping the DNaseI hypersensitive sites across 250 kb of the human TAL1 locus in CD34+ primary stem/progenitor cells and K562 cells and by mapping the previously known DNaseI hypersensitive sites across 240 kb of the human alpha-globin locus in K562 cells. ADHM provides a powerful approach to identifying DNaseI hypersensitive sites across large genomic regions.
Collapse
Affiliation(s)
- George A Follows
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 2XY, United Kingdom.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Bockamp E, Antunes C, Maringer M, Heck R, Presser K, Beilke S, Ohngemach S, Alt R, Cross M, Sprengel R, Hartwig U, Kaina B, Schmitt S, Eshkind L. Tetracycline-controlled transgenic targeting from the SCL locus directs conditional expression to erythrocytes, megakaryocytes, granulocytes, and c-kit-expressing lineage-negative hematopoietic cells. Blood 2006; 108:1533-41. [PMID: 16675709 DOI: 10.1182/blood-2005-12-012104] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The stem cell leukemia gene SCL, also known as TAL-1, encodes a basic helix-loop-helix transcription factor expressed in erythroid, myeloid, megakaryocytic, and hematopoietic stem cells. To be able to make use of the unique tissue-restricted and spatio-temporal expression pattern of the SCL gene, we have generated a knock-in mouse line containing the tTA-2S tetracycline transactivator under the control of SCL regulatory elements. Analysis of this mouse using different tetracycline-dependent reporter strains demonstrated that switchable transgene expression was restricted to erythrocytes, megakaryocytes, granulocytes, and, importantly, to the c-kit-expressing and lineage-negative cell fraction of the bone marrow. In addition, conditional transgene activation also was detected in a very minor population of endothelial cells and in the kidney. However, no activation of the reporter transgene was found in the brain of adult mice. These findings suggested that the expression of tetracycline-responsive reporter genes recapitulated the known endogenous expression pattern of SCL. Our data therefore demonstrate that exogenously inducible and reversible expression of selected transgenes in myeloid, megakaryocytic, erythroid, and c-kit-expressing lineage-negative bone marrow cells can be directed through SCL regulatory elements. The SCL knock-in mouse presented here represents a powerful tool for studying normal and malignant hematopoiesis in vivo.
Collapse
Affiliation(s)
- Ernesto Bockamp
- Institute of Toxicology/Mouse Genetics, Johannes Gutenberg-Universität Mainz, Obere Zahlbacher Str 67, 55131 Mainz, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Pimanda JE, Chan WYI, Donaldson IJ, Bowen M, Green AR, Göttgens B. Endoglin expression in the endothelium is regulated by Fli-1, Erg, and Elf-1 acting on the promoter and a -8-kb enhancer. Blood 2006; 107:4737-45. [PMID: 16484587 DOI: 10.1182/blood-2005-12-4929] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Angiogenesis is critical to the growth and regeneration of tissue but is also a key component of tumor growth and chronic inflammatory disorders. Endoglin plays a key role in angiogenesis by modulating cellular responses to transforming growth factor-beta (TGF-beta) signaling and is upregulated in proliferating endothelial cells. To gain insights into the transcriptional hierarchies that govern endoglin expression, we used a combination of comparative genomic, biochemical, and transgenic approaches. Both the promoter and a region 8 kb upstream of exon 1 were active in transfection assays in endothelial cells. In transgenic mice, the promoter directed low-level expression to a subset of endothelial cells. By contrast, inclusion of the -8 enhancer resulted in robust endothelial activity with additional staining in developing ear mesenchyme. Subsequent molecular analysis demonstrated that both the -8 enhancer and the promoter depend on conserved Ets sites, which were bound in endothelial cells in vivo by Fli-1, Erg, and Elf-1. This study therefore establishes the transcriptional framework within which endoglin functions during angiogenesis.
Collapse
Affiliation(s)
- John E Pimanda
- Department of Hematology, Cambridge Institute of Medical Research, University of Cambridge, Cambridge CB2 2XY, UK
| | | | | | | | | | | |
Collapse
|
29
|
Stone EA, Cooper GM, Sidow A. Trade-offs in detecting evolutionarily constrained sequence by comparative genomics. Annu Rev Genomics Hum Genet 2005; 6:143-64. [PMID: 16124857 DOI: 10.1146/annurev.genom.6.080604.162146] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
As whole-genome sequencing efforts extend beyond more traditional model organisms to include a deep diversity of species, comparative genomic analyses will be further empowered to reveal insights into the human genome and its evolution. The discovery and annotation of functional genomic elements is a necessary step toward a detailed understanding of our biology, and sequence comparisons have proven to be an integral tool for that task. This review is structured to broadly reflect the statistical challenges in discriminating these functional elements from the bulk of the genome that has evolved neutrally. Specifically, we review the comparative genomics literature in terms of specificity, sensitivity, and phylogenetic scope, as well as the trade-offs that relate these factors in standard analyses. We consider the impact of an expanding diversity of orthologous sequences on our ability to resolve functional elements. This impact is assessed through both recent comparative analyses of deep alignments and mathematical modeling.
Collapse
Affiliation(s)
- Eric A Stone
- Department of Statistics, Stanford University, Stanford, California 94305, USA
| | | | | |
Collapse
|
30
|
Cameron RA, Chow SH, Berney K, Chiu TY, Yuan QA, Krämer A, Helguero A, Ransick A, Yun M, Davidson EH. An evolutionary constraint: strongly disfavored class of change in DNA sequence during divergence of cis-regulatory modules. Proc Natl Acad Sci U S A 2005; 102:11769-74. [PMID: 16087870 PMCID: PMC1188003 DOI: 10.1073/pnas.0505291102] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The DNA of functional cis-regulatory modules displays extensive sequence conservation in comparisons of genomes from modestly distant species. Patches of sequence that are several hundred base pairs in length within these modules are often seen to be 80-95% identical, although the flanking sequence cannot even be aligned. However, it is unlikely that base pairs located between the transcription factor target sites of cis-regulatory modules have sequence-dependent function, and the mechanism that constrains evolutionary change within cis-regulatory modules is incompletely understood. We chose five functionally characterized cis-regulatory modules from the Strongylocentrotus purpuratus (sea urchin) genome and obtained orthologous regulatory and flanking sequences from a bacterial artificial chromosome genome library of a congener, Strongylocentrotus franciscanus. As expected, single-nucleotide substitutions and small indels occur freely at many positions within the regulatory modules of these two species, as they do outside the regulatory modules. However, large indels (>20 bp) are statistically almost absent within the regulatory modules, although they are common in flanking intergenic or intronic sequence. The result helps to explain the patterns of evolutionary sequence divergence characteristic of cis-regulatory DNA.
Collapse
Affiliation(s)
- R Andrew Cameron
- Division of Biology and Center for Computational Regulatory Genomics of the Beckman Institute, California Institute of Technology, Pasadena, CA 91125, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Delabesse E, Ogilvy S, Chapman MA, Piltz SG, Gottgens B, Green AR. Transcriptional regulation of the SCL locus: identification of an enhancer that targets the primitive erythroid lineage in vivo. Mol Cell Biol 2005; 25:5215-25. [PMID: 15923636 PMCID: PMC1140604 DOI: 10.1128/mcb.25.12.5215-5225.2005] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2004] [Revised: 01/16/2005] [Accepted: 03/02/2005] [Indexed: 12/29/2022] Open
Abstract
The stem cell leukemia (SCL) gene, also known as TAL-1, encodes a basic helix-loop-helix protein that is essential for the formation of all hematopoietic lineages, including primitive erythropoiesis. Appropriate transcriptional regulation is essential for the biological functions of SCL, and we have previously identified five distinct enhancers which target different subdomains of the normal SCL expression pattern. However, it is not known whether these SCL enhancers also regulate neighboring genes within the SCL locus, and the erythroid expression of SCL remains unexplained. Here, we have quantitated transcripts from SCL and neighboring genes in multiple hematopoietic cell types. Our results show striking coexpression of SCL and its immediate downstream neighbor, MAP17, suggesting that they share regulatory elements. A systematic survey of histone H3 and H4 acetylation throughout the SCL locus in different hematopoietic cell types identified several peaks of histone acetylation between SIL and MAP17, all of which corresponded to previously characterized SCL enhancers or to the MAP17 promoter. Downstream of MAP17 (and 40 kb downstream of SCL exon 1a), an additional peak of acetylation was identified in hematopoietic cells and was found to correlate with expression of SCL but not other neighboring genes. This +40 region is conserved in human-dog-mouse-rat sequence comparisons, functions as an erythroid cell-restricted enhancer in vitro, and directs beta-galactosidase expression to primitive, but not definitive, erythroblasts in transgenic mice. The SCL +40 enhancer provides a powerful tool for studying the molecular and cellular biology of the primitive erythroid lineage.
Collapse
Affiliation(s)
- E Delabesse
- University of Cambridge, Department of Hematology, Cambridge Institute for Medical Research, Hills Road, Cambridge CB2 2XY, United Kingdom
| | | | | | | | | | | |
Collapse
|
32
|
Martin N, Patel S, Segre JA. Long-range comparison of human and mouse Sprr loci to identify conserved noncoding sequences involved in coordinate regulation. Genome Res 2005; 14:2430-8. [PMID: 15574822 PMCID: PMC534667 DOI: 10.1101/gr.2709404] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Mammalian epidermis provides a permeability barrier between an organism and its environment. Under homeostatic conditions, epidermal cells produce structural proteins, which are cross-linked in an orderly fashion to form a cornified envelope (CE). However, under genetic or environmental stress, specific genes are induced to rapidly build a temporary barrier. Small proline-rich (SPRR) proteins are the primary constituents of the CE. Under stress the entire family of 14 Sprr genes is upregulated. The Sprr genes are clustered within the larger epidermal differentiation complex on mouse chromosome 3, human chromosome 1q21. The clustering of the Sprr genes and their upregulation under stress suggest that these genes may be coordinately regulated. To identify enhancer elements that regulate this stress response activation of the Sprr locus, we utilized bioinformatic tools and classical biochemical dissection. Long-range comparative sequence analysis identified conserved noncoding sequences (CNSs). Clusters of epidermal-specific DNaseI-hypersensitive sites (HSs) mapped to specific CNSs. Increased prevalence of these HSs in barrier-deficient epidermis provides in vivo evidence of the regulation of the Sprr locus by these conserved sequences. Individual components of these HSs were cloned, and one was shown to have strong enhancer activity specific to conditions when the Sprr genes are coordinately upregulated.
Collapse
Affiliation(s)
- Natalia Martin
- National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | | | | |
Collapse
|
33
|
Abstract
The genomes from three mammals (human, mouse, and rat), two worms, and several yeasts have been sequenced, and more genomes will be completed in the near future for comparison with those of the major model organisms. Scientists have used various methods to align and compare the sequenced genomes to address critical issues in genome function and evolution. This review covers some of the major new insights about gene content, gene regulation, and the fraction of mammalian genomes that are under purifying selection and presumed functional. We review the evolutionary processes that shape genomes, with particular attention to variation in rates within genomes and along different lineages. Internet resources for accessing and analyzing the treasure trove of sequence alignments and annotations are reviewed, and we discuss critical problems to address in new bioinformatic developments in comparative genomics.
Collapse
Affiliation(s)
- Webb Miller
- The Center for Comparative Genomics and Bioinformatics, The Huck Institutes of Life Sciences, Department of Biology, Pennsylvania State University, University Park, Pennsylvania, USA.
| | | | | | | |
Collapse
|
34
|
Schmollinger M, Nieselt K, Kaufmann M, Morgenstern B. DIALIGN P: fast pair-wise and multiple sequence alignment using parallel processors. BMC Bioinformatics 2004; 5:128. [PMID: 15357879 PMCID: PMC520757 DOI: 10.1186/1471-2105-5-128] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2004] [Accepted: 09/09/2004] [Indexed: 11/30/2022] Open
Abstract
Background Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Results Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a) pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b) For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. Conclusions By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.
Collapse
Affiliation(s)
- Martin Schmollinger
- Wilhelm-Schickard-Institut fur Informatik, Sand 14, 72076 Tübingen, Germany.
| | | | | | | |
Collapse
|
35
|
Frazer KA, Pachter L, Poliakov A, Rubin EM, Dubchak I. VISTA: computational tools for comparative genomics. Nucleic Acids Res 2004; 32:W273-9. [PMID: 15215394 PMCID: PMC441596 DOI: 10.1093/nar/gkh458] [Citation(s) in RCA: 1697] [Impact Index Per Article: 84.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Comparison of DNA sequences from different species is a fundamental method for identifying functional elements in genomes. Here, we describe the VISTA family of tools created to assist biologists in carrying out this task. Our first VISTA server at http://www-gsd.lbl.gov/vista/ was launched in the summer of 2000 and was designed to align long genomic sequences and visualize these alignments with associated functional annotations. Currently the VISTA site includes multiple comparative genomics tools and provides users with rich capabilities to browse pre-computed whole-genome alignments of large vertebrate genomes and other groups of organisms with VISTA Browser, to submit their own sequences of interest to several VISTA servers for various types of comparative analysis and to obtain detailed comparative analysis results for a set of cardiovascular genes. We illustrate capabilities of the VISTA site by the analysis of a 180 kb interval on human chromosome 5 that encodes for the kinesin family member 3A (KIF3A) protein.
Collapse
Affiliation(s)
- Kelly A Frazer
- Perlegen Sciences, Inc., 2021 Stierlin Court, Mountain View, CA 94043, USA
| | | | | | | | | |
Collapse
|
36
|
Montgomery SB, Astakhova T, Bilenky M, Birney E, Fu T, Hassel M, Melsopp C, Rak M, Robertson AG, Sleumer M, Siddiqui AS, Jones SJM. Sockeye: a 3D environment for comparative genomics. Genome Res 2004; 14:956-62. [PMID: 15123592 PMCID: PMC479126 DOI: 10.1101/gr.1890304] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Comparative genomics techniques are used in bioinformatics analyses to identify the structural and functional properties of DNA sequences. As the amount of available sequence data steadily increases, the ability to perform large-scale comparative analyses has become increasingly relevant. In addition, the growing complexity of genomic feature annotation means that new approaches to genomic visualization need to be explored. We have developed a Java-based application called Sockeye that uses three-dimensional (3D) graphics technology to facilitate the visualization of annotation and conservation across multiple sequences. This software uses the Ensembl database project to import sequence and annotation information from several eukaryotic species. A user can additionally import their own custom sequence and annotation data. Individual annotation objects are displayed in Sockeye by using custom 3D models. Ensembl-derived and imported sequences can be analyzed by using a suite of multiple and pair-wise alignment algorithms. The results of these comparative analyses are also displayed in the 3D environment of Sockeye. By using the Java3D API to visualize genomic data in a 3D environment, we are able to compactly display cross-sequence comparisons. This provides the user with a novel platform for visualizing and comparing genomic feature organization.
Collapse
Affiliation(s)
- Stephen B Montgomery
- Canada's Michael Smith Genome Sciences Centre, Vancouver, British Columbia V5Z 4E6, Canada
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Nardone J, Lee DU, Ansel KM, Rao A. Bioinformatics for the 'bench biologist': how to find regulatory regions in genomic DNA. Nat Immunol 2004; 5:768-74. [PMID: 15282556 DOI: 10.1038/ni0804-768] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The combination of bioinformatic and biological approaches constitutes a powerful method for identifying gene regulatory elements. High-quality genome sequences are available in public databases for several vertebrate species. Comparative cross-species sequence analysis of these genomes shows considerable conservation of noncoding sequences in DNA. Biological analyses show that an unexpectedly high number of the conserved sequences correspond to functional cis-regulatory regions that influence gene transcription. Because research biologists are often unfamiliar with the bioinformatic resources at their disposal, this commentary discusses how to integrate biological and bioinformatic methods in the discovery of gene regulatory regions and includes a tutorial on widely available comparative genomics programs.
Collapse
Affiliation(s)
- Julie Nardone
- Department of Pathology, Harvard Medical School and the CBR Institute for Biomedical Research, Boston, Massachusetts 02115, USA
| | | | | | | |
Collapse
|
38
|
Valverde-Garduno V, Guyot B, Anguita E, Hamlett I, Porcher C, Vyas P. Differences in the chromatin structure and cis-element organization of the human and mouse GATA1 loci: implications for cis-element identification. Blood 2004; 104:3106-16. [PMID: 15265794 DOI: 10.1182/blood-2004-04-1333] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Cis-element identification is a prerequisite to understand transcriptional regulation of gene loci. From analysis of a limited number of conserved gene loci, sequence comparison has proved a robust and efficient way to locate cis-elements. Human and mouse GATA1 genes encode a critical hematopoietic transcription factor conserved in expression and function. Proper control of GATA1 transcription is critical in regulating myeloid lineage specification and maturation. Here, we compared sequence and systematically mapped position of DNase I hypersensitive sites, acetylation status of histone H3/H4, and in vivo binding of transcription factors over approximately 120 kilobases flanking the human GATA1 gene and the corresponding region in mice. Despite lying in approximately 10 megabase (Mb) conserved syntenic segment, the chromatin structures of the 2 homologous loci are strikingly different. The 2 previously unidentified hematopoietic cis-elements, one in each species, are not conserved in position and sequence and have enhancer activity in erythroid cells. In vivo, they both bind the transcription factors GATA1, SCL, LMO2, and Ldb1. More broadly, there are both species- and regulatory element-specific patterns of transcription factor binding. These findings suggest that some cis-elements regulating human and mouse GATA1 genes differ. More generally, mouse human sequence comparison may fail to identify all cis-elements.
Collapse
Affiliation(s)
- Veronica Valverde-Garduno
- Department of Haematology, Medical Research Council Molecular Unit, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford OX3 9DU, United Kingdom
| | | | | | | | | | | |
Collapse
|
39
|
van der Burg M, Poulsen TS, Hunger SP, Beverloo HB, Smit EME, Vang-Nielsen K, Langerak AW, van Dongen JJM. Split-signal FISH for detection of chromosome aberrations in acute lymphoblastic leukemia. Leukemia 2004; 18:895-908. [PMID: 15042105 DOI: 10.1038/sj.leu.2403340] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2003] [Accepted: 02/03/2004] [Indexed: 11/08/2022]
Abstract
Chromosome aberrations are frequently observed in precursor-B-acute lymphoblastic leukemias (ALL) and T-cell acute lymphoblastic leukemias (T-ALL). These translocations can form leukemia-specific chimeric fusion proteins or they can deregulate expression of an (onco)gene, resulting in aberrant expression or overexpression. Detection of chromosome aberrations is an important tool for risk classification. We developed rapid and sensitive split-signal fluorescent in situ hybridization (FISH) assays for six of the most frequent chromosome aberrations in precursor-B-ALL and T-ALL. The split-signal FISH approach uses two differentially labeled probes, located in one gene at opposite sites of the breakpoint region. Probe sets were developed for the genes TCF3 (E2A) at 19p13, MLL at 11q23, ETV6 at 12p13, BCR at 22q11, SIL-TAL1 at 1q32 and TLX3 (HOX11L2) at 5q35. In normal karyotypes, two colocalized green/red signals are visible, but a translocation results in a split of one of the colocalized signals. Split-signal FISH has three main advantages over the classical fusion-signal FISH approach, which uses two labeled probes located in two genes. First, the detection of a chromosome aberration is independent of the involved partner gene. Second, split-signal FISH allows the identification of the partner gene or chromosome region if metaphase spreads are present, and finally it reduces false-positivity.
Collapse
Affiliation(s)
- M van der Burg
- Department of Immunology, Erasmus MC, University Medical Center Rotterdam, Rotterdam, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Brake RL, Chatterjee PK, Kees UR, Watt PM. The functional mapping of long-range transcription control elements of the HOX11 proto-oncogene. Biochem Biophys Res Commun 2004; 313:327-35. [PMID: 14684164 DOI: 10.1016/j.bbrc.2003.11.117] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Mapping of transcriptional control elements normally depends on the generation of a series of deletion mutants. The consequences of particular deletions are then functionally assessed by their ability to alter gene expression. The information derived from such investigations provides a general regulatory profile of the gene of interest, as well as generating a focus for future experiments. Due to the limitations of conventional DNA cloning methods, it has previously not been possible to use such an approach to rapidly assess the role of long-range regulatory elements that frequently lie further than 20 kb away from the coding region. In order to identify regulatory elements of the proto-oncogene HOX11 that may be mutated in a subset of childhood T-cell acute lymphoblastic leukaemia specimens, we generated nested deletions from a P1 artificial chromosome (PAC). This clone contained 95 kilobases (kb) of the HOX11 locus at 10q24; including 63 kb of 5' regulatory DNA. The deletion series was produced by the use of a recombination based cloning system and clones were subsequently transfected into mammalian cells. We have identified several long-range regulatory elements that mediate transcriptional control of HOX11. This approach is simple, rapid, and inexpensive. Furthermore, it generates multiple deletion clones in a single experiment. This novel approach opens up a new avenue for investigating long-range transcription control. Additionally, by allowing analysis of these elements in the natural context of large integrants the approach does not require the use of artificial extrachromosomal elements. This methodology can be applied to any gene cloned into a PAC or BAC vector and could also be useful in identifying appropriately sized deletion mutants for functional testing in transgenic models.
Collapse
Affiliation(s)
- Rachael L Brake
- Division of Children's Leukaemia and Cancer Research, Telethon Institute for Child Health Research and Centre for Child Health Research, The University of Western Australia, West Perth, WA 6872, Australia.
| | | | | | | |
Collapse
|
41
|
Chapman MA, Donaldson IJ, Gilbert J, Grafham D, Rogers J, Green AR, Göttgens B. Analysis of multiple genomic sequence alignments: a web resource, online tools, and lessons learned from analysis of mammalian SCL loci. Genome Res 2004; 14:313-8. [PMID: 14718377 PMCID: PMC327107 DOI: 10.1101/gr.1759004] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2003] [Accepted: 11/24/2003] [Indexed: 11/24/2022]
Abstract
Comparative analysis of genomic sequences is becoming a standard technique for studying gene regulation. However, only a limited number of tools are currently available for the analysis of multiple genomic sequences. An extensive data set for the testing and training of such tools is provided by the SCL gene locus. Here we have expanded the data set to eight vertebrate species by sequencing the dog SCL locus and by annotating the dog and rat SCL loci. To provide a resource for the bioinformatics community, all SCL sequences and functional annotations, comprising a collation of the extensive experimental evidence pertaining to SCL regulation, have been made available via a Web server. A Web interface to new tools specifically designed for the display and analysis of multiple sequence alignments was also implemented. The unique SCL data set and new sequence comparison tools allowed us to perform a rigorous examination of the true benefits of multiple sequence comparisons. We demonstrate that multiple sequence alignments are, overall, superior to pairwise alignments for identification of mammalian regulatory regions. In the search for individual transcription factor binding sites, multiple alignments markedly increase the signal-to-noise ratio compared to pairwise alignments.
Collapse
|
42
|
Abstract
Various experimental and computational approaches have been used to identify genomic locations of transcription-factor binding sites; methods involving computational comparisons of related genomes have been particularly successful. Identifying genomic locations of transcription-factor binding sites, particularly in higher eukaryotic genomes, has been an enormous challenge. Various experimental and computational approaches have been used to detect these sites; methods involving computational comparisons of related genomes have been particularly successful.
Collapse
Affiliation(s)
- Martha L Bulyk
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, New Research Building, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.
| |
Collapse
|
43
|
Brudno M, Chapman M, Göttgens B, Batzoglou S, Morgenstern B. Fast and sensitive multiple alignment of large genomic sequences. BMC Bioinformatics 2003; 4:66. [PMID: 14693042 PMCID: PMC521198 DOI: 10.1186/1471-2105-4-66] [Citation(s) in RCA: 120] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2003] [Accepted: 12/23/2003] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Genomic sequence alignment is a powerful method for genome analysis and annotation, as alignments are routinely used to identify functional sites such as genes or regulatory elements. With a growing number of partially or completely sequenced genomes, multiple alignment is playing an increasingly important role in these studies. In recent years, various tools for pair-wise and multiple genomic alignment have been proposed. Some of them are extremely fast, but often efficiency is achieved at the expense of sensitivity. One way of combining speed and sensitivity is to use an anchored-alignment approach. In a first step, a fast search program identifies a chain of strong local sequence similarities. In a second step, regions between these anchor points are aligned using a slower but more accurate method. RESULTS Herein, we present CHAOS, a novel algorithm for rapid identification of chains of local pair-wise sequence similarities. Local alignments calculated by CHAOS are used as anchor points to improve the running time of DIALIGN, a slow but sensitive multiple-alignment tool. We show that this way, the running time of DIALIGN can be reduced by more than 95% for BAC-sized and longer sequences, without affecting the quality of the resulting alignments. We apply our approach to a set of five genomic sequences around the stem-cell-leukemia (SCL) gene and demonstrate that exons and small regulatory elements can be identified by our multiple-alignment procedure. CONCLUSION We conclude that the novel CHAOS local alignment tool is an effective way to significantly speed up global alignment tools such as DIALIGN without reducing the alignment quality. We likewise demonstrate that the DIALIGN/CHAOS combination is able to accurately align short regulatory sequences in distant orthologues.
Collapse
Affiliation(s)
- Michael Brudno
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Michael Chapman
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Hills Road, Cambridge CB2 2XY, United Kingdom
| | - Berthold Göttgens
- Department of Haematology, University of Cambridge, Cambridge Institute for Medical Research, Hills Road, Cambridge CB2 2XY, United Kingdom
| | - Serafim Batzoglou
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Burkhard Morgenstern
- International Graduate School in Bioinformatics and Genome Research, Universität Bielefeld, Postfach 100131, 33501 Bielefeld, Germany
- University of Göttingen, Institute of Microbiology and Genetics, Goldschmidtstr. 1, 37077 Göttingen, Germany
| |
Collapse
|
44
|
Abstract
Most current computational tools have been designed for pairwise comparisons of DNA sequences, and efficient extension of these tools to multiple species will require knowledge of the ideal evolutionary distance to choose and the development of new algorithms for alignment, analysis of conservation, and visualization of results. Multi-species comparisons of DNA sequences are more powerful for discovering functional sequences than pairwise DNA sequence comparisons. Most current computational tools have been designed for pairwise comparisons, and efficient extension of these tools to multiple species will require knowledge of the ideal evolutionary distance to choose and the development of new algorithms for alignment, analysis of conservation, and visualization of results.
Collapse
Affiliation(s)
- Inna Dubchak
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| | | |
Collapse
|
45
|
Herzog C, Zhuang L, Gorgan L, Segal Y, Zhou J. Tissue- and developmental stage-specific activation of α5 and α6(IV) collagen expression in the upper gastrointestinal tract of transgenic mice. Biochem Biophys Res Commun 2003; 311:553-60. [PMID: 14592452 DOI: 10.1016/j.bbrc.2003.09.233] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Little is known about mechanisms regulating gene expression for the alpha chains of basement membrane type IV collagen, arranged head-to-head in transcription units COL4A1-COL4A2, COL4A3-COL4A4, and COL4A5-COL4A6, and implicated broadly in genetic diseases. To investigate these mechanisms, we generated transgenic mouse lines bearing 5'-flanking sequences of COL4A5 and COL4A6, cloned upstream of a lacZ reporter gene. A 3.8-kb fragment upstream of COL4A6 directs reporter gene expression in the esophagus, stomach, and duodenum, whereas a 13.8-kb fragment directs expression in the esophagus only. A 10.6-kb fragment upstream of COL4A5 directs expression in the esophagus. Coupled with evidence of long-range conservation between human and mouse non-coding sequences, described herein, our findings provide the first indication that highly specialized patterns characteristic of COL4A5-COL4A6 expression in vivo arise from effects of distributed cis-acting regulatory elements on a bidirectional proximal promoter, itself transcriptionally competent.
Collapse
Affiliation(s)
- Christine Herzog
- Renal Division, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA
| | | | | | | | | |
Collapse
|
46
|
Inada DC, Bashir A, Lee C, Thomas BC, Ko C, Goff SA, Freeling M. Conserved noncoding sequences in the grasses. Genome Res 2003; 13:2030-41. [PMID: 12952874 PMCID: PMC403677 DOI: 10.1101/gr.1280703] [Citation(s) in RCA: 105] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
As orthologous genes from related species diverge over time, some sequences are conserved in noncoding regions. In mammals, large phylogenetic footprints, or conserved noncoding sequences (CNSs), are known to be common features of genes. Here we present the first large-scale analysis of plant genes for CNSs. We used maize and rice, maximally diverged members of the grass family of monocots. Using a local sequence alignment set to deliver only significant alignments, we found one or more CNSs in the noncoding regions of the majority of genes studied. Grass genes have dramatically fewer and much smaller CNSs than mammalian genes. Twenty-seven percent of grass gene comparisons revealed no CNSs. Genes functioning in upstream regulatory roles, such as transcription factors, are greatly enriched for CNSs relative to genes encoding enzymes or structural proteins. Further, we show that a CNS cluster in an intron of the knotted1 homeobox gene serves as a site of negative regulation. We showthat CNSs in the adh1 gene do not correlate with known cis-acting sites. We discuss the potential meanings of CNSs and their value as analytical tools and evolutionary characters. We advance the idea that many CNSs function to lock-in gene regulatory decisions.
Collapse
Affiliation(s)
- Dan Choffnes Inada
- Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California 94720, USA
| | | | | | | | | | | | | |
Collapse
|
47
|
Edwards YJK, Carver TJ, Vavouri T, Frith M, Bishop MJ, Elgar G. Theatre: A software tool for detailed comparative analysis and visualization of genomic sequence. Nucleic Acids Res 2003; 31:3510-7. [PMID: 12824356 PMCID: PMC168908 DOI: 10.1093/nar/gkg501] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2002] [Revised: 01/17/2003] [Accepted: 01/27/2003] [Indexed: 01/12/2023] Open
Abstract
Theatre is a web-based computing system designed for the comparative analysis of genomic sequences, especially with respect to motifs likely to be involved in the regulation of gene expression. Theatre is an interface to commonly used sequence analysis tools and biological sequence databases to determine or predict the positions of coding regions, repetitive sequences and transcription factor binding sites in families of DNA sequences. The information is displayed in a manner that can be easily understood and can reveal patterns that might not otherwise have been noticed. In addition to web-based output, Theatre can produce publication quality colour hardcopies showing predicted features in aligned genomic sequences. A case study using the p53 promoter region of four mammalian species and two fish species is described. Unlike the mammalian sequences the promoter regions in fish have not been previously predicted or characterized and we report the differences in the p53 promoter region of four mammals and that predicted for two fish species. Theatre can be accessed at http://www.hgmp.mrc.ac.uk/Registered/Webapp/theatre/.
Collapse
Affiliation(s)
- Yvonne J K Edwards
- Comparative Genomics Group, Research Division, MRC UK Human Genome Mapping Project Resource Centre, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SB, UK.
| | | | | | | | | | | |
Collapse
|
48
|
Pennacchio LA. Insights from human/mouse genome comparisons. Mamm Genome 2003; 14:429-36. [PMID: 12925891 DOI: 10.1007/s00335-002-4001-1] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2003] [Accepted: 02/20/2003] [Indexed: 10/27/2022]
Abstract
Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish ( Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestry of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.
Collapse
Affiliation(s)
- Len A Pennacchio
- Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, California, USA.
| |
Collapse
|
49
|
Pennacchio LA, Rubin EM. Comparative genomic tools and databases: providing insights into the human genome. J Clin Invest 2003. [DOI: 10.1172/jci200317842] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
50
|
Pennacchio LA, Rubin EM. Comparative genomic tools and databases: providing insights into the human genome. J Clin Invest 2003; 111:1099-106. [PMID: 12697725 PMCID: PMC152942 DOI: 10.1172/jci17842] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Affiliation(s)
- Len A Pennacchio
- Genome Sciences Department, MS 84-171, Lawrence Berkeley National Laboratory, One Cyclotron Road, Berkeley, CA 94720, USA.
| | | |
Collapse
|