1
|
Xiang G, He X, Giardine BM, Isaac KJ, Taylor DJ, McCoy RC, Jansen C, Keller CA, Wixom AQ, Cockburn A, Miller A, Qi Q, He Y, Li Y, Lichtenberg J, Heuston EF, Anderson SM, Luan J, Vermunt MW, Yue F, Sauria MEG, Schatz MC, Taylor J, Göttgens B, Hughes JR, Higgs DR, Weiss MJ, Cheng Y, Blobel GA, Bodine DM, Zhang Y, Li Q, Mahony S, Hardison RC. Interspecies regulatory landscapes and elements revealed by novel joint systematic integration of human and mouse blood cell epigenomes. Genome Res 2024; 34:1089-1105. [PMID: 38951027 PMCID: PMC11368181 DOI: 10.1101/gr.277950.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 06/24/2024] [Indexed: 07/03/2024]
Abstract
Knowledge of locations and activities of cis-regulatory elements (CREs) is needed to decipher basic mechanisms of gene regulation and to understand the impact of genetic variants on complex traits. Previous studies identified candidate CREs (cCREs) using epigenetic features in one species, making comparisons difficult between species. In contrast, we conducted an interspecies study defining epigenetic states and identifying cCREs in blood cell types to generate regulatory maps that are comparable between species, using integrative modeling of eight epigenetic features jointly in human and mouse in our Validated Systematic Integration (VISION) Project. The resulting catalogs of cCREs are useful resources for further studies of gene regulation in blood cells, indicated by high overlap with known functional elements and strong enrichment for human genetic variants associated with blood cell phenotypes. The contribution of each epigenetic state in cCREs to gene regulation, inferred from a multivariate regression, was used to estimate epigenetic state regulatory potential (esRP) scores for each cCRE in each cell type, which were used to categorize dynamic changes in cCREs. Groups of cCREs displaying similar patterns of regulatory activity in human and mouse cell types, obtained by joint clustering on esRP scores, harbor distinctive transcription factor binding motifs that are similar between species. An interspecies comparison of cCREs revealed both conserved and species-specific patterns of epigenetic evolution. Finally, we show that comparisons of the epigenetic landscape between species can reveal elements with similar roles in regulation, even in the absence of genomic sequence alignment.
Collapse
Affiliation(s)
- Guanjue Xiang
- Bioinformatics and Genomics Graduate Program, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Data Science, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts 02215, USA
| | - Xi He
- Bioinformatics and Genomics Graduate Program, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Belinda M Giardine
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Kathryn J Isaac
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Dylan J Taylor
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Rajiv C McCoy
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Camden Jansen
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Cheryl A Keller
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Alexander Q Wixom
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - April Cockburn
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Amber Miller
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Qian Qi
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yanghua He
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
- Department of Human Nutrition, Food and Animal Sciences, University of Hawaìi at Mānoa, Honolulu, Hawaii 96822, USA
| | - Yichao Li
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Jens Lichtenberg
- Genetics and Molecular Biology Branch, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Elisabeth F Heuston
- Genetics and Molecular Biology Branch, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Stacie M Anderson
- Flow Cytometry Core, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Jing Luan
- Department of Pediatrics, Children's Hospital of Philadelphia, and Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Marit W Vermunt
- Department of Pediatrics, Children's Hospital of Philadelphia, and Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Evanston, Illinois 60611, USA
| | - Michael E G Sauria
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Michael C Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - James Taylor
- Department of Biology, Johns Hopkins University, Baltimore, Maryland 21218, USA
- Department of Computer Science, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Berthold Göttgens
- Wellcome and MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, United Kingdom
| | - Jim R Hughes
- MRC Weatherall Institute of Molecular Medicine, Oxford University, Oxford OX3 9DS, United Kingdom
| | - Douglas R Higgs
- MRC Weatherall Institute of Molecular Medicine, Oxford University, Oxford OX3 9DS, United Kingdom
| | - Mitchell J Weiss
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yong Cheng
- Department of Hematology, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Gerd A Blobel
- Department of Pediatrics, Children's Hospital of Philadelphia, and Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - David M Bodine
- Genetics and Molecular Biology Branch, National Human Genome Research Institute, Bethesda, Maryland 20892, USA
| | - Yu Zhang
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Qunhua Li
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Computational Biology and Bioinformatics, Genome Sciences Institute, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Shaun Mahony
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Computational Biology and Bioinformatics, Genome Sciences Institute, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA;
- Center for Computational Biology and Bioinformatics, Genome Sciences Institute, Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
2
|
Makhmudova U, Schulze PC, Lorkowski S, März W, Geiling JA, Weingärtner O. Monogenic hypertriglyceridemia and recurrent pancreatitis in a homozygous carrier of a rare APOA5 mutation: a case report. J Med Case Rep 2024; 18:278. [PMID: 38872171 PMCID: PMC11177521 DOI: 10.1186/s13256-024-04532-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 03/25/2024] [Indexed: 06/15/2024] Open
Abstract
BACKGROUND Homozygous mutations in the APOA5 gene constitute a rare cause of monogenic hypertriglyceridemia, or familial chylomicronemia syndrome (FCS). We searched PubMed and identified 16 cases of homozygous mutations in the APOA5 gene. Severe hypertriglyceridemia related to monogenic mutations in triglyceride-regulating genes can cause recurrent acute pancreatitis. Standard therapeutic approaches for managing this condition typically include dietary interventions, fibrates, and omega-3-fatty acids. A novel therapeutic approach, antisense oligonucleotide volanesorsen is approved for use in patients with FCS. CASE PRESENTATION We report a case of a 25-years old Afghani male presenting with acute pancreatitis due to severe hypertriglyceridemia up to 29.8 mmol/L caused by homozygosity in APOA5 (c.427delC, p.Arg143Alafs*57). A low-fat diet enriched with medium-chain TG (MCT) oil and fibrate therapy did not prevent recurrent relapses, and volanesorsen was initiated. Volanesorsen resulted in almost normalized triglyceride levels. No further relapses of acute pancreatitis occurred. Patient reported an improve life quality due to alleviated chronic abdominal pain and headaches. CONCLUSIONS Our case reports a rare yet potentially life-threatening condition-monogenic hypertriglyceridemia-induced acute pancreatitis. The implementation of the antisense drug volanesorsen resulted in improved triglyceride levels, alleviated symptoms, and enhanced the quality of life.
Collapse
Affiliation(s)
- Umidakhon Makhmudova
- Deutsches Herzzentrum der Charité, Hindenburgdamm 30, 12203, Berlin, Germany.
- Friede Springer Cardiovascular Prevention Center @Charité, Hindenburgdamm 30, 12203, Berlin, Germany.
- Charité-Universitätsmedizin Berlin, Freie Universität Berlin and Humboldt-Universität zu Berlin, Klinik/Centrum, Charitéplatz 1, 10117, Berlin, Germany.
- Klinik für Innere Medizin I, Universitätsklinikum Jena, Am Klinikum 1, 07743, Jena, Germany.
| | - P Christian Schulze
- Klinik für Innere Medizin I, Universitätsklinikum Jena, Am Klinikum 1, 07743, Jena, Germany
| | - Stefan Lorkowski
- Institute of Nutritional Sciences, Friedrich Schiller University Jena, Jena, Germany
- Competence Cluster for Nutrition and Cardiovascular Health (nutriCARD), Halle-Jena-Leipzig, Germany
| | - Winfried März
- SYNLAB Academy, SYNLAB Holding Deutschland GmbH Mannheim and Augsburg GmbH, Mannheim, Germany
| | - J-A Geiling
- Klinik für Innere Medizin I, Universitätsklinikum Jena, Am Klinikum 1, 07743, Jena, Germany
| | - Oliver Weingärtner
- Klinik für Innere Medizin I, Universitätsklinikum Jena, Am Klinikum 1, 07743, Jena, Germany
| |
Collapse
|
3
|
Smith SK, Frazel PW, Khodadadi-Jamayran A, Zappile P, Marier C, Okhovat M, Brown S, Long MA, Heguy A, Phelps SM. De novo assembly and annotation of the singing mouse genome. BMC Genomics 2023; 24:569. [PMID: 37749493 PMCID: PMC10521431 DOI: 10.1186/s12864-023-09678-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Accepted: 09/14/2023] [Indexed: 09/27/2023] Open
Abstract
BACKGROUND Developing genomic resources for a diverse range of species is an important step towards understanding the mechanisms underlying complex traits. Specifically, organisms that exhibit unique and accessible phenotypes-of-interest allow researchers to address questions that may be ill-suited to traditional model organisms. We sequenced the genome and transcriptome of Alston's singing mouse (Scotinomys teguina), an emerging model for social cognition and vocal communication. In addition to producing advertisement songs used for mate attraction and male-male competition, these rodents are diurnal, live at high-altitudes, and are obligate insectivores, providing opportunities to explore diverse physiological, ecological, and evolutionary questions. RESULTS Using PromethION, Illumina, and PacBio sequencing, we produced an annotated genome and transcriptome, which were validated using gene expression and functional enrichment analyses. To assess the usefulness of our assemblies, we performed single nuclei sequencing on cells of the orofacial motor cortex, a brain region implicated in song coordination, identifying 12 cell types. CONCLUSIONS These resources will provide the opportunity to identify the molecular basis of complex traits in singing mice as well as to contribute data that can be used for large-scale comparative analyses.
Collapse
Affiliation(s)
- Samantha K Smith
- Department of Integrative Biology, University of Texas at Austin, Austin, TX, 78712, USA.
| | - Paul W Frazel
- Department of Neuroscience and Physiology, New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - Alireza Khodadadi-Jamayran
- Applied Bioinformatics Laboratory, New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - Paul Zappile
- Genome Technology Center, New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - Christian Marier
- Genome Technology Center, New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - Mariam Okhovat
- Department of Integrative Biology, University of Texas at Austin, Austin, TX, 78712, USA
- Present Address: Oregon Health & Science University, Portland, OR, USA
| | - Stuart Brown
- NYU Center for Health Informatics and Bioinformatics, New York University Grossman School of Medicine, New York, NY, 10016, USA
- Present Address: Exxon Mobil Corporate, Houston, TX, USA
| | - Michael A Long
- Department of Neuroscience and Physiology, New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - Adriana Heguy
- Genome Technology Center, New York University Grossman School of Medicine, New York, NY, 10016, USA
| | - Steven M Phelps
- Department of Integrative Biology, University of Texas at Austin, Austin, TX, 78712, USA
| |
Collapse
|
4
|
Yıldırım B, Vogl C. Purifying selection against spurious splicing signals contributes to the base composition evolution of the polypyrimidine tract. J Evol Biol 2023; 36:1295-1312. [PMID: 37564008 PMCID: PMC10946897 DOI: 10.1111/jeb.14205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/31/2023] [Accepted: 06/15/2023] [Indexed: 08/12/2023]
Abstract
Among eukaryotes, the major spliceosomal pathway is highly conserved. While long introns may contain additional regulatory sequences, the ones in short introns seem to be nearly exclusively related to splicing. Although these regulatory sequences involved in splicing are well-characterized, little is known about their evolution. At the 3' end of introns, the splice signal nearly universally contains the dimer AG, which consists of purines, and the polypyrimidine tract upstream of this 3' splice signal is characterized by over-representation of pyrimidines. If the over-representation of pyrimidines in the polypyrimidine tract is also due to avoidance of a premature splicing signal, we hypothesize that AG should be the most under-represented dimer. Through the use of DNA-strand asymmetry patterns, we confirm this prediction in fruit flies of the genus Drosophila and by comparing the asymmetry patterns to a presumably neutrally evolving region, we quantify the selection strength acting on each motif. Moreover, our inference and simulation method revealed that the best explanation for the base composition evolution of the polypyrimidine tract is the joint action of purifying selection against a spurious 3' splice signal and the selection for pyrimidines. Patterns of asymmetry in other eukaryotes indicate that avoidance of premature splicing similarly affects the nucleotide composition in their polypyrimidine tracts.
Collapse
Affiliation(s)
- Burçin Yıldırım
- Department of Biomedical SciencesVetmeduni ViennaViennaAustria
- Vienna Graduate School of Population GeneticsViennaAustria
| | - Claus Vogl
- Department of Biomedical SciencesVetmeduni ViennaViennaAustria
- Vienna Graduate School of Population GeneticsViennaAustria
| |
Collapse
|
5
|
Sackerson C, Garcia V, Medina N, Maldonado J, Daly J, Cartwright R. Comparative analysis of the myoglobin gene in whales and humans reveals evolutionary changes in regulatory elements and expression levels. PLoS One 2023; 18:e0284834. [PMID: 37643191 PMCID: PMC10464968 DOI: 10.1371/journal.pone.0284834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 08/15/2023] [Indexed: 08/31/2023] Open
Abstract
Cetacea and other diving mammals have undergone numerous adaptations to their aquatic environment, among them high levels of the oxygen-carrying intracellular hemoprotein myoglobin in skeletal muscles. Hypotheses regarding the mechanisms leading to these high myoglobin levels often invoke the induction of gene expression by exercise, hypoxia, and other physiological gene regulatory pathways. Here we explore an alternative hypothesis: that cetacean myoglobin genes have evolved high levels of transcription driven by the intrinsic developmental mechanisms that drive muscle cell differentiation. We have used luciferase assays in differentiated C2C12 cells to test this hypothesis. Contrary to our hypothesis, we find that the myoglobin gene from the minke whale, Balaenoptera acutorostrata, shows a low level of expression, only about 8% that of humans. This low expression level is broadly shared among cetaceans and artiodactylans. Previous work on regulation of the human gene has identified a core muscle-specific enhancer comprised of two regions, the "AT element" and a C-rich sequence 5' of the AT element termed the "CCAC-box". Analysis of the minke whale gene supports the importance of the AT element, but the minke whale CCAC-box ortholog has little effect. Instead, critical positive input has been identified in a G-rich region 3' of the AT element. Also, a conserved E-box in exon 1 positively affects expression, despite having been assigned a repressive role in the human gene. Last, a novel region 5' of the core enhancer has been identified, which we hypothesize may function as a boundary element. These results illustrate regulatory flexibility during evolution. We discuss the possibility that low transcription levels are actually beneficial, and that evolution of the myoglobin protein toward enhanced stability is a critical factor in the accumulation of high myoglobin levels in adult cetacean muscle tissue.
Collapse
Affiliation(s)
- Charles Sackerson
- Biology Department, California State University Channel Islands, Camarillo, California, United States of America
| | - Vivian Garcia
- Biology Department, California State University Channel Islands, Camarillo, California, United States of America
| | - Nicole Medina
- Biology Department, California State University Channel Islands, Camarillo, California, United States of America
| | - Jessica Maldonado
- Biology Department, California State University Channel Islands, Camarillo, California, United States of America
| | - John Daly
- Biology Department, California State University Channel Islands, Camarillo, California, United States of America
| | - Rachel Cartwright
- Biology Department, California State University Channel Islands, Camarillo, California, United States of America
- The Keiki Kohola Project, Lahaina, Hawaii, United States of America
| |
Collapse
|
6
|
The tissue-specificity associated region and motif of an emx2 downstream enhancer CNE2.04 in zebrafish. Gene Expr Patterns 2022; 45:119269. [PMID: 35970322 DOI: 10.1016/j.gep.2022.119269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 07/04/2022] [Accepted: 07/29/2022] [Indexed: 11/22/2022]
Abstract
BACKGROUND Expression level of EMX2 plays an important role in the development of nervous system and cancers. CNE2.04, a conserved enhancer downstream of emx2, drives fluorescent protein expression in the similar pattern of emx2. METHODS CNE2.04 truncated or motif-mutated transgenic reporter plasmids were constructed and injected into the zebrafish fertilized egg with Tol2 mRNA at the unicellular stage of zebrafish eggs. The green fluorescence expression patterns were observed at 24, 48, and 72 hpf, and the fluorescence rates of different tissues were counted at 48 hpf. RESULTS Compared to CNE2.04, CNE2.04-R400 had comparable enhancer activity, while the tissue specificity of CNE2.04-L400 was obviously changed. Motif CCCCTC mutation obviously changed the enhancer activity, while motif CCGCTC mutations also changed it. CONCLUSION Due to their correlation with tissue specificity, CNE2.04-R400 is associated with the tissue-specificity of CNE2.04, and motif CCCCTC plays an important role in the enhancer activity of CNE2.04.
Collapse
|
7
|
Takeuchi Y, Yahagi N, Aita Y, Mehrazad-Saber Z, Ho MH, Huyan Y, Murayama Y, Shikama A, Masuda Y, Izumida Y, Miyamoto T, Matsuzaka T, Kawakami Y, Shimano H. FoxO-KLF15 pathway switches the flow of macronutrients under the control of insulin. iScience 2021; 24:103446. [PMID: 34988390 PMCID: PMC8710527 DOI: 10.1016/j.isci.2021.103446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 09/18/2021] [Accepted: 11/11/2021] [Indexed: 11/15/2022] Open
Abstract
KLF15 is a transcription factor that plays an important role in the activation of gluconeogenesis from amino acids as well as the suppression of lipogenesis from glucose. Here we identified the transcription start site of liver-specific KLF15 transcript and showed that FoxO1/3 transcriptionally regulates Klf15 gene expression by directly binding to the liver-specific Klf15 promoter. To achieve this, we performed a precise in vivo promoter analysis combined with the genome-wide transcription-factor-screening method "TFEL scan", using our original Transcription Factor Expression Library (TFEL), which covers nearly all the transcription factors in the mouse genome. Hepatic Klf15 expression is significantly increased via FoxOs by attenuating insulin signaling. Furthermore, FoxOs elevate the expression levels of amino acid catabolic enzymes and suppress SREBP-1c via KLF15, resulting in accelerated amino acid breakdown and suppressed lipogenesis during fasting. Thus, the FoxO-KLF15 pathway contributes to switching the macronutrient flow in the liver under the control of insulin.
Collapse
Affiliation(s)
- Yoshinori Takeuchi
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Naoya Yahagi
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Yuichi Aita
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Zahra Mehrazad-Saber
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Man Hei Ho
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Yiren Huyan
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan
| | - Yuki Murayama
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Akito Shikama
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Yukari Masuda
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan.,Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Yoshihiko Izumida
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan
| | - Takafumi Miyamoto
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Takashi Matsuzaka
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| | - Yasushi Kawakami
- Nutrigenomics Research Group, Faculty of Medicine, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8575, Japan
| | - Hitoshi Shimano
- Department of Internal Medicine (Endocrinology and Metabolism), Faculty of Medicine, University of Tsukuba, Tsukuba, Ibaraki 305-8575, Japan
| |
Collapse
|
8
|
Gasperini M, Tome JM, Shendure J. Towards a comprehensive catalogue of validated and target-linked human enhancers. Nat Rev Genet 2020; 21:292-310. [PMID: 31988385 PMCID: PMC7845138 DOI: 10.1038/s41576-019-0209-0] [Citation(s) in RCA: 159] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/13/2019] [Indexed: 12/14/2022]
Abstract
The human gene catalogue is essentially complete, but we lack an equivalently vetted inventory of bona fide human enhancers. Hundreds of thousands of candidate enhancers have been nominated via biochemical annotations; however, only a handful of these have been validated and confidently linked to their target genes. Here we review emerging technologies for discovering, characterizing and validating human enhancers at scale. We furthermore propose a new framework for operationally defining enhancers that accommodates the heterogeneous and complementary results that are emerging from reporter assays, biochemical measurements and CRISPR screens.
Collapse
Affiliation(s)
- Molly Gasperini
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jacob M Tome
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
- Allen Discovery Center for Cell Lineage, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
9
|
Morris VE, Hashmi SS, Zhu L, Maili L, Urbina C, Blackwell S, Greives MR, Buchanan EP, Mulliken JB, Blanton SH, Zheng WJ, Hecht JT, Letra A. Evidence for craniofacial enhancer variation underlying nonsyndromic cleft lip and palate. Hum Genet 2020; 139:1261-1272. [PMID: 32318854 DOI: 10.1007/s00439-020-02169-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Accepted: 04/13/2020] [Indexed: 12/14/2022]
Abstract
Nonsyndromic cleft lip with or without cleft palate (NSCLP) is a common birth defect for which only ~ 20% of the underlying genetic variation has been identified. Variants in noncoding regions have been increasingly suggested to contribute to the missing heritability. In this study, we investigated whether variation in craniofacial enhancers contributes to NSCLP. Candidate enhancers were identified using VISTA Enhancer Browser and previous publications. Prioritization was based on patterning defects in knockout mice, deletion/duplication of craniofacial genes in animal models and results of whole exome/whole genome sequencing studies. This resulted in 20 craniofacial enhancers to be investigated. Custom amplicon-based sequencing probes were designed and used for sequencing 380 NSCLP probands (from multiplex and simplex families of non-Hispanic white (NHW) and Hispanic ethnicities) using Illumina MiSeq. The frequencies of identified variants were compared to ethnically matched European (CEU) and Los Angeles Mexican (MXL) control genomes and used for association analyses. Variants in mm427/MSX1 and hs1582/SPRY1 showed genome-wide significant association with NSCLP (p ≤ 6.4 × 10-11). In silico analysis showed that these enhancer variants may disrupt important transcription factor binding sites. Haplotypes involving these enhancers and also mm435/ABCA4 were significantly associated with NSCLP, especially in NHW (p ≤ 6.3 × 10-7). Importantly, groupwise burden analysis showed several enhancer combinations significantly over-represented in NSCLP individuals, revealing novel NSCLP pathways and supporting a polygenic inheritance model. Our findings support the role of craniofacial enhancer sequence variation in the etiology of NSCLP.
Collapse
Affiliation(s)
- Vershanna E Morris
- Department of Pediatrics, UTHealth McGovern Medical School, Houston, TX, 77030, USA.,Pediatric Research Center, UTHealth McGovern Medical School, Houston, TX, 77030, USA
| | - S Shahrukh Hashmi
- Department of Pediatrics, UTHealth McGovern Medical School, Houston, TX, 77030, USA.,Pediatric Research Center, UTHealth McGovern Medical School, Houston, TX, 77030, USA
| | - Lisha Zhu
- UTHealth School of Biomedical Informatics, Houston, TX, 77054, USA
| | - Lorena Maili
- Department of Pediatrics, UTHealth McGovern Medical School, Houston, TX, 77030, USA.,Pediatric Research Center, UTHealth McGovern Medical School, Houston, TX, 77030, USA
| | - Christian Urbina
- Department of Pediatrics, UTHealth McGovern Medical School, Houston, TX, 77030, USA.,Pediatric Research Center, UTHealth McGovern Medical School, Houston, TX, 77030, USA
| | | | - Matthew R Greives
- Department of Pediatric Surgery, University of Texas Health Science Center McGovern Medical School, Houston, TX, 77030, USA
| | - Edward P Buchanan
- Department of Plastic Surgery, Texas Children's Hospital, Houston, TX, 77030, USA
| | - John B Mulliken
- Department of Plastic Surgery, Boston Children's Hospital, Boston, MA, 02115, USA
| | - Susan H Blanton
- Dr. John T. Macdonald Foundation Department of Human Genetics, John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, FL, 33136, USA
| | - W Jim Zheng
- UTHealth School of Biomedical Informatics, Houston, TX, 77054, USA
| | - Jacqueline T Hecht
- Department of Pediatrics, UTHealth McGovern Medical School, Houston, TX, 77030, USA.,Pediatric Research Center, UTHealth McGovern Medical School, Houston, TX, 77030, USA.,Shriners' Hospital for Children, Houston, TX, 77030, USA.,Center for Craniofacial Research, UTHealth School of Dentistry, Houston, TX, 77054, USA
| | - Ariadne Letra
- School of Dentistry, Department of Diagnostic and Biomedical Sciences, University of Texas Health Science Center At Houston, 1941 East Road, BBSB 4210, Houston, TX, 77054, USA. .,Center for Craniofacial Research, UTHealth School of Dentistry, Houston, TX, 77054, USA.
| |
Collapse
|
10
|
Blankvoort S, Descamps LAL, Kentros C. Enhancer-Driven Gene Expression (EDGE) enables the generation of cell type specific tools for the analysis of neural circuits. Neurosci Res 2020; 152:78-86. [PMID: 31958494 DOI: 10.1016/j.neures.2020.01.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2019] [Revised: 01/10/2020] [Accepted: 01/10/2020] [Indexed: 12/20/2022]
Abstract
As in all circuits, fully understanding how neural circuits operate requires the ability to specifically manipulate individual circuit elements, i.e. particular neuronal cell types. While recent years saw the development of molecular genetic tools allowing one to control and monitor neuronal activity, progress is limited by the ability to express such transgenes specifically enough. This goal is complicated by the fact that we are only beginning to understand how many cell types exist in the mammalian brain. Obtaining neuronal cell type-specific expression requires co-opting the genetic machinery which specifies their striking diversity, typically done by making transgenic animals using promoters expressing in neurons. However, while the vast majority of genes express in the brain, they almost always express in multiple cell types, meaning native promoters are not specific enough. We have recently taken a new approach to increase the specificity of transgene expression based upon identifying the distal cis-regulatory genomic elements (i.e. enhancers) uniquely active in a brain region and combining them with a heterologous minimal promoter. Termed Enhancer-Driven Gene Expression (EDGE), it allows for the generation of transgenic animals targeting the cell types of any brain region with far greater specificity than can be obtained with native promoters. Moreover, their small size allows for the generation of cell-specific viral vectors, conceivably enabling circuit-specific manipulations to any species.
Collapse
Affiliation(s)
- Stefan Blankvoort
- Kavli Institute for Systems Neuroscience and Centre for Neural Computation, NTNU, Trondheim, Norway.
| | - Lucie A L Descamps
- Kavli Institute for Systems Neuroscience and Centre for Neural Computation, NTNU, Trondheim, Norway
| | - Cliff Kentros
- Kavli Institute for Systems Neuroscience and Centre for Neural Computation, NTNU, Trondheim, Norway.
| |
Collapse
|
11
|
Identification and Characterization of Cis-Regulatory Elements for Photoreceptor-Type-Specific Transcription in ZebraFish. Methods Mol Biol 2020; 2092:123-145. [PMID: 31786786 DOI: 10.1007/978-1-0716-0175-4_10] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/06/2022]
Abstract
Tissue-specific or cell-type-specific transcription of protein-coding genes is controlled by both trans-regulatory elements (TREs) and cis-regulatory elements (CREs). However, it is challenging to identify TREs and CREs, which are unknown for most genes. Here, we describe a protocol for identifying two types of transcription-activating CREs-core promoters and enhancers-of zebrafish photoreceptor type-specific genes. This protocol is composed of three phases: bioinformatic prediction, experimental validation, and characterization of the CREs. To better illustrate the principles and logic of this protocol, we exemplify it with the discovery of the core promoter and enhancer of the mpp5b apical polarity gene (also known as ponli), whose red, green, and blue (RGB) cone-specific transcription requires its enhancer, a member of the rainbow enhancer family. While exemplified with an RGB-cone-specific gene, this protocol is general and can be used to identify the core promoters and enhancers of other protein-coding genes.
Collapse
|
12
|
Doan RN, Shin T, Walsh CA. Evolutionary Changes in Transcriptional Regulation: Insights into Human Behavior and Neurological Conditions. Annu Rev Neurosci 2019; 41:185-206. [PMID: 29986162 DOI: 10.1146/annurev-neuro-080317-062104] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Understanding the biological basis for human-specific cognitive traits presents both immense challenges and unique opportunities. Although the question of what makes us human has been investigated with several different methods, the rise of comparative genomics, epigenomics, and medical genetics has provided tools to help narrow down and functionally assess the regions of the genome that seem evolutionarily relevant along the human lineage. In this review, we focus on how medical genetic cases have provided compelling functional evidence for genes and loci that appear to have interesting evolutionary signatures in humans. Furthermore, we examine a special class of noncoding regions, human accelerated regions (HARs), that have been suggested to show human-lineage-specific divergence, and how the use of clinical and population data has started to provide functional information to examine these regions. Finally, we outline methods that provide new insights into functional noncoding sequences in evolution.
Collapse
Affiliation(s)
- Ryan N Doan
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts 02115, USA; .,Allen Discovery Center for Human Brain Evolution, Boston Children's Hospital, Boston, Massachusetts 02115, USA.,Howard Hughes Medical Institute, Boston Children's Hospital, Boston, Massachusetts 02115, USA
| | - Taehwan Shin
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts 02115, USA; .,Allen Discovery Center for Human Brain Evolution, Boston Children's Hospital, Boston, Massachusetts 02115, USA
| | - Christopher A Walsh
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts 02115, USA; .,Allen Discovery Center for Human Brain Evolution, Boston Children's Hospital, Boston, Massachusetts 02115, USA.,Howard Hughes Medical Institute, Boston Children's Hospital, Boston, Massachusetts 02115, USA.,Departments of Pediatrics and Neurology, Harvard Medical School, Boston, Massachusetts 02138, USA
| |
Collapse
|
13
|
Oak N, Ghosh R, Huang KL, Wheeler DA, Ding L, Plon SE. Framework for microRNA variant annotation and prioritization using human population and disease datasets. Hum Mutat 2018; 40:73-89. [PMID: 30302893 DOI: 10.1002/humu.23668] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 11/10/2022]
Abstract
MicroRNA (miRNA) expression is frequently deregulated in human disease, in contrast, disease-associated miRNA mutations are understudied. We developed Annotative Database of miRNA Elements, ADmiRE, which combines multiple existing and new biological annotations to aid prioritization of causal miRNA variation. We annotated 10,206 mature (3,257 within seed region) miRNA variants from multiple large sequencing datasets including gnomAD (15,496 genomes; 123,136 exomes). The pattern of miRNA variation closely resembles protein-coding exonic regions, with no difference between intragenic and intergenic miRNAs (P = 0.56), and high confidence miRNAs demonstrate higher sequence constraint (P < 0.001). Conservation analysis across 100 vertebrates identified 765 highly conserved miRNAs that also have limited genetic variation in gnomAD. We applied ADmiRE to the TCGA PanCancerAtlas WES dataset containing over 10,000 individuals across 33 adult cancers and annotated 1,267 germline (rare in gnomAD) and 1,492 somatic miRNA variants. Several miRNA families with deregulated gene expression in cancer have low levels of both somatic and germline variants, e.g., let-7 and miR-10. In addition to known somatic miR-142 mutations in hematologic cancers, we describe novel somatic miR-21 mutations in esophageal cancers impacting downstream miRNA targets. Through the development of ADmiRE, we present a framework for annotation and prioritization of miRNA variation in disease datasets.
Collapse
Affiliation(s)
- Ninad Oak
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030.,Texas Children's Cancer Center, Texas Children's Hospital, Houston, TX
| | - Rajarshi Ghosh
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030.,Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030
| | - Kuan-Lin Huang
- Department of Medicine, Washington University in St. Louis, MO 63108.,McDonnel Genome Institute, Washington University in St. Louis, MO 63108
| | - David A Wheeler
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030.,Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030
| | - Li Ding
- Department of Medicine, Washington University in St. Louis, MO 63108.,McDonnel Genome Institute, Washington University in St. Louis, MO 63108.,Department of Genetics, Washington University in St. Louis, MO 63108.,Siteman Cancer Center, Washington University in St. Louis, MO 63108
| | - Sharon E Plon
- Departments of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030.,Texas Children's Cancer Center, Texas Children's Hospital, Houston, TX.,Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030.,Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030
| |
Collapse
|
14
|
Li L, Barth NKH, Hirth E, Taher L. Pairs of Adjacent Conserved Noncoding Elements Separated by Conserved Genomic Distances Act as Cis-Regulatory Units. Genome Biol Evol 2018; 10:2535-2550. [PMID: 30184074 PMCID: PMC6161761 DOI: 10.1093/gbe/evy196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2018] [Indexed: 01/02/2023] Open
Abstract
Comparative genomic studies have identified thousands of conserved noncoding elements (CNEs) in the mammalian genome, many of which have been reported to exert cis-regulatory activity. We analyzed ∼5,500 pairs of adjacent CNEs in the human genome and found that despite divergence at the nucleotide sequence level, the inter-CNE distances of the pairs are under strong evolutionary constraint, with inter-CNE sequences featuring significantly lower transposon densities than expected. Further, we show that different degrees of conservation of the inter-CNE distance are associated with distinct cis-regulatory functions at the CNEs. Specifically, the CNEs in pairs with conserved and mildly contracted inter-CNE sequences are the most likely to represent active or poised enhancers. In contrast, CNEs in pairs with extremely contracted or expanded inter-CNE sequences are associated with no cis-regulatory activity. Furthermore, we observed that functional CNEs in a pair have very similar epigenetic profiles, hinting at a functional relationship between them. Taken together, our results support the existence of epistatic interactions between adjacent CNEs that are distance-sensitive and disrupted by transposon insertions and deletions, and contribute to our understanding of the selective forces acting on cis-regulatory elements, which are crucial for elucidating the molecular mechanisms underlying adaptive evolution and human genetic diseases.
Collapse
Affiliation(s)
- Lifei Li
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Nicolai K H Barth
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Eva Hirth
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Leila Taher
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
15
|
Vlahovic I, Gluncic M, Rosandic M, Ugarkovic Ð, Paar V. Regular Higher Order Repeat Structures in Beetle Tribolium castaneum Genome. Genome Biol Evol 2018; 9:2668-2680. [PMID: 27492235 PMCID: PMC5737470 DOI: 10.1093/gbe/evw174] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/21/2016] [Indexed: 02/07/2023] Open
Abstract
Higher order repeats (HORs) containing tandems of primary and secondary repeat units (head-to-tail “tandem within tandem pattern”), referred to as regular HORs, are typical for primate alpha satellite DNAs and most pronounced in human genome. Regular HORs are known to be a result of recent evolutionary processes. In non-primate genomes mostly so called complex HORs have been found, without head to tail tandem of primary repeat units. In beetle Tribolium castaneum, considered as a model case for genome studies, large tandem repeats have been identified, but no HORs have been reported. Here, using our novel robust repeat finding algorithm Global Repeat Map, we discover two regular and six complex HORs in T. castaneum. In organizational pattern, the integrity and homogeneity of regular HORs in T. castaneum resemble human regular HORs (with T. castaneum monomers different from human alpha satellite monomers), involving a wider range of monomer lengths than in human HORs. Similar regular higher order repeat structures have previously not been found in insects. Some of these novel HORs in T. castaneum appear as most regular among known HORs in non-primate genomes, although with substantial riddling. This is intriguing, in particular from the point of view of role of non-coding repeats in modulation of gene expression.
Collapse
Affiliation(s)
- Ines Vlahovic
- Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Matko Gluncic
- Faculty of Science, University of Zagreb, Zagreb, Croatia
| | | | | | - Vladimir Paar
- Faculty of Science, University of Zagreb, Zagreb, Croatia.,Croatian Academy of Sciences and Arts, Zagreb, Croatia
| |
Collapse
|
16
|
Martín-Gálvez D, Dunoyer de Segonzac D, Ma MCJ, Kwitek AE, Thybert D, Flicek P. Genome variation and conserved regulation identify genomic regions responsible for strain specific phenotypes in rat. BMC Genomics 2017; 18:986. [PMID: 29272997 PMCID: PMC5741965 DOI: 10.1186/s12864-017-4351-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2017] [Accepted: 11/27/2017] [Indexed: 11/10/2022] Open
Abstract
Background The genomes of laboratory rat strains are characterised by a mosaic haplotype structure caused by their unique breeding history. These mosaic haplotypes have been recently mapped by extensive sequencing of key strains. Comparison of genomic variation between two closely related rat strains with different phenotypes has been proposed as an effective strategy for the discovery of candidate strain-specific regions involved in phenotypic differences. We developed a method to prioritise strain-specific haplotypes by integrating genomic variation and genomic regulatory data predicted to be involved in specific phenotypes. Specifically, we aimed to identify genomic regions associated with Metabolic Syndrome (MetS), a disorder of energy utilization and storage affecting several organ systems. Results We compared two Lyon rat strains, Lyon Hypertensive (LH) which is susceptible to MetS, and Lyon Low pressure (LL), which is susceptible to obesity as an intermediate MetS phenotype, with a third strain (Lyon Normotensive, LN) that is resistant to both MetS and obesity. Applying a novel metric, we ranked the identified strain-specific haplotypes using evolutionary conservation of the occupancy three liver-specific transcription factors (HNF4A, CEBPA, and FOXA1) in five rodents including rat. Consideration of regulatory information effectively identified regions with liver-associated genes and rat orthologues of human GWAS variants related to obesity and metabolic traits. We attempted to find possible causative variants and compared them with the candidate genes proposed by previous studies. In strain-specific regions with conserved regulation, we found a significant enrichment for published evidence to obesity—one of the metabolic symptoms shown by the Lyon strains—amongst the genes assigned to promoters with strain-specific variation. Conclusions Our results show that the use of functional regulatory conservation is a potentially effective approach to select strain-specific genomic regions associated with phenotypic differences among Lyon rats and could be extended to other systems. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4351-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David Martín-Gálvez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Denis Dunoyer de Segonzac
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Man Chun John Ma
- Department of Pharmacology, University of Iowa, Iowa City, IA, USA.,Iowa Institute of Human Genetics, University of Iowa, Iowa City, IA, USA.,Present address: MD Anderson Cancer Center, University of Texas, Houston, TX, USA
| | - Anne E Kwitek
- Department of Pharmacology, University of Iowa, Iowa City, IA, USA.,Iowa Institute of Human Genetics, University of Iowa, Iowa City, IA, USA
| | - David Thybert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK. .,Present address: Earlham Institute, Norwich research Park, Norwich, NR4 7UH, UK.
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
17
|
Smith AF, Posakony JW, Rebeiz M. Automated tools for comparative sequence analysis of genic regions using the GenePalette application. Dev Biol 2017; 429:158-164. [PMID: 28673819 PMCID: PMC5623810 DOI: 10.1016/j.ydbio.2017.06.033] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Revised: 06/28/2017] [Accepted: 06/28/2017] [Indexed: 10/19/2022]
Abstract
Comparative sequence analysis methods, such as phylogenetic footprinting, represent one of the most effective ways to decode regulatory sequence functions based upon DNA sequence information alone. The laborious task of assembling orthologous sequences to perform these comparisons is a hurdle to these analyses, which is further aggravated by the relative paucity of tools for visualization of sequence comparisons in large genic regions. Here, we describe a second-generation implementation of the GenePalette DNA sequence analysis software to facilitate comparative studies of gene function and regulation. We have developed an automated module called OrthologGrabber (OG) that performs BLAT searches against the UC Santa Cruz genome database to identify and retrieve segments homologous to a region of interest. Upon acquisition, sequences are compared to identify high-confidence anchor-points, which are graphically displayed. The visualization of anchor-points alongside other DNA features, such as transcription factor binding sites, allows users to precisely examine whether a binding site of interest is conserved, even if the surrounding region exhibits poor sequence identity. This approach also aids in identifying orthologous segments of regulatory DNA, facilitating studies of regulatory sequence evolution. As with previous versions of the software, GenePalette 2.1 takes the form of a platform-independent, single-windowed interface that is simple to use.
Collapse
Affiliation(s)
- Andrew F Smith
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - James W Posakony
- Division of Biological Sciences/CDB, University of California San Diego, La Jolla, CA 92093, USA
| | - Mark Rebeiz
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA.
| |
Collapse
|
18
|
Fu H, Zhang X. Noncoding Variants Functional Prioritization Methods Based on Predicted Regulatory Factor Binding Sites. Curr Genomics 2017; 18:322-331. [PMID: 29081688 PMCID: PMC5635616 DOI: 10.2174/1389202918666170228143619] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 10/16/2016] [Accepted: 11/02/2016] [Indexed: 12/31/2022] Open
Abstract
BACKGROUNDS With the advent of the post genomic era, the research for the genetic mechanism of the diseases has found to be increasingly depended on the studies of the genes, the gene-networks and gene-protein interaction networks. To explore gene expression and regulation, the researchers have carried out many studies on transcription factors and their binding sites (TFBSs). Based on the large amount of transcription factor binding sites predicting values in the deep learning models, further computation and analysis have been done to reveal the relationship between the gene mutation and the occurrence of the disease. It has been demonstrated that based on the deep learning methods, the performances of the prediction for the functions of the noncoding variants are outperforming than those of the conventional methods. The research on the prediction for functions of Single Nucleotide Polymorphisms (SNPs) is expected to uncover the mechanism of the gene mutation affection on traits and diseases of human beings. RESULTS We reviewed the conventional TFBSs identification methods from different perspectives. As for the deep learning methods to predict the TFBSs, we discussed the related problems, such as the raw data preprocessing, the structure design of the deep convolution neural network (CNN) and the model performance measure et al. And then we summarized the techniques that usually used in finding out the functional noncoding variants from de novo sequence. CONCLUSION Along with the rapid development of the high-throughout assays, more and more sample data and chromatin features would be conducive to improve the prediction accuracy of the deep convolution neural network for TFBSs identification. Meanwhile, getting more insights into the deep CNN framework itself has been proved useful for both the promotion on model performance and the development for more suitable design to sample data. Based on the feature values predicted by the deep CNN model, the prioritization model for functional noncoding variants would contribute to reveal the affection of gene mutation on the diseases.
Collapse
Affiliation(s)
- Haoyue Fu
- College of Sciences, Northeastern University, Shenyang, China
| | - LianpingYang
- College of Sciences, Northeastern University, Shenyang, China
- University of Southern California, Dept. Biol. Sci., Program Mol & Computat Biol, USA
| | - Xiangde Zhang
- College of Sciences, Northeastern University, Shenyang, China
| |
Collapse
|
19
|
Abstract
SHOX deficiency is the most frequent genetic growth disorder associated with isolated and syndromic forms of short stature. Caused by mutations in the homeobox gene SHOX, its varied clinical manifestations include isolated short stature, Léri-Weill dyschondrosteosis, and Langer mesomelic dysplasia. In addition, SHOX deficiency contributes to the skeletal features in Turner syndrome. Causative SHOX mutations have allowed downstream pathology to be linked to defined molecular lesions. Expression levels of SHOX are tightly regulated, and almost half of the pathogenic mutations have affected enhancers. Clinical severity of SHOX deficiency varies between genders and ranges from normal stature to profound mesomelic skeletal dysplasia. Treatment options for children with SHOX deficiency are available. Two decades of research support the concept of SHOX as a transcription factor that integrates diverse aspects of bone development, growth plate biology, and apoptosis. Due to its absence in mouse, the animal models of choice have become chicken and zebrafish. These models, therefore, together with micromass cultures and primary cell lines, have been used to address SHOX function. Pathway and network analyses have identified interactors, target genes, and regulators. Here, we summarize recent data and give insight into the critical molecular and cellular functions of SHOX in the etiopathogenesis of short stature and limb development.
Collapse
Affiliation(s)
- Antonio Marchini
- Tumour Virology Division F010 (A.M.), German Cancer Research Center, 69120 Heidelberg, Germany; Department of Oncology (A.M.), Luxembourg Institute of Health 84, rue Val Fleuri L-1526, Luxembourg; Department of Pediatrics (T.O.), Hamamatsu University School of Medicine, Higashi-ku, Hamamatsu 431-3192, Japan; and Department of Human Molecular Genetics (G.A.R.), Institute of Human Genetics, Heidelberg University Hospital, 69120 Heidelberg, Germany
| | - Tsutomu Ogata
- Tumour Virology Division F010 (A.M.), German Cancer Research Center, 69120 Heidelberg, Germany; Department of Oncology (A.M.), Luxembourg Institute of Health 84, rue Val Fleuri L-1526, Luxembourg; Department of Pediatrics (T.O.), Hamamatsu University School of Medicine, Higashi-ku, Hamamatsu 431-3192, Japan; and Department of Human Molecular Genetics (G.A.R.), Institute of Human Genetics, Heidelberg University Hospital, 69120 Heidelberg, Germany
| | - Gudrun A Rappold
- Tumour Virology Division F010 (A.M.), German Cancer Research Center, 69120 Heidelberg, Germany; Department of Oncology (A.M.), Luxembourg Institute of Health 84, rue Val Fleuri L-1526, Luxembourg; Department of Pediatrics (T.O.), Hamamatsu University School of Medicine, Higashi-ku, Hamamatsu 431-3192, Japan; and Department of Human Molecular Genetics (G.A.R.), Institute of Human Genetics, Heidelberg University Hospital, 69120 Heidelberg, Germany
| |
Collapse
|
20
|
Khurana E, Fu Y, Chakravarty D, Demichelis F, Rubin MA, Gerstein M. Role of non-coding sequence variants in cancer. Nat Rev Genet 2016; 17:93-108. [PMID: 26781813 DOI: 10.1038/nrg.2015.17] [Citation(s) in RCA: 319] [Impact Index Per Article: 39.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Patients with cancer carry somatic sequence variants in their tumour in addition to the germline variants in their inherited genome. Although variants in protein-coding regions have received the most attention, numerous studies have noted the importance of non-coding variants in cancer. Moreover, the overwhelming majority of variants, both somatic and germline, occur in non-coding portions of the genome. We review the current understanding of non-coding variants in cancer, including the great diversity of the mutation types--from single nucleotide variants to large genomic rearrangements--and the wide range of mechanisms by which they affect gene expression to promote tumorigenesis, such as disrupting transcription factor-binding sites or functions of non-coding RNAs. We highlight specific case studies of somatic and germline variants, and discuss how non-coding variants can be interpreted on a large-scale through computational and experimental methods.
Collapse
Affiliation(s)
- Ekta Khurana
- Meyer Cancer Center, Weill Cornell Medical College, New York, New York 10065, USA.,Institute for Precision Medicine, Weill Cornell Medical College, New York, New York 10065, USA.,Institute for Computational Biomedicine, Weill Cornell Medical College, New York, New York 10021, USA.,Department of Physiology and Biophysics, Weill Cornell Medical College, New York, New York 10065, USA
| | - Yao Fu
- Bina Technologies, Roche Sequencing, Redwood City, California 94065, USA
| | - Dimple Chakravarty
- Institute for Precision Medicine, Weill Cornell Medical College, New York, New York 10065, USA.,Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, New York 10065, USA
| | - Francesca Demichelis
- Institute for Precision Medicine, Weill Cornell Medical College, New York, New York 10065, USA.,Institute for Computational Biomedicine, Weill Cornell Medical College, New York, New York 10021, USA.,Centre for Integrative Biology, University of Trento, 38123 Trento, Italy
| | - Mark A Rubin
- Meyer Cancer Center, Weill Cornell Medical College, New York, New York 10065, USA.,Institute for Precision Medicine, Weill Cornell Medical College, New York, New York 10065, USA.,Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, New York 10065, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.,Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut 06520, USA.,Department of Computer Science, Yale University, New Haven, Connecticut 06520, USA
| |
Collapse
|
21
|
Rastegar S, Strähle U. The Zebrafish as Model for Deciphering the Regulatory Architecture of Vertebrate Genomes. GENETICS, GENOMICS AND FISH PHENOMICS 2016; 95:195-216. [DOI: 10.1016/bs.adgen.2016.04.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
22
|
Handling Permutation in Sequence Comparison: Genome-Wide Enhancer Prediction in Vertebrates by a Novel Non-Linear Alignment Scoring Principle. PLoS One 2015; 10:e0141487. [PMID: 26505748 PMCID: PMC4624239 DOI: 10.1371/journal.pone.0141487] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Accepted: 10/08/2015] [Indexed: 01/01/2023] Open
Abstract
Enhancers have been described to evolve by permutation without changing function. This has posed the problem of how to predict enhancer elements that are hidden from alignment-based approaches due to the loss of co-linearity. Alignment-free algorithms have been proposed as one possible solution. However, this approach is hampered by several problems inherent to its underlying working principle. Here we present a new approach, which combines the power of alignment and alignment-free techniques into one algorithm. It allows the prediction of enhancers based on the query and target sequence only, no matter whether the regulatory logic is co-linear or reshuffled. To test our novel approach, we employ it for the prediction of enhancers across the evolutionary distance of ~450Myr between human and medaka. We demonstrate its efficacy by subsequent in vivo validation resulting in 82% (9/11) of the predicted medaka regions showing reporter activity. These include five candidates with partially co-linear and four with reshuffled motif patterns. Orthology in flanking genes and conservation of the detected co-linear motifs indicates that those candidates are likely functionally equivalent enhancers. In sum, our results demonstrate that the proposed principle successfully predicts mutated as well as permuted enhancer regions at an encouragingly high rate.
Collapse
|
23
|
The hierarchical organization of natural protein interaction networks confers self-organization properties on pseudocells. BMC SYSTEMS BIOLOGY 2015; 9 Suppl 3:S3. [PMID: 26050708 PMCID: PMC4464023 DOI: 10.1186/1752-0509-9-s3-s3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Background Cell organization is governed and maintained via specific interactions among its constituent macromolecules. Comparison of the experimentally determined protein interaction networks in different model organisms has revealed little conservation of the specific edges linking ortholog proteins. Nevertheless, some topological characteristics of the graphs representing the networks - namely non-random degree distribution and high clustering coefficient - are shared by networks of distantly related organisms. Here we investigate the role of the topological features of the protein interaction network in promoting cell organization. Methods We have used a stochastic model, dubbed ProtNet representing a computer stylized cell to answer questions about the dynamic consequences of the topological properties of the static graphs representing protein interaction networks. Results By using a novel metrics of cell organization, we show that natural networks, differently from random networks, can promote cell self-organization. Furthermore the ensemble of protein complexes that forms in pseudocells, which self-organize according to the interaction rules of natural networks, are more robust to perturbations. Conclusions The analysis of the dynamic properties of networks with a variety of topological characteristics lead us to conclude that self organization is a consequence of the high clustering coefficient, whereas the scale free degree distribution has little influence on this property.
Collapse
|
24
|
The short-chain fatty acid receptor GPR43 is transcriptionally regulated by XBP1 in human monocytes. Sci Rep 2015; 5:8134. [PMID: 25633224 PMCID: PMC4311239 DOI: 10.1038/srep08134] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2014] [Accepted: 01/08/2015] [Indexed: 02/06/2023] Open
Abstract
G-protein coupled receptor 43 (GPR43) recognizes short chain fatty acids and is implicated in obesity, colitis, asthma and arthritis. Here, we present the first full characterization of the GPR43 promoter and 5′-UTR. 5′-RACE of the GPR43 transcript identified the transcription start site (TSS) and a 124 bp 5′-UTR followed by a 1335 bp intron upstream of the ATG start codon. The sequence spanning -4560 to +68 bp relative to the GPR43 TSS was found to contain strong promoter activity, increasing luciferase reporter expression by >100-fold in U937 monocytes. Stepwise deletions further narrowed the putative GPR43 promoter (−451 to +68). Site-directed mutagenesis identified XBP1 as a core cis element, the mutation of which abrogated transcriptional activity. Mutations of predicted CREB, CHOP, NFAT and STAT5 binding sites, partially reduced promoter activity. ChIP assays confirmed the binding of XBP1 to the endogenous GPR43 promoter. Consistently, GPR43 expression is reduced in monocytes upon siRNA-knockdown of XBP1, while A549 cells overexpressing XBP1 displayed elevated GPR43 levels. Based on its ability to activate XBP1, we predicted and confirmed that TNFα induces GPR43 expression in human monocytes. Altogether, our findings form the basis for strategic modulation of GPR43 expression, with a view to regulate GPR43-associated diseases.
Collapse
|
25
|
Cheng Y, Ma Z, Kim BH, Wu W, Cayting P, Boyle AP, Sundaram V, Xing X, Dogan N, Li J, Euskirchen G, Lin S, Lin Y, Visel A, Kawli T, Yang X, Patacsil D, Keller CA, Giardine B, Kundaje A, Wang T, Pennacchio LA, Weng Z, Hardison RC, Snyder MP. Principles of regulatory information conservation between mouse and human. Nature 2015; 515:371-375. [PMID: 25409826 PMCID: PMC4343047 DOI: 10.1038/nature13985] [Citation(s) in RCA: 189] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2014] [Accepted: 10/21/2014] [Indexed: 11/09/2022]
Abstract
To broaden our understanding of the evolution of gene regulation mechanisms, we generated occupancy profiles for 34 orthologous transcription factors (TFs) in human-mouse erythroid progenitor, lymphoblast and embryonic stem-cell lines. By combining the genome-wide transcription factor occupancy repertoires, associated epigenetic signals, and co-association patterns, here we deduce several evolutionary principles of gene regulatory features operating since the mouse and human lineages diverged. The genomic distribution profiles, primary binding motifs, chromatin states, and DNA methylation preferences are well conserved for TF-occupied sequences. However, the extent to which orthologous DNA segments are bound by orthologous TFs varies both among TFs and with genomic location: binding at promoters is more highly conserved than binding at distal elements. Notably, occupancy-conserved TF-occupied sequences tend to be pleiotropic; they function in several tissues and also co-associate with many TFs. Single nucleotide variants at sites with potential regulatory functions are enriched in occupancy-conserved TF-occupied sequences.
Collapse
Affiliation(s)
- Yong Cheng
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Zhihai Ma
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Bong-Hyun Kim
- Program in Bioinformatics and Integrative Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Weisheng Wu
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Philip Cayting
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Alan P Boyle
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Vasavi Sundaram
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Xiaoyun Xing
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Nergiz Dogan
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Jingjing Li
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Ghia Euskirchen
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Shin Lin
- Department of Genetics, Stanford University, Stanford, CA 94305, USA.,Division of Cardiovascular Medicine, Stanford University, Stanford, CA 94304, USA
| | - Yiing Lin
- Department of Genetics, Stanford University, Stanford, CA 94305, USA.,Department of Surgery, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Axel Visel
- Lawrence Berkeley National Laboratory, Genomics Division, Berkeley, CA 94701,USA.,Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA.,School of Natural Sciences, University of California, Merced, CA 95343,USA
| | - Trupti Kawli
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Xinqiong Yang
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Dorrelyn Patacsil
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Cheryl A Keller
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Belinda Giardine
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | | | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Ting Wang
- Department of Genetics, Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Len A Pennacchio
- Lawrence Berkeley National Laboratory, Genomics Division, Berkeley, CA 94701,USA.,Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Ross C Hardison
- Center for Comparative Genomics and Bioinformatics, Huck Institutes of the Life Sciences, Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Michael P Snyder
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
26
|
A novel pairwise comparison method for in silico discovery of statistically significant cis-regulatory elements in eukaryotic promoter regions: application to Arabidopsis. J Theor Biol 2014; 364:364-76. [PMID: 25303887 DOI: 10.1016/j.jtbi.2014.09.038] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2014] [Revised: 09/27/2014] [Accepted: 09/29/2014] [Indexed: 11/22/2022]
Abstract
Cis regulatory elements (CREs), located within promoter regions, play a significant role in the blueprint for transcriptional regulation of genes. There is a growing interest to study the combinatorial nature of CREs including presence or absence of CREs, the number of occurrences of each CRE, as well as of their order and location relative to their target genes. Comparative promoter analysis has been shown to be a reliable strategy to test the significance of each component of promoter architecture. However, it remains unclear what level of difference in the number of occurrences of each CRE is of statistical significance in order to explain different expression patterns of two genes. In this study, we present a novel statistical approach for pairwise comparison of promoters of Arabidopsis genes in the context of number of occurrences of each CRE within the promoters. First, using the sample of 1000 Arabidopsis promoters, the results of the goodness of fit test and non-parametric analysis revealed that the number of occurrences of CREs in a promoter sequence is Poisson distributed. As a promoter sequence contained functional and non-functional CREs, we addressed the issue of the statistical distribution of functional CREs by analyzing the ChIP-seq datasets. The results showed that the number of occurrences of functional CREs over the genomic regions was determined as being Poisson distributed. In accordance with the obtained distribution of CREs occurrences, we suggested the Audic and Claverie (AC) test to compare two promoters based on the number of occurrences for the CREs. Superiority of the AC test over Chi-square (2×2) and Fisher's exact tests was also shown, as the AC test was able to detect a higher number of significant CREs. The two case studies on the Arabidopsis genes were performed in order to biologically verify the pairwise test for promoter comparison. Consequently, a number of CREs with significantly different occurrences was identified between the promoters. The results of the pairwise comparative analysis together with the expression data for the studied genes revealed the biological significance of the identified CREs.
Collapse
|
27
|
Kurihara M, Shiraishi A, Satake H, Kimura AP. A conserved noncoding sequence can function as a spermatocyte-specific enhancer and a bidirectional promoter for a ubiquitously expressed gene and a testis-specific long noncoding RNA. J Mol Biol 2014; 426:3069-93. [PMID: 25020229 DOI: 10.1016/j.jmb.2014.06.018] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2014] [Revised: 06/26/2014] [Accepted: 06/27/2014] [Indexed: 12/13/2022]
Abstract
Tissue-specific gene expression is tightly regulated by various elements such as promoters, enhancers, and long noncoding RNAs (lncRNAs). In the present study, we identified a conserved noncoding sequence (CNS1) as a novel enhancer for the spermatocyte-specific mouse testicular cell adhesion molecule 1 (Tcam1) gene. CNS1 was located 3.4kb upstream of the Tcam1 gene and associated with histone H3K4 mono-methylation in testicular germ cells. By the in vitro reporter gene assay, CNS1 could enhance Tcam1 promoter activity only in GC-2spd(ts) cells, which were derived from mouse spermatocytes. When we integrated the 6.9-kb 5'-flanking sequence of Tcam1 with or without a deletion of CNS1 linked to the enhanced green fluorescent protein gene into the chromatin of GC-2spd(ts) cells, CNS1 significantly enhanced Tcam1 promoter activity. These results indicate that CNS1 could function as a spermatocyte-specific enhancer. Interestingly, CNS1 also showed high bidirectional promoter activity in the reporter assay, and consistent with this, the Smarcd2 gene and lncRNA, designated lncRNA-Tcam1, were transcribed from adjacent regions of CNS1. While Smarcd2 was ubiquitously expressed, lncRNA-Tcam1 expression was restricted to testicular germ cells, although this lncRNA did not participate in Tcam1 activation. Ubiquitous Smarcd2 expression was correlated to CpG hypo-methylation of CNS1 and partially controlled by Sp1. However, for lncRNA-Tcam1 transcription, the strong association with histone acetylation and histone H3K4 tri-methylation also appeared to be required. The present data suggest that CNS1 is a spermatocyte-specific enhancer for the Tcam1 gene and a bidirectional promoter of Smarcd2 and lncRNA-Tcam1.
Collapse
Affiliation(s)
- Misuzu Kurihara
- Graduate School of Life Science, Hokkaido University, Sapporo 060-0810, Japan
| | - Akira Shiraishi
- Suntory Foundation for Life Sciences, Bioorganic Research Institute, Osaka 618-8503, Japan
| | - Honoo Satake
- Suntory Foundation for Life Sciences, Bioorganic Research Institute, Osaka 618-8503, Japan
| | - Atsushi P Kimura
- Graduate School of Life Science, Hokkaido University, Sapporo 060-0810, Japan; Department of Biological Sciences, Faculty of Science, Hokkaido University, Sapporo 060-0810, Japan.
| |
Collapse
|
28
|
Downs GS, Liseron-Monfils C, Lukens LN. Regulatory motifs identified from a maize developmental coexpression network. Genome 2014; 57:181-4. [DOI: 10.1139/gen-2013-0177] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Transcriptional control is an important determinant of plant development, and distinct modules of coordinated genes characterize the maize developmental transcriptome. Upstream regulatory sequences are often the primary factors that control gene expression pattern and abundance. Here, we identify 244 regulatory motifs that are significantly enriched within 24 gene expression modules previously constructed from transcript abundances of 34 876 Zea mays (maize) gene models from embryogenesis to senescence. Within modules, we identify motifs that have not been characterized. In addition, we identify motifs similar to experimentally verified motifs, and the functions of these motifs overlap with predicted module functions. This work demonstrates the power of transcript-level coexpression modules to identify both variants of known regulatory motifs and novel motifs that control a species’ developmental transcriptome.
Collapse
Affiliation(s)
- Gregory S. Downs
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Christophe Liseron-Monfils
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
- Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA
| | - Lewis N. Lukens
- Department of Plant Agriculture, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
29
|
Nord AS, Blow MJ, Attanasio C, Akiyama JA, Holt A, Hosseini R, Phouanenavong S, Plajzer-Frick I, Shoukry M, Afzal V, Rubenstein JLR, Rubin EM, Pennacchio LA, Visel A. Rapid and pervasive changes in genome-wide enhancer usage during mammalian development. Cell 2014; 155:1521-31. [PMID: 24360275 DOI: 10.1016/j.cell.2013.11.033] [Citation(s) in RCA: 260] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Revised: 10/28/2013] [Accepted: 11/22/2013] [Indexed: 12/26/2022]
Abstract
Enhancers are distal regulatory elements that can activate tissue-specific gene expression and are abundant throughout mammalian genomes. Although substantial progress has been made toward genome-wide annotation of mammalian enhancers, their temporal activity patterns and global contributions in the context of developmental in vivo processes remain poorly explored. Here we used epigenomic profiling for H3K27ac, a mark of active enhancers, coupled to transgenic mouse assays to examine the genome-wide utilization of enhancers in three different mouse tissues across seven developmental stages. The majority of the ∼90,000 enhancers identified exhibited tightly temporally restricted predicted activity windows and were associated with stage-specific biological functions and regulatory pathways in individual tissues. Comparative genomic analysis revealed that evolutionary conservation of enhancers decreases following midgestation across all tissues examined. The dynamic enhancer activities uncovered in this study illuminate rapid and pervasive temporal in vivo changes in enhancer usage that underlie processes central to development and disease.
Collapse
Affiliation(s)
- Alex S Nord
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Matthew J Blow
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Catia Attanasio
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jennifer A Akiyama
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Amy Holt
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Roya Hosseini
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Sengthavy Phouanenavong
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ingrid Plajzer-Frick
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Malak Shoukry
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Veena Afzal
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - John L R Rubenstein
- Department of Psychiatry, Rock Hall, University of California, San Francisco, CA 94158-2324, USA
| | - Edward M Rubin
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA
| | - Len A Pennacchio
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA.
| | - Axel Visel
- Genomics Division, MS 84-171, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; U.S. Department of Energy Joint Genome Institute, Walnut Creek, CA 94598, USA; School of Natural Sciences, University of California, Merced, CA 95343, USA.
| |
Collapse
|
30
|
Matsubara S, Kurihara M, Kimura AP. A long non-coding RNA transcribed from conserved non-coding sequences contributes to the mouse prolyl oligopeptidase gene activation. J Biochem 2013; 155:243-56. [PMID: 24369296 DOI: 10.1093/jb/mvt113] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Prolyl oligopeptidase (POP) is a multifunctional protease which is involved in many physiological events, but its gene regulatory mechanism is poorly understood. To identify novel regulatory elements of the POP gene, we compared the genomic sequences at the mouse and human POP loci and found six conserved non-coding sequences (CNSs) at adjacent intergenic regions. From these CNSs, four long non-coding RNAs (lncRNAs) were transcribed and the expression pattern of one (lncPrep+96kb) was correlated with that of POP. lncPrep+96kb was transcribed as two forms due to the different transcriptional start sites and was localized at the nucleus and cytoplasm, although more was present at the nucleus. When we knocked down lncPrep+96kb in the primary ovarian granulosa cell and a hepatic cell line, the POP expression was decreased in both cells. In contrast, overexpression of lncPrep+96kb increased the POP expression only in the granulosa cell. Because lncPrep+96kb was upregulated with the same timing as POP in the hormone-treated ovary, this lncRNA could play a role in the POP gene activation in the granulosa cell. Moreover, a downstream region of the human POP gene was also transcribed. We propose a novel mechanism for the POP gene activation.
Collapse
Affiliation(s)
- Shin Matsubara
- Graduate School of Life Science and Department of Biological Sciences, Faculty of Science, Hokkaido University, Sapporo 060-0810, Japan
| | | | | |
Collapse
|
31
|
Ferg M, Armant O, Yang L, Dickmeis T, Rastegar S, Strähle U. Gene transcription in the zebrafish embryo: regulators and networks. Brief Funct Genomics 2013; 13:131-43. [PMID: 24152666 DOI: 10.1093/bfgp/elt044] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
The precise spatial and temporal control of gene expression is a key process in the development, maintenance and regeneration of the vertebrate body. A substantial proportion of vertebrate genomes encode genes that control the transcription of the genetic information into mRNA. The zebrafish is particularly well suited to investigate gene regulatory networks underlying the control of gene expression during development due to the external development of its transparent embryos and the increasingly sophisticated tools for genetic manipulation available for this model system. We review here recent data on the analysis of cis-regulatory modules, transcriptional regulators and their integration into gene regulatory networks in the zebrafish, using the developing spinal cord as example.
Collapse
Affiliation(s)
- Marco Ferg
- Institute of Toxicology and Genetics, Karlsruhe Institute of Technology (KIT), Postfach 3640, 76021 Karlsruhe, Germany.
| | | | | | | | | | | |
Collapse
|
32
|
Kumar A, Kamaraj B, Sethumadhavan R, Purohit R. Evolution driven structural changes in CENP-E motor domain. Interdiscip Sci 2013; 5:102-11. [DOI: 10.1007/s12539-013-0137-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2012] [Revised: 10/19/2012] [Accepted: 10/29/2012] [Indexed: 12/13/2022]
|
33
|
Hamilton NA, Tammen I, Raadsma HW. Multi-species comparative analysis of the equine ACE gene identifies a highly conserved potential transcription factor binding site in intron 16. PLoS One 2013; 8:e55434. [PMID: 23408978 PMCID: PMC3568152 DOI: 10.1371/journal.pone.0055434] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 12/23/2012] [Indexed: 11/18/2022] Open
Abstract
Angiotensin converting enzyme (ACE) is essential for control of blood pressure. The human ACE gene contains an intronic Alu indel (I/D) polymorphism that has been associated with variation in serum enzyme levels, although the functional mechanism has not been identified. The polymorphism has also been associated with cardiovascular disease, type II diabetes, renal disease and elite athleticism. We have characterized the ACE gene in horses of breeds selected for differing physical abilities. The equine gene has a similar structure to that of all known mammalian ACE genes. Nine common single nucleotide polymorphisms (SNPs) discovered in pooled DNA were found to be inherited in nine haplotypes. Three of these SNPs were located in intron 16, homologous to that containing the Alu polymorphism in the human. A highly conserved 18 bp sequence, also within that intron, was identified as being a potential binding site for the transcription factors Oct-1, HFH-1 and HNF-3β, and lies within a larger area of higher than normal homology. This putative regulatory element may contribute to regulation of the documented inter-individual variation in human circulating enzyme levels, for which a functional mechanism is yet to be defined. Two equine SNPs occurred within the conserved area in intron 16, although neither of them disrupted the putative binding site. We propose a possible regulatory mechanism of the ACE gene in mammalian species which was previously unknown. This advance will allow further analysis leading to a better understanding of the mechanisms underpinning the associations seen between the human Alu polymorphism and enzyme levels, cardiovascular disease states and elite athleticism.
Collapse
Affiliation(s)
- Natasha A Hamilton
- ReproGen-Animal Bioscience Group, Faculty of Veterinary Science, University of Sydney, Camperdown, New South Wales, Australia.
| | | | | |
Collapse
|
34
|
Su MY, Steiner LA, Bogardus H, Mishra T, Schulz VP, Hardison RC, Gallagher PG. Identification of biologically relevant enhancers in human erythroid cells. J Biol Chem 2013; 288:8433-8444. [PMID: 23341446 DOI: 10.1074/jbc.m112.413260] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Identification of cell type-specific enhancers is important for understanding the regulation of programs controlling cellular development and differentiation. Enhancers are typically marked by the co-transcriptional activator protein p300 or by groups of cell-expressed transcription factors. We hypothesized that a unique set of enhancers regulates gene expression in human erythroid cells, a highly specialized cell type evolved to provide adequate amounts of oxygen throughout the body. Using chromatin immunoprecipitation followed by massively parallel sequencing, genome-wide maps of candidate enhancers were constructed for p300 and four transcription factors, GATA1, NF-E2, KLF1, and SCL, using primary human erythroid cells. These data were combined with gene expression analyses, and candidate enhancers were identified. Consistent with their predicted function as candidate enhancers, there was statistically significant enrichment of p300 and combinations of co-localizing erythroid transcription factors within 1-50 kb of the transcriptional start site (TSS) of genes highly expressed in erythroid cells. Candidate enhancers were also enriched near genes with known erythroid cell function or phenotype. Candidate enhancers exhibited moderate conservation with mouse and minimal conservation with nonplacental vertebrates. Candidate enhancers were mapped to a set of erythroid-associated, biologically relevant, SNPs from the genome-wide association studies (GWAS) catalogue of NHGRI, National Institutes of Health. Fourteen candidate enhancers, representing 10 genetic loci, mapped to sites associated with biologically relevant erythroid traits. Fragments from these loci directed statistically significant expression in reporter gene assays. Identification of enhancers in human erythroid cells will allow a better understanding of erythroid cell development, differentiation, structure, and function and provide insights into inherited and acquired hematologic disease.
Collapse
Affiliation(s)
- Mack Y Su
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520
| | - Laurie A Steiner
- Department of Pediatrics, University of Rochester, Rochester, New York 14642
| | - Hannah Bogardus
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520
| | - Tejaswini Mishra
- Department of Biochemistry and Molecular Biology, Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Vincent P Schulz
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, Center for Comparative Genomics and Bioinformatics, Pennsylvania State University, University Park, Pennsylvania 16802
| | - Patrick G Gallagher
- Department of Pediatrics, Yale University School of Medicine, New Haven, Connecticut 06520; Departments of Pathology and Genetics, Yale University School of Medicine, New Haven, Connecticut 06520.
| |
Collapse
|
35
|
Li Z, Li Y, Zhang W. Ghrelin receptor in energy homeostasis and obesity pathogenesis. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2013; 114:45-87. [PMID: 23317782 DOI: 10.1016/b978-0-12-386933-3.00002-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The ghrelin receptor, also known as growth hormone secretagogue receptor (GHS-R), was identified in porcine and rat anterior pituitary membranes, where the synthetic secretagogue MK-0677 causes amplified pulsatile growth hormone (GH) release. In addition to its function in the stimulation of GH secretion, ghrelin, the natural ligand of ghrelin receptor is now recognized as a peptide hormone with fundamental influence on energy homeostasis. Despite the potential existence of multiple subtypes of ghrelin receptor, the effects of ghrelin on energy metabolism, obesity, and diabetes are mediated by its classical receptor GHS-R1a, whose activation requires the n-octanoylation of ghrelin. Here we review the current understanding of the role of the ghrelin receptor in the regulation of energy homeostasis. An overview of the ghrelin receptor is presented first, followed by the discussion on its effects on food intake, glucose homeostasis, and lipid metabolism. Finally, potential strategies for treating obesity and diabetes via manipulation of the ghrelin/ghrelin receptor axis are explored.
Collapse
Affiliation(s)
- Ziru Li
- Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Peking University, Beijing, China
| | | | | |
Collapse
|
36
|
Functional analysis of HapMap SNPs. Gene 2012; 511:358-63. [PMID: 23041558 DOI: 10.1016/j.gene.2012.09.075] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Revised: 08/07/2012] [Accepted: 09/13/2012] [Indexed: 11/20/2022]
Abstract
Genome-wide association studies (GWAS) have successfully identified many genetic variants associated with complex diseases and traits. However, functional consequence of genetic variants studied in GWAS is not yet fully investigated, which would hinder the application of GWAS. We therefore performed a systematic functional analysis of HapMap SNPs, which have been most commonly used as the reference panel for GWAS. Our study highlights several characteristics of HapMap SNPs and identifies subsets of genetic variants with interesting functional implication. The results show that HapMap SNPs have good coverage within RefSeq genes, especially within known disease-related genes. On the other hand, only a small percentage of SNPs are non-synonymous SNPs while many SNPs are actually located at gene deserts. Moreover, many functionally important variants are not yet still interrogated. A redesigned SNP reference panel with additional functionally important variants would be useful to identify disease-causal variants in the future genome-wide studies.
Collapse
|
37
|
Rosandić M, Glunčić M, Paar V. Start/stop codon like trinucleotides extensions in primate alpha satellites. J Theor Biol 2012; 317:301-9. [PMID: 23026763 DOI: 10.1016/j.jtbi.2012.09.022] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2012] [Revised: 09/07/2012] [Accepted: 09/19/2012] [Indexed: 11/28/2022]
Abstract
The centromeres remain "the final frontier" in unexplored segments of genome landscape in primate genomes, characterized by 2-5 Mb arrays of evolutionary rapidly evolving alpha satellite (AS) higher order repeats (HORs). Alpha satellites as specific noncoding sequences may be also significant in light of regulatory role of noncoding sequences. Using the Global Repeat Map (GRM) algorithm we identify in NCBI assemblies of chromosome 5 the species-specific alpha satellite HORs: 13mer in human, 5mer in chimpanzee, 14mer in orangutan and 3mers in macaque. The suprachromosomal family (SF) classification of alpha satellite HORs and surrounding monomeric alpha satellites is performed and specific segmental structure was found for major alpha satellite arrays in chromosome 5 of primates. In the framework of our novel concept of start/stop Codon Like Trinucleotides (CLTs) as a "new DNA language in noncoding sequences", we find characteristics and differences of these species in CLT extensions, in particular the extensions of stop-TGA CLT. We hypothesize that these are regulators in noncoding sequences, acting at a distance, and that they can amplify or weaken the activity of start/stop codons in coding sequences in protein genesis, increasing the richness of regulatory phenomena.
Collapse
Affiliation(s)
- Marija Rosandić
- Faculty of Science, University of Zagreb, 10000 Zagreb, Croatia.
| | | | | |
Collapse
|
38
|
Glunčić M, Paar V. Direct mapping of symbolic DNA sequence into frequency domain in global repeat map algorithm. Nucleic Acids Res 2012; 41:e17. [PMID: 22977183 PMCID: PMC3592446 DOI: 10.1093/nar/gks721] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
The main feature of global repeat map (GRM) algorithm (www.hazu.hr/grm/software/win/grm2012.exe) is its ability to identify a broad variety of repeats of unbounded length that can be arbitrarily distant in sequences as large as human chromosomes. The efficacy is due to the use of complete set of a K-string ensemble which enables a new method of direct mapping of symbolic DNA sequence into frequency domain, with straightforward identification of repeats as peaks in GRM diagram. In this way, we obtain very fast, efficient and highly automatized repeat finding tool. The method is robust to substitutions and insertions/deletions, as well as to various complexities of the sequence pattern. We present several case studies of GRM use, in order to illustrate its capabilities: identification of α-satellite tandem repeats and higher order repeats (HORs), identification of Alu dispersed repeats and of Alu tandems, identification of Period 3 pattern in exons, implementation of ‘magnifying glass’ effect, identification of complex HOR pattern, identification of inter-tandem transitional dispersed repeat sequences and identification of long segmental duplications. GRM algorithm is convenient for use, in particular, in cases of large repeat units, of highly mutated and/or complex repeats, and of global repeat maps for large genomic sequences (chromosomes and genomes).
Collapse
Affiliation(s)
- Matko Glunčić
- Faculty of Science, University of Zagreb, Bijenička 32 and Croatian Academy of Sciences and Arts, Zrinski trg 11, 10000 Zagreb, Croatia.
| | | |
Collapse
|
39
|
SECOM: a novel hash seed and community detection based-approach for genome-scale protein domain identification. PLoS One 2012; 7:e39475. [PMID: 22761802 PMCID: PMC3386278 DOI: 10.1371/journal.pone.0039475] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Accepted: 05/23/2012] [Indexed: 12/22/2022] Open
Abstract
With rapid advances in the development of DNA sequencing technologies, a plethora of high-throughput genome and proteome data from a diverse spectrum of organisms have been generated. The functional annotation and evolutionary history of proteins are usually inferred from domains predicted from the genome sequences. Traditional database-based domain prediction methods cannot identify novel domains, however, and alignment-based methods, which look for recurring segments in the proteome, are computationally demanding. Here, we propose a novel genome-wide domain prediction method, SECOM. Instead of conducting all-against-all sequence alignment, SECOM first indexes all the proteins in the genome by using a hash seed function. Local similarity can thus be detected and encoded into a graph structure, in which each node represents a protein sequence and each edge weight represents the shared hash seeds between the two nodes. SECOM then formulates the domain prediction problem as an overlapping community-finding problem in this graph. A backward graph percolation algorithm that efficiently identifies the domains is proposed. We tested SECOM on five recently sequenced genomes of aquatic animals. Our tests demonstrated that SECOM was able to identify most of the known domains identified by InterProScan. When compared with the alignment-based method, SECOM showed higher sensitivity in detecting putative novel domains, while it was also three orders of magnitude faster. For example, SECOM was able to predict a novel sponge-specific domain in nucleoside-triphosphatase (NTPases). Furthermore, SECOM discovered two novel domains, likely of bacterial origin, that are taxonomically restricted to sea anemone and hydra. SECOM is an open-source program and available at http://sfb.kaust.edu.sa/Pages/Software.aspx.
Collapse
|
40
|
Abstract
Differential gene expression is the fundamental mechanism underlying animal development and cell differentiation. However, it is a challenge to identify comprehensively and accurately the DNA sequences that are required to regulate gene expression: namely, cis-regulatory modules (CRMs). Three major features, either singly or in combination, are used to predict CRMs: clusters of transcription factor binding site motifs, non-coding DNA that is under evolutionary constraint and biochemical marks associated with CRMs, such as histone modifications and protein occupancy. The validation rates for predictions indicate that identifying diagnostic biochemical marks is the most reliable method, and understanding is enhanced by the analysis of motifs and conservation patterns within those predicted CRMs.
Collapse
|
41
|
Regulation of pancreatic microRNA-7 expression. EXPERIMENTAL DIABETES RESEARCH 2012; 2012:695214. [PMID: 22675342 PMCID: PMC3362837 DOI: 10.1155/2012/695214] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2012] [Accepted: 03/08/2012] [Indexed: 11/18/2022]
Abstract
Genome-encoded microRNAs (miRNAs) provide a posttranscriptional regulatory layer, which is important for pancreas development. Differentiation of endocrine cells is controlled by a network of pancreatic transcription factors including Ngn3 and NeuroD/Beta2. However, how specific miRNAs are intertwined into this transcriptional network is not well understood. Here, we characterize the regulation of microRNA-7 (miR-7) by endocrine-specific transcription factors. Our data reveal that three independent miR-7 genes are coexpressed in the pancreas. We have identified conserved blocks upstream of pre-miR-7a-2 and pre-miR-7b and demonstrated by functional assays that they possess promoter activity, which is increased by the expression of NeuroD/Beta2. These data suggest that the endocrine specificity of miR-7 expression is governed by transcriptional mechanisms and involves members of the pancreatic endocrine network of transcription factors.
Collapse
|
42
|
Emami Riedmaier A, Nies AT, Schaeffeler E, Schwab M. Organic Anion Transporters and Their Implications in Pharmacotherapy. Pharmacol Rev 2012; 64:421-49. [DOI: 10.1124/pr.111.004614] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
|
43
|
An interspecies analysis reveals a key role for unmethylated CpG dinucleotides in vertebrate Polycomb complex recruitment. EMBO J 2011; 31:317-29. [PMID: 22056776 DOI: 10.1038/emboj.2011.399] [Citation(s) in RCA: 161] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2011] [Accepted: 10/13/2011] [Indexed: 01/19/2023] Open
Abstract
The role of DNA sequence in determining chromatin state is incompletely understood. We have previously demonstrated that large chromosomal segments from human cells recapitulate their native chromatin state in mouse cells, but the relative contribution of local sequences versus their genomic context remains unknown. In this study, we compare orthologous chromosomal regions for which the human locus establishes prominent sites of Polycomb complex recruitment in pluripotent stem cells, whereas the corresponding mouse locus does not. Using recombination-mediated cassette exchange at the mouse locus, we establish the primacy of local sequences in the encoding of chromatin state. We show that the signal for chromatin bivalency is redundantly encoded across a bivalent domain and that this reflects competition between Polycomb complex recruitment and transcriptional activation. Furthermore, our results suggest that a high density of unmethylated CpG dinucleotides is sufficient for vertebrate Polycomb recruitment. This model is supported by analysis of DNA methyltransferase-deficient embryonic stem cells.
Collapse
|
44
|
Ludwig MZ, Manu, Kittler R, White KP, Kreitman M. Consequences of eukaryotic enhancer architecture for gene expression dynamics, development, and fitness. PLoS Genet 2011; 7:e1002364. [PMID: 22102826 PMCID: PMC3213169 DOI: 10.1371/journal.pgen.1002364] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Accepted: 09/14/2011] [Indexed: 12/13/2022] Open
Abstract
The regulatory logic of time- and tissue-specific gene expression has mostly been dissected in the context of the smallest DNA fragments that, when isolated, recapitulate native expression in reporter assays. It is not known if the genomic sequences surrounding such fragments, often evolutionarily conserved, have any biological function or not. Using an enhancer of the even-skipped gene of Drosophila as a model, we investigate the functional significance of the genomic sequences surrounding empirically identified enhancers. A 480 bp long "minimal stripe element" is able to drive even-skipped expression in the second of seven stripes but is embedded in a larger region of 800 bp containing evolutionarily conserved binding sites for required transcription factors. To assess the overall fitness contribution made by these binding sites in the native genomic context, we employed a gene-replacement strategy in which whole-locus transgenes, capable of rescuing even-skipped(-) lethality to adulthood, were substituted for the native gene. The molecular phenotypes were characterized by tagging Even-skipped with a fluorescent protein and monitoring gene expression dynamics in living embryos. We used recombineering to excise the sequences surrounding the minimal enhancer and site-specific transgenesis to create co-isogenic strains differing only in their stripe 2 sequences. Remarkably, the flanking sequences were dispensable for viability, proving the sufficiency of the minimal element for biological function under normal conditions. These sequences are required for robustness to genetic and environmental perturbation instead. The mutant enhancers had measurable sex- and dose-dependent effects on viability. At the molecular level, the mutants showed a destabilization of stripe placement and improper activation of downstream genes. Finally, we demonstrate through live measurements that the peripheral sequences are required for temperature compensation. These results imply that seemingly redundant regulatory sequences beyond the minimal enhancer are necessary for robust gene expression and that "robustness" itself must be an evolved characteristic of the wild-type enhancer.
Collapse
Affiliation(s)
- Michael Z. Ludwig
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Manu
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Ralf Kittler
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Kevin P. White
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Martin Kreitman
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
45
|
Lyons MR, West AE. Mechanisms of specificity in neuronal activity-regulated gene transcription. Prog Neurobiol 2011; 94:259-95. [PMID: 21620929 PMCID: PMC3134613 DOI: 10.1016/j.pneurobio.2011.05.003] [Citation(s) in RCA: 144] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2011] [Revised: 05/05/2011] [Accepted: 05/05/2011] [Indexed: 02/06/2023]
Abstract
The brain is a highly adaptable organ that is capable of converting sensory information into changes in neuronal function. This plasticity allows behavior to be accommodated to the environment, providing an important evolutionary advantage. Neurons convert environmental stimuli into long-lasting changes in their physiology in part through the synaptic activity-regulated transcription of new gene products. Since the neurotransmitter-dependent regulation of Fos transcription was first discovered nearly 25 years ago, a wealth of studies have enriched our understanding of the molecular pathways that mediate activity-regulated changes in gene transcription. These findings show that a broad range of signaling pathways and transcriptional regulators can be engaged by neuronal activity to sculpt complex programs of stimulus-regulated gene transcription. However, the shear scope of the transcriptional pathways engaged by neuronal activity raises the question of how specificity in the nature of the transcriptional response is achieved in order to encode physiologically relevant responses to divergent stimuli. Here we summarize the general paradigms by which neuronal activity regulates transcription while focusing on the molecular mechanisms that confer differential stimulus-, cell-type-, and developmental-specificity upon activity-regulated programs of neuronal gene transcription. In addition, we preview some of the new technologies that will advance our future understanding of the mechanisms and consequences of activity-regulated gene transcription in the brain.
Collapse
Affiliation(s)
- Michelle R Lyons
- Department of Neurobiology, Duke University Medical Center, Durham, NC 27710, USA
| | | |
Collapse
|
46
|
Barrière A, Gordon KL, Ruvinsky I. Distinct functional constraints partition sequence conservation in a cis-regulatory element. PLoS Genet 2011; 7:e1002095. [PMID: 21655084 PMCID: PMC3107193 DOI: 10.1371/journal.pgen.1002095] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Accepted: 04/07/2011] [Indexed: 11/25/2022] Open
Abstract
Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. Comparison between genome sequences of different species is a powerful tool in modern biology because important features are maintained by natural selection and are therefore conserved. However, some important sequences within genomes evolve considerably faster than others. One possible explanation is that they encode little or no function. Alternatively, they may evolve under different constraints that permit sequence turnover while maintaining function. Here we report that the promoter of the unc-47 gene of C. elegans contains two discrete elements. One has a highly conserved sequence that determines the spatial expression pattern. Another shows no sequence conservation, but it makes expression of the gene robust, that is, consistent between individuals and resilient to environmental challenges. Remarkably, multiple unrelated sequences are capable of promoting robust expression. Nucleotide composition of these sequences suggests that open chromatin may play a role in conferring robustness of gene expression. Because general sequence composition and therefore expression robustness can be maintained despite sequence turnover, our results offer an explanation of how rapidly diverging promoter elements can nevertheless remain functionally conserved.
Collapse
Affiliation(s)
- Antoine Barrière
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, Chicago, Illinois, United States of America
| | - Kacy L. Gordon
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
| | - Ilya Ruvinsky
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, Chicago, Illinois, United States of America
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
47
|
Franchini P, van der Merwe M, Roodt-Wilding R. Transcriptome characterization of the South African abalone Haliotis midae using sequencing-by-synthesis. BMC Res Notes 2011; 4:59. [PMID: 21396099 PMCID: PMC3063225 DOI: 10.1186/1756-0500-4-59] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2010] [Accepted: 03/11/2011] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Worldwide, the genus Haliotis is represented by 56 extant species and several of these are commercially cultured. Among the six abalone species found in South Africa, Haliotis midae is the only aquacultured species. Despite its economic importance, genomic sequence resources for H. midae, and for abalone in general, are still scarce. Next generation sequencing technologies provide a fast and efficient tool to generate large sequence collections that can be used to characterize the transcriptome and identify expressed genes associated with economically important traits like growth and disease resistance. RESULTS More than 25 million short reads generated by the Illumina Genome Analyzer were de novo assembled in 22,761 contigs with an average size of 260 bp. With a stringent E-value threshold of 10-10, 3,841 contigs (16.8%) had a BLAST homologous match against the Genbank non-redundant (NR) protein database. Most of these sequences were annotated using the gene ontology (GO) and eukaryotic orthologous groups of proteins (KOG) databases and assigned to various functional categories. According to annotation results, many gene families involved in immune response were identified. Thousands of simple sequence repeats (SSR) and single nucleotide polymorphisms (SNP) were detected. Setting stringent parameters to ensure a high probability of amplification, 420 primer pairs in 181 contigs containing SSR loci were designed. CONCLUSION This data represents the most comprehensive genomic resource for the South African abalone H. midae to date. The amount of assembled sequences demonstrated the utility of the Illumina sequencing technology in the transcriptome characterization of a non-model species. It allowed the development of several markers and the identification of promising candidate genes for future studies on population and functional genomics in H. midae and in other abalone species.
Collapse
Affiliation(s)
- Paolo Franchini
- Molecular Aquatic Research Group, Department of Genetics, Stellenbosch University, Private Bag X1, Matieland, 7602, South Africa.
| | | | | |
Collapse
|
48
|
Paar V, Gluncic M, Rosandic M, Basar I, Vlahovic I. Intragene Higher Order Repeats in Neuroblastoma BreakPoint Family Genes Distinguish Humans from Chimpanzees. Mol Biol Evol 2011; 28:1877-92. [DOI: 10.1093/molbev/msr009] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
|
49
|
Dib S, Denarier E, Dionne N, Beaudoin M, Friedman HH, Peterson AC. Regulatory modules function in a non-autonomous manner to control transcription of the mbp gene. Nucleic Acids Res 2010; 39:2548-58. [PMID: 21131280 PMCID: PMC3074125 DOI: 10.1093/nar/gkq1160] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Multiple regulatory modules contribute to the complex expression programs realized by many loci. Although long thought of as isolated components, recent studies demonstrate that such regulatory sequences can physically associate with promoters and with each other and may localize to specific sub-nuclear transcription factories. These associations provide a substrate for putative interactions and have led to the suggested existence of a transcriptional interactome. Here, using a controlled strategy of transgenesis, we analyzed the functional consequences of regulatory sequence interaction within the myelin basic protein (mbp) locus. Interactions were revealed through comparisons of the qualitative and quantitative expression programs conferred by an allelic series of 11 different enhancer/inter-enhancer combinations ligated to a common promoter/reporter gene. In a developmentally contextual manner, the regulatory output of all modules changed markedly in the presence of other sequences. Predicted by transgene expression programs, deletion of one such module from the endogenous locus reduced oligodendrocyte expression levels but unexpectedly, also attenuated expression of the overlapping golli transcriptional unit. These observations support a regulatory architecture that extends beyond a combinatorial model to include frequent interactions capable of significantly modulating the functions conferred through regulatory modules in isolation.
Collapse
Affiliation(s)
- Samar Dib
- Department of Human Genetics, Laboratory of Developmental Biology, Royal Victoria Hospital, H-5, McGill University Health Centre, Montreal, Quebec, Canada
| | | | | | | | | | | |
Collapse
|
50
|
Paar V, Glunčić M, Basar I, Rosandić M, Paar P, Cvitković M. Large Tandem, Higher Order Repeats and Regularly Dispersed Repeat Units Contribute Substantially to Divergence Between Human and Chimpanzee Y Chromosomes. J Mol Evol 2010; 72:34-55. [DOI: 10.1007/s00239-010-9401-8] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2010] [Accepted: 10/25/2010] [Indexed: 10/18/2022]
|