1
|
Djebali S, Lagarde J, Kapranov P, Lacroix V, Borel C, Mudge JM, Howald C, Foissac S, Ucla C, Chrast J, Ribeca P, Martin D, Murray RR, Yang X, Ghamsari L, Lin C, Bell I, Dumais E, Drenkow J, Tress ML, Gelpí JL, Orozco M, Valencia A, van Berkum NL, Lajoie BR, Vidal M, Stamatoyannopoulos J, Batut P, Dobin A, Harrow J, Hubbard T, Dekker J, Frankish A, Salehi-Ashtiani K, Reymond A, Antonarakis SE, Guigó R, Gingeras TR. Evidence for transcript networks composed of chimeric RNAs in human cells. PLoS One 2012; 7:e28213. [PMID: 22238572 PMCID: PMC3251577 DOI: 10.1371/journal.pone.0028213] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Accepted: 11/03/2011] [Indexed: 12/03/2022] Open
Abstract
The classic organization of a gene structure has followed the Jacob and Monod bacterial gene model proposed more than 50 years ago. Since then, empirical determinations of the complexity of the transcriptomes found in yeast to human has blurred the definition and physical boundaries of genes. Using multiple analysis approaches we have characterized individual gene boundaries mapping on human chromosomes 21 and 22. Analyses of the locations of the 5′ and 3′ transcriptional termini of 492 protein coding genes revealed that for 85% of these genes the boundaries extend beyond the current annotated termini, most often connecting with exons of transcripts from other well annotated genes. The biological and evolutionary importance of these chimeric transcripts is underscored by (1) the non-random interconnections of genes involved, (2) the greater phylogenetic depth of the genes involved in many chimeric interactions, (3) the coordination of the expression of connected genes and (4) the close in vivo and three dimensional proximity of the genomic regions being transcribed and contributing to parts of the chimeric RNAs. The non-random nature of the connection of the genes involved suggest that chimeric transcripts should not be studied in isolation, but together, as an RNA network.
Collapse
Affiliation(s)
- Sarah Djebali
- Bioinformatics and Genomics, Centre for Genomic Regulation and Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
2
|
Nikolaev SI, Deutsch S, Genolet R, Borel C, Parand L, Ucla C, Schütz F, Duriaux Sail G, Dupré Y, Jaquier-Gubler P, Araud T, Conne B, Descombes P, Vassalli JD, Curran J, Antonarakis SE. Transcriptional and post-transcriptional profile of human chromosome 21. Genome Res 2009; 19:1471-9. [PMID: 19581486 DOI: 10.1101/gr.089425.108] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Recent studies have demonstrated extensive transcriptional activity across the human genome, a substantial fraction of which is not associated with any functional annotation. However, very little is known regarding the post-transcriptional processes that operate within the different classes of RNA molecules. To characterize the post-transcriptional properties of expressed sequences from human chromosome 21 (HSA21), we separated RNA molecules from three cell lines (GM06990, HeLa S3, and SK-N-AS) according to their ribosome content by sucrose gradient fractionation. Polyribosomal-associated RNA and total RNA were subsequently hybridized to genomic tiling arrays. We found that approximately 50% of the transcriptional signals were located outside of annotated exons and were considered as TARs (transcriptionally active regions). Although TARs were observed among polysome-associated RNAs, RT-PCR and RACE experiments revealed that approximately 40% were likely to represent nonspecific cross-hybridization artifacts. Bioinformatics discrimination of TARs according to conservation and sequence complexity allowed us to identify a set of high-confidence TARs. This set of TARs was significantly depleted in the polysomes, suggesting that it was not likely to be involved in translation. Analysis of polysome representation of RefSeq exons showed that at least 15% of RefSeq transcripts undergo significant post-transcriptional regulation in at least two of the three cell lines tested. Among the regulated transcripts, enrichment analysis revealed an over-representation of genes involved in Alzheimer's disease (AD), including APP and the BACE1 protease that cleaves APP to produce the pathogenic beta 42 peptide. We demonstrate that the combination of RNA fractionation and tiling arrays is a powerful method to assess the transcriptional and post-transcriptional properties of genomic regions.
Collapse
Affiliation(s)
- Sergey I Nikolaev
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
D'haene B, Attanasio C, Beysen D, Dostie J, Lemire E, Bouchard P, Field M, Jones K, Lorenz B, Menten B, Buysse K, Pattyn F, Friedli M, Ucla C, Rossier C, Wyss C, Speleman F, De Paepe A, Dekker J, Antonarakis SE, De Baere E. Disease-causing 7.4 kb cis-regulatory deletion disrupting conserved non-coding sequences and their interaction with the FOXL2 promotor: implications for mutation screening. PLoS Genet 2009; 5:e1000522. [PMID: 19543368 PMCID: PMC2689649 DOI: 10.1371/journal.pgen.1000522] [Citation(s) in RCA: 77] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2009] [Accepted: 05/18/2009] [Indexed: 11/23/2022] Open
Abstract
To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. Long-range genetic control is an inherent feature of genes harbouring a highly complex spatiotemporal expression pattern, requiring a combined action of multiple cis-regulatory elements such as promoters, enhancers, and silencers. Consequently, disruption of the long-range genetic control of a target gene by genomic rearrangements of regulatory elements may lead to aberrant gene transcription and disease. To date, the contribution of mutated regulatory elements to human disease has not been studied frequently. Here, we explored the contribution of genetic changes in potentially cis-regulatory elements of the FOXL2 gene in blepharophimosis syndrome (BPES), a developmental monogenic condition of the eyelids and ovaries. We identified a de novo very subtle deletion of 7.4 kb causing BPES. Moreover, we studied the functional capacities and chromosome conformation of the deleted region in FOXL2 expressing cellular systems. Interestingly, the chromosome conformation analysis demonstrated the close proximity of the 7.4 kb deleted fragment and two other conserved regions with the FOXL2 core promoter, and the necessity of their integrity for correct FOXL2 expression. Finally, our study revealed the smallest distant deletion causing monogenic disease and emphasized the importance of mutation screening of cis-regulatory elements in human genetic disease.
Collapse
Affiliation(s)
- Barbara D'haene
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Catia Attanasio
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Diane Beysen
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Josée Dostie
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Edmond Lemire
- Division of Medical Genetics, Royal University Hospital, Saskatoon, Saskatchewan, Canada
| | | | | | - Kristie Jones
- Department of Clinical Genetics, The Children's Hospital at Westmead, Westmead, Australia
| | - Birgit Lorenz
- Department of Ophthalmology, Justus-Liebig-University Giessen, Universitaetsklinikum Giessen und Marburg GmbH Giessen Campus, Giessen, Germany
| | - Björn Menten
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Karen Buysse
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Filip Pattyn
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Marc Friedli
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Catherine Ucla
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Colette Rossier
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Carine Wyss
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Frank Speleman
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Anne De Paepe
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Job Dekker
- Program in Gene Function and Expression and Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts, United States of America
| | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, Geneva, Switzerland
| | - Elfride De Baere
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
- * E-mail:
| |
Collapse
|
4
|
Elsik CG, Tellam RL, Worley KC, Gibbs RA, Muzny DM, Weinstock GM, Adelson DL, Eichler EE, Elnitski L, Guigó R, Hamernik DL, Kappes SM, Lewin HA, Lynn DJ, Nicholas FW, Reymond A, Rijnkels M, Skow LC, Zdobnov EM, Schook L, Womack J, Alioto T, Antonarakis SE, Astashyn A, Chapple CE, Chen HC, Chrast J, Câmara F, Ermolaeva O, Henrichsen CN, Hlavina W, Kapustin Y, Kiryutin B, Kitts P, Kokocinski F, Landrum M, Maglott D, Pruitt K, Sapojnikov V, Searle SM, Solovyev V, Souvorov A, Ucla C, Wyss C, Anzola JM, Gerlach D, Elhaik E, Graur D, Reese JT, Edgar RC, McEwan JC, Payne GM, Raison JM, Junier T, Kriventseva EV, Eyras E, Plass M, Donthu R, Larkin DM, Reecy J, Yang MQ, Chen L, Cheng Z, Chitko-McKown CG, Liu GE, Matukumalli LK, Song J, Zhu B, Bradley DG, Brinkman FSL, Lau LPL, Whiteside MD, Walker A, Wheeler TT, Casey T, German JB, Lemay DG, Maqbool NJ, Molenaar AJ, Seo S, Stothard P, Baldwin CL, Baxter R, Brinkmeyer-Langford CL, Brown WC, Childers CP, Connelley T, Ellis SA, Fritz K, Glass EJ, Herzig CTA, Iivanainen A, Lahmers KK, Bennett AK, Dickens CM, Gilbert JGR, Hagen DE, Salih H, Aerts J, Caetano AR, Dalrymple B, Garcia JF, Gill CA, Hiendleder SG, Memili E, Spurlock D, Williams JL, Alexander L, Brownstein MJ, Guan L, Holt RA, Jones SJM, Marra MA, Moore R, Moore SS, Roberts A, Taniguchi M, Waterman RC, Chacko J, Chandrabose MM, Cree A, Dao MD, Dinh HH, Gabisi RA, Hines S, Hume J, Jhangiani SN, Joshi V, Kovar CL, Lewis LR, Liu YS, Lopez J, Morgan MB, Nguyen NB, Okwuonu GO, Ruiz SJ, Santibanez J, Wright RA, Buhay C, Ding Y, Dugan-Rocha S, Herdandez J, Holder M, Sabo A, Egan A, Goodell J, Wilczek-Boney K, Fowler GR, Hitchens ME, Lozado RJ, Moen C, Steffen D, Warren JT, Zhang J, Chiu R, Schein JE, Durbin KJ, Havlak P, Jiang H, Liu Y, Qin X, Ren Y, Shen Y, Song H, Bell SN, Davis C, Johnson AJ, Lee S, Nazareth LV, Patel BM, Pu LL, Vattathil S, Williams RL, Curry S, Hamilton C, Sodergren E, Wheeler DA, Barris W, Bennett GL, Eggen A, Green RD, Harhay GP, Hobbs M, Jann O, Keele JW, Kent MP, Lien S, McKay SD, McWilliam S, Ratnakumar A, Schnabel RD, Smith T, Snelling WM, Sonstegard TS, Stone RT, Sugimoto Y, Takasuga A, Taylor JF, Van Tassell CP, Macneil MD, Abatepaulo ARR, Abbey CA, Ahola V, Almeida IG, Amadio AF, Anatriello E, Bahadue SM, Biase FH, Boldt CR, Carroll JA, Carvalho WA, Cervelatti EP, Chacko E, Chapin JE, Cheng Y, Choi J, Colley AJ, de Campos TA, De Donato M, Santos IKFDM, de Oliveira CJF, Deobald H, Devinoy E, Donohue KE, Dovc P, Eberlein A, Fitzsimmons CJ, Franzin AM, Garcia GR, Genini S, Gladney CJ, Grant JR, Greaser ML, Green JA, Hadsell DL, Hakimov HA, Halgren R, Harrow JL, Hart EA, Hastings N, Hernandez M, Hu ZL, Ingham A, Iso-Touru T, Jamis C, Jensen K, Kapetis D, Kerr T, Khalil SS, Khatib H, Kolbehdari D, Kumar CG, Kumar D, Leach R, Lee JCM, Li C, Logan KM, Malinverni R, Marques E, Martin WF, Martins NF, Maruyama SR, Mazza R, McLean KL, Medrano JF, Moreno BT, Moré DD, Muntean CT, Nandakumar HP, Nogueira MFG, Olsaker I, Pant SD, Panzitta F, Pastor RCP, Poli MA, Poslusny N, Rachagani S, Ranganathan S, Razpet A, Riggs PK, Rincon G, Rodriguez-Osorio N, Rodriguez-Zas SL, Romero NE, Rosenwald A, Sando L, Schmutz SM, Shen L, Sherman L, Southey BR, Lutzow YS, Sweedler JV, Tammen I, Telugu BPVL, Urbanski JM, Utsunomiya YT, Verschoor CP, Waardenberg AJ, Wang Z, Ward R, Weikard R, Welsh TH, White SN, Wilming LG, Wunderlich KR, Yang J, Zhao FQ. The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science 2009; 324:522-8. [PMID: 19390049 DOI: 10.1126/science.1169588] [Citation(s) in RCA: 806] [Impact Index Per Article: 53.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
To understand the biology and evolution of ruminants, the cattle genome was sequenced to about sevenfold coverage. The cattle genome contains a minimum of 22,000 genes, with a core set of 14,345 orthologs shared among seven mammalian species of which 1217 are absent or undetected in noneutherian (marsupial or monotreme) genomes. Cattle-specific evolutionary breakpoint regions in chromosomes have a higher density of segmental duplications, enrichment of repetitive elements, and species-specific variations in genes associated with lactation and immune responsiveness. Genes involved in metabolism are generally highly conserved, although five metabolic genes are deleted or extensively diverged from their human orthologs. The cattle genome sequence thus provides a resource for understanding mammalian evolution and accelerating livestock genetic improvement for milk and meat production.
Collapse
|
5
|
Djebali S, Kapranov P, Foissac S, Lagarde J, Reymond A, Ucla C, Wyss C, Drenkow J, Dumais E, Murray RR, Lin C, Szeto D, Denoeud F, Calvo M, Frankish A, Harrow J, Makrythanasis P, Vidal M, Salehi-Ashtiani K, Antonarakis SE, Gingeras TR, Guigó R. Efficient targeted transcript discovery via array-based normalization of RACE libraries. Nat Methods 2008; 5:629-35. [PMID: 18500348 PMCID: PMC2713501 DOI: 10.1038/nmeth.1216] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Accepted: 04/24/2008] [Indexed: 11/09/2022]
Abstract
RACE (Rapid Amplification of cDNA Ends) is a widely used approach for transcript identification. Random clone selection from the RACE mixture, however, is an ineffective sampling strategy if the dynamic range of transcript abundances is large. Here, we describe a strategy that uses array hybridization to improve sampling efficiency of human transcripts. The products of the RACE reaction are hybridized onto tiling arrays, and the exons detected are used to delineate a series of RT-PCR reactions, through which the original RACE mixture is segregated into simpler RT-PCR reactions. These are independently cloned, and randomly selected clones are sequenced. This approach is superior to direct cloning and sequencing of RACE products: it specifically targets novel transcripts, and often results in overall normalization of transcript abundances. We show theoretically and experimentally that this strategy leads indeed to efficient sampling of novel transcripts, and we investigate multiplexing it by pooling RACE reactions from multiple interrogated loci prior to hybridization.
Collapse
Affiliation(s)
- Sarah Djebali
- Grup de Recerca en Informàtica Biomèdica, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
6
|
Lyle R, Prandini P, Osoegawa K, ten Hallers B, Humphray S, Zhu B, Eyras E, Castelo R, Bird CP, Gagos S, Scott C, Cox A, Deutsch S, Ucla C, Cruts M, Dahoun S, She X, Bena F, Wang SY, Van Broeckhoven C, Eichler EE, Guigo R, Rogers J, de Jong PJ, Reymond A, Antonarakis SE. Islands of euchromatin-like sequence and expressed polymorphic sequences within the short arm of human chromosome 21. Genome Res 2007; 17:1690-6. [PMID: 17895424 PMCID: PMC2045151 DOI: 10.1101/gr.6675307] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The goals of the human genome project did not include sequencing of the heterochromatic regions. We describe here an initial sequence of 1.1 Mb of the short arm of human chromosome 21 (HSA21p), estimated to be 10% of 21p. This region contains extensive euchromatic-like sequence and includes on average one transcript every 100 kb. These transcripts show multiple inter- and intrachromosomal copies, and extensive copy number and sequence variability. The sequencing of the "heterochromatic" regions of the human genome is likely to reveal many additional functional elements and provide important evolutionary information.
Collapse
Affiliation(s)
- Robert Lyle
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
- Corresponding authors.E-mail ; fax 47-22-11-98-99.E-mail . fax 41-22-379-5706
| | - Paola Prandini
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
| | - Kazutoyo Osoegawa
- Children's Hospital Oakland Research Institute, Oakland, California 94609, USA
| | | | - Sean Humphray
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Baoli Zhu
- Children's Hospital Oakland Research Institute, Oakland, California 94609, USA
| | - Eduardo Eyras
- Research Group on Biomedical Informatics, Pompeu Fabra University and Municipal Insititute of Medical Research, E-8003 Barcelona, Catalonia, Spain
| | - Robert Castelo
- Research Group on Biomedical Informatics, Pompeu Fabra University and Municipal Insititute of Medical Research, E-8003 Barcelona, Catalonia, Spain
| | | | - Sarantos Gagos
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
| | - Carol Scott
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Antony Cox
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Samuel Deutsch
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
| | - Catherine Ucla
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
| | - Marc Cruts
- Neurodegenerative Brain Diseases Group, Department of Molecular Genetics, VIB, University of Antwerp, BE-2610 Antwerpen, Belgium
| | - Sophie Dahoun
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
| | - Xinwei She
- Department of Genome Sciences, University of Washington and Howard Hughes Medical Institute, Seattle, Washington 98195-5065, USA
| | - Frederique Bena
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
| | - Sheng-Yue Wang
- Chinese National Human Genome Center at Shanghai, Shanghai 201203, China
| | - Christine Van Broeckhoven
- Neurodegenerative Brain Diseases Group, Department of Molecular Genetics, VIB, University of Antwerp, BE-2610 Antwerpen, Belgium
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington and Howard Hughes Medical Institute, Seattle, Washington 98195-5065, USA
| | - Roderic Guigo
- Centre for Genomic Regulation E-8003 Barcelona, Catalonia, Spain
| | - Jane Rogers
- Wellcome Trust Sanger Institute, Cambridge CB10 1SA, United Kingdom
| | - Pieter J. de Jong
- Children's Hospital Oakland Research Institute, Oakland, California 94609, USA
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, and University Hospitals, 1211 Geneva, Switzerland
- Corresponding authors.E-mail ; fax 47-22-11-98-99.E-mail . fax 41-22-379-5706
| |
Collapse
|
7
|
Birney E, Stamatoyannopoulos JA, Dutta A, Guigó R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE, Kuehn MS, Taylor CM, Neph S, Koch CM, Asthana S, Malhotra A, Adzhubei I, Greenbaum JA, Andrews RM, Flicek P, Boyle PJ, Cao H, Carter NP, Clelland GK, Davis S, Day N, Dhami P, Dillon SC, Dorschner MO, Fiegler H, Giresi PG, Goldy J, Hawrylycz M, Haydock A, Humbert R, James KD, Johnson BE, Johnson EM, Frum TT, Rosenzweig ER, Karnani N, Lee K, Lefebvre GC, Navas PA, Neri F, Parker SCJ, Sabo PJ, Sandstrom R, Shafer A, Vetrie D, Weaver M, Wilcox S, Yu M, Collins FS, Dekker J, Lieb JD, Tullius TD, Crawford GE, Sunyaev S, Noble WS, Dunham I, Denoeud F, Reymond A, Kapranov P, Rozowsky J, Zheng D, Castelo R, Frankish A, Harrow J, Ghosh S, Sandelin A, Hofacker IL, Baertsch R, Keefe D, Dike S, Cheng J, Hirsch HA, Sekinger EA, Lagarde J, Abril JF, Shahab A, Flamm C, Fried C, Hackermüller J, Hertel J, Lindemeyer M, Missal K, Tanzer A, Washietl S, Korbel J, Emanuelsson O, Pedersen JS, Holroyd N, Taylor R, Swarbreck D, Matthews N, Dickson MC, Thomas DJ, Weirauch MT, Gilbert J, Drenkow J, Bell I, Zhao X, Srinivasan KG, Sung WK, Ooi HS, Chiu KP, Foissac S, Alioto T, Brent M, Pachter L, Tress ML, Valencia A, Choo SW, Choo CY, Ucla C, Manzano C, Wyss C, Cheung E, Clark TG, Brown JB, Ganesh M, Patel S, Tammana H, Chrast J, Henrichsen CN, Kai C, Kawai J, Nagalakshmi U, Wu J, Lian Z, Lian J, Newburger P, Zhang X, Bickel P, Mattick JS, Carninci P, Hayashizaki Y, Weissman S, Hubbard T, Myers RM, Rogers J, Stadler PF, Lowe TM, Wei CL, Ruan Y, Struhl K, Gerstein M, Antonarakis SE, Fu Y, Green ED, Karaöz U, Siepel A, Taylor J, Liefer LA, Wetterstrand KA, Good PJ, Feingold EA, Guyer MS, Cooper GM, Asimenos G, Dewey CN, Hou M, Nikolaev S, Montoya-Burgos JI, Löytynoja A, Whelan S, Pardi F, Massingham T, Huang H, Zhang NR, Holmes I, Mullikin JC, Ureta-Vidal A, Paten B, Seringhaus M, Church D, Rosenbloom K, Kent WJ, Stone EA, Batzoglou S, Goldman N, Hardison RC, Haussler D, Miller W, Sidow A, Trinklein ND, Zhang ZD, Barrera L, Stuart R, King DC, Ameur A, Enroth S, Bieda MC, Kim J, Bhinge AA, Jiang N, Liu J, Yao F, Vega VB, Lee CWH, Ng P, Shahab A, Yang A, Moqtaderi Z, Zhu Z, Xu X, Squazzo S, Oberley MJ, Inman D, Singer MA, Richmond TA, Munn KJ, Rada-Iglesias A, Wallerman O, Komorowski J, Fowler JC, Couttet P, Bruce AW, Dovey OM, Ellis PD, Langford CF, Nix DA, Euskirchen G, Hartman S, Urban AE, Kraus P, Van Calcar S, Heintzman N, Kim TH, Wang K, Qu C, Hon G, Luna R, Glass CK, Rosenfeld MG, Aldred SF, Cooper SJ, Halees A, Lin JM, Shulha HP, Zhang X, Xu M, Haidar JNS, Yu Y, Ruan Y, Iyer VR, Green RD, Wadelius C, Farnham PJ, Ren B, Harte RA, Hinrichs AS, Trumbower H, Clawson H, Hillman-Jackson J, Zweig AS, Smith K, Thakkapallayil A, Barber G, Kuhn RM, Karolchik D, Armengol L, Bird CP, de Bakker PIW, Kern AD, Lopez-Bigas N, Martin JD, Stranger BE, Woodroffe A, Davydov E, Dimas A, Eyras E, Hallgrímsdóttir IB, Huppert J, Zody MC, Abecasis GR, Estivill X, Bouffard GG, Guan X, Hansen NF, Idol JR, Maduro VVB, Maskeri B, McDowell JC, Park M, Thomas PJ, Young AC, Blakesley RW, Muzny DM, Sodergren E, Wheeler DA, Worley KC, Jiang H, Weinstock GM, Gibbs RA, Graves T, Fulton R, Mardis ER, Wilson RK, Clamp M, Cuff J, Gnerre S, Jaffe DB, Chang JL, Lindblad-Toh K, Lander ES, Koriabine M, Nefedov M, Osoegawa K, Yoshinaga Y, Zhu B, de Jong PJ. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 2007; 447:799-816. [PMID: 17571346 PMCID: PMC2212820 DOI: 10.1038/nature05874] [Citation(s) in RCA: 3782] [Impact Index Per Article: 222.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
We report the generation and analysis of functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project. These data have been further integrated and augmented by a number of evolutionary and computational analyses. Together, our results advance the collective knowledge about human genome function in several major areas. First, our studies provide convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and those that extensively overlap one another. Second, systematic examination of transcriptional regulation has yielded new understanding about transcription start sites, including their relationship to specific regulatory sequences and features of chromatin accessibility and histone modification. Third, a more sophisticated view of chromatin structure has emerged, including its inter-relationship with DNA replication and transcriptional regulation. Finally, integration of these new sources of information, in particular with respect to mammalian evolution based on inter- and intra-species sequence comparisons, has yielded new mechanistic and evolutionary insights concerning the functional landscape of the human genome. Together, these studies are defining a path for pursuit of a more comprehensive characterization of human genome function.
Collapse
|
8
|
Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermüller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, Ucla C, Wyss C, Antonarakis SE, Denoeud F, Lagarde J, Drenkow J, Kapranov P, Gingeras TR, Guigó R, Snyder M, Gerstein MB, Reymond A, Hofacker IL, Stadler PF. Structured RNAs in the ENCODE selected regions of the human genome. Genes Dev 2007; 17:852-64. [PMID: 17568003 PMCID: PMC1891344 DOI: 10.1101/gr.5650707] [Citation(s) in RCA: 136] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2006] [Accepted: 12/12/2006] [Indexed: 12/16/2022]
Abstract
Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack characteristic signals in primary sequence, comparative approaches evaluating evolutionary conservation of structures are most promising. We have used three recently introduced programs based on either phylogenetic-stochastic context-free grammar (EvoFold) or energy directed folding (RNAz and AlifoldZ), yielding several thousand candidate structures (corresponding to approximately 2.7% of the ENCODE regions). EvoFold has its highest sensitivity in highly conserved and relatively AU-rich regions, while RNAz favors slightly GC-rich regions, resulting in a relatively small overlap between methods. Comparison with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz and EvoFold, and an additional 239 RNAz or EvoFold predictions are supported by the (more stringent) AlifoldZ algorithm. Five hundred seventy RNAz structure predictions fall into regions that show signs of selection pressure also on the sequence level (i.e., conserved elements). More than 700 predictions overlap with noncoding transcripts detected by oligonucleotide tiling arrays. One hundred seventy-five selected candidates were tested by RT-PCR in six tissues, and expression could be verified in 43 cases (24.6%).
Collapse
Affiliation(s)
- Stefan Washietl
- Institute for Theoretical Chemistry, University of Vienna, A-1090 Wien, Austria.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Denoeud F, Kapranov P, Ucla C, Frankish A, Castelo R, Drenkow J, Lagarde J, Alioto T, Manzano C, Chrast J, Dike S, Wyss C, Henrichsen CN, Holroyd N, Dickson MC, Taylor R, Hance Z, Foissac S, Myers RM, Rogers J, Hubbard T, Harrow J, Guigó R, Gingeras TR, Antonarakis SE, Reymond A. Prominent use of distal 5' transcription start sites and discovery of a large number of additional exons in ENCODE regions. Genes Dev 2007; 17:746-59. [PMID: 17567994 PMCID: PMC1891335 DOI: 10.1101/gr.5660607] [Citation(s) in RCA: 162] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2006] [Accepted: 01/22/2007] [Indexed: 11/24/2022]
Abstract
This report presents systematic empirical annotation of transcript products from 399 annotated protein-coding loci across the 1% of the human genome targeted by the Encyclopedia of DNA elements (ENCODE) pilot project using a combination of 5' rapid amplification of cDNA ends (RACE) and high-density resolution tiling arrays. We identified previously unannotated and often tissue- or cell-line-specific transcribed fragments (RACEfrags), both 5' distal to the annotated 5' terminus and internal to the annotated gene bounds for the vast majority (81.5%) of the tested genes. Half of the distal RACEfrags span large segments of genomic sequences away from the main portion of the coding transcript and often overlap with the upstream-annotated gene(s). Notably, at least 20% of the resultant novel transcripts have changes in their open reading frames (ORFs), most of them fusing ORFs of adjacent transcripts. A significant fraction of distal RACEfrags show expression levels comparable to those of known exons of the same locus, suggesting that they are not part of very minority splice forms. These results have significant implications concerning (1) our current understanding of the architecture of protein-coding genes; (2) our views on locations of regulatory regions in the genome; and (3) the interpretation of sequence polymorphisms mapping to regions hitherto considered to be "noncoding," ultimately relating to the identification of disease-related sequence alterations.
Collapse
Affiliation(s)
- France Denoeud
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | | | - Catherine Ucla
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Adam Frankish
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Robert Castelo
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Jorg Drenkow
- Affymetrix, Inc., Santa Clara, California 95051, USA
| | - Julien Lagarde
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - Tyler Alioto
- Center for Genomic Regulation, 08003 Barcelona, Catalonia, Spain
| | - Caroline Manzano
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Jacqueline Chrast
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| | - Sujit Dike
- Affymetrix, Inc., Santa Clara, California 95051, USA
| | - Carine Wyss
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | | | - Nancy Holroyd
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Mark C. Dickson
- Department of Genetics, Stanford Human Genome Center, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Ruth Taylor
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Zahra Hance
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Sylvain Foissac
- Center for Genomic Regulation, 08003 Barcelona, Catalonia, Spain
| | - Richard M. Myers
- Department of Genetics, Stanford Human Genome Center, Stanford University School of Medicine, Stanford, California 94305-5120, USA
| | - Jane Rogers
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Tim Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1HH, United Kingdom
| | - Roderic Guigó
- Grup de Recerca en Informática Biomèdica, Institut Municipal d’Investigació Mèdica/Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
- Center for Genomic Regulation, 08003 Barcelona, Catalonia, Spain
| | | | - Stylianos E. Antonarakis
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
| | - Alexandre Reymond
- Department of Genetic Medicine and Development, University of Geneva Medical School, 1211 Geneva, Switzerland
- Center for Integrative Genomics, University of Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
10
|
Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JGR, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis SE, Guigo R. GENCODE: producing a reference annotation for ENCODE. Genome Biol 2006; 7 Suppl 1:S4.1-9. [PMID: 16925838 PMCID: PMC1810553 DOI: 10.1186/gb-2006-7-s1-s4] [Citation(s) in RCA: 440] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This was achieved by a combination of initial manual annotation by the HAVANA team, experimental validation by the GENCODE consortium and a refinement of the annotation based on these experimental results. RESULTS The GENCODE gene features are divided into eight different categories of which only the first two (known and novel coding sequence) are confidently predicted to be protein-coding genes. 5' rapid amplification of cDNA ends (RACE) and RT-PCR were used to experimentally verify the initial annotation. Of the 420 coding loci tested, 229 RACE products have been sequenced. They supported 5' extensions of 30 loci and new splice variants in 50 loci. In addition, 46 loci without evidence for a coding sequence were validated, consisting of 31 novel and 15 putative transcripts. We assessed the comprehensiveness of the GENCODE annotation by attempting to validate all the predicted exon boundaries outside the GENCODE annotation. Out of 1,215 tested in a subset of the ENCODE regions, 14 novel exon pairs were validated, only two of them in intergenic regions. CONCLUSION In total, 487 loci, of which 434 are coding, have been annotated as part of the GENCODE reference set available from the UCSC browser. Comparison of GENCODE annotation with RefSeq and ENSEMBL show only 40% of GENCODE exons are contained within the two sets, which is a reflection of the high number of alternative splice forms with unique exons annotated. Over 50% of coding loci have been experimentally verified by 5' RACE for EGASP and the GENCODE collaboration is continuing to refine its annotation of 1% human genome with the aid of experimental validation.
Collapse
Affiliation(s)
- Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Campus, Hinxton, Cambridge CB10 1SA, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Guigó R, Flicek P, Abril JF, Reymond A, Lagarde J, Denoeud F, Antonarakis S, Ashburner M, Bajic VB, Birney E, Castelo R, Eyras E, Ucla C, Gingeras TR, Harrow J, Hubbard T, Lewis SE, Reese MG. EGASP: the human ENCODE Genome Annotation Assessment Project. Genome Biol 2006; 7 Suppl 1:S2.1-31. [PMID: 16925836 PMCID: PMC1810551 DOI: 10.1186/gb-2006-7-s1-s2] [Citation(s) in RCA: 198] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND We present the results of EGASP, a community experiment to assess the state-of-the-art in genome annotation within the ENCODE regions, which span 1% of the human genome sequence. The experiment had two major goals: the assessment of the accuracy of computational methods to predict protein coding genes; and the overall assessment of the completeness of the current human genome annotations as represented in the ENCODE regions. For the computational prediction assessment, eighteen groups contributed gene predictions. We evaluated these submissions against each other based on a 'reference set' of annotations generated as part of the GENCODE project. These annotations were not available to the prediction groups prior to the submission deadline, so that their predictions were blind and an external advisory committee could perform a fair assessment. RESULTS The best methods had at least one gene transcript correctly predicted for close to 70% of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into account alternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotide level, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programs relying on mRNA and protein sequences were the most accurate in reproducing the manually curated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could be verified. CONCLUSION This is the first such experiment in human DNA, and we have followed the standards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe the results presented here contribute to the value of ongoing large-scale annotation projects and should guide further experimental methods when being scaled up to the entire human genome sequence.
Collapse
Affiliation(s)
- Roderic Guigó
- Centre de Regulació Genòmica, Institut Municipal d'Investigació Mèdica-Universitat Pompeu Fabra, E08003 Barcelona, Catalonia, Spain
- Member of the EGASP Organizing Committee
| | - Paul Flicek
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josep F Abril
- Centre de Regulació Genòmica, Institut Municipal d'Investigació Mèdica-Universitat Pompeu Fabra, E08003 Barcelona, Catalonia, Spain
| | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne, Switzerland
| | - Julien Lagarde
- Centre de Regulació Genòmica, Institut Municipal d'Investigació Mèdica-Universitat Pompeu Fabra, E08003 Barcelona, Catalonia, Spain
| | - France Denoeud
- Centre de Regulació Genòmica, Institut Municipal d'Investigació Mèdica-Universitat Pompeu Fabra, E08003 Barcelona, Catalonia, Spain
| | - Stylianos Antonarakis
- University of Geneva Medical School and University Hospitals of Geneva, 1211 Geneva, Switzerland
| | - Michael Ashburner
- Department of Genetics, University of Cambridge, Cambridge CB3 2EH, UK
- Member of the EGASP Advisory Board
| | - Vladimir B Bajic
- South African National Bioinformatics Institute (SANBI), University of Western Cape, Bellville 7535, South Africa
- Member of the EGASP Advisory Board
| | - Ewan Birney
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
- Member of the EGASP Organizing Committee
| | - Robert Castelo
- Centre de Regulació Genòmica, Institut Municipal d'Investigació Mèdica-Universitat Pompeu Fabra, E08003 Barcelona, Catalonia, Spain
| | - Eduardo Eyras
- Centre de Regulació Genòmica, Institut Municipal d'Investigació Mèdica-Universitat Pompeu Fabra, E08003 Barcelona, Catalonia, Spain
| | - Catherine Ucla
- University of Geneva Medical School and University Hospitals of Geneva, 1211 Geneva, Switzerland
| | - Thomas R Gingeras
- Affymetrix Inc., Santa Clara, California 95051, USA
- Member of the EGASP Advisory Board
| | - Jennifer Harrow
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
- Member of the EGASP Organizing Committee
| | - Tim Hubbard
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
- Member of the EGASP Organizing Committee
| | - Suzanna E Lewis
- Department of Molecular and Cellular Biology, University of California, Berkeley, California 94792, USA
- Member of the EGASP Advisory Board
| | - Martin G Reese
- Omicia Inc., Christie Ave., Emeryville, California 94608, USA
- Member of the EGASP Advisory Board
| |
Collapse
|
12
|
Bonafé L, Dermitzakis ET, Unger S, Greenberg CR, Campos-Xavier BA, Zankl A, Ucla C, Antonarakis SE, Superti-Furga A, Reymond A. Evolutionary comparison provides evidence for pathogenicity of RMRP mutations. PLoS Genet 2006; 1:e47. [PMID: 16244706 PMCID: PMC1262189 DOI: 10.1371/journal.pgen.0010047] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2005] [Accepted: 09/07/2005] [Indexed: 11/19/2022] Open
Abstract
Cartilage-hair hypoplasia (CHH) is a pleiotropic disease caused by recessive mutations in the RMRP gene that result in a wide spectrum of manifestations including short stature, sparse hair, metaphyseal dysplasia, anemia, immune deficiency, and increased incidence of cancer. Molecular diagnosis of CHH has implications for management, prognosis, follow-up, and genetic counseling of affected patients and their families. We report 20 novel mutations in 36 patients with CHH and describe the associated phenotypic spectrum. Given the high mutational heterogeneity (62 mutations reported to date), the high frequency of variations in the region (eight single nucleotide polymorphisms in and around RMRP), and the fact that RMRP is not translated into protein, prediction of mutation pathogenicity is difficult. We addressed this issue by a comparative genomic approach and aligned the genomic sequences of RMRP gene in the entire class of mammals. We found that putative pathogenic mutations are located in highly conserved nucleotides, whereas polymorphisms are located in non-conserved positions. We conclude that the abundance of variations in this small gene is remarkable and at odds with its high conservation through species; it is unclear whether these variations are caused by a high local mutation rate, a failure of repair mechanisms, or a relaxed selective pressure. The marked diversity of mutations in RMRP and the low homozygosity rate in our patient population indicate that CHH is more common than previously estimated, but may go unrecognized because of its variable clinical presentation. Thus, RMRP molecular testing may be indicated in individuals with isolated metaphyseal dysplasia, anemia, or immune dysregulation.
Collapse
Affiliation(s)
- Luisa Bonafé
- Division of Molecular Pediatrics, Centre Hospitalier Universitaire Vaudois, Lausanne, Switzerland.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Bonnafe E, Touka M, AitLounis A, Baas D, Barras E, Ucla C, Moreau A, Flamant F, Dubruille R, Couble P, Collignon J, Durand B, Reith W. The transcription factor RFX3 directs nodal cilium development and left-right asymmetry specification. Mol Cell Biol 2004; 24:4417-27. [PMID: 15121860 PMCID: PMC400456 DOI: 10.1128/mcb.24.10.4417-4427.2004] [Citation(s) in RCA: 159] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
There are five members of the RFX family of transcription factors in mammals. While RFX5 plays a well-defined role in the immune system, the functions of RFX1 to RFX4 remain largely unknown. We have generated mice with a deletion of the Rfx3 gene. RFX3-deficient mice exhibit frequent left-right (LR) asymmetry defects leading to a high rate of embryonic lethality and situs inversus in surviving adults. In vertebrates, specification of the LR body axis is controlled by monocilia in the embryonic node, and defects in nodal cilia consequently result in abnormal LR patterning. Consistent with this, Rfx3 is expressed in ciliated cells of the node and RFX3-deficient mice exhibit a pronounced defect in nodal cilia. In contrast to the case for wild-type embryos, for which we document for the first time a twofold increase in the length of nodal cilia during development, the cilia are present but remain markedly stunted in mutant embryos. Finally, we show that RFX3 regulates the expression of D2lic, the mouse orthologue of a Caenorhabditis elegans gene that is implicated in intraflagellar transport, a process required for the assembly and maintenance of cilia. In conclusion, RFX3 is essential for the differentiation of nodal monocilia and hence for LR body axis determination.
Collapse
Affiliation(s)
- E Bonnafe
- Centre de Génétique Moléculaire et Cellulaire, CNRS UMR 5534, Université Claude Bernard Lyon-1, F-69622 Villeurbanne, France
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
14
|
Dermitzakis ET, Reymond A, Scamuffa N, Ucla C, Kirkness E, Rossier C, Antonarakis SE. Evolutionary Discrimination of Mammalian Conserved Non-Genic Sequences (CNGs). Science 2003; 302:1033-5. [PMID: 14526086 DOI: 10.1126/science.1087047] [Citation(s) in RCA: 156] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Analysis of the human and mouse genomes identified an abundance of conserved non-genic sequences (CNGs). The significance and evolutionary depth of their conservation remain unanswered. We have quantified levels and patterns of conservation of 191 CNGs of human chromosome 21 in 14 mammalian species. We found that CNGs are significantly more conserved than protein-coding genes and noncoding RNAS (ncRNAs) within the mammalian class from primates to monotremes to marsupials. The pattern of substitutions in CNGs differed from that seen in protein-coding and ncRNA genes and resembled that of protein-binding regions. About 0.3% to 1% of the human genome corresponds to a previously unknown class of extremely constrained CNGs shared among mammals.
Collapse
Affiliation(s)
- Emmanouil T Dermitzakis
- Division of Medical Genetics and National Center of Competence in Research (NCCR) Frontiers in Genetics, University of Geneva Medical School and University Hospitals, 1211 Geneva, Switzerland.
| | | | | | | | | | | | | |
Collapse
|
15
|
Guigo R, Dermitzakis ET, Agarwal P, Ponting CP, Parra G, Reymond A, Abril JF, Keibler E, Lyle R, Ucla C, Antonarakis SE, Brent MR. Comparison of mouse and human genomes followed by experimental verification yields an estimated 1,019 additional genes. Proc Natl Acad Sci U S A 2003; 100:1140-5. [PMID: 12552088 PMCID: PMC298740 DOI: 10.1073/pnas.0337561100] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2002] [Accepted: 12/11/2002] [Indexed: 11/18/2022] Open
Abstract
A primary motivation for sequencing the mouse genome was to accelerate the discovery of mammalian genes by using sequence conservation between mouse and human to identify coding exons. Achieving this goal proved challenging because of the large proportion of the mouse and human genomes that is apparently conserved but apparently does not code for protein. We developed a two-stage procedure that exploits the mouse and human genome sequences to produce a set of genes with a much higher rate of experimental verification than previously reported prediction methods. RT-PCR amplification and direct sequencing applied to an initial sample of mouse predictions that do not overlap previously known genes verified the regions flanking one intron in 139 predictions, with verification rates reaching 76%. On average, the confirmed predictions show more restricted expression patterns than the mouse orthologs of known human genes, and two-thirds lack homologs in fish genomes, demonstrating the sensitivity of this dual-genome approach to hard-to-find genes. We verified 112 previously unknown homologs of known proteins, including two homeobox proteins relevant to developmental biology, an aquaporin, and a homolog of dystrophin. We estimate that transcription and splicing can be verified for >1,000 gene predictions identified by this method that do not overlap known genes. This is likely to constitute a significant fraction of the previously unknown, multiexon mammalian genes.
Collapse
Affiliation(s)
- Roderic Guigo
- Research Group in Biomedical Informatics, Institut Municipal d'Investigació Mèdica/Universitat Pompeu Fabra/Centre de Regulació Genòmica, E08003 Barcelona, Catalonia, Spain
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, Antonarakis SE, Attwood J, Baertsch R, Bailey J, Barlow K, Beck S, Berry E, Birren B, Bloom T, Bork P, Botcherby M, Bray N, Brent MR, Brown DG, Brown SD, Bult C, Burton J, Butler J, Campbell RD, Carninci P, Cawley S, Chiaromonte F, Chinwalla AT, Church DM, Clamp M, Clee C, Collins FS, Cook LL, Copley RR, Coulson A, Couronne O, Cuff J, Curwen V, Cutts T, Daly M, David R, Davies J, Delehaunty KD, Deri J, Dermitzakis ET, Dewey C, Dickens NJ, Diekhans M, Dodge S, Dubchak I, Dunn DM, Eddy SR, Elnitski L, Emes RD, Eswara P, Eyras E, Felsenfeld A, Fewell GA, Flicek P, Foley K, Frankel WN, Fulton LA, Fulton RS, Furey TS, Gage D, Gibbs RA, Glusman G, Gnerre S, Goldman N, Goodstadt L, Grafham D, Graves TA, Green ED, Gregory S, Guigó R, Guyer M, Hardison RC, Haussler D, Hayashizaki Y, Hillier LW, Hinrichs A, Hlavina W, Holzer T, Hsu F, Hua A, Hubbard T, Hunt A, Jackson I, Jaffe DB, Johnson LS, Jones M, Jones TA, Joy A, Kamal M, Karlsson EK, Karolchik D, Kasprzyk A, Kawai J, Keibler E, Kells C, Kent WJ, Kirby A, Kolbe DL, Korf I, Kucherlapati RS, Kulbokas EJ, Kulp D, Landers T, Leger JP, Leonard S, Letunic I, Levine R, Li J, Li M, Lloyd C, Lucas S, Ma B, Maglott DR, Mardis ER, Matthews L, Mauceli E, Mayer JH, McCarthy M, McCombie WR, McLaren S, McLay K, McPherson JD, Meldrim J, Meredith B, Mesirov JP, Miller W, Miner TL, Mongin E, Montgomery KT, Morgan M, Mott R, Mullikin JC, Muzny DM, Nash WE, Nelson JO, Nhan MN, Nicol R, Ning Z, Nusbaum C, O'Connor MJ, Okazaki Y, Oliver K, Overton-Larty E, Pachter L, Parra G, Pepin KH, Peterson J, Pevzner P, Plumb R, Pohl CS, Poliakov A, Ponce TC, Ponting CP, Potter S, Quail M, Reymond A, Roe BA, Roskin KM, Rubin EM, Rust AG, Santos R, Sapojnikov V, Schultz B, Schultz J, Schwartz MS, Schwartz S, Scott C, Seaman S, Searle S, Sharpe T, Sheridan A, Shownkeen R, Sims S, Singer JB, Slater G, Smit A, Smith DR, Spencer B, Stabenau A, Stange-Thomann N, Sugnet C, Suyama M, Tesler G, Thompson J, Torrents D, Trevaskis E, Tromp J, Ucla C, Ureta-Vidal A, Vinson JP, Von Niederhausern AC, Wade CM, Wall M, Weber RJ, Weiss RB, Wendl MC, West AP, Wetterstrand K, Wheeler R, Whelan S, Wierzbowski J, Willey D, Williams S, Wilson RK, Winter E, Worley KC, Wyman D, Yang S, Yang SP, Zdobnov EM, Zody MC, Lander ES. Initial sequencing and comparative analysis of the mouse genome. Nature 2002; 420:520-62. [PMID: 12466850 DOI: 10.1038/nature01262] [Citation(s) in RCA: 4791] [Impact Index Per Article: 217.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2002] [Accepted: 10/31/2002] [Indexed: 12/18/2022]
Abstract
The sequence of the mouse genome is a key informational tool for understanding the contents of the human genome and a key experimental tool for biomedical research. Here, we report the results of an international collaboration to produce a high-quality draft sequence of the mouse genome. We also present an initial comparative analysis of the mouse and human genomes, describing some of the insights that can be gleaned from the two sequences. We discuss topics including the analysis of the evolutionary forces shaping the size, structure and sequence of the genomes; the conservation of large-scale synteny across most of the genomes; the much lower extent of sequence orthology covering less than half of the genomes; the proportions of the genomes under selection; the number of protein-coding genes; the expansion of gene families related to reproduction and immunity; the evolution of proteins; and the identification of intraspecies polymorphism.
Collapse
MESH Headings
- Animals
- Base Composition
- Chromosomes, Mammalian/genetics
- Conserved Sequence/genetics
- CpG Islands/genetics
- Evolution, Molecular
- Gene Expression Regulation
- Genes/genetics
- Genetic Variation/genetics
- Genome
- Genome, Human
- Genomics
- Humans
- Mice/classification
- Mice/genetics
- Mice, Knockout
- Mice, Transgenic
- Models, Animal
- Multigene Family/genetics
- Mutagenesis
- Neoplasms/genetics
- Physical Chromosome Mapping
- Proteome/genetics
- Pseudogenes/genetics
- Quantitative Trait Loci/genetics
- RNA, Untranslated/genetics
- Repetitive Sequences, Nucleic Acid/genetics
- Selection, Genetic
- Sequence Analysis, DNA
- Sex Chromosomes/genetics
- Species Specificity
- Synteny
Collapse
|
17
|
Reymond A, Marigo V, Yaylaoglu MB, Leoni A, Ucla C, Scamuffa N, Caccioppoli C, Dermitzakis ET, Lyle R, Banfi S, Eichele G, Antonarakis SE, Ballabio A. Human chromosome 21 gene expression atlas in the mouse. Nature 2002; 420:582-6. [PMID: 12466854 DOI: 10.1038/nature01178] [Citation(s) in RCA: 159] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2002] [Accepted: 09/19/2002] [Indexed: 11/09/2022]
Abstract
Genome-wide expression analyses have a crucial role in functional genomics. High resolution methods, such as RNA in situ hybridization provide an accurate description of the spatiotemporal distribution of transcripts as well as a three-dimensional 'in vivo' gene expression overview. We set out to analyse systematically the expression patterns of genes from an entire chromosome. We chose human chromosome 21 because of the medical relevance of trisomy 21 (Down's syndrome). Here we show the expression analysis of all identifiable murine orthologues of human chromosome 21 genes (161 out of 178 confirmed human genes) by RNA in situ hybridization on whole mounts and tissue sections, and by polymerase chain reaction with reverse transcription on adult tissues. We observed patterned expression in several tissues including those affected in trisomy 21 phenotypes (that is, central nervous system, heart, gastrointestinal tract, and limbs). Furthermore, statistical analysis suggests the presence of some regions of the chromosome with genes showing either lack of expression or, to a lesser extent, co-expression in specific tissues. This high resolution expression 'atlas' of an entire human chromosome is an important step towards the understanding of gene function and of the pathogenetic mechanisms in Down's syndrome.
Collapse
Affiliation(s)
- Alexandre Reymond
- Division of Medical Genetics, University of Geneva Medical School and University Hospital of Geneva, CMU, 1, rue Michel Servet, 1211 Geneva, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
18
|
Dermitzakis ET, Reymond A, Lyle R, Scamuffa N, Ucla C, Deutsch S, Stevenson BJ, Flegel V, Bucher P, Jongeneel CV, Antonarakis SE. Numerous potentially functional but non-genic conserved sequences on human chromosome 21. Nature 2002; 420:578-82. [PMID: 12466853 DOI: 10.1038/nature01251] [Citation(s) in RCA: 194] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2002] [Accepted: 10/30/2002] [Indexed: 11/08/2022]
Abstract
The use of comparative genomics to infer genome function relies on the understanding of how different components of the genome change over evolutionary time. The aim of such comparative analysis is to identify conserved, functionally transcribed sequences such as protein-coding genes and non-coding RNA genes, and other functional sequences such as regulatory regions, as well as other genomic features. Here, we have compared the entire human chromosome 21 with syntenic regions of the mouse genome, and have identified a large number of conserved blocks of unknown function. Although previous studies have made similar observations, it is unknown whether these conserved sequences are genes or not. Here we present an extensive experimental and computational analysis of human chromosome 21 in an effort to assign function to sequences conserved between human chromosome 21 (ref. 8) and the syntenic mouse regions. Our data support the presence of a large number of potentially functional non-genic sequences, probably regulatory and structural. The integration of the properties of the conserved components of human chromosome 21 to the rapidly accumulating functional data for this chromosome will improve considerably our understanding of the role of sequence conservation in mammalian genomes.
Collapse
Affiliation(s)
- Emmanouil T Dermitzakis
- Division of Medical Genetics, 1 Rue Michel-Servet, University of Geneva Medical School and University Hospitals of Geneva, CH-1211 Geneva, Switzerland
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Reymond A, Camargo AA, Deutsch S, Stevenson BJ, Parmigiani RB, Ucla C, Bettoni F, Rossier C, Lyle R, Guipponi M, de Souza S, Iseli C, Jongeneel CV, Bucher P, Simpson AJG, Antonarakis SE. Nineteen additional unpredicted transcripts from human chromosome 21. Genomics 2002; 79:824-32. [PMID: 12036297 DOI: 10.1006/geno.2002.6781] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The identification of all human chromosome 21 (HC21) genes is a necessary step in understanding the molecular pathogenesis of trisomy 21 (Down syndrome). The first analysis of the sequence of 21q included 127 previously characterized genes and predicted an additional 98 novel anonymous genes. Recently we evaluated the quality of this annotation by characterizing a set of HC21 open reading frames (C21orfs) identified by mapping spliced expressed sequence tags (ESTs) and predicted genes (PREDs), identified only in silico. This study underscored the limitations of in silico-only gene prediction, as many PREDs were incorrectly predicted. To refine the HC21 annotation, we have developed a reliable algorithm to extract and stringently map sequences that contain bona fide 3' transcript ends to the genome. We then created a specific 21q graphical display allowing an integrated view of the data that incorporates new ESTs as well as features such as CpG islands, repeats, and gene predictions. Using these tools we identified 27 new putative genes. To validate these, we sequenced previously cloned cDNAs and carried out RT-PCR, 5'- and 3'-RACE procedures, and comparative mapping. These approaches substantiated 19 new transcripts, thus increasing the HC21 gene count by 9.5%. These transcripts were likely not previously identified because they are small and encode small proteins. We also identified four transcriptional units that are spliced but contain no obvious open reading frame. The HC21 data presented here further emphasize that current gene prediction algorithms miss a substantial number of transcripts that nevertheless can be identified using a combination of experimental approaches and multiple refined algorithms.
Collapse
Affiliation(s)
- Alexandre Reymond
- Division of Medical Genetics, University of Geneva Medical School, 1211 Geneva, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Merla G, Ucla C, Guipponi M, Reymond A. Identification of additional transcripts in the Williams-Beuren syndrome critical region. Hum Genet 2002; 110:429-38. [PMID: 12073013 DOI: 10.1007/s00439-002-0710-x] [Citation(s) in RCA: 93] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2001] [Accepted: 02/07/2002] [Indexed: 11/24/2022]
Abstract
Williams-Beuren syndrome (WBS) is a developmental disorder associated with haploinsufficiency of multiple genes at 7q11.23. Here, we report the characterization of WBSCR16, WBSCR17, WBSCR18, WBSCR20A, WBSCR20B, WBSCR20C, WBSCR21, WBSCR22, and WBSCR23, nine novel genes contained in the WBS commonly deleted region or its flanking sequences. They encode an RCC1-like G-exchanging factor, an N-acetylgalactosaminyltransferase, a DNAJ-like chaperone, NOL1/NOP2/sun domain-containing proteins, a methyltransferase, or proteins with no known homologies. Haploinsufficiency of these newly identified WBSCR genes may contribute to certain of the WBS phenotypical features.
Collapse
Affiliation(s)
- Giuseppe Merla
- Division of Medical Genetics, University of Geneva Medical School, CMU, 1 Rue Michel Servet, 1211 Geneva 4, Switzerland
| | | | | | | |
Collapse
|
21
|
Reymond A, Friedli M, Henrichsen CN, Chapot F, Deutsch S, Ucla C, Rossier C, Lyle R, Guipponi M, Antonarakis SE. From PREDs and open reading frames to cDNA isolation: revisiting the human chromosome 21 transcription map. Genomics 2001; 78:46-54. [PMID: 11707072 DOI: 10.1006/geno.2001.6640] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A supernumerary copy of human chromosome 21 (HC21) causes Down syndrome. To understand the molecular pathogenesis of Down syndrome, it is necessary to identify all HC21 genes. The first annotation of the sequence of 21q confirmed 127 genes, and predicted an additional 98 previously unknown "anonymous" genes (predictions (PREDs) and open reading frames (C21orfs)), which were foreseen by exon prediction programs and/or spliced expressed sequence tags. These putative gene models still need to be confirmed as bona fide transcripts. Here we report the characterization and expression pattern of the putative transcripts C21orf7, C21orf11, C21orf15, C21orf18, C21orf19, C21orf22, C21orf42, C21orf50, C21orf51, C21orf57, and C21orf58, the GC-rich sequence DNA-binding factor candidate GCFC (also known as C21orf66), PRED12, PRED31, PRED34, PRED44, PRED54, and PRED56. Our analysis showed that most of the C21orfs originally defined by matching spliced expressed sequence tags were correctly predicted, whereas many of the PREDs, defined solely by computer prediction, do not correspond to genuine genes. Four of the six PREDs were incorrectly predicted: PRED44 and C21orf11 are portions of the same transcript, PRED31 is a pseudogene, and PRED54 and PRED56 were wrongly predicted. In contrast, PRED12 (now called C21orf68) and PRED34 (C21orf63) are now confirmed transcripts. We identified three new genes, C21orf67, C21orf69, and C21orf70, not previously predicted by any programs. This revision of the HC21 transcriptome has consequences for the entire genome regarding the quality of previous annotations and the total number of transcripts. It also provides new candidates for genes involved in Down syndrome and other genetic disorders that map to HC21.
Collapse
MESH Headings
- Animals
- COS Cells
- Chromosomes, Human, Pair 21/genetics
- Cloning, Molecular
- DNA, Complementary/chemistry
- DNA, Complementary/genetics
- DNA, Complementary/isolation & purification
- Down Syndrome/genetics
- Expressed Sequence Tags
- Genes/genetics
- Green Fluorescent Proteins
- Humans
- Internet
- Luminescent Proteins/genetics
- Luminescent Proteins/metabolism
- Mice
- Microscopy, Fluorescence
- Molecular Sequence Data
- Open Reading Frames/genetics
- Recombinant Fusion Proteins/genetics
- Recombinant Fusion Proteins/metabolism
- Sequence Analysis, DNA
- Transcription, Genetic
- Tumor Cells, Cultured
Collapse
Affiliation(s)
- A Reymond
- Division of Medical Genetics, University of Geneva Medical School, Geneva, 1211, Switzerland
| | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Stauffer Y, Marguerat S, Meylan F, Ucla C, Sutkowski N, Huber B, Pelet T, Conrad B. Interferon-alpha-induced endogenous superantigen. a model linking environment and autoimmunity. Immunity 2001; 15:591-601. [PMID: 11672541 DOI: 10.1016/s1074-7613(01)00212-6] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We earlier proposed that a human endogenous retroviral (HERV) superantigen (SAg) IDDMK(1,2)22 may cause type I diabetes by activating autoreactive T cells. Viral infections and induction of interferon-alpha (IFN-alpha) are tightly associated with the onset of autoimmunity. Here we establish a link between viral infections and IFN-alpha-regulated SAg expression of the polymorphic and defective HERV-K18 provirus. HERV-K18 has three alleles, IDDMK(1,2)22 and two full-length envelope genes, that all encode SAgs. Expression of HERV-K18 SAgs is inducible by IFN-alpha and this is sufficient to stimulate V beta 7 T cells to levels comparable to transfectants constitutively expressing HERV-K18 SAgs. Endogenous SAgs induced via IFN-alpha by viral infections is a novel mechanism through which environmental factors may cause disease in genetically susceptible individuals.
Collapse
Affiliation(s)
- Y Stauffer
- Department of Genetics and Microbiology, University of Geneva Medical School, 1211 Geneva 4, Switzerland
| | | | | | | | | | | | | | | |
Collapse
|
23
|
Bontron S, Ucla C, Mach B, Steimle V. Efficient repression of endogenous major histocompatibility complex class II expression through dominant negative CIITA mutants isolated by a functional selection strategy. Mol Cell Biol 1997; 17:4249-58. [PMID: 9234682 PMCID: PMC232278 DOI: 10.1128/mcb.17.8.4249] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Major histocompatibility complex class II (MHC-II) molecules present peptide antigens to CD4-positive T cells and are of critical importance for the immune response. The MHC-II transactivator CIITA is essential for all aspects of MHC-II gene expression examined so far and thus constitutes a master regulator of MHC-II expression. In this study, we generated and analyzed mutant CIITA molecules which are able to suppress endogenous MHC-II expression in a dominant negative manner for both constitutive and inducible MHC-II expression. Dominant negative CIITA mutants were generated via specific restriction sites and by functional selection from a library of random N-terminal CIITA deletions. This functional selection strategy was very effective, leading to strong dominant negative CIITA mutants in which the N-terminal acidic and proline/serine/threonine-rich regions were completely deleted. Dominant negative activity is dependent on an intact C terminus. Efficient repression of endogenous MHC-II mRNA levels was quantified by RNase protection analysis. The quantitative effects of various dominant negative CIITA mutants on mRNA expression levels of the different MHC-II isotypes are very similar. The optimized dominant negative CIITA mutants isolated by functional selection should be useful for in vivo repression of MHC-II expression.
Collapse
Affiliation(s)
- S Bontron
- Department of Genetics and Microbiology, University of Geneva Medical School, Switzerland
| | | | | | | |
Collapse
|
24
|
Bontron S, Steimle V, Ucla C, Eibl MM, Mach B. Two novel mutations in the MHC class II transactivator CIITA in a second patient from MHC class II deficiency complementation group A. Hum Genet 1997; 99:541-6. [PMID: 9099848 DOI: 10.1007/s004390050403] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Congenital MHC class II deficiency or bare lymphocyte syndrome (BLS; McKusick 209920) is caused by defects in trans-acting regulatory factors that control MHC class II expression and is therefore a disease of gene regulation. There are at least four complementation groups and the genetic and molecular dissection of this rare disease has contributed considerably to our current understanding of the molecular mechanisms governing MHC class II expression. Identification of the gene that is defective in BLS complementation group A, CIITA (MHC class II transactivator), has led to the discovery that CIITA acts as a master control factor of MHC class II expression. We have identified the CIITA mutations in a second patient from BLS group A. Two novel mutations abolish CIITA function, as shown by transfection experiments. Molecular analysis of these two novel mutations, together with the one described earlier in the first patient, is informative in terms of CIITA structure-function relationships.
Collapse
Affiliation(s)
- S Bontron
- Department of Genetics and Microbiology, University of Geneva Medical School, Switzerland
| | | | | | | | | |
Collapse
|
25
|
Abstract
RFX transcription factors constitute a highly conserved family of site-specific DNA binding proteins involved in the expression of a variety of cellular and viral genes, including major histocompatibility complex class II genes and genes in human hepatitis B virus. Five members of the RFX gene family have been isolated from human and mouse, and all share a highly characteristic DNA binding domain that is distinct from other known DNA binding motifs. The human RFX1 and RFX2 genes have been assigned by in situ hybridization to chromosome 19p13.1 and 19p13.3, respectively. In this paper, we present data that localize RFX1 and RFX2 precisely within the detailed physical map of human chromosome 19 and genetic data that assign Rfx1 and Rfx2 to homologous regions of mouse chromosomes 8 and 17, respectively. These data define the established relationships between these homologous mouse and human regions in further detail and provide new tools for linking cloned genes to phenotypes in both species.
Collapse
Affiliation(s)
- J Doyle
- Biology Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, 37831-8077, USA
| | | | | | | | | | | |
Collapse
|
26
|
Reith W, Ucla C, Barras E, Gaud A, Durand B, Herrero-Sanchez C, Kobr M, Mach B. RFX1, a transactivator of hepatitis B virus enhancer I, belongs to a novel family of homodimeric and heterodimeric DNA-binding proteins. Mol Cell Biol 1994; 14:1230-44. [PMID: 8289803 PMCID: PMC358479 DOI: 10.1128/mcb.14.2.1230-1244.1994] [Citation(s) in RCA: 57] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
RFX1 is a transactivator of human hepatitis B virus enhancer I. We show here that RFX1 belongs to a previously unidentified family of DNA-binding proteins of which we have cloned three members, RFX1, RFX2, and RFX3, from humans and mice. Members of the RFX family constitute the nuclear complexes that have been referred to previously as enhancer factor C, EP, methylation-dependent DNA-binding protein, or rpL30 alpha. RFX proteins share five strongly conserved regions which include the two domains required for DNA binding and dimerization. They have very similar DNA-binding specificities and heterodimerize both in vitro and in vivo. mRNA levels for all three genes, particularly RFX2, are elevated in testis. In other cell lines and tissues, RFX mRNA levels are variable, particularly for RFX2 and RFX3. RFX proteins share several novel features, including new DNA-binding and dimerization motifs and a peculiar dependence on methylated CpG dinucleotides at certain sites.
Collapse
Affiliation(s)
- W Reith
- Jeantet Laboratory of Molecular Genetics, Department of Genetics and Microbiology, University of Geneva Medical School, Centre Médical Universitaire, Switzerland
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Affiliation(s)
- I Kern
- Jeantet Laboratory of Molecular Genetics, Department of Genetics and Microbiology, University of Geneva Medical School, Switzerland
| | | | | |
Collapse
|
28
|
Scherly D, Nouspikel T, Corlet J, Ucla C, Bairoch A, Clarkson SG. Complementation of the DNA repair defect in xeroderma pigmentosum group G cells by a human cDNA related to yeast RAD2. Nature 1993; 363:182-5. [PMID: 8483504 DOI: 10.1038/363182a0] [Citation(s) in RCA: 149] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Defects in human DNA repair proteins can give rise to the autosomal recessive disorders xeroderma pigmentosum (XP) and Cockayne's syndrome (CS), sometimes even together. Seven XP and three CS complementation groups have been identified that are thought to be due to mutations in genes from the nucleotide excision repair pathway. Here we isolate frog and human complementary DNAs that encode proteins resembling RAD2, a protein involved in this pathway in yeast. Alignment of these three polypeptides, together with two other RAD2 related proteins, reveals that their conserved sequences are largely confined to two regions. Expression of the human cDNA in vivo restores to normal the sensitivity to ultraviolet light and unscheduled DNA synthesis of lymphoblastoid cells from XP group G, but not CS group A. The XP-G correcting protein XPGC is generated from a messenger RNA of approximately 4 kilobases that is present in normal amounts in the XP-G cell line.
Collapse
Affiliation(s)
- D Scherly
- Department of Genetics and Microbiology, University Medical Centre (CMU), Geneva, Switzerland
| | | | | | | | | | | |
Collapse
|
29
|
Pugliatti L, Derré J, Berger R, Ucla C, Reith W, Mach B. The genes for MHC class II regulatory factors RFX1 and RFX2 are located on the short arm of chromosome 19. Genomics 1992; 13:1307-10. [PMID: 1505960 DOI: 10.1016/0888-7543(92)90052-t] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
RFX1 is a transacting DNA-binding regulatory factor involved in the control of MHC class II gene expression. RFX2 is a structurally very similar protein with identical DNA binding features. A member of the family of RFX factors is affected in an autosomal recessive disease, MHC class II deficient combined immunodeficiency (CID), caused by a defect in a trans-acting regulatory factor controlling MHC class II gene expression. In situ hybridization with 3H-labeled RFX1 cDNA has allowed us to identify two distinct targets on the short arm of chromosome 19 (19p13.1 and 19p13.2-p13.3). With the use of biotinylated genomic cosmid clones specific for RFX1 and RFX2, respectively, it was then possible to localize RFX1 at 19p13.1 and RFX2 at 19p13.2-p13.3. These two regulatory genes are thus assigned to a region of high gene density and RFX1 is close to another DNA-binding factor, LYL1.
Collapse
Affiliation(s)
- L Pugliatti
- Jeantet Laboratory of Molecular Genetics, Department of Genetics and Microbiology, University of Geneva Medical School, Switzerland
| | | | | | | | | | | |
Collapse
|
30
|
Ucla C, Roux-Lombard P, Fey S, Dayer JM, Mach B. Interferon gamma drastically modifies the regulation of interleukin 1 genes by endotoxin in U937 cells. J Clin Invest 1990; 85:185-91. [PMID: 2104878 PMCID: PMC296404 DOI: 10.1172/jci114411] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
IL-1 alpha, IL-1 beta, and tumor necrosis factor alpha (TNF-alpha) gene expression is induced by LPS (endotoxin) in monocytes/macrophages and in some monocytic cell lines. IFN gamma and 1 alpha,25-dihydroxyvitamin D3 (1,25[OH]2D3) are important macrophage-activating factors. They induce changes in the human monocyte cell line U937 that reflect cellular differentiation. We have studied the effect of IFN-gamma and of 1,25(OH)2D3 on the expression of IL-1 and TNF-alpha messenger RNA in response to LPS. The induction of these genes by LPS is immediate and transient, with a maximum in 3 h. Preincubation of the cells with IFN-gamma or with 1,25(OH)2D3 increases these mRNA responses to LPS about fourfold. More importantly, cells exposed to IFN-gamma for 72 h exhibit a drastically different and unexpected pattern of IL-1 alpha and IL-1 beta gene response to LPS. Instead of the normal transient response, one then observes a sustained increase in IL-1 alpha and IL-1 beta gene expression over at least 16 h after LPS stimulation. This was measured both at the level of mRNA and by direct transcription assays (run-off). This striking effect of IFN-gamma on the kinetics of IL-1 gene response does not apply to the TNF-alpha gene. Interestingly, 1,25(OH)2D3, which shares with IFN-gamma a number of important effects on monocytes/macrophages, does not affect the kinetics of IL-1 gene response to LPS. In view of the biological relevance of endotoxin as a macrophage activator, the potential clinical implication of this prolonged induction of IL-1 gene expression is discussed.
Collapse
Affiliation(s)
- C Ucla
- Department of Microbiology, University of Geneva Medical School, Switzerland
| | | | | | | | | |
Collapse
|
31
|
Iynedjian PB, Pilot PR, Nouspikel T, Milburn JL, Quaade C, Hughes S, Ucla C, Newgard CB. Differential expression and regulation of the glucokinase gene in liver and islets of Langerhans. Proc Natl Acad Sci U S A 1989; 86:7838-42. [PMID: 2682629 PMCID: PMC298166 DOI: 10.1073/pnas.86.20.7838] [Citation(s) in RCA: 138] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Glucokinase, a key regulatory enzyme of glucose metabolism in mammals, provides an interesting model of tissue-specific gene expression. The single-copy gene is expressed principally in liver, where it gives rise to a 2.4-kilobase mRNA. The islets of Langerhans of the pancreas also contain glucokinase. Using a cDNA complementary to rat liver glucokinase mRNA, we show that normal pancreatic islets and tumoral islet cells contain a glucokinase mRNA species approximately 400 nucleotides longer than hepatic mRNA. Hybridization with synthetic oligonucleotides and primer-extension analysis show that the liver and islet glucokinase mRNAs differ in the 5' region. Glucokinase mRNA is absent from the livers of fasted rats and is strongly induced within hours by an oral glucose load. In contrast, islet glucokinase mRNA is expressed at a constant level during the fasting-refeeding cycle. The level of glucokinase protein in islets measured by immunoblotting is unaffected by fasting and refeeding, whereas a 3-fold increase in the amount of enzyme occurs in liver during the transition from fasting to refeeding. From these data, we conclude (i) that alternative splicing and/or the use of distinct tissue-specific promoters generate structurally distinct mRNA species in liver and islets of Langerhans and (ii) that tissue-specific transcription mechanisms result in inducible expression of the glucokinase gene in liver but not in islets during the fasting-refeeding transition.
Collapse
Affiliation(s)
- P B Iynedjian
- Institut de Biochimie Clinique, University of Geneva School of Medicine, Switzerland
| | | | | | | | | | | | | | | |
Collapse
|
32
|
Gorski J, Irle C, Mickelson EM, Sheehy MJ, Termijtelen A, Ucla C, Mach B. Correlation of structure with T cell responses of the three members of the HLA-DRw52 allelic series. J Exp Med 1989; 170:1027-32. [PMID: 2788702 PMCID: PMC2189422 DOI: 10.1084/jem.170.3.1027] [Citation(s) in RCA: 25] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
A third allele at the DRB3 locus, DRw52c, represents an intermediate sequence between DRw52a and DRw52b and may have arisen by a gene conversion-like event. The recognition of cells bearing these molecules by a number of alloreactive and antigen-specific DR-restricted T cell clones was analyzed. On the basis of a theoretical model of HLA class II structure, distinct amino acid clusters have been identified as motifs controlling TCR recognition. These are located both in the cleft and in the alpha-helical edge of the MHC class II recognition platform. Motifs shared between two alleles may restrict public T cell clones.
Collapse
Affiliation(s)
- J Gorski
- Department of Microbiology, University of Geneva Medical School, Switzerland
| | | | | | | | | | | | | |
Collapse
|
33
|
Abstract
Recent progress in the molecular genetics of HLA class II antigens has revealed the existence of multiple loci and of a large degree of polymorphism, with more individual alleles than was expected. An accurate detection and analysis of this extensive polymorphism is essential for optimal HLA typing for transplantation and for a reevaluation of HLA-disease association. Because of the limitations of the current typing methods, including restriction fragment length polymorphisms, we have proposed a DNA typing procedure based on hybridization with loci- and allele-specific oligonucleotides. Here we present a much simpler way of analyzing class II micropolymorphism down to the level of single nucleotide differences. RNA oligonucleotide typing (ROT) relies on RNA dot blots and requires 10-20 ml of blood. It is shown that with appropriate oligonucleotide probes, ROT can reliably and unambiguously identify any polymorphism at any of the HLA loci, including new alleles, not identified with previous methods. This illustrates the importance of oligonucleotide typing to optimize HLA matching, in particular for transplantation involving unrelated donors.
Collapse
Affiliation(s)
- C Ucla
- Department of Microbiology, University of Geneva School of Medicine, Switzerland
| | | | | | | |
Collapse
|
34
|
Iynedjian PB, Ucla C, Mach B. Molecular cloning of glucokinase cDNA. Developmental and dietary regulation of glucokinase mRNA in rat liver. J Biol Chem 1987; 262:6032-8. [PMID: 3553185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
A rat liver cDNA library enriched for glucokinase sequences was constructed using the phage expression vector lambda gt11 and screened with an antiserum to glucokinase. A positive phage clone termed lambda-GK223 was isolated by several rounds of plaque purification. When introduced in the high frequency lysogenization strain Y1089, the phage was shown to encode a fusion protein containing epitopes specific to rat liver glucokinase. The 1800-base pair cDNA insert of lambda-GK223 was subcloned in a pUC plasmid, and a resulting recombinant termed pUC-GK1 was used for hybrid selection of mRNA. The selected mRNA directed the synthesis in a cell-free translation system of a protein identified as glucokinase by electrophoresis and immunoprecipitation. The cloned cDNA was then used as a probe to measure the amount of glucokinase mRNA in rat liver during postnatal development. Glucokinase mRNA, 2.4 kilobases in length, was first detectable at day 14 after birth and increased 40-fold in amount from this age to day 31, in parallel with the emergence of glucokinase enzyme activity. In the adult rat, glucokinase mRNA was low during fasting and increased more than 50-fold above the fasting level within 6 h of an oral glucose load. However, maximal accumulation of glucokinase mRNA was short-lived and the mRNA level returned toward basal values by 18 h of refeeding. These data point to rapid and massive effects on the expression of the glucokinase gene at the transcriptional or post-transcriptional levels during ontogenic development and dietary changes in the adult animal.
Collapse
|
35
|
Iynedjian P, Ucla C, Mach B. Molecular cloning of glucokinase cDNA. Developmental and dietary regulation of glucokinase mRNA in rat liver. J Biol Chem 1987. [DOI: 10.1016/s0021-9258(18)45533-0] [Citation(s) in RCA: 58] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open
|
36
|
Mach B, Gorski J, Rollini P, Berte C, Amaldi I, Berdoz J, Ucla C. Polymorphism and regulation of HLA class II genes of the major histocompatibility complex. Cold Spring Harb Symp Quant Biol 1986; 51 Pt 1:67-74. [PMID: 3472746 DOI: 10.1101/sqb.1986.051.01.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
37
|
Dayer JM, Zavadil-Grob C, Ucla C, Mach B. Induction of human interleukin 1 mRNA measured by collagenase- and prostaglandin E2-stimulating activity in rheumatoid synovial cells. Eur J Immunol 1984; 14:898-901. [PMID: 6092094 DOI: 10.1002/eji.1830141007] [Citation(s) in RCA: 53] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Human blood peripheral monocyte/macrophages release in culture a mononuclear cell factor (MCF) which stimulates the production of collagenase and prostaglandin E2 by human rheumatoid synovial cells and dermal fibroblasts. These two products play a role in connective tissue destruction. MCF has an apparent molecular weight of approximately 15 000 and is biologically and biochemically indistinguishable from interleukin 1. MCF therefore belongs to the well-documented nonimmune biological activities attributed to interleukin 1. Studies on the mechanisms of production and action of such monokine(s) have been difficult in view of the minute quantities produced by freshly isolated cells or from human monocytic lines. Starting from lectin-stimulated human blood mononuclear cells, we have isolated poly(A)+ RNA and studied its translation following microinjection into Xenopus laevis oocytes. The mRNA translation products stimulated collagenase and prostaglandin E2 production in human rheumatoid synovial cells and dermal fibroblasts. The size of MCF-mRNA was estimated to be 10 S. The mRNA of a member of the interleukin 1 family can now be studied in a system based on a specific and direct relevant biological assay and eventually compared with those of other monokines.
Collapse
|