1
|
Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest ARR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SPT, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, et alCarninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest ARR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SPT, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schönbach C, Sekiguchi K, Semple CAM, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y. The transcriptional landscape of the mammalian genome. Science 2005; 309:1559-63. [PMID: 16141072 DOI: 10.1126/science.1112014] [Show More Authors] [Citation(s) in RCA: 2666] [Impact Index Per Article: 133.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
Collapse
|
Research Support, Non-U.S. Gov't |
20 |
2666 |
2
|
Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR. A census of human cancer genes. Nat Rev Cancer 2004; 4:177-83. [PMID: 14993899 PMCID: PMC2665285 DOI: 10.1038/nrc1299] [Citation(s) in RCA: 2284] [Impact Index Per Article: 108.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
A central aim of cancer research has been to identify the mutated genes that are causally implicated in oncogenesis (‘cancer genes’). After two decades of searching, how many have been identified and how do they compare to the complete gene set that has been revealed by the human genome sequence? We have conducted a ‘census’ of cancer genes that indicates that mutations in more than 1% of genes contribute to human cancer. The census illustrates striking features in the types of sequence alteration, cancer classes in which oncogenic mutations have been identified and protein domains that are encoded by cancer genes.
Collapse
|
Review |
21 |
2284 |
3
|
Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Huminiecki L, Kasprzyk A, Lehvaslaiho H, Lijnzaad P, Melsopp C, Mongin E, Pettett R, Pocock M, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Clamp M. The Ensembl genome database project. Nucleic Acids Res 2002; 30:38-41. [PMID: 11752248 PMCID: PMC99161 DOI: 10.1093/nar/30.1.38] [Citation(s) in RCA: 1096] [Impact Index Per Article: 47.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
Collapse
|
research-article |
23 |
1096 |
4
|
Eckhardt F, Lewin J, Cortese R, Rakyan VK, Attwood J, Burger M, Burton J, Cox TV, Davies R, Down TA, Haefliger C, Horton R, Howe K, Jackson DK, Kunde J, Koenig C, Liddle J, Niblett D, Otto T, Pettett R, Seemann S, Thompson C, West T, Rogers J, Olek A, Berlin K, Beck S. DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet 2006; 38:1378-85. [PMID: 17072317 PMCID: PMC3082778 DOI: 10.1038/ng1909] [Citation(s) in RCA: 945] [Impact Index Per Article: 49.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2006] [Accepted: 09/18/2006] [Indexed: 12/17/2022]
Abstract
DNA methylation is the most stable type of epigenetic modification modulating the transcriptional plasticity of mammalian genomes. Using bisulfite DNA sequencing, we report high-resolution methylation profiles of human chromosomes 6, 20 and 22, providing a resource of about 1.9 million CpG methylation values derived from 12 different tissues. Analysis of six annotation categories showed that evolutionarily conserved regions are the predominant sites for differential DNA methylation and that a core region surrounding the transcriptional start site is an informative surrogate for promoter methylation. We find that 17% of the 873 analyzed genes are differentially methylated in their 5' UTRs and that about one-third of the differentially methylated 5' UTRs are inversely correlated with transcription. Despite the fact that our study controlled for factors reported to affect DNA methylation such as sex and age, we did not find any significant attributable effects. Our data suggest DNA methylation to be ontogenetically more stable than previously thought.
Collapse
MESH Headings
- 5' Untranslated Regions
- Adult
- Age Factors
- Aged
- Animals
- Chromosomes, Human, Pair 20/genetics
- Chromosomes, Human, Pair 20/metabolism
- Chromosomes, Human, Pair 22/genetics
- Chromosomes, Human, Pair 22/metabolism
- Chromosomes, Human, Pair 6/genetics
- Chromosomes, Human, Pair 6/metabolism
- CpG Islands
- DNA Methylation
- Epigenesis, Genetic
- Evolution, Molecular
- Female
- Humans
- Male
- Mice
- Middle Aged
- Organ Specificity
- Promoter Regions, Genetic
- Sex Characteristics
- Species Specificity
- Transcription, Genetic
Collapse
|
Comparative Study |
19 |
945 |
5
|
Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome-wide association studies for common human diseases. Nat Rev Genet 2011; 12:529-41. [PMID: 21747404 PMCID: PMC3508712 DOI: 10.1038/nrg3000] [Citation(s) in RCA: 899] [Impact Index Per Article: 64.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Despite the success of genome-wide association studies (GWASs) in identifying loci associated with common diseases, a substantial proportion of the causality remains unexplained. Recent advances in genomic technologies have placed us in a position to initiate large-scale studies of human disease-associated epigenetic variation, specifically variation in DNA methylation. Such epigenome-wide association studies (EWASs) present novel opportunities but also create new challenges that are not encountered in GWASs. We discuss EWAS design, cohort and sample selections, statistical significance and power, confounding factors and follow-up studies. We also discuss how integration of EWASs with GWASs can help to dissect complex GWAS haplotypes for functional analysis.
Collapse
|
Review |
14 |
899 |
6
|
Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K, Meidl P, Ouverdin B, Parker A, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Severin J, Slater G, Smedley D, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wood M, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Flicek P, Kasprzyk A, Proctor G, Searle S, Smith J, Ureta-Vidal A, Birney E. Ensembl 2007. Nucleic Acids Res 2006; 35:D610-7. [PMID: 17148474 PMCID: PMC1761443 DOI: 10.1093/nar/gkl996] [Citation(s) in RCA: 657] [Impact Index Per Article: 34.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees.
Collapse
|
Research Support, Non-U.S. Gov't |
19 |
657 |
7
|
Rakyan VK, Down TA, Maslau S, Andrew T, Yang TP, Beyan H, Whittaker P, McCann OT, Finer S, Valdes AM, Leslie RD, Deloukas P, Spector TD. Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains. Genome Res 2010; 20:434-9. [PMID: 20219945 DOI: 10.1101/gr.103101.109] [Citation(s) in RCA: 562] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
There is a growing realization that some aging-associated phenotypes/diseases have an epigenetic basis. Here, we report the first genome-scale study of epigenomic dynamics during normal human aging. We identify aging-associated differentially methylated regions (aDMRs) in whole blood in a discovery cohort, and then replicate these aDMRs in sorted CD4(+) T-cells and CD14(+) monocytes in an independent cohort, suggesting that aDMRs occur in precursor haematopoietic cells. Further replication of the aDMRs in buccal cells, representing a tissue that originates from a different germ layer compared with blood, demonstrates that the aDMR signature is a multitissue phenomenon. Moreover, we demonstrate that aging-associated DNA hypermethylation occurs predominantly at bivalent chromatin domain promoters. This same category of promoters, associated with key developmental genes, is frequently hypermethylated in cancers and in vitro cell culture, pointing to a novel mechanistic link between aberrant hypermethylation in cancer, aging, and cell culture.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
562 |
8
|
Hackett JA, Sengupta R, Zylicz JJ, Murakami K, Lee C, Down TA, Surani MA. Germline DNA demethylation dynamics and imprint erasure through 5-hydroxymethylcytosine. Science 2013; 339:448-52. [PMID: 23223451 PMCID: PMC3847602 DOI: 10.1126/science.1229277] [Citation(s) in RCA: 545] [Impact Index Per Article: 45.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Mouse primordial germ cells (PGCs) undergo sequential epigenetic changes and genome-wide DNA demethylation to reset the epigenome for totipotency. Here, we demonstrate that erasure of CpG methylation (5mC) in PGCs occurs via conversion to 5-hydroxymethylcytosine (5hmC), driven by high levels of TET1 and TET2. Global conversion to 5hmC initiates asynchronously among PGCs at embryonic day (E) 9.5 to E10.5 and accounts for the unique process of imprint erasure. Mechanistically, 5hmC enrichment is followed by its protracted decline thereafter at a rate consistent with replication-coupled dilution. The conversion to 5hmC is an important component of parallel redundant systems that drive comprehensive reprogramming in PGCs. Nonetheless, we identify rare regulatory elements that escape systematic DNA demethylation in PGCs, providing a potential mechanistic basis for transgenerational epigenetic inheritance.
Collapse
|
research-article |
12 |
545 |
9
|
Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J. Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet 2009; 41:376-81. [PMID: 19182803 PMCID: PMC2648722 DOI: 10.1038/ng.322] [Citation(s) in RCA: 503] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2008] [Accepted: 01/09/2009] [Indexed: 12/11/2022]
Abstract
Variation in patterns of methylations of histone tails reflects and modulates chromatin structure and function. To provide a framework for the analysis of chromatin function in Caenorhabditis elegans, we generated a genome-wide map of histone H3 tail methylations. We find that C. elegans genes show distributions of histone modifications that are similar to those of other organisms, with H3K4me3 near transcription start sites, H3K36me3 in the body of genes and H3K9me3 enriched on silent genes. We also observe a novel pattern: exons are preferentially marked with H3K36me3 relative to introns. H3K36me3 exon marking is dependent on transcription and is found at lower levels in alternatively spliced exons, supporting a splicing-related marking mechanism. We further show that the difference in H3K36me3 marking between exons and introns is evolutionarily conserved in human and mouse. We propose that H3K36me3 exon marking in chromatin provides a dynamic link between transcription and splicing.
Collapse
|
Research Support, Non-U.S. Gov't |
16 |
503 |
10
|
Down TA, Rakyan VK, Turner DJ, Flicek P, Li H, Kulesha E, Gräf S, Johnson N, Herrero J, Tomazou EM, Thorne NP, Bäckdahl L, Herberth M, Howe KL, Jackson DK, Miretti MM, Marioni JC, Birney E, Hubbard TJP, Durbin R, Tavaré S, Beck S. A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nat Biotechnol 2008; 26:779-85. [PMID: 18612301 DOI: 10.1038/nbt1414] [Citation(s) in RCA: 463] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2008] [Accepted: 05/15/2008] [Indexed: 12/31/2022]
Abstract
DNA methylation is an indispensible epigenetic modification required for regulating the expression of mammalian genomes. Immunoprecipitation-based methods for DNA methylome analysis are rapidly shifting the bottleneck in this field from data generation to data analysis, necessitating the development of better analytical tools. In particular, an inability to estimate absolute methylation levels remains a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling. To address this issue, we developed a cross-platform algorithm-Bayesian tool for methylation analysis (Batman)-for analyzing methylated DNA immunoprecipitation (MeDIP) profiles generated using oligonucleotide arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). We developed the latter approach to provide a high-resolution whole-genome DNA methylation profile (DNA methylome) of a mammalian genome. Strong correlation of our data, obtained using mature human spermatozoa, with those obtained using bisulfite sequencing suggest that combining MeDIP-seq or MeDIP-chip with Batman provides a robust, quantitative and cost-effective functional genomic strategy for elucidating the function of DNA methylation.
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
463 |
11
|
Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Eyre T, Fitzgerald S, Fernandez-Banet J, Gräf S, Haider S, Hammond M, Holland R, Howe KL, Howe K, Johnson N, Jenkinson A, Kähäri A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Megy K, Meidl P, Overduin B, Parker A, Pritchard B, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Slater G, Smedley D, Spudich G, Trevanion S, Vilella AJ, Vogel J, White S, Wood M, Birney E, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Herrero J, Hubbard TJP, Kasprzyk A, Proctor G, Smith J, Ureta-Vidal A, Searle S. Ensembl 2008. Nucleic Acids Res 2007; 36:D707-14. [PMID: 18000006 PMCID: PMC2238821 DOI: 10.1093/nar/gkm988] [Citation(s) in RCA: 371] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases and other information for chordate and selected model organism and disease vector genomes. As of release 47 (October 2007), Ensembl fully supports 35 species, with preliminary support for six additional species. New species in the past year include platypus and horse. Major additions and improvements to Ensembl since our previous report include extensive support for functional genomics data in the form of a specialized functional genomics database, genome-wide maps of protein–DNA interactions and the Ensembl regulatory build; support for customization of the Ensembl web interface through the addition of user accounts and user groups; and increased support for genome resequencing. We have also introduced new comparative genomics-based data mining options and report on the continued development of our software infrastructure.
Collapse
|
Research Support, Non-U.S. Gov't |
18 |
371 |
12
|
Hubbard T, Andrews D, Caccamo M, Cameron G, Chen Y, Clamp M, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Gilbert J, Hammond M, Herrero J, Hotz H, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Kokocinsci F, London D, Longden I, McVicker G, Melsopp C, Meidl P, Potter S, Proctor G, Rae M, Rios D, Schuster M, Searle S, Severin J, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Birney E. Ensembl 2005. Nucleic Acids Res 2005; 33:D447-53. [PMID: 15608235 PMCID: PMC540092 DOI: 10.1093/nar/gki138] [Citation(s) in RCA: 341] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2004] [Revised: 11/01/2004] [Accepted: 11/01/2004] [Indexed: 11/17/2022] Open
Abstract
The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased by 7 to 16, with the addition of the six vertebrate genomes of chimpanzee, dog, cow, chicken, tetraodon and frog and the insect genome of honeybee. The majority have been annotated automatically using the Ensembl gene build system, showing its flexibility to reliably annotate a wide variety of genomes. With the increased number of vertebrate genomes, the comparative analysis provided to users has been greatly improved, with new website interfaces allowing annotation of different genomes to be directly compared. The Ensembl software system is being increasingly widely reused in different projects showing the benefits of a completely open approach to software development and distribution.
Collapse
|
research-article |
20 |
341 |
13
|
Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Gräf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kähäri A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, Parker A, Proctor G, Prlic A, Rae M, Rios D, Redmond S, Schuster M, Sealy I, Searle S, Severin J, Slater G, Smedley D, Smith J, Stabenau A, Stalker J, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Hubbard TJP. Ensembl 2006. Nucleic Acids Res 2006; 34:D556-61. [PMID: 16381931 PMCID: PMC1347495 DOI: 10.1093/nar/gkj133] [Citation(s) in RCA: 323] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
The Ensembl () project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased from 4 to 19, with the addition of the mammalian genomes of Rhesus macaque and Opossum, the chordate genome of Ciona intestinalis and the import and integration of the yeast genome. The year has also seen extensive improvements to both data analysis and presentation, with the introduction of a redesigned website, the addition of RNA gene and regulatory annotation and substantial improvements to the integration of human genome variation data.
Collapse
|
Research Support, Non-U.S. Gov't |
19 |
323 |
14
|
Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J, Curwen V, Cutts T, Down T, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz HR, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark KC, Cameron G, Durbin R, Cox A, Hubbard T, Clamp M. An overview of Ensembl. Genome Res 2004; 14:925-8. [PMID: 15078858 PMCID: PMC479121 DOI: 10.1101/gr.1860604] [Citation(s) in RCA: 314] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of individual genomes, and of the synteny and orthology relationships between them. It is also a framework for integration of any biological data that can be mapped onto features derived from the genomic sequence. Ensembl is available as an interactive Web site, a set of flat files, and as a complete, portable open source software system for handling genomes. All data are provided without restriction, and code is freely available. Ensembl's aims are to continue to "widen" this biological integration to include other model organisms relevant to understanding human biology as they become available; to "deepen" this integration to provide an ever more seamless linkage between equivalent components in different species; and to provide further classification of functional elements in the genome that have been previously elusive.
Collapse
|
Review |
21 |
314 |
15
|
Howe KL, Bolt BJ, Cain S, Chan J, Chen WJ, Davis P, Done J, Down T, Gao S, Grove C, Harris TW, Kishore R, Lee R, Lomax J, Li Y, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Schindelman G, Stanley E, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Wright A, Yook K, Berriman M, Kersey P, Schedl T, Stein L, Sternberg PW. WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res 2015; 44:D774-80. [PMID: 26578572 PMCID: PMC4702863 DOI: 10.1093/nar/gkv1217] [Citation(s) in RCA: 289] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Accepted: 10/28/2015] [Indexed: 11/24/2022] Open
Abstract
WormBase (www.wormbase.org) is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite (parasite.wormbase.org) contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
289 |
16
|
Rakyan VK, Beyan H, Down TA, Hawa MI, Maslau S, Aden D, Daunay A, Busato F, Mein CA, Manfras B, Dias KRM, Bell CG, Tost J, Boehm BO, Beck S, Leslie RD. Identification of type 1 diabetes-associated DNA methylation variable positions that precede disease diagnosis. PLoS Genet 2011; 7:e1002300. [PMID: 21980303 PMCID: PMC3183089 DOI: 10.1371/journal.pgen.1002300] [Citation(s) in RCA: 236] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 08/03/2011] [Indexed: 12/24/2022] Open
Abstract
Monozygotic (MZ) twin pair discordance for childhood-onset Type 1 Diabetes (T1D) is ∼50%, implicating roles for genetic and non-genetic factors in the aetiology of this complex autoimmune disease. Although significant progress has been made in elucidating the genetics of T1D in recent years, the non-genetic component has remained poorly defined. We hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology and, thus, performed an epigenome-wide association study (EWAS) for this disease. We generated genome-wide DNA methylation profiles of purified CD14+ monocytes (an immune effector cell type relevant to T1D pathogenesis) from 15 T1D–discordant MZ twin pairs. This identified 132 different CpG sites at which the direction of the intra-MZ pair DNA methylation difference significantly correlated with the diabetic state, i.e. T1D–associated methylation variable positions (T1D–MVPs). We confirmed these T1D–MVPs display statistically significant intra-MZ pair DNA methylation differences in the expected direction in an independent set of T1D–discordant MZ pairs (P = 0.035). Then, to establish the temporal origins of the T1D–MVPs, we generated two further genome-wide datasets and established that, when compared with controls, T1D–MVPs are enriched in singletons both before (P = 0.001) and at (P = 0.015) disease diagnosis, and also in singletons positive for diabetes-associated autoantibodies but disease-free even after 12 years follow-up (P = 0.0023). Combined, these results suggest that T1D–MVPs arise very early in the etiological process that leads to overt T1D. Our EWAS of T1D represents an important contribution toward understanding the etiological role of epigenetic variation in type 1 diabetes, and it is also the first systematic analysis of the temporal origins of disease-associated epigenetic variation for any human complex disease. Type 1 diabetes (T1D) is a complex autoimmune disease affecting >30 million people worldwide. It is caused by a combination of genetic and non-genetic factors, leading to destruction of insulin-secreting cells. Although significant progress has recently been made in elucidating the genetics of T1D, the non-genetic component has remained poorly defined. Epigenetic modifications, such as methylation of DNA, are indispensable for genomic processes such as transcriptional regulation and are frequently perturbed in human disease. We therefore hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology, and we performed a genome-wide DNA methylation analysis of a specific subset of immune cells (monocytes) from monozygotic twins discordant for T1D. This revealed the presence of T1D–specific methylation variable positions (T1D–MVPs) in the T1D–affected co-twins. Since these T1D–MVPs were found in MZ twins, they cannot be due to genetic differences. Additional experiments revealed that some of these T1D–MVPs are found in individuals before T1D diagnosis, suggesting they arise very early in the process that leads to overt T1D and are not simply due to post-disease associated factors (e.g. medication or long-term metabolic changes). T1D–MVPs may thus potentially represent a previously unappreciated, and important, component of type 1 diabetes risk.
Collapse
|
Twin Study |
14 |
236 |
17
|
Down TA, Hubbard TJP. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res 2002; 12:458-61. [PMID: 11875034 PMCID: PMC155284 DOI: 10.1101/gr.216102] [Citation(s) in RCA: 217] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Transcription, the process whereby RNA copies are made from sections of the DNA genome, is directed by promoter regions. These define the transcription start site, and also the set of cellular conditions under which the promoter is active. At least in more complex species, it appears to be common for genes to have several different transcription start sites, which may be active under different conditions. Eukaryotic promoters are complex and fairly diffuse structures, which have proven hard to detect in silico. We show that a novel hybrid machine-learning method is able to build useful models of promoters for >50% of human transcription start sites. We estimate specificity to be >70%, and demonstrate good positional accuracy. Based on the structure of our learned models, we conclude that a signal resembling the well known TATA box, together with flanking regions of C-G enrichment, are the most important sequence-based signals marking sites of transcriptional initiation at a large class of typical promoters.
Collapse
|
research-article |
23 |
217 |
18
|
Movassagh M, Choy MK, Knowles DA, Cordeddu L, Haider S, Down T, Siggens L, Vujic A, Simeoni I, Penkett C, Goddard M, Lio P, Bennett MR, Foo RSY. Distinct epigenomic features in end-stage failing human hearts. Circulation 2011; 124:2411-22. [PMID: 22025602 DOI: 10.1161/circulationaha.111.040071] [Citation(s) in RCA: 203] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
BACKGROUND The epigenome refers to marks on the genome, including DNA methylation and histone modifications, that regulate the expression of underlying genes. A consistent profile of gene expression changes in end-stage cardiomyopathy led us to hypothesize that distinct global patterns of the epigenome may also exist. METHODS AND RESULTS We constructed genome-wide maps of DNA methylation and histone-3 lysine-36 trimethylation (H3K36me3) enrichment for cardiomyopathic and normal human hearts. More than 506 Mb sequences per library were generated by high-throughput sequencing, allowing us to assign methylation scores to ≈28 million CG dinucleotides in the human genome. DNA methylation was significantly different in promoter CpG islands, intragenic CpG islands, gene bodies, and H3K36me3-enriched regions of the genome. DNA methylation differences were present in promoters of upregulated genes but not downregulated genes. H3K36me3 enrichment itself was also significantly different in coding regions of the genome. Specifically, abundance of RNA transcripts encoded by the DUX4 locus correlated to differential DNA methylation and H3K36me3 enrichment. In vitro, Dux gene expression was responsive to a specific inhibitor of DNA methyltransferase, and Dux siRNA knockdown led to reduced cell viability. CONCLUSIONS Distinct epigenomic patterns exist in important DNA elements of the cardiac genome in human end-stage cardiomyopathy. The epigenome may control the expression of local or distal genes with critical functions in myocardial stress response. If epigenomic patterns track with disease progression, assays for the epigenome may be useful for assessing prognosis in heart failure. Further studies are needed to determine whether and how the epigenome contributes to the development of cardiomyopathy.
Collapse
|
Research Support, Non-U.S. Gov't |
14 |
203 |
19
|
Novakovic B, Sibson M, Ng HK, Manuelpillai U, Rakyan V, Down T, Beck S, Fournier T, Evain-Brion D, Dimitriadis E, Craig JM, Morley R, Saffery R. Placenta-specific methylation of the vitamin D 24-hydroxylase gene: implications for feedback autoregulation of active vitamin D levels at the fetomaternal interface. J Biol Chem 2009; 284:14838-48. [PMID: 19237542 PMCID: PMC2685665 DOI: 10.1074/jbc.m809542200] [Citation(s) in RCA: 183] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2008] [Revised: 02/20/2009] [Indexed: 11/21/2022] Open
Abstract
Plasma concentrations of biologically active vitamin D (1,25-(OH)(2)D) are tightly controlled via feedback regulation of renal 1alpha-hydroxylase (CYP27B1; positive) and 24-hydroxylase (CYP24A1; catabolic) enzymes. In pregnancy, this regulation is uncoupled, and 1,25-(OH)(2)D levels are significantly elevated, suggesting a role in pregnancy progression. Epigenetic regulation of CYP27B1 and CYP24A1 has previously been described in cell and animal models, and despite emerging evidence for a critical role of epigenetics in placentation generally, little is known about the regulation of enzymes modulating vitamin D homeostasis at the fetomaternal interface. In this study, we investigated the methylation status of genes regulating vitamin D bioavailability and activity in the placenta. No methylation of the VDR (vitamin D receptor) and CYP27B1 genes was found in any placental tissues. In contrast, the CYP24A1 gene is methylated in human placenta, purified cytotrophoblasts, and primary and cultured chorionic villus sampling tissue. No methylation was detected in any somatic human tissue tested. Methylation was also evident in marmoset and mouse placental tissue. All three genes were hypermethylated in choriocarcinoma cell lines, highlighting the role of vitamin D deregulation in this cancer. Gene expression analysis confirmed a reduced capacity for CYP24A1 induction with promoter methylation in primary cells and in vitro reporter analysis demonstrated that promoter methylation directly down-regulates basal promoter activity and abolishes vitamin D-mediated feedback activation. This study strongly suggests that epigenetic decoupling of vitamin D feedback catabolism plays an important role in maximizing active vitamin D bioavailability at the fetomaternal interface.
Collapse
|
research-article |
16 |
183 |
20
|
Clamp M, Andrews D, Barker D, Bevan P, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Hubbard T, Kasprzyk A, Keefe D, Lehvaslaiho H, Iyer V, Melsopp C, Mongin E, Pettett R, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Birney E. Ensembl 2002: accommodating comparative genomics. Nucleic Acids Res 2003; 31:38-42. [PMID: 12519943 PMCID: PMC165530 DOI: 10.1093/nar/gkg083] [Citation(s) in RCA: 180] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.
Collapse
|
research-article |
22 |
180 |
21
|
Bell CG, Finer S, Lindgren CM, Wilson GA, Rakyan VK, Teschendorff AE, Akan P, Stupka E, Down TA, Prokopenko I, Morison IM, Mill J, Pidsley R, Deloukas P, Frayling TM, Hattersley AT, McCarthy MI, Beck S, Hitman GA. Integrated genetic and epigenetic analysis identifies haplotype-specific methylation in the FTO type 2 diabetes and obesity susceptibility locus. PLoS One 2010; 5:e14040. [PMID: 21124985 PMCID: PMC2987816 DOI: 10.1371/journal.pone.0014040] [Citation(s) in RCA: 177] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2010] [Accepted: 10/27/2010] [Indexed: 01/04/2023] Open
Abstract
Recent multi-dimensional approaches to the study of complex disease have revealed powerful insights into how genetic and epigenetic factors may underlie their aetiopathogenesis. We examined genotype-epigenotype interactions in the context of Type 2 Diabetes (T2D), focussing on known regions of genomic susceptibility. We assayed DNA methylation in 60 females, stratified according to disease susceptibility haplotype using previously identified association loci. CpG methylation was assessed using methylated DNA immunoprecipitation on a targeted array (MeDIP-chip) and absolute methylation values were estimated using a Bayesian algorithm (BATMAN). Absolute methylation levels were quantified across LD blocks, and we identified increased DNA methylation on the FTO obesity susceptibility haplotype, tagged by the rs8050136 risk allele A (p = 9.40×10−4, permutation p = 1.0×10−3). Further analysis across the 46 kb LD block using sliding windows localised the most significant difference to be within a 7.7 kb region (p = 1.13×10−7). Sequence level analysis, followed by pyrosequencing validation, revealed that the methylation difference was driven by the co-ordinated phase of CpG-creating SNPs across the risk haplotype. This 7.7 kb region of haplotype-specific methylation (HSM), encapsulates a Highly Conserved Non-Coding Element (HCNE) that has previously been validated as a long-range enhancer, supported by the histone H3K4me1 enhancer signature. This study demonstrates that integration of Genome-Wide Association (GWA) SNP and epigenomic DNA methylation data can identify potential novel genotype-epigenotype interactions within disease-associated loci, thus providing a novel route to aid unravelling common complex diseases.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
177 |
22
|
Holland RCG, Down TA, Pocock M, Prlić A, Huen D, James K, Foisy S, Dräger A, Yates A, Heuer M, Schreiber MJ. BioJava: an open-source framework for bioinformatics. Bioinformatics 2008; 24:2096-7. [PMID: 18689808 PMCID: PMC2530884 DOI: 10.1093/bioinformatics/btn397] [Citation(s) in RCA: 168] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Summary: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Availability: BioJava is an open-source project distributed under the Lesser GPL (LGPL). BioJava can be downloaded from the BioJava website (http://www.biojava.org). BioJava requires Java 1.5 or higher. Contact:andreas.prlic@gmail.com. All queries should be directed to the BioJava mailing lists. Details are available at http://biojava.org/wiki/BioJava:MailingLists.
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
168 |
23
|
Birney E, Andrews D, Bevan P, Caccamo M, Cameron G, Chen Y, Clarke L, Coates G, Cox T, Cuff J, Curwen V, Cutts T, Down T, Durbin R, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz H, Iyer V, Kahari A, Jekosch K, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark C, Clamp M, Hubbard T. Ensembl 2004. Nucleic Acids Res 2004; 32:D468-70. [PMID: 14681459 PMCID: PMC308772 DOI: 10.1093/nar/gkh038] [Citation(s) in RCA: 143] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organize biology around the sequences of large genomes. It is a comprehensive and integrated source of annotation of large genome sequences, available via interactive website, web services or flat files. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. The facilities of the system range from sequence analysis to data storage and visualization and installations exist around the world both in companies and at academic sites. With a total of nine genome sequences available from Ensembl and more genomes to follow, recent developments have focused mainly on closer integration between genomes and external data.
Collapse
|
Research Support, U.S. Gov't, P.H.S. |
21 |
143 |
24
|
Cheung MS, Down TA, Latorre I, Ahringer J. Systematic bias in high-throughput sequencing data and its correction by BEADS. Nucleic Acids Res 2011; 39:e103. [PMID: 21646344 PMCID: PMC3159482 DOI: 10.1093/nar/gkr425] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina’s Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses.
Collapse
|
Research Support, Non-U.S. Gov't |
14 |
108 |
25
|
Häsler R, Feng Z, Bäckdahl L, Spehlmann ME, Franke A, Teschendorff A, Rakyan VK, Down TA, Wilson GA, Feber A, Beck S, Schreiber S, Rosenstiel P. A functional methylome map of ulcerative colitis. Genome Res 2012; 22:2130-7. [PMID: 22826509 PMCID: PMC3483542 DOI: 10.1101/gr.138347.112] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The etiology of inflammatory bowel diseases is only partially explained by the current genetic risk map. It is hypothesized that environmental factors modulate the epigenetic landscape and thus contribute to disease susceptibility, manifestation, and progression. To test this, we analyzed DNA methylation (DNAm), a fundamental mechanism of epigenetic long-term modulation of gene expression. We report a three-layer epigenome-wide association study (EWAS) using intestinal biopsies from 10 monozygotic twin pairs (n = 20 individuals) discordant for manifestation of ulcerative colitis (UC). Genome-wide expression scans were generated using Affymetrix UG 133 Plus 2.0 arrays (layer 1). Genome-wide DNAm scans were carried out using Illumina 27k Infinium Bead Arrays to identify methylation variable positions (MVPs, layer 2), and MeDIP-chip on Nimblegen custom 385k Tiling Arrays to identify differentially methylated regions (DMRs, layer 3). Identified MVPs and DMRs were validated in two independent patient populations by quantitative real-time PCR and bisulfite-pyrosequencing (n = 185). The EWAS identified 61 disease-associated loci harboring differential DNAm in cis of a differentially expressed transcript. All constitute novel candidate risk loci for UC not previously identified by GWAS. Among them are several that have been functionally implicated in inflammatory processes, e.g., complement factor CFI, the serine protease inhibitor SPINK4, and the adhesion molecule THY1 (also known as CD90). Our study design excludes nondisease inflammation as a cause of the identified changes in DNAm. This study represents the first replicated EWAS of UC integrated with transcriptional signatures in the affected tissue and demonstrates the power of EWAS to uncover unexplained disease risk and molecular events of disease manifestation.
Collapse
|
Twin Study |
13 |
101 |