Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

Journal Articles

Rank	Citation Analysis	Article Type	Number of Years	Citation(s) in RCA
1	Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest ARR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SPT, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, et alCarninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, Kodzius R, Shimokawa K, Bajic VB, Brenner SE, Batalov S, Forrest ARR, Zavolan M, Davis MJ, Wilming LG, Aidinis V, Allen JE, Ambesi-Impiombato A, Apweiler R, Aturaliya RN, Bailey TL, Bansal M, Baxter L, Beisel KW, Bersano T, Bono H, Chalk AM, Chiu KP, Choudhary V, Christoffels A, Clutterbuck DR, Crowe ML, Dalla E, Dalrymple BP, de Bono B, Della Gatta G, di Bernardo D, Down T, Engstrom P, Fagiolini M, Faulkner G, Fletcher CF, Fukushima T, Furuno M, Futaki S, Gariboldi M, Georgii-Hemming P, Gingeras TR, Gojobori T, Green RE, Gustincich S, Harbers M, Hayashi Y, Hensch TK, Hirokawa N, Hill D, Huminiecki L, Iacono M, Ikeo K, Iwama A, Ishikawa T, Jakt M, Kanapin A, Katoh M, Kawasawa Y, Kelso J, Kitamura H, Kitano H, Kollias G, Krishnan SPT, Kruger A, Kummerfeld SK, Kurochkin IV, Lareau LF, Lazarevic D, Lipovich L, Liu J, Liuni S, McWilliam S, Madan Babu M, Madera M, Marchionni L, Matsuda H, Matsuzawa S, Miki H, Mignone F, Miyake S, Morris K, Mottagui-Tabar S, Mulder N, Nakano N, Nakauchi H, Ng P, Nilsson R, Nishiguchi S, Nishikawa S, Nori F, Ohara O, Okazaki Y, Orlando V, Pang KC, Pavan WJ, Pavesi G, Pesole G, Petrovsky N, Piazza S, Reed J, Reid JF, Ring BZ, Ringwald M, Rost B, Ruan Y, Salzberg SL, Sandelin A, Schneider C, Schönbach C, Sekiguchi K, Semple CAM, Seno S, Sessa L, Sheng Y, Shibata Y, Shimada H, Shimada K, Silva D, Sinclair B, Sperling S, Stupka E, Sugiura K, Sultana R, Takenaka Y, Taki K, Tammoja K, Tan SL, Tang S, Taylor MS, Tegner J, Teichmann SA, Ueda HR, van Nimwegen E, Verardo R, Wei CL, Yagi K, Yamanishi H, Zabarovsky E, Zhu S, Zimmer A, Hide W, Bult C, Grimmond SM, Teasdale RD, Liu ET, Brusic V, Quackenbush J, Wahlestedt C, Mattick JS, Hume DA, Kai C, Sasaki D, Tomaru Y, Fukuda S, Kanamori-Katayama M, Suzuki M, Aoki J, Arakawa T, Iida J, Imamura K, Itoh M, Kato T, Kawaji H, Kawagashira N, Kawashima T, Kojima M, Kondo S, Konno H, Nakano K, Ninomiya N, Nishio T, Okada M, Plessy C, Shibata K, Shiraki T, Suzuki S, Tagami M, Waki K, Watahiki A, Okamura-Oho Y, Suzuki H, Kawai J, Hayashizaki Y. The transcriptional landscape of the mammalian genome. Science 2005;309:1559-63. [PMID: 16141072 DOI: 10.1126/science.1112014] [Show More Authors] [Citation(s) in RCA: 2666] [Impact Index Per Article: 133.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Abstract This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development. Collapse Key Words Collapse MESH Headings 3' Untranslated Regions Animals Base Sequence Conserved Sequence DNA, Complementary/chemistry Genome Genome, Human Genomics Humans Mice/genetics Promoter Regions, Genetic Proteins/genetics RNA/chemistry RNA/classification RNA Splicing RNA, Untranslated/chemistry Regulatory Sequences, Ribonucleic Acid Terminator Regions, Genetic Transcription Initiation Site Transcription, Genetic Collapse Grants TGM03P17 Telethon TGM06S01 Telethon Collapse	Research Support, Non-U.S. Gov't	20	2666
2	Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR. A census of human cancer genes. Nat Rev Cancer 2004;4:177-83. [PMID: 14993899 PMCID: PMC2665285 DOI: 10.1038/nrc1299] [Citation(s) in RCA: 2284] [Impact Index Per Article: 108.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Abstract A central aim of cancer research has been to identify the mutated genes that are causally implicated in oncogenesis (‘cancer genes’). After two decades of searching, how many have been identified and how do they compare to the complete gene set that has been revealed by the human genome sequence? We have conducted a ‘census’ of cancer genes that indicates that mutations in more than 1% of genes contribute to human cancer. The census illustrates striking features in the types of sequence alteration, cancer classes in which oncogenic mutations have been identified and protein domains that are encoded by cancer genes. Collapse Key Words Collapse MESH Headings Genes/genetics Genome, Human Humans Mutation Neoplasms/genetics Oncogenes/genetics Collapse Grants Collapse	Review	21	2284
3	Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Huminiecki L, Kasprzyk A, Lehvaslaiho H, Lijnzaad P, Melsopp C, Mongin E, Pettett R, Pocock M, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Clamp M. The Ensembl genome database project. Nucleic Acids Res 2002;30:38-41. [PMID: 11752248 PMCID: PMC99161 DOI: 10.1093/nar/30.1.38] [Citation(s) in RCA: 1096] [Impact Index Per Article: 47.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open Abstract The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops. Collapse Key Words Collapse MESH Headings Computational Biology Database Management Systems Databases, Genetic Genome, Human Humans Information Storage and Retrieval Internet Sequence Analysis, DNA Systems Integration Collapse Grants Wellcome Trust Collapse	research-article	23	1096
4	Eckhardt F, Lewin J, Cortese R, Rakyan VK, Attwood J, Burger M, Burton J, Cox TV, Davies R, Down TA, Haefliger C, Horton R, Howe K, Jackson DK, Kunde J, Koenig C, Liddle J, Niblett D, Otto T, Pettett R, Seemann S, Thompson C, West T, Rogers J, Olek A, Berlin K, Beck S. DNA methylation profiling of human chromosomes 6, 20 and 22. Nat Genet 2006;38:1378-85. [PMID: 17072317 PMCID: PMC3082778 DOI: 10.1038/ng1909] [Citation(s) in RCA: 945] [Impact Index Per Article: 49.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2006] [Accepted: 09/18/2006] [Indexed: 12/17/2022] Abstract DNA methylation is the most stable type of epigenetic modification modulating the transcriptional plasticity of mammalian genomes. Using bisulfite DNA sequencing, we report high-resolution methylation profiles of human chromosomes 6, 20 and 22, providing a resource of about 1.9 million CpG methylation values derived from 12 different tissues. Analysis of six annotation categories showed that evolutionarily conserved regions are the predominant sites for differential DNA methylation and that a core region surrounding the transcriptional start site is an informative surrogate for promoter methylation. We find that 17% of the 873 analyzed genes are differentially methylated in their 5' UTRs and that about one-third of the differentially methylated 5' UTRs are inversely correlated with transcription. Despite the fact that our study controlled for factors reported to affect DNA methylation such as sex and age, we did not find any significant attributable effects. Our data suggest DNA methylation to be ontogenetically more stable than previously thought. Collapse Key Words Collapse MESH Headings 5' Untranslated Regions Adult Age Factors Aged Animals Chromosomes, Human, Pair 20/genetics Chromosomes, Human, Pair 20/metabolism Chromosomes, Human, Pair 22/genetics Chromosomes, Human, Pair 22/metabolism Chromosomes, Human, Pair 6/genetics Chromosomes, Human, Pair 6/metabolism CpG Islands DNA Methylation Epigenesis, Genetic Evolution, Molecular Female Humans Male Mice Middle Aged Organ Specificity Promoter Regions, Genetic Sex Characteristics Species Specificity Transcription, Genetic Collapse Grants Wellcome Trust 084071 Wellcome Trust Collapse	Comparative Study	19	945
5	Rakyan VK, Down TA, Balding DJ, Beck S. Epigenome-wide association studies for common human diseases. Nat Rev Genet 2011;12:529-41. [PMID: 21747404 PMCID: PMC3508712 DOI: 10.1038/nrg3000] [Citation(s) in RCA: 899] [Impact Index Per Article: 64.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Abstract Despite the success of genome-wide association studies (GWASs) in identifying loci associated with common diseases, a substantial proportion of the causality remains unexplained. Recent advances in genomic technologies have placed us in a position to initiate large-scale studies of human disease-associated epigenetic variation, specifically variation in DNA methylation. Such epigenome-wide association studies (EWASs) present novel opportunities but also create new challenges that are not encountered in GWASs. We discuss EWAS design, cohort and sample selections, statistical significance and power, confounding factors and follow-up studies. We also discuss how integration of EWASs with GWASs can help to dissect complex GWAS haplotypes for functional analysis. Collapse Key Words epigenomics disease genetics dna methylation epigenetics quantitative trait Collapse MESH Headings Biomarkers DNA Methylation Epigenomics/methods Gene Expression Profiling Genetic Predisposition to Disease Genetic Variation Genome, Human Genome-Wide Association Study/methods Haplotypes Humans Oligonucleotide Array Sequence Analysis Sequence Analysis, DNA Collapse Grants Wellcome Trust 084071 Wellcome Trust Collapse	Review	14	899
6	Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K, Meidl P, Ouverdin B, Parker A, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Severin J, Slater G, Smedley D, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wood M, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Flicek P, Kasprzyk A, Proctor G, Searle S, Smith J, Ureta-Vidal A, Birney E. Ensembl 2007. Nucleic Acids Res 2006;35:D610-7. [PMID: 17148474 PMCID: PMC1761443 DOI: 10.1093/nar/gkl996] [Citation(s) in RCA: 657] [Impact Index Per Article: 34.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of chordate genome sequences. Over the past year the number of genomes available from Ensembl has increased from 15 to 33, with the addition of sites for the mammalian genomes of elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat and european hedgehog; the fish genomes of stickleback and medaka and the second example of the genomes of the sea squirt (Ciona savignyi) and the mosquito (Aedes aegypti). Some of the major features added during the year include the first complete gene sets for genomes with low-sequence coverage, the introduction of new strain variation data and the introduction of new orthology/paralog annotations based on gene trees. Collapse Key Words Collapse MESH Headings Animals Base Sequence Databases, Nucleic Acid/standards Genetic Variation Genome, Human Genomics Humans Internet Mice Proteins/genetics Reference Standards Sequence Alignment Systems Integration User-Computer Interface Collapse Grants BBS/B/13462 Biotechnology and Biological Sciences Research Council BBS/B/13446 Biotechnology and Biological Sciences Research Council BB/E010768/1 Biotechnology and Biological Sciences Research Council BBS/B/13470 Biotechnology and Biological Sciences Research Council BBS/B/13438 Biotechnology and Biological Sciences Research Council Wellcome Trust 062023 Wellcome Trust BB/E011640/1 Biotechnology and Biological Sciences Research Council Collapse	Research Support, Non-U.S. Gov't	19	657
7	Rakyan VK, Down TA, Maslau S, Andrew T, Yang TP, Beyan H, Whittaker P, McCann OT, Finer S, Valdes AM, Leslie RD, Deloukas P, Spector TD. Human aging-associated DNA hypermethylation occurs preferentially at bivalent chromatin domains. Genome Res 2010;20:434-9. [PMID: 20219945 DOI: 10.1101/gr.103101.109] [Citation(s) in RCA: 562] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Abstract There is a growing realization that some aging-associated phenotypes/diseases have an epigenetic basis. Here, we report the first genome-scale study of epigenomic dynamics during normal human aging. We identify aging-associated differentially methylated regions (aDMRs) in whole blood in a discovery cohort, and then replicate these aDMRs in sorted CD4(+) T-cells and CD14(+) monocytes in an independent cohort, suggesting that aDMRs occur in precursor haematopoietic cells. Further replication of the aDMRs in buccal cells, representing a tissue that originates from a different germ layer compared with blood, demonstrates that the aDMR signature is a multitissue phenomenon. Moreover, we demonstrate that aging-associated DNA hypermethylation occurs predominantly at bivalent chromatin domain promoters. This same category of promoters, associated with key developmental genes, is frequently hypermethylated in cancers and in vitro cell culture, pointing to a novel mechanistic link between aberrant hypermethylation in cancer, aging, and cell culture. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	15	562
8	Hackett JA, Sengupta R, Zylicz JJ, Murakami K, Lee C, Down TA, Surani MA. Germline DNA demethylation dynamics and imprint erasure through 5-hydroxymethylcytosine. Science 2013;339:448-52. [PMID: 23223451 PMCID: PMC3847602 DOI: 10.1126/science.1229277] [Citation(s) in RCA: 545] [Impact Index Per Article: 45.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Abstract Mouse primordial germ cells (PGCs) undergo sequential epigenetic changes and genome-wide DNA demethylation to reset the epigenome for totipotency. Here, we demonstrate that erasure of CpG methylation (5mC) in PGCs occurs via conversion to 5-hydroxymethylcytosine (5hmC), driven by high levels of TET1 and TET2. Global conversion to 5hmC initiates asynchronously among PGCs at embryonic day (E) 9.5 to E10.5 and accounts for the unique process of imprint erasure. Mechanistically, 5hmC enrichment is followed by its protracted decline thereafter at a rate consistent with replication-coupled dilution. The conversion to 5hmC is an important component of parallel redundant systems that drive comprehensive reprogramming in PGCs. Nonetheless, we identify rare regulatory elements that escape systematic DNA demethylation in PGCs, providing a potential mechanistic basis for transgenerational epigenetic inheritance. Collapse Key Words Collapse MESH Headings 5-Methylcytosine/metabolism Animals CpG Islands Cytosine/analogs & derivatives Cytosine/metabolism DNA Methylation DNA-Binding Proteins/genetics DNA-Binding Proteins/metabolism Dioxygenases Embryo, Mammalian/metabolism Embryonic Development Epigenesis, Genetic Female Genomic Imprinting Germ Cells/metabolism Germ Layers/cytology Male Mice Promoter Regions, Genetic Proto-Oncogene Proteins/genetics Proto-Oncogene Proteins/metabolism RNA-Binding Proteins/genetics Collapse Grants 079249 Wellcome Trust 083563 Wellcome Trust 092096 Wellcome Trust 083089 Wellcome Trust RG49135 Wellcome Trust RG44593 Wellcome Trust Wellcome Trust Collapse	research-article	12	545
9	Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J. Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet 2009;41:376-81. [PMID: 19182803 PMCID: PMC2648722 DOI: 10.1038/ng.322] [Citation(s) in RCA: 503] [Impact Index Per Article: 31.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2008] [Accepted: 01/09/2009] [Indexed: 12/11/2022] Abstract Variation in patterns of methylations of histone tails reflects and modulates chromatin structure and function. To provide a framework for the analysis of chromatin function in Caenorhabditis elegans, we generated a genome-wide map of histone H3 tail methylations. We find that C. elegans genes show distributions of histone modifications that are similar to those of other organisms, with H3K4me3 near transcription start sites, H3K36me3 in the body of genes and H3K9me3 enriched on silent genes. We also observe a novel pattern: exons are preferentially marked with H3K36me3 relative to introns. H3K36me3 exon marking is dependent on transcription and is found at lower levels in alternatively spliced exons, supporting a splicing-related marking mechanism. We further show that the difference in H3K36me3 marking between exons and introns is evolutionarily conserved in human and mouse. We propose that H3K36me3 exon marking in chromatin provides a dynamic link between transcription and splicing. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	16	503
10	Down TA, Rakyan VK, Turner DJ, Flicek P, Li H, Kulesha E, Gräf S, Johnson N, Herrero J, Tomazou EM, Thorne NP, Bäckdahl L, Herberth M, Howe KL, Jackson DK, Miretti MM, Marioni JC, Birney E, Hubbard TJP, Durbin R, Tavaré S, Beck S. A Bayesian deconvolution strategy for immunoprecipitation-based DNA methylome analysis. Nat Biotechnol 2008;26:779-85. [PMID: 18612301 DOI: 10.1038/nbt1414] [Citation(s) in RCA: 463] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2008] [Accepted: 05/15/2008] [Indexed: 12/31/2022] Abstract DNA methylation is an indispensible epigenetic modification required for regulating the expression of mammalian genomes. Immunoprecipitation-based methods for DNA methylome analysis are rapidly shifting the bottleneck in this field from data generation to data analysis, necessitating the development of better analytical tools. In particular, an inability to estimate absolute methylation levels remains a major analytical difficulty associated with immunoprecipitation-based DNA methylation profiling. To address this issue, we developed a cross-platform algorithm-Bayesian tool for methylation analysis (Batman)-for analyzing methylated DNA immunoprecipitation (MeDIP) profiles generated using oligonucleotide arrays (MeDIP-chip) or next-generation sequencing (MeDIP-seq). We developed the latter approach to provide a high-resolution whole-genome DNA methylation profile (DNA methylome) of a mammalian genome. Strong correlation of our data, obtained using mature human spermatozoa, with those obtained using bisulfite sequencing suggest that combining MeDIP-seq or MeDIP-chip with Batman provides a robust, quantitative and cost-effective functional genomic strategy for elucidating the function of DNA methylation. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	17	463
11	Flicek P, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Eyre T, Fitzgerald S, Fernandez-Banet J, Gräf S, Haider S, Hammond M, Holland R, Howe KL, Howe K, Johnson N, Jenkinson A, Kähäri A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Megy K, Meidl P, Overduin B, Parker A, Pritchard B, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Slater G, Smedley D, Spudich G, Trevanion S, Vilella AJ, Vogel J, White S, Wood M, Birney E, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Herrero J, Hubbard TJP, Kasprzyk A, Proctor G, Smith J, Ureta-Vidal A, Searle S. Ensembl 2008. Nucleic Acids Res 2007;36:D707-14. [PMID: 18000006 PMCID: PMC2238821 DOI: 10.1093/nar/gkm988] [Citation(s) in RCA: 371] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open Abstract The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of genome annotation, databases and other information for chordate and selected model organism and disease vector genomes. As of release 47 (October 2007), Ensembl fully supports 35 species, with preliminary support for six additional species. New species in the past year include platypus and horse. Major additions and improvements to Ensembl since our previous report include extensive support for functional genomics data in the form of a specialized functional genomics database, genome-wide maps of protein–DNA interactions and the Ensembl regulatory build; support for customization of the Ensembl web interface through the addition of user accounts and user groups; and increased support for genome resequencing. We have also introduced new comparative genomics-based data mining options and report on the continued development of our software infrastructure. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	18	371
12	Hubbard T, Andrews D, Caccamo M, Cameron G, Chen Y, Clamp M, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Gilbert J, Hammond M, Herrero J, Hotz H, Howe K, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Kokocinsci F, London D, Longden I, McVicker G, Melsopp C, Meidl P, Potter S, Proctor G, Rae M, Rios D, Schuster M, Searle S, Severin J, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Birney E. Ensembl 2005. Nucleic Acids Res 2005;33:D447-53. [PMID: 15608235 PMCID: PMC540092 DOI: 10.1093/nar/gki138] [Citation(s) in RCA: 341] [Impact Index Per Article: 17.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2004] [Revised: 11/01/2004] [Accepted: 11/01/2004] [Indexed: 11/17/2022] Open Abstract The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased by 7 to 16, with the addition of the six vertebrate genomes of chimpanzee, dog, cow, chicken, tetraodon and frog and the insect genome of honeybee. The majority have been annotated automatically using the Ensembl gene build system, showing its flexibility to reliably annotate a wide variety of genomes. With the increased number of vertebrate genomes, the comparative analysis provided to users has been greatly improved, with new website interfaces allowing annotation of different genomes to be directly compared. The Ensembl software system is being increasingly widely reused in different projects showing the benefits of a completely open approach to software development and distribution. Collapse Key Words Collapse MESH Headings Animals Base Sequence Cattle Databases, Nucleic Acid Dogs Genomics Humans Internet Mice Rats Sequence Alignment Software User-Computer Interface Collapse Grants Wellcome Trust Collapse	research-article	20	341
13	Birney E, Andrews D, Caccamo M, Chen Y, Clarke L, Coates G, Cox T, Cunningham F, Curwen V, Cutts T, Down T, Durbin R, Fernandez-Suarez XM, Flicek P, Gräf S, Hammond M, Herrero J, Howe K, Iyer V, Jekosch K, Kähäri A, Kasprzyk A, Keefe D, Kokocinski F, Kulesha E, London D, Longden I, Melsopp C, Meidl P, Overduin B, Parker A, Proctor G, Prlic A, Rae M, Rios D, Redmond S, Schuster M, Sealy I, Searle S, Severin J, Slater G, Smedley D, Smith J, Stabenau A, Stalker J, Trevanion S, Ureta-Vidal A, Vogel J, White S, Woodwark C, Hubbard TJP. Ensembl 2006. Nucleic Acids Res 2006;34:D556-61. [PMID: 16381931 PMCID: PMC1347495 DOI: 10.1093/nar/gkj133] [Citation(s) in RCA: 323] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open Abstract The Ensembl () project provides a comprehensive and integrated source of annotation of large genome sequences. Over the last year the number of genomes available from the Ensembl site has increased from 4 to 19, with the addition of the mammalian genomes of Rhesus macaque and Opossum, the chordate genome of Ciona intestinalis and the import and integration of the yeast genome. The year has also seen extensive improvements to both data analysis and presentation, with the introduction of a redesigned website, the addition of RNA gene and regulatory annotation and substantial improvements to the integration of human genome variation data. Collapse Key Words Collapse MESH Headings Animals Base Sequence Databases, Nucleic Acid Genetic Variation Genome, Human Genomics Humans Internet Mice Proteins/genetics RNA/genetics Rats Regulatory Sequences, Nucleic Acid Sequence Alignment User-Computer Interface Collapse Grants BBS/B/13446 Biotechnology and Biological Sciences Research Council BBS/B/13462 Biotechnology and Biological Sciences Research Council BBS/B/13470 Biotechnology and Biological Sciences Research Council Wellcome Trust Collapse	Research Support, Non-U.S. Gov't	19	323
14	Birney E, Andrews TD, Bevan P, Caccamo M, Chen Y, Clarke L, Coates G, Cuff J, Curwen V, Cutts T, Down T, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz HR, Iyer V, Jekosch K, Kahari A, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark KC, Cameron G, Durbin R, Cox A, Hubbard T, Clamp M. An overview of Ensembl. Genome Res 2004;14:925-8. [PMID: 15078858 PMCID: PMC479121 DOI: 10.1101/gr.1860604] [Citation(s) in RCA: 314] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Abstract Ensembl (http://www.ensembl.org/) is a bioinformatics project to organize biological information around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of individual genomes, and of the synteny and orthology relationships between them. It is also a framework for integration of any biological data that can be mapped onto features derived from the genomic sequence. Ensembl is available as an interactive Web site, a set of flat files, and as a complete, portable open source software system for handling genomes. All data are provided without restriction, and code is freely available. Ensembl's aims are to continue to "widen" this biological integration to include other model organisms relevant to understanding human biology as they become available; to "deepen" this integration to provide an ever more seamless linkage between equivalent components in different species; and to provide further classification of functional elements in the genome that have been previously elusive. Collapse Key Words Collapse MESH Headings Computational Biology/trends Collapse Grants 062023 Wellcome Trust Collapse	Review	21	314
15	Howe KL, Bolt BJ, Cain S, Chan J, Chen WJ, Davis P, Done J, Down T, Gao S, Grove C, Harris TW, Kishore R, Lee R, Lomax J, Li Y, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Schindelman G, Stanley E, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Wright A, Yook K, Berriman M, Kersey P, Schedl T, Stein L, Sternberg PW. WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res 2015;44:D774-80. [PMID: 26578572 PMCID: PMC4702863 DOI: 10.1093/nar/gkv1217] [Citation(s) in RCA: 289] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Accepted: 10/28/2015] [Indexed: 11/24/2022] Open Abstract WormBase (www.wormbase.org) is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite (parasite.wormbase.org) contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	10	289
16	Rakyan VK, Beyan H, Down TA, Hawa MI, Maslau S, Aden D, Daunay A, Busato F, Mein CA, Manfras B, Dias KRM, Bell CG, Tost J, Boehm BO, Beck S, Leslie RD. Identification of type 1 diabetes-associated DNA methylation variable positions that precede disease diagnosis. PLoS Genet 2011;7:e1002300. [PMID: 21980303 PMCID: PMC3183089 DOI: 10.1371/journal.pgen.1002300] [Citation(s) in RCA: 236] [Impact Index Per Article: 16.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 08/03/2011] [Indexed: 12/24/2022] Open Abstract Monozygotic (MZ) twin pair discordance for childhood-onset Type 1 Diabetes (T1D) is ∼50%, implicating roles for genetic and non-genetic factors in the aetiology of this complex autoimmune disease. Although significant progress has been made in elucidating the genetics of T1D in recent years, the non-genetic component has remained poorly defined. We hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology and, thus, performed an epigenome-wide association study (EWAS) for this disease. We generated genome-wide DNA methylation profiles of purified CD14⁺ monocytes (an immune effector cell type relevant to T1D pathogenesis) from 15 T1D–discordant MZ twin pairs. This identified 132 different CpG sites at which the direction of the intra-MZ pair DNA methylation difference significantly correlated with the diabetic state, i.e. T1D–associated methylation variable positions (T1D–MVPs). We confirmed these T1D–MVPs display statistically significant intra-MZ pair DNA methylation differences in the expected direction in an independent set of T1D–discordant MZ pairs (P = 0.035). Then, to establish the temporal origins of the T1D–MVPs, we generated two further genome-wide datasets and established that, when compared with controls, T1D–MVPs are enriched in singletons both before (P = 0.001) and at (P = 0.015) disease diagnosis, and also in singletons positive for diabetes-associated autoantibodies but disease-free even after 12 years follow-up (P = 0.0023). Combined, these results suggest that T1D–MVPs arise very early in the etiological process that leads to overt T1D. Our EWAS of T1D represents an important contribution toward understanding the etiological role of epigenetic variation in type 1 diabetes, and it is also the first systematic analysis of the temporal origins of disease-associated epigenetic variation for any human complex disease. Type 1 diabetes (T1D) is a complex autoimmune disease affecting >30 million people worldwide. It is caused by a combination of genetic and non-genetic factors, leading to destruction of insulin-secreting cells. Although significant progress has recently been made in elucidating the genetics of T1D, the non-genetic component has remained poorly defined. Epigenetic modifications, such as methylation of DNA, are indispensable for genomic processes such as transcriptional regulation and are frequently perturbed in human disease. We therefore hypothesized that epigenetic variation could underlie some of the non-genetic component of T1D aetiology, and we performed a genome-wide DNA methylation analysis of a specific subset of immune cells (monocytes) from monozygotic twins discordant for T1D. This revealed the presence of T1D–specific methylation variable positions (T1D–MVPs) in the T1D–affected co-twins. Since these T1D–MVPs were found in MZ twins, they cannot be due to genetic differences. Additional experiments revealed that some of these T1D–MVPs are found in individuals before T1D diagnosis, suggesting they arise very early in the process that leads to overt T1D and are not simply due to post-disease associated factors (e.g. medication or long-term metabolic changes). T1D–MVPs may thus potentially represent a previously unappreciated, and important, component of type 1 diabetes risk. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Twin Study	14	236
17	Down TA, Hubbard TJP. Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res 2002;12:458-61. [PMID: 11875034 PMCID: PMC155284 DOI: 10.1101/gr.216102] [Citation(s) in RCA: 217] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Abstract Transcription, the process whereby RNA copies are made from sections of the DNA genome, is directed by promoter regions. These define the transcription start site, and also the set of cellular conditions under which the promoter is active. At least in more complex species, it appears to be common for genes to have several different transcription start sites, which may be active under different conditions. Eukaryotic promoters are complex and fairly diffuse structures, which have proven hard to detect in silico. We show that a novel hybrid machine-learning method is able to build useful models of promoters for >50% of human transcription start sites. We estimate specificity to be >70%, and demonstrate good positional accuracy. Based on the structure of our learned models, we conclude that a signal resembling the well known TATA box, together with flanking regions of C-G enrichment, are the most important sequence-based signals marking sites of transcriptional initiation at a large class of typical promoters. Collapse Key Words Collapse MESH Headings Animals Bayes Theorem Computational Biology/methods DNA/chemistry DNA/genetics DNA, Complementary/chemistry Genome Genome, Human Humans Mice Models, Genetic Promoter Regions, Genetic/genetics Transcription Initiation Site Collapse Grants Collapse	research-article	23	217
18	Movassagh M, Choy MK, Knowles DA, Cordeddu L, Haider S, Down T, Siggens L, Vujic A, Simeoni I, Penkett C, Goddard M, Lio P, Bennett MR, Foo RSY. Distinct epigenomic features in end-stage failing human hearts. Circulation 2011;124:2411-22. [PMID: 22025602 DOI: 10.1161/circulationaha.111.040071] [Citation(s) in RCA: 203] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Abstract BACKGROUND The epigenome refers to marks on the genome, including DNA methylation and histone modifications, that regulate the expression of underlying genes. A consistent profile of gene expression changes in end-stage cardiomyopathy led us to hypothesize that distinct global patterns of the epigenome may also exist. METHODS AND RESULTS We constructed genome-wide maps of DNA methylation and histone-3 lysine-36 trimethylation (H3K36me3) enrichment for cardiomyopathic and normal human hearts. More than 506 Mb sequences per library were generated by high-throughput sequencing, allowing us to assign methylation scores to ≈28 million CG dinucleotides in the human genome. DNA methylation was significantly different in promoter CpG islands, intragenic CpG islands, gene bodies, and H3K36me3-enriched regions of the genome. DNA methylation differences were present in promoters of upregulated genes but not downregulated genes. H3K36me3 enrichment itself was also significantly different in coding regions of the genome. Specifically, abundance of RNA transcripts encoded by the DUX4 locus correlated to differential DNA methylation and H3K36me3 enrichment. In vitro, Dux gene expression was responsive to a specific inhibitor of DNA methyltransferase, and Dux siRNA knockdown led to reduced cell viability. CONCLUSIONS Distinct epigenomic patterns exist in important DNA elements of the cardiac genome in human end-stage cardiomyopathy. The epigenome may control the expression of local or distal genes with critical functions in myocardial stress response. If epigenomic patterns track with disease progression, assays for the epigenome may be useful for assessing prognosis in heart failure. Further studies are needed to determine whether and how the epigenome contributes to the development of cardiomyopathy. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	14	203
19	Novakovic B, Sibson M, Ng HK, Manuelpillai U, Rakyan V, Down T, Beck S, Fournier T, Evain-Brion D, Dimitriadis E, Craig JM, Morley R, Saffery R. Placenta-specific methylation of the vitamin D 24-hydroxylase gene: implications for feedback autoregulation of active vitamin D levels at the fetomaternal interface. J Biol Chem 2009;284:14838-48. [PMID: 19237542 PMCID: PMC2685665 DOI: 10.1074/jbc.m809542200] [Citation(s) in RCA: 183] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2008] [Revised: 02/20/2009] [Indexed: 11/21/2022] Open Abstract Plasma concentrations of biologically active vitamin D (1,25-(OH)(2)D) are tightly controlled via feedback regulation of renal 1alpha-hydroxylase (CYP27B1; positive) and 24-hydroxylase (CYP24A1; catabolic) enzymes. In pregnancy, this regulation is uncoupled, and 1,25-(OH)(2)D levels are significantly elevated, suggesting a role in pregnancy progression. Epigenetic regulation of CYP27B1 and CYP24A1 has previously been described in cell and animal models, and despite emerging evidence for a critical role of epigenetics in placentation generally, little is known about the regulation of enzymes modulating vitamin D homeostasis at the fetomaternal interface. In this study, we investigated the methylation status of genes regulating vitamin D bioavailability and activity in the placenta. No methylation of the VDR (vitamin D receptor) and CYP27B1 genes was found in any placental tissues. In contrast, the CYP24A1 gene is methylated in human placenta, purified cytotrophoblasts, and primary and cultured chorionic villus sampling tissue. No methylation was detected in any somatic human tissue tested. Methylation was also evident in marmoset and mouse placental tissue. All three genes were hypermethylated in choriocarcinoma cell lines, highlighting the role of vitamin D deregulation in this cancer. Gene expression analysis confirmed a reduced capacity for CYP24A1 induction with promoter methylation in primary cells and in vitro reporter analysis demonstrated that promoter methylation directly down-regulates basal promoter activity and abolishes vitamin D-mediated feedback activation. This study strongly suggests that epigenetic decoupling of vitamin D feedback catabolism plays an important role in maximizing active vitamin D bioavailability at the fetomaternal interface. Collapse Key Words Collapse MESH Headings 25-Hydroxyvitamin D3 1-alpha-Hydroxylase/genetics Animals Calcitriol/pharmacology Cell Line, Tumor Choriocarcinoma/genetics CpG Islands/genetics DNA Methylation/drug effects Feedback, Physiological/drug effects Female Homeostasis/drug effects Humans Mammals/metabolism Maternal-Fetal Exchange/drug effects Organ Specificity/drug effects Placenta/cytology Placenta/drug effects Placenta/enzymology Pre-Eclampsia/enzymology Pre-Eclampsia/genetics Pregnancy Pregnancy Trimester, First/drug effects Pregnancy Trimester, First/genetics Promoter Regions, Genetic/genetics Receptors, Calcitriol/genetics Steroid Hydroxylases/genetics Steroid Hydroxylases/metabolism Term Birth/drug effects Term Birth/genetics Transcription, Genetic/drug effects Trophoblasts/cytology Trophoblasts/drug effects Trophoblasts/enzymology Up-Regulation/drug effects Vitamin D/metabolism Vitamin D3 24-Hydroxylase Collapse Grants 084071 Wellcome Trust Collapse	research-article	16	183
20	Clamp M, Andrews D, Barker D, Bevan P, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Hubbard T, Kasprzyk A, Keefe D, Lehvaslaiho H, Iyer V, Melsopp C, Mongin E, Pettett R, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Birney E. Ensembl 2002: accommodating comparative genomics. Nucleic Acids Res 2003;31:38-42. [PMID: 12519943 PMCID: PMC165530 DOI: 10.1093/nar/gkg083] [Citation(s) in RCA: 180] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation. Collapse Key Words Collapse MESH Headings Animals Computational Biology Databases, Genetic Genome, Human Genomics Humans Internet Mice Software Synteny Collapse Grants Wellcome Trust Collapse	research-article	22	180
21	Bell CG, Finer S, Lindgren CM, Wilson GA, Rakyan VK, Teschendorff AE, Akan P, Stupka E, Down TA, Prokopenko I, Morison IM, Mill J, Pidsley R, Deloukas P, Frayling TM, Hattersley AT, McCarthy MI, Beck S, Hitman GA. Integrated genetic and epigenetic analysis identifies haplotype-specific methylation in the FTO type 2 diabetes and obesity susceptibility locus. PLoS One 2010;5:e14040. [PMID: 21124985 PMCID: PMC2987816 DOI: 10.1371/journal.pone.0014040] [Citation(s) in RCA: 177] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2010] [Accepted: 10/27/2010] [Indexed: 01/04/2023] Open Abstract Recent multi-dimensional approaches to the study of complex disease have revealed powerful insights into how genetic and epigenetic factors may underlie their aetiopathogenesis. We examined genotype-epigenotype interactions in the context of Type 2 Diabetes (T2D), focussing on known regions of genomic susceptibility. We assayed DNA methylation in 60 females, stratified according to disease susceptibility haplotype using previously identified association loci. CpG methylation was assessed using methylated DNA immunoprecipitation on a targeted array (MeDIP-chip) and absolute methylation values were estimated using a Bayesian algorithm (BATMAN). Absolute methylation levels were quantified across LD blocks, and we identified increased DNA methylation on the FTO obesity susceptibility haplotype, tagged by the rs8050136 risk allele A (p = 9.40×10⁻⁴, permutation p = 1.0×10⁻³). Further analysis across the 46 kb LD block using sliding windows localised the most significant difference to be within a 7.7 kb region (p = 1.13×10⁻⁷). Sequence level analysis, followed by pyrosequencing validation, revealed that the methylation difference was driven by the co-ordinated phase of CpG-creating SNPs across the risk haplotype. This 7.7 kb region of haplotype-specific methylation (HSM), encapsulates a Highly Conserved Non-Coding Element (HCNE) that has previously been validated as a long-range enhancer, supported by the histone H3K4me1 enhancer signature. This study demonstrates that integration of Genome-Wide Association (GWA) SNP and epigenomic DNA methylation data can identify potential novel genotype-epigenotype interactions within disease-associated loci, thus providing a novel route to aid unravelling common complex diseases. Collapse Key Words Collapse MESH Headings Adult Algorithms Animals Base Sequence Bayes Theorem CpG Islands/genetics DNA Methylation Diabetes Mellitus, Type 2/genetics Diabetes Mellitus, Type 2/metabolism Epigenomics Evolution, Molecular Female Gene Expression Profiling Gene Frequency Genetic Predisposition to Disease/genetics Genotype Haplotypes/genetics Histones/metabolism Humans Methylation Obesity/genetics Oligonucleotide Array Sequence Analysis Polymorphism, Single Nucleotide Sequence Analysis, DNA Collapse Grants R01 DK073490 NIDDK NIH HHS G0800441 Medical Research Council DK-073490 NIDDK NIH HHS 090532 Wellcome Trust WT086596/Z/08/Z Wellcome Trust 084071 Wellcome Trust Wellcome Trust Collapse	Research Support, Non-U.S. Gov't	15	177
22	Holland RCG, Down TA, Pocock M, Prlić A, Huen D, James K, Foisy S, Dräger A, Yates A, Heuer M, Schreiber MJ. BioJava: an open-source framework for bioinformatics. Bioinformatics 2008;24:2096-7. [PMID: 18689808 PMCID: PMC2530884 DOI: 10.1093/bioinformatics/btn397] [Citation(s) in RCA: 168] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open Abstract Summary: BioJava is a mature open-source project that provides a framework for processing of biological data. BioJava contains powerful analysis and statistical routines, tools for parsing common file formats and packages for manipulating sequences and 3D structures. It enables rapid bioinformatics application development in the Java programming language. Availability: BioJava is an open-source project distributed under the Lesser GPL (LGPL). BioJava can be downloaded from the BioJava website (http://www.biojava.org). BioJava requires Java 1.5 or higher. Contact:andreas.prlic@gmail.com. All queries should be directed to the BioJava mailing lists. Details are available at http://biojava.org/wiki/BioJava:MailingLists. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	17	168
23	Birney E, Andrews D, Bevan P, Caccamo M, Cameron G, Chen Y, Clarke L, Coates G, Cox T, Cuff J, Curwen V, Cutts T, Down T, Durbin R, Eyras E, Fernandez-Suarez XM, Gane P, Gibbins B, Gilbert J, Hammond M, Hotz H, Iyer V, Kahari A, Jekosch K, Kasprzyk A, Keefe D, Keenan S, Lehvaslaiho H, McVicker G, Melsopp C, Meidl P, Mongin E, Pettett R, Potter S, Proctor G, Rae M, Searle S, Slater G, Smedley D, Smith J, Spooner W, Stabenau A, Stalker J, Storey R, Ureta-Vidal A, Woodwark C, Clamp M, Hubbard T. Ensembl 2004. Nucleic Acids Res 2004;32:D468-70. [PMID: 14681459 PMCID: PMC308772 DOI: 10.1093/nar/gkh038] [Citation(s) in RCA: 143] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open Abstract The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organize biology around the sequences of large genomes. It is a comprehensive and integrated source of annotation of large genome sequences, available via interactive website, web services or flat files. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. The facilities of the system range from sequence analysis to data storage and visualization and installations exist around the world both in companies and at academic sites. With a total of nine genome sequences available from Ensembl and more genomes to follow, recent developments have focused mainly on closer integration between genomes and external data. Collapse Key Words Collapse MESH Headings Animals Computational Biology Databases, Genetic Genome Genomics Humans Information Storage and Retrieval Internet Software Collapse Grants Collapse	Research Support, U.S. Gov't, P.H.S.	21	143
24	Cheung MS, Down TA, Latorre I, Ahringer J. Systematic bias in high-throughput sequencing data and its correction by BEADS. Nucleic Acids Res 2011;39:e103. [PMID: 21646344 PMCID: PMC3159482 DOI: 10.1093/nar/gkr425] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open Abstract Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina’s Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Research Support, Non-U.S. Gov't	14	108
25	Häsler R, Feng Z, Bäckdahl L, Spehlmann ME, Franke A, Teschendorff A, Rakyan VK, Down TA, Wilson GA, Feber A, Beck S, Schreiber S, Rosenstiel P. A functional methylome map of ulcerative colitis. Genome Res 2012;22:2130-7. [PMID: 22826509 PMCID: PMC3483542 DOI: 10.1101/gr.138347.112] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Abstract The etiology of inflammatory bowel diseases is only partially explained by the current genetic risk map. It is hypothesized that environmental factors modulate the epigenetic landscape and thus contribute to disease susceptibility, manifestation, and progression. To test this, we analyzed DNA methylation (DNAm), a fundamental mechanism of epigenetic long-term modulation of gene expression. We report a three-layer epigenome-wide association study (EWAS) using intestinal biopsies from 10 monozygotic twin pairs (n = 20 individuals) discordant for manifestation of ulcerative colitis (UC). Genome-wide expression scans were generated using Affymetrix UG 133 Plus 2.0 arrays (layer 1). Genome-wide DNAm scans were carried out using Illumina 27k Infinium Bead Arrays to identify methylation variable positions (MVPs, layer 2), and MeDIP-chip on Nimblegen custom 385k Tiling Arrays to identify differentially methylated regions (DMRs, layer 3). Identified MVPs and DMRs were validated in two independent patient populations by quantitative real-time PCR and bisulfite-pyrosequencing (n = 185). The EWAS identified 61 disease-associated loci harboring differential DNAm in cis of a differentially expressed transcript. All constitute novel candidate risk loci for UC not previously identified by GWAS. Among them are several that have been functionally implicated in inflammatory processes, e.g., complement factor CFI, the serine protease inhibitor SPINK4, and the adhesion molecule THY1 (also known as CD90). Our study design excludes nondisease inflammation as a cause of the identified changes in DNAm. This study represents the first replicated EWAS of UC integrated with transcriptional signatures in the affected tissue and demonstrates the power of EWAS to uncover unexplained disease risk and molecular events of disease manifestation. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse	Twin Study	13	101

Please SIGN IN to browse more articles.