251
|
Chou CH, Chang NW, Shrestha S, Hsu SD, Lin YL, Lee WH, Yang CD, Hong HC, Wei TY, Tu SJ, Tsai TR, Ho SY, Jian TY, Wu HY, Chen PR, Lin NC, Huang HT, Yang TL, Pai CY, Tai CS, Chen WL, Huang CY, Liu CC, Weng SL, Liao KW, Hsu WL, Huang HD. miRTarBase 2016: updates to the experimentally validated miRNA-target interactions database. Nucleic Acids Res 2015; 44:D239-47. [PMID: 26590260 PMCID: PMC4702890 DOI: 10.1093/nar/gkv1258] [Citation(s) in RCA: 798] [Impact Index Per Article: 88.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 10/30/2015] [Indexed: 02/07/2023] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs of approximately 22 nucleotides, which negatively regulate the gene expression at the post-transcriptional level. This study describes an update of the miRTarBase (http://miRTarBase.mbc.nctu.edu.tw/) that provides information about experimentally validated miRNA-target interactions (MTIs). The latest update of the miRTarBase expanded it to identify systematically Argonaute-miRNA-RNA interactions from 138 crosslinking and immunoprecipitation sequencing (CLIP-seq) data sets that were generated by 21 independent studies. The database contains 4966 articles, 7439 strongly validated MTIs (using reporter assays or western blots) and 348 007 MTIs from CLIP-seq. The number of MTIs in the miRTarBase has increased around 7-fold since the 2014 miRTarBase update. The miRNA and gene expression profiles from The Cancer Genome Atlas (TCGA) are integrated to provide an effective overview of this exponential growth in the miRNA experimental data. These improvements make the miRTarBase one of the more comprehensively annotated, experimentally validated miRNA-target interactions databases and motivate additional miRNA research efforts.
Collapse
Affiliation(s)
- Chih-Hung Chou
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Nai-Wen Chang
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei, 106, Taiwan
| | - Sirjana Shrestha
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Sheng-Da Hsu
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Yu-Ling Lin
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan Center for Bioinformatics Research, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Wei-Hsiang Lee
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan Clinical Research Center, Chung Shan Medical University Hospital, Taichung, 402, Taiwan
| | - Chi-Dung Yang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan Institute of Population Health Sciences, National Health Research Institutes, Miaoli, 350, Taiwan
| | - Hsiao-Chin Hong
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Ting-Yen Wei
- Interdisciplinary Program of Life Science, National Tsing Hua University, Hsinchu, 300, Taiwan
| | - Siang-Jyun Tu
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Tzi-Ren Tsai
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Shu-Yi Ho
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Ting-Yan Jian
- Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Hsin-Yi Wu
- Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Pin-Rong Chen
- Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Nai-Chieh Lin
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Hsin-Tzu Huang
- Degree Program of Applied Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Tzu-Ling Yang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Chung-Yuan Pai
- Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Chun-San Tai
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Wen-Liang Chen
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Chia-Yen Huang
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan Gynecologic Cancer Center, Department of Obstetrics and Gynecology, Cathay General Hospital, Taipei, 106, Taiwan
| | - Chun-Chi Liu
- Institute of Genomics and Bioinformatics, National Chung Hsing University, Taichung, 402, Taiwan
| | - Shun-Long Weng
- Department of Obstetrics and Gynecology, Hsinchu Mackay Memorial Hospital, Hsinchu, 300, Taiwan Mackay Medicine, Nursing and Management College, Taipei, 112, Taiwan Department of Medicine, Mackay Medical College, New Taipei City, 252, Taiwan
| | - Kuang-Wen Liao
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan Institute of Molecular Medicine and Bioengineering, National Chiao Tung University, Hsinchu, 300, Taiwan
| | - Wen-Lian Hsu
- Institute of Information Science, Academia Sinica, Taipei, 115, Taiwan
| | - Hsien-Da Huang
- Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, 300, Taiwan Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, 300, Taiwan Center for Bioinformatics Research, National Chiao Tung University, Hsinchu, 300, Taiwan Department of Biomedical Science and Environmental Biology, Kaohsiung Medical University, Kaohsiung, 807, Taiwan
| |
Collapse
|
252
|
Howe KL, Bolt BJ, Cain S, Chan J, Chen WJ, Davis P, Done J, Down T, Gao S, Grove C, Harris TW, Kishore R, Lee R, Lomax J, Li Y, Muller HM, Nakamura C, Nuin P, Paulini M, Raciti D, Schindelman G, Stanley E, Tuli MA, Van Auken K, Wang D, Wang X, Williams G, Wright A, Yook K, Berriman M, Kersey P, Schedl T, Stein L, Sternberg PW. WormBase 2016: expanding to enable helminth genomic research. Nucleic Acids Res 2015; 44:D774-80. [PMID: 26578572 PMCID: PMC4702863 DOI: 10.1093/nar/gkv1217] [Citation(s) in RCA: 278] [Impact Index Per Article: 30.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Accepted: 10/28/2015] [Indexed: 11/24/2022] Open
Abstract
WormBase (www.wormbase.org) is a central repository for research data on the biology, genetics and genomics of Caenorhabditis elegans and other nematodes. The project has evolved from its original remit to collect and integrate all data for a single species, and now extends to numerous nematodes, ranging from evolutionary comparators of C. elegans to parasitic species that threaten plant, animal and human health. Research activity using C. elegans as a model system is as vibrant as ever, and we have created new tools for community curation in response to the ever-increasing volume and complexity of data. To better allow users to navigate their way through these data, we have made a number of improvements to our main website, including new tools for browsing genomic features and ontology annotations. Finally, we have developed a new portal for parasitic worm genomes. WormBase ParaSite (parasite.wormbase.org) contains all publicly available nematode and platyhelminth annotated genome sequences, and is designed specifically to support helminth genomic research.
Collapse
Affiliation(s)
- Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Bruce J Bolt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - James Done
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Thomas Down
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sibyl Gao
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Christian Grove
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Todd W Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Raymond Lee
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jane Lomax
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Yuling Li
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Hans-Michael Muller
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Cecilia Nakamura
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Gary Schindelman
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Eleanor Stanley
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Mary Ann Tuli
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Daniel Wang
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Xiaodong Wang
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 156-29, California Institute of Technology, Pasadena, CA 91125, USA
| | - Matthew Berriman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Paul Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Paul W Sternberg
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada Howard Hughes Medical Institute, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
253
|
Wu C, Jin X, Tsueng G, Afrasiabi C, Su AI. BioGPS: building your own mash-up of gene annotations and expression profiles. Nucleic Acids Res 2015; 44:D313-6. [PMID: 26578587 PMCID: PMC4702805 DOI: 10.1093/nar/gkv1104] [Citation(s) in RCA: 299] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2015] [Accepted: 10/11/2015] [Indexed: 12/24/2022] Open
Abstract
BioGPS (http://biogps.org) is a centralized gene-annotation portal that enables researchers to access distributed gene annotation resources. This article focuses on the updates to BioGPS since our last paper (2013 database issue). The unique features of BioGPS, compared to those of other gene portals, are its community extensibility and user customizability. Users contribute the gene-specific resources accessible from BioGPS (‘plugins’), which helps ensure that the resource collection is always up-to-date and that it will continue expanding over time (since the 2013 paper, 162 resources have been added, for a 34% increase in the number of resources available). BioGPS users can create their own collections of relevant plugins and save them as customized gene-report pages or ‘layouts’ (since the 2013 paper, 488 user-created layouts have been added, for a 22% increase in the number of layouts). In addition, we recently updated the most popular plugin, the ‘Gene expression/activity chart’, to include ∼6000 datasets (from ∼2000 datasets) and we enhanced user interactivity. We also added a new ‘gene list’ feature that allows users to save query results for future reference.
Collapse
Affiliation(s)
- Chunlei Wu
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Xuefeng Jin
- Nanjing Xiexu MediaTechnology Co., Ltd., 180 Ruanjiandadao Rd #7-403, Nanjing, Jiangsu 210000, China
| | - Ginger Tsueng
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Cyrus Afrasiabi
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew I Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA 92037, USA
| |
Collapse
|
254
|
Luzón-Toro B, Gui H, Ruiz-Ferrer M, Sze-Man Tang C, Fernández RM, Sham PC, Torroglosa A, Kwong-Hang Tam P, Espino-Paisán L, Cherny SS, Bleda M, Enguix-Riego MDV, Dopazo J, Antiñolo G, García-Barceló MM, Borrego S. Exome sequencing reveals a high genetic heterogeneity on familial Hirschsprung disease. Sci Rep 2015; 5:16473. [PMID: 26559152 PMCID: PMC4642299 DOI: 10.1038/srep16473] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2015] [Accepted: 10/14/2015] [Indexed: 11/24/2022] Open
Abstract
Hirschsprung disease (HSCR; OMIM 142623) is a developmental disorder characterized by aganglionosis along variable lengths of the distal gastrointestinal tract, which results in intestinal obstruction. Interactions among known HSCR genes and/or unknown disease susceptibility loci lead to variable severity of phenotype. Neither linkage nor genome-wide association studies have efficiently contributed to completely dissect the genetic pathways underlying this complex genetic disorder. We have performed whole exome sequencing of 16 HSCR patients from 8 unrelated families with SOLID platform. Variants shared by affected relatives were validated by Sanger sequencing. We searched for genes recurrently mutated across families. Only variations in the FAT3 gene were significantly enriched in five families. Within-family analysis identified compound heterozygotes for AHNAK and several genes (N = 23) with heterozygous variants that co-segregated with the phenotype. Network and pathway analyses facilitated the discovery of polygenic inheritance involving FAT3, HSCR known genes and their gene partners. Altogether, our approach has facilitated the detection of more than one damaging variant in biologically plausible genes that could jointly contribute to the phenotype. Our data may contribute to the understanding of the complex interactions that occur during enteric nervous system development and the etiopathology of familial HSCR.
Collapse
Affiliation(s)
- Berta Luzón-Toro
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain
| | - Hongsheng Gui
- Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Macarena Ruiz-Ferrer
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain
| | - Clara Sze-Man Tang
- Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Raquel M Fernández
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain
| | - Pak-Chung Sham
- Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,State Key Laboratory of Brain and Cognitive Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Centre for Reproduction, Development, and Growth, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Ana Torroglosa
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain
| | - Paul Kwong-Hang Tam
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Centre for Reproduction, Development, and Growth, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Laura Espino-Paisán
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain
| | - Stacey S Cherny
- Centre for Genomic Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Department of Psychiatry, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,State Key Laboratory of Brain and Cognitive Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Marta Bleda
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain.,Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain
| | - María Del Valle Enguix-Riego
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain
| | - Joaquín Dopazo
- Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain.,Computational Genomics Department, Centro de Investigación Príncipe Felipe (CIPF), Valencia, Spain.,Functional Genomics Node, (INB) at CIPF, Valencia, Spain
| | - Guillermo Antiñolo
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain
| | - María-Mercé García-Barceló
- Department of Surgery, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.,Centre for Reproduction, Development, and Growth, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Salud Borrego
- Department of Genetics, Reproduction and Fetal Medicine, Institute of Biomedicine of Seville (IBIS), University Hospital Virgen del Rocío/CSIC/University of Seville, Seville, Spain.,Centre for Biomedical Network Research on Rare Diseases (CIBERER), Spain
| |
Collapse
|
255
|
Meta-analysis identifies seven susceptibility loci involved in the atopic march. Nat Commun 2015; 6:8804. [PMID: 26542096 PMCID: PMC4667629 DOI: 10.1038/ncomms9804] [Citation(s) in RCA: 117] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2015] [Accepted: 10/06/2015] [Indexed: 12/20/2022] Open
Abstract
Eczema often precedes the development of asthma in a disease course called the ‘atopic march'. To unravel the genes underlying this characteristic pattern of allergic disease, we conduct a multi-stage genome-wide association study on infantile eczema followed by childhood asthma in 12 populations including 2,428 cases and 17,034 controls. Here we report two novel loci specific for the combined eczema plus asthma phenotype, which are associated with allergic disease for the first time; rs9357733 located in EFHC1 on chromosome 6p12.3 (OR 1.27; P=2.1 × 10−8) and rs993226 between TMTC2 and SLC6A15 on chromosome 12q21.3 (OR 1.58; P=5.3 × 10−9). Additional susceptibility loci identified at genome-wide significance are FLG (1q21.3), IL4/KIF3A (5q31.1), AP5B1/OVOL1 (11q13.1), C11orf30/LRRC32 (11q13.5) and IKZF3 (17q21). We show that predominantly eczema loci increase the risk for the atopic march. Our findings suggest that eczema may play an important role in the development of asthma after eczema. The development of asthma following eczema is known as the atopic march. Here the authors conduct a GWAS on affected children and identify two novel loci associated with the disease phenotype.
Collapse
|
256
|
Chen Y, Cai X, Xu R. Combining Human Disease Genetics and Mouse Model Phenotypes towards Drug Repositioning for Parkinson's disease. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2015; 2015:1851-60. [PMID: 26958284 PMCID: PMC4765695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Parkinson's disease (PD) is a severe neurodegenerative disorder without effective treatments. Here, we present a novel drug repositioning approach to predict new drugs for PD leveraging both disease genetics and large amounts of mouse model phenotypes. First, we identified PD-specific mouse phenotypes using well-studied human disease genes. Then we searched all FDA-approved drugs for candidates that share similar mouse phenotype profiles with PD. We demonstrated the validity of our approach using drugs that have been approved for PD: 10 approved PD drugs were ranked within top 10% among 1197 candidates. In predicting novel PD drugs, our approach achieved a mean average precision of 0.24, which is significantly higher (p
Collapse
Affiliation(s)
- Yang Chen
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Xiaoshu Cai
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Rong Xu
- Department of Epidemiology and Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
257
|
Pembroke WG, Babbs A, Davies KE, Ponting CP, Oliver PL. Temporal transcriptomics suggest that twin-peaking genes reset the clock. eLife 2015; 4. [PMID: 26523393 PMCID: PMC4718813 DOI: 10.7554/elife.10518] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 11/01/2015] [Indexed: 01/08/2023] Open
Abstract
The mammalian suprachiasmatic nucleus (SCN) drives daily rhythmic behavior and physiology, yet a detailed understanding of its coordinated transcriptional programmes is lacking. To reveal the finer details of circadian variation in the mammalian SCN transcriptome we combined laser-capture microdissection (LCM) and RNA-seq over a 24 hr light / dark cycle. We show that 7-times more genes exhibited a classic sinusoidal expression signature than previously observed in the SCN. Another group of 766 genes unexpectedly peaked twice, near both the start and end of the dark phase; this twin-peaking group is significantly enriched for synaptic transmission genes that are crucial for light-induced phase shifting of the circadian clock. 341 intergenic non-coding RNAs, together with novel exons of annotated protein-coding genes, including Cry1, also show specific circadian expression variation. Overall, our data provide an important chronobiological resource (www.wgpembroke.com/shiny/SCNseq/) and allow us to propose that transcriptional timing in the SCN is gating clock resetting mechanisms. DOI:http://dx.doi.org/10.7554/eLife.10518.001 The daily cycles of life in mammals are driven by a small region of the brain called the suprachiasmatic nucleus (or SCN). The SCN receives signals from sunlight and other environmental factors to help coordinate most aspects of daily biological activity and behaviour. To work correctly, it is essential that the SCN switches certain genes on and off at exactly the right time. However, many questions remain over the identity of these genes and how their levels of activity change during a 24-hour period. When a gene is active (or “being expressed”), it is used as a template to build the molecules of RNA that are needed to make proteins and to help to control how cells work. Pembroke et al. have now sequenced the RNA molecules made in the SCN of mice (which plays the same role as the equivalent human brain region) over a 24-hour period. The mice spent half of each day in the light, and half in the dark. This revealed that the expression levels of over a quarter of all the genes that are found in the SCN fluctuate over a 24-hour period. One particular group of genes peak in activity twice a day; Pembroke et al. suggest that these genes are important for controlling how an animal can adjust its body clock to light. Further research is now needed to find out which of the newly discovered fluctuating genes play the most important roles in daily activity rhythms, and which might play a part in disease. DOI:http://dx.doi.org/10.7554/eLife.10518.002
Collapse
Affiliation(s)
- William G Pembroke
- MRC Functional Genomics Unit, Department of Physiology Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Arran Babbs
- MRC Functional Genomics Unit, Department of Physiology Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Kay E Davies
- MRC Functional Genomics Unit, Department of Physiology Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Chris P Ponting
- MRC Functional Genomics Unit, Department of Physiology Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| | - Peter L Oliver
- MRC Functional Genomics Unit, Department of Physiology Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
258
|
Wilson R, McGuire C, Mohun T. Deciphering the mechanisms of developmental disorders: phenotype analysis of embryos from mutant mouse lines. Nucleic Acids Res 2015; 44:D855-61. [PMID: 26519470 PMCID: PMC4702824 DOI: 10.1093/nar/gkv1138] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Accepted: 10/18/2015] [Indexed: 01/09/2023] Open
Abstract
The Deciphering the Mechanisms of Developmental Disorders (DMDD) consortium is a research programme set up to identify genes in the mouse, which if mutated (or knocked-out) result in embryonic lethality when homozygous, and initiate the study of why disruption of their function has such profound effects on embryo development and survival. The project uses a combination of comprehensive high resolution 3D imaging and tissue histology to identify abnormalities in embryo and placental structures of embryonic lethal lines. The image data we have collected and the phenotypes scored are freely available through the project website (http://dmdd.org.uk). In this article we describe the web interface to the images that allows the embryo data to be viewed at full resolution in different planes, discuss how to search the database for a phenotype, and our approach to organising the data for an embryo and a mutant line so it is easy to comprehend and intuitive to navigate.
Collapse
Affiliation(s)
- Robert Wilson
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | - Christina McGuire
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | - Timothy Mohun
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London NW7 1AA, UK
| | | |
Collapse
|
259
|
Szlachcic WJ, Switonski PM, Kurkowiak M, Wiatr K, Figiel M. Mouse polyQ database: a new online resource for research using mouse models of neurodegenerative diseases. Mol Brain 2015; 8:69. [PMID: 26515641 PMCID: PMC4625465 DOI: 10.1186/s13041-015-0160-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Accepted: 10/19/2015] [Indexed: 01/15/2023] Open
Abstract
Background The polyglutamine (polyQ) family of disorders comprises 9 genetic diseases, including several types of ataxia and Huntington disease. Approximately two decades of investigation and the creation of more than 130 mouse models of polyQ disorders have revealed many similarities between these diseases. The disorders share common mutation types, neurological characteristics and certain aspects of pathogenesis, including morphological and physiological neuronal alterations. All of the diseases still remain incurable. Description The large volume of information collected as a result of the investigation of polyQ models currently represents a great potential for searching, comparing and translating pathogenesis and therapeutic information between diseases. Therefore, we generated a public database comprising the polyQ mouse models, phenotypes and therapeutic interventions tested in vivo. The database is available at http://conyza.man.poznan.pl/. Conclusion The use of the database in the field of polyQ diseases may accelerate research on these and other neurodegenerative diseases and provide new perspectives for future investigation.
Collapse
Affiliation(s)
- Wojciech J Szlachcic
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznań, Poland.
| | - Pawel M Switonski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznań, Poland.
| | - Małgorzata Kurkowiak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznań, Poland.
| | - Kalina Wiatr
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznań, Poland.
| | - Maciej Figiel
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704, Poznań, Poland.
| |
Collapse
|
260
|
Cunha MLR, Meijers JCM, Middeldorp S. Introduction to the analysis of next generation sequencing data and its application to venous thromboembolism. Thromb Haemost 2015; 114:920-32. [PMID: 26446408 DOI: 10.1160/th15-05-0411] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 08/26/2015] [Indexed: 12/13/2022]
Abstract
Despite knowledge of various inherited risk factors associated with venous thromboembolism (VTE), no definite cause can be found in about 50% of patients. The application of data-driven searches such as GWAS has not been able to identify genetic variants with implications for clinical care, and unexplained heritability remains. In the past years, the development of several so-called next generation sequencing (NGS) platforms is offering the possibility of generating fast, inexpensive and accurate genomic information. However, so far their application to VTE has been very limited. Here we review basic concepts of NGS data analysis and explore the application of NGS technology to VTE. We provide both computational and biological viewpoints to discuss potentials and challenges of NGS-based studies.
Collapse
Affiliation(s)
- Marisa L R Cunha
- Marisa L. R. Cunha, Department of Experimental Vascular Medicine, Academic Medical Center, Meibergdreef 9, 1105 AZ Amsterdam, The Netherlands, Tel.: +31 20 5662824, Fax: +31 20 6968833, E-mail:
| | | | | |
Collapse
|
261
|
Manda P, Balhoff JP, Lapp H, Mabee P, Vision TJ. Using the phenoscape knowledgebase to relate genetic perturbations to phenotypic evolution. Genesis 2015. [PMID: 26220875 DOI: 10.1002/dvg.22878] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
The abundance of phenotypic diversity among species can enrich our knowledge of development and genetics beyond the limits of variation that can be observed in model organisms. The Phenoscape Knowledgebase (KB) is designed to enable exploration and discovery of phenotypic variation among species. Because phenotypes in the KB are annotated using standard ontologies, evolutionary phenotypes can be compared with phenotypes from genetic perturbations in model organisms. To illustrate the power of this approach, we review the use of the KB to find taxa showing evolutionary variation similar to that of a query gene. Matches are made between the full set of phenotypes described for a gene and an evolutionary profile, the latter of which is defined as the set of phenotypes that are variable among the daughters of any node on the taxonomic tree. Phenoscape's semantic similarity interface allows the user to assess the statistical significance of each match and flags matches that may only result from differences in annotation coverage between genetic and evolutionary studies. Tools such as this will help meet the challenge of relating the growing volume of genetic knowledge in model organisms to the diversity of phenotypes in nature. The Phenoscape KB is available at http://kb.phenoscape.org.
Collapse
Affiliation(s)
- Prashanti Manda
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina.,US National Evolutionary Synthesis Center, Durham, North Carolina
| | - James P Balhoff
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina.,US National Evolutionary Synthesis Center, Durham, North Carolina
| | - Hilmar Lapp
- US National Evolutionary Synthesis Center, Durham, North Carolina.,Center for Genomic and Computational Biology, Duke University, Durham, North Carolina
| | - Paula Mabee
- Department of Biology, University of South Dakota, Vermillion, South Dakota
| | - Todd J Vision
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina.,US National Evolutionary Synthesis Center, Durham, North Carolina
| |
Collapse
|
262
|
Dumont-Lagacé M, St-Pierre C, Perreault C. Sex hormones have pervasive effects on thymic epithelial cells. Sci Rep 2015; 5:12895. [PMID: 26250469 PMCID: PMC4528223 DOI: 10.1038/srep12895] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Accepted: 07/15/2015] [Indexed: 12/15/2022] Open
Abstract
The goal of our study was to evaluate at the systems-level, the effect of sex hormones on thymic epithelial cells (TECs). To this end, we sequenced the transcriptome of cortical and medullary TECs (cTECs and mTECs) from three groups of 6 month-old mice: males, females and males castrated at four weeks of age. In parallel, we analyzed variations in the size of TEC subsets in those three groups between 1 and 12 months of age. We report that sex hormones have pervasive effects on the transcriptome of TECs. These effects were exquisitely TEC-subset specific. Sexual dimorphism was particularly conspicuous in cTECs. Male cTECs displayed low proliferation rates that correlated with low expression of Foxn1 and its main targets. Furthermore, male cTECs expressed relatively low levels of genes instrumental in thymocyte expansion (e.g., Dll4) and positive selection (Psmb11 and Ctsl). Nevertheless, cTECs were more abundant in males than females. Accumulation of cTECs in males correlated with differential expression of genes regulating cell survival in cTECs and cell differentiation in mTECs. The sexual dimorphism of TECs highlighted here may be mechanistically linked to the well-recognized sex differences in susceptibility to infections and autoimmune diseases.
Collapse
Affiliation(s)
- Maude Dumont-Lagacé
- 1] Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, QC, Canada H3C 3J7 [2] Department of Medicine, Université de Montréal, Montreal, QC, Canada H3C 3J7
| | - Charles St-Pierre
- 1] Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, QC, Canada H3C 3J7 [2] Department of Medicine, Université de Montréal, Montreal, QC, Canada H3C 3J7
| | - Claude Perreault
- 1] Institute for Research in Immunology and Cancer, Université de Montréal, Montreal, QC, Canada H3C 3J7 [2] Department of Medicine, Université de Montréal, Montreal, QC, Canada H3C 3J7
| |
Collapse
|
263
|
James-Zorn C, Ponferrada VG, Burns KA, Fortriede JD, Lotay VS, Liu Y, Karpinka JB, Karimi K, Zorn AM, Vize PD. Xenbase: Core features, data acquisition, and data processing. Genesis 2015; 53:486-97. [PMID: 26150211 PMCID: PMC4545734 DOI: 10.1002/dvg.22873] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Revised: 06/15/2015] [Accepted: 06/22/2015] [Indexed: 01/18/2023]
Abstract
Xenbase, the Xenopus model organism database (www.xenbase.org), is a cloud-based, web-accessible resource that integrates the diverse genomic and biological data from Xenopus research. Xenopus frogs are one of the major vertebrate animal models used for biomedical research, and Xenbase is the central repository for the enormous amount of data generated using this model tetrapod. The goal of Xenbase is to accelerate discovery by enabling investigators to make novel connections between molecular pathways in Xenopus and human disease. Our relational database and user-friendly interface make these data easy to query and allows investigators to quickly interrogate and link different data types in ways that would otherwise be difficult, time consuming, or impossible. Xenbase also enhances the value of these data through high-quality gene expression curation and data integration, by providing bioinformatics tools optimized for Xenopus experiments, and by linking Xenopus data to other model organisms and to human data. Xenbase draws in data via pipelines that download data, parse the content, and save them into appropriate files and database tables. Furthermore, Xenbase makes these data accessible to the broader biomedical community by continually providing annotated data updates to organizations such as NCBI, UniProtKB, and Ensembl. Here, we describe our bioinformatics, genome-browsing tools, data acquisition and sharing, our community submitted and literature curation pipelines, text-mining support, gene page features, and the curation of gene nomenclature and gene models.
Collapse
Affiliation(s)
- Christina James-Zorn
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Virgillio G. Ponferrada
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Kevin A. Burns
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Joshua D. Fortriede
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Vaneet S. Lotay
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - Yu Liu
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - J. Brad Karpinka
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - Kamran Karimi
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| | - Aaron M. Zorn
- Division of Developmental Biology, Cincinnati Children’s Hospital, 3333 Burnet Ave, Cincinnati, OH 45229, USA
| | - Peter D. Vize
- Departments of Biological Science and Computer Science, University of Calgary, 2500 University Drive NW, Calgary, AB T2N1N4, Canada
| |
Collapse
|
264
|
Eppig JT, Richardson JE, Kadin JA, Ringwald M, Blake JA, Bult CJ. Mouse Genome Informatics (MGI): reflecting on 25 years. Mamm Genome 2015; 26:272-84. [PMID: 26238262 PMCID: PMC4534491 DOI: 10.1007/s00335-015-9589-4] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 07/20/2015] [Indexed: 12/02/2022]
Abstract
From its inception in 1989, the mission of the Mouse Genome Informatics (MGI) resource remains to integrate genetic, genomic, and biological data about the laboratory mouse to facilitate the study of human health and disease. This mission is ever more feasible as the revolution in genetics knowledge, the ability to sequence genomes, and the ability to specifically manipulate mammalian genomes are now at our fingertips. Through major paradigm shifts in biological research and computer technologies, MGI has adapted and evolved to become an integral part of the larger global bioinformatics infrastructure and honed its ability to provide authoritative reference datasets used and incorporated by many other established bioinformatics resources. Here, we review some of the major changes in research approaches over that last quarter century, how these changes are reflected in the MGI resource you use today, and what may be around the next corner.
Collapse
Affiliation(s)
- Janan T. Eppig
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609 USA
| | - Joel E. Richardson
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609 USA
| | - James A. Kadin
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609 USA
| | - Martin Ringwald
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609 USA
| | - Judith A. Blake
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609 USA
| | - Carol J. Bult
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME 04609 USA
| |
Collapse
|
265
|
Lyne R, Sullivan J, Butano D, Contrino S, Heimbach J, Hu F, Kalderimis A, Lyne M, Smith RN, Štěpán R, Balakrishnan R, Binkley G, Harris T, Karra K, Moxon SAT, Motenko H, Neuhauser S, Ruzicka L, Cherry M, Richardson J, Stein L, Westerfield M, Worthey E, Micklem G. Cross-organism analysis using InterMine. Genesis 2015; 53:547-60. [PMID: 26097192 PMCID: PMC4545681 DOI: 10.1002/dvg.22869] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Revised: 06/17/2015] [Accepted: 06/17/2015] [Indexed: 01/01/2023]
Abstract
InterMine is a data integration warehouse and analysis software system developed for large and complex biological data sets. Designed for integrative analysis, it can be accessed through a user-friendly web interface. For bioinformaticians, extensive web services as well as programming interfaces for most common scripting languages support access to all features. The web interface includes a useful identifier look-up system, and both simple and sophisticated search options. Interactive results tables enable exploration, and data can be filtered, summarized, and browsed. A set of graphical analysis tools provide a rich environment for data exploration including statistical enrichment of sets of genes or other entities. InterMine databases have been developed for the major model organisms, budding yeast, nematode worm, fruit fly, zebrafish, mouse, and rat together with a newly developed human database. Here, we describe how this has facilitated interoperation and development of cross-organism analysis tools and reports. InterMine as a data exploration and analysis tool is also described. All the InterMine-based systems described in this article are resources freely available to the scientific community.
Collapse
Affiliation(s)
- Rachel Lyne
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Julie Sullivan
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Daniela Butano
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Sergio Contrino
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Josh Heimbach
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Fengyuan Hu
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Alex Kalderimis
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Mike Lyne
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Richard N. Smith
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Radek Štěpán
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Rama Balakrishnan
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Gail Binkley
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Todd Harris
- Ontario Institute for Cancer Research, Toronto, ON, M5G0A3, Canada
| | - Kalpana Karra
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | | | - Howie Motenko
- The Jackson Laboratory, Bar Harbor, Maine, 04609, USA
| | | | | | - Mike Cherry
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | | | - Lincoln Stein
- Ontario Institute for Cancer Research, Toronto, ON, M5G0A3, Canada
| | - Monte Westerfield
- ZFIN, University of Oregon, Eugene, OR, 97403, USA
- Institute of Neuroscience, University of Oregon, Eugene, OR, 97403, USA
| | - Elizabeth Worthey
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, 53226, USA
| | - Gos Micklem
- Cambridge Systems Biology Centre, University of Cambridge, Cambridge CB2 1QR, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| |
Collapse
|
266
|
Dolan ME, Baldarelli RM, Bello SM, Ni L, McAndrews MS, Bult CJ, Kadin JA, Richardson JE, Ringwald M, Eppig JT, Blake JA. Orthology for comparative genomics in the mouse genome database. Mamm Genome 2015. [PMID: 26223881 PMCID: PMC4534493 DOI: 10.1007/s00335-015-9588-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.
Collapse
Affiliation(s)
- Mary E Dolan
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA,
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
267
|
Smedley D, Robinson PN. Phenotype-driven strategies for exome prioritization of human Mendelian disease genes. Genome Med 2015; 7:81. [PMID: 26229552 PMCID: PMC4520011 DOI: 10.1186/s13073-015-0199-2] [Citation(s) in RCA: 77] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Abstract
Whole exome sequencing has altered the way in which rare diseases are diagnosed and disease genes identified. Hundreds of novel disease-associated genes have been characterized by whole exome sequencing in the past five years, yet the identification of disease-causing mutations is often challenging because of the large number of rare variants that are being revealed. Gene prioritization aims to rank the most probable candidate genes towards the top of a list of potentially pathogenic variants. A promising new approach involves the computational comparison of the phenotypic abnormalities of the individual being investigated with those previously associated with human diseases or genetically modified model organisms. In this review, we compare and contrast the strengths and weaknesses of current phenotype-driven computational algorithms, including Phevor, Phen-Gen, eXtasy and two algorithms developed by our groups called PhenIX and Exomiser. Computational phenotype analysis can substantially improve the performance of exome analysis pipelines.
Collapse
Affiliation(s)
- Damian Smedley
- />Skarnes Faculty Group, Wellcome Trust Sanger Institute, Hinxton, UK
| | - Peter N. Robinson
- />Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Berlin, Germany
- />Max Planck Institute for Molecular Genetics, Ihnestrasse, 14195 Berlin, Germany
- />Berlin Brandenburg Center for Regenerative Therapies (BCRT), Charité-Universitätsmedizin Berlin, Augustenburger Platz, 13353 Berlin, Germany
- />Institute for Bioinformatics, Department of Mathematics and Computer Science, Freie Universität Berlin, Takustrasse, 14195 Berlin, Germany
| |
Collapse
|
268
|
Eppig JT, Richardson JE, Kadin JA, Smith CL, Blake JA, Bult CJ. Mouse Genome Database: From sequence to phenotypes and disease models. Genesis 2015; 53:458-73. [PMID: 26150326 PMCID: PMC4545690 DOI: 10.1002/dvg.22874] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 06/30/2015] [Accepted: 07/02/2015] [Indexed: 12/19/2022]
Abstract
The Mouse Genome Database (MGD, www.informatics.jax.org) is the international scientific database for genetic, genomic, and biological data on the laboratory mouse to support the research requirements of the biomedical community. To accomplish this goal, MGD provides broad data coverage, serves as the authoritative standard for mouse nomenclature for genes, mutants, and strains, and curates and integrates many types of data from literature and electronic sources. Among the key data sets MGD supports are: the complete catalog of mouse genes and genome features, comparative homology data for mouse and vertebrate genes, the authoritative set of Gene Ontology (GO) annotations for mouse gene functions, a comprehensive catalog of mouse mutations and their phenotypes, and a curated compendium of mouse models of human diseases. Here, we describe the data acquisition process, specifics about MGD's key data areas, methods to access and query MGD data, and outreach and user help facilities. genesis 53:458–473, 2015. © 2015 The Authors. Genesis Published by Wiley Periodicals, Inc.
Collapse
|
269
|
Abstract
Mouse anatomy ontologies provide standard nomenclature for describing normal and mutant mouse anatomy, and are essential for the description and integration of data directly related to anatomy such as gene expression patterns. Building on our previous work on anatomical ontologies for the embryonic and adult mouse, we have recently developed a new and substantially revised anatomical ontology covering all life stages of the mouse. Anatomical terms are organized in complex hierarchies enabling multiple relationships between terms. Tissue classification as well as partonomic, developmental, and other types of relationships can be represented. Hierarchies for specific developmental stages can also be derived. The ontology forms the core of the eMouse Atlas Project (EMAP) and is used extensively for annotating and integrating gene expression patterns and other data by the Gene Expression Database (GXD), the eMouse Atlas of Gene Expression (EMAGE) and other database resources. Here we illustrate the evolution of the developmental and adult mouse anatomical ontologies toward one combined system. We report on recent ontology enhancements, describe the current status, and discuss future plans for mouse anatomy ontology development and application in integrating data resources.
Collapse
|
270
|
Rajput B, Murphy TD, Pruitt KD. RefSeq curation and annotation of antizyme and antizyme inhibitor genes in vertebrates. Nucleic Acids Res 2015; 43:7270-9. [PMID: 26170238 PMCID: PMC4551939 DOI: 10.1093/nar/gkv713] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2015] [Accepted: 07/01/2015] [Indexed: 12/29/2022] Open
Abstract
Polyamines are ubiquitous cations that are involved in regulating fundamental cellular processes such as cell growth and proliferation; hence, their intracellular concentration is tightly regulated. Antizyme and antizyme inhibitor have a central role in maintaining cellular polyamine levels. Antizyme is unique in that it is expressed via a novel programmed ribosomal frameshifting mechanism. Conventional computational tools are unable to predict a programmed frameshift, resulting in misannotation of antizyme transcripts and proteins on transcript and genomic sequences. Correct annotation of a programmed frameshifting event requires manual evaluation. Our goal was to provide an accurately curated and annotated Reference Sequence (RefSeq) data set of antizyme transcript and protein records across a broad taxonomic scope that would serve as standards for accurate representation of these gene products. As antizyme and antizyme inhibitor proteins are functionally connected, we also curated antizyme inhibitor genes to more fully represent the elegant biology of polyamine regulation. Manual review of genes for three members of the antizyme family and two members of the antizyme inhibitor family in 91 vertebrate organisms resulted in a total of 461 curated RefSeq records.
Collapse
Affiliation(s)
- Bhanu Rajput
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Terence D Murphy
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| |
Collapse
|
271
|
Bello SM, Smith CL, Eppig JT. Allele, phenotype and disease data at Mouse Genome Informatics: improving access and analysis. Mamm Genome 2015; 26:285-94. [PMID: 26162703 PMCID: PMC4534497 DOI: 10.1007/s00335-015-9582-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 06/23/2015] [Indexed: 11/16/2022]
Abstract
A core part of the Mouse Genome Informatics (MGI) resource is the collection of mouse mutations and the annotation phenotypes and diseases displayed by mice carrying these mutations. These data are integrated with the rest of data in MGI and exported to numerous other resources. The use of mouse phenotype data to drive translational research into human disease has expanded rapidly with the improvements in sequencing technology. MGI has implemented many improvements in allele and phenotype data annotation, search, and display to facilitate access to these data through multiple avenues. For example, the description of alleles has been modified to include more detailed categories of allele attributes. This allows improved discrimination between mutation types. Further, connections have been created between mutations involving multiple genes and each of the genes overlapping the mutation. This allows users to readily find all mutations affecting a gene and see all genes affected by a mutation. In a similar manner, the genes expressed by transgenic or knock-in alleles are now connected to these alleles. The advanced search forms and public reports have been updated to take advantage of these improvements. These search forms and reports are used by an expanding number of researchers to identify novel human disease genes and mouse models of human disease.
Collapse
Affiliation(s)
- Susan M Bello
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME, 04609, USA,
| | | | | |
Collapse
|
272
|
Drabkin HJ, Christie KR, Dolan ME, Hill DP, Ni L, Sitnikov D, Blake JA. Application of comparative biology in GO functional annotation: the mouse model. Mamm Genome 2015; 26:574-83. [PMID: 26141960 PMCID: PMC4602061 DOI: 10.1007/s00335-015-9580-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 06/23/2015] [Indexed: 01/22/2023]
Abstract
The Gene Ontology (GO) is an important component of modern biological knowledge representation with great utility for computational analysis of genomic and genetic data. The Gene Ontology Consortium (GOC) consists of a large team of contributors including curation teams from most model organism database groups as well as curation teams focused on representation of data relevant to specific human diseases. Key to the generation of consistent and comprehensive annotations is the development and use of shared standards and measures of curation quality. The GOC engages all contributors to work to a defined standard of curation that is presented here in the context of annotation of genes in the laboratory mouse. Comprehensive understanding of the origin, epistemology, and coverage of GO annotations is essential for most effective use of GO resources. Here the application of comparative approaches to capturing functional data in the mouse system is described.
Collapse
Affiliation(s)
| | | | - Mary E Dolan
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | - David P Hill
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | - Li Ni
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | | | | |
Collapse
|
273
|
Motenko H, Neuhauser SB, O'Keefe M, Richardson JE. MouseMine: a new data warehouse for MGI. Mamm Genome 2015; 26:325-30. [PMID: 26092688 PMCID: PMC4534495 DOI: 10.1007/s00335-015-9573-z] [Citation(s) in RCA: 95] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2015] [Accepted: 06/01/2015] [Indexed: 11/25/2022]
Abstract
MouseMine (www.mousemine.org) is a new data warehouse for accessing mouse data from Mouse Genome Informatics (MGI). Based on the InterMine software framework, MouseMine supports powerful query, reporting, and analysis capabilities, the ability to save and combine results from different queries, easy integration into larger workflows, and a comprehensive Web Services layer. Through MouseMine, users can access a significant portion of MGI data in new and useful ways. Importantly, MouseMine is also a member of a growing community of online data resources based on InterMine, including those established by other model organism databases. Adopting common interfaces and collaborating on data representation standards are critical to fostering cross-species data analysis. This paper presents a general introduction to MouseMine, presents examples of its use, and discusses the potential for further integration into the MGI interface.
Collapse
Affiliation(s)
- H Motenko
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | | | | | | |
Collapse
|
274
|
Haendel MA, Vasilevsky N, Brush M, Hochheiser HS, Jacobsen J, Oellrich A, Mungall CJ, Washington N, Köhler S, Lewis SE, Robinson PN, Smedley D. Disease insights through cross-species phenotype comparisons. Mamm Genome 2015; 26:548-55. [PMID: 26092691 PMCID: PMC4602072 DOI: 10.1007/s00335-015-9577-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 05/20/2015] [Indexed: 11/30/2022]
Abstract
New sequencing technologies have ushered in a new era for diagnosis and discovery of new causative mutations for rare diseases. However, the sheer numbers of candidate variants that require interpretation in an exome or genomic analysis are still a challenging prospect. A powerful approach is the comparison of the patient’s set of phenotypes (phenotypic profile) to known phenotypic profiles caused by mutations in orthologous genes associated with these variants. The most abundant source of relevant data for this task is available through the efforts of the Mouse Genome Informatics group and the International Mouse Phenotyping Consortium. In this review, we highlight the challenges in comparing human clinical phenotypes with mouse phenotypes and some of the solutions that have been developed by members of the Monarch Initiative. These tools allow the identification of mouse models for known disease-gene associations that may otherwise have been overlooked as well as candidate genes may be prioritized for novel associations. The culmination of these efforts is the Exomiser software package that allows clinical researchers to analyse patient exomes in the context of variant frequency and predicted pathogenicity as well the phenotypic similarity of the patient to any given candidate orthologous gene.
Collapse
Affiliation(s)
- Melissa A Haendel
- University Library and Department of Medical Informatics and Epidemiology, Oregon Health & Science University, Portland, OR, USA
| | - Nicole Vasilevsky
- University Library and Department of Medical Informatics and Epidemiology, Oregon Health & Science University, Portland, OR, USA
| | - Matthew Brush
- University Library and Department of Medical Informatics and Epidemiology, Oregon Health & Science University, Portland, OR, USA
| | - Harry S Hochheiser
- Department of Biomedical Informatics and Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15206, USA
| | - Julius Jacobsen
- Skarnes Faculty Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Anika Oellrich
- Skarnes Faculty Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Christopher J Mungall
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Nicole Washington
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Sebastian Köhler
- Computational Biology Group, Institute for Medical Genetics and Human Genetics, Universitatsklinikum Charité, Augustenburger Platz 1, 13353, Berlin, Germany
| | - Suzanna E Lewis
- Genomics Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA, 94720, USA
| | - Peter N Robinson
- Computational Biology Group, Institute for Medical Genetics and Human Genetics, Universitatsklinikum Charité, Augustenburger Platz 1, 13353, Berlin, Germany
| | - Damian Smedley
- Skarnes Faculty Group, Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK.
| |
Collapse
|
275
|
A unified gene catalog for the laboratory mouse reference genome. Mamm Genome 2015; 26:295-304. [PMID: 26084703 PMCID: PMC4534496 DOI: 10.1007/s00335-015-9571-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 06/03/2015] [Indexed: 01/01/2023]
Abstract
We report here a semi-automated process by which mouse genome feature predictions and curated
annotations (i.e., genes, pseudogenes, functional RNAs, etc.) from Ensembl, NCBI and Vertebrate Genome Annotation database (Vega) are reconciled with the genome features in the Mouse Genome Informatics (MGI) database (http://www.informatics.jax.org) into a comprehensive and non-redundant catalog. Our gene unification method employs an algorithm (fjoin—feature join) for efficient detection of genome coordinate overlaps among features represented in two annotation data sets. Following the analysis with fjoin, genome features are binned into six possible categories (1:1, 1:0, 0:1, 1:n, n:1, n:m) based on coordinate overlaps. These categories are subsequently prioritized for assessment of annotation equivalencies and differences. The version of the unified catalog reported here contains more than 59,000 entries, including 22,599 protein-coding coding genes, 12,455 pseudogenes, and 24,007 other feature types (e.g., microRNAs, lincRNAs, etc.). More than 23,000 of the entries in the MGI gene catalog have equivalent gene models in the annotation files obtained from NCBI, Vega, and Ensembl. 12,719 of the features are unique to NCBI relative to Ensembl/Vega; 11,957 are unique to Ensembl/Vega relative to NCBI, and 3095 are unique to MGI. More than 4000 genome features fall into categories that require manual inspection to resolve structural differences in the gene models from different annotation sources. Using the MGI unified gene catalog, researchers can easily generate a comprehensive report of mouse genome features from a single source and compare the details of gene and transcript structure using MGI’s mouse genome browser.
Collapse
|
276
|
Finger JH, Smith CM, Hayamizu TF, McCright IJ, Xu J, Eppig JT, Kadin JA, Richardson JE, Ringwald M. The mouse gene expression database: New features and how to use them effectively. Genesis 2015; 53:510-22. [PMID: 26045019 DOI: 10.1002/dvg.22864] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 05/29/2015] [Accepted: 06/01/2015] [Indexed: 12/15/2022]
Abstract
The Gene Expression Database (GXD) is an extensive and freely available community resource of mouse developmental expression data. GXD curates and integrates expression data from the literature, via electronic data submissions, and by collaborations with large-scale projects. As an integral component of the Mouse Genome Informatics Resource, GXD combines expression data with genetic, functional, phenotypic, and disease-related data, and provides tools for the research community to search for and analyze expression data in this larger context. Recent enhancements include: an interactive browser to navigate the mouse developmental anatomy and find expression data for specific anatomical structures; the capability to search for expression data of genes located in specific genomic regions, supporting the identification of disease candidate genes; a summary displaying all the expression images that meet specified search criteria; interactive matrix views that provide overviews of spatio-temporal expression patterns (Tissue × Stage Matrix) and enable the comparison of expression patterns between genes (Tissue × Gene Matrix); data zoom and filter utilities to iteratively refine summary displays and data sets; and gene-based links to expression data from other model organisms, such as chicken, Xenopus, and zebrafish, fostering comparative expression analysis for species that are highly relevant for developmental research.
Collapse
Affiliation(s)
| | | | | | | | - Jingxia Xu
- The Jackson Laboratory, Bar Harbor, Maine
| | | | | | | | | |
Collapse
|
277
|
Analysis of the human diseasome using phenotype similarity between common, genetic, and infectious diseases. Sci Rep 2015; 5:10888. [PMID: 26051359 PMCID: PMC4458913 DOI: 10.1038/srep10888] [Citation(s) in RCA: 72] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2014] [Accepted: 04/22/2015] [Indexed: 01/29/2023] Open
Abstract
Phenotypes are the observable characteristics of an organism arising from its response to the environment. Phenotypes associated with engineered and natural genetic variation are widely recorded using phenotype ontologies in model organisms, as are signs and symptoms of human Mendelian diseases in databases such as OMIM and Orphanet. Exploiting these resources, several computational methods have been developed for integration and analysis of phenotype data to identify the genetic etiology of diseases or suggest plausible interventions. A similar resource would be highly useful not only for rare and Mendelian diseases, but also for common, complex and infectious diseases. We apply a semantic text-mining approach to identify the phenotypes (signs and symptoms) associated with over 6,000 diseases. We evaluate our text-mined phenotypes by demonstrating that they can correctly identify known disease-associated genes in mice and humans with high accuracy. Using a phenotypic similarity measure, we generate a human disease network in which diseases that have similar signs and symptoms cluster together, and we use this network to identify closely related diseases based on common etiological, anatomical as well as physiological underpinnings.
Collapse
|
278
|
Richardson JE, Bult CJ. Visual annotation display (VLAD): a tool for finding functional themes in lists of genes. Mamm Genome 2015; 26:567-73. [PMID: 26047590 PMCID: PMC4602057 DOI: 10.1007/s00335-015-9570-2] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2015] [Accepted: 05/19/2015] [Indexed: 12/27/2022]
Abstract
Experiments that employ genome scale technology platforms frequently result in lists of tens to thousands of genes with potential significance to a specific biological process or disease. Searching for biologically relevant connections among the genes or gene products in these lists is a common data analysis task. We have implemented a software application for uncovering functional themes in sets of genes based on their annotations to bio-ontologies, such as the gene ontology and the mammalian phenotype ontology. The application, called VisuaL Annotation Display (VLAD), performs a statistical analysis to test for the enrichment of ontology terms in a set of genes submitted by a researcher. The results for each analysis using VLAD includes a table of ontology terms, sorted in decreasing order of significance. Each row contains the term, statistics such as the number of annotated terms, the p value, etc., and the symbols of annotated genes. An accompanying graphical display shows portions of the ontology hierarchy, where node sizes are scaled based on p values. Although numerous ontology term enrichment programs already exist, VLAD is unique in that it allows users to upload their own annotation files and ontologies for customized term enrichment analyses, supports the analysis of multiple gene sets at once, provides interfaces to customize graphical output, and is tightly integrated with functional and biological details about mouse genes in the Mouse Genome Informatics (MGI) database. VLAD is available as a web-based application from the MGI web site (http://proto.informatics.jax.org/prototypes/vlad/).
Collapse
Affiliation(s)
- Joel E Richardson
- Mouse Genome Informatics (MGI) Database Consortium, The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
| | - Carol J Bult
- Mouse Genome Informatics (MGI) Database Consortium, The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
| |
Collapse
|
279
|
Smith CM, Finger JH, Hayamizu TF, McCright IJ, Xu J, Eppig JT, Kadin JA, Richardson JE, Ringwald M. GXD: a community resource of mouse Gene Expression Data. Mamm Genome 2015; 26:314-24. [PMID: 25939429 PMCID: PMC4534488 DOI: 10.1007/s00335-015-9563-1] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 04/13/2015] [Indexed: 12/23/2022]
Abstract
The Gene Expression Database (GXD) is an extensive, easily searchable, and freely available database of mouse gene expression information (www.informatics.jax.org/expression.shtml). GXD was developed to foster progress toward understanding the molecular basis of human development and disease. GXD contains information about when and where genes are expressed in different tissues in the mouse, especially during the embryonic period. GXD collects different types of expression data from wild-type and mutant mice, including RNA in situ hybridization, immunohistochemistry, RT-PCR, and northern and western blot results. The GXD curators read the scientific literature and enter the expression data from those papers into the database. GXD also acquires expression data directly from researchers, including groups doing large-scale expression studies. GXD currently contains nearly 1.5 million expression results for over 13,900 genes. In addition, it has over 265,000 images of expression data, allowing users to retrieve the primary data and interpret it themselves. By being an integral part of the larger Mouse Genome Informatics (MGI) resource, GXD’s expression data are combined with other genetic, functional, phenotypic, and disease-oriented data. This allows GXD to provide tools for researchers to evaluate expression data in the larger context, search by a wide variety of biologically and biomedically relevant parameters, and discover new data connections to help in the design of new experiments. Thus, GXD can provide researchers with critical insights into the functions of genes and the molecular mechanisms of development, differentiation, and disease.
Collapse
Affiliation(s)
| | | | | | | | - Jingxia Xu
- The Jackson Laboratory, Bar Harbor, ME 04609 USA
| | | | | | | | | |
Collapse
|
280
|
Goya J, Wong AK, Yao V, Krishnan A, Homilius M, Troyanskaya OG. FNTM: a server for predicting functional networks of tissues in mouse. Nucleic Acids Res 2015; 43:W182-7. [PMID: 25940632 PMCID: PMC4489275 DOI: 10.1093/nar/gkv443] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 04/24/2015] [Indexed: 12/11/2022] Open
Abstract
Functional Networks of Tissues in Mouse (FNTM) provides biomedical researchers with tissue-specific predictions of functional relationships between proteins in the most widely used model organism for human disease, the laboratory mouse. Users can explore FNTM-predicted functional relationships for their tissues and genes of interest or examine gene function and interaction predictions across multiple tissues, all through an interactive, multi-tissue network browser. FNTM makes predictions based on integration of a variety of functional genomic data, including over 13 000 gene expression experiments, and prior knowledge of gene function. FNTM is an ideal starting point for clinical and translational researchers considering a mouse model for their disease of interest, researchers already working with mouse models who are interested in discovering new genes related to their pathways or phenotypes of interest, and biologists working with other organisms to explore the functional relationships of their genes of interest in specific mouse tissue contexts. FNTM predicts tissue-specific functional relationships in 200 tissues, does not require any registration or installation and is freely available for use at http://fntm.princeton.edu.
Collapse
Affiliation(s)
- Jonathan Goya
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA
| | - Aaron K Wong
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA Simons Center for Data Analysis, Simons Foundation, NY 10010, USA Department of Computer Science, Princeton University, Princeton, NJ 08540, USA
| | - Victoria Yao
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA Department of Computer Science, Princeton University, Princeton, NJ 08540, USA
| | - Arjun Krishnan
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA
| | - Max Homilius
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA Department of Computer Science, Princeton University, Princeton, NJ 08540, USA
| | - Olga G Troyanskaya
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA Simons Center for Data Analysis, Simons Foundation, NY 10010, USA Department of Computer Science, Princeton University, Princeton, NJ 08540, USA
| |
Collapse
|
281
|
Notwell JH, Chung T, Heavner W, Bejerano G. A family of transposable elements co-opted into developmental enhancers in the mouse neocortex. Nat Commun 2015; 6:6644. [PMID: 25806706 PMCID: PMC4438107 DOI: 10.1038/ncomms7644] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Accepted: 02/13/2015] [Indexed: 12/27/2022] Open
Abstract
The neocortex is a mammalian-specific structure that is responsible for higher functions such as cognition, emotion and perception. To gain insight into its evolution and the gene regulatory codes that pattern it, we studied the overlap of its active developmental enhancers with transposable element (TE) families and compared this overlap to uniformly shuffled enhancers. Here we show a striking enrichment of the MER130 repeat family among active enhancers in the mouse dorsal cerebral wall, which gives rise to the neocortex, at embryonic day 14.5. We show that MER130 instances preserve a common code of transcriptional regulatory logic, function as enhancers and are adjacent to critical neocortical genes. MER130, a nonautonomous interspersed TE, originates in the tetrapod or possibly Sarcopterygii ancestor, which far predates the appearance of the neocortex. Our results show that MER130 elements were recruited, likely through their common regulatory logic, as neocortical enhancers.
Collapse
Affiliation(s)
- James H Notwell
- Department of Computer Science, Stanford University, 279 Campus Drive West (MC 5329), Beckman Center B-300, Stanford, California 94305-5329, USA
| | - Tisha Chung
- Department of Developmental Biology, Stanford University, 279 Campus Drive West (MC 5329), Beckman Center B-300, Stanford, California 94305-5329, USA
| | - Whitney Heavner
- 1] Department of Developmental Biology, Stanford University, 279 Campus Drive West (MC 5329), Beckman Center B-300, Stanford, California 94305-5329, USA [2] Department of Biology, Stanford University, 279 Campus Drive West (MC 5329), Beckman Center B-300, Stanford, California 94305-5329, USA
| | - Gill Bejerano
- 1] Department of Computer Science, Stanford University, 279 Campus Drive West (MC 5329), Beckman Center B-300, Stanford, California 94305-5329, USA [2] Department of Developmental Biology, Stanford University, 279 Campus Drive West (MC 5329), Beckman Center B-300, Stanford, California 94305-5329, USA [3] Department of Pediatrics, Division of Medical Genetics, Stanford University, 279 Campus Drive West (MC 5329), Beckman Center B-300, Stanford, California 94305-5329, USA
| |
Collapse
|
282
|
Abstract
The human genome contains multiple stretches of CGG trinucleotide repeats, which act as transcription- and translation-regulatory elements but at the same time form secondary structures that impede replication and give rise to sites of chromosome fragility. Proteins binding to such DNA elements may be involved in divergent cellular processes such as transcription, DNA damage, and epigenetic state of the chromatin. We review here the work done on CGG repeats and associated proteins with special focus on a factor called CGGBP1. CGGBP1 presents with an interesting example of factors that do not have any single dedicated function, but participate indispensably in multiple processes. Both experimental results and data from cancer genome sequencing have revealed that any alteration in CGGBP1 that compromises its function is not tolerated by normal or cancer cells alike. Based upon a large amount of published data, information from databases, and unpublished results, we decipher in this review how CGGBP1 is a classic example of the 'one factor, divergent functions' paradigm of cytoprotection. By taking cues from the studies on CGGBP1, more such factors can be discovered for a better understanding of the evolution of mechanisms of cellular survival.
Collapse
Affiliation(s)
- Umashankar Singh
- Biological Sciences and Engineering, Indian Institute of Technology, Gandhinagar, Gujarat, India
- Correspondence: Umashankar Singh, Biological Sciences and Engineering, Indian Institute of Technology, Gandhinagar, Gujarat, India.
| | - Bengt Westermark
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Rudbeck Laboratory, Uppsala University, Sweden
| |
Collapse
|