1
|
Duyzend MH, Cacheiro P, Jacobsen JO, Giordano J, Brand H, Wapner RJ, Talkowski ME, Robinson PN, Smedley D. Improving prenatal diagnosis through standards and aggregation. Prenat Diagn 2024; 44:454-464. [PMID: 38242839 PMCID: PMC11006584 DOI: 10.1002/pd.6522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/17/2023] [Accepted: 12/22/2023] [Indexed: 01/21/2024]
Abstract
Advances in sequencing and imaging technologies enable enhanced assessment in the prenatal space, with a goal to diagnose and predict the natural history of disease, to direct targeted therapies, and to implement clinical management, including transfer of care, election of supportive care, and selection of surgical interventions. The current lack of standardization and aggregation stymies variant interpretation and gene discovery, which hinders the provision of prenatal precision medicine, leaving clinicians and patients without an accurate diagnosis. With large amounts of data generated, it is imperative to establish standards for data collection, processing, and aggregation. Aggregated and homogeneously processed genetic and phenotypic data permits dissection of the genomic architecture of prenatal presentations of disease and provides a dataset on which data analysis algorithms can be tuned to the prenatal space. Here we discuss the importance of generating aggregate data sets and how the prenatal space is driving the development of interoperable standards and phenotype-driven tools.
Collapse
Affiliation(s)
- Michael H. Duyzend
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Division of Genetics and Genomics, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Pilar Cacheiro
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Julius O.B. Jacobsen
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| | - Jessica Giordano
- Department of Obstetrics & Gynecology, Columbia University Medical Center, New York, NY, USA
| | - Harrison Brand
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
| | - Ronald J. Wapner
- Department of Obstetrics & Gynecology, Columbia University Medical Center, New York, NY, USA
| | - Michael E. Talkowski
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA
- Program in Medical and Population Genetics, The Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Neurology, Harvard Medical School, Boston, MA, USA
- Program in Biological and Biomedical Sciences, Division of Medical Sciences, Harvard Medical School, Boston, MA, USA
- Program in Bioinformatics and Integrative Genomics, Division of Medical Sciences, Harvard Medical School, Boston, MA, USA
| | - Peter N. Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
- Institute for Systems Genomics, University of Connecticut, Farmington, CT 06032, USA
| | - Damian Smedley
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, UK
| |
Collapse
|
2
|
Li N, Yang Z, Yang Y, Wang J, Lin H. Hyperbolic hierarchical knowledge graph embeddings for biological entities. J Biomed Inform 2023; 147:104503. [PMID: 37778673 DOI: 10.1016/j.jbi.2023.104503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 08/25/2023] [Accepted: 09/19/2023] [Indexed: 10/03/2023]
Abstract
Predicting relationships between biological entities can greatly benefit important biomedical problems. Previous studies have attempted to represent biological entities and relationships in Euclidean space using embedding methods, which evaluate their semantic similarity by representing entities as numerical vectors. However, the limitation of these methods is that they cannot prevent the loss of latent hierarchical information when embedding large graph-structured data into Euclidean space, and therefore cannot capture the semantics of entities and relationships accurately. Hyperbolic spaces, such as Poincaré ball, are better suited for hierarchical modeling than Euclidean spaces. This is because hyperbolic spaces exhibit negative curvature, causing distances to grow exponentially as they approach the boundary. In this paper, we propose HEM, a hyperbolic hierarchical knowledge graph embedding model to generate vector representations of bio-entities. By encoding the entities and relations in the hyperbolic space, HEM can capture latent hierarchical information and improve the accuracy of biological entity representation. Notably, HEM can preserve rich information with a low dimension compared with the methods that encode entities in Euclidean space. Furthermore, we explore the performance of HEM in protein-protein interaction prediction and gene-disease association prediction tasks. Experimental results demonstrate the superior performance of HEM over state-of-the-art baselines. The data and code are available at : https://github.com/Nan-ll/HEM.
Collapse
Affiliation(s)
- Nan Li
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Zhihao Yang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China.
| | - Yumeng Yang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Jian Wang
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| | - Hongfei Lin
- College of Computer Science and Technology, Dalian University of Technology, Dalian, China
| |
Collapse
|
3
|
Kaldunski ML, Smith JR, Brodie KC, De Pons JL, Demos WM, Gibson AC, Hayman GT, Lamers L, Laulederkind SJF, Thorat K, Thota J, Tutaj MA, Tutaj M, Vedi M, Wang SJ, Zacher S, Dwinell MR, Kwitek AE. Rare disease research resources at the Rat Genome Database. Genetics 2023; 224:iyad078. [PMID: 37119810 PMCID: PMC10411567 DOI: 10.1093/genetics/iyad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 04/05/2023] [Accepted: 04/19/2023] [Indexed: 05/01/2023] Open
Abstract
Rare diseases individually affect relatively few people, but as a group they impact considerable numbers of people. The Rat Genome Database (https://rgd.mcw.edu) is a knowledgebase that offers resources for rare disease research. This includes disease definitions, genes, quantitative trail loci (QTLs), genetic variants, annotations to published literature, links to external resources, and more. One important resource is identifying relevant cell lines and rat strains that serve as models for disease research. Diseases, genes, and strains have report pages with consolidated data, and links to analysis tools. Utilizing these globally accessible resources for rare disease research, potentiating discovery of mechanisms and new treatments, can point researchers toward solutions to alleviate the suffering of those afflicted with these diseases.
Collapse
Affiliation(s)
- Mary L Kaldunski
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jennifer R Smith
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Kent C Brodie
- Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jeffrey L De Pons
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Wendy M Demos
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Adam C Gibson
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - G Thomas Hayman
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Logan Lamers
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Stanley J F Laulederkind
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Ketaki Thorat
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jyothi Thota
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Marek A Tutaj
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Monika Tutaj
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Mahima Vedi
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Shur-Jen Wang
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Stacy Zacher
- Finance and Administration, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Melinda R Dwinell
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Anne E Kwitek
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
- Joint Department of Biomedical Engineering, Marquette University & Medical College of Wisconsin, Milwaukee, WI 53226, USA
| |
Collapse
|
4
|
Khaveh N, Schachler K, Berghöfer J, Jung K, Metzger J. Altered hair root gene expression profiles highlight calcium signaling and lipid metabolism pathways to be associated with curly hair initiation and maintenance in Mangalitza pigs. Front Genet 2023; 14:1184015. [PMID: 37351343 PMCID: PMC10282778 DOI: 10.3389/fgene.2023.1184015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 05/30/2023] [Indexed: 06/24/2023] Open
Abstract
Hair types have been under strong targeted selection in domestic animals for their impact on skin protection, thermoregulation and exterior morphology, and subsequent economic importance. In pigs, a very special hair phenotype was observed in Mangalitza, who expresses a thick coat of curly bristles and downy hair. Two breed-specific missense variants in TRPM2 and CYP4F3 were suggested to be associated with the Mangalitza pig's hair shape due to their role in hair follicle morphogenesis reported for human and mice. However, the mechanism behind this expression of a curly hair type is still unclear and needs to be explored. In our study, hair shafts were measured and investigated for the curvature of the hair in Mangalitza and crossbreeds in comparison to straight-coated pigs. For molecular studies, hair roots underwent RNA sequencing for a differential gene expression analysis using DESeq2. The output matrix of normalized counts was then used to construct weighted gene co-expression networks. The resulting hair root gene expression profiles highlighted 454 genes to be significantly differentially expressed for initiation of curly hair phenotype in newborn Mangalitza piglets versus post-initiation in later development. Furthermore, 2,554 genes showed a significant differential gene expression in curly hair in comparison to straight hair. Neither TRPM2 nor CYP4F3 were identified as differentially expressed. Incidence of the genes in weighted co-expression networks associated with TRPM2 and CYP4F3, and prominent interactions of subsequent proteins with lipids and calcium-related pathways suggested calcium signaling and/or lipid metabolism as essential players in the induction of the curly hair as well as an ionic calcium-dependency to be a prominent factor for the maintenance of this phenotype. Subsequently, our study highlights the complex interrelations and dependencies of mutant genes TRPM2 and CYP4F3 and associated gene expression patterns, allowing the initiation of curly hair type during the development of a piglet as well as the maintenance in adult individuals.
Collapse
Affiliation(s)
- Nadia Khaveh
- Research Group Veterinary Functional Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Hannover, Germany
| | - Kathrin Schachler
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Hannover, Germany
| | - Jan Berghöfer
- Research Group Veterinary Functional Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Department of Biology, Chemistry and Pharmacy, Institute of Chemistry and Biochemistry, Freie Universität Berlin, Berlin, Germany
| | - Klaus Jung
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Hannover, Germany
| | - Julia Metzger
- Research Group Veterinary Functional Genomics, Max Planck Institute for Molecular Genetics, Berlin, Germany
- Institute of Animal Breeding and Genetics, University of Veterinary Medicine Hannover, Hannover, Germany
| |
Collapse
|
5
|
Fu Y, Liu H, Dou J, Wang Y, Liao Y, Huang X, Tang Z, Xu J, Yin D, Zhu S, Liu Y, Shen X, Liu H, Liu J, Yang X, Zhang Y, Xiang Y, Li J, Zheng Z, Zhao Y, Ma Y, Wang H, Du X, Xie S, Xu X, Zhang H, Yin L, Zhu M, Yu M, Li X, Liu X, Zhao S. IAnimal: a cross-species omics knowledgebase for animals. Nucleic Acids Res 2022; 51:D1312-D1324. [PMID: 36300629 PMCID: PMC9825575 DOI: 10.1093/nar/gkac936] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/23/2022] [Accepted: 10/11/2022] [Indexed: 01/30/2023] Open
Abstract
With the exponential growth of multi-omics data, its integration and utilization have brought unprecedented opportunities for the interpretation of gene regulation mechanisms and the comprehensive analyses of biological systems. IAnimal (https://ianimal.pro/), a cross-species, multi-omics knowledgebase, was developed to improve the utilization of massive public data and simplify the integration of multi-omics information to mine the genetic mechanisms of objective traits. Currently, IAnimal provides 61 191 individual omics data of genome (WGS), transcriptome (RNA-Seq), epigenome (ChIP-Seq, ATAC-Seq) and genome annotation information for 21 species, such as mice, pigs, cattle, chickens, and macaques. The scale of its total clean data has reached 846.46 TB. To better understand the biological significance of omics information, a deep learning model for IAnimal was built based on BioBERT and AutoNER to mine 'gene' and 'trait' entities from 2 794 237 abstracts, which has practical significance for comprehending how each omics layer regulates genes to affect traits. By means of user-friendly web interfaces, flexible data application programming interfaces, and abundant functional modules, IAnimal enables users to easily query, mine, and visualize characteristics in various omics, and to infer how genes play biological roles under the influence of various omics layers.
Collapse
Affiliation(s)
- Yuhua Fu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Hong Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Jingwen Dou
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Yue Wang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Yong Liao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Xin Huang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Zhenshuang Tang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - JingYa Xu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Dong Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Shilin Zhu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Yangfan Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Xiong Shen
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Hengyi Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Jiaqi Liu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Xin Yang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Yi Zhang
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, Hubei 430070, PR China
| | - Yue Xiang
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Jingjin Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Zhuqing Zheng
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China
| | - Yunxia Zhao
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Yunlong Ma
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Haiyan Wang
- Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Xiaoyong Du
- Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Shengsong Xie
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Xuewen Xu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Haohao Zhang
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, Hubei 430070, PR China
| | - Lilin Yin
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Mengjin Zhu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Mei Yu
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Xinyun Li
- Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Ministry of Education, Key Laboratory of Swine Genetics and Breeding, Ministry of Agriculture, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, PR China,Frontiers Science Center for Animal Breeding and Sustainable Production, Wuhan, Hubei 430070, PR China
| | - Xiaolei Liu
- Correspondence may also be addressed to Xiaolei Liu.
| | - Shuhong Zhao
- To whom correspondence should be addressed. Tel: +86 27 87387480;
| |
Collapse
|
6
|
WhichTF is functionally important in your open chromatin data? PLoS Comput Biol 2022; 18:e1010378. [PMID: 36040971 PMCID: PMC9426921 DOI: 10.1371/journal.pcbi.1010378] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 07/11/2022] [Indexed: 11/19/2022] Open
Abstract
We present WhichTF, a computational method to identify functionally important transcription factors (TFs) from chromatin accessibility measurements. To rank TFs, WhichTF applies an ontology-guided functional approach to compute novel enrichment by integrating accessibility measurements, high-confidence pre-computed conservation-aware TF binding sites, and putative gene-regulatory models. Comparison with prior sheer abundance-based methods reveals the unique ability of WhichTF to identify context-specific TFs with functional relevance, including NF-κB family members in lymphocytes and GATA factors in cardiac cells. To distinguish the transcriptional regulatory landscape in closely related samples, we apply differential analysis and demonstrate its utility in lymphocyte, mesoderm developmental, and disease cells. We find suggestive, under-characterized TFs, such as RUNX3 in mesoderm development and GLI1 in systemic lupus erythematosus. We also find TFs known for stress response, suggesting routine experimental caveats that warrant careful consideration. WhichTF yields biological insight into known and novel molecular mechanisms of TF-mediated transcriptional regulation in diverse contexts, including human and mouse cell types, cell fate trajectories, and disease-associated cells. Transcription factors (TFs), a class of DNA binding proteins, regulate tissue- and cell-type-specific expression of genes. Identifying the critical TFs in a given cellular context leads to investigating molecular regulatory mechanisms in development, differentiation, and disease. Because there are more than 1,500 human TFs, experimental measurements of genome-wide occupancy across all TFs have been challenging. While computational approaches play pivotal roles, most existing methods rely on statistical enrichment, focusing either on sequence motif similarity recognized by TFs or the similarity of the genomic region of interest with the previously characterized TF occupancy profile. Here we propose WhichTF as an alternative, incorporating curated biomedical knowledge from ontology and integrating it with the high-confidence prediction of conserved TF binding sites in user-provided genomic regions of interest. We develop a new WhichTF score to rank TFs and demonstrate its applicability across human and mouse cell types, cellular differentiation trajectories, and disease-associated cells.
Collapse
|
7
|
Fisher ME, Segerdell E, Matentzoglu N, Nenni MJ, Fortriede JD, Chu S, Pells TJ, Osumi-Sutherland D, Chaturvedi P, James-Zorn C, Sundararaj N, Lotay VS, Ponferrada V, Wang DZ, Kim E, Agalakov S, Arshinoff BI, Karimi K, Vize PD, Zorn AM. The Xenopus phenotype ontology: bridging model organism phenotype data to human health and development. BMC Bioinformatics 2022; 23:99. [PMID: 35317743 PMCID: PMC8939077 DOI: 10.1186/s12859-022-04636-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 03/08/2022] [Indexed: 11/10/2022] Open
Abstract
Background Ontologies of precisely defined, controlled vocabularies are essential to curate the results of biological experiments such that the data are machine searchable, can be computationally analyzed, and are interoperable across the biomedical research continuum. There is also an increasing need for methods to interrelate phenotypic data easily and accurately from experiments in animal models with human development and disease. Results Here we present the Xenopus phenotype ontology (XPO) to annotate phenotypic data from experiments in Xenopus, one of the major vertebrate model organisms used to study gene function in development and disease. The XPO implements design patterns from the Unified Phenotype Ontology (uPheno), and the principles outlined by the Open Biological and Biomedical Ontologies (OBO Foundry) to maximize interoperability with other species and facilitate ongoing ontology management. Constructed in Web Ontology Language (OWL) the XPO combines the existing uPheno library of ontology design patterns with additional terms from the Xenopus Anatomy Ontology (XAO), the Phenotype and Trait Ontology (PATO) and the Gene Ontology (GO). The integration of these different ontologies into the XPO enables rich phenotypic curation, whilst the uPheno bridging axioms allows phenotypic data from Xenopus experiments to be related to phenotype data from other model organisms and human disease. Moreover, the simple post-composed uPheno design patterns facilitate ongoing XPO development as the generation of new terms and classes of terms can be substantially automated. Conclusions The XPO serves as an example of current best practices to help overcome many of the inherent challenges in harmonizing phenotype data between different species. The XPO currently consists of approximately 22,000 terms and is being used to curate phenotypes by Xenbase, the Xenopus Model Organism Knowledgebase, forming a standardized corpus of genotype–phenotype data that can be directly related to other uPheno compliant resources. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04636-8.
Collapse
Affiliation(s)
- Malcolm E Fisher
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Erik Segerdell
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Nicolas Matentzoglu
- Monarch Initiative, London, UK.,Semanticly Ltd, London, UK.,European Bioinformatics Institute (EMBL-EBI), Hinxton, UK
| | - Mardi J Nenni
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Joshua D Fortriede
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Stanley Chu
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Troy J Pells
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | | | - Praneet Chaturvedi
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Christina James-Zorn
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Nivitha Sundararaj
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Vaneet S Lotay
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Virgilio Ponferrada
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - Dong Zhuo Wang
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Eugene Kim
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Sergei Agalakov
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Bradley I Arshinoff
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Kamran Karimi
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Peter D Vize
- Department of Biological Science, University of Calgary, Calgary, AB, Canada
| | - Aaron M Zorn
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA.
| |
Collapse
|
8
|
CoMent: relationships between biomedical concepts inferred from the scientific literature. J Mol Biol 2022; 434:167568. [DOI: 10.1016/j.jmb.2022.167568] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 03/18/2022] [Accepted: 03/22/2022] [Indexed: 01/22/2023]
|
9
|
Schriml LM, Munro JB, Schor M, Olley D, McCracken C, Felix V, Baron JA, Jackson R, Bello SM, Bearer C, Lichenstein R, Bisordi K, Dialo NC, Giglio M, Greene C. The Human Disease Ontology 2022 update. Nucleic Acids Res 2021; 50:D1255-D1261. [PMID: 34755882 PMCID: PMC8728220 DOI: 10.1093/nar/gkab1063] [Citation(s) in RCA: 69] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 10/13/2021] [Accepted: 10/18/2021] [Indexed: 01/31/2023] Open
Abstract
The Human Disease Ontology (DO) (www.disease-ontology.org) database, has significantly expanded the disease content and enhanced our userbase and website since the DO’s 2018 Nucleic Acids Research DATABASE issue paper. Conservatively, based on available resource statistics, terms from the DO have been annotated to over 1.5 million biomedical data elements and citations, a 10× increase in the past 5 years. The DO, funded as a NHGRI Genomic Resource, plays a key role in disease knowledge organization, representation, and standardization, serving as a reference framework for multiscale biomedical data integration and analysis across thousands of clinical, biomedical and computational research projects and genomic resources around the world. This update reports on the addition of 1,793 new disease terms, a 14% increase of textual definitions and the integration of 22 137 new SubClassOf axioms defining disease to disease connections representing the DO’s complex disease classification. The DO’s updated website provides multifaceted etiology searching, enhanced documentation and educational resources.
Collapse
Affiliation(s)
- Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - James B Munro
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Mike Schor
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Dustin Olley
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Carrie McCracken
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Victor Felix
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - J Allen Baron
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | | | - Susan M Bello
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME, USA
| | | | | | | | | | - Michelle Giglio
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Carol Greene
- University of Maryland School of Medicine, Baltimore, MD, USA
| |
Collapse
|
10
|
Konopka T, Vestito L, Smedley D. Dimensional reduction of phenotypes from 53 000 mouse models reveals a diverse landscape of gene function. BIOINFORMATICS ADVANCES 2021; 1:vbab026. [PMID: 34870209 PMCID: PMC8633315 DOI: 10.1093/bioadv/vbab026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 09/09/2021] [Accepted: 10/07/2021] [Indexed: 01/27/2023]
Abstract
Animal models have long been used to study gene function and the impact of genetic mutations on phenotype. Through the research efforts of thousands of research groups, systematic curation of published literature and high-throughput phenotyping screens, the collective body of knowledge for the mouse now covers the majority of protein-coding genes. We here collected data for over 53 000 mouse models with mutations in over 15 000 genomic markers and characterized by more than 254 000 annotations using more than 9000 distinct ontology terms. We investigated dimensional reduction and embedding techniques as means to facilitate access to this diverse and high-dimensional information. Our analyses provide the first visual maps of the landscape of mouse phenotypic diversity. We also summarize some of the difficulties in producing and interpreting embeddings of sparse phenotypic data. In particular, we show that data preprocessing, filtering and encoding have as much impact on the final embeddings as the process of dimensional reduction. Nonetheless, techniques developed in the context of dimensional reduction create opportunities for explorative analysis of this large pool of public data, including for searching for mouse models suited to study human diseases. AVAILABILITY AND IMPLEMENTATION Source code for analysis scripts is available on GitHub at https://github.com/tkonopka/mouse-embeddings. The data underlying this article are available in Zenodo at https://doi.org/10.5281/zenodo.4916171. CONTACT t.konopka@qmul.ac.uk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Tomasz Konopka
- William Harvey Research Institute, Queen Mary University of London, EC1M 6BQ London, UK,To whom correspondence should be addressed.
| | - Letizia Vestito
- William Harvey Research Institute, Queen Mary University of London, EC1M 6BQ London, UK,Ear Institute, University College London, WC1X 8EE London, UK,Great Ormond Street Institute of Child Health, University College London, WC1N 1EH London, UK
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, EC1M 6BQ London, UK
| |
Collapse
|
11
|
Pleiotropy data resource as a primer for investigating co-morbidities/multi-morbidities and their role in disease. Mamm Genome 2021; 33:135-142. [PMID: 34524473 PMCID: PMC8913486 DOI: 10.1007/s00335-021-09917-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 09/06/2021] [Indexed: 11/06/2022]
Abstract
Most current biomedical and protein research focuses only on a small proportion of genes, which results in a lost opportunity to identify new gene-disease associations and explore new opportunities for therapeutic intervention. The International Mouse Phenotyping Consortium (IMPC) focuses on elucidating gene function at scale for poorly characterized and/or under-studied genes. A key component of the IMPC initiative is the implementation of a broad phenotyping pipeline, which is facilitating the discovery of pleiotropy. Characterizing pleiotropy is essential to identify gene-disease associations, and it is of particular importance when elucidating the genetic causes of syndromic disorders. Here we show how the IMPC is effectively uncovering pleiotropy and how the new mouse models and gene function hypotheses generated by the IMPC are increasing our understanding of the mammalian genome, forming the basis of new research and identifying new gene-disease associations.
Collapse
|
12
|
Lou P, Dong Y, Jimeno Yepes A, Li C. A representation model for biological entities by fusing structured axioms with unstructured texts. Bioinformatics 2021; 37:1156-1163. [PMID: 33107905 DOI: 10.1093/bioinformatics/btaa913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 09/04/2020] [Accepted: 10/13/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Structured semantic resources, for example, biological knowledge bases and ontologies, formally define biological concepts, entities and their semantic relationships, manifested as structured axioms and unstructured texts (e.g. textual definitions). The resources contain accurate expressions of biological reality and have been used by machine-learning models to assist intelligent applications like knowledge discovery. The current methods use both the axioms and definitions as plain texts in representation learning (RL). However, since the axioms are machine-readable while the natural language is human-understandable, difference in meaning of token and structure impedes the representations to encode desirable biological knowledge. RESULTS We propose ERBK, a RL model of bio-entities. Instead of using the axioms and definitions as a textual corpus, our method uses knowledge graph embedding method and deep convolutional neural models to encode the axioms and definitions respectively. The representations could not only encode more underlying biological knowledge but also be further applied to zero-shot circumstance where existing approaches fall short. Experimental evaluations show that ERBK outperforms the existing methods for predicting protein-protein interactions and gene-disease associations. Moreover, it shows that ERBK still maintains promising performance under the zero-shot circumstance. We believe the representations and the method have certain generality and could extend to other types of bio-relation. AVAILABILITY AND IMPLEMENTATION The source code is available at the gitlab repository https://gitlab.com/BioAI/erbk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Peiliang Lou
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China.,Key Laboratory of Intelligent Networks and Network Security (Xi'an Jiaotong University), Ministry of Education, Xi'an, Shaanxi 710049, China
| | - YuXin Dong
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| | | | - Chen Li
- School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China.,National Engineering Lab for Big Data Analytics, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China
| |
Collapse
|
13
|
Blake JA, Baldarelli R, Kadin JA, Richardson JE, Smith C, Bult CJ. Mouse Genome Database (MGD): Knowledgebase for mouse-human comparative biology. Nucleic Acids Res 2021; 49:D981-D987. [PMID: 33231642 PMCID: PMC7779030 DOI: 10.1093/nar/gkaa1083] [Citation(s) in RCA: 165] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/18/2020] [Accepted: 11/22/2020] [Indexed: 11/17/2022] Open
Abstract
The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the community model organism knowledgebase for the laboratory mouse, a widely used animal model for comparative studies of the genetic and genomic basis for human health and disease. MGD is the authoritative source for biological reference data related to mouse genes, gene functions, phenotypes and mouse models of human disease. MGD is the primary source for official gene, allele, and mouse strain nomenclature based on the guidelines set by the International Committee on Standardized Nomenclature for Mice. MGD's biocuration scientists curate information from the biomedical literature and from large and small datasets contributed directly by investigators. In this report we describe significant enhancements to the content and interfaces at MGD, including (i) improvements in the Multi Genome Viewer for exploring the genomes of multiple mouse strains, (ii) inclusion of many more mouse strains and new mouse strain pages with extended query options and (iii) integration of extensive data about mouse strain variants. We also describe improvements to the efficiency of literature curation processes and the implementation of an information portal focused on mouse models and genes for the study of COVID-19.
Collapse
|
14
|
DeepPheno: Predicting single gene loss-of-function phenotypes using an ontology-aware hierarchical classifier. PLoS Comput Biol 2020; 16:e1008453. [PMID: 33206638 PMCID: PMC7710064 DOI: 10.1371/journal.pcbi.1008453] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 12/02/2020] [Accepted: 10/20/2020] [Indexed: 12/21/2022] Open
Abstract
Predicting the phenotypes resulting from molecular perturbations is one of the key challenges in genetics. Both forward and reverse genetic screen are employed to identify the molecular mechanisms underlying phenotypes and disease, and these resulted in a large number of genotype–phenotype association being available for humans and model organisms. Combined with recent advances in machine learning, it may now be possible to predict human phenotypes resulting from particular molecular aberrations. We developed DeepPheno, a neural network based hierarchical multi-class multi-label classification method for predicting the phenotypes resulting from loss-of-function in single genes. DeepPheno uses the functional annotations with gene products to predict the phenotypes resulting from a loss-of-function; additionally, we employ a two-step procedure in which we predict these functions first and then predict phenotypes. Prediction of phenotypes is ontology-based and we propose a novel ontology-based classifier suitable for very large hierarchical classification tasks. These methods allow us to predict phenotypes associated with any known protein-coding gene. We evaluate our approach using evaluation metrics established by the CAFA challenge and compare with top performing CAFA2 methods as well as several state of the art phenotype prediction approaches, demonstrating the improvement of DeepPheno over established methods. Furthermore, we show that predictions generated by DeepPheno are applicable to predicting gene–disease associations based on comparing phenotypes, and that a large number of new predictions made by DeepPheno have recently been added as phenotype databases. Gene–phenotype associations can help to understand the underlying mechanisms of many genetic diseases. However, experimental identification, often involving animal models, is time consuming and expensive. Computational methods that predict gene–phenotype associations can be used instead. We developed DeepPheno, a novel approach for predicting the phenotypes resulting from a loss of function of a single gene. We use gene functions and gene expression as information to prediction phenotypes. Our method uses a neural network classifier that is able to account for hierarchical dependencies between phenotypes. We extensively evaluate our method and compare it with related approaches, and we show that DeepPheno results in better performance in several evaluations. Furthermore, we found that many of the new predictions made by our method have been added to phenotype association databases released one year later. Overall, DeepPheno simulates some aspects of human physiology and how molecular and physiological alterations lead to abnormal phenotypes.
Collapse
|
15
|
Smaili FZ, Gao X, Hoehndorf R. Formal axioms in biomedical ontologies improve analysis and interpretation of associated data. Bioinformatics 2020; 36:2229-2236. [PMID: 31821406 PMCID: PMC7141863 DOI: 10.1093/bioinformatics/btz920] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Revised: 10/16/2019] [Accepted: 12/06/2019] [Indexed: 12/30/2022] Open
Abstract
Motivation Over the past years, significant resources have been invested into formalizing biomedical ontologies. Formal axioms in ontologies have been developed and used to detect and ensure ontology consistency, find unsatisfiable classes, improve interoperability, guide ontology extension through the application of axiom-based design patterns and encode domain background knowledge. The domain knowledge of biomedical ontologies may have also the potential to provide background knowledge for machine learning and predictive modelling. Results We use ontology-based machine learning methods to evaluate the contribution of formal axioms and ontology meta-data to the prediction of protein–protein interactions and gene–disease associations. We find that the background knowledge provided by the Gene Ontology and other ontologies significantly improves the performance of ontology-based prediction models through provision of domain-specific background knowledge. Furthermore, we find that the labels, synonyms and definitions in ontologies can also provide background knowledge that may be exploited for prediction. The axioms and meta-data of different ontologies contribute to improving data analysis in a context-specific manner. Our results have implications on the further development of formal knowledge bases and ontologies in the life sciences, in particular as machine learning methods are more frequently being applied. Our findings motivate the need for further development, and the systematic, application-driven evaluation and improvement, of formal axioms in ontologies. Availability and implementation https://github.com/bio-ontology-research-group/tsoe. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Fatima Zohra Smaili
- Computer, Electrical & Mathematical Sciences and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Xin Gao
- Computer, Electrical & Mathematical Sciences and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Robert Hoehndorf
- Computer, Electrical & Mathematical Sciences and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
16
|
Abdelhakim M, McMurray E, Syed AR, Kafkas S, Kamau AA, Schofield PN, Hoehndorf R. DDIEM: drug database for inborn errors of metabolism. Orphanet J Rare Dis 2020; 15:146. [PMID: 32527280 PMCID: PMC7291537 DOI: 10.1186/s13023-020-01428-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Accepted: 05/28/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Inborn errors of metabolism (IEM) represent a subclass of rare inherited diseases caused by a wide range of defects in metabolic enzymes or their regulation. Of over a thousand characterized IEMs, only about half are understood at the molecular level, and overall the development of treatment and management strategies has proved challenging. An overview of the changing landscape of therapeutic approaches is helpful in assessing strategic patterns in the approach to therapy, but the information is scattered throughout the literature and public data resources. RESULTS We gathered data on therapeutic strategies for 300 diseases into the Drug Database for Inborn Errors of Metabolism (DDIEM). Therapeutic approaches, including both successful and ineffective treatments, were manually classified by their mechanisms of action using a new ontology. CONCLUSIONS We present a manually curated, ontologically formalized knowledgebase of drugs, therapeutic procedures, and mitigated phenotypes. DDIEM is freely available through a web interface and for download at http://ddiem.phenomebrowser.net.
Collapse
Affiliation(s)
- Marwa Abdelhakim
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955 Kingdom of Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, PO 23955 Saudi Arabia
| | - Eunice McMurray
- Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3EG United Kingdom
| | - Ali Raza Syed
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955 Kingdom of Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, PO 23955 Saudi Arabia
| | - Senay Kafkas
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955 Kingdom of Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, PO 23955 Saudi Arabia
| | - Allan Anthony Kamau
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955 Kingdom of Saudi Arabia
| | - Paul N Schofield
- Department of Physiology, Development & Neuroscience, University of Cambridge, Downing Street, Cambridge, CB2 3EG United Kingdom
| | - Robert Hoehndorf
- Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, 23955 Kingdom of Saudi Arabia
- Computer, Electrical and Mathematical Sciences & Engineering Division (CEMSE), King Abdullah University of Science and Technology, 4700 KAUST, Thuwal, PO 23955 Saudi Arabia
| |
Collapse
|
17
|
Components of genetic associations across 2,138 phenotypes in the UK Biobank highlight adipocyte biology. Nat Commun 2019; 10:4064. [PMID: 31492854 PMCID: PMC6731283 DOI: 10.1038/s41467-019-11953-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 08/14/2019] [Indexed: 01/25/2023] Open
Abstract
Population-based biobanks with genomic and dense phenotype data provide opportunities for generating effective therapeutic hypotheses and understanding the genomic role in disease predisposition. To characterize latent components of genetic associations, we apply truncated singular value decomposition (DeGAs) to matrices of summary statistics derived from genome-wide association analyses across 2,138 phenotypes measured in 337,199 White British individuals in the UK Biobank study. We systematically identify key components of genetic associations and the contributions of variants, genes, and phenotypes to each component. As an illustration of the utility of the approach to inform downstream experiments, we report putative loss of function variants, rs114285050 (GPR151) and rs150090666 (PDE3B), that substantially contribute to obesity-related traits and experimentally demonstrate the role of these genes in adipocyte biology. Our approach to dissect components of genetic associations across the human phenome will accelerate biomedical hypothesis generation by providing insights on previously unexplored latent structures.
Collapse
|
18
|
New models for human disease from the International Mouse Phenotyping Consortium. Mamm Genome 2019; 30:143-150. [PMID: 31127358 PMCID: PMC6606664 DOI: 10.1007/s00335-019-09804-5] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Accepted: 05/15/2019] [Indexed: 12/21/2022]
Abstract
The International Mouse Phenotyping Consortium (IMPC) continues to expand the catalogue of mammalian gene function by conducting genome and phenome-wide phenotyping on knockout mouse lines. The extensive and standardized phenotype screens allow the identification of new potential models for human disease through cross-species comparison by computing the similarity between the phenotypes observed in the mutant mice and the human phenotypes associated to their orthologous loci in Mendelian disease. Here, we present an update on the novel disease models available from the most recent data release (DR10.0), with 5861 mouse genes fully or partially phenotyped and a total number of 69,982 phenotype calls reported. With approximately one-third of human Mendelian genes with orthologous null mouse phenotypes described, the range of available models relevant for human diseases keeps increasing. Among the breadth of new data, we identify previously uncharacterized disease genes in the mouse and additional phenotypes for genes with existing mutant lines mimicking the associated disorder. The automated and unbiased discovery of relevant models for all types of rare diseases implemented by the IMPC constitutes a powerful tool for human genetics and precision medicine.
Collapse
|
19
|
Nenni MJ, Fisher ME, James-Zorn C, Pells TJ, Ponferrada V, Chu S, Fortriede JD, Burns KA, Wang Y, Lotay VS, Wang DZ, Segerdell E, Chaturvedi P, Karimi K, Vize PD, Zorn AM. Xenbase: Facilitating the Use of Xenopus to Model Human Disease. Front Physiol 2019; 10:154. [PMID: 30863320 PMCID: PMC6399412 DOI: 10.3389/fphys.2019.00154] [Citation(s) in RCA: 44] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2018] [Accepted: 02/08/2019] [Indexed: 01/02/2023] Open
Abstract
At a fundamental level most genes, signaling pathways, biological functions and organ systems are highly conserved between man and all vertebrate species. Leveraging this conservation, researchers are increasingly using the experimental advantages of the amphibian Xenopus to model human disease. The online Xenopus resource, Xenbase, enables human disease modeling by curating the Xenopus literature published in PubMed and integrating these Xenopus data with orthologous human genes, anatomy, and more recently with links to the Online Mendelian Inheritance in Man resource (OMIM) and the Human Disease Ontology (DO). Here we review how Xenbase supports disease modeling and report on a meta-analysis of the published Xenopus research providing an overview of the different types of diseases being modeled in Xenopus and the variety of experimental approaches being used. Text mining of over 50,000 Xenopus research articles imported into Xenbase from PubMed identified approximately 1,000 putative disease- modeling articles. These articles were manually assessed and annotated with disease ontologies, which were then used to classify papers based on disease type. We found that Xenopus is being used to study a diverse array of disease with three main experimental approaches: cell-free egg extracts to study fundamental aspects of cellular and molecular biology, oocytes to study ion transport and channel physiology and embryo experiments focused on congenital diseases. We integrated these data into Xenbase Disease Pages to allow easy navigation to disease information on external databases. Results of this analysis will equip Xenopus researchers with a suite of experimental approaches available to model or dissect a pathological process. Ideally clinicians and basic researchers will use this information to foster collaborations necessary to interrogate the development and treatment of human diseases.
Collapse
Affiliation(s)
- Mardi J Nenni
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| | - Malcolm E Fisher
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| | - Christina James-Zorn
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| | - Troy J Pells
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Virgilio Ponferrada
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| | - Stanley Chu
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Joshua D Fortriede
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| | - Kevin A Burns
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| | - Ying Wang
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Vaneet S Lotay
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Dong Zhou Wang
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Erik Segerdell
- Institute of Ecology and Evolution, University of Oregon, Eugene, OR, United States
| | - Praneet Chaturvedi
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| | - Kamran Karimi
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Peter D Vize
- Department of Biological Sciences, University of Calgary, Calgary, AB, Canada
| | - Aaron M Zorn
- Division of Developmental Biology, Cincinnati Children's Hospital, Cincinnati, OH, United States
| |
Collapse
|
20
|
Smaili FZ, Gao X, Hoehndorf R. OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction. Bioinformatics 2018; 35:2133-2140. [DOI: 10.1093/bioinformatics/bty933] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2018] [Revised: 11/02/2018] [Accepted: 11/07/2018] [Indexed: 12/11/2022] Open
Affiliation(s)
- Fatima Zohra Smaili
- Computer, Electrical & Mathematical Sciences and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Xin Gao
- Computer, Electrical & Mathematical Sciences and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Robert Hoehndorf
- Computer, Electrical & Mathematical Sciences and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
21
|
Gkoutos GV, Schofield PN, Hoehndorf R. The anatomy of phenotype ontologies: principles, properties and applications. Brief Bioinform 2018; 19:1008-1021. [PMID: 28387809 PMCID: PMC6169674 DOI: 10.1093/bib/bbx035] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2017] [Revised: 02/05/2017] [Indexed: 12/14/2022] Open
Abstract
The past decade has seen an explosion in the collection of genotype data in domains as diverse as medicine, ecology, livestock and plant breeding. Along with this comes the challenge of dealing with the related phenotype data, which is not only large but also highly multidimensional. Computational analysis of phenotypes has therefore become critical for our ability to understand the biological meaning of genomic data in the biological sciences. At the heart of computational phenotype analysis are the phenotype ontologies. A large number of these ontologies have been developed across many domains, and we are now at a point where the knowledge captured in the structure of these ontologies can be used for the integration and analysis of large interrelated data sets. The Phenotype And Trait Ontology framework provides a method for formal definitions of phenotypes and associated data sets and has proved to be key to our ability to develop methods for the integration and analysis of phenotype data. Here, we describe the development and products of the ontological approach to phenotype capture, the formal content of phenotype ontologies and how their content can be used computationally.
Collapse
Affiliation(s)
| | | | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, King Abdullah University of Science and Technology, Thuwal
| |
Collapse
|
22
|
Kulmanov M, Schofield PN, Gkoutos GV, Hoehndorf R. Ontology-based validation and identification of regulatory phenotypes. Bioinformatics 2018; 34:i857-i865. [PMID: 30423068 PMCID: PMC6129279 DOI: 10.1093/bioinformatics/bty605] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Motivation Function annotations of gene products, and phenotype annotations of genotypes, provide valuable information about molecular mechanisms that can be utilized by computational methods to identify functional and phenotypic relatedness, improve our understanding of disease and pathobiology, and lead to discovery of drug targets. Identifying functions and phenotypes commonly requires experiments which are time-consuming and expensive to carry out; creating the annotations additionally requires a curator to make an assertion based on reported evidence. Support to validate the mutual consistency of functional and phenotype annotations as well as a computational method to predict phenotypes from function annotations, would greatly improve the utility of function annotations. Results We developed a novel ontology-based method to validate the mutual consistency of function and phenotype annotations. We apply our method to mouse and human annotations, and identify several inconsistencies that can be resolved to improve overall annotation quality. We also apply our method to the rule-based prediction of regulatory phenotypes from functions and demonstrate that we can predict these phenotypes with Fmax of up to 0.647. Availability and implementation https://github.com/bio-ontology-research-group/phenogocon.
Collapse
Affiliation(s)
- Maxat Kulmanov
- Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Centre, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Paul N Schofield
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| | - Georgios V Gkoutos
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, UK
- Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, Birmingham, UK
- NIHR Experimental Cancer Medicine Centre, Birmingham, UK
- NIHR Surgical Reconstruction and Microbiology Research Centre, Birmingham, UK
- NIHR Biomedical Research Centre, Birmingham, UK
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Centre, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| |
Collapse
|
23
|
The International Mouse Phenotyping Consortium (IMPC): a functional catalogue of the mammalian genome that informs conservation. CONSERV GENET 2018; 19:995-1005. [PMID: 30100824 PMCID: PMC6061128 DOI: 10.1007/s10592-018-1072-9] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 05/03/2018] [Indexed: 01/08/2023]
Abstract
The International Mouse Phenotyping Consortium (IMPC) is building a catalogue of mammalian gene function by producing and phenotyping a knockout mouse line for every protein-coding gene. To date, the IMPC has generated and characterised 5186 mutant lines. One-third of the lines have been found to be non-viable and over 300 new mouse models of human disease have been identified thus far. While current bioinformatics efforts are focused on translating results to better understand human disease processes, IMPC data also aids understanding genetic function and processes in other species. Here we show, using gorilla genomic data, how genes essential to development in mice can be used to help assess the potentially deleterious impact of gene variants in other species. This type of analyses could be used to select optimal breeders in endangered species to maintain or increase fitness and avoid variants associated to impaired-health phenotypes or loss-of-function mutations in genes of critical importance. We also show, using selected examples from various mammal species, how IMPC data can aid in the identification of candidate genes for studying a condition of interest, deliver information about the mechanisms involved, or support predictions for the function of genes that may play a role in adaptation. With genotyping costs decreasing and the continued improvements of bioinformatics tools, the analyses we demonstrate can be routinely applied.
Collapse
|
24
|
Mouse Genome Informatics (MGI) Is the International Resource for Information on the Laboratory Mouse. Methods Mol Biol 2018; 1757:141-161. [PMID: 29761459 DOI: 10.1007/978-1-4939-7737-6_7] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Mouse Genome Informatics (MGI, http://www.informatics.jax.org/ ) web resources provide free access to meticulously curated information about the laboratory mouse. MGI's primary goal is to help researchers investigate the genetic foundations of human diseases by translating information from mouse phenotypes and disease models studies to human systems. MGI provides comprehensive phenotypes for over 50,000 mutant alleles in mice and provides experimental model descriptions for over 1500 human diseases. Curated data from scientific publications are integrated with those from high-throughput phenotyping and gene expression centers. Data are standardized using defined, hierarchical vocabularies such as the Mammalian Phenotype (MP) Ontology, Mouse Developmental Anatomy and the Gene Ontologies (GO). This chapter introduces you to Gene and Allele Detail pages and provides step-by-step instructions for simple searches and those that take advantage of the breadth of MGI data integration.
Collapse
|
25
|
Shimoyama M, Laulederkind SJF, De Pons J, Nigam R, Smith JR, Tutaj M, Petri V, Hayman GT, Wang SJ, Ghiasvand O, Thota J, Dwinell MR. Exploring human disease using the Rat Genome Database. Dis Model Mech 2017; 9:1089-1095. [PMID: 27736745 PMCID: PMC5087824 DOI: 10.1242/dmm.026021] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Rattus norvegicus, the laboratory rat, has been a crucial model for studies of the environmental and genetic factors associated with human diseases for over 150 years. It is the primary model organism for toxicology and pharmacology studies, and has features that make it the model of choice in many complex-disease studies. Since 1999, the Rat Genome Database (RGD; http://rgd.mcw.edu) has been the premier resource for genomic, genetic, phenotype and strain data for the laboratory rat. The primary role of RGD is to curate rat data and validate orthologous relationships with human and mouse genes, and make these data available for incorporation into other major databases such as NCBI, Ensembl and UniProt. RGD also provides official nomenclature for rat genes, quantitative trait loci, strains and genetic markers, as well as unique identifiers. The RGD team adds enormous value to these basic data elements through functional and disease annotations, the analysis and visual presentation of pathways, and the integration of phenotype measurement data for strains used as disease models. Because much of the rat research community focuses on understanding human diseases, RGD provides a number of datasets and software tools that allow users to easily explore and make disease-related connections among these datasets. RGD also provides comprehensive human and mouse data for comparative purposes, illustrating the value of the rat in translational research. This article introduces RGD and its suite of tools and datasets to researchers - within and beyond the rat community - who are particularly interested in leveraging rat-based insights to understand human diseases.
Collapse
Affiliation(s)
- Mary Shimoyama
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | | | - Jeff De Pons
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Rajni Nigam
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Jennifer R Smith
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Marek Tutaj
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Victoria Petri
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - G Thomas Hayman
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Shur-Jen Wang
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Omid Ghiasvand
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Jyothi Thota
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Melinda R Dwinell
- Medical College of Wisconsin, Department of Physiology, Milwaukee, WI 53226, USA
| |
Collapse
|
26
|
Meehan TF, Conte N, West DB, Jacobsen JO, Mason J, Warren J, Chen CK, Tudose I, Relac M, Matthews P, Karp N, Santos L, Fiegel T, Ring N, Westerberg H, Greenaway S, Sneddon D, Morgan H, Codner GF, Stewart ME, Brown J, Horner N, Haendel M, Washington N, Mungall CJ, Reynolds CL, Gallegos J, Gailus-Durner V, Sorg T, Pavlovic G, Bower LR, Moore M, Morse I, Gao X, Tocchini-Valentini GP, Obata Y, Cho SY, Seong JK, Seavitt J, Beaudet AL, Dickinson ME, Herault Y, Wurst W, de Angelis MH, Lloyd KK, Flenniken AM, Nutter LMJ, Newbigging S, McKerlie C, Justice MJ, Murray SA, Svenson KL, Braun RE, White JK, Bradley A, Flicek P, Wells S, Skarnes WC, Adams DJ, Parkinson H, Mallon AM, Brown SD, Smedley D. Disease model discovery from 3,328 gene knockouts by The International Mouse Phenotyping Consortium. Nat Genet 2017; 49:1231-1238. [PMID: 28650483 PMCID: PMC5546242 DOI: 10.1038/ng.3901] [Citation(s) in RCA: 161] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2016] [Accepted: 05/25/2017] [Indexed: 12/12/2022]
Abstract
Although next-generation sequencing has revolutionized the ability to associate variants with human diseases, diagnostic rates and development of new therapies are still limited by a lack of knowledge of the functions and pathobiological mechanisms of most genes. To address this challenge, the International Mouse Phenotyping Consortium is creating a genome- and phenome-wide catalog of gene function by characterizing new knockout-mouse strains across diverse biological systems through a broad set of standardized phenotyping tests. All mice will be readily available to the biomedical community. Analyzing the first 3,328 genes identified models for 360 diseases, including the first models, to our knowledge, for type C Bernard-Soulier, Bardet-Biedl-5 and Gordon Holmes syndromes. 90% of our phenotype annotations were novel, providing functional evidence for 1,092 genes and candidates in genetically uncharacterized diseases including arrhythmogenic right ventricular dysplasia 3. Finally, we describe our role in variant functional validation with The 100,000 Genomes Project and others.
Collapse
Affiliation(s)
- Terrence F. Meehan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nathalie Conte
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David B. West
- Children’s Hospital Oakland Research Institute, Oakland, California 94609, USA
| | - Julius O. Jacobsen
- William Harvey Research Institute, Queen Mary University of London, London, E1 4NS, UK
| | - Jeremy Mason
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jonathan Warren
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Chao-Kung Chen
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ilinca Tudose
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mike Relac
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter Matthews
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Natasha Karp
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Luis Santos
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Tanja Fiegel
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Natalie Ring
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Henrik Westerberg
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Simon Greenaway
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Duncan Sneddon
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Hugh Morgan
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Gemma F Codner
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Michelle E Stewart
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - James Brown
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Neil Horner
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | | | - Melissa Haendel
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Nicole Washington
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Christopher J. Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Corey L Reynolds
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Juan Gallegos
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Valerie Gailus-Durner
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics, Neuherberg 85764, Germany
| | - Tania Sorg
- CELPHEDIA, PHENOMIN, Institut Clinique de la Souris (ICS), 1 rue Laurent Fries, F-67404 Illkirch-Graffenstaden, France
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Université de Strasbourg, Illkirch, France
- Centre National de la Recherche Scientifique, UMR7104, Illkirch, France
- Institut National de la Santé et de la Recherche Médicale, U964, Illkirch, France
| | - Guillaume Pavlovic
- CELPHEDIA, PHENOMIN, Institut Clinique de la Souris (ICS), 1 rue Laurent Fries, F-67404 Illkirch-Graffenstaden, France
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Université de Strasbourg, Illkirch, France
- Centre National de la Recherche Scientifique, UMR7104, Illkirch, France
- Institut National de la Santé et de la Recherche Médicale, U964, Illkirch, France
| | - Lynette R Bower
- Mouse Biology Program, University of California, Davis, California 95618, USA
| | - Mark Moore
- IMPC, San Anselmo, California 94960, USA
| | - Iva Morse
- Charles River Laboratories, Wilmington, Massachusetts 01887, USA
| | - Xiang Gao
- SKL of Pharmaceutical Biotechnology and Model Animal Research Center, Collaborative Innovation Center for Genetics and Development, Nanjing Biomedical Research Institute, Nanjing University, Nanjing 210061, China
| | - Glauco P Tocchini-Valentini
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, Monterotondo Scalo I-00015, Italy
| | - Yuichi Obata
- RIKEN BioResource Center, Tsukuba, Ibaraki 305-0074, Japan
| | - Soo Young Cho
- Korea Mouse Phenotyping Center, 08826, Republic of Korea
- National Cancer Center, Goyang, Gyeonggi, 10408, Republic of Korea
| | - Je Kyung Seong
- Korea Mouse Phenotyping Center, 08826, Republic of Korea
- Research Institute for Veterinary Science, Seoul National University, Republic of Korea
| | - John Seavitt
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Arthur L. Beaudet
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Mary E. Dickinson
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Yann Herault
- CELPHEDIA, PHENOMIN, Institut Clinique de la Souris (ICS), 1 rue Laurent Fries, F-67404 Illkirch-Graffenstaden, France
- Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Université de Strasbourg, Illkirch, France
- Centre National de la Recherche Scientifique, UMR7104, Illkirch, France
- Institut National de la Santé et de la Recherche Médicale, U964, Illkirch, France
| | - Wolfgang Wurst
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics, Neuherberg 85764, Germany
| | - Martin Hrabe de Angelis
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics, Neuherberg 85764, Germany
| | - K.C. Kent Lloyd
- Mouse Biology Program, University of California, Davis, California 95618, USA
| | - Ann M Flenniken
- The Centre for Phenogenomics, Toronto, Ontario M5T 3H7, Canada
| | | | | | - Colin McKerlie
- The Centre for Phenogenomics, Toronto, Ontario M5T 3H7, Canada
| | - Monica J. Justice
- Mouse Imaging Centre, The Hospital for Sick Children, Toronto, Ontario M5T 3H7, Canada
| | | | | | | | - Jacqueline K. White
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Allan Bradley
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sara Wells
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - William C. Skarnes
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - David J. Adams
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ann-Marie Mallon
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Steve D.M. Brown
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire OX11 0RD, UK
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, E1 4NS, UK
| |
Collapse
|
27
|
Eppig JT. Mouse Genome Informatics (MGI) Resource: Genetic, Genomic, and Biological Knowledgebase for the Laboratory Mouse. ILAR J 2017; 58:17-41. [PMID: 28838066 PMCID: PMC5886341 DOI: 10.1093/ilar/ilx013] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Revised: 03/14/2017] [Accepted: 03/28/2017] [Indexed: 12/13/2022] Open
Abstract
The Mouse Genome Informatics (MGI) Resource supports basic, translational, and computational research by providing high-quality, integrated data on the genetics, genomics, and biology of the laboratory mouse. MGI serves a strategic role for the scientific community in facilitating biomedical, experimental, and computational studies investigating the genetics and processes of diseases and enabling the development and testing of new disease models and therapeutic interventions. This review describes the nexus of the body of growing genetic and biological data and the advances in computer technology in the late 1980s, including the World Wide Web, that together launched the beginnings of MGI. MGI develops and maintains a gold-standard resource that reflects the current state of knowledge, provides semantic and contextual data integration that fosters hypothesis testing, continually develops new and improved tools for searching and analysis, and partners with the scientific community to assure research data needs are met. Here we describe one slice of MGI relating to the development of community-wide large-scale mutagenesis and phenotyping projects and introduce ways to access and use these MGI data. References and links to additional MGI aspects are provided.
Collapse
Affiliation(s)
- Janan T. Eppig
- Janan T. Eppig, PhD, is Professor Emeritus at The Jackson Laboratory in Bar Harbor, Maine
| |
Collapse
|
28
|
Eppig JT, Smith CL, Blake JA, Ringwald M, Kadin JA, Richardson JE, Bult CJ. Mouse Genome Informatics (MGI): Resources for Mining Mouse Genetic, Genomic, and Biological Data in Support of Primary and Translational Research. Methods Mol Biol 2017; 1488:47-73. [PMID: 27933520 DOI: 10.1007/978-1-4939-6427-7_3] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
The Mouse Genome Informatics (MGI), resource ( www.informatics.jax.org ) has existed for over 25 years, and over this time its data content, informatics infrastructure, and user interfaces and tools have undergone dramatic changes (Eppig et al., Mamm Genome 26:272-284, 2015). Change has been driven by scientific methodological advances, rapid improvements in computational software, growth in computer hardware capacity, and the ongoing collaborative nature of the mouse genomics community in building resources and sharing data. Here we present an overview of the current data content of MGI, describe its general organization, and provide examples using simple and complex searches, and tools for mining and retrieving sets of data.
Collapse
Affiliation(s)
- Janan T Eppig
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA.
| | - Cynthia L Smith
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - Judith A Blake
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - Martin Ringwald
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | - James A Kadin
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| | | | - Carol J Bult
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME, 04609, USA
| |
Collapse
|
29
|
Abstract
The Gene Ontology (GO) is a framework designed to represent biological knowledge about gene products' biological roles and the cellular location in which they act. Biocuration is a complex process: the body of scientific literature is large and selection of appropriate GO terms can be challenging. Both these issues are compounded by the fact that our understanding of biology is still incomplete; hence it is important to appreciate that GO is inherently an evolving model. In this chapter, we describe how biocurators create GO annotations from experimental findings from research articles. We describe the current best practices for high-quality literature curation and how GO curators succeed in modeling biology using a relatively simple framework. We also highlight a number of difficulties when translating experimental assays into GO annotations.
Collapse
Affiliation(s)
- Sylvain Poux
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211, Geneva 4, Switzerland
| | - Pascale Gaudet
- CALIPHO group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, 1 rue Michel Servet, 1211, Geneva 4, Switzerland. .,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland.
| |
Collapse
|
30
|
Mungall CJ, McMurry JA, Köhler S, Balhoff JP, Borromeo C, Brush M, Carbon S, Conlin T, Dunn N, Engelstad M, Foster E, Gourdine JP, Jacobsen JOB, Keith D, Laraway B, Lewis SE, NguyenXuan J, Shefchek K, Vasilevsky N, Yuan Z, Washington N, Hochheiser H, Groza T, Smedley D, Robinson PN, Haendel MA. The Monarch Initiative: an integrative data and analytic platform connecting phenotypes to genotypes across species. Nucleic Acids Res 2016; 45:D712-D722. [PMID: 27899636 PMCID: PMC5210586 DOI: 10.1093/nar/gkw1128] [Citation(s) in RCA: 189] [Impact Index Per Article: 23.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Revised: 10/26/2016] [Accepted: 11/02/2016] [Indexed: 02/04/2023] Open
Abstract
The correlation of phenotypic outcomes with genetic variation and environmental factors is a core pursuit in biology and biomedicine. Numerous challenges impede our progress: patient phenotypes may not match known diseases, candidate variants may be in genes that have not been characterized, model organisms may not recapitulate human or veterinary diseases, filling evolutionary gaps is difficult, and many resources must be queried to find potentially significant genotype–phenotype associations. Non-human organisms have proven instrumental in revealing biological mechanisms. Advanced informatics tools can identify phenotypically relevant disease models in research and diagnostic contexts. Large-scale integration of model organism and clinical research data can provide a breadth of knowledge not available from individual sources and can provide contextualization of data back to these sources. The Monarch Initiative (monarchinitiative.org) is a collaborative, open science effort that aims to semantically integrate genotype–phenotype data from many species and sources in order to support precision medicine, disease modeling, and mechanistic exploration. Our integrated knowledge graph, analytic tools, and web services enable diverse users to explore relationships between phenotypes and genotypes across species.
Collapse
Affiliation(s)
- Christopher J Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Julie A McMurry
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Sebastian Köhler
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany
| | | | - Charles Borromeo
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Matthew Brush
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Seth Carbon
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Tom Conlin
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Nathan Dunn
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Mark Engelstad
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Erin Foster
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - J P Gourdine
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Julius O B Jacobsen
- William Harvey Research Institute, Barts & The London School of Medicine & Dentistry, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK
| | - Dan Keith
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Bryan Laraway
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Suzanna E Lewis
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jeremy NguyenXuan
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Kent Shefchek
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Nicole Vasilevsky
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| | - Zhou Yuan
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Nicole Washington
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Harry Hochheiser
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Tudor Groza
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010, Australia
| | - Damian Smedley
- William Harvey Research Institute, Barts & The London School of Medicine & Dentistry, Queen Mary University of London, Charterhouse Square, London EC1M 6BQ, UK
| | - Peter N Robinson
- Institute for Medical Genetics and Human Genetics, Charité-Universitätsmedizin Berlin, Augustenburger Platz 1, 13353 Berlin, Germany.,The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032mUSA
| | - Melissa A Haendel
- Department of Medical Informatics and Clinical Epidemiology and OHSU Library, Oregon Health & Science University, Portland, OR, 97239, USA
| |
Collapse
|
31
|
Dickinson ME, Flenniken AM, Ji X, Teboul L, Wong MD, White JK, Meehan TF, Weninger WJ, Westerberg H, Adissu H, Baker CN, Bower L, Brown JM, Caddle LB, Chiani F, Clary D, Cleak J, Daly MJ, Denegre JM, Doe B, Dolan ME, Edie SM, Fuchs H, Gailus-Durner V, Galli A, Gambadoro A, Gallegos J, Guo S, Horner NR, Hsu CW, Johnson SJ, Kalaga S, Keith LC, Lanoue L, Lawson TN, Lek M, Mark M, Marschall S, Mason J, McElwee ML, Newbigging S, Nutter LM, Peterson KA, Ramirez-Solis R, Rowland DJ, Ryder E, Samocha KE, Seavitt JR, Selloum M, Szoke-Kovacs Z, Tamura M, Trainor AG, Tudose I, Wakana S, Warren J, Wendling O, West DB, Wong L, Yoshiki A, MacArthur DG, Tocchini-Valentini GP, Gao X, Flicek P, Bradley A, Skarnes WC, Justice MJ, Parkinson HE, Moore M, Wells S, Braun RE, Svenson KL, de Angelis MH, Herault Y, Mohun T, Mallon AM, Henkelman RM, Brown SD, Adams DJ, Lloyd KK, McKerlie C, Beaudet AL, Bucan M, Murray SA. High-throughput discovery of novel developmental phenotypes. Nature 2016; 537:508-514. [PMID: 27626380 PMCID: PMC5295821 DOI: 10.1038/nature19356] [Citation(s) in RCA: 787] [Impact Index Per Article: 98.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 08/10/2016] [Indexed: 12/29/2022]
Abstract
Approximately one-third of all mammalian genes are essential for life. Phenotypes resulting from knockouts of these genes in mice have provided tremendous insight into gene function and congenital disorders. As part of the International Mouse Phenotyping Consortium effort to generate and phenotypically characterize 5,000 knockout mouse lines, here we identify 410 lethal genes during the production of the first 1,751 unique gene knockouts. Using a standardized phenotyping platform that incorporates high-resolution 3D imaging, we identify phenotypes at multiple time points for previously uncharacterized genes and additional phenotypes for genes with previously reported mutant phenotypes. Unexpectedly, our analysis reveals that incomplete penetrance and variable expressivity are common even on a defined genetic background. In addition, we show that human disease genes are enriched for essential genes, thus providing a dataset that facilitates the prioritization and validation of mutations identified in clinical sequencing efforts.
Collapse
Affiliation(s)
- Mary E. Dickinson
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Ann M. Flenniken
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Xiao Ji
- Genomics and Computational Biology Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia PA 19104
| | - Lydia Teboul
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Michael D. Wong
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- Mouse Imaging Centre, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Jacqueline K. White
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Terrence F. Meehan
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Wolfgang J. Weninger
- Centre for Anatomy and Cell Biology, Medical University of Vienna, Vienna, Austria
| | - Henrik Westerberg
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Hibret Adissu
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | | | - Lynette Bower
- Mouse Biology Program, University of California, Davis
| | - James M. Brown
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | | | - Francesco Chiani
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, Monterotondo Scalo, Itally
| | - Dave Clary
- Mouse Biology Program, University of California, Davis
| | - James Cleak
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Mark J. Daly
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | | | - Brendan Doe
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | - Helmut Fuchs
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
| | - Valerie Gailus-Durner
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
| | - Antonella Galli
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Alessia Gambadoro
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, Monterotondo Scalo, Itally
| | - Juan Gallegos
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| | - Shiying Guo
- SKL of Pharmaceutical Biotechnology and Model Animal Research Center, Collaborative Innovation Center for Genetics and Development, Nanjing Biomedical Research Institute, Nanjing University, China
| | - Neil R. Horner
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Chih-wei Hsu
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Sara J. Johnson
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Sowmya Kalaga
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Lance C. Keith
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | - Louise Lanoue
- Mouse Biology Program, University of California, Davis
| | - Thomas N. Lawson
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - Monkol Lek
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | - Manuel Mark
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - Susan Marschall
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
| | - Jeremy Mason
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Susan Newbigging
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Lauryl M.J. Nutter
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | | | - Ramiro Ramirez-Solis
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Edward Ryder
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Kaitlin E. Samocha
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | - John R. Seavitt
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| | - Mohammed Selloum
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - Zsombor Szoke-Kovacs
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | | | | | - Ilinca Tudose
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Jonathan Warren
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Olivia Wendling
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - David B. West
- Children’s Hospital Oakland Research Institute, Oakland, CA 94609
| | - Leeyean Wong
- Department of Molecular Physiology and Biophysics, Houston, Texas, USA
| | | | | | - Daniel G. MacArthur
- Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston MA, USA
- Program in Medical and Population Genetics, Broad Institute MIT and Harvard, Cambridge, MA, USA
| | - Glauco P. Tocchini-Valentini
- Monterotondo Mouse Clinic, Italian National Research Council (CNR), Institute of Cell Biology and Neurobiology, Monterotondo Scalo, Itally
| | - Xiang Gao
- SKL of Pharmaceutical Biotechnology and Model Animal Research Center, Collaborative Innovation Center for Genetics and Development, Nanjing Biomedical Research Institute, Nanjing University, China
| | - Paul Flicek
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Allan Bradley
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - William C. Skarnes
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Monica J. Justice
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Helen E. Parkinson
- European Molecular Biology Laboratory- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Sara Wells
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | | | | | - Martin Hrabe de Angelis
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Experimental Genetics and German Mouse Clinic, Neuherberg, Germany
- Chair of Experimental Genetics, School of Life Science Weihenstephan, Technische Universität München, Freising
- German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Yann Herault
- Infrastructure Nationale PHENOMIN, Institut Clinique de la Souris (ICS), et Institut de Génétique Biologie Moléculaire et Cellulaire (IGBMC) CNRS, INSERM, University of Strasbourg, Illkirch-Graffenstaden, France
| | - Tim Mohun
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London, UK
| | - Ann-Marie Mallon
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - R. Mark Henkelman
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- Mouse Imaging Centre, The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Steve D.M. Brown
- Medical Research Council Harwell (Mammalian Genetics Unit and Mary Lyon Centre), Harwell, Oxfordshire, UK
| | - David J. Adams
- The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Colin McKerlie
- The Centre for Phenogenomics, Toronto, Ontario, Canada
- The Hospital for Sick Children, Toronto, Ontario, Canada
| | - Arthur L. Beaudet
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX USA
| | - Maja Bucan
- Departments of Genetics and Psychiatry, Perlman School of Medicine, University of Pennsylvania, Philadelphia PA 19104
| | | |
Collapse
|
32
|
Soldatova LN, Collier N, Oellrich A, Groza T, Verspoor K, Rocca-Serra P, Dumontier M, Shah NH. Special issue on bio-ontologies and phenotypes. J Biomed Semantics 2015; 6:40. [PMID: 26682035 PMCID: PMC4682270 DOI: 10.1186/s13326-015-0040-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Accepted: 11/15/2015] [Indexed: 11/10/2022] Open
Abstract
The bio-ontologies and phenotypes special issue includes eight papers selected from the 11 papers presented at the Bio-Ontologies SIG (Special Interest Group) and the Phenotype Day at ISMB (Intelligent Systems for Molecular Biology) conference in Boston in 2014. The selected papers span a wide range of topics including the automated re-use and update of ontologies, quality assessment of ontological resources, and the systematic description of phenotype variation, driven by manual, semi- and fully automatic means.
Collapse
Affiliation(s)
| | | | | | - Tudor Groza
- The Garvan Institute of Medical Research, Sydney, Australia
| | | | | | | | | |
Collapse
|
33
|
Mungall CJ, Washington NL, Nguyen-Xuan J, Condit C, Smedley D, Köhler S, Groza T, Shefchek K, Hochheiser H, Robinson PN, Lewis SE, Haendel MA. Use of model organism and disease databases to support matchmaking for human disease gene discovery. Hum Mutat 2015; 36:979-84. [PMID: 26269093 PMCID: PMC5473253 DOI: 10.1002/humu.22857] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Accepted: 07/22/2015] [Indexed: 11/10/2022]
Abstract
The Matchmaker Exchange application programming interface (API) allows searching a patient's genotypic or phenotypic profiles across clinical sites, for the purposes of cohort discovery and variant disease causal validation. This API can be used not only to search for matching patients, but also to match against public disease and model organism data. This public disease data enable matching known diseases and variant-phenotype associations using phenotype semantic similarity algorithms developed by the Monarch Initiative. The model data can provide additional evidence to aid diagnosis, suggest relevant models for disease mechanism and treatment exploration, and identify collaborators across the translational divide. The Monarch Initiative provides an implementation of this API for searching multiple integrated sources of data that contextualize the knowledge about any given patient or patient family into the greater biomedical knowledge landscape. While this corpus of data can aid diagnosis, it is also the beginning of research to improve understanding of rare human diseases.
Collapse
Affiliation(s)
| | - Nicole L. Washington
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Jeremy Nguyen-Xuan
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Christopher Condit
- San Diego Supercomputing Center, UC San Diego, La Jolla, California, USA
| | - Damian Smedley
- Wellcome Trust Sanger Institute, Mouse Informatics group, Hinxton, UK
| | - Sebastian Köhler
- Charité - Universitätsmedizin Berlin, Institute for Medical and Human Genetics, Berlin, Germany
| | - Tudor Groza
- Garvan Institute, Kinghorn Centre for Clinical Genomics, Sydney, Australia
| | - Kent Shefchek
- Department of Biomedical Informatics and Clinical Epidemiology, Oregon Health and Science University
| | - Harry Hochheiser
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Peter N. Robinson
- Charité - Universitätsmedizin Berlin, Institute for Medical and Human Genetics, Berlin, Germany
| | - Suzanna E. Lewis
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Melissa A. Haendel
- Department of Biomedical Informatics and Clinical Epidemiology, Oregon Health and Science University
| |
Collapse
|
34
|
|
35
|
Bello SM, Smith CL, Eppig JT. Allele, phenotype and disease data at Mouse Genome Informatics: improving access and analysis. Mamm Genome 2015; 26:285-94. [PMID: 26162703 PMCID: PMC4534497 DOI: 10.1007/s00335-015-9582-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 06/23/2015] [Indexed: 11/16/2022]
Abstract
A core part of the Mouse Genome Informatics (MGI) resource is the collection of mouse mutations and the annotation phenotypes and diseases displayed by mice carrying these mutations. These data are integrated with the rest of data in MGI and exported to numerous other resources. The use of mouse phenotype data to drive translational research into human disease has expanded rapidly with the improvements in sequencing technology. MGI has implemented many improvements in allele and phenotype data annotation, search, and display to facilitate access to these data through multiple avenues. For example, the description of alleles has been modified to include more detailed categories of allele attributes. This allows improved discrimination between mutation types. Further, connections have been created between mutations involving multiple genes and each of the genes overlapping the mutation. This allows users to readily find all mutations affecting a gene and see all genes affected by a mutation. In a similar manner, the genes expressed by transgenic or knock-in alleles are now connected to these alleles. The advanced search forms and public reports have been updated to take advantage of these improvements. These search forms and reports are used by an expanding number of researchers to identify novel human disease genes and mouse models of human disease.
Collapse
Affiliation(s)
- Susan M Bello
- Mouse Genome Informatics, The Jackson Laboratory, Bar Harbor, ME, 04609, USA,
| | | | | |
Collapse
|