1
|
Faria D, Eugénio P, Contreiras Silva M, Balbi L, Bedran G, Kallor AA, Nunes S, Palkowski A, Waleron M, Alfaro JA, Pesquita C. The Immunopeptidomics Ontology (ImPO). Database (Oxford) 2024; 2024:baae014. [PMID: 38857186 PMCID: PMC11164101 DOI: 10.1093/database/baae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 11/30/2023] [Accepted: 02/22/2024] [Indexed: 06/12/2024]
Abstract
The adaptive immune response plays a vital role in eliminating infected and aberrant cells from the body. This process hinges on the presentation of short peptides by major histocompatibility complex Class I molecules on the cell surface. Immunopeptidomics, the study of peptides displayed on cells, delves into the wide variety of these peptides. Understanding the mechanisms behind antigen processing and presentation is crucial for effectively evaluating cancer immunotherapies. As an emerging domain, immunopeptidomics currently lacks standardization-there is neither an established terminology nor formally defined semantics-a critical concern considering the complexity, heterogeneity, and growing volume of data involved in immunopeptidomics studies. Additionally, there is a disconnection between how the proteomics community delivers the information about antigen presentation and its uptake by the clinical genomics community. Considering the significant relevance of immunopeptidomics in cancer, this shortcoming must be addressed to bridge the gap between research and clinical practice. In this work, we detail the development of the ImmunoPeptidomics Ontology, ImPO, the first effort at standardizing the terminology and semantics in the domain. ImPO aims to encapsulate and systematize data generated by immunopeptidomics experimental processes and bioinformatics analysis. ImPO establishes cross-references to 24 relevant ontologies, including the National Cancer Institute Thesaurus, Mondo Disease Ontology, Logical Observation Identifier Names and Codes and Experimental Factor Ontology. Although ImPO was developed using expert knowledge to characterize a large and representative data collection, it may be readily used to encode other datasets within the domain. Ultimately, ImPO facilitates data integration and analysis, enabling querying, inference and knowledge generation and importantly bridging the gap between the clinical proteomics and genomics communities. As the field of immunogenomics uses protein-level immunopeptidomics data, we expect ImPO to play a key role in supporting a rich and standardized description of the large-scale data that emerging high-throughput technologies are expected to bring in the near future. Ontology URL: https://zenodo.org/record/10237571 Project GitHub: https://github.com/liseda-lab/ImPO/blob/main/ImPO.owl.
Collapse
Affiliation(s)
- Daniel Faria
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Rua Alves Redol, 9, Lisboa 1000-029, Portugal
| | - Patrícia Eugénio
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Marta Contreiras Silva
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Laura Balbi
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Georges Bedran
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Ashwin Adrian Kallor
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Susana Nunes
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| | - Aleksander Palkowski
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Michal Waleron
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
| | - Javier A Alfaro
- International Centre for Cancer Vaccine Science, University of Gdansk, ul. Kładki 24, Gdańsk 80-822, Poland
- Department of Biochemistry and Microbiology, University of Victoria, 3800 Finnerty Rd, Victoria, British Columbia, BC V8P 5C2, Canada
- Institute for Adaptive and Neural Computation, School of Informatics, University of Edinburgh, Old College, South Bridge, Edinburgh, EH8 9YL, UK
- The Canadian Association for Responsible AI in Medicine, Victoria, Canada
| | - Catia Pesquita
- LASIGE, Faculdade de Ciências da Universidade de Lisboa, Campo Grande, Lisboa 1749-016, Portugal
| |
Collapse
|
2
|
Stefancsik R, Balhoff JP, Balk MA, Ball RL, Bello SM, Caron AR, Chesler EJ, de Souza V, Gehrke S, Haendel M, Harris LW, Harris NL, Ibrahim A, Koehler S, Matentzoglu N, McMurry JA, Mungall CJ, Munoz-Torres MC, Putman T, Robinson P, Smedley D, Sollis E, Thessen AE, Vasilevsky N, Walton DO, Osumi-Sutherland D. The Ontology of Biological Attributes (OBA)-computational traits for the life sciences. Mamm Genome 2023; 34:364-378. [PMID: 37076585 PMCID: PMC10382347 DOI: 10.1007/s00335-023-09992-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 04/06/2023] [Indexed: 04/21/2023]
Abstract
Existing phenotype ontologies were originally developed to represent phenotypes that manifest as a character state in relation to a wild-type or other reference. However, these do not include the phenotypic trait or attribute categories required for the annotation of genome-wide association studies (GWAS), Quantitative Trait Loci (QTL) mappings or any population-focussed measurable trait data. The integration of trait and biological attribute information with an ever increasing body of chemical, environmental and biological data greatly facilitates computational analyses and it is also highly relevant to biomedical and clinical applications. The Ontology of Biological Attributes (OBA) is a formalised, species-independent collection of interoperable phenotypic trait categories that is intended to fulfil a data integration role. OBA is a standardised representational framework for observable attributes that are characteristics of biological entities, organisms, or parts of organisms. OBA has a modular design which provides several benefits for users and data integrators, including an automated and meaningful classification of trait terms computed on the basis of logical inferences drawn from domain-specific ontologies for cells, anatomical and other relevant entities. The logical axioms in OBA also provide a previously missing bridge that can computationally link Mendelian phenotypes with GWAS and quantitative traits. The term components in OBA provide semantic links and enable knowledge and data integration across specialised research community boundaries, thereby breaking silos.
Collapse
Affiliation(s)
- Ray Stefancsik
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK.
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC, 27517, USA
| | - Meghan A Balk
- Natural History Museum, University of Oslo, Oslo, Norway
| | - Robyn L Ball
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | | | - Anita R Caron
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | - Vinicius de Souza
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Sarah Gehrke
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Melissa Haendel
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Laura W Harris
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Nomi L Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Arwa Ibrahim
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | | | | | - Julie A McMurry
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Tim Putman
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | | | - Damian Smedley
- William Harvey Research Institute, Barts and the London School of Medicine and Dentistry, Queen Mary University of London, London, EC1M 6BQ, UK
| | - Elliot Sollis
- European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, CB10 1SD, UK
| | - Anne E Thessen
- Anschutz Medical Campus, University of Colorado, Aurora, CO, 80045, USA
| | - Nicole Vasilevsky
- Data Collaboration Center, Critical Path Institute, Tucson, AZ, 85718, USA
| | | | | |
Collapse
|
3
|
Laulederkind SJF, Hayman GT, Wang SJ, Kaldunski ML, Vedi M, Demos WM, Tutaj M, Smith JR, Lamers L, Gibson AC, Thorat K, Thota J, Tutaj MA, De Pons JL, Dwinell MR, Kwitek AE. The Rat Genome Database: Genetic, Genomic, and Phenotypic Data Across Multiple Species. Curr Protoc 2023; 3:e804. [PMID: 37347557 PMCID: PMC10335880 DOI: 10.1002/cpz1.804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
The laboratory rat, Rattus norvegicus, is an important model of human health and disease, and experimental findings in the rat have relevance to human physiology and disease. The Rat Genome Database (RGD, https://rgd.mcw.edu) is a model organism database that provides access to a wide variety of curated rat data including disease associations, phenotypes, pathways, molecular functions, biological processes, cellular components, and chemical interactions for genes, quantitative trait loci, and strains. We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat and other species. © 2023 Wiley Periodicals LLC. Basic Protocol 1: Navigating the Rat Genome Database (RGD) home page Basic Protocol 2: Using the RGD search functions Basic Protocol 3: Searching for quantitative trait loci Basic Protocol 4: Using the RGD genome browser (JBrowse) to find phenotypic annotations Basic Protocol 5: Using OntoMate to find gene-disease data Basic Protocol 6: Using MOET to find gene-ontology enrichment Basic Protocol 7: Using OLGA to generate gene lists for analysis Basic Protocol 8: Using the GA tool to analyze ontology annotations for genes Basic Protocol 9: Using the RGD InterViewer tool to find protein interaction data Basic Protocol 10: Using the RGD Variant Visualizer tool to find genetic variant data Basic Protocol 11: Using the RGD Disease Portals to find disease, phenotype, and other information Basic Protocol 12: Using the RGD Phenotypes & Models Portal to find qualitative and quantitative phenotype data and other rat strain-related information Basic Protocol 13: Using the RGD Pathway Portal to find disease and phenotype data via molecular pathways.
Collapse
Affiliation(s)
| | - G. Thomas Hayman
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Shur-Jen Wang
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Mary L. Kaldunski
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Mahima Vedi
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Wendy M. Demos
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Monika Tutaj
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jennifer R. Smith
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Logan Lamers
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Adam C. Gibson
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Ketaki Thorat
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jyothi Thota
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Marek A. Tutaj
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jeffrey L. De Pons
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Melinda R. Dwinell
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Anne E. Kwitek
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
4
|
Hu ZL, Park CA, Reecy JM. A combinatorial approach implementing new database structures to facilitate practical data curation management of QTL, association, correlation and heritability data on trait variants. Database (Oxford) 2023; 2023:7135870. [PMID: 37084387 PMCID: PMC10121204 DOI: 10.1093/database/baad024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 02/28/2023] [Accepted: 03/27/2023] [Indexed: 04/23/2023]
Abstract
A precise description of traits is essential in genetics and genomics studies to facilitate comparative genetics and meta-analyses. It is an ongoing challenge in research and production environments to unambiguously and consistently compare traits of interest from data collected under various conditions. Despite previous efforts to standardize trait nomenclature, it remains a challenge to fully and accurately capture trait nomenclature granularity in a way that ensures long-term data sustainability in terms of the data curation processes, data management logistics and the ability to make meaningful comparisons across studies. In the Animal Quantitative Trait Loci Database and the Animal Trait Correlation Database, we have recently introduced a new method to extend livestock trait ontologies by using trait modifiers and qualifiers to define traits that differ slightly in how they are measured, examined or combined with other traits or factors. Here, we describe the implementation of a system in which the extended trait data, with modifiers, are managed at the experiment level as 'trait variants'. This has helped us to streamline the management and curation of such trait information in our database environment. Database URL https://www.animalgenome.org/PGNET/.
Collapse
Affiliation(s)
- Zhi-Liang Hu
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, 806 Stange Road, Ames, IA 50011-3150, USA
| | - Carissa A Park
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, 806 Stange Road, Ames, IA 50011-3150, USA
| | - James M Reecy
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, 806 Stange Road, Ames, IA 50011-3150, USA
| |
Collapse
|
5
|
Saunders DC, Messmer J, Kusmartseva I, Beery ML, Yang M, Atkinson MA, Powers AC, Cartailler JP, Brissova M. Pancreatlas: Applying an Adaptable Framework to Map the Human Pancreas in Health and Disease. PATTERNS 2020; 1:100120. [PMID: 33294866 PMCID: PMC7691395 DOI: 10.1016/j.patter.2020.100120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/19/2020] [Revised: 08/31/2020] [Accepted: 09/14/2020] [Indexed: 12/14/2022]
Abstract
Human tissue phenotyping generates complex spatial information from numerous imaging modalities, yet images typically become static figures for publication, and original data and metadata are rarely available. While comprehensive image maps exist for some organs, most resources have limited support for multiplexed imaging or have non-intuitive user interfaces. Therefore, we built a Pancreatlas resource that integrates several technologies into a unique interface, allowing users to access richly annotated web pages, drill down to individual images, and deeply explore data online. The current version of Pancreatlas contains over 800 unique images acquired by whole-slide scanning, confocal microscopy, and imaging mass cytometry, and is available at https://www.pancreatlas.org. To create this human pancreas-specific biological imaging resource, we developed a React-based web application and Python-based application programming interface, collectively called Flexible Framework for Integrating and Navigating Data (FFIND), which can be adapted beyond Pancreatlas to meet countless imaging or other structured data-management needs. Human organ phenotyping databases benefit from intuitive user interfaces Pancreatlas resource enables exploration of bioimaging data from human pancreas The front-end framework of Pancreatlas, FFIND, is modular and easily adaptable FFIND provides structured data-exploration capabilities across countless domains
Scientists need cost-effective yet fully featured database solutions that facilitate large dataset sharing in a structured and easily digestible manner. Flexible Framework for Integrating and Navigating Data (FFIND) is a data-agnostic web application that is designed to easily connect existing databases with data-browsing clients. We used FFIND to build Pancreatlas, an online imaging resource containing datasets linking imaging data with clinical data to facilitate advances in the understanding of diabetes, pancreatitis, and pancreatic cancer. FFIND architecture, which is available as open-source software, can be easily adapted to meet other field- or project-specific needs; we hope it will help data scientists reach a broader audience by reducing the development life cycle and providing familiar interactivity in communicating data and underlying stories.
Collapse
Affiliation(s)
- Diane C Saunders
- Division of Diabetes, Endocrinology, and Metabolism, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - James Messmer
- Division of Diabetes, Endocrinology, and Metabolism, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Irina Kusmartseva
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
| | - Maria L Beery
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
| | - Mingder Yang
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
| | - Mark A Atkinson
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA.,Department of Pediatrics, College of Medicine, Diabetes Institute, University of Florida, Gainesville, FL, USA
| | - Alvin C Powers
- Division of Diabetes, Endocrinology, and Metabolism, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA.,VA Tennessee Valley Healthcare System, Nashville, TN, USA.,Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA
| | - Jean-Philippe Cartailler
- Creative Data Solutions Shared Resource, Center for Stem Cell Biology, Vanderbilt University, Nashville, TN, USA
| | - Marcela Brissova
- Division of Diabetes, Endocrinology, and Metabolism, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| |
Collapse
|
6
|
Smith JR, Hayman GT, Wang SJ, Laulederkind SJF, Hoffman MJ, Kaldunski ML, Tutaj M, Thota J, Nalabolu HS, Ellanki SLR, Tutaj MA, De Pons JL, Kwitek AE, Dwinell MR, Shimoyama ME. The Year of the Rat: The Rat Genome Database at 20: a multi-species knowledgebase and analysis platform. Nucleic Acids Res 2020; 48:D731-D742. [PMID: 31713623 PMCID: PMC7145519 DOI: 10.1093/nar/gkz1041] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2019] [Revised: 10/21/2019] [Accepted: 10/24/2019] [Indexed: 12/13/2022] Open
Abstract
Formed in late 1999, the Rat Genome Database (RGD, https://rgd.mcw.edu) will be 20 in 2020, the Year of the Rat. Because the laboratory rat, Rattus norvegicus, has been used as a model for complex human diseases such as cardiovascular disease, diabetes, cancer, neurological disorders and arthritis, among others, for >150 years, RGD has always been disease-focused and committed to providing data and tools for researchers doing comparative genomics and translational studies. At its inception, before the sequencing of the rat genome, RGD started with only a few data types localized on genetic and radiation hybrid (RH) maps and offered only a few tools for querying and consolidating that data. Since that time, RGD has expanded to include a wealth of structured and standardized genetic, genomic, phenotypic, and disease-related data for eight species, and a suite of innovative tools for querying, analyzing and visualizing this data. This article provides an overview of recent substantial additions and improvements to RGD's data and tools that can assist researchers in finding and utilizing the data they need, whether their goal is to develop new precision models of disease or to more fully explore emerging details within a system or across multiple systems.
Collapse
Affiliation(s)
- Jennifer R Smith
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
- To whom correspondence should be addressed. Tel: +1 414 955 8871; Fax: +1 414 955 6595;
| | - G Thomas Hayman
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Shur-Jen Wang
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Stanley J F Laulederkind
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Matthew J Hoffman
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
- Genomic Sciences and Precision Medicine Center and Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Mary L Kaldunski
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Monika Tutaj
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jyothi Thota
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Harika S Nalabolu
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Santoshi L R Ellanki
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Marek A Tutaj
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jeffrey L De Pons
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Anne E Kwitek
- Genomic Sciences and Precision Medicine Center and Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Melinda R Dwinell
- Genomic Sciences and Precision Medicine Center and Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Mary E Shimoyama
- Rat Genome Database, Department of Biomedical Engineering, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| |
Collapse
|
7
|
Bartley BA, Beal J, Karr JR, Strychalski EA. Organizing genome engineering for the gigabase scale. Nat Commun 2020; 11:689. [PMID: 32019919 PMCID: PMC7000699 DOI: 10.1038/s41467-020-14314-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2019] [Accepted: 12/18/2019] [Indexed: 12/11/2022] Open
Abstract
Genome-scale engineering holds great potential to impact science, industry, medicine, and society, and recent improvements in DNA synthesis have enabled the manipulation of megabase genomes. However, coordinating and integrating the workflows and large teams necessary for gigabase genome engineering remains a considerable challenge. We examine this issue and recommend a path forward by: 1) adopting and extending existing representations for designs, assembly plans, samples, data, and workflows; 2) developing new technologies for data curation and quality control; 3) conducting fundamental research on genome-scale modeling and design; and 4) developing new legal and contractual infrastructure to facilitate collaboration.
Collapse
Affiliation(s)
| | - Jacob Beal
- Raytheon BBN Technologies, Cambridge, MA, 02138, USA.
| | - Jonathan R Karr
- Icahn Institute and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10128, USA
| | | |
Collapse
|
8
|
Abstract
My goal in searching for the big pictures is to discover novel ways of organizing information in psychology that will have both theoretical and practical significance. The first section lists my reasons for writing each of five articles. The second section discusses an additional five articles that integrate advancements in artificial intelligence and cognitive psychology. The following two sections elaborate on my collaboration with ontologists to use formal ontologies to organize psychological knowledge, including the National Institute of Mental Health Research Domain Criteria, for formulating a biological basis for mental illness. I next discuss strategies for writing integrative articles. The following section describes the helpfulness of the integrations for making psychology relevant to a general audience. I conclude with recommendations for creating breadth in doctoral training.
Collapse
Affiliation(s)
- Stephen K Reed
- Department of Psychology, San Diego State University; Center for Research in Mathematics and Science Education, San Diego State University; and Department of Psychology, University of California, San Diego
| |
Collapse
|
9
|
Bogue MA, Grubb SC, Walton DO, Philip VM, Kolishovski G, Stearns T, Dunn MH, Skelly DA, Kadakkuzha B, TeHennepe G, Kunde-Ramamoorthy G, Chesler EJ. Mouse Phenome Database: an integrative database and analysis suite for curated empirical phenotype data from laboratory mice. Nucleic Acids Res 2019; 46:D843-D850. [PMID: 29136208 PMCID: PMC5753241 DOI: 10.1093/nar/gkx1082] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 10/19/2017] [Indexed: 12/25/2022] Open
Abstract
The Mouse Phenome Database (MPD; https://phenome.jax.org) is a widely used resource that provides access to primary experimental trait data, genotypic variation, protocols and analysis tools for mouse genetic studies. Data are contributed by investigators worldwide and represent a broad scope of phenotyping endpoints and disease-related traits in naïve mice and those exposed to drugs, environmental agents or other treatments. MPD houses individual animal data with detailed, searchable protocols, and makes these data available to other resources via API. MPD provides rigorous curation of experimental data and supporting documentation using relevant ontologies and controlled vocabularies. Most data in MPD are from inbreds and other reproducible strains such that the data are cumulative over time and across laboratories. The resource has been expanded to include the QTL Archive and other primary phenotype data from mapping crosses as well as advanced high-diversity mouse populations including the Collaborative Cross and Diversity Outbred mice. Furthermore, MPD provides a means of assessing replicability and reproducibility across experimental conditions and protocols, benchmarking assays in users’ own laboratories, identifying sensitized backgrounds for making new mouse models with genome editing technologies, analyzing trait co-inheritance, finding the common genetic basis for multiple traits and assessing sex differences and sex-by-genotype interactions.
Collapse
Affiliation(s)
- Molly A Bogue
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | | | | | | | | | - Tim Stearns
- The Jackson Laboratory, Bar Harbor, Maine 04609, USA
| | | | | | | | | | | | | |
Collapse
|
10
|
Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, Billis K, Cummins C, Gall A, Girón CG, Gil L, Gordon L, Haggerty L, Haskell E, Hourlier T, Izuogu OG, Janacek SH, Juettemann T, To JK, Laird MR, Lavidas I, Liu Z, Loveland JE, Maurel T, McLaren W, Moore B, Mudge J, Murphy DN, Newman V, Nuhn M, Ogeh D, Ong CK, Parker A, Patricio M, Riat HS, Schuilenburg H, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Zadissa A, Frankish A, Hunt SE, Kostadima M, Langridge N, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Aken BL, Cunningham F, Yates A, Flicek P. Ensembl 2018. Nucleic Acids Res 2019; 46:D754-D761. [PMID: 29155950 PMCID: PMC5753206 DOI: 10.1093/nar/gkx1098] [Citation(s) in RCA: 1914] [Impact Index Per Article: 382.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2017] [Accepted: 10/21/2017] [Indexed: 01/29/2023] Open
Abstract
The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.
Collapse
Affiliation(s)
- Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Premanand Achuthan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Wasiu Akanni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - M Ridwan Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Barrell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Eagle Genomics Ltd., Wellcome Genome Campus, Hinxton, Cambridge CB10 1DR, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Konstantinos Billis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Astrid Gall
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carlos García Girón
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Leo Gordon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Leanne Haggerty
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Erin Haskell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Osagie G Izuogu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sophie H Janacek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Juettemann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jimmy Kiang To
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew R Laird
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ilias Lavidas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Zhicheng Liu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jane E Loveland
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Maurel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - William McLaren
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jonathan Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel N Murphy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Victoria Newman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael Nuhn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Denye Ogeh
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Chuang Kee Ong
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Harpreet Singh Riat
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Schuilenburg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dan Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Sparrow
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kieron Taylor
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alessandro Vullo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Brandon Walts
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amonida Zadissa
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam Frankish
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Myrto Kostadima
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicholas Langridge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magali Ruffier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Dan M Staines
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Bronwen L Aken
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| |
Collapse
|
11
|
Zhao Y, Smith JR, Wang SJ, Dwinell MR, Shimoyama M. Quantitative phenotype analysis to identify, validate and compare rat disease models. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5424140. [PMID: 30938777 PMCID: PMC6444380 DOI: 10.1093/database/baz037] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 02/08/2019] [Accepted: 02/27/2019] [Indexed: 12/18/2022]
Abstract
The laboratory rat has been widely used as an animal model in biomedical research. There are many strains exhibiting a wide variety of phenotypes. Capturing these phenotypes in a centralized database provides researchers with an easy method for choosing the appropriate strains for their studies. Existing resources have provided some preliminary work in rat phenotype databases. However, existing resources suffer from problems such as small number of animals, lack of updating, web interface queries limitations and lack of standardized metadata. The Rat Genome Database (RGD) PhenoMiner tool has provided the first step in this effort by standardizing and integrating data from individual studies. Our work, mainly utilizing data curated in RGD, involves the following key steps: (i) we developed a meta-analysis pipeline to automatically integrate data from heterogeneous sources and to produce expected ranges (standardized phenotype ranges) for different strains and phenotypes under different experimental conditions; (ii) we created tools to visualize expected ranges for individual strains and strain groups. We developed a meta-analysis pipeline and an interactive web interface that summarizes and visualizes expected ranges produced from the meta-analysis pipeline. Automation of the pipeline allows for updates as additional data becomes available. The interactive web interface provides curators and researchers with a platform for identifying and validating expected ranges for a variety of quantitative phenotypes. The data analysis result and visualization tools will promote an understanding of rat disease models, guide researchers to choose optimal strains for their research needs and encourage data sharing from different research hubs. Such resources also help to promote research reproducibility. The interactive platforms created in this project will continue to provide a valuable resource for translational research efforts.
Collapse
Affiliation(s)
- Yiqing Zhao
- Department of Biomedical Engineering, Marquette University and Medical College of Wisconsin, Milwaukee, WI, USA.,Division of Biomedical Statistics and Informatics, Mayo Clinic, Rochester, MN, USA
| | - Jennifer R Smith
- Department of Biomedical Engineering, Marquette University and Medical College of Wisconsin, Milwaukee, WI, USA
| | - Shur-Jen Wang
- Department of Biomedical Engineering, Marquette University and Medical College of Wisconsin, Milwaukee, WI, USA
| | - Melinda R Dwinell
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA.,Genomic Sciences and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Mary Shimoyama
- Department of Biomedical Engineering, Marquette University and Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
12
|
Lin FP, Groza T, Kocbek S, Antezana E, Epstein RJ. Cancer Care Treatment Outcome Ontology: A Novel Computable Ontology for Profiling Treatment Outcomes in Patients With Solid Tumors. JCO Clin Cancer Inform 2019; 2:1-14. [PMID: 30652600 DOI: 10.1200/cci.18.00026] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE There is as yet no computer-processable resource to describe treatment end points in cancer, hindering our ability to systematically capture and share outcomes data to inform better patient care. To address these unmet needs, we have built an ontology, the Cancer Care Treatment Outcome Ontology (CCTOO), to organize high-level concepts of treatment end points with structured knowledge representation to facilitate standardized sharing of real-world data. METHODS End points from oncology trials in ClinicalTrials.gov were extracted, queried using the keyword cancer, and followed by an expert appraisal. Synonyms and relevant terms were imported from the National Cancer Institute Thesaurus and Common Terminology Criteria for Adverse Events. Logical relationships among concepts were manually represented by production rules. The applicability of 1,847 rules was tested in an index case. RESULTS After removing duplicated terms from 54,705 trial entries, an ontology holding 1,133 terms was built. CCTOO organized concepts into four domains (cancer treatment, health services, physical, and psychosocial health-related concepts), 13 subgroups (including efficacy, safety, and quality of life), and two (taxonomic and evaluative) concept hierarchies. This ontology has a comprehensive term coverage in the cancer trial literature: at least one term was mentioned in 98% of MEDLINE abstracts of phase I to III trials, whereas concepts about efficacy were mentioned in 7,208 (79%) phase I, 15,051 (92%) phase II, and 3,884 (86%) phase III trials. The event sequence of the index case was readily convertible to a comprehensive profile incorporating response, treatment toxicity, and survival by applying the set of production rules curated in the CCTOO. CONCLUSION CCTOO categorizes high-level treatment end points used in oncology and provides a mechanism for profiling individual patient data by outcomes to facilitate translational analysis.
Collapse
Affiliation(s)
- Frank P Lin
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Tudor Groza
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Simon Kocbek
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Erick Antezana
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| | - Richard J Epstein
- Frank P. Lin and Richard J. Epstein, St Vincent's Hospital and The Kinghorn Cancer Centre; Frank P. Lin, Tudor Groza, Simon Kocbek, and Richard J. Epstein, Garvan Institute of Medical Research, Sydney, Australia; Frank P. Lin, Waikato Hospital, Hamilton, New Zealand; and Erick Antezana, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
13
|
Abstract
The laboratory rat, Rattus norvegicus, has been used in biomedical research for more than 150 years, and in many cases remains the model of choice for studies of physiology, behavior, and complex human disease. This book provides detailed information on a number of methodologies that can be used in rat. This chapter gives an introduction to rat as a species and as a biomedical model, providing historical information, a brief introduction to the current state of rat research, and a perspective on the future of rat as a model for human disease.
Collapse
Affiliation(s)
- Jennifer R Smith
- Department of Biomedical Engineering, Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA.
| | - Elizabeth R Bolton
- Department of Biomedical Engineering, Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Melinda R Dwinell
- Genomic Sciences and Precision Medicine Center, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Physiology, Rat Genome Database, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
14
|
Endara L, Thessen AE, Cole HA, Walls R, Gkoutos G, Cao Y, Chong SS, Cui H. Modifier Ontologies for frequency, certainty, degree, and coverage phenotype modifier. Biodivers Data J 2018; 6:e29232. [PMID: 30532623 PMCID: PMC6281706 DOI: 10.3897/bdj.6.e29232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 11/20/2018] [Indexed: 11/21/2022] Open
Abstract
Background: When phenotypic characters are described in the literature, they may be constrained or clarified with additional information such as the location or degree of expression, these terms are called "modifiers". With effort underway to convert narrative character descriptions to computable data, ontologies for such modifiers are needed. Such ontologies can also be used to guide term usage in future publications. Spatial and method modifiers are the subjects of ontologies that already have been developed or are under development. In this work, frequency (e.g., rarely, usually), certainty (e.g., probably, definitely), degree (e.g., slightly, extremely), and coverage modifiers (e.g., sparsely, entirely) are collected, reviewed, and used to create two modifier ontologies with different design considerations. The basic goal is to express the sequential relationships within a type of modifiers, for example, usually is more frequent than rarely, in order to allow data annotated with ontology terms to be classified accordingly. Method: Two designs are proposed for the ontology, both using the list pattern: a closed ordered list (i.e., five-bin design) and an open ordered list design. The five-bin design puts the modifier terms into a set of 5 fixed bins with interval object properties, for example, one_level_more/less_frequently_than, where new terms can only be added as synonyms to existing classes. The open list approach starts with 5 bins, but supports the extensibility of the list via ordinal properties, for example, more/less_frequently_than, allowing new terms to be inserted as a new class anywhere in the list. The consequences of the different design decisions are discussed in the paper. CharaParser was used to extract modifiers from plant, ant, and other taxonomic descriptions. After a manual screening, 130 modifier words were selected as the candidate terms for the modifier ontologies. Four curators/experts (three biologists and one information scientist specialized in biosemantics) reviewed and categorized the terms into 20 bins using the Ontology Term Organizer (OTO) (http://biosemantics.arizona.edu/OTO). Inter-curator variations were reviewed and expressed in the final ontologies. Results: Frequency, certainty, degree, and coverage terms with complete agreement among all curators were used as class labels or exact synonyms. Terms with different interpretations were either excluded or included using "broader synonym" or "not recommended" annotation properties. These annotations explicitly allow for the user to be aware of the semantic ambiguity associated with the terms and whether they should be used with caution or avoided. Expert categorization results showed that 16 out of 20 bins contained terms with full agreements, suggesting differentiating the modifiers into 5 levels/bins balances the need to differentiate modifiers and the need for the ontology to reflect user consensus. Two ontologies, developed using the Protege ontology editor, are made available as OWL files and can be downloaded from https://github.com/biosemantics/ontologies. Contribution: We built the first two modifier ontologies following a consensus-based approach with terms commonly used in taxonomic literature. The five-bin ontology has been used in the Explorer of Taxon Concepts web toolkit to compute the similarity between characters extracted from literature to facilitate taxon concepts alignments. The two ontologies will also be used in an ontology-informed authoring tool for taxonomists to facilitate consistency in modifier term usage.
Collapse
Affiliation(s)
- Lorena Endara
- University of Florida, Gainesville, United States of AmericaUniversity of FloridaGainesvilleUnited States of America
| | - Anne E Thessen
- The Ronin Institute for Independent Scholarship, Monclair, NJ, United States of AmericaThe Ronin Institute for Independent ScholarshipMonclair, NJUnited States of America
| | - Heather A Cole
- Science and Technology Branch, Agriculture and Agri-Food Canada, Government of Canada, Ottawa, CanadaScience and Technology Branch, Agriculture and Agri-Food Canada, Government of CanadaOttawaCanada
| | - Ramona Walls
- CyVerse, Tucson, United States of AmericaCyVerseTucsonUnited States of America
| | - Georgios Gkoutos
- College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of Birmingham, Birmingham, United KingdomCollege of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, Centre for Computational Biology, University of BirminghamBirminghamUnited Kingdom
- Institute of Translational Medicine, University Hospitals Birmingham NHS Foundation Trust, B15 2TT, Birmingham, United KingdomInstitute of Translational Medicine, University Hospitals Birmingham NHS Foundation Trust, B15 2TTBirminghamUnited Kingdom
| | - Yujie Cao
- Center for Studies of Information Resources, Wuhan Universtity, Wuhan, ChinaCenter for Studies of Information Resources, Wuhan UniverstityWuhanChina
| | - Steven S. Chong
- National Center for Ecological Analysis and Synthesis, University of California, Santa Barbara, Santa Barbara, United States of AmericaNational Center for Ecological Analysis and Synthesis, University of California, Santa BarbaraSanta BarbaraUnited States of America
- University of Arizona, Tucson, United States of AmericaUniversity of ArizonaTucsonUnited States of America
| | - Hong Cui
- University of Arizona, Tucson, United States of AmericaUniversity of ArizonaTucsonUnited States of America
| |
Collapse
|
15
|
Schuler JC, Ceusters WM. The Problems of Realism-Based Ontology Design: a Case Study in Creating Definitions for an Application Ontology for Diabetes Camps. AMIA ... ANNUAL SYMPOSIUM PROCEEDINGS. AMIA SYMPOSIUM 2018; 2017:1517-1526. [PMID: 29854221 PMCID: PMC5977642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
A requirement of realism-based ontology design is that classes denote exclusively entities that exist objectively in reality and that their definitions adhere to strict criteria to ensure that the classes are re-usable in other ontologies while preserving their ontological commitment. Building realism-based ontologies is therefore quite challenging and time-consuming, demanding considerable training. Although the top-level in the form of the Basic Formal Ontology (BFO) is worked out very well, and also the upper levels of certain domains, there is still a disconnect with the bottom- up or middle-out approach which is typical, and more practical, for application ontologies. Using the development of an application ontology for diabetes management in diabetes camps as an example, we present an overview of problems trainees in realism-based ontology design can be confronted with and offer some guidelines on how to deal with them in case no ideal solution is available.
Collapse
Affiliation(s)
- James C Schuler
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY, USA
| | - Werner M Ceusters
- Department of Biomedical Informatics, University at Buffalo, Buffalo, NY, USA
| |
Collapse
|
16
|
Shimoyama M, Smith JR, Bryda E, Kuramoto T, Saba L, Dwinell M. Rat Genome and Model Resources. ILAR J 2017; 58:42-58. [PMID: 28838068 PMCID: PMC6057551 DOI: 10.1093/ilar/ilw041] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Indexed: 11/25/2022] Open
Abstract
Rats remain a major model for studying disease mechanisms and discovery, validation, and testing of new compounds to improve human health. The rat’s value continues to grow as indicated by the more than 1.4 million publications (second to human) at PubMed documenting important discoveries using this model. Advanced sequencing technologies, genome modification techniques, and the development of embryonic stem cell protocols ensure the rat remains an important mammalian model for disease studies. The 2004 release of the reference genome has been followed by the production of complete genomes for more than two dozen individual strains utilizing NextGen sequencing technologies; their analyses have identified over 80 million variants. This explosion in genomic data has been accompanied by the ability to selectively edit the rat genome, leading to hundreds of new strains through multiple technologies. A number of resources have been developed to provide investigators with access to precision rat models, comprehensive datasets, and sophisticated software tools necessary for their research. Those profiled here include the Rat Genome Database, PhenoGen, Gene Editing Rat Resource Center, Rat Resource and Research Center, and the National BioResource Project for the Rat in Japan.
Collapse
Affiliation(s)
- Mary Shimoyama
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jennifer R Smith
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Elizabeth Bryda
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Takashi Kuramoto
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Laura Saba
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Melinda Dwinell
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
17
|
Blanch A, García R, Planes J, Gil R, Balada F, Blanco E, Aluja A. Ontologies About Human Behavior. EUROPEAN PSYCHOLOGIST 2017. [DOI: 10.1027/1016-9040/a000295] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Abstract. The development of information and communication technologies has stimulated a variety of data and informational resources about human behavior. This is contributing toward collaborative efforts in the formalization and systematization of an overwhelming volume of scientific information. Several tools are helpful for this endeavor, among which the ontology is growing in popularity. Most of the available informational resources adopt the ontology to organize a shared conceptualization of a given body of knowledge. In the present study, we reviewed ontology resources (n = 17) that can be of interest to researchers and scholars involved in human behavior and psychological research. The selected ontologies were contrasted on the three main components of ontologies, classes, individuals, and properties, and on scheme and knowledge metrics. Moreover, we recorded the associations of the terms within a given ontology with terms of other ontologies (mappings), the number of projects using a particular ontology, and whether an ontology was available within the Bioportal, an extensive repository about biomedical ontologies. A few working examples were also provided to clarify how these resources might contribute to improve the analysis, understanding, and research cooperation about human behavior and psychological research.
Collapse
Affiliation(s)
- Angel Blanch
- Department of Psychology, Faculty of Education, Psychology and Social Work, University of Lleida, Spain
- Institute of Biomedical Research (IRB Lleida), Spain
| | - Roberto García
- Department of Computing Science and Industrial Engineering, University of Lleida, Spain
| | - Jordi Planes
- Department of Computing Science and Industrial Engineering, University of Lleida, Spain
| | - Rosa Gil
- Department of Computing Science and Industrial Engineering, University of Lleida, Spain
| | - Ferran Balada
- Department of Psychobiology, Institute of Neurosciences, Universitat Autònoma de Barcelona, Spain
| | - Eduardo Blanco
- Department of Psychology, Faculty of Education, Psychology and Social Work, University of Lleida, Spain
- Institute of Biomedical Research (IRB Lleida), Spain
| | - Anton Aluja
- Department of Psychology, Faculty of Education, Psychology and Social Work, University of Lleida, Spain
- Institute of Biomedical Research (IRB Lleida), Spain
| |
Collapse
|
18
|
Zwierzyna M, Overington JP. Classification and analysis of a large collection of in vivo bioassay descriptions. PLoS Comput Biol 2017; 13:e1005641. [PMID: 28678787 PMCID: PMC5517062 DOI: 10.1371/journal.pcbi.1005641] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Revised: 07/19/2017] [Accepted: 06/21/2017] [Indexed: 12/17/2022] Open
Abstract
Testing potential drug treatments in animal disease models is a decisive step of all preclinical drug discovery programs. Yet, despite the importance of such experiments for translational medicine, there have been relatively few efforts to comprehensively and consistently analyze the data produced by in vivo bioassays. This is partly due to their complexity and lack of accepted reporting standards-publicly available animal screening data are only accessible in unstructured free-text format, which hinders computational analysis. In this study, we use text mining to extract information from the descriptions of over 100,000 drug screening-related assays in rats and mice. We retrieve our dataset from ChEMBL-an open-source literature-based database focused on preclinical drug discovery. We show that in vivo assay descriptions can be effectively mined for relevant information, including experimental factors that might influence the outcome and reproducibility of animal research: genetic strains, experimental treatments, and phenotypic readouts used in the experiments. We further systematize extracted information using unsupervised language model (Word2Vec), which learns semantic similarities between terms and phrases, allowing identification of related animal models and classification of entire assay descriptions. In addition, we show that random forest models trained on features generated by Word2Vec can predict the class of drugs tested in different in vivo assays with high accuracy. Finally, we combine information mined from text with curated annotations stored in ChEMBL to investigate the patterns of usage of different animal models across a range of experiments, drug classes, and disease areas.
Collapse
Affiliation(s)
- Magdalena Zwierzyna
- BenevolentAI, London, United Kingdom
- Institute of Cardiovascular Science, University College London, London, United Kingdom
| | - John P. Overington
- BenevolentAI, London, United Kingdom
- Institute of Cardiovascular Science, University College London, London, United Kingdom
| |
Collapse
|
19
|
Shriberg LD, Strand EA, Fourakis M, Jakielski KJ, Hall SD, Karlsson HB, Mabie HL, McSweeny JL, Tilkens CM, Wilson DL. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech From Speech Delay: I. Development and Description of the Pause Marker. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2017; 60:S1096-S1117. [PMID: 28384779 PMCID: PMC5548086 DOI: 10.1044/2016_jslhr-s-15-0296] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2015] [Revised: 04/12/2016] [Accepted: 08/21/2016] [Indexed: 05/10/2023]
Abstract
Purpose The goal of this article (PM I) is to describe the rationale for and development of the Pause Marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech from speech delay. Method The authors describe and prioritize 7 criteria with which to evaluate the research and clinical utility of a diagnostic marker for childhood apraxia of speech, including evaluation of the present proposal. An overview is given of the Speech Disorders Classification System, including extensions completed in the same approximately 3-year period in which the PM was developed. Results The finalized Speech Disorders Classification System includes a nosology and cross-classification procedures for childhood and persistent speech disorders and motor speech disorders (Shriberg, Strand, & Mabie, 2017). A PM is developed that provides procedural and scoring information, and citations to papers and technical reports that include audio exemplars of the PM and reference data used to standardize PM scores are provided. Conclusions The PM described here is an acoustic-aided perceptual sign that quantifies one aspect of speech precision in the linguistic domain of phrasing. This diagnostic marker can be used to discriminate early or persistent childhood apraxia of speech from speech delay.
Collapse
Affiliation(s)
| | | | | | - Kathy J. Jakielski
- Department of Communication Sciences and Disorders, Augustana College, Rock Island, IL
| | | | | | | | | | | | | |
Collapse
|
20
|
Oberkampf H, Zillner S, Overton JA, Bauer B, Cavallaro A, Uder M, Hammon M. Semantic representation of reported measurements in radiology. BMC Med Inform Decis Mak 2016; 16:5. [PMID: 26801764 PMCID: PMC4722630 DOI: 10.1186/s12911-016-0248-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Accepted: 01/20/2016] [Indexed: 12/23/2022] Open
Abstract
Background In radiology, a vast amount of diverse data is generated, and unstructured reporting is standard. Hence, much useful information is trapped in free-text form, and often lost in translation and transmission. One relevant source of free-text data consists of reports covering the assessment of changes in tumor burden, which are needed for the evaluation of cancer treatment success. Any change of lesion size is a critical factor in follow-up examinations. It is difficult to retrieve specific information from unstructured reports and to compare them over time. Therefore, a prototype was implemented that demonstrates the structured representation of findings, allowing selective review in consecutive examinations and thus more efficient comparison over time. Methods We developed a semantic Model for Clinical Information (MCI) based on existing ontologies from the Open Biological and Biomedical Ontologies (OBO) library. MCI is used for the integrated representation of measured image findings and medical knowledge about the normal size of anatomical entities. An integrated view of the radiology findings is realized by a prototype implementation of a ReportViewer. Further, RECIST (Response Evaluation Criteria In Solid Tumors) guidelines are implemented by SPARQL queries on MCI. The evaluation is based on two data sets of German radiology reports: An oncologic data set consisting of 2584 reports on 377 lymphoma patients and a mixed data set consisting of 6007 reports on diverse medical and surgical patients. All measurement findings were automatically classified as abnormal/normal using formalized medical background knowledge, i.e., knowledge that has been encoded into an ontology. A radiologist evaluated 813 classifications as correct or incorrect. All unclassified findings were evaluated as incorrect. Results The proposed approach allows the automatic classification of findings with an accuracy of 96.4 % for oncologic reports and 92.9 % for mixed reports. The ReportViewer permits efficient comparison of measured findings from consecutive examinations. The implementation of RECIST guidelines with SPARQL enhances the quality of the selection and comparison of target lesions as well as the corresponding treatment response evaluation. Conclusions The developed MCI enables an accurate integrated representation of reported measurements and medical knowledge. Thus, measurements can be automatically classified and integrated in different decision processes. The structured representation is suitable for improved integration of clinical findings during decision-making. The proposed ReportViewer provides a longitudinal overview of the measurements.
Collapse
Affiliation(s)
- Heiner Oberkampf
- Department of Computer Science, Software Methodologies for Distributed Systems, University of Augsburg, Universitätsstraße 6a, 86159, Augsburg, Germany. .,Corporate Technology, Siemens AG, Otto-Hahn-Ring 6, 81739, Münech, Germany.
| | - Sonja Zillner
- Corporate Technology, Siemens AG, Otto-Hahn-Ring 6, 81739, Münech, Germany. .,School of International Business and Entrepreneurship, Steinbeis University, Kalkofenstraße 53, 71083, Herrenberg, Germany.
| | | | - Bernhard Bauer
- Department of Computer Science, Software Methodologies for Distributed Systems, University of Augsburg, Universitätsstraße 6a, 86159, Augsburg, Germany.
| | - Alexander Cavallaro
- Department of Radiology, University Hospital Erlangen, Maximiliansplatz 1, 91054, Erlangen, Germany.
| | - Michael Uder
- Department of Radiology, University Hospital Erlangen, Maximiliansplatz 1, 91054, Erlangen, Germany.
| | - Matthias Hammon
- Department of Radiology, University Hospital Erlangen, Maximiliansplatz 1, 91054, Erlangen, Germany.
| |
Collapse
|
21
|
Hu ZL, Park CA, Reecy JM. Developmental progress and current status of the Animal QTLdb. Nucleic Acids Res 2015; 44:D827-33. [PMID: 26602686 PMCID: PMC4702873 DOI: 10.1093/nar/gkv1233] [Citation(s) in RCA: 188] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2015] [Accepted: 10/30/2015] [Indexed: 11/14/2022] Open
Abstract
The Animal QTL Database (QTLdb; http://www.animalgenome.org/QTLdb) has undergone dramatic growth in recent years in terms of new data curated, data downloads and new functions and tools. We have focused our development efforts to cope with challenges arising from rapid growth of newly published data and end users' data demands, and to optimize data retrieval and analysis to facilitate users' research. Evidenced by the 27 releases in the past 11 years, the growth of the QTLdb has been phenomenal. Here we report our recent progress which is highlighted by addition of one new species, four new data types, four new user tools, a new API tool set, numerous new functions and capabilities added to the curator tool set, expansion of our data alliance partners and more than 20 other improvements. In this paper we present a summary of our progress to date and an outlook regarding future directions.
Collapse
Affiliation(s)
- Zhi-Liang Hu
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA 50011, USA
| | - Carissa A Park
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA 50011, USA
| | - James M Reecy
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, Ames, IA 50011, USA
| |
Collapse
|
22
|
Oellrich A, Collier N, Groza T, Rebholz-Schuhmann D, Shah N, Bodenreider O, Boland MR, Georgiev I, Liu H, Livingston K, Luna A, Mallon AM, Manda P, Robinson PN, Rustici G, Simon M, Wang L, Winnenburg R, Dumontier M. The digital revolution in phenotyping. Brief Bioinform 2015; 17:819-30. [PMID: 26420780 PMCID: PMC5036847 DOI: 10.1093/bib/bbv083] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Indexed: 12/22/2022] Open
Abstract
Phenotypes have gained increased notoriety in the clinical and biological domain owing to their application in numerous areas such as the discovery of disease genes and drug targets, phylogenetics and pharmacogenomics. Phenotypes, defined as observable characteristics of organisms, can be seen as one of the bridges that lead to a translation of experimental findings into clinical applications and thereby support 'bench to bedside' efforts. However, to build this translational bridge, a common and universal understanding of phenotypes is required that goes beyond domain-specific definitions. To achieve this ambitious goal, a digital revolution is ongoing that enables the encoding of data in computer-readable formats and the data storage in specialized repositories, ready for integration, enabling translational research. While phenome research is an ongoing endeavor, the true potential hidden in the currently available data still needs to be unlocked, offering exciting opportunities for the forthcoming years. Here, we provide insights into the state-of-the-art in digital phenotyping, by means of representing, acquiring and analyzing phenotype data. In addition, we provide visions of this field for future research work that could enable better applications of phenotype data.
Collapse
|
23
|
Wang SJ, Laulederkind SJF, Hayman GT, Petri V, Liu W, Smith JR, Nigam R, Dwinell MR, Shimoyama M. PhenoMiner: a quantitative phenotype database for the laboratory rat, Rattus norvegicus. Application in hypertension and renal disease. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bau128. [PMID: 25632109 PMCID: PMC4309021 DOI: 10.1093/database/bau128] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Rats have been used extensively as animal models to study physiological and pathological processes involved in human diseases. Numerous rat strains have been selectively bred for certain biological traits related to specific medical interests. Recently, the Rat Genome Database (http://rgd.mcw.edu) has initiated the PhenoMiner project to integrate quantitative phenotype data from the PhysGen Program for Genomic Applications and the National BioResource Project in Japan as well as manual annotations from biomedical literature. PhenoMiner, the search engine for these integrated phenotype data, facilitates mining of data sets across studies by searching the database with a combination of terms from four different ontologies/vocabularies (Rat Strain Ontology, Clinical Measurement Ontology, Measurement Method Ontology and Experimental Condition Ontology). In this study, salt-induced hypertension was used as a model to retrieve blood pressure records of Brown Norway, Fawn-Hooded Hypertensive (FHH) and Dahl salt-sensitive (SS) rat strains. The records from these three strains served as a basis for comparing records from consomic/congenic/mutant offspring derived from them. We examined the cardiovascular and renal phenotypes of consomics derived from FHH and SS, and of SS congenics and mutants. The availability of quantitative records across laboratories in one database, such as these provided by PhenoMiner, can empower researchers to make the best use of publicly available data. Database URL:http://rgd.mcw.edu
Collapse
Affiliation(s)
- Shur-Jen Wang
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - Stanley J F Laulederkind
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - G Thomas Hayman
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - Victoria Petri
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - Weisong Liu
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - Jennifer R Smith
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - Rajni Nigam
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - Melinda R Dwinell
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| | - Mary Shimoyama
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI53226, USA
| |
Collapse
|
24
|
Liu W, Laulederkind SJF, Hayman GT, Wang SJ, Nigam R, Smith JR, De Pons J, Dwinell MR, Shimoyama M. OntoMate: a text-mining tool aiding curation at the Rat Genome Database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015; 2015:bau129. [PMID: 25619558 PMCID: PMC4305386 DOI: 10.1093/database/bau129] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
The Rat Genome Database (RGD) is the premier repository of rat genomic, genetic and physiologic data. Converting data from free text in the scientific literature to a structured format is one of the main tasks of all model organism databases. RGD spends considerable effort manually curating gene, Quantitative Trait Locus (QTL) and strain information. The rapidly growing volume of biomedical literature and the active research in the biological natural language processing (bioNLP) community have given RGD the impetus to adopt text-mining tools to improve curation efficiency. Recently, RGD has initiated a project to use OntoMate, an ontology-driven, concept-based literature search engine developed at RGD, as a replacement for the PubMed (http://www.ncbi.nlm.nih.gov/pubmed) search engine in the gene curation workflow. OntoMate tags abstracts with gene names, gene mutations, organism name and most of the 16 ontologies/vocabularies used at RGD. All terms/ entities tagged to an abstract are listed with the abstract in the search results. All listed terms are linked both to data entry boxes and a term browser in the curation tool. OntoMate also provides user-activated filters for species, date and other parameters relevant to the literature search. Using the system for literature search and import has streamlined the process compared to using PubMed. The system was built with a scalable and open architecture, including features specifically designed to accelerate the RGD gene curation process. With the use of bioNLP tools, RGD has added more automation to its curation workflow. Database URL:http://rgd.mcw.edu
Collapse
Affiliation(s)
- Weisong Liu
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - Stanley J F Laulederkind
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - G Thomas Hayman
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - Shur-Jen Wang
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - Rajni Nigam
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - Jennifer R Smith
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - Jeff De Pons
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - Melinda R Dwinell
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| | - Mary Shimoyama
- Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA
| |
Collapse
|
25
|
Shimoyama M, De Pons J, Hayman GT, Laulederkind SJF, Liu W, Nigam R, Petri V, Smith JR, Tutaj M, Wang SJ, Worthey E, Dwinell M, Jacob H. The Rat Genome Database 2015: genomic, phenotypic and environmental variations and disease. Nucleic Acids Res 2014; 43:D743-50. [PMID: 25355511 PMCID: PMC4383884 DOI: 10.1093/nar/gku1026] [Citation(s) in RCA: 167] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
The Rat Genome Database (RGD, http://rgd.mcw.edu) provides the most comprehensive data repository and informatics platform related to the laboratory rat, one of the most important model organisms for disease studies. RGD maintains and updates datasets for genomic elements such as genes, transcripts and increasingly in recent years, sequence variations, as well as map positions for multiple assemblies and sequence information. Functional annotations for genomic elements are curated from published literature, submitted by researchers and integrated from other public resources. Complementing the genomic data catalogs are those associated with phenotypes and disease, including strains, QTL and experimental phenotype measurements across hundreds of strains. Data are submitted by researchers, acquired through bulk data pipelines or curated from published literature. Innovative software tools provide users with an integrated platform to query, mine, display and analyze valuable genomic and phenomic datasets for discovery and enhancement of their own research. This update highlights recent developments that reflect an increasing focus on: (i) genomic variation, (ii) phenotypes and diseases, (iii) data related to the environment and experimental conditions and (iv) datasets and software tools that allow the user to explore and analyze the interactions among these and their impact on disease.
Collapse
Affiliation(s)
- Mary Shimoyama
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA Department of Surgery, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jeff De Pons
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - G Thomas Hayman
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | | | - Weisong Liu
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Rajni Nigam
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Victoria Petri
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jennifer R Smith
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Marek Tutaj
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Shur-Jen Wang
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Elizabeth Worthey
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA Department of Pediatrics, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Melinda Dwinell
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Howard Jacob
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| |
Collapse
|
26
|
Hancock JM. Commentary on Shimoyama et al. (2012): three ontologies to define phenotype measurement data. Front Genet 2014; 5:93. [PMID: 24795755 PMCID: PMC4006037 DOI: 10.3389/fgene.2014.00093] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2014] [Accepted: 04/03/2014] [Indexed: 01/17/2023] Open
Affiliation(s)
- John M Hancock
- Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| |
Collapse
|
27
|
Hancock JM. Editorial: biological ontologies and semantic biology. Front Genet 2014; 5:18. [PMID: 24550936 PMCID: PMC3912459 DOI: 10.3389/fgene.2014.00018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Accepted: 01/21/2014] [Indexed: 01/22/2023] Open
Affiliation(s)
- John M Hancock
- Department of Physiology, Development and Neuroscience, University of Cambridge Cambridge, UK
| |
Collapse
|
28
|
Viti F, Scaglione S, Orro A, Milanesi L. Guidelines for managing data and processes in bone and cartilage tissue engineering. BMC Bioinformatics 2014; 15 Suppl 1:S14. [PMID: 24564199 PMCID: PMC4015954 DOI: 10.1186/1471-2105-15-s1-s14] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
Background In the last decades, a wide number of researchers/clinicians involved in tissue engineering field published several works about the possibility to induce a tissue regeneration guided by the use of biomaterials. To this aim, different scaffolds have been proposed, and their effectiveness tested through in vitro and/or in vivo experiments. In this context, integration and meta-analysis approaches are gaining importance for analyses and reuse of data as, for example, those concerning the bone and cartilage biomarkers, the biomolecular factors intervening in cell differentiation and growth, the morphology and the biomechanical performance of a neo-formed tissue, and, in general, the scaffolds' ability to promote tissue regeneration. Therefore standards and ontologies are becoming crucial, to provide a unifying knowledge framework for annotating data and supporting the semantic integration and the unambiguous interpretation of novel experimental results. Results In this paper a conceptual framework has been designed for bone/cartilage tissue engineering domain, by now completely lacking standardized methods. A set of guidelines has been provided, defining the minimum information set necessary for describing an experimental study involved in bone and cartilage regenerative medicine field. In addition, a Bone/Cartilage Tissue Engineering Ontology (BCTEO) has been developed to provide a representation of the domain's concepts, specifically oriented to cells, and chemical composition, morphology, physical characterization of biomaterials involved in bone/cartilage tissue engineering research. Conclusions Considering that tissue engineering is a discipline that traverses different semantic fields and employs many data types, the proposed instruments represent a first attempt to standardize the domain knowledge and can provide a suitable means to integrate data across the field.
Collapse
|
29
|
Nigam R, Munzenmaier DH, Worthey EA, Dwinell MR, Shimoyama M, Jacob HJ. Rat Strain Ontology: structured controlled vocabulary designed to facilitate access to strain data at RGD. J Biomed Semantics 2013; 4:36. [PMID: 24267899 PMCID: PMC4177145 DOI: 10.1186/2041-1480-4-36] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2013] [Accepted: 10/02/2013] [Indexed: 11/10/2022] Open
Abstract
Background The Rat Genome Database (RGD) (
http://rgd.mcw.edu/) is the premier site for comprehensive data on the different strains of the laboratory rat (Rattus norvegicus). The strain data are collected from various publications, direct submissions from individual researchers, and rat providers worldwide. Rat strain, substrain designation and nomenclature follow the Guidelines for Nomenclature of Mouse and Rat Strains, instituted by the International Committee on Standardized Genetic Nomenclature for Mice. While symbols and names aid in identifying strains correctly, the flat nature of this information prohibits easy search and retrieval, as well as other data mining functions. In order to improve these functionalities, particularly in ontology-based tools, the Rat Strain Ontology (RS) was developed. Results The Rat Strain Ontology (RS) reflects the breeding history, parental background, and genetic manipulation of rat strains. This controlled vocabulary organizes strains by type: inbred, outbred, chromosome altered, congenic, mutant and so on. In addition, under the chromosome altered category, strains are organized by chromosome, and further by type of manipulations, such as mutant or congenic. This allows users to easily retrieve strains of interest with modifications in specific genomic regions. The ontology was developed using the Open Biological and Biomedical Ontology (OBO) file format, and is organized on the Directed Acyclic Graph (DAG) structure. Rat Strain Ontology IDs are included as part of the strain report (RS: ######). Conclusions As rat researchers are often unaware of the number of substrains or altered strains within a breeding line, this vocabulary now provides an easy way to retrieve all substrains and accompanying information. Its usefulness is particularly evident in tools such as the PhenoMiner at RGD, where users can now easily retrieve phenotype measurement data for related strains, strains with similar backgrounds or those with similar introgressed regions. This controlled vocabulary also allows better retrieval and filtering for QTLs and in genomic tools such as the GViewer. The Rat Strain Ontology has been incorporated into the RGD Ontology Browser (
http://rgd.mcw.edu/rgdweb/ontology/view.html?acc_id=RS:0000457#s) and is available through the National Center for Biomedical Ontology (
http://bioportal.bioontology.org/ontologies/1150) or the RGD ftp site (
ftp://rgd.mcw.edu/pub/ontology/rat_strain/).
Collapse
Affiliation(s)
- Rajni Nigam
- Human and Molecular Genetics Center, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee 53226-3548, WI, USA.
| | | | | | | | | | | |
Collapse
|
30
|
Smith JR, Park CA, Nigam R, Laulederkind SJF, Hayman GT, Wang SJ, Lowry TF, Petri V, Pons JD, Tutaj M, Liu W, Worthey EA, Shimoyama M, Dwinell MR. The clinical measurement, measurement method and experimental condition ontologies: expansion, improvements and new applications. J Biomed Semantics 2013; 4:26. [PMID: 24103152 PMCID: PMC3882879 DOI: 10.1186/2041-1480-4-26] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2013] [Accepted: 10/01/2013] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND The Clinical Measurement Ontology (CMO), Measurement Method Ontology (MMO), and Experimental Condition Ontology (XCO) were originally developed at the Rat Genome Database (RGD) to standardize quantitative rat phenotype data in order to integrate results from multiple studies into the PhenoMiner database and data mining tool. These ontologies provide the framework for presenting what was measured, how it was measured, and under what conditions it was measured. RESULTS There has been a continuing expansion of subdomains in each ontology with a parallel 2-3 fold increase in the total number of terms, substantially increasing the size and improving the scope of the ontologies. The proportion of terms with textual definitions has increased from ~60% to over 80% with greater synchronization of format and content throughout the three ontologies. Representation of definition source Uniform Resource Identifiers (URI) has been standardized, including the removal of all non-URI characters, and systematic versioning of all ontology files has been implemented. The continued expansion and success of these ontologies has facilitated the integration of more than 60,000 records into the RGD PhenoMiner database. In addition, new applications of these ontologies, such as annotation of Quantitative Trait Loci (QTL), have been added at the sites actively using them, including RGD and the Animal QTL Database. CONCLUSIONS The improvements to these three ontologies have been substantial, and development is ongoing. New terms and expansions to the ontologies continue to be added as a result of active curation efforts at RGD and the Animal QTL database. Use of these vocabularies to standardize data representation for quantitative phenotypes and quantitative trait loci across databases for multiple species has demonstrated their utility for integrating diverse data types from multiple sources. These ontologies are freely available for download and use from the NCBO BioPortal website at http://bioportal.bioontology.org/ontologies/1583 (CMO), http://bioportal.bioontology.org/ontologies/1584 (MMO), and http://bioportal.bioontology.org/ontologies/1585 (XCO), or from the RGD ftp site at ftp://rgd.mcw.edu/pub/ontology/.
Collapse
Affiliation(s)
- Jennifer R Smith
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Carissa A Park
- Department of Animal Science, Iowa State University, Ames, IA, USA
| | - Rajni Nigam
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | | | - G Thomas Hayman
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Shur-Jen Wang
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Timothy F Lowry
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Victoria Petri
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Jeff De Pons
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Marek Tutaj
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Weisong Liu
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Elizabeth A Worthey
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Pediatrics, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Mary Shimoyama
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Surgery, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Melinda R Dwinell
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, WI, USA
- Department of Physiology, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
31
|
Knüpfer C, Beckstein C. Function of dynamic models in systems biology: linking structure to behaviour. J Biomed Semantics 2013; 4:24. [PMID: 24103739 PMCID: PMC3853929 DOI: 10.1186/2041-1480-4-24] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2013] [Accepted: 05/23/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Dynamic models in Systems Biology are used in computational simulation experiments for addressing biological questions. The complexity of the modelled biological systems and the growing number and size of the models calls for computer support for modelling and simulation in Systems Biology. This computer support has to be based on formal representations of relevant knowledge fragments. RESULTS In this paper we describe different functional aspects of dynamic models. This description is conceptually embedded in our "meaning facets" framework which systematises the interpretation of dynamic models in structural, functional and behavioural facets. Here we focus on how function links the structure and the behaviour of a model. Models play a specific role (teleological function) in the scientific process of finding explanations for dynamic phenomena. In order to fulfil this role a model has to be used in simulation experiments (pragmatical function). A simulation experiment always refers to a specific situation and a state of the model and the modelled system (conditional function). We claim that the function of dynamic models refers to both the simulation experiment executed by software (intrinsic function) and the biological experiment which produces the phenomena under investigation (extrinsic function). We use the presented conceptual framework for the function of dynamic models to review formal accounts for functional aspects of models in Systems Biology, such as checklists, ontologies, and formal languages. Furthermore, we identify missing formal accounts for some of the functional aspects. In order to fill one of these gaps we propose an ontology for the teleological function of models. CONCLUSION We have thoroughly analysed the role and use of models in Systems Biology. The resulting conceptual framework for the function of models is an important first step towards a comprehensive formal representation of the functional knowledge involved in the modelling and simulation process. Any progress in this area will in turn improve computer-supported modelling and simulation in Systems Biology.
Collapse
|
32
|
Park CA, Bello SM, Smith CL, Hu ZL, Munzenmaier DH, Nigam R, Smith JR, Shimoyama M, Eppig JT, Reecy JM. The Vertebrate Trait Ontology: a controlled vocabulary for the annotation of trait data across species. J Biomed Semantics 2013; 4:13. [PMID: 23937709 PMCID: PMC3851175 DOI: 10.1186/2041-1480-4-13] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 07/05/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The use of ontologies to standardize biological data and facilitate comparisons among datasets has steadily grown as the complexity and amount of available data have increased. Despite the numerous ontologies available, one area currently lacking a robust ontology is the description of vertebrate traits. A trait is defined as any measurable or observable characteristic pertaining to an organism or any of its substructures. While there are several ontologies to describe entities and processes in phenotypes, diseases, and clinical measurements, one has not been developed for vertebrate traits; the Vertebrate Trait Ontology (VT) was created to fill this void. DESCRIPTION Significant inconsistencies in trait nomenclature exist in the literature, and additional difficulties arise when trait data are compared across species. The VT is a unified trait vocabulary created to aid in the transfer of data within and between species and to facilitate investigation of the genetic basis of traits. Trait information provides a valuable link between the measurements that are used to assess the trait, the phenotypes related to the traits, and the diseases associated with one or more phenotypes. Because multiple clinical and morphological measurements are often used to assess a single trait, and a single measurement can be used to assess multiple physiological processes, providing investigators with standardized annotations for trait data will allow them to investigate connections among these data types. CONCLUSIONS The annotation of genomic data with ontology terms provides unique opportunities for data mining and analysis. Links between data in disparate databases can be identified and explored, a strategy that is particularly useful for cross-species comparisons or in situations involving inconsistent terminology. The VT provides a common basis for the description of traits in multiple vertebrate species. It is being used in the Rat Genome Database and Animal QTL Database for annotation of QTL data for rat, cattle, chicken, swine, sheep, and rainbow trout, and in the Mouse Phenome Database to annotate strain characterization data. In these databases, data are also cross-referenced to applicable terms from other ontologies, providing additional avenues for data mining and analysis. The ontology is available at http://bioportal.bioontology.org/ontologies/50138.
Collapse
Affiliation(s)
- Carissa A Park
- Department of Animal Science, Iowa State University, Ames, IA, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
TRAK ontology: Defining standard care for the rehabilitation of knee conditions. J Biomed Inform 2013; 46:615-25. [DOI: 10.1016/j.jbi.2013.04.009] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2013] [Revised: 04/23/2013] [Accepted: 04/25/2013] [Indexed: 11/24/2022]
|
34
|
Nigam R, Laulederkind SJF, Hayman GT, Smith JR, Wang SJ, Lowry TF, Petri V, De Pons J, Tutaj M, Liu W, Jayaraman P, Munzenmaier DH, Worthey EA, Dwinell MR, Shimoyama M, Jacob HJ. Rat Genome Database: a unique resource for rat, human, and mouse quantitative trait locus data. Physiol Genomics 2013; 45:809-16. [PMID: 23881287 DOI: 10.1152/physiolgenomics.00065.2013] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The rat has been widely used as a disease model in a laboratory setting, resulting in an abundance of genetic and phenotype data from a wide variety of studies. These data can be found at the Rat Genome Database (RGD, http://rgd.mcw.edu/), which provides a platform for researchers interested in linking genomic variations to phenotypes. Quantitative trait loci (QTLs) form one of the earliest and core datasets, allowing researchers to identify loci harboring genes associated with disease. These QTLs are not only important for those using the rat to identify genes and regions associated with disease, but also for cross-organism analyses of syntenic regions on the mouse and the human genomes to identify potential regions for study in these organisms. Currently, RGD has data on >1,900 rat QTLs that include details about the methods and animals used to determine the respective QTL along with the genomic positions and markers that define the region. RGD also curates human QTLs (>1,900) and houses>4,000 mouse QTLs (imported from Mouse Genome Informatics). Multiple ontologies are used to standardize traits, phenotypes, diseases, and experimental methods to facilitate queries, analyses, and cross-organism comparisons. QTLs are visualized in tools such as GBrowse and GViewer, with additional tools for analysis of gene sets within QTL regions. The QTL data at RGD provide valuable information for the study of mapped phenotypes and identification of candidate genes for disease associations.
Collapse
Affiliation(s)
- Rajni Nigam
- Human and Molecular Genetics Center, Medical College of Wisconsin, Milwaukee, Wisconsin
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
35
|
Wang SJ, Laulederkind SJF, Hayman GT, Smith JR, Petri V, Lowry TF, Nigam R, Dwinell MR, Worthey EA, Munzenmaier DH, Shimoyama M, Jacob HJ. Analysis of disease-associated objects at the Rat Genome Database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat046. [PMID: 23794737 PMCID: PMC3689439 DOI: 10.1093/database/bat046] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The Rat Genome Database (RGD) is the premier resource for genetic, genomic and phenotype data for the laboratory rat, Rattus norvegicus. In addition to organizing biological data from rats, the RGD team focuses on manual curation of gene–disease associations for rat, human and mouse. In this work, we have analyzed disease-associated strains, quantitative trait loci (QTL) and genes from rats. These disease objects form the basis for seven disease portals. Among disease portals, the cardiovascular disease and obesity/metabolic syndrome portals have the highest number of rat strains and QTL. These two portals share 398 rat QTL, and these shared QTL are highly concentrated on rat chromosomes 1 and 2. For disease-associated genes, we performed gene ontology (GO) enrichment analysis across portals using RatMine enrichment widgets. Fifteen GO terms, five from each GO aspect, were selected to profile enrichment patterns of each portal. Of the selected biological process (BP) terms, ‘regulation of programmed cell death’ was the top enriched term across all disease portals except in the obesity/metabolic syndrome portal where ‘lipid metabolic process’ was the most enriched term. ‘Cytosol’ and ‘nucleus’ were common cellular component (CC) annotations for disease genes, but only the cancer portal genes were highly enriched with ‘nucleus’ annotations. Similar enrichment patterns were observed in a parallel analysis using the DAVID functional annotation tool. The relationship between the preselected 15 GO terms and disease terms was examined reciprocally by retrieving rat genes annotated with these preselected terms. The individual GO term–annotated gene list showed enrichment in physiologically related diseases. For example, the ‘regulation of blood pressure’ genes were enriched with cardiovascular disease annotations, and the ‘lipid metabolic process’ genes with obesity annotations. Furthermore, we were able to enhance enrichment of neurological diseases by combining ‘G-protein coupled receptor binding’ annotated genes with ‘protein kinase binding’ annotated genes. Database URL:http://rgd.mcw.edu
Collapse
Affiliation(s)
- Shur-Jen Wang
- Rat Genome Database, Human and Molecular Genetics Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Laulederkind SJF, Liu W, Smith JR, Hayman GT, Wang SJ, Nigam R, Petri V, Lowry TF, de Pons J, Dwinell MR, Shimoyama M. PhenoMiner: quantitative phenotype curation at the rat genome database. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2013; 2013:bat015. [PMID: 23603846 PMCID: PMC3630803 DOI: 10.1093/database/bat015] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
The Rat Genome Database (RGD) is the premier repository of rat genomic and genetic data and currently houses >40 000 rat gene records as well as human and mouse orthologs, >2000 rat and 1900 human quantitative trait loci (QTLs) records and >2900 rat strain records. Biological information curated for these data objects includes disease associations, phenotypes, pathways, molecular functions, biological processes and cellular components. Recently, a project was initiated at RGD to incorporate quantitative phenotype data for rat strains, in addition to the currently existing qualitative phenotype data for rat strains, QTLs and genes. A specialized curation tool was designed to generate manual annotations with up to six different ontologies/vocabularies used simultaneously to describe a single experimental value from the literature. Concurrently, three of those ontologies needed extensive addition of new terms to move the curation forward. The curation interface development, as well as ontology development, was an ongoing process during the early stages of the PhenoMiner curation project. Database URL:http://rgd.mcw.edu
Collapse
Affiliation(s)
- Stanley J F Laulederkind
- Human and Molecular Genetics Center, Medical College of Wisconsin, Human and Molecular Genetics Center, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
37
|
Laulederkind SJF, Hayman GT, Wang SJ, Lowry TF, Nigam R, Petri V, Smith JR, Dwinell MR, Jacob HJ, Shimoyama M. Exploring genetic, genomic, and phenotypic data at the rat genome database. ACTA ACUST UNITED AC 2013; Chapter 1:1.14.1-1.14.27. [PMID: 23255149 DOI: 10.1002/0471250953.bi0114s40] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The laboratory rat, Rattus norvegicus, is an important model of human health and disease, and experimental findings in the rat have relevance to human physiology and disease. The Rat Genome Database (RGD, http://rgd.mcw.edu) is a model organism database that provides access to a wide variety of curated rat data including disease associations, phenotypes, pathways, molecular functions, biological processes, and cellular components for genes, quantitative trait loci, and strains. We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat.
Collapse
|
38
|
Laulederkind SJF, Hayman GT, Wang SJ, Smith JR, Lowry TF, Nigam R, Petri V, de Pons J, Dwinell MR, Shimoyama M, Munzenmaier DH, Worthey EA, Jacob HJ. The Rat Genome Database 2013--data, tools and users. Brief Bioinform 2013; 14:520-6. [PMID: 23434633 PMCID: PMC3713714 DOI: 10.1093/bib/bbt007] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
The Rat Genome Database (RGD) was started >10 years ago to provide a core genomic resource for rat researchers. Currently, RGD combines genetic, genomic, pathway, phenotype and strain information with a focus on disease. RGD users are provided with access to structured and curated data from the molecular level through the organismal level. Those users access RGD from all over the world. End users are not only rat researchers but also researchers working with mouse and human data. Translational research is supported by RGD’s comparative genetics/genomics data in disease portals, in GBrowse, in VCMap and on gene report pages. The impact of RGD also goes beyond the traditional biomedical researcher, as the influence of RGD reaches bioinformaticians, tool developers and curators. Import of RGD data into other publicly available databases expands the influence of RGD to a larger set of end users than those who avail themselves of the RGD website. The value of RGD continues to grow as more types of data and more tools are added, while reaching more types of end users.
Collapse
Affiliation(s)
- Stanley J F Laulederkind
- Human and Molecular Genetics Center, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
39
|
Hu ZL, Park CA, Wu XL, Reecy JM. Animal QTLdb: an improved database tool for livestock animal QTL/association data dissemination in the post-genome era. Nucleic Acids Res 2012. [PMID: 23180796 PMCID: PMC3531174 DOI: 10.1093/nar/gks1150] [Citation(s) in RCA: 285] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The Animal QTL database (QTLdb; http://www.animalgenome.org/QTLdb) is designed to house all publicly available QTL and single-nucleotide polymorphism/gene association data on livestock animal species. An earlier version was published in the Nucleic Acids Research Database issue in 2007. Since then, we have continued our efforts to develop new and improved database tools to allow more data types, parameters and functions. Our efforts have transformed the Animal QTLdb into a tool that actively serves the research community as a quality data repository and more importantly, a provider of easily accessible tools and functions to disseminate QTL and gene association information. The QTLdb has been heavily used by the livestock genomics community since its first public release in 2004. To date, there are 5920 cattle, 3442 chicken, 7451 pigs, 753 sheep and 88 rainbow trout data points in the database, and at least 290 publications that cite use of the database. The rapid advancement in genomic studies of cattle, chicken, pigs, sheep and other livestock animals has presented us with challenges, as well as opportunities for the QTLdb to meet the evolving needs of the research community. Here, we report our progress over the recent years and highlight new functions and services available to the general public.
Collapse
Affiliation(s)
- Zhi-Liang Hu
- Department of Animal Science and Center for Integrated Animal Genomics, Iowa State University, 2255 Kildee Hall, Ames, IA 50011, USA.
| | | | | | | |
Collapse
|
40
|
Golik W, Dameron O, Bugeon J, Fatet A, Hue I, Hurtaud C, Reichstadt M, Salaün MC, Vernet J, Joret L, Papazian F, Nédellec C, Le Bail PY. ATOL: The Multi-species Livestock Trait Ontology. COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE 2012. [DOI: 10.1007/978-3-642-35233-1_28] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|