1
|
Bregonzio M, Bernasconi A, Pinoli P. Advancing healthcare through data: the BETTER project's vision for distributed analytics. Front Med (Lausanne) 2024; 11:1473874. [PMID: 39416867 PMCID: PMC11480012 DOI: 10.3389/fmed.2024.1473874] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2024] [Accepted: 09/12/2024] [Indexed: 10/19/2024] Open
Abstract
Introduction Data-driven medicine is essential for enhancing the accessibility and quality of the healthcare system. The availability of data plays a crucial role in achieving this goal. Methods We propose implementing a robust data infrastructure of FAIRification and data fusion for clinical, genomic, and imaging data. This will be embedded within the framework of a distributed analytics platform for healthcare data analysis, utilizing the Personal Health Train paradigm. Results This infrastructure will ensure the findability, accessibility, interoperability, and reusability of data, metadata, and results among multiple medical centers participating in the BETTER Horizon Europe project. The project focuses on studying rare diseases, such as intellectual disability and inherited retinal dystrophies. Conclusion The anticipated impacts will benefit a wide range of healthcare practitioners and potentially influence health policymakers.
Collapse
Affiliation(s)
| | - Anna Bernasconi
- Department of Information, Electronics, and Bioengineering, Politecnico di Milano, Milan, Italy
| | - Pietro Pinoli
- Department of Information, Electronics, and Bioengineering, Politecnico di Milano, Milan, Italy
| |
Collapse
|
2
|
Bialy N, Alber F, Andrews B, Angelo M, Beliveau B, Bintu L, Boettiger A, Boehm U, Brown CM, Maina MB, Chambers JJ, Cimini BA, Eliceiri K, Errington R, Faklaris O, Gaudreault N, Germain RN, Goscinski W, Grunwald D, Halter M, Hanein D, Hickey JW, Lacoste J, Laude A, Lundberg E, Ma J, Malacrida L, Moore J, Nelson G, Neumann EK, Nitschke R, Onami S, Pimentel JA, Plant AL, Radtke AJ, Sabata B, Schapiro D, Schöneberg J, Spraggins JM, Sudar D, Vierdag WMAM, Volkmann N, Wählby C, Wang SS, Yaniv Z, Strambio-De-Castillia C. Harmonizing the Generation and Pre-publication Stewardship of FAIR bioimage data. ARXIV 2024:arXiv:2401.13022v5. [PMID: 38351940 PMCID: PMC10862930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
Together with the molecular knowledge of genes and proteins, biological images promise to significantly enhance the scientific understanding of complex cellular systems and to advance predictive and personalized therapeutic products for human health. For this potential to be realized, quality-assured bioimage data must be shared among labs at a global scale to be compared, pooled, and reanalyzed, thus unleashing untold potential beyond the original purpose for which the data was generated. There are two broad sets of requirements to enable bioimage data sharing in the life sciences. One set of requirements is articulated in the companion White Paper entitled "Enabling Global Image Data Sharing in the Life Sciences," which is published in parallel and addresses the need to build the cyberinfrastructure for sharing bioimage data (arXiv:2401.13023 [q-bio.OT], https://doi.org/10.48550/arXiv.2401.13023). Here, we detail a broad set of requirements, which involves collecting, managing, presenting, and propagating contextual information essential to assess the quality, understand the content, interpret the scientific implications, and reuse bioimage data in the context of the experimental details. We start by providing an overview of the main lessons learned to date through international community activities, which have recently made generating community standard practices for imaging Quality Control (QC) and metadata (Faklaris et al., 2022; Hammer et al., 2021; Huisman et al., 2021; Microscopy Australia, 2016; Montero Llopis et al., 2021; Rigano et al., 2021; Sarkans et al., 2021). We then provide a clear set of recommendations for amplifying this work. The driving goal is to address remaining challenges and democratize access to common practices and tools for a spectrum of biomedical researchers, regardless of their expertise, access to resources, and geographical location.
Collapse
Affiliation(s)
- Nikki Bialy
- Morgridge Institute for Research, Madison, USA
| | | | | | | | | | | | | | | | | | | | | | - Beth A Cimini
- Broad Institute of MIT and Harvard, Imaging Platform, Cambridge, USA
| | - Kevin Eliceiri
- Morgridge Institute for Research, Madison, USA
- University of Wisconsin-Madison, Madison, USA
| | | | | | | | - Ronald N Germain
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, USA
| | | | | | - Michael Halter
- National Institute of Standards and Technology, Gaithersburg, USA
| | | | | | | | - Alex Laude
- Newcastle University, Newcastle upon Tyne, UK
| | - Emma Lundberg
- Stanford University, Palo Alto, USA
- SciLifeLab, KTH Royal Institute of Technology, Stockholm, Sweden
| | - Jian Ma
- Carnegie Mellon University, Pittsburgh, USA
| | - Leonel Malacrida
- Institut Pasteur de Montevideo, & Universidad de la República, Montevideo, Uruguay
| | - Josh Moore
- German BioImaging-Gesellschaft für Mikroskopie und Bildanalyse e.V., Constance, Germany
| | - Glyn Nelson
- Newcastle University, Newcastle upon Tyne, UK
| | | | | | - Shuichi Onami
- RIKEN Center for Biosystems Dynamics Research, Kobe, Japan
| | | | - Anne L Plant
- National Institute of Standards and Technology, Gaithersburg, USA
| | - Andrea J Radtke
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, USA
| | | | | | | | | | - Damir Sudar
- Quantitative Imaging Systems LLC, Portland, USA
| | | | | | | | | | - Ziv Yaniv
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, USA
| | | |
Collapse
|
3
|
Galgonek J, Vondrášek J. The IDSM mass spectrometry extension: searching mass spectra using SPARQL. Bioinformatics 2024; 40:btae174. [PMID: 38561173 PMCID: PMC11034985 DOI: 10.1093/bioinformatics/btae174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 02/24/2024] [Accepted: 03/28/2024] [Indexed: 04/04/2024] Open
Abstract
SUMMARY The Integrated Database of Small Molecules (IDSM) integrates data from small-molecule datasets, making them accessible through the SPARQL query language. Its unique feature is the ability to search for compounds through SPARQL based on their molecular structure. We extended IDSM to enable mass spectra databases to be integrated and searched for based on mass spectrum similarity. As sources of mass spectra, we employed the MassBank of North America database and the In Silico Spectral Database of natural products. AVAILABILITY AND IMPLEMENTATION The extension is an integral part of IDSM, which is available at https://idsm.elixir-czech.cz. The manual and usage examples are available at https://idsm.elixir-czech.cz/docs/ms. The source codes of all IDSM parts are available under open-source licences at https://github.com/idsm-src.
Collapse
Affiliation(s)
- Jakub Galgonek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, Prague 160 00, Czech Republic
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 2, Prague 160 00, Czech Republic
| |
Collapse
|
4
|
Azzi R, Bordea G, Griffier R, Nikiema JN, Mougin F. Enriching the FIDEO ontology with food-drug interactions from online knowledge sources. J Biomed Semantics 2024; 15:1. [PMID: 38438913 PMCID: PMC10913206 DOI: 10.1186/s13326-024-00302-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 02/05/2024] [Indexed: 03/06/2024] Open
Abstract
The increasing number of articles on adverse interactions that may occur when specific foods are consumed with certain drugs makes it difficult to keep up with the latest findings. Conflicting information is available in the scientific literature and specialized knowledge bases because interactions are described in an unstructured or semi-structured format. The FIDEO ontology aims to integrate and represent information about food-drug interactions in a structured way. This article reports on the new version of this ontology in which more than 1700 interactions are integrated from two online resources: DrugBank and Hedrine. These food-drug interactions have been represented in FIDEO in the form of precompiled concepts, each of which specifies both the food and the drug involved. Additionally, competency questions that can be answered are reviewed, and avenues for further enrichment are discussed.
Collapse
Affiliation(s)
- Rabia Azzi
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- CHU de Bordeaux, Service d'information médicale, F-33000, Bordeaux, France
| | - Georgeta Bordea
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- Univ. La Rochelle, L3i, F-17000, La Rochelle, France
| | - Romain Griffier
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France
- CHU de Bordeaux, Service d'information médicale, F-33000, Bordeaux, France
| | - Jean Noël Nikiema
- Department of Management, Evaluation and Health Policy, School of Public Health, Université de Montréal, Québec, Canada
| | - Fleur Mougin
- Univ. Bordeaux, Inserm, BPH, U1219, F-33000, Bordeaux, France.
| |
Collapse
|
5
|
Gargano MA, Matentzoglu N, Coleman B, Addo-Lartey EB, Anagnostopoulos A, Anderton J, Avillach P, Bagley AM, Bakštein E, Balhoff JP, Baynam G, Bello SM, Berk M, Bertram H, Bishop S, Blau H, Bodenstein DF, Botas P, Boztug K, Čady J, Callahan TJ, Cameron R, Carbon S, Castellanos F, Caufield JH, Chan LE, Chute C, Cruz-Rojo J, Dahan-Oliel N, Davids JR, de Dieuleveult M, de Souza V, de Vries BBA, de Vries E, DePaulo JR, Derfalvi B, Dhombres F, Diaz-Byrd C, Dingemans AJM, Donadille B, Duyzend M, Elfeky R, Essaid S, Fabrizzi C, Fico G, Firth HV, Freudenberg-Hua Y, Fullerton JM, Gabriel DL, Gilmour K, Giordano J, Goes FS, Moses RG, Green I, Griese M, Groza T, Gu W, Guthrie J, Gyori B, Hamosh A, Hanauer M, Hanušová K, He Y(O, Hegde H, Helbig I, Holasová K, Hoyt CT, Huang S, Hurwitz E, Jacobsen JOB, Jiang X, Joseph L, Keramatian K, King B, Knoflach K, Koolen DA, Kraus M, Kroll C, Kusters M, Ladewig MS, Lagorce D, Lai MC, Lapunzina P, Laraway B, Lewis-Smith D, Li X, Lucano C, Majd M, Marazita ML, Martinez-Glez V, McHenry TH, McInnis MG, McMurry JA, Mihulová M, Millett CE, Mitchell PB, Moslerová V, Narutomi K, Nematollahi S, Nevado J, Nierenberg AA, Čajbiková NN, Nurnberger JI, Ogishima S, Olson D, Ortiz A, Pachajoa H, Perez de Nanclares G, Peters A, Putman T, Rapp CK, Rath A, Reese J, Rekerle L, Roberts A, Roy S, Sanders SJ, Schuetz C, Schulte EC, Schulze TG, Schwarz M, Scott K, Seelow D, Seitz B, Shen Y, Similuk MN, Simon ES, Singh B, Smedley D, Smith CL, Smolinsky JT, Sperry S, Stafford E, Stefancsik R, Steinhaus R, Strawbridge R, Sundaramurthi JC, Talapova P, Tenorio Castano JA, Tesner P, Thomas RH, Thurm A, Turnovec M, van Gijn ME, Vasilevsky NA, Vlčková M, Walden A, Wang K, Wapner R, Ware JS, Wiafe AA, Wiafe SA, Wiggins LD, Williams AE, Wu C, Wyrwoll MJ, Xiong H, Yalin N, Yamamoto Y, Yatham LN, Yocum AK, Young AH, Yüksel Z, Zandi PP, Zankl A, Zarante I, Zvolský M, Toro S, Carmody LC, Harris NL, Munoz-Torres MC, Danis D, Mungall CJ, Köhler S, Haendel MA, Robinson PN. The Human Phenotype Ontology in 2024: phenotypes around the world. Nucleic Acids Res 2024; 52:D1333-D1346. [PMID: 37953324 PMCID: PMC10767975 DOI: 10.1093/nar/gkad1005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/12/2023] [Accepted: 10/19/2023] [Indexed: 11/14/2023] Open
Abstract
The Human Phenotype Ontology (HPO) is a widely used resource that comprehensively organizes and defines the phenotypic features of human disease, enabling computational inference and supporting genomic and phenotypic analyses through semantic similarity and machine learning algorithms. The HPO has widespread applications in clinical diagnostics and translational research, including genomic diagnostics, gene-disease discovery, and cohort analytics. In recent years, groups around the world have developed translations of the HPO from English to other languages, and the HPO browser has been internationalized, allowing users to view HPO term labels and in many cases synonyms and definitions in ten languages in addition to English. Since our last report, a total of 2239 new HPO terms and 49235 new HPO annotations were developed, many in collaboration with external groups in the fields of psychiatry, arthrogryposis, immunology and cardiology. The Medical Action Ontology (MAxO) is a new effort to model treatments and other measures taken for clinical management. Finally, the HPO consortium is contributing to efforts to integrate the HPO and the GA4GH Phenopacket Schema into electronic health records (EHRs) with the goal of more standardized and computable integration of rare disease data in EHRs.
Collapse
Affiliation(s)
| | | | - Ben Coleman
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | | | - Joel Anderton
- Center for Craniofacial and Dental Genetics, Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | | | - Anita M Bagley
- Shriners Children's Northern California, Sacramento, CA, USA
| | - Eduard Bakštein
- National Institute of Mental Health, Klecany, Czech Republic
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, NC 27517, USA
| | - Gareth Baynam
- Rare Care Centre, Perth Children's Hospital, Perth, Australia
| | | | - Michael Berk
- Deakin University, IMPACT - the Institute for Mental and Physical Health and Clinical Translation, School of Medicine, Barwon Health, Geelong, Australia
| | - Holli Bertram
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - Somer Bishop
- Department of Psychiatry and Behavioral Sciences, UCSF Weil Institute for Neuroscience, San Francisco, CA, USA
| | - Hannah Blau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - David F Bodenstein
- Department of Pharmacology and Toxicology, University of Toronto, Toronto, ON, Canada
| | | | - Kaan Boztug
- St. Anna Children's Cancer Research Institute (CCRI), Vienna, Austria
| | - Jolana Čady
- Institute of Health Information and Statistics of the Czech Republic, Prague, Czech Republic
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University Irving Medical Center, NY, NY, USA
| | | | - Seth J Carbon
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - J Harry Caufield
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Lauren E Chan
- College of Public Health and Human Sciences, Oregon State University, Corvallis, OR 97331, USA
| | - Christopher G Chute
- Schools of Medicine, Public Health, and Nursing, Johns Hopkins University, Baltimore, MD 21287, USA
| | - Jaime Cruz-Rojo
- UDISGEN (Dysmorphology and Genetics Unit), 12 de Octubre Hospital, Madrid, Spain
| | - Noémi Dahan-Oliel
- Department of Clinical Research, Shriners Hospitals for Children, Montreal, Quebec, Canada
| | - Jon R Davids
- Shriners Children's Northern California, Sacramento, CA, USA
| | - Maud de Dieuleveult
- Département I&D, AP-HP, Banque Nationale de Données Maladies Rares, Paris, France
| | - Vinicius de Souza
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Bert B A de Vries
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, Netherlands
| | | | - J Raymond DePaulo
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Beata Derfalvi
- Department of Pediatrics, Dalhousie University, Halifax, NS, Canada
| | - Ferdinand Dhombres
- Fetal Medicine Department, Armand Trousseau Hospital, Sorbonne University, GRC26, INSERM, Limics, Paris, France
| | - Claudia Diaz-Byrd
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - Alexander J M Dingemans
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, Netherlands
| | - Bruno Donadille
- St Antoine Hospital, Reference Center for Rare Growth Endocrine Disorders, Sorbonne University, AP-HP, INSERM, US14 - Orphanet, Plateforme Maladies Rares, Paris, France
| | | | - Reem Elfeky
- Department of Immunology, GOS Hospital for Children NHS Foundation Trust, University College London, London, UK
| | - Shahim Essaid
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | | | - Giovanna Fico
- Bipolar and Depressive Disorders Unit, Institute of Neuroscience, Hospital Clinic, University of Barcelona, IDIBAPS, CIBERSAM, Barcelona, Catalonia, Spain
| | - Helen V Firth
- Addenbrooke's Hospital, Cambridge University Hospitals, Cambridge, UK
| | - Yun Freudenberg-Hua
- Department of Psychiatry, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | | | - Davera L Gabriel
- School of Medicine, Johns Hopkins University, Baltimore, MD 21287, USA
| | | | - Jessica Giordano
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA
| | - Fernando S Goes
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Rachel Gore Moses
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ian Green
- SNOMED International, London W2 6BD, UK
| | - Matthias Griese
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, LMU Munich, German center for Lung research (DZL), Munich, Germany
| | - Tudor Groza
- Rare Care Centre, Perth Children's Hospital, Perth, Australia
| | | | - Julia Guthrie
- Department of Structural and Computational Biology, University of Vienna; Max Perutz Labs, Vienna, Austria
| | - Benjamin Gyori
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | - Ada Hamosh
- Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Marc Hanauer
- INSERM, US14 - Orphanet, Plateforme Maladies Rares, Paris, France
| | - Kateřina Hanušová
- Institute of Health Information and Statistics of the Czech Republic, Prague, Czech Republic
| | | | - Harshad Hegde
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Ingo Helbig
- Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Kateřina Holasová
- Institute of Health Information and Statistics of the Czech Republic, Prague, Czech Republic
| | - Charles Tapley Hoyt
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
| | | | - Eric Hurwitz
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Julius O B Jacobsen
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | | | - Lisa Joseph
- Neurodevelopmental and Behavioral Phenotyping Service, National Institute of Mental Health, Bethesda, MD, USA
| | - Kamyar Keramatian
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| | - Bryan King
- Department of Psychiatry and Behavioral Sciences, UCSF Weil Institute for Neuroscience, San Francisco, CA, USA
| | - Katrin Knoflach
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, LMU Munich, German center for Lung research (DZL), Munich, Germany
| | - David A Koolen
- Department of Human Genetics, Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Nijmegen, Netherlands
| | - Megan L Kraus
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Carlo Kroll
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | - Maaike Kusters
- Immunology, NIHR Great Ormond Street Hospital BRC, London, UK
| | - Markus S Ladewig
- Department of Ophthalmology, University Clinic Marburg - Campus Fulda, Fulda, Germany
| | - David Lagorce
- INSERM, US14 - Orphanet, Plateforme Maladies Rares, Paris, France
| | - Meng-Chuan Lai
- Campbell Family Mental Health Research Institute, Centre for Addiction and Mental Health, Toronto, ON, Canada
| | - Pablo Lapunzina
- Institute of Medical and Molecular Genetics, Hospital Univ. La Paz, Madrid, Spain
| | - Bryan Laraway
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - David Lewis-Smith
- Translational and Clinical Research Institute, Henry Wellcome Building, Framlington Place, Newcastle University, Newcastle-Upon-Tyne NE14LP, UK
| | | | - Caterina Lucano
- INSERM, US14 - Orphanet, Plateforme Maladies Rares, Paris, France
| | - Marzieh Majd
- Department of Psychiatry, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
| | - Mary L Marazita
- Center for Craniofacial and Dental Genetics, Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Victor Martinez-Glez
- Center for Genomic Medicine, Parc Taulí Hospital Universitari, Institut d’Investigació i Innovació Parc Taulí (I3PT-CERCA), Sabadell, Spain
| | - Toby H McHenry
- Center for Craniofacial and Dental Genetics, Department of Oral and Craniofacial Sciences, School of Dental Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Melvin G McInnis
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - Julie A McMurry
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Michaela Mihulová
- Department of Biology and Medical Genetics, 2nd Medical Faculty of Charles University and University Hospital Motol, Prague, Czech Republic
| | - Caitlin E Millett
- Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Philip B Mitchell
- Discipline of Psychiatry & Mental Health, School of Clinical Medicine, Faculty of Medicine & Health, University of New South Wales, Sydney, NSW, Australia
| | - Veronika Moslerová
- Department of Biology and Medical Genetics, 2nd Medical Faculty of Charles University and University Hospital Motol, Prague, Czech Republic
| | - Kenji Narutomi
- Okinawa Prefectural Nanbu Medical Center & Children's Medical Center
| | - Shahrzad Nematollahi
- School of Physical and Occupational Therapy, McGill University, Montreal, Quebec, Canada
| | - Julian Nevado
- Institute of Medical and Molecular Genetics, Hospital Univ. La Paz, Madrid, Spain
| | - Andrew A Nierenberg
- Dauten Family Center for Bipolar Treatment Innovation, Massachusetts General Hospital, Boston, MA, USA
| | - Nikola Novák Čajbiková
- Department of Biology and Medical Genetics, 2nd Medical Faculty of Charles University and University Hospital Motol, Prague, Czech Republic
| | - John I Nurnberger
- Stark Neurosciences Research Institute, Departments of Psychiatry and Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN, USA
| | | | - Daniel Olson
- Data Collaboration Center, Data Science, Critical Path Institute, Tucson, AZ, USA
| | - Abigail Ortiz
- Department of Psychiatry, University of Toronto, Toronto, ON, Canada
| | - Harry Pachajoa
- Centro de Investigaciones en Anomalías Congénitas y Enfermedades Raras (CIACER), Universidad Icesi, Cali, Colombia
| | - Guiomar Perez de Nanclares
- Molecular (epi) genetics lab, Bioaraba Health Research Institute, Araba University Hospital, Vitoria-Gasteiz, Spain
| | - Amy Peters
- Department of Psychiatry, Massachusetts General Hospital, Boston, MA, USA
| | - Tim Putman
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Christina K Rapp
- Department of Pediatrics, Dr. von Hauner Children's Hospital, University Hospital, LMU Munich, German center for Lung research (DZL), Munich, Germany
| | - Ana Rath
- INSERM, US14 - Orphanet, Plateforme Maladies Rares, Paris, France
| | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Lauren Rekerle
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Angharad M Roberts
- National Heart & Lung Institute & MRC London Institute of Medical Sciences, Imperial College London, London W12 0HS, UK
| | - Suzy Roy
- SNOMED International, London W2 6BD, UK
| | - Stephan J Sanders
- Department of Paediatrics, Institute of Developmental and Regenerative Medicine, University of Oxford, Oxford, UK
| | - Catharina Schuetz
- Universitätsklinikum Carl Gustav Carus, Medizinische Fakultät, TU, Dresden, Germany
| | - Eva C Schulte
- Institute of Psychiatric Phenomics and Genomics (IPPG), LMU University Hospital, LMU Munich, Munich, Germany
| | - Thomas G Schulze
- Department of Psychiatry and Behavioral Sciences, SUNY Upstate Medical University, Syracuse, NY, USA
| | - Martin Schwarz
- Department of Biology and Medical Genetics, 2nd Medical Faculty of Charles University and University Hospital Motol, Prague, Czech Republic
| | - Katie Scott
- Department of Psychiatry, Dalhousie University, Halifax, NS, Canada
| | - Dominik Seelow
- Exploratory Diagnostic Sciences, Berliner Institut für Gesundheitsforschung - Charité, Berlin, Germany
| | - Berthold Seitz
- Department of Ophthalmology, Saarland University Medical Center UKS, Homburg/Saar, Germany
| | | | - Morgan N Similuk
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | - Eric S Simon
- Eisenberg Family Depression Center, University of Michigan, Ann Arbor, MI, USA
| | - Balwinder Singh
- Department of Psychiatry and Psychology, Mayo Clinic, Rochester, MN, USA
| | - Damian Smedley
- William Harvey Research Institute, Queen Mary University of London, London, UK
| | | | - Jake T Smolinsky
- Human Genetics Institute of New Jersey, Rutgers University, Piscataway, NJ, USA
| | - Sarah Sperry
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | | | - Ray Stefancsik
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Robin Steinhaus
- Exploratory Diagnostic Sciences, Berliner Institut für Gesundheitsforschung - Charité, Berlin, Germany
| | - Rebecca Strawbridge
- Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | | | - Polina Talapova
- Institute for Research and Health Policy Studies, Tufts Medicine, Boston, MA 2111, USA
| | | | - Pavel Tesner
- Department of Biology and Medical Genetics, 2nd Medical Faculty of Charles University and University Hospital Motol, Prague, Czech Republic
| | - Rhys H Thomas
- Translational and Clinical Research Institute, Henry Wellcome Building, Framlington Place, Newcastle University, Newcastle-Upon-Tyne NE14LP, UK
| | - Audrey Thurm
- Neurodevelopmental and Behavioral Phenotyping Service, National Institute of Mental Health, Bethesda, MD, USA
| | - Marek Turnovec
- Department of Biology and Medical Genetics, 2nd Medical Faculty of Charles University and University Hospital Motol, Prague, Czech Republic
| | - Marielle E van Gijn
- Department of Genetics, University Medical Center Groningen, Groningen, Netherlands
| | | | - Markéta Vlčková
- Department of Biology and Medical Genetics, 2nd Medical Faculty of Charles University and University Hospital Motol, Prague, Czech Republic
| | - Anita Walden
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Kai Wang
- Chinese HPO Consortium, Beijing, China
| | - Ron Wapner
- Department of Obstetrics and Gynecology, Columbia University Irving Medical Center, New York, NY, USA
| | - James S Ware
- National Heart & Lung Institute & MRC London Institute of Medical Sciences, Imperial College London, London W12 0HS, UK
| | | | | | - Lisa D Wiggins
- National Center on Birth Defects and Developmental Disabilities, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Andrew E Williams
- Institute for Research and Health Policy Studies, Tufts Medicine, Boston, MA 2111, USA
| | - Chen Wu
- Chinese HPO Consortium, Beijing, China
| | - Margot J Wyrwoll
- Centre for Regenerative Medicine, Institute for Regeneration and Repair, Institute for Stem Cell Research, University of Edinburgh, Edinburgh, UK
| | - Hui Xiong
- Chinese HPO Consortium, Beijing, China
| | - Nefize Yalin
- Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King's College London, London, UK
| | - Yasunori Yamamoto
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Japan
| | - Lakshmi N Yatham
- Department of Psychiatry, University of British Columbia, Vancouver, BC, Canada
| | - Anastasia K Yocum
- Department of Psychiatry, University of Michigan, Ann Arbor, MI, USA
| | - Allan H Young
- Psychological Medicine, Institute of Psychiatry, Psychology and Neuroscience, King's College London & South London and Maudsley NHS Foundation Trust, Bethlem Royal Hospital, Monks Orchard Road, Beckenham, Kent, London SE5 8AF, UK
| | - Zafer Yüksel
- Department of Human Genetics, Bioscientia Healthcare GmbH, Ingelheim, Germany
| | - Peter P Zandi
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Andreas Zankl
- Faculty of Medicine and Health, The University of Sydney, Camperdown, Australia
| | - Ignacio Zarante
- Institute of Human Genetics, Pontificia Universidad Javeriana, Bogotá, Colombia
| | - Miroslav Zvolský
- Institute of Health Information and Statistics of the Czech Republic, Prague, Czech Republic
| | - Sabrina Toro
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Leigh C Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Nomi L Harris
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Monica C Munoz-Torres
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Daniel Danis
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | | | - Melissa A Haendel
- University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USA
| | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| |
Collapse
|
6
|
Gao J, Mo S, Wang J, Zhang M, Shi Y, Zhu C, Shang Y, Tang X, Zhang S, Wu X, Xu X, Wang Y, Li Z, Zheng G, Chen Z, Wang Q, Tang K, Cao Z. MACC: a visual interactive knowledgebase of metabolite-associated cell communications. Nucleic Acids Res 2024; 52:D633-D639. [PMID: 37897362 PMCID: PMC10767829 DOI: 10.1093/nar/gkad914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/21/2023] [Accepted: 10/10/2023] [Indexed: 10/30/2023] Open
Abstract
Metabolite-associated cell communications play critical roles in maintaining the normal biological function of human through coordinating cells, organs and physiological systems. Though substantial information of MACCs has been continuously reported, no relevant database has become available so far. To address this gap, we here developed the first knowledgebase (MACC), to comprehensively describe human metabolite-associated cell communications through curation of experimental literatures. MACC currently contains: (a) 4206 carefully curated metabolite-associated cell communications pairs involving 244 human endogenous metabolites and reported biological effects in vivo and in vitro; (b) 226 comprehensive cell subtypes and 296 disease states, such as cancers, autoimmune diseases, and pathogenic infections; (c) 4508 metabolite-related enzymes and transporters, involving 542 pathways; (d) an interactive tool with user-friendly interface to visualize networks of multiple metabolite-cell interactions. (e) overall expression landscape of metabolite-associated gene sets derived from over 1500 single-cell expression profiles to infer metabolites variations across different cells in the sample. Also, MACC enables cross-links to well-known databases, such as HMDB, DrugBank, TTD and PubMed etc. In complement to ligand-receptor databases, MACC may give new perspectives of alternative communication between cells via metabolite secretion and adsorption, together with the resulting biological functions. MACC is publicly accessible at: http://macc.badd-cao.net/.
Collapse
Affiliation(s)
- Jian Gao
- School of Life Sciences, Fudan University, Shanghai, China
- International Human Phenome Institutes (Shanghai), Shanghai, China
- Department of Thoracic Surgery and State Key Laboratory of Genetic Engineering, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Saifeng Mo
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Jun Wang
- School of Life Sciences, Fudan University, Shanghai, China
| | - Mou Zhang
- School of Life Sciences, Fudan University, Shanghai, China
| | - Yao Shi
- School of Life Sciences, Fudan University, Shanghai, China
| | - Chuhan Zhu
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Yuxuan Shang
- Biological Sciences, University of California Santa Barbara, CA, USA
| | - Xinyue Tang
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Shiyue Zhang
- School of Life Sciences, Fudan University, Shanghai, China
| | - Xinwen Wu
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Xinyan Xu
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Yiheng Wang
- School of Life Sciences, Fudan University, Shanghai, China
| | - Zihao Li
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Genhui Zheng
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Zikun Chen
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Qiming Wang
- School of Life Sciences, Fudan University, Shanghai, China
| | - Kailin Tang
- Dept. of Gastroenterology, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China
| | - Zhiwei Cao
- School of Life Sciences, Fudan University, Shanghai, China
- International Human Phenome Institutes (Shanghai), Shanghai, China
| |
Collapse
|
7
|
Carmody LC, Gargano MA, Toro S, Vasilevsky NA, Adam MP, Blau H, Chan LE, Gomez-Andres D, Horvath R, Kraus ML, Ladewig MS, Lewis-Smith D, Lochmüller H, Matentzoglu NA, Munoz-Torres MC, Schuetz C, Seitz B, Similuk MN, Sparks TN, Strauss T, Swietlik EM, Thompson R, Zhang XA, Mungall CJ, Haendel MA, Robinson PN. The Medical Action Ontology: A tool for annotating and analyzing treatments and clinical management of human disease. MED 2023; 4:913-927.e3. [PMID: 37963467 PMCID: PMC10842845 DOI: 10.1016/j.medj.2023.10.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 08/31/2023] [Accepted: 10/14/2023] [Indexed: 11/16/2023]
Abstract
BACKGROUND Navigating the clinical literature to determine the optimal clinical management for rare diseases presents significant challenges. We introduce the Medical Action Ontology (MAxO), an ontology specifically designed to organize medical procedures, therapies, and interventions. METHODS MAxO incorporates logical structures that link MAxO terms to numerous other ontologies within the OBO Foundry. Term development involves a blend of manual and semi-automated processes. Additionally, we have generated annotations detailing diagnostic modalities for specific phenotypic abnormalities defined by the Human Phenotype Ontology (HPO). We introduce a web application, POET, that facilitates MAxO annotations for specific medical actions for diseases using the Mondo Disease Ontology. FINDINGS MAxO encompasses 1,757 terms spanning a wide range of biomedical domains, from human anatomy and investigations to the chemical and protein entities involved in biological processes. These terms annotate phenotypic features associated with specific disease (using HPO and Mondo). Presently, there are over 16,000 MAxO diagnostic annotations that target HPO terms. Through POET, we have created 413 MAxO annotations specifying treatments for 189 rare diseases. CONCLUSIONS MAxO offers a computational representation of treatments and other actions taken for the clinical management of patients. Its development is closely coupled to Mondo and HPO, broadening the scope of our computational modeling of diseases and phenotypic features. We invite the community to contribute disease annotations using POET (https://poet.jax.org/). MAxO is available under the open-source CC-BY 4.0 license (https://github.com/monarch-initiative/MAxO). FUNDING NHGRI 1U24HG011449-01A1 and NHGRI 5RM1HG010860-04.
Collapse
Affiliation(s)
- Leigh C Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Sabrina Toro
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Margaret P Adam
- University of Washington School of Medicine, Seattle, WA, USA
| | - Hannah Blau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - David Gomez-Andres
- Pediatric Neurology, Vall d'Hebron Institut de Recerca (VHIR), Hospital Universitari Vall d'Hebron, Vall d'Hebron Barcelona Hospital Campus, Passeig Vall d'Hebron 119-129, 08035 Barcelona, Spain
| | - Rita Horvath
- Department of Clinical Neurosciences, University of Cambridge, Robinson Way, Cambridge CB2 0PY, UK
| | - Megan L Kraus
- University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Markus S Ladewig
- Department of Ophthalmology, Klinikum Saarbrücken, Saarbrücken, Germany
| | - David Lewis-Smith
- Translational and Clinical Research Institute, Newcastle University, Newcastle upon Tyne NE2 4HH, UK
| | - Hanns Lochmüller
- Children's Hospital of Eastern Ontario Research Institute, Ottowa, Canada; Division of Neurology, Department of Medicine, The Ottawa Hospital, Ottawa, Canada; Brain and Mind Research Institute, University of Ottawa, Ottawa, Canada; Department of Neuropediatrics and Muscle Disorders, Medical Center - University of Freiburg, Faculty of Medicine, Freiburg, Germany; Centro Nacional de Análisis Genómico, Barcelona, Spain
| | | | | | - Catharina Schuetz
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Berthold Seitz
- Department of Ophthalmology, Saarland University Medical Center UKS, Homburg, Saar, Germany
| | - Morgan N Similuk
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Teresa N Sparks
- Department of Obstetrics, Gynecology, & Reproductive Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Timmy Strauss
- Department of Pediatrics, Medizinische Fakultät Carl Gustav Carus, Technische Universität Dresden, 01307 Dresden, Germany
| | - Emilia M Swietlik
- Department of Medicine, University of Cambridge, Heart and Lung Research Institute, Cambridge CB2 0BB, UK
| | - Rachel Thompson
- Children's Hospital of Eastern Ontario Research Institute, Ottowa, Canada
| | | | | | | | - Peter N Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
| |
Collapse
|
8
|
Deepa Maheshvare M, Raha S, König M, Pal D. A pathway model of glucose-stimulated insulin secretion in the pancreatic β-cell. Front Endocrinol (Lausanne) 2023; 14:1185656. [PMID: 37600713 PMCID: PMC10433753 DOI: 10.3389/fendo.2023.1185656] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Accepted: 06/08/2023] [Indexed: 08/22/2023] Open
Abstract
The pancreas plays a critical role in maintaining glucose homeostasis through the secretion of hormones from the islets of Langerhans. Glucose-stimulated insulin secretion (GSIS) by the pancreatic β-cell is the main mechanism for reducing elevated plasma glucose. Here we present a systematic modeling workflow for the development of kinetic pathway models using the Systems Biology Markup Language (SBML). Steps include retrieval of information from databases, curation of experimental and clinical data for model calibration and validation, integration of heterogeneous data including absolute and relative measurements, unit normalization, data normalization, and model annotation. An important factor was the reproducibility and exchangeability of the model, which allowed the use of various existing tools. The workflow was applied to construct a novel data-driven kinetic model of GSIS in the pancreatic β-cell based on experimental and clinical data from 39 studies spanning 50 years of pancreatic, islet, and β-cell research in humans, rats, mice, and cell lines. The model consists of detailed glycolysis and phenomenological equations for insulin secretion coupled to cellular energy state, ATP dynamics and (ATP/ADP ratio). Key findings of our work are that in GSIS there is a glucose-dependent increase in almost all intermediates of glycolysis. This increase in glycolytic metabolites is accompanied by an increase in energy metabolites, especially ATP and NADH. One of the few decreasing metabolites is ADP, which, in combination with the increase in ATP, results in a large increase in ATP/ADP ratios in the β-cell with increasing glucose. Insulin secretion is dependent on ATP/ADP, resulting in glucose-stimulated insulin secretion. The observed glucose-dependent increase in glycolytic intermediates and the resulting change in ATP/ADP ratios and insulin secretion is a robust phenomenon observed across data sets, experimental systems and species. Model predictions of the glucose-dependent response of glycolytic intermediates and biphasic insulin secretion are in good agreement with experimental measurements. Our model predicts that factors affecting ATP consumption, ATP formation, hexokinase, phosphofructokinase, and ATP/ADP-dependent insulin secretion have a major effect on GSIS. In conclusion, we have developed and applied a systematic modeling workflow for pathway models that allowed us to gain insight into key mechanisms in GSIS in the pancreatic β-cell.
Collapse
Affiliation(s)
- M. Deepa Maheshvare
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
| | - Soumyendu Raha
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
| | - Matthias König
- Institute for Biology, Institute for Theoretical Biology, Humboldt-University Berlin, Berlin, Germany
| | - Debnath Pal
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India
| |
Collapse
|
9
|
Gawron P, Hoksza D, Piñero J, Peña-Chilet M, Esteban-Medina M, Fernandez-Rueda JL, Colonna V, Smula E, Heirendt L, Ancien F, Groues V, Satagopam VP, Schneider R, Dopazo J, Furlong LI, Ostaszewski M. Visualization of automatically combined disease maps and pathway diagrams for rare diseases. FRONTIERS IN BIOINFORMATICS 2023; 3:1101505. [PMID: 37502697 PMCID: PMC10369067 DOI: 10.3389/fbinf.2023.1101505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/05/2023] [Indexed: 07/29/2023] Open
Abstract
Introduction: Investigation of molecular mechanisms of human disorders, especially rare diseases, require exploration of various knowledge repositories for building precise hypotheses and complex data interpretation. Recently, increasingly more resources offer diagrammatic representation of such mechanisms, including disease-dedicated schematics in pathway databases and disease maps. However, collection of knowledge across them is challenging, especially for research projects with limited manpower. Methods: In this article we present an automated workflow for construction of maps of molecular mechanisms for rare diseases. The workflow requires a standardized definition of a disease using Orphanet or HPO identifiers to collect relevant genes and variants, and to assemble a functional, visual repository of related mechanisms, including data overlays. The diagrams composing the final map are unified to a common systems biology format from CellDesigner SBML, GPML and SBML+layout+render. The constructed resource contains disease-relevant genes and variants as data overlays for immediate visual exploration, including embedded genetic variant browser and protein structure viewer. Results: We demonstrate the functionality of our workflow on two examples of rare diseases: Kawasaki disease and retinitis pigmentosa. Two maps are constructed based on their corresponding identifiers. Moreover, for the retinitis pigmentosa use-case, we include a list of differentially expressed genes to demonstrate how to tailor the workflow using omics datasets. Discussion: In summary, our work allows for an ad-hoc construction of molecular diagrams combined from different sources, preserving their layout and graphical style, but integrating them into a single resource. This allows to reduce time consuming tasks of prototyping of a molecular disease map, enabling visual exploration, hypothesis building, data visualization and further refinement. The code of the workflow is open and accessible at https://gitlab.lcsb.uni.lu/minerva/automap/.
Collapse
Affiliation(s)
- Piotr Gawron
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| | - David Hoksza
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
- Faculty of Mathematics and Physics, Charles University, Prague, Czechia
| | - Janet Piñero
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Barcelona, Spain
- Department of Experimental and Health Sciences, Pompeu Fabra University (UPF), Barcelona, Spain
- MedBioinformatics Solutions SL, Barcelona, Spain
| | - Maria Peña-Chilet
- Computational Medicine Platform, Fundacion Progreso y Salud, Sevilla, Spain
- Spanish Network of Research in Rare Diseases (CIBERER), Sevilla, Spain
| | | | | | - Vincenza Colonna
- Institute of Genetics and Biophysics, National Research Council of Italy, Naples, Rome
- Department of Genetics, Genomics and Informatics, College of Medicine, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Ewa Smula
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| | - Laurent Heirendt
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| | - François Ancien
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| | - Valentin Groues
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| | - Venkata P. Satagopam
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| | - Joaquin Dopazo
- Computational Medicine Platform, Fundacion Progreso y Salud, Sevilla, Spain
- Spanish Network of Research in Rare Diseases (CIBERER), Sevilla, Spain
| | - Laura I. Furlong
- Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Barcelona, Spain
- Department of Experimental and Health Sciences, Pompeu Fabra University (UPF), Barcelona, Spain
- MedBioinformatics Solutions SL, Barcelona, Spain
| | - Marek Ostaszewski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, Luxembourg, Luxembourg
| |
Collapse
|
10
|
Danis D, Jacobsen JOB, Wagner AH, Groza T, Beckwith MA, Rekerle L, Carmody LC, Reese J, Hegde H, Ladewig MS, Seitz B, Munoz-Torres M, Harris NL, Rambla J, Baudis M, Mungall CJ, Haendel MA, Robinson PN. Phenopacket-tools: Building and validating GA4GH Phenopackets. PLoS One 2023; 18:e0285433. [PMID: 37196000 PMCID: PMC10191354 DOI: 10.1371/journal.pone.0285433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 04/21/2023] [Indexed: 05/19/2023] Open
Abstract
The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Phenopacket-tools can be used to validate the syntax and semantics of phenopackets as well as to assess adherence to additional user-defined requirements. The documentation includes examples showing how to use the Java library and the command-line tool to create and validate phenopackets. We demonstrate how to create, convert, and validate phenopackets using the library or the command-line application. Source code, API documentation, comprehensive user guide and a tutorial can be found at https://github.com/phenopackets/phenopacket-tools. The library can be installed from the public Maven Central artifact repository and the application is available as a standalone archive. The phenopacket-tools library helps developers implement and standardize the collection and exchange of phenotypic and other clinical data for use in phenotype-driven genomic diagnostics, translational research, and precision medicine applications.
Collapse
Affiliation(s)
- Daniel Danis
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Julius O. B. Jacobsen
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
| | - Alex H. Wagner
- Departments of Pediatrics and Biomedical Informatics, The Ohio State University College of Medicine, Columbus, OH, United States of America
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH, United States of America
| | | | - Martha A. Beckwith
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Lauren Rekerle
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Leigh C. Carmody
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
| | - Justin Reese
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Harshad Hegde
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Markus S. Ladewig
- Department of Ophthalmology, Klinikum Saarbrücken, Saarbrücken, Germany
| | - Berthold Seitz
- Department of Ophthalmology, Saarland University Medical Center, Homburg/Saar, Germany
| | - Monica Munoz-Torres
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States of America
| | - Nomi L. Harris
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Jordi Rambla
- European Genome-Phenome Archive (EGA) in the Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Michael Baudis
- University of Zurich and Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Christopher J. Mungall
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
| | - Melissa A. Haendel
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, United States of America
| | - Peter N. Robinson
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, United States of America
- Institute for Systems Genomics, University of Connecticut, Farmington, CT, United States of America
| |
Collapse
|
11
|
Kumar N, Mishra BK, Liu J, Mohan B, Thingujam D, Pajerowska-Mukhtar KM, Mukhtar MS. Network Biology Analyses and Dynamic Modeling of Gene Regulatory Networks under Drought Stress Reveal Major Transcriptional Regulators in Arabidopsis. Int J Mol Sci 2023; 24:ijms24087349. [PMID: 37108512 PMCID: PMC10139068 DOI: 10.3390/ijms24087349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 04/02/2023] [Accepted: 04/11/2023] [Indexed: 04/29/2023] Open
Abstract
Drought is one of the most serious abiotic stressors in the environment, restricting agricultural production by reducing plant growth, development, and productivity. To investigate such a complex and multifaceted stressor and its effects on plants, a systems biology-based approach is necessitated, entailing the generation of co-expression networks, identification of high-priority transcription factors (TFs), dynamic mathematical modeling, and computational simulations. Here, we studied a high-resolution drought transcriptome of Arabidopsis. We identified distinct temporal transcriptional signatures and demonstrated the involvement of specific biological pathways. Generation of a large-scale co-expression network followed by network centrality analyses identified 117 TFs that possess critical properties of hubs, bottlenecks, and high clustering coefficient nodes. Dynamic transcriptional regulatory modeling of integrated TF targets and transcriptome datasets uncovered major transcriptional events during the course of drought stress. Mathematical transcriptional simulations allowed us to ascertain the activation status of major TFs, as well as the transcriptional intensity and amplitude of their target genes. Finally, we validated our predictions by providing experimental evidence of gene expression under drought stress for a set of four TFs and their major target genes using qRT-PCR. Taken together, we provided a systems-level perspective on the dynamic transcriptional regulation during drought stress in Arabidopsis and uncovered numerous novel TFs that could potentially be used in future genetic crop engineering programs.
Collapse
Affiliation(s)
- Nilesh Kumar
- Department of Biology, 464 Campbell Hall, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA
| | - Bharat K Mishra
- Department of Biology, 464 Campbell Hall, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA
| | - Jinbao Liu
- Department of Biology, 464 Campbell Hall, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA
| | - Binoop Mohan
- Department of Biology, 464 Campbell Hall, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA
| | - Doni Thingujam
- Department of Biology, 464 Campbell Hall, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA
| | - Karolina M Pajerowska-Mukhtar
- Department of Biology, 464 Campbell Hall, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA
| | - M Shahid Mukhtar
- Department of Biology, 464 Campbell Hall, University of Alabama at Birmingham, 1300 University Boulevard, Birmingham, AL 35294, USA
- Nutrition Obesity Research Center, University of Alabama at Birmingham, 1675 University Boulevard, Birmingham, AL 35294, USA
- Department of Surgery, University of Alabama at Birmingham, 1808 7th Ave S, Birmingham, AL 35294, USA
| |
Collapse
|
12
|
Deepa Maheshvare M, Raha S, König M, Pal D. A Consensus Model of Glucose-Stimulated Insulin Secretion in the Pancreatic β -Cell. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.10.532028. [PMID: 36945414 PMCID: PMC10028967 DOI: 10.1101/2023.03.10.532028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/13/2023]
Abstract
The pancreas plays a critical role in maintaining glucose homeostasis through the secretion of hormones from the islets of Langerhans. Glucose-stimulated insulin secretion (GSIS) by the pancreatic β -cell is the main mechanism for reducing elevated plasma glucose. Here we present a systematic modeling workflow for the development of kinetic pathway models using the Systems Biology Markup Language (SBML). Steps include retrieval of information from databases, curation of experimental and clinical data for model calibration and validation, integration of heterogeneous data including absolute and relative measurements, unit normalization, data normalization, and model annotation. An important factor was the reproducibility and exchangeability of the model, which allowed the use of various existing tools. The workflow was applied to construct the first consensus model of GSIS in the pancreatic β -cell based on experimental and clinical data from 39 studies spanning 50 years of pancreatic, islet, and β -cell research in humans, rats, mice, and cell lines. The model consists of detailed glycolysis and equations for insulin secretion coupled to cellular energy state (ATP/ADP ratio). Key findings of our work are that in GSIS there is a glucose-dependent increase in almost all intermediates of glycolysis. This increase in glycolytic metabolites is accompanied by an increase in energy metabolites, especially ATP and NADH. One of the few decreasing metabolites is ADP, which, in combination with the increase in ATP, results in a large increase in ATP/ADP ratios in the β -cell with increasing glucose. Insulin secretion is dependent on ATP/ADP, resulting in glucose-stimulated insulin secretion. The observed glucose-dependent increase in glycolytic intermediates and the resulting change in ATP/ADP ratios and insulin secretion is a robust phenomenon observed across data sets, experimental systems and species. Model predictions of the glucose-dependent response of glycolytic intermediates and insulin secretion are in good agreement with experimental measurements. Our model predicts that factors affecting ATP consumption, ATP formation, hexokinase, phosphofructokinase, and ATP/ADP-dependent insulin secretion have a major effect on GSIS. In conclusion, we have developed and applied a systematic modeling workflow for pathway models that allowed us to gain insight into key mechanisms in GSIS in the pancreatic β -cell.
Collapse
|
13
|
Perova Z, Martinez M, Mandloi T, Gomez F, Halmagyi C, Follette A, Mason J, Newhauser S, Begley D, Krupke D, Bult C, Parkinson H, Groza T. PDCM Finder: an open global research platform for patient-derived cancer models. Nucleic Acids Res 2022; 51:D1360-D1366. [PMID: 36399494 PMCID: PMC9825610 DOI: 10.1093/nar/gkac1021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 10/13/2022] [Accepted: 10/25/2022] [Indexed: 11/19/2022] Open
Abstract
PDCM Finder (www.cancermodels.org) is a cancer research platform that aggregates clinical, genomic and functional data from patient-derived xenografts, organoids and cell lines. It was launched in April 2022 as a successor of the PDX Finder portal, which focused solely on patient-derived xenograft models. Currently the portal has over 6200 models across 13 cancer types, including rare paediatric models (17%) and models from minority ethnic backgrounds (33%), making it the largest free to consumer and open access resource of this kind. The PDCM Finder standardises, harmonises and integrates the complex and diverse data associated with PDCMs for the cancer community and displays over 90 million data points across a variety of data types (clinical metadata, molecular and treatment-based). PDCM data is FAIR and underpins the generation and testing of new hypotheses in cancer mechanisms and personalised medicine development.
Collapse
Affiliation(s)
- Zinaida Perova
- To whom correspondence should be addressed. Tel: +44 1223 494 121; Fax: +44 1223 494 468;
| | - Mauricio Martinez
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tushar Mandloi
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Federico Lopez Gomez
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Csaba Halmagyi
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alex Follette
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jeremy Mason
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Steven Newhauser
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Dale A Begley
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Debra M Krupke
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Carol Bult
- The Jackson Laboratory, 600 Main Street, Bar Harbor, ME 04609, USA
| | - Helen Parkinson
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tudor Groza
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
14
|
Palmblad M, Asein E, Bergman NP, Ivanova A, Ramasauskas L, Reyes HM, Ruchti S, Soto-Jácome L, Bergquist J. Semantic Annotation of Experimental Methods in Analytical Chemistry. Anal Chem 2022; 94:15464-15471. [DOI: 10.1021/acs.analchem.2c03565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center, 2300 RCLeiden, The Netherlands
| | - Enahoro Asein
- Institute of Chemistry, University of Tartu, Ravila 14a, 50411Tartu, Estonia
| | - Nina P. Bergman
- Analytical Pharmaceutical Chemistry, Department of Medicinal Chemistry - BMC, Uppsala University, SE-75123Uppsala, Sweden
| | - Arina Ivanova
- Analytical Chemistry and Neurochemistry, Department of Chemistry─BMC, Uppsala University, SE-75124Uppsala, Sweden
| | - Lukas Ramasauskas
- Analytical Chemistry and Neurochemistry, Department of Chemistry─BMC, Uppsala University, SE-75124Uppsala, Sweden
| | | | - Stefan Ruchti
- Institute of Chemistry, University of Tartu, Ravila 14a, 50411Tartu, Estonia
- Analytical Chemistry and Neurochemistry, Department of Chemistry─BMC, Uppsala University, SE-75124Uppsala, Sweden
| | | | - Jonas Bergquist
- Analytical Chemistry and Neurochemistry, Department of Chemistry─BMC, Uppsala University, SE-75124Uppsala, Sweden
| |
Collapse
|
15
|
Bernabé-Díaz JA, Franco M, Vivo JM, Quesada-Martínez M, Fernández-Breis JT. An automated process for supporting decisions in clustering-based data analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 219:106765. [PMID: 35367914 DOI: 10.1016/j.cmpb.2022.106765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 03/14/2022] [Accepted: 03/18/2022] [Indexed: 06/14/2023]
Abstract
BACKGROUND AND OBJECTIVE Metrics are commonly used by biomedical researchers and practitioners to measure and evaluate properties of individuals, instruments, models, methods, or datasets. Due to the lack of a standardized validation procedure for a metric, it is assumed that if a metric is appropriate for analyzing a dataset in a certain domain, then it will be appropriate for other datasets in the same domain. However, such generalizability cannot be taken for granted, since the behavior of a metric can vary in different scenarios. The study of such behavior of a metric is the objective of this paper, since it would allow for assessing its reliability before drawing any conclusion about biomedical datasets. METHODS We present a method to support in evaluating the behavior of quantitative metrics on datasets. Our approach assesses a metric by using clustering-based data analysis, and enhancing the decision-making process in the optimal classification. Our method assesses the metrics by applying two important criteria of the unsupervised classification validation that are calculated on the clusterings generated by the metric, namely stability and goodness of the clusters. The application of our method is facilitated to biomedical researchers by our evaluomeR tool. RESULTS The analytical power of our methods is shown in the results of the application of our method to analyze (1) the behavior of the impact factor metric for a series of journal categories; (2) which structural metrics provide a better partitioning of the content of a repository of biomedical ontologies, and (3) the heterogeneity sources in effect size metrics of biomedical primary studies. CONCLUSIONS The use of statistical properties such as stability and goodness of classifications allows for a useful analysis of the behavior of quantitative metrics, which can be used for supporting decisions about which metrics to apply on a certain dataset.
Collapse
Affiliation(s)
| | - Manuel Franco
- Dept. Statistics and Operations Research, University of Murcia, IMIB-Arrixaca, Spain
| | - Juana-María Vivo
- Dept. Statistics and Operations Research, University of Murcia, IMIB-Arrixaca, Spain
| | | | | |
Collapse
|
16
|
Mazandu GK, Hotchkiss J, Nembaware V, Wonkam A, Mulder N. The Sickle Cell Disease Ontology: recent development and expansion of the universal sickle cell knowledge representation. Database (Oxford) 2022; 2022:6562127. [PMID: 35363306 PMCID: PMC9216550 DOI: 10.1093/database/baac014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 02/15/2022] [Accepted: 03/16/2022] [Indexed: 12/17/2022]
Abstract
The Sickle Cell Disease (SCD) Ontology (SCDO, https://scdontology.h3abionet.org/) provides a comprehensive knowledge base of SCD management, systems and standardized human and machine-readable resources that unambiguously describe terminology and concepts about SCD for researchers, patients and clinicians. The SCDO was launched in 2016 and is continuously updated in quantity, as well as in quality, to effectively support the curation of SCD research, patient databasing and clinical informatics applications. SCD knowledge from the scientific literature is used to update existing SCDO terms and create new terms where necessary. Here, we report major updates to the SCDO, from December 2019 until April 2021, for promoting interoperability and facilitating SCD data harmonization, sharing and integration across different studies and for retrospective multi-site research collaborations. SCDO developers continue to collaborate with the SCD community, clinicians and researchers to improve specific ontology areas and expand standardized descriptions to conditions influencing SCD phenotypic expressions and clinical manifestations of the sickling process, e.g. thalassemias. Database URL: https://scdontology.h3abionet.org/
Collapse
Affiliation(s)
- Gaston K Mazandu
- Department of Pathology, Division of Human Genetics, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory 7925, South Africa
| | - Jade Hotchkiss
- Department of Pathology, Division of Human Genetics, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory 7925, South Africa
| | - Victoria Nembaware
- Department of Pathology, Division of Human Genetics, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory 7925, South Africa
| | - Ambroise Wonkam
- Department of Pathology, Division of Human Genetics, University of Cape Town, Health Sciences Campus, Anzio Rd, Observatory 7925, South Africa
| | - Nicola Mulder
- Department of Integrative Biomedical Sciences, Computational Biology Division, IDM, CIDRI-Africa WT Centre, University of Cape Town, Health Sciences Campus. Anzio Rd, Observatory 7925, South Africa
| |
Collapse
|
17
|
Hammal F, de Langen P, Bergon A, Lopez F, Ballester B. ReMap 2022: a database of Human, Mouse, Drosophila and Arabidopsis regulatory regions from an integrative analysis of DNA-binding sequencing experiments. Nucleic Acids Res 2021; 50:D316-D325. [PMID: 34751401 PMCID: PMC8728178 DOI: 10.1093/nar/gkab996] [Citation(s) in RCA: 170] [Impact Index Per Article: 56.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/07/2021] [Accepted: 10/13/2021] [Indexed: 11/23/2022] Open
Abstract
ReMap (https://remap.univ-amu.fr) aims to provide manually curated, high-quality catalogs of regulatory regions resulting from a large-scale integrative analysis of DNA-binding experiments in Human, Mouse, Fly and Arabidopsis thaliana for hundreds of transcription factors and regulators. In this 2022 update, we have uniformly processed >11 000 DNA-binding sequencing datasets from public sources across four species. The updated Human regulatory atlas includes 8103 datasets covering a total of 1210 transcriptional regulators (TRs) with a catalog of 182 million (M) peaks, while the updated Arabidopsis atlas reaches 4.8M peaks, 423 TRs across 694 datasets. Also, this ReMap release is enriched by two new regulatory catalogs for Mus musculus and Drosophila melanogaster. First, the Mouse regulatory catalog consists of 123M peaks across 648 TRs as a result of the integration and validation of 5503 ChIP-seq datasets. Second, the Drosophila melanogaster catalog contains 16.6M peaks across 550 TRs from the integration of 1205 datasets. The four regulatory catalogs are browsable through track hubs at UCSC, Ensembl and NCBI genome browsers. Finally, ReMap 2022 comes with a new Cis Regulatory Module identification method, improved quality controls, faster search results, and better user experience with an interactive tour and video tutorials on browsing and filtering ReMap catalogs.
Collapse
|
18
|
Doğan T, Atas H, Joshi V, Atakan A, Rifaioglu A, Nalbat E, Nightingale A, Saidi R, Volynkin V, Zellner H, Cetin-Atalay R, Martin M, Atalay V. CROssBAR: comprehensive resource of biomedical relations with knowledge graph representations. Nucleic Acids Res 2021; 49:e96. [PMID: 34181736 PMCID: PMC8450100 DOI: 10.1093/nar/gkab543] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Revised: 04/11/2021] [Accepted: 06/10/2021] [Indexed: 12/11/2022] Open
Abstract
Systemic analysis of available large-scale biological/biomedical data is critical for studying biological mechanisms, and developing novel and effective treatment approaches against diseases. However, different layers of the available data are produced using different technologies and scattered across individual computational resources without any explicit connections to each other, which hinders extensive and integrative multi-omics-based analysis. We aimed to address this issue by developing a new data integration/representation methodology and its application by constructing a biological data resource. CROssBAR is a comprehensive system that integrates large-scale biological/biomedical data from various resources and stores them in a NoSQL database. CROssBAR is enriched with the deep-learning-based prediction of relationships between numerous data entries, which is followed by the rigorous analysis of the enriched data to obtain biologically meaningful modules. These complex sets of entities and relationships are displayed to users via easy-to-interpret, interactive knowledge graphs within an open-access service. CROssBAR knowledge graphs incorporate relevant genes-proteins, molecular interactions, pathways, phenotypes, diseases, as well as known/predicted drugs and bioactive compounds, and they are constructed on-the-fly based on simple non-programmatic user queries. These intensely processed heterogeneous networks are expected to aid systems-level research, especially to infer biological mechanisms in relation to genes, proteins, their ligands, and diseases.
Collapse
Affiliation(s)
- Tunca Doğan
- Department of Computer Engineering, Hacettepe University, Ankara 06800, Turkey
- Institute of Informatics, Hacettepe University, Ankara 06800, Turkey
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara 06800, Turkey
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Heval Atas
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara 06800, Turkey
| | - Vishal Joshi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Ahmet Atakan
- Department of Computer Engineering, METU, Ankara 06800, Turkey
- Department of Computer Engineering, EBYU, Erzincan 24002, Turkey
| | - Ahmet Sureyya Rifaioglu
- Department of Computer Engineering, METU, Ankara 06800, Turkey
- Department of Computer Engineering, İskenderun Technical University, Hatay 31200, Turkey
| | - Esra Nalbat
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara 06800, Turkey
| | - Andrew Nightingale
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Rabie Saidi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Vladimir Volynkin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Hermann Zellner
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Rengul Cetin-Atalay
- Cancer Systems Biology Laboratory, Graduate School of Informatics, METU, Ankara 06800, Turkey
- Section of Pulmonary and Critical Care Medicine, University of Chicago, Chicago, IL 60637, USA
| | - Maria Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI), Hinxton, Cambridgeshire CB10 1SD, UK
| | - Volkan Atalay
- Department of Computer Engineering, METU, Ankara 06800, Turkey
| |
Collapse
|
19
|
Zhou X, Nurkowski D, Mosbach S, Akroyd J, Kraft M. Question Answering System for Chemistry. J Chem Inf Model 2021; 61:3868-3880. [PMID: 34338504 DOI: 10.1021/acs.jcim.1c00275] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
This paper describes the implementation and evaluation of a proof-of-concept Question Answering (QA) system for accessing chemical data from knowledge graphs (KGs) which offer data from chemical kinetics to the chemical and physical properties of species. We trained the question classification and named the entity recognition models that specialize in interpreting chemistry questions. The system has a novel design which applies a topic model to identify the question-to-ontology affiliation to handle ontologies with different structures. The topic model also helps the system to provide answers with a higher quality. Moreover, a new method that automatically generates training questions from ontologies is also implemented. The question set generated for training contains 432,989 questions under 11 types. Such a training set has been proven to be effective for training both the question classification model and the named entity recognition model. We evaluated the system using other KGQA systems as baselines. The system outperforms the chosen KGQA system answering chemistry-related questions. The QA system is also compared to the Google search engine and the WolframAlpha engine. It shows that the QA system can answer certain types of questions better than the search engines.
Collapse
Affiliation(s)
- Xiaochi Zhou
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K
| | - Daniel Nurkowski
- CMCL Innovations, Sheraton House, Castle Park, Castle Street, Cambridge CB3 0AX, U.K
| | - Sebastian Mosbach
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K
| | - Jethro Akroyd
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K
| | - Markus Kraft
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, U.K.,School of Chemical and Biomedical Engineering, Nanyang Technological University, 62 Nanyang Drive, Singapore 637459.,CARES, Cambridge Centre for Advanced Research and Education in Singapore, 1 Create Way, CREATE Tower, #05-05, Singapore 138602
| |
Collapse
|
20
|
Pancsa R, Vranken W, Mészáros B. Computational resources for identifying and describing proteins driving liquid-liquid phase separation. Brief Bioinform 2021; 22:6124912. [PMID: 33517364 PMCID: PMC8425267 DOI: 10.1093/bib/bbaa408] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 11/23/2020] [Accepted: 12/12/2020] [Indexed: 01/06/2023] Open
Abstract
One of the most intriguing fields emerging in current molecular biology is the study of membraneless organelles formed via liquid–liquid phase separation (LLPS). These organelles perform crucial functions in cell regulation and signalling, and recent years have also brought about the understanding of the molecular mechanism of their formation. The LLPS field is continuously developing and optimizing dedicated in vitro and in vivo methods to identify and characterize these non-stoichiometric molecular condensates and the proteins able to drive or contribute to LLPS. Building on these observations, several computational tools and resources have emerged in parallel to serve as platforms for the collection, annotation and prediction of membraneless organelle-linked proteins. In this survey, we showcase recent advancements in LLPS bioinformatics, focusing on (i) available databases and ontologies that are necessary to describe the studied phenomena and the experimental results in an unambiguous way and (ii) prediction methods to assess the potential LLPS involvement of proteins. Through hands-on application of these resources on example proteins and representative datasets, we give a practical guide to show how they can be used in conjunction to provide in silico information on LLPS.
Collapse
Affiliation(s)
- Rita Pancsa
- Enzymology Institute of the Research Centre for Natural Sciences, Budapest, Hungary
| | - Wim Vranken
- Computer Science, chemistry and biomedical sciences at the Vrije Universiteit Brussel
| | - Bálint Mészáros
- Structural and Computational Biology Unit at the European Molecular Biology Laboratory, Heidelberg 69117, Germany
| |
Collapse
|
21
|
Stevens I, Mukarram AK, Hörtenhuber M, Meehan TF, Rung J, Daub CO. Ten simple rules for annotating sequencing experiments. PLoS Comput Biol 2020; 16:e1008260. [PMID: 33017400 PMCID: PMC7535046 DOI: 10.1371/journal.pcbi.1008260] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Affiliation(s)
- Irene Stevens
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
- Science for Life Laboratory, Karolinska Institutet, Stockholm, Sweden
- * E-mail:
| | - Abdul Kadir Mukarram
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Matthias Hörtenhuber
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
| | - Terrence F. Meehan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Johan Rung
- Science for Life Laboratory, Karolinska Institutet, Stockholm, Sweden
- Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
| | - Carsten O. Daub
- Department of Biosciences and Nutrition, Karolinska Institutet, Huddinge, Sweden
- Science for Life Laboratory, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
22
|
Palermo A, Huan T, Rinehart D, Rinschen MM, Li S, O'Donnell VB, Fahy E, Xue J, Subramaniam S, Benton HP, Siuzdak G. Cloud-based archived metabolomics data: A resource for in-source fragmentation/annotation, meta-analysis and systems biology. ANALYTICAL SCIENCE ADVANCES 2020; 1:70-80. [PMID: 35190800 PMCID: PMC8858440 DOI: 10.1002/ansa.202000042] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Archived metabolomics data represent a broad resource for the scientific community. However, the absence of tools for the meta-analysis of heterogeneous data types makes it challenging to perform direct comparisons in a single and cohesive workflow. Here we present a framework for the meta-analysis of metabolic pathways and interpretation with proteomic and transcriptomic data. This framework facilitates the comparison of heterogeneous types of metabolomics data from online repositories (e.g., XCMS Online, Metabolomics Workbench, GNPS, and MetaboLights) representing tens of thousands of studies, as well as locally acquired data. As a proof of concept, we apply the workflow for the meta-analysis of i) independent colon cancer studies, further interpreted with proteomics and transcriptomics data, ii) multimodal data from Alzheimer's disease and mild cognitive impairment studies, demonstrating its high-throughput capability for the systems level interpretation of metabolic pathways. Moreover, the platform has been modified for improved knowledge dissemination through a collaboration with Metabolomics Workbench and LIPID MAPS. We envision that this meta-analysis tool will help overcome the primary bottleneck in analyzing diverse datasets and facilitate the full exploitation of archival metabolomics data for addressing a broad array of questions in metabolism research and systems biology.
Collapse
Affiliation(s)
- Amelia Palermo
- Scripps Center for MetabolomicsThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Tao Huan
- Scripps Center for MetabolomicsThe Scripps Research InstituteLa JollaCaliforniaUSA
- Department of ChemistryUniversity of British ColumbiaVancouverBritish ColumbiaCanada
| | - Duane Rinehart
- Scripps Center for MetabolomicsThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Markus M. Rinschen
- Scripps Center for MetabolomicsThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Shuzhao Li
- The Jackson Laboratory for Genomic MedicineFarmingtonConnecticutUSA
| | | | - Eoin Fahy
- Department of BioengineeringUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Jingchuan Xue
- Scripps Center for MetabolomicsThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Shankar Subramaniam
- Department of BioengineeringUniversity of California San DiegoLa JollaCaliforniaUSA
| | - H. Paul Benton
- Scripps Center for MetabolomicsThe Scripps Research InstituteLa JollaCaliforniaUSA
| | - Gary Siuzdak
- Scripps Center for MetabolomicsThe Scripps Research InstituteLa JollaCaliforniaUSA
- Department of ChemistryMolecular and Computational BiologyThe Scripps Research InstituteLa JollaCaliforniaUSA
| |
Collapse
|
23
|
Schriml LM, Mitraka E, Munro J, Tauber B, Schor M, Nickle L, Felix V, Jeng L, Bearer C, Lichenstein R, Bisordi K, Campion N, Hyman B, Kurland D, Oates CP, Kibbey S, Sreekumar P, Le C, Giglio M, Greene C. Human Disease Ontology 2018 update: classification, content and workflow expansion. Nucleic Acids Res 2020; 47:D955-D962. [PMID: 30407550 PMCID: PMC6323977 DOI: 10.1093/nar/gky1032] [Citation(s) in RCA: 249] [Impact Index Per Article: 62.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Accepted: 10/22/2018] [Indexed: 12/22/2022] Open
Abstract
The Human Disease Ontology (DO) (http://www.disease-ontology.org), database has undergone significant expansion in the past three years. The DO disease classification includes specific formal semantic rules to express meaningful disease models and has expanded from a single asserted classification to include multiple-inferred mechanistic disease classifications, thus providing novel perspectives on related diseases. Expansion of disease terms, alternative anatomy, cell type and genetic disease classifications and workflow automation highlight the updates for the DO since 2015. The enhanced breadth and depth of the DO’s knowledgebase has expanded the DO’s utility for exploring the multi-etiology of human disease, thus improving the capture and communication of health-related data across biomedical databases, bioinformatics tools, genomic and cancer resources and demonstrated by a 6.6× growth in DO’s user community since 2015. The DO’s continual integration of human disease knowledge, evidenced by the more than 200 SVN/GitHub releases/revisions, since previously reported in our DO 2015 NAR paper, includes the addition of 2650 new disease terms, a 30% increase of textual definitions, and an expanding suite of disease classification hierarchies constructed through defined logical axioms.
Collapse
Affiliation(s)
- Lynn M Schriml
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | | | - James Munro
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Becky Tauber
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Mike Schor
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Lance Nickle
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Victor Felix
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Linda Jeng
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Cynthia Bearer
- University of Maryland School of Medicine, Baltimore, MD, USA
| | | | | | - Nicole Campion
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Brooke Hyman
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - David Kurland
- New York University Langone Medical Center, Department of Neurosurgery, New York, NY, USA
| | - Connor Patrick Oates
- Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Siobhan Kibbey
- University of Maryland School of Medicine, Baltimore, MD, USA
| | | | - Chris Le
- University of Maryland School of Medicine, Baltimore, MD, USA
| | - Michelle Giglio
- University of Maryland School of Medicine, Institute for Genome Sciences, Baltimore, MD, USA
| | - Carol Greene
- University of Maryland School of Medicine, Baltimore, MD, USA
| |
Collapse
|
24
|
Franco M, Vivo JM, Quesada-Martínez M, Duque-Ramos A, Fernández-Breis JT. Evaluation of ontology structural metrics based on public repository data. Brief Bioinform 2020; 21:473-485. [PMID: 30715146 DOI: 10.1093/bib/bbz009] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2018] [Revised: 12/20/2018] [Accepted: 01/05/2019] [Indexed: 11/14/2022] Open
Abstract
The development and application of biological ontologies have increased significantly in recent years. These ontologies can be retrieved from different repositories, which do not provide much information about quality aspects of the ontologies. In the past years, some ontology structural metrics have been proposed, but their validity as measurement instrument has not been sufficiently studied to date. In this work, we evaluate a set of reproducible and objective ontology structural metrics. Given the lack of standard methods for this purpose, we have applied an evaluation method based on the stability and goodness of the classifications of ontologies produced by each metric on an ontology corpus. The evaluation has been done using ontology repositories as corpora. More concretely, we have used 119 ontologies from the OBO Foundry repository and 78 ontologies from AgroPortal. First, we study the correlations between the metrics. Second, we study whether the clusters for a given metric are stable and have a good structure. The results show that the existing correlations are not biasing the evaluation, there are no metrics generating unstable clusterings and all the metrics evaluated provide at least reasonable clustering structure. Furthermore, our work permits to review and suggest the most reliable ontology structural metrics in terms of stability and goodness of their classifications. Availability: http://sele.inf.um.es/ontology-metrics.
Collapse
Affiliation(s)
- Manuel Franco
- Departamento de Estadística e Investigación Operativa, Universidad de Murcia, Murcia, Spain
| | - Juana María Vivo
- Departamento de Estadística e Investigación Operativa, Universidad de Murcia, Murcia, Spain
| | | | - Astrid Duque-Ramos
- Departamento de Sistemas, Facultad de Ingenierías, Universidad de Antioquia, Medellín, Colombia
| | | |
Collapse
|
25
|
Humayun F, Domingo-Fernández D, Paul George AA, Hopp MT, Syllwasschy BF, Detzel MS, Hoyt CT, Hofmann-Apitius M, Imhof D. A Computational Approach for Mapping Heme Biology in the Context of Hemolytic Disorders. Front Bioeng Biotechnol 2020; 8:74. [PMID: 32211383 PMCID: PMC7069124 DOI: 10.3389/fbioe.2020.00074] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2019] [Accepted: 01/28/2020] [Indexed: 01/07/2023] Open
Abstract
Heme is an iron ion-containing molecule found within hemoproteins such as hemoglobin and cytochromes that participates in diverse biological processes. Although excessive heme has been implicated in several diseases including malaria, sepsis, ischemia-reperfusion, and disseminated intravascular coagulation, little is known about its regulatory and signaling functions. Furthermore, the limited understanding of heme's role in regulatory and signaling functions is in part due to the lack of curated pathway resources for heme cell biology. Here, we present two resources aimed to exploit this unexplored information to model heme biology. The first resource is a terminology covering heme-specific terms not yet included in standard controlled vocabularies. Using this terminology, we curated and modeled the second resource, a mechanistic knowledge graph representing the heme's interactome based on a corpus of 46 scientific articles. Finally, we demonstrated the utility of these resources by investigating the role of heme in the Toll-like receptor signaling pathway. Our analysis proposed a series of crosstalk events that could explain the role of heme in activating the TLR4 signaling pathway. In summary, the presented work opens the door to the scientific community for exploring the published knowledge on heme biology.
Collapse
Affiliation(s)
- Farah Humayun
- Pharmaceutical Biochemistry and Bioanalytics, Pharmaceutical Institute, University of Bonn, Bonn, Germany
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Ajay Abisheck Paul George
- Pharmaceutical Biochemistry and Bioanalytics, Pharmaceutical Institute, University of Bonn, Bonn, Germany
| | - Marie-Thérèse Hopp
- Pharmaceutical Biochemistry and Bioanalytics, Pharmaceutical Institute, University of Bonn, Bonn, Germany
| | - Benjamin F. Syllwasschy
- Pharmaceutical Biochemistry and Bioanalytics, Pharmaceutical Institute, University of Bonn, Bonn, Germany
| | - Milena S. Detzel
- Pharmaceutical Biochemistry and Bioanalytics, Pharmaceutical Institute, University of Bonn, Bonn, Germany
| | - Charles Tapley Hoyt
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin, Germany
| | - Diana Imhof
- Pharmaceutical Biochemistry and Bioanalytics, Pharmaceutical Institute, University of Bonn, Bonn, Germany
| |
Collapse
|
26
|
Rahman RU, Liebhoff AM, Bansal V, Fiosins M, Rajput A, Sattar A, Magruder DS, Madan S, Sun T, Gautam A, Heins S, Liwinski T, Bethune J, Trenkwalder C, Fluck J, Mollenhauer B, Bonn S. SEAweb: the small RNA Expression Atlas web application. Nucleic Acids Res 2020; 48:D204-D219. [PMID: 31598718 PMCID: PMC6943056 DOI: 10.1093/nar/gkz869] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2019] [Revised: 09/14/2019] [Accepted: 10/01/2019] [Indexed: 12/12/2022] Open
Abstract
We present the Small RNA Expression Atlas (SEAweb), a web application that allows for the interactive querying, visualization and analysis of known and novel small RNAs across 10 organisms. It contains sRNA and pathogen expression information for over 4200 published samples with standardized search terms and ontologies. In addition, SEAweb allows for the interactive visualization and re-analysis of 879 differential expression and 514 classification comparisons. SEAweb's user model enables sRNA researchers to compare and re-analyze user-specific and published datasets, highlighting common and distinct sRNA expression patterns. We provide evidence for SEAweb's fidelity by (i) generating a set of 591 tissue specific miRNAs across 29 tissues, (ii) finding known and novel bacterial and viral infections across diseases and (iii) determining a Parkinson's disease-specific blood biomarker signature using novel data. We believe that SEAweb's simple semantic search interface, the flexible interactive reports and the user model with rich analysis capabilities will enable researchers to better understand the potential function and diagnostic value of sRNAs or pathogens across tissues, diseases and organisms.
Collapse
Affiliation(s)
- Raza-Ur Rahman
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Anna-Maria Liebhoff
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Vikas Bansal
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
- German Center for Neurodegenerative Diseases, 72076 Tübingen, Germany
| | - Maksims Fiosins
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
- German Center for Neurodegenerative Diseases, 72076 Tübingen, Germany
- Genevention GmbH, 37079 Göttingen, Germany
| | - Ashish Rajput
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Abdul Sattar
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Daniel S Magruder
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
- Genevention GmbH, 37079 Göttingen, Germany
| | - Sumit Madan
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- Rheinische Friedrich-Wilhelms-Universität Bonn, 53113 Bonn, Germany
| | - Ting Sun
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
- Department of Neurogenetics, Max Planck Institute of Experimental Medicine, 37075 Göttingen, Germany
| | - Abhivyakti Gautam
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Sven Heins
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Timur Liwinski
- Department of Medicine, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Jörn Bethune
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
| | - Claudia Trenkwalder
- Paracelsus-Elena-Klinik, 34128 Kassel, Germany
- Department of Neurosurgery, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Juliane Fluck
- Fraunhofer Institute for Algorithms and Scientific Computing, Schloss Birlinghoven, 53757 Sankt Augustin, Germany
- Institute of Geodesy and Geoinformation, University of Bonn, 53115 Bonn, Germany
- German National Library of Medicine (ZB MED) - Information Centre for Life Sciences, 53115 Bonn, Germany
| | - Brit Mollenhauer
- Paracelsus-Elena-Klinik, 34128 Kassel, Germany
- Institute of Neurology, University Medical Center Göttingen, 37075 Göttingen, Germany
| | - Stefan Bonn
- Institute of Medical Systems Biology, Center for Molecular Neurobiology, University Medical Center Hamburg-Eppendorf, 20251 Hamburg, Germany
- German Center for Neurodegenerative Diseases, 72076 Tübingen, Germany
| |
Collapse
|
27
|
Biomedical ontologies and their development, management, and applications in and beyond China. JOURNAL OF BIO-X RESEARCH 2019. [DOI: 10.1097/jbr.0000000000000051] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
28
|
Jackson RC, Balhoff JP, Douglass E, Harris NL, Mungall CJ, Overton JA. ROBOT: A Tool for Automating Ontology Workflows. BMC Bioinformatics 2019; 20:407. [PMID: 31357927 PMCID: PMC6664714 DOI: 10.1186/s12859-019-3002-3] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 07/19/2019] [Indexed: 11/21/2022] Open
Abstract
BACKGROUND Ontologies are invaluable in the life sciences, but building and maintaining ontologies often requires a challenging number of distinct tasks such as running automated reasoners and quality control checks, extracting dependencies and application-specific subsets, generating standard reports, and generating release files in multiple formats. Similar to more general software development, automation is the key to executing and managing these tasks effectively and to releasing more robust products in standard forms. For ontologies using the Web Ontology Language (OWL), the OWL API Java library is the foundation for a range of software tools, including the Protégé ontology editor. In the Open Biological and Biomedical Ontologies (OBO) community, we recognized the need to package a wide range of low-level OWL API functionality into a library of common higher-level operations and to make those operations available as a command-line tool. RESULTS ROBOT (a recursive acronym for "ROBOT is an OBO Tool") is an open source library and command-line tool for automating ontology development tasks. The library can be called from any programming language that runs on the Java Virtual Machine (JVM). Most usage is through the command-line tool, which runs on macOS, Linux, and Windows. ROBOT provides ontology processing commands for a variety of tasks, including commands for converting formats, running a reasoner, creating import modules, running reports, and various other tasks. These commands can be combined into larger workflows using a separate task execution system such as GNU Make, and workflows can be automatically executed within continuous integration systems. CONCLUSIONS ROBOT supports automation of a wide range of ontology development tasks, focusing on OBO conventions. It packages common high-level ontology development functionality into a convenient library, and makes it easy to configure, combine, and execute individual tasks in comprehensive, automated workflows. This helps ontology developers to efficiently create, maintain, and release high-quality ontologies, so that they can spend more time focusing on development tasks. It also helps guarantee that released ontologies are free of certain types of logical errors and conform to standard quality control checks, increasing the overall robustness and efficiency of the ontology development lifecycle.
Collapse
Affiliation(s)
| | - James P Balhoff
- Renaissance Computing Institute, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Eric Douglass
- Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Nomi L Harris
- Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | | | | |
Collapse
|
29
|
Erkimbaev AO, Zitserman VY, Kobzev GA, Kosinov AV. Ontological Concepts and Taxonomies for Nano World. JOURNAL OF INFORMATION & KNOWLEDGE MANAGEMENT 2019. [DOI: 10.1142/s021964921950014x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The purpose of the paper is to provide a detailed overview of the methods of indexing and categorizing data generated to solve problems in a complex and multifaceted field of knowledge related to the application of nanotechnology. Analysis of the capabilities and restrictions of various categorization methods are applied to the issues of the subject field, starting with simple classification schemes and up to high level ontologies. The content of integrating methods and approaches developed in many natural sciences is considered: life science, chemistry, material science, etc. The main restriction of the currently applicable ontologies and vocabularies has been identified — a primary focus on the tasks of bio- and medical informatics. It is shown that the way to overcome them includes the adoption of a new system for describing nanomaterials proposed in the CODATA-VAMAS international project. The overview shows how the extreme broadness and continuous evolution of the subject field are reflected in the means of data categorization. It is shown that the most developed of them can serve as a basis for building a knowledge base. The prospective tasks of nanoinformatics are stated required to be solved to cover fundamentally unlimited classes of materials, their properties and fields of application.
Collapse
Affiliation(s)
- Adilbek O. Erkimbaev
- Joint Institute for High Temperatures of the Russian, Academy of Sciences, Izhorskaya St., 13, Bd. 2, Moscow 125412, Russia
| | - Vladimir Yu. Zitserman
- Joint Institute for High Temperatures of the Russian, Academy of Sciences, Izhorskaya St., 13, Bd. 2, Moscow 125412, Russia
| | - Georgii A. Kobzev
- Joint Institute for High Temperatures of the Russian, Academy of Sciences, Izhorskaya St., 13, Bd. 2, Moscow 125412, Russia
| | - Andrey V. Kosinov
- Joint Institute for High Temperatures of the Russian, Academy of Sciences, Izhorskaya St., 13, Bd. 2, Moscow 125412, Russia
| |
Collapse
|
30
|
Tang YA, Pichler K, Füllgrabe A, Lomax J, Malone J, Munoz-Torres MC, Vasant DV, Williams E, Haendel M. Ten quick tips for biocuration. PLoS Comput Biol 2019; 15:e1006906. [PMID: 31048830 PMCID: PMC6497217 DOI: 10.1371/journal.pcbi.1006906] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Affiliation(s)
- Y. Amy Tang
- Genestack Limited, Cambridge, Cambridgeshire, United Kingdom
- * E-mail:
| | - Klemens Pichler
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, United Kingdom
| | - Anja Füllgrabe
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, United Kingdom
| | - Jane Lomax
- SciBite Limited, BioData Innovation Centre, Hinxton, Cambridgeshire, United Kingdom
| | - James Malone
- SciBite Limited, BioData Innovation Centre, Hinxton, Cambridgeshire, United Kingdom
| | | | - Drashtti V. Vasant
- Bayer Business Services GmbH, BP Research and Development, Translational Sciences, Berlin, Germany
| | - Eleanor Williams
- Centre for Gene Regulation and Expression, School of Life Sciences, University of Dundee, Dundee, United Kingdom
- Genomics England, Queen Mary University of London, London, United Kingdom
| | - Melissa Haendel
- Linus Pauling Institute, Oregon State University, Corvallis, Oregon, United States of America
| |
Collapse
|
31
|
Gonçalves RS, Musen MA. The variable quality of metadata about biological samples used in biomedical experiments. Sci Data 2019; 6:190021. [PMID: 30778255 PMCID: PMC6380228 DOI: 10.1038/sdata.2019.21] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2018] [Accepted: 01/18/2019] [Indexed: 11/08/2022] Open
Abstract
We present an analytical study of the quality of metadata about samples used in biomedical experiments. The metadata under analysis are stored in two well-known databases: BioSample-a repository managed by the National Center for Biotechnology Information (NCBI), and BioSamples-a repository managed by the European Bioinformatics Institute (EBI). We tested whether 11.4 M sample metadata records in the two repositories are populated with values that fulfill the stated requirements for such values. Our study revealed multiple anomalies in the metadata. Most metadata field names and their values are not standardized or controlled. Even simple binary or numeric fields are often populated with inadequate values of different data types. By clustering metadata field names, we discovered there are often many distinct ways to represent the same aspect of a sample. Overall, the metadata we analyzed reveal that there is a lack of principled mechanisms to enforce and validate metadata requirements. The significant aberrancies that we found in the metadata are likely to impede search and secondary use of the associated datasets.
Collapse
Affiliation(s)
- Rafael S. Gonçalves
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford CA, USA
| | - Mark A. Musen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford CA, USA
| |
Collapse
|
32
|
Luecken MD, Page MJT, Crosby AJ, Mason S, Reinert G, Deane CM. CommWalker: correctly evaluating modules in molecular networks in light of annotation bias. Bioinformatics 2019; 34:994-1000. [PMID: 29112702 PMCID: PMC5860269 DOI: 10.1093/bioinformatics/btx706] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Accepted: 11/02/2017] [Indexed: 11/24/2022] Open
Abstract
Motivation Detecting novel functional modules in molecular networks is an important step in biological research. In the absence of gold standard functional modules, functional annotations are often used to verify whether detected modules/communities have biological meaning. However, as we show, the uneven distribution of functional annotations means that such evaluation methods favor communities of well-studied proteins. Results We propose a novel framework for the evaluation of communities as functional modules. Our proposed framework, CommWalker, takes communities as inputs and evaluates them in their local network environment by performing short random walks. We test CommWalker’s ability to overcome annotation bias using input communities from four community detection methods on two protein interaction networks. We find that modules accepted by CommWalker are similarly co-expressed as those accepted by current methods. Crucially, CommWalker performs well not only in well-annotated regions, but also in regions otherwise obscured by poor annotation. CommWalker community prioritization both faithfully captures well-validated communities and identifies functional modules that may correspond to more novel biology. Availability and implementation The CommWalker algorithm is freely available at opig.stats.ox.ac.uk/resources or as a docker image on the Docker Hub at hub.docker.com/r/lueckenmd/commwalker/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- M D Luecken
- Department of Statistics, University of Oxford, Oxford, UK
- Doctoral Training Centre, University of Oxford, Oxford, UK
| | - M J T Page
- Department of Informatics, UCB Pharma, Slough, UK
| | - A J Crosby
- Immunology Therapeutic Area, UCB Pharma, Slough, UK
| | - S Mason
- Immunology Therapeutic Area, UCB Pharma, Slough, UK
| | - G Reinert
- Department of Statistics, University of Oxford, Oxford, UK
| | - C M Deane
- Department of Statistics, University of Oxford, Oxford, UK
- Doctoral Training Centre, University of Oxford, Oxford, UK
- To whom correspondence should be addressed.
| |
Collapse
|
33
|
GNOMICS: A one-stop shop for biomedical and genomic data. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018; 2017:118-123. [PMID: 29888054 PMCID: PMC5961829] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
The World Wide Web is an indispensable tool for biomedical researchers who are striving to understand the molecular basis of phenotype. However, it presents challenges in the form of proliferation of data resources, with heterogeneity ranging from their content to functionality to interfaces. This often frustrates researchers who must visit multiple sites, become familiar with their interfaces, and learn how to use them to extract knowledge. Even then, one may never feel sure that they have tracked down all needed information. We envision addressing this challenge with GNOMICS (Genomic Nomenclature Omnibus and Multifaceted Informatics and Computational Suite), a suite with both a programmatic interface and a GUI. GNOMICS allows for extensible biomedical functionality, including identifier conversion, pathway enrichment, sequence alignment, and reference gathering, among others. It combines usage of other biological and chemical database application programming interfaces (APIs) to deliver uniform data which can be further manipulated and parsed.
Collapse
|
34
|
Wimalaratne SM, Juty N, Kunze J, Janée G, McMurry JA, Beard N, Jimenez R, Grethe JS, Hermjakob H, Martone ME, Clark T. Uniform resolution of compact identifiers for biomedical data. Sci Data 2018; 5:180029. [PMID: 29737976 PMCID: PMC5944906 DOI: 10.1038/sdata.2018.29] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Accepted: 01/26/2018] [Indexed: 11/09/2022] Open
Abstract
Most biomedical data repositories issue locally-unique accessions numbers, but do not provide globally unique, machine-resolvable, persistent identifiers for their datasets, as required by publishers wishing to implement data citation in accordance with widely accepted principles. Local accessions may however be prefixed with a namespace identifier, providing global uniqueness. Such "compact identifiers" have been widely used in biomedical informatics to support global resource identification with local identifier assignment. We report here on our project to provide robust support for machine-resolvable, persistent compact identifiers in biomedical data citation, by harmonizing the Identifiers.org and N2T.net (Name-To-Thing) meta-resolvers and extending their capabilities. Identifiers.org services hosted at the European Molecular Biology Laboratory - European Bioinformatics Institute (EMBL-EBI), and N2T.net services hosted at the California Digital Library (CDL), can now resolve any given identifier from over 600 source databases to its original source on the Web, using a common registry of prefix-based redirection rules. We believe these services will be of significant help to publishers and others implementing persistent, machine-resolvable citation of research data.
Collapse
Affiliation(s)
- Sarala M Wimalaratne
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Nick Juty
- University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - John Kunze
- California Digital Library, University of California, Oakland, CA 94612, USA
| | - Greg Janée
- California Digital Library, University of California, Oakland, CA 94612, USA
| | - Julie A McMurry
- Oregon Health and Science University, Portland, OR 97239, USA
| | - Niall Beard
- University of Manchester, Oxford Road, Manchester M13 9PL, UK
| | - Rafael Jimenez
- ELIXIR, Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | | | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | | | - Tim Clark
- Massachusetts General Hospital, Boston, MA 02114, USA.,Harvard Medical School, Boston, MA 02115, USA
| |
Collapse
|
35
|
He Y, Xiang Z, Zheng J, Lin Y, Overton JA, Ong E. The eXtensible ontology development (XOD) principles and tool implementation to support ontology interoperability. J Biomed Semantics 2018; 9:3. [PMID: 29329592 PMCID: PMC5765662 DOI: 10.1186/s13326-017-0169-2] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Accepted: 12/07/2017] [Indexed: 11/13/2022] Open
Abstract
Ontologies are critical to data/metadata and knowledge standardization, sharing, and analysis. With hundreds of biological and biomedical ontologies developed, it has become critical to ensure ontology interoperability and the usage of interoperable ontologies for standardized data representation and integration. The suite of web-based Ontoanimal tools (e.g., Ontofox, Ontorat, and Ontobee) support different aspects of extensible ontology development. By summarizing the common features of Ontoanimal and other similar tools, we identified and proposed an “eXtensible Ontology Development” (XOD) strategy and its associated four principles. These XOD principles reuse existing terms and semantic relations from reliable ontologies, develop and apply well-established ontology design patterns (ODPs), and involve community efforts to support new ontology development, promoting standardized and interoperable data and knowledge representation and integration. The adoption of the XOD strategy, together with robust XOD tool development, will greatly support ontology interoperability and robust ontology applications to support data to be Findable, Accessible, Interoperable and Reusable (i.e., FAIR).
Collapse
Affiliation(s)
- Yongqun He
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA.
| | - Zuoshuang Xiang
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, Center for Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| | - Jie Zheng
- Department of Genetics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, 19104, USA
| | - Yu Lin
- Center for Computational Science, University of Miami, Coral Gables, FL, USA
| | | | - Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI, USA
| |
Collapse
|
36
|
Harper L, Campbell J, Cannon EKS, Jung S, Poelchau M, Walls R, Andorf C, Arnaud E, Berardini TZ, Birkett C, Cannon S, Carson J, Condon B, Cooper L, Dunn N, Elsik CG, Farmer A, Ficklin SP, Grant D, Grau E, Herndon N, Hu ZL, Humann J, Jaiswal P, Jonquet C, Laporte MA, Larmande P, Lazo G, McCarthy F, Menda N, Mungall CJ, Munoz-Torres MC, Naithani S, Nelson R, Nesdill D, Park C, Reecy J, Reiser L, Sanderson LA, Sen TZ, Staton M, Subramaniam S, Tello-Ruiz MK, Unda V, Unni D, Wang L, Ware D, Wegrzyn J, Williams J, Woodhouse M, Yu J, Main D. AgBioData consortium recommendations for sustainable genomics and genetics databases for agriculture. Database (Oxford) 2018; 2018:5096675. [PMID: 30239679 PMCID: PMC6146126 DOI: 10.1093/database/bay088] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 07/19/2018] [Accepted: 07/30/2018] [Indexed: 01/07/2023]
Abstract
The future of agricultural research depends on data. The sheer volume of agricultural biological data being produced today makes excellent data management essential. Governmental agencies, publishers and science funders require data management plans for publicly funded research. Furthermore, the value of data increases exponentially when they are properly stored, described, integrated and shared, so that they can be easily utilized in future analyses. AgBioData (https://www.agbiodata.org) is a consortium of people working at agricultural biological databases, data archives and knowledgbases who strive to identify common issues in database development, curation and management, with the goal of creating database products that are more Findable, Accessible, Interoperable and Reusable. We strive to promote authentic, detailed, accurate and explicit communication between all parties involved in scientific data. As a step toward this goal, we present the current state of biocuration, ontologies, metadata and persistence, database platforms, programmatic (machine) access to data, communication and sustainability with regard to data curation. Each section describes challenges and opportunities for these topics, along with recommendations and best practices.
Collapse
Affiliation(s)
- Lisa Harper
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | | | - Ethalinda K S Cannon
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
- Computer Science, Iowa State University, Ames, IA, USA
| | - Sook Jung
- Horticulture, Washington State University, Pullman, WA, USA
| | - Monica Poelchau
- National Agricultural Library, USDA Agricultural Research Service, Beltsville, MD, USA
| | | | - Carson Andorf
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
- Computer Science, Iowa State University, Ames, IA, USA
| | - Elizabeth Arnaud
- Bioversity International, Informatics Unit, Conservation and Availability Programme, Parc Scientifique Agropolis II, Montpellier, France
| | - Tanya Z Berardini
- The Arabidopsis Information Resource, Phoenix Bioinformatics, Fremont, CA, USA
| | | | - Steve Cannon
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | - James Carson
- Texas Advanced Computing Center, The University of Texas at Austin, Austin, TX, USA
| | - Bradford Condon
- Entomology and Plant Pathology, University of Tennessee Knoxville, Knoxville, TN, USA
| | - Laurel Cooper
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Nathan Dunn
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Christine G Elsik
- Division of Animal Sciences and Division of Plant Sciences, University of Missouri, Columbia, MO, USA
| | - Andrew Farmer
- National Center for Genome Resources, Santa Fe, NM, USA
| | | | - David Grant
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | - Emily Grau
- National Center for Genome Resources, Santa Fe, NM, USA
| | - Nic Herndon
- Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Zhi-Liang Hu
- Animal Science, Iowa State University, Ames, USA
| | - Jodi Humann
- Horticulture, Washington State University, Pullman, WA, USA
| | - Pankaj Jaiswal
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Clement Jonquet
- Laboratory of Informatics, Robotics, Microelectronics of Montpellier, University of Montpellier & CNRS, Montpellier, France
| | - Marie-Angélique Laporte
- Bioversity International, Informatics Unit, Conservation and Availability Programme, Parc Scientifique Agropolis II, Montpellier, France
| | | | - Gerard Lazo
- Crop Improvement and Genetics Research Unit, USDA-ARS, Albany, CA, USA
| | - Fiona McCarthy
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, AZ, USA
| | | | | | | | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR, USA
| | - Rex Nelson
- Corn Insects and Crop Genetics Research Unit, USDA-ARS, Ames, IA, USA
| | - Daureen Nesdill
- Marriott Library, University of Utah, Salt Lake City, UT, USA
| | - Carissa Park
- Animal Science, Iowa State University, Ames, USA
| | - James Reecy
- Animal Science, Iowa State University, Ames, USA
| | - Leonore Reiser
- The Arabidopsis Information Resource, Phoenix Bioinformatics, Fremont, CA, USA
| | | | - Taner Z Sen
- Crop Improvement and Genetics Research Unit, USDA-ARS, Albany, CA, USA
| | - Margaret Staton
- Entomology and Plant Pathology, University of Tennessee Knoxville, Knoxville, TN, USA
| | | | | | - Victor Unda
- Horticulture, Washington State University, Pullman, WA, USA
| | - Deepak Unni
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Liya Wang
- Plant Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Doreen Ware
- USDA, Plant, Soil and Nutrition Research, Ithaca, NY, USA
- Plant Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Jill Wegrzyn
- Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jason Williams
- Cold Spring Harbor Laboratory, DNA Learning Center, Cold Spring Harbor, NY, USA
| | - Margaret Woodhouse
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, USA
| | - Jing Yu
- Horticulture, Washington State University, Pullman, WA, USA
| | - Doreen Main
- Horticulture, Washington State University, Pullman, WA, USA
| |
Collapse
|
37
|
Deutsch EW, Orchard S, Binz PA, Bittremieux W, Eisenacher M, Hermjakob H, Kawano S, Lam H, Mayer G, Menschaert G, Perez-Riverol Y, Salek RM, Tabb DL, Tenzer S, Vizcaíno JA, Walzer M, Jones AR. Proteomics Standards Initiative: Fifteen Years of Progress and Future Work. J Proteome Res 2017; 16:4288-4298. [PMID: 28849660 PMCID: PMC5715286 DOI: 10.1021/acs.jproteome.7b00370] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Indexed: 12/21/2022]
Abstract
The Proteomics Standards Initiative (PSI) of the Human Proteome Organization (HUPO) has now been developing and promoting open community standards and software tools in the field of proteomics for 15 years. Under the guidance of the chair, cochairs, and other leadership positions, the PSI working groups are tasked with the development and maintenance of community standards via special workshops and ongoing work. Among the existing ratified standards, the PSI working groups continue to update PSI-MI XML, MITAB, mzML, mzIdentML, mzQuantML, mzTab, and the MIAPE (Minimum Information About a Proteomics Experiment) guidelines with the advance of new technologies and techniques. Furthermore, new standards are currently either in the final stages of completion (proBed and proBAM for proteogenomics results as well as PEFF) or in early stages of design (a spectral library standard format, a universal spectrum identifier, the qcML quality control format, and the Protein Expression Interface (PROXI) web services Application Programming Interface). In this work we review the current status of all of these aspects of the PSI, describe synergies with other efforts such as the ProteomeXchange Consortium, the Human Proteome Project, and the metabolomics community, and provide a look at future directions of the PSI.
Collapse
Affiliation(s)
- Eric W. Deutsch
- Institute
for Systems Biology, Seattle, Washington 98109, United States
| | - Sandra Orchard
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Pierre-Alain Binz
- CHUV
Centre Hospitalier Universitaire Vaudois, 1011 Lausanne, Switzerland
| | - Wout Bittremieux
- Department
of Mathematics and Computer Science, University
of Antwerp, Middelheimlaan
1, 2020 Antwerp, Belgium
| | - Martin Eisenacher
- Medizinisches
Proteom Center (MPC), Ruhr-Universität
Bochum, D-44801 Bochum, Germany
| | - Henning Hermjakob
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
- State
Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing
Institute of Radiation Medicine, National
Center for Protein Sciences, Beijing, Beijing 102206, China
| | - Shin Kawano
- Database
Center for Life Science, Joint Support Center for Data Science Research,
Research Organization of Information and Systems, Kashiwa, Chiba 277-0871, Japan
| | - Henry Lam
- Division
of Biomedical Engineering, The Hong Kong
University of Science and Technology, Clear Water Bay, Hong Kong, P. R. China
- Department
of Chemical and Biomolecular Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong, P. R. China
| | - Gerhard Mayer
- Medizinisches
Proteom Center (MPC), Ruhr-Universität
Bochum, D-44801 Bochum, Germany
| | - Gerben Menschaert
- Lab of Bioinformatics
and Computational Genomics (BioBix), Faculty of Bioscience Engineering, Ghent University, 9000 Ghent, Belgium
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Reza M. Salek
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - David L. Tabb
- SA
MRC Centre
for TB Research, DST/NRF Centre of Excellence for Biomedical TB Research,
Division of Molecular Biology and Human Genetics, Faculty of Medicine
and Health Sciences, Stellenbosch University, Cape Town, South Africa
| | - Stefan Tenzer
- Institute
for Immunology, University Medical Center
of the Johannes-Gutenberg University Mainz, 55131 Mainz, Germany
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Mathias Walzer
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom
| | - Andrew R. Jones
- Institute
of Integrative Biology, University of Liverpool, South Wirral L64 4AY, United Kingdom
| |
Collapse
|
38
|
Jarnuczak AF, Vizcaíno JA. Using the PRIDE Database and ProteomeXchange for Submitting and Accessing Public Proteomics Datasets. ACTA ACUST UNITED AC 2017; 59:13.31.1-13.31.12. [PMID: 28902400 DOI: 10.1002/cpbi.30] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The ProteomeXchange (PX) Consortium is the unifying framework for world-leading mass spectrometry (MS)-based proteomics repositories. Current members include the PRIDE database (U.K.), PeptideAtlas/PASSEL, and MassIVE (U.S.A.), and jPOST (Japan). The Consortium standardizes submission and dissemination of public proteomics data worldwide. This is achieved through implementing common data submission guidelines and enforcing metadata requirements by each of the members. Furthermore, the members use a common identifier space. Each dataset receives a unique (PXD) accession number and is publicly accessible as soon as the associated scientific publications are released. The two basic protocols provide a step-by-step guide on how to submit data to the PRIDE database, and describe how to access the PX portal (called ProteomeCentral), which can be used to search datasets available in any of the PX members. © 2017 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Andrew F Jarnuczak
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Juan Antonio Vizcaíno
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
39
|
Abstract
Ontologies are powerful and popular tools to encode data in a structured format and manage knowledge. A large variety of existing ontologies offer users access to biomedical knowledge. This chapter contains a short theoretical background of ontologies and introduces two notable examples: The Gene Ontology and the ontology for Biological Pathways Exchange. For both ontologies a short overview and working bioinformatic applications, i.e., Gene Ontology enrichment analyses and pathway data visualization, are provided.
Collapse
Affiliation(s)
- Frank Kramer
- Department of Medical Statistics, University Medical Center Göttingen, Humboldtallee 32, 37073, Göttingen, Germany.
| | - Tim Beißbarth
- Department of Medical Statistics, University Medical Center Göttingen, Humboldtallee 32, 37073, Göttingen, Germany
| |
Collapse
|
40
|
Abstract
Many publicly available data repositories and resources have been developed to support protein-related information management, data-driven hypothesis generation, and biological knowledge discovery. To help researchers quickly find the appropriate protein-related informatics resources, we present a comprehensive review (with categorization and description) of major protein bioinformatics databases in this chapter. We also discuss the challenges and opportunities for developing next-generation protein bioinformatics databases and resources to support data integration and data analytics in the Big Data era.
Collapse
Affiliation(s)
- Chuming Chen
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, 19711, USA.
| | - Hongzhan Huang
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, 19711, USA
| | - Cathy H Wu
- Center for Bioinformatics and Computational Biology, Department of Computer and Information Sciences, University of Delaware, Newark, DE, 19711, USA
- Protein Information Resource, Department of Biochemistry and Molecular and Cellular Biology, Georgetown University Medical Center, Washington, DC, 20007, USA
| |
Collapse
|
41
|
Okuda S, Watanabe Y, Moriya Y, Kawano S, Yamamoto T, Matsumoto M, Takami T, Kobayashi D, Araki N, Yoshizawa AC, Tabata T, Sugiyama N, Goto S, Ishihama Y. jPOSTrepo: an international standard data repository for proteomes. Nucleic Acids Res 2016; 45:D1107-D1111. [PMID: 27899654 PMCID: PMC5210561 DOI: 10.1093/nar/gkw1080] [Citation(s) in RCA: 404] [Impact Index Per Article: 50.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2016] [Revised: 10/21/2016] [Accepted: 10/27/2016] [Indexed: 12/17/2022] Open
Abstract
Major advancements have recently been made in mass spectrometry-based proteomics, yielding an increasing number of datasets from various proteomics projects worldwide. In order to facilitate the sharing and reuse of promising datasets, it is important to construct appropriate, high-quality public data repositories. jPOSTrepo (https://repository.jpostdb.org/) has successfully implemented several unique features, including high-speed file uploading, flexible file management and easy-to-use interfaces. This repository has been launched as a public repository containing various proteomic datasets and is available for researchers worldwide. In addition, our repository has joined the ProteomeXchange consortium, which includes the most popular public repositories such as PRIDE in Europe for MS/MS datasets and PASSEL for SRM datasets in the USA. Later MassIVE was introduced in the USA and accepted into the ProteomeXchange, as was our repository in July 2016, providing important datasets from Asia/Oceania. Accordingly, this repository thus contributes to a global alliance to share and store all datasets from a wide variety of proteomics experiments. Thus, the repository is expected to become a major repository, particularly for data collected in the Asia/Oceania region.
Collapse
Affiliation(s)
- Shujiro Okuda
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | - Yu Watanabe
- Niigata University Graduate School of Medical and Dental Sciences, Niigata 951-8510, Japan
| | - Yuki Moriya
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa 277-0871, Japan
| | - Shin Kawano
- Database Center for Life Science, Joint Support-Center for Data Science Research, Research Organization of Information and Systems, Kashiwa 277-0871, Japan
| | - Tadashi Yamamoto
- Biofluid Biomarker Center, Institute for Social Innovation and Cooperation, Niigata University, Niigata 950-2181, Japan
| | - Masaki Matsumoto
- Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Tomoyo Takami
- Medical Institute of Bioregulation, Kyushu University, Fukuoka 812-8582, Japan
| | - Daiki Kobayashi
- Graduate School of Medical Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Norie Araki
- Graduate School of Medical Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto 860-8556, Japan
| | - Akiyasu C Yoshizawa
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Tsuyoshi Tabata
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Naoyuki Sugiyama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| | - Susumu Goto
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Yasushi Ishihama
- Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan
| |
Collapse
|
42
|
Ravagli C, Pognan F, Marc P. OntoBrowser: a collaborative tool for curation of ontologies by subject matter experts. Bioinformatics 2016; 33:148-149. [PMID: 27605099 PMCID: PMC5408772 DOI: 10.1093/bioinformatics/btw579] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 07/08/2016] [Accepted: 09/02/2016] [Indexed: 11/16/2022] Open
Abstract
Summary The lack of controlled terminology and ontology usage leads to incomplete search results and poor interoperability between databases. One of the major underlying challenges of data integration is curating data to adhere to controlled terminologies and/or ontologies. Finding subject matter experts with the time and skills required to perform data curation is often problematic. In addition, existing tools are not designed for continuous data integration and collaborative curation. This results in time-consuming curation workflows that often become unsustainable. The primary objective of OntoBrowser is to provide an easy-to-use online collaborative solution for subject matter experts to map reported terms to preferred ontology (or code list) terms and facilitate ontology evolution. Additional features include web service access to data, visualization of ontologies in hierarchical/graph format and a peer review/approval workflow with alerting. Availability and implementation The source code is freely available under the Apache v2.0 license. Source code and installation instructions are available at http://opensource.nibr.com. This software is designed to run on a Java EE application server and store data in a relational database.
Collapse
Affiliation(s)
- Carlo Ravagli
- PreClinical Safety, Translational Sciences, Novartis Institute for Biomedical Research, Basel, CH-4002, Switzerland
| | - Francois Pognan
- PreClinical Safety, Translational Sciences, Novartis Institute for Biomedical Research, Basel, CH-4002, Switzerland
| | - Philippe Marc
- PreClinical Safety, Translational Sciences, Novartis Institute for Biomedical Research, Basel, CH-4002, Switzerland
| |
Collapse
|
43
|
Name-calling in the hippocampus (and beyond): coming to terms with neuron types and properties. Brain Inform 2016; 4:1-12. [PMID: 27747821 PMCID: PMC5319951 DOI: 10.1007/s40708-016-0053-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Accepted: 05/24/2016] [Indexed: 01/25/2023] Open
Abstract
Widely spread naming inconsistencies in neuroscience pose a vexing obstacle to effective communication within and across areas of expertise. This problem is particularly acute when identifying neuron types and their properties. Hippocampome.org is a web-accessible neuroinformatics resource that organizes existing data about essential properties of all known neuron types in the rodent hippocampal formation. Hippocampome.org links evidence supporting the assignment of a property to a type with direct pointers to quotes and figures. Mining this knowledge from peer-reviewed reports reveals the troubling extent of terminological ambiguity and undefined terms. Examples span simple cases of using multiple synonyms and acronyms for the same molecular biomarkers (or other property) to more complex cases of neuronal naming. New publications often use different terms without mapping them to previous terms. As a result, neurons of the same type are assigned disparate names, while neurons of different types are bestowed the same name. Furthermore, non-unique properties are frequently used as names, and several neuron types are not named at all. In order to alleviate this nomenclature confusion regarding hippocampal neuron types and properties, we introduce a new functionality of Hippocampome.org: a fully searchable, curated catalog of human and machine-readable definitions, each linked to the corresponding neuron and property terms. Furthermore, we extend our robust approach to providing each neuron type with an informative name and unique identifier by mapping all encountered synonyms and homonyms.
Collapse
|
44
|
Boldt K, van Reeuwijk J, Lu Q, Koutroumpas K, Nguyen TMT, Texier Y, van Beersum SEC, Horn N, Willer JR, Mans DA, Dougherty G, Lamers IJC, Coene KLM, Arts HH, Betts MJ, Beyer T, Bolat E, Gloeckner CJ, Haidari K, Hetterschijt L, Iaconis D, Jenkins D, Klose F, Knapp B, Latour B, Letteboer SJF, Marcelis CL, Mitic D, Morleo M, Oud MM, Riemersma M, Rix S, Terhal PA, Toedt G, van Dam TJP, de Vrieze E, Wissinger Y, Wu KM, Apic G, Beales PL, Blacque OE, Gibson TJ, Huynen MA, Katsanis N, Kremer H, Omran H, van Wijk E, Wolfrum U, Kepes F, Davis EE, Franco B, Giles RH, Ueffing M, Russell RB, Roepman R. An organelle-specific protein landscape identifies novel diseases and molecular mechanisms. Nat Commun 2016; 7:11491. [PMID: 27173435 PMCID: PMC4869170 DOI: 10.1038/ncomms11491] [Citation(s) in RCA: 183] [Impact Index Per Article: 22.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Accepted: 04/01/2016] [Indexed: 01/12/2023] Open
Abstract
Cellular organelles provide opportunities to relate biological mechanisms to disease. Here we use affinity proteomics, genetics and cell biology to interrogate cilia: poorly understood organelles, where defects cause genetic diseases. Two hundred and seventeen tagged human ciliary proteins create a final landscape of 1,319 proteins, 4,905 interactions and 52 complexes. Reverse tagging, repetition of purifications and statistical analyses, produce a high-resolution network that reveals organelle-specific interactions and complexes not apparent in larger studies, and links vesicle transport, the cytoskeleton, signalling and ubiquitination to ciliary signalling and proteostasis. We observe sub-complexes in exocyst and intraflagellar transport complexes, which we validate biochemically, and by probing structurally predicted, disruptive, genetic variants from ciliary disease patients. The landscape suggests other genetic diseases could be ciliary including 3M syndrome. We show that 3M genes are involved in ciliogenesis, and that patient fibroblasts lack cilia. Overall, this organelle-specific targeting strategy shows considerable promise for Systems Medicine. Mutations in proteins that localize to primary cilia cause devastating diseases, yet the primary cilium is a poorly understood organelle. Here the authors use interaction proteomics to identify a network of human ciliary proteins that provides new insights into several biological processes and diseases.
Collapse
Affiliation(s)
- Karsten Boldt
- Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany
| | - Jeroen van Reeuwijk
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Qianhao Lu
- Biochemie Zentrum Heidelberg (BZH), University of Heidelberg, Im Neuenheimer Feld 328, 69120 Heidelberg, Germany.,Cell Networks, Bioquant, Ruprecht-Karl University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Konstantinos Koutroumpas
- Institute of Systems and Synthetic Biology, Genopole, CNRS, Université d'Evry, 91030 Evry, France
| | - Thanh-Minh T Nguyen
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Yves Texier
- Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany.,Department of Molecular Epigenetics, Helmholtz Center Munich, Center for Integrated Protein Science, 81377 Munich, Germany
| | - Sylvia E C van Beersum
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Nicola Horn
- Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany
| | - Jason R Willer
- Center for Human Disease Modeling, Duke University, Durham, North Carolina 27701, USA
| | - Dorus A Mans
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Gerard Dougherty
- Department of General Pediatrics, University Children's Hospital Muenster, 48149 Muenster, Germany
| | - Ideke J C Lamers
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Karlien L M Coene
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Heleen H Arts
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Matthew J Betts
- Biochemie Zentrum Heidelberg (BZH), University of Heidelberg, Im Neuenheimer Feld 328, 69120 Heidelberg, Germany.,Cell Networks, Bioquant, Ruprecht-Karl University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Tina Beyer
- Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany
| | - Emine Bolat
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Christian Johannes Gloeckner
- German Center for Neurodegenerative Diseases (DZNE) within the Helmholz Association, Otfried-Müller Strasse 23, 72076 Tuebingen, Germany
| | - Khatera Haidari
- Department of Nephrology and Hypertension, Regenerative Medicine Center, University Medical Center Utrecht, 3584 CT Utrecht, The Netherlands
| | - Lisette Hetterschijt
- Department of Otorhinolaryngology and Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Daniela Iaconis
- Telethon Institute of Genetics and Medicine, TIGEM 80078, Italy
| | - Dagan Jenkins
- Molecular Medicine Unit and Birth Defects Research Centre, UCL Institute of Child Health, London, WC1N 1EH, UK
| | - Franziska Klose
- Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany
| | - Barbara Knapp
- Cell and Matrix Biology, Inst. of Zoology, Johannes Gutenberg University of Mainz, 55122 Mainz, Germany
| | - Brooke Latour
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Stef J F Letteboer
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Carlo L Marcelis
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Dragana Mitic
- Cambridge Cell Networks Ltd, St John's Innovation Centre, Cowley Road, Cambridge, CB4 0WS, UK
| | - Manuela Morleo
- Telethon Institute of Genetics and Medicine, TIGEM 80078, Italy.,Department of Translational Medicine Federico II University, 80131 Naples, Italy
| | - Machteld M Oud
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Moniek Riemersma
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Susan Rix
- Molecular Medicine Unit and Birth Defects Research Centre, UCL Institute of Child Health, London, WC1N 1EH, UK
| | - Paulien A Terhal
- Department of Genetics, University Medical Center Utrecht, 3584 CX Utrecht, The Netherlands
| | - Grischa Toedt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Teunis J P van Dam
- Centre for Molecular and Biomolecular Informatics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 26-28, 6525 GA Nijmegen, The Netherlands
| | - Erik de Vrieze
- Department of Otorhinolaryngology and Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Yasmin Wissinger
- Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany
| | - Ka Man Wu
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Gordana Apic
- Cambridge Cell Networks Ltd, St John's Innovation Centre, Cowley Road, Cambridge, CB4 0WS, UK
| | - Philip L Beales
- Molecular Medicine Unit and Birth Defects Research Centre, UCL Institute of Child Health, London, WC1N 1EH, UK
| | - Oliver E Blacque
- School of Biomolecular &Biomed Science, Conway Institute, University College Dublin, Dublin 4, Ireland
| | - Toby J Gibson
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Martijn A Huynen
- Centre for Molecular and Biomolecular Informatics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 26-28, 6525 GA Nijmegen, The Netherlands
| | - Nicholas Katsanis
- Center for Human Disease Modeling, Duke University, Durham, North Carolina 27701, USA
| | - Hannie Kremer
- Department of Otorhinolaryngology and Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Heymut Omran
- Department of General Pediatrics, University Children's Hospital Muenster, 48149 Muenster, Germany
| | - Erwin van Wijk
- Department of Otorhinolaryngology and Donders Institute for Brain, Cognition and Behaviour, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | - Uwe Wolfrum
- Cell and Matrix Biology, Inst. of Zoology, Johannes Gutenberg University of Mainz, 55122 Mainz, Germany
| | - François Kepes
- Institute of Systems and Synthetic Biology, Genopole, CNRS, Université d'Evry, 91030 Evry, France
| | - Erica E Davis
- Center for Human Disease Modeling, Duke University, Durham, North Carolina 27701, USA
| | - Brunella Franco
- Telethon Institute of Genetics and Medicine, TIGEM 80078, Italy.,Department of Translational Medicine Federico II University, 80131 Naples, Italy
| | - Rachel H Giles
- Department of Nephrology and Hypertension, Regenerative Medicine Center, University Medical Center Utrecht, 3584 CT Utrecht, The Netherlands
| | - Marius Ueffing
- Medical Proteome Center, Institute for Ophthalmic Research, University of Tuebingen, 72074 Tuebingen, Germany
| | - Robert B Russell
- Biochemie Zentrum Heidelberg (BZH), University of Heidelberg, Im Neuenheimer Feld 328, 69120 Heidelberg, Germany.,Cell Networks, Bioquant, Ruprecht-Karl University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
| | - Ronald Roepman
- Department of Human Genetics and Radboud Institute for Molecular Life Sciences, Radboud University Medical Center, Geert Grooteplein Zuid 10, 6525 GA Nijmegen, The Netherlands
| | | |
Collapse
|
45
|
Rodríguez-Iglesias A, Rodríguez-González A, Irvine AG, Sesma A, Urban M, Hammond-Kosack KE, Wilkinson MD. Publishing FAIR Data: An Exemplar Methodology Utilizing PHI-Base. FRONTIERS IN PLANT SCIENCE 2016; 7:641. [PMID: 27433158 PMCID: PMC4922217 DOI: 10.3389/fpls.2016.00641] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Accepted: 04/26/2016] [Indexed: 06/06/2023]
Abstract
Pathogen-Host interaction data is core to our understanding of disease processes and their molecular/genetic bases. Facile access to such core data is particularly important for the plant sciences, where individual genetic and phenotypic observations have the added complexity of being dispersed over a wide diversity of plant species vs. the relatively fewer host species of interest to biomedical researchers. Recently, an international initiative interested in scholarly data publishing proposed that all scientific data should be "FAIR"-Findable, Accessible, Interoperable, and Reusable. In this work, we describe the process of migrating a database of notable relevance to the plant sciences-the Pathogen-Host Interaction Database (PHI-base)-to a form that conforms to each of the FAIR Principles. We discuss the technical and architectural decisions, and the migration pathway, including observations of the difficulty and/or fidelity of each step. We examine how multiple FAIR principles can be addressed simultaneously through careful design decisions, including making data FAIR for both humans and machines with minimal duplication of effort. We note how FAIR data publishing involves more than data reformatting, requiring features beyond those exhibited by most life science Semantic Web or Linked Data resources. We explore the value-added by completing this FAIR data transformation, and then test the result through integrative questions that could not easily be asked over traditional Web-based data resources. Finally, we demonstrate the utility of providing explicit and reliable access to provenance information, which we argue enhances citation rates by encouraging and facilitating transparent scholarly reuse of these valuable data holdings.
Collapse
Affiliation(s)
| | | | - Alistair G. Irvine
- Department of Computational and Systems Biology, Rothamsted ResearchHarpenden, UK
| | - Ane Sesma
- Center for Plant Biotechnology and Genomics, Universidad Politécnica de MadridMadrid, Spain
| | - Martin Urban
- Department of Plant Biology and Crop Science, Rothamsted ResearchHarpenden, UK
| | | | - Mark D. Wilkinson
- Center for Plant Biotechnology and Genomics, Universidad Politécnica de MadridMadrid, Spain
| |
Collapse
|
46
|
Mulder N, Nembaware V, Adekile A, Anie KA, Inusa B, Brown B, Campbell A, Chinenere F, Chunda-Liyoka C, Derebail VK, Geard A, Ghedira K, Hamilton CM, Hanchard NA, Haendel M, Huggins W, Ibrahim M, Jupp S, Kamga KK, Knight-Madden J, Lopez-Sall P, Mbiyavanga M, Munube D, Nirenberg D, Nnodu O, Ofori-Acquah SF, Ohene-Frempong K, Opap KB, Panji S, Park M, Pule G, Royal C, Sangeda R, Tayo B, Treadwell M, Tshilolo L, Wonkam A. Proceedings of a Sickle Cell Disease Ontology workshop - Towards the first comprehensive ontology for Sickle Cell Disease. Appl Transl Genom 2016; 9:23-9. [PMID: 27354937 PMCID: PMC4911424 DOI: 10.1016/j.atg.2016.03.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 03/11/2016] [Accepted: 03/11/2016] [Indexed: 11/20/2022]
Abstract
Sickle cell disease (SCD) is a debilitating single gene disorder caused by a single point mutation that results in physical deformation (i.e. sickling) of erythrocytes at reduced oxygen tensions. Up to 75% of SCD in newborns world-wide occurs in sub-Saharan Africa, where neonatal and childhood mortality from sickle cell related complications is high. While SCD research across the globe is tackling the disease on multiple fronts, advances have yet to significantly impact on the health and quality of life of SCD patients, due to lack of coordination of these disparate efforts. Ensuring data across studies is directly comparable through standardization is a necessary step towards realizing this goal. Such a standardization requires the development and implementation of a disease-specific ontology for SCD that is applicable globally. Ontology development is best achieved by bringing together experts in the domain to contribute their knowledge. The SCD community and H3ABioNet members joined forces at a recent SCD Ontology workshop to develop an ontology covering aspects of SCD under the classes: phenotype, diagnostics, therapeutics, quality of life, disease modifiers and disease stage. The aim of the workshop was for participants to contribute their expertise to development of the structure and contents of the SCD ontology. Here we describe the proceedings of the Sickle Cell Disease Ontology Workshop held in Cape Town South Africa in February 2016 and its outcomes. The objective of the workshop was to bring together experts in SCD from around the world to contribute their expertise to the development of various aspects of the SCD ontology.
Collapse
Affiliation(s)
- Nicola Mulder
- H3ABioNet Consortium, Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Victoria Nembaware
- H3ABioNet Consortium, Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Adekunle Adekile
- Department of Pediatrics, Faculty of Medicine, Kuwait University, Kuwait City, Kuwait
| | - Kofi A Anie
- London North West Healthcare NHS Trust & Imperial College London, London, United Kingdom
| | - Baba Inusa
- Evelina Children's Hospital, Guy's and St Thomas NHS Trust, London, United Kingdom
| | - Biobele Brown
- Department of Paediatrics, College of Medicine, University of Ibadan, Ibadan, Nigeria
| | - Andrew Campbell
- Pediatric Hematology/Oncology and Center for Human Growth and Development, University of Michigan, Ann Arbor, MI, United States
| | | | - Catherine Chunda-Liyoka
- University Teaching Hospital (UTH), Lusaka, Zambia; University of Zambia (UNZA) School of medicine, Lusaka, Zambia
| | - Vimal K Derebail
- Division of Nephrology and Hypertension, Department of Medicine, UNC Kidney Center, University of North Carolina at Chapel Hill, NC, United States
| | - Amy Geard
- Division of Human Genetics, Department of Clinical Laboratory Sciences, National Health Laboratory Service and University of Cape Town, 7925, South Africa
| | - Kais Ghedira
- Université de Tunis El Manar, Institut Pasteur de Tunis, LR11IPT06 Laboratory of medical parasitology, biotechnologies and biomolecules, Group of Bioinformatics and mathematical modeling, Tunis, Tunisia
| | - Carol M Hamilton
- Research Computing Division, RTI International, Research Triangle Park, NC, United States
| | - Neil A Hanchard
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, United States
| | - Melissa Haendel
- Oregon Health and Science University, Portland, OR, United States
| | - Wayne Huggins
- Research Computing Division, RTI International, Research Triangle Park, NC, United States
| | | | - Simon Jupp
- European Bioinformatics Institute, London, United Kingdom
| | | | | | - Philomène Lopez-Sall
- Department of Pharmacy, Biochemistry Unit, , Cheikh Anta Diop University, Dakar, Senegal
| | - Mamana Mbiyavanga
- H3ABioNet Consortium, Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Deogratias Munube
- Department of Paediatrics and Child Health, College of Health Sciences, Makerere University/Mulago Hospital, Kampala, Uganda
| | - Damian Nirenberg
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, United States
| | - Obiageli Nnodu
- Centre of Excellence for Sickle Cell Disease Research and Training, University of Abuja, Abuja, Nigeria
| | - Solomon Fiifi Ofori-Acquah
- Center for Translational and International Hematology, Vascular Medicine Institute, University of Pittsburgh, Pittsburgh, PA, United States
| | | | - Kenneth Babu Opap
- H3ABioNet Consortium, Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Sumir Panji
- H3ABioNet Consortium, Computational Biology Group, Department of Integrative Biomedical Sciences, Institute of Infectious Disease and Molecular Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
| | - Miriam Park
- Instituto da Criança, Hospital das Clínicas, São Paulo Medical School, University of São Paulo, Brazil
| | - Gift Pule
- Division of Human Genetics, Department of Clinical Laboratory Sciences, National Health Laboratory Service and University of Cape Town, 7925, South Africa
| | | | | | - Bamidele Tayo
- Loyola University Chicago, Chicago, IL, United States
| | - Marsha Treadwell
- UCSF Benioff Children's Hospital Oakland, Oakland, CA, United States
| | | | - Ambroise Wonkam
- Division of Human Genetics, Department of Clinical Laboratory Sciences, National Health Laboratory Service and University of Cape Town, 7925, South Africa
| |
Collapse
|
47
|
Li S, Besson S, Blackburn C, Carroll M, Ferguson RK, Flynn H, Gillen K, Leigh R, Lindner D, Linkert M, Moore WJ, Ramalingam B, Rozbicki E, Rustici G, Tarkowska A, Walczysko P, Williams E, Allan C, Burel JM, Moore J, Swedlow JR. Metadata management for high content screening in OMERO. Methods 2016; 96:27-32. [PMID: 26476368 PMCID: PMC4773399 DOI: 10.1016/j.ymeth.2015.10.006] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2015] [Accepted: 10/13/2015] [Indexed: 01/18/2023] Open
Abstract
High content screening (HCS) experiments create a classic data management challenge-multiple, large sets of heterogeneous structured and unstructured data, that must be integrated and linked to produce a set of "final" results. These different data include images, reagents, protocols, analytic output, and phenotypes, all of which must be stored, linked and made accessible for users, scientists, collaborators and where appropriate the wider community. The OME Consortium has built several open source tools for managing, linking and sharing these different types of data. The OME Data Model is a metadata specification that supports the image data and metadata recorded in HCS experiments. Bio-Formats is a Java library that reads recorded image data and metadata and includes support for several HCS screening systems. OMERO is an enterprise data management application that integrates image data, experimental and analytic metadata and makes them accessible for visualization, mining, sharing and downstream analysis. We discuss how Bio-Formats and OMERO handle these different data types, and how they can be used to integrate, link and share HCS experiments in facilities and public data repositories. OME specifications and software are open source and are available at https://www.openmicroscopy.org.
Collapse
Affiliation(s)
- Simon Li
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Sébastien Besson
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Colin Blackburn
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Mark Carroll
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Richard K Ferguson
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Helen Flynn
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Kenneth Gillen
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Roger Leigh
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Dominik Lindner
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | | | - William J Moore
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Balaji Ramalingam
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | | | - Gabriella Rustici
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Aleksandra Tarkowska
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Petr Walczysko
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Eleanor Williams
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | | | - Jean-Marie Burel
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK
| | - Josh Moore
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK; Glencoe Software, Inc., Seattle, WA, USA
| | - Jason R Swedlow
- Centre for Gene Regulation & Expression, University of Dundee, Dundee, Scotland, UK; Glencoe Software, Inc., Seattle, WA, USA.
| |
Collapse
|
48
|
|
49
|
ten Hoopen P, Amid C, Buttigieg PL, Pafilis E, Bravakos P, Cerdeño-Tárraga AM, Gibson R, Kahlke T, Legaki A, Narayana Murthy K, Papastefanou G, Pereira E, Rossello M, Luisa Toribio A, Cochrane G. Value, but high costs in post-deposition data curation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:bav126. [PMID: 26861660 PMCID: PMC4747322 DOI: 10.1093/database/bav126] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Accepted: 12/14/2015] [Indexed: 12/26/2022]
Abstract
Discoverability of sequence data in primary data archives is proportional to the richness of contextual information associated with the data. Here, we describe an exercise in the improvement of contextual information surrounding sample records associated with metagenomics sequence reads available in the European Nucleotide Archive. We outline the annotation process and summarize findings of this effort aimed at increasing usability of publicly available environmental data. Furthermore, we emphasize the benefits of such an exercise and detail its costs. We conclude that such a third party annotation approach is expensive and has value as an element of curation, but should form only part of a more sustainable submitter-driven approach. Database URL: http://www.ebi.ac.uk/ena
Collapse
Affiliation(s)
- Petra ten Hoopen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Clara Amid
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Pier Luigi Buttigieg
- Alfred-Wegener-Institut Helmholtz-Zentrum für Polar-und Meeresforschung, Am Handelshafen 12, Bremerhaven 27570, Germany
| | - Evangelos Pafilis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Panos Bravakos
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Ana M Cerdeño-Tárraga
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Richard Gibson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tim Kahlke
- CSIRO Marine and Atmospheric Research, Castray Esplanade, Hobart TAS 7001, Australia
| | - Aglaia Legaki
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Kada Narayana Murthy
- Pondicherry University, Brookshabad Campus, Andaman and Nicobar Islands, Port Blair 744112, India
| | - Gabriella Papastefanou
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion, Crete, P.O. Box 2214 71003, Greece
| | - Emiliano Pereira
- Max Planck Institute for Marine Microbial Ecology, Microbial Genomics and Bioinformatics Group, Celsiusstr. 1, Bremen 28359, Germany
| | - Marc Rossello
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ana Luisa Toribio
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
50
|
Semantic interrogation of a multi knowledge domain ontological model of tendinopathy identifies four strong candidate risk genes. Sci Rep 2016; 6:19820. [PMID: 26804977 PMCID: PMC4726433 DOI: 10.1038/srep19820] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2015] [Accepted: 11/11/2015] [Indexed: 01/10/2023] Open
Abstract
Tendinopathy is a multifactorial syndrome characterised by tendon pain and thickening, and impaired performance during activity. Candidate gene association studies have identified genetic factors that contribute to intrinsic risk of developing tendinopathy upon exposure to extrinsic factors. Bioinformatics approaches that data-mine existing knowledge for biological relationships may assist with the identification of candidate genes. The aim of this study was to data-mine functional annotation of human genes and identify candidate genes by ontology-seeded queries capturing the features of tendinopathy. Our BioOntological Relationship Graph database (BORG) integrates multiple sources of genomic and biomedical knowledge into an on-disk semantic network where human genes and their orthologs in mouse and rat are central concepts mapped to ontology terms. The BORG was used to screen all human genes for potential links to tendinopathy. Following further prioritisation, four strong candidate genes (COL11A2, ELN, ITGB3, LOX) were identified. These genes are differentially expressed in tendinopathy, functionally linked to features of tendinopathy and previously implicated in other connective tissue diseases. In conclusion, cross-domain semantic integration of multiple sources of biomedical knowledge, and interrogation of phenotypes and gene functions associated with disease, may significantly increase the probability of identifying strong and unobvious candidate genes in genetic association studies.
Collapse
|