Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Dumontier M, Baker CJ, Baran J, Callahan A, Chepelev L, Cruz-Toledo J, Del Rio NR, Duck G, Furlong LI, Keath N, Klassen D, McCusker JP, Queralt-Rosinach N, Samwald M, Villanueva-Rosales N, Wilkinson MD, Hoehndorf R. The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery. J Biomed Semantics 2014;5:14. [PMID: 24602174 PMCID: PMC4015691 DOI: 10.1186/2041-1480-5-14] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 02/02/2014] [Indexed: 11/10/2022] Open

For:	Dumontier M, Baker CJ, Baran J, Callahan A, Chepelev L, Cruz-Toledo J, Del Rio NR, Duck G, Furlong LI, Keath N, Klassen D, McCusker JP, Queralt-Rosinach N, Samwald M, Villanueva-Rosales N, Wilkinson MD, Hoehndorf R. The Semanticscience Integrated Ontology (SIO) for biomedical research and knowledge discovery. J Biomed Semantics 2014;5:14. [PMID: 24602174 PMCID: PMC4015691 DOI: 10.1186/2041-1480-5-14] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2013] [Accepted: 02/02/2014] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Lapins M, Arvidsson S, Lampa S, Berg A, Schaal W, Alvarsson J, Spjuth O. A confidence predictor for logD using conformal regression and a support-vector machine. J Cheminform 2018;10:17. [PMID: 29616425 PMCID: PMC5882484 DOI: 10.1186/s13321-018-0271-1] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Accepted: 03/25/2018] [Indexed: 02/03/2023] Open

Abstract

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water–octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\hbox {Q}^{2}=0.973$$\end{document}Q2=0.973 and with the best performing nonconformity measure having median prediction interval of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pm ~0.39$$\end{document}±0.39 log units at 80% confidence and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pm ~0.60$$\end{document}±0.60 log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

Collapse

Xin J, Afrasiabi C, Lelong S, Adesara J, Tsueng G, Su AI, Wu C. Cross-linking BioThings APIs through JSON-LD to facilitate knowledge exploration. BMC Bioinformatics 2018;19:30. [PMID: 29390967 PMCID: PMC5796402 DOI: 10.1186/s12859-018-2041-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Accepted: 01/24/2018] [Indexed: 01/25/2023] Open

Garcia A, Lopez F, Garcia L, Giraldo O, Bucheli V, Dumontier M. Biotea: semantics for Pubmed Central. PeerJ 2018;6:e4201. [PMID: 29312824 PMCID: PMC5755483 DOI: 10.7717/peerj.4201] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 12/07/2017] [Indexed: 01/26/2023] Open

Kawashima S, Katayama T, Hatanaka H, Kushida T, Takagi T. NBDC RDF portal: a comprehensive repository for semantic data in life sciences. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018;2018:5255118. [PMID: 30576482 PMCID: PMC6301334 DOI: 10.1093/database/bay123] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 10/15/2018] [Indexed: 11/28/2022]

Linked Data for Life Sciences. ALGORITHMS 2017. [DOI: 10.3390/a10040126] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Esteban-Gil A, Fernández-Breis JT, Boeker M. Analysis and visualization of disease courses in a semantically-enabled cancer registry. J Biomed Semantics 2017;8:46. [PMID: 28962670 PMCID: PMC5622544 DOI: 10.1186/s13326-017-0154-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Accepted: 09/19/2017] [Indexed: 12/20/2022] Open

Abstract

Background

Regional and epidemiological cancer registries are important for cancer research and the quality management of cancer treatment. Many technological solutions are available to collect and analyse data for cancer registries nowadays. However, the lack of a well-defined common semantic model is a problem when user-defined analyses and data linking to external resources are required. The objectives of this study are: (1) design of a semantic model for local cancer registries; (2) development of a semantically-enabled cancer registry based on this model; and (3) semantic exploitation of the cancer registry for analysing and visualising disease courses.

Results

Our proposal is based on our previous results and experience working with semantic technologies. Data stored in a cancer registry database were transformed into RDF employing a process driven by OWL ontologies. The semantic representation of the data was then processed to extract semantic patient profiles, which were exploited by means of SPARQL queries to identify groups of similar patients and to analyse the disease timelines of patients.

Based on the requirements analysis, we have produced a draft of an ontology that models the semantics of a local cancer registry in a pragmatic extensible way. We have implemented a Semantic Web platform that allows transforming and storing data from cancer registries in RDF. This platform also permits users to formulate incremental user-defined queries through a graphical user interface. The query results can be displayed in several customisable ways. The complex disease timelines of individual patients can be clearly represented. Different events, e.g. different therapies and disease courses, are presented according to their temporal and causal relations.

Conclusion

The presented platform is an example of the parallel development of ontologies and applications that take advantage of semantic web technologies in the medical field. The semantic structure of the representation renders it easy to analyse key figures of the patients and their evolution at different granularity levels.

Collapse

Timón S, Rincón M, Martínez-Tomás R. Extending XNAT Platform with an Incremental Semantic Framework. Front Neuroinform 2017;11:57. [PMID: 28912709 PMCID: PMC5583223 DOI: 10.3389/fninf.2017.00057] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Accepted: 08/14/2017] [Indexed: 11/13/2022] Open

Measuring vocabulary use in the Linked Data Cloud. ONLINE INFORMATION REVIEW 2017. [DOI: 10.1108/oir-06-2015-0183] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Abstract Purpose This paper reports on a quantitative study of data gathered from the Linked Open Vocabularies (LOV) catalogue, including the use of network analysis and metrics. The purpose of this paper is to gain insights into the structure of LOV and the use of vocabularies in the Web of Data. It is important to note that not all the vocabularies in it are registered in LOV. Given the de-centralised and collaborative nature of the use and adoption of these vocabularies, the results of the study can be used to identify emergent important vocabularies that are shaping the Web of Data. Design/methodology/approach The methodology is based on an analytical approach to a data set that captures a complete snapshot of the LOV catalogue dated April 2014. An initial analysis of the data is presented in order to obtain insights into the characteristics of the vocabularies found in LOV. This is followed by an analysis of the use of Vocabulary of a Friend properties that describe relations among vocabularies. Finally, the study is complemented with an analysis of the usage of the different vocabularies, and concludes by proposing a number of metrics. Findings The most relevant insight is that unsurprisingly the vocabularies with more presence are those used to model Semantic Web data, such as Resource Description Framework, RDF Schema and OWL, as well as broadly used standards as Simple Knowledge Organization System, DCTERMS and DCE. It was also discovered that the most used language is English and the vocabularies are not considered to be highly specialised in a field. Also, there is not a dominant scope of the vocabularies. Regarding the structural analysis, it is concluded that LOV is a heterogeneous network. Originality/value The paper provides an empirical analysis of the structure of LOV and the relations between its vocabularies, together with some metrics that may be of help to determine the important vocabularies from a practical perspective. The results are of interest for a better understanding of the evolution and dynamics of the Web of Data, and for applications that attempt to retrieve data in the Linked Data Cloud. These applications can benefit from the insights into the important vocabularies to be supported and the value added when mapping between and using the vocabularies. Collapse

McCusker JP, Dumontier M, Yan R, He S, Dordick JS, McGuinness DL. Finding melanoma drugs through a probabilistic knowledge graph. PeerJ Comput Sci 2017;3:e106. [PMID: 37133296 PMCID: PMC10151034 DOI: 10.7717/peerj-cs.106] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Accepted: 12/27/2016] [Indexed: 05/04/2023]

Mashima J, Kodama Y, Fujisawa T, Katayama T, Okuda Y, Kaminuma E, Ogasawara O, Okubo K, Nakamura Y, Takagi T. DNA Data Bank of Japan. Nucleic Acids Res 2016;45:D25-D31. [PMID: 27924010 PMCID: PMC5210514 DOI: 10.1093/nar/gkw1001] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2016] [Revised: 10/13/2016] [Accepted: 10/15/2016] [Indexed: 12/27/2022] Open

Piñero J, Bravo À, Queralt-Rosinach N, Gutiérrez-Sacristán A, Deu-Pons J, Centeno E, García-García J, Sanz F, Furlong LI. DisGeNET: a comprehensive platform integrating information on human disease-associated genes and variants. Nucleic Acids Res 2016;45:D833-D839. [PMID: 27924018 PMCID: PMC5210640 DOI: 10.1093/nar/gkw943] [Citation(s) in RCA: 1522] [Impact Index Per Article: 190.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Revised: 09/29/2016] [Accepted: 10/18/2016] [Indexed: 12/12/2022] Open

Affiliation(s)

Janet Piñero Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Àlex Bravo Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Núria Queralt-Rosinach Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Alba Gutiérrez-Sacristán Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Jordi Deu-Pons Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Emilio Centeno Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Javier García-García Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Ferran Sanz Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain
Laura I Furlong Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences (DCEXS), Universitat Pompeu Fabra (UPF), C/Dr Aiguader 88, E-08003 Barcelona, Spain

Collapse

Shaban-Nejad A, Lavigne M, Okhmatovskaia A, Buckeridge DL. PopHR: a knowledge-based platform to support integration, analysis, and visualization of population health data. Ann N Y Acad Sci 2016;1387:44-53. [PMID: 27750378 DOI: 10.1111/nyas.13271] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Revised: 08/30/2016] [Accepted: 09/09/2016] [Indexed: 11/29/2022]

Shen F, Lee Y. Knowledge Discovery from Biomedical Ontologies in Cross Domains. PLoS One 2016;11:e0160005. [PMID: 27548262 PMCID: PMC4993478 DOI: 10.1371/journal.pone.0160005] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 07/12/2016] [Indexed: 01/19/2023] Open

Dumontier M, Gray AJG, Marshall MS, Alexiev V, Ansell P, Bader G, Baran J, Bolleman JT, Callahan A, Cruz-Toledo J, Gaudet P, Gombocz EA, Gonzalez-Beltran AN, Groth P, Haendel M, Ito M, Jupp S, Juty N, Katayama T, Kobayashi N, Krishnaswami K, Laibe C, Le Novère N, Lin S, Malone J, Miller M, Mungall CJ, Rietveld L, Wimalaratne SM, Yamaguchi A. The health care and life sciences community profile for dataset descriptions. PeerJ 2016;4:e2331. [PMID: 27602295 PMCID: PMC4991880 DOI: 10.7717/peerj.2331] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Accepted: 07/14/2016] [Indexed: 11/20/2022] Open

Affiliation(s)

Michel Dumontier Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, United States of America
Alasdair J G Gray Department of Computer Science, Heriot-Watt University, Edinburgh, United Kingdom
M Scott Marshall Department of Radiation Oncology (MAASTRO), GROW- School for Oncology and Developmental Biology, MAASTRO Clinic, Maastricht, Netherlands
Vladimir Alexiev Ontotext Corporation, Sofia, Bulgaria
Peter Ansell CSIRO, Australia
Gary Bader The Donnelly Centre, University of Toronto, Toronto, Canada
Joachim Baran Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, United States of America
Jerven T Bolleman Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Geneve, Switzerland
Alison Callahan Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, United States of America
José Cruz-Toledo Carleton University, Canada
Pascale Gaudet CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneve, Switzerland
Erich A Gombocz IO Informatics, Berkeley, CA, United States of America
Alejandra N Gonzalez-Beltran Oxford e-Research Centre, University of Oxford, Oxford, Oxfordshire, United Kingdom
Paul Groth Elsevier Labs, Netherlands
Melissa Haendel Department of Medical Informatics and Epidemiology, Oregon Health Sciences University, Portland, OR, United States of America
Maori Ito Office of Medical Informatics and Epidemiology, Pharmaceuticals and Medical Devices Agency, Chiyoda-ku, Japan
Simon Jupp EMBL, European Bioinformatics Institute, Saffron Walden, United Kingdom
Nick Juty EMBL, European Bioinformatics Institute, Saffron Walden, United Kingdom
Toshiaki Katayama Database Center for Life Science, Kashiwa, Japan
Norio Kobayashi Advanced Center for Computing and Communication, RIKEN, Wako-shi, Saitama, Japan
Kalpana Krishnaswami Cerenode Inc., United States of America
Camille Laibe EMBL, European Bioinformatics Institute, Saffron Walden, United Kingdom
Nicolas Le Novère The Babraham Institute, Cambridge, United Kingdom
Simon Lin Nationwide Children's Hospital, Columbus, OH, United States of America
James Malone EMBL, European Bioinformatics Institute, Saffron Walden, United Kingdom
Michael Miller Institute for Systems Biology, Seattle, WA, United States of America
Christopher J Mungall Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America
Laurens Rietveld Department of Exact Sciences, VU University Amsterdam, Amsterdam, Netherlands
Sarala M Wimalaratne EMBL, European Bioinformatics Institute, Saffron Walden, United Kingdom
Atsuko Yamaguchi Database Center for Life Science, Kashiwa, Japan

Collapse

Fernández-Breis JT, Chiba H, Legaz-García MDC, Uchiyama I. The Orthology Ontology: development and applications. J Biomed Semantics 2016;7:34. [PMID: 27259657 PMCID: PMC4893294 DOI: 10.1186/s13326-016-0077-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2015] [Accepted: 05/17/2016] [Indexed: 11/16/2022] Open

Rodríguez-Iglesias A, Rodríguez-González A, Irvine AG, Sesma A, Urban M, Hammond-Kosack KE, Wilkinson MD. Publishing FAIR Data: An Exemplar Methodology Utilizing PHI-Base. FRONTIERS IN PLANT SCIENCE 2016;7:641. [PMID: 27433158 PMCID: PMC4922217 DOI: 10.3389/fpls.2016.00641] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Accepted: 04/26/2016] [Indexed: 06/06/2023]

Brochhausen M, Zheng J, Birtwell D, Williams H, Masci AM, Ellis HJ, Stoeckert CJ. OBIB-a novel ontology for biobanking. J Biomed Semantics 2016;7:23. [PMID: 27148435 PMCID: PMC4855778 DOI: 10.1186/s13326-016-0068-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 04/21/2016] [Indexed: 11/10/2022] Open

Shen F, Liu H, Sohn S, Larson DW, Lee Y. Predicate Oriented Pattern Analysis for Biomedical Knowledge Discovery. INTELLIGENT INFORMATION MANAGEMENT 2016;8:66-85. [PMID: 28983419 PMCID: PMC5626454 DOI: 10.4236/iim.2016.83006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Queralt-Rosinach N, Piñero J, Bravo À, Sanz F, Furlong LI. DisGeNET-RDF: harnessing the innovative power of the Semantic Web to explore the genetic basis of diseases. Bioinformatics 2016;32:2236-8. [PMID: 27153650 PMCID: PMC4937199 DOI: 10.1093/bioinformatics/btw214] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Accepted: 04/14/2016] [Indexed: 11/13/2022] Open

Callahan A, Abeyruwan SW, Al-Ali H, Sakurai K, Ferguson AR, Popovich PG, Shah NH, Visser U, Bixby JL, Lemmon VP. RegenBase: a knowledge base of spinal cord injury biology for translational research. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw040. [PMID: 27055827 PMCID: PMC4823819 DOI: 10.1093/database/baw040] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 03/03/2016] [Indexed: 12/20/2022]

Burgstaller-Muehlbacher S, Waagmeester A, Mitraka E, Turner J, Putman T, Leong J, Naik C, Pavlidis P, Schriml L, Good BM, Su AI. Wikidata as a semantic framework for the Gene Wiki initiative. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016;2016:baw015. [PMID: 26989148 PMCID: PMC4795929 DOI: 10.1093/database/baw015] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2015] [Accepted: 02/01/2016] [Indexed: 11/14/2022]

Shaban-Nejad A, Mamiya H, Riazanov A, Forster AJ, Baker CJO, Tamblyn R, Buckeridge DL. From Cues to Nudge: A Knowledge-Based Framework for Surveillance of Healthcare-Associated Infections. J Med Syst 2015;40:23. [PMID: 26537131 DOI: 10.1007/s10916-015-0364-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2015] [Accepted: 09/30/2015] [Indexed: 10/22/2022]

González-Beltrán A, Li P, Zhao J, Avila-Garcia MS, Roos M, Thompson M, van der Horst E, Kaliyaperumal R, Luo R, Lee TL, Lam TW, Edmunds SC, Sansone SA, Rocca-Serra P. From Peer-Reviewed to Peer-Reproduced in Scholarly Publishing: The Complementary Roles of Data Models and Workflows in Bioinformatics. PLoS One 2015;10:e0127612. [PMID: 26154165 PMCID: PMC4495984 DOI: 10.1371/journal.pone.0127612] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2014] [Accepted: 04/16/2015] [Indexed: 12/20/2022] Open

Abstract

MOTIVATION

Reproducing the results from a scientific paper can be challenging due to the absence of data and the computational tools required for their analysis. In addition, details relating to the procedures used to obtain the published results can be difficult to discern due to the use of natural language when reporting how experiments have been performed. The Investigation/Study/Assay (ISA), Nanopublications (NP), and Research Objects (RO) models are conceptual data modelling frameworks that can structure such information from scientific papers. Computational workflow platforms can also be used to reproduce analyses of data in a principled manner. We assessed the extent by which ISA, NP, and RO models, together with the Galaxy workflow system, can capture the experimental processes and reproduce the findings of a previously published paper reporting on the development of SOAPdenovo2, a de novo genome assembler.

RESULTS

Executable workflows were developed using Galaxy, which reproduced results that were consistent with the published findings. A structured representation of the information in the SOAPdenovo2 paper was produced by combining the use of ISA, NP, and RO models. By structuring the information in the published paper using these data and scientific workflow modelling frameworks, it was possible to explicitly declare elements of experimental design, variables, and findings. The models served as guides in the curation of scientific information and this led to the identification of inconsistencies in the original published paper, thereby allowing its authors to publish corrections in the form of an errata.

AVAILABILITY

SOAPdenovo2 scripts, data, and results are available through the GigaScience Database: http://dx.doi.org/10.5524/100044; the workflows are available from GigaGalaxy: http://galaxy.cbiit.cuhk.edu.hk; and the representations using the ISA, NP, and RO models are available through the SOAPdenovo2 case study website http://isa-tools.github.io/soapdenovo2/.

CONTACT

philippe.rocca-serra@oerc.ox.ac.uk and susanna-assunta.sansone@oerc.ox.ac.uk.

Collapse

Affiliation(s)

Alejandra González-Beltrán Oxford e-Research Centre, University of Oxford, 7 Keble Road, OX1 3QG, United Kingdom
Peter Li GigaScience, BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong Kong, People’s Republic of China
Jun Zhao InfoLab21, Lancaster University, Bailrigg, Lancaster, LA1 4WA, United Kingdom
Maria Susana Avila-Garcia Nuffield Department of Medicine, Experimental Medicine Division, John Radcliffe Hospital, Headley Way, Headington, Oxford, OX3 9DU, United Kingdom
Marco Roos Department of Human Genetics, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Mark Thompson Department of Human Genetics, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Eelke van der Horst Department of Human Genetics, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Rajaram Kaliyaperumal Department of Human Genetics, Leiden University Medical Center, P.O. Box 9600, 2300 RC Leiden, The Netherlands
Ruibang Luo HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Pokfulam, Hong Kong, People’s Republic of China
Tin-Lap Lee School of Biomedical Sciences and CUHK-BGI Innovation Institute of Trans-omics, The Chinese University of Hong Kong, Shatin, Hong Kong, People’s Republic of China
Tak-wah Lam HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory & Department of Computer Science, University of Hong Kong, Pokfulam, Hong Kong, People’s Republic of China
Scott C. Edmunds GigaScience, BGI HK Research Institute, 16 Dai Fu Street, Tai Po Industrial Estate, Hong Kong, People’s Republic of China
Susanna-Assunta Sansone Oxford e-Research Centre, University of Oxford, 7 Keble Road, OX1 3QG, United Kingdom
Philippe Rocca-Serra Oxford e-Research Centre, University of Oxford, 7 Keble Road, OX1 3QG, United Kingdom

Collapse

Baran J, Durgahee BSB, Eilbeck K, Antezana E, Hoehndorf R, Dumontier M. GFVO: the Genomic Feature and Variation Ontology. PeerJ 2015;3:e933. [PMID: 26019997 PMCID: PMC4435477 DOI: 10.7717/peerj.933] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2014] [Accepted: 04/14/2015] [Indexed: 01/06/2023] Open

Piñero J, Queralt-Rosinach N, Bravo À, Deu-Pons J, Bauer-Mehren A, Baron M, Sanz F, Furlong LI. DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2015;2015:bav028. [PMID: 25877637 PMCID: PMC4397996 DOI: 10.1093/database/bav028] [Citation(s) in RCA: 630] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 03/09/2015] [Indexed: 11/25/2022]

Affiliation(s)

Janet Piñero Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
Núria Queralt-Rosinach Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
Àlex Bravo Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
Jordi Deu-Pons Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
Anna Bauer-Mehren Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
Martin Baron Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
Ferran Sanz Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany
Laura I Furlong Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, C/Dr Aiguader 88, E-08003 Barcelona, Spain, Roche Pharma Research and Early Development, pRED Informatics, Roche Innovation Center Penzberg, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany and Scientific & Business Information Services, Roche Diagnostics GmbH, Nonnenwald 2, 82377 Penzberg, Germany

Collapse

Chiba H, Nishide H, Uchiyama I. Construction of an ortholog database using the semantic web technology for integrative analysis of genomic data. PLoS One 2015;10:e0122802. [PMID: 25875762 PMCID: PMC4395280 DOI: 10.1371/journal.pone.0122802] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2014] [Accepted: 02/13/2015] [Indexed: 12/30/2022] Open

González-Beltrán A, Maguire E, Sansone SA, Rocca-Serra P. linkedISA: semantic representation of ISA-Tab experimental metadata. BMC Bioinformatics 2014;15 Suppl 14:S4. [PMID: 25472428 PMCID: PMC4255742 DOI: 10.1186/1471-2105-15-s14-s4] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

Reporting and sharing experimental metadata- such as the experimental design, characteristics of the samples, and procedures applied, along with the analysis results, in a standardised manner ensures that datasets are comprehensible and, in principle, reproducible, comparable and reusable. Furthermore, sharing datasets in formats designed for consumption by humans and machines will also maximize their use. The Investigation/Study/Assay (ISA) open source metadata tracking framework facilitates standards-compliant collection, curation, visualization, storage and sharing of datasets, leveraging on other platforms to enable analysis and publication. The ISA software suite includes several components used in increasingly diverse set of life science and biomedical domains; it is underpinned by a general-purpose format, ISA-Tab, and conversions exist into formats required by public repositories. While ISA-Tab works well mainly as a human readable format, we have also implemented a linked data approach to semantically define the ISA-Tab syntax.

RESULTS

We present a semantic web representation of the ISA-Tab syntax that complements ISA-Tab's syntactic interoperability with semantic interoperability. We introduce the linkedISA conversion tool from ISA-Tab to the Resource Description Framework (RDF), supporting mappings from the ISA syntax to multiple community-defined, open ontologies and capitalising on user-provided ontology annotations in the experimental metadata. We describe insights of the implementation and how annotations can be expanded driven by the metadata. We applied the conversion tool as part of Bio-GraphIIn, a web-based application supporting integration of the semantically-rich experimental descriptions. Designed in a user-friendly manner, the Bio-GraphIIn interface hides most of the complexities to the users, exposing a familiar tabular view of the experimental description to allow seamless interaction with the RDF representation, and visualising descriptors to drive the query over the semantic representation of the experimental design. In addition, we defined queries over the linkedISA RDF representation and demonstrated its use over the linkedISA conversion of datasets from Nature' Scientific Data online publication.

CONCLUSIONS

Our linked data approach has allowed us to: 1) make the ISA-Tab semantics explicit and machine-processable, 2) exploit the existing ontology-based annotations in the ISA-Tab experimental descriptions, 3) augment the ISA-Tab syntax with new descriptive elements, 4) visualise and query elements related to the experimental design. Reasoning over ISA-Tab metadata and associated data will facilitate data integration and knowledge discovery.

Collapse

Chibucos MC, Mungall CJ, Balakrishnan R, Christie KR, Huntley RP, White O, Blake JA, Lewis SE, Giglio M. Standardized description of scientific evidence using the Evidence Ontology (ECO). DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2014;2014:bau075. [PMID: 25052702 PMCID: PMC4105709 DOI: 10.1093/database/bau075] [Citation(s) in RCA: 82] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Affiliation(s)

Marcus C Chibucos Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USAInstitute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Christopher J Mungall Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Rama Balakrishnan Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Karen R Christie Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Rachael P Huntley Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Owen White Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USAInstitute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Judith A Blake Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Suzanna E Lewis Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA
Michelle Giglio Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USAInstitute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, MD 21201, USA, Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA, Saccharomyces Genome Database, Department of Genetics, Stanford University, Stanford, CA 94305, USA, Computational Biology and Bioinformatics, The Jackson Laboratory, Bar Harbor, ME 04609, USA, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD UK, Department of Epidemiology, University of Maryland School of Medicine, Baltimore, MD 21201, USA and Department of Medicine, University of Maryland School of Medicine, Baltimore, MD 21201, USA

Collapse

Bölling C, Weidlich M, Holzhütter HG. SEE: structured representation of scientific evidence in the biomedical domain using Semantic Web techniques. J Biomed Semantics 2014;5:S1. [PMID: 25093070 PMCID: PMC4108886 DOI: 10.1186/2041-1480-5-s1-s1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Accounts of evidence are vital to evaluate and reproduce scientific findings and integrate data on an informed basis. Currently, such accounts are often inadequate, unstandardized and inaccessible for computational knowledge engineering even though computational technologies, among them those of the semantic web, are ever more employed to represent, disseminate and integrate biomedical data and knowledge.

RESULTS

We present SEE (Semantic EvidencE), an RDF/OWL based approach for detailed representation of evidence in terms of the argumentative structure of the supporting background for claims even in complex settings. We derive design principles and identify minimal components for the representation of evidence. We specify the Reasoning and Discourse Ontology (RDO), an OWL representation of the model of scientific claims, their subjects, their provenance and their argumentative relations underlying the SEE approach. We demonstrate the application of SEE and illustrate its design patterns in a case study by providing an expressive account of the evidence for certain claims regarding the isolation of the enzyme glutamine synthetase.

CONCLUSIONS

SEE is suited to provide coherent and computationally accessible representations of evidence-related information such as the materials, methods, assumptions, reasoning and information sources used to establish a scientific finding by adopting a consistently claim-based perspective on scientific results and their evidence. SEE allows for extensible evidence representations, in which the level of detail can be adjusted and which can be extended as needed. It supports representation of arbitrary many consecutive layers of interpretation and attribution and different evaluations of the same data. SEE and its underlying model could be a valuable component in a variety of use cases that require careful representation or examination of evidence for data presented on the semantic web or in other formats.

Collapse

González-Beltrán A, Maguire E, Sansone SA, Rocca-Serra P. linkedISA: semantic representation of ISA-Tab experimental metadata. BMC Bioinformatics 2014;15. [PMID: 25472428 PMCID: PMC4255742 DOI: 10.1186/1471-2105-15-s14-s4,] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/10/2023] Open

Abstract

BACKGROUND

RESULTS

CONCLUSIONS

Collapse

Chaudhri VK, Elenius D, Goldenkranz A, Gong A, Martone ME, Webb W, Yorke-Smith N. Comparative analysis of knowledge representation and reasoning requirements across a range of life sciences textbooks. J Biomed Semantics 2014;5:51. [PMID: 25785183 PMCID: PMC4362633 DOI: 10.1186/2041-1480-5-51] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2014] [Accepted: 11/26/2014] [Indexed: 11/29/2022] Open

Abstract

Background

Using knowledge representation for biomedical projects is now commonplace. In previous work, we represented the knowledge found in a college-level biology textbook in a fashion useful for answering questions. We showed that embedding the knowledge representation and question-answering abilities in an electronic textbook helped to engage student interest and improve learning. A natural question that arises from this success, and this paper’s primary focus, is whether a similar approach is applicable across a range of life science textbooks. To answer that question, we considered four different textbooks, ranging from a below-introductory college biology text to an advanced, graduate-level neuroscience textbook. For these textbooks, we investigated the following questions: (1) To what extent is knowledge shared between the different textbooks? (2) To what extent can the same upper ontology be used to represent the knowledge found in different textbooks? (3) To what extent can the questions of interest for a range of textbooks be answered by using the same reasoning mechanisms?

Results

Our existing modeling and reasoning methods apply especially well both to a textbook that is comparable in level to the text studied in our previous work (i.e., an introductory-level text) and to a textbook at a lower level, suggesting potential for a high degree of portability. Even for the overlapping knowledge found across the textbooks, the level of detail covered in each textbook was different, which requires that the representations must be customized for each textbook. We also found that for advanced textbooks, representing models and scientific reasoning processes was particularly important.

Conclusions

With some additional work, our representation methodology would be applicable to a range of textbooks. The requirements for knowledge representation are common across textbooks, suggesting that a shared semantic infrastructure for the life sciences is feasible. Because our representation overlaps heavily with those already being used for biomedical ontologies, this work suggests a natural pathway to include such representations as part of the life sciences curriculum at different grade levels.

Collapse

Samadian S, McManus B, Wilkinson MD. Extending and encoding existing biological terminologies and datasets for use in the reasoned semantic web. J Biomed Semantics 2012;3:6. [PMID: 22818710 PMCID: PMC3639885 DOI: 10.1186/2041-1480-3-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Accepted: 05/13/2012] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Clinical phenotypes and disease-risk stratification are most often determined through the direct observations of clinicians in conjunction with published standards and guidelines, where the clinical expert is the final arbiter of the patient's classification. While this "human" approach is highly desirable in the context of personalized and optimal patient care, it is problematic in a healthcare research setting because the basis for the patient's classification is not transparent, and likely not reproducible from one clinical expert to another. This sits in opposition to the rigor required to execute, for example, Genome-wide association analyses and other high-throughput studies where a large number of variables are being compared to a complex disease phenotype. Most clinical classification systems and are not structured for automated classification, and similarly, clinical data is generally not represented in a form that lends itself to automated integration and interpretation. Here we apply Semantic Web technologies to the problem of automated, transparent interpretation of clinical data for use in high-throughput research environments, and explore migration-paths for existing data and legacy semantic standards.

RESULTS

Using a dataset from a cardiovascular cohort collected two decades ago, we present a migration path - both for the terminologies/classification systems and the data - that enables rich automated clinical classification using well-established standards. This is achieved by establishing a simple and flexible core data model, which is combined with a layered ontological framework utilizing both logical reasoning and analytical algorithms to iteratively "lift" clinical data through increasingly complex layers of interpretation and classification. We compare our automated analysis to that of the clinical expert, and discrepancies are used to refine the ontological models, finally arriving at ontologies that mirror the expert opinion of the individual clinical researcher. Other discrepancies, however, could not be as easily modeled, and we evaluate what information we are lacking that would allow these discrepancies to be resolved in an automated manner.

CONCLUSIONS

We demonstrate that the combination of semantically-explicit data, logically rigorous models of clinical guidelines, and publicly-accessible Semantic Web Services, can be used to execute automated, rigorous and reproducible clinical classifications with an accuracy approaching that of an expert. Discrepancies between the manual and automatic approaches reveal, as expected, that clinicians do not always rigorously follow established guidelines for classification; however, we demonstrate that "personalized" ontologies may represent a re-usable and transparent approach to modeling individual clinical expertise, leading to more reproducible science.

Collapse