1
|
Munarko Y, Rampadarath A, Nickerson D. Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE). F1000Res 2023; 12:162. [PMID: 37842339 PMCID: PMC10570691 DOI: 10.12688/f1000research.128982.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/25/2023] [Indexed: 10/17/2023] Open
Abstract
The Transformer-based approaches to solving natural language processing (NLP) tasks such as BERT and GPT are gaining popularity due to their ability to achieve high performance. These approaches benefit from using enormous data sizes to create pre-trained models and the ability to understand the context of words in a sentence. Their use in the information retrieval domain is thought to increase effectiveness and efficiency. This paper demonstrates a BERT-based method (CASBERT) implementation to build a search tool over data annotated compositely using ontologies. The data was a collection of biosimulation models written using the CellML standard in the Physiome Model Repository (PMR). A biosimulation model structurally consists of basic entities of constants and variables that construct higher-level entities such as components, reactions, and the model. Finding these entities specific to their level is beneficial for various purposes regarding variable reuse, experiment setup, and model audit. Initially, we created embeddings representing compositely-annotated entities for constant and variable search (lowest level entity). Then, these low-level entity embeddings were vertically and efficiently combined to create higher-level entity embeddings to search components, models, images, and simulation setups. Our approach was general, so it can be used to create search tools with other data semantically annotated with ontologies - biosimulation models encoded in the SBML format, for example. Our tool is named Biosimulation Model Search Engine (BMSE).
Collapse
Affiliation(s)
- Yuda Munarko
- Auckland Bioengineering Institute, University of Auckland, Auckland, 1010, New Zealand
| | - Anand Rampadarath
- Auckland Bioengineering Institute, University of Auckland, Auckland, 1010, New Zealand
- The New Zealand Institute for Plant and Food Research Limited, Auckland, New Zealand
| | - David Nickerson
- Auckland Bioengineering Institute, University of Auckland, Auckland, 1010, New Zealand
| |
Collapse
|
2
|
Niarakis A, Waltemath D, Glazier J, Schreiber F, Keating SM, Nickerson D, Chaouiya C, Siegel A, Noël V, Hermjakob H, Helikar T, Soliman S, Calzone L. Addressing barriers in comprehensiveness, accessibility, reusability, interoperability and reproducibility of computational models in systems biology. Brief Bioinform 2022; 23:bbac212. [PMID: 35671510 PMCID: PMC9294410 DOI: 10.1093/bib/bbac212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/20/2022] [Accepted: 05/06/2022] [Indexed: 11/14/2022] Open
Abstract
Computational models are often employed in systems biology to study the dynamic behaviours of complex systems. With the rise in the number of computational models, finding ways to improve the reusability of these models and their ability to reproduce virtual experiments becomes critical. Correct and effective model annotation in community-supported and standardised formats is necessary for this improvement. Here, we present recent efforts toward a common framework for annotated, accessible, reproducible and interoperable computational models in biology, and discuss key challenges of the field.
Collapse
Affiliation(s)
- Anna Niarakis
- Université Paris-Saclay, Laboratoire Européen de Recherche pour la Polyarthrite rhumatoïde - Genhotel, Univ Evry, Evry, France
- Lifeware Group, Inria, Saclay-île de France, 91120 Palaiseau, France
| | - Dagmar Waltemath
- Department of Medical Informatics, University Medicine Greifswald, Greifswald, Germany
| | - James Glazier
- Biocomplexity Institute and Department of Intelligent Systems Engineering, Indiana University, Bloomington, IN, USA
| | - Falk Schreiber
- Department of Computer and Information Science, University of Konstanz, Konstanz, Germany
- Faculty of Information Technology, Monash University, Clayton, Australia
| | | | - David Nickerson
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | | | - Anne Siegel
- Univ Rennes, CNRS, Inria - IRISA lab. Rennes
| | - Vincent Noël
- Institut Curie, PSL Research University, Paris, France
- INSERM, U900, Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| | - Henning Hermjakob
- EMBL-European Bioinformatics Institute, Wellcome Genome Campus, Cambridge, UK
| | - Tomáš Helikar
- Department of Biochemistry, University of Nebraska-Lincoln, Lincoln, NE, 68588, USA
| | - Sylvain Soliman
- Lifeware Group, Inria, Saclay-île de France, 91120 Palaiseau, France
| | - Laurence Calzone
- Institut Curie, PSL Research University, Paris, France
- INSERM, U900, Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| |
Collapse
|
3
|
Erdem C, Mutsuddy A, Bensman EM, Dodd WB, Saint-Antoine MM, Bouhaddou M, Blake RC, Gross SM, Heiser LM, Feltus FA, Birtwistle MR. A scalable, open-source implementation of a large-scale mechanistic model for single cell proliferation and death signaling. Nat Commun 2022; 13:3555. [PMID: 35729113 PMCID: PMC9213456 DOI: 10.1038/s41467-022-31138-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 06/07/2022] [Indexed: 02/01/2023] Open
Abstract
Mechanistic models of how single cells respond to different perturbations can help integrate disparate big data sets or predict response to varied drug combinations. However, the construction and simulation of such models have proved challenging. Here, we developed a python-based model creation and simulation pipeline that converts a few structured text files into an SBML standard and is high-performance- and cloud-computing ready. We applied this pipeline to our large-scale, mechanistic pan-cancer signaling model (named SPARCED) and demonstrate it by adding an IFNγ pathway submodel. We then investigated whether a putative crosstalk mechanism could be consistent with experimental observations from the LINCS MCF10A Data Cube that IFNγ acts as an anti-proliferative factor. The analyses suggested this observation can be explained by IFNγ-induced SOCS1 sequestering activated EGF receptors. This work forms a foundational recipe for increased mechanistic model-based data integration on a single-cell level, an important building block for clinically-predictive mechanistic models.
Collapse
Affiliation(s)
- Cemal Erdem
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA.
| | - Arnab Mutsuddy
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Ethan M Bensman
- Computer Science, School of Computing, Clemson University, Clemson, SC, USA
| | - William B Dodd
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Michael M Saint-Antoine
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA
| | - Mehdi Bouhaddou
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA
| | - Robert C Blake
- Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Sean M Gross
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Laura M Heiser
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - F Alex Feltus
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
- Biomedical Data Science and Informatics Program, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Clemson, SC, USA
| | - Marc R Birtwistle
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA.
- Department of Bioengineering, Clemson University, Clemson, SC, USA.
| |
Collapse
|
4
|
Shahidi N, Pan M, Tran K, Crampin EJ, Nickerson DP. A semantics, energy-based approach to automate biomodel composition. PLoS One 2022; 17:e0269497. [PMID: 35657966 PMCID: PMC9165793 DOI: 10.1371/journal.pone.0269497] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 05/20/2022] [Indexed: 11/19/2022] Open
Abstract
Hierarchical modelling is essential to achieving complex, large-scale models. However, not all modelling schemes support hierarchical composition, and correctly mapping points of connection between models requires comprehensive knowledge of each model's components and assumptions. To address these challenges in integrating biosimulation models, we propose an approach to automatically and confidently compose biosimulation models. The approach uses bond graphs to combine aspects of physical and thermodynamics-based modelling with biological semantics. We improved on existing approaches by using semantic annotations to automate the recognition of common components. The approach is illustrated by coupling a model of the Ras-MAPK cascade to a model of the upstream activation of EGFR. Through this methodology, we aim to assist researchers and modellers in readily having access to more comprehensive biological systems models.
Collapse
Affiliation(s)
- Niloofar Shahidi
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Michael Pan
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Melbourne, Victoria, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Victoria, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Victoria, Australia
| | - Kenneth Tran
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Edmund J. Crampin
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Melbourne, Victoria, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Victoria, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Victoria, Australia
- School of Medicine, University of Melbourne, Melbourne, Victoria, Australia
| | - David P. Nickerson
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
5
|
Karr J, Malik-Sheriff RS, Osborne J, Gonzalez-Parra G, Forgoston E, Bowness R, Liu Y, Thompson R, Garira W, Barhak J, Rice J, Torres M, Dobrovolny HM, Tang T, Waites W, Glazier JA, Faeder JR, Kulesza A. Model Integration in Computational Biology: The Role of Reproducibility, Credibility and Utility. FRONTIERS IN SYSTEMS BIOLOGY 2022; 2:822606. [PMID: 36909847 PMCID: PMC10002468 DOI: 10.3389/fsysb.2022.822606] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
During the COVID-19 pandemic, mathematical modeling of disease transmission has become a cornerstone of key state decisions. To advance the state-of-the-art host viral modeling to handle future pandemics, many scientists working on related issues assembled to discuss the topics. These discussions exposed the reproducibility crisis that leads to inability to reuse and integrate models. This document summarizes these discussions, presents difficulties, and mentions existing efforts towards future solutions that will allow future model utility and integration. We argue that without addressing these challenges, scientists will have diminished ability to build, disseminate, and implement high-impact multi-scale modeling that is needed to understand the health crises we face.
Collapse
Affiliation(s)
- Jonathan Karr
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Rahuman S. Malik-Sheriff
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridgeshire, United Kingdom
| | - James Osborne
- School of Mathematics and Statistics, University of Melbourne, Parkville, VIC, Australia
| | | | - Eric Forgoston
- Department of Applied Mathematics and Statistics, Montclair State University, Montclair, NJ, United States
| | - Ruth Bowness
- Department of Mathematical Sciences, University of Bath, Bath, United Kingdom
| | - Yaling Liu
- Department of Mechanical Engineering and Mechanics, Department of Bioengineering, Lehigh University, Bethlehem, PA, United States
| | - Robin Thompson
- Mathematics Institute and the Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, United Kingdom
| | - Winston Garira
- Department of Mathematics and Applied Mathematics, Modelling Health and Environmental Linkages Research Group, University of Venda, Limpopo, South Africa
| | - Jacob Barhak
- Jacob Barhak Analytics, Austin, TX, United States
| | - John Rice
- Independent Retired Working Group Volunteer, Virginia Beach, VA, United States
| | - Marcella Torres
- Department of Mathematics and Computer Science, University of Richmond, Richmond, VA, United States
| | - Hana M. Dobrovolny
- Department of Physics and Astronomy, Texas Christian University, Fort Worth, TX, United States
| | - Tingting Tang
- Department of Mathematics and Statistics in San Diego State University (SDSU) and SDSU Imperial Valley, Calexico, CA, United States
| | - William Waites
- Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Department of Computer and Information Sciences, University of Strathclyde, Glasgow, Scotland
| | - James A. Glazier
- Biocomplexity Institute, Indiana University, Bloomington, IN, United States
| | - James R. Faeder
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA, United States
| | | |
Collapse
|
6
|
Munarko Y, Sarwar DM, Rampadarath A, Atalag K, Gennari JH, Neal ML, Nickerson DP. NLIMED: Natural Language Interface for Model Entity Discovery in Biosimulation Model Repositories. Front Physiol 2022; 13:820683. [PMID: 35283794 PMCID: PMC8908213 DOI: 10.3389/fphys.2022.820683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Accepted: 01/31/2022] [Indexed: 12/04/2022] Open
Abstract
Semantic annotation is a crucial step to assure reusability and reproducibility of biosimulation models in biology and physiology. For this purpose, the COmputational Modeling in BIology NEtwork (COMBINE) community recommends the use of the Resource Description Framework (RDF). This grounding in RDF provides the flexibility to enable searching for entities within models (e.g., variables, equations, or entire models) by utilizing the RDF query language SPARQL. However, the rigidity and complexity of the SPARQL syntax and the nature of the tree-like structure of semantic annotations, are challenging for users. Therefore, we propose NLIMED, an interface that converts natural language queries into SPARQL. We use this interface to query and discover model entities from repositories of biosimulation models. NLIMED works with the Physiome Model Repository (PMR) and the BioModels database and potentially other repositories annotated using RDF. Natural language queries are first “chunked” into phrases and annotated against ontology classes and predicates utilizing different natural language processing tools. Then, the ontology classes and predicates are composed as SPARQL and finally ranked using our SPARQL Composer and our indexing system. We demonstrate that NLIMED's approach for chunking and annotating queries is more effective than the NCBO Annotator for identifying relevant ontology classes in natural language queries.Comparison of NLIMED's behavior against historical query records in the PMR shows that it can adapt appropriately to queries associated with well-annotated models.
Collapse
Affiliation(s)
- Yuda Munarko
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
- *Correspondence: Yuda Munarko
| | - Dewan M. Sarwar
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Anand Rampadarath
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Koray Atalag
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - John H. Gennari
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, WA, United States
| | - Maxwell L. Neal
- Center for Global Infectious Disease Research, Seattle Children's Research Institute, Seattle, WA, United States
| | - David P. Nickerson
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| |
Collapse
|
7
|
Gawthrop PJ, Pan M, Crampin EJ. Modular dynamic biomolecular modelling with bond graphs: the unification of stoichiometry, thermodynamics, kinetics and data. J R Soc Interface 2021; 18:20210478. [PMID: 34428949 PMCID: PMC8385351 DOI: 10.1098/rsif.2021.0478] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 08/02/2021] [Indexed: 12/14/2022] Open
Abstract
Renewed interest in dynamic simulation models of biomolecular systems has arisen from advances in genome-wide measurement and applications of such models in biotechnology and synthetic biology. In particular, genome-scale models of cellular metabolism beyond the steady state are required in order to represent transient and dynamic regulatory properties of the system. Development of such whole-cell models requires new modelling approaches. Here, we propose the energy-based bond graph methodology, which integrates stoichiometric models with thermodynamic principles and kinetic modelling. We demonstrate how the bond graph approach intrinsically enforces thermodynamic constraints, provides a modular approach to modelling, and gives a basis for estimation of model parameters leading to dynamic models of biomolecular systems. The approach is illustrated using a well-established stoichiometric model of Escherichia coli and published experimental data.
Collapse
Affiliation(s)
- Peter J. Gawthrop
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Victoria 3010, Australia
| | - Michael Pan
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Victoria 3010, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, School of Chemical and Biomedical Engineering, University of Melbourne, Victoria 3010, Australia
| | - Edmund J. Crampin
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Victoria 3010, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, School of Chemical and Biomedical Engineering, University of Melbourne, Victoria 3010, Australia
| |
Collapse
|
8
|
Porubsky V, Smith L, Sauro HM. Publishing reproducible dynamic kinetic models. Brief Bioinform 2021; 22:bbaa152. [PMID: 32793969 PMCID: PMC8138891 DOI: 10.1093/bib/bbaa152] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2020] [Revised: 05/19/2020] [Accepted: 06/17/2020] [Indexed: 11/14/2022] Open
Abstract
Publishing repeatable and reproducible computational models is a crucial aspect of the scientific method in computational biology and one that is often forgotten in the rush to publish. The pressures of academic life and the lack of any reward system at institutions, granting agencies and journals means that publishing reproducible science is often either non-existent or, at best, presented in the form of an incomplete description. In the article, we will focus on repeatability and reproducibility in the systems biology field where a great many published models cannot be reproduced and in many cases even repeated. This review describes the current landscape of software tooling, model repositories, model standards and best practices for publishing repeatable and reproducible kinetic models. The review also discusses possible future remedies including working more closely with journals to help reviewers and editors ensure that published kinetic models are at minimum, repeatable. Contact: hsauro@uw.edu.
Collapse
Affiliation(s)
- Veronica Porubsky
- Department of Bioengineering, University of Washington, Seattle, 98105,USA
| | - Lucian Smith
- Department of Bioengineering, University of Washington, Seattle, 98105,USA
| | - Herbert M Sauro
- Department of Bioengineering, University of Washington, Seattle, 98105,USA
| |
Collapse
|
9
|
Shahidi N, Pan M, Safaei S, Tran K, Crampin EJ, Nickerson DP. Hierarchical semantic composition of biosimulation models using bond graphs. PLoS Comput Biol 2021; 17:e1008859. [PMID: 33983945 PMCID: PMC8148364 DOI: 10.1371/journal.pcbi.1008859] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 05/25/2021] [Accepted: 04/27/2021] [Indexed: 11/19/2022] Open
Abstract
Simulating complex biological and physiological systems and predicting their behaviours under different conditions remains challenging. Breaking systems into smaller and more manageable modules can address this challenge, assisting both model development and simulation. Nevertheless, existing computational models in biology and physiology are often not modular and therefore difficult to assemble into larger models. Even when this is possible, the resulting model may not be useful due to inconsistencies either with the laws of physics or the physiological behaviour of the system. Here, we propose a general methodology for composing models, combining the energy-based bond graph approach with semantics-based annotations. This approach improves model composition and ensures that a composite model is physically plausible. As an example, we demonstrate this approach to automated model composition using a model of human arterial circulation. The major benefit is that modellers can spend more time on understanding the behaviour of complex biological and physiological systems and less time wrangling with model composition.
Collapse
Affiliation(s)
- Niloofar Shahidi
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Michael Pan
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Melbourne, Victoria, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Victoria, Australia
| | - Soroush Safaei
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Kenneth Tran
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Edmund J. Crampin
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Melbourne, Victoria, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Victoria, Australia
| | - David P. Nickerson
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
10
|
Waltemath D, Golebiewski M, Blinov ML, Gleeson P, Hermjakob H, Hucka M, Inau ET, Keating SM, König M, Krebs O, Malik-Sheriff RS, Nickerson D, Oberortner E, Sauro HM, Schreiber F, Smith L, Stefan MI, Wittig U, Myers CJ. The first 10 years of the international coordination network for standards in systems and synthetic biology (COMBINE). J Integr Bioinform 2020; 17:jib-2020-0005. [PMID: 32598315 PMCID: PMC7756615 DOI: 10.1515/jib-2020-0005] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Accepted: 05/14/2020] [Indexed: 01/23/2023] Open
Abstract
This paper presents a report on outcomes of the 10th Computational Modeling in Biology Network (COMBINE) meeting that was held in Heidelberg, Germany, in July of 2019. The annual event brings together researchers, biocurators and software engineers to present recent results and discuss future work in the area of standards for systems and synthetic biology. The COMBINE initiative coordinates the development of various community standards and formats for computational models in the life sciences. Over the past 10 years, COMBINE has brought together standard communities that have further developed and harmonized their standards for better interoperability of models and data. COMBINE 2019 was co-located with a stakeholder workshop of the European EU-STANDS4PM initiative that aims at harmonized data and model standardization for in silico models in the field of personalized medicine, as well as with the FAIRDOM PALs meeting to discuss findable, accessible, interoperable and reusable (FAIR) data sharing. This report briefly describes the work discussed in invited and contributed talks as well as during breakout sessions. It also highlights recent advancements in data, model, and annotation standardization efforts. Finally, this report concludes with some challenges and opportunities that this community will face during the next 10 years.
Collapse
Affiliation(s)
- Dagmar Waltemath
- Medical Informatics, University Medicine Greifswald, Greifswald, Germany
| | - Martin Golebiewski
- Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
| | | | - Padraig Gleeson
- Department of Neuroscience, Physiology and Pharmacology, University College London, London, UK
| | | | - Michael Hucka
- Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA
| | - Esther Thea Inau
- Medical Informatics, University Medicine Greifswald, Greifswald, Germany
| | | | - Matthias König
- Institute for Theoretical Biology, Humboldt-University Berlin, Berlin, Germany
| | - Olga Krebs
- Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
| | | | - David Nickerson
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Ernst Oberortner
- U.S. Department of Energy (DOE) Joint Genome Institute (JGI), Lawrence Berkeley National Labs, Berkeley, CA, USA
| | - Herbert M Sauro
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Falk Schreiber
- Department of Computer and Information Science, University ofKonstanz, Germany.,Faculty of IT, Monash University, Melbourne, VIC, Australia
| | - Lucian Smith
- Department of Bioengineering, University of Washington, Seattle, WA, USA
| | - Melanie I Stefan
- Centre for Discovery Brain Sciences, The University of Edinburgh, Edinburgh, UK.,ZJU-UoE Institute, Zhejiang University, Haining, China.,University of Utah, Salt Lake City, UT, USA
| | - Ulrike Wittig
- Heidelberg Institute for Theoretical Studies (HITS), Heidelberg, Germany
| | - Chris J Myers
- Centre for Discovery Brain Sciences, The University of Edinburgh, Edinburgh, UK
| |
Collapse
|
11
|
Malik-Sheriff RS, Glont M, Nguyen TVN, Tiwari K, Roberts MG, Xavier A, Vu MT, Men J, Maire M, Kananathan S, Fairbanks EL, Meyer JP, Arankalle C, Varusai TM, Knight-Schrijver V, Li L, Dueñas-Roca C, Dass G, Keating SM, Park YM, Buso N, Rodriguez N, Hucka M, Hermjakob H. BioModels-15 years of sharing computational models in life science. Nucleic Acids Res 2020; 48:D407-D415. [PMID: 31701150 PMCID: PMC7145643 DOI: 10.1093/nar/gkz1055] [Citation(s) in RCA: 122] [Impact Index Per Article: 30.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2019] [Revised: 10/22/2019] [Accepted: 11/06/2019] [Indexed: 01/05/2023] Open
Abstract
Computational modelling has become increasingly common in life science research. To provide a platform to support universal sharing, easy accessibility and model reproducibility, BioModels (https://www.ebi.ac.uk/biomodels/), a repository for mathematical models, was established in 2005. The current BioModels platform allows submission of models encoded in diverse modelling formats, including SBML, CellML, PharmML, COMBINE archive, MATLAB, Mathematica, R, Python or C++. The models submitted to BioModels are curated to verify the computational representation of the biological process and the reproducibility of the simulation results in the reference publication. The curation also involves encoding models in standard formats and annotation with controlled vocabularies following MIRIAM (minimal information required in the annotation of biochemical models) guidelines. BioModels now accepts large-scale submission of auto-generated computational models. With gradual growth in content over 15 years, BioModels currently hosts about 2000 models from the published literature. With about 800 curated models, BioModels has become the world’s largest repository of curated models and emerged as the third most used data resource after PubMed and Google Scholar among the scientists who use modelling in their research. Thus, BioModels benefits modellers by providing access to reliable and semantically enriched curated models in standard formats that are easy to share, reproduce and reuse.
Collapse
Affiliation(s)
- Rahuman S Malik-Sheriff
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mihai Glont
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tung V N Nguyen
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Krishna Tiwari
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| | - Matthew G Roberts
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ashley Xavier
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Manh T Vu
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jinghao Men
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Maire
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarubini Kananathan
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Emma L Fairbanks
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Johannes P Meyer
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Chinmay Arankalle
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thawfeek M Varusai
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | - Lu Li
- Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| | - Corina Dueñas-Roca
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gaurhari Dass
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah M Keating
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Young M Park
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicola Buso
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Nicolas Rodriguez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Babraham Institute, Babraham Research Campus, Cambridge CB22 3AT, UK
| | - Michael Hucka
- California Institute of Technology, Pasadena, 91125, CA, USA
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,State Key Laboratory of Proteomics, Beijing Proteome Research Center, Beijing Institute of Lifeomics, National Center for Protein Sciences (The PHOENIX Center, Beijing), Beijing 102206, China
| |
Collapse
|
12
|
Lang PF, Chebaro Y, Zheng X, P Sekar JA, Shaikh B, Natale DA, Karr JR. BpForms and BcForms: a toolkit for concretely describing non-canonical polymers and complexes to facilitate global biochemical networks. Genome Biol 2020; 21:117. [PMID: 32423472 PMCID: PMC7236495 DOI: 10.1186/s13059-020-02025-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 04/16/2020] [Indexed: 12/12/2022] Open
Abstract
Non-canonical residues, caps, crosslinks, and nicks are important to many functions of DNAs, RNAs, proteins, and complexes. However, we do not fully understand how networks of such non-canonical macromolecules generate behavior. One barrier is our limited formats for describing macromolecules. To overcome this barrier, we develop BpForms and BcForms, a toolkit for representing the primary structure of macromolecules as combinations of residues, caps, crosslinks, and nicks. The toolkit can help omics researchers perform quality control and exchange information about macromolecules, help systems biologists assemble global models of cells that encompass processes such as post-translational modification, and help bioengineers design cells.
Collapse
Affiliation(s)
- Paul F Lang
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, UK
| | - Yassmine Chebaro
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, Institut National de la Santé et de la Recherche Médicale, Centre National de la Recherche Scientifique, Université de Strasbourg, Illkirch, 67404, France
| | - Xiaoyue Zheng
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
| | - John A P Sekar
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
| | - Bilal Shaikh
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA
| | - Darren A Natale
- Protein Information Resource, Georgetown University Medical Center, Washington, DC, 20007, USA
| | - Jonathan R Karr
- Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA.
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, 10029, NY, USA.
| |
Collapse
|
13
|
Sarwar DM, Kalbasi R, Gennari JH, Carlson BE, Neal ML, Bono BD, Atalag K, Hunter PJ, Nickerson DP. Model annotation and discovery with the Physiome Model Repository. BMC Bioinformatics 2019; 20:457. [PMID: 31492098 PMCID: PMC6731580 DOI: 10.1186/s12859-019-2987-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2019] [Accepted: 07/09/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Mathematics and Phy sics-based simulation models have the potential to help interpret and encapsulate biological phenomena in a computable and reproducible form. Similarly, comprehensive descriptions of such models help to ensure that such models are accessible, discoverable, and reusable. To this end, researchers have developed tools and standards to encode mathematical models of biological systems enabling reproducibility and reuse, tools and guidelines to facilitate semantic description of mathematical models, and repositories in which to archive, share, and discover models. Scientists can leverage these resources to investigate specific questions and hypotheses in a more efficient manner. RESULTS We have comprehensively annotated a cohort of models with biological semantics. These annotated models are freely available in the Physiome Model Repository (PMR). To demonstrate the benefits of this approach, we have developed a web-based tool which enables users to discover models relevant to their work, with a particular focus on epithelial transport. Based on a semantic query, this tool will help users discover relevant models, suggesting similar or alternative models that the user may wish to explore or use. CONCLUSION The semantic annotation and the web tool we have developed is a new contribution enabling scientists to discover relevant models in the PMR as candidates for reuse in their own scientific endeavours. This approach demonstrates how semantic web technologies and methodologies can contribute to biomedical and clinical research. The source code and links to the web tool are available at https://github.com/dewancse/model-discovery-tool.
Collapse
Affiliation(s)
- Dewan M Sarwar
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Reza Kalbasi
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - John H Gennari
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington, USA
| | - Brian E Carlson
- Molecular & Integrative Physiology, University of Michigan, Ann Arbor, Michigan, USA
| | - Maxwell L Neal
- Center for Global Infectious Disease Research, Seattle Children's Research Institute, Seattle, Washington, USA
| | - Bernard de Bono
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Koray Atalag
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Peter J Hunter
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - David P Nickerson
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand.
| |
Collapse
|
14
|
Mısırlı G, Taylor R, Goñi-Moreno A, McLaughlin JA, Myers C, Gennari JH, Lord P, Wipat A. SBOL-OWL: An Ontological Approach for Formal and Semantic Representation of Synthetic Biology Information. ACS Synth Biol 2019; 8:1498-1514. [PMID: 31059645 DOI: 10.1021/acssynbio.8b00532] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Standard representation of data is key for the reproducibility of designs in synthetic biology. The Synthetic Biology Open Language (SBOL) has already emerged as a data standard to represent information about genetic circuits, and it is based on capturing data using graphs. The language provides the syntax using a free text document that is accessible to humans only. This paper describes SBOL-OWL, an ontology for a machine understandable definition of SBOL. This ontology acts as a semantic layer for genetic circuit designs. As a result, computational tools can understand the meaning of design entities in addition to parsing structured SBOL data. SBOL-OWL not only describes how genetic circuits can be constructed computationally, it also facilitates the use of several existing Semantic Web tools for synthetic biology. This paper demonstrates some of these features, for example, to validate designs and check for inconsistencies. Through the use of SBOL-OWL, queries can be simplified and become more intuitive. Moreover, existing reasoners can be used to infer information about genetic circuit designs that cannot be directly retrieved using existing querying mechanisms. This ontological representation of the SBOL standard provides a new perspective to the verification, representation, and querying of information about genetic circuits and is important to incorporate complex design information via the integration of biological ontologies.
Collapse
Affiliation(s)
- Göksel Mısırlı
- School of Computing and Mathematics, Keele University, Keele, Staffordshire ST5 5BG, UK
| | - Renee Taylor
- School of Computing and Mathematics, Keele University, Keele, Staffordshire ST5 5BG, UK
| | - Angel Goñi-Moreno
- School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, UK
| | | | - Chris Myers
- Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, Utah 84112, United States
| | - John H. Gennari
- Department of Biomedical Informatics and Medical Education, University of Washington, Seattle, Washington 98195, United States
| | - Phillip Lord
- School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, UK
| | - Anil Wipat
- School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, UK
| |
Collapse
|