1
|
Yang PC, Purawat S, Ieong PU, Jeng MT, DeMarco KR, Vorobyov I, McCulloch AD, Altintas I, Amaro RE, Clancy CE. A demonstration of modularity, reuse, reproducibility, portability and scalability for modeling and simulation of cardiac electrophysiology using Kepler Workflows. PLoS Comput Biol 2019; 15:e1006856. [PMID: 30849072 PMCID: PMC6426265 DOI: 10.1371/journal.pcbi.1006856] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 03/20/2019] [Accepted: 02/08/2019] [Indexed: 01/18/2023] Open
Abstract
Multi-scale computational modeling is a major branch of computational biology as evidenced by the US federal interagency Multi-Scale Modeling Consortium and major international projects. It invariably involves specific and detailed sequences of data analysis and simulation, often with multiple tools and datasets, and the community recognizes improved modularity, reuse, reproducibility, portability and scalability as critical unmet needs in this area. Scientific workflows are a well-recognized strategy for addressing these needs in scientific computing. While there are good examples if the use of scientific workflows in bioinformatics, medical informatics, biomedical imaging and data analysis, there are fewer examples in multi-scale computational modeling in general and cardiac electrophysiology in particular. Cardiac electrophysiology simulation is a mature area of multi-scale computational biology that serves as an excellent use case for developing and testing new scientific workflows. In this article, we develop, describe and test a computational workflow that serves as a proof of concept of a platform for the robust integration and implementation of a reusable and reproducible multi-scale cardiac cell and tissue model that is expandable, modular and portable. The workflow described leverages Python and Kepler-Python actor for plotting and pre/post-processing. During all stages of the workflow design, we rely on freely available open-source tools, to make our workflow freely usable by scientists. We present a computational workflow as a proof of concept for integration and implementation of a reusable and reproducible cardiac multi-scale electrophysiology model that is expandable, modular and portable. This framework enables scientists to create intuitive, user-friendly and flexible end-to-end automated scientific workflows using a graphical user interface. Kepler is an advanced open-source platform that supports multiple models of computation. The underlying workflow engine handles scalability, provenance, reproducibility aspects of the code, performs orchestration of data flow, and automates execution on heterogeneous computing resources. One of the main advantages of workflow utilization is the integration of code written in multiple languages Standardization occurs at the interfaces of the workflow elements and allows for general applications and easy comparison and integration of code from different research groups or even multiple programmers coding in different languages for various purposes from the same group. A workflow driven problem-solving approach enables domain scientists to focus on resolving the core science questions, and delegates the computational and process management burden to the underlying Workflow. The workflow driven approach allows scaling the computational experiment with distributed data-parallel execution on multiple computing platforms, such as, HPC resources, GPU clusters, Cloud etc. The workflow framework tracks software version information along with hardware information to allow users an opportunity to trace any variation in workflow outcome to the system configurations.
Collapse
Affiliation(s)
- Pei-Chi Yang
- Department of Physiology and Membrane Biology, Department of Pharmacology, School of Medicine, University of California Davis, Davis, California, United States of America
| | - Shweta Purawat
- San Diego Supercomputer Center (SDSC), University of California, San Diego, La Jolla, California, United States of America
| | - Pek U. Ieong
- Department of Chemistry and Biochemistry, National Biomedical Computation Resource, Drug Design Data Resource (D3R), University of California San Diego, La Jolla, California, United States of America
| | - Mao-Tsuen Jeng
- Department of Physiology and Membrane Biology, Department of Pharmacology, School of Medicine, University of California Davis, Davis, California, United States of America
| | - Kevin R. DeMarco
- Department of Physiology and Membrane Biology, Department of Pharmacology, School of Medicine, University of California Davis, Davis, California, United States of America
| | - Igor Vorobyov
- Department of Physiology and Membrane Biology, Department of Pharmacology, School of Medicine, University of California Davis, Davis, California, United States of America
| | - Andrew D. McCulloch
- Departments of Bioengineering and Medicine, University of California, San Diego, La Jolla, California, United States of America
| | - Ilkay Altintas
- San Diego Supercomputer Center (SDSC), University of California, San Diego, La Jolla, California, United States of America
| | - Rommie E. Amaro
- Department of Chemistry and Biochemistry, National Biomedical Computation Resource, Drug Design Data Resource (D3R), University of California San Diego, La Jolla, California, United States of America
| | - Colleen E. Clancy
- Department of Physiology and Membrane Biology, Department of Pharmacology, School of Medicine, University of California Davis, Davis, California, United States of America
- * E-mail:
| |
Collapse
|
2
|
Dräger A, Palsson BØ. Improving collaboration by standardization efforts in systems biology. Front Bioeng Biotechnol 2014; 2:61. [PMID: 25538939 PMCID: PMC4259112 DOI: 10.3389/fbioe.2014.00061] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Accepted: 11/14/2014] [Indexed: 11/17/2022] Open
Abstract
Collaborative genome-scale reconstruction endeavors of metabolic networks would not be possible without a common, standardized formal representation of these systems. The ability to precisely define biological building blocks together with their dynamic behavior has even been considered a prerequisite for upcoming synthetic biology approaches. Driven by the requirements of such ambitious research goals, standardization itself has become an active field of research on nearly all levels of granularity in biology. In addition to the originally envisaged exchange of computational models and tool interoperability, new standards have been suggested for an unambiguous graphical display of biological phenomena, to annotate, archive, as well as to rank models, and to describe execution and the outcomes of simulation experiments. The spectrum now even covers the interaction of entire neurons in the brain, three-dimensional motions, and the description of pharmacometric studies. Thereby, the mathematical description of systems and approaches for their (repeated) simulation are clearly separated from each other and also from their graphical representation. Minimum information definitions constitute guidelines and common operation protocols in order to ensure reproducibility of findings and a unified knowledge representation. Central database infrastructures have been established that provide the scientific community with persistent links from model annotations to online resources. A rich variety of open-source software tools thrives for all data formats, often supporting a multitude of programing languages. Regular meetings and workshops of developers and users lead to continuous improvement and ongoing development of these standardization efforts. This article gives a brief overview about the current state of the growing number of operation protocols, mark-up languages, graphical descriptions, and fundamental software support with relevance to systems biology.
Collapse
Affiliation(s)
- Andreas Dräger
- Systems Biology Research Group, Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
- Cognitive Systems, Center for Bioinformatics Tübingen (ZBIT), Department of Computer Science, University of Tübingen, Tübingen, Germany
| | - Bernhard Ø. Palsson
- Systems Biology Research Group, Department of Bioengineering, University of California, San Diego, La Jolla, CA, USA
| |
Collapse
|