1
|
Ziemann M, Poulain P, Bora A. The five pillars of computational reproducibility: bioinformatics and beyond. Brief Bioinform 2023; 24:bbad375. [PMID: 37870287 PMCID: PMC10591307 DOI: 10.1093/bib/bbad375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/26/2023] [Accepted: 09/30/2023] [Indexed: 10/24/2023] Open
Abstract
Computational reproducibility is a simple premise in theory, but is difficult to achieve in practice. Building upon past efforts and proposals to maximize reproducibility and rigor in bioinformatics, we present a framework called the five pillars of reproducible computational research. These include (1) literate programming, (2) code version control and sharing, (3) compute environment control, (4) persistent data sharing and (5) documentation. These practices will ensure that computational research work can be reproduced quickly and easily, long into the future. This guide is designed for bioinformatics data analysts and bioinformaticians in training, but should be relevant to other domains of study.
Collapse
Affiliation(s)
- Mark Ziemann
- Deakin University, School of Life and Environmental Sciences, Geelong, Australia
- Burnet Institute, Melbourne, Australia
| | - Pierre Poulain
- Université Paris Cité, CNRS, Institut Jacques Monod, Paris, France
| | - Anusuiya Bora
- Deakin University, School of Life and Environmental Sciences, Geelong, Australia
| |
Collapse
|
2
|
Deng ZL, Zhou DZ, Cao SJ, Li Q, Zhang JF, Xie H. Development and Validation of an Inflammatory Response-Related Gene Signature for Predicting the Prognosis of Pancreatic Adenocarcinoma. Inflammation 2022; 45:1732-1751. [PMID: 35322324 DOI: 10.1007/s10753-022-01657-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 02/27/2022] [Accepted: 03/01/2022] [Indexed: 11/05/2022]
Abstract
Pancreatic adenocarcinoma (PAAD) is a highly dangerous malignant tumor of the digestive tract, and difficult to diagnose, treat, and predict the prognosis. As we all know, tumor and inflammation can affect each other, and thus the inflammatory response in the microenvironment can be used to affect the prognosis. So far, the prognostic value of inflammatory response-related genes in PAAD is still unclear. Therefore, this study aimed to explore the inflammatory response-related genes for predicting the prognosis of PAAD. In this study, the mRNA expression profiles of PAAD patients and the corresponding clinical characteristics data of PAAD patients were downloaded from the public database. The least absolute shrinkage and selection operator (LASSO) Cox analysis model was used to identify and construct the prognostic gene signature in The Cancer Genome Atlas (TCGA) cohort. The PAAD patients used for verification are from the International Cancer Genome Consortium (ICGC) cohort. The Kaplan-Meier method was used to compare the overall survival (OS) between the high- and low-risk groups. Univariate and multivariate Cox analyses were performed to identify the independent predictors of OS. Gene set enrichment analysis (GSEA) was performed to obtain gene ontology (GO) terms and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and the correlation between gene expression and immune infiltrates was investigated via single sample gene set enrichment analysis (ssGSEA). The GEPIA database was performed to examine prognostic genes in PAAD. LASSO Cox regression analysis was used to construct a model of inflammatory response-related gene signature. Compared with the low-risk group, patients in the high-risk group had significantly lower OS. The receiver operating characteristic curve (ROC) analysis confirmed the signature's predictive capacity. Multivariate Cox analysis showed that risk score is an independent predictor of OS. Functional analysis shows that the immune status between the two risk groups is significantly different, and the cancer-related pathways were abundant in the high-risk group. Moreover, the risk score is significantly related to tumor grade, stage, and immune infiltration types. It was also obtained that the expression level of prognostic genes was significantly correlated with the sensitivity of cancer cells to anti-tumor drugs. In addition, there are significant differences in the expression of PAAD tissues and adjacent non-tumor tissues. The novel signature constructed from five inflammatory response-related genes can be used to predict prognosis and affect the immune status of PAAD. In addition, suppressing these genes may be a treatment option.
Collapse
Affiliation(s)
- Zu-Liang Deng
- Department of Radiation Oncology, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, People's Republic of China
| | - Ding-Zhong Zhou
- Department of Interventional Vascular Surgery, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, People's Republic of China
| | - Su-Juan Cao
- Department of Oncology, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, People's Republic of China
| | - Qing Li
- Department of Interventional Vascular Surgery, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, People's Republic of China
| | - Jian-Fang Zhang
- Department of Physical Examination, Beihu Centers for Disease Control and Prevention, Chenzhou, 423000, People's Republic of China
| | - Hui Xie
- Department of Radiation Oncology, Affiliated Hospital (Clinical College) of Xiangnan University, Chenzhou, 423000, People's Republic of China.
| |
Collapse
|
3
|
Cudmore P, Pan M, Gawthrop PJ, Crampin EJ. Analysing and simulating energy-based models in biology using BondGraphTools. THE EUROPEAN PHYSICAL JOURNAL. E, SOFT MATTER 2021; 44:148. [PMID: 34904197 DOI: 10.1140/epje/s10189-021-00152-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 11/18/2021] [Indexed: 06/14/2023]
Abstract
Like all physical systems, biological systems are constrained by the laws of physics. However, mathematical models of biochemistry frequently neglect the conservation of energy, leading to unrealistic behaviour. Energy-based models that are consistent with conservation of mass, charge and energy have the potential to aid the understanding of complex interactions between biological components, and are becoming easier to develop with recent advances in experimental measurements and databases. In this paper, we motivate the use of bond graphs (a modelling tool from engineering) for energy-based modelling and introduce, BondGraphTools, a Python library for constructing and analysing bond graph models. We use examples from biochemistry to illustrate how BondGraphTools can be used to automate model construction in systems biology while maintaining consistency with the laws of physics.
Collapse
Affiliation(s)
- Peter Cudmore
- Systems Biology Laboratory, School of Mathematics and Statistics, Department of Biomedical Engineering, University of Melbourne, Parkville, VIC, 3010, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Michael Pan
- Systems Biology Laboratory, School of Mathematics and Statistics, Department of Biomedical Engineering, University of Melbourne, Parkville, VIC, 3010, Australia.
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Peter J Gawthrop
- Systems Biology Laboratory, School of Mathematics and Statistics, Department of Biomedical Engineering, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Edmund J Crampin
- Systems Biology Laboratory, School of Mathematics and Statistics, Department of Biomedical Engineering, University of Melbourne, Parkville, VIC, 3010, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Parkville, VIC, 3010, Australia
- School of Medicine, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
4
|
Shahidi N, Pan M, Safaei S, Tran K, Crampin EJ, Nickerson DP. Hierarchical semantic composition of biosimulation models using bond graphs. PLoS Comput Biol 2021; 17:e1008859. [PMID: 33983945 PMCID: PMC8148364 DOI: 10.1371/journal.pcbi.1008859] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Revised: 05/25/2021] [Accepted: 04/27/2021] [Indexed: 11/19/2022] Open
Abstract
Simulating complex biological and physiological systems and predicting their behaviours under different conditions remains challenging. Breaking systems into smaller and more manageable modules can address this challenge, assisting both model development and simulation. Nevertheless, existing computational models in biology and physiology are often not modular and therefore difficult to assemble into larger models. Even when this is possible, the resulting model may not be useful due to inconsistencies either with the laws of physics or the physiological behaviour of the system. Here, we propose a general methodology for composing models, combining the energy-based bond graph approach with semantics-based annotations. This approach improves model composition and ensures that a composite model is physically plausible. As an example, we demonstrate this approach to automated model composition using a model of human arterial circulation. The major benefit is that modellers can spend more time on understanding the behaviour of complex biological and physiological systems and less time wrangling with model composition.
Collapse
Affiliation(s)
- Niloofar Shahidi
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Michael Pan
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Melbourne, Victoria, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Victoria, Australia
| | - Soroush Safaei
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Kenneth Tran
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| | - Edmund J. Crampin
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, University of Melbourne, Melbourne, Victoria, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Faculty of Engineering and Information Technology, University of Melbourne, Melbourne, Victoria, Australia
| | - David P. Nickerson
- Auckland Bioengineering Institute, The University of Auckland, Auckland, New Zealand
| |
Collapse
|
5
|
Schaduangrat N, Lampa S, Simeon S, Gleeson MP, Spjuth O, Nantasenamat C. Towards reproducible computational drug discovery. J Cheminform 2020; 12:9. [PMID: 33430992 PMCID: PMC6988305 DOI: 10.1186/s13321-020-0408-x] [Citation(s) in RCA: 85] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 01/02/2020] [Indexed: 12/11/2022] Open
Abstract
The reproducibility of experiments has been a long standing impediment for further scientific progress. Computational methods have been instrumental in drug discovery efforts owing to its multifaceted utilization for data collection, pre-processing, analysis and inference. This article provides an in-depth coverage on the reproducibility of computational drug discovery. This review explores the following topics: (1) the current state-of-the-art on reproducible research, (2) research documentation (e.g. electronic laboratory notebook, Jupyter notebook, etc.), (3) science of reproducible research (i.e. comparison and contrast with related concepts as replicability, reusability and reliability), (4) model development in computational drug discovery, (5) computational issues on model development and deployment, (6) use case scenarios for streamlining the computational drug discovery protocol. In computational disciplines, it has become common practice to share data and programming codes used for numerical calculations as to not only facilitate reproducibility, but also to foster collaborations (i.e. to drive the project further by introducing new ideas, growing the data, augmenting the code, etc.). It is therefore inevitable that the field of computational drug design would adopt an open approach towards the collection, curation and sharing of data/code.
Collapse
Affiliation(s)
- Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand
| | - Samuel Lampa
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden
| | - Saw Simeon
- Interdisciplinary Graduate Program in Bioscience, Faculty of Science, Kasetsart University, 10900, Bangkok, Thailand
| | - Matthew Paul Gleeson
- Department of Biomedical Engineering, Faculty of Engineering, King Mongkut's Institute of Technology Ladkrabang, 10520, Bangkok, Thailand.
| | - Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, 751 24, Uppsala, Sweden.
| | - Chanin Nantasenamat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, 10700, Bangkok, Thailand.
| |
Collapse
|
6
|
Pan M, Gawthrop PJ, Tran K, Cursons J, Crampin EJ. A thermodynamic framework for modelling membrane transporters. J Theor Biol 2018; 481:10-23. [PMID: 30273576 DOI: 10.1016/j.jtbi.2018.09.034] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 09/24/2018] [Accepted: 09/27/2018] [Indexed: 12/18/2022]
Abstract
Membrane transporters contribute to the regulation of the internal environment of cells by translocating substrates across cell membranes. Like all physical systems, the behaviour of membrane transporters is constrained by the laws of thermodynamics. However, many mathematical models of transporters, especially those incorporated into whole-cell models, are not thermodynamically consistent, leading to unrealistic behaviour. In this paper we use a physics-based modelling framework, in which the transfer of energy is explicitly accounted for, to develop thermodynamically consistent models of transporters. We then apply this methodology to model two specific transporters: the cardiac sarcoplasmic/endoplasmic Ca2+ ATPase (SERCA) and the cardiac Na+/K+ ATPase.
Collapse
Affiliation(s)
- Michael Pan
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, Melbourne School of Engineering, University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Peter J Gawthrop
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, Melbourne School of Engineering, University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Kenneth Tran
- Auckland Bioengineering Institute, University of Auckland, New Zealand.
| | - Joseph Cursons
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria 3052, Australia; Department of Medical Biology, School of Medicine, University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Edmund J Crampin
- Systems Biology Laboratory, School of Mathematics and Statistics, and Department of Biomedical Engineering, Melbourne School of Engineering, University of Melbourne, Parkville, Victoria 3010, Australia; School of Medicine, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, Victoria 3010, Australia; ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Melbourne School of Engineering, University of Melbourne, Parkville, Victoria 3010, Australia.
| |
Collapse
|
7
|
Gawthrop PJ, Siekmann I, Kameneva T, Saha S, Ibbotson MR, Crampin EJ. Bond graph modelling of chemoelectrical energy transduction. IET Syst Biol 2017; 11:127-138. [PMCID: PMC8687425 DOI: 10.1049/iet-syb.2017.0006] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 04/25/2017] [Accepted: 05/23/2017] [Indexed: 07/20/2023] Open
Abstract
Energy‐based bond graph modelling of biomolecular systems is extended to include chemoelectrical transduction thus enabling integrated thermodynamically compliant modelling of chemoelectrical systems in general and excitable membranes in particular. Our general approach is illustrated by recreating a well‐known model of an excitable membrane. This model is used to investigate the energy consumed during a membrane action potential thus contributing to the current debate on the trade‐off between the speed of an action potential event and energy consumption. The influx of Na+ is often taken as a proxy for energy consumption; in contrast, this study presents an energy‐based model of action potentials. As the energy‐based approach avoids the assumptions underlying the proxy approach it can be directly used to compute energy consumption in both healthy and diseased neurons. These results are illustrated by comparing the energy consumption of healthy and degenerative retinal ganglion cells using both simulated and in vitro data.
Collapse
Affiliation(s)
- Peter J. Gawthrop
- Department of Biomedical EngineeringUniversity of MelbourneParkvilleVICAustralia
| | - Ivo Siekmann
- Institute for Mathematical Stochastics, University of GöttingenGottingenGermany
| | - Tatiana Kameneva
- Department of Biomedical EngineeringUniversity of MelbourneParkvilleVICAustralia
| | - Susmita Saha
- National Vision Research Institute, Australian College of OptometryCarltonVICAustralia
| | - Michael R. Ibbotson
- National Vision Research Institute, Australian College of OptometryCarltonVICAustralia
- Centre of Excellence for Integrative Brain Function, Dept. Optometry and Vision SciencesUniversity of MelbourneParkvilleVICAustralia
| | - Edmund J. Crampin
- Department of Biomedical EngineeringUniversity of MelbourneParkvilleVICAustralia
- School of Mathematics and Statistics, University of MelbourneParkvilleVIC3010Australia
- School of Medicine, University of MelbourneParkvilleVIC3010Australia
| |
Collapse
|
8
|
Gawthrop PJ, Crampin EJ. Energy-based analysis of biomolecular pathways. Proc Math Phys Eng Sci 2017; 473:20160825. [PMID: 28690404 DOI: 10.1098/rspa.2016.0825] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 05/26/2017] [Indexed: 01/03/2023] Open
Abstract
Decomposition of biomolecular reaction networks into pathways is a powerful approach to the analysis of metabolic and signalling networks. Current approaches based on analysis of the stoichiometric matrix reveal information about steady-state mass flows (reaction rates) through the network. In this work, we show how pathway analysis of biomolecular networks can be extended using an energy-based approach to provide information about energy flows through the network. This energy-based approach is developed using the engineering-inspired bond graph methodology to represent biomolecular reaction networks. The approach is introduced using glycolysis as an exemplar; and is then applied to analyse the efficiency of free energy transduction in a biomolecular cycle model of a transporter protein [sodium-glucose transport protein 1 (SGLT1)]. The overall aim of our work is to present a framework for modelling and analysis of biomolecular reactions and processes which considers energy flows and losses as well as mass transport.
Collapse
Affiliation(s)
- Peter J Gawthrop
- Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia
| | - Edmund J Crampin
- Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia.,School of Mathematics and Statistics, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia.,School of Medicine, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia.,ARC Centre of Excellence in Convergent Bio-Nano Science, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia
| |
Collapse
|
9
|
Gawthrop PJ. Bond Graph Modeling of Chemiosmotic Biomolecular Energy Transduction. IEEE Trans Nanobioscience 2017; 16:177-188. [PMID: 28252411 DOI: 10.1109/tnb.2017.2674683] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Engineering systems modeling and analysis based on the bond graph approach has been applied to biomolecular systems. In this context, the notion of a Faraday-equivalent chemical potential is introduced which allows chemical potential to be expressed in an analogous manner to electrical volts thus allowing engineering intuition to be applied to biomolecular systems. Redox reactions, and their representation by half-reactions, are key components of biological systems which involve both electrical and chemical domains. A bond graph interpretation of redox reactions is given which combines bond graphs with the Faraday-equivalent chemical potential. This approach is particularly relevant when the biomolecular system implements chemoelectrical transduction - for example chemiosmosis within the key metabolic pathway of mitochondria: oxidative phosphorylation. An alternative way of implementing computational modularity using bond graphs is introduced and used to give a physically based model of the mitochondrial electron transport chain To illustrate the overall approach, this model is analyzed using the Faraday-equivalent chemical potential approach and engineering intuition is used to guide affinity equalisation: a energy based analysis of the mitochondrial electron transport chain.
Collapse
|
10
|
Denaxas S, Direk K, Gonzalez-Izquierdo A, Pikoula M, Cakiroglu A, Moore J, Hemingway H, Smeeth L. Methods for enhancing the reproducibility of biomedical research findings using electronic health records. BioData Min 2017; 10:31. [PMID: 28912836 PMCID: PMC5594436 DOI: 10.1186/s13040-017-0151-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 08/28/2017] [Indexed: 01/07/2023] Open
Abstract
BACKGROUND The ability of external investigators to reproduce published scientific findings is critical for the evaluation and validation of biomedical research by the wider community. However, a substantial proportion of health research using electronic health records (EHR), data collected and generated during clinical care, is potentially not reproducible mainly due to the fact that the implementation details of most data preprocessing, cleaning, phenotyping and analysis approaches are not systematically made available or shared. With the complexity, volume and variety of electronic health record data sources made available for research steadily increasing, it is critical to ensure that scientific findings from EHR data are reproducible and replicable by researchers. Reporting guidelines, such as RECORD and STROBE, have set a solid foundation by recommending a series of items for researchers to include in their research outputs. Researchers however often lack the technical tools and methodological approaches to actuate such recommendations in an efficient and sustainable manner. RESULTS In this paper, we review and propose a series of methods and tools utilized in adjunct scientific disciplines that can be used to enhance the reproducibility of research using electronic health records and enable researchers to report analytical approaches in a transparent manner. Specifically, we discuss the adoption of scientific software engineering principles and best-practices such as test-driven development, source code revision control systems, literate programming and the standardization and re-use of common data management and analytical approaches. CONCLUSION The adoption of such approaches will enable scientists to systematically document and share EHR analytical workflows and increase the reproducibility of biomedical research using such complex data sources.
Collapse
Affiliation(s)
- Spiros Denaxas
- Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA UK.,Farr Institute of Health Informatics Research, 222 Euston Road, London, UK
| | - Kenan Direk
- Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA UK.,Farr Institute of Health Informatics Research, 222 Euston Road, London, UK
| | - Arturo Gonzalez-Izquierdo
- Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA UK.,Farr Institute of Health Informatics Research, 222 Euston Road, London, UK
| | - Maria Pikoula
- Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA UK.,Farr Institute of Health Informatics Research, 222 Euston Road, London, UK
| | - Aylin Cakiroglu
- The Francis Crick Institute, 1 Midland Road, London, NW1 1AT UK
| | - Jason Moore
- Institute of Biomedical Informatics, University of Pennsylvania, Richards Medical Research Laboratories, 3700 Hamilton Walk, Philadelphia, 19104 USA
| | - Harry Hemingway
- Institute of Health Informatics, University College London, 222 Euston Road, London, NW1 2DA UK.,Farr Institute of Health Informatics Research, 222 Euston Road, London, UK
| | - Liam Smeeth
- EHR Research Group, Department of Non-communicable Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel Streeet, London, WC1E 7HT UK
| |
Collapse
|
11
|
Cui J, Faria M, Björnmalm M, Ju Y, Suma T, Gunawan ST, Richardson JJ, Heidari H, Bals S, Crampin EJ, Caruso F. A Framework to Account for Sedimentation and Diffusion in Particle-Cell Interactions. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2016; 32:12394-12402. [PMID: 27384770 DOI: 10.1021/acs.langmuir.6b01634] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
In vitro experiments provide a solid basis for understanding the interactions between particles and biological systems. An important confounding variable for these studies is the difference between the amount of particles administered and that which reaches the surface of cells. Here, we engineer a hydrogel-based nanoparticle system and combine in situ characterization techniques, 3D-printed cell cultures, and computational modeling to evaluate and study particle-cell interactions of advanced particle systems. The framework presented demonstrates how sedimentation and diffusion can explain differences in particle-cell association, and provides a means to account for these effects. Finally, using in silico modeling, we predict the proportion of particles that reaches the cell surface using common experimental conditions for a wide range of inorganic and organic micro- and nanoparticles. This work can assist in the understanding and control of sedimentation and diffusion when investigating cellular interactions of engineered particles.
Collapse
Affiliation(s)
- Jiwei Cui
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Matthew Faria
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Systems Biology Laboratory, Melbourne School of Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Mattias Björnmalm
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Yi Ju
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Tomoya Suma
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Sylvia T Gunawan
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Joseph J Richardson
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Hamed Heidari
- Electron Microscopy for Materials Research (EMAT), University of Antwerp , Groenenborgerlaan 171, 2020 Antwerp, Belgium
| | - Sara Bals
- Electron Microscopy for Materials Research (EMAT), University of Antwerp , Groenenborgerlaan 171, 2020 Antwerp, Belgium
| | - Edmund J Crampin
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Systems Biology Laboratory, Melbourne School of Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| | - Frank Caruso
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, and the Department of Chemical and Biomolecular Engineering, The University of Melbourne , Parkville, Victoria 3010, Australia
| |
Collapse
|
12
|
Andrews MC, Cursons J, Hurley DG, Anaka M, Cebon JS, Behren A, Crampin EJ. Systems analysis identifies miR-29b regulation of invasiveness in melanoma. Mol Cancer 2016; 15:72. [PMID: 27852308 PMCID: PMC5112703 DOI: 10.1186/s12943-016-0554-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Accepted: 10/31/2016] [Indexed: 02/08/2023] Open
Abstract
Background In many cancers, microRNAs (miRs) contribute to metastatic progression by modulating phenotypic reprogramming processes such as epithelial-mesenchymal plasticity. This can be driven by miRs targeting multiple mRNA transcripts, inducing regulated changes across large sets of genes. The miR-target databases TargetScan and DIANA-microT predict putative relationships by examining sequence complementarity between miRs and mRNAs. However, it remains a challenge to identify which miR-mRNA interactions are active at endogenous expression levels, and of biological consequence. Methods We developed a workflow to integrate TargetScan and DIANA-microT predictions into the analysis of data-driven associations calculated from transcript abundance (RNASeq) data, specifically the mutual information and Pearson’s correlation metrics. We use this workflow to identify putative relationships of miR-mediated mRNA repression with strong support from both lines of evidence. Applying this approach systematically to a large, published collection of unique melanoma cell lines – the Ludwig Melbourne melanoma (LM-MEL) cell line panel – we identified putative miR-mRNA interactions that may contribute to invasiveness. This guided the selection of interactions of interest for further in vitro validation studies. Results Several miR-mRNA regulatory relationships supported by TargetScan and DIANA-microT demonstrated differential activity across cell lines of varying matrigel invasiveness. Strong negative statistical associations for these putative regulatory relationships were consistent with target mRNA inhibition by the miR, and suggest that differential activity of such miR-mRNA relationships contribute to differences in melanoma invasiveness. Many of these relationships were reflected across the skin cutaneous melanoma TCGA dataset, indicating that these observations also show graded activity across clinical samples. Several of these miRs are implicated in cancer progression (miR-211, -340, -125b, −221, and -29b). The specific role for miR-29b-3p in melanoma has not been well studied. We experimentally validated the predicted miR-29b-3p regulation of LAMC1 and PPIC and LASP1, and show that dysregulation of miR-29b-3p or these mRNA targets can influence cellular invasiveness in vitro. Conclusions This analytic strategy provides a comprehensive, systems-level approach to identify miR-mRNA regulation in high-throughput cancer data, identifies novel putative interactions with functional phenotypic relevance, and can be used to direct experimental resources for subsequent experimental validation. Computational scripts are available: http://github.com/uomsystemsbiology/LMMEL-miR-miner Electronic supplementary material The online version of this article (doi:10.1186/s12943-016-0554-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Miles C Andrews
- Olivia Newton-John Cancer Research Institute, Heidelberg, VIC, 3084, Australia.,Ludwig Institute for Cancer Research, Melbourne-Austin Branch, Cancer Immunobiology Laboratory, Heidelberg, VIC, 3084, Australia.,School of Cancer Medicine, La Trobe University, Heidelberg, VIC, 3084, Australia.,Department of Medicine, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Joseph Cursons
- Systems Biology Laboratory, University of Melbourne, Parkville, VIC, 3010, Australia.,ARC Centre of Excellence in Convergent Bio-Nano Science, University of Melbourne, Parkville, VIC, 3010, Australia.,School of Mathematics and Statistics, University of Melbourne, Parkville, VIC, 3010, Australia.,Centre for Systems Genomics, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Daniel G Hurley
- Systems Biology Laboratory, University of Melbourne, Parkville, VIC, 3010, Australia.,School of Mathematics and Statistics, University of Melbourne, Parkville, VIC, 3010, Australia.,Centre for Systems Genomics, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Matthew Anaka
- Ludwig Institute for Cancer Research, Melbourne-Austin Branch, Cancer Immunobiology Laboratory, Heidelberg, VIC, 3084, Australia.,Department of Medicine, University of Toronto, Toronto, ON, Canada
| | - Jonathan S Cebon
- Olivia Newton-John Cancer Research Institute, Heidelberg, VIC, 3084, Australia. .,Ludwig Institute for Cancer Research, Melbourne-Austin Branch, Cancer Immunobiology Laboratory, Heidelberg, VIC, 3084, Australia. .,School of Cancer Medicine, La Trobe University, Heidelberg, VIC, 3084, Australia. .,Department of Medicine, University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Andreas Behren
- Olivia Newton-John Cancer Research Institute, Heidelberg, VIC, 3084, Australia. .,Ludwig Institute for Cancer Research, Melbourne-Austin Branch, Cancer Immunobiology Laboratory, Heidelberg, VIC, 3084, Australia. .,School of Cancer Medicine, La Trobe University, Heidelberg, VIC, 3084, Australia.
| | - Edmund J Crampin
- Department of Medicine, University of Melbourne, Parkville, VIC, 3010, Australia. .,Systems Biology Laboratory, University of Melbourne, Parkville, VIC, 3010, Australia. .,ARC Centre of Excellence in Convergent Bio-Nano Science, University of Melbourne, Parkville, VIC, 3010, Australia. .,School of Mathematics and Statistics, University of Melbourne, Parkville, VIC, 3010, Australia. .,Centre for Systems Genomics, University of Melbourne, Parkville, VIC, 3010, Australia.
| |
Collapse
|
13
|
Abstract
When reporting research findings, scientists document the steps they followed so that others can verify and build upon the research. When those steps have been described in sufficient detail that others can retrace the steps and obtain similar results, the research is said to be reproducible. Computers play a vital role in many research disciplines and present both opportunities and challenges for reproducibility. Computers can be programmed to execute analysis tasks, and those programs can be repeated and shared with others. The deterministic nature of most computer programs means that the same analysis tasks, applied to the same data, will often produce the same outputs. However, in practice, computational findings often cannot be reproduced because of complexities in how software is packaged, installed, and executed-and because of limitations associated with how scientists document analysis steps. Many tools and techniques are available to help overcome these challenges; here we describe seven such strategies. With a broad scientific audience in mind, we describe the strengths and limitations of each approach, as well as the circumstances under which each might be applied. No single strategy is sufficient for every scenario; thus we emphasize that it is often useful to combine approaches.
Collapse
Affiliation(s)
- Stephen R Piccolo
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA.
| | - Michael B Frampton
- Department of Computer Science, Brigham Young University, Provo, UT, USA
| |
Collapse
|
14
|
Abstract
High-throughput bioinformatic analyses increasingly rely on pipeline frameworks to process sequence and metadata. Modern implementations of these frameworks differ on three key dimensions: using an implicit or explicit syntax, using a configuration, convention or class-based design paradigm and offering a command line or workbench interface. Here I survey and compare the design philosophies of several current pipeline frameworks. I provide practical recommendations based on analysis requirements and the user base.
Collapse
Affiliation(s)
- Jeremy Leipzig
- Department of Biomedical and Health Informatics, The Children’s Hospital of Philadelphia, 3535 Market Street, Room 1063, Philadelphia, PA, USA
- Corresponding author: Jeremy Leipzig, Department of Biomedical and Health Informatics, The Children’s Hospital of Philadelphia, 3535 Market Street, Room 1063, Philadelphia, PA 19104, USA. Tel.: +12154261375; Fax: +12155905245; E-mail:
| |
Collapse
|
15
|
Cursons J, Angel CE, Hurley DG, Print CG, Dunbar PR, Jacobs MD, Crampin EJ. Spatially transformed fluorescence image data for ERK-MAPK and selected proteins within human epidermis. Gigascience 2015; 4:63. [PMID: 26675891 PMCID: PMC4678632 DOI: 10.1186/s13742-015-0102-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2015] [Accepted: 12/03/2015] [Indexed: 12/20/2022] Open
Abstract
Background Phosphoprotein signalling pathways have been intensively studied in vitro, yet their role in regulating tissue homeostasis is not fully understood. In the skin, interfollicular keratinocytes differentiate over approximately 2 weeks as they traverse the epidermis. The extracellular signal-regulated kinase (ERK) branch of the mitogen-activated protein kinase (MAPK) pathway has been implicated in this process. Therefore, we examined ERK-MAPK activity within human epidermal keratinocytes in situ. Findings We used confocal microscopy and immunofluorescence labelling to measure the relative abundances of Raf-1, MEK1/2 and ERK1/2, and their phosphorylated (active) forms within three human skin samples. Additionally, we measured the abundance of selected proteins thought to modulate ERK-MAPK activity, including calmodulin, β1 integrin and stratifin (14-3-3σ); and of transcription factors known to act as effectors of ERK1/2, including the AP-1 components Jun-B, Fra2 and c-Fos. Imaging was performed with sufficient resolution to identify the plasma membrane, cytoplasm and nucleus as distinct domains within cells across the epidermis. The image field of view was also sufficiently large to capture the entire epidermis in cross-section, and thus the full range of keratinocyte differentiation in a single observation. Image processing methods were developed to quantify image data for mathematical and statistical analysis. Here, we provide raw image data and processed outputs. Conclusions These data indicate coordinated changes in ERK-MAPK signalling activity throughout the depth of the epidermis, with changes in relative phosphorylation-mediated signalling activity occurring along the gradient of cellular differentiation. We believe these data provide unique information about intracellular signalling as they are obtained from a homeostatic human tissue, and they might be useful for investigating intercellular heterogeneity. Electronic supplementary material The online version of this article (doi:10.1186/s13742-015-0102-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Joseph Cursons
- Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Parkville, VIC Australia, 3010 ; ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of Melbourne, Parkville, Australia, 3010
| | - Catherine E Angel
- Maurice Wilkins Centre, University of Auckland, Auckland, New Zealand ; School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Daniel G Hurley
- Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Parkville, VIC Australia, 3010
| | - Cristin G Print
- Maurice Wilkins Centre, University of Auckland, Auckland, New Zealand ; School of Biological Sciences, University of Auckland, Auckland, New Zealand ; Bioinformatics Institute, University of Auckland, Auckland, New Zealand ; Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand
| | - P Rod Dunbar
- Maurice Wilkins Centre, University of Auckland, Auckland, New Zealand ; School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Marc D Jacobs
- Department of Biology, New Zealand International College, ACG New Zealand, Auckland, New Zealand
| | - Edmund J Crampin
- Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Parkville, VIC Australia, 3010 ; ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, University of Melbourne, Parkville, Australia, 3010 ; School of Mathematics and Statistics, University of Melbourne, Parkville, Australia, 3010 ; School of Medicine, University of Melbourne, Parkville, Australia, 3010
| |
Collapse
|
16
|
Gawthrop PJ, Cursons J, Crampin EJ. Hierarchical bond graph modelling of biochemical networks. Proc Math Phys Eng Sci 2015. [DOI: 10.1098/rspa.2015.0642] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The bond graph approach to modelling biochemical networks is extended to allow hierarchical construction of complex models from simpler components. This is made possible by representing the simpler components as thermodynamically open systems exchanging mass and energy via ports. A key feature of this approach is that the resultant models are
robustly
thermodynamically compliant: the thermodynamic compliance is
not
dependent on precise numerical values of parameters. Moreover, the models are
reusable
owing to the well-defined interface provided by the energy ports. To extract bond graph model parameters from parameters found in the literature, general and compact formulae are developed to relate free-energy constants and equilibrium constants. The existence and uniqueness of solutions is considered in terms of fundamental properties of stoichiometric matrices. The approach is illustrated by building a hierarchical bond graph model of glycogenolysis in skeletal muscle.
Collapse
Affiliation(s)
- Peter J. Gawthrop
- Systems Biology Laboratory, Melbourne School of Engineering, Parkville, Victoria 3010, Australia
| | - Joseph Cursons
- Systems Biology Laboratory, Melbourne School of Engineering, Parkville, Victoria 3010, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Melbourne School of Engineering, Parkville, Victoria 3010, Australia
| | - Edmund J. Crampin
- Systems Biology Laboratory, Melbourne School of Engineering, Parkville, Victoria 3010, Australia
- ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, Melbourne School of Engineering, Parkville, Victoria 3010, Australia
- School of Mathematics and Statistics, Parkville, Victoria 3010, Australia
- School of Medicine, University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
17
|
FlexDM: Simple, parallel and fault-tolerant data mining using WEKA. SOURCE CODE FOR BIOLOGY AND MEDICINE 2015; 10:13. [PMID: 26579209 PMCID: PMC4647584 DOI: 10.1186/s13029-015-0045-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2015] [Accepted: 11/09/2015] [Indexed: 12/03/2022]
Abstract
Background With the continued exponential growth in data volume, large-scale data mining and machine learning experiments have become a necessity for many researchers without programming or statistics backgrounds. WEKA (Waikato Environment for Knowledge Analysis) is a gold standard framework that facilitates and simplifies this task by allowing specification of algorithms, hyper-parameters and test strategies from a streamlined Experimenter GUI. Despite its popularity, the WEKA Experimenter exhibits several limitations that we address in our new FlexDM software. Results FlexDM addresses four fundamental limitations with the WEKA Experimenter: reliance on a verbose and difficult-to-modify XML schema; inability to meta-optimise experiments over a large number of algorithm hyper-parameters; inability to recover from software or hardware failure during a large experiment; and failing to leverage modern multicore processor architectures. Direct comparisons between the FlexDM and default WEKA XML schemas demonstrate a 10-fold improvement in brevity for a specification that allows finer control of experimental procedures. The stability of FlexDM has been tested on a large biological dataset (approximately 450 k attributes by 150 samples), and automatic parallelisation of tasks yields a quasi-linear reduction in execution time when distributed across multiple processor cores. Conclusion FlexDM is a powerful and easy-to-use extension to the WEKA package, which better handles the increased volume and complexity of data that has emerged during the 20 years since WEKA’s original development. FlexDM has been tested on Windows, OSX and Linux operating systems and is provided as a pre-configured virtual reference environment for trivial usage and extensibility. This software can substantially improve the productivity of any research group conducting large-scale data mining or machine learning tasks, in addition to providing non-programmers with improved control over specific aspects of their data analysis pipeline via a succinct and simplified XML schema. Electronic supplementary material The online version of this article (doi:10.1186/s13029-015-0045-3) contains supplementary material, which is available to authorized users.
Collapse
|
18
|
Budden DM, Hurley DG, Crampin EJ. Modelling the conditional regulatory activity of methylated and bivalent promoters. Epigenetics Chromatin 2015; 8:21. [PMID: 26097508 PMCID: PMC4474576 DOI: 10.1186/s13072-015-0013-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2015] [Accepted: 06/10/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Predictive modelling of gene expression is a powerful framework for the in silico exploration of transcriptional regulatory interactions through the integration of high-throughput -omics data. A major limitation of previous approaches is their inability to handle conditional interactions that emerge when genes are subject to different regulatory mechanisms. Although chromatin immunoprecipitation-based histone modification data are often used as proxies for chromatin accessibility, the association between these variables and expression often depends upon the presence of other epigenetic markers (e.g. DNA methylation or histone variants). These conditional interactions are poorly handled by previous predictive models and reduce the reliability of downstream biological inference. RESULTS We have previously demonstrated that integrating both transcription factor and histone modification data within a single predictive model is rendered ineffective by their statistical redundancy. In this study, we evaluate four proposed methods for quantifying gene-level DNA methylation levels and demonstrate that inclusion of these data in predictive modelling frameworks is also subject to this critical limitation in data integration. Based on the hypothesis that statistical redundancy in epigenetic data is caused by conditional regulatory interactions within a dynamic chromatin context, we construct a new gene expression model which is the first to improve prediction accuracy by unsupervised identification of latent regulatory classes. We show that DNA methylation and H2A.Z histone variant data can be interpreted in this way to identify and explore the signatures of silenced and bivalent promoters, substantially improving genome-wide predictions of mRNA transcript abundance and downstream biological inference across multiple cell lines. CONCLUSIONS Previous models of gene expression have been applied successfully to several important problems in molecular biology, including the discovery of transcription factor roles, identification of regulatory elements responsible for differential expression patterns and comparative analysis of the transcriptome across distant species. Our analysis supports our hypothesis that statistical redundancy in epigenetic data is partially due to conditional relationships between these regulators and gene expression levels. This analysis provides insight into the heterogeneous roles of H3K4me3 and H3K27me3 in the presence of the H2A.Z histone variant (implicated in cancer progression) and how these signatures change during lineage commitment and carcinogenesis.
Collapse
Affiliation(s)
- David M Budden
- Systems Biology Laboratory, Melbourne School of Engineering, The University of Melbourne, 3010 Parkville, Australia ; NICTA Victoria Research Laboratory, The University of Melbourne, 3010 Parkville, Australia
| | - Daniel G Hurley
- Systems Biology Laboratory, Melbourne School of Engineering, The University of Melbourne, 3010 Parkville, Australia
| | - Edmund J Crampin
- Systems Biology Laboratory, Melbourne School of Engineering, The University of Melbourne, 3010 Parkville, Australia ; NICTA Victoria Research Laboratory, The University of Melbourne, 3010 Parkville, Australia ; ARC Centre of Excellence in Convergent Bio-Nano Science and Technology, 3010 Parkville, Australia ; Department of Mathematics and Statistics, The University of Melbourne, 3010 Parkville, Australia ; School of Medicine, The University of Melbourne, 3010 Parkville, Australia
| |
Collapse
|