1
|
Sinha S. A pedagogical walkthrough of computational modeling and simulation of Wnt signaling pathway using static causal models in MATLAB. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2016; 2017:1. [PMID: 27547217 PMCID: PMC4977324 DOI: 10.1186/s13637-016-0044-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 07/22/2016] [Indexed: 12/26/2022]
Abstract
Simulation study in systems biology involving computational experiments dealing with Wnt signaling pathways abound in literature but often lack a pedagogical perspective that might ease the understanding of beginner students and researchers in transition, who intend to work on the modeling of the pathway. This paucity might happen due to restrictive business policies which enforce an unwanted embargo on the sharing of important scientific knowledge. A tutorial introduction to computational modeling of Wnt signaling pathway in a human colorectal cancer dataset using static Bayesian network models is provided. The walkthrough might aid biologists/informaticians in understanding the design of computational experiments that is interleaved with exposition of the Matlab code and causal models from Bayesian network toolbox. The manuscript elucidates the coding contents of the advance article by Sinha (Integr. Biol. 6:1034-1048, 2014) and takes the reader in a step-by-step process of how (a) the collection and the transformation of the available biological information from literature is done, (b) the integration of the heterogeneous data and prior biological knowledge in the network is achieved, (c) the simulation study is designed, (d) the hypothesis regarding a biological phenomena is transformed into computational framework, and (e) results and inferences drawn using d-connectivity/separability are reported. The manuscript finally ends with a programming assignment to help the readers get hands-on experience of a perturbation project. Description of Matlab files is made available under GNU GPL v3 license at the Google code project on https://code.google.com/p/static-bn-for-wnt-signaling-pathway and https: //sites.google.com/site/shriprakashsinha/shriprakashsinha/projects/static-bn-for-wnt-signaling-pathway. Latest updates can be found in the latter website.
Collapse
|
2
|
Sinha S. Integration of prior biological knowledge and epigenetic information enhances the prediction accuracy of the Bayesian Wnt pathway. Integr Biol (Camb) 2015; 6:1034-48. [PMID: 25167061 DOI: 10.1039/c4ib00124a] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Computational modeling of the Wnt signaling pathway has gained prominence for its use as a diagnostic tool to develop therapeutic cancer target drugs and predict test samples as tumorous/normal. Diagnostic tools entail modeling of the biological phenomena behind the pathway while prediction requires inclusion of factors for discriminative classification. This manuscript develops simple static Bayesian network predictive models of varying complexity by encompassing prior partially available biological knowledge about intra/extracellular factors and incorporating information regarding epigenetic modification into a few genes that are known to have an inhibitory effect on the pathway. Incorporation of epigenetic information enhances the prediction accuracy of test samples in human colorectal cancer. In comparison to the Naive Bayes model where β-catenin transcription complex activation predictions are assumed to correspond to sample predictions, the new biologically inspired models shed light on differences in behavior of the transcription complex and the state of samples. Receiver operator curves and their respective area under the curve measurements obtained from predictions of the state of the test sample and the corresponding predictions of the state of activation of the β-catenin transcription complex of the pathway for the test sample indicate a significant difference between the transcription complex being on (off) and its association with the sample being tumorous (normal). The two-sample Kolmogorov-Smirnov test confirms the statistical deviation between the distributions of these predictions. Hitherto unknown relationship between factors like DKK2, DKK3-1 and SFRP-2/3/5 w.r.t. the β-catenin transcription complex has been inferred using these causal models.
Collapse
Affiliation(s)
- Shriprakash Sinha
- Netherlands Bioinformatics Centre, 6500 HB, Nijmegen, The Netherlands.
| |
Collapse
|
3
|
Sands B, Jenkins P, Peria WJ, Naivar M, Houston JP, Brent R. Measuring and sorting cell populations expressing isospectral fluorescent proteins with different fluorescence lifetimes. PLoS One 2014; 9:e109940. [PMID: 25302964 PMCID: PMC4193854 DOI: 10.1371/journal.pone.0109940] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2014] [Accepted: 09/04/2014] [Indexed: 01/03/2023] Open
Abstract
Study of signal transduction in live cells benefits from the ability to visualize and quantify light emitted by fluorescent proteins (XFPs) fused to different signaling proteins. However, because cell signaling proteins are often present in small numbers, and because the XFPs themselves are poor fluorophores, the amount of emitted light, and the observable signal in these studies, is often small. An XFP's fluorescence lifetime contains additional information about the immediate environment of the fluorophore that can augment the information from its weak light signal. Here, we constructed and expressed in Saccharomyces cerevisiae variants of Teal Fluorescent Protein (TFP) and Citrine that were isospectral but had shorter fluorescence lifetimes, ∼1.5 ns vs ∼3 ns. We modified microscopic and flow cytometric instruments to measure fluorescence lifetimes in live cells. We developed digital hardware and a measure of lifetime called a “pseudophasor” that we could compute quickly enough to permit sorting by lifetime in flow. We used these abilities to sort mixtures of cells expressing TFP and the short-lifetime TFP variant into subpopulations that were respectively 97% and 94% pure. This work demonstrates the feasibility of using information about fluorescence lifetime to help quantify cell signaling in living cells at the high throughput provided by flow cytometry. Moreover, it demonstrates the feasibility of isolating and recovering subpopulations of cells with different XFP lifetimes for subsequent experimentation.
Collapse
Affiliation(s)
- Bryan Sands
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Patrick Jenkins
- Department of Chemical Engineering, New Mexico State University, Las Cruces, New Mexico, United States of America
| | - William J. Peria
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Mark Naivar
- Darkling X, LLC, Los Alamos, New Mexico, United States of America
| | - Jessica P. Houston
- Department of Chemical Engineering, New Mexico State University, Las Cruces, New Mexico, United States of America
| | - Roger Brent
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- * E-mail:
| |
Collapse
|
4
|
Piedra D, Ferrer A, Gea J. Text mining and medicine: usefulness in respiratory diseases. Arch Bronconeumol 2014; 50:113-9. [PMID: 24507559 DOI: 10.1016/j.arbres.2013.04.009] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Revised: 04/12/2013] [Accepted: 04/18/2013] [Indexed: 12/24/2022]
Abstract
It is increasingly common to have medical information in electronic format. This includes scientific articles as well as clinical management reviews, and even records from health institutions with patient data. However, traditional instruments, both individual and institutional, are of little use for selecting the most appropriate information in each case, either in the clinical or research field. So-called text or data «mining» enables this huge amount of information to be managed, extracting it from various sources using processing systems (filtration and curation), integrating it and permitting the generation of new knowledge. This review aims to provide an overview of text and data mining, and of the potential usefulness of this bioinformatic technique in the exercise of care in respiratory medicine and in research in the same field.
Collapse
Affiliation(s)
- David Piedra
- Instituto de Investigación del Hospital del Mar (IMIM), Barcelona, España.
| | - Antoni Ferrer
- Instituto de Investigación del Hospital del Mar (IMIM), Barcelona, España; Servicio de Neumología, Hospital del Mar, Barcelona, España; Facultat de Ciències de la Salut i de la Vida, Universitat Pompeu Fabra, Barcelona, España; CIBERES, ISC III, Bunyola, Mallorca, España
| | - Joaquim Gea
- Instituto de Investigación del Hospital del Mar (IMIM), Barcelona, España; Servicio de Neumología, Hospital del Mar, Barcelona, España; Facultat de Ciències de la Salut i de la Vida, Universitat Pompeu Fabra, Barcelona, España; CIBERES, ISC III, Bunyola, Mallorca, España
| |
Collapse
|
5
|
McGuire MF, Sriram Iyengar M, Mercer DW. Data driven linear algebraic methods for analysis of molecular pathways: application to disease progression in shock/trauma. J Biomed Inform 2012; 45:372-87. [PMID: 22200681 PMCID: PMC3346262 DOI: 10.1016/j.jbi.2011.12.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2011] [Revised: 12/09/2011] [Accepted: 12/10/2011] [Indexed: 12/24/2022]
Abstract
MOTIVATION Although trauma is the leading cause of death for those below 45years of age, there is a dearth of information about the temporal behavior of the underlying biological mechanisms in those who survive the initial trauma only to later suffer from syndromes such as multiple organ failure. Levels of serum cytokines potentially affect the clinical outcomes of trauma; understanding how cytokine levels modulate intra-cellular signaling pathways can yield insights into molecular mechanisms of disease progression and help to identify targeted therapies. However, developing such analyses is challenging since it necessitates the integration and interpretation of large amounts of heterogeneous, quantitative and qualitative data. Here we present the Pathway Semantics Algorithm (PSA), an algebraic process of node and edge analyses of evoked biological pathways over time for in silico discovery of biomedical hypotheses, using data from a prospective controlled clinical study of the role of cytokines in multiple organ failure (MOF) at a major US trauma center. A matrix algebra approach was used in both the PSA node and PSA edge analyses with different matrix configurations and computations based on the biomedical questions to be examined. In the edge analysis, a percentage measure of crosstalk called XTALK was also developed to assess cross-pathway interference. RESULTS In the node/molecular analysis of the first 24h from trauma, PSA uncovered seven molecules evoked computationally that differentiated outcomes of MOF or non-MOF (NMOF), of which three molecules had not been previously associated with any shock/trauma syndrome. In the edge/molecular interaction analysis, PSA examined four categories of functional molecular interaction relationships--activation, expression, inhibition, and transcription--and found that the interaction patterns and crosstalk changed over time and outcome. The PSA edge analysis suggests that a diagnosis, prognosis or therapy based on molecular interaction mechanisms may be most effective within a certain time period and for a specific functional relationship.
Collapse
Affiliation(s)
- Mary F McGuire
- Department of Pathology and Laboratory Medicine, Medical School, University of Texas Health Science Center at Houston, Houston, TX, USA.
| | | | | |
Collapse
|
6
|
|
7
|
Abstract
This paper summarizes recent advances in causal inference and underscores the paradigmatic shifts that must be undertaken in moving from traditional statistical analysis to causal analysis of multivariate data. Special emphasis is placed on the assumptions that underlie all causal inferences, the languages used in formulating those assumptions, the conditional nature of all causal and counterfactual claims, and the methods that have been developed for the assessment of such claims. These advances are illustrated using a general theory of causation based on the Structural Causal Model (SCM) described in Pearl (2000a), which subsumes and unifies other approaches to causation, and provides a coherent mathematical foundation for the analysis of causes and counterfactuals. In particular, the paper surveys the development of mathematical tools for inferring (from a combination of data and assumptions) answers to three types of causal queries: those about (1) the effects of potential interventions, (2) probabilities of counterfactuals, and (3) direct and indirect effects (also known as "mediation"). Finally, the paper defines the formal and conceptual relationships between the structural and potential-outcome frameworks and presents tools for a symbiotic analysis that uses the strong features of both. The tools are demonstrated in the analyses of mediation, causes of effects, and probabilities of causation.
Collapse
|
8
|
Hull D, Pettifer SR, Kell DB. Defrosting the digital library: bibliographic tools for the next generation web. PLoS Comput Biol 2008; 4:e1000204. [PMID: 18974831 PMCID: PMC2568856 DOI: 10.1371/journal.pcbi.1000204] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Many scientists now manage the bulk of their bibliographic information electronically, thereby organizing their publications and citation material from digital libraries. However, a library has been described as "thought in cold storage," and unfortunately many digital libraries can be cold, impersonal, isolated, and inaccessible places. In this Review, we discuss the current chilly state of digital libraries for the computational biologist, including PubMed, IEEE Xplore, the ACM digital library, ISI Web of Knowledge, Scopus, Citeseer, arXiv, DBLP, and Google Scholar. We illustrate the current process of using these libraries with a typical workflow, and highlight problems with managing data and metadata using URIs. We then examine a range of new applications such as Zotero, Mendeley, Mekentosj Papers, MyNCBI, CiteULike, Connotea, and HubMed that exploit the Web to make these digital libraries more personal, sociable, integrated, and accessible places. We conclude with how these applications may begin to help achieve a digital defrost, and discuss some of the issues that will help or hinder this in terms of making libraries on the Web warmer places in the future, becoming resources that are considerably more useful to both humans and machines.
Collapse
Affiliation(s)
- Duncan Hull
- School of Chemistry, The University of Manchester, Manchester, UK.
| | | | | |
Collapse
|
9
|
|
10
|
Ananiadou S, Kell DB, Tsujii JI. Text mining and its potential applications in systems biology. Trends Biotechnol 2006; 24:571-9. [PMID: 17045684 DOI: 10.1016/j.tibtech.2006.10.002] [Citation(s) in RCA: 163] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2006] [Revised: 08/21/2006] [Accepted: 10/02/2006] [Indexed: 11/15/2022]
Abstract
With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models.
Collapse
Affiliation(s)
- Sophia Ananiadou
- School of Computer Science, National Centre for Text Mining, The Manchester Interdisciplinary Biocentre, The University of Manchester, 131 Princess Street, Manchester M1 7ND, UK.
| | | | | |
Collapse
|
11
|
Kell DB. Theodor Bücher Lecture. Metabolomics, modelling and machine learning in systems biology - towards an understanding of the languages of cells. Delivered on 3 July 2005 at the 30th FEBS Congress and the 9th IUBMB conference in Budapest. FEBS J 2006; 273:873-94. [PMID: 16478464 DOI: 10.1111/j.1742-4658.2006.05136.x] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The newly emerging field of systems biology involves a judicious interplay between high-throughput 'wet' experimentation, computational modelling and technology development, coupled to the world of ideas and theory. This interplay involves iterative cycles, such that systems biology is not at all confined to hypothesis-dependent studies, with intelligent, principled, hypothesis-generating studies being of high importance and consequently very far from aimless fishing expeditions. I seek to illustrate each of these facets. Novel technology development in metabolomics can increase substantially the dynamic range and number of metabolites that one can detect, and these can be exploited as disease markers and in the consequent and principled generation of hypotheses that are consistent with the data and achieve this in a value-free manner. Much of classical biochemistry and signalling pathway analysis has concentrated on the analyses of changes in the concentrations of intermediates, with 'local' equations - such as that of Michaelis and Menten v=(Vmax x S)/(S+K m) - that describe individual steps being based solely on the instantaneous values of these concentrations. Recent work using single cells (that are not subject to the intellectually unsupportable averaging of the variable displayed by heterogeneous cells possessing nonlinear kinetics) has led to the recognition that some protein signalling pathways may encode their signals not (just) as concentrations (AM or amplitude-modulated in a radio analogy) but via changes in the dynamics of those concentrations (the signals are FM or frequency-modulated). This contributes in principle to a straightforward solution of the crosstalk problem, leads to a profound reassessment of how to understand the downstream effects of dynamic changes in the concentrations of elements in these pathways, and stresses the role of signal processing (and not merely the intermediates) in biological signalling. It is this signal processing that lies at the heart of understanding the languages of cells. The resolution of many of the modern and postgenomic problems of biochemistry requires the development of a myriad of new technologies (and maybe a new culture), and thus regular input from the physical sciences, engineering, mathematics and computer science. One solution, that we are adopting in the Manchester Interdisciplinary Biocentre (http://www.mib.ac.uk/) and the Manchester Centre for Integrative Systems Biology (http://www.mcisb.org/), is thus to colocate individuals with the necessary combinations of skills. Novel disciplines that require such an integrative approach continue to emerge. These include fields such as chemical genomics, synthetic biology, distributed computational environments for biological data and modelling, single cell diagnostics/bionanotechnology, and computational linguistics/text mining.
Collapse
Affiliation(s)
- Douglas B Kell
- School of Chemistry, Faraday Building, The University of Manchester, UK.
| |
Collapse
|