1
|
A novel method for predicting the progression rate of ALS disease based on automatic generation of probabilistic causal chains. Artif Intell Med 2020; 107:101879. [PMID: 32828438 DOI: 10.1016/j.artmed.2020.101879] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Revised: 04/17/2020] [Accepted: 05/12/2020] [Indexed: 01/22/2023]
Abstract
Causal discovery is considered as a major concept in biomedical informatics contributing to diagnosis, therapy, and prognosis of diseases. Probabilistic causality approaches in epidemiology and medicine is a common method for finding relationships between pathogen and disease, environment and disease, and adverse events and drugs. Bayesian Network (BN) is one of the common approaches for probabilistic causality, which is widely used in health-care and biomedical science. Since in many biomedical applications we deal with temporal dataset, the temporal extension of BNs called Dynamic Bayesian network (DBN) is used for such applications. DBNs define probabilistic relationships between parameters in consecutive time points in the form of a graph and have been successfully used in many biomedical applications. In this paper, a novel method was introduced for finding probabilistic causal chains from a temporal dataset with the help of entropy and causal tendency measures. In this method, first, Causal Features Dependency (CFD) matrix is created on the basis of parameters changes in consecutive events of a phenomenon, and then the probabilistic causal graph is constructed from this matrix based on entropy criteria. At the next step, a set of probabilistic causal chains of the corresponding causal graph is constructed by a novel polynomial-time heuristic. Finally, the causal chains are used for predicting the future trend of the phenomenon. The proposed model was applied to the Pooled Resource Open-Access Clinical Trials (PRO-ACT) dataset related to Amyotrophic Lateral Sclerosis (ALS) disease, in order to predict the progression rate of this disease. The results of comparison with Bayesian tree, random forest, support vector regression, linear regression, and multivariate regression show that the proposed algorithm can compete with these methods and in some cases outperforms other algorithms. This study revealed that probabilistic causality is an appropriate approach for predicting the future states of chronic diseases with unknown cause.
Collapse
|
2
|
Muir P, Hans EC, Racette M, Volstad N, Sample SJ, Heaton C, Holzman G, Schaefer SL, Bloom DD, Bleedorn JA, Hao Z, Amene E, Suresh M, Hematti P. Autologous Bone Marrow-Derived Mesenchymal Stem Cells Modulate Molecular Markers of Inflammation in Dogs with Cruciate Ligament Rupture. PLoS One 2016; 11:e0159095. [PMID: 27575050 PMCID: PMC5005014 DOI: 10.1371/journal.pone.0159095] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2016] [Accepted: 06/27/2016] [Indexed: 01/22/2023] Open
Abstract
Mid-substance rupture of the canine cranial cruciate ligament rupture (CR) and associated stifle osteoarthritis (OA) is an important veterinary health problem. CR causes stifle joint instability and contralateral CR often develops. The dog is an important model for human anterior cruciate ligament (ACL) rupture, where rupture of graft repair or the contralateral ACL is also common. This suggests that both genetic and environmental factors may increase ligament rupture risk. We investigated use of bone marrow-derived mesenchymal stem cells (BM-MSCs) to reduce systemic and stifle joint inflammatory responses in dogs with CR. Twelve dogs with unilateral CR and contralateral stable partial CR were enrolled prospectively. BM-MSCs were collected during surgical treatment of the unstable CR stifle and culture-expanded. BM-MSCs were subsequently injected at a dose of 2x106 BM-MSCs/kg intravenously and 5x106 BM-MSCs by intra-articular injection of the partial CR stifle. Blood (entry, 4 and 8 weeks) and stifle synovial fluid (entry and 8 weeks) were obtained after BM-MSC injection. No adverse events after BM-MSC treatment were detected. Circulating CD8+ T lymphocytes were lower after BM-MSC injection. Serum C-reactive protein (CRP) was decreased at 4 weeks and serum CXCL8 was increased at 8 weeks. Synovial CRP in the complete CR stifle was decreased at 8 weeks. Synovial IFNγ was also lower in both stifles after BM-MSC injection. Synovial/serum CRP ratio at diagnosis in the partial CR stifle was significantly correlated with development of a second CR. Systemic and intra-articular injection of autologous BM-MSCs in dogs with partial CR suppresses systemic and stifle joint inflammation, including CRP concentrations. Intra-articular injection of autologous BM-MSCs had profound effects on the correlation and conditional dependencies of cytokines using causal networks. Such treatment effects could ameliorate risk of a second CR by modifying the stifle joint inflammatory response associated with cranial cruciate ligament matrix degeneration or damage.
Collapse
Affiliation(s)
- Peter Muir
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
- * E-mail:
| | - Eric C. Hans
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Molly Racette
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Nicola Volstad
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Susannah J. Sample
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
- Department of Comparative Biosciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Caitlin Heaton
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Gerianne Holzman
- UW Veterinary Care Hospital, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Susan L. Schaefer
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Debra D. Bloom
- Department of Medicine, School of Medicine & Public Health, University of Wisconsin-Madison, Madison, Wisconsin, 53705, United States of America
| | - Jason A. Bleedorn
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Zhengling Hao
- Comparative Orthopaedic Research Laboratory, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Ermias Amene
- Department of Medical Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - M. Suresh
- Department of Pathobiological Sciences, School of Veterinary Medicine, University of Wisconsin-Madison, Madison, Wisconsin, 53706, United States of America
| | - Peiman Hematti
- Department of Medicine, School of Medicine & Public Health, University of Wisconsin-Madison, Madison, Wisconsin, 53705, United States of America
- University of Wisconsin Carbone Cancer Center, Madison, Wisconsin, 53705, United States of America
| |
Collapse
|
3
|
Spirtes P, Zhang K. Causal discovery and inference: concepts and recent methodological advances. ACTA ACUST UNITED AC 2016; 3:3. [PMID: 27195202 PMCID: PMC4841209 DOI: 10.1186/s40535-016-0018-x] [Citation(s) in RCA: 101] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2015] [Accepted: 01/31/2016] [Indexed: 11/16/2022]
Abstract
This paper aims to give a broad coverage of central concepts and principles involved in automated causal inference and emerging approaches to causal discovery from i.i.d data and from time series. After reviewing concepts including manipulations, causal models, sample predictive modeling, causal predictive modeling, and structural equation models, we present the constraint-based approach to causal discovery, which relies on the conditional independence relationships in the data, and discuss the assumptions underlying its validity. We then focus on causal discovery based on structural equations models, in which a key issue is the identifiability of the causal structure implied by appropriately defined structural equation models: in the two-variable case, under what conditions (and why) is the causal direction between the two variables identifiable? We show that the independence between the error term and causes, together with appropriate structural constraints on the structural equation, makes it possible. Next, we report some recent advances in causal discovery from time series. Assuming that the causal relations are linear with nonGaussian noise, we mention two problems which are traditionally difficult to solve, namely causal discovery from subsampled data and that in the presence of confounding time series. Finally, we list a number of open questions in the field of causal discovery and inference.
Collapse
Affiliation(s)
- Peter Spirtes
- Department of Philosophy, Carnegie Mellon University, Pittsburgh, USA
| | - Kun Zhang
- Department of Philosophy, Carnegie Mellon University, Pittsburgh, USA ; Max-Planck Institute for Intelligent Systems, 72076 Tübingen, Germany
| |
Collapse
|
4
|
Big data analysis using modern statistical and machine learning methods in medicine. Int Neurourol J 2014; 18:50-7. [PMID: 24987556 PMCID: PMC4076480 DOI: 10.5213/inj.2014.18.2.50] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2014] [Accepted: 06/20/2014] [Indexed: 11/08/2022] Open
Abstract
In this article we introduce modern statistical machine learning and bioinformatics approaches that have been used in learning statistical relationships from big data in medicine and behavioral science that typically include clinical, genomic (and proteomic) and environmental variables. Every year, data collected from biomedical and behavioral science is getting larger and more complicated. Thus, in medicine, we also need to be aware of this trend and understand the statistical tools that are available to analyze these datasets. Many statistical analyses that are aimed to analyze such big datasets have been introduced recently. However, given many different types of clinical, genomic, and environmental data, it is rather uncommon to see statistical methods that combine knowledge resulting from those different data types. To this extent, we will introduce big data in terms of clinical data, single nucleotide polymorphism and gene expression studies and their interactions with environment. In this article, we will introduce the concept of well-known regression analyses such as linear and logistic regressions that has been widely used in clinical data analyses and modern statistical models such as Bayesian networks that has been introduced to analyze more complicated data. Also we will discuss how to represent the interaction among clinical, genomic, and environmental data in using modern statistical models. We conclude this article with a promising modern statistical method called Bayesian networks that is suitable in analyzing big data sets that consists with different type of large data from clinical, genomic, and environmental data. Such statistical model form big data will provide us with more comprehensive understanding of human physiology and disease.
Collapse
|
5
|
Yoo C. Bayesian Method for Causal Discovery of Latent-Variable Models from a Mixture of Experimental and Observational Data. Comput Stat Data Anal 2012; 56:2183-2205. [PMID: 32831439 DOI: 10.1016/j.csda.2012.01.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
This paper describes a Bayesian method for learning causal Bayesian networks through networks that contain latent variables from an arbitrary mixture of observational and experimental data. The paper presents Bayesian methods (including a new method) for learning the causal structure and parameters of the underlying causal process that is generating the data, given that the data contain a mixture of observational and experimental cases. These learning methods were applied using as input various mixtures of experimental and observational data that were generated from the ALARM causal Bayesian network. The paper reports how these structure predictions and parameter estimates compare with the true causal structures and parameters as given by the ALARM network. The paper shows that (1) the new method for learning Bayesian network structure from a mixture of data that this paper introduce, Gibbs Volume method, best estimates the probability of the data given the latent variable model and (2) using large data (>10,000 cases), another model, the implicit latent variable method, is asymptotically correct and efficient.
Collapse
Affiliation(s)
- Changwon Yoo
- Department of Biostatistics, Florida International University, 11200 SW 8 St., AHC2 580, Miami, FL 33199, / Tel: 305-348-4906
| |
Collapse
|
6
|
Felty Q, Yoo C, Kennedy A. Gene expression profile of endothelial cells exposed to estrogenic environmental compounds: implications to pulmonary vascular lesions. Life Sci 2010; 86:919-27. [PMID: 20416326 DOI: 10.1016/j.lfs.2010.04.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2010] [Revised: 03/31/2010] [Accepted: 04/11/2010] [Indexed: 01/26/2023]
Abstract
AIMS The cardiovascular system is an important target of estrogenic compounds. Considering the recent studies that question previously reported cardio-protective effects of estrogen, there is a growing concern that estrogenic environmental compounds may contribute to the pathology of vascular lesion formation. MAIN METHODS Real-time quantitative PCR was used to monitor the expression of genes involved in vascularization. Using Bayesian network modeling, we determined a gene network that estrogenic chemicals modulate in human vascular endothelial cells. KEY FINDINGS We showed that planar and coplanar polychlorinated biphenyls (PCBs) induce the expression of different genes compared to estradiol. Non-planar PCB congener 153 induced NOTCH3 which is a new finding as well as CCL2 and IL8 similar to what has been reported by other non-planar PCBs in endothelial cells. Our gene network indicated that experimental treatments signal a network containing TGF-beta receptor and NOTCH3; molecules biologically relevant to signaling pulmonary vascular lesions. SIGNIFICANCE We report in the present study that exposure of vascular endothelial cells to environmentally relevant concentrations of estrogenic PCBs induce gene networks implicated in the process of inflammation and adhesion. Our data suggest that PCBs can promote vascular lesion formation by activating gene networks involved in endothelial cell adhesion, cell growth, and pro-inflammatory molecules which were different from natural estrogen. Since inflammation and adhesion are a hallmark in the pathology of endothelial cell dysfunction, reconstructing gene networks provide insight into the potential mechanisms that may contribute to the vascular risks associated with estrogenic environmental chemicals.
Collapse
Affiliation(s)
- Quentin Felty
- Department of Environmental and Occupational Health, Florida International University, Miami, FL 33199, USA.
| | | | | |
Collapse
|
7
|
|
8
|
Yoo C, Brilz EM. The five-gene-network data analysis with local causal discovery algorithm using causal Bayesian networks. Ann N Y Acad Sci 2009; 1158:93-101. [PMID: 19348635 PMCID: PMC4623325 DOI: 10.1111/j.1749-6632.2008.03749.x] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Using microarray experiments, we can model causal relationships of genes measured through mRNA expression levels. To this end, it is desirable to compare experiments of the system under complete interventions of some genes, such as by knock out of some genes, with experiments of the system under no interventions. However, it is expensive and difficult to conduct wet lab experiments of complete interventions of genes in a biological system. Thus, it will be helpful if we can discover promising causal relationships among genes with no interventions or incomplete interventions, such as by applying a treatment that has unknown effects to modeled genes, in order to identify promising genes to perturb in the system that can later be verified in wet laboratories. In this paper we use causal Bayesian networks to implement a causal discovery algorithm-the equivalence local implicit latent variable scoring method (EquLIM)-that identifies promising causal relationships even with a small dataset generated from no or incomplete interventions. We then apply EquLIM to analyze the five-gene-network data and compare EquLIM's predictions with true causal pairwise relationships between the genes.
Collapse
Affiliation(s)
- Changwon Yoo
- Computer Science, University of Montana, Missoula, Montana, USA.
| | | |
Collapse
|
9
|
Abstract
In this review we give an overview of computational and statistical methods to reconstruct cellular networks. Although this area of research is vast and fast developing, we show that most currently used methods can be organized by a few key concepts. The first part of the review deals with conditional independence models including Gaussian graphical models and Bayesian networks. The second part discusses probabilistic and graph-based methods for data from experimental interventions and perturbations.
Collapse
Affiliation(s)
- Florian Markowetz
- Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
- Princeton University, Lewis-Sigler Institute for Integrative Genomics and Dept. of Computer Science, Princeton, NJ 08544, USA
| | - Rainer Spang
- Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
- Present affiliation: University Regensburg, Institute of Functional Genomics, Josef-Engert-Str. 9, 93053 Regensburg, Germany
| |
Collapse
|
10
|
Yoo C, Cooper GF, Schmidt M. A control study to evaluate a computer-based microarray experiment design recommendation system for gene-regulation pathways discovery. J Biomed Inform 2005; 39:126-46. [PMID: 16203178 DOI: 10.1016/j.jbi.2005.05.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2005] [Revised: 04/22/2005] [Accepted: 05/27/2005] [Indexed: 11/22/2022]
Abstract
The main topic of this paper is evaluating a system that uses the expected value of experimentation for discovering causal pathways in gene expression data. By experimentation we mean both interventions (e.g., a gene knock-out experiment) and observations (e.g., passively observing the expression level of a "wild-type" gene). We introduce a system called GEEVE (causal discovery in Gene Expression data using Expected Value of Experimentation), which implements expected value of experimentation in discovering causal pathways using gene expression data. GEEVE provides the following assistance, which is intended to help biologists in their quest to discover gene-regulation pathways: Recommending which experiments to perform (with a focus on "knock-out" experiments) using an expected value of experimentation (EVE) method. Recommending the number of measurements (observational and experimental) to include in the experimental design, again using an EVE method. Providing a Bayesian analysis that combines prior knowledge with the results of recent microarray experimental results to derive posterior probabilities of gene regulation relationships. In recommending which experiments to perform (and how many times to repeat them) the EVE approach considers the biologist's preferences for which genes to focus the discovery process. Also, since exact EVE calculations are exponential in time, GEEVE incorporates approximation methods. GEEVE is able to combine data from knock-out experiments with data from wild-type experiments to suggest additional experiments to perform and then to analyze the results of those microarray experimental results. It models the possibility that unmeasured (latent) variables may be responsible for some of the statistical associations among the expression levels of the genes under study. To evaluate the GEEVE system, we used a gene expression simulator to generate data from specified models of gene regulation. Using the simulator, we evaluated the GEEVE system using a randomized control study that involved 10 biologists, some of whom used GEEVE and some of whom did not. The results show that biologists who used GEEVE reached correct causal assessments about gene regulation more often than did those biologists who did not use GEEVE. The GEEVE users also reached their assessments in a more cost-effective manner.
Collapse
Affiliation(s)
- Changwon Yoo
- Department of Computer Science, University of Montana, 420 Social Sciences, University of Montana, Missoula, MT 59803, USA.
| | | | | |
Collapse
|
11
|
Markowetz F, Bloch J, Spang R. Non-transcriptional pathway features reconstructed from secondary effects of RNA interference. Bioinformatics 2005; 21:4026-32. [PMID: 16159925 DOI: 10.1093/bioinformatics/bti662] [Citation(s) in RCA: 64] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Cellular signaling pathways, which are not modulated on a transcriptional level, cannot be directly deduced from expression profiling experiments. The situation changes, when external interventions such as RNA interference or gene knock-outs come into play. Even if the expression of the signaling genes is not changed, secondary effects in downstream genes shed light on the pathway, and allow partial reconstruction of its topology. RESULTS We introduce an algorithm to infer non-transcriptional pathway features based on differential gene expression in silencing assays. We demonstrate the power of our algorithm in the controlled setting of simulation studies, and explain its practical use in the context of an RNA interference dataset investigating the response to microbial challenge in Drosophila melanogaster.
Collapse
Affiliation(s)
- Florian Markowetz
- Department of Computational Molecular Biology, Computational Diagnostics Group, Max Planck Institute for Molecular Genetics Ihnestrasse 63-73, 14195 Berlin, Germany.
| | | | | |
Collapse
|
12
|
Laghaee A, Malcolm C, Hallam J, Ghazal P. Artificial intelligence and robotics in high throughput post-genomics. Drug Discov Today 2005; 10:1253-9. [PMID: 16213418 DOI: 10.1016/s1359-6446(05)03581-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
The shift of post-genomics towards a systems approach has offered an ever-increasing role for artificial intelligence (AI) and robotics. Many disciplines (e.g. engineering, robotics, computer science) bear on the problem of automating the different stages involved in post-genomic research with a view to developing quality assured high-dimensional data. We review some of the latest contributions of AI and robotics to this end and note the limitations arising from the current independent, exploratory way in which specific solutions are being presented for specific problems without regard to how these could be eventually integrated into one comprehensible integrated intelligent system.
Collapse
Affiliation(s)
- Aroosha Laghaee
- Institute for Perception, Action and Behaviour (IPAB), School of Informatics, James Clerk Maxwell Building, University of Edinburgh, Mayfield Road, Edinburgh EH9 3JZ, UK.
| | | | | | | |
Collapse
|
13
|
Current Awareness on Comparative and Functional Genomics. Comp Funct Genomics 2005. [PMCID: PMC2447482 DOI: 10.1002/cfg.421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|