1
|
Abante J, Kambhampati S, Feinberg AP, Goutsias J. Estimating DNA methylation potential energy landscapes from nanopore sequencing data. Sci Rep 2021; 11:21619. [PMID: 34732768 PMCID: PMC8566571 DOI: 10.1038/s41598-021-00781-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 10/18/2021] [Indexed: 11/23/2022] Open
Abstract
High-throughput third-generation nanopore sequencing devices have enormous potential for simultaneously observing epigenetic modifications in human cells over large regions of the genome. However, signals generated by these devices are subject to considerable noise that can lead to unsatisfactory detection performance and hamper downstream analysis. Here we develop a statistical method, CpelNano, for the quantification and analysis of 5mC methylation landscapes using nanopore data. CpelNano takes into account nanopore noise by means of a hidden Markov model (HMM) in which the true but unknown (“hidden”) methylation state is modeled through an Ising probability distribution that is consistent with methylation means and pairwise correlations, whereas nanopore current signals constitute the observed state. It then estimates the associated methylation potential energy function by employing the expectation-maximization (EM) algorithm and performs differential methylation analysis via permutation-based hypothesis testing. Using simulations and analysis of published data obtained from three human cell lines (GM12878, MCF-10A, and MDA-MB-231), we show that CpelNano can faithfully estimate DNA methylation potential energy landscapes, substantially improving current methods and leading to a powerful tool for the modeling and analysis of epigenetic landscapes using nanopore sequencing data.
Collapse
Affiliation(s)
- Jordi Abante
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, 21218, USA. .,Department of Electrical & Computer Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA. .,Department of Biomedical Data Science, Stanford University School of Medicine, Stanford, CA, 94305, USA.
| | - Sandeep Kambhampati
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21205, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, 02115, USA
| | - Andrew P Feinberg
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21205, USA.,Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA.,Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, 21218, USA. .,Department of Electrical & Computer Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA.
| |
Collapse
|
2
|
Koldobskiy MA, Jenkinson G, Abante J, Rodriguez DiBlasi VA, Zhou W, Pujadas E, Idrizi A, Tryggvadottir R, Callahan C, Bonifant CL, Rabin KR, Brown PA, Ji H, Goutsias J, Feinberg AP. Converging genetic and epigenetic drivers of paediatric acute lymphoblastic leukaemia identified by an information-theoretic analysis. Nat Biomed Eng 2021; 5:360-376. [PMID: 33859388 PMCID: PMC8370714 DOI: 10.1038/s41551-021-00703-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 02/18/2021] [Indexed: 02/02/2023]
Abstract
In cancer, linking epigenetic alterations to drivers of transformation has been difficult, in part because DNA methylation analyses must capture epigenetic variability, which is central to tumour heterogeneity and tumour plasticity. Here, by conducting a comprehensive analysis, based on information theory, of differences in methylation stochasticity in samples from patients with paediatric acute lymphoblastic leukaemia (ALL), we show that ALL epigenomes are stochastic and marked by increased methylation entropy at specific regulatory regions and genes. By integrating DNA methylation and single-cell gene-expression data, we arrived at a relationship between methylation entropy and gene-expression variability, and found that epigenetic changes in ALL converge on a shared set of genes that overlap with genetic drivers involved in chromosomal translocations across the disease spectrum. Our findings suggest that an epigenetically driven gene-regulation network, with UHRF1 (ubiquitin-like with PHD and RING finger domains 1) as a central node, links genetic drivers and epigenetic mediators in ALL.
Collapse
Affiliation(s)
- Michael A Koldobskiy
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Garrett Jenkinson
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA
- Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
| | - Jordi Abante
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Varenka A Rodriguez DiBlasi
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Cancer Immunology and Immune Modulation, Boehringer Ingelheim, Ridgefield, CT, USA
| | - Weiqiang Zhou
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - Elisabet Pujadas
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Adrian Idrizi
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Rakel Tryggvadottir
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Colin Callahan
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Challice L Bonifant
- Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Karen R Rabin
- Department of Pediatrics, Section of Hematology-Oncology, Baylor College of Medicine, Houston, TX, USA
| | - Patrick A Brown
- Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA.
| | - Andrew P Feinberg
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
3
|
Koldobskiy M, Tetens A, Martin A, Eberhart C, Raabe E, Goutsias J, Feinberg A. DIPG-70. DISORDERED DNA METHYLATION IN DIPG UNDERLIES PHENOTYPIC PLASTICITY. Neuro Oncol 2020. [PMCID: PMC7715591 DOI: 10.1093/neuonc/noaa222.112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Diffuse intrinsic pontine glioma (DIPG) is a childhood brainstem tumor with a dismal prognosis and no effective treatment. Recent studies point to a critical role for epigenetic dysregulation in this disease. Nearly 80% of DIPGs harbor mutations in histone H3 encoding replacement of lysine 27 with methionine (K27M), leading to global loss of the repressive histone H3K27 trimethylation mark, global DNA hypomethylation, and a distinct gene expression profile. However, a static view of the epigenome fails to capture the plasticity of cancer cells and their gene expression states. Recent studies across diverse cancers have highlighted the role of epigenetic variability as a driving force in tumor evolution. Epigenetic variability may underlie the heterogeneity and phenotypic plasticity of DIPG cells and allow for the selection of cellular traits that promote survival and resistance to therapy. We have recently formalized a novel framework for analyzing variability of DNA methylation directly from whole-genome bisulfite sequencing data, allowing computation of DNA methylation entropy at precise genomic locations. Using these methods, we have shown that DIPG exhibits a markedly disordered epigenome, with increased stochasticity of DNA methylation localizing to specific regulatory elements and genes. We evaluate the responsiveness of the DIPG epigenetic landscape to pharmacologic modulation in order to modify proliferation, differentiation state, and immune signaling in DIPG cells.
Collapse
Affiliation(s)
| | | | | | | | - Eric Raabe
- Johns Hopkins University, Baltimore, MD, USA
| | | | | |
Collapse
|
4
|
Abstract
In heterozygous genomes, allele-specific measurements can reveal biologically significant differences in DNA methylation between homologous alleles associated with local changes in genetic sequence. Current approaches for detecting such events from whole-genome bisulfite sequencing (WGBS) data perform statistically independent marginal analysis at individual cytosine-phosphate-guanine (CpG) sites, thus ignoring correlations in the methylation state, or carry-out a joint statistical analysis of methylation patterns at four CpG sites producing unreliable statistical evidence. Here, we employ the one-dimensional Ising model of statistical physics and develop a method for detecting allele-specific methylation (ASM) events within segments of DNA containing clusters of linked single-nucleotide polymorphisms (SNPs), called haplotypes. Comparisons with existing approaches using simulated and real WGBS data show that our method provides an improved fit to data, especially when considering large haplotypes. Importantly, the method employs robust hypothesis testing for detecting statistically significant imbalances in mean methylation level and methylation entropy, as well as for identifying haplotypes for which the genetic variant carries significant information about the methylation state. As such, our ASM analysis approach can potentially lead to biological discoveries with important implications for the genetics of complex human diseases.
Collapse
Affiliation(s)
- J Abante
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Department of Electrical & Computer Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA.
| | - Y Fang
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21205, USA
| | - A P Feinberg
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21205, USA
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - J Goutsias
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, 21218, USA.
- Department of Electrical & Computer Engineering, Johns Hopkins University, Baltimore, MD, 21218, USA.
| |
Collapse
|
5
|
Koldobskiy MA, Abante J, Jenkinson G, Pujadas E, Tetens A, Zhao F, Tryggvadottir R, Idrizi A, Reinisch A, Majeti R, Goutsias J, Feinberg AP. A Dysregulated DNA Methylation Landscape Linked to Gene Expression in MLL-Rearranged AML. Epigenetics 2020; 15:841-858. [PMID: 32114880 DOI: 10.1080/15592294.2020.1734149] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
Abstract
Translocations of the KMT2A (MLL) gene define a biologically distinct and clinically aggressive subtype of acute myeloid leukaemia (AML), marked by a characteristic gene expression profile and few cooperating mutations. Although dysregulation of the epigenetic landscape in this leukaemia is particularly interesting given the low mutation frequency, its comprehensive analysis using whole genome bisulphite sequencing (WGBS) has not been previously performed. Here we investigated epigenetic dysregulation in nine MLL-rearranged (MLL-r) AML samples by comparing them to six normal myeloid controls, using a computational method that encapsulates mean DNA methylation measurements along with analyses of methylation stochasticity. We discovered a dramatically altered epigenetic profile in MLL-r AML, associated with genome-wide hypomethylation and a markedly increased DNA methylation entropy reflecting an increasingly disordered epigenome. Methylation discordance mapped to key genes and regulatory elements that included bivalent promoters and active enhancers. Genes associated with significant changes in methylation stochasticity recapitulated known MLL-r AML expression signatures, suggesting a role for the altered epigenetic landscape in the transcriptional programme initiated by MLL translocations. Accordingly, we established statistically significant associations between discordances in methylation stochasticity and gene expression in MLL-r AML, thus providing a link between the altered epigenetic landscape and the phenotype.
Collapse
Affiliation(s)
- Michael A Koldobskiy
- Center for Epigenetics, Johns Hopkins University School of Medicine , Baltimore, MD, USA.,Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine , Baltimore, MD, USA
| | - Jordi Abante
- Whitaker Biomedical Engineering Institute, Johns Hopkins University , Baltimore, MD, USA
| | - Garrett Jenkinson
- Center for Epigenetics, Johns Hopkins University School of Medicine , Baltimore, MD, USA.,Whitaker Biomedical Engineering Institute, Johns Hopkins University , Baltimore, MD, USA.,Department of Health Science Research, Mayo Clinic , Rochester, MN, USA
| | - Elisabet Pujadas
- Center for Epigenetics, Johns Hopkins University School of Medicine , Baltimore, MD, USA.,Department of Pathology, Icahn School of Medicine at Mount Sinai , New York, NY, USA
| | - Ashley Tetens
- Center for Epigenetics, Johns Hopkins University School of Medicine , Baltimore, MD, USA
| | - Feifei Zhao
- Department of Medicine, Division of Hematology, Cancer Institute and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine , Stanford, CA, USA
| | - Rakel Tryggvadottir
- Center for Epigenetics, Johns Hopkins University School of Medicine , Baltimore, MD, USA
| | - Adrian Idrizi
- Center for Epigenetics, Johns Hopkins University School of Medicine , Baltimore, MD, USA
| | - Andreas Reinisch
- Department of Medicine, Division of Hematology, Cancer Institute and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine , Stanford, CA, USA.,Division of Hematology, Medical University of Graz , Graz, Austria
| | - Ravindra Majeti
- Department of Medicine, Division of Hematology, Cancer Institute and Institute for Stem Cell Biology and Regenerative Medicine, Stanford University School of Medicine , Stanford, CA, USA
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, Johns Hopkins University , Baltimore, MD, USA
| | - Andrew P Feinberg
- Center for Epigenetics, Johns Hopkins University School of Medicine , Baltimore, MD, USA.,Department of Biomedical Engineering, Johns Hopkins University , Baltimore, MD, USA.,Department of Medicine, Johns Hopkins University School of Medicine , Baltimore, MD, USA
| |
Collapse
|
6
|
Garrett-Bakelman FE, Darshi M, Green SJ, Gur RC, Lin L, Macias BR, McKenna MJ, Meydan C, Mishra T, Nasrini J, Piening BD, Rizzardi LF, Sharma K, Siamwala JH, Taylor L, Vitaterna MH, Afkarian M, Afshinnekoo E, Ahadi S, Ambati A, Arya M, Bezdan D, Callahan CM, Chen S, Choi AMK, Chlipala GE, Contrepois K, Covington M, Crucian BE, De Vivo I, Dinges DF, Ebert DJ, Feinberg JI, Gandara JA, George KA, Goutsias J, Grills GS, Hargens AR, Heer M, Hillary RP, Hoofnagle AN, Hook VYH, Jenkinson G, Jiang P, Keshavarzian A, Laurie SS, Lee-McMullen B, Lumpkins SB, MacKay M, Maienschein-Cline MG, Melnick AM, Moore TM, Nakahira K, Patel HH, Pietrzyk R, Rao V, Saito R, Salins DN, Schilling JM, Sears DD, Sheridan CK, Stenger MB, Tryggvadottir R, Urban AE, Vaisar T, Van Espen B, Zhang J, Ziegler MG, Zwart SR, Charles JB, Kundrot CE, Scott GBI, Bailey SM, Basner M, Feinberg AP, Lee SMC, Mason CE, Mignot E, Rana BK, Smith SM, Snyder MP, Turek FW. The NASA Twins Study: A multidimensional analysis of a year-long human spaceflight. Science 2019; 364:364/6436/eaau8650. [PMID: 30975860 DOI: 10.1126/science.aau8650] [Citation(s) in RCA: 399] [Impact Index Per Article: 79.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2018] [Accepted: 02/28/2019] [Indexed: 12/11/2022]
Abstract
To understand the health impact of long-duration spaceflight, one identical twin astronaut was monitored before, during, and after a 1-year mission onboard the International Space Station; his twin served as a genetically matched ground control. Longitudinal assessments identified spaceflight-specific changes, including decreased body mass, telomere elongation, genome instability, carotid artery distension and increased intima-media thickness, altered ocular structure, transcriptional and metabolic changes, DNA methylation changes in immune and oxidative stress-related pathways, gastrointestinal microbiota alterations, and some cognitive decline postflight. Although average telomere length, global gene expression, and microbiome changes returned to near preflight levels within 6 months after return to Earth, increased numbers of short telomeres were observed and expression of some genes was still disrupted. These multiomic, molecular, physiological, and behavioral datasets provide a valuable roadmap of the putative health risks for future human spaceflight.
Collapse
Affiliation(s)
- Francine E Garrett-Bakelman
- Weill Cornell Medicine, New York, NY, USA.,University of Virginia School of Medicine, Charlottesville, VA, USA
| | - Manjula Darshi
- Center for Renal Precision Medicine, University of Texas Health, San Antonio, TX, USA
| | | | - Ruben C Gur
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | - Ling Lin
- Stanford University School of Medicine, Palo Alto, CA, USA
| | | | | | - Cem Meydan
- Weill Cornell Medicine, New York, NY, USA.,The Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, USA
| | | | - Jad Nasrini
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | | | | | - Kumar Sharma
- Center for Renal Precision Medicine, University of Texas Health, San Antonio, TX, USA
| | | | - Lynn Taylor
- Colorado State University, Fort Collins, CO, USA
| | | | | | - Ebrahim Afshinnekoo
- Weill Cornell Medicine, New York, NY, USA.,The Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, USA
| | - Sara Ahadi
- Stanford University School of Medicine, Palo Alto, CA, USA
| | - Aditya Ambati
- Stanford University School of Medicine, Palo Alto, CA, USA
| | | | - Daniela Bezdan
- Weill Cornell Medicine, New York, NY, USA.,The Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, USA
| | | | - Songjie Chen
- Stanford University School of Medicine, Palo Alto, CA, USA
| | | | | | | | - Marisa Covington
- National Aeronautics and Space Administration (NASA), Houston, TX, USA
| | - Brian E Crucian
- National Aeronautics and Space Administration (NASA), Houston, TX, USA
| | | | - David F Dinges
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | | | | | | | | | | | | | | | | | - Ryan P Hillary
- Stanford University School of Medicine, Palo Alto, CA, USA
| | | | | | | | - Peng Jiang
- Northwestern University, Evanston, IL, USA
| | | | | | | | | | | | | | | | - Tyler M Moore
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA
| | | | - Hemal H Patel
- University of California, San Diego, La Jolla, CA, USA
| | | | - Varsha Rao
- Stanford University School of Medicine, Palo Alto, CA, USA
| | - Rintaro Saito
- University of California, San Diego, La Jolla, CA, USA
| | - Denis N Salins
- Stanford University School of Medicine, Palo Alto, CA, USA
| | | | | | | | - Michael B Stenger
- National Aeronautics and Space Administration (NASA), Houston, TX, USA
| | | | | | | | | | - Jing Zhang
- Stanford University School of Medicine, Palo Alto, CA, USA
| | | | - Sara R Zwart
- University of Texas Medical Branch, Galveston, TX, USA
| | - John B Charles
- National Aeronautics and Space Administration (NASA), Houston, TX, USA.
| | - Craig E Kundrot
- Space Life and Physical Sciences Division, NASA Headquarters, Washington, DC, USA.
| | - Graham B I Scott
- National Space Biomedical Research Institute, Baylor College of Medicine, Houston, TX, USA.
| | | | - Mathias Basner
- University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
| | | | | | - Christopher E Mason
- Weill Cornell Medicine, New York, NY, USA. .,The Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, New York, NY, USA.,The Feil Family Brain and Mind Research Institute, New York, NY, USA.,The WorldQuant Initiative for Quantitative Prediction, New York, NY, USA
| | | | - Brinda K Rana
- University of California, San Diego, La Jolla, CA, USA.
| | - Scott M Smith
- National Aeronautics and Space Administration (NASA), Houston, TX, USA.
| | | | | |
Collapse
|
7
|
Jenkinson G, Abante J, Koldobskiy MA, Feinberg AP, Goutsias J. Ranking genomic features using an information-theoretic measure of epigenetic discordance. BMC Bioinformatics 2019; 20:175. [PMID: 30961526 PMCID: PMC6454630 DOI: 10.1186/s12859-019-2777-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 03/25/2019] [Indexed: 02/07/2023] Open
Abstract
Background Establishment and maintenance of DNA methylation throughout the genome is an important epigenetic mechanism that regulates gene expression whose disruption has been implicated in human diseases like cancer. It is therefore crucial to know which genes, or other genomic features of interest, exhibit significant discordance in DNA methylation between two phenotypes. We have previously proposed an approach for ranking genes based on methylation discordance within their promoter regions, determined by centering a window of fixed size at their transcription start sites. However, we cannot use this method to identify statistically significant genomic features and handle features of variable length and with missing data. Results We present a new approach for computing the statistical significance of methylation discordance within genomic features of interest in single and multiple test/reference studies. We base the proposed method on a well-articulated hypothesis testing problem that produces p- and q-values for each genomic feature, which we then use to identify and rank features based on the statistical significance of their epigenetic dysregulation. We employ the information-theoretic concept of mutual information to derive a novel test statistic, which we can evaluate by computing Jensen-Shannon distances between the probability distributions of methylation in a test and a reference sample. We design the proposed methodology to simultaneously handle biological, statistical, and technical variability in the data, as well as variable feature lengths and missing data, thus enabling its wide-spread use on any list of genomic features. This is accomplished by estimating, from reference data, the null distribution of the test statistic as a function of feature length using generalized additive regression models. Differential assessment, using normal/cancer data from healthy fetal tissue and pediatric high-grade glioma patients, illustrates the potential of our approach to greatly facilitate the exploratory phases of clinically and biologically relevant methylation studies. Conclusions The proposed approach provides the first computational tool for statistically testing and ranking genomic features of interest based on observed DNA methylation discordance in comparative studies that accounts, in a rigorous manner, for biological, statistical, and technical variability in methylation data, as well as for variability in feature length and for missing data. Electronic supplementary material The online version of this article (10.1186/s12859-019-2777-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Garrett Jenkinson
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA.,Center for Epigenetics, Johns Hopkins School of Medicine, Baltimore, MD, USA.,Currently with Department of Health Sciences Research, Mayo Clinic, Rochester, MN, USA
| | - Jordi Abante
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Michael A Koldobskiy
- Center for Epigenetics, Johns Hopkins School of Medicine, Baltimore, MD, USA.,Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Andrew P Feinberg
- Center for Epigenetics, Johns Hopkins School of Medicine, Baltimore, MD, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.,Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
8
|
Koldobskiy M, Jenkinson G, Pujadas E, Martin A, Eberhart C, Goutsias J, Raabe E, Feinberg A. DIPG-74. DNA METHYLATION STOCHASTICITY IN DIPG. Neuro Oncol 2018. [DOI: 10.1093/neuonc/noy059.166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
| | | | | | | | | | | | - Eric Raabe
- Johns Hopkins University, Baltimore, MD, USA
| | | |
Collapse
|
9
|
Abstract
BACKGROUND DNA methylation is a stable form of epigenetic memory used by cells to control gene expression. Whole genome bisulfite sequencing (WGBS) has emerged as a gold-standard experimental technique for studying DNA methylation by producing high resolution genome-wide methylation profiles. Statistical modeling and analysis is employed to computationally extract and quantify information from these profiles in an effort to identify regions of the genome that demonstrate crucial or aberrant epigenetic behavior. However, the performance of most currently available methods for methylation analysis is hampered by their inability to directly account for statistical dependencies between neighboring methylation sites, thus ignoring significant information available in WGBS reads. RESULTS We present a powerful information-theoretic approach for genome-wide modeling and analysis of WGBS data based on the 1D Ising model of statistical physics. This approach takes into account correlations in methylation by utilizing a joint probability model that encapsulates all information available in WGBS methylation reads and produces accurate results even when applied on single WGBS samples with low coverage. Using the Shannon entropy, our approach provides a rigorous quantification of methylation stochasticity in individual WGBS samples genome-wide. Furthermore, it utilizes the Jensen-Shannon distance to evaluate differences in methylation distributions between a test and a reference sample. Differential performance assessment using simulated and real human lung normal/cancer data demonstrate a clear superiority of our approach over DSS, a recently proposed method for WGBS data analysis. Critically, these results demonstrate that marginal methods become statistically invalid when correlations are present in the data. CONCLUSIONS This contribution demonstrates clear benefits and the necessity of modeling joint probability distributions of methylation using the 1D Ising model of statistical physics and of quantifying methylation stochasticity using concepts from information theory. By employing this methodology, substantial improvement of DNA methylation analysis can be achieved by effectively taking into account the massive amount of statistical information available in WGBS data, which is largely ignored by existing methods.
Collapse
Affiliation(s)
- Garrett Jenkinson
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA.,Center for Epigenetics, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Jordi Abante
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Andrew P Feinberg
- Center for Epigenetics, Johns Hopkins School of Medicine, Baltimore, MD, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.,Department of Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
10
|
Nakamura H, Lee AA, Afshar AS, Watanabe S, Rho E, Razavi S, Suarez A, Lin YC, Tanigawa M, Huang B, DeRose R, Bobb D, Hong W, Gabelli SB, Goutsias J, Inoue T. Intracellular production of hydrogels and synthetic RNA granules by multivalent molecular interactions. Nat Mater 2018; 17:79-89. [PMID: 29115293 PMCID: PMC5916848 DOI: 10.1038/nmat5006] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/31/2015] [Accepted: 09/08/2017] [Indexed: 05/06/2023]
Abstract
Some protein components of intracellular non-membrane-bound entities, such as RNA granules, are known to form hydrogels in vitro. The physico-chemical properties and functional role of these intracellular hydrogels are difficult to study, primarily due to technical challenges in probing these materials in situ. Here, we present iPOLYMER, a strategy for a rapid induction of protein-based hydrogels inside living cells that explores the chemically inducible dimerization paradigm. Biochemical and biophysical characterizations aided by computational modelling show that the polymer network formed in the cytosol resembles a physiological hydrogel-like entity that acts as a size-dependent molecular sieve. We functionalize these polymers with RNA-binding motifs that sequester polyadenine-containing nucleotides to synthetically mimic RNA granules. These results show that iPOLYMER can be used to synthetically reconstitute the nucleation of biologically functional entities, including RNA granules in intact cells.
Collapse
Affiliation(s)
- Hideki Nakamura
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - Albert A. Lee
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - Ali Sobhi Afshar
- Center for Imaging Science, Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD, 21218
- To whom correspondence regarding the computational analysis should be addressed: (A.S.A)
| | - Shigeki Watanabe
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
| | - Elmer Rho
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - Shiva Razavi
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Department of Biomedical Engineering, Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD 21218
| | - Allison Suarez
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - Yu-Chun Lin
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - Makoto Tanigawa
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Department of Biomedical Engineering, Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD 21218
| | - Brian Huang
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - Robert DeRose
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - Diana Bobb
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
| | - William Hong
- Department of Biophysics and Biophysical Chemistry, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
| | - Sandra B. Gabelli
- Department of Biophysics and Biophysical Chemistry, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Department of Medicine, School of Medicine, Johns Hopkins University, Baltimore, MD, 21205
- Department of Oncology, School of Medicine, Johns Hopkins University, Baltimore, MD, 21205
| | - John Goutsias
- Center for Imaging Science, Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD, 21218
| | - Takanari Inoue
- Department of Cell Biology, School of Medicine, The Johns Hopkins University, Baltimore, MD, 21205
- Center for Cell Dynamics, Institute for Basic Biomedical Sciences, The Johns Hopkins University, Baltimore, MD, 21205
- Department of Biomedical Engineering, Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD 21218
- To whom general correspondence should be addressed: (T.I.)
| |
Collapse
|
11
|
Afshar AS, Xu J, Goutsias J. Integrative identification of deregulated miRNA/TF-mediated gene regulatory loops and networks in prostate cancer. PLoS One 2014; 9:e100806. [PMID: 24968068 PMCID: PMC4072696 DOI: 10.1371/journal.pone.0100806] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Accepted: 05/28/2014] [Indexed: 01/07/2023] Open
Abstract
MicroRNAs (miRNAs) have attracted a great deal of attention in biology and medicine. It has been hypothesized that miRNAs interact with transcription factors (TFs) in a coordinated fashion to play key roles in regulating signaling and transcriptional pathways and in achieving robust gene regulation. Here, we propose a novel integrative computational method to infer certain types of deregulated miRNA-mediated regulatory circuits at the transcriptional, post-transcriptional and signaling levels. To reliably predict miRNA-target interactions from mRNA/miRNA expression data, our method collectively utilizes sequence-based miRNA-target predictions obtained from several algorithms, known information about mRNA and miRNA targets of TFs available in existing databases, certain molecular structures identified to be statistically over-represented in gene regulatory networks, available molecular subtyping information, and state-of-the-art statistical techniques to appropriately constrain the underlying analysis. In this way, the method exploits almost every aspect of extractable information in the expression data. We apply our procedure on mRNA/miRNA expression data from prostate tumor and normal samples and detect numerous known and novel miRNA-mediated deregulated loops and networks in prostate cancer. We also demonstrate instances of the results in a number of distinct biological settings, which are known to play crucial roles in prostate and other types of cancer. Our findings show that the proposed computational method can be used to effectively achieve notable insights into the poorly understood molecular mechanisms of miRNA-mediated interactions and dissect their functional roles in cancer in an effort to pave the way for miRNA-based therapeutics in clinical settings.
Collapse
Affiliation(s)
- Ali Sobhi Afshar
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Joseph Xu
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
| |
Collapse
|
12
|
Jenkinson G, Goutsias J. Statistically testing the validity of analytical and computational approximations to the chemical master equation. J Chem Phys 2013; 138:204108. [PMID: 23742455 DOI: 10.1063/1.4807390] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The master equation is used extensively to model chemical reaction systems with stochastic dynamics. However, and despite its phenomenological simplicity, it is not in general possible to compute the solution of this equation. Drawing exact samples from the master equation is possible, but can be computationally demanding, especially when estimating high-order statistical summaries or joint probability distributions. As a consequence, one often relies on analytical approximations to the solution of the master equation or on computational techniques that draw approximative samples from this equation. Unfortunately, it is not in general possible to check whether a particular approximation scheme is valid. The main objective of this paper is to develop an effective methodology to address this problem based on statistical hypothesis testing. By drawing a moderate number of samples from the master equation, the proposed techniques use the well-known Kolmogorov-Smirnov statistic to reject the validity of a given approximation method or accept it with a certain level of confidence. Our approach is general enough to deal with any master equation and can be used to test the validity of any analytical approximation method or any approximative sampling technique of interest. A number of examples, based on the Schlögl model of chemistry and the SIR model of epidemiology, clearly illustrate the effectiveness and potential of the proposed statistical framework.
Collapse
Affiliation(s)
- Garrett Jenkinson
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | |
Collapse
|
13
|
Abstract
The processes by which disease spreads in a population of individuals are inherently stochastic. The master equation has proven to be a useful tool for modeling such processes. Unfortunately, solving the master equation analytically is possible only in limited cases (e.g., when the model is linear), and thus numerical procedures or approximation methods must be employed. Available approximation methods, such as the system size expansion method of van Kampen, may fail to provide reliable solutions, whereas current numerical approaches can induce appreciable computational cost. In this paper, we propose a new numerical technique for solving the master equation. Our method is based on a more informative stochastic process than the population process commonly used in the literature. By exploiting the structure of the master equation governing this process, we develop a novel technique for calculating the exact solution of the master equation – up to a desired precision – in certain models of stochastic epidemiology. We demonstrate the potential of our method by solving the master equation associated with the stochastic SIR epidemic model. MATLAB software that implements the methods discussed in this paper is freely available as Supporting Information S1.
Collapse
Affiliation(s)
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, United States of America
- * E-mail:
| |
Collapse
|
14
|
Zhang HX, Goutsias J. Reducing experimental variability in variance-based sensitivity analysis of biochemical reaction systems. J Chem Phys 2011; 134:114105. [PMID: 21428605 DOI: 10.1063/1.3563539] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Sensitivity analysis is a valuable task for assessing the effects of biological variability on cellular behavior. Available techniques require knowledge of nominal parameter values, which cannot be determined accurately due to experimental uncertainty typical to problems of systems biology. As a consequence, the practical use of existing sensitivity analysis techniques may be seriously hampered by the effects of unpredictable experimental variability. To address this problem, we propose here a probabilistic approach to sensitivity analysis of biochemical reaction systems that explicitly models experimental variability and effectively reduces the impact of this type of uncertainty on the results. The proposed approach employs a recently introduced variance-based method to sensitivity analysis of biochemical reaction systems [Zhang et al., J. Chem. Phys. 134, 094101 (2009)] and leads to a technique that can be effectively used to accommodate appreciable levels of experimental variability. We discuss three numerical techniques for evaluating the sensitivity indices associated with the new method, which include Monte Carlo estimation, derivative approximation, and dimensionality reduction based on orthonormal Hermite approximation. By employing a computational model of the epidermal growth factor receptor signaling pathway, we demonstrate that the proposed technique can greatly reduce the effect of experimental variability on variance-based sensitivity analysis results. We expect that, in cases of appreciable experimental variability, the new method can lead to substantial improvements over existing sensitivity analysis techniques.
Collapse
Affiliation(s)
- Hong-Xuan Zhang
- Procter & Gamble Co., Miami Valley Innovation Center, Cincinnati, Ohio 45253, USA
| | | |
Collapse
|
15
|
Jenkinson G, Goutsias J. Thermodynamically consistent model calibration in chemical kinetics. BMC Syst Biol 2011; 5:64. [PMID: 21548948 PMCID: PMC3117730 DOI: 10.1186/1752-0509-5-64] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2011] [Accepted: 05/06/2011] [Indexed: 11/10/2022]
Abstract
Background The dynamics of biochemical reaction systems are constrained by the fundamental laws of thermodynamics, which impose well-defined relationships among the reaction rate constants characterizing these systems. Constructing biochemical reaction systems from experimental observations often leads to parameter values that do not satisfy the necessary thermodynamic constraints. This can result in models that are not physically realizable and may lead to inaccurate, or even erroneous, descriptions of cellular function. Results We introduce a thermodynamically consistent model calibration (TCMC) method that can be effectively used to provide thermodynamically feasible values for the parameters of an open biochemical reaction system. The proposed method formulates the model calibration problem as a constrained optimization problem that takes thermodynamic constraints (and, if desired, additional non-thermodynamic constraints) into account. By calculating thermodynamically feasible values for the kinetic parameters of a well-known model of the EGF/ERK signaling cascade, we demonstrate the qualitative and quantitative significance of imposing thermodynamic constraints on these parameters and the effectiveness of our method for accomplishing this important task. MATLAB software, using the Systems Biology Toolbox 2.1, can be accessed from http://www.cis.jhu.edu/~goutsias/CSS lab/software.html. An SBML file containing the thermodynamically feasible EGF/ERK signaling cascade model can be found in the BioModels database. Conclusions TCMC is a simple and flexible method for obtaining physically plausible values for the kinetic parameters of open biochemical reaction systems. It can be effectively used to recalculate a thermodynamically consistent set of parameter values for existing thermodynamically infeasible biochemical reaction models of cellular function as well as to estimate thermodynamically feasible values for the parameters of new models. Furthermore, TCMC can provide dimensionality reduction, better estimation performance, and lower computational complexity, and can help to alleviate the problem of data overfitting.
Collapse
Affiliation(s)
- Garrett Jenkinson
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD 21218, USA.
| | | |
Collapse
|
16
|
Jenkinson G, Zhong X, Goutsias J. Thermodynamically consistent Bayesian analysis of closed biochemical reaction systems. BMC Bioinformatics 2010; 11:547. [PMID: 21054868 PMCID: PMC3248051 DOI: 10.1186/1471-2105-11-547] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2010] [Accepted: 11/05/2010] [Indexed: 12/04/2022] Open
Abstract
Background Estimating the rate constants of a biochemical reaction system with known stoichiometry from noisy time series measurements of molecular concentrations is an important step for building predictive models of cellular function. Inference techniques currently available in the literature may produce rate constant values that defy necessary constraints imposed by the fundamental laws of thermodynamics. As a result, these techniques may lead to biochemical reaction systems whose concentration dynamics could not possibly occur in nature. Therefore, development of a thermodynamically consistent approach for estimating the rate constants of a biochemical reaction system is highly desirable. Results We introduce a Bayesian analysis approach for computing thermodynamically consistent estimates of the rate constants of a closed biochemical reaction system with known stoichiometry given experimental data. Our method employs an appropriately designed prior probability density function that effectively integrates fundamental biophysical and thermodynamic knowledge into the inference problem. Moreover, it takes into account experimental strategies for collecting informative observations of molecular concentrations through perturbations. The proposed method employs a maximization-expectation-maximization algorithm that provides thermodynamically feasible estimates of the rate constant values and computes appropriate measures of estimation accuracy. We demonstrate various aspects of the proposed method on synthetic data obtained by simulating a subset of a well-known model of the EGF/ERK signaling pathway, and examine its robustness under conditions that violate key assumptions. Software, coded in MATLAB®, which implements all Bayesian analysis techniques discussed in this paper, is available free of charge at http://www.cis.jhu.edu/~goutsias/CSS%20lab/software.html. Conclusions Our approach provides an attractive statistical methodology for estimating thermodynamically feasible values for the rate constants of a biochemical reaction system from noisy time series observations of molecular concentrations obtained through perturbations. The proposed technique is theoretically sound and computationally feasible, but restricted to quantitative data obtained from closed biochemical reaction systems. This necessitates development of similar techniques for estimating the rate constants of open biochemical reaction systems, which are more realistic models of cellular function.
Collapse
Affiliation(s)
- Garrett Jenkinson
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD 21218, USA
| | | | | |
Collapse
|
17
|
Zhang HX, Goutsias J. A comparison of approximation techniques for variance-based sensitivity analysis of biochemical reaction systems. BMC Bioinformatics 2010; 11:246. [PMID: 20462443 PMCID: PMC2894038 DOI: 10.1186/1471-2105-11-246] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2009] [Accepted: 05/12/2010] [Indexed: 11/10/2022] Open
Abstract
Background Sensitivity analysis is an indispensable tool for the analysis of complex systems. In a recent paper, we have introduced a thermodynamically consistent variance-based sensitivity analysis approach for studying the robustness and fragility properties of biochemical reaction systems under uncertainty in the standard chemical potentials of the activated complexes of the reactions and the standard chemical potentials of the molecular species. In that approach, key sensitivity indices were estimated by Monte Carlo sampling, which is computationally very demanding and impractical for large biochemical reaction systems. Computationally efficient algorithms are needed to make variance-based sensitivity analysis applicable to realistic cellular networks, modeled by biochemical reaction systems that consist of a large number of reactions and molecular species. Results We present four techniques, derivative approximation (DA), polynomial approximation (PA), Gauss-Hermite integration (GHI), and orthonormal Hermite approximation (OHA), for analytically approximating the variance-based sensitivity indices associated with a biochemical reaction system. By using a well-known model of the mitogen-activated protein kinase signaling cascade as a case study, we numerically compare the approximation quality of these techniques against traditional Monte Carlo sampling. Our results indicate that, although DA is computationally the most attractive technique, special care should be exercised when using it for sensitivity analysis, since it may only be accurate at low levels of uncertainty. On the other hand, PA, GHI, and OHA are computationally more demanding than DA but can work well at high levels of uncertainty. GHI results in a slightly better accuracy than PA, but it is more difficult to implement. OHA produces the most accurate approximation results and can be implemented in a straightforward manner. It turns out that the computational cost of the four approximation techniques considered in this paper is orders of magnitude smaller than traditional Monte Carlo estimation. Software, coded in MATLAB®, which implements all sensitivity analysis techniques discussed in this paper, is available free of charge. Conclusions Estimating variance-based sensitivity indices of a large biochemical reaction system is a computationally challenging task that can only be addressed via approximations. Among the methods presented in this paper, a technique based on orthonormal Hermite polynomials seems to be an acceptable candidate for the job, producing very good approximation results for a wide range of uncertainty levels in a fraction of the time required by traditional Monte Carlo sampling.
Collapse
Affiliation(s)
- Hong-Xuan Zhang
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD 21218, USA
| | | |
Collapse
|
18
|
Abstract
Sensitivity analysis is an indispensable tool for studying the robustness and fragility properties of biochemical reaction systems as well as for designing optimal approaches for selective perturbation and intervention. Deterministic sensitivity analysis techniques, using derivatives of the system response, have been extensively used in the literature. However, these techniques suffer from several drawbacks, which must be carefully considered before using them in problems of systems biology. We develop here a probabilistic approach to sensitivity analysis of biochemical reaction systems. The proposed technique employs a biophysically derived model for parameter fluctuations and, by using a recently suggested variance-based approach to sensitivity analysis [Saltelli et al., Chem. Rev. (Washington, D.C.) 105, 2811 (2005)], it leads to a powerful sensitivity analysis methodology for biochemical reaction systems. The approach presented in this paper addresses many problems associated with derivative-based sensitivity analysis techniques. Most importantly, it produces thermodynamically consistent sensitivity analysis results, can easily accommodate appreciable parameter variations, and allows for systematic investigation of high-order interaction effects. By employing a computational model of the mitogen-activated protein kinase signaling cascade, we demonstrate that our approach is well suited for sensitivity analysis of biochemical reaction systems and can produce a wealth of information about the sensitivity properties of such systems. The price to be paid, however, is a substantial increase in computational complexity over derivative-based techniques, which must be effectively addressed in order to make the proposed approach to sensitivity analysis more practical.
Collapse
Affiliation(s)
- Hong-Xuan Zhang
- The Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | | | |
Collapse
|
19
|
Abstract
To understand most cellular processes, one must understand how genetic information is processed. A formidable challenge is the dissection of gene regulatory networks to delineate how eukaryotic cells coordinate and govern patterns of gene expression that ultimately lead to a phenotype. In this paper, we review several approaches for modeling eukaryotic gene regulatory networks and for reverse engineering such networks from experimental observations. Since we are interested in elucidating the transcriptional regulatory mechanisms of colon cancer progression, we use this important biological problem to illustrate various aspects of modeling gene regulation. We discuss four important models: gene networks, transcriptional regulatory systems, Boolean networks, and dynamical Bayesian networks. We review state-of-the-art functional genomics techniques, such as gene expression profiling, cis-regulatory element identification, TF target gene identification, and gene silencing by RNA interference, which can be used to extract information about gene regulation. We can employ this information, in conjunction with appropriately designed reverse engineering algorithms, to construct a computational model of gene regulation that sufficiently predicts experimental observations. In the last part of this review, we focus on the problem of reverse engineering transcriptional regulatory networks by gene perturbations. We mathematically formulate this problem and discuss the role of experimental resolution in our ability to reconstruct accurate models of gene regulation. We conclude, by discussing a promising approach for inferring a transcriptional regulatory system from microarray data obtained by gene perturbations.
Collapse
Affiliation(s)
- J Goutsias
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, MD 21218, USA.
| | | |
Collapse
|
20
|
Abstract
We study fundamental relationships between classical and stochastic chemical kinetics for general biochemical systems with elementary reactions. Analytical and numerical investigations show that intrinsic fluctuations may qualitatively and quantitatively affect both transient and stationary system behavior. Thus, we provide a theoretical understanding of the role that intrinsic fluctuations may play in inducing biochemical function. The mean concentration dynamics are governed by differential equations that are similar to the ones of classical chemical kinetics, expressed in terms of the stoichiometry matrix and time-dependent fluxes. However, each flux is decomposed into a macroscopic term, which accounts for the effect of mean reactant concentrations on the rate of product synthesis, and a mesoscopic term, which accounts for the effect of statistical correlations among interacting reactions. We demonstrate that the ability of a model to account for phenomena induced by intrinsic fluctuations may be seriously compromised if we do not include the mesoscopic fluxes. Unfortunately, computation of fluxes and mean concentration dynamics requires intensive Monte Carlo simulation. To circumvent the computational expense, we employ a moment closure scheme, which leads to differential equations that can be solved by standard numerical techniques to obtain more accurate approximations of fluxes and mean concentration dynamics than the ones obtained with the classical approach.
Collapse
Affiliation(s)
- John Goutsias
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland, USA.
| |
Collapse
|
21
|
Abstract
We address the problem of eliminating fast reaction kinetics in stochastic biochemical systems by employing a quasiequilibrium approximation. We build on two previous methodologies developed by [Haseltine and Rawlings, J. Chem. Phys. 117, 6959 (2002)] and by [Rao and Arkin, J. Chem. Phys. 118, 4999 (2003)]. By following Haseltine and Rawlings, we use the numbers of occurrences of the underlying reactions to characterize the state of a biochemical system. We consider systems that can be effectively partitioned into two distinct subsystems, one that comprises "slow" reactions and one that comprises "fast" reactions. We show that when the probabilities of occurrence of the slow reactions depend at most linearly on the states of the fast reactions, we can effectively eliminate the fast reactions by modifying the probabilities of occurrence of the slow reactions. This modification requires computation of the mean states of the fast reactions, conditioned on the states of the slow reactions. By assuming that within consecutive occurrences of slow reactions, the fast reactions rapidly reach equilibrium, we show that the conditional state means of the fast reactions satisfy a system of at most quadratic equations, subject to linear inequality constraints. We present three examples which allow analytical calculations that clearly illustrate the mathematical steps underlying the proposed approximation and demonstrate the accuracy and effectiveness of our method.
Collapse
Affiliation(s)
- John Goutsias
- The Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, Maryland 21218, USA.
| |
Collapse
|
22
|
Abstract
Modeling transcriptional regulation with time delays is an important problem of computational cell biology. In this paper, we propose a computational tool for studying transcriptional regulation in single cells based on a mean-field approximation method. The main idea is to replace the occurrence probabilities of the underlying transcriptional events by their mean values and use appropriately chosen additive noise terms to model statistical variations not accounted by this approximation. The proposed methodology allows us to characterize the transient and steady-state behavior of transcriptional regulation. Moreover, it provides a rather simple and computationally attractive tool for rapid statistical characterization of the dynamic behavior of a nonlinear transcriptional regulatory system with time delays.
Collapse
Affiliation(s)
- John Goutsias
- Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland 21218, USA.
| | | |
Collapse
|
23
|
Abstract
We discuss several issues pertaining to the use of stochastic biochemical systems for modeling transcriptional regulation in single cells. By appropriately choosing the system state, we can model transcriptional regulation by a hidden Markov model (HMM). This opens the possibility of using well-known techniques for the statistical analysis and stochastic control of HMMs to mathematically and computationally study transcriptional regulation in single cells. Unfortunately, in all but a few simple cases, analytical characterization of the statistical behavior of the proposed HMM is not possible. Moreover, analysis by Monte Carlo simulation is computationally cumbersome. We discuss several techniques for approximating the HMM by one that is more tractable. We employ simulations, based on a biologically relevant transcriptional regulatory system, to show the relative merits and limitations of various approximation techniques and provide general guidelines for their use.
Collapse
Affiliation(s)
- John Goutsias
- Whitaker Biomedical Engineering Institute, Clark Hall 308A, The Johns Hopkins University, Baltimore, MD 21218, USA.
| |
Collapse
|
24
|
Abstract
This paper introduces a novel approach for image analysis based on the notion of multiscale connectivity. We use the proposed approach to design several novel tools for object-based image representation and analysis which exploit the connectivity structure of images in a multiscale fashion. More specifically, we propose a nonlinear pyramidal image representation scheme, which decomposes an image at different scales by means of multiscale grain filters. These filters gradually remove connected components from an image that fail to satisfy a given criterion. We also use the concept of multiscale connectivity to design a hierarchical data partitioning tool. We employ this tool to construct another image representation scheme, based on the concept of component trees, which organizes partitions of an image in a hierarchical multiscale fashion. In addition, we propose a geometrically-oriented hierarchical clustering algorithm which generalizes the classical single-linkage algorithm. Finally, we propose two object-based multiscale image summaries, reminiscent of the well-known (morphological) pattern spectrum, which can be useful in image analysis and image understanding applications.
Collapse
Affiliation(s)
- Ulisses Braga-Neto
- Virology and Experimental Therapy Laboratory of the Aggeu Magalhães Research Center--CPqAM/FIOCRUZ, Recife, PE Brazil.
| | | |
Collapse
|
25
|
Abstract
A novel notion of connectivity for grayscale images is introduced, defined by means of a binary connectivity assigned at image-level sets. In this framework, a grayscale image is connected if all level sets below a prespecified threshold are connected. The proposed notion is referred to as grayscale level connectivity and includes, as special cases, other well-known notions of grayscale connectivity, such as fuzzy grayscale connectivity and grayscale blobs. In contrast to those approaches, the present framework does not require all image-level sets to be connected. Moreover, a connected grayscale object may contain more than one regional maximum. Grayscale level connectivity is studied in the rigorous framework of connectivity classes. The use of grayscale level connectivity in image analysis applications, such as object extraction, image segmentation, object-based filtering, and hierarchical image representation, is discussed and illustrated.
Collapse
Affiliation(s)
- Ulisses Braga-Neto
- Section of Clinical Cancer Genetics, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
| | | |
Collapse
|
26
|
Goutsias J, Kim S. A nonlinear discrete dynamical model for transcriptional regulation: construction and properties. Biophys J 2004; 86:1922-45. [PMID: 15041638 PMCID: PMC1304049 DOI: 10.1016/s0006-3495(04)74257-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2003] [Accepted: 11/17/2003] [Indexed: 10/21/2022] Open
Abstract
Transcriptional regulation is a fundamental mechanism of living cells, which allows them to determine their actions and properties, by selectively choosing which proteins to express and by dynamically controlling the amounts of those proteins. In this article, we revisit the problem of mathematically modeling transcriptional regulation. First, we adopt a biologically motivated continuous model for gene transcription and mRNA translation, based on first-order rate equations, coupled with a set of nonlinear equations that model cis-regulation. Then, we view the processes of transcription and translation as being discrete, which, together with the need to use computational techniques for large-scale analysis and simulation, motivates us to model transcriptional regulation by means of a nonlinear discrete dynamical system. Classical arguments from chemical kinetics allow us to specify the nonlinearities underlying cis-regulation and to include both activators and repressors as well as the notion of regulatory modules in our formulation. We show that the steady-state behavior of the proposed discrete dynamical system is identical to that of the continuous model. We discuss several aspects of our model, related to homeostatic and epigenetic regulation as well as to Boolean networks, and elaborate on their significance. Simulations of transcriptional regulation of a hypothetical metabolic pathway illustrate several properties of our model, and demonstrate that a nonlinear discrete dynamical system may be effectively used to model transcriptional regulation in a biologically relevant way.
Collapse
Affiliation(s)
- John Goutsias
- The Whitaker Biomedical Engineering Institute, The Johns Hopkins University, Baltimore, Maryland 21218, USA.
| | | |
Collapse
|
27
|
Abstract
An unsupervised iterative scheme is proposed for land mine detection in heavily cluttered scenes. This scheme is based on iterating hybrid multispectral filters that consist of a decorrelating linear transform coupled with a nonlinear morphological detector. Detections extracted from the first pass are used to improve results in subsequent iterations. The procedure stops after a predetermined number of iterations. The proposed scheme addresses several weaknesses associated with previous adaptations of morphological approaches to land mine detection. Improvement in detection performance, robustness with respect to clutter inhomogeneities, a completely unsupervised operation, and computational efficiency are the main highlights of the method. Experimental results reveal excellent performance.
Collapse
Affiliation(s)
- Sinan Batman
- Eastman Kodak Health Imaging, Allendale, NJ 07401, USA.
| | | |
Collapse
|
28
|
Goutsias J, Heijmans HM. Nonlinear multiresolution signal decomposition schemes--part I: morphological pyramids. IEEE Trans Image Process 2000; 9:1862-1876. [PMID: 18262923 DOI: 10.1109/83.877209] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Interest in multiresolution techniques for signal processing and analysis is increasing steadily. An important instance of such a technique is the so-called pyramid decomposition scheme. This paper presents a general theory for constructing linear as well as nonlinear pyramid decomposition schemes for signal analysis and synthesis. The proposed theory is based on the following ingredients: 1) the pyramid consists of a (finite or infinite) number of levels such that the information content decreases toward higher levels and 2) each step toward a higher level is implemented by an (information-reducing) analysis operator, whereas each step toward a lower level is implemented by an (information-preserving) synthesis operator. One basic assumption is necessary: synthesis followed by analysis yields the identity operator, meaning that no information is lost by these two consecutive steps. Several examples of pyramid decomposition schemes are shown to be instances of the proposed theory: a particular class of linear pyramids, morphological skeleton decompositions, the morphological Haar pyramid, median pyramids, etc. Furthermore, the paper makes a distinction between single-scale and multiscale decomposition schemes, i.e., schemes without or with sample reduction. Finally, the proposed theory provides the foundation of a general approach to constructing nonlinear wavelet decomposition schemes and filter banks.
Collapse
Affiliation(s)
- J Goutsias
- Dept. of Electr. and Comput. Eng., Johns Hopkins Univ., Baltimore, MD 21218, USA.
| | | |
Collapse
|
29
|
Heijmans HM, Goutsias J. Nonlinear multiresolution signal decomposition schemes--part II: morphological wavelets. IEEE Trans Image Process 2000; 9:1897-1913. [PMID: 18262925 DOI: 10.1109/83.877211] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
In its original form, the wavelet transform is a linear tool. However, it has been increasingly recognized that nonlinear extensions are possible. A major impulse to the development of nonlinear wavelet transforms has been given by the introduction of the lifting scheme by Sweldens (1995, 1996, 1998). The aim of this paper, which is a sequel to a previous paper devoted exclusively to the pyramid transform, is to present an axiomatic framework encompassing most existing linear and nonlinear wavelet decompositions. Furthermore, it introduces some, thus far unknown, wavelets based on mathematical morphology, such as the morphological Haar wavelet, both in one and two dimensions. A general and flexible approach for the construction of nonlinear (morphological) wavelets is provided by the lifting scheme. This paper briefly discusses one example, the max-lifting scheme, which has the intriguing property that preserves local maxima in a signal over a range of scales, depending on how local or global these maxima are.
Collapse
Affiliation(s)
- H M Heijmans
- Centre for Mathematics and Computer Science (CWI), Amsterdam.
| | | |
Collapse
|
30
|
Abstract
We theoretically formulate the problem of processing continuous-space binary random fields by means of mathematical morphology. This may allow us to employ mathematical morphology to develop new statistical techniques for the analysis of binary random images. Since morphological transformations of continuous-space binary random fields are not measurable in general, we are naturally forced to employ intermediate steps that require generation of an equivalent random closed set. The relationship between continuous-space binary random fields and random closed sets is thoroughly investigated. As a byproduct of this investigation, a number of useful new results, regarding separability of random closed sets, are presented. Our plan, however, suffers from a few technical problems that are prominent in the continuous case. As an alternative, we suggest morphological discretization of binary random fields, random closed sets, and morphological operators, thereby effectively implementing our problem in the discrete domain.
Collapse
Affiliation(s)
- K Sivakumar
- Dept. of Electr. and Comput. Eng., Johns Hopkins Univ., Baltimore, MD
| | | |
Collapse
|
31
|
|
32
|
|