1
|
Rice RC, Gil DV, Baratta AM, Frawley RR, Hill SY, Farris SP, Homanics GE. Inter- and transgenerational heritability of preconception chronic stress or alcohol exposure: Translational outcomes in brain and behavior. Neurobiol Stress 2024; 29:100603. [PMID: 38234394 PMCID: PMC10792982 DOI: 10.1016/j.ynstr.2023.100603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 12/18/2023] [Accepted: 12/19/2023] [Indexed: 01/19/2024] Open
Abstract
Chronic stress and alcohol (ethanol) use are highly interrelated and can change an individual's behavior through molecular adaptations that do not change the DNA sequence, but instead change gene expression. A recent wealth of research has found that these nongenomic changes can be transmitted across generations, which could partially account for the "missing heritability" observed in genome-wide association studies of alcohol use disorder and other stress-related neuropsychiatric disorders. In this review, we summarize the molecular and behavioral outcomes of nongenomic inheritance of chronic stress and ethanol exposure and the germline mechanisms that could give rise to this heritability. In doing so, we outline the need for further research to: (1) Investigate individual germline mechanisms of paternal, maternal, and biparental nongenomic chronic stress- and ethanol-related inheritance; (2) Synthesize and dissect cross-generational chronic stress and ethanol exposure; (3) Determine cross-generational molecular outcomes of preconception ethanol exposure that contribute to alcohol-related disease risk, using cancer as an example. A detailed understanding of the cross-generational nongenomic effects of stress and/or ethanol will yield novel insight into the impact of ancestral perturbations on disease risk across generations and uncover actionable targets to improve human health.
Collapse
Affiliation(s)
- Rachel C. Rice
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
| | - Daniela V. Gil
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
| | - Annalisa M. Baratta
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
| | - Remy R. Frawley
- Department of Anesthesiology and Perioperative Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Shirley Y. Hill
- Department of Psychiatry, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Psychology, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, PA, USA
| | - Sean P. Farris
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
- Department of Anesthesiology and Perioperative Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Gregg E. Homanics
- Center for Neuroscience at the University of Pittsburgh, Pittsburgh, PA, USA
- Department of Anesthesiology and Perioperative Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Department of Pharmacology and Chemical Biology, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
2
|
Nauwelaerts SJD, De Cremer K, Bustos Sierra N, Gand M, Van Geel D, Delvoye M, Vandermassen E, Vercauteren J, Stroobants C, Bernard A, Saenen ND, Nawrot TS, Roosens NHC, De Keersmaecker SCJ. Assessment of the Feasibility of a Future Integrated Larger-Scale Epidemiological Study to Evaluate Health Risks of Air Pollution Episodes in Children. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2022; 19:ijerph19148531. [PMID: 35886381 PMCID: PMC9323067 DOI: 10.3390/ijerph19148531] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/08/2022] [Accepted: 07/10/2022] [Indexed: 02/07/2023]
Abstract
Air pollution exposure can lead to exacerbation of respiratory disorders in children. Using sensitive biomarkers helps to assess the impact of air pollution on children’s respiratory health and combining protein, genetic and epigenetic biomarkers gives insights on their interrelatedness. Most studies do not contain such an integrated approach and investigate these biomarkers individually in blood, although its collection in children is challenging. Our study aimed at assessing the feasibility of conducting future integrated larger-scale studies evaluating respiratory health risks of air pollution episodes in children, based on a qualitative analysis of the technical and logistic aspects of a small-scale field study involving 42 children. This included the preparation, collection and storage of non-invasive samples (urine, saliva), the measurement of general and respiratory health parameters and the measurement of specific biomarkers (genetic, protein, epigenetic) of respiratory health and air pollution exposure. Bottlenecks were identified and modifications were proposed to expand this integrated study to a higher number of children, time points and locations. This would allow for non-invasive assessment of the impact of air pollution exposure on the respiratory health of children in future larger-scale studies, which is critical for the development of policies or measures at the population level.
Collapse
Affiliation(s)
- Sarah J. D. Nauwelaerts
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (S.J.D.N.); (M.G.); (D.V.G.); (M.D.); (E.V.); (N.H.C.R.)
- Centre for Toxicology and Applied Pharmacology, University Catholique de Louvain, 1200 Brussels, Belgium;
| | - Koen De Cremer
- Platform Chromatography and Mass Spectrometry, Sciensano, 1050 Brussels, Belgium;
| | | | - Mathieu Gand
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (S.J.D.N.); (M.G.); (D.V.G.); (M.D.); (E.V.); (N.H.C.R.)
| | - Dirk Van Geel
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (S.J.D.N.); (M.G.); (D.V.G.); (M.D.); (E.V.); (N.H.C.R.)
| | - Maud Delvoye
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (S.J.D.N.); (M.G.); (D.V.G.); (M.D.); (E.V.); (N.H.C.R.)
| | - Els Vandermassen
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (S.J.D.N.); (M.G.); (D.V.G.); (M.D.); (E.V.); (N.H.C.R.)
| | - Jordy Vercauteren
- Unit Air, Vlaamse Milieumaatschappij, 2000 Antwerpen, Belgium; (J.V.); (C.S.)
| | | | - Alfred Bernard
- Centre for Toxicology and Applied Pharmacology, University Catholique de Louvain, 1200 Brussels, Belgium;
| | - Nelly D. Saenen
- Centre for Environmental Sciences, Hasselt University, 3590 Diepenbeek, Belgium; (N.D.S.); (T.S.N.)
| | - Tim S. Nawrot
- Centre for Environmental Sciences, Hasselt University, 3590 Diepenbeek, Belgium; (N.D.S.); (T.S.N.)
- Department of Public Health and Primary Care, KU Leuven, 3000 Leuven, Belgium
| | - Nancy H. C. Roosens
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (S.J.D.N.); (M.G.); (D.V.G.); (M.D.); (E.V.); (N.H.C.R.)
| | - Sigrid C. J. De Keersmaecker
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (S.J.D.N.); (M.G.); (D.V.G.); (M.D.); (E.V.); (N.H.C.R.)
- Correspondence:
| |
Collapse
|
3
|
Liu Q, Cheng B, Jin Y, Hu P. Bayesian tensor factorization-drive breast cancer subtyping by integrating multi-omics data. J Biomed Inform 2021; 125:103958. [PMID: 34839017 DOI: 10.1016/j.jbi.2021.103958] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 10/13/2021] [Accepted: 11/19/2021] [Indexed: 12/12/2022]
Abstract
Breast cancer is a highly heterogeneous disease. Subtyping the disease and identifying the genomic features driving these subtypes are critical for precision oncology for breast cancer. This study focuses on developing a new computational approach for breast cancer subtyping. We proposed to use Bayesian tensor factorization (BTF) to integrate multi-omics data of breast cancer, which include expression profiles of RNA-sequencing, copy number variation, and DNA methylation measured on 762 breast cancer patients from The Cancer Genome Atlas. We applied a consensus clustering approach to identify breast cancer subtypes using the factorized latent features by BTF. Subtype-specific survival patterns of the breast cancer patients were evaluated using Kaplan-Meier (KM) estimators. The proposed approach was compared with other state-of-the-art approaches for cancer subtyping. The BTF-subtyping analysis identified 17 optimized latent components, which were used to reveal six major breast cancer subtypes. Out of all different approaches, only the proposed approach showed distinct survival patterns (p < 0.05). Statistical tests also showed that the identified clusters have statistically significant distributions. Our results showed that the proposed approach is a promising strategy to efficiently use publicly available multi-omics data to identify breast cancer subtypes.
Collapse
Affiliation(s)
- Qian Liu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada; Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Bowen Cheng
- Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada
| | - Yongwon Jin
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada
| | - Pingzhao Hu
- Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Canada; Department of Computer Science, University of Manitoba, Winnipeg, Manitoba, Canada; Dalla Lana School of Public Health, University of Toronto, Toronto, Ontario, Canada; CancerCare Manitoba Research Institute, Winnipeg, Manitoba, Canada.
| |
Collapse
|
4
|
Liu Y, Baggerly KA, Orouji E, Manyam G, Chen H, Lam M, Davis JS, Lee MS, Broom BM, Menter DG, Rai K, Kopetz S, Morris JS. Methylation-eQTL Analysis in Cancer Research. Bioinformatics 2021; 37:4014-4022. [PMID: 34117863 PMCID: PMC9188481 DOI: 10.1093/bioinformatics/btab443] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 03/15/2021] [Accepted: 06/11/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION DNA methylation is a key epigenetic factor regulating gene expression. While promoter methylation has been well studied, recent publications have revealed that functionally important methylation also occurs in intergenic and distal regions, and varies across genes and tissue types. Given the growing importance of inter-platform integrative genomic analyses, there is an urgent need to develop methods to discover and characterize gene-level relationships between methylation and expression. RESULTS We introduce a novel sequential penalized regression approach to identify methylation-expression quantitative trait loci (methyl-eQTLs), a term that we have coined to represent, for each gene and tissue type, a sparse set of CpG loci best explaining gene expression and accompanying weights indicating direction and strength of association. Using TCGA and MD Anderson colorectal cohorts to build and validate our models, we demonstrate our strategy better explains expression variability than current commonly used gene-level methylation summaries. The methyl-eQTLs identified by our approach can be used to construct gene-level methylation summaries that are maximally correlated with gene expression for use in integrative models, and produce a tissue-specific summary of which genes appear to be strongly regulated by methylation. Our results introduce an important resource to the biomedical community for integrative genomics analyses involving DNA methylation. AVAILABILITY AND IMPLEMENTATION We produce an R Shiny app (https://rstudio-prd-c1.pmacs.upenn.edu/methyl-eQTL/) that interactively presents methyl-eQTL results for colorectal, breast, and pancreatic cancer. The source R code for this work is provided in the supplement. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yusha Liu
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - Keith A Baggerly
- Department of Bioinformatics and Computational Biology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Elias Orouji
- Department of Genomic Medicine, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Ganiraju Manyam
- Department of Bioinformatics and Computational Biology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Huiqin Chen
- Department of Bioinformatics and Computational Biology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Michael Lam
- Department of Gastrointestinal Medical Oncology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Jennifer S Davis
- Department of Epidemiology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Michael S Lee
- Department of Medicine, The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Bradley M Broom
- Department of Bioinformatics and Computational Biology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - David G Menter
- Department of Gastrointestinal Medical Oncology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Kunal Rai
- Department of Genomic Medicine, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Scott Kopetz
- Department of Gastrointestinal Medical Oncology, M.D. Anderson Cancer Center, Houston, TX 77030, USA
| | - Jeffrey S Morris
- Department of Biostatistics, Epidemiology and Informatics, The University of Pennsylvania, Philadelphia, PA 19104-6021, USA
| |
Collapse
|
5
|
Rabadán R, Mohamedi Y, Rubin U, Chu T, Alghalith AN, Elliott O, Arnés L, Cal S, Obaya ÁJ, Levine AJ, Cámara PG. Identification of relevant genetic alterations in cancer using topological data analysis. Nat Commun 2020; 11:3808. [PMID: 32732999 PMCID: PMC7393176 DOI: 10.1038/s41467-020-17659-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2018] [Accepted: 07/09/2020] [Indexed: 01/05/2023] Open
Abstract
Large-scale cancer genomic studies enable the systematic identification of mutations that lead to the genesis and progression of tumors, uncovering the underlying molecular mechanisms and potential therapies. While some such mutations are recurrently found in many tumors, many others exist solely within a few samples, precluding detection by conventional recurrence-based statistical approaches. Integrated analysis of somatic mutations and RNA expression data across 12 tumor types reveals that mutations of cancer genes are usually accompanied by substantial changes in expression. We use topological data analysis to leverage this observation and uncover 38 elusive candidate cancer-associated genes, including inactivating mutations of the metalloproteinase ADAMTS12 in lung adenocarcinoma. We show that ADAMTS12-/- mice have a five-fold increase in the susceptibility to develop lung tumors, confirming the role of ADAMTS12 as a tumor suppressor gene. Our results demonstrate that data integration through topological techniques can increase our ability to identify previously unreported cancer-related alterations.
Collapse
Affiliation(s)
- Raúl Rabadán
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 1130 St. Nicholas Ave., New York, NY, 10032, USA.
| | - Yamina Mohamedi
- Departamento de Bioquimica y Biologia Molecular, Universidad de Oviedo, Oviedo, Asturias, Spain
- IUOPA, Instituto Universitario de Oncologia, Oviedo, Asturias, Spain
| | - Udi Rubin
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 1130 St. Nicholas Ave., New York, NY, 10032, USA
- Memorial Sloan Kettering Cancer Center, 1275 York Ave, New York, NY, 10065, USA
| | - Tim Chu
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 1130 St. Nicholas Ave., New York, NY, 10032, USA
| | - Adam N Alghalith
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd., Philadelphia, PA, 19104, USA
| | - Oliver Elliott
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 1130 St. Nicholas Ave., New York, NY, 10032, USA
| | - Luis Arnés
- Departments of Systems Biology and Biomedical Informatics, Columbia University, 1130 St. Nicholas Ave., New York, NY, 10032, USA
| | - Santiago Cal
- Departamento de Bioquimica y Biologia Molecular, Universidad de Oviedo, Oviedo, Asturias, Spain
- IUOPA, Instituto Universitario de Oncologia, Oviedo, Asturias, Spain
| | - Álvaro J Obaya
- IUOPA, Instituto Universitario de Oncologia, Oviedo, Asturias, Spain
- Departamento de Biologia Funcional, Universidad de Oviedo, Oviedo, Asturias, Spain
| | - Arnold J Levine
- The Simons Center for Systems Biology, Institute for Advanced Study, Princeton, NJ, 08540, USA.
| | - Pablo G Cámara
- Department of Genetics and Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, 3400 Civic Center Blvd., Philadelphia, PA, 19104, USA.
| |
Collapse
|
6
|
Tarazona S, Balzano-Nogueira L, Gómez-Cabrero D, Schmidt A, Imhof A, Hankemeier T, Tegnér J, Westerhuis JA, Conesa A. Harmonization of quality metrics and power calculation in multi-omic studies. Nat Commun 2020; 11:3092. [PMID: 32555183 PMCID: PMC7303201 DOI: 10.1038/s41467-020-16937-8] [Citation(s) in RCA: 40] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 05/29/2020] [Indexed: 12/20/2022] Open
Abstract
Multi-omic studies combine measurements at different molecular levels to build comprehensive models of cellular systems. The success of a multi-omic data analysis strategy depends largely on the adoption of adequate experimental designs, and on the quality of the measurements provided by the different omic platforms. However, the field lacks a comparative description of performance parameters across omic technologies and a formulation for experimental design in multi-omic data scenarios. Here, we propose a set of harmonized Figures of Merit (FoM) as quality descriptors applicable to different omic data types. Employing this information, we formulate the MultiPower method to estimate and assess the optimal sample size in a multi-omics experiment. MultiPower supports different experimental settings, data types and sample sizes, and includes graphical for experimental design decision-making. MultiPower is complemented with MultiML, an algorithm to estimate sample size for machine learning classification problems based on multi-omic data.
Collapse
Affiliation(s)
- Sonia Tarazona
- Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Valencia, Spain
| | - Leandro Balzano-Nogueira
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA
| | - David Gómez-Cabrero
- Unit of Computational Medicine, Department of Medicine, Solna, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
- Mucosal & Salivary Biology Division, King's College London Dental Institute, London, UK
- Navarrabiomed, Complejo Hospitalario de Navarra (CHN), Universidad Pública de Navarra (UPNA), IdiSNA, Pamplona, Spain
| | - Andreas Schmidt
- Protein Analysis Unit, Biomedical Center, Faculty of Medicine, LMU Munich, Planegg-Martinsried, Germany
| | - Axel Imhof
- Protein Analysis Unit, Biomedical Center, Faculty of Medicine, LMU Munich, Planegg-Martinsried, Germany
- Munich Center of Integrated Protein Science LMU Munich, Planegg-Martinsried, Germany
| | - Thomas Hankemeier
- Division Analytical Biosciences, Leiden/Amsterdam Center for Drug Research, Leiden, The Netherlands
| | - Jesper Tegnér
- Unit of Computational Medicine, Department of Medicine, Solna, Center for Molecular Medicine, Karolinska Institutet, Stockholm, Sweden
- Science for Life Laboratory, Solna, Sweden
- Biological and Environmental Sciences and Engineering Division, Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Johan A Westerhuis
- Swammerdam Institute for Life Sciences, University of Amsterdam, Amsterdam, The Netherlands
- Department of Statistics, Faculty of Natural Sciences, North-West University (Potchefstroom Campus), Potchefstroom, South Africa
| | - Ana Conesa
- Microbiology and Cell Science Department, Institute for Food and Agricultural Research, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
7
|
Fang J. Tightly integrated genomic and epigenomic data mining using tensor decomposition. Bioinformatics 2019; 35:112-118. [PMID: 29939222 DOI: 10.1093/bioinformatics/bty513] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 06/21/2018] [Indexed: 12/12/2022] Open
Abstract
Motivation Complex diseases such as cancers often involve multiple types of genomic and/or epigenomic abnormalities. Rapid accumulation of multiple types of omics data demands methods for integrating the multidimensional data in order to elucidate complex relationships among different types of genomic and epigenomic abnormalities. Results In the present study, we propose a tightly integrated approach based on tensor decomposition. Multiple types of data, including mRNA, methylation, copy number variations and somatic mutations, are merged into a high-order tensor which is used to develop predictive models for overall survival. The weight tensors of the models are constrained using CANDECOMP/PARAFAC (CP) tensor decomposition and learned using support tensor machine regression (STR) and ridge tensor regression (RTR). The results demonstrate that the tensor decomposition based approaches can achieve better performance than the models based individual data type and the concatenation approach. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jianwen Fang
- Computational & Systems Biology Branch, Biometric Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, 9609 Medical Center Dr., Rockville, MD, USA
| |
Collapse
|
8
|
Sharma A, Jiang C, De S. Dissecting the sources of gene expression variation in a pan-cancer analysis identifies novel regulatory mutations. Nucleic Acids Res 2019; 46:4370-4381. [PMID: 29672706 PMCID: PMC5961375 DOI: 10.1093/nar/gky271] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2017] [Accepted: 03/29/2018] [Indexed: 02/07/2023] Open
Abstract
Although the catalog of cancer-associated mutations in protein-coding regions is nearly complete for all major cancer types, an assessment of regulatory changes in cancer genomes and their clinical significance remain largely preliminary. Adopting bottom-up approach, we quantify the effects of different sources of gene expression variation in a cohort of 3899 samples from 10 cancer types. We find that copy number alterations, epigenetic changes, transcription factors and microRNAs collectively explain, on average, only 31–38% and 18–26% expression variation for cancer-associated and other genes, respectively, and that among these factors copy number alteration has the highest effect. We show that the genes with systematic, large expression variation that could not be attributed to these factors are enriched for pathways related to cancer hallmarks. Integrating whole genome sequencing data and focusing on genes with systematic expression variation we identify novel, recurrent regulatory mutations affecting known cancer genes such as NKX2-1 and GRIN2D in multiple cancer types. Nonetheless, at a genome-wide scale proportions of gene expression variation attributed to recurrent point mutations appear to be modest so far, especially when compared to that attributed to copy number changes – a pattern different from that observed for other complex diseases and traits. We suspect that, owing to plasticity and redundancy in biological pathways, regulatory alterations show complex combinatorial patterns, modulating gene expression in cancer genomes at a finer scale.
Collapse
Affiliation(s)
- Anchal Sharma
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey. New Brunswick, NJ 08901, USA
| | - Chuan Jiang
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey. New Brunswick, NJ 08901, USA
| | - Subhajyoti De
- Center for Systems and Computational Biology, Rutgers Cancer Institute of New Jersey, Rutgers the State University of New Jersey. New Brunswick, NJ 08901, USA
| |
Collapse
|
9
|
Guala D, Ogris C, Müller N, Sonnhammer ELL. Genome-wide functional association networks: background, data & state-of-the-art resources. Brief Bioinform 2019; 21:1224-1237. [PMID: 31281921 PMCID: PMC7373183 DOI: 10.1093/bib/bbz064] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Revised: 04/29/2019] [Accepted: 05/04/2019] [Indexed: 02/06/2023] Open
Abstract
The vast amount of experimental data from recent advances in the field of high-throughput biology begs for integration into more complex data structures such as genome-wide functional association networks. Such networks have been used for elucidation of the interplay of intra-cellular molecules to make advances ranging from the basic science understanding of evolutionary processes to the more translational field of precision medicine. The allure of the field has resulted in rapid growth of the number of available network resources, each with unique attributes exploitable to answer different biological questions. Unfortunately, the high volume of network resources makes it impossible for the intended user to select an appropriate tool for their particular research question. The aim of this paper is to provide an overview of the underlying data and representative network resources as well as to mention methods of integration, allowing a customized approach to resource selection. Additionally, this report will provide a primer for researchers venturing into the field of network integration.
Collapse
Affiliation(s)
- Dimitri Guala
- Science for Life Laboratory, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121 Solna, Sweden
| | - Christoph Ogris
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Center Munich, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
| | - Nikola Müller
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Center Munich, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany
| | - Erik L L Sonnhammer
- Science for Life Laboratory, Stockholm Bioinformatics Center, Department of Biochemistry and Biophysics, Stockholm University, Box 1031, 17121 Solna, Sweden
| |
Collapse
|
10
|
Marsh JW, Hayward RJ, Shetty AC, Mahurkar A, Humphrys MS, Myers GSA. Bioinformatic analysis of bacteria and host cell dual RNA-sequencing experiments. Brief Bioinform 2019; 19:1115-1129. [PMID: 28535295 DOI: 10.1093/bib/bbx043] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Indexed: 12/18/2022] Open
Abstract
Bacterial pathogens subvert host cells by manipulating cellular pathways for survival and replication; in turn, host cells respond to the invading pathogen through cascading changes in gene expression. Deciphering these complex temporal and spatial dynamics to identify novel bacterial virulence factors or host response pathways is crucial for improved diagnostics and therapeutics. Dual RNA sequencing (dRNA-Seq) has recently been developed to simultaneously capture host and bacterial transcriptomes from an infected cell. This approach builds on the high sensitivity and resolution of RNA sequencing technology and is applicable to any bacteria that interact with eukaryotic cells, encompassing parasitic, commensal or mutualistic lifestyles. Several laboratory protocols have been presented that outline the collection, extraction and sequencing of total RNA for dRNA-Seq experiments, but there is relatively little guidance available for the detailed bioinformatic analyses required. This protocol outlines a typical dRNA-Seq experiment, based on a Chlamydia trachomatis-infected host cell, with a detailed description of the necessary bioinformatic analyses with currently available software tools.
Collapse
Affiliation(s)
- James W Marsh
- The ithree institute, University of Technology Sydney
| | | | - Amol C Shetty
- Institute for Genome Sciences at the University of Maryland, Baltimore
| | - Anup Mahurkar
- Institute for Genome Sciences at the University of Maryland, Baltimore
| | | | | |
Collapse
|
11
|
Krishnan NM, I M, Hariharan J, Panda B. CAFE MOCHA: An Integrated Platform for Discovering Clinically Relevant Molecular Changes in Cancer-An Example of Distant Metastasis- and Recurrence-Linked Classifiers in Head and Neck Squamous Cell Carcinoma. JCO Clin Cancer Inform 2019; 2:1-11. [PMID: 30652568 DOI: 10.1200/cci.17.00045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
PURPOSE With large amounts of multidimensional molecular data on cancers generated and deposited into public repositories such as The Cancer Genome Atlas and International Cancer Genome Consortium, a cancer type agnostic and integrative platform will help to identify signatures with clinical relevance. We devised such a platform and showcase it by identifying a molecular signature for patients with metastatic and recurrent (MR) head and neck squamous cell carcinoma (HNSCC). METHODS We devised a statistical framework accompanied by a graphical user interface-driven application, Clinical Association of Functionally Established MOlecular CHAnges ( CAFE MOCHA; https://github.com/binaypanda/CAFEMOCHA), to discover molecular signatures linked to a specific clinical attribute in a cancer type. The platform integrates mutations and indels, gene expression, DNA methylation, and copy number variations to discover a classifier first and then to predict an incoming tumor for the same by pulling defined class variables into a single framework that incorporates a coordinate geometry-based algorithm called complete specificity margin-based clustering, which ensures maximum specificity. CAFE MOCHA classifies an incoming tumor sample using either its matched normal or a built-in database of normal tissues. The application is packed and deployed using the install4j multiplatform installer. We tested CAFE MOCHA in HNSCC tumors (n = 513) followed by validation in tumors from an independent cohort (n = 18) for discovering a signature linked to distant MR. RESULTS CAFE MOCHA identified an integrated signature, MR44, associated with distant MR HNSCC, with 80% sensitivity and 100% specificity in the discovery stage and 100% sensitivity and 100% specificity in the validation stage. CONCLUSION CAFE MOCHA is a cancer type and clinical attribute agnostic statistical framework to discover integrated molecular signatures.
Collapse
Affiliation(s)
- Neeraja M Krishnan
- All authors: Ganit Labs; and Binay Panda, Strand Life Sciences, Bangalore, Karnataka, India
| | - Mohanraj I
- All authors: Ganit Labs; and Binay Panda, Strand Life Sciences, Bangalore, Karnataka, India
| | - Janani Hariharan
- All authors: Ganit Labs; and Binay Panda, Strand Life Sciences, Bangalore, Karnataka, India
| | - Binay Panda
- All authors: Ganit Labs; and Binay Panda, Strand Life Sciences, Bangalore, Karnataka, India
| |
Collapse
|
12
|
Cologne J, Loo L, Shvetsov YB, Misumi M, Lin P, Haiman CA, Wilkens LR, Le Marchand L. Stepwise approach to SNP-set analysis illustrated with the Metabochip and colorectal cancer in Japanese Americans of the Multiethnic Cohort. BMC Genomics 2018; 19:524. [PMID: 29986644 PMCID: PMC6038257 DOI: 10.1186/s12864-018-4910-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 06/29/2018] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Common variants have explained less than the amount of heritability expected for complex diseases, which has led to interest in less-common variants and more powerful approaches to the analysis of whole-genome scans. Because of low frequency (low statistical power), less-common variants are best analyzed using SNP-set methods such as gene-set or pathway-based analyses. However, there is as yet no clear consensus regarding how to focus in on potential risk variants following set-based analyses. We used a stepwise, telescoping approach to analyze common- and rare-variant data from the Illumina Metabochip array to assess genomic association with colorectal cancer (CRC) in the Japanese sub-population of the Multiethnic Cohort (676 cases, 7180 controls). We started with pathway analysis of SNPs that are in genes and pathways having known mechanistic roles in colorectal cancer, then focused on genes within the pathways that evidenced association with CRC, and finally assessed individual SNPs within the genes that evidenced association. Pathway SNPs downloaded from the dbSNP database were cross-matched with Metabochip SNPs and analyzed using the logistic kernel machine regression approach (logistic SNP-set kernel-machine association test, or sequence kernel association test; SKAT) and related methods. RESULTS The TGF-β and WNT pathways were associated with all CRC, and the WNT pathway was associated with colon cancer. Individual genes demonstrating the strongest associations were TGFBR2 in the TGF-β pathway and SMAD7 (which is involved in both the TGF-β and WNT pathways). As partial validation of our approach, a known CRC risk variant in SMAD7 (in both the TGF-β and WNT pathways: rs11874392) was associated with CRC risk in our data. We also detected two novel candidate CRC risk variants (rs13075948 and rs17025857) in TGFBR2, a gene known to be associated with CRC risk. CONCLUSIONS A stepwise, telescoping approach identified some potentially novel risk variants associated with colorectal cancer, so it may be a useful method for following up on results of set-based SNP analyses. Further work is required to assess the statistical characteristics of the approach, and additional applications should aid in better clarifying its utility.
Collapse
Affiliation(s)
- John Cologne
- Department of Statistics, Radiation Effects Research Foundation, Hiroshima, 732-0815, Japan.
| | - Lenora Loo
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Yurii B Shvetsov
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Munechika Misumi
- Department of Statistics, Radiation Effects Research Foundation, Hiroshima, 732-0815, Japan
| | - Philip Lin
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Christopher A Haiman
- Department of Preventive Medicine and Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, 90033, USA
| | - Lynne R Wilkens
- Biostatistics and Informatics Shared Resource, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| | - Loïc Le Marchand
- Epidemiology Program, University of Hawaii Cancer Center, Honolulu, HI, 96813, USA
| |
Collapse
|
13
|
Karpinski P, Patai AV, Hap W, Kielan W, Laczmanska I, Sasiadek MM. Multilevel omic data clustering reveals variable contribution of methylator phenotype to integrative cancer subtypes. Epigenomics 2018; 10:1289-1299. [PMID: 29896967 DOI: 10.2217/epi-2018-0057] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
AIM We aimed to assess to what extent CpG island methylator phenotype (CIMP) contributes to cancer subtypes obtained by multilevel omic data analysis. MATERIALS & METHODS 16 The Cancer Genome Atlas datasets encompassing three data layers in 4688 tumor samples were analyzed. We identified cancer integrative subtypes (ISs) by the use of similarity network fusion and consensus clustering. CIMP high (CIMP-H) associated ISs were profiled by gene sets and transcriptional regulators enrichment analysis. RESULTS & CONCLUSION In nine out of 16 cancer datasets CIMP-H clusters significantly overlaped with unique ISs. The contribution of CIMP-H on integrative molecular profiling is variable; therefore, only in a subset of cancer types does CIMP-H contribute to homogenous integrative subtype. CIMP-H associated ISs are heterogenous groups with regard to deregulated pathways and transcriptional regulators.
Collapse
Affiliation(s)
- Pawel Karpinski
- Department of Genetics; Wroclaw Medical University, Wroclaw, Poland
| | - Arpad V Patai
- 2nd Department of Internal Medicine, Semmelweis University, Budapest, Hungary
| | - Wojciech Hap
- 2nd Department of General & Oncological Surgery, Wroclaw Medical University, Wroclaw, Poland
| | - Wojciech Kielan
- 2nd Department of General & Oncological Surgery, Wroclaw Medical University, Wroclaw, Poland
| | | | | |
Collapse
|
14
|
Hill SY, Rompala G, Homanics GE, Zezza N. Cross-generational effects of alcohol dependence in humans on HRAS and TP53 methylation in offspring. Epigenomics 2017; 9:1189-1203. [PMID: 28799801 DOI: 10.2217/epi-2017-0052] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
AIM We hypothesized that cross-generational effects of alcohol exposure could alter DNA methylation and expression of the HRAS oncogene and TP53 tumor suppressor gene that drive cancer development. METHODS DNA methylation of the HRAS and TP53 genes was tested in samples from young participants (Mean age of 13.4 years). RESULTS Controlling for both personal use and maternal use of substances during pregnancy, familial alcohol dependence was associated with hypomethylation of CpG sites in the HRAS promoter region and hypermethylation of the TP53 gene. CONCLUSION The results suggest that ancestral exposure to alcohol can have enduring effects that impact epigenetic processes such as DNA methylation that controls expression of genes that drive cancer development such as HRAS and TP53.
Collapse
Affiliation(s)
- Shirley Y Hill
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Gregory Rompala
- Center for Neuroscience, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Gregg E Homanics
- Departments of Anesthesiology & Pharmacology & Chemical Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Nicholas Zezza
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| |
Collapse
|
15
|
Lai YP, Wang LB, Wang WA, Lai LC, Tsai MH, Lu TP, Chuang EY. iGC-an integrated analysis package of gene expression and copy number alteration. BMC Bioinformatics 2017; 18:35. [PMID: 28088185 PMCID: PMC5237550 DOI: 10.1186/s12859-016-1438-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 12/17/2016] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND With the advancement in high-throughput technologies, researchers can simultaneously investigate gene expression and copy number alteration (CNA) data from individual patients at a lower cost. Traditional analysis methods analyze each type of data individually and integrate their results using Venn diagrams. Challenges arise, however, when the results are irreproducible and inconsistent across multiple platforms. To address these issues, one possible approach is to concurrently analyze both gene expression profiling and CNAs in the same individual. RESULTS We have developed an open-source R/Bioconductor package (iGC). Multiple input formats are supported and users can define their own criteria for identifying differentially expressed genes driven by CNAs. The analysis of two real microarray datasets demonstrated that the CNA-driven genes identified by the iGC package showed significantly higher Pearson correlation coefficients with their gene expression levels and copy numbers than those genes located in a genomic region with CNA. Compared with the Venn diagram approach, the iGC package showed better performance. CONCLUSION The iGC package is effective and useful for identifying CNA-driven genes. By simultaneously considering both comparative genomic and transcriptomic data, it can provide better understanding of biological and medical questions. The iGC package's source code and manual are freely available at https://www.bioconductor.org/packages/release/bioc/html/iGC.html .
Collapse
Affiliation(s)
- Yi-Pin Lai
- Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, Taiwan
| | - Liang-Bo Wang
- Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, Taiwan.,Graduate Institute of Biomedical Electronics and Bioinformatics, Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan
| | - Wei-An Wang
- Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, Taiwan
| | - Liang-Chuan Lai
- Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, Taiwan.,Graduate Institute of Physiology, National Taiwan University, Taipei, Taiwan
| | - Mong-Hsun Tsai
- Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, Taiwan.,Institute of Biotechnology, National Taiwan University, Taipei, Taiwan
| | - Tzu-Pin Lu
- Department of Public Health, Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taipei, Taiwan.
| | - Eric Y Chuang
- Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, Taiwan. .,Graduate Institute of Biomedical Electronics and Bioinformatics, Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan.
| |
Collapse
|
16
|
Jia Y, Chen L, Jia Q, Dou X, Xu N, Liao DJ. The well-accepted notion that gene amplification contributes to increased expression still remains, after all these years, a reasonable but unproven assumption. J Carcinog 2016; 15:3. [PMID: 27298590 PMCID: PMC4895059 DOI: 10.4103/1477-3163.182809] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Accepted: 04/25/2016] [Indexed: 02/06/2023] Open
Abstract
“Gene amplification causes overexpression” is a longstanding and well-accepted concept in cancer genetics. However, raking the whole literature, we find only statistical analyses showing a positive correlation between gene copy number and expression level, but do not find convincing experimental corroboration for this notion, for most of the amplified oncogenes in cancers. Since an association does not need to be an actual causal relation, in our opinion, this widespread notion still remains a reasonable but unproven assumption awaiting experimental verification.
Collapse
Affiliation(s)
- Yuping Jia
- Animal Facilities, Shandong Academy of Pharmaceutical Sciences, Ji'nan, Shandong 250101, USA
| | - Lichan Chen
- Hormel Institute, University of Minnesota, Austin, MN 55912, USA
| | - Qingwen Jia
- Animal Facilities, Shandong Academy of Pharmaceutical Sciences, Ji'nan, Shandong 250101, USA
| | - Xixi Dou
- Animal Facilities, Shandong Academy of Pharmaceutical Sciences, Ji'nan, Shandong 250101, USA
| | - Ningzhi Xu
- Laboratory of Cell and Molecular Biology, Cancer Institute, Chinese Academy of Medical Science, Beijing 100021, China
| | - Dezhong Joshua Liao
- Department of Pathology, Guizhou Medical University Hospital, Guizhou, Guiyang 550004, P.R. China
| |
Collapse
|