1
|
Laber S, Strobel S, Mercader JM, Dashti H, dos Santos FR, Kubitz P, Jackson M, Ainbinder A, Honecker J, Agrawal S, Garborcauskas G, Stirling DR, Leong A, Figueroa K, Sinnott-Armstrong N, Kost-Alimova M, Deodato G, Harney A, Way GP, Saadat A, Harken S, Reibe-Pal S, Ebert H, Zhang Y, Calabuig-Navarro V, McGonagle E, Stefek A, Dupuis J, Cimini BA, Hauner H, Udler MS, Carpenter AE, Florez JC, Lindgren C, Jacobs SB, Claussnitzer M. Discovering cellular programs of intrinsic and extrinsic drivers of metabolic traits using LipocyteProfiler. CELL GENOMICS 2023; 3:100346. [PMID: 37492099 PMCID: PMC10363917 DOI: 10.1016/j.xgen.2023.100346] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 08/22/2022] [Accepted: 05/26/2023] [Indexed: 07/27/2023]
Abstract
A primary obstacle in translating genetic associations with disease into therapeutic strategies is elucidating the cellular programs affected by genetic risk variants and effector genes. Here, we introduce LipocyteProfiler, a cardiometabolic-disease-oriented high-content image-based profiling tool that enables evaluation of thousands of morphological and cellular profiles that can be systematically linked to genes and genetic variants relevant to cardiometabolic disease. We show that LipocyteProfiler allows surveillance of diverse cellular programs by generating rich context- and process-specific cellular profiles across hepatocyte and adipocyte cell-state transitions. We use LipocyteProfiler to identify known and novel cellular mechanisms altered by polygenic risk of metabolic disease, including insulin resistance, fat distribution, and the polygenic contribution to lipodystrophy. LipocyteProfiler paves the way for large-scale forward and reverse deep phenotypic profiling in lipocytes and provides a framework for the unbiased identification of causal relationships between genetic variants and cellular programs relevant to human disease.
Collapse
Affiliation(s)
- Samantha Laber
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7FZ, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Sophie Strobel
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute of Nutritional Medicine, School of Medicine, Technical University of Munich, 85354 Freising-Weihenstephan, Germany
| | - Josep M. Mercader
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Hesam Dashti
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Felipe R.C. dos Santos
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Phil Kubitz
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Else Kröner-Fresenius-Centre for Nutritional Medicine, School of Life Sciences, Technical University of Munich, 85354 Freising-Weihenstephan, Germany
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Maya Jackson
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alina Ainbinder
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Julius Honecker
- Else Kröner-Fresenius-Centre for Nutritional Medicine, School of Life Sciences, Technical University of Munich, 85354 Freising-Weihenstephan, Germany
| | - Saaket Agrawal
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Garrett Garborcauskas
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - David R. Stirling
- Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Aaron Leong
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Katherine Figueroa
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Nasa Sinnott-Armstrong
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Department of Genetics, Stanford University, San Francisco, CA, USA
| | - Maria Kost-Alimova
- Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Giacomo Deodato
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alycen Harney
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Gregory P. Way
- Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Alham Saadat
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Sierra Harken
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Saskia Reibe-Pal
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7FZ, UK
| | - Hannah Ebert
- Institute of Nutritional Science, University Hohenheim, 70599 Stuttgart, Germany
| | - Yixin Zhang
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
| | - Virtu Calabuig-Navarro
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Institute of Nutritional Science, University Hohenheim, 70599 Stuttgart, Germany
| | - Elizabeth McGonagle
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Adam Stefek
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Josée Dupuis
- Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC H3A 1G1, Canada
| | - Beth A. Cimini
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Hans Hauner
- Institute of Nutritional Medicine, School of Medicine, Technical University of Munich, 85354 Freising-Weihenstephan, Germany
- Else Kröner-Fresenius-Centre for Nutritional Medicine, School of Life Sciences, Technical University of Munich, 85354 Freising-Weihenstephan, Germany
- German Center for Diabetes Research (DZD), 85764 Neuherberg, Germany
| | - Miriam S. Udler
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Anne E. Carpenter
- Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Jose C. Florez
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
| | - Cecilia Lindgren
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7FZ, UK
- Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
| | - Suzanne B.R. Jacobs
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Melina Claussnitzer
- Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA
- Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
- Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
2
|
Jeong JC, Hands I, Kolesar JM, Rao M, Davis B, Dobyns Y, Hurt-Mueller J, Levens J, Gregory J, Williams J, Witt L, Kim EM, Burton C, Elbiheary AA, Chang M, Durbin EB. Local data commons: the sleeping beauty in the community of data commons. BMC Bioinformatics 2022; 23:386. [PMID: 36151511 PMCID: PMC9502580 DOI: 10.1186/s12859-022-04922-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Accepted: 09/12/2022] [Indexed: 12/03/2022] Open
Abstract
Background Public Data Commons (PDC) have been highlighted in the scientific literature for their capacity to collect and harmonize big data. On the other hand, local data commons (LDC), located within an institution or organization, have been underrepresented in the scientific literature, even though they are a critical part of research infrastructure. Being closest to the sources of data, LDCs provide the ability to collect and maintain the most up-to-date, high-quality data within an organization, closest to the sources of the data. As a data provider, LDCs have many challenges in both collecting and standardizing data, moreover, as a consumer of PDC, they face problems of data harmonization stemming from the monolithic harmonization pipeline designs commonly adapted by many PDCs. Unfortunately, existing guidelines and resources for building and maintaining data commons exclusively focus on PDC and provide very little information on LDC. Results This article focuses on four important observations. First, there are three different types of LDC service models that are defined based on their roles and requirements. These can be used as guidelines for building new LDC or enhancing the services of existing LDC. Second, the seven core services of LDC are discussed, including cohort identification and facilitation of genomic sequencing, the management of molecular reports and associated infrastructure, quality control, data harmonization, data integration, data sharing, and data access control. Third, instead of commonly developed monolithic systems, we propose a new data sharing method for data harmonization that combines both divide-and-conquer and bottom-up approaches. Finally, an end-to-end LDC implementation is introduced with real-world examples. Conclusions Although LDCs are an optimal place to identify and address data quality issues, they have traditionally been relegated to the role of passive data provider for much larger PDC. Indeed, many LDCs limit their functions to only conducting routine data storage and transmission tasks due to a lack of information on how to design, develop, and improve their services using limited resources. We hope that this work will be the first small step in raising awareness among the LDCs of their expanded utility and to publicize to a wider audience the importance of LDC.
Collapse
Affiliation(s)
- Jong Cheol Jeong
- Division of Biomedical Informatics, College of Medicine, University of Kentucky, Lexington, KY, USA. .,Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.
| | - Isaac Hands
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - Jill M Kolesar
- Department of Pharmacy Practice and Science, College of Pharmacy, University of Kentucky, Lexington, KY, USA
| | - Mahadev Rao
- Department of Pharmacy Practice, Center for Translational Research, Manipal College of Pharmaceutical Sciences, Manipal Academy of Higher Education, Manipal, Karnataka, India
| | - Bront Davis
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - York Dobyns
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - Joseph Hurt-Mueller
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - Justin Levens
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - Jenny Gregory
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - John Williams
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - Lisa Witt
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA.,Kentucky Cancer Registry, Lexington, KY, USA
| | - Eun Mi Kim
- Department of Computer Science, Eastern Kentucky University, Richmond, KY, USA
| | - Carlee Burton
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA
| | - Amir A Elbiheary
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA
| | - Mingguang Chang
- Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA
| | - Eric B Durbin
- Division of Biomedical Informatics, College of Medicine, University of Kentucky, Lexington, KY, USA. .,Cancer Research Informatics Shared Resource Facility, Markey Cancer Center, Lexington, KY, USA. .,Kentucky Cancer Registry, Lexington, KY, USA.
| |
Collapse
|
3
|
Wei S, Tao J, Xu J, Chen X, Wang Z, Zhang N, Zuo L, Jia Z, Chen H, Sun H, Yan Y, Zhang M, Lv H, Kong F, Duan L, Ma Y, Liao M, Xu L, Feng R, Liu G, Project TEWAS, Jiang Y. Ten Years of EWAS. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2021; 8:e2100727. [PMID: 34382344 PMCID: PMC8529436 DOI: 10.1002/advs.202100727] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 05/11/2021] [Indexed: 06/13/2023]
Abstract
Epigenome-wide association study (EWAS) has been applied to analyze DNA methylation variation in complex diseases for a decade, and epigenome as a research target has gradually become a hot topic of current studies. The DNA methylation microarrays, next-generation, and third-generation sequencing technologies have prepared a high-quality platform for EWAS. Here, the progress of EWAS research is reviewed, its contributions to clinical applications, and mainly describe the achievements of four typical diseases. Finally, the challenges encountered by EWAS and make bold predictions for its future development are presented.
Collapse
Affiliation(s)
- Siyu Wei
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Junxian Tao
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Jing Xu
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Xingyu Chen
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Zhaoyang Wang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Nan Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Lijiao Zuo
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Zhe Jia
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Haiyan Chen
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Hongmei Sun
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Yubo Yan
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Mingming Zhang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Hongchao Lv
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
| | - Fanwu Kong
- The EWAS ProjectHarbinChina
- Department of NephrologyThe Second Affiliated HospitalHarbin Medical UniversityHarbin150001China
| | - Lian Duan
- The EWAS ProjectHarbinChina
- The First Affiliated Hospital of Wenzhou Medical UniversityWenzhou325000China
| | - Ye Ma
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| | - Mingzhi Liao
- The EWAS ProjectHarbinChina
- College of Life SciencesNorthwest A&F UniversityYanglingShanxi712100China
| | - Liangde Xu
- The EWAS ProjectHarbinChina
- School of Biomedical EngineeringWenzhou Medical UniversityWenzhou325035China
| | - Rennan Feng
- The EWAS ProjectHarbinChina
- Department of Nutrition and Food HygienePublic Health CollegeHarbin Medical UniversityHarbin150081China
| | - Guiyou Liu
- The EWAS ProjectHarbinChina
- Beijing Institute for Brain DisordersCapital Medical UniversityBeijing100069China
| | | | - Yongshuai Jiang
- College of Bioinformatics Science and TechnologyHarbin Medical UniversityHarbin150081China
- The EWAS ProjectHarbinChina
| |
Collapse
|
4
|
Machine learning workflows to estimate class probabilities for precision cancer diagnostics on DNA methylation microarray data. Nat Protoc 2020; 15:479-512. [PMID: 31932775 DOI: 10.1038/s41596-019-0251-6] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Accepted: 10/04/2019] [Indexed: 01/01/2023]
Abstract
DNA methylation data-based precision cancer diagnostics is emerging as the state of the art for molecular tumor classification. Standards for choosing statistical methods with regard to well-calibrated probability estimates for these typically highly multiclass classification tasks are still lacking. To support this choice, we evaluated well-established machine learning (ML) classifiers including random forests (RFs), elastic net (ELNET), support vector machines (SVMs) and boosted trees in combination with post-processing algorithms and developed ML workflows that allow for unbiased class probability (CP) estimation. Calibrators included ridge-penalized multinomial logistic regression (MR) and Platt scaling by fitting logistic regression (LR) and Firth's penalized LR. We compared these workflows on a recently published brain tumor 450k DNA methylation cohort of 2,801 samples with 91 diagnostic categories using a 5 × 5-fold nested cross-validation scheme and demonstrated their generalizability on external data from The Cancer Genome Atlas. ELNET was the top stand-alone classifier with the best calibration profiles. The best overall two-stage workflow was MR-calibrated SVM with linear kernels closely followed by ridge-calibrated tuned RF. For calibration, MR was the most effective regardless of the primary classifier. The protocols developed as a result of these comparisons provide valuable guidance on choosing ML workflows and their tuning to generate well-calibrated CP estimates for precision diagnostics using DNA methylation data. Computation times vary depending on the ML algorithm from <15 min to 5 d using multi-core desktop PCs. Detailed scripts in the open-source R language are freely available on GitHub, targeting users with intermediate experience in bioinformatics and statistics and using R with Bioconductor extensions.
Collapse
|
5
|
Schmidt F, List M, Cukuroglu E, Köhler S, Göke J, Schulz MH. An ontology-based method for assessing batch effect adjustment approaches in heterogeneous datasets. Bioinformatics 2019; 34:i908-i916. [PMID: 30423059 PMCID: PMC6129283 DOI: 10.1093/bioinformatics/bty553] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Motivation International consortia such as the Genotype-Tissue Expression (GTEx) project, The Cancer Genome Atlas (TCGA) or the International Human Epigenetics Consortium (IHEC) have produced a wealth of genomic datasets with the goal of advancing our understanding of cell differentiation and disease mechanisms. However, utilizing all of these data effectively through integrative analysis is hampered by batch effects, large cell type heterogeneity and low replicate numbers. To study if batch effects across datasets can be observed and adjusted for, we analyze RNA-seq data of 215 samples from ENCODE, Roadmap, BLUEPRINT and DEEP as well as 1336 samples from GTEx and TCGA. While batch effects are a considerable issue, it is non-trivial to determine if batch adjustment leads to an improvement in data quality, especially in cases of low replicate numbers. Results We present a novel method for assessing the performance of batch effect adjustment methods on heterogeneous data. Our method borrows information from the Cell Ontology to establish if batch adjustment leads to a better agreement between observed pairwise similarity and similarity of cell types inferred from the ontology. A comparison of state-of-the art batch effect adjustment methods suggests that batch effects in heterogeneous datasets with low replicate numbers cannot be adequately adjusted. Better methods need to be developed, which can be assessed objectively in the framework presented here. Availability and implementation Our method is available online at https://github.com/SchulzLab/OntologyEval. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Florian Schmidt
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany.,Cluster of Excellence MMCI, Saarland University, Saarland Informatics Campus, Saarbrücken, Germany.,Graduate School of Computer Science, Saarland Informatics Campus, Saarbrücken, Germany.,Genome Institute of Singapore, Computational Genomics and Transcriptomics, Singapore
| | - Markus List
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany.,Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Engin Cukuroglu
- Genome Institute of Singapore, Computational Genomics and Transcriptomics, Singapore
| | | | - Jonathan Göke
- Genome Institute of Singapore, Computational Genomics and Transcriptomics, Singapore
| | - Marcel H Schulz
- Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany.,Cluster of Excellence MMCI, Saarland University, Saarland Informatics Campus, Saarbrücken, Germany.,Institute for Cardiovascular Regeneration, Goethe University, Frankfurt am Main, Germany.,German Center for Cardiovascular Research, Partner Site Rhein-Main, Frankfurt am Main, Germany
| |
Collapse
|
6
|
Han LKM, Verhoeven JE, Tyrka AR, Penninx BWJH, Wolkowitz OM, Månsson KNT, Lindqvist D, Boks MP, Révész D, Mellon SH, Picard M. Accelerating research on biological aging and mental health: Current challenges and future directions. Psychoneuroendocrinology 2019; 106:293-311. [PMID: 31154264 PMCID: PMC6589133 DOI: 10.1016/j.psyneuen.2019.04.004] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Revised: 01/22/2019] [Accepted: 04/02/2019] [Indexed: 12/13/2022]
Abstract
Aging is associated with complex biological changes that can be accelerated, slowed, or even temporarily reversed by biological and non-biological factors. This article focuses on the link between biological aging, psychological stressors, and mental illness. Rather than comprehensively reviewing this rapidly expanding field, we highlight challenges in this area of research and propose potential strategies to accelerate progress in this field. This effort requires the interaction of scientists across disciplines - including biology, psychiatry, psychology, and epidemiology; and across levels of analysis that emphasize different outcome measures - functional capacity, physiological, cellular, and molecular. Dialogues across disciplines and levels of analysis naturally lead to new opportunities for discovery but also to stimulating challenges. Some important challenges consist of 1) establishing the best objective and predictive biological age indicators or combinations of indicators, 2) identifying the basis for inter-individual differences in the rate of biological aging, and 3) examining to what extent interventions can delay, halt or temporarily reverse aging trajectories. Discovering how psychological states influence biological aging, and vice versa, has the potential to create novel and exciting opportunities for healthcare and possibly yield insights into the fundamental mechanisms that drive human aging.
Collapse
Affiliation(s)
- Laura K M Han
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Psychiatry, Amsterdam Public Health Research Institute, Oldenaller 1, the Netherlands; Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Psychiatry, Amsterdam Neuroscience, De Boelelaan 1117, Amsterdam, the Netherlands
| | - Josine E Verhoeven
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Psychiatry, Amsterdam Public Health Research Institute, Oldenaller 1, the Netherlands
| | - Audrey R Tyrka
- Butler Hospital and the Department of Psychiatry and Human Behavior, Warren Alpert Medical School of Brown University, Providence, RI, USA
| | - Brenda W J H Penninx
- Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Psychiatry, Amsterdam Public Health Research Institute, Oldenaller 1, the Netherlands; Amsterdam UMC, Vrije Universiteit Amsterdam, Department of Psychiatry, Amsterdam Neuroscience, De Boelelaan 1117, Amsterdam, the Netherlands
| | - Owen M Wolkowitz
- Department of Psychiatry and Weill Institute for Neurosciences, University of California, San Francisco, School of Medicine, San Francisco, CA, USA
| | - Kristoffer N T Månsson
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden; Department of Psychology, Stockholm University, Stockholm, Sweden; Department of Psychology, Uppsala University, Uppsala, Sweden
| | - Daniel Lindqvist
- Faculty of Medicine, Department of Clinical Sciences, Psychiatry, Lund University, Lund, Sweden; Department of Psychiatry, University of California San Francisco (UCSF) School of Medicine, San Francisco, CA, USA; Psychiatric Clinic, Lund, Division of Psychiatry, Lund, Sweden
| | - Marco P Boks
- Brain Center Rudolf Magnus, Department of Psychiatry, University Medical Center Utrecht, the Netherlands
| | - Dóra Révész
- Center of Research on Psychology in Somatic diseases (CoRPS), Department of Medical and Clinical Psychology, Tilburg University, Tilburg, the Netherlands
| | - Synthia H Mellon
- Department of Psychiatry and Weill Institute for Neurosciences, University of California, San Francisco, School of Medicine, San Francisco, CA, USA
| | - Martin Picard
- Department of Psychiatry, Division of Behavioral Medicine, Columbia University Medical Center, New York, NY, USA; Department of Neurology, H. Houston Merritt Center, Columbia Translational Neuroscience Initiative, Columbia University Medical Center, New York, NY, USA; Columbia Aging Center, Columbia University, New York, NY, USA.
| |
Collapse
|
7
|
Reimann B, Janssen BG, Alfano R, Ghantous A, Espín-Pérez A, de Kok TM, Saenen ND, Cox B, Robinson O, Chadeau-Hyam M, Penders J, Herceg Z, Vineis P, Nawrot TS, Plusquin M. The Cord Blood Insulin and Mitochondrial DNA Content Related Methylome. Front Genet 2019; 10:325. [PMID: 31031804 PMCID: PMC6474284 DOI: 10.3389/fgene.2019.00325] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2018] [Accepted: 03/25/2019] [Indexed: 12/16/2022] Open
Abstract
Mitochondrial dysfunction seems to play a key role in the etiology of insulin resistance. At birth, a link has already been established between mitochondrial DNA (mtDNA) content and insulin levels in cord blood. In this study, we explore shared epigenetic mechanisms of the association between mtDNA content and insulin levels, supporting the developmental origins of this link. First, the association between cord blood insulin and mtDNA content in 882 newborns of the ENVIRONAGE birth cohort was assessed. Cord blood mtDNA content was established via qPCR, while cord blood levels of insulin were determined using electrochemiluminescence immunoassays. Then the cord blood DNA methylome and transcriptome were determined in 179 newborns, using the human 450K methylation Illumina and Agilent Whole Human Genome 8 × 60 K microarrays, respectively. Subsequently, we performed an epigenome-wide association study (EWAS) adjusted for different maternal and neonatal variables. Afterward, we focused on the 20 strongest associations based on p-values to assign transcriptomic correlates and allocate corresponding pathways employing the R packages ReactomePA and RDAVIDWebService. On the regional level, we examined differential methylation using the DMRcate and Bumphunter packages in R. Cord blood mtDNA content and insulin were significantly correlated (r = 0.074, p = 0.028), still showing a trend after additional adjustment for maternal and neonatal variables (p = 0.062). We found an overlap of 33 pathways which were in common between the association with cord blood mtDNA content and insulin levels, including pathways of neurodevelopment, histone modification, cytochromes P450 (CYP)-metabolism, and biological aging. We further identified a DMR annotated to Repulsive Guidance Molecule BMP Co-Receptor A (RGMA) linked to cord blood insulin as well as mtDNA content. Metabolic variation in early life represented by neonatal insulin levels and mtDNA content might reflect or accommodate alterations in neurodevelopment, histone modification, CYP-metabolism, and aging, indicating etiological origins in epigenetic programming. Variation in metabolic hormones at birth, reflected by molecular changes, might via these alterations predispose children to metabolic diseases later in life. The results of this study may provide important markers for following targeted studies.
Collapse
Affiliation(s)
- Brigitte Reimann
- Centre for Environmental Sciences, University of Hasselt, Hasselt, Belgium
| | - Bram G. Janssen
- Centre for Environmental Sciences, University of Hasselt, Hasselt, Belgium
| | - Rossella Alfano
- Centre for Environmental Sciences, University of Hasselt, Hasselt, Belgium
| | - Akram Ghantous
- Epigenetics Group, International Agency for Research on Cancer (IARC), Lyon, France
| | - Almudena Espín-Pérez
- Department of Biomedical Informatics Research, Stanford University, California, CA, United States
| | - Theo M. de Kok
- Department of Toxicogenomics, GROW School for Oncology and Developmental Biology, Maastricht University, Maastricht, Netherlands
| | - Nelly D. Saenen
- Centre for Environmental Sciences, University of Hasselt, Hasselt, Belgium
| | - Bianca Cox
- Centre for Environmental Sciences, University of Hasselt, Hasselt, Belgium
| | - Oliver Robinson
- Department of Epidemiology and Biostatistics, The School of Public Health, Imperial College London, London, United Kingdom
- Medical Research Council-Health Protection Agency Centre for Environment and Health, Imperial College London, London, United Kingdom
| | - Marc Chadeau-Hyam
- Department of Epidemiology and Biostatistics, The School of Public Health, Imperial College London, London, United Kingdom
- Medical Research Council-Health Protection Agency Centre for Environment and Health, Imperial College London, London, United Kingdom
- Institute for Risk Assessment Sciences (IRAS), Division of Environmental Epidemiology, Utrecht University, Utrecht, Netherlands
| | - Joris Penders
- Laboratory of Clinical Biology, East-Limburg Hospital, Genk, Belgium
| | - Zdenko Herceg
- Epigenetics Group, International Agency for Research on Cancer (IARC), Lyon, France
| | - Paolo Vineis
- Department of Epidemiology and Biostatistics, The School of Public Health, Imperial College London, London, United Kingdom
- Medical Research Council-Health Protection Agency Centre for Environment and Health, Imperial College London, London, United Kingdom
- Italian Institute for Genomic Medicine (IIGM), Turin, Italy
| | - Tim S. Nawrot
- Centre for Environmental Sciences, University of Hasselt, Hasselt, Belgium
- School of Public Health, Occupational and Environmental Medicine, KU Leuven, Leuven, Belgium
| | - Michelle Plusquin
- Centre for Environmental Sciences, University of Hasselt, Hasselt, Belgium
- Department of Epidemiology and Biostatistics, The School of Public Health, Imperial College London, London, United Kingdom
- Medical Research Council-Health Protection Agency Centre for Environment and Health, Imperial College London, London, United Kingdom
| |
Collapse
|
8
|
Min JL, Hemani G, Davey Smith G, Relton C, Suderman M. Meffil: efficient normalization and analysis of very large DNA methylation datasets. Bioinformatics 2018; 34:3983-3989. [PMID: 29931280 PMCID: PMC6247925 DOI: 10.1093/bioinformatics/bty476] [Citation(s) in RCA: 121] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 06/18/2018] [Indexed: 12/11/2022] Open
Abstract
Motivation DNA methylation datasets are growing ever larger both in sample size and genome coverage. Novel computational solutions are required to efficiently handle these data. Results We have developed meffil, an R package designed for efficient quality control, normalization and epigenome-wide association studies of large samples of Illumina Methylation BeadChip microarrays. A complete re-implementation of functional normalization minimizes computational memory without increasing running time. Incorporating fixed and random effects within functional normalization, and automated estimation of functional normalization parameters reduces technical variation in DNA methylation levels, thus reducing false positive rates and improving power. Support for normalization of datasets distributed across physically different locations without needing to share biologically-based individual-level data means that meffil can be used to reduce heterogeneity in meta-analyses of epigenome-wide association studies. Availability and implementation https://github.com/perishky/meffil/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- J L Min
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.,Bristol Medical School, University of Bristol, Bristol, UK
| | - G Hemani
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.,Bristol Medical School, University of Bristol, Bristol, UK
| | - G Davey Smith
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.,Bristol Medical School, University of Bristol, Bristol, UK
| | - C Relton
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.,Bristol Medical School, University of Bristol, Bristol, UK
| | - M Suderman
- MRC Integrative Epidemiology Unit, University of Bristol, Bristol, UK.,Bristol Medical School, University of Bristol, Bristol, UK
| |
Collapse
|
9
|
Jiao C, Zhang C, Dai R, Xia Y, Wang K, Giase G, Chen C, Liu C. Positional effects revealed in Illumina methylation array and the impact on analysis. Epigenomics 2018; 10:643-659. [PMID: 29469594 PMCID: PMC6021926 DOI: 10.2217/epi-2017-0105] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2017] [Accepted: 01/17/2018] [Indexed: 12/18/2022] Open
Abstract
AIM We aimed to prove the existence of positional effects in the Illumina methylation beadchip data and to find an optimal correction method. MATERIALS & METHODS Three HumanMethylation450, three HumanMethylation27 datasets and two EPIC datasets were analyzed. ComBat, linear regression, functional normalization and single-sample Noob were used for minimizing positional effects. The corrected results were evaluated by four methods. RESULTS We detected 52,988 CpG loci significantly associated with sample positions, 112 remained after ComBat correction in the primary dataset. The pre- and postcorrection comparisons indicate the positional effects could alter the measured methylation values and downstream analysis results. CONCLUSION Positional effects exist in the Illumina methylation array and may bias the analyses. Using ComBat to correct positional effects is recommended.
Collapse
Affiliation(s)
- Chuan Jiao
- Center for Medical Genetics, Central South University, Changsha, Hunan 410012, PR China
| | - Chunling Zhang
- Department of Neurology and Physiology, SUNY Upstate Medical University, Syracuse, NY 13201, USA
| | - Rujia Dai
- Center for Medical Genetics, Central South University, Changsha, Hunan 410012, PR China
| | - Yan Xia
- Center for Medical Genetics, Central South University, Changsha, Hunan 410012, PR China
| | - Kangli Wang
- Center for Medical Genetics, Central South University, Changsha, Hunan 410012, PR China
| | - Gina Giase
- Department of Psychiatry, University of Illinois at Chicago, Chicago, IL 60607, USA
| | - Chao Chen
- Center for Medical Genetics, Central South University, Changsha, Hunan 410012, PR China
- National Clinical Research Center for Geriatric Disorders, Central South University, Changsha, Hunan 410012, PR China
| | - Chunyu Liu
- Center for Medical Genetics, Central South University, Changsha, Hunan 410012, PR China
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY 13201, USA
| |
Collapse
|
10
|
Abstract
Studies have pointed out that the expression of genes are highly regulated, which result in a cascade of distinct patterns of coexpression forming a network. Identifying and understanding such patterns is crucial in deciphering molecular mechanisms that underlie the pathophysiology of diseases. With the advance of high throughput assay of messenger RNA (mRNA) and high performance computing, reconstructing such network from molecular data such as gene expression is now possible. This chapter discusses an overview of methods of constructing such networks, practical considerations, and an example.
Collapse
Affiliation(s)
- Roby Joehanes
- Hebrew SeniorLife, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|