1
|
Bonfiglio F, Legati A, Lasorsa VA, Palombo F, De Riso G, Isidori F, Russo S, Furini S, Merla G, Coppedè F, Tartaglia M, Bruselles A, Pippucci T, Ciolfi A, Pinelli M, Capasso M. Best practices for germline variant and DNA methylation analysis of second- and third-generation sequencing data. Hum Genomics 2024; 18:120. [PMID: 39501379 PMCID: PMC11536923 DOI: 10.1186/s40246-024-00684-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 10/11/2024] [Indexed: 11/09/2024] Open
Abstract
This comprehensive review provides insights and suggested strategies for the analysis of germline variants using second- and third-generation sequencing technologies (SGS and TGS). It addresses the critical stages of data processing, starting from alignment and preprocessing to quality control, variant calling, and the removal of artifacts. The document emphasized the importance of meticulous data handling, highlighting advanced methodologies for annotating variants and identifying structural variations and methylated DNA sites. Special attention is given to the inspection of problematic variants, a step that is crucial for ensuring the accuracy of the analysis, particularly in clinical settings where genetic diagnostics can inform patient care. Additionally, the document covers the use of various bioinformatics tools and software that enhance the precision and reliability of these analyses. It outlines best practices for the annotation of variants, including considerations for problematic genetic alterations such as those in the human leukocyte antigen region, runs of homozygosity, and mitochondrial DNA alterations. The document also explores the complexities associated with identifying structural variants and copy number variations, underscoring the challenges posed by these large-scale genomic alterations. The objective is to offer a comprehensive framework for researchers and clinicians, ensuring that genetic analyses conducted with SGS and TGS are both accurate and reproducible. By following these best practices, the document aims to increase the diagnostic accuracy for hereditary diseases, facilitating early diagnosis, prevention, and personalized treatment strategies. This review serves as a valuable resource for both novices and experts in the field, providing insights into the latest advancements and methodologies in genetic analysis. It also aims to encourage the adoption of these practices in diverse research and clinical contexts, promoting consistency and reliability across studies.
Collapse
Affiliation(s)
- Ferdinando Bonfiglio
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy
| | - Andrea Legati
- Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | | | - Flavia Palombo
- Programma Di Neurogenetica, IRCCS Istituto Delle Scienze Neurologiche Di Bologna, Bologna, Italy
| | - Giulia De Riso
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy
| | - Federica Isidori
- IRCCS Azienda Ospedaliero-Universitaria Di Bologna, Bologna, Italy
| | - Silvia Russo
- Research Laboratory of Medical Cytogenetics and Molecular Genetics, IRCCS Istituto Auxologico Italiano, Milan, Italy
- Laboratorio di Ricerca di Citogenetica Medica e Genetica Molecolare, Istituto Auxologico Italiano, IRCCS, 20145, Milano, Italy
| | - Simone Furini
- Department of Electrical, Electronic and Information Engineering "Guglielmo Marconi", University of Bologna, Bologna, Italy
| | - Giuseppe Merla
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
| | - Fabio Coppedè
- Department of Translational Research and of New Surgical and Medical Technologies, University of Pisa, Pisa, Italy
| | - Marco Tartaglia
- Molecular Genetics and Functional Genomics, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Alessandro Bruselles
- Department of Oncology and Molecular Medicine, Istituto Superiore Di Sanità, Rome, Italy
| | - Tommaso Pippucci
- IRCCS Azienda Ospedaliero-Universitaria Di Bologna, Bologna, Italy
| | - Andrea Ciolfi
- Molecular Genetics and Functional Genomics, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Michele Pinelli
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy
| | - Mario Capasso
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Naples, Italy.
- CEINGE Advanced Biotechnology Franco Salvatore, Naples, Italy.
| |
Collapse
|
2
|
Shastri GG, Sudre G, Ahn K, Jung B, Kolachana B, Auluck PK, Elnitski L, Marenco S, Shaw P. Cortico-striatal differences in the epigenome in attention-deficit/ hyperactivity disorder. Transl Psychiatry 2024; 14:189. [PMID: 38605038 PMCID: PMC11009227 DOI: 10.1038/s41398-024-02896-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 03/20/2024] [Accepted: 04/02/2024] [Indexed: 04/13/2024] Open
Abstract
While epigenetic modifications have been implicated in ADHD through studies of peripheral tissue, to date there has been no examination of the epigenome of the brain in the disorder. To address this gap, we mapped the methylome of the caudate nucleus and anterior cingulate cortex in post-mortem tissue from fifty-eight individuals with or without ADHD. While no single probe showed adjusted significance in differential methylation, several differentially methylated regions emerged. These regions implicated genes involved in developmental processes including neurogenesis and the differentiation of oligodendrocytes and glial cells. We demonstrate a significant association between differentially methylated genes in the caudate and genes implicated by GWAS not only in ADHD but also in autistic spectrum, obsessive compulsive and bipolar affective disorders through GWAS. Using transcriptomic data available on the same subjects, we found modest correlations between the methylation and expression of genes. In conclusion, this study of the cortico-striatal methylome points to gene and gene pathways involved in neurodevelopment, consistent with studies of common and rare genetic variation, as well as the post-mortem transcriptome in ADHD.
Collapse
Affiliation(s)
- Gauri G Shastri
- Social and Behavioral Research Branch, National Human Genome Research Institute, NIH, Bethesda, MD, 20892, USA
| | - Gustavo Sudre
- Social and Behavioral Research Branch, National Human Genome Research Institute, NIH, Bethesda, MD, 20892, USA
| | - Kwangmi Ahn
- Social and Behavioral Research Branch, National Human Genome Research Institute, NIH, Bethesda, MD, 20892, USA
| | - Benjamin Jung
- Social and Behavioral Research Branch, National Human Genome Research Institute, NIH, Bethesda, MD, 20892, USA
| | - Bhaskar Kolachana
- Human Brain Collection Core, National Institute of Mental Health, NIH, Bethesda, MD, 20892, USA
| | - Pavan K Auluck
- Human Brain Collection Core, National Institute of Mental Health, NIH, Bethesda, MD, 20892, USA
| | - Laura Elnitski
- Translational and Functional Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD, 20892, USA
| | - Stefano Marenco
- Human Brain Collection Core, National Institute of Mental Health, NIH, Bethesda, MD, 20892, USA
| | - Philip Shaw
- Social and Behavioral Research Branch, National Human Genome Research Institute, NIH, Bethesda, MD, 20892, USA.
| |
Collapse
|
3
|
Carvalho Silva R, Martini P, Hohoff C, Mattevi S, Bortolomasi M, Menesello V, Gennarelli M, Baune BT, Minelli A. DNA methylation changes in association with trauma-focused psychotherapy efficacy in treatment-resistant depression patients: a prospective longitudinal study. Eur J Psychotraumatol 2024; 15:2314913. [PMID: 38362742 PMCID: PMC10878335 DOI: 10.1080/20008066.2024.2314913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/30/2024] [Indexed: 02/17/2024] Open
Abstract
Background: Stressful events increase the risk for treatment-resistant depression (TRD), and trauma-focused psychotherapy can be useful for TRD patients exposed to early life stress (ELS). Epigenetic processes are known to be related to depression and ELS, but there is no evidence of the effects of trauma-focused psychotherapy on methylation alterations.Objective: We performed the first epigenome-wide association study to investigate methylation changes related to trauma-focused psychotherapies effects in TRD patients.Method: Thirty TRD patients assessed for ELS underwent trauma-focused psychotherapy, of those, 12 received trauma-focused cognitive behavioural therapy, and 18 Eye Movement Desensitization and Reprocessing (EMDR). DNA methylation was profiled with Illumina Infinium EPIC array at T0 (baseline), after 8 weeks (T8, end of psychotherapy) and after 12 weeks (T12 - follow-up). We examined differentially methylated CpG sites and regions, as well as pathways analysis in association with the treatment.Results: Main results obtained have shown 110 differentially methylated regions (DMRs) with a significant adjusted p-value area associated with the effects of trauma-focused psychotherapies in the entire cohort. Several annotated genes are related to inflammatory processes and psychiatric disorders, such as LTA, GFI1, ARID5B, TNFSF13, and LST1. Gene enrichment analyses revealed statistically significant processes related to tumour necrosis factor (TNF) receptor and TNF signalling pathway. Stratified analyses by type of trauma-focused psychotherapy showed statistically significant adjusted p-value area in 141 DMRs only for the group of patients receiving EMDR, with annotated genes related to inflammation and psychiatric disorders, including LTA, GFI1, and S100A8. Gene set enrichment analyses in the EMDR group indicated biological processes related to inflammatory response, particularly the TNF signalling pathway.Conclusion: We provide preliminary valuable insights into global DNA methylation changes associated with trauma-focused psychotherapies effects, in particular with EMDR treatment.
Collapse
Affiliation(s)
- Rosana Carvalho Silva
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Paolo Martini
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | - Christa Hohoff
- Department of Psychiatry and Psychotherapy, University of Münster, Münster, Germany
| | - Stefania Mattevi
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
| | | | - Valentina Menesello
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | - Massimo Gennarelli
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| | - Bernhard T. Baune
- Department of Psychiatry and Psychotherapy, University of Münster, Münster, Germany
- Department of Psychiatry, Melbourne Medical School, University of Melbourne, Melbourne, Australia
- The Florey Institute of Neuroscience and Mental Health, The University of Melbourne, Parkville, Australia
| | - Alessandra Minelli
- Department of Molecular and Translational Medicine, University of Brescia, Brescia, Italy
- Genetics Unit, IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy
| |
Collapse
|
4
|
Giuili E, Grolaux R, Macedo CZNM, Desmyter L, Pichon B, Neuens S, Vilain C, Olsen C, Van Dooren S, Smits G, Defrance M. Comprehensive evaluation of the implementation of episignatures for diagnosis of neurodevelopmental disorders (NDDs). Hum Genet 2023; 142:1721-1735. [PMID: 37889307 PMCID: PMC10676303 DOI: 10.1007/s00439-023-02609-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 10/10/2023] [Indexed: 10/28/2023]
Abstract
Episignatures are popular tools for the diagnosis of rare neurodevelopmental disorders. They are commonly based on a set of differentially methylated CpGs used in combination with a support vector machine model. DNA methylation (DNAm) data often include missing values due to changes in data generation technology and batch effects. While many normalization methods exist for DNAm data, their impact on episignature performance have never been assessed. In addition, technologies to quantify DNAm evolve quickly and this may lead to poor transposition of existing episignatures generated on deprecated array versions to new ones. Indeed, probe removal between array versions, technologies or during preprocessing leads to missing values. Thus, the effect of missing data on episignature performance must also be carefully evaluated and addressed through imputation or an innovative approach to episignatures design. In this paper, we used data from patients suffering from Kabuki and Sotos syndrome to evaluate the influence of normalization methods, classification models and missing data on the prediction performances of two existing episignatures. We compare how six popular normalization methods for methylarray data affect episignature classification performances in Kabuki and Sotos syndromes and provide best practice suggestions when building new episignatures. In this setting, we show that Illumina, Noob or Funnorm normalization methods achieved higher classification performances on the testing sets compared to Quantile, Raw and Swan normalization methods. We further show that penalized logistic regression and support vector machines perform best in the classification of Kabuki and Sotos syndrome patients. Then, we describe a new paradigm to build episignatures based on the detection of differentially methylated regions (DMRs) and evaluate their performance compared to classical differentially methylated cytosines (DMCs)-based episignatures in the presence of missing data. We show that the performance of classical DMC-based episignatures suffers from the presence of missing data more than the DMR-based approach. We present a comprehensive evaluation of how the normalization of DNA methylation data affects episignature performance, using three popular classification models. We further evaluate how missing data affect those models' predictions. Finally, we propose a novel methodology to develop episignatures based on differentially methylated regions identification and show how this method slightly outperforms classical episignatures in the presence of missing data.
Collapse
Affiliation(s)
- Edoardo Giuili
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
| | - Robin Grolaux
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
| | - Catarina Z N M Macedo
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
| | - Laurence Desmyter
- Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
| | - Bruno Pichon
- Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
| | - Sebastian Neuens
- Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
- Department of Genetics, Hôpital Universitaire Des Enfants Reine Fabiola, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
| | - Catheline Vilain
- Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
- Department of Genetics, Hôpital Universitaire Des Enfants Reine Fabiola, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
| | - Catharina Olsen
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
- Clinical Sciences, Research Group Reproduction and Genetics, Brussels Interuniversity Genomics High Throughput Core (BRIGHTcore), Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium
- Clinical Sciences, Research Group Reproduction and Genetics, Centre for Medical Genetics, Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium
| | - Sonia Van Dooren
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
- Clinical Sciences, Research Group Reproduction and Genetics, Brussels Interuniversity Genomics High Throughput Core (BRIGHTcore), Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium
- Clinical Sciences, Research Group Reproduction and Genetics, Centre for Medical Genetics, Vrije Universiteit Brussel (VUB), Universitair Ziekenhuis Brussel (UZ Brussel), Brussels, Belgium
| | - Guillaume Smits
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium
- Center for Human Genetics, Hôpital Erasme, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
- Department of Genetics, Hôpital Universitaire Des Enfants Reine Fabiola, Hôpital Universitaire de Bruxelles, Université Libre de Bruxelles, Brussels, Belgium
| | - Matthieu Defrance
- Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussel, Brussels, Belgium.
| |
Collapse
|
5
|
Zheng Y, Lunetta KL, Liu C, Smith AK, Sherva R, Miller MW, Logue MW. A novel principal component based method for identifying differentially methylated regions in Illumina Infinium MethylationEPIC BeadChip data. Epigenetics 2023; 18:2207959. [PMID: 37196182 PMCID: PMC10193914 DOI: 10.1080/15592294.2023.2207959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 03/22/2023] [Accepted: 04/19/2023] [Indexed: 05/19/2023] Open
Abstract
Differentially methylated regions (DMRs) are genomic regions with methylation patterns across multiple CpG sites that are associated with a phenotype. In this study, we proposed a Principal Component (PC) based DMR analysis method for use with data generated using the Illumina Infinium MethylationEPIC BeadChip (EPIC) array. We obtained methylation residuals by regressing the M-values of CpGs within a region on covariates, extracted PCs of the residuals, and then combined association information across PCs to obtain regional significance. Simulation-based genome-wide false positive (GFP) rates and true positive rates were estimated under a variety of conditions before determining the final version of our method, which we have named DMRPC. Then, DMRPC and another DMR method, coMethDMR, were used to perform epigenome-wide analyses of several phenotypes known to have multiple associated methylation loci (age, sex, and smoking) in a discovery and a replication cohort. Among regions that were analysed by both methods, DMRPC identified 50% more genome-wide significant age-associated DMRs than coMethDMR. The replication rate for the loci that were identified by only DMRPC was higher than the rate for those that were identified by only coMethDMR (90% for DMRPC vs. 76% for coMethDMR). Furthermore, DMRPC identified replicable associations in regions of moderate between-CpG correlation which are typically not analysed by coMethDMR. For the analyses of sex and smoking, the advantage of DMRPC was less clear. In conclusion, DMRPC is a new powerful DMR discovery tool that retains power in genomic regions with moderate correlation across CpGs.
Collapse
Affiliation(s)
- Yuanchao Zheng
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Kathryn L. Lunetta
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Chunyu Liu
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
| | - Alicia K. Smith
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
- Department of Psychiatry and Behavioral Sciences, Emory University School of Medicine, Atlanta, GA, USA
| | - Richard Sherva
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Department of Psychiatry, Boston University School of Medicine, Boston, MA, USA
| | - Mark W. Miller
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Biomedical Genetics, Boston University School of Medicine, Boston, MA, USA
| | - Mark W. Logue
- National Center for PTSD, VA Boston Healthcare System, Boston, MA, USA
- Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA
- Department of Psychiatry, Boston University School of Medicine, Boston, MA, USA
- Biomedical Genetics, Boston University School of Medicine, Boston, MA, USA
| |
Collapse
|