1
|
Hu M, Chikina M. Heterogeneous pseudobulk simulation enables realistic benchmarking of cell-type deconvolution methods. Genome Biol 2024; 25:169. [PMID: 38956606 PMCID: PMC11218230 DOI: 10.1186/s13059-024-03292-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 05/29/2024] [Indexed: 07/04/2024] Open
Abstract
BACKGROUND Computational cell type deconvolution enables the estimation of cell type abundance from bulk tissues and is important for understanding tissue microenviroment, especially in tumor tissues. With rapid development of deconvolution methods, many benchmarking studies have been published aiming for a comprehensive evaluation for these methods. Benchmarking studies rely on cell-type resolved single-cell RNA-seq data to create simulated pseudobulk datasets by adding individual cells-types in controlled proportions. RESULTS In our work, we show that the standard application of this approach, which uses randomly selected single cells, regardless of the intrinsic difference between them, generates synthetic bulk expression values that lack appropriate biological variance. We demonstrate why and how the current bulk simulation pipeline with random cells is unrealistic and propose a heterogeneous simulation strategy as a solution. The heterogeneously simulated bulk samples match up with the variance observed in real bulk datasets and therefore provide concrete benefits for benchmarking in several ways. We demonstrate that conceptual classes of deconvolution methods differ dramatically in their robustness to heterogeneity with reference-free methods performing particularly poorly. For regression-based methods, the heterogeneous simulation provides an explicit framework to disentangle the contributions of reference construction and regression methods to performance. Finally, we perform an extensive benchmark of diverse methods across eight different datasets and find BayesPrism and a hybrid MuSiC/CIBERSORTx approach to be the top performers. CONCLUSIONS Our heterogeneous bulk simulation method and the entire benchmarking framework is implemented in a user friendly package https://github.com/humengying0907/deconvBenchmarking and https://doi.org/10.5281/zenodo.8206516 , enabling further developments in deconvolution methods.
Collapse
Affiliation(s)
- Mengying Hu
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, USA
- Joint Carnegie Mellon - University of Pittsburgh Computational Biology PhD Program, University of Pittsburgh, Pittsburgh, USA
| | - Maria Chikina
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, USA.
- Joint Carnegie Mellon - University of Pittsburgh Computational Biology PhD Program, University of Pittsburgh, Pittsburgh, USA.
| |
Collapse
|
2
|
Marchi E, Hinks TSC, Richardson M, Khalfaoui L, Symon FA, Rajasekar P, Clifford R, Hargadon B, Austin CD, MacIsaac JL, Kobor MS, Siddiqui S, Mar JS, Arron JR, Choy DF, Bradding P. The effects of inhaled corticosteroids on healthy airways. Allergy 2024; 79:1831-1843. [PMID: 38686450 PMCID: PMC7616167 DOI: 10.1111/all.16146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/27/2024] [Accepted: 03/19/2024] [Indexed: 05/02/2024]
Abstract
BACKGROUND The effects of inhaled corticosteroids (ICS) on healthy airways are poorly defined. OBJECTIVES To delineate the effects of ICS on gene expression in healthy airways, without confounding caused by changes in disease-related genes and disease-related alterations in ICS responsiveness. METHODS Randomized open-label bronchoscopy study of high-dose ICS therapy in 30 healthy adult volunteers randomized 2:1 to (i) fluticasone propionate 500 mcg bd daily or (ii) no treatment, for 4 weeks. Laboratory staff were blinded to allocation. Biopsies and brushings were analysed by immunohistochemistry, bulk RNA sequencing, DNA methylation array and metagenomics. RESULTS ICS induced small between-group differences in blood and lamina propria eosinophil numbers, but not in other immunopathological features, blood neutrophils, FeNO, FEV1, microbiome or DNA methylation. ICS treatment upregulated 72 genes in brushings and 53 genes in biopsies, and downregulated 82 genes in brushings and 416 genes in biopsies. The most downregulated genes in both tissues were canonical markers of type-2 inflammation (FCER1A, CPA3, IL33, CLEC10A, SERPINB10 and CCR5), T cell-mediated adaptive immunity (TARP, TRBC1, TRBC2, PTPN22, TRAC, CD2, CD8A, HLA-DQB2, CD96, PTPN7), B-cell immunity (CD20, immunoglobulin heavy and light chains) and innate immunity, including CD48, Hobit, RANTES, Langerin and GFI1. An IL-17-dependent gene signature was not upregulated by ICS. CONCLUSIONS In healthy airways, 4-week ICS exposure reduces gene expression related to both innate and adaptive immunity, and reduces markers of type-2 inflammation. This implies that homeostasis in health involves tonic type-2 signalling in the airway mucosa, which is exquisitely sensitive to ICS.
Collapse
Affiliation(s)
- Emanuele Marchi
- NIHR Oxford Respiratory BRC and Respiratory Medicine Unit, Experimental Medicine, Nuffield Department of Medicine, John Radcliffe Hospital, Oxford, UK
| | - Timothy S C Hinks
- NIHR Oxford Respiratory BRC and Respiratory Medicine Unit, Experimental Medicine, Nuffield Department of Medicine, John Radcliffe Hospital, Oxford, UK
| | - Matthew Richardson
- Department of Respiratory Sciences, University of Leicester, Leicester Respiratory NIHR BRC, Glenfield Hospital, Leicester, UK
| | - Latifa Khalfaoui
- Department of Respiratory Sciences, University of Leicester, Leicester Respiratory NIHR BRC, Glenfield Hospital, Leicester, UK
| | - Fiona A Symon
- Department of Respiratory Sciences, University of Leicester, Leicester Respiratory NIHR BRC, Glenfield Hospital, Leicester, UK
| | - Poojitha Rajasekar
- Centre for Respiratory Research, Translational Medical Sciences, School of Medicine, Nottingham NIHR Biomedical Research Centre, Biodiscovery Institute, University Park, University of Nottingham, Nottingham, UK
| | - Rachel Clifford
- Centre for Respiratory Research, Translational Medical Sciences, School of Medicine, Nottingham NIHR Biomedical Research Centre, Biodiscovery Institute, University Park, University of Nottingham, Nottingham, UK
| | - Beverley Hargadon
- Department of Respiratory Sciences, University of Leicester, Leicester Respiratory NIHR BRC, Glenfield Hospital, Leicester, UK
| | - Cary D Austin
- Genentech, Inc., South San Francisco, California, USA
| | - Julia L MacIsaac
- Edwin S.H. Leong Centre for Healthy Aging, Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Michael S Kobor
- Edwin S.H. Leong Centre for Healthy Aging, Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Salman Siddiqui
- Department of Respiratory Sciences, University of Leicester, Leicester Respiratory NIHR BRC, Glenfield Hospital, Leicester, UK
| | - Jordan S Mar
- Genentech, Inc., South San Francisco, California, USA
| | | | - David F Choy
- Genentech, Inc., South San Francisco, California, USA
| | - Peter Bradding
- Department of Respiratory Sciences, University of Leicester, Leicester Respiratory NIHR BRC, Glenfield Hospital, Leicester, UK
| |
Collapse
|
3
|
Yap CX, Vo DD, Heffel MG, Bhattacharya A, Wen C, Yang Y, Kemper KE, Zeng J, Zheng Z, Zhu Z, Hannon E, Vellame DS, Franklin A, Caggiano C, Wamsley B, Geschwind DH, Zaitlen N, Gusev A, Pasaniuc B, Mill J, Luo C, Gandal MJ. Brain cell-type shifts in Alzheimer's disease, autism, and schizophrenia interrogated using methylomics and genetics. SCIENCE ADVANCES 2024; 10:eadn7655. [PMID: 38781333 PMCID: PMC11114225 DOI: 10.1126/sciadv.adn7655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 03/14/2024] [Indexed: 05/25/2024]
Abstract
Few neuropsychiatric disorders have replicable biomarkers, prompting high-resolution and large-scale molecular studies. However, we still lack consensus on a more foundational question: whether quantitative shifts in cell types-the functional unit of life-contribute to neuropsychiatric disorders. Leveraging advances in human brain single-cell methylomics, we deconvolve seven major cell types using bulk DNA methylation profiling across 1270 postmortem brains, including from individuals diagnosed with Alzheimer's disease, schizophrenia, and autism. We observe and replicate cell-type compositional shifts for Alzheimer's disease (endothelial cell loss), autism (increased microglia), and schizophrenia (decreased oligodendrocytes), and find age- and sex-related changes. Multiple layers of evidence indicate that endothelial cell loss contributes to Alzheimer's disease, with comparable effect size to APOE genotype among older people. Genome-wide association identified five genetic loci related to cell-type composition, involving plausible genes for the neurovascular unit (P2RX5 and TRPV3) and excitatory neurons (DPY30 and MEMO1). These results implicate specific cell-type shifts in the pathophysiology of neuropsychiatric disorders.
Collapse
Affiliation(s)
- Chloe X. Yap
- Mater Research Institute, University of Queensland, Brisbane, Queensland, Australia
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Daniel D. Vo
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Lifespan Brain Institute at Penn Medicine and The Children’s Hospital of Philadelphia, Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Matthew G. Heffel
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
| | - Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
- Department of Epidemiology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
- Institute for Data Science in Oncology, University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Cindy Wen
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Yuanhao Yang
- Mater Research Institute, University of Queensland, Brisbane, Queensland, Australia
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - Kathryn E. Kemper
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - Jian Zeng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - Zhili Zheng
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
| | - Zhihong Zhu
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia
- The National Centre for Register-based Research, Aarhus University, Denmark
| | - Eilis Hannon
- Department of Clinical and Biomedical Sciences, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Dorothea Seiler Vellame
- Department of Clinical and Biomedical Sciences, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Alice Franklin
- Department of Clinical and Biomedical Sciences, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Christa Caggiano
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
| | - Brie Wamsley
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
- Center for Autism Research and Treatment, Semel Institute, University of California, Los Angeles, Los Angeles, CA, USA
| | - Daniel H. Geschwind
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
- Center for Autism Research and Treatment, Semel Institute, University of California, Los Angeles, Los Angeles, CA, USA
| | - Noah Zaitlen
- Department of Neurology, University of California Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
| | - Alexander Gusev
- Department of Medical Oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston, MA, USA
- Division of Genetics, Brigham & Women’s Hospital, Boston, MA, USA
- Medical and Population Genetics, Broad Institute, Cambridge, MA, USA
| | - Bogdan Pasaniuc
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Institute for Precision Health, University of California, Los Angeles, Los Angeles, CA, USA
| | - Jonathan Mill
- Department of Clinical and Biomedical Sciences, University of Exeter Medical School, University of Exeter, Exeter, UK
| | - Chongyuan Luo
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Michael J. Gandal
- Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Program in Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Lifespan Brain Institute at Penn Medicine and The Children’s Hospital of Philadelphia, Department of Psychiatry, University of Pennsylvania, Philadelphia, PA, USA
- Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
4
|
Nguyen H, Nguyen H, Tran D, Draghici S, Nguyen T. Fourteen years of cellular deconvolution: methodology, applications, technical evaluation and outstanding challenges. Nucleic Acids Res 2024; 52:4761-4783. [PMID: 38619038 PMCID: PMC11109966 DOI: 10.1093/nar/gkae267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 03/01/2024] [Accepted: 04/02/2024] [Indexed: 04/16/2024] Open
Abstract
Single-cell RNA sequencing (scRNA-Seq) is a recent technology that allows for the measurement of the expression of all genes in each individual cell contained in a sample. Information at the single-cell level has been shown to be extremely useful in many areas. However, performing single-cell experiments is expensive. Although cellular deconvolution cannot provide the same comprehensive information as single-cell experiments, it can extract cell-type information from bulk RNA data, and therefore it allows researchers to conduct studies at cell-type resolution from existing bulk datasets. For these reasons, a great effort has been made to develop such methods for cellular deconvolution. The large number of methods available, the requirement of coding skills, inadequate documentation, and lack of performance assessment all make it extremely difficult for life scientists to choose a suitable method for their experiment. This paper aims to fill this gap by providing a comprehensive review of 53 deconvolution methods regarding their methodology, applications, performance, and outstanding challenges. More importantly, the article presents a benchmarking of all these 53 methods using 283 cell types from 30 tissues of 63 individuals. We also provide an R package named DeconBenchmark that allows readers to execute and benchmark the reviewed methods (https://github.com/tinnlab/DeconBenchmark).
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Ha Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | - Duc Tran
- Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Sorin Draghici
- Department of Computer Science, Wayne State University, Detroit, MI, USA
- Advaita Bioinformatics, Ann Arbor, MI, USA
| | - Tin Nguyen
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| |
Collapse
|
5
|
De Ridder K, Che H, Leroy K, Thienpont B. Benchmarking of methods for DNA methylome deconvolution. Nat Commun 2024; 15:4134. [PMID: 38755121 PMCID: PMC11099101 DOI: 10.1038/s41467-024-48466-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 04/30/2024] [Indexed: 05/18/2024] Open
Abstract
Defining the number and abundance of different cell types in tissues is important for understanding disease mechanisms as well as for diagnostic and prognostic purposes. Typically, this is achieved by immunohistological analyses, cell sorting, or single-cell RNA-sequencing. Alternatively, cell-specific DNA methylome information can be leveraged to deconvolve cell fractions from a bulk DNA mixture. However, comprehensive benchmarking of deconvolution methods and modalities was not yet performed. Here we evaluate 16 deconvolution algorithms, developed either specifically for DNA methylome data or more generically. We assess the performance of these algorithms, and the effect of normalization methods, while modeling variables that impact deconvolution performance, including cell abundance, cell type similarity, reference panel size, method for methylome profiling (array or sequencing), and technical variation. We observe differences in algorithm performance depending on each these variables, emphasizing the need for tailoring deconvolution analyses. The complexity of the reference, marker selection method, number of marker loci and, for sequencing-based assays, sequencing depth have a marked influence on performance. By developing handles to select the optimal analysis configuration, we provide a valuable source of information for studies aiming to deconvolve array- or sequencing-based methylation data.
Collapse
Affiliation(s)
- Kobe De Ridder
- Laboratory for Functional Epigenetics, Department of Human Genetics, KU Leuven, 3000, Leuven, Belgium
| | - Huiwen Che
- Laboratory for Functional Epigenetics, Department of Human Genetics, KU Leuven, 3000, Leuven, Belgium
| | - Kaat Leroy
- Laboratory for Functional Epigenetics, Department of Human Genetics, KU Leuven, 3000, Leuven, Belgium
| | - Bernard Thienpont
- Laboratory for Functional Epigenetics, Department of Human Genetics, KU Leuven, 3000, Leuven, Belgium.
- KU Leuven Institute for Single Cell Omics (LISCO), KU Leuven, 3000, Leuven, Belgium.
- KU Leuven Cancer Institute (LKI), KU Leuven, 3000, Leuven, Belgium.
| |
Collapse
|
6
|
Pan Y, Wang X, Sun J, Liu C, Peng J, Li Q. Multimodal joint deconvolution and integrative signature selection in proteomics. Commun Biol 2024; 7:493. [PMID: 38658803 PMCID: PMC11043077 DOI: 10.1038/s42003-024-06155-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 04/08/2024] [Indexed: 04/26/2024] Open
Abstract
Deconvolution is an efficient approach for detecting cell-type-specific (cs) transcriptomic signals without cellular segmentation. However, this type of methods may require a reference profile from the same molecular source and tissue type. Here, we present a method to dissect bulk proteome by leveraging tissue-matched transcriptome and proteome without using a proteomics reference panel. Our method also selects the proteins contributing to the cellular heterogeneity shared between bulk transcriptome and proteome. The deconvoluted result enables downstream analyses such as cs-protein Quantitative Trait Loci (cspQTL) mapping. We benchmarked the performance of this multimodal deconvolution approach through CITE-seq pseudo bulk data, a simulation study, and the bulk multi-omics data from human brain normal tissues and breast cancer tumors, individually, showing robust and accurate cell abundance quantification across different datasets. This algorithm is implemented in a tool MICSQTL that also provides cspQTL and multi-omics integrative visualization, available at https://bioconductor.org/packages/MICSQTL .
Collapse
Affiliation(s)
- Yue Pan
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Xusheng Wang
- Center for Proteomics and Metabolomics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Department of Genetics, Genomics & Informatics, University of Tennessee Health Science Center, Memphis, TN, 38105, USA
| | - Jiao Sun
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Chunyu Liu
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY, 13210, USA
| | - Junmin Peng
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
- Department of Developmental Neurobiology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Qian Li
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA.
| |
Collapse
|
7
|
Qiu J, Xu X, Guo J, Wang Z, Wu J, Ding H, Xu Y, Wu Y, Ying Q, Qiu J, Wu S, Shi S. Comparison of extraction processes, characterization and intestinal protection activity of Bletilla striata polysaccharides. Int J Biol Macromol 2024; 263:130267. [PMID: 38378109 DOI: 10.1016/j.ijbiomac.2024.130267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 02/12/2024] [Accepted: 02/15/2024] [Indexed: 02/22/2024]
Abstract
We optimized the extraction process of Bletilla striata polysaccharides using orthogonal design, Box-Behnken design (BBD), and genetic algorithm-back propagation (GA-BP), then compared and evaluated them to confirm that the combination of BBD and GA-BP neural networks was capable of increasing polysaccharide yields and antioxidant activity. The optimal extraction parameters were as follows: liquid-to-solid ratio of 15 mL/g, extraction power of 450 W, and extraction time of 34 min. Under these conditions, the polysaccharide yield and antioxidant activity were 8.29 ± 0.50 % and 26.20 ± 0.28 (mM FE/mg). Subsequently, the polysaccharide was purified to obtain purified Bletilla striata polysaccharides 1 (pBSP1) with a Mw of 255.172 kDa. Scanning electron microscope (SEM), ultraviolet-visible detector (UV), fourier transform infrared spectrometer (FTIR), high performance liquid chromatography (HPLC), X-ray diffraction (XRD), nuclear magnetic resonance (NMR) and periodate oxidation were used to analyze the structure of pBSP1. The results showed pBSP1 had a smooth surface and a rough interior, with a composition of α-D conformation glucose (18.23 %) and β-D conformation mannose (53.77 %), and an amorphous crystal structure. According to the results of thermogravimetric and rheological tests, pBSP1 exhibits good thermal stability and viscoelastic behavior. Furthermore, pBSP1 protected lipopolysaccharide (LPS)-induced GES - 1 and Caco2 cells, the results showed pBSP1(400 μg/mL) lowered TEER synthesis in Caco2 cells as well as apoptosis and reactive oxygen species (ROS) production in both cells, indicating that pBSP1 may have an intestine protective effect.
Collapse
Affiliation(s)
- Junjie Qiu
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Xiao Xu
- Asset Management Co., Ltd, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Jingyan Guo
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Zhenyu Wang
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Jinjin Wu
- The Third School of Clinical Medicine, Zhejiang Chinese Medical University, Hangzhou, China
| | - Huiqin Ding
- The Second School of Clinical Medicine, Zhejiang Chinese Medical University, Hangzhou, China
| | - Yuchen Xu
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Yili Wu
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Qianyi Ying
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Jiawei Qiu
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China
| | - Suxiang Wu
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China.
| | - Senlin Shi
- School of Pharmaceutical Sciences, Zhejiang Chinese Medical University, Hangzhou 310053, China.
| |
Collapse
|
8
|
Ferro dos Santos MR, Giuili E, De Koker A, Everaert C, De Preter K. Computational deconvolution of DNA methylation data from mixed DNA samples. Brief Bioinform 2024; 25:bbae234. [PMID: 38762790 PMCID: PMC11102637 DOI: 10.1093/bib/bbae234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 03/30/2024] [Accepted: 04/30/2024] [Indexed: 05/20/2024] Open
Abstract
In this review, we provide a comprehensive overview of the different computational tools that have been published for the deconvolution of bulk DNA methylation (DNAm) data. Here, deconvolution refers to the estimation of cell-type proportions that constitute a mixed sample. The paper reviews and compares 25 deconvolution methods (supervised, unsupervised or hybrid) developed between 2012 and 2023 and compares the strengths and limitations of each approach. Moreover, in this study, we describe the impact of the platform used for the generation of methylation data (including microarrays and sequencing), the applied data pre-processing steps and the used reference dataset on the deconvolution performance. Next to reference-based methods, we also examine methods that require only partial reference datasets or require no reference set at all. In this review, we provide guidelines for the use of specific methods dependent on the DNA methylation data type and data availability.
Collapse
Affiliation(s)
- Maísa R Ferro dos Santos
- VIB-UGent Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Zwijnaarde, Belgium
- Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium
| | - Edoardo Giuili
- VIB-UGent Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Zwijnaarde, Belgium
- Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium
| | - Andries De Koker
- VIB-UGent Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Zwijnaarde, Belgium
- Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium
| | - Celine Everaert
- VIB-UGent Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Zwijnaarde, Belgium
- Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium
| | - Katleen De Preter
- VIB-UGent Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Zwijnaarde, Belgium
- Cancer Research Institute Ghent (CRIG), 9000 Ghent, Belgium
| |
Collapse
|
9
|
Wu CT, Du D, Chen L, Dai R, Liu C, Yu G, Bhardwaj S, Parker SJ, Zhang Z, Clarke R, Herrington DM, Wang Y. CAM3.0: determining cell type composition and expression from bulk tissues with fully unsupervised deconvolution. Bioinformatics 2024; 40:btae107. [PMID: 38407991 PMCID: PMC10924278 DOI: 10.1093/bioinformatics/btae107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 01/13/2024] [Accepted: 02/25/2024] [Indexed: 02/28/2024] Open
Abstract
MOTIVATION Complex tissues are dynamic ecosystems consisting of molecularly distinct yet interacting cell types. Computational deconvolution aims to dissect bulk tissue data into cell type compositions and cell-specific expressions. With few exceptions, most existing deconvolution tools exploit supervised approaches requiring various types of references that may be unreliable or even unavailable for specific tissue microenvironments. RESULTS We previously developed a fully unsupervised deconvolution method-Convex Analysis of Mixtures (CAM), that enables estimation of cell type composition and expression from bulk tissues. We now introduce CAM3.0 tool that improves this framework with three new and highly efficient algorithms, namely, radius-fixed clustering to identify reliable markers, linear programming to detect an initial scatter simplex, and a smart floating search for the optimum latent variable model. The comparative experimental results obtained from both realistic simulations and case studies show that the CAM3.0 tool can help biologists more accurately identify known or novel cell markers, determine cell proportions, and estimate cell-specific expressions, complementing the existing tools particularly when study- or datatype-specific references are unreliable or unavailable. AVAILABILITY AND IMPLEMENTATION The open-source R Scripts of CAM3.0 is freely available at https://github.com/ChiungTingWu/CAM3/(https://github.com/Bioconductor/Contributions/issues/3205). A user's guide and a vignette are provided.
Collapse
Affiliation(s)
- Chiung-Ting Wu
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, United States
| | - Dongping Du
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, United States
| | - Lulu Chen
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, United States
| | - Rujia Dai
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY 13210, United States
| | - Chunyu Liu
- Department of Psychiatry, SUNY Upstate Medical University, Syracuse, NY 13210, United States
| | - Guoqiang Yu
- Department of Automation, Tsinghua University, Beijing 100084, P. R. China
| | - Saurabh Bhardwaj
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, United States
- Department of Electrical and Instrumentation Engineering, Thapar Institute of Engineering & Technology, Punjab 147004, India
| | - Sarah J Parker
- Advanced Clinical Biosystems Research Institute, Cedars Sinai Medical Center, Los Angeles, CA 90048, United States
| | - Zhen Zhang
- Department of Pathology, Johns Hopkins University, Baltimore, MD 21231, United States
| | - Robert Clarke
- The Hormel Institute, University of Minnesota, Austin, MN 55912, United States
| | - David M Herrington
- Department of Internal Medicine, Wake Forest University, Winston-Salem, NC 27157, United States
| | - Yue Wang
- Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, United States
| |
Collapse
|
10
|
Garmire LX, Li Y, Huang Q, Xu C, Teichmann SA, Kaminski N, Pellegrini M, Nguyen Q, Teschendorff AE. Challenges and perspectives in computational deconvolution of genomics data. Nat Methods 2024; 21:391-400. [PMID: 38374264 DOI: 10.1038/s41592-023-02166-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 12/26/2023] [Indexed: 02/21/2024]
Abstract
Deciphering cell-type heterogeneity is crucial for systematically understanding tissue homeostasis and its dysregulation in diseases. Computational deconvolution is an efficient approach for estimating cell-type abundances from a variety of omics data. Despite substantial methodological progress in computational deconvolution in recent years, challenges are still outstanding. Here we enlist four important challenges related to computational deconvolution: the quality of the reference data, generation of ground truth data, limitations of computational methodologies, and benchmarking design and implementation. Finally, we make recommendations on reference data generation, new directions of computational methodologies, and strategies to promote rigorous benchmarking.
Collapse
Affiliation(s)
- Lana X Garmire
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.
| | - Yijun Li
- Department of Biostatistics, University of Michigan, Ann Arbor, MI, USA
| | - Qianhui Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Chuan Xu
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | | | - Naftali Kaminski
- Pulmonary, Critical Care & Sleep Medicine, Yale University School of Medicine, New Haven, CT, USA
| | - Matteo Pellegrini
- Molecular, Cell and Developmental Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Quan Nguyen
- Institute for Molecular Bioscience, The University of Queensland and QIMR Berghofer Medical Research Institute, Brisbane, Queensland, Australia
| | - Andrew E Teschendorff
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- UCL Cancer Institute, University College London, London, UK
| |
Collapse
|
11
|
Morita K, Mizuno T, Azuma I, Suzuki Y, Kusuhara H. Rat Deconvolution as Knowledge Miner for Immune Cell Trafficking from Toxicogenomics Databases. Toxicol Sci 2023; 197:kfad117. [PMID: 37941435 PMCID: PMC10823770 DOI: 10.1093/toxsci/kfad117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023] Open
Abstract
Toxicogenomics databases are useful for understanding biological responses in individuals because they include a diverse spectrum of biological responses. Although these databases contain no information regarding immune cells in the liver, which are important in the progression of liver injury, deconvolution that estimates cell-type proportions from bulk transcriptome could extend immune information. However, deconvolution has been mainly applied to humans and mice and less often to rats, which are the main target of toxicogenomics databases. Here, we developed a deconvolution method for rats to retrieve information regarding immune cells from toxicogenomics databases. The rat-specific deconvolution showed high correlations for several types of immune cells between spleen and blood, and between liver treated with toxicants compared with those based on human and mouse data. Additionally, we found 4 clusters of compounds in Open TG-GATEs database based on estimated immune cell trafficking, which are different from those based on transcriptome data itself. The contributions of this work are three-fold. First, we obtained the gene expression profiles of 6 rat immune cells necessary for deconvolution. Second, we clarified the importance of species differences on deconvolution. Third, we retrieved immune cell trafficking from toxicogenomics databases. Accumulated and comparable immune cell profiles of massive data of immune cell trafficking in rats could deepen our understanding of enable us to clarify the relationship between the order and the contribution rate of immune cells, chemokines and cytokines, and pathologies. Ultimately, these findings will lead to the evaluation of organ responses in Adverse Outcome Pathway.
Collapse
Affiliation(s)
- Katsuhisa Morita
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| | - Tadahaya Mizuno
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| | - Iori Azuma
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| | - Yutaka Suzuki
- Graduate School of Frontier Sciences, The University of Tokyo, Chiba, Japan
| | - Hiroyuki Kusuhara
- Department of Pharmaceutical Sciences, The University of Tokyo, Bunkyo, Tokyo, Japan
| |
Collapse
|
12
|
Feng H, Meng G, Lin T, Parikh H, Pan Y, Li Z, Krischer J, Li Q. ISLET: individual-specific reference panel recovery improves cell-type-specific inference. Genome Biol 2023; 24:174. [PMID: 37496087 PMCID: PMC10373385 DOI: 10.1186/s13059-023-03014-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 07/12/2023] [Indexed: 07/28/2023] Open
Abstract
We propose a statistical framework ISLET to infer individual-specific and cell-type-specific transcriptome reference panels. ISLET models the repeatedly measured bulk gene expression data, to optimize the usage of shared information within each subject. ISLET is the first available method to achieve individual-specific reference estimation in repeated samples. Using simulation studies, we show outstanding performance of ISLET in the reference estimation and downstream cell-type-specific differentially expressed genes testing. We apply ISLET to longitudinal transcriptomes profiled from blood samples in a large observational study of young children and confirm the cell-type-specific gene signatures for pancreatic islet autoantibody. ISLET is available at https://bioconductor.org/packages/ISLET .
Collapse
Affiliation(s)
- Hao Feng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA.
| | - Guanqun Meng
- Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
| | - Tong Lin
- Department of Biostatistics, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA
| | - Hemang Parikh
- Health Informatics Institute, University of South Florida, Tampa, FL, 33620, USA
| | - Yue Pan
- Department of Biostatistics, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA
| | - Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, 77030, USA
| | - Jeffrey Krischer
- Health Informatics Institute, University of South Florida, Tampa, FL, 33620, USA
| | - Qian Li
- Department of Biostatistics, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN, 38105, USA.
| |
Collapse
|
13
|
Heiling HM, Wilson DR, Rashid NU, Sun W, Ibrahim JG. Estimating cell type composition using isoform expression one gene at a time. Biometrics 2023; 79:854-865. [PMID: 34921386 PMCID: PMC11245124 DOI: 10.1111/biom.13614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 12/08/2021] [Indexed: 11/29/2022]
Abstract
Human tissue samples are often mixtures of heterogeneous cell types, which can confound the analyses of gene expression data derived from such tissues. The cell type composition of a tissue sample may itself be of interest and is needed for proper analysis of differential gene expression. A variety of computational methods have been developed to estimate cell type proportions using gene-level expression data. However, RNA isoforms can also be differentially expressed across cell types, and isoform-level expression could be equally or more informative for determining cell type origin than gene-level expression. We propose a new computational method, IsoDeconvMM, which estimates cell type fractions using isoform-level gene expression data. A novel and useful feature of IsoDeconvMM is that it can estimate cell type proportions using only a single gene, though in practice we recommend aggregating estimates of a few dozen genes to obtain more accurate results. We demonstrate the performance of IsoDeconvMM using a unique data set with cell type-specific RNA-seq data across more than 135 individuals. This data set allows us to evaluate different methods given the biological variation of cell type-specific gene expression data across individuals. We further complement this analysis with additional simulations.
Collapse
Affiliation(s)
- Hillary M Heiling
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Douglas R Wilson
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Naim U Rashid
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| | - Wei Sun
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington
| | - Joseph G Ibrahim
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina
| |
Collapse
|
14
|
Chen L, Li Z, Wu H. CeDAR: incorporating cell type hierarchy improves cell type-specific differential analyses in bulk omics data. Genome Biol 2023; 24:37. [PMID: 36855165 PMCID: PMC9972684 DOI: 10.1186/s13059-023-02857-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 01/17/2023] [Indexed: 03/02/2023] Open
Abstract
Bulk high-throughput omics data contain signals from a mixture of cell types. Recent developments of deconvolution methods facilitate cell type-specific inferences from bulk data. Our real data exploration suggests that differential expression or methylation status is often correlated among cell types. Based on this observation, we develop a novel statistical method named CeDAR to incorporate the cell type hierarchy in cell type-specific differential analyses of bulk data. Extensive simulation and real data analyses demonstrate that this approach significantly improves the accuracy and power in detecting cell type-specific differential signals compared with existing methods, especially in low-abundance cell types.
Collapse
Affiliation(s)
- Luxiao Chen
- Department of Biostatistics and Bioinformatics, Emory University, GA 30322 Atlanta, USA
| | - Ziyi Li
- Department of Biostatistics, The University of MD Anderson Cancer Center, 77030 Houston, TX, USA
| | - Hao Wu
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, 1068 Xueyuan Avenue, Shenzhen University Town, Shenzhen, 518055 P.R. China
| |
Collapse
|
15
|
Verma A, Kommaddi RP, Gnanabharathi B, Hirsch EC, Ravindranath V. Genes critical for development and differentiation of dopaminergic neurons are downregulated in Parkinson's disease. J Neural Transm (Vienna) 2023; 130:495-512. [PMID: 36820885 DOI: 10.1007/s00702-023-02604-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 02/13/2023] [Indexed: 02/24/2023]
Abstract
We performed transcriptome analysis using RNA sequencing on substantia nigra pars compacta (SNpc) from mice after acute and chronic 1-methyl-4-phenyl-1,2,3,6-tetrahydropyridine (MPTP) treatment and from Parkinson's disease (PD) patients. Acute and chronic exposure to MPTP resulted in decreased expression of genes involved in sodium channel regulation. However, upregulation of pro-inflammatory pathways was seen after single dose but not after chronic MPTP treatment. Dopamine biosynthesis and synaptic vesicle recycling pathways were downregulated in PD patients and after chronic MPTP treatment in mice. Genes essential for midbrain development and determination of dopaminergic phenotype such as, LMX1B, FOXA1, RSPO2, KLHL1, EBF3, PITX3, RGS4, ALDH1A1, RET, FOXA2, EN1, DLK1, GFRA1, LMX1A, NR4A2, GAP43, SNCA, PBX1, and GRB10 were downregulated in human PD and overexpression of GFP tagged LMX1B rescued MPP+ induced death in SH-SY5Y neurons. Downregulation of gene ensemble involved in development and differentiation of dopaminergic neurons indicate their potential involvement in pathogenesis and progression of human PD.
Collapse
Affiliation(s)
- Aditi Verma
- Centre for Neuroscience, Indian Institute of Science, C.V. Raman Avenue, Bangalore, 560012, India
| | - Reddy Peera Kommaddi
- Centre for Brain Research, Indian Institute of Science, Bangalore, 560012, India
| | | | - Etienne C Hirsch
- Sorbonne Université, Institut du Cerveau - ICM, Inserm U 1127, CNRS UMR 7225, 75013, Paris, France
| | - Vijayalakshmi Ravindranath
- Centre for Neuroscience, Indian Institute of Science, C.V. Raman Avenue, Bangalore, 560012, India. .,Centre for Brain Research, Indian Institute of Science, Bangalore, 560012, India.
| |
Collapse
|
16
|
Song J, Kuan PF. A systematic assessment of cell type deconvolution algorithms for DNA methylation data. Brief Bioinform 2022; 23:bbac449. [PMID: 36242584 PMCID: PMC9947552 DOI: 10.1093/bib/bbac449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 08/11/2022] [Accepted: 09/20/2022] [Indexed: 12/14/2022] Open
Abstract
We performed systematic assessment of computational deconvolution methods that play an important role in the estimation of cell type proportions from bulk methylation data. The proposed framework methylDeConv (available as an R package) integrates several deconvolution methods for methylation profiles (Illumina HumanMethylation450 and MethylationEPIC arrays) and offers different cell-type-specific CpG selection to construct the extended reference library which incorporates the main immune cell subsets, epithelial cells and cell-free DNAs. We compared the performance of different deconvolution algorithms via simulations and benchmark datasets and further investigated the associations of the estimated cell type proportions to cancer therapy in breast cancer and subtypes in melanoma methylation case studies. Our results indicated that the deconvolution based on the extended reference library is critical to obtain accurate estimates of cell proportions in non-blood tissues.
Collapse
Affiliation(s)
- Junyan Song
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY
| | - Pei-Fen Kuan
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY
| |
Collapse
|
17
|
Fan J, Lyu Y, Zhang Q, Wang X, Li M, Xiao R. MuSiC2: cell-type deconvolution for multi-condition bulk RNA-seq data. Brief Bioinform 2022; 23:bbac430. [PMID: 36208175 PMCID: PMC9677503 DOI: 10.1093/bib/bbac430] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 08/19/2022] [Accepted: 09/03/2022] [Indexed: 12/14/2022] Open
Abstract
Cell-type composition of intact bulk tissues can vary across samples. Deciphering cell-type composition and its changes during disease progression is an important step toward understanding disease pathogenesis. To infer cell-type composition, existing cell-type deconvolution methods for bulk RNA sequencing (RNA-seq) data often require matched single-cell RNA-seq (scRNA-seq) data, generated from samples with similar clinical conditions, as reference. However, due to the difficulty of obtaining scRNA-seq data in diseased samples, only limited scRNA-seq data in matched disease conditions are available. Using scRNA-seq reference to deconvolve bulk RNA-seq data from samples with different disease conditions may lead to a biased estimation of cell-type proportions. To overcome this limitation, we propose an iterative estimation procedure, MuSiC2, which is an extension of MuSiC, to perform deconvolution analysis of bulk RNA-seq data generated from samples with multiple clinical conditions where at least one condition is different from that of the scRNA-seq reference. Extensive benchmark evaluations indicated that MuSiC2 improved the accuracy of cell-type proportion estimates of bulk RNA-seq samples under different conditions as compared with the traditional MuSiC deconvolution. MuSiC2 was applied to two bulk RNA-seq datasets for deconvolution analysis, including one from human pancreatic islets and the other from human retina. We show that MuSiC2 improves current deconvolution methods and provides more accurate cell-type proportion estimates when the bulk and single-cell reference differ in clinical conditions. We believe the condition-specific cell-type composition estimates from MuSiC2 will facilitate the downstream analysis and help identify cellular targets of human diseases.
Collapse
Affiliation(s)
- Jiaxin Fan
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Yafei Lyu
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Qihuang Zhang
- Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC, H3A 1G1, Canada
| | - Xuran Wang
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Mingyao Li
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Rui Xiao
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| |
Collapse
|
18
|
Guo Z, Shafik AM, Jin P, Wu H. Differential RNA methylation analysis for MeRIP-seq data under general experimental design. Bioinformatics 2022; 38:4705-4712. [PMID: 36063045 PMCID: PMC9563684 DOI: 10.1093/bioinformatics/btac601] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 08/03/2022] [Accepted: 09/02/2022] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION RNA epigenetics is an emerging field to study the post-transcriptional gene regulation. The dynamics of RNA epigenetic modification have been reported to associate with many human diseases. Recently developed high-throughput technology named Methylated RNA Immunoprecipitation Sequencing (MeRIP-seq) enables the transcriptome-wide profiling of N6-methyladenosine (m6A) modification and comparison of RNA epigenetic modifications. There are a few computational methods for the comparison of mRNA modifications under different conditions but they all suffer from serious limitations. RESULTS In this work, we develop a novel statistical method to detect differentially methylated mRNA regions from MeRIP-seq data. We model the sequence count data by a hierarchical negative binomial model that accounts for various sources of variations and derive parameter estimation and statistical testing procedures for flexible statistical inferences under general experimental designs. Extensive benchmark evaluations in simulation and real data analyses demonstrate that our method is more accurate, robust and flexible compared to existing methods. AVAILABILITY AND IMPLEMENTATION Our method TRESS is implemented as an R/Bioconductor package and is available at https://bioconductor.org/packages/devel/TRESS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhenxing Guo
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Andrew M Shafik
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Peng Jin
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
19
|
McCullough KM, Katrinli S, Hartmann J, Lori A, Klengel C, Missig G, Klengel T, Langford NA, Newman EL, Anderson KJ, Smith AK, Carroll FI, Ressler KJ, Carlezon WA. Blood levels of T-Cell Receptor Excision Circles (TRECs) provide an index of exposure to traumatic stress in mice and humans. Transl Psychiatry 2022; 12:423. [PMID: 36192377 PMCID: PMC9530209 DOI: 10.1038/s41398-022-02159-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/05/2022] [Accepted: 09/07/2022] [Indexed: 12/03/2022] Open
Abstract
Exposure to stress triggers biological changes throughout the body. Accumulating evidence indicates that alterations in immune system function are associated with the development of stress-associated illnesses such as major depressive disorder and post-traumatic stress disorder, increasing interest in identifying immune markers that provide insight into mental health. Recombination events during T-cell receptor rearrangement and T-cell maturation in the thymus produce circular DNA fragments called T-cell receptor excision circles (TRECs) that can be utilized as indicators of thymic function and numbers of newly emigrating T-cells. Given data suggesting that stress affects thymus function, we examined whether blood levels of TRECs might serve as a quantitative peripheral index of cumulative stress exposure and its physiological correlates. We hypothesized that chronic stress exposure would compromise thymus function and produce corresponding decreases in levels of TRECs. In male mice, exposure to chronic social defeat stress (CSDS) produced thymic involution, adrenal hypertrophy, and decreased levels of TRECs in blood. Extending these studies to humans revealed robust inverse correlations between levels of circulating TRECs and childhood emotional and physical abuse. Cell-type specific analyses also revealed associations between TREC levels and blood cell composition, as well as cell-type specific methylation changes in CD4T + and CD8T + cells. Additionally, TREC levels correlated with epigenetic age acceleration, a common biomarker of stress exposure. Our findings demonstrate alignment between findings in mice and humans and suggest that blood-borne TRECs are a translationally-relevant biomarker that correlates with, and provides insight into, the cumulative physiological and immune-related impacts of stress exposure in mammals.
Collapse
Affiliation(s)
- Kenneth M McCullough
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
| | - Seyma Katrinli
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
| | - Jakob Hartmann
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
| | - Adriana Lori
- Department of Psychiatry & Behavioral Sciences, Emory University, Atlanta, GA, USA
| | - Claudia Klengel
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
| | - Galen Missig
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
| | - Torsten Klengel
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
| | - Nicole A Langford
- Department of Psychiatry & Behavioral Sciences, Emory University, Atlanta, GA, USA
| | - Emily L Newman
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
| | - Kasey J Anderson
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
| | - Alicia K Smith
- Department of Gynecology and Obstetrics, Emory University, Atlanta, GA, USA
- Department of Psychiatry & Behavioral Sciences, Emory University, Atlanta, GA, USA
| | - F Ivy Carroll
- Center for Organic and Medicinal Chemistry, Research Triangle Institute, Research Triangle Park, NC, USA
| | - Kerry J Ressler
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA
- Department of Psychiatry & Behavioral Sciences, Emory University, Atlanta, GA, USA
| | - William A Carlezon
- Basic Neuroscience Division, Department of Psychiatry, Harvard Medical School, McLean Hospital, Belmont, MA, USA.
| |
Collapse
|
20
|
Song L, Yang YT, Guo Q, Zhao XM. Cellular transcriptional alterations of peripheral blood in Alzheimer's disease. BMC Med 2022; 20:266. [PMID: 36031604 PMCID: PMC9422129 DOI: 10.1186/s12916-022-02472-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 07/11/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Alzheimer's disease (AD), a progressive neurodegenerative disease, is the most common cause of dementia worldwide. Accumulating data support the contributions of the peripheral immune system in AD pathogenesis. However, there is a lack of comprehensive understanding about the molecular characteristics of peripheral immune cells in AD. METHODS To explore the alterations of cellular composition and the alterations of intrinsic expression of individual cell types in peripheral blood, we performed cellular deconvolution in a large-scale bulk blood expression cohort and identified cell-intrinsic differentially expressed genes in individual cell types with adjusting for cellular proportion. RESULTS We detected a significant increase and decrease in the proportion of neutrophils and B lymphocytes in AD blood, respectively, which had a robust replicability across other three AD cohorts, as well as using alternative algorithms. The differentially expressed genes in AD neutrophils were enriched for some AD-associated pathways, such as ATP metabolic process and mitochondrion organization. We also found a significant enrichment of protein-protein interaction network modules of leukocyte cell-cell activation, mitochondrion organization, and cytokine-mediated signaling pathway in neutrophils for AD risk genes including CD33 and IL1B. Both changes in cellular composition and expression levels of specific genes were significantly associated with the clinical and pathological alterations. A similar pattern of perturbations on the cellular proportion and gene expression levels of neutrophils could be also observed in mild cognitive impairment (MCI). Moreover, we noticed an elevation of neutrophil abundance in the AD brains. CONCLUSIONS We revealed the landscape of molecular perturbations at the cellular level for AD. These alterations highlight the putative roles of neutrophils in AD pathobiology.
Collapse
Affiliation(s)
- Liting Song
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Yucheng T Yang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China.,MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China.,Zhangjiang Fudan International Innovation Center, Shanghai, 200433, China
| | - Qihao Guo
- Department of Gerontology, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, Shanghai, 200233, China
| | | | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China. .,MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, 200433, China. .,Zhangjiang Fudan International Innovation Center, Shanghai, 200433, China. .,International Human Phenome Institutes (Shanghai), Shanghai, 200433, China.
| |
Collapse
|
21
|
Lyu C, Huang M, Liu N, Chen Z, Lupo PJ, Tycko B, Witte JS, Hobbs CA, Li M. Random field modeling of multi-trait multi-locus association for detecting methylation quantitative trait loci. Bioinformatics 2022; 38:3853-3862. [PMID: 35781319 PMCID: PMC9364381 DOI: 10.1093/bioinformatics/btac443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 06/28/2022] [Accepted: 06/30/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION CpG sites within the same genomic region often share similar methylation patterns and tend to be co-regulated by multiple genetic variants that may interact with one another. RESULTS We propose a multi-trait methylation random field (multi-MRF) method to evaluate the joint association between a set of CpG sites and a set of genetic variants. The proposed method has several advantages. First, it is a multi-trait method that allows flexible correlation structures between neighboring CpG sites (e.g. distance-based correlation). Second, it is also a multi-locus method that integrates the effect of multiple common and rare genetic variants. Third, it models the methylation traits with a beta distribution to characterize their bimodal and interval properties. Through simulations, we demonstrated that the proposed method had improved power over some existing methods under various disease scenarios. We further illustrated the proposed method via an application to a study of congenital heart defects (CHDs) with 83 cardiac tissue samples. Our results suggested that gene BACE2, a methylation quantitative trait locus (QTL) candidate, colocalized with expression QTLs in artery tibial and harbored genetic variants with nominal significant associations in two genome-wide association studies of CHD. AVAILABILITY AND IMPLEMENTATION https://github.com/chenlyu2656/Multi-MRF. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chen Lyu
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA,Department of Population Health, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Manyan Huang
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA
| | - Nianjun Liu
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA
| | - Zhongxue Chen
- Department of Epidemiology and Biostatistics, Indiana University Bloomington, Bloomington, IN 47405, USA
| | - Philip J Lupo
- Department of Pediatrics, Baylor College of Medicine, Houston, TX 77030, USA
| | - Benjamin Tycko
- Center for Discovery and Innovation, Nutley, NJ 07110, USA
| | - John S Witte
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA 94305, USA,Department of Biomedical Data Sciences, Stanford University, Stanford, CA 94305, USA
| | - Charlotte A Hobbs
- Rady Children’s Institute for Genomic Medicine, San Diego, CA 92123, USA
| | - Ming Li
- To whom correspondence should be addressed.
| |
Collapse
|
22
|
Cai M, Yue M, Chen T, Liu J, Forno E, Lu X, Billiar T, Celedón J, McKennan C, Chen W, Wang J. Robust and accurate estimation of cellular fraction from tissue omics data via ensemble deconvolution. Bioinformatics 2022; 38:3004-3010. [PMID: 35438146 PMCID: PMC9991889 DOI: 10.1093/bioinformatics/btac279] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2022] [Revised: 03/22/2022] [Accepted: 04/13/2022] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Tissue-level omics data such as transcriptomics and epigenomics are an average across diverse cell types. To extract cell-type-specific (CTS) signals, dozens of cellular deconvolution methods have been proposed to infer cell-type fractions from tissue-level data. However, these methods produce vastly different results under various real data settings. Simulation-based benchmarking studies showed no universally best deconvolution approaches. There have been attempts of ensemble methods, but they only aggregate multiple single-cell references or reference-free deconvolution methods. RESULTS To achieve a robust estimation of cellular fractions, we proposed EnsDeconv (Ensemble Deconvolution), which adopts CTS robust regression to synthesize the results from 11 single deconvolution methods, 10 reference datasets, 5 marker gene selection procedures, 5 data normalizations and 2 transformations. Unlike most benchmarking studies based on simulations, we compiled four large real datasets of 4937 tissue samples in total with measured cellular fractions and bulk gene expression from different tissues. Comprehensive evaluations demonstrated that EnsDeconv yields more stable, robust and accurate fractions than existing methods. We illustrated that EnsDeconv estimated cellular fractions enable various CTS downstream analyses such as differential fractions associated with clinical variables. We further extended EnsDeconv to analyze bulk DNA methylation data. AVAILABILITY AND IMPLEMENTATION EnsDeconv is freely available as an R-package from https://github.com/randel/EnsDeconv. The RNA microarray data from the TRAUMA study are available and can be accessed in GEO (GSE36809). The demographic and clinical phenotypes can be shared on reasonable request to the corresponding authors. The RNA-seq data from the EVAPR study cannot be shared publicly due to the privacy of individuals that participated in the clinical research in compliance with the IRB approval at the University of Pittsburgh. The RNA microarray data from the FHS study are available from dbGaP (phs000007.v32.p13). The RNA-seq data from ROS study is downloaded from AD Knowledge Portal. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Manqi Cai
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Molin Yue
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA
| | - Tianmeng Chen
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15213, USA
- Department of Pathology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Jinling Liu
- Department of Engineering Management and Systems Engineering, Missouri University of Science and Technology, Rolla, MO 65409, USA
- Department of Biological Sciences, Missouri University of Science and Technology, Rolla, MO 65409, USA
| | - Erick Forno
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, USA
| | - Xinghua Lu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| | - Timothy Billiar
- Department of Surgery, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Juan Celedón
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, USA
| | - Chris McKennan
- Department of Statistics, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Wei Chen
- Department of Pediatrics, University of Pittsburgh Medical Center Children’s Hospital of Pittsburgh, Pittsburgh, PA 15224, USA
| | - Jiebiao Wang
- Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA
| |
Collapse
|
23
|
Ma W, Sharma S, Jin P, Gourley SL, Qin ZS. LRcell: detecting the source of differential expression at the sub-cell-type level from bulk RNA-seq data. Brief Bioinform 2022; 23:bbac063. [PMID: 35272348 PMCID: PMC9116223 DOI: 10.1093/bib/bbac063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 01/23/2022] [Accepted: 02/08/2022] [Indexed: 11/13/2022] Open
Abstract
Given most tissues are consist of abundant and diverse (sub-)cell types, an important yet unaddressed problem in bulk RNA-seq analysis is to identify at which (sub-)cell type(s) the differential expression occurs. Single-cell RNA-sequencing (scRNA-seq) technologies can answer the question, but they are often labor-intensive and cost-prohibitive. Here, we present LRcell, a computational method aiming to identify specific (sub-)cell type(s) that drives the changes observed in a bulk RNA-seq experiment. In addition, LRcell provides pre-embedded marker genes computed from putative scRNA-seq experiments as options to execute the analyses. We conduct a simulation study to demonstrate the effectiveness and reliability of LRcell. Using three different real datasets, we show that LRcell successfully identifies known cell types involved in psychiatric disorders. Applying LRcell to bulk RNA-seq results can produce a hypothesis on which (sub-)cell type(s) contributes to the differential expression. LRcell is complementary to cell type deconvolution methods.
Collapse
Affiliation(s)
- Wenjing Ma
- Department of Computer Science, Emory University, 400 Dowman Drive, Atlanta, GA 30322, USA
| | - Sumeet Sharma
- Graduate Program in Neuroscience, Emory University, 1462 Clifton Road NE, Atlanta, GA 30322, USA
| | - Peng Jin
- Department of Human Genetics, Emory University, 1365 Clifton Road, Atlanta, GA 30322, USA
| | - Shannon L Gourley
- Department of Pediatrics, School of Medicine, Emory University, 100 Woodruff Circle, Atlanta, GA 30322, USA; Yerkes National Primate Research Center, Atlanta, GA 30322, USA
| | - Zhaohui S Qin
- Department of Computer Science, Emory University, 400 Dowman Drive, Atlanta, GA 30322, USA
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, 1518 Clifton Road NE, Atlanta, GA 30322, USA
| |
Collapse
|
24
|
Comprehensive evaluation of deconvolution methods for human brain gene expression. Nat Commun 2022; 13:1358. [PMID: 35292647 PMCID: PMC8924248 DOI: 10.1038/s41467-022-28655-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Accepted: 01/28/2022] [Indexed: 11/08/2022] Open
Abstract
Transcriptome deconvolution aims to estimate the cellular composition of an RNA sample from its gene expression data, which in turn can be used to correct for composition differences across samples. The human brain is unique in its transcriptomic diversity, and comprises a complex mixture of cell-types, including transcriptionally similar subtypes of neurons. Here, we carry out a comprehensive evaluation of deconvolution methods for human brain transcriptome data, and assess the tissue-specificity of our key observations by comparison with human pancreas and heart. We evaluate eight transcriptome deconvolution approaches and nine cell-type signatures, testing the accuracy of deconvolution using in silico mixtures of single-cell RNA-seq data, RNA mixtures, as well as nearly 2000 human brain samples. Our results identify the main factors that drive deconvolution accuracy for brain data, and highlight the importance of biological factors influencing cell-type signatures, such as brain region and in vitro cell culturing. Transcriptome deconvolution aims to estimate cellular composition based on gene expression data. Here the authors evaluate deconvolution methods for human brain transcriptome and conclude that partial deconvolution algorithms work best, but that appropriate cell-type signatures are also important.
Collapse
|
25
|
Rowland B, Huh R, Hou Z, Crowley C, Wen J, Shen Y, Hu M, Giusti-Rodríguez P, Sullivan PF, Li Y. THUNDER: A reference-free deconvolution method to infer cell type proportions from bulk Hi-C data. PLoS Genet 2022; 18:e1010102. [PMID: 35259165 PMCID: PMC8932604 DOI: 10.1371/journal.pgen.1010102] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 03/18/2022] [Accepted: 02/14/2022] [Indexed: 11/30/2022] Open
Abstract
Hi-C data provide population averaged estimates of three-dimensional chromatin contacts across cell types and states in bulk samples. Effective analysis of Hi-C data entails controlling for the potential confounding factor of differential cell type proportions across heterogeneous bulk samples. We propose a novel unsupervised deconvolution method for inferring cell type composition from bulk Hi-C data, the Two-step Hi-c UNsupervised DEconvolution appRoach (THUNDER). We conducted extensive simulations to test THUNDER based on combining two published single-cell Hi-C (scHi-C) datasets. THUNDER more accurately estimates the underlying cell type proportions compared to reference-free methods (e.g., TOAST, and NMF) and is more robust than reference-dependent methods (e.g. MuSiC). We further demonstrate the practical utility of THUNDER to estimate cell type proportions and identify cell-type-specific interactions in Hi-C data from adult human cortex tissue samples. THUNDER will be a useful tool in adjusting for varying cell type composition in population samples, facilitating valid and more powerful downstream analysis such as differential chromatin organization studies. Additionally, THUNDER estimated contact profiles provide a useful exploratory framework to investigate cell-type-specificity of the chromatin interactome while experimental data is still rare.
Collapse
Affiliation(s)
- Bryce Rowland
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Ruth Huh
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Zoey Hou
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, Illinois, United States of America
| | - Cheynna Crowley
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Jia Wen
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Yin Shen
- Institute for Human Genetics, University of California San Francisco, San Francisco, California, United States of America
- Department of Neurology, University of California San Francisco, San Francisco, California, United States of America
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, Ohio, United States of America
| | - Paola Giusti-Rodríguez
- Department of Psychiatry, University of Florida College of Medicine, Gainesville, Florida, United States of America
| | - Patrick F. Sullivan
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Psychiatry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden
| | - Yun Li
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
- Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
26
|
Ferraro F, Fevga C, Bonifati V, Mandemakers W, Mahfouz A, Reinders M. Correcting Differential Gene Expression Analysis for Cyto-Architectural Alterations in Substantia Nigra of Parkinson's Disease Patients Reveals Known and Potential Novel Disease-Associated Genes and Pathways. Cells 2022; 11:cells11020198. [PMID: 35053314 PMCID: PMC8774027 DOI: 10.3390/cells11020198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 12/31/2021] [Accepted: 01/04/2022] [Indexed: 11/16/2022] Open
Abstract
Several studies have analyzed gene expression profiles in the substantia nigra to better understand the pathological mechanisms causing Parkinson’s disease (PD). However, the concordance between the identified gene signatures in these individual studies was generally low. This might have been caused by a change in cell type composition as loss of dopaminergic neurons in the substantia nigra pars compacta is a hallmark of PD. Through an extensive meta-analysis of nine previously published microarray studies, we demonstrated that a big proportion of the detected differentially expressed genes was indeed caused by cyto-architectural alterations due to the heterogeneity in the neurodegenerative stage and/or technical artefacts. After correcting for cell composition, we identified a common signature that deregulated the previously unreported ammonium transport, as well as known biological processes such as bioenergetic pathways, response to proteotoxic stress, and immune response. By integrating with protein interaction data, we shortlisted a set of key genes, such as LRRK2, PINK1, PRKN, and FBXO7, known to be related to PD, others with compelling evidence for their role in neurodegeneration, such as GSK3β, WWOX, and VPC, and novel potential players in the PD pathogenesis. Together, these data show the importance of accounting for cyto-architecture in these analyses and highlight the contribution of multiple cell types and novel processes to PD pathology, providing potential new targets for drug development.
Collapse
Affiliation(s)
- Federico Ferraro
- Erasmus MC, Department of Clinical Genetics, University Medical Center Rotterdam, 3015 GD Rotterdam, The Netherlands; (F.F.); (C.F.); (V.B.); (W.M.)
| | - Christina Fevga
- Erasmus MC, Department of Clinical Genetics, University Medical Center Rotterdam, 3015 GD Rotterdam, The Netherlands; (F.F.); (C.F.); (V.B.); (W.M.)
| | - Vincenzo Bonifati
- Erasmus MC, Department of Clinical Genetics, University Medical Center Rotterdam, 3015 GD Rotterdam, The Netherlands; (F.F.); (C.F.); (V.B.); (W.M.)
| | - Wim Mandemakers
- Erasmus MC, Department of Clinical Genetics, University Medical Center Rotterdam, 3015 GD Rotterdam, The Netherlands; (F.F.); (C.F.); (V.B.); (W.M.)
| | - Ahmed Mahfouz
- Delft Bioinformatics Labaratory, Delft University of Technology, 2628 XE Delft, The Netherlands;
- Leiden Computational Biology Center, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
| | - Marcel Reinders
- Delft Bioinformatics Labaratory, Delft University of Technology, 2628 XE Delft, The Netherlands;
- Leiden Computational Biology Center, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
- Department of Human Genetics, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
- Section Molecular Epidemiology, Department of Biomedical Data Sciences, Leiden University Medical Center, 2333 ZA Leiden, The Netherlands
- Correspondence:
| |
Collapse
|
27
|
Computational challenges in detection of cancer using cell-free DNA methylation. Comput Struct Biotechnol J 2022; 20:26-39. [PMID: 34976309 PMCID: PMC8669313 DOI: 10.1016/j.csbj.2021.12.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 12/02/2021] [Accepted: 12/02/2021] [Indexed: 12/18/2022] Open
Abstract
Cell-free DNA(cfDNA) methylation profiling is considered promising and potentially reliable for liquid biopsy to study progress of diseases and develop reliable and consistent diagnostic and prognostic biomarkers. There are several different mechanisms responsible for the release of cfDNA in blood plasma, and henceforth it can provide information regarding dynamic changes in the human body. Due to the fragmented nature, low concentration of cfDNA, and high background noise, there are several challenges in its analysis for regular use in diagnosis of cancer. Such challenges in the analysis of the methylation profile of cfDNA are further aggravated due to heterogeneity, biomarker sensitivity, platform biases, and batch effects. This review delineates the origin of cfDNA methylation, its profiling, and associated computational problems in analysis for diagnosis. Here we also contemplate upon the multi-marker approach to handle the scenario of cancer heterogeneity and explore the utility of markers for 5hmC based cfDNA methylation pattern. Further, we provide a critical overview of deconvolution and machine learning methods for cfDNA methylation analysis. Our review of current methods reveals the potential for further improvement in analysis strategies for detecting early cancer using cfDNA methylation.
Collapse
Key Words
- Cancer heterogeneity
- Cell free DNA
- Computation
- DMP, Differentially methylated base position
- DMR, Differentially methylated regions
- Diagnosis
- HELP-seq, HpaII-tiny fragment Enrichment by Ligation-mediated PCR sequencing
- MBD-seq, Methyl-CpG Binding Domain Protein Capture Sequencing
- MCTA-seq, Methylated CpG tandems amplification and sequencing
- MSCC, Methylation Sensitive Cut Counting
- MSRE, methylation sensitive restriction enzymes
- MeDIP-seq, Methylated DNA Immunoprecipitation Sequencing
- RRBS, Reduced-Representation Bisulfite Sequencing
- WGBS, Whole Genome Bisulfite Sequencing
- cfDNA, cell free DNA
- ctDNA, circulating tumor DNA
- dPCR, digital polymerase chain reaction
- ddMCP, droplet digital methylation-specific PCR
- ddPCR, droplet digital polymerase chain reaction
- scCGI, methylated CGIs at single cell level
Collapse
|
28
|
Jaakkola MK, Elo LL. Estimating cell type-specific differential expression using deconvolution. Brief Bioinform 2021; 23:6396788. [PMID: 34651640 PMCID: PMC8769698 DOI: 10.1093/bib/bbab433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 09/17/2021] [Accepted: 09/23/2021] [Indexed: 12/02/2022] Open
Affiliation(s)
- Maria K Jaakkola
- Department of Mathematics and Statistics, University of Turku, Yliopistonmäki, 20014, Turku, Finland
| | - Laura L Elo
- Turku Bioscience Centre, University of Turku and Åbo Akademi University, Tykistökatu 6, FI-20520, Turku, Finland.,Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FI-20520, Turku, Finland
| |
Collapse
|
29
|
Guo Z, Shafik AM, Jin P, Wu Z, Wu H. Detecting m6A methylation regions from Methylated RNA Immunoprecipitation Sequencing. Bioinformatics 2021; 37:2818-2824. [PMID: 33724304 PMCID: PMC9991887 DOI: 10.1093/bioinformatics/btab181] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 02/16/2021] [Accepted: 03/12/2021] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION The post-transcriptional epigenetic modification on mRNA is an emerging field to study the gene regulatory mechanism and their association with diseases. Recently developed high-throughput sequencing technology named Methylated RNA Immunoprecipitation Sequencing (MeRIP-seq) enables one to profile mRNA epigenetic modification transcriptome wide. A few computational methods are available to identify transcriptome-wide mRNA modification, but they are either limited by over-simplified model ignoring the biological variance across replicates or suffer from low accuracy and efficiency. RESULTS In this work, we develop a novel statistical method, based on an empirical Bayesian hierarchical model, to identify mRNA epigenetic modification regions from MeRIP-seq data. Our method accounts for various sources of variations in the data through rigorous modeling and applies shrinkage estimation by borrowing information from transcriptome-wide data to stabilize the parameter estimation. Simulation and real data analyses demonstrate that our method is more accurate, robust and efficient than the existing peak calling methods. AVAILABILITY AND IMPLEMENTATION Our method TRES is implemented as an R package and is freely available on Github at https://github.com/ZhenxingGuo0015/TRES. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhenxing Guo
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Andrew M Shafik
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Peng Jin
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Zhijin Wu
- Department of Biostatistics, Brown University, Providence, RI 02806, USA
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
30
|
Zhang W, Wu H, Li Z. Complete deconvolution of DNA methylation signals from complex tissues: a geometric approach. Bioinformatics 2021; 37:1052-1059. [PMID: 33135072 PMCID: PMC8150138 DOI: 10.1093/bioinformatics/btaa930] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 10/16/2020] [Accepted: 10/21/2020] [Indexed: 12/16/2022] Open
Abstract
MOTIVATION It is a common practice in epigenetics research to profile DNA methylation on tissue samples, which is usually a mixture of different cell types. To properly account for the mixture, estimating cell compositions has been recognized as an important first step. Many methods were developed for quantifying cell compositions from DNA methylation data, but they mostly have limited applications due to lack of reference or prior information. RESULTS We develop Tsisal, a novel complete deconvolution method which accurately estimate cell compositions from DNA methylation data without any prior knowledge of cell types or their proportions. Tsisal is a full pipeline to estimate number of cell types, cell compositions and identify cell-type-specific CpG sites. It can also assign cell type labels when (full or part of) reference panel is available. Extensive simulation studies and analyses of seven real datasets demonstrate the favorable performance of our proposed method compared with existing deconvolution methods serving similar purpose. AVAILABILITY AND IMPLEMENTATION The proposed method Tsisal is implemented as part of the R/Bioconductor package TOAST at https://bioconductor.org/packages/TOAST. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Weiwei Zhang
- School of Science, East China University of Technology, Nanchang, Jiangxi 330013, China
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Ziyi Li
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
31
|
Bhattacharya A, Hamilton AM, Troester MA, Love MI. DeCompress: tissue compartment deconvolution of targeted mRNA expression panels using compressed sensing. Nucleic Acids Res 2021; 49:e48. [PMID: 33524140 PMCID: PMC8096278 DOI: 10.1093/nar/gkab031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 12/21/2020] [Accepted: 01/12/2021] [Indexed: 12/13/2022] Open
Abstract
Targeted mRNA expression panels, measuring up to 800 genes, are used in academic and clinical settings due to low cost and high sensitivity for archived samples. Most samples assayed on targeted panels originate from bulk tissue comprised of many cell types, and cell-type heterogeneity confounds biological signals. Reference-free methods are used when cell-type-specific expression references are unavailable, but limited feature spaces render implementation challenging in targeted panels. Here, we present DeCompress, a semi-reference-free deconvolution method for targeted panels. DeCompress leverages a reference RNA-seq or microarray dataset from similar tissue to expand the feature space of targeted panels using compressed sensing. Ensemble reference-free deconvolution is performed on this artificially expanded dataset to estimate cell-type proportions and gene signatures. In simulated mixtures, four public cell line mixtures, and a targeted panel (1199 samples; 406 genes) from the Carolina Breast Cancer Study, DeCompress recapitulates cell-type proportions with less error than reference-free methods and finds biologically relevant compartments. We integrate compartment estimates into cis-eQTL mapping in breast cancer, identifying a tumor-specific cis-eQTL for CCR3 (C-C Motif Chemokine Receptor 3) at a risk locus. DeCompress improves upon reference-free methods without requiring expression profiles from pure cell populations, with applications in genomic analyses and clinical settings.
Collapse
Affiliation(s)
- Arjun Bhattacharya
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California-Los Angeles, Los Angeles, CA 90095, USA
| | - Alina M Hamilton
- Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
| | - Melissa A Troester
- Department of Pathology and Laboratory Medicine, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
- Department of Epidemiology, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
| | - Michael I Love
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
- Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, NC 27516, USA
| |
Collapse
|
32
|
Bayesian Joint Modeling of Single-Cell Expression Data and Bulk Spatial Transcriptomic Data. STATISTICS IN BIOSCIENCES 2021. [DOI: 10.1007/s12561-021-09308-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
33
|
He L, Liu L, Li T, Zhuang D, Dai J, Wang B, Bi L. Exploring the Imbalance of Periodontitis Immune System From the Cellular to Molecular Level. Front Genet 2021; 12:653209. [PMID: 33841510 PMCID: PMC8033214 DOI: 10.3389/fgene.2021.653209] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 03/08/2021] [Indexed: 01/22/2023] Open
Abstract
Periodontitis is a common chronic inflammatory disease of periodontal tissue, mostly concentrated in people over 30 years old. Statistics show that compared with foreign countries, the prevalence of periodontitis in China is as high as 40%, and the prevalence of periodontal disease is more than 90%, which must arouse our great attention. Diagnosis and treatment of periodontitis currently rely mainly on clinical criteria, and the exploration of the etiologic criteria is relatively lacking. We, therefore, have explored the pathogenesis of periodontitis from the perspective of immune imbalance. By predicting the fraction of 22 immune cells in periodontitis tissues and comparing them with normal tissues, we found that multiple immune cell infiltration in periodontitis tissues was inhibited and this feature can clearly distinguish periodontitis from normal tissues. Further, protein interaction network (PPI) and transcription regulation network have been constructed based on differentially expressed genes (DEGs) to explore the interaction function modules and regulation pathways. Three functional modules have been revealed and top TFs such as EGR1 and ETS1 have been shown to regulate the expression of periodontitis-related immune genes that play an important role in the formation of the immunosuppressive microenvironment. The classifier was also used to verify the reliability of periodontitis features obtained at the cellular and molecular levels. In conclusion, we have revealed the immune microenvironment and molecular characteristics of periodontitis, which will help to better understand the mechanism of periodontitis and its application in clinical diagnosis and treatment.
Collapse
Affiliation(s)
- Longfei He
- Department of Stomatology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China.,Department of Stomatology, Weifang People's Hospital, Weifang, China
| | - Lijuan Liu
- Department of Stomatology, Weifang People's Hospital, Weifang, China
| | - Ti Li
- Department of Stomatology, Weifang People's Hospital, Weifang, China
| | - Deshu Zhuang
- Department of Stomatology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China.,Department of Oral Biological and Medical Sciences, Faculty of Dentistry, University of British Columbia, Vancouver, BC, Canada
| | - Jiayin Dai
- Department of Stomatology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Bo Wang
- Department of Stomatology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Liangjia Bi
- Department of Stomatology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| |
Collapse
|
34
|
EMeth: An EM algorithm for cell type decomposition based on DNA methylation data. Sci Rep 2021; 11:5717. [PMID: 33707472 PMCID: PMC7952399 DOI: 10.1038/s41598-021-84864-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 02/22/2021] [Indexed: 12/31/2022] Open
Abstract
We introduce a new computational method named EMeth to estimate cell type proportions using DNA methylation data. EMeth is a reference-based method that requires cell type-specific DNA methylation data from relevant cell types. EMeth improves on the existing reference-based methods by detecting the CpGs whose DNA methylation are inconsistent with the deconvolution model and reducing their contributions to cell type decomposition. Another novel feature of EMeth is that it allows a cell type with known proportions but unknown reference and estimates its methylation. This is motivated by the case of studying methylation in tumor cells while bulk tumor samples include tumor cells as well as other cell types such as infiltrating immune cells, and tumor cell proportion can be estimated by copy number data. We demonstrate that EMeth delivers more accurate estimates of cell type proportions than several other methods using simulated data and in silico mixtures. Applications in cancer studies show that the proportions of T regulatory cells estimated by DNA methylation have expected associations with mutation load and survival time, while the estimates from gene expression miss such associations.
Collapse
|
35
|
Maternal DNA Methylation During Pregnancy: a Review. Reprod Sci 2021; 28:2758-2769. [PMID: 33469876 DOI: 10.1007/s43032-020-00456-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Accepted: 12/29/2020] [Indexed: 12/19/2022]
Abstract
Multiple environmental, behavioral, and hereditary factors affect pregnancy. Recent studies suggest that epigenetic modifications, such as DNA methylation (DNAm), affect both maternal and fetal health during the period of gestation. Some of the pregnancy-related risk factors can influence maternal DNAm, thus predisposing both the mother and the neonate to clinical adversities with long-lasting consequences. DNAm alterations in the promoter and enhancer regions modulate gene expression changes which play vital physiological role. In this review, we have discussed the recent advances in our understanding of maternal DNA methylation changes during pregnancy and its associated complications such as gestational diabetes and anemia, adverse pregnancy outcomes like preterm birth, and preeclampsia. We have also highlighted some major gaps and limitations in the area which if addressed might improve our understanding of pregnancy and its associated adverse clinical conditions, ultimately leading to healthy pregnancies and reduction of public health burden.
Collapse
|
36
|
Chen Z, Wu A. Progress and challenge for computational quantification of tissue immune cells. Brief Bioinform 2021; 22:6065002. [PMID: 33401306 DOI: 10.1093/bib/bbaa358] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 10/23/2020] [Accepted: 11/07/2020] [Indexed: 12/28/2022] Open
Abstract
Tissue immune cells have long been recognized as important regulators for the maintenance of balance in the body system. Quantification of the abundance of different immune cells will provide enhanced understanding of the correlation between immune cells and normal or abnormal situations. Currently, computational methods to predict tissue immune cell compositions from bulk transcriptomes have been largely developed. Therefore, summarizing the advantages and disadvantages is appropriate. In addition, an examination of the challenges and possible solutions for these computational models will assist the development of this field. The common hypothesis of these models is that the expression of signature genes for immune cell types might represent the proportion of immune cells that contribute to the tissue transcriptome. In general, we grouped all reported tools into three groups, including reference-free, reference-based scoring and reference-based deconvolution methods. In this review, a summary of all the currently reported computational immune cell quantification tools and their applications, limitations, and perspectives are presented. Furthermore, some critical problems are found that have limited the performance and application of these models, including inadequate immune cell type, the collinearity problem, the impact of the tissue environment on the immune cell expression level, and the deficiency of standard datasets for model validation. To address these issues, tissue specific training datasets that include all known immune cells, a hierarchical computational framework, and benchmark datasets including both tissue expression profiles and the abundances of all the immune cells are proposed to further promote the development of this field.
Collapse
Affiliation(s)
- Ziyi Chen
- Suzhou Institute of Systems Medicine, Center for Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Jiangsu, Suzhou, China
| | - Aiping Wu
- Suzhou Institute of Systems Medicine, Center for Systems Medicine, Chinese Academy of Medical Sciences & Peking Union Medical College, Jiangsu, Suzhou, China
| |
Collapse
|
37
|
Li Z, Guo Z, Cheng Y, Jin P, Wu H. Robust partial reference-free cell composition estimation from tissue expression. Bioinformatics 2020; 36:3431-3438. [PMID: 32167531 DOI: 10.1093/bioinformatics/btaa184] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 03/05/2020] [Accepted: 03/10/2020] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION In the analysis of high-throughput omics data from tissue samples, estimating and accounting for cell composition have been recognized as important steps. High cost, intensive labor requirements and technical limitations hinder the cell composition quantification using cell-sorting or single-cell technologies. Computational methods for cell composition estimation are available, but they are either limited by the availability of a reference panel or suffer from low accuracy. RESULTS We introduce TOols for the Analysis of heterogeneouS Tissues TOAST/-P and TOAST/+P, two partial reference-free algorithms for estimating cell composition of heterogeneous tissues based on their gene expression profiles. TOAST/-P and TOAST/+P incorporate additional biological information, including cell-type-specific markers and prior knowledge of compositions, in the estimation procedure. Extensive simulation studies and real data analyses demonstrate that the proposed methods provide more accurate and robust cell composition estimation than existing methods. AVAILABILITY AND IMPLEMENTATION The proposed methods TOAST/-P and TOAST/+P are implemented as part of the R/Bioconductor package TOAST at https://bioconductor.org/packages/TOAST. CONTACT ziyi.li@emory.edu or hao.wu@emory.edu. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ziyi Li
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Zhenxing Guo
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| | - Ying Cheng
- Institute of Biomedical Research, Yunnan University, Kunming, China
| | - Peng Jin
- Department of Human Genetics, Emory University, Atlanta, GA 30322, USA
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA
| |
Collapse
|
38
|
Fan F, Chen D, Zhao Y, Wang H, Sun H, Sun K. Rapid preliminary purity evaluation of tumor biopsies using deep learning approach. Comput Struct Biotechnol J 2020; 18:1746-1753. [PMID: 32695267 PMCID: PMC7352054 DOI: 10.1016/j.csbj.2020.06.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 05/18/2020] [Accepted: 06/05/2020] [Indexed: 12/29/2022] Open
Abstract
Tumor biopsy is one of the most widely used materials in cancer diagnoses and molecular studies, where the purity of the biopsies (i.e., proportion of cells that are cancerous) is crucial for both applications. However, conventional approaches for tumor biopsy purity evaluation require experienced pathologists and/or various materials/experiments therefore were time-consuming and error prone. Rapid, easy-to-perform and cost-effective methods are thus still of demand. Recent studies had demonstrated that molecular signatures were informative to this task. Previously, we had developed GeneCT, a deep learning-based cancerous status and tissue-of-origin classifier for pan-tumor/tissue biopsies. In the current work, we applied GeneCT on datasets collected from various groups, where the experimental protocols and cancer types differed from each other. We found that GeneCT showed high accuracies on most datasets; for samples with unexpected results, in-depth investigations suggested that they might suffer from imperfect purity. In silico mixture experiments further showed that GeneCT classification was highly indicative in predicting the purity of the tumor biopsies. Considering that transcriptome profiling is a common and inexpensive experiment in molecular cancer studies, our deep learning-based GeneCT could thus serve as a valuable tool for rapid, preliminary tumor biopsy purity assessment.
Collapse
Affiliation(s)
- Fei Fan
- Department of Neurosurgery, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430022, China
| | - Dan Chen
- The Third Affiliated Hospital (Provisional) of The Chinese University of Hong, Shenzhen, Shenzhen 518172, China
| | - Yu Zhao
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong SAR 999077, China
| | - Huating Wang
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong SAR 999077, China.,Department of Orthopaedics and Traumatology, The Chinese University of Hong Kong, Hong Kong SAR 999077, China
| | - Hao Sun
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Hong Kong SAR 999077, China.,Department of Chemical Pathology, The Chinese University of Hong Kong, Hong Kong SAR 999077, China
| | - Kun Sun
- Shenzhen Bay Laboratory, Shenzhen 518132, China
| |
Collapse
|
39
|
Danziger SA, Gibbs DL, Shmulevich I, McConnell M, Trotter MWB, Schmitz F, Reiss DJ, Ratushny AV. ADAPTS: Automated deconvolution augmentation of profiles for tissue specific cells. PLoS One 2019; 14:e0224693. [PMID: 31743345 PMCID: PMC6863530 DOI: 10.1371/journal.pone.0224693] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 10/18/2019] [Indexed: 12/19/2022] Open
Abstract
Immune cell infiltration of tumors and the tumor microenvironment can be an important component for determining patient outcomes. For example, immune and stromal cell presence inferred by deconvolving patient gene expression data may help identify high risk patients or suggest a course of treatment. One particularly powerful family of deconvolution techniques uses signature matrices of genes that uniquely identify each cell type as determined from single cell type purified gene expression data. Many methods from this family have been recently published, often including new signature matrices appropriate for a single purpose, such as investigating a specific type of tumor. The package ADAPTS helps users make the most of this expanding knowledge base by introducing a framework for cell type deconvolution. ADAPTS implements modular tools for customizing signature matrices for new tissue types by adding custom cell types or building new matrices de novo, including from single cell RNAseq data. It includes a common interface to several popular deconvolution algorithms that use a signature matrix to estimate the proportion of cell types present in heterogenous samples. ADAPTS also implements a novel method for clustering cell types into groups that are difficult to distinguish by deconvolution and then re-splitting those clusters using hierarchical deconvolution. We demonstrate that the techniques implemented in ADAPTS improve the ability to reconstruct the cell types present in a single cell RNAseq data set in a blind predictive analysis. ADAPTS is currently available for use in R on CRAN and GitHub.
Collapse
Affiliation(s)
- Samuel A. Danziger
- Celgene Corporation, Seattle, Washington, United States of America
- * E-mail: (SAD); (AVR)
| | - David L. Gibbs
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Ilya Shmulevich
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Mark McConnell
- Celgene Corporation, Seattle, Washington, United States of America
| | - Matthew W. B. Trotter
- Celgene Corporation, Seattle, Washington, United States of America
- Celgene Institute for Translational Research Europe, Seville, Sevilla, Spain
| | - Frank Schmitz
- Celgene Corporation, Seattle, Washington, United States of America
| | - David J. Reiss
- Celgene Corporation, Seattle, Washington, United States of America
| | - Alexander V. Ratushny
- Celgene Corporation, Seattle, Washington, United States of America
- * E-mail: (SAD); (AVR)
| |
Collapse
|