1
|
Characterization and Optimization of Multiomic Single-Cell Epigenomic Profiling. Genes (Basel) 2023; 14:1245. [PMID: 37372428 PMCID: PMC10297939 DOI: 10.3390/genes14061245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/01/2023] [Accepted: 06/07/2023] [Indexed: 06/29/2023] Open
Abstract
The snATAC + snRNA platform allows epigenomic profiling of open chromatin and gene expression with single-cell resolution. The most critical assay step is to isolate high-quality nuclei to proceed with droplet-base single nuclei isolation and barcoding. With the increasing popularity of multiomic profiling in various fields, there is a need for optimized and reliable nuclei isolation methods, mainly for human tissue samples. Herein we compared different nuclei isolation methods for cell suspensions, such as peripheral blood mononuclear cells (PBMC, n = 18) and a solid tumor type, ovarian cancer (OC, n = 18), derived from debulking surgery. Nuclei morphology and sequencing output parameters were used to evaluate the quality of preparation. Our results show that NP-40 detergent-based nuclei isolation yields better sequencing results than collagenase tissue dissociation for OC, significantly impacting cell type identification and analysis. Given the utility of applying such techniques to frozen samples, we also tested frozen preparation and digestion (n = 6). A paired comparison between frozen and fresh samples validated the quality of both specimens. Finally, we demonstrate the reproducibility of scRNA and snATAC + snRNA platform, by comparing the gene expression profiling of PBMC. Our results highlight how the choice of nuclei isolation methods is critical for obtaining quality data in multiomic assays. It also shows that the measurement of expression between scRNA and snRNA is comparable and effective for cell type identification.
Collapse
|
2
|
Abstract 806: Enhancer deregulation in TET2-mutant clonal hematopoiesis is associated with increased COVID-19 severity and mortality. Cancer Res 2023. [DOI: 10.1158/1538-7445.am2023-806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2023]
Abstract
Abstract
Background: COVID-19 causes significant morbidity and mortality, albeit with considerable heterogeneity among affected individuals. It remains unclear which host factors determine disease severity and survival. Given the propensity of clonal hematopoiesis (CH) to promote inflammation in healthy individuals, we investigated its effect on COVID-19 outcomes.
Methods: We performed a multi-omics interrogation of the genome, epigenome, transcriptome, and proteome of peripheral blood mononuclear cells from COVID-19 patients (n=227). We obtained clinical data, laboratory studies, and survival outcomes. We determined CH status and TET2-related DNA methylation. We performed single-cell proteogenomics to understand clonal composition in relation to cell phenotype. We interrogated single-cell gene expression in isolation and in conjunction with DNA accessibility. We integrated these multi-omics data to understand the effect of CH on clonal composition, gene expression, methylation of cis-regulatory elements, and lineage commitment in COVID-19 patients. We performed shRNA knockdowns to validate the effect of one candidate transcription factor in myeloid cell lines.
Results: The presence of CH was strongly associated with COVID-19 severity and all-cause mortality, independent of age (HR 3.48, 95% CI 1.45-8.36, p=0.005). Differential methylation of promoters and enhancers was prevalent in TET2-mutant, but not DNMT3A-mutant CH. TET2-mutant CH was associated with enhanced classical/intermediate monocytosis and single-cell proteogenomics confirmed an enrichment of TET2 mutations in these cell types. We identified cell-type specific gene expression changes associated with TET2 mutations in 102,072 single cells (n=34). Single-cell RNA-seq confirmed the skewing of hematopoiesis towards classical and intermediate monocytes and demonstrated the downregulation of EGR1 (a transcription factor important for monocyte differentiation) along with up-regulation of the lncRNA MALAT1 in monocytes. Combined scRNA-/scATAC-seq in 43,160 single cells (n=18) confirmed the skewing of hematopoiesis and up-regulation of MALAT1 in monocytes along with decreased accessibility of EGR1 motifs in known cis-regulatory elements. Using myeloid cell lines for functional validation, shRNA knockdowns of EGR1 confirmed the up-regulation of MALAT1 (in comparison to wildtype controls).
Conclusions: CH is an independent prognostic factor in COVID-19 and skews hematopoiesis towards monocytosis. TET2-mutant CH is characterized by differential methylation and accessibility of enhancers binding myeloid transcriptions factors including EGR1. The ensuing loss of EGR1 expression in monocytes causes MALAT1 overexpression, a factor known to promote monocyte differentiation and inflammation. These data provide a mechanistic insight to the adverse prognostic impact of CH in COVID-19.
Citation Format: Moritz Binder, Terra L. Lasho, Wazim Mohammed Ismail, Nana A. Ben-Crentsil, Jenna A. Fernandez, Minsuk Kim, Susan M. Geyer, Amelia Mazzone, Christy M. Finke, Abhishek A. Mangaonkar, Jeong-Heon Lee, Kwan Hyun Kim, Vernadette A. Simon, Fariborz Rakhshan Rohakthar, Amik Munankarmy, Susan M. Schwager, Jonathan J. Harrington, Melissa R. Snyder, Nathalie M. Droin, Eric Solary, Keith D. Robertson, Eric D. Wieben, Eric Padron, Nicholas Chia, Alexandre Gaspar-Maia, Mrinal M. Patnaik. Enhancer deregulation inTET2-mutant clonal hematopoiesis is associated with increased COVID-19 severity and mortality [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2023; Part 1 (Regular and Invited Abstracts); 2023 Apr 14-19; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2023;83(7_Suppl):Abstract nr 806.
Collapse
|
3
|
MacroH2A histone variants modulate enhancer activity to repress oncogenic programs and cellular reprogramming. Commun Biol 2023; 6:215. [PMID: 36823213 PMCID: PMC9950461 DOI: 10.1038/s42003-023-04571-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 02/09/2023] [Indexed: 02/25/2023] Open
Abstract
Considerable efforts have been made to characterize active enhancer elements, which can be annotated by accessible chromatin and H3 lysine 27 acetylation (H3K27ac). However, apart from poised enhancers that are observed in early stages of development and putative silencers, the functional significance of cis-regulatory elements lacking H3K27ac is poorly understood. Here we show that macroH2A histone variants mark a subset of enhancers in normal and cancer cells, which we coined 'macro-Bound Enhancers', that modulate enhancer activity. We find macroH2A variants localized at enhancer elements that are devoid of H3K27ac in a cell type-specific manner, indicating a role for macroH2A at inactive enhancers to maintain cell identity. In following, reactivation of macro-bound enhancers is associated with oncogenic programs in breast cancer and their repressive role is correlated with the activity of macroH2A2 as a negative regulator of BRD4 chromatin occupancy. Finally, through single cell epigenomic profiling of normal mammary stem cells derived from mice, we show that macroH2A deficiency facilitates increased activity of transcription factors associated with stem cell activity.
Collapse
|
4
|
New complexities of SOS-induced "untargeted" mutagenesis in Escherichia coli as revealed by mutation accumulation and whole-genome sequencing. DNA Repair (Amst) 2020; 90:102852. [PMID: 32388005 DOI: 10.1016/j.dnarep.2020.102852] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 03/19/2020] [Accepted: 04/06/2020] [Indexed: 01/23/2023]
Abstract
When its DNA is damaged, Escherichia coli induces the SOS response, which consists of about 40 genes that encode activities to repair or tolerate the damage. Certain alleles of the major SOS-control genes, recA and lexA, cause constitutive expression of the response, resulting in an increase in spontaneous mutations. These mutations, historically called "untargeted", have been the subject of many previous studies. Here we re-examine SOS-induced mutagenesis using mutation accumulation followed by whole-genome sequencing (MA/WGS), which allows a detailed picture of the types of mutations induced as well as their sequence-specificity. Our results confirm previous findings that SOS expression specifically induces transversion base-pair substitutions, with rates averaging about 60-fold above wild-type levels. Surprisingly, the rates of G:C to C:G transversions, normally an extremely rare mutation, were induced an average of 160-fold above wild-type levels. The SOS-induced transversion showed strong sequence specificity, the most extreme of which was the G:C to C:G transversions, 60% of which occurred at the middle base of 5'GGC3'+5'GCC3' sites, although these sites represent only 8% of the G:C base pairs in the genome. SOS-induced transversions were also DNA strand-biased, occurring, on average, 2- to 4- times more often when the purine was on the leading-strand template and the pyrimidine on the lagging-strand template than in the opposite orientation. However, the strand bias was also sequence specific, and even of reverse orientation at some sites. By eliminating constraints on the mutations that can be recovered, the MA/WGS protocol revealed new complexities of SOS "untargeted" mutations.
Collapse
|
5
|
Abstract
BACKGROUND Bacterial cells during many replication cycles accumulate spontaneous mutations, which result in the birth of novel clones. As a result of this clonal expansion, an evolving bacterial population has different clonal composition over time, as revealed in the long-term evolution experiments (LTEEs). Accurately inferring the haplotypes of novel clones as well as the clonal frequencies and the clonal evolutionary history in a bacterial population is useful for the characterization of the evolutionary pressure on multiple correlated mutations instead of that on individual mutations. RESULTS In this paper, we study the computational problem of reconstructing the haplotypes of bacterial clones from the variant allele frequencies observed from an evolving bacterial population at multiple time points. We formalize the problem using a maximum likelihood function, which is defined under the assumption that mutations occur spontaneously, and thus the likelihood of a mutation occurring in a specific clone is proportional to the frequency of the clone in the population when the mutation occurs. We develop a series of heuristic algorithms to address the maximum likelihood inference, and show through simulation experiments that the algorithms are fast and achieve near optimal accuracy that is practically plausible under the maximum likelihood framework. We also validate our method using experimental data obtained from a recent study on long-term evolution of Escherichia coli. CONCLUSION We developed efficient algorithms to reconstruct the clonal evolution history from time course genomic sequencing data. Our algorithm can also incorporate clonal sequencing data to improve the reconstruction results when they are available. Based on the evaluation on both simulated and experimental sequencing data, our algorithms can achieve satisfactory results on the genome sequencing data from long-term evolution experiments. AVAILABILITY The program (ClonalTREE) is available as open-source software on GitHub at https://github.com/COL-IU/ClonalTREE.
Collapse
|
6
|
Algorithmic approaches to clonal reconstruction in heterogeneous cell populations. QUANTITATIVE BIOLOGY 2019; 7:255-265. [PMID: 32431959 PMCID: PMC7236794 DOI: 10.1007/s40484-019-0188-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2019] [Revised: 08/09/2019] [Accepted: 08/25/2019] [Indexed: 12/15/2022]
Abstract
BACKGROUND The reconstruction of clonal haplotypes and their evolutionary history in evolving populations is a common problem in both microbial evolutionary biology and cancer biology. The clonal theory of evolution provides a theoretical framework for modeling the evolution of clones. RESULTS In this paper, we review the theoretical framework and assumptions over which the clonal reconstruction problem is formulated. We formally define the problem and then discuss the complexity and solution space of the problem. Various methods have been proposed to find the phylogeny that best explains the observed data. We categorize these methods based on the type of input data that they use (space-resolved or time-resolved), and also based on their computational formulation as either combinatorial or probabilistic. It is crucial to understand the different types of input data because each provides essential but distinct information for drastically reducing the solution space of the clonal reconstruction problem. Complementary information provided by single cell sequencing or from whole genome sequencing of randomly isolated clones can also improve the accuracy of clonal reconstruction. We briefly review the existing algorithms and their relationships. Finally we summarize the tools that are developed for either directly solving the clonal reconstruction problem or a related computational problem. CONCLUSIONS In this review, we discuss the various formulations of the problem of inferring the clonal evolutionary history from allele frequeny data, review existing algorithms and catergorize them according to their problem formulation and solution approaches. We note that most of the available clonal inference algorithms were developed for elucidating tumor evolution whereas clonal reconstruction for unicellular genomes are less addressed. We conclude the review by discussing more open problems such as the lack of benchmark datasets and comparison of performance between available tools.
Collapse
|
7
|
The sequencing and interpretation of the genome obtained from a Serbian individual. PLoS One 2018; 13:e0208901. [PMID: 30566479 PMCID: PMC6300249 DOI: 10.1371/journal.pone.0208901] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 11/26/2018] [Indexed: 02/07/2023] Open
Abstract
Recent genetic studies and whole-genome sequencing projects have greatly improved our understanding of human variation and clinically actionable genetic information. Smaller ethnic populations, however, remain underrepresented in both individual and large-scale sequencing efforts and hence present an opportunity to discover new variants of biomedical and demographic significance. This report describes the sequencing and analysis of a genome obtained from an individual of Serbian origin, introducing tens of thousands of previously unknown variants to the currently available pool. Ancestry analysis places this individual in close proximity to Central and Eastern European populations; i.e., closest to Croatian, Bulgarian and Hungarian individuals and, in terms of other Europeans, furthest from Ashkenazi Jewish, Spanish, Sicilian and Baltic individuals. Our analysis confirmed gene flow between Neanderthal and ancestral pan-European populations, with similar contributions to the Serbian genome as those observed in other European groups. Finally, to assess the burden of potentially disease-causing/clinically relevant variation in the sequenced genome, we utilized manually curated genotype-phenotype association databases and variant-effect predictors. We identified several variants that have previously been associated with severe early-onset disease that is not evident in the proband, as well as putatively impactful variants that could yet prove to be clinically relevant to the proband over the next decades. The presence of numerous private and low-frequency variants, along with the observed and predicted disease-causing mutations in this genome, exemplify some of the global challenges of genome interpretation, especially in the context of under-studied ethnic groups.
Collapse
|
8
|
A Maximum-Likelihood Approach to Estimating the Insertion Frequencies of Transposable Elements from Population Sequencing Data. Mol Biol Evol 2018; 35:2560-2571. [PMID: 30099533 PMCID: PMC6188571 DOI: 10.1093/molbev/msy152] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Transposable elements (TEs) contribute to a large fraction of the expansion of many eukaryotic genomes due to the capability of TEs duplicating themselves through transposition. A first step to understanding the roles of TEs in a eukaryotic genome is to characterize the population-wide variation of TE insertions in the species. Here, we present a maximum-likelihood (ML) method for estimating allele frequencies and detecting selection on TE insertions in a diploid population, based on the genotypes at TE insertion sites detected in multiple individuals sampled from the population using paired-end (PE) sequencing reads. Tests of the method on simulated data show that it can accurately estimate the allele frequencies of TE insertions even when the PE sequencing is conducted at a relatively low coverage (=5X). The method can also detect TE insertions under strong selection, and the detection ability increases with sample size in a population, although a substantial fraction of actual TE insertions under selection may be undetected. Application of the ML method to genomic sequencing data collected from a natural Daphnia pulex population shows that, on the one hand, most (>90%) TE insertions present in the reference D. pulex genome are either fixed or nearly fixed (with allele frequencies >0.95); on the other hand, among the nonreference TE insertions (i.e., those detected in some individuals in the population but absent from the reference genome), the majority (>70%) are still at low frequencies (<0.1). Finally, we detected a substantial fraction (∼9%) of nonreference TE insertions under selection.
Collapse
|
9
|
MGEScan: a Galaxy-based system for identifying retrotransposons in genomes. ACTA ACUST UNITED AC 2016; 32:2502-4. [PMID: 27153595 DOI: 10.1093/bioinformatics/btw157] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2015] [Accepted: 03/17/2016] [Indexed: 01/08/2023]
Abstract
UNLABELLED : MGEScan-long terminal repeat (LTR) and MGEScan-non-LTR are successfully used programs for identifying LTRs and non-LTR retrotransposons in eukaryotic genome sequences. However, these programs are not supported by easy-to-use interfaces nor well suited for data visualization in general data formats. Here, we present MGEScan, a user-friendly system that combines these two programs with a Galaxy workflow system accelerated with MPI and Python threading on compute clusters. MGEScan and Galaxy empower researchers to identify transposable elements in a graphical user interface with ready-to-use workflows. MGEScan also visualizes the custom annotation tracks for mobile genetic elements in public genome browsers. A maximum speed-up of 3.26× is attained for execution time using concurrent processing and MPI on four virtual cores. MGEScan provides four operational modes: as a command line tool, as a Galaxy Toolshed, on a Galaxy-based web server, and on a virtual cluster on the Amazon cloud. AVAILABILITY AND IMPLEMENTATION MGEScan tutorials and source code are available at http://mgescan.readthedocs.org/ CONTACT hatang@indiana.edu or syoh@ajou.ac.kr SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
10
|
Abstract
BACKGROUND Metatranscriptomic sequencing is a highly sensitive bioassay of functional activity in a microbial community, providing complementary information to the metagenomic sequencing of the community. The acquisition of the metatranscriptomic sequences will enable us to refine the annotations of the metagenomes, and to study the gene activities and their regulation in complex microbial communities and their dynamics. RESULTS In this paper, we present TransGeneScan, a software tool for finding genes in assembled transcripts from metatranscriptomic sequences. By incorporating several features of metatranscriptomic sequencing, including strand-specificity, short intergenic regions, and putative antisense transcripts into a Hidden Markov Model, TranGeneScan can predict a sense transcript containing one or multiple genes (in an operon) or an antisense transcript. CONCLUSION We tested TransGeneScan on a mock metatranscriptomic data set containing three known bacterial genomes. The results showed that TranGeneScan performs better than metagenomic gene finders (MetaGeneMark and FragGeneScan) on predicting protein coding genes in assembled transcripts, and achieves comparable or even higher accuracy than gene finders for microbial genomes (Glimmer and GeneMark). These results imply, with the assistance of metatranscriptomic sequencing, we can obtain a broad and precise picture about the genes (and their functions) in a microbial community. AVAILABILITY TransGeneScan is available as open-source software on SourceForge at https://sourceforge.net/projects/transgenescan/.
Collapse
|
11
|
Tuberculous lymphadenitis: skin delayed-type hypersensitivity reaction and cellular immune responses. West Afr J Med 2011; 30:193-196. [PMID: 22120485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
BACKGROUND Tuberculous lymphadenitis (TL) is the commonest form of extra-pulmonary tuberculosis in tropical countries. OBJECTIVE This study aimed to characterize in vivo and in vitro cellular immune responses to Mycobacterium PPD in TL patients as markers of disease and healing. METHODS Following informed consent, 36 TL patients, 40 patients with pulmonary tuberculosis (TB) and 20 apparently healthy individuals were enrolled when they met specific selection criteria. The tuberculin skin test (TST) and peripheral blood mono-nuclear cells (PBMCs) culture were conducted using PPD. The cytokines were measured using commercial kits. RESULTS The mean TST was 24.6 ±8.0 mm for TL patients. The TST was variable in pulmonary TB patients and healthy individuals. It was reactive in a third of pulmonary TB patients with a mean of 20 ±3.0 mm and reactive in half of the healthy individuals with a mean of 12.6 ±3.2 mm. Pre and post-treatment interferon gamma (IFN-g) mean levels were 498.6 ±905.8 pg/ml and 710.0 ±844.6 pg/ml respectively (p=0.0001) for TL patients, while IL-10 mean levels were 93.0 ±136.0 pg/ml and 32.4 ±31.7 pg/ml respectively (p= 0.0001). TST-reactive Pulmonary TB patients had significantly higher IFN-g (851 ±234.4 pg/ml) compared to TBLNT patients (p = 0.0001), while pulmonary TB patients had significantly lower IL-10 compared to TBLNT patients (p=0.0001). Apparently healthy individuals had significantly lower IFN-g and IL-10 levels compared to TBLNT and pulmonary TB patients (p=0.003). CONCLUSION Strong TST reactivity, high IFN-g and IL-10 levels are good surrogate markers of active TBLNT, while increasing IFN-g levels and decreasing IL-10 levels mark healing. Tuberculosis Skin Test reactivity although a good diagnostic marker does not disappear with treatment.
Collapse
|