1
|
Bohrer CH, Fursova NA, Larson DR. Enhancers: A Focus on Synthetic Biology and Correlated Gene Expression. ACS Synth Biol 2024; 13:3093-3108. [PMID: 39276360 DOI: 10.1021/acssynbio.4c00244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/17/2024]
Abstract
Enhancers are central for the regulation of metazoan transcription but have proven difficult to study, primarily due to a myriad of interdependent variables shaping their activity. Consequently, synthetic biology has emerged as the main approach for dissecting mechanisms of enhancer function. We start by reviewing simple but highly parallel reporter assays, which have been successful in quantifying the complexity of the activator/coactivator mechanisms at enhancers. We then describe studies that examine how enhancers function in the genomic context and in combination with other enhancers, revealing that they activate genes through a variety of different mechanisms, working together as a system. Here, we primarily focus on synthetic reporter genes that can quantify the dynamics of enhancer biology through time. We end by considering the consequences of having many genes and enhancers within a 'local environment', which we believe leads to correlated gene expression and likely reports on the general principles of enhancer biology.
Collapse
Affiliation(s)
- Christopher H Bohrer
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Nadezda A Fursova
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Daniel R Larson
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
2
|
Kawasaki K, Fukaya T. Regulatory landscape of enhancer-mediated transcriptional activation. Trends Cell Biol 2024; 34:826-837. [PMID: 38355349 DOI: 10.1016/j.tcb.2024.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 12/21/2023] [Accepted: 01/22/2024] [Indexed: 02/16/2024]
Abstract
Enhancers are noncoding regulatory elements that instruct spatial and temporal specificity of gene transcription in response to a variety of intrinsic and extrinsic signals during development. Although it has long been postulated that enhancers physically interact with target promoters through the formation of stable loops, recent studies have changed this static view: sequence-specific transcription factors (TFs) and coactivators are dynamically recruited to enhancers and assemble so-called transcription hubs. Dynamic assembly of transcription hubs appears to serve as a key scaffold to integrate regulatory information encoded by surrounding genome and biophysical properties of transcription machineries. In this review, we outline emerging new models of transcriptional regulation by enhancers and discuss future perspectives.
Collapse
Affiliation(s)
- Koji Kawasaki
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0032, Japan
| | - Takashi Fukaya
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0032, Japan; Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-0032, Japan.
| |
Collapse
|
3
|
Nabi IR, Cardoen B, Khater IM, Gao G, Wong TH, Hamarneh G. AI analysis of super-resolution microscopy: Biological discovery in the absence of ground truth. J Cell Biol 2024; 223:e202311073. [PMID: 38865088 PMCID: PMC11169916 DOI: 10.1083/jcb.202311073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 04/02/2024] [Accepted: 05/21/2024] [Indexed: 06/13/2024] Open
Abstract
Super-resolution microscopy, or nanoscopy, enables the use of fluorescent-based molecular localization tools to study molecular structure at the nanoscale level in the intact cell, bridging the mesoscale gap to classical structural biology methodologies. Analysis of super-resolution data by artificial intelligence (AI), such as machine learning, offers tremendous potential for the discovery of new biology, that, by definition, is not known and lacks ground truth. Herein, we describe the application of weakly supervised paradigms to super-resolution microscopy and its potential to enable the accelerated exploration of the nanoscale architecture of subcellular macromolecules and organelles.
Collapse
Affiliation(s)
- Ivan R. Nabi
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British Columbia, Vancouver, Canada
- School of Biomedical Engineering, University of British Columbia, Vancouver, Canada
| | - Ben Cardoen
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| | - Ismail M. Khater
- School of Computing Science, Simon Fraser University, Burnaby, Canada
- Department of Electrical and Computer Engineering, Faculty of Engineering and Technology, Birzeit University, Birzeit, Palestine
| | - Guang Gao
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British Columbia, Vancouver, Canada
| | - Timothy H. Wong
- Department of Cellular and Physiological Sciences, Life Sciences Institute, University of British Columbia, Vancouver, Canada
| | - Ghassan Hamarneh
- School of Computing Science, Simon Fraser University, Burnaby, Canada
| |
Collapse
|
4
|
Yu H, Wu D, Mishra S, Shen G, Sun H, Hu M, Li Y. SnapFISH-IMPUTE: an imputation method for multiplexed DNA FISH data. Commun Biol 2024; 7:834. [PMID: 38982263 PMCID: PMC11233503 DOI: 10.1038/s42003-024-06428-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 06/07/2024] [Indexed: 07/11/2024] Open
Abstract
Chromatin spatial organization plays a crucial role in gene regulation. Recently developed and prospering multiplexed DNA FISH technologies enable direct visualization of chromatin conformation in the nucleus. However, incomplete data caused by limited detection efficiency can substantially complicate and impair downstream analysis. Here, we present SnapFISH-IMPUTE that imputes missing values in multiplexed DNA FISH data. Analysis on multiple published datasets shows that the proposed method preserves the distribution of pairwise distances between imaging loci, and the imputed chromatin conformations are indistinguishable from the observed conformations. Additionally, imputation greatly improves downstream analyses such as identifying enhancer-promoter loops and clustering cells into distinct cell types. SnapFISH-IMPUTE is freely available at https://github.com/hyuyu104/SnapFISH-IMPUTE .
Collapse
Affiliation(s)
- Hongyu Yu
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Daiqing Wu
- Department of Statistics, University of Toronto, Ontario, Canada
| | - Shreya Mishra
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Guning Shen
- Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA
- Department of Biology, University of North Carolina, Chapel Hill, NC, USA
| | - Huaigu Sun
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, USA.
| | - Yun Li
- Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA.
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA.
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA.
| |
Collapse
|
5
|
Patel R, Pham K, Chandrashekar H, Phillips-Cremins JE. FISHnet: Detecting chromatin domains in single-cell sequential Oligopaints imaging data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.18.599627. [PMID: 38948824 PMCID: PMC11212945 DOI: 10.1101/2024.06.18.599627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
Sequential Oligopaints DNA FISH is an imaging technique that measures higher-order genome folding at single-allele resolution via multiplexed, probe-based tracing. Currently there is a paucity of algorithms to identify 3D genome features in sequential Oligopaints data. Here, we present FISHnet, a graph theory method based on optimization of network modularity to detect chromatin domains and boundaries in pairwise distance matrices. FISHnet uncovers cell type-specific domain-like folding patterns on single alleles, thus enabling future studies aiming to elucidate the role for single-cell folding variation on genome function.
Collapse
Affiliation(s)
- Rohan Patel
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
- Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
| | - Kenneth Pham
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
- Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
| | - Harshini Chandrashekar
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
- Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
| | - Jennifer E Phillips-Cremins
- Department of Bioengineering, University of Pennsylvania, Philadelphia, PA
- Epigenetics Institute, Perelman School of Medicine, University of Pennsylvania
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania
| |
Collapse
|
6
|
Yu H, Wu D, Shen G, Hu M, Li Y. SnapFISH-IMPUTE: an imputation method for multiplexed DNA FISH data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.12.575427. [PMID: 38293083 PMCID: PMC10827092 DOI: 10.1101/2024.01.12.575427] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Chromatin spatial organization plays a crucial role in gene regulation. Recently developed and prospering multiplexed DNA FISH technologies enable direct visualization of chromatin conformation in nucleus. However, incomplete data caused by limited detection efficiency can substantially complicate and impair downstream analysis. Here, we present SnapFISH-IMPUTE that imputes missing values in multiplexed DNA FISH data. Analysis on multiple published datasets shows that the proposed method preserves the distribution of pairwise distances between imaging loci, and the imputed chromatin conformations are indistinguishable from the observed conformations. Additionally, imputation greatly improves downstream analyses such as identifying enhancer-promoter loops and clustering cells into distinct cell types. SnapFISH-IMPUTE is freely available at https://github.com/hyuyu104/SnapFISH-IMPUTE.
Collapse
Affiliation(s)
- Hongyu Yu
- Department of Statistics, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Daiqing Wu
- Department of Mathematics, Sun Yat-Sen University, Guangzhou, Guangdong, China
| | - Guning Shen
- Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA
- Department of Biology, University of North Carolina, Chapel Hill, NC, USA
| | - Ming Hu
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic Foundation, Cleveland, OH, USA
| | - Yun Li
- Department of Computer Science, University of North Carolina, Chapel Hill, NC, USA
- Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| |
Collapse
|
7
|
Schaeffer M, Nollmann M. Contributions of 3D chromatin structure to cell-type-specific gene regulation. Curr Opin Genet Dev 2023; 79:102032. [PMID: 36893484 DOI: 10.1016/j.gde.2023.102032] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/05/2023] [Accepted: 02/06/2023] [Indexed: 03/09/2023]
Abstract
Eukaryotic genomes are organized in 3D in a multiscale manner, and different mechanisms acting at each of these scales can contribute to transcriptional regulation. However, the large single-cell variability in 3D chromatin structures represents a challenge to understand how transcription may be differentially regulated between cell types in a robust and efficient manner. Here, we describe the different mechanisms by which 3D chromatin structure was shown to contribute to cell-type-specific transcriptional regulation. Excitingly, several novel methodologies able to measure 3D chromatin conformation and transcription in single cells in their native tissue context, or to detect the dynamics of cis-regulatory interactions, are starting to allow quantitative dissection of chromatin structure noise and relate it to how transcription may be regulated between different cell types and cell states.
Collapse
Affiliation(s)
- Marie Schaeffer
- Centre de Biologie Structurale, Univ Montpellier, CNRS UMR 5048, INSERM U1054, Montpellier, France
| | - Marcelo Nollmann
- Centre de Biologie Structurale, Univ Montpellier, CNRS UMR 5048, INSERM U1054, Montpellier, France.
| |
Collapse
|
8
|
Li Y, Matsunaga S. Various Strategies for Improved Signal-to-Noise Ratio in CRISPR-Based Live Cell Imaging. CYTOLOGIA 2023. [DOI: 10.1508/cytologia.88.3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/28/2023]
|
9
|
Sun X, Yin D, Qin F, Yu H, Lu W, Yao F, He Q, Huang X, Yan Z, Wang P, Deng C, Liu N, Yang Y, Liang W, Wang R, Wang C, Yokoya N, Hänsch R, Fu K. Revealing influencing factors on global waste distribution via deep-learning based dumpsite detection from satellite imagery. Nat Commun 2023; 14:1444. [PMID: 36922495 PMCID: PMC10015540 DOI: 10.1038/s41467-023-37136-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 03/01/2023] [Indexed: 03/17/2023] Open
Abstract
With the advancement of global civilisation, monitoring and managing dumpsites have become essential parts of environmental governance in various countries. Dumpsite locations are difficult to obtain in a timely manner by local government agencies and environmental groups. The World Bank shows that governments need to spend massive labour and economic costs to collect illegal dumpsites to implement management. Here we show that applying novel deep convolutional networks to high-resolution satellite images can provide an effective, efficient, and low-cost method to detect dumpsites. In sampled areas of 28 cities around the world, our model detects nearly 1000 dumpsites that appeared around 2021. This approach reduces the investigation time by more than 96.8% compared with the manual method. With this novel and powerful methodology, it is now capable of analysing the relationship between dumpsites and various social attributes on a global scale, temporally and spatially.
Collapse
Affiliation(s)
- Xian Sun
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China.
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China.
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China.
| | - Dongshuo Yin
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Fei Qin
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
| | - Hongfeng Yu
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Wanxuan Lu
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Fanglong Yao
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Qibin He
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Xingliang Huang
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Zhiyuan Yan
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Peijin Wang
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Chubo Deng
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Nayu Liu
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Yiran Yang
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Wei Liang
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China
| | - Ruiping Wang
- Institute of Computing Technology, Chinese Academy of Sciences, 100190, Beijing, China
| | - Cheng Wang
- Fujian Key Laboratory of Sensing and Computing for Smart Cities, School of Information Science and Engineering, Xiamen University, 361005, Xiamen, China
- Fujian Collaborative Innovation Center for Big Data Applications in Governments, 350003, Fuzhou, China
| | - Naoto Yokoya
- RIKEN Center for Advanced Intelligence Project, RIKEN, Tokyo, 103-0027, Japan
- Department of Complexity Science and Engineering, The University of Tokyo, Tokyo, 113-8654, Japan
| | - Ronny Hänsch
- German Aerospace Center (DLR), 82234, Weßling, Germany
| | - Kun Fu
- Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China.
- School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, 100190, Beijing, China.
- Key Laboratory of Network Information System Technology (NIST), Aerospace Information Research Institute, Chinese Academy of Sciences, 100190, Beijing, China.
| |
Collapse
|
10
|
Abstract
In animals, the sequences for controlling gene expression do not concentrate just at the transcription start site of genes, but are frequently thousands to millions of base pairs distal to it. The interaction of these sequences with one another and their transcription start sites is regulated by factors that shape the three-dimensional (3D) organization of the genome within the nucleus. Over the past decade, indirect tools exploiting high-throughput DNA sequencing have helped to map this 3D organization, have identified multiple key regulators of its structure and, in the process, have substantially reshaped our view of how 3D genome architecture regulates transcription. Now, new tools for high-throughput super-resolution imaging of chromatin have directly visualized the 3D chromatin organization, settling some debates left unresolved by earlier indirect methods, challenging some earlier models of regulatory specificity and creating hypotheses about the role of chromatin structure in transcriptional regulation.
Collapse
|
11
|
Multiple parameters shape the 3D chromatin structure of single nuclei at the doc locus in Drosophila. Nat Commun 2022; 13:5375. [PMID: 36104317 PMCID: PMC9474875 DOI: 10.1038/s41467-022-32973-y] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 08/25/2022] [Indexed: 11/08/2022] Open
Abstract
AbstractThe spatial organization of chromatin at the scale of topologically associating domains (TADs) and below displays large cell-to-cell variations. Up until now, how this heterogeneity in chromatin conformation is shaped by chromatin condensation, TAD insulation, and transcription has remained mostly elusive. Here, we used Hi-M, a multiplexed DNA-FISH imaging technique providing developmental timing and transcriptional status, to show that the emergence of TADs at the ensemble level partially segregates the conformational space explored by single nuclei during the early development of Drosophila embryos. Surprisingly, a substantial fraction of nuclei display strong insulation even before TADs emerge. Moreover, active transcription within a TAD leads to minor changes to the local inter- and intra-TAD chromatin conformation in single nuclei and only weakly affects insulation to the neighboring TAD. Overall, our results indicate that multiple parameters contribute to shaping the chromatin architecture of single nuclei at the TAD scale.
Collapse
|
12
|
Zeng W, Gautam A, Huson DH. DeepToA: An Ensemble Deep-Learning Approach to Predicting the Theater of Activity of a Microbiome. Bioinformatics 2022; 38:4670-4676. [PMID: 36029249 DOI: 10.1093/bioinformatics/btac584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 07/19/2022] [Accepted: 08/26/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Metagenomics is the study of microbiomes using DNA sequencing. A microbiome consists of an assemblage of microbes that is associated with a "theater of activity" (ToA). An important question is, to what degree does the taxonomic and functional content of the former depend on the (details of the) latter? Here we investigate a related technical question: Given a taxonomic and/or functional profile estimated from metagenomic sequencing data, how to predict the associated ToA? We present a deep-learning approach to this question. We use both taxonomic and functional profiles as input. We apply node2vec to embed hierarchical taxonomic profiles into numerical vectors. We then perform dimension reduction using clustering, to address the sparseness of the taxonomic data and thus make the problem more amenable to deep-learning algorithms. Functional features are combined with textual descriptions of protein families or domains. We present an ensemble deep-learning framework DeepToA for predicting the "theater of activity" of amicrobial community, based on taxonomic and functional profiles. We use SHAP (SHapley Additive exPlanations) values to determine which taxonomic and functional features are important for the prediction. RESULTS Based on 7,560 metagenomic profiles downloaded from MGnify, classified into ten different theaters of activity, we demonstrate that DeepToA has an accuracy of 98.30%. We show that adding textual information to functional features increases the accuracy. AVAILABILITY Our approach is available at http://ab.inf.uni-tuebingen.de/software/deeptoa. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenhuan Zeng
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, 72076, Germany
| | - Anupam Gautam
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, 72076, Germany.,International Max Planck Research School "From Molecules to Organisms", Max Planck Institute for Biology Tübingen, Max-Planck-Ring 5, Tübingen, 72076, Germany
| | - Daniel H Huson
- Institute for Bioinformatics and Medical Informatics, University of Tübingen, Tübingen, 72076, Germany.,International Max Planck Research School "From Molecules to Organisms", Max Planck Institute for Biology Tübingen, Max-Planck-Ring 5, Tübingen, 72076, Germany.,Cluster of Excellence: Controlling Microbes to Fight Infection, Tübingen, Germany
| |
Collapse
|
13
|
Zhao C, Liu T, Wang Z. Functional Similarities of Protein-Coding Genes in Topologically Associating Domains and Spatially-Proximate Genomic Regions. Genes (Basel) 2022; 13:genes13030480. [PMID: 35328034 PMCID: PMC8951421 DOI: 10.3390/genes13030480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 02/26/2022] [Accepted: 03/05/2022] [Indexed: 02/01/2023] Open
Abstract
Topologically associating domains (TADs) are the structural and functional units of the genome. However, the functions of protein-coding genes existing in the same or different TADs have not been fully investigated. We compared the functional similarities of protein-coding genes existing in the same TAD and between different TADs, and also in the same gap region (the region between two consecutive TADs) and between different gap regions. We found that the protein-coding genes from the same TAD or gap region are more likely to share similar protein functions, and this trend is more obvious with TADs than the gap regions. We further created two types of gene–gene spatial interaction networks: the first type is based on Hi-C contacts, whereas the second type is based on both Hi-C contacts and the relationship of being in the same TAD. A graph auto-encoder was applied to learn the network topology, reconstruct the two types of networks, and predict the functions of the central genes/nodes based on the functions of the neighboring genes/nodes. It was found that better performance was achieved with the second type of network. Furthermore, we detected long-range spatially-interactive regions based on Hi-C contacts and calculated the functional similarities of the gene pairs from these regions.
Collapse
|
14
|
Molecular architecture of enhancer–promoter interaction. Curr Opin Cell Biol 2022; 74:62-70. [DOI: 10.1016/j.ceb.2022.01.003] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 12/29/2021] [Accepted: 01/10/2022] [Indexed: 12/26/2022]
|
15
|
Huminiecki Ł. Virtual Gene Concept and a Corresponding Pragmatic Research Program in Genetical Data Science. ENTROPY (BASEL, SWITZERLAND) 2021; 24:17. [PMID: 35052043 PMCID: PMC8774939 DOI: 10.3390/e24010017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/02/2021] [Accepted: 12/14/2021] [Indexed: 06/14/2023]
Abstract
Mendel proposed an experimentally verifiable paradigm of particle-based heredity that has been influential for over 150 years. The historical arguments have been reflected in the near past as Mendel's concept has been diversified by new types of omics data. As an effect of the accumulation of omics data, a virtual gene concept forms, giving rise to genetical data science. The concept integrates genetical, functional, and molecular features of the Mendelian paradigm. I argue that the virtual gene concept should be deployed pragmatically. Indeed, the concept has already inspired a practical research program related to systems genetics. The program includes questions about functionality of structural and categorical gene variants, about regulation of gene expression, and about roles of epigenetic modifications. The methodology of the program includes bioinformatics, machine learning, and deep learning. Education, funding, careers, standards, benchmarks, and tools to monitor research progress should be provided to support the research program.
Collapse
Affiliation(s)
- Łukasz Huminiecki
- Evolutionary, Computational, and Statistical Genetics, Department of Molecula Biology, Institute of Genetics and Animal Biotechnology, Polish Academy of Sciences, Postępu 36A, Jastrzębiec, 05-552 Warsaw, Poland
| |
Collapse
|
16
|
Kanapeckaitė A, Burokienė N, Mažeikienė A, Cottrell GS, Widera D. Biophysics is reshaping our perception of the epigenome: from DNA-level to high-throughput studies. BIOPHYSICAL REPORTS 2021; 1:100028. [PMID: 36425454 PMCID: PMC9680810 DOI: 10.1016/j.bpr.2021.100028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Accepted: 09/24/2021] [Indexed: 06/16/2023]
Abstract
Epigenetic research holds great promise to advance our understanding of biomarkers and regulatory processes in health and disease. An increasing number of new approaches, ranging from molecular to biophysical analyses, enable identifying epigenetic changes on the level of a single gene or the whole epigenome. The aim of this review is to highlight how the field is shifting from completely molecular-biology-driven solutions to multidisciplinary strategies including more reliance on biophysical analysis tools. Biophysics not only offers technical advancements in imaging or structure analysis but also helps to explore regulatory interactions. New computational methods are also being developed to meet the demand of growing data volumes and their processing. Therefore, it is important to capture these new directions in epigenetics from a biophysical perspective and discuss current challenges as well as multiple applications of biophysical methods and tools. Specifically, we gradually introduce different biophysical research methods by first considering the DNA-level information and eventually higher-order chromatin structures. Moreover, we aim to highlight that the incorporation of bioinformatics, machine learning, and artificial intelligence into biophysical analysis allows gaining new insights into complex epigenetic processes. The gained understanding has already proven useful in translational and clinical research providing better patient stratification options or new therapeutic insights. Together, this offers a better readiness to transform bench-top experiments into industrial high-throughput applications with a possibility to employ developed methods in clinical practice and diagnostics.
Collapse
Affiliation(s)
- Austė Kanapeckaitė
- Algorithm379, Laisvės g. 7, LT 12007, Vilnius, Lithuania
- Reading School of Pharmacy, Whiteknights, Reading, UK, RG6 6UB
| | - Neringa Burokienė
- Clinics of Internal Diseases, Family Medicine and Oncology, Institute of Clinical Medicine, Faculty of Medicine, Vilnius University, M. K. Čiurlionio str. 21/27, LT-03101 Vilnius, Lithuania
| | - Asta Mažeikienė
- Department of Physiology, Biochemistry, Microbiology and Laboratory Medicine, Institute of Biomedical Sciences, Faculty of Medicine, M. K. Čiurlionio str. 21/27, LT-03101 Vilnius, Lithuania
| | | | - Darius Widera
- Reading School of Pharmacy, Whiteknights, Reading, UK, RG6 6UB
| |
Collapse
|