1
|
Kretschmer M, Fischer V, Gapp K. When Dad's Stress Gets under Kid's Skin-Impacts of Stress on Germline Cargo and Embryonic Development. Biomolecules 2023; 13:1750. [PMID: 38136621 PMCID: PMC10742275 DOI: 10.3390/biom13121750] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 11/24/2023] [Accepted: 12/01/2023] [Indexed: 12/24/2023] Open
Abstract
Multiple lines of evidence suggest that paternal psychological stress contributes to an increased prevalence of neuropsychiatric and metabolic diseases in the progeny. While altered paternal care certainly plays a role in such transmitted disease risk, molecular factors in the germline might additionally be at play in humans. This is supported by findings on changes to the molecular make up of germ cells and suggests an epigenetic component in transmission. Several rodent studies demonstrate the correlation between paternal stress induced changes in epigenetic modifications and offspring phenotypic alterations, yet some intriguing cases also start to show mechanistic links in between sperm and the early embryo. In this review, we summarise efforts to understand the mechanism of intergenerational transmission from sperm to the early embryo. In particular, we highlight how stress alters epigenetic modifications in sperm and discuss the potential for these modifications to propagate modified molecular trajectories in the early embryo to give rise to aberrant phenotypes in adult offspring.
Collapse
Affiliation(s)
- Miriam Kretschmer
- Laboratory of Epigenetics and Neuroendocrinology, Department of Health Sciences and Technology, Institute for Neuroscience, ETH Zürich, 8057 Zürich, Switzerland; (M.K.); (V.F.)
- Neuroscience Center Zurich, ETH Zürich and University of Zürich, 8057 Zürich, Switzerland
| | - Vincent Fischer
- Laboratory of Epigenetics and Neuroendocrinology, Department of Health Sciences and Technology, Institute for Neuroscience, ETH Zürich, 8057 Zürich, Switzerland; (M.K.); (V.F.)
- Neuroscience Center Zurich, ETH Zürich and University of Zürich, 8057 Zürich, Switzerland
| | - Katharina Gapp
- Laboratory of Epigenetics and Neuroendocrinology, Department of Health Sciences and Technology, Institute for Neuroscience, ETH Zürich, 8057 Zürich, Switzerland; (M.K.); (V.F.)
- Neuroscience Center Zurich, ETH Zürich and University of Zürich, 8057 Zürich, Switzerland
| |
Collapse
|
2
|
Marri D, Filipovic D, Kana O, Tischkau S, Bhattacharya S. Prediction of mammalian tissue-specific CLOCK-BMAL1 binding to E-box DNA motifs. Sci Rep 2023; 13:7742. [PMID: 37173345 PMCID: PMC10182026 DOI: 10.1038/s41598-023-34115-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Accepted: 04/25/2023] [Indexed: 05/15/2023] Open
Abstract
The Brain and Muscle ARNTL-Like 1 protein (BMAL1) forms a heterodimer with either Circadian Locomotor Output Cycles Kaput (CLOCK) or Neuronal PAS domain protein 2 (NPAS2) to act as a master regulator of the mammalian circadian clock gene network. The dimer binds to E-box gene regulatory elements on DNA, activating downstream transcription of clock genes. Identification of transcription factor binding sites and genomic features that correlate to DNA binding by BMAL1 is a challenging problem, given that CLOCK-BMAL1 or NPAS2-BMAL1 bind to several distinct binding motifs (CANNTG) on DNA. Using three different types of tissue-specific machine learning models with features based on (1) DNA sequence, (2) DNA sequence plus DNA shape, and (3) DNA sequence and shape plus histone modifications, we developed an interpretable predictive model of genome-wide BMAL1 binding to E-box motifs and dissected the mechanisms underlying BMAL1-DNA binding. Our results indicated that histone modifications, the local shape of the DNA, and the flanking sequence of the E-box motif are sufficient predictive features for BMAL1-DNA binding. Our models also provide mechanistic insights into tissue specificity of DNA binding by BMAL1.
Collapse
Affiliation(s)
- Daniel Marri
- Department of Biomedical Engineering, Michigan State University, East Lansing, MI, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - David Filipovic
- Department of Biomedical Engineering, Michigan State University, East Lansing, MI, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Omar Kana
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
- Department of Pharmacology and Toxicology, Michigan State University, East Lansing, MI, USA
- Institute for Integrative Toxicology, Michigan State University, East Lansing, MI, USA
| | - Shelley Tischkau
- Department of Pharmacology, Southern Illinois University School of Medicine, Springfield, IL, USA
| | - Sudin Bhattacharya
- Department of Biomedical Engineering, Michigan State University, East Lansing, MI, USA.
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA.
- Department of Pharmacology and Toxicology, Michigan State University, East Lansing, MI, USA.
- Institute for Integrative Toxicology, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
3
|
Sghaier N, Essemine J, Ayed RB, Gorai M, Ben Marzoug R, Rebai A, Qu M. An Evidence Theory and Fuzzy Logic Combined Approach for the Prediction of Potential ARF-Regulated Genes in Quinoa. PLANTS (BASEL, SWITZERLAND) 2022; 12:71. [PMID: 36616201 PMCID: PMC9824623 DOI: 10.3390/plants12010071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 11/26/2022] [Indexed: 06/17/2023]
Abstract
Quinoa constitutes among the tolerant plants to the challenging and harmful abiotic environmental factors. Quinoa was selected as among the model crops destined for bio-saline agriculture that could contribute to the staple food security for an ever-growing worldwide population under various climate change scenarios. The auxin response factors (ARFs) constitute the main contributors in the plant adaptation to severe environmental conditions. Thus, the determination of the ARF-binding sites represents the major step that could provide promising insights helping in plant breeding programs and improving agronomic traits. Hence, determining the ARF-binding sites is a challenging task, particularly in species with large genome sizes. In this report, we present a data fusion approach based on Dempster-Shafer evidence theory and fuzzy set theory to predict the ARF-binding sites. We then performed an "In-silico" identification of the ARF-binding sites in Chenopodium quinoa. The characterization of some known pathways implicated in the auxin signaling in other higher plants confirms our prediction reliability. Furthermore, several pathways with no or little available information about their functions were identified to play important roles in the adaptation of quinoa to environmental conditions. The predictive auxin response genes associated with the detected ARF-binding sites may certainly help to explore the biological roles of some unknown genes newly identified in quinoa.
Collapse
Affiliation(s)
- Nesrine Sghaier
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya 572024, China
- CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
- Laboratory of Advanced Technology and Intelligent Systems, National Engineering School of Sousse, Sousse 4023, Tunisia
| | - Jemaa Essemine
- CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| | - Rayda Ben Ayed
- Department of Agronomy and Plant Biotechnology, National Institute of Agronomy of Tunisia (INAT), 43 Avenue Charles Nicolle, 1082 El Mahrajène, University of Carthage-Tunis, Tunis 1082, Tunisia
- Laboratory of Extremophile Plants, Centre of Biotechnology of Borj-Cédria, B.P. 901, Hammam Lif 2050, Tunisia
| | - Mustapha Gorai
- Higher Institute of Applied Biology Medenine, University of Gabes, Medenine 4119, Tunisia
| | - Riadh Ben Marzoug
- Laboratory of Molecular and Cellular Screening Processes, Sfax Biotechnology Center, B.P 1177, Sfax 3018, Tunisia
| | - Ahmed Rebai
- Laboratory of Molecular and Cellular Screening Processes, Sfax Biotechnology Center, B.P 1177, Sfax 3018, Tunisia
| | - Mingnan Qu
- National Nanfan Research Institute (Sanya), Chinese Academy of Agricultural Sciences, Sanya 572024, China
- CAS Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200032, China
| |
Collapse
|
4
|
Rivière Q, Corso M, Ciortan M, Noël G, Verbruggen N, Defrance M. Exploiting Genomic Features to Improve the Prediction of Transcription Factor-Binding Sites in Plants. PLANT & CELL PHYSIOLOGY 2022; 63:1457-1473. [PMID: 35799371 DOI: 10.1093/pcp/pcac095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 06/07/2022] [Accepted: 07/06/2022] [Indexed: 06/15/2023]
Abstract
The identification of transcription factor (TF) target genes is central in biology. A popular approach is based on the location by pattern matching of potential cis-regulatory elements (CREs). During the last few years, tools integrating next-generation sequencing data have been developed to improve the performance of pattern matching. However, such tools have not yet been comprehensively evaluated in plants. Hence, we developed a new streamlined method aiming at predicting CREs and target genes of plant TFs in specific organs or conditions. Our approach implements a supervised machine learning strategy, which allows decision rule models to be learnt using TF ChIP-chip/seq experimental data. Different layers of genomic features were integrated in predictive models: the position on the gene, the DNA sequence conservation, the chromatin state and various CRE footprints. Among the tested features, the chromatin features were crucial for improving the accuracy of the method. Furthermore, we evaluated the transferability of predictive models across TFs, organs and species. Finally, we validated our method by correctly inferring the target genes of key TFs controlling metabolite biosynthesis at the organ level in Arabidopsis. We developed a tool-Wimtrap-to reproduce our approach in plant species and conditions/organs for which ChIP-chip/seq data are available. Wimtrap is a user-friendly R package that supports an R Shiny web interface and is provided with pre-built models that can be used to quickly get predictions of CREs and TF gene targets in different organs or conditions in Arabidopsis thaliana, Solanum lycopersicum, Oryza sativa and Zea mays.
Collapse
Affiliation(s)
- Quentin Rivière
- Brussels Bioengineering School, Laboratory of Plant Physiology and molecular Genetics, Université Libre de Bruxelles, Brussels 1050, Belgium
| | - Massimiliano Corso
- Brussels Bioengineering School, Laboratory of Plant Physiology and molecular Genetics, Université Libre de Bruxelles, Brussels 1050, Belgium
- INRAE, AgroParisTech, Institut Jean-Pierre Bourgin (IJPB), Université Paris-Saclay, Versailles 78000, France
| | - Madalina Ciortan
- Interuniversity Institute of Bioinformatics in Brussels, Machine Learning Group, Université Libre de Bruxelles, Brussels 1050, Belgium
| | - Grégoire Noël
- Functional and Evolutionary Entomology, Gembloux Agro-Bio Tech, University of Liège, Passage des Déportés 2, Gembloux 5030, Belgium
| | - Nathalie Verbruggen
- Brussels Bioengineering School, Laboratory of Plant Physiology and molecular Genetics, Université Libre de Bruxelles, Brussels 1050, Belgium
| | - Matthieu Defrance
- Interuniversity Institute of Bioinformatics in Brussels, Machine Learning Group, Université Libre de Bruxelles, Brussels 1050, Belgium
| |
Collapse
|
5
|
Zibetti C. Deciphering the Retinal Epigenome during Development, Disease and Reprogramming: Advancements, Challenges and Perspectives. Cells 2022; 11:cells11050806. [PMID: 35269428 PMCID: PMC8908986 DOI: 10.3390/cells11050806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 02/15/2022] [Accepted: 02/18/2022] [Indexed: 02/01/2023] Open
Abstract
Retinal neurogenesis is driven by concerted actions of transcription factors, some of which are expressed in a continuum and across several cell subtypes throughout development. While seemingly redundant, many factors diversify their regulatory outcome on gene expression, by coordinating variations in chromatin landscapes to drive divergent retinal specification programs. Recent studies have furthered the understanding of the epigenetic contribution to the progression of age-related macular degeneration, a leading cause of blindness in the elderly. The knowledge of the epigenomic mechanisms that control the acquisition and stabilization of retinal cell fates and are evoked upon damage, holds the potential for the treatment of retinal degeneration. Herein, this review presents the state-of-the-art approaches to investigate the retinal epigenome during development, disease, and reprogramming. A pipeline is then reviewed to functionally interrogate the epigenetic and transcriptional networks underlying cell fate specification, relying on a truly unbiased screening of open chromatin states. The related work proposes an inferential model to identify gene regulatory networks, features the first footprinting analysis and the first tentative, systematic query of candidate pioneer factors in the retina ever conducted in any model organism, leading to the identification of previously uncharacterized master regulators of retinal cell identity, such as the nuclear factor I, NFI. This pipeline is virtually applicable to the study of genetic programs and candidate pioneer factors in any developmental context. Finally, challenges and limitations intrinsic to the current next-generation sequencing techniques are discussed, as well as recent advances in super-resolution imaging, enabling spatio-temporal resolution of the genome.
Collapse
Affiliation(s)
- Cristina Zibetti
- Department of Ophthalmology, Institute of Clinical Medicine, University of Oslo, Kirkeveien 166, Building 36, 0455 Oslo, Norway
| |
Collapse
|
6
|
Zhang Y, Wang Z, Zeng Y, Liu Y, Xiong S, Wang M, Zhou J, Zou Q. A novel convolution attention model for predicting transcription factor binding sites by combination of sequence and shape. Brief Bioinform 2021; 23:6470969. [PMID: 34929739 DOI: 10.1093/bib/bbab525] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/28/2021] [Accepted: 11/13/2021] [Indexed: 12/17/2022] Open
Abstract
The discovery of putative transcription factor binding sites (TFBSs) is important for understanding the underlying binding mechanism and cellular functions. Recently, many computational methods have been proposed to jointly account for DNA sequence and shape properties in TFBSs prediction. However, these methods fail to fully utilize the latent features derived from both sequence and shape profiles and have limitation in interpretability and knowledge discovery. To this end, we present a novel Deep Convolution Attention network combining Sequence and Shape, dubbed as D-SSCA, for precisely predicting putative TFBSs. Experiments conducted on 165 ENCODE ChIP-seq datasets reveal that D-SSCA significantly outperforms several state-of-the-art methods in predicting TFBSs, and justify the utility of channel attention module for feature refinements. Besides, the thorough analysis about the contribution of five shapes to TFBSs prediction demonstrates that shape features can improve the predictive power for transcription factors-DNA binding. Furthermore, D-SSCA can realize the cross-cell line prediction of TFBSs, indicating the occupancy of common interplay patterns concerning both sequence and shape across various cell lines. The source code of D-SSCA can be found at https://github.com/MoonLord0525/.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China.,School of Computer Science and Engineering, University of Electronic Science and Technology of China, 611731, Chengdu, China
| | - Zixuan Wang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Yuhang Liu
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Shuwen Xiong
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Maocheng Wang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Jiliu Zhou
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 610054, Chengdu, China
| |
Collapse
|
7
|
Zhang Y, Wang Z, Zeng Y, Zhou J, Zou Q. High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method. Brief Bioinform 2021; 22:6322761. [PMID: 34272562 DOI: 10.1093/bib/bbab273] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/19/2021] [Accepted: 06/25/2021] [Indexed: 11/14/2022] Open
Abstract
Transcription factors (TFs) are essential proteins in regulating the spatiotemporal expression of genes. It is crucial to infer the potential transcription factor binding sites (TFBSs) with high resolution to promote biology and realize precision medicine. Recently, deep learning-based models have shown exemplary performance in the prediction of TFBSs at the base-pair level. However, the previous models fail to integrate nucleotide position information and semantic information without noisy responses. Thus, there is still room for improvement. Moreover, both the inner mechanism and prediction results of these models are challenging to interpret. To this end, the Deep Attentive Encoder-Decoder Neural Network (D-AEDNet) is developed to identify the location of TFs-DNA binding sites in DNA sequences. In particular, our model adopts Skip Architecture to leverage the nucleotide position information in the encoder and removes noisy responses in the information fusion process by Attention Gate. Simultaneously, the Transcription Factor Motif Discovery based on Sliding Window (TF-MoDSW), an approach to discover TFs-DNA binding motifs by utilizing the output of neural networks, is proposed to understand the biological meaning of the predicted result. On ChIP-exo datasets, experimental results show that D-AEDNet has better performance than competing methods. Besides, we authenticate that Attention Gate can improve the interpretability of our model by ways of visualization analysis. Furthermore, we confirm that ability of D-AEDNet to learn TFs-DNA binding motifs outperform the state-of-the-art methods and availability of TF-MoDSW to discover biological sequence motifs in TFs-DNA interaction by conducting experiment on ChIP-seq datasets.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Zixuan Wang
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Jiliu Zhou
- School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 610054, Chengdu, China
| |
Collapse
|
8
|
Jing F, Zhang SW, Cao Z, Zhang S. An Integrative Framework for Combining Sequence and Epigenomic Data to Predict Transcription Factor Binding Sites Using Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:355-364. [PMID: 30835229 DOI: 10.1109/tcbb.2019.2901789] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Knowing the transcription factor binding sites (TFBSs) is essential for modeling the underlying binding mechanisms and follow-up cellular functions. Convolutional neural networks (CNNs) have outperformed methods in predicting TFBSs from the primary DNA sequence. In addition to DNA sequences, histone modifications and chromatin accessibility are also important factors influencing their activity. They have been explored to predict TFBSs recently. However, current methods rarely take into account histone modifications and chromatin accessibility using CNN in an integrative framework. To this end, we developed a general CNN model to integrate these data for predicting TFBSs. We systematically benchmarked a series of architecture variants by changing network structure in terms of width and depth, and explored the effects of sample length at flanking regions. We evaluated the performance of the three types of data and their combinations using 256 ChIP-seq experiments and also compared it with competing machine learning methods. We find that contributions from these three types of data are complementary to each other. Moreover, the integrative CNN framework is superior to traditional machine learning methods with significant improvements.
Collapse
|
9
|
Sazonova MA, Ryzhkova AI, Sinyov VV, Sazonova MD, Khasanova ZB, Nikitina NA, Karagodin VP, Orekhov AN, Sobenin IA. Creation of Cultures Containing Mutations Linked with Cardiovascular Diseases using Transfection and Genome Editing. Curr Pharm Des 2020; 25:693-699. [PMID: 30931844 DOI: 10.2174/1381612825666190329121532] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Accepted: 03/25/2019] [Indexed: 12/18/2022]
Abstract
OBJECTIVE In this review article, we analyzed the literature on the creation of cultures containing mutations associated with cardiovascular diseases (CVD) using transfection, transduction and editing of the human genome. METHODS We described different methods of transfection, transduction and editing of the human genome, used in the literature. RESULTS We reviewed the researches in which the creation of сell cultures containing mutations was described. According to the literature, system CRISPR/Cas9 proved to be the most preferred method for editing the genome. We found rather promising and interesting a practically undeveloped direction of mitochondria transfection using a gene gun. Such a gun can direct a genetically-engineered construct containing human DNA mutations to the mitochondria using heavy metal particles. However, in human molecular genetics, the transfection method using a gene gun is unfairly forgotten and is almost never used. Ethical problems arising from editing the human genome were also discussed in our review. We came to a conclusion that it is impossible to stop scientific and technical progress. It is important that the editing of the genome takes place under the strict control of society and does not bear dangerous consequences for humanity. To achieve this, the constant interaction of science with society, culture and business is necessary. CONCLUSION The most promising methods for the creation of cell cultures containing mutations linked with cardiovascular diseases, were system CRISPR/Cas9 and the gene gun.
Collapse
Affiliation(s)
- Margarita A Sazonova
- Laboratory of Medical Genetics, National Medical Research Center of Cardiology, Moscow, Russian Federation.,Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Moscow, Russian Federation
| | - Anastasia I Ryzhkova
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Moscow, Russian Federation
| | - Vasily V Sinyov
- Laboratory of Medical Genetics, National Medical Research Center of Cardiology, Moscow, Russian Federation
| | - Marina D Sazonova
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Moscow, Russian Federation
| | - Zukhra B Khasanova
- Laboratory of Medical Genetics, National Medical Research Center of Cardiology, Moscow, Russian Federation
| | - Nadezhda A Nikitina
- Laboratory of Medical Genetics, National Medical Research Center of Cardiology, Moscow, Russian Federation
| | | | - Alexander N Orekhov
- Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Moscow, Russian Federation
| | - Igor A Sobenin
- Laboratory of Medical Genetics, National Medical Research Center of Cardiology, Moscow, Russian Federation.,Laboratory of Angiopathology, Institute of General Pathology and Pathophysiology, Moscow, Russian Federation
| |
Collapse
|
10
|
Yan F, Powell DR, Curtis DJ, Wong NC. From reads to insight: a hitchhiker's guide to ATAC-seq data analysis. Genome Biol 2020; 21:22. [PMID: 32014034 PMCID: PMC6996192 DOI: 10.1186/s13059-020-1929-3] [Citation(s) in RCA: 194] [Impact Index Per Article: 48.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Accepted: 01/08/2020] [Indexed: 12/16/2022] Open
Abstract
Assay of Transposase Accessible Chromatin sequencing (ATAC-seq) is widely used in studying chromatin biology, but a comprehensive review of the analysis tools has not been completed yet. Here, we discuss the major steps in ATAC-seq data analysis, including pre-analysis (quality check and alignment), core analysis (peak calling), and advanced analysis (peak differential analysis and annotation, motif enrichment, footprinting, and nucleosome position analysis). We also review the reconstruction of transcriptional regulatory networks with multiomics data and highlight the current challenges of each step. Finally, we describe the potential of single-cell ATAC-seq and highlight the necessity of developing ATAC-seq specific analysis tools to obtain biologically meaningful insights.
Collapse
Affiliation(s)
- Feng Yan
- Australian Centre for Blood Diseases, Central Clinical School, Monash University, Melbourne, VIC, Australia
| | - David R Powell
- Monash Bioinformatics Platform, Monash University, Melbourne, VIC, Australia
| | - David J Curtis
- Australian Centre for Blood Diseases, Central Clinical School, Monash University, Melbourne, VIC, Australia.,Department of Clinical Haematology, Alfred Health, Melbourne, VIC, Australia
| | - Nicholas C Wong
- Australian Centre for Blood Diseases, Central Clinical School, Monash University, Melbourne, VIC, Australia. .,Monash Bioinformatics Platform, Monash University, Melbourne, VIC, Australia.
| |
Collapse
|
11
|
Xu T, Zheng X, Li B, Jin P, Qin Z, Wu H. A comprehensive review of computational prediction of genome-wide features. Brief Bioinform 2020; 21:120-134. [PMID: 30462144 PMCID: PMC10233247 DOI: 10.1093/bib/bby110] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Revised: 10/15/2018] [Accepted: 10/16/2018] [Indexed: 12/15/2022] Open
Abstract
There are significant correlations among different types of genetic, genomic and epigenomic features within the genome. These correlations make the in silico feature prediction possible through statistical or machine learning models. With the accumulation of a vast amount of high-throughput data, feature prediction has gained significant interest lately, and a plethora of papers have been published in the past few years. Here we provide a comprehensive review on these published works, categorized by the prediction targets, including protein binding site, enhancer, DNA methylation, chromatin structure and gene expression. We also provide discussions on some important points and possible future directions.
Collapse
Affiliation(s)
- Tianlei Xu
- Department of Mathematics and Computer Science, Emory University, Atlanta, GA, USA
| | - Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai, China
| | - Ben Li
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Peng Jin
- Department of Human Genetics, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Zhaohui Qin
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| | - Hao Wu
- Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, USA
| |
Collapse
|
12
|
Behjati Ardakani F, Schmidt F, Schulz MH. Predicting transcription factor binding using ensemble random forest models. F1000Res 2019; 7:1603. [PMID: 31723409 PMCID: PMC6823902 DOI: 10.12688/f1000research.16200.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 08/15/2019] [Indexed: 12/03/2022] Open
Abstract
Background: Understanding the location and cell-type specific binding of Transcription Factors (TFs) is important in the study of gene regulation. Computational prediction of TF binding sites is challenging, because TFs often bind only to short DNA motifs and cell-type specific co-factors may work together with the same TF to determine binding. Here, we consider the problem of learning a general model for the prediction of TF binding using DNase1-seq data and TF motif description in form of position specific energy matrices (PSEMs). Methods: We use TF ChIP-seq data as a gold-standard for model training and evaluation. Our contribution is a novel ensemble learning approach using random forest classifiers. In the context of the
ENCODE-DREAM in vivo TF binding site prediction challenge we consider different learning setups. Results: Our results indicate that the ensemble learning approach is able to better generalize across tissues and cell-types compared to individual tissue-specific classifiers or a classifier built based upon data aggregated across tissues. Furthermore, we show that incorporating DNase1-seq peaks is essential to reduce the false positive rate of TF binding predictions compared to considering the raw DNase1 signal. Conclusions: Analysis of important features reveals that the models preferentially select motifs of other TFs that are close interaction partners in existing protein protein-interaction networks. Code generated in the scope of this project is available on GitHub:
https://github.com/SchulzLab/TFAnalysis (DOI: 10.5281/zenodo.1409697).
Collapse
Affiliation(s)
- Fatemeh Behjati Ardakani
- High throughput Genomics and Systems Biology, Cluster of Excellence on Multimodel Computing and Interaction, Saarland University, Saarbruecken,, Saarland, 66123, Germany.,Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbruecken, Saarland, 66123, Germany.,Graduate School of computer science, Saarland University, Saarbruecken, Saarland, 66123, Germany
| | - Florian Schmidt
- High throughput Genomics and Systems Biology, Cluster of Excellence on Multimodel Computing and Interaction, Saarland University, Saarbruecken,, Saarland, 66123, Germany.,Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbruecken, Saarland, 66123, Germany.,Graduate School of computer science, Saarland University, Saarbruecken, Saarland, 66123, Germany.,Computational Systems Biology, Genome Institute of Singapore, Singapore, Singapore
| | - Marcel H Schulz
- High throughput Genomics and Systems Biology, Cluster of Excellence on Multimodel Computing and Interaction, Saarland University, Saarbruecken,, Saarland, 66123, Germany.,Computational Biology and Applied Algorithmics, Max Planck Institute for Informatics, Saarbruecken, Saarland, 66123, Germany.,Institute for Cardiovasular Regeneration, Goethe University Frankfurt Am Main, Frankfurt Am Main, Hessen, 60590, Germany
| |
Collapse
|
13
|
Zibetti C, Liu S, Wan J, Qian J, Blackshaw S. Epigenomic profiling of retinal progenitors reveals LHX2 is required for developmental regulation of open chromatin. Commun Biol 2019; 2:142. [PMID: 31044167 PMCID: PMC6484012 DOI: 10.1038/s42003-019-0375-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2018] [Accepted: 03/11/2019] [Indexed: 11/14/2022] Open
Abstract
Retinal neurogenesis occurs through partially overlapping temporal windows, driven by concerted actions of transcription factors which, in turn, may contribute to the establishment of divergent genetic programs in the developing retina by coordinating variations in chromatin landscapes. Here we comprehensively profile murine retinal progenitors by integrating next generation sequencing methods and interrogate changes in chromatin accessibility at embryonic and post-natal stages. An unbiased search for motifs in open chromatin regions identifies putative factors involved in the developmental progression of the epigenome in retinal progenitor cells. Among these factors, the transcription factor LHX2 exhibits a developmentally regulated cis-regulatory repertoire and stage-dependent motif instances. Using loss-of-function assays, we determine LHX2 coordinates variations in chromatin accessibility, by competition for nucleosome occupancy and secondary regulation of candidate pioneer factors.
Collapse
Affiliation(s)
- Cristina Zibetti
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Sheng Liu
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202 USA
| | - Jun Wan
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202 USA
| | - Jiang Qian
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Seth Blackshaw
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
- Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
- Center for Human Systems Biology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
- Institute for Cell Engineering, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| |
Collapse
|
14
|
Keilwagen J, Posch S, Grau J. Accurate prediction of cell type-specific transcription factor binding. Genome Biol 2019; 20:9. [PMID: 30630522 PMCID: PMC6327544 DOI: 10.1186/s13059-018-1614-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 12/18/2018] [Indexed: 01/11/2023] Open
Abstract
Prediction of cell type-specific, in vivo transcription factor binding sites is one of the central challenges in regulatory genomics. Here, we present our approach that earned a shared first rank in the "ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge" in 2017. In post-challenge analyses, we benchmark the influence of different feature sets and find that chromatin accessibility and binding motifs are sufficient to yield state-of-the-art performance. Finally, we provide 682 lists of predicted peaks for a total of 31 transcription factors in 22 primary cell types and tissues and a user-friendly version of our approach, Catchitt, for download.
Collapse
Affiliation(s)
- Jens Keilwagen
- Institute for Biosafety in Plant Biotechnology, Julius Kühn-Institut (JKI) - Federal Research Centre for Cultivated Plants, Erwin-Baur-Straße 27, Quedlinburg, 06484 Germany
| | - Stefan Posch
- Institute of Computer Science, Martin Luther University Halle–Wittenberg, Von-Seckendorff-Platz 1, Halle (Saale), 06120 Germany
| | - Jan Grau
- Institute of Computer Science, Martin Luther University Halle–Wittenberg, Von-Seckendorff-Platz 1, Halle (Saale), 06120 Germany
| |
Collapse
|
15
|
Cabot B, Cabot RA. Chromatin remodeling in mammalian embryos. Reproduction 2018; 155:R147-R158. [PMID: 29339454 DOI: 10.1530/rep-17-0488] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2017] [Accepted: 01/12/2018] [Indexed: 12/28/2022]
Abstract
The mammalian embryo undergoes a dramatic amount of epigenetic remodeling during the first week of development. In this review, we discuss several epigenetic changes that happen over the course of cleavage development, focusing on covalent marks (e.g., histone methylation and acetylation) and non-covalent remodeling (chromatin remodeling via remodeling complexes; e.g., SWI/SNF-mediated chromatin remodeling). Comparisons are also drawn between remodeling events that occur in embryos from a variety of mammalian species.
Collapse
Affiliation(s)
- Birgit Cabot
- Department of Animal SciencesPurdue University, West Lafayette, Indiana, USA
| | - Ryan A Cabot
- Department of Animal SciencesPurdue University, West Lafayette, Indiana, USA
| |
Collapse
|