1
|
Huo Q, Song R, Ma Z. Recent advances in exploring transcriptional regulatory landscape of crops. FRONTIERS IN PLANT SCIENCE 2024; 15:1421503. [PMID: 38903438 PMCID: PMC11188431 DOI: 10.3389/fpls.2024.1421503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 05/23/2024] [Indexed: 06/22/2024]
Abstract
Crop breeding entails developing and selecting plant varieties with improved agronomic traits. Modern molecular techniques, such as genome editing, enable more efficient manipulation of plant phenotype by altering the expression of particular regulatory or functional genes. Hence, it is essential to thoroughly comprehend the transcriptional regulatory mechanisms that underpin these traits. In the multi-omics era, a large amount of omics data has been generated for diverse crop species, including genomics, epigenomics, transcriptomics, proteomics, and single-cell omics. The abundant data resources and the emergence of advanced computational tools offer unprecedented opportunities for obtaining a holistic view and profound understanding of the regulatory processes linked to desirable traits. This review focuses on integrated network approaches that utilize multi-omics data to investigate gene expression regulation. Various types of regulatory networks and their inference methods are discussed, focusing on recent advancements in crop plants. The integration of multi-omics data has been proven to be crucial for the construction of high-confidence regulatory networks. With the refinement of these methodologies, they will significantly enhance crop breeding efforts and contribute to global food security.
Collapse
Affiliation(s)
| | | | - Zeyang Ma
- State Key Laboratory of Maize Bio-breeding, Frontiers Science Center for Molecular Design Breeding, Joint International Research Laboratory of Crop Molecular Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing, China
| |
Collapse
|
2
|
Bell CC, Balic JJ, Talarmain L, Gillespie A, Scolamiero L, Lam EYN, Ang CS, Faulkner GJ, Gilan O, Dawson MA. Comparative cofactor screens show the influence of transactivation domains and core promoters on the mechanisms of transcription. Nat Genet 2024; 56:1181-1192. [PMID: 38769457 DOI: 10.1038/s41588-024-01749-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 04/09/2024] [Indexed: 05/22/2024]
Abstract
Eukaryotic transcription factors (TFs) activate gene expression by recruiting cofactors to promoters. However, the relationships between TFs, promoters and their associated cofactors remain poorly understood. Here we combine GAL4-transactivation assays with comparative CRISPR-Cas9 screens to identify the cofactors used by nine different TFs and core promoters in human cells. Using this dataset, we associate TFs with cofactors, classify cofactors as ubiquitous or specific and discover transcriptional co-dependencies. Through a reductionistic, comparative approach, we demonstrate that TFs do not display discrete mechanisms of activation. Instead, each TF depends on a unique combination of cofactors, which influences distinct steps in transcription. By contrast, the influence of core promoters appears relatively discrete. Different promoter classes are constrained by either initiation or pause-release, which influences their dynamic range and compatibility with cofactors. Overall, our comparative cofactor screens characterize the interplay between TFs, cofactors and core promoters, identifying general principles by which they influence transcription.
Collapse
Affiliation(s)
- Charles C Bell
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia.
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, Queensland, Australia.
| | - Jesse J Balic
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Laure Talarmain
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Andrea Gillespie
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Laura Scolamiero
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia
| | - Enid Y N Lam
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
| | - Ching-Seng Ang
- Bio21 Mass Spectrometry and Proteomics Facility, The University of Melbourne, Parkville, Victoria, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, Queensland, Australia
- Queensland Brain Institute, University of Queensland, Brisbane, Queensland, Australia
| | - Omer Gilan
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia
- Australian Centre for Blood Diseases, Monash University, Melbourne, Victoria, Australia
| | - Mark A Dawson
- Cancer Research Division, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.
- Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Victoria, Australia.
- Department of Haematology, Peter MacCallum Cancer Centre, Melbourne, Victoria, Australia.
- Centre for Cancer Research, University of Melbourne, Melbourne, Victoria, Australia.
| |
Collapse
|
3
|
Baumgarten N, Rumpf L, Kessler T, Schulz MH. A statistical approach for identifying single nucleotide variants that affect transcription factor binding. iScience 2024; 27:109765. [PMID: 38736546 PMCID: PMC11088338 DOI: 10.1016/j.isci.2024.109765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 01/30/2024] [Accepted: 04/15/2024] [Indexed: 05/14/2024] Open
Abstract
Non-coding variants located within regulatory elements may alter gene expression by modifying transcription factor (TF) binding sites, thereby leading to functional consequences. Different TF models are being used to assess the effect of DNA sequence variants, such as single nucleotide variants (SNVs). Often existing methods are slow and do not assess statistical significance of results. We investigated the distribution of absolute maximal differential TF binding scores for general computational models that affect TF binding. We find that a modified Laplace distribution can adequately approximate the empirical distributions. A benchmark on in vitro and in vivo datasets showed that our approach improves upon an existing method in terms of performance and speed. Applications on eQTLs and on a genome-wide association study illustrate the usefulness of our statistics by highlighting cell type-specific regulators and target genes. An implementation of our approach is freely available on GitHub and as bioconda package.
Collapse
Affiliation(s)
- Nina Baumgarten
- Institute of Cardiovascular Regeneration, Goethe University, 60590 Frankfurt am Main, Germany
- Institute for Computational Genomic Medicine, Goethe University, 60590 Frankfurt am Main, Germany
- Institute for Computer Science, Goethe University, 60590 Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590 Frankfurt am Main, Germany
| | - Laura Rumpf
- Institute of Cardiovascular Regeneration, Goethe University, 60590 Frankfurt am Main, Germany
- Institute for Computational Genomic Medicine, Goethe University, 60590 Frankfurt am Main, Germany
- Institute for Computer Science, Goethe University, 60590 Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590 Frankfurt am Main, Germany
| | - Thorsten Kessler
- German Heart Centre Munich, Department of Cardiology, School of Medicine and Health, Technical University of Munich, 80636 Munich, Germany
- German Centre for Cardiovascular Research, Partner Site Munich Heart Alliance, 80636 Munich, Germany
| | - Marcel H. Schulz
- Institute of Cardiovascular Regeneration, Goethe University, 60590 Frankfurt am Main, Germany
- Institute for Computational Genomic Medicine, Goethe University, 60590 Frankfurt am Main, Germany
- Institute for Computer Science, Goethe University, 60590 Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner Site Rhein-Main, 60590 Frankfurt am Main, Germany
| |
Collapse
|
4
|
Marešová A, Oravcová M, Rodríguez-López M, Hradilová M, Zemlianski V, Häsler R, Hernández P, Bähler J, Převorovský M. Critical importance of DNA binding for CSL protein functions in fission yeast. J Cell Sci 2024; 137:jcs261568. [PMID: 38482739 DOI: 10.1242/jcs.261568] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 03/07/2024] [Indexed: 05/01/2024] Open
Abstract
CSL proteins [named after the homologs CBF1 (RBP-Jκ in mice), Suppressor of Hairless and LAG-1] are conserved transcription factors found in animals and fungi. In the fission yeast Schizosaccharomyces pombe, they regulate various cellular processes, including cell cycle progression, lipid metabolism and cell adhesion. CSL proteins bind to DNA through their N-terminal Rel-like domain and central β-trefoil domain. Here, we investigated the importance of DNA binding for CSL protein functions in fission yeast. We created CSL protein mutants with disrupted DNA binding and found that the vast majority of CSL protein functions depend on intact DNA binding. Specifically, DNA binding is crucial for the regulation of cell adhesion, lipid metabolism, cell cycle progression, long non-coding RNA expression and genome integrity maintenance. Interestingly, perturbed lipid metabolism leads to chromatin structure changes, potentially linking lipid metabolism to the diverse phenotypes associated with CSL protein functions. Our study highlights the critical role of DNA binding for CSL protein functions in fission yeast.
Collapse
Affiliation(s)
- Anna Marešová
- Department of Cell Biology, Faculty of Science, Charles University, Viničná 7, 128 00 Prague 2, Czechia
| | - Martina Oravcová
- Department of Cell Biology, Faculty of Science, Charles University, Viničná 7, 128 00 Prague 2, Czechia
| | - María Rodríguez-López
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas Margarita Salas, Consejo Superior de Investigaciones Científicas, Ramiro de Maeztu 9, 28040 Madrid, Spain
| | - Miluše Hradilová
- Laboratory of Genomics and Bioinformatics, Institute of Molecular Genetics of the Czech Academy of Sciences, Vídeňská 1083, 142 20 Prague 4, Czechia
| | - Viacheslav Zemlianski
- Department of Cell Biology, Faculty of Science, Charles University, Viničná 7, 128 00 Prague 2, Czechia
| | - Robert Häsler
- Center for Inflammatory Skin Diseases, Department of Dermatology and Allergy, University Hospital Schleswig-Holstein, Campus Kiel, Rosalind-Franklin-Straße 9, 24105 Kiel, Germany
| | - Pablo Hernández
- Department of Cellular and Molecular Biology, Centro de Investigaciones Biológicas Margarita Salas, Consejo Superior de Investigaciones Científicas, Ramiro de Maeztu 9, 28040 Madrid, Spain
| | - Jürg Bähler
- Institute of Healthy Ageing and Department of Genetics, Evolution and Environment , University College London, Gower Street, London WC1E 6BT, UK
| | - Martin Převorovský
- Department of Cell Biology, Faculty of Science, Charles University, Viničná 7, 128 00 Prague 2, Czechia
| |
Collapse
|
5
|
Gibson TJ, Larson ED, Harrison MM. Protein-intrinsic properties and context-dependent effects regulate pioneer factor binding and function. Nat Struct Mol Biol 2024; 31:548-558. [PMID: 38365978 PMCID: PMC11261375 DOI: 10.1038/s41594-024-01231-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 01/22/2024] [Indexed: 02/18/2024]
Abstract
Chromatin is a barrier to the binding of many transcription factors. By contrast, pioneer factors access nucleosomal targets and promote chromatin opening. Despite binding to target motifs in closed chromatin, many pioneer factors display cell-type-specific binding and activity. The mechanisms governing pioneer factor occupancy and the relationship between chromatin occupancy and opening remain unclear. We studied three Drosophila transcription factors with distinct DNA-binding domains and biological functions: Zelda, Grainy head and Twist. We demonstrated that the level of chromatin occupancy is a key determinant of pioneering activity. Multiple factors regulate occupancy, including motif content, local chromatin and protein concentration. Regions outside the DNA-binding domain are required for binding and chromatin opening. Our results show that pioneering activity is not a binary feature intrinsic to a protein but occurs on a spectrum and is regulated by a variety of protein-intrinsic and cell-type-specific features.
Collapse
Affiliation(s)
- Tyler J Gibson
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Elizabeth D Larson
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Melissa M Harrison
- Department of Biomolecular Chemistry, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
6
|
Xu J, Gao J, Ni P, Gerstein M. Less-is-more: selecting transcription factor binding regions informative for motif inference. Nucleic Acids Res 2024; 52:e20. [PMID: 38214231 PMCID: PMC10899791 DOI: 10.1093/nar/gkad1240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 12/06/2023] [Accepted: 12/17/2023] [Indexed: 01/13/2024] Open
Abstract
Numerous statistical methods have emerged for inferring DNA motifs for transcription factors (TFs) from genomic regions. However, the process of selecting informative regions for motif inference remains understudied. Current approaches select regions with strong ChIP-seq signal for a given TF, assuming that such strong signal primarily results from specific interactions between the TF and its motif. Additionally, these selection approaches do not account for non-target motifs, i.e. motifs of other TFs; they presume the occurrence of these non-target motifs infrequent compared to that of the target motif, and thus assume these have minimal interference with the identification of the target. Leveraging extensive ChIP-seq datasets, we introduced the concept of TF signal 'crowdedness', referred to as C-score, for each genomic region. The C-score helps in highlighting TF signals arising from non-specific interactions. Moreover, by considering the C-score (and adjusting for the length of genomic regions), we can effectively mitigate interference of non-target motifs. Using these tools, we find that in many instances, strong ChIP-seq signal stems mainly from non-specific interactions, and the occurrence of non-target motifs significantly impacts the accurate inference of the target motif. Prioritizing genomic regions with reduced crowdedness and short length markedly improves motif inference. This 'less-is-more' effect suggests that ChIP-seq region selection warrants more attention.
Collapse
Affiliation(s)
- Jinrui Xu
- Department of Biology, Howard University, Washington, DC 20059, USA
- Center for Applied Data Science and Analytics, Howard University, Washington, DC 20059, USA
| | - Jiahao Gao
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
| | - Pengyu Ni
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| |
Collapse
|
7
|
Schiopu I, Dragomir I, Asandei A. Single molecule technique unveils the role of electrostatic interactions in ssDNA-gp32 molecular complex stability. RSC Adv 2024; 14:5449-5460. [PMID: 38352678 PMCID: PMC10862658 DOI: 10.1039/d3ra07746b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 02/07/2024] [Indexed: 02/16/2024] Open
Abstract
The exploration of single-strand DNA-binding protein (SSB)-ssDNA interactions and their crucial roles in essential biological processes lagged behind other types of protein-nucleic acid interactions, such as protein-dsDNA and protein-RNA interactions. The ssDNA binding protein gene product 32 (gp32) of the T4 bacteriophage is a central integrating component of the replication complex that must continuously bind to and unbind from transiently exposed template strands during the DNA synthesis. To gain deeper insights into the electrostatic conditions influencing the stability of the ssDNA-gp32 molecular complex, like the salt concentration or some metal ions proven to specifically bind to gp32, we employed a method that performs rapid measurements of the DNA-protein stability using an α-Hemolysin (α-HL) protein nanopore. We indirectly probed the stability of a protein-nucleic acid complex by monitoring the dissociation process between the gp32 protein and the ssDNA molecular complex in single-molecular electrophysiology experiments, but also through fluorescence spectroscopy techniques. We have shown that the complex is more stable in 0.5 M KCl solution than in 2 M KCl solution and that the presence of Zn2+ ions further increases this stability for any salt used in the present study. This method can be applied to other nucleic acid-protein molecular complexes, as well as for an accurate determination of the drug-protein carrier stability.
Collapse
Affiliation(s)
- Irina Schiopu
- The Institute of Interdisciplinary Research, Department of Exact Sciences and Natural Sciences, "Alexandru Ioan Cuza" University of Iaşi 700506 Iasi Romania
| | - Isabela Dragomir
- The Institute of Interdisciplinary Research, Department of Exact Sciences and Natural Sciences, "Alexandru Ioan Cuza" University of Iaşi 700506 Iasi Romania
| | - Alina Asandei
- The Institute of Interdisciplinary Research, Department of Exact Sciences and Natural Sciences, "Alexandru Ioan Cuza" University of Iaşi 700506 Iasi Romania
| |
Collapse
|
8
|
Hunt G, Vaid R, Pirogov S, Pfab A, Ziegenhain C, Sandberg R, Reimegård J, Mannervik M. Tissue-specific RNA Polymerase II promoter-proximal pause release and burst kinetics in a Drosophila embryonic patterning network. Genome Biol 2024; 25:2. [PMID: 38166964 PMCID: PMC10763363 DOI: 10.1186/s13059-023-03135-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Accepted: 11/30/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Formation of tissue-specific transcriptional programs underlies multicellular development, including dorsoventral (DV) patterning of the Drosophila embryo. This involves interactions between transcriptional enhancers and promoters in a chromatin context, but how the chromatin landscape influences transcription is not fully understood. RESULTS Here we comprehensively resolve differential transcriptional and chromatin states during Drosophila DV patterning. We find that RNA Polymerase II pausing is established at DV promoters prior to zygotic genome activation (ZGA), that pausing persists irrespective of cell fate, but that release into productive elongation is tightly regulated and accompanied by tissue-specific P-TEFb recruitment. DV enhancers acquire distinct tissue-specific chromatin states through CBP-mediated histone acetylation that predict the transcriptional output of target genes, whereas promoter states are more tissue-invariant. Transcriptome-wide inference of burst kinetics in different cell types revealed that while DV genes are generally characterized by a high burst size, either burst size or frequency can differ between tissues. CONCLUSIONS The data suggest that pausing is established by pioneer transcription factors prior to ZGA and that release from pausing is imparted by enhancer chromatin state to regulate bursting in a tissue-specific manner in the early embryo. Our results uncover how developmental patterning is orchestrated by tissue-specific bursts of transcription from Pol II primed promoters in response to enhancer regulatory cues.
Collapse
Affiliation(s)
- George Hunt
- Department Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Roshan Vaid
- Department Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Sergei Pirogov
- Department Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | - Alexander Pfab
- Department Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden
| | | | - Rickard Sandberg
- Department Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | - Johan Reimegård
- Department Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Mattias Mannervik
- Department Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
9
|
de Boer CG, Taipale J. Hold out the genome: a roadmap to solving the cis-regulatory code. Nature 2024; 625:41-50. [PMID: 38093018 DOI: 10.1038/s41586-023-06661-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/20/2023] [Indexed: 01/05/2024]
Abstract
Gene expression is regulated by transcription factors that work together to read cis-regulatory DNA sequences. The 'cis-regulatory code' - how cells interpret DNA sequences to determine when, where and how much genes should be expressed - has proven to be exceedingly complex. Recently, advances in the scale and resolution of functional genomics assays and machine learning have enabled substantial progress towards deciphering this code. However, the cis-regulatory code will probably never be solved if models are trained only on genomic sequences; regions of homology can easily lead to overestimation of predictive performance, and our genome is too short and has insufficient sequence diversity to learn all relevant parameters. Fortunately, randomly synthesized DNA sequences enable testing a far larger sequence space than exists in our genomes, and designed DNA sequences enable targeted queries to maximally improve the models. As the same biochemical principles are used to interpret DNA regardless of its source, models trained on these synthetic data can predict genomic activity, often better than genome-trained models. Here we provide an outlook on the field, and propose a roadmap towards solving the cis-regulatory code by a combination of machine learning and massively parallel assays using synthetic DNA.
Collapse
Affiliation(s)
- Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| | - Jussi Taipale
- Applied Tumor Genomics Research Program, Faculty of Medicine, University of Helsinki, Helsinki, Finland.
- Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden.
- Department of Biochemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
10
|
Gera T, Kumar DK, Yaakov G, Barkai N, Jonas F. ChEC-Seq: A Comprehensive Guide for Scalable and Cost-Efficient Genome-Wide Profiling in Saccharomyces cerevisiae. Methods Mol Biol 2024; 2846:263-283. [PMID: 39141241 DOI: 10.1007/978-1-0716-4071-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Chromatin endogenous cleavage coupled with high-throughput sequencing (ChEC-seq) is a profiling method for protein-DNA interactions that can detect binding locations in vivo, does not require antibodies or fixation, and provides genome-wide coverage at near nucleotide resolution.The core of this method is an MNase fusion of the target protein, which allows it, when triggered by calcium exposure, to cut DNA at its binding sites and to generate small DNA fragments that can be readily separated from the rest of the genome and sequenced.Improvements since the original protocol have increased the ease, lowered the costs, and multiplied the throughput of this method to enable a scale and resolution of experiments not available with traditional methods such as ChIP-seq. This method describes each step from the initial creation and verification of the MNase-tagged yeast strains, over the ChEC MNase activation and small fragment purification procedure to the sequencing library preparation. It also briefly touches on the bioinformatic steps necessary to create meaningful genome-wide binding profiles.
Collapse
Affiliation(s)
- Tamar Gera
- Department of Molecular Genetics, Weizmann Institute, Rehovot, Israel
| | | | - Gilad Yaakov
- Department of Molecular Genetics, Weizmann Institute, Rehovot, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute, Rehovot, Israel
| | - Felix Jonas
- School of Science, Constructor University, Bremen, Germany.
| |
Collapse
|
11
|
Ansari SA, Uhlenhaut NH. An Optimized High-Resolution Mapping Method for Glucocorticoid Receptor-DNA Binding in Mouse Primary Macrophages. Methods Mol Biol 2024; 2846:91-107. [PMID: 39141231 DOI: 10.1007/978-1-0716-4071-5_6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
ChIP-exo is a powerful tool for achieving enhanced sensitivity and single-base-pair resolution of transcription factor (TF) binding, which utilizes a combination of chromatin immunoprecipitation (ChIP) and lambda exonuclease digestion (exo) followed by high-throughput sequencing. ChIP-nexus (chromatin immunoprecipitation experiments with nucleotide resolution through exonuclease, unique barcode, and single ligation) is an updated and simplified version of the original ChIP-exo method, which has reported an efficient adapter ligation through the DNA circularization step. Building upon an established method, we present a protocol for generating NGS (next-generation sequencing) ready and high-quality ChIP-nexus library for glucocorticoid receptor (GR). This method is specifically optimized for bone marrow-derived macrophage (BMDM) cells. The protocol is initiated by the formation of DNA-protein cross-links in intact cells. This is followed by chromatin shearing, chromatin immunoprecipitation, ligation of sequencing adapters, digestion of adapter-ligated DNA using lambda exonuclease, and purification of single-stranded DNA for circularization and library amplification.
Collapse
Affiliation(s)
- Suhail A Ansari
- German Center for Diabetes Research (DZD), Neuherberg, Germany.
- Institute for Diabetes and Endocrinology (IDE), Helmholtz Zentrum Muenchen, Neuherberg, Germany.
| | - Nina Henriette Uhlenhaut
- Institute for Diabetes and Endocrinology (IDE), Helmholtz Center Munich (HMGU), Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- Metabolic Programming, School of Life Sciences Weihenstephan, ZIEL - Institute for Food and Health, Technical University of Munich (TUM), Freising, Germany
| |
Collapse
|
12
|
Perez AA, Goronzy IN, Blanco MR, Guo JK, Guttman M. ChIP-DIP: A multiplexed method for mapping hundreds of proteins to DNA uncovers diverse regulatory elements controlling gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.14.571730. [PMID: 38187704 PMCID: PMC10769186 DOI: 10.1101/2023.12.14.571730] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Gene expression is controlled by the dynamic localization of thousands of distinct regulatory proteins to precise regions of DNA. Understanding this cell-type specific process has been a goal of molecular biology for decades yet remains challenging because most current DNA-protein mapping methods study one protein at a time. To overcome this, we developed ChIP-DIP (ChIP Done In Parallel), a split-pool based method that enables simultaneous, genome-wide mapping of hundreds of diverse regulatory proteins in a single experiment. We demonstrate that ChIP-DIP generates highly accurate maps for all classes of DNA-associated proteins, including histone modifications, chromatin regulators, transcription factors, and RNA Polymerases. Using these data, we explore quantitative combinations of protein localization on genomic DNA to define distinct classes of regulatory elements and their functional activity. Our data demonstrate that ChIP-DIP enables the generation of 'consortium level', context-specific protein localization maps within any molecular biology lab.
Collapse
|
13
|
Schmidt CA, Hodkinson LJ, Comstra HS, Khan S, Torres H, Rieder LE. A cost-free CURE: using bioinformatics to identify DNA-binding factors at a specific genomic locus. JOURNAL OF MICROBIOLOGY & BIOLOGY EDUCATION 2023; 24:e00120-23. [PMID: 38107989 PMCID: PMC10720551 DOI: 10.1128/jmbe.00120-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 09/14/2023] [Indexed: 12/19/2023]
Abstract
Research experiences provide diverse benefits for undergraduates. Many academic institutions have adopted course-based undergraduate research experiences (CUREs) to improve student access to research opportunities. However, potential instructors of a CURE might still face financial or practical hurdles that prevent implementation. Bioinformatics research offers an alternative that is free, safe, compatible with remote learning, and may be more accessible for students with disabilities. Here, we describe a bioinformatics CURE that leverages publicly available datasets to discover novel proteins that target an instructor-determined genomic locus of interest. We use the free, user-friendly bioinformatics platform Galaxy to map ChIP-seq datasets to a genome, which removes the computing burden from students. Both faculty and students directly benefit from this CURE, as faculty can perform candidate screens and publish CURE results. Students gain not only basic bioinformatics knowledge, but also transferable skills, including scientific communication, database navigation, and primary literature experience. The CURE is flexible and can be expanded to analyze different types of high-throughput data or to investigate different genomic loci in any species.
Collapse
Affiliation(s)
| | - Lauren J. Hodkinson
- Graduate Program in Genetics and Molecular Biology, Emory University, Atlanta, Georgia, USA
| | - H. Skye Comstra
- Department of Biology, Emory University, Atlanta, Georgia, USA
| | - Samia Khan
- Department of Biology, Emory University, Atlanta, Georgia, USA
| | | | - Leila E. Rieder
- Department of Biology, Emory University, Atlanta, Georgia, USA
- Graduate Program in Genetics and Molecular Biology, Emory University, Atlanta, Georgia, USA
| |
Collapse
|
14
|
Qin Z, Zhang K, He P, Zhang X, Xie M, Fu Y, Gu C, Zhu Y, Tong A, Wei H, Zhang C, Xiang Y. Discovering covalent inhibitors of protein-protein interactions from trillions of sulfur(VI) fluoride exchange-modified oligonucleotides. Nat Chem 2023; 15:1705-1714. [PMID: 37653229 DOI: 10.1038/s41557-023-01304-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Accepted: 07/24/2023] [Indexed: 09/02/2023]
Abstract
Molecules that covalently engage target proteins are widely used as activity-based probes and covalent drugs. The performance of these covalent inhibitors is, however, often compromised by the paradox of efficacy and risk, which demands a balance between reactivity and selectivity. The challenge is more evident when targeting protein-protein interactions owing to their low ligandability and undefined reactivity. Here we report sulfur(VI) fluoride exchange (SuFEx) in vitro selection, a general platform for high-throughput discovery of covalent inhibitors from trillions of SuFEx-modified oligonucleotides. With SuFEx in vitro selection, we identified covalent inhibitors that cross-link distinct residues of the SARS-CoV-2 spike protein at its protein-protein interaction interface with the human angiotensin-converting enzyme 2. A separate suite of covalent inhibitors was isolated for the human complement C5 protein. In both cases, we observed a clear disconnection between binding affinity and cross-linking reactivity, indicating that direct search for the aimed reactivity-as enabled by SuFEx in vitro selection-is vital for discovering covalent inhibitors of high selectivity and potency.
Collapse
Affiliation(s)
- Zichen Qin
- Department of Chemistry, Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, China
| | - Kaining Zhang
- Department of Chemistry, Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, China
| | - Ping He
- CAS Key Laboratory of Special Pathogens and Biosafety, Centre for Biosafety Mega-Science, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Xue Zhang
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Shanghai Key Laboratory for Molecular Engineering of Chiral Drugs, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, China
| | - Miao Xie
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Shanghai Key Laboratory for Molecular Engineering of Chiral Drugs, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, China
| | - Yucheng Fu
- Department of Orthopedics, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Chunmei Gu
- Department of Chemistry, Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, China
- Beijing Institute of Collaborative Innovation (BICI), Beijing, China
| | - Yiying Zhu
- Department of Chemistry, Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, China
| | - Aijun Tong
- Department of Chemistry, Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, China
| | - Hongping Wei
- CAS Key Laboratory of Special Pathogens and Biosafety, Centre for Biosafety Mega-Science, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Chuan Zhang
- School of Chemistry and Chemical Engineering, Frontiers Science Center for Transformative Molecules, Shanghai Key Laboratory for Molecular Engineering of Chiral Drugs, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, China
| | - Yu Xiang
- Department of Chemistry, Key Laboratory of Bioorganic Phosphorus Chemistry and Chemical Biology (Ministry of Education), Tsinghua University, Beijing, China.
| |
Collapse
|
15
|
Kaltbeitzel J, Wich PR. Protein-based Nanoparticles: From Drug Delivery to Imaging, Nanocatalysis and Protein Therapy. Angew Chem Int Ed Engl 2023; 62:e202216097. [PMID: 36917017 DOI: 10.1002/anie.202216097] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 03/12/2023] [Accepted: 03/13/2023] [Indexed: 03/16/2023]
Abstract
Proteins and enzymes are versatile biomaterials for a wide range of medical applications due to their high specificity for receptors and substrates, high degradability, low toxicity, and overall good biocompatibility. Protein nanoparticles are formed by the arrangement of several native or modified proteins into nanometer-sized assemblies. In this review, we will focus on artificial nanoparticle systems, where proteins are the main structural element and not just an encapsulated payload. While under natural conditions, only certain proteins form defined aggregates and nanoparticles, chemical modifications or a change in the physical environment can further extend the pool of available building blocks. This allows the assembly of many globular proteins and even enzymes. These advances in preparation methods led to the emergence of new generations of nanosystems that extend beyond transport vehicles to diverse applications, from multifunctional drug delivery to imaging, nanocatalysis and protein therapy.
Collapse
Affiliation(s)
- Jonas Kaltbeitzel
- School of Chemical Engineering, University of New South Wales, Sydney, NSW 2052, Australia
- Australian Centre for NanoMedicine, University of New South Wales, Sydney, NSW 2052, Australia
| | - Peter R Wich
- School of Chemical Engineering, University of New South Wales, Sydney, NSW 2052, Australia
- Australian Centre for NanoMedicine, University of New South Wales, Sydney, NSW 2052, Australia
| |
Collapse
|
16
|
Brennan KJ, Weilert M, Krueger S, Pampari A, Liu HY, Yang AWH, Morrison JA, Hughes TR, Rushlow CA, Kundaje A, Zeitlinger J. Chromatin accessibility in the Drosophila embryo is determined by transcription factor pioneering and enhancer activation. Dev Cell 2023; 58:1898-1916.e9. [PMID: 37557175 PMCID: PMC10592203 DOI: 10.1016/j.devcel.2023.07.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 05/09/2023] [Accepted: 07/13/2023] [Indexed: 08/11/2023]
Abstract
Chromatin accessibility is integral to the process by which transcription factors (TFs) read out cis-regulatory DNA sequences, but it is difficult to differentiate between TFs that drive accessibility and those that do not. Deep learning models that learn complex sequence rules provide an unprecedented opportunity to dissect this problem. Using zygotic genome activation in Drosophila as a model, we analyzed high-resolution TF binding and chromatin accessibility data with interpretable deep learning and performed genetic validation experiments. We identify a hierarchical relationship between the pioneer TF Zelda and the TFs involved in axis patterning. Zelda consistently pioneers chromatin accessibility proportional to motif affinity, whereas patterning TFs augment chromatin accessibility in sequence contexts where they mediate enhancer activation. We conclude that chromatin accessibility occurs in two tiers: one through pioneering, which makes enhancers accessible but not necessarily active, and the second when the correct combination of TFs leads to enhancer activation.
Collapse
Affiliation(s)
- Kaelan J Brennan
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Sabrina Krueger
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Anusri Pampari
- Department of Computer Science, Stanford University, Palo Alto, CA 94305, USA
| | - Hsiao-Yun Liu
- Department of Biology, New York University, New York, NY 10003, USA
| | - Ally W H Yang
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | - Jason A Morrison
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA
| | - Timothy R Hughes
- Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada
| | | | - Anshul Kundaje
- Department of Computer Science, Stanford University, Palo Alto, CA 94305, USA; Department of Genetics, Stanford University, Palo Alto, CA 94305, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO 64110, USA; Department of Pathology & Laboratory Medicine, The University of Kansas Medical Center, Kansas City, KS 66160, USA.
| |
Collapse
|
17
|
Alexandari AM, Horton CA, Shrikumar A, Shah N, Li E, Weilert M, Pufall MA, Zeitlinger J, Fordyce PM, Kundaje A. De novo distillation of thermodynamic affinity from deep learning regulatory sequence models of in vivo protein-DNA binding. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.11.540401. [PMID: 37214836 PMCID: PMC10197627 DOI: 10.1101/2023.05.11.540401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Transcription factors (TF) are proteins that bind DNA in a sequence-specific manner to regulate gene transcription. Despite their unique intrinsic sequence preferences, in vivo genomic occupancy profiles of TFs differ across cellular contexts. Hence, deciphering the sequence determinants of TF binding, both intrinsic and context-specific, is essential to understand gene regulation and the impact of regulatory, non-coding genetic variation. Biophysical models trained on in vitro TF binding assays can estimate intrinsic affinity landscapes and predict occupancy based on TF concentration and affinity. However, these models cannot adequately explain context-specific, in vivo binding profiles. Conversely, deep learning models, trained on in vivo TF binding assays, effectively predict and explain genomic occupancy profiles as a function of complex regulatory sequence syntax, albeit without a clear biophysical interpretation. To reconcile these complementary models of in vitro and in vivo TF binding, we developed Affinity Distillation (AD), a method that extracts thermodynamic affinities de-novo from deep learning models of TF chromatin immunoprecipitation (ChIP) experiments by marginalizing away the influence of genomic sequence context. Applied to neural networks modeling diverse classes of yeast and mammalian TFs, AD predicts energetic impacts of sequence variation within and surrounding motifs on TF binding as measured by diverse in vitro assays with superior dynamic range and accuracy compared to motif-based methods. Furthermore, AD can accurately discern affinities of TF paralogs. Our results highlight thermodynamic affinity as a key determinant of in vivo binding, suggest that deep learning models of in vivo binding implicitly learn high-resolution affinity landscapes, and show that these affinities can be successfully distilled using AD. This new biophysical interpretation of deep learning models enables high-throughput in silico experiments to explore the influence of sequence context and variation on both intrinsic affinity and in vivo occupancy.
Collapse
Affiliation(s)
- Amr M. Alexandari
- Department of Computer Science, Stanford University, Stanford, CA 94305
| | | | - Avanti Shrikumar
- Department of Earth System Science, Stanford University, Stanford, CA 94305
| | - Nilay Shah
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Eileen Li
- Department of Genetics, Stanford University, Stanford, CA 94305
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Miles A. Pufall
- Department of Biochemistry, Carver College of Medicine, University of Iowa, Iowa City, Iowa 52242, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA
- The University of Kansas Medical Center, Kansas City, KS, USA
| | - Polly M. Fordyce
- Department of Genetics, Stanford University, Stanford, CA 94305
- Department of Bioengineering, Stanford University, Stanford, CA 94305
- ChEM-H Institute, Stanford University, Stanford, CA 94305
- Chan Zuckerberg Biohub, San Francisco, CA 94110
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA 94305
- Department of Genetics, Stanford University, Stanford, CA 94305
| |
Collapse
|
18
|
Gibson TJ, Harrison MM. Protein-intrinsic properties and context-dependent effects regulate pioneer-factor binding and function. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.18.533281. [PMID: 37066406 PMCID: PMC10103944 DOI: 10.1101/2023.03.18.533281] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Chromatin is a barrier to the binding of many transcription factors. By contrast, pioneer factors access nucleosomal targets and promote chromatin opening. Despite binding to target motifs in closed chromatin, many pioneer factors display cell-type specific binding and activity. The mechanisms governing pioneer-factor occupancy and the relationship between chromatin occupancy and opening remain unclear. We studied three Drosophila transcription factors with distinct DNA-binding domains and biological functions: Zelda, Grainy head, and Twist. We demonstrated that the level of chromatin occupancy is a key determinant of pioneering activity. Multiple factors regulate occupancy, including motif content, local chromatin, and protein concentration. Regions outside the DNA-binding domain are required for binding and chromatin opening. Our results show that pioneering activity is not a binary feature intrinsic to a protein but occurs on a spectrum and is regulated by a variety of protein-intrinsic and cell-type-specific features.
Collapse
Affiliation(s)
- Tyler J. Gibson
- Department of Biomolecular Chemistry, University of Wisconsin-Madison Madison, WI
| | - Melissa M. Harrison
- Department of Biomolecular Chemistry, University of Wisconsin-Madison Madison, WI
| |
Collapse
|
19
|
Using unique molecular identifiers to improve allele calling in low-template mixtures. Forensic Sci Int Genet 2023; 63:102807. [PMID: 36462297 DOI: 10.1016/j.fsigen.2022.102807] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2022] [Revised: 10/20/2022] [Accepted: 11/18/2022] [Indexed: 11/27/2022]
Abstract
PCR artifacts are an ever-present challenge in sequencing applications. These artifacts can seriously limit the analysis and interpretation of low-template samples and mixtures, especially with respect to a minor contributor. In medicine, molecular barcoding techniques have been employed to decrease the impact of PCR error and to allow the examination of low-abundance somatic variation. In principle, it should be possible to apply the same techniques to the forensic analysis of mixtures. To that end, several short tandem repeat loci were selected for targeted sequencing, and a bioinformatic pipeline for analyzing the sequence data was developed. The pipeline notes the relevant unique molecular identifiers (UMIs) attached to each read and, using machine learning, filters the noise products out of the set of potential alleles. To evaluate this pipeline, DNA from pairs of individuals were mixed at different ratios (1-1, 1-9) and sequenced with different starting amounts of DNA (10, 1 and 0.1 ng). Naïvely using the information in the molecular barcodes led to increased performance, with the machine learning resulting in an additional benefit. In concrete terms, using the UMI data results in less noise for a given amount of drop out. For instance, if thresholds are selected that filter out a quarter of the true alleles, using read counts accepts 2381 noise alleles and using raw UMI counts accepts 1726 noise alleles, while the machine learning approach only accepts 307.
Collapse
|
20
|
van der Sande M, Frölich S, van Heeringen SJ. Computational approaches to understand transcription regulation in development. Biochem Soc Trans 2023; 51:1-12. [PMID: 36695505 PMCID: PMC9988001 DOI: 10.1042/bst20210145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 01/07/2023] [Accepted: 01/13/2023] [Indexed: 01/26/2023]
Abstract
Gene regulatory networks (GRNs) serve as useful abstractions to understand transcriptional dynamics in developmental systems. Computational prediction of GRNs has been successfully applied to genome-wide gene expression measurements with the advent of microarrays and RNA-sequencing. However, these inferred networks are inaccurate and mostly based on correlative rather than causative interactions. In this review, we highlight three approaches that significantly impact GRN inference: (1) moving from one genome-wide functional modality, gene expression, to multi-omics, (2) single cell sequencing, to measure cell type-specific signals and predict context-specific GRNs, and (3) neural networks as flexible models. Together, these experimental and computational developments have the potential to significantly impact the quality of inferred GRNs. Ultimately, accurately modeling the regulatory interactions between transcription factors and their target genes will be essential to understand the role of transcription factors in driving developmental gene expression programs and to derive testable hypotheses for validation.
Collapse
Affiliation(s)
| | | | - Simon J. van Heeringen
- Radboud University, Department of Molecular Developmental Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, 6525GA Nijmegen, The Netherlands
| |
Collapse
|
21
|
Li Z, Gao E, Zhou J, Han W, Xu X, Gao X. Applications of deep learning in understanding gene regulation. CELL REPORTS METHODS 2023; 3:100384. [PMID: 36814848 PMCID: PMC9939384 DOI: 10.1016/j.crmeth.2022.100384] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Gene regulation is a central topic in cell biology. Advances in omics technologies and the accumulation of omics data have provided better opportunities for gene regulation studies than ever before. For this reason deep learning, as a data-driven predictive modeling approach, has been successfully applied to this field during the past decade. In this article, we aim to give a brief yet comprehensive overview of representative deep-learning methods for gene regulation. Specifically, we discuss and compare the design principles and datasets used by each method, creating a reference for researchers who wish to replicate or improve existing methods. We also discuss the common problems of existing approaches and prospectively introduce the emerging deep-learning paradigms that will potentially alleviate them. We hope that this article will provide a rich and up-to-date resource and shed light on future research directions in this area.
Collapse
Affiliation(s)
- Zhongxiao Li
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Elva Gao
- The KAUST School, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Juexiao Zhou
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Wenkai Han
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xiaopeng Xu
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| | - Xin Gao
- Computer Science Program, Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
- KAUST Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia
| |
Collapse
|
22
|
Delos Santos NP, Duttke S, Heinz S, Benner C. MEPP: more transparent motif enrichment by profiling positional correlations. NAR Genom Bioinform 2022; 4:lqac075. [PMID: 36267125 PMCID: PMC9575187 DOI: 10.1093/nargab/lqac075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 08/18/2022] [Accepted: 09/23/2022] [Indexed: 11/11/2022] Open
Abstract
Score-based motif enrichment analysis (MEA) is typically applied to regulatory DNA to infer transcription factors (TFs) that may modulate transcription and chromatin state in different conditions. Most MEA methods determine motif enrichment independent of motif position within a sequence, even when those sequences harbor anchor points that motifs and their bound TFs may functionally interact with in a distance-dependent fashion, such as other TF binding motifs, transcription start sites (TSS), sequencing assay cleavage sites, or other biologically meaningful features. We developed motif enrichment positional profiling (MEPP), a novel MEA method that outputs a positional enrichment profile of a given TF's binding motif relative to key anchor points (e.g. transcription start sites, or other motifs) within the analyzed sequences while accounting for lower-order nucleotide bias. Using transcription initiation and TF binding as test cases, we demonstrate MEPP's utility in determining the sequence positions where motif presence correlates with measures of biological activity, inferring positional dependencies of binding site function. We demonstrate how MEPP can be applied to interpretation and hypothesis generation from experiments that quantify transcription initiation, chromatin structure, or TF binding measurements. MEPP is available for download from https://github.com/npdeloss/mepp.
Collapse
Affiliation(s)
- Nathaniel P Delos Santos
- Department of Biomedical Informatics, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0634, USA
| | - Sascha Duttke
- School of Molecular Biosciences, College of Veterinary Medicine, Washington State University, Pullman, WA, USA
| | - Sven Heinz
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0634, USA
| | - Christopher Benner
- Department of Medicine, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0634, USA
| |
Collapse
|
23
|
Towards a better understanding of TF-DNA binding prediction from genomic features. Comput Biol Med 2022; 149:105993. [DOI: 10.1016/j.compbiomed.2022.105993] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 07/12/2022] [Accepted: 08/14/2022] [Indexed: 11/17/2022]
|
24
|
RNA Polymerase II “Pause” Prepares Promoters for Upcoming Transcription during Drosophila Development. Int J Mol Sci 2022; 23:ijms231810662. [PMID: 36142573 PMCID: PMC9503990 DOI: 10.3390/ijms231810662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/07/2022] [Accepted: 09/08/2022] [Indexed: 11/17/2022] Open
Abstract
According to previous studies, during Drosophila embryogenesis, the recruitment of RNA polymerase II precedes active gene transcription. This work is aimed at exploring whether this mechanism is used during Drosophila metamorphosis. In addition, the composition of the RNA polymerase II “paused” complexes associated with promoters at different developmental stages are described in detail. For this purpose, we performed ChIP-Seq analysis using antibodies for various modifications of RNA polymerase II (total, Pol II CTD Ser5P, and Pol II CTD Ser2P) as well as for subunits of the NELF, DSIF, and PAF complexes and Brd4/Fs(1)h that control transcription elongation. We found that during metamorphosis, similar to mid-embryogenesis, the promoters were bound by RNA polymerase II in the “paused” state, preparing for activation at later stages of development. During mid-embryogenesis, RNA polymerase II in a “pause” state was phosphorylated at Ser5 and Ser2 of Pol II CTD and bound the NELF, DSIF, and PAF complexes, but not Brd4/Fs(1)h. During metamorphosis, the “paused” RNA polymerase II complex included Brd4/Fs(1)h in addition to NELF, DSIF, and PAF. The RNA polymerase II in this complex was phosphorylated at Ser5 of Pol II CTD, but not at Ser2. These results indicate that, during mid-embryogenesis, RNA polymerase II stalls in the “post-pause” state, being phosphorylated at Ser2 of Pol II CTD (after the stage of p-TEFb action). During metamorphosis, the “pause” mechanism is closer to classical promoter-proximal pausing and is characterized by a low level of Pol II CTD Ser2P.
Collapse
|
25
|
Yi R, Cho K, Bonneau R. NetTIME: a Multitask and Base-pair Resolution Framework for Improved Transcription Factor Binding Site Prediction. Bioinformatics 2022; 38:4762-4770. [PMID: 35997560 PMCID: PMC9563695 DOI: 10.1093/bioinformatics/btac569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 08/16/2022] [Accepted: 08/20/2022] [Indexed: 12/05/2022] Open
Abstract
Motivation Machine learning models for predicting cell-type-specific transcription factor (TF) binding sites have become increasingly more accurate thanks to the increased availability of next-generation sequencing data and more standardized model evaluation criteria. However, knowledge transfer from data-rich to data-limited TFs and cell types remains crucial for improving TF binding prediction models because available binding labels are highly skewed towards a small collection of TFs and cell types. Transfer prediction of TF binding sites can potentially benefit from a multitask learning approach; however, existing methods typically use shallow single-task models to generate low-resolution predictions. Here, we propose NetTIME, a multitask learning framework for predicting cell-type-specific TF binding sites with base-pair resolution. Results We show that the multitask learning strategy for TF binding prediction is more efficient than the single-task approach due to the increased data availability. NetTIME trains high-dimensional embedding vectors to distinguish TF and cell-type identities. We show that this approach is critical for the success of the multitask learning strategy and allows our model to make accurate transfer predictions within and beyond the training panels of TFs and cell types. We additionally train a linear-chain conditional random field (CRF) to classify binding predictions and show that this CRF eliminates the need for setting a probability threshold and reduces classification noise. We compare our method’s predictive performance with two state-of-the-art methods, Catchitt and Leopard, and show that our method outperforms previous methods under both supervised and transfer learning settings. Availability and implementation NetTIME is freely available at https://github.com/ryi06/NetTIME and the code is also archived at https://doi.org/10.5281/zenodo.6994897. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ren Yi
- Department of Computer Science, New York University, New York, NY, 10011, USA
| | - Kyunghyun Cho
- Department of Computer Science, New York University, New York, NY, 10011, USA.,Center for Data Science, New York University, New York, NY, 10011, USA.,Prescient Design, a Genentech accelerator, New York, NY, 10010, USA
| | - Richard Bonneau
- Department of Computer Science, New York University, New York, NY, 10011, USA.,Center for Data Science, New York University, New York, NY, 10011, USA.,Department of Biology, New York University, New York, NY, 10003, USA.,Prescient Design, a Genentech accelerator, New York, NY, 10010, USA
| |
Collapse
|
26
|
Angelov D, Boopathi R, Lone IN, Menoni H, Dimitrov S, Cadet J. Capturing Protein-Nucleic Acid Interactions by High-Intensity Laser-Induced Covalent Crosslinking. Photochem Photobiol 2022; 99:296-312. [PMID: 35997098 DOI: 10.1111/php.13699] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Accepted: 07/21/2022] [Indexed: 11/30/2022]
Abstract
Interactions of DNA with structural proteins such as histones, regulatory proteins, and enzymes play a crucial role in major cellular processes such as transcription, replication and repair. The in vivo mapping and characterization of the binding sites of the involved biomolecules are of primary importance for a better understanding of genomic deployment that is implicated in tissue and developmental stage-specific gene expression regulation. The most powerful and commonly used approach to date is immunoprecipitation of chemically cross-linked chromatin (XChIP) coupled with sequencing analysis (ChIP-seq). While the resolution and the sensitivity of the high-throughput sequencing techniques have been constantly improved little progress has been achieved in the crosslinking step. Because of its low efficiency the use of the conventional UVC lamps remains very limited while the formaldehyde method was established as the "gold standard" crosslinking agent. Efficient biphotonic crosslinking of directly interacting nucleic acid-protein complexes by a single short UV laser pulse has been introduced as an innovative technique for overcoming limitations of conventionally used chemical and photochemical approaches. In this survey, the main available methods including the laser approach are critically reviewed for their ability to generate DNA-protein crosslinks in vitro model systems and cells.
Collapse
Affiliation(s)
- Dimitar Angelov
- Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS, Laboratoire de Biologie et de Modélisation de la Cellule LBMC, CNRS UMR 5239, 46 Allée d'Italie, 69007, Lyon, France.,Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Balçova, Izmir 35330, Turkey
| | - Ramachandran Boopathi
- Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS, Laboratoire de Biologie et de Modélisation de la Cellule LBMC, CNRS UMR 5239, 46 Allée d'Italie, 69007, Lyon, France.,Université Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale (IBS), 38000, Grenoble, France
| | - Imtiaz Nisar Lone
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Balçova, Izmir 35330, Turkey
| | - Hervé Menoni
- Université Grenoble Alpes, CNRS UMR 5309, INSERM U1209, Institute for Advanced Biosciences (IAB), Site Santé - Allée des Alpes, 38700, La Tronche, France
| | - Stefan Dimitrov
- Université Grenoble Alpes, CNRS UMR 5309, INSERM U1209, Institute for Advanced Biosciences (IAB), Site Santé - Allée des Alpes, 38700, La Tronche, France
| | - Jean Cadet
- Département de Médecine nucléaire et Radiobiologie, Faculté de Médecine, Université de Sherbrooke, Sherbrooke, J1H 5N4, Québec, Canada
| |
Collapse
|
27
|
Yin YH, Shen LC, Jiang Y, Gao S, Song J, Yu DJ. Improving the prediction of DNA-protein binding by integrating multi-scale dense convolutional network with fault-tolerant coding. Anal Biochem 2022; 656:114878. [DOI: 10.1016/j.ab.2022.114878] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 08/18/2022] [Accepted: 08/23/2022] [Indexed: 11/01/2022]
|
28
|
Hajheidari M, Huang SSC. Elucidating the biology of transcription factor-DNA interaction for accurate identification of cis-regulatory elements. CURRENT OPINION IN PLANT BIOLOGY 2022; 68:102232. [PMID: 35679803 PMCID: PMC10103634 DOI: 10.1016/j.pbi.2022.102232] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 04/26/2022] [Accepted: 05/02/2022] [Indexed: 05/03/2023]
Abstract
Transcription factors (TFs) play a critical role in determining cell fate decisions by integrating developmental and environmental signals through binding to specific cis-regulatory modules and regulating spatio-temporal specificity of gene expression patterns. Precise identification of functional TF binding sites in time and space not only will revolutionize our understanding of regulatory networks governing cell fate decisions but is also instrumental to uncover how genetic variations cause morphological diversity or disease. In this review, we discuss recent advances in mapping TF binding sites and characterizing the various parameters underlying the complexity of binding site recognition by TFs.
Collapse
Affiliation(s)
- Mohsen Hajheidari
- Center for Genomics and Systems Biology, Department of Biology, New York University, 12 Waverly Pl, New York, NY 10003, USA
| | - Shao-Shan Carol Huang
- Center for Genomics and Systems Biology, Department of Biology, New York University, 12 Waverly Pl, New York, NY 10003, USA.
| |
Collapse
|
29
|
Osmala M, Eraslan G, Lähdesmäki H. ChromDMM: a Dirichlet-multinomial mixture model for clustering heterogeneous epigenetic data. Bioinformatics 2022; 38:3863-3870. [PMID: 35786716 PMCID: PMC9364382 DOI: 10.1093/bioinformatics/btac444] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/20/2022] [Accepted: 06/30/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Research on epigenetic modifications and other chromatin features at genomic regulatory elements elucidates essential biological mechanisms including the regulation of gene expression. Despite the growing number of epigenetic datasets, new tools are still needed to discover novel distinctive patterns of heterogeneous epigenetic signals at regulatory elements. RESULTS We introduce ChromDMM, a product Dirichlet-multinomial mixture model for clustering genomic regions that are characterized by multiple chromatin features. ChromDMM extends the mixture model framework by profile shifting and flipping that can probabilistically account for inaccuracies in the position and strand-orientation of the genomic regions. Owing to hyper-parameter optimization, ChromDMM can also regularize the smoothness of the epigenetic profiles across the consecutive genomic regions. With simulated data, we demonstrate that ChromDMM clusters, shifts and strand-orients the profiles more accurately than previous methods. With ENCODE data, we show that the clustering of enhancer regions in the human genome reveals distinct patterns in several chromatin features. We further validate the enhancer clusters by their enrichment for transcriptional regulatory factor binding sites. AVAILABILITY AND IMPLEMENTATION ChromDMM is implemented as an R package and is available at https://github.com/MariaOsmala/ChromDMM. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | - Harri Lähdesmäki
- Department of Computer Science, Aalto University, Espoo 02150, Finland
| |
Collapse
|
30
|
Singh NP, Krumlauf R. Diversification and Functional Evolution of HOX Proteins. Front Cell Dev Biol 2022; 10:798812. [PMID: 35646905 PMCID: PMC9136108 DOI: 10.3389/fcell.2022.798812] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 04/08/2022] [Indexed: 01/07/2023] Open
Abstract
Gene duplication and divergence is a major contributor to the generation of morphological diversity and the emergence of novel features in vertebrates during evolution. The availability of sequenced genomes has facilitated our understanding of the evolution of genes and regulatory elements. However, progress in understanding conservation and divergence in the function of proteins has been slow and mainly assessed by comparing protein sequences in combination with in vitro analyses. These approaches help to classify proteins into different families and sub-families, such as distinct types of transcription factors, but how protein function varies within a gene family is less well understood. Some studies have explored the functional evolution of closely related proteins and important insights have begun to emerge. In this review, we will provide a general overview of gene duplication and functional divergence and then focus on the functional evolution of HOX proteins to illustrate evolutionary changes underlying diversification and their role in animal evolution.
Collapse
Affiliation(s)
| | - Robb Krumlauf
- Stowers Institute for Medical Research, Kansas City, MO, United States
- Department of Anatomy and Cell Biology, Kansas University Medical Center, Kansas City, KS, United States
- *Correspondence: Robb Krumlauf,
| |
Collapse
|
31
|
Lee D, Kim S. Knowledge-guided artificial intelligence technologies for decoding complex multiomics interactions in cells. Clin Exp Pediatr 2022; 65:239-249. [PMID: 34844399 PMCID: PMC9082244 DOI: 10.3345/cep.2021.01438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/19/2021] [Accepted: 10/21/2021] [Indexed: 11/27/2022] Open
Abstract
Cells survive and proliferate through complex interactions among diverse molecules across multiomics layers. Conventional experimental approaches for identifying these interactions have built a firm foundation for molecular biology, but their scalability is gradually becoming inadequate compared to the rapid accumulation of multiomics data measured by high-throughput technologies. Therefore, the need for data-driven computational modeling of interactions within cells has been highlighted in recent years. The complexity of multiomics interactions is primarily due to their nonlinearity. That is, their accurate modeling requires intricate conditional dependencies, synergies, or antagonisms between considered genes or proteins, which retard experimental validations. Artificial intelligence (AI) technologies, including deep learning models, are optimal choices for handling complex nonlinear relationships between features that are scalable and produce large amounts of data. Thus, they have great potential for modeling multiomics interactions. Although there exist many AI-driven models for computational biology applications, relatively few explicitly incorporate the prior knowledge within model architectures or training procedures. Such guidance of models by domain knowledge will greatly reduce the amount of data needed to train models and constrain their vast expressive powers to focus on the biologically relevant space. Therefore, it can enhance a model's interpretability, reduce spurious interactions, and prove its validity and utility. Thus, to facilitate further development of knowledge-guided AI technologies for the modeling of multiomics interactions, here we review representative bioinformatics applications of deep learning models for multiomics interactions developed to date by categorizing them by guidance mode.
Collapse
Affiliation(s)
- Dohoon Lee
- Bioinformatics Institute, Seoul National University, Seoul, Korea
| | - Sun Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea
- Department of Computer Science and Engineering, Seoul National University, Seoul, Korea
- Institute of Engineering Research, Seoul National University, Seoul, Korea
- AIGENDRUG Co., Ltd., Seoul, Korea
| |
Collapse
|
32
|
Transcriptional Regulation and Implications for Controlling Hox Gene Expression. J Dev Biol 2022; 10:jdb10010004. [PMID: 35076545 PMCID: PMC8788451 DOI: 10.3390/jdb10010004] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/04/2022] [Accepted: 01/06/2022] [Indexed: 02/06/2023] Open
Abstract
Hox genes play key roles in axial patterning and regulating the regional identity of cells and tissues in a wide variety of animals from invertebrates to vertebrates. Nested domains of Hox expression generate a combinatorial code that provides a molecular framework for specifying the properties of tissues along the A–P axis. Hence, it is important to understand the regulatory mechanisms that coordinately control the precise patterns of the transcription of clustered Hox genes required for their roles in development. New insights are emerging about the dynamics and molecular mechanisms governing transcriptional regulation, and there is interest in understanding how these may play a role in contributing to the regulation of the expression of the clustered Hox genes. In this review, we summarize some of the recent findings, ideas and emerging mechanisms underlying the regulation of transcription in general and consider how they may be relevant to understanding the transcriptional regulation of Hox genes.
Collapse
|
33
|
Castro-Mondragon JA, Riudavets-Puig R, Rauluseviciute I, Berhanu Lemma R, Turchi L, Blanc-Mathieu R, Lucas J, Boddie P, Khan A, Manosalva Pérez N, Fornes O, Leung T, Aguirre A, Hammal F, Schmelter D, Baranasic D, Ballester B, Sandelin A, Lenhard B, Vandepoele K, Wasserman WW, Parcy F, Mathelier A. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res 2022; 50:D165-D173. [PMID: 34850907 PMCID: PMC8728201 DOI: 10.1093/nar/gkab1113] [Citation(s) in RCA: 936] [Impact Index Per Article: 468.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Revised: 10/20/2021] [Accepted: 10/22/2021] [Indexed: 12/18/2022] Open
Abstract
JASPAR (http://jaspar.genereg.net/) is an open-access database containing manually curated, non-redundant transcription factor (TF) binding profiles for TFs across six taxonomic groups. In this 9th release, we expanded the CORE collection with 341 new profiles (148 for plants, 101 for vertebrates, 85 for urochordates, and 7 for insects), which corresponds to a 19% expansion over the previous release. We added 298 new profiles to the Unvalidated collection when no orthogonal evidence was found in the literature. All the profiles were clustered to provide familial binding profiles for each taxonomic group. Moreover, we revised the structural classification of DNA binding domains to consider plant-specific TFs. This release introduces word clouds to represent the scientific knowledge associated with each TF. We updated the genome tracks of TFBSs predicted with JASPAR profiles in eight organisms; the human and mouse TFBS predictions can be visualized as native tracks in the UCSC Genome Browser. Finally, we provide a new tool to perform JASPAR TFBS enrichment analysis in user-provided genomic regions. All the data is accessible through the JASPAR website, its associated RESTful API, the R/Bioconductor data package, and a new Python package, pyJASPAR, that facilitates serverless access to the data.
Collapse
Affiliation(s)
- Jaime A Castro-Mondragon
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
| | - Rafael Riudavets-Puig
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
| | - Ieva Rauluseviciute
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
| | - Roza Berhanu Lemma
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
| | - Laura Turchi
- Laboratoire Physiologie Cellulaire et Végétale, Univ. Grenoble Alpes, CNRS, CEA, INRAE, IRIG-DBSCI-LPCV, 17 avenue des martyrsF-38054, Grenoble, France
| | - Romain Blanc-Mathieu
- Laboratoire Physiologie Cellulaire et Végétale, Univ. Grenoble Alpes, CNRS, CEA, INRAE, IRIG-DBSCI-LPCV, 17 avenue des martyrsF-38054, Grenoble, France
| | - Jeremy Lucas
- Laboratoire Physiologie Cellulaire et Végétale, Univ. Grenoble Alpes, CNRS, CEA, INRAE, IRIG-DBSCI-LPCV, 17 avenue des martyrsF-38054, Grenoble, France
| | - Paul Boddie
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
| | - Aziz Khan
- Stanford Cancer Institute, Stanford University School of Medicine, Stanford, CA94305, USA
| | - Nicolás Manosalva Pérez
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052 Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052 Ghent, Belgium
| | - Oriol Fornes
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children's Hospital Research Institute, University of British Columbia, 950 W 28th Ave, Vancouver, BC V5Z 4H4, Canada
| | - Tiffany Y Leung
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children's Hospital Research Institute, University of British Columbia, 950 W 28th Ave, Vancouver, BC V5Z 4H4, Canada
| | - Alejandro Aguirre
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children's Hospital Research Institute, University of British Columbia, 950 W 28th Ave, Vancouver, BC V5Z 4H4, Canada
| | | | - Daniel Schmelter
- UCSC Genome Browser, University of California Santa Cruz, Santa Cruz, CA95060, USA
| | - Damir Baranasic
- MRC London Institute of Medical Sciences, Du Cane Road, London, W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, Du Cane Road, London W12 0NN, UK
| | | | - Albin Sandelin
- The Bioinformatics Centre, Department of Biology & Biotech Research and Innovation Centre, University of Copenhagen, Ole Maaloes Vej 5, DK2200 Copenhagen N, Denmark
| | - Boris Lenhard
- MRC London Institute of Medical Sciences, Du Cane Road, London, W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital Campus, Du Cane Road, London W12 0NN, UK
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Technologiepark 71, 9052 Ghent, Belgium
- VIB Center for Plant Systems Biology, Technologiepark 71, 9052 Ghent, Belgium
- Bioinformatics Institute Ghent, Ghent University, Technologiepark 71, 9052 Ghent, Belgium
| | - Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics, Department of Medical Genetics, BC Children's Hospital Research Institute, University of British Columbia, 950 W 28th Ave, Vancouver, BC V5Z 4H4, Canada
| | - François Parcy
- Laboratoire Physiologie Cellulaire et Végétale, Univ. Grenoble Alpes, CNRS, CEA, INRAE, IRIG-DBSCI-LPCV, 17 avenue des martyrsF-38054, Grenoble, France
| | - Anthony Mathelier
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318 Oslo, Norway
- Department of Medical Genetics, Institute of Clinical Medicine, University of Oslo and Oslo University Hospital, Oslo, Norway
| |
Collapse
|
34
|
Abstract
Mapping the epigenome is key to describe the relationship between chromatin landscapes and the control of DNA-based cellular processes such as transcription. Cleavage under targets and release using nuclease (CUT&RUN) is an in situ chromatin profiling strategy in which controlled cleavage by antibody-targeted Micrococcal Nuclease solubilizes specific protein-DNA complexes for paired-end DNA sequencing. When applied to budding yeast, CUT&RUN profiling yields precise genome-wide maps of histone modifications, histone variants, transcription factors, and ATP-dependent chromatin remodelers, while avoiding cross-linking and solubilization issues associated with the most commonly used chromatin profiling technique Chromatin Immunoprecipitation (ChIP). Furthermore, targeted chromatin complexes cleanly released by CUT&RUN can be used as input for a subsequent native immunoprecipitation step (CUT&RUN.ChIP) to simultaneously map two epitopes in single molecules genome-wide. The intrinsically low background and high resolution of CUT&RUN and CUT&RUN.ChIP allows for identification of transient genomic features such as dynamic nucleosome-remodeling intermediates. Starting from cells, one can perform CUT&RUN or CUT&RUN.ChIP and obtain purified DNA for sequencing library preparation in 2 days.
Collapse
Affiliation(s)
- Sandipan Brahma
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| | - Steven Henikoff
- Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
| |
Collapse
|
35
|
Soffers JHM, Alcantara SGM, Li X, Shao W, Seidel CW, Li H, Zeitlinger J, Abmayr SM, Workman JL. The SAGA core module is critical during Drosophila oogenesis and is broadly recruited to promoters. PLoS Genet 2021; 17:e1009668. [PMID: 34807910 PMCID: PMC8648115 DOI: 10.1371/journal.pgen.1009668] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2021] [Revised: 12/06/2021] [Accepted: 10/22/2021] [Indexed: 11/19/2022] Open
Abstract
The Spt/Ada-Gcn5 Acetyltransferase (SAGA) coactivator complex has multiple modules with different enzymatic and non-enzymatic functions. How each module contributes to gene expression is not well understood. During Drosophila oogenesis, the enzymatic functions are not equally required, which may indicate that different genes require different enzymatic functions. An analogy for this phenomenon is the handyman principle: while a handyman has many tools, which tool he uses depends on what requires maintenance. Here we analyzed the role of the non-enzymatic core module during Drosophila oogenesis, which interacts with TBP. We show that depletion of SAGA-specific core subunits blocked egg chamber development at earlier stages than depletion of enzymatic subunits. These results, as well as additional genetic analyses, point to an interaction with TBP and suggest a differential role of SAGA modules at different promoter types. However, SAGA subunits co-occupied all promoter types of active genes in ChIP-seq and ChIP-nexus experiments, and the complex was not specifically associated with distinct promoter types in the ovary. The high-resolution genomic binding profiles were congruent with SAGA recruitment by activators upstream of the start site, and retention on chromatin by interactions with modified histones downstream of the start site. Our data illustrate that a distinct genetic requirement for specific components may conceal the fact that the entire complex is physically present and suggests that the biological context defines which module functions are critical. Embryonic development critically relies on the differential expression of genes in different tissues. This involves the dynamic interplay between DNA, sequence-specific transcription factors, coactivators and chromatin remodelers, which guide the transcription machinery to the appropriate promoters for productive transcription. To understand how this happens at the molecular level, we need to understand when and how coactivator complexes such as SAGA function. SAGA consists of multiple modules with well characterized enzymatic functions. This study shows that the non-enzymatic core module of SAGA is required for Drosophila oogenesis, while the enzymatic functions are largely dispensable. Despite this differential requirement, SAGA subunits appear to be broadly recruited to all promoter types, consistent with the biochemical integrity of the complex. These results suggest that genetic requirements for different modules depend on the developmental demands.
Collapse
Affiliation(s)
- Jelly H. M. Soffers
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Sergio G-M Alcantara
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Xuanying Li
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Wanqing Shao
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Christopher W. Seidel
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Hua Li
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
- Department of Pathology and Laboratory Medicine, University of Kansas School of Medicine, Kansas City, Kansas, United States of America
| | - Susan M. Abmayr
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
- Department of Anatomy and Cell Biology, University of Kansas School of Medicine, Kansas City, Kansas, United States of America
| | - Jerry L. Workman
- Stowers Institute for Medical Research, Kansas City, Missouri, United States of America
- * E-mail:
| |
Collapse
|
36
|
Abstract
To predict transcription, one needs a mechanistic understanding of how the numerous required transcription factors (TFs) explore the nuclear space to find their target genes, assemble, cooperate, and compete with one another. Advances in fluorescence microscopy have made it possible to visualize real-time TF dynamics in living cells, leading to two intriguing observations: first, most TFs contact chromatin only transiently; and second, TFs can assemble into clusters through their intrinsically disordered regions. These findings suggest that highly dynamic events and spatially structured nuclear microenvironments might play key roles in transcription regulation that are not yet fully understood. The emerging model is that while some promoters directly convert TF-binding events into on/off cycles of transcription, many others apply complex regulatory layers that ultimately lead to diverse phenotypic outputs. Cracking this kinetic code is an ongoing and challenging task that is made possible by combining innovative imaging approaches with biophysical models.
Collapse
Affiliation(s)
- Feiyue Lu
- Institute for Systems Genetics and Cell Biology Department, NYU School of Medicine, New York, New York 10016, USA
| | - Timothée Lionnet
- Institute for Systems Genetics and Cell Biology Department, NYU School of Medicine, New York, New York 10016, USA
| |
Collapse
|
37
|
Zhang J, Cavallaro M, Hebenstreit D. Timing RNA polymerase pausing with TV-PRO-seq. CELL REPORTS METHODS 2021; 1:None. [PMID: 34723238 PMCID: PMC8547241 DOI: 10.1016/j.crmeth.2021.100083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Revised: 08/03/2021] [Accepted: 08/18/2021] [Indexed: 11/28/2022]
Abstract
Transcription of many genes in metazoans is subject to polymerase pausing, which is the transient stop of transcriptionally engaged polymerases. This is known to mainly occur in promoter-proximal regions but it is not well understood. In particular, a genome-wide measurement of pausing times at high resolution has been lacking. We present here the time-variant precision nuclear run-on and sequencing (TV-PRO-seq) assay, an extension of the standard PRO-seq that allows us to estimate genome-wide pausing times at single-base resolution. Its application to human cells demonstrates that, proximal to promoters, polymerases pause more frequently but for shorter times than in other genomic regions. Comparison with single-cell gene expression data reveals that the polymerase pausing times are longer in highly expressed genes, while transcriptionally noisier genes have higher pausing frequencies and slightly longer pausing times. Analyses of histone modifications suggest that the marker H3K36me3 is related to the polymerase pausing.
Collapse
Affiliation(s)
- Jie Zhang
- School of Life Sciences, Gibbet Hill Campus, the University of Warwick, CV4 7AL Coventry, UK
| | - Massimo Cavallaro
- School of Life Sciences, Gibbet Hill Campus, the University of Warwick, CV4 7AL Coventry, UK
- Mathematics Institute and Zeeman Institute for Systems Biology and Infectious Disease Epidemiology Research, the University of Warwick, CV4 7AL Coventry, UK
| | - Daniel Hebenstreit
- School of Life Sciences, Gibbet Hill Campus, the University of Warwick, CV4 7AL Coventry, UK
| |
Collapse
|
38
|
Libbrecht MW, Chan RCW, Hoffman MM. Segmentation and genome annotation algorithms for identifying chromatin state and other genomic patterns. PLoS Comput Biol 2021; 17:e1009423. [PMID: 34648491 PMCID: PMC8516206 DOI: 10.1371/journal.pcbi.1009423] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Segmentation and genome annotation (SAGA) algorithms are widely used to understand genome activity and gene regulation. These algorithms take as input epigenomic datasets, such as chromatin immunoprecipitation-sequencing (ChIP-seq) measurements of histone modifications or transcription factor binding. They partition the genome and assign a label to each segment such that positions with the same label exhibit similar patterns of input data. SAGA algorithms discover categories of activity such as promoters, enhancers, or parts of genes without prior knowledge of known genomic elements. In this sense, they generally act in an unsupervised fashion like clustering algorithms, but with the additional simultaneous function of segmenting the genome. Here, we review the common methodological framework that underlies these methods, review variants of and improvements upon this basic framework, and discuss the outlook for future work. This review is intended for those interested in applying SAGA methods and for computational researchers interested in improving upon them.
Collapse
Affiliation(s)
| | - Rachel C. W. Chan
- Department of Computer Science, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
| | - Michael M. Hoffman
- Department of Computer Science, University of Toronto, Toronto, Canada
- Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
- Department of Medical Biophysics, University of Toronto, Toronto, Canada
- Vector Institute for Artificial Intelligence, Toronto, Canada
| |
Collapse
|
39
|
Lin J, Huang L, Chen X, Zhang S, Wong KC. DeepMotifSyn: a deep learning approach to synthesize heterodimeric DNA motifs. Brief Bioinform 2021; 23:6370301. [PMID: 34524404 DOI: 10.1093/bib/bbab334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 07/21/2021] [Accepted: 07/28/2021] [Indexed: 11/12/2022] Open
Abstract
The cooperativity of transcription factors (TFs) is a widespread phenomenon in the gene regulation system. However, the interaction patterns between TF binding motifs remain elusive. The recent high-throughput assays, CAP-SELEX, have identified over 600 composite DNA sites (i.e. heterodimeric motifs) bound by cooperative TF pairs. However, there are over 25 000 inferentially effective heterodimeric TFs in the human cells. It is not practically feasible to validate all heterodimeric motifs due to cost and labor. We introduce DeepMotifSyn, a deep learning-based tool for synthesizing heterodimeric motifs from monomeric motif pairs. Specifically, DeepMotifSyn is composed of heterodimeric motif generator and evaluator. The generator is a U-Net-based neural network that can synthesize heterodimeric motifs from aligned motif pairs. The evaluator is a machine learning-based model that can score the generated heterodimeric motif candidates based on the motif sequence features. Systematic evaluations on CAP-SELEX data illustrate that DeepMotifSyn significantly outperforms the current state-of-the-art predictors. In addition, DeepMotifSyn can synthesize multiple heterodimeric motifs with different orientation and spacing settings. Such a feature can address the shortcomings of previous models. We believe DeepMotifSyn is a more practical and reliable model than current predictors on heterodimeric motif synthesis. Contact: kc.w@cityu.edu.hk.
Collapse
Affiliation(s)
- Jiecong Lin
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR
| | - Lei Huang
- Hong Kong Institute for Data Science, City University of Hong Kong, Kowloon, Hong Kong SAR
| | - Xingjian Chen
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Shixiong Zhang
- School of Computer Science and Technology, Xidian University, Xi'an, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR
| |
Collapse
|
40
|
Feng Y, Huang W, Paul C, Liu X, Sadayappan S, Wang Y, Pauklin S. Mitochondrial nucleoid in cardiac homeostasis: bidirectional signaling of mitochondria and nucleus in cardiac diseases. Basic Res Cardiol 2021; 116:49. [PMID: 34392401 PMCID: PMC8364536 DOI: 10.1007/s00395-021-00889-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 07/20/2021] [Indexed: 01/11/2023]
Abstract
Metabolic function and energy production in eukaryotic cells are regulated by mitochondria, which have been recognized as the intracellular 'powerhouses' of eukaryotic cells for their regulation of cellular homeostasis. Mitochondrial function is important not only in normal developmental and physiological processes, but also in a variety of human pathologies, including cardiac diseases. An emerging topic in the field of cardiovascular medicine is the implication of mitochondrial nucleoid for metabolic reprogramming. This review describes the linear/3D architecture of the mitochondrial nucleoid (e.g., highly organized protein-DNA structure of nucleoid) and how it is regulated by a variety of factors, such as noncoding RNA and its associated R-loop, for metabolic reprogramming in cardiac diseases. In addition, we highlight many of the presently unsolved questions regarding cardiac metabolism in terms of bidirectional signaling of mitochondrial nucleoid and 3D chromatin structure in the nucleus. In particular, we explore novel techniques to dissect the 3D structure of mitochondrial nucleoid and propose new insights into the mitochondrial retrograde signaling, and how it regulates the nuclear (3D) chromatin structures in mitochondrial diseases.
Collapse
Affiliation(s)
- Yuliang Feng
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Old Road, University of Oxford, Oxford, OX3 7LD, UK
| | - Wei Huang
- Department of Pathology and Laboratory Medicine, Regenerative Medicine Research, University of Cincinnati College of Medicine, 231 Albert Sabin Way, CincinnatiCincinnati, OH, 45267-0529, USA
| | - Christian Paul
- Department of Pathology and Laboratory Medicine, Regenerative Medicine Research, University of Cincinnati College of Medicine, 231 Albert Sabin Way, CincinnatiCincinnati, OH, 45267-0529, USA
| | - Xingguo Liu
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, Hefei Institute of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
- Guangzhou Regenerative Medicine and Health Guangdong Laboratory, CAS Key Laboratory of Regenerative Biology, Joint School of Life Sciences, Hefei Institute of Stem Cell and Regenerative Medicine, Guangzhou Institutes of Biomedicine and Health, Guangzhou Medical University, Guangzhou, 510530, China
- Guangdong Provincial Key Laboratory of Stem Cell and Regenerative Medicine, Institute for Stem Cell and Regeneration, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou, 510530, China
| | - Sakthivel Sadayappan
- Heart, Lung and Vascular Institute, Division of Cardiovascular Health and Disease, Department of Internal Medicine, University of Cincinnati, Cincinnati, OH, 45267, USA
| | - Yigang Wang
- Department of Pathology and Laboratory Medicine, Regenerative Medicine Research, University of Cincinnati College of Medicine, 231 Albert Sabin Way, CincinnatiCincinnati, OH, 45267-0529, USA.
| | - Siim Pauklin
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Old Road, University of Oxford, Oxford, OX3 7LD, UK.
| |
Collapse
|
41
|
Weidemüller P, Kholmatov M, Petsalaki E, Zaugg JB. Transcription factors: Bridge between cell signaling and gene regulation. Proteomics 2021; 21:e2000034. [PMID: 34314098 DOI: 10.1002/pmic.202000034] [Citation(s) in RCA: 83] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 07/05/2021] [Accepted: 07/16/2021] [Indexed: 01/17/2023]
Abstract
Transcription factors (TFs) are key regulators of intrinsic cellular processes, such as differentiation and development, and of the cellular response to external perturbation through signaling pathways. In this review we focus on the role of TFs as a link between signaling pathways and gene regulation. Cell signaling tends to result in the modulation of a set of TFs that then lead to changes in the cell's transcriptional program. We highlight the molecular layers at which TF activity can be measured and the associated technical and conceptual challenges. These layers include post-translational modifications (PTMs) of the TF, regulation of TF binding to DNA through chromatin accessibility and epigenetics, and expression of target genes. We highlight that a large number of TFs are understudied in both signaling and gene regulation studies, and that our knowledge about known TF targets has a strong literature bias. We argue that TFs serve as a perfect bridge between the fields of gene regulation and signaling, and that separating these fields hinders our understanding of cell functions. Multi-omics approaches that measure multiple dimensions of TF activity are ideally suited to study the interplay of cell signaling and gene regulation using TFs as the anchor to link the two fields.
Collapse
Affiliation(s)
- Paula Weidemüller
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Maksim Kholmatov
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstraße 1, Heidelberg, 69117, Germany
| | - Evangelia Petsalaki
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Judith B Zaugg
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstraße 1, Heidelberg, 69117, Germany
| |
Collapse
|
42
|
Mehrmohamadi M, Sepehri MH, Nazer N, Norouzi MR. A Comparative Overview of Epigenomic Profiling Methods. Front Cell Dev Biol 2021; 9:714687. [PMID: 34368164 PMCID: PMC8340004 DOI: 10.3389/fcell.2021.714687] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 06/30/2021] [Indexed: 11/13/2022] Open
Abstract
In the past decade, assays that profile different aspects of the epigenome have grown exponentially in number and variation. However, standard guidelines for researchers to choose between available tools depending on their needs are lacking. Here, we introduce a comprehensive collection of the most commonly used bulk and single-cell epigenomic assays and compare and contrast their strengths and weaknesses. We summarize some of the most important technical and experimental parameters that should be considered for making an appropriate decision when designing epigenomic experiments.
Collapse
Affiliation(s)
- Mahya Mehrmohamadi
- Department of Biotechnology, College of Science, University of Tehran, Tehran, Iran
| | | | - Naghme Nazer
- Department of Electrical Engineering, Sharif University of Technology, Tehran, Iran
| | | |
Collapse
|
43
|
Biswas A, Narlikar L. Resolving diverse protein-DNA footprints from exonuclease-based ChIP experiments. Bioinformatics 2021; 37:i367-i375. [PMID: 34252930 PMCID: PMC8275329 DOI: 10.1093/bioinformatics/btab274] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
MOTIVATION High-throughput chromatin immunoprecipitation (ChIP) sequencing-based assays capture genomic regions associated with the profiled transcription factor (TF). ChIP-exo is a modified protocol, which uses lambda exonuclease to digest DNA close to the TF-DNA complex, in order to improve on the positional resolution of the TF-DNA contact. Because the digestion occurs in the 5'-3' orientation, the protocol produces directional footprints close to the complex, on both sides of the double stranded DNA. Like all ChIP-based methods, ChIP-exo reports a mixture of different regions associated with the TF: those bound directly to the TF as well as via intermediaries. However, the distribution of footprints are likely to be indicative of the complex forming at the DNA. RESULTS We present ExoDiversity, which uses a model-based framework to learn a joint distribution over footprints and motifs, thus resolving the mixture of ChIP-exo footprints into diverse binding modes. It uses no prior motif or TF information and automatically learns the number of different modes from the data. We show its application on a wide range of TFs and organisms/cell-types. Because its goal is to explain the complete set of reported regions, it is able to identify co-factor TF motifs that appear in a small fraction of the dataset. Further, ExoDiversity discovers small nucleotide variations within and outside canonical motifs, which co-occur with variations in footprints, suggesting that the TF-DNA structural configuration at those regions is likely to be different. Finally, we show that detected modes have specific DNA shape features and conservation signals, giving insights into the structure and function of the putative TF-DNA complexes. AVAILABILITY AND IMPLEMENTATION The code for ExoDiversity is available on https://github.com/NarlikarLab/exoDIVERSITY. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anushua Biswas
- Department of Chemical Engineering, CSIR-National Chemical Laboratory, Pune 411008, India.,Academy of Scientific and Innovative Research, Ghaziabad 201002, India
| | - Leelavati Narlikar
- Department of Chemical Engineering, CSIR-National Chemical Laboratory, Pune 411008, India.,Academy of Scientific and Innovative Research, Ghaziabad 201002, India
| |
Collapse
|
44
|
Zhang Y, Ho TD, Buchler NE, Gordân R. Competition for DNA binding between paralogous transcription factors determines their genomic occupancy and regulatory functions. Genome Res 2021; 31:1216-1229. [PMID: 33975875 PMCID: PMC8256859 DOI: 10.1101/gr.275145.120] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Accepted: 05/06/2021] [Indexed: 11/24/2022]
Abstract
Most eukaryotic transcription factors (TFs) are part of large protein families, with members of the same family (i.e., paralogous TFs) recognizing similar DNA-binding motifs but performing different regulatory functions. Many TF paralogs are coexpressed in the cell and thus can compete for target sites across the genome. However, this competition is rarely taken into account when studying the in vivo binding patterns of eukaryotic TFs. Here, we show that direct competition for DNA binding between TF paralogs is a major determinant of their genomic binding patterns. Using yeast proteins Cbf1 and Pho4 as our model system, we designed a high-throughput quantitative assay to capture the genomic binding profiles of competing TFs in a cell-free system. Our data show that Cbf1 and Pho4 greatly influence each other's occupancy by competing for their common putative genomic binding sites. The competition is different at different genomic sites, as dictated by the TFs' expression levels and their divergence in DNA-binding specificity and affinity. Analyses of ChIP-seq data show that the biophysical rules that dictate the competitive TF binding patterns in vitro are also followed in vivo, in the complex cellular environment. Furthermore, the Cbf1-Pho4 competition for genomic sites, as characterized in vitro using our new assay, plays a critical role in the specific activation of their target genes in the cell. Overall, our study highlights the importance of direct TF-TF competition for genomic binding and gene regulation by TF paralogs, and proposes an approach for studying this competition in a quantitative and high-throughput manner.
Collapse
Affiliation(s)
- Yuning Zhang
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Program in Computational Biology and Bioinformatics, Duke University, Durham, North Carolina 27708, USA
| | - Tiffany D Ho
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina 27708, USA
| | - Nicolas E Buchler
- Department of Molecular Biomedical Sciences, North Carolina State University, Raleigh, North Carolina 27606, USA
| | - Raluca Gordân
- Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina 27708, USA
- Department of Computer Science, Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
45
|
Hua P, Badat M, Hanssen LLP, Hentges LD, Crump N, Downes DJ, Jeziorska DM, Oudelaar AM, Schwessinger R, Taylor S, Milne TA, Hughes JR, Higgs DR, Davies JOJ. Defining genome architecture at base-pair resolution. Nature 2021; 595:125-129. [PMID: 34108683 DOI: 10.1038/s41586-021-03639-4] [Citation(s) in RCA: 79] [Impact Index Per Article: 26.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 05/13/2021] [Indexed: 12/16/2022]
Abstract
In higher eukaryotes, many genes are regulated by enhancers that are 104-106 base pairs (bp) away from the promoter. Enhancers contain transcription-factor-binding sites (which are typically around 7-22 bp), and physical contact between the promoters and enhancers is thought to be required to modulate gene expression. Although chromatin architecture has been mapped extensively at resolutions of 1 kilobase and above; it has not been possible to define physical contacts at the scale of the proteins that determine gene expression. Here we define these interactions in detail using a chromosome conformation capture method (Micro-Capture-C) that enables the physical contacts between different classes of regulatory elements to be determined at base-pair resolution. We find that highly punctate contacts occur between enhancers, promoters and CCCTC-binding factor (CTCF) sites and we show that transcription factors have an important role in the maintenance of the contacts between enhancers and promoters. Our data show that interactions between CTCF sites are increased when active promoters and enhancers are located within the intervening chromatin. This supports a model in which chromatin loop extrusion1 is dependent on cohesin loading at active promoters and enhancers, which explains the formation of tissue-specific chromatin domains without changes in CTCF binding.
Collapse
Affiliation(s)
- Peng Hua
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Mohsin Badat
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Lars L P Hanssen
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Lance D Hentges
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Nicholas Crump
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Damien J Downes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Danuta M Jeziorska
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | | | - Ron Schwessinger
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Stephen Taylor
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Thomas A Milne
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Jim R Hughes
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
- MRC WIMM Centre for Computational Biology, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Doug R Higgs
- Laboratory of Gene Regulation, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - James O J Davies
- MRC Molecular Haematology Unit, MRC Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK.
| |
Collapse
|
46
|
Zrimec J, Buric F, Kokina M, Garcia V, Zelezniak A. Learning the Regulatory Code of Gene Expression. Front Mol Biosci 2021; 8:673363. [PMID: 34179082 PMCID: PMC8223075 DOI: 10.3389/fmolb.2021.673363] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open
Abstract
Data-driven machine learning is the method of choice for predicting molecular phenotypes from nucleotide sequence, modeling gene expression events including protein-DNA binding, chromatin states as well as mRNA and protein levels. Deep neural networks automatically learn informative sequence representations and interpreting them enables us to improve our understanding of the regulatory code governing gene expression. Here, we review the latest developments that apply shallow or deep learning to quantify molecular phenotypes and decode the cis-regulatory grammar from prokaryotic and eukaryotic sequencing data. Our approach is to build from the ground up, first focusing on the initiating protein-DNA interactions, then specific coding and non-coding regions, and finally on advances that combine multiple parts of the gene and mRNA regulatory structures, achieving unprecedented performance. We thus provide a quantitative view of gene expression regulation from nucleotide sequence, concluding with an information-centric overview of the central dogma of molecular biology.
Collapse
Affiliation(s)
- Jan Zrimec
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Filip Buric
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Mariia Kokina
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Victor Garcia
- School of Life Sciences and Facility Management, Zurich University of Applied Sciences, Wädenswil, Switzerland
| | - Aleksej Zelezniak
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
- Science for Life Laboratory, Stockholm, Sweden
| |
Collapse
|
47
|
Tang Y, Jia Z, Xu H, Da LT, Wu Q. Mechanism of REST/NRSF regulation of clustered protocadherin α genes. Nucleic Acids Res 2021; 49:4506-4521. [PMID: 33849071 PMCID: PMC8096226 DOI: 10.1093/nar/gkab248] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Revised: 03/23/2021] [Accepted: 03/26/2021] [Indexed: 12/16/2022] Open
Abstract
Repressor element-1 silencing transcription factor (REST) or neuron-restrictive silencer factor (NRSF) is a zinc-finger (ZF) containing transcriptional repressor that recognizes thousands of neuron-restrictive silencer elements (NRSEs) in mammalian genomes. How REST/NRSF regulates gene expression remains incompletely understood. Here, we investigate the binding pattern and regulation mechanism of REST/NRSF in the clustered protocadherin (PCDH) genes. We find that REST/NRSF directionally forms base-specific interactions with NRSEs via tandem ZFs in an anti-parallel manner but with striking conformational changes. In addition, REST/NRSF recruitment to the HS5-1 enhancer leads to the decrease of long-range enhancer-promoter interactions and downregulation of the clustered PCDHα genes. Thus, REST/NRSF represses PCDHα gene expression through directional binding to a repertoire of NRSEs within the distal enhancer and variable target genes.
Collapse
Affiliation(s)
- Yuanxiao Tang
- Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Zhilian Jia
- Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Honglin Xu
- Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Lin-tai Da
- Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Qiang Wu
- Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, State Key Laboratory of Microbial Metabolism, Joint International Research Laboratory of Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology, Institute of Systems Biomedicine, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
48
|
Patty BJ, Hainer SJ. Transcription factor chromatin profiling genome-wide using uliCUT&RUN in single cells and individual blastocysts. Nat Protoc 2021; 16:2633-2666. [PMID: 33911257 PMCID: PMC8177051 DOI: 10.1038/s41596-021-00516-2] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 02/04/2021] [Indexed: 02/02/2023]
Abstract
Determining chromatin-associated protein localization across the genome has provided insight into the functions of DNA-binding proteins and their connections to disease. However, established protocols requiring large quantities of cell or tissue samples currently limit applications for clinical and biomedical research in this field. Furthermore, most technologies have been optimized to assess abundant histone protein localization, prohibiting the investigation of nonhistone protein localization in low cell numbers. We recently described a protocol to profile chromatin-associated protein localization in as low as one cell: ultra-low-input cleavage under targets and release using nuclease (uliCUT&RUN). Optimized from chromatin immunocleavage and CUT&RUN, uliCUT&RUN is a tethered enzyme-based protocol that utilizes a combination of recombinant protein, antibody recognition and stringent purification to selectively target proteins of interest and isolate the associated DNA. Performed in native conditions, uliCUT&RUN profiles protein localization to chromatin with low input and high precision. Compared with other profiling technologies, uliCUT&RUN can determine nonhistone protein chromatin occupancies in low cell numbers, permitting the investigation into the molecular functions of a range of DNA-binding proteins within rare samples. From sample preparation to sequencing library submission, the uliCUT&RUN protocol takes <2 d to perform, with the accompanying data analysis timeline dependent on experience level.
Collapse
Affiliation(s)
- Benjamin J Patty
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Sarah J Hainer
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
49
|
Mylonas C, Lee C, Auld AL, Cisse II, Boyer LA. A dual role for H2A.Z.1 in modulating the dynamics of RNA polymerase II initiation and elongation. Nat Struct Mol Biol 2021; 28:435-442. [PMID: 33972784 DOI: 10.1038/s41594-021-00589-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 04/06/2021] [Indexed: 02/03/2023]
Abstract
RNA polymerase II (RNAPII) pausing immediately downstream of the transcription start site is a critical rate-limiting step for the expression of most metazoan genes. During pause release, RNAPII encounters a highly conserved +1 H2A.Z nucleosome, yet how this histone variant contributes to transcription is poorly understood. Here, using an inducible protein degron system combined with genomic approaches and live cell super-resolution microscopy, we show that H2A.Z.1 modulates RNAPII dynamics across most genes in murine embryonic stem cells. Our quantitative analysis shows that H2A.Z.1 slows the rate of RNAPII pause release and consequently impacts negative elongation factor dynamics as well as nascent transcription. Consequently, H2A.Z.1 also impacts re-loading of the pre-initiation complex components TFIIB and TBP. Altogether, this work provides a critical mechanistic link between H2A.Z.1 and the proper induction of mammalian gene expression programs through the regulation of RNAPII dynamics and pause release.
Collapse
Affiliation(s)
- Constantine Mylonas
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Choongman Lee
- Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Alexander L Auld
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Ibrahim I Cisse
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.,Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Laurie A Boyer
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA. .,Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
| |
Collapse
|
50
|
Jaeger MG, Winter GE. Fast-acting chemical tools to delineate causality in transcriptional control. Mol Cell 2021; 81:1617-1630. [PMID: 33689749 DOI: 10.1016/j.molcel.2021.02.015] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 01/20/2021] [Accepted: 02/11/2021] [Indexed: 12/11/2022]
Abstract
Multi-dimensional omics profiling continues to illuminate the complexity of cellular processes. Because of difficult mechanistic interpretation of phenotypes induced by slow perturbation, fast experimental setups are increasingly used to dissect causal interactions directly in cells. Here we review a growing body of studies that leverage rapid pharmacological perturbation to delineate causality in gene control. When coupled with kinetically matched readouts, fast chemical genetic tools allow recording of primary phenotypes before confounding secondary effects manifest. The toolbox encompasses directly acting probes, such as active-site inhibitors and proteolysis-targeting chimeras, as well as strategies using genetic engineering to render target proteins chemically tractable, such as analog-sensitive and degron systems. We anticipate that extrapolation of these concepts to single-cell setups will further transform our mechanistic understanding of transcriptional control in the future. Importantly, the concept of leveraging speed to derive causality should be broadly applicable to many aspects of biological regulation.
Collapse
Affiliation(s)
- Martin G Jaeger
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Georg E Winter
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria.
| |
Collapse
|