Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Keilwagen J, Posch S, Grau J. Accurate prediction of cell type-specific transcription factor binding. Genome Biol 2019;20:9. [PMID: 30630522 PMCID: PMC6327544 DOI: 10.1186/s13059-018-1614-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 12/18/2018] [Indexed: 01/11/2023] Open

For:	Keilwagen J, Posch S, Grau J. Accurate prediction of cell type-specific transcription factor binding. Genome Biol 2019;20:9. [PMID: 30630522 PMCID: PMC6327544 DOI: 10.1186/s13059-018-1614-y] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 12/18/2018] [Indexed: 01/11/2023] Open

Number

Cited by Other Article(s)

Raditsa V, Tsukanov A, Bogomolov A, Levitsky V. Genomic background sequences systematically outperform synthetic ones in de novo motif discovery for ChIP-seq data. NAR Genom Bioinform 2024;6:lqae090. [PMID: 39071850 PMCID: PMC11282361 DOI: 10.1093/nargab/lqae090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 06/03/2024] [Accepted: 07/19/2024] [Indexed: 07/30/2024] Open

Xu C, Kleinschmidt H, Yang J, Leith EM, Johnson J, Tan S, Mahony S, Bai L. Systematic dissection of sequence features affecting binding specificity of a pioneer factor reveals binding synergy between FOXA1 and AP-1. Mol Cell 2024;84:2838-2855.e10. [PMID: 39019045 PMCID: PMC11334613 DOI: 10.1016/j.molcel.2024.06.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 04/23/2024] [Accepted: 06/21/2024] [Indexed: 07/19/2024]

Affiliation(s)

Cheng Xu Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Holly Kleinschmidt Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Jianyu Yang Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Erik M Leith Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Jenna Johnson Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
Song Tan Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Shaun Mahony Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Lu Bai Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA; Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA; Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA.

Collapse

Elkayam S, Tziony I, Orenstein Y. DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 on-target editing efficiency in specific cellular contexts. Bioinformatics 2024;40:btae481. [PMID: 39073893 PMCID: PMC11319645 DOI: 10.1093/bioinformatics/btae481] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 05/28/2024] [Accepted: 07/27/2024] [Indexed: 07/31/2024] Open

Abstract

MOTIVATION

CRISPR/Cas9 technology has been revolutionizing the field of gene editing. Guide RNAs (gRNAs) enable Cas9 proteins to target specific genomic loci for editing. However, editing efficiency varies between gRNAs and so computational methods were developed to predict editing efficiency for any gRNA of interest. High-throughput datasets of Cas9 editing efficiencies were produced to train machine-learning models to predict editing efficiency. However, these high-throughput datasets have a low correlation with functional and endogenous datasets, which are too small to train accurate machine-learning models on.

RESULTS

We developed DeepCRISTL, a deep-learning model to predict the editing efficiency in a specific cellular context. DeepCRISTL takes advantage of high-throughput datasets to learn general patterns of gRNA editing efficiency and then fine-tunes the model on functional or endogenous data to fit a specific cellular context. We tested two state-of-the-art models trained on high-throughput datasets for editing efficiency prediction, our newly improved DeepHF and CRISPRon, combined with various transfer-learning approaches. The combination of CRISPRon and fine-tuning all model weights was the overall best performer. DeepCRISTL outperformed state-of-the-art methods in predicting editing efficiency in a specific cellular context on functional and endogenous datasets. Using saliency maps, we identified and compared the important features learned by DeepCRISTL across cellular contexts. We believe DeepCRISTL will improve prediction performance in many other CRISPR/Cas9 editing contexts by leveraging transfer learning to utilize both high-throughput datasets and smaller and more biologically relevant datasets.

AVAILABILITY AND IMPLEMENTATION

DeepCRISTL is available via https://github.com/OrensteinLab/DeepCRISTL.

Collapse

Yang Y, Pe’er D. REUNION: transcription factor binding prediction and regulatory association inference from single-cell multi-omics data. Bioinformatics 2024;40:i567-i575. [PMID: 38940155 PMCID: PMC11211829 DOI: 10.1093/bioinformatics/btae234] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open

Abstract

MOTIVATION

Profiling of gene expression and chromatin accessibility by single-cell multi-omics approaches can help to systematically decipher how transcription factors (TFs) regulate target gene expression via cis-region interactions. However, integrating information from different modalities to discover regulatory associations is challenging, in part because motif scanning approaches miss many likely TF binding sites.

RESULTS

We develop REUNION, a framework for predicting genome-wide TF binding and cis-region-TF-gene "triplet" regulatory associations using single-cell multi-omics data. The first component of REUNION, Unify, utilizes information theory-inspired complementary score functions that incorporate TF expression, chromatin accessibility, and target gene expression to identify regulatory associations. The second component, Rediscover, takes Unify estimates as input for pseudo semi-supervised learning to predict TF binding in accessible genomic regions that may or may not include detected TF motifs. Rediscover leverages latent chromatin accessibility and sequence feature spaces of the genomic regions, without requiring chromatin immunoprecipitation data for model training. Applied to peripheral blood mononuclear cell data, REUNION outperforms alternative methods in TF binding prediction on average performance. In particular, it recovers missing region-TF associations from regions lacking detected motifs, which circumvents the reliance on motif scanning and facilitates discovery of novel associations involving potential co-binding transcriptional regulators. Newly identified region-TF associations, even in regions lacking a detected motif, improve the prediction of target gene expression in regulatory triplets, and are thus likely to genuinely participate in the regulation.

AVAILABILITY AND IMPLEMENTATION

All source code is available at https://github.com/yangymargaret/REUNION.

Collapse

Ehle C, Iyer-Bierhoff A, Wu Y, Xing S, Kiehntopf M, Mosig AS, Godmann M, Heinzel T. Downregulation of HNF4A enables transcriptomic reprogramming during the hepatic acute-phase response. Commun Biol 2024;7:589. [PMID: 38755249 PMCID: PMC11099168 DOI: 10.1038/s42003-024-06288-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 05/03/2024] [Indexed: 05/18/2024] Open

Saotome M, Poduval D, Grimm SA, Nagornyuk A, Gunarathna S, Shimbo T, Wade P, Takaku M. Genomic transcription factor binding site selection is edited by the chromatin remodeling factor CHD4. Nucleic Acids Res 2024;52:3607-3622. [PMID: 38281186 PMCID: PMC11039999 DOI: 10.1093/nar/gkae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 12/19/2023] [Accepted: 01/04/2024] [Indexed: 01/30/2024] Open

Yang Z, Li X, Sheng L, Zhu M, Lan X, Gu F. Multiomics-integrated deep language model enables in silico genome-wide detection of transcription factor binding site in unexplored biosamples. Bioinformatics 2024;40:btae013. [PMID: 38216534 PMCID: PMC10812877 DOI: 10.1093/bioinformatics/btae013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 12/07/2023] [Accepted: 01/11/2024] [Indexed: 01/14/2024] Open

Neikes HK, Kliza KW, Gräwe C, Wester RA, Jansen PWTC, Lamers LA, Baltissen MP, van Heeringen SJ, Logie C, Teichmann SA, Lindeboom RGH, Vermeulen M. Quantification of absolute transcription factor binding affinities in the native chromatin context using BANC-seq. Nat Biotechnol 2023;41:1801-1809. [PMID: 36973556 DOI: 10.1038/s41587-023-01715-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 02/16/2023] [Indexed: 03/29/2023]

Affiliation(s)

Hannah K Neikes Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
Katarzyna W Kliza Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
Cathrin Gräwe Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
Roelof A Wester Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
Pascal W T C Jansen Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
Lieke A Lamers Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
Marijke P Baltissen Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
Simon J van Heeringen Department of Molecular Developmental Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Radboud University Nijmegen, Nijmegen, the Netherlands
Colin Logie Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Radboud University Nijmegen, Nijmegen, the Netherlands
Sarah A Teichmann Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK
Rik G H Lindeboom Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK. The Netherlands Cancer Institute, Amsterdam, the Netherlands.
Michiel Vermeulen Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands. The Netherlands Cancer Institute, Amsterdam, the Netherlands.

Collapse

Filipovic D, Qi W, Kana O, Marri D, LeCluyse EL, Andersen ME, Cuddapah S, Bhattacharya S. Interpretable predictive models of genome-wide aryl hydrocarbon receptor-DNA binding reveal tissue-specific binding determinants. Toxicol Sci 2023;196:170-186. [PMID: 37707797 PMCID: PMC10682972 DOI: 10.1093/toxsci/kfad094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2023] Open

Affiliation(s)

David Filipovic Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan 48824, USA Institute for Quantitative Health Science & Engineering, Michigan State University, East Lansing, Michigan 48824, USA
Wenjie Qi Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, Michigan 48824, USA Institute for Quantitative Health Science & Engineering, Michigan State University, East Lansing, Michigan 48824, USA
Omar Kana Institute for Quantitative Health Science & Engineering, Michigan State University, East Lansing, Michigan 48824, USA Department of Pharmacology & Toxicology, Michigan State University, East Lansing, Michigan 48824, USA Institute for Integrative Toxicology, Michigan State University, East Lansing, Michigan 48824, USA
Daniel Marri Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA Institute for Quantitative Health Science & Engineering, Michigan State University, East Lansing, Michigan 48824, USA
Edward L LeCluyse LifeSciences Division, LifeNet Health, Research Triangle Park, North Carolina 27709, USA
Melvin E Andersen ScitoVation LLC, Durham, North Carolina 27713, USA
Suresh Cuddapah Division of Environmental Medicine, Department of Medicine, New York University School of Medicine, New York, New York 10010, USA
Sudin Bhattacharya Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan 48824, USA Institute for Quantitative Health Science & Engineering, Michigan State University, East Lansing, Michigan 48824, USA Department of Pharmacology & Toxicology, Michigan State University, East Lansing, Michigan 48824, USA Institute for Integrative Toxicology, Michigan State University, East Lansing, Michigan 48824, USA Center for Research on Ingredient Safety, Michigan State University, East Lansing, Michigan 48824, USA

Collapse

Xu C, Kleinschmidt H, Yang J, Leith E, Johnson J, Tan S, Mahony S, Bai L. Systematic Dissection of Sequence Features Affecting the Binding Specificity of a Pioneer Factor Reveals Binding Synergy Between FOXA1 and AP-1. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.08.566246. [PMID: 37986839 PMCID: PMC10659273 DOI: 10.1101/2023.11.08.566246] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]

Affiliation(s)

Cheng Xu Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Holly Kleinschmidt Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Jianyu Yang Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Erik Leith Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Jenna Johnson Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
Song Tan Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Shaun Mahony Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA
Lu Bai Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA 16802, USA Department of Physics, The Pennsylvania State University, University Park, PA 16802, USA

Collapse

Grau J, Schmidt F, Schulz MH. Widespread effects of DNA methylation and intra-motif dependencies revealed by novel transcription factor binding models. Nucleic Acids Res 2023;51:e95. [PMID: 37650641 PMCID: PMC10570048 DOI: 10.1093/nar/gkad693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 07/20/2023] [Accepted: 08/10/2023] [Indexed: 09/01/2023] Open

Walker M, Li Y, Morales-Hernandez A, Qi Q, Parupalli C, Brown S, Christian C, Clements WK, Cheng Y, McKinney-Freeman S. An NFIX-mediated regulatory network governs the balance of hematopoietic stem and progenitor cells during hematopoiesis. Blood Adv 2023;7:4677-4689. [PMID: 36478187 PMCID: PMC10468369 DOI: 10.1182/bloodadvances.2022007811] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 10/07/2022] [Accepted: 11/09/2022] [Indexed: 12/12/2022] Open

Villaman C, Pollastri G, Saez M, Martin AJ. Benefiting from the intrinsic role of epigenetics to predict patterns of CTCF binding. Comput Struct Biotechnol J 2023;21:3024-3031. [PMID: 37266407 PMCID: PMC10229758 DOI: 10.1016/j.csbj.2023.05.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 05/11/2023] [Accepted: 05/11/2023] [Indexed: 06/03/2023] Open

Computational approaches to understand transcription regulation in development. Biochem Soc Trans 2023;51:1-12. [PMID: 36695505 PMCID: PMC9988001 DOI: 10.1042/bst20210145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 01/07/2023] [Accepted: 01/13/2023] [Indexed: 01/26/2023]

Cazares TA, Rizvi FW, Iyer B, Chen X, Kotliar M, Bejjani AT, Wayman JA, Donmez O, Wronowski B, Parameswaran S, Kottyan LC, Barski A, Weirauch MT, Prasath VBS, Miraldi ER. maxATAC: Genome-scale transcription-factor binding prediction from ATAC-seq with deep neural networks. PLoS Comput Biol 2023;19:e1010863. [PMID: 36719906 PMCID: PMC9917285 DOI: 10.1371/journal.pcbi.1010863] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 02/10/2023] [Accepted: 01/10/2023] [Indexed: 02/01/2023] Open

Affiliation(s)

Tareian A. Cazares Immunology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
Faiz W. Rizvi Systems Biology and Physiology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
Balaji Iyer Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America
Xiaoting Chen The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Michael Kotliar Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Anthony T. Bejjani Molecular and Developmental Biology Graduate Program, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
Joseph A. Wayman Division of Immunobiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Omer Donmez The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Benjamin Wronowski Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Sreeja Parameswaran The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Leah C. Kottyan The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Artem Barski Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
Matthew T. Weirauch Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America The Center for Autoimmune Genetics and Etiology (CAGE), Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America Division of Human Genetics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America
V. B. Surya Prasath Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America
Emily R. Miraldi Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Department of Electrical Engineering and Computer Science, University of Cincinnati, Cincinnati, Ohio, United States of America Division of Immunobiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio, United States of America Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, Ohio, United States of America

Collapse

Zhang Q, Teng P, Wang S, He Y, Cui Z, Guo Z, Liu Y, Yuan C, Liu Q, Huang DS. Computational prediction and characterization of cell-type-specific and shared binding sites. Bioinformatics 2022;39:6885447. [PMID: 36484687 PMCID: PMC9825777 DOI: 10.1093/bioinformatics/btac798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 11/24/2022] [Accepted: 12/08/2022] [Indexed: 12/13/2022] Open

Abstract

MOTIVATION

Cell-type-specific gene expression is maintained in large part by transcription factors (TFs) selectively binding to distinct sets of sites in different cell types. Recent research works have provided evidence that such cell-type-specific binding is determined by TF's intrinsic sequence preferences, cooperative interactions with co-factors, cell-type-specific chromatin landscapes and 3D chromatin interactions. However, computational prediction and characterization of cell-type-specific and shared binding sites is rarely studied.

RESULTS

In this article, we propose two computational approaches for predicting and characterizing cell-type-specific and shared binding sites by integrating multiple types of features, in which one is based on XGBoost and another is based on convolutional neural network (CNN). To validate the performance of our proposed approaches, ChIP-seq datasets of 10 binding factors were collected from the GM12878 (lymphoblastoid) and K562 (erythroleukemic) human hematopoietic cell lines, each of which was further categorized into cell-type-specific (GM12878- and K562-specific) and shared binding sites. Then, multiple types of features for these binding sites were integrated to train the XGBoost- and CNN-based models. Experimental results show that our proposed approaches significantly outperform other competing methods on three classification tasks. Moreover, we identified independent feature contributions for cell-type-specific and shared sites through SHAP values and explored the ability of the CNN-based model to predict cell-type-specific and shared binding sites by excluding or including DNase signals. Furthermore, we investigated the generalization ability of our proposed approaches to different binding factors in the same cellular environment.

AVAILABILITY AND IMPLEMENTATION

The source code is available at: https://github.com/turningpoint1988/CSSBS.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Yan W, Li Z, Pian C, Wu Y. PlantBind: an attention-based multi-label neural network for predicting plant transcription factor binding sites. Brief Bioinform 2022;23:6713513. [PMID: 36155619 DOI: 10.1093/bib/bbac425] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/29/2022] [Accepted: 08/31/2022] [Indexed: 12/14/2022] Open

Rivière Q, Corso M, Ciortan M, Noël G, Verbruggen N, Defrance M. Exploiting Genomic Features to Improve the Prediction of Transcription Factor-Binding Sites in Plants. PLANT & CELL PHYSIOLOGY 2022;63:1457-1473. [PMID: 35799371 DOI: 10.1093/pcp/pcac095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 06/07/2022] [Accepted: 07/06/2022] [Indexed: 06/15/2023]

McAfee JC, Bell JL, Krupa O, Matoba N, Stein JL, Won H. Focus on your locus with a massively parallel reporter assay. J Neurodev Disord 2022;14:50. [PMID: 36085003 PMCID: PMC9463819 DOI: 10.1186/s11689-022-09461-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 09/01/2022] [Indexed: 01/01/2023] Open

Lal A. Deciphering the regulatory syntax of genomic DNA with deep learning. J Biosci 2022. [DOI: 10.1007/s12038-022-00291-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

Yang MG, Ling E, Cowley CJ, Greenberg ME, Vierbuchen T. Characterization of sequence determinants of enhancer function using natural genetic variation. eLife 2022;11:76500. [PMID: 36043696 PMCID: PMC9662815 DOI: 10.7554/elife.76500] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 08/30/2022] [Indexed: 02/04/2023] Open

Yi R, Cho K, Bonneau R. NetTIME: a Multitask and Base-pair Resolution Framework for Improved Transcription Factor Binding Site Prediction. Bioinformatics 2022;38:4762-4770. [PMID: 35997560 PMCID: PMC9563695 DOI: 10.1093/bioinformatics/btac569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 08/16/2022] [Accepted: 08/20/2022] [Indexed: 12/05/2022] Open

Abstract

Motivation

Machine learning models for predicting cell-type-specific transcription factor (TF) binding sites have become increasingly more accurate thanks to the increased availability of next-generation sequencing data and more standardized model evaluation criteria. However, knowledge transfer from data-rich to data-limited TFs and cell types remains crucial for improving TF binding prediction models because available binding labels are highly skewed towards a small collection of TFs and cell types. Transfer prediction of TF binding sites can potentially benefit from a multitask learning approach; however, existing methods typically use shallow single-task models to generate low-resolution predictions. Here, we propose NetTIME, a multitask learning framework for predicting cell-type-specific TF binding sites with base-pair resolution.

Results

We show that the multitask learning strategy for TF binding prediction is more efficient than the single-task approach due to the increased data availability. NetTIME trains high-dimensional embedding vectors to distinguish TF and cell-type identities. We show that this approach is critical for the success of the multitask learning strategy and allows our model to make accurate transfer predictions within and beyond the training panels of TFs and cell types. We additionally train a linear-chain conditional random field (CRF) to classify binding predictions and show that this CRF eliminates the need for setting a probability threshold and reduces classification noise. We compare our method’s predictive performance with two state-of-the-art methods, Catchitt and Leopard, and show that our method outperforms previous methods under both supervised and transfer learning settings.

Availability and implementation

NetTIME is freely available at https://github.com/ryi06/NetTIME and the code is also archived at https://doi.org/10.5281/zenodo.6994897.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Ng JWK, Ong EHQ, Tucker-Kellogg L, Tucker-Kellogg G. Deep learning for de-convolution of Smad2 versus Smad3 binding sites. BMC Genomics 2022;23:525. [PMID: 35858839 PMCID: PMC9297549 DOI: 10.1186/s12864-022-08565-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 04/19/2022] [Indexed: 11/10/2022] Open

Hernandez-Corchado A, Najafabadi HS. Toward a base-resolution panorama of the in vivo impact of cytosine methylation on transcription factor binding. Genome Biol 2022;23:151. [PMID: 35799193 PMCID: PMC9264634 DOI: 10.1186/s13059-022-02713-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 06/19/2022] [Indexed: 11/10/2022] Open

Abstract

Background

While methylation of CpG dinucleotides is traditionally considered antagonistic to the DNA-binding activity of most transcription factors (TFs), recent in vitro studies have revealed a more complex picture, suggesting that over a third of TFs may preferentially bind to methylated sequences. Expanding these in vitro observations to in vivo TF binding preferences is challenging since the effect of methylation of individual CpG sites cannot be easily isolated from the confounding effects of DNA accessibility and regional DNA methylation. Thus, in vivo methylation preferences of most TFs remain uncharacterized.

Results

We introduce joint accessibility-methylation-sequence (JAMS) models, which connect the strength of the binding signal observed in ChIP-seq to the DNA accessibility of the binding site, regional methylation level, DNA sequence, and base-resolution cytosine methylation. We show that JAMS models quantitatively explain TF occupancy, recapitulate cell type-specific TF binding, and have high positive predictive value for identification of TFs affected by intra-motif methylation. Analysis of 2209 ChIP-seq experiments results in high-confidence JAMS models for 260 TFs, revealing a negative association between in vivo TF occupancy and intra-motif methylation for 45% of studied TFs, as well as 16 TFs that are predicted to bind to methylated sites, including 11 novel methyl-binding TFs mostly from the multi-zinc finger family.

Conclusions

Our study substantially expands the repertoire of in vivo methyl-binding TFs, but also suggests that most TFs that prefer methylated CpGs in vitro present themselves as methylation agnostic in vivo, potentially due to the balancing effect of competition with other methyl-binding proteins.

Supplementary Information

The online version contains supplementary material available at 10.1186/s13059-022-02713-y.

Collapse

Karimzadeh M, Hoffman MM. Virtual ChIP-seq: predicting transcription factor binding by learning from the transcriptome. Genome Biol 2022;23:126. [PMID: 35681170 PMCID: PMC9185870 DOI: 10.1186/s13059-022-02690-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 05/16/2022] [Indexed: 11/29/2022] Open

Luo K, Zhong J, Safi A, Hong LK, Tewari AK, Song L, Reddy TE, Ma L, Crawford GE, Hartemink AJ. Profiling the quantitative occupancy of myriad transcription factors across conditions by modeling chromatin accessibility data. Genome Res 2022;32:1183-1198. [PMID: 35609992 PMCID: PMC9248881 DOI: 10.1101/gr.272203.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 05/06/2022] [Indexed: 11/24/2022]

Affiliation(s)

Kaixuan Luo Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Computer Science, Duke University, Durham, North Carolina 27708, USA Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
Jianling Zhong Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Computer Science, Duke University, Durham, North Carolina 27708, USA
Alexias Safi Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
Linda K Hong Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
Alok K Tewari Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
Lingyun Song Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
Timothy E Reddy Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Biostatistics and Bioinformatics, Durham, North Carolina 27710, USA Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710, USA Department of Biomedical Engineering, Duke University, Durham, North Carolina 27708, USA
Li Ma Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA Department of Statistical Science, Duke University, Durham, North Carolina 27708, USA
Gregory E Crawford Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Pediatrics, Duke University Medical Center, Durham, North Carolina 27710, USA
Alexander J Hartemink Computational Biology & Bioinformatics Graduate Program, Duke University, Durham, North Carolina 27708, USA Center for Genomic and Computational Biology, Duke University, Durham, North Carolina 27708, USA Department of Computer Science, Duke University, Durham, North Carolina 27708, USA Department of Biology, Duke University, Durham, North Carolina 27708, USA

Collapse

Jing F, Zhang SW, Zhang S. Prediction of the transcription factor binding sites with meta-learning. Methods 2022;203:207-213. [DOI: 10.1016/j.ymeth.2022.04.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 04/01/2022] [Accepted: 04/17/2022] [Indexed: 11/26/2022] Open

Sapoval N, Aghazadeh A, Nute MG, Antunes DA, Balaji A, Baraniuk R, Barberan CJ, Dannenfelser R, Dun C, Edrisi M, Elworth RAL, Kille B, Kyrillidis A, Nakhleh L, Wolfe CR, Yan Z, Yao V, Treangen TJ. Current progress and open challenges for applying deep learning across the biosciences. Nat Commun 2022;13:1728. [PMID: 35365602 PMCID: PMC8976012 DOI: 10.1038/s41467-022-29268-7] [Citation(s) in RCA: 76] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Accepted: 03/09/2022] [Indexed: 11/19/2022] Open

Heller IS, Guenther CA, Meireles AM, Talbot WS, Kingsley DM. Characterization of mouse Bmp5 regulatory injury element in zebrafish wound models. Bone 2022;155:116263. [PMID: 34826632 PMCID: PMC9007314 DOI: 10.1016/j.bone.2021.116263] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 11/17/2021] [Accepted: 11/18/2021] [Indexed: 11/21/2022]

Erkes A, Mücke S, Reschke M, Boch J, Grau J. Epigenetic features improve TALE target prediction. BMC Genomics 2021;22:914. [PMID: 34965853 PMCID: PMC8717664 DOI: 10.1186/s12864-021-08210-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/25/2021] [Indexed: 11/20/2022] Open

Abstract

Background

The yield of many crop plants can be substantially reduced by plant-pathogenic Xanthomonas bacteria. The infection strategy of many Xanthomonas strains is based on transcription activator-like effectors (TALEs), which are secreted into the host cells and act as transcriptional activators of plant genes that are beneficial for the bacteria.The modular DNA binding domain of TALEs contains tandem repeats, each comprising two hyper-variable amino acids. These repeat-variable diresidues (RVDs) bind to their target box and determine the specificity of a TALE.All available tools for the prediction of TALE targets within the host plant suffer from many false positives. In this paper we propose a strategy to improve prediction accuracy by considering the epigenetic state of the host plant genome in the region of the target box.

Results

To this end, we extend our previously published tool PrediTALE by considering two epigenetic features: (i) chromatin accessibility of potentially bound regions and (ii) DNA methylation of cytosines within target boxes. Here, we determine the epigenetic features from publicly available DNase-seq, ATAC-seq, and WGBS data in rice.We benchmark the utility of both epigenetic features separately and in combination, deriving ground-truth from RNA-seq data of infections studies in rice. We find an improvement for each individual epigenetic feature, but especially the combination of both.Having established an advantage in TALE target predicting considering epigenetic features, we use these data for promoterome and genome-wide scans by our new tool EpiTALE, leading to several novel putative virulence targets.

Conclusions

Our results suggest that it would be worthwhile to collect condition-specific chromatin accessibility data and methylation information when studying putative virulence targets of Xanthomonas TALEs.

Supplementary Information

The online version contains supplementary material available at (10.1186/s12864-021-08210-z).

Collapse

Constructing gene regulatory networks using epigenetic data. NPJ Syst Biol Appl 2021;7:45. [PMID: 34887443 PMCID: PMC8660777 DOI: 10.1038/s41540-021-00208-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 11/01/2021] [Indexed: 12/24/2022] Open

Morrow A, Hughes J, Singh J, Joseph A, Yosef N. Epitome: predicting epigenetic events in novel cell types with multi-cell deep ensemble learning. Nucleic Acids Res 2021;49:e110. [PMID: 34379786 PMCID: PMC8565335 DOI: 10.1093/nar/gkab676] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 07/19/2021] [Accepted: 07/25/2021] [Indexed: 01/04/2023] Open

Wang H, Huang B, Wang J. Predict long-range enhancer regulation based on protein-protein interactions between transcription factors. Nucleic Acids Res 2021;49:10347-10368. [PMID: 34570239 PMCID: PMC8501976 DOI: 10.1093/nar/gkab841] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 08/10/2021] [Accepted: 09/10/2021] [Indexed: 12/18/2022] Open

Xu Q, Georgiou G, Frölich S, van der Sande M, Veenstra G, Zhou H, van Heeringen S. ANANSE: an enhancer network-based computational approach for predicting key transcription factors in cell fate determination. Nucleic Acids Res 2021;49:7966-7985. [PMID: 34244796 PMCID: PMC8373078 DOI: 10.1093/nar/gkab598] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 06/02/2021] [Accepted: 06/28/2021] [Indexed: 12/21/2022] Open

Shu H, Zhou J, Lian Q, Li H, Zhao D, Zeng J, Ma J. Modeling gene regulatory networks using neural network architectures. NATURE COMPUTATIONAL SCIENCE 2021;1:491-501. [PMID: 38217125 DOI: 10.1038/s43588-021-00099-8] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/15/2021] [Indexed: 01/15/2024]

Zrimec J, Buric F, Kokina M, Garcia V, Zelezniak A. Learning the Regulatory Code of Gene Expression. Front Mol Biosci 2021;8:673363. [PMID: 34179082 PMCID: PMC8223075 DOI: 10.3389/fmolb.2021.673363] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open

Schreiber J, Singh R. Machine learning for profile prediction in genomics. Curr Opin Chem Biol 2021;65:35-41. [PMID: 34107341 DOI: 10.1016/j.cbpa.2021.04.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/21/2021] [Accepted: 04/24/2021] [Indexed: 02/08/2023]

Meyer P, Saez-Rodriguez J. Advances in systems biology modeling: 10 years of crowdsourcing DREAM challenges. Cell Syst 2021;12:636-653. [PMID: 34139170 DOI: 10.1016/j.cels.2021.05.015] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Revised: 03/29/2021] [Accepted: 05/18/2021] [Indexed: 02/07/2023]

Patel N, Bush WS. Modeling transcriptional regulation using gene regulatory networks based on multi-omics data sources. BMC Bioinformatics 2021;22:200. [PMID: 33874910 PMCID: PMC8056605 DOI: 10.1186/s12859-021-04126-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 04/09/2021] [Indexed: 11/17/2022] Open

Abstract

Background

Transcriptional regulation is complex, requiring multiple cis (local) and trans acting mechanisms working in concert to drive gene expression, with disruption of these processes linked to multiple diseases. Previous computational attempts to understand the influence of regulatory mechanisms on gene expression have used prediction models containing input features derived from cis regulatory factors. However, local chromatin looping and trans-acting mechanisms are known to also influence transcriptional regulation, and their inclusion may improve model accuracy and interpretation. In this study, we create a general model of transcription factor influence on gene expression by incorporating both cis and trans gene regulatory features.

Results

We describe a computational framework to model gene expression for GM12878 and K562 cell lines. This framework weights the impact of transcription factor-based regulatory data using multi-omics gene regulatory networks to account for both cis and trans acting mechanisms, and measures of the local chromatin context. These prediction models perform significantly better compared to models containing cis-regulatory features alone. Models that additionally integrate long distance chromatin interactions (or chromatin looping) between distal transcription factor binding regions and gene promoters also show improved accuracy. As a demonstration of their utility, effect estimates from these models were used to weight cis-regulatory rare variants for sequence kernel association test analyses of gene expression.

Conclusions

Our models generate refined effect estimates for the influence of individual transcription factors on gene expression, allowing characterization of their roles across the genome. This work also provides a framework for integrating multiple data types into a single model of transcriptional regulation.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-021-04126-3.

Collapse

Li H, Guan Y. Fast decoding cell type-specific transcription factor binding landscape at single-nucleotide resolution. Genome Res 2021;31:721-731. [PMID: 33741685 PMCID: PMC8015851 DOI: 10.1101/gr.269613.120] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 02/17/2021] [Indexed: 01/22/2023]

Nakato R, Sakata T. Methods for ChIP-seq analysis: A practical workflow and advanced applications. Methods 2021;187:44-53. [PMID: 32240773 DOI: 10.1016/j.ymeth.2020.03.005] [Citation(s) in RCA: 90] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2020] [Revised: 03/17/2020] [Accepted: 03/18/2020] [Indexed: 12/13/2022] Open

Integrative analysis identifies bHLH transcription factors as contributors to Parkinson's disease risk mechanisms. Sci Rep 2021;11:3502. [PMID: 33568722 PMCID: PMC7875985 DOI: 10.1038/s41598-021-83087-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2019] [Accepted: 01/26/2021] [Indexed: 11/08/2022] Open

Xu L, Zu T, Li T, Li M, Mi J, Bai F, Liu G, Wen J, Li H, Brakebusch C, Wang X, Wu X. ATF3 downmodulates its new targets IFI6 and IFI27 to suppress the growth and migration of tongue squamous cell carcinoma cells. PLoS Genet 2021;17:e1009283. [PMID: 33539340 PMCID: PMC7888615 DOI: 10.1371/journal.pgen.1009283] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 02/17/2021] [Accepted: 11/18/2020] [Indexed: 01/16/2023] Open

Abstract

Activating transcription factor 3 (ATF3) is a key transcription factor involved in regulating cellular stress responses, with different expression levels and functions in different tissues. ATF3 has also been shown to play crucial roles in regulating tumor development and progression, however its potential role in oral squamous cell carcinomas has not been fully explored. In this study, we examined biopsies of tongue squamous cell carcinomas (TSCCs) and found that the nuclear expression level of ATF3 correlated negatively with the differentiation status of TSCCs, which was validated by analysis of the ATGC database. By using gain- or loss- of function analyses of ATF3 in four different TSCC cell lines, we demonstrated that ATF3 negatively regulates the growth and migration of human TSCC cells in vitro. RNA-seq analysis identified two new downstream targets of ATF3, interferon alpha inducible proteins 6 (IFI6) and 27 (IFI27), which were upregulated in ATF3-deleted cells and were downregulated in ATF3-overexpressing cells. Chromatin immunoprecipitation assays showed that ATF3 binds the promoter regions of the IFI6 and IFI27 genes. Both IFI6 and IFI27 were highly expressed in TSCC biopsies and knockdown of either IFI6 or IFI27 in TSCC cells blocked the cell growth and migration induced by the deletion of ATF3. Conversely, overexpression of either IFI6 or IFI27 counteracted the inhibition of TSCC cell growth and migration induced by the overexpression of ATF3. Finally, an in vivo study in mice confirmed those in vitro findings. Our study suggests that ATF3 plays an anti-tumor function in TSCCs through the negative regulation of its downstream targets, IFI6 and IFI27.

Activating transcription factor 3 (ATF3), a stress response gene, has been shown to play either tumor promoting or tumor suppressing functions depending on the type of tumor cell and the stromal context. Here we discovered that ATF3 plays an anti-tumor role in tongue squamous cell carcinoma (TSCC) cells through the transcriptional suppression of its new downstream targets interferon alpha inducible proteins 6 (IFI6) and 27 (IFI27). This finding contributes to understanding how ATF3, a transcriptional repressor, can target specific downstream genes in different tumor cells to play anti-tumor or pro-tumor functions. A thorough understanding of ATF3 functions and its downstream signaling pathways provides a potential approach to develop new therapeutics for the treatment of tumors such as TSCCs.

Collapse

Affiliation(s)

Lin Xu Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China Department of Oral and Maxillofacial Surgery, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration & Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Shandong, China Department of Orthodontics, Liaocheng People’s Hospital, Liaocheng, Shandong, China Precision Biomedical Key Laboratory, Liaocheng People’s Hospital, Liaocheng, Shandong, China
Tingjian Zu Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China School of Stomatology, Shandong First Medical University & Shandong Academy of Medical Sciences, Tai’an, Shandong, China
Tao Li Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China Department of Oral and Maxillofacial Surgery, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration & Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Shandong, China
Min Li Precision Biomedical Key Laboratory, Liaocheng People’s Hospital, Liaocheng, Shandong, China
Jun Mi Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China
Fuxiang Bai Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China
Guanyi Liu Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China
Jie Wen Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China
Hui Li Department of Hematology, Southwest Hospital, Third Military Medical University, Chongqing, China
Cord Brakebusch Biotech Research and Innovation Centre (BRIC), University of Copenhagen, Ole Maaløes Vej 5, Copenhagen, Denmark
Xuxia Wang Department of Oral and Maxillofacial Surgery, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration & Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Shandong, China * E-mail: (XW); (XW)
Xunwei Wu Department of Tissue Engineering and Regeneration, School and Hospital of Stomatology, Cheeloo College of Medicine, Shandong University & Shandong Key Laboratory of Oral Tissue Regeneration and Shandong Engineering Laboratory for Dental Materials and Oral Tissue Regeneration, Jinan, Shandong, China * E-mail: (XW); (XW)

Collapse

Chen C, Hou J, Shi X, Yang H, Birchler JA, Cheng J. DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks. BMC Bioinformatics 2021;22:38. [PMID: 33522898 PMCID: PMC7852092 DOI: 10.1186/s12859-020-03952-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 12/29/2020] [Indexed: 12/21/2022] Open

Abstract

Background

Due to the complexity of the biological systems, the prediction of the potential DNA binding sites for transcription factors remains a difficult problem in computational biology. Genomic DNA sequences and experimental results from parallel sequencing provide available information about the affinity and accessibility of genome and are commonly used features in binding sites prediction. The attention mechanism in deep learning has shown its capability to learn long-range dependencies from sequential data, such as sentences and voices. Until now, no study has applied this approach in binding site inference from massively parallel sequencing data. The successful applications of attention mechanism in similar input contexts motivate us to build and test new methods that can accurately determine the binding sites of transcription factors.

Results

In this study, we propose a novel tool (named DeepGRN) for transcription factors binding site prediction based on the combination of two components: single attention module and pairwise attention module. The performance of our methods is evaluated on the ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge datasets. The results show that DeepGRN achieves higher unified scores in 6 of 13 targets than any of the top four methods in the DREAM challenge. We also demonstrate that the attention weights learned by the model are correlated with potential informative inputs, such as DNase-Seq coverage and motifs, which provide possible explanations for the predictive improvements in DeepGRN.

Conclusions

DeepGRN can automatically and effectively predict transcription factor binding sites from DNA sequences and DNase-Seq coverage. Furthermore, the visualization techniques we developed for the attention modules help to interpret how critical patterns from different types of input features are recognized by our model.

Collapse

Zhou M, Li H, Wang X, Guan Y. Evidence of widespread, independent sequence signature for transcription factor cobinding. Genome Res 2021;31:265-278. [PMID: 33303494 PMCID: PMC7849410 DOI: 10.1101/gr.267310.120] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 12/03/2020] [Indexed: 01/03/2023]

Srivastava D, Aydin B, Mazzoni EO, Mahony S. An interpretable bimodal neural network characterizes the sequence and preexisting chromatin predictors of induced transcription factor binding. Genome Biol 2021;22:20. [PMID: 33413545 PMCID: PMC7788824 DOI: 10.1186/s13059-020-02218-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 12/03/2020] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Transcription factor (TF) binding specificity is determined via a complex interplay between the transcription factor's DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with transcription factor binding in a given cell type have been well characterized. For instance, the binding sites for a majority of transcription factors display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding activities of the transcription factor itself and thus provide limited insight into how genome-wide TF-DNA binding patterns became established in the first place. To understand the determinants of transcription factor binding specificity, we therefore need to examine how newly activated transcription factors interact with sequence and preexisting chromatin landscapes.

RESULTS

Here, we investigate the sequence and preexisting chromatin predictors of TF-DNA binding by examining the genome-wide occupancy of transcription factors that have been induced in well-characterized chromatin environments. We develop Bichrom, a bimodal neural network that jointly models sequence and preexisting chromatin data to interpret the genome-wide binding patterns of induced transcription factors. We find that the preexisting chromatin landscape is a differential global predictor of TF-DNA binding; incorporating preexisting chromatin features improves our ability to explain the binding specificity of some transcription factors substantially, but not others. Furthermore, by analyzing site-level predictors, we show that transcription factor binding in previously inaccessible chromatin tends to correspond to the presence of more favorable cognate DNA sequences.

CONCLUSIONS

Bichrom thus provides a framework for modeling, interpreting, and visualizing the joint sequence and chromatin landscapes that determine TF-DNA binding dynamics.

Collapse

Endometriosis Is Associated with a Significant Increase in hTERC and Altered Telomere/Telomerase Associated Genes in the Eutopic Endometrium, an Ex-Vivo and In Silico Study. Biomedicines 2020;8:biomedicines8120588. [PMID: 33317189 PMCID: PMC7764055 DOI: 10.3390/biomedicines8120588] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/02/2020] [Accepted: 12/03/2020] [Indexed: 12/13/2022] Open

López-Rivera F, Foster Rhoades OK, Vincent BJ, Pym ECG, Bragdon MDJ, Estrada J, DePace AH, Wunderlich Z. A Mutation in the Drosophila melanogaster eve Stripe 2 Minimal Enhancer Is Buffered by Flanking Sequences. G3 (BETHESDA, MD.) 2020;10:4473-4482. [PMID: 33037064 PMCID: PMC7718739 DOI: 10.1534/g3.120.401777] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 10/01/2020] [Indexed: 01/18/2023]

Martin PC, Zabet NR. Dissecting the binding mechanisms of transcription factors to DNA using a statistical thermodynamics framework. Comput Struct Biotechnol J 2020;18:3590-3605. [PMID: 33304457 PMCID: PMC7708957 DOI: 10.1016/j.csbj.2020.11.006] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 11/02/2020] [Accepted: 11/04/2020] [Indexed: 01/22/2023] Open

Nameki R, Chang H, Reddy J, Corona RI, Lawrenson K. Transcription factors in epithelial ovarian cancer: histotype-specific drivers and novel therapeutic targets. Pharmacol Ther 2020;220:107722. [PMID: 33137377 DOI: 10.1016/j.pharmthera.2020.107722] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 10/26/2020] [Indexed: 02/06/2023]