1
|
Patrick R, Naval-Sanchez M, Deshpande N, Huang Y, Zhang J, Chen X, Yang Y, Tiwari K, Esmaeili M, Tran M, Mohamed AR, Wang B, Xia D, Ma J, Bayliss J, Wong K, Hun ML, Sun X, Cao B, Cottle DL, Catterall T, Barzilai-Tutsch H, Troskie RL, Chen Z, Wise AF, Saini S, Soe YM, Kumari S, Sweet MJ, Thomas HE, Smyth IM, Fletcher AL, Knoblich K, Watt MJ, Alhomrani M, Alsanie W, Quinn KM, Merson TD, Chidgey AP, Ricardo SD, Yu D, Jardé T, Cheetham SW, Marcelle C, Nilsson SK, Nguyen Q, White MD, Nefzger CM. The activity of early-life gene regulatory elements is hijacked in aging through pervasive AP-1-linked chromatin opening. Cell Metab 2024; 36:1858-1881.e23. [PMID: 38959897 DOI: 10.1016/j.cmet.2024.06.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 03/28/2024] [Accepted: 06/06/2024] [Indexed: 07/05/2024]
Abstract
A mechanistic connection between aging and development is largely unexplored. Through profiling age-related chromatin and transcriptional changes across 22 murine cell types, analyzed alongside previous mouse and human organismal maturation datasets, we uncovered a transcription factor binding site (TFBS) signature common to both processes. Early-life candidate cis-regulatory elements (cCREs), progressively losing accessibility during maturation and aging, are enriched for cell-type identity TFBSs. Conversely, cCREs gaining accessibility throughout life have a lower abundance of cell identity TFBSs but elevated activator protein 1 (AP-1) levels. We implicate TF redistribution toward these AP-1 TFBS-rich cCREs, in synergy with mild downregulation of cell identity TFs, as driving early-life cCRE accessibility loss and altering developmental and metabolic gene expression. Such remodeling can be triggered by elevating AP-1 or depleting repressive H3K27me3. We propose that AP-1-linked chromatin opening drives organismal maturation by disrupting cell identity TFBS-rich cCREs, thereby reprogramming transcriptome and cell function, a mechanism hijacked in aging through ongoing chromatin opening.
Collapse
Affiliation(s)
- Ralph Patrick
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Marina Naval-Sanchez
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Nikita Deshpande
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia; WHO Collaborating Centre for Reference and Research on Influenza, The Peter Doherty Institute for Infection and Immunity, Melbourne, VIC 3000, Australia
| | - Yifei Huang
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Jingyu Zhang
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Xiaoli Chen
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Ying Yang
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Kanupriya Tiwari
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Mohammadhossein Esmaeili
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Minh Tran
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Amin R Mohamed
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Binxu Wang
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Di Xia
- Genome Innovation Hub, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Jun Ma
- Genome Innovation Hub, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Jacqueline Bayliss
- Department of Anatomy and Physiology, Faculty of Medicine Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Kahlia Wong
- Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Michael L Hun
- Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Xuan Sun
- Biomedical Manufacturing, Commonwealth Scientific and Industrial Research Organization, Melbourne, VIC, Australia; Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia
| | - Benjamin Cao
- Biomedical Manufacturing, Commonwealth Scientific and Industrial Research Organization, Melbourne, VIC, Australia; Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia
| | - Denny L Cottle
- Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Tara Catterall
- St. Vincent's Institute of Medical Research, Fitzroy, VIC 3065, Australia
| | - Hila Barzilai-Tutsch
- Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia; Institut NeuroMyoGène, University Claude Bernard Lyon 1, 69008 Lyon, France
| | - Robin-Lee Troskie
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Zhian Chen
- Frazer Institute, Faculty of Medicine, The University of Queensland, Brisbane, QLD 4102, Australia
| | - Andrea F Wise
- Department of Pharmacology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Sheetal Saini
- Department of Pharmacology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Ye Mon Soe
- Frazer Institute, Faculty of Medicine, The University of Queensland, Brisbane, QLD 4102, Australia
| | - Snehlata Kumari
- Frazer Institute, Faculty of Medicine, The University of Queensland, Brisbane, QLD 4102, Australia
| | - Matthew J Sweet
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia; Australian Infectious Diseases Research Centre, The University of Queensland, Brisbane, QLD, Australia
| | - Helen E Thomas
- St. Vincent's Institute of Medical Research, Fitzroy, VIC 3065, Australia
| | - Ian M Smyth
- Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Anne L Fletcher
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Konstantin Knoblich
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Matthew J Watt
- Department of Anatomy and Physiology, Faculty of Medicine Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Majid Alhomrani
- Department of Clinical Laboratories Sciences, Faculty of Applied Medical Sciences, Taif University, Taif, Saudi Arabia; Research Centre for Health Sciences, Taif University, Taif, Saudi Arabia
| | - Walaa Alsanie
- Department of Clinical Laboratories Sciences, Faculty of Applied Medical Sciences, Taif University, Taif, Saudi Arabia; Research Centre for Health Sciences, Taif University, Taif, Saudi Arabia
| | - Kylie M Quinn
- Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; School of Health and Biomedical Sciences, RMIT University, Bundoora, VIC 3083, Australia
| | - Tobias D Merson
- Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia; National Institute of Mental Health, National Institutes of Health, Bethesda, MD 20892, USA
| | - Ann P Chidgey
- Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Sharon D Ricardo
- Department of Pharmacology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia
| | - Di Yu
- Frazer Institute, Faculty of Medicine, The University of Queensland, Brisbane, QLD 4102, Australia; Ian Frazer Centre for Children's Immunotherapy Research, Child Health Research Centre, Faculty of Medicine, The University of Queensland, Brisbane, QLD 4102, Australia
| | - Thierry Jardé
- Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; Cancer Program, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; Department of Surgery, Cabrini Monash University, Malvern, VIC 3144, Australia
| | - Seth W Cheetham
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Christophe Marcelle
- Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia; Institut NeuroMyoGène, University Claude Bernard Lyon 1, 69008 Lyon, France
| | - Susan K Nilsson
- Biomedical Manufacturing, Commonwealth Scientific and Industrial Research Organization, Melbourne, VIC, Australia; Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia
| | - Quan Nguyen
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia; School of Biomedical Sciences, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Melanie D White
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia; School of Biomedical Sciences, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Christian M Nefzger
- Institute for Molecular Bioscience, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia; Department of Anatomy and Developmental Biology, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Monash University, Clayton, VIC 3800, Australia; School of Chemistry and Molecular Biosciences, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia.
| |
Collapse
|
2
|
Kang JS, Kim D, Rhee J, Seo JY, Park I, Kim JH, Lee D, Lee W, Kim YL, Yoo K, Bae S, Chung J, Seong RH, Kong YY. Baf155 regulates skeletal muscle metabolism via HIF-1a signaling. PLoS Biol 2023; 21:e3002192. [PMID: 37478146 PMCID: PMC10396025 DOI: 10.1371/journal.pbio.3002192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 06/12/2023] [Indexed: 07/23/2023] Open
Abstract
During exercise, skeletal muscle is exposed to a low oxygen condition, hypoxia. Under hypoxia, the transcription factor hypoxia-inducible factor-1α (HIF-1α) is stabilized and induces expressions of its target genes regulating glycolytic metabolism. Here, using a skeletal muscle-specific gene ablation mouse model, we show that Brg1/Brm-associated factor 155 (Baf155), a core subunit of the switch/sucrose non-fermentable (SWI/SNF) complex, is essential for HIF-1α signaling in skeletal muscle. Muscle-specific ablation of Baf155 increases oxidative metabolism by reducing HIF-1α function, which accompanies the decreased lactate production during exercise. Furthermore, the augmented oxidation leads to high intramuscular adenosine triphosphate (ATP) level and results in the enhancement of endurance exercise capacity. Mechanistically, our chromatin immunoprecipitation (ChIP) analysis reveals that Baf155 modulates DNA-binding activity of HIF-1α to the promoters of its target genes. In addition, for this regulatory function, Baf155 requires a phospho-signal transducer and activator of transcription 3 (pSTAT3), which forms a coactivator complex with HIF-1α, to activate HIF-1α signaling. Our findings reveal the crucial role of Baf155 in energy metabolism of skeletal muscle and the interaction between Baf155 and hypoxia signaling.
Collapse
Affiliation(s)
- Jong-Seol Kang
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Dongha Kim
- Department of Anatomy, College of Medicine, The Catholic University of Korea, Seoul, South Korea
| | - Joonwoo Rhee
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Ji-Yun Seo
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Inkuk Park
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Ji-Hoon Kim
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Daewon Lee
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea
| | - WonUk Lee
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Ye Lynne Kim
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Kyusang Yoo
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Sunghwan Bae
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| | - Jongkyeong Chung
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea
| | - Rho Hyun Seong
- School of Biological Sciences, Seoul National University, Seoul, South Korea
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul, South Korea
| | - Young-Yun Kong
- School of Biological Sciences, Seoul National University, Seoul, South Korea
| |
Collapse
|
3
|
Cain B, Webb J, Yuan Z, Cheung D, Lim HW, Kovall R, Weirauch MT, Gebelein B. Prediction of cooperative homeodomain DNA binding sites from high-throughput-SELEX data. Nucleic Acids Res 2023; 51:6055-6072. [PMID: 37114997 PMCID: PMC10325903 DOI: 10.1093/nar/gkad318] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 04/12/2023] [Accepted: 04/25/2023] [Indexed: 04/29/2023] Open
Abstract
Homeodomain proteins constitute one of the largest families of metazoan transcription factors. Genetic studies have demonstrated that homeodomain proteins regulate many developmental processes. Yet, biochemical data reveal that most bind highly similar DNA sequences. Defining how homeodomain proteins achieve DNA binding specificity has therefore been a long-standing goal. Here, we developed a novel computational approach to predict cooperative dimeric binding of homeodomain proteins using High-Throughput (HT) SELEX data. Importantly, we found that 15 of 88 homeodomain factors form cooperative homodimer complexes on DNA sites with precise spacing requirements. Approximately one third of the paired-like homeodomain proteins cooperatively bind palindromic sequences spaced 3 bp apart, whereas other homeodomain proteins cooperatively bind sites with distinct orientation and spacing requirements. Combining structural models of a paired-like factor with our cooperativity predictions identified key amino acid differences that help differentiate between cooperative and non-cooperative factors. Finally, we confirmed predicted cooperative dimer sites in vivo using available genomic data for a subset of factors. These findings demonstrate how HT-SELEX data can be computationally mined to predict cooperativity. In addition, the binding site spacing requirements of select homeodomain proteins provide a mechanism by which seemingly similar AT-rich DNA sequences can preferentially recruit specific homeodomain factors.
Collapse
Affiliation(s)
- Brittany Cain
- Department of Biomedical Engineering, University of Cincinnati, Cincinnati, OH 45221, USA
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 7007, Cincinnati, OH 45229, USA
| | - Jordan Webb
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Zhenyu Yuan
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - David Cheung
- Graduate Program in Molecular and Developmental Biology, Cincinnati Children's Hospital Research Foundation, Cincinnati, OH 45229, USA
| | - Hee-Woong Lim
- Division of Biomedical Informatics, Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Rhett A Kovall
- Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Matthew T Weirauch
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
- Divisions of Human Genetics, Biomedical Informatics and Developmental Biology, Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA
| | - Brian Gebelein
- Division of Developmental Biology, Cincinnati Children's Hospital Medical Center, 3333 Burnet Ave, MLC 7007, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| |
Collapse
|
4
|
Vahed M, Vahed M, Garmire LX. BML: a versatile web server for bipartite motif discovery. Brief Bioinform 2021; 23:6490318. [PMID: 34974623 PMCID: PMC8769915 DOI: 10.1093/bib/bbab536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 11/18/2021] [Accepted: 11/19/2021] [Indexed: 11/28/2022] Open
Abstract
Motif discovery and characterization are important for gene regulation analysis. The lack of intuitive and integrative web servers impedes the effective use of motifs. Most motif discovery web tools are either not designed for non-expert users or lacking optimization steps when using default settings. Here we describe bipartite motifs learning (BML), a parameter-free web server that provides a user-friendly portal for online discovery and analysis of sequence motifs, using high-throughput sequencing data as the input. BML utilizes both position weight matrix and dinucleotide weight matrix, the latter of which enables the expression of the interdependencies of neighboring bases. With input parameters concerning the motifs are given, the BML achieves significantly higher accuracy than other available tools for motif finding. When no parameters are given by non-expert users, unlike other tools, BML employs a learning method to identify motifs automatically and achieve accuracy comparable to the scenario where the parameters are set. The BML web server is freely available at http://motif.t-ridership.com/ (https://github.com/Mohammad-Vahed/BML).
Collapse
Affiliation(s)
- Mohammad Vahed
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, University of California Los Angeles (UCLA), California, USA.,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48105, USA
| | - Majid Vahed
- Pharmaceutical Sciences Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Lana X Garmire
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, 48105, USA
| |
Collapse
|
5
|
Novakovsky G, Saraswat M, Fornes O, Mostafavi S, Wasserman WW. Biologically relevant transfer learning improves transcription factor binding prediction. Genome Biol 2021; 22:280. [PMID: 34579793 PMCID: PMC8474956 DOI: 10.1186/s13059-021-02499-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 09/15/2021] [Indexed: 12/27/2022] Open
Abstract
BACKGROUND Deep learning has proven to be a powerful technique for transcription factor (TF) binding prediction but requires large training datasets. Transfer learning can reduce the amount of data required for deep learning, while improving overall model performance, compared to training a separate model for each new task. RESULTS We assess a transfer learning strategy for TF binding prediction consisting of a pre-training step, wherein we train a multi-task model with multiple TFs, and a fine-tuning step, wherein we initialize single-task models for individual TFs with the weights learned by the multi-task model, after which the single-task models are trained at a lower learning rate. We corroborate that transfer learning improves model performance, especially if in the pre-training step the multi-task model is trained with biologically relevant TFs. We show the effectiveness of transfer learning for TFs with ~ 500 ChIP-seq peak regions. Using model interpretation techniques, we demonstrate that the features learned in the pre-training step are refined in the fine-tuning step to resemble the binding motif of the target TF (i.e., the recipient of transfer learning in the fine-tuning step). Moreover, pre-training with biologically relevant TFs allows single-task models in the fine-tuning step to learn useful features other than the motif of the target TF. CONCLUSIONS Our results confirm that transfer learning is a powerful technique for TF binding prediction.
Collapse
Affiliation(s)
- Gherman Novakovsky
- Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3 N1, Canada
| | - Manu Saraswat
- Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3 N1, Canada
| | - Oriol Fornes
- Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada.
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3 N1, Canada.
| | - Sara Mostafavi
- Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3 N1, Canada
- Department of Statistics, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
- Canadian Institute for Advanced Research, CIFAR AI Chair, and Child and Brain Development, Toronto, ON, M5G 1 M1, Canada
| | - Wyeth W Wasserman
- Centre for Molecular Medicine and Therapeutics, BC Children's Hospital Research Institute, Vancouver, BC, V5Z 4H4, Canada.
- Department of Medical Genetics, University of British Columbia, Vancouver, BC, V6H 3 N1, Canada.
| |
Collapse
|
6
|
Rogan PK, Mucaki EJ, Shirley BC. A proposed molecular mechanism for pathogenesis of severe RNA-viral pulmonary infections. F1000Res 2020; 9:943. [PMID: 33299552 PMCID: PMC7676395 DOI: 10.12688/f1000research.25390.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/23/2020] [Indexed: 12/19/2022] Open
Abstract
Background: Certain riboviruses can cause severe pulmonary complications leading to death in some infected patients. We propose that DNA damage induced-apoptosis accelerates viral release, triggered by depletion of host RNA binding proteins (RBPs) from nuclear RNA bound to replicating viral sequences. Methods: Information theory-based analysis of interactions between RBPs and individual sequences in the Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2), Influenza A (H3N1), HIV-1, and Dengue genomes identifies strong RBP binding sites in these viral genomes. Replication and expression of viral sequences is expected to increasingly sequester RBPs - SRSF1 and RNPS1. Ordinarily, RBPs bound to nascent host transcripts prevents their annealing to complementary DNA. Their depletion induces destabilizing R-loops. Chromosomal breakage occurs when an excess of unresolved R-loops collide with incoming replication forks, overwhelming the DNA repair machinery. We estimated stoichiometry of inhibition of RBPs in host nuclear RNA by counting competing binding sites in replicating viral genomes and host RNA. Results: Host RBP binding sites are frequent and conserved among different strains of RNA viral genomes. Similar binding motifs of SRSF1 and RNPS1 explain why DNA damage resulting from SRSF1 depletion is complemented by expression of RNPS1. Clustering of strong RBP binding sites coincides with the distribution of RNA-DNA hybridization sites across the genome. SARS-CoV-2 replication is estimated to require 32.5-41.8 hours to effectively compete for binding of an equal proportion of SRSF1 binding sites in host encoded nuclear RNAs. Significant changes in expression of transcripts encoding DNA repair and apoptotic proteins were found in an analysis of influenza A and Dengue-infected cells in some individuals. Conclusions: R-loop-induced apoptosis indirectly resulting from viral replication could release significant quantities of membrane-associated virions into neighboring alveoli. These could infect adjacent pneumocytes and other tissues, rapidly compromising lung function, causing multiorgan system failure and other described symptoms.
Collapse
Affiliation(s)
- Peter K. Rogan
- Biochemistry, University of Western Ontario, London, Ontario, N6A 2C8, Canada
- CytoGnomix Inc, London, Ontario, N5X 3X5, Canada
| | - Eliseos J. Mucaki
- Biochemistry, University of Western Ontario, London, Ontario, N6A 2C8, Canada
| | | |
Collapse
|
7
|
Rogan PK, Mucaki EJ, Shirley BC. A proposed molecular mechanism for pathogenesis of severe RNA-viral pulmonary infections. F1000Res 2020; 9:943. [PMID: 33299552 PMCID: PMC7676395 DOI: 10.12688/f1000research.25390.2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/16/2020] [Indexed: 12/19/2022] Open
Abstract
Background: Certain riboviruses can cause severe pulmonary complications leading to death in some infected patients. We propose that DNA damage induced-apoptosis accelerates viral release, triggered by depletion of host RNA binding proteins (RBPs) from nuclear RNA bound to replicating viral sequences. Methods: Information theory-based analysis of interactions between RBPs and individual sequences in the Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2), Influenza A (H3N2), HIV-1, and Dengue genomes identifies strong RBP binding sites in these viral genomes. Replication and expression of viral sequences is expected to increasingly sequester RBPs - SRSF1 and RNPS1. Ordinarily, RBPs bound to nascent host transcripts prevents their annealing to complementary DNA. Their depletion induces destabilizing R-loops. Chromosomal breakage occurs when an excess of unresolved R-loops collide with incoming replication forks, overwhelming the DNA repair machinery. We estimated stoichiometry of inhibition of RBPs in host nuclear RNA by counting competing binding sites in replicating viral genomes and host RNA. Results: Host RBP binding sites are frequent and conserved among different strains of RNA viral genomes. Similar binding motifs of SRSF1 and RNPS1 explain why DNA damage resulting from SRSF1 depletion is complemented by expression of RNPS1. Clustering of strong RBP binding sites coincides with the distribution of RNA-DNA hybridization sites across the genome. SARS-CoV-2 replication is estimated to require 32.5-41.8 hours to effectively compete for binding of an equal proportion of SRSF1 binding sites in host encoded nuclear RNAs. Significant changes in expression of transcripts encoding DNA repair and apoptotic proteins were found in an analysis of influenza A and Dengue-infected cells in some individuals. Conclusions: R-loop-induced apoptosis indirectly resulting from viral replication could release significant quantities of membrane-associated virions into neighboring alveoli. These could infect adjacent pneumocytes and other tissues, rapidly compromising lung function, causing multiorgan system failure and other described symptoms.
Collapse
Affiliation(s)
- Peter K. Rogan
- Biochemistry, University of Western Ontario, London, Ontario, N6A 2C8, Canada
- CytoGnomix Inc, London, Ontario, N5X 3X5, Canada
| | - Eliseos J. Mucaki
- Biochemistry, University of Western Ontario, London, Ontario, N6A 2C8, Canada
| | | |
Collapse
|
8
|
Toivonen J, Das PK, Taipale J, Ukkonen E. MODER2: first-order Markov modeling and discovery of monomeric and dimeric binding motifs. Bioinformatics 2020; 36:2690-2696. [PMID: 31999322 PMCID: PMC7203737 DOI: 10.1093/bioinformatics/btaa045] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2019] [Revised: 12/23/2019] [Accepted: 01/23/2020] [Indexed: 12/21/2022] Open
Abstract
MOTIVATION Position-specific probability matrices (PPMs, also called position-specific weight matrices) have been the dominating model for transcription factor (TF)-binding motifs in DNA. There is, however, increasing recent evidence of better performance of higher order models such as Markov models of order one, also called adjacent dinucleotide matrices (ADMs). ADMs can model dependencies between adjacent nucleotides, unlike PPMs. A modeling technique and software tool that would estimate such models simultaneously both for monomers and their dimers have been missing. RESULTS We present an ADM-based mixture model for monomeric and dimeric TF-binding motifs and an expectation maximization algorithm MODER2 for learning such models from training data and seeds. The model is a mixture that includes monomers and dimers, built from the monomers, with a description of the dimeric structure (spacing, orientation). The technique is modular, meaning that the co-operative effect of dimerization is made explicit by evaluating the difference between expected and observed models. The model is validated using HT-SELEX and generated datasets, and by comparing to some earlier PPM and ADM techniques. The ADM models explain data slightly better than PPM models for 314 tested TFs (or their DNA-binding domains) from four families (bHLH, bZIP, ETS and Homeodomain), the ADM mixture models by MODER2 being the best on average. AVAILABILITY AND IMPLEMENTATION Software implementation is available from https://github.com/jttoivon/moder2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jarkko Toivonen
- Department of Computer Science, University of Helsinki, Helsinki FI-00014, Finland
| | - Pratyush K Das
- Applied Tumor Genomics, Research Programs Unit, University of Helsinki, Helsinki FI-00014, Finland
| | - Jussi Taipale
- Department of Biochemistry, University of Cambridge, CB2 1GA Cambridge, UK
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, SE 141 83 Stockholm, Sweden
- Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
- Genome-Scale Biology Program, University of Helsinki, Helsinki FI-00014, Finland
| | - Esko Ukkonen
- Department of Computer Science, University of Helsinki, Helsinki FI-00014, Finland
| |
Collapse
|
9
|
Mucaki EJ, Shirley BC, Rogan PK. Expression Changes Confirm Genomic Variants Predicted to Result in Allele-Specific, Alternative mRNA Splicing. Front Genet 2020; 11:109. [PMID: 32211018 PMCID: PMC7066660 DOI: 10.3389/fgene.2020.00109] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Accepted: 01/30/2020] [Indexed: 12/11/2022] Open
Abstract
Splice isoform structure and abundance can be affected by either noncoding or masquerading coding variants that alter the structure or abundance of transcripts. When these variants are common in the population, these nonconstitutive transcripts are sufficiently frequent so as to resemble naturally occurring, alternative mRNA splicing. Prediction of the effects of such variants has been shown to be accurate using information theory-based methods. Single nucleotide polymorphisms (SNPs) predicted to significantly alter natural and/or cryptic splice site strength were shown to affect gene expression. Splicing changes for known SNP genotypes were confirmed in HapMap lymphoblastoid cell lines with gene expression microarrays and custom designed q-RT-PCR or TaqMan assays. The majority of these SNPs (15 of 22) as well as an independent set of 24 variants were then subjected to RNAseq analysis using the ValidSpliceMut web beacon (http://validsplicemut.cytognomix.com), which is based on data from the Cancer Genome Atlas and International Cancer Genome Consortium. SNPs from different genes analyzed with gene expression microarray and q-RT-PCR exhibited significant changes in affected splice site use. Thirteen SNPs directly affected exon inclusion and 10 altered cryptic site use. Homozygous SNP genotypes resulting in stronger splice sites exhibited higher levels of processed mRNA than alleles associated with weaker sites. Four SNPs exhibited variable expression among individuals with the same genotypes, masking statistically significant expression differences between alleles. Genome-wide information theory and expression analyses (RNAseq) in tumor exomes and genomes confirmed splicing effects for 7 of the HapMap SNP and 14 SNPs identified from tumor genomes. q-RT-PCR resolved rare splice isoforms with read abundance too low for statistical significance in ValidSpliceMut. Nevertheless, the web-beacon provides evidence of unanticipated splicing outcomes, for example, intron retention due to compromised recognition of constitutive splice sites. Thus, ValidSpliceMut and q-RT-PCR represent complementary resources for identification of allele-specific, alternative splicing.
Collapse
Affiliation(s)
- Eliseos J Mucaki
- Department of Biochemistry, University of Western Ontario, London, ON, Canada
| | | | - Peter K Rogan
- Department of Biochemistry, University of Western Ontario, London, ON, Canada.,CytoGnomix, London, ON, Canada.,Department of Oncology University of Western Ontario, London, ON, Canada.,Department of Computer Science, University of Western Ontario, London, ON, Canada
| |
Collapse
|
10
|
Lin QXX, Thieffry D, Jha S, Benoukraf T. TFregulomeR reveals transcription factors' context-specific features and functions. Nucleic Acids Res 2020; 48:e10. [PMID: 31754708 PMCID: PMC6954419 DOI: 10.1093/nar/gkz1088] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 10/25/2019] [Accepted: 11/01/2019] [Indexed: 12/25/2022] Open
Abstract
Transcription factors (TFs) are sequence-specific DNA binding proteins, fine-tuning spatiotemporal gene expression. Since genomic occupancy of a TF is highly dynamic, it is crucial to study TF binding sites (TFBSs) in a cell-specific context. To date, thousands of ChIP-seq datasets have portrayed the genomic binding landscapes of numerous TFs in different cell types. Although these datasets can be browsed via several platforms, tools that can operate on that data flow are still lacking. Here, we introduce TFregulomeR (https://github.com/benoukraflab/TFregulomeR), an R-library linked to an up-to-date compendium of cistrome and methylome datasets, implemented with functionalities that facilitate integrative analyses. In particular, TFregulomeR enables the characterization of TF binding partners and cell-specific TFBSs, along with the study of TF’s functions in the context of different partnerships and DNA methylation levels. We demonstrated that TFs’ target gene ontologies can differ notably depending on their partners and, by re-analyzing well characterized TFs, we brought to light that numerous leucine zipper TFBSs derived from ChIP-seq experiments documented in current databases were inadequately characterized, due to the fact that their position weight matrices were assembled using a mixture of homodimer and heterodimer binding sites. Altogether, analyses of context-specific transcription regulation with TFregulomeR foster our understanding of regulatory network-dependent TF functions.
Collapse
Affiliation(s)
- Quy Xiao Xuan Lin
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore
| | - Denis Thieffry
- Computational Systems Biology Team, Institut de Biologie de l'École Normale Supérieure (IBENS), CNRS, INSERM, École Normale Supérieure, PSL Research University, Paris 75005, France
| | - Sudhakar Jha
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore.,Department of Biochemistry, National University of Singapore, Singapore 117596, Singapore
| | - Touati Benoukraf
- Cancer Science Institute of Singapore, National University of Singapore, Singapore 117599, Singapore.,Discipline of Genetics, Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL A1B 3V6, Canada
| |
Collapse
|
11
|
Vahed M, Ishihara JI, Takahashi H. DIpartite: A tool for detecting bipartite motifs by considering base interdependencies. PLoS One 2019; 14:e0220207. [PMID: 31469855 PMCID: PMC6716629 DOI: 10.1371/journal.pone.0220207] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 07/10/2019] [Indexed: 12/22/2022] Open
Abstract
It is extremely important to identify transcription factor binding sites (TFBSs). Some TFBSs are proposed to be bipartite motifs known as two-block motifs separated by gap sequences with variable lengths. While position weight matrix (PWM) is commonly used for the representation and prediction of TFBSs, dinucleotide weight matrix (DWM) enables expression of the interdependencies of neighboring bases. By incorporating DWM into the detection of bipartite motifs, we have developed a novel tool for ab initio motif detection, DIpartite (bipartite motif detection tool based on dinucleotide weight matrix) using a Gibbs sampling strategy and the minimization of Shannon’s entropy. DIpartite predicts the bipartite motifs by considering the interdependencies of neighboring positions, that is, DWM. We compared DIpartite with other available alternatives by using test datasets, namely, of CRP in E. coli, sigma factors in B. subtilis, and promoter sequences in humans. We have developed DIpartite for the detection of TFBSs, particularly bipartite motifs. DIpartite enables ab initio prediction of conserved motifs based on not only PWM, but also DWM. We evaluated the performance of DIpartite by comparing it with freely available tools, such as MEME, BioProspector, BiPad, and AMD. Taken the obtained findings together, DIpartite performs equivalently to or better than these other tools, especially for detecting bipartite motifs with variable gaps. DIpartite requires users to specify the motif lengths, gap length, and PWM or DWM. DIpartite is available for use at https://github.com/Mohammad-Vahed/DIpartite.
Collapse
Affiliation(s)
- Mohammad Vahed
- Medical Mycology Research Center, Chiba University, Chiba, Japan
| | | | - Hiroki Takahashi
- Medical Mycology Research Center, Chiba University, Chiba, Japan
- Molecular Chirality Research Center, Chiba University, Chiba, Japan
- * E-mail:
| |
Collapse
|
12
|
Toivonen J, Kivioja T, Jolma A, Yin Y, Taipale J, Ukkonen E. Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets. Nucleic Acids Res 2019; 46:e44. [PMID: 29385521 PMCID: PMC5934673 DOI: 10.1093/nar/gky027] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2017] [Accepted: 01/12/2018] [Indexed: 01/06/2023] Open
Abstract
In some dimeric cases of transcription factor (TF) binding, the specificity of dimeric motifs has been observed to differ notably from what would be expected were the two factors to bind to DNA independently of each other. Current motif discovery methods are unable to learn monomeric and dimeric motifs in modular fashion such that deviations from the expected motif would become explicit and the noise from dimeric occurrences would not corrupt monomeric models. We propose a novel modeling technique and an expectation maximization algorithm, implemented as software tool MODER, for discovering monomeric TF binding motifs and their dimeric combinations. Given training data and seeds for monomeric motifs, the algorithm learns in the same probabilistic framework a mixture model which represents monomeric motifs as standard position-specific probability matrices (PPMs), and dimeric motifs as pairs of monomeric PPMs, with associated orientation and spacing preferences. For dimers the model represents deviations from pure modular model of two independent monomers, thus making co-operative binding effects explicit. MODER can analyze in reasonable time tens of Mbps of training data. We validated the tool on HT-SELEX and ChIP-seq data. Our findings include some TFs whose expected model has palindromic symmetry but the observed model is directional.
Collapse
Affiliation(s)
- Jarkko Toivonen
- Department of Computer Science, P.O. Box 68, FI-00014 University of Helsinki, Helsinki, Finland
| | - Teemu Kivioja
- Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland
| | - Arttu Jolma
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
| | - Yimeng Yin
- Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden
| | - Jussi Taipale
- Genome-Scale Biology Program, P.O. Box 63, FI-00014 University of Helsinki, Helsinki, Finland.,Division of Functional Genomics and Systems Biology, Department of Medical Biochemistry and Biophysics, and Department of Biosciences and Nutrition, Karolinska Institutet, SE 141 83 Stockholm, Sweden.,Department of Biochemistry, University of Cambridge, CB2 1GA Cambridge, UK
| | - Esko Ukkonen
- Department of Computer Science, P.O. Box 68, FI-00014 University of Helsinki, Helsinki, Finland.,Helsinki Institute for Information Technology HIIT, University of Helsinki & Aalto University, Helsinki, Finland
| |
Collapse
|
13
|
Lu R, Rogan PK. Transcription factor binding site clusters identify target genes with similar tissue-wide expression and buffer against mutations. F1000Res 2018; 7:1933. [PMID: 31001412 PMCID: PMC6464064 DOI: 10.12688/f1000research.17363.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/05/2018] [Indexed: 10/12/2023] Open
Abstract
Background: The distribution and composition of cis-regulatory modules composed of transcription factor (TF) binding site (TFBS) clusters in promoters substantially determine gene expression patterns and TF targets. TF knockdown experiments have revealed that TF binding profiles and gene expression levels are correlated. We use TFBS features within accessible promoter intervals to predict genes with similar tissue-wide expression patterns and TF targets. Methods: Genes with correlated expression patterns across 53 tissues and TF targets were respectively identified from Bray-Curtis Similarity and TF knockdown experiments. Corresponding promoter sequences were reduced to DNase I-accessible intervals; TFBSs were then identified within these intervals using information theory-based position weight matrices for each TF (iPWMs) and clustered. Features from information-dense TFBS clusters predicted these genes with machine learning classifiers, which were evaluated for accuracy, specificity and sensitivity. Mutations in TFBSs were analyzed to in silico examine their impact on cluster densities and the regulatory states of target genes. Results: We initially chose the glucocorticoid receptor gene ( NR3C1), whose regulation has been extensively studied, to test this approach. SLC25A32 and TANK were found to exhibit the most similar expression patterns to NR3C1. A Decision Tree classifier exhibited the largest area under the Receiver Operating Characteristic (ROC) curve in detecting such genes. Target gene prediction was confirmed using siRNA knockdown of TFs, which was found to be more accurate than those predicted after CRISPR/CAS9 inactivation. In-silico mutation analyses of TFBSs also revealed that one or more information-dense TFBS clusters in promoters are required for accurate target gene prediction. Conclusions: Machine learning based on TFBS information density, organization, and chromatin accessibility accurately identifies gene targets with comparable tissue-wide expression patterns. Multiple information-dense TFBS clusters in promoters appear to protect promoters from effects of deleterious binding site mutations in a single TFBS that would otherwise alter regulation of these genes.
Collapse
Affiliation(s)
- Ruipeng Lu
- Computer Science, University of Western Ontario, London, Ontario, N6A 5B7, Canada
| | - Peter K. Rogan
- Computer Science, University of Western Ontario, London, Ontario, N6A 5B7, Canada
- Biochemistry, University of Western Ontario, London, Ontario, N6A 5C1, Canada
- Cytognomix, London, Ontario, N5X 3X5, Canada
| |
Collapse
|
14
|
Lu R, Rogan PK. Transcription factor binding site clusters identify target genes with similar tissue-wide expression and buffer against mutations. F1000Res 2018; 7:1933. [PMID: 31001412 PMCID: PMC6464064 DOI: 10.12688/f1000research.17363.2] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/28/2019] [Indexed: 12/20/2022] Open
Abstract
Background: The distribution and composition of cis-regulatory modules composed of transcription factor (TF) binding site (TFBS) clusters in promoters substantially determine gene expression patterns and TF targets. TF knockdown experiments have revealed that TF binding profiles and gene expression levels are correlated. We use TFBS features within accessible promoter intervals to predict genes with similar tissue-wide expression patterns and TF targets using Machine Learning (ML). Methods: Bray-Curtis Similarity was used to identify genes with correlated expression patterns across 53 tissues. TF targets from knockdown experiments were also analyzed by this approach to set up the ML framework. TFBSs were selected within DNase I-accessible intervals of corresponding promoter sequences using information theory-based position weight matrices (iPWMs) for each TF. Features from information-dense clusters of TFBSs were input to ML classifiers which predict these gene targets along with their accuracy, specificity and sensitivity. Mutations in TFBSs were analyzed in silico to examine their impact on TFBS clustering and predict changes in gene regulation. Results: The glucocorticoid receptor gene ( NR3C1), whose regulation has been extensively studied, was selected to test this approach. SLC25A32 and TANK exhibited the most similar expression patterns to NR3C1. A Decision Tree classifier exhibited the best performance in detecting such genes, based on Area Under the Receiver Operating Characteristic curve (ROC). TF target gene prediction was confirmed using siRNA knockdown, which was more accurate than CRISPR/CAS9 inactivation. TFBS mutation analyses revealed that accurate target gene prediction required at least 1 information-dense TFBS cluster. Conclusions: ML based on TFBS information density, organization, and chromatin accessibility accurately identifies gene targets with comparable tissue-wide expression patterns. Multiple information-dense TFBS clusters in promoters appear to protect promoters from effects of deleterious binding site mutations in a single TFBS that would otherwise alter regulation of these genes.
Collapse
Affiliation(s)
- Ruipeng Lu
- Computer Science, University of Western Ontario, London, Ontario, N6A 5B7, Canada
| | - Peter K. Rogan
- Computer Science, University of Western Ontario, London, Ontario, N6A 5B7, Canada
- Biochemistry, University of Western Ontario, London, Ontario, N6A 5C1, Canada
- Cytognomix, London, Ontario, N5X 3X5, Canada
| |
Collapse
|
15
|
Burke LJ, Sevcik J, Gambino G, Tudini E, Mucaki EJ, Shirley BC, Whiley P, Parsons MT, De Leeneer K, Gutiérrez‐Enríquez S, Santamariña M, Caputo SM, Santana dos Santos E, Soukupova J, Janatova M, Zemankova P, Lhotova K, Stolarova L, Borecka M, Moles‐Fernández A, Manoukian S, Bonanni B, Edwards SL, Blok MJ, van Overeem Hansen T, Rossing M, Diez O, Vega A, Claes KB, Goldgar DE, Rouleau E, Radice P, Peterlongo P, Rogan PK, Caligo M, Spurdle AB, Brown MA. BRCA1 and BRCA2 5' noncoding region variants identified in breast cancer patients alter promoter activity and protein binding. Hum Mutat 2018; 39:2025-2039. [PMID: 30204945 PMCID: PMC6282814 DOI: 10.1002/humu.23652] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2018] [Revised: 09/01/2018] [Accepted: 09/07/2018] [Indexed: 12/13/2022]
Abstract
The widespread use of next generation sequencing for clinical testing is detecting an escalating number of variants in noncoding regions of the genome. The clinical significance of the majority of these variants is currently unknown, which presents a significant clinical challenge. We have screened over 6,000 early-onset and/or familial breast cancer (BC) cases collected by the ENIGMA consortium for sequence variants in the 5' noncoding regions of BC susceptibility genes BRCA1 and BRCA2, and identified 141 rare variants with global minor allele frequency < 0.01, 76 of which have not been reported previously. Bioinformatic analysis identified a set of 21 variants most likely to impact transcriptional regulation, and luciferase reporter assays detected altered promoter activity for four of these variants. Electrophoretic mobility shift assays demonstrated that three of these altered the binding of proteins to the respective BRCA1 or BRCA2 promoter regions, including NFYA binding to BRCA1:c.-287C>T and PAX5 binding to BRCA2:c.-296C>T. Clinical classification of variants affecting promoter activity, using existing prediction models, found no evidence to suggest that these variants confer a high risk of disease. Further studies are required to determine if such variation may be associated with a moderate or low risk of BC.
Collapse
Affiliation(s)
- Leslie J. Burke
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneAustralia
| | - Jan Sevcik
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneAustralia
- Institute of Biochemistry and Experimental Oncology, First Faculty of MedicineCharles UniversityPragueCzech Republic
| | - Gaetana Gambino
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneAustralia
- Section of Molecular GeneticsDepartment of Laboratory MedicineUniversity Hospital of PisaPisaItaly
| | - Emma Tudini
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneAustralia
- Department of Genetics and Computational BiologyQIMR Berghofer Medical Research InstituteBrisbaneAustralia
| | - Eliseos J. Mucaki
- University of Western Ontario, Department of BiochemistrySchulich School of Medicine and DentistryLondonOntarioCanada
| | | | - Phillip Whiley
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneAustralia
- Department of Genetics and Computational BiologyQIMR Berghofer Medical Research InstituteBrisbaneAustralia
| | - Michael T. Parsons
- Department of Genetics and Computational BiologyQIMR Berghofer Medical Research InstituteBrisbaneAustralia
| | - Kim De Leeneer
- Center for Medical GeneticsGhent University Hospitaland Cancer Research Institute Ghent (CRIG)Ghent UniversityGhentBelgium
| | | | - Marta Santamariña
- Fundación Pública Galega de Medicina Xenómica‐SERGASGrupo de Medicina Xenómica‐USC, CIBERER, IDISSantiago de CompostelaSpain
| | - Sandrine M. Caputo
- Service de GénétiqueDepartment de Biologie des TumeursInstitut CurieParisFrance
| | - Elizabeth Santana dos Santos
- Service de GénétiqueDepartment de Biologie des TumeursInstitut CurieParisFrance
- Department of oncologyCenter for Translational OncologyCancer Institute of the State of São Paulo ‐ ICESPSão PauloBrazil
- A.C.Camargo Cancer CenterSão PauloBrazil
| | - Jana Soukupova
- Institute of Biochemistry and Experimental Oncology, First Faculty of MedicineCharles UniversityPragueCzech Republic
| | - Marketa Janatova
- Institute of Biochemistry and Experimental Oncology, First Faculty of MedicineCharles UniversityPragueCzech Republic
| | - Petra Zemankova
- Institute of Biochemistry and Experimental Oncology, First Faculty of MedicineCharles UniversityPragueCzech Republic
| | - Klara Lhotova
- Institute of Biochemistry and Experimental Oncology, First Faculty of MedicineCharles UniversityPragueCzech Republic
| | - Lenka Stolarova
- Institute of Biochemistry and Experimental Oncology, First Faculty of MedicineCharles UniversityPragueCzech Republic
| | - Mariana Borecka
- Institute of Biochemistry and Experimental Oncology, First Faculty of MedicineCharles UniversityPragueCzech Republic
| | | | - Siranoush Manoukian
- Unit of Medical GeneticsDepartment of Medical Oncology and HematologyFondazione IRCCS (Istituto di Ricovero e Cura a Carattere Scientifico) Istituto Nazionale dei Tumori (INT)MilanItaly
| | - Bernardo Bonanni
- Division of Cancer Prevention and GeneticsIstituto Europeo di OncologiaMilanItaly
| | - ENIGMA Consortium
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneAustralia
| | - Stacey L. Edwards
- Department of Genetics and Computational BiologyQIMR Berghofer Medical Research InstituteBrisbaneAustralia
| | - Marinus J. Blok
- Department of Clinical GeneticsMaastricht University Medical CentreMaastrichtThe Netherlands
| | | | - Maria Rossing
- Center for Genomic MedicineCopenhagen University Hospital, RigshospitaletCopenhagenDenmark
| | - Orland Diez
- Oncogenetics GroupVall d'Hebron Institute of Oncology (VHIO)BarcelonaSpain
- Area of Clinical and Molecular GeneticsUniversity Hospital Vall d'Hebron (UHVH)BarcelonaSpain
| | - Ana Vega
- Fundación Pública Galega de Medicina Xenómica‐SERGASGrupo de Medicina Xenómica‐USC, CIBERER, IDISSantiago de CompostelaSpain
| | - Kathleen B.M. Claes
- Center for Medical GeneticsGhent University Hospitaland Cancer Research Institute Ghent (CRIG)Ghent UniversityGhentBelgium
| | | | | | - Paolo Radice
- Unit of Molecular Bases of Genetic Risk and Genetic TestingDepartment of ResearchFondazione IRCCS Istituto Nazionale dei Tumori di MilanoMilanItaly
| | | | - Peter K. Rogan
- University of Western Ontario, Department of BiochemistrySchulich School of Medicine and DentistryLondonOntarioCanada
- CytoGnomix Inc.LondonOntarioCanada
| | - Maria Caligo
- Section of Molecular GeneticsDepartment of Laboratory MedicineUniversity Hospital of PisaPisaItaly
| | - Amanda B. Spurdle
- Department of Genetics and Computational BiologyQIMR Berghofer Medical Research InstituteBrisbaneAustralia
| | - Melissa A. Brown
- School of Chemistry and Molecular BiosciencesUniversity of QueenslandBrisbaneAustralia
| |
Collapse
|
16
|
Yang X, Vingron M. Classifying human promoters by occupancy patterns identifies recurring sequence elements, combinatorial binding, and spatial interactions. BMC Biol 2018; 16:138. [PMID: 30442124 PMCID: PMC6238301 DOI: 10.1186/s12915-018-0585-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Accepted: 10/04/2018] [Indexed: 12/14/2022] Open
Abstract
Background Characterizing recurring sequence patterns in human promoters has been a challenging undertaking even nowadays where a near-complete overview of promoters exists. However, with the more recent availability of genomic location (ChIP-seq) data, one can approach that question through the identification of characteristic patterns of transcription factor occupancy and histone modifications. Results Based on the ENCODE annotation and integration of sequence motifs as well as three-dimensional chromatin data, we have undertaken a re-analysis of occupancy and sequence patterns in human promoters. We identify clear groups of CAAT-box and E-box sequence motif containing promoters, as well as a group of promoters whose interaction with an enhancer appears to be mediated by CCCTC-binding factor (CTCF) binding on the promoter. We also extend our analysis to inactive promoters, showing that only a surprisingly small number of inactive promoters is repressed by the polycomb complex. We also identify combinatorial patterns of transcription factor interactions indicated by the ChIP-seq signals. Conclusion Our analysis defines subgroups of promoters characterized by stereotypic patterns of transcription factor occupancy, and combinations of specific sequence patterns which are required for their binding. This grouping provides new hypotheses concerning the assembly and dynamics of transcription factor complexes at their respective promoter groups, as well as questions on the evolutionary origin of these groups. Electronic supplementary material The online version of this article (10.1186/s12915-018-0585-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Xinyi Yang
- Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany, Ihnestraße 63-73, Berlin, 14195, Germany
| | - Martin Vingron
- Max Planck Institute for Molecular Genetics, 14195 Berlin, Germany, Ihnestraße 63-73, Berlin, 14195, Germany.
| |
Collapse
|
17
|
Dos Santos ES, Caputo SM, Castera L, Gendrot M, Briaux A, Breault M, Krieger S, Rogan PK, Mucaki EJ, Burke LJ, Bièche I, Houdayer C, Vaur D, Stoppa-Lyonnet D, Brown MA, Lallemand F, Rouleau E. Assessment of the functional impact of germline BRCA1/2 variants located in non-coding regions in families with breast and/or ovarian cancer predisposition. Breast Cancer Res Treat 2017; 168:311-325. [PMID: 29236234 DOI: 10.1007/s10549-017-4602-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Accepted: 11/28/2017] [Indexed: 12/19/2022]
Abstract
PURPOSE The molecular mechanism of breast and/or ovarian cancer susceptibility remains unclear in the majority of patients. While germline mutations in the regulatory non-coding regions of BRCA1 and BRCA2 genes have been described, screening has generally been limited to coding regions. The aim of this study was to evaluate the contribution of BRCA1/2 non-coding variants. METHODS Four BRCA1/2 non-coding regions were screened using high-resolution melting analysis/Sanger sequencing or next-generation sequencing on DNA extracted from index cases with breast and ovarian cancer predisposition (3926 for BRCA1 and 3910 for BRCA2). The impact of a set of variants on BRCA1/2 gene regulation was evaluated by site-directed mutagenesis, transfection, followed by Luciferase gene reporter assay. RESULTS We identified a total of 117 variants and tested twelve BRCA1 and 8 BRCA2 variants mapping to promoter and intronic regions. We highlighted two neighboring BRCA1 promoter variants (c.-130del; c.-125C > T) and one BRCA2 promoter variants (c.-296C > T) inhibiting significantly the promoter activity. In the functional assays, a regulating region within the intron 12 was found with the same enhancing impact as within the intron 2. Furthermore, the variants c.81-3980A > G and c.4186-2022C > T suppress the positive effect of the introns 2 and 12, respectively, on the BRCA1 promoter activity. We also found some variants inducing the promoter activities. CONCLUSION In this study, we highlighted some variants among many, modulating negatively the promoter activity of BRCA1 or 2 and thus having a potential impact on the risk of developing cancer. This selection makes it possible to conduct future validation studies on a limited number of variants.
Collapse
Affiliation(s)
- E Santana Dos Santos
- Department of Oncology, Center for Translational Oncology, Cancer Institute of the State of São Paulo - ICESP, São Paulo, Brazil
- Service de Génétique, Institut Curie, Paris, France
- A.C.Camargo Cancer Center, São Paulo, Brazil
| | - S M Caputo
- Service de Génétique, Institut Curie, Paris, France
| | - L Castera
- Laboratoire de Biologie et de Génétique du Cancer, CLCC François Baclesse, INSERM 1079 Centre Normand de Génomique et de MédecinePersonnalisée, Caen, France
| | - M Gendrot
- Service de Génétique, Institut Curie, Paris, France
| | - A Briaux
- Service de Génétique, Institut Curie, Paris, France
| | - M Breault
- Service de Génétique, Institut Curie, Paris, France
| | - S Krieger
- Laboratoire de Biologie et de Génétique du Cancer, CLCC François Baclesse, INSERM 1079 Centre Normand de Génomique et de MédecinePersonnalisée, Caen, France
| | - P K Rogan
- Department of Biochemistry, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Canada
| | - E J Mucaki
- Department of Biochemistry, Schulich School of Medicine and Dentistry, University of Western Ontario, London, Canada
| | - L J Burke
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - I Bièche
- Service de Génétique, Institut Curie, Paris, France
- Université Paris Descartes, Paris, France
| | - C Houdayer
- Service de Génétique, Institut Curie, Paris, France
- Université Paris Descartes, Paris, France
| | - D Vaur
- Laboratoire de Biologie et de Génétique du Cancer, CLCC François Baclesse, INSERM 1079 Centre Normand de Génomique et de MédecinePersonnalisée, Caen, France
| | - D Stoppa-Lyonnet
- Service de Génétique, Institut Curie, Paris, France
- Université Paris Descartes, Paris, France
| | - M A Brown
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, Australia
| | - F Lallemand
- Service de Génétique, Institut Curie, Paris, France.
| | | |
Collapse
|
18
|
Kakumanu A, Velasco S, Mazzoni E, Mahony S. Deconvolving sequence features that discriminate between overlapping regulatory annotations. PLoS Comput Biol 2017; 13:e1005795. [PMID: 29049320 PMCID: PMC5663517 DOI: 10.1371/journal.pcbi.1005795] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Revised: 10/31/2017] [Accepted: 09/26/2017] [Indexed: 11/19/2022] Open
Abstract
Genomic loci with regulatory potential can be annotated with various properties. For example, genomic sites bound by a given transcription factor (TF) can be divided according to whether they are proximal or distal to known promoters. Sites can be further labeled according to the cell types and conditions in which they are active. Given such a collection of labeled sites, it is natural to ask what sequence features are associated with each annotation label. However, discovering such label-specific sequence features is often confounded by overlaps between the labels; e.g. if regulatory sites specific to a given cell type are also more likely to be promoter-proximal, it is difficult to assess whether motifs identified in that set of sites are associated with the cell type or associated with promoters. In order to meet this challenge, we developed SeqUnwinder, a principled approach to deconvolving interpretable discriminative sequence features associated with overlapping annotation labels. We demonstrate the novel analysis abilities of SeqUnwinder using three examples. Firstly, SeqUnwinder is able to unravel sequence features associated with the dynamic binding behavior of TFs during motor neuron programming from features associated with chromatin state in the initial embryonic stem cells. Secondly, we characterize distinct sequence properties of multi-condition and cell-specific TF binding sites after controlling for uneven associations with promoter proximity. Finally, we demonstrate the scalability of SeqUnwinder to discover cell-specific sequence features from over one hundred thousand genomic loci that display DNase I hypersensitivity in one or more ENCODE cell lines. Transcription factor proteins control gene expression by recognizing and interacting with short DNA sequence patterns in regulatory regions on the genome. Current genomics experiments allow us to find regulatory regions associated with a particular biochemical activity over the entire genome; for example, all regions where a particular transcription factor interacts with the genome in a given cell type. Given a collection of regulatory regions, we often aim to discover short DNA sequence patterns that are more common in the collection than in other regions. Performing such “DNA motif-finding” analysis can give us hints about the patterns that determine gene regulation in the analyzed cell type. Here we describe a new method for DNA motif-finding called SeqUnwinder. Our approach analyzes collections of regulatory regions where each has been labeled according to various biological properties. For example, the labels could correspond to various cell types in which the regulatory region is active. SeqUnwinder then performs machine-learning analysis to unravel DNA sequence features that are characteristic of each label (e.g. features that distinguish regulatory regions in each cell type from other cell types). SeqUnwinder is the first method to enable analysis of regulatory region collections that contain several overlapping labels.
Collapse
Affiliation(s)
- Akshay Kakumanu
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA, United States of America
| | - Silvia Velasco
- Department of Biology, New York University, 100 Washington Square East, New York, NY, United States of America
| | - Esteban Mazzoni
- Department of Biology, New York University, 100 Washington Square East, New York, NY, United States of America
| | - Shaun Mahony
- Center for Eukaryotic Gene Regulation, Department of Biochemistry & Molecular Biology, The Pennsylvania State University, University Park, PA, United States of America
- * E-mail:
| |
Collapse
|
19
|
Yang XR, Devi BCR, Sung H, Guida J, Mucaki EJ, Xiao Y, Best A, Garland L, Xie Y, Hu N, Rodriguez-Herrera M, Wang C, Jones K, Luo W, Hicks B, Tang TS, Moitra K, Rogan PK, Dean M. Prevalence and spectrum of germline rare variants in BRCA1/2 and PALB2 among breast cancer cases in Sarawak, Malaysia. Breast Cancer Res Treat 2017; 165:687-697. [PMID: 28664506 DOI: 10.1007/s10549-017-4356-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2017] [Accepted: 06/23/2017] [Indexed: 12/29/2022]
Abstract
PURPOSE To characterize the spectrum of germline mutations in BRCA1, BRCA2, and PALB2 in population-based unselected breast cancer cases in an Asian population. METHODS Germline DNA from 467 breast cancer patients in Sarawak General Hospital, Malaysia, where 93% of the breast cancer patients in Sarawak are treated, was sequenced for the entire coding region of BRCA1; BRCA2; PALB2; Exons 6, 7, and 8 of TP53; and Exons 7 and 8 of PTEN. Pathogenic variants included known pathogenic variants in ClinVar, loss of function variants, and variants that disrupt splice site. RESULTS We found 27 pathogenic variants (11 BRCA1, 10 BRCA2, 4 PALB2, and 2 TP53) in 34 patients, which gave a prevalence of germline mutations of 2.8, 3.23, and 0.86% for BRCA1, BRCA2, and PALB2, respectively. Compared to mutation non-carriers, BRCA1 mutation carriers were more likely to have an earlier age at onset, triple-negative subtype, and lower body mass index, whereas BRCA2 mutation carriers were more likely to have a positive family history. Mutation carrier cases had worse survival compared to non-carriers; however, the association was mostly driven by stage and tumor subtype. We also identified 19 variants of unknown significance, and some of them were predicted to alter splicing or transcription factor binding sites. CONCLUSION Our data provide insight into the genetics of breast cancer in this understudied group and suggest the need for modifying genetic testing guidelines for this population with a much younger age at diagnosis and more limited resources compared with Caucasian populations.
Collapse
Affiliation(s)
- Xiaohong R Yang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA.
| | - Beena C R Devi
- Department of Radiotherapy, Oncology and Palliative Care, Sarawak General Hospital, Kuching, Sarawak, Malaysia
| | - Hyuna Sung
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Jennifer Guida
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Eliseos J Mucaki
- Department of Biochemistry, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada
| | - Yanzi Xiao
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Ana Best
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Lisa Garland
- Cancer Genomics Research Laboratory, Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Yi Xie
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Nan Hu
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Maria Rodriguez-Herrera
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Chaoyu Wang
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| | - Kristine Jones
- Cancer Genomics Research Laboratory, Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Wen Luo
- Cancer Genomics Research Laboratory, Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Belynda Hicks
- Cancer Genomics Research Laboratory, Leidos Biomedical Research, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | - Tieng Swee Tang
- Department of Radiotherapy, Oncology and Palliative Care, Sarawak General Hospital, Kuching, Sarawak, Malaysia
| | - Karobi Moitra
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA.,Department of Biology, Trinity Washington University, Washington, DC, USA
| | - Peter K Rogan
- Department of Biochemistry, Schulich School of Medicine and Dentistry, University of Western Ontario, London, ON, Canada
| | - Michael Dean
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, NCI/NIH, Bethesda, Rockville, MD, USA
| |
Collapse
|