1
|
Morffy N, Van den Broeck L, Miller C, Emenecker RJ, Bryant JA, Lee TM, Sageman-Furnas K, Wilkinson EG, Pathak S, Kotha SR, Lam A, Mahatma S, Pande V, Waoo A, Wright RC, Holehouse AS, Staller MV, Sozzani R, Strader LC. Identification of plant transcriptional activation domains. Nature 2024; 632:166-173. [PMID: 39020176 DOI: 10.1038/s41586-024-07707-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 06/12/2024] [Indexed: 07/19/2024]
Abstract
Gene expression in Arabidopsis is regulated by more than 1,900 transcription factors (TFs), which have been identified genome-wide by the presence of well-conserved DNA-binding domains. Activator TFs contain activation domains (ADs) that recruit coactivator complexes; however, for nearly all Arabidopsis TFs, we lack knowledge about the presence, location and transcriptional strength of their ADs1. To address this gap, here we use a yeast library approach to experimentally identify Arabidopsis ADs on a proteome-wide scale, and find that more than half of the Arabidopsis TFs contain an AD. We annotate 1,553 ADs, the vast majority of which are, to our knowledge, previously unknown. Using the dataset generated, we develop a neural network to accurately predict ADs and to identify sequence features that are necessary to recruit coactivator complexes. We uncover six distinct combinations of sequence features that result in activation activity, providing a framework to interrogate the subfunctionalization of ADs. Furthermore, we identify ADs in the ancient AUXIN RESPONSE FACTOR family of TFs, revealing that AD positioning is conserved in distinct clades. Our findings provide a deep resource for understanding transcriptional activation, a framework for examining function in intrinsically disordered regions and a predictive model of ADs.
Collapse
Affiliation(s)
| | - Lisa Van den Broeck
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
| | - Caelan Miller
- Department of Biology, Duke University, Durham, NC, USA
| | - Ryan J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| | - John A Bryant
- Biological Systems Engineering, Virginia Tech, Blacksburg, VA, USA
| | - Tyler M Lee
- Department of Biology, Duke University, Durham, NC, USA
| | | | | | - Sunita Pathak
- Department of Biology, Duke University, Durham, NC, USA
| | - Sanjana R Kotha
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Angelica Lam
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Saloni Mahatma
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
| | - Vikram Pande
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
| | - Aman Waoo
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
| | - R Clay Wright
- Biological Systems Engineering, Virginia Tech, Blacksburg, VA, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA
- Center for Biomolecular Condensates, Washington University in St. Louis, St. Louis, MO, USA
| | - Max V Staller
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Rosangela Sozzani
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
| | | |
Collapse
|
2
|
Hummel NFC, Markel K, Stefani J, Staller MV, Shih PM. Systematic identification of transcriptional activation domains from non-transcription factor proteins in plants and yeast. Cell Syst 2024; 15:662-672.e4. [PMID: 38866009 DOI: 10.1016/j.cels.2024.05.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 04/26/2024] [Accepted: 05/22/2024] [Indexed: 06/14/2024]
Abstract
Transcription factors can promote gene expression through activation domains. Whole-genome screens have systematically mapped activation domains in transcription factors but not in non-transcription factor proteins (e.g., chromatin regulators and coactivators). To fill this knowledge gap, we employed the activation domain predictor PADDLE to analyze the proteomes of Arabidopsis thaliana and Saccharomyces cerevisiae. We screened 18,000 predicted activation domains from >800 non-transcription factor genes in both species, confirming that 89% of candidate proteins contain active fragments. Our work enables the annotation of hundreds of nuclear proteins as putative coactivators, many of which have never been ascribed any function in plants. Analysis of peptide sequence compositions reveals how the distribution of key amino acids dictates activity. Finally, we validated short, "universal" activation domains with comparable performance to state-of-the-art activation domains used for genome engineering. Our approach enables the genome-wide discovery and annotation of activation domains that can function across diverse eukaryotes.
Collapse
Affiliation(s)
- Niklas F C Hummel
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA; Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA 94608, USA; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; Department of Biology, Technische Universität Darmstadt, 64287 Darmstadt, Germany
| | - Kasey Markel
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA; Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA 94608, USA; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jordan Stefani
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Max V Staller
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA; Center for Computational Biology, University of California, Berkeley, CA 94720, USA; Chan Zuckerberg Biohub-San Francisco, San Francisco, CA 9415, USA.
| | - Patrick M Shih
- Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA; Feedstocks Division, Joint BioEnergy Institute, Emeryville, CA 94608, USA; Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA; Innovative Genomics Institute, University of California, Berkeley, CA 94720, USA.
| |
Collapse
|
3
|
McDonnell AF, Plech M, Livesey BJ, Gerasimavicius L, Owen LJ, Hall HN, FitzPatrick DR, Marsh JA, Kudla G. Deep mutational scanning quantifies DNA binding and predicts clinical outcomes of PAX6 variants. Mol Syst Biol 2024; 20:825-844. [PMID: 38849565 PMCID: PMC11219921 DOI: 10.1038/s44320-024-00043-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 04/05/2024] [Accepted: 05/14/2024] [Indexed: 06/09/2024] Open
Abstract
Nonsense and missense mutations in the transcription factor PAX6 cause a wide range of eye development defects, including aniridia, microphthalmia and coloboma. To understand how changes of PAX6:DNA binding cause these phenotypes, we combined saturation mutagenesis of the paired domain of PAX6 with a yeast one-hybrid (Y1H) assay in which expression of a PAX6-GAL4 fusion gene drives antibiotic resistance. We quantified binding of more than 2700 single amino-acid variants to two DNA sequence elements. Mutations in DNA-facing residues of the N-terminal subdomain and linker region were most detrimental, as were mutations to prolines and to negatively charged residues. Many variants caused sequence-specific molecular gain-of-function effects, including variants in position 71 that increased binding to the LE9 enhancer but decreased binding to a SELEX-derived binding site. In the absence of antibiotic selection, variants that retained DNA binding slowed yeast growth, likely because such variants perturbed the yeast transcriptome. Benchmarking against known patient variants and applying ACMG/AMP guidelines to variant classification, we obtained supporting-to-moderate evidence that 977 variants are likely pathogenic and 1306 are likely benign. Our analysis shows that most pathogenic mutations in the paired domain of PAX6 can be explained simply by the effects of these mutations on PAX6:DNA association, and establishes Y1H as a generalisable assay for the interpretation of variant effects in transcription factors.
Collapse
Affiliation(s)
- Alexander F McDonnell
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Marcin Plech
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Lukas Gerasimavicius
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Liusaidh J Owen
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Hildegard Nikki Hall
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - David R FitzPatrick
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK.
| |
Collapse
|
4
|
Ginell GM, Emenecker RJ, Lotthammer JM, Usher ET, Holehouse AS. Direct prediction of intermolecular interactions driven by disordered regions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.03.597104. [PMID: 38895487 PMCID: PMC11185574 DOI: 10.1101/2024.06.03.597104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Intrinsically disordered regions (IDRs) are critical for a wide variety of cellular functions, many of which involve interactions with partner proteins. Molecular recognition is typically considered through the lens of sequence-specific binding events. However, a growing body of work has shown that IDRs often interact with partners in a manner that does not depend on the precise order of the amino acid order, instead driven by complementary chemical interactions leading to disordered bound-state complexes. Despite this emerging paradigm, we lack tools to describe, quantify, predict, and interpret these types of structurally heterogeneous interactions from the underlying amino acid sequences. Here, we repurpose the chemical physics developed originally for molecular simulations to develop an approach for predicting intermolecular interactions between IDRs and partner proteins. Our approach enables the direct prediction of phase diagrams, the identification of chemically-specific interaction hotspots on IDRs, and a route to develop and test mechanistic hypotheses regarding IDR function in the context of molecular recognition. We use our approach to examine a range of systems and questions to highlight its versatility and applicability.
Collapse
Affiliation(s)
- Garrett M. Ginell
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Ryan. J Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Jeffrey M. Lotthammer
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Emery T. Usher
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Alex S. Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| |
Collapse
|
5
|
Singleton MD, Eisen MB. Evolutionary analyses of intrinsically disordered regions reveal widespread signals of conservation. PLoS Comput Biol 2024; 20:e1012028. [PMID: 38662765 PMCID: PMC11075841 DOI: 10.1371/journal.pcbi.1012028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Revised: 05/07/2024] [Accepted: 03/28/2024] [Indexed: 05/08/2024] Open
Abstract
Intrinsically disordered regions (IDRs) are segments of proteins without stable three-dimensional structures. As this flexibility allows them to interact with diverse binding partners, IDRs play key roles in cell signaling and gene expression. Despite the prevalence and importance of IDRs in eukaryotic proteomes and various biological processes, associating them with specific molecular functions remains a significant challenge due to their high rates of sequence evolution. However, by comparing the observed values of various IDR-associated properties against those generated under a simulated model of evolution, a recent study found most IDRs across the entire yeast proteome contain conserved features. Furthermore, it showed clusters of IDRs with common "evolutionary signatures," i.e. patterns of conserved features, were associated with specific biological functions. To determine if similar patterns of conservation are found in the IDRs of other systems, in this work we applied a series of phylogenetic models to over 7,500 orthologous IDRs identified in the Drosophila genome to dissect the forces driving their evolution. By comparing models of constrained and unconstrained continuous trait evolution using the Brownian motion and Ornstein-Uhlenbeck models, respectively, we identified signals of widespread constraint, indicating conservation of distributed features is mechanism of IDR evolution common to multiple biological systems. In contrast to the previous study in yeast, however, we observed limited evidence of IDR clusters with specific biological functions, which suggests a more complex relationship between evolutionary constraints and function in the IDRs of multicellular organisms.
Collapse
Affiliation(s)
- Marc D. Singleton
- Howard Hughes Medical Institute, UC Berkeley, Berkeley, California, United States of America
| | - Michael B. Eisen
- Howard Hughes Medical Institute, UC Berkeley, Berkeley, California, United States of America
- Department of Molecular and Cell Biology, UC Berkeley, Berkeley, California, United States of America
| |
Collapse
|
6
|
Monté D, Lens Z, Dewitte F, Villeret V, Verger A. Assessment of machine-learning predictions for the Mediator complex subunit MED25 ACID domain interactions with transactivation domains. FEBS Lett 2024; 598:758-773. [PMID: 38436147 DOI: 10.1002/1873-3468.14837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/01/2024] [Accepted: 02/10/2024] [Indexed: 03/05/2024]
Abstract
The human Mediator complex subunit MED25 binds transactivation domains (TADs) present in various cellular and viral proteins using two binding interfaces, named H1 and H2, which are found on opposite sides of its ACID domain. Here, we use and compare deep learning methods to characterize human MED25-TAD interfaces and assess the predicted models to published experimental data. For the H1 interface, AlphaFold produces predictions with high-reliability scores that agree well with experimental data, while the H2 interface predictions appear inconsistent, preventing reliable binding modes. Despite these limitations, we experimentally assess the validity of MED25 interface predictions with the viral transcriptional activators Lana-1 and IE62. AlphaFold predictions also suggest the existence of a unique hydrophobic pocket for the Arabidopsis MED25 ACID domain.
Collapse
Affiliation(s)
- Didier Monté
- CNRS EMR 9002 Integrative Structural Biology, Inserm U 1167 - RID-AGE, Univ. Lille, CHU Lille, Institut Pasteur de Lille, France
| | - Zoé Lens
- CNRS EMR 9002 Integrative Structural Biology, Inserm U 1167 - RID-AGE, Univ. Lille, CHU Lille, Institut Pasteur de Lille, France
| | - Frédérique Dewitte
- CNRS EMR 9002 Integrative Structural Biology, Inserm U 1167 - RID-AGE, Univ. Lille, CHU Lille, Institut Pasteur de Lille, France
| | - Vincent Villeret
- CNRS EMR 9002 Integrative Structural Biology, Inserm U 1167 - RID-AGE, Univ. Lille, CHU Lille, Institut Pasteur de Lille, France
| | - Alexis Verger
- CNRS EMR 9002 Integrative Structural Biology, Inserm U 1167 - RID-AGE, Univ. Lille, CHU Lille, Institut Pasteur de Lille, France
| |
Collapse
|
7
|
Mindel V, Brodsky S, Cohen A, Manadre W, Jonas F, Carmi M, Barkai N. Intrinsically disordered regions of the Msn2 transcription factor encode multiple functions using interwoven sequence grammars. Nucleic Acids Res 2024; 52:2260-2272. [PMID: 38109289 PMCID: PMC10954448 DOI: 10.1093/nar/gkad1191] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Revised: 11/04/2023] [Accepted: 12/11/2023] [Indexed: 12/20/2023] Open
Abstract
Intrinsically disordered regions (IDRs) are abundant in eukaryotic proteins, but their sequence-function relationship remains poorly understood. IDRs of transcription factors (TFs) can direct promoter selection and recruit coactivators, as shown for the budding yeast TF Msn2. To examine how IDRs encode both these functions, we compared genomic binding specificity, coactivator recruitment, and gene induction amongst a large set of designed Msn2-IDR mutants. We find that both functions depend on multiple regions across the > 600AA IDR. Yet, transcription activity was readily disrupted by mutations that showed no effect on the Msn2 binding specificity. Our data attribute this differential sensitivity to the integration of a relaxed, composition-based code directing binding specificity with a more stringent, motif-based code controlling the recruitment of coactivators and transcription activity. Therefore, Msn2 utilizes interwoven sequence grammars for encoding multiple functions, suggesting a new IDR design paradigm of potentially general use.
Collapse
Affiliation(s)
- Vladimir Mindel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Aileen Cohen
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Wajd Manadre
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Felix Jonas
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Miri Carmi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
8
|
Swint-Kruse L, Fenton AW. Rheostats, toggles, and neutrals, Oh my! A new framework for understanding how amino acid changes modulate protein function. J Biol Chem 2024; 300:105736. [PMID: 38336297 PMCID: PMC10914490 DOI: 10.1016/j.jbc.2024.105736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/09/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Advances in personalized medicine and protein engineering require accurately predicting outcomes of amino acid substitutions. Many algorithms correctly predict that evolutionarily-conserved positions show "toggle" substitution phenotypes, which is defined when a few substitutions at that position retain function. In contrast, predictions often fail for substitutions at the less-studied "rheostat" positions, which are defined when different amino acid substitutions at a position sample at least half of the possible functional range. This review describes efforts to understand the impact and significance of rheostat positions: (1) They have been observed in globular soluble, integral membrane, and intrinsically disordered proteins; within single proteins, their prevalence can be up to 40%. (2) Substitutions at rheostat positions can have biological consequences and ∼10% of substitutions gain function. (3) Although both rheostat and "neutral" (defined when all substitutions exhibit wild-type function) positions are nonconserved, the two classes have different evolutionary signatures. (4) Some rheostat positions have pleiotropic effects on function, simultaneously modulating multiple parameters (e.g., altering both affinity and allosteric coupling). (5) In structural studies, substitutions at rheostat positions appear to cause only local perturbations; the overall conformations appear unchanged. (6) Measured functional changes show promising correlations with predicted changes in protein dynamics; the emergent properties of predicted, dynamically coupled amino acid networks might explain some of the complex functional outcomes observed when substituting rheostat positions. Overall, rheostat positions provide unique opportunities for using single substitutions to tune protein function. Future studies of these positions will yield important insights into the protein sequence/function relationship.
Collapse
Affiliation(s)
- Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA.
| | - Aron W Fenton
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
9
|
Holehouse AS, Kragelund BB. The molecular basis for cellular function of intrinsically disordered protein regions. Nat Rev Mol Cell Biol 2024; 25:187-211. [PMID: 37957331 DOI: 10.1038/s41580-023-00673-0] [Citation(s) in RCA: 48] [Impact Index Per Article: 48.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/26/2023] [Indexed: 11/15/2023]
Abstract
Intrinsically disordered protein regions exist in a collection of dynamic interconverting conformations that lack a stable 3D structure. These regions are structurally heterogeneous, ubiquitous and found across all kingdoms of life. Despite the absence of a defined 3D structure, disordered regions are essential for cellular processes ranging from transcriptional control and cell signalling to subcellular organization. Through their conformational malleability and adaptability, disordered regions extend the repertoire of macromolecular interactions and are readily tunable by their structural and chemical context, making them ideal responders to regulatory cues. Recent work has led to major advances in understanding the link between protein sequence and conformational behaviour in disordered regions, yet the link between sequence and molecular function is less well defined. Here we consider the biochemical and biophysical foundations that underlie how and why disordered regions can engage in productive cellular functions, provide examples of emerging concepts and discuss how protein disorder contributes to intracellular information processing and regulation of cellular function.
Collapse
Affiliation(s)
- Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St Louis, MO, USA.
- Center for Biomolecular Condensates, Washington University in St Louis, St Louis, MO, USA.
| | - Birthe B Kragelund
- REPIN, Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
10
|
Zheng Y, Chen S. Transcriptional precision in photoreceptor development and diseases - Lessons from 25 years of CRX research. Front Cell Neurosci 2024; 18:1347436. [PMID: 38414750 PMCID: PMC10896975 DOI: 10.3389/fncel.2024.1347436] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 01/19/2024] [Indexed: 02/29/2024] Open
Abstract
The vertebrate retina is made up of six specialized neuronal cell types and one glia that are generated from a common retinal progenitor. The development of these distinct cell types is programmed by transcription factors that regulate the expression of specific genes essential for cell fate specification and differentiation. Because of the complex nature of transcriptional regulation, understanding transcription factor functions in development and disease is challenging. Research on the Cone-rod homeobox transcription factor CRX provides an excellent model to address these challenges. In this review, we reflect on 25 years of mammalian CRX research and discuss recent progress in elucidating the distinct pathogenic mechanisms of four CRX coding variant classes. We highlight how in vitro biochemical studies of CRX protein functions facilitate understanding CRX regulatory principles in animal models. We conclude with a brief discussion of the emerging systems biology approaches that could accelerate precision medicine for CRX-linked diseases and beyond.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Saint Louis, MO, United States
- Department of Ophthalmology and Visual Sciences, Saint Louis, MO, United States
| | - Shiming Chen
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Saint Louis, MO, United States
- Department of Ophthalmology and Visual Sciences, Saint Louis, MO, United States
- Department of Developmental Biology, Washington University in St. Louis, Saint Louis, MO, United States
| |
Collapse
|
11
|
Lobel JH, Ingolia NT. Defining the mechanisms and properties of post-transcriptional regulatory disordered regions by high-throughput functional profiling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.01.578453. [PMID: 38370681 PMCID: PMC10871298 DOI: 10.1101/2024.02.01.578453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Disordered regions within RNA binding proteins are required to control mRNA decay and protein synthesis. To understand how these disordered regions modulate gene expression, we surveyed regulatory activity across the entire disordered proteome using a high-throughput functional assay. We identified hundreds of regulatory sequences within intrinsically disordered regions and demonstrate how these elements cooperate with core mRNA decay machinery to promote transcript turnover. Coupling high-throughput functional profiling with mutational scanning revealed diverse molecular features, ranging from defined motifs to overall sequence composition, underlying the regulatory effects of disordered peptides. Machine learning analysis implicated aromatic residues in particular contexts as critical determinants of repressor activity, consistent with their roles in forming protein-protein interactions with downstream effectors. Our results define the molecular principles and biochemical mechanisms that govern post-transcriptional gene regulation by disordered regions and exemplify the encoding of diverse yet specific functions in the absence of well-defined structure.
Collapse
Affiliation(s)
- Joseph H Lobel
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Nicholas T Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Lead contact
| |
Collapse
|
12
|
Sreenivasan S, Heffren P, Suh K, Rodnin MV, Kosa E, Fenton AW, Ladokhin AS, Smith PE, Fontes JD, Swint‐Kruse L. The intrinsically disordered transcriptional activation domain of CIITA is functionally tuneable by single substitutions: An exception or a new paradigm? Protein Sci 2024; 33:e4863. [PMID: 38073129 PMCID: PMC10806935 DOI: 10.1002/pro.4863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 12/04/2023] [Accepted: 12/07/2023] [Indexed: 01/27/2024]
Abstract
During protein evolution, some amino acid substitutions modulate protein function ("tuneability"). In most proteins, the tuneable range is wide and can be sampled by a set of protein variants that each contains multiple amino acid substitutions. In other proteins, the full tuneable range can be accessed by a set of variants that each contains a single substitution. Indeed, in some globular proteins, the full tuneable range can be accessed by the set of site-saturating substitutions at an individual "rheostat" position. However, in proteins with intrinsically disordered regions (IDRs), most functional studies-which would also detect tuneability-used multiple substitutions or small deletions. In disordered transcriptional activation domains (ADs), studies with multiple substitutions led to the "acidic exposure" model, which does not anticipate the existence of rheostat positions. In the few studies that did assess effects of single substitutions on AD function, results were mixed: the ADs of two full-length transcription factors did not show tuneability, whereas a fragment of a third AD was tuneable by single substitutions. In this study, we tested tuneability in the AD of full-length human class II transactivator (CIITA). Sequence analyses and experiments showed that CIITA's AD is an IDR. Functional assays of singly-substituted AD variants showed that CIITA's function was highly tuneable, with outcomes not predicted by the acidic exposure model. Four tested positions showed rheostat behavior for transcriptional activation. Thus, tuneability of different IDRs can vary widely. Future studies are needed to illuminate the biophysical features that govern whether an IDR is tuneable by single substitutions.
Collapse
Affiliation(s)
- Shwetha Sreenivasan
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
| | - Paul Heffren
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
- Present address:
Department of BiosciencesKansas City UniversityKansas CityMissouriUSA
| | - Kyung‐Shin Suh
- Department of ChemistryKansas State UniversityManhattanKansasUSA
| | - Mykola V. Rodnin
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
| | - Edina Kosa
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
| | - Aron W. Fenton
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
| | - Alexey S. Ladokhin
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
| | - Paul E. Smith
- Department of ChemistryKansas State UniversityManhattanKansasUSA
| | - Joseph D. Fontes
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
| | - Liskin Swint‐Kruse
- Department of Biochemistry and Molecular BiologyUniversity of Kansas Medical CenterKansas CityKansasUSA
| |
Collapse
|
13
|
Udupa A, Kotha SR, Staller MV. Commonly asked questions about transcriptional activation domains. Curr Opin Struct Biol 2024; 84:102732. [PMID: 38056064 PMCID: PMC11193542 DOI: 10.1016/j.sbi.2023.102732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Revised: 10/23/2023] [Accepted: 10/27/2023] [Indexed: 12/08/2023]
Abstract
Eukaryotic transcription factors activate gene expression with their DNA-binding domains and activation domains. DNA-binding domains bind the genome by recognizing structurally related DNA sequences; they are structured, conserved, and predictable from protein sequences. Activation domains recruit chromatin modifiers, coactivator complexes, or basal transcriptional machinery via structurally diverse protein-protein interactions. Activation domains and DNA-binding domains have been called independent, modular units, but there are many departures from modularity, including interactions between these regions and overlap in function. Compared to DNA-binding domains, activation domains are poorly understood because they are poorly conserved, intrinsically disordered, and difficult to predict from protein sequences. This review, organized around commonly asked questions, describes recent progress that the field has made in understanding the sequence features that control activation domains and predicting them from sequence.
Collapse
Affiliation(s)
- Aditya Udupa
- Department of Molecular and Cell Biology, University of California, Berkeley, 94720, USA
| | - Sanjana R Kotha
- Department of Molecular and Cell Biology, University of California, Berkeley, 94720, USA; Center for Computational Biology, University of California, Berkeley, 94720, USA
| | - Max V Staller
- Department of Molecular and Cell Biology, University of California, Berkeley, 94720, USA; Center for Computational Biology, University of California, Berkeley, 94720, USA; Chan Zuckerberg Biohub-San Francisco, San Francisco, CA 94158, USA.
| |
Collapse
|
14
|
Theisen FF, Prestel A, Elkjær S, Leurs YHA, Morffy N, Strader LC, O'Shea C, Teilum K, Kragelund BB, Skriver K. Molecular switching in transcription through splicing and proline-isomerization regulates stress responses in plants. Nat Commun 2024; 15:592. [PMID: 38238333 PMCID: PMC10796322 DOI: 10.1038/s41467-024-44859-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 01/09/2024] [Indexed: 01/22/2024] Open
Abstract
The Arabidopsis thaliana DREB2A transcription factor interacts with the negative regulator RCD1 and the ACID domain of subunit 25 of the transcriptional co-regulator mediator (Med25) to integrate stress signals for gene expression, with elusive molecular interplay. Using biophysical and structural analyses together with high-throughput screening, we reveal a bivalent binding switch in DREB2A containing an ACID-binding motif (ABS) and the known RCD1-binding motif (RIM). The RIM is lacking in a stress-induced DREB2A splice variant with retained transcriptional activity. ABS and RIM bind to separate sites on Med25-ACID, and NMR analyses show a structurally heterogeneous complex deriving from a DREB2A-ABS proline residue populating cis- and trans-isomers with remote impact on the RIM. The cis-isomer stabilizes an α-helix, while the trans-isomer may introduce energetic frustration facilitating rapid exchange between activators and repressors. Thus, DREB2A uses a post-transcriptionally and post-translationally modulated switch for transcriptional regulation.
Collapse
Affiliation(s)
- Frederik Friis Theisen
- The REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Andreas Prestel
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Steffie Elkjær
- The REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Yannick H A Leurs
- The REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | | | | - Charlotte O'Shea
- The REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kaare Teilum
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Birthe B Kragelund
- The REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
- Structural Biology and NMR Laboratory, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Karen Skriver
- The REPIN and The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
15
|
Lemma RB, Fuglerud BM, Frampton J, Gabrielsen OS. MYB: A Key Transcription Factor in the Hematopoietic System Subject to Many Levels of Control. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2024; 1459:3-29. [PMID: 39017837 DOI: 10.1007/978-3-031-62731-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2024]
Abstract
MYB is a master regulator and pioneer factor highly expressed in hematopoietic progenitor cells (HPCs) where it contributes to the reprogramming processes operating during hematopoietic development. MYB plays a complex role being involved in several lineages of the hematopoietic system. At the molecular level, the MYB gene is subject to intricate regulation at many levels through several enhancer and promoter elements, through transcriptional elongation control, as well as post-transcriptional regulation. The protein is modulated by post-translational modifications (PTMs) such as SUMOylation restricting the expression of its downstream targets. Together with a range of interaction partners, cooperating transcription factors (TFs) and epigenetic regulators, MYB orchestrates a fine-tuned symphony of genes expressed during various stages of haematopoiesis. At the same time, the complex MYB system is vulnerable, being a target for unbalanced control and cancer development.
Collapse
Affiliation(s)
- Roza Berhanu Lemma
- Department of Biosciences, University of Oslo, Oslo, Norway
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo, Norway
| | | | - Jon Frampton
- Department of Cancer & Genomic Sciences, College of Medicine & Health, University of Birmingham, Edgbaston, Birmingham, UK
| | | |
Collapse
|
16
|
DelRosso N, Bintu L. Using High-Throughput Measurements to Identify Principles of Transcriptional and Epigenetic Regulators. Methods Mol Biol 2024; 2842:79-101. [PMID: 39012591 DOI: 10.1007/978-1-0716-4051-7_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/17/2024]
Abstract
To achieve exquisite control over the epigenome, we need a better predictive understanding of how transcription factors, chromatin regulators, and their individual domain's function, both as modular parts and as full proteins. Transcriptional effector domains are one class of protein domains that regulate transcription and chromatin. These effector domains either repress or activate gene expression by interacting with chromatin-modifying enzymes, transcriptional cofactors, and/or general transcriptional machinery. Here, we discuss important design considerations for high-throughput investigations of effector domains, recent advances in discovering new domains in human cells and testing how domain function depends on amino acid sequence. For every effector domain, we would like to know the following: What role does the cell type, signaling state, and targeted context have on activation, silencing, and epigenetic memory? Large-scale measurements of transcriptional activities can help systematically answer these questions and identify general rules for how all these parameters affect effector domain activities. Last, we discuss what steps need to be taken to turn a newly discovered effector domain into a robust, precise epigenome editor. With more carefully considered high-throughput investigations, soon we will have better predictive control over the epigenome.
Collapse
|
17
|
Notin P, Kollasch AW, Ritter D, van Niekerk L, Paul S, Spinner H, Rollins N, Shaw A, Weitzman R, Frazer J, Dias M, Franceschi D, Orenbuch R, Gal Y, Marks DS. ProteinGym: Large-Scale Benchmarks for Protein Design and Fitness Prediction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.07.570727. [PMID: 38106144 PMCID: PMC10723403 DOI: 10.1101/2023.12.07.570727] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2023]
Abstract
Predicting the effects of mutations in proteins is critical to many applications, from understanding genetic disease to designing novel proteins that can address our most pressing challenges in climate, agriculture and healthcare. Despite a surge in machine learning-based protein models to tackle these questions, an assessment of their respective benefits is challenging due to the use of distinct, often contrived, experimental datasets, and the variable performance of models across different protein families. Addressing these challenges requires scale. To that end we introduce ProteinGym, a large-scale and holistic set of benchmarks specifically designed for protein fitness prediction and design. It encompasses both a broad collection of over 250 standardized deep mutational scanning assays, spanning millions of mutated sequences, as well as curated clinical datasets providing high-quality expert annotations about mutation effects. We devise a robust evaluation framework that combines metrics for both fitness prediction and design, factors in known limitations of the underlying experimental methods, and covers both zero-shot and supervised settings. We report the performance of a diverse set of over 70 high-performing models from various subfields (eg., alignment-based, inverse folding) into a unified benchmark suite. We open source the corresponding codebase, datasets, MSAs, structures, model predictions and develop a user-friendly website that facilitates data access and analysis.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Ada Shaw
- Applied Mathematics, Harvard University
| | | | | | - Mafalda Dias
- Centre for Genomic Regulation, Universitat Pompeu Fabra
| | | | | | - Yarin Gal
- Computer Science, University of Oxford
| | | |
Collapse
|
18
|
Emenecker RJ, Guadalupe K, Shamoon NM, Sukenik S, Holehouse AS. Sequence-ensemble-function relationships for disordered proteins in live cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.29.564547. [PMID: 37961106 PMCID: PMC10634935 DOI: 10.1101/2023.10.29.564547] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Intrinsically disordered protein regions (IDRs) are ubiquitous across all kingdoms of life and play a variety of essential cellular roles. IDRs exist in a collection of structurally distinct conformers known as an ensemble. An IDR's amino acid sequence determines its ensemble, which in turn can play an important role in dictating molecular function. Yet a clear link connecting IDR sequence, its ensemble properties, and its molecular function in living cells has not been directly established. Here, we set out to test this sequence-ensemble-function paradigm using a novel computational method (GOOSE) that enables the rational design of libraries of IDRs by systematically varying specific sequence properties. Using ensemble FRET, we measured the ensemble dimensions of a library of rationally designed IDRs in human-derived cell lines, revealing how IDR sequence influences ensemble dimensions in situ. Furthermore, we show that the interplay between sequence and ensemble can tune an IDR's ability to sense changes in cell volume - a de novo molecular function for these synthetic sequences. Our results establish biophysical rules for intracellular sequence-ensemble relationships, enable a new route for understanding how IDR sequences map to function in live cells, and set the ground for the design of synthetic IDRs with de novo function.
Collapse
Affiliation(s)
- Ryan J. Emenecker
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| | - Karina Guadalupe
- Department of Chemistry and Biochemistry, University of California, Merced, CA
- Center for Cellular and Biomolecular Machines, University of California, Merced, CA
| | - Nora M. Shamoon
- Center for Cellular and Biomolecular Machines, University of California, Merced, CA
- Quantitative Systems Biology Program, University of California, Merced, CA
| | - Shahar Sukenik
- Department of Chemistry and Biochemistry, University of California, Merced, CA
- Center for Cellular and Biomolecular Machines, University of California, Merced, CA
- Quantitative Systems Biology Program, University of California, Merced, CA
- Health Sciences Research Institute, University of California, Merced, CA
| | - Alex S. Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO
- Center for Biomolecular Condensates (CBC), Washington University in St. Louis, St. Louis, MO
| |
Collapse
|
19
|
Holehouse A, Emenecker R, Guadalupe K, Shamoon N, Sukenik S. Sequence-ensemble-function relationships for disordered proteins in live cells. RESEARCH SQUARE 2023:rs.3.rs-3501110. [PMID: 37986812 PMCID: PMC10659550 DOI: 10.21203/rs.3.rs-3501110/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
Intrinsically disordered protein regions (IDRs) are ubiquitous across all kingdoms of life and play a variety of essential cellular roles. IDRs exist in a collection of structurally distinct conformers known as an ensemble. IDR amino acid sequence determines its ensemble, which in turn can play an important role in dictating molecular function. Yet a clear link connecting IDR sequence, its ensemble properties, and its molecular function in living cells has not been systematically established. Here, we set out to test this sequence-ensemble-function paradigm using a novel computational method (GOOSE) that enables the rational design of libraries of IDRs by systematically varying specific sequence properties. Using ensemble FRET, we measured the ensemble dimensions of a library of rationally designed IDRs in human-derived cell lines, revealing how IDR sequence influences ensemble dimensions in situ. Furthermore, we show that the interplay between sequence and ensemble can tune an IDR's ability to sense changes in cell volume - a de novomolecular function for these synthetic sequences. Our results establish biophysical rules for intracellular sequence-ensemble relationships, enable a new route for understanding how IDR sequences map to function in live cells, and set the ground for the design of synthetic IDRs with de novo function.
Collapse
|
20
|
Kotha SR, Staller MV. Clusters of acidic and hydrophobic residues can predict acidic transcriptional activation domains from protein sequence. Genetics 2023; 225:iyad131. [PMID: 37462277 PMCID: PMC10550315 DOI: 10.1093/genetics/iyad131] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 07/03/2023] [Indexed: 10/06/2023] Open
Abstract
Transcription factors activate gene expression in development, homeostasis, and stress with DNA binding domains and activation domains. Although there exist excellent computational models for predicting DNA binding domains from protein sequence, models for predicting activation domains from protein sequence have lagged, particularly in metazoans. We recently developed a simple and accurate predictor of acidic activation domains on human transcription factors. Here, we show how the accuracy of this human predictor arises from the clustering of aromatic, leucine, and acidic residues, which together are necessary for acidic activation domain function. When we combine our predictor with the predictions of convolutional neural network (CNN) models trained in yeast, the intersection is more accurate than individual models, emphasizing that each approach carries orthogonal information. We synthesize these findings into a new set of activation domain predictions on human transcription factors.
Collapse
Affiliation(s)
- Sanjana R Kotha
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA 94720, USA
| | - Max Valentín Staller
- Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720, USA
- Center for Computational Biology, University of California, Berkeley, CA 94720, USA
- Chan Zuckerberg Biohub—San Francisco, San Francisco, CA 94158, USA
| |
Collapse
|
21
|
Mukund AX, Tycko J, Allen SJ, Robinson SA, Andrews C, Sinha J, Ludwig CH, Spees K, Bassik MC, Bintu L. High-throughput functional characterization of combinations of transcriptional activators and repressors. Cell Syst 2023; 14:746-763.e5. [PMID: 37543039 PMCID: PMC10642976 DOI: 10.1016/j.cels.2023.07.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 06/26/2023] [Accepted: 07/06/2023] [Indexed: 08/07/2023]
Abstract
Despite growing knowledge of the functions of individual human transcriptional effector domains, much less is understood about how multiple effector domains within the same protein combine to regulate gene expression. Here, we measure transcriptional activity for 8,400 effector domain combinations by recruiting them to reporter genes in human cells. In our assay, weak and moderate activation domains synergize to drive strong gene expression, whereas combining strong activators often results in weaker activation. In contrast, repressors combine linearly and produce full gene silencing, and repressor domains often overpower activation domains. We use this information to build a synthetic transcription factor whose function can be tuned between repression and activation independent of recruitment to target genes by using a small-molecule drug. Altogether, we outline the basic principles of how effector domains combine to regulate gene expression and demonstrate their value in building precise and flexible synthetic biology tools. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Adi X Mukund
- Biophysics Program, Stanford University, Stanford, CA 94305, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Sage J Allen
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | | - Cecelia Andrews
- Department of Developmental Biology, Stanford University, Stanford, CA 94305, USA
| | - Joydeb Sinha
- Department of Chemical and Systems Biology, Stanford University, Stanford, CA 94305, USA
| | - Connor H Ludwig
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - Kaitlyn Spees
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Michael C Bassik
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
22
|
Hummel NFC, Markel K, Stefani J, Staller MV, Shih PM. Systematic identification of transcriptional activator domains from non-transcription factor proteins in plants and yeast. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.12.557247. [PMID: 37745555 PMCID: PMC10515812 DOI: 10.1101/2023.09.12.557247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Transcription factors promote gene expression via trans-regulatory activation domains. Although whole genome scale screens in model organisms (e.g. human, yeast, fly) have helped identify activation domains from transcription factors, such screens have been less extensively used to explore the occurrence of activation domains in non-transcription factor proteins, such as transcriptional coactivators, chromatin regulators and some cytosolic proteins, leaving a blind spot on what role activation domains in these proteins could play in regulating transcription. We utilized the activation domain predictor PADDLE to mine the entire proteomes of two model eukaryotes, Arabidopsis thaliana and Saccharomyces cerevisiae ( 1 ). We characterized 18,000 fragments covering predicted activation domains from >800 non-transcription factor genes in both species, and experimentally validated that 89% of proteins contained fragments capable of activating transcription in yeast. Peptides with similar sequence composition show a broad range of activities, which is explained by the arrangement of key amino acids. We also annotated hundreds of nuclear proteins with activation domains as putative coactivators; many of which have never been ascribed any function in plants. Furthermore, our library contains >250 non-nuclear proteins containing peptides with activation domain function across both eukaryotic lineages, suggesting that there are unknown biological roles of these peptides beyond transcription. Finally, we identify and validate short, 'universal' eukaryotic activation domains that activate transcription in both yeast and plants with comparable or stronger performance to state-of-the-art activation domains. Overall, our dual host screen provides a blueprint on how to systematically discover novel genetic parts for synthetic biology that function across a wide diversity of eukaryotes. Significance Statement Activation domains promote transcription and play a critical role in regulating gene expression. Although the mapping of activation domains from transcription factors has been carried out in previous genome-wide screens, their occurrence in non-transcription factors has been less explored. We utilize an activation domain predictor to mine the entire proteomes of Arabidopsis thaliana and Saccharomyces cerevisiae for new activation domains on non-transcription factor proteins. We validate peptides derived from >750 non-transcription factor proteins capable of activating transcription, discovering many potentially new coactivators in plants. Importantly, we identify novel genetic parts that can function across both species, representing unique synthetic biology tools.
Collapse
|
23
|
Christou-Kent M, Cuartero S, Garcia-Cabau C, Ruehle J, Naderi J, Erber J, Neguembor MV, Plana-Carmona M, Alcoverro-Bertran M, De Andres-Aguayo L, Klonizakis A, Julià-Vilella E, Lynch C, Serrano M, Hnisz D, Salvatella X, Graf T, Stik G. CEBPA phase separation links transcriptional activity and 3D chromatin hubs. Cell Rep 2023; 42:112897. [PMID: 37516962 DOI: 10.1016/j.celrep.2023.112897] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Revised: 06/02/2023] [Accepted: 07/14/2023] [Indexed: 08/01/2023] Open
Abstract
Cell identity is orchestrated through an interplay between transcription factor (TF) action and genome architecture. The mechanisms used by TFs to shape three-dimensional (3D) genome organization remain incompletely understood. Here we present evidence that the lineage-instructive TF CEBPA drives extensive chromatin compartment switching and promotes the formation of long-range chromatin hubs during induced B cell-to-macrophage transdifferentiation. Mechanistically, we find that the intrinsically disordered region (IDR) of CEBPA undergoes in vitro phase separation (PS) dependent on aromatic residues. Both overexpressing B cells and native CEBPA-expressing cell types such as primary granulocyte-macrophage progenitors, liver cells, and trophectoderm cells reveal nuclear CEBPA foci and long-range 3D chromatin hubs at CEBPA-bound regions. In short, we show that CEBPA can undergo PS through its IDR, which may underlie in vivo foci formation and suggest a potential role of PS in regulating CEBPA function.
Collapse
Affiliation(s)
- Marie Christou-Kent
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Sergi Cuartero
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain; Germans Trias I Pujol Research Institute (IGTP), Badalona, Spain
| | - Carla Garcia-Cabau
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Julia Ruehle
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Julian Naderi
- Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
| | - Julia Erber
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Maria Victoria Neguembor
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Marcos Plana-Carmona
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | - Luisa De Andres-Aguayo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Antonios Klonizakis
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain
| | | | - Cian Lynch
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, 08028 Barcelona, Spain; Altos Labs, Cambridge Institute of Science, Cambridge CB21 6GP, UK
| | - Manuel Serrano
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, 08028 Barcelona, Spain; Altos Labs, Cambridge Institute of Science, Cambridge CB21 6GP, UK
| | - Denes Hnisz
- Department of Genome Regulation, Max Planck Institute for Molecular Genetics, Ihnestrasse 63-73, 14195 Berlin, Germany
| | - Xavier Salvatella
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, 08028 Barcelona, Spain; ICREA, Passeig Lluís Companys 23, 08010 Barcelona, Spain
| | - Thomas Graf
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, 08003 Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| | - Grégoire Stik
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain.
| |
Collapse
|
24
|
Meeussen JVW, Pomp W, Brouwer I, de Jonge WJ, Patel HP, Lenstra TL. Transcription factor clusters enable target search but do not contribute to target gene activation. Nucleic Acids Res 2023; 51:5449-5468. [PMID: 36987884 PMCID: PMC10287935 DOI: 10.1093/nar/gkad227] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 03/06/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
Many transcription factors (TFs) localize in nuclear clusters of locally increased concentrations, but how TF clustering is regulated and how it influences gene expression is not well understood. Here, we use quantitative microscopy in living cells to study the regulation and function of clustering of the budding yeast TF Gal4 in its endogenous context. Our results show that Gal4 forms clusters that overlap with the GAL loci. Cluster number, density and size are regulated in different growth conditions by the Gal4-inhibitor Gal80 and Gal4 concentration. Gal4 truncation mutants reveal that Gal4 clustering is facilitated by, but does not completely depend on DNA binding and intrinsically disordered regions. Moreover, we discover that clustering acts as a double-edged sword: self-interactions aid TF recruitment to target genes, but recruited Gal4 molecules that are not DNA-bound do not contribute to, and may even inhibit, transcription activation. We propose that cells need to balance the different effects of TF clustering on target search and transcription activation to facilitate proper gene expression.
Collapse
Affiliation(s)
- Joseph V W Meeussen
- Division of Gene Regulation, The Netherlands Cancer Institute, Oncode Institute, 1066CX Amsterdam, The Netherlands
| | - Wim Pomp
- Division of Gene Regulation, The Netherlands Cancer Institute, Oncode Institute, 1066CX Amsterdam, The Netherlands
| | - Ineke Brouwer
- Division of Gene Regulation, The Netherlands Cancer Institute, Oncode Institute, 1066CX Amsterdam, The Netherlands
| | - Wim J de Jonge
- Division of Gene Regulation, The Netherlands Cancer Institute, Oncode Institute, 1066CX Amsterdam, The Netherlands
| | - Heta P Patel
- Division of Gene Regulation, The Netherlands Cancer Institute, Oncode Institute, 1066CX Amsterdam, The Netherlands
| | - Tineke L Lenstra
- Division of Gene Regulation, The Netherlands Cancer Institute, Oncode Institute, 1066CX Amsterdam, The Netherlands
| |
Collapse
|
25
|
Ludwig CH, Thurm AR, Morgens DW, Yang KJ, Tycko J, Bassik MC, Glaunsinger BA, Bintu L. High-throughput discovery and characterization of viral transcriptional effectors in human cells. Cell Syst 2023; 14:482-500.e8. [PMID: 37348463 PMCID: PMC10350249 DOI: 10.1016/j.cels.2023.05.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 02/17/2023] [Accepted: 05/23/2023] [Indexed: 06/24/2023]
Abstract
Viruses encode transcriptional regulatory proteins critical for controlling viral and host gene expression. Given their multifunctional nature and high sequence divergence, it is unclear which viral proteins can affect transcription and which specific sequences contribute to this function. Using a high-throughput assay, we measured the transcriptional regulatory potential of over 60,000 protein tiles across ∼1,500 proteins from 11 coronaviruses and all nine human herpesviruses. We discovered hundreds of transcriptional effector domains, including a conserved repression domain in all coronavirus Spike homologs, dual activation-repression domains in viral interferon regulatory factors (VIRFs), and an activation domain in six herpesvirus homologs of the single-stranded DNA-binding protein that we show is important for viral replication and late gene expression in Kaposi's sarcoma-associated herpesvirus (KSHV). For the effector domains we identified, we investigated their mechanisms via high-throughput sequence and chemical perturbations, pinpointing sequence motifs essential for function. This work massively expands viral protein annotations, serving as a springboard for studying their biological and health implications and providing new candidates for compact gene regulation tools.
Collapse
Affiliation(s)
- Connor H Ludwig
- Bioengineering Department, Stanford University, Stanford, CA 94305, USA
| | - Abby R Thurm
- Biophysics Graduate Program, Stanford University, Stanford, CA 94305, USA
| | - David W Morgens
- Department of Plant and Microbial Biology, UC Berkeley, Berkeley, CA 94720, USA
| | - Kevin J Yang
- Department of Molecular and Cell Biology, UC Berkeley, Berkeley, CA 94720, USA
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Michael C Bassik
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Britt A Glaunsinger
- Department of Plant and Microbial Biology, UC Berkeley, Berkeley, CA 94720, USA; Department of Molecular and Cell Biology, UC Berkeley, Berkeley, CA 94720, USA; Howard Hughes Medical Institute, UC Berkeley, Berkeley, CA 94720, USA
| | - Lacramioara Bintu
- Bioengineering Department, Stanford University, Stanford, CA 94305, USA.
| |
Collapse
|
26
|
Jonas F, Carmi M, Krupkin B, Steinberger J, Brodsky S, Jana T, Barkai N. The molecular grammar of protein disorder guiding genome-binding locations. Nucleic Acids Res 2023; 51:4831-4844. [PMID: 36938874 PMCID: PMC10250222 DOI: 10.1093/nar/gkad184] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 01/25/2023] [Accepted: 03/15/2023] [Indexed: 03/21/2023] Open
Abstract
Intrinsically disordered regions (IDRs) direct transcription factors (TFs) towards selected genomic occurrences of their binding motif, as exemplified by budding yeast's Msn2. However, the sequence basis of IDR-directed TF binding selectivity remains unknown. To reveal this sequence grammar, we analyze the genomic localizations of >100 designed IDR mutants, each carrying up to 122 mutations within this 567-AA region. Our data points at multivalent interactions, carried by hydrophobic-mostly aliphatic-residues dispersed within a disordered environment and independent of linear sequence motifs, as the key determinants of Msn2 genomic localization. The implications of our results for the mechanistic basis of IDR-based TF binding preferences are discussed.
Collapse
Affiliation(s)
- Felix Jonas
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Miri Carmi
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Beniamin Krupkin
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Joseph Steinberger
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Tamar Jana
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
27
|
Reynaud K, McGeachy AM, Noble D, Meacham ZA, Ingolia NT. Surveying the global landscape of post-transcriptional regulators. Nat Struct Mol Biol 2023; 30:740-752. [PMID: 37231154 PMCID: PMC10279529 DOI: 10.1038/s41594-023-00999-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Accepted: 04/17/2023] [Indexed: 05/27/2023]
Abstract
Numerous proteins regulate gene expression by modulating mRNA translation and decay. To uncover the full scope of these post-transcriptional regulators, we conducted an unbiased survey that quantifies regulatory activity across the budding yeast proteome and delineates the protein domains responsible for these effects. Our approach couples a tethered function assay with quantitative single-cell fluorescence measurements to analyze ~50,000 protein fragments and determine their effects on a tethered mRNA. We characterize hundreds of strong regulators, which are enriched for canonical and unconventional mRNA-binding proteins. Regulatory activity typically maps outside the RNA-binding domains themselves, highlighting a modular architecture that separates mRNA targeting from post-transcriptional regulation. Activity often aligns with intrinsically disordered regions that can interact with other proteins, even in core mRNA translation and degradation factors. Our results thus reveal networks of interacting proteins that control mRNA fate and illuminate the molecular basis for post-transcriptional gene regulation.
Collapse
Affiliation(s)
- Kendra Reynaud
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA, USA
| | - Anna M McGeachy
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - David Noble
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Zuriah A Meacham
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Nicholas T Ingolia
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA, USA.
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
28
|
DelRosso N, Tycko J, Suzuki P, Andrews C, Aradhana, Mukund A, Liongson I, Ludwig C, Spees K, Fordyce P, Bassik MC, Bintu L. Large-scale mapping and mutagenesis of human transcriptional effector domains. Nature 2023; 616:365-372. [PMID: 37020022 PMCID: PMC10484233 DOI: 10.1038/s41586-023-05906-y] [Citation(s) in RCA: 39] [Impact Index Per Article: 39.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 03/01/2023] [Indexed: 04/07/2023]
Abstract
Human gene expression is regulated by more than 2,000 transcription factors and chromatin regulators1,2. Effector domains within these proteins can activate or repress transcription. However, for many of these regulators we do not know what type of effector domains they contain, their location in the protein, their activation and repression strengths, and the sequences that are necessary for their functions. Here, we systematically measure the effector activity of more than 100,000 protein fragments tiling across most chromatin regulators and transcription factors in human cells (2,047 proteins). By testing the effect they have when recruited at reporter genes, we annotate 374 activation domains and 715 repression domains, roughly 80% of which are new and have not been previously annotated3-5. Rational mutagenesis and deletion scans across all the effector domains reveal aromatic and/or leucine residues interspersed with acidic, proline, serine and/or glutamine residues are necessary for activation domain activity. Furthermore, most repression domain sequences contain sites for small ubiquitin-like modifier (SUMO)ylation, short interaction motifs for recruiting corepressors or are structured binding domains for recruiting other repressive proteins. We discover bifunctional domains that can both activate and repress, some of which dynamically split a cell population into high- and low-expression subpopulations. Our systematic annotation and characterization of effector domains provide a rich resource for understanding the function of human transcription factors and chromatin regulators, engineering compact tools for controlling gene expression and refining predictive models of effector domain function.
Collapse
Affiliation(s)
| | - Josh Tycko
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Peter Suzuki
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Cecelia Andrews
- Department of Developmental Biology, Stanford University, Stanford, CA, USA
| | - Aradhana
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Adi Mukund
- Biophysics Program, Stanford University, Stanford, CA, USA
| | - Ivan Liongson
- Department of Biology, Stanford University, Stanford, CA, USA
| | - Connor Ludwig
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Kaitlyn Spees
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Polly Fordyce
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Bioengineering, Stanford University, Stanford, CA, USA
- ChEM-H Institute, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| | | | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
| |
Collapse
|
29
|
Chandra S, Manjunath K, Asok A, Varadarajan R. Mutational scan inferred binding energetics and structure in intrinsically disordered protein CcdA. Protein Sci 2023; 32:e4580. [PMID: 36714997 PMCID: PMC9951195 DOI: 10.1002/pro.4580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 01/02/2023] [Accepted: 01/25/2023] [Indexed: 01/31/2023]
Abstract
Unlike globular proteins, mutational effects on the function of Intrinsically Disordered Proteins (IDPs) are not well-studied. Deep Mutational Scanning of a yeast surface displayed mutant library yields insights into sequence-function relationships in the CcdA IDP. The approach enables facile prediction of interface residues and local structural signatures of the bound conformation. In contrast to previous titration-based approaches which use a number of ligand concentrations, we show that use of a single rationally chosen ligand concentration can provide quantitative estimates of relative binding constants for large numbers of protein variants. This is because the extended interface of IDP ensures that energetic effects of point mutations are spread over a much smaller range than for globular proteins. Our data also provides insights into the much-debated role of helicity and disorder in partner binding of IDPs. Based on this exhaustive mutational sensitivity dataset, a rudimentary model was developed in an attempt to predict mutational effects on binding affinity of IDPs that form alpha-helical structures upon binding.
Collapse
Affiliation(s)
| | | | - Aparna Asok
- Molecular Biophysics Unit, Indian Institute of ScienceBangaloreIndia
| | | |
Collapse
|
30
|
Klaus L, de Almeida BP, Vlasova A, Nemčko F, Schleiffer A, Bergauer K, Hofbauer L, Rath M, Stark A. Systematic identification and characterization of repressive domains in Drosophila transcription factors. EMBO J 2023; 42:e112100. [PMID: 36545802 PMCID: PMC9890238 DOI: 10.15252/embj.2022112100] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 11/21/2022] [Accepted: 12/01/2022] [Indexed: 12/24/2022] Open
Abstract
All multicellular life relies on differential gene expression, determined by regulatory DNA elements and DNA-binding transcription factors that mediate activation and repression via cofactor recruitment. While activators have been extensively characterized, repressors are less well studied: the identities and properties of their repressive domains (RDs) are typically unknown and the specific co-repressors (CoRs) they recruit have not been determined. Here, we develop a high-throughput, next-generation sequencing-based screening method, repressive-domain (RD)-seq, to systematically identify RDs in complex DNA-fragment libraries. Screening more than 200,000 fragments covering the coding sequences of all transcription-related proteins in Drosophila melanogaster, we identify 195 RDs in known repressors and in proteins not previously associated with repression. Many RDs contain recurrent short peptide motifs, which are conserved between fly and human and are required for RD function, as demonstrated by motif mutagenesis. Moreover, we show that RDs that contain one of five distinct repressive motifs interact with and depend on different CoRs, such as Groucho, CtBP, Sin3A, or Smrter. These findings advance our understanding of repressors, their sequences, and the functional impact of sequence-altering mutations and should provide a valuable resource for further studies.
Collapse
Affiliation(s)
- Loni Klaus
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Bernardo P de Almeida
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Anna Vlasova
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
| | - Filip Nemčko
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Alexander Schleiffer
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Institute of Molecular Biotechnology (IMBA)Vienna BioCenter (VBC)ViennaAustria
| | - Katharina Bergauer
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
| | - Lorena Hofbauer
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Vienna BioCenter PhD ProgramDoctoral School of the University of Vienna and Medical University of ViennaViennaAustria
| | - Martina Rath
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP)Vienna BioCenter (VBC)ViennaAustria
- Medical University of ViennaVienna BioCenter (VBC)ViennaAustria
| |
Collapse
|
31
|
Conti MM, Li R, Narváez Ramos MA, Zhu LJ, Fazzio TG, Benanti JA. Phosphosite Scanning reveals a complex phosphorylation code underlying CDK-dependent activation of Hcm1. Nat Commun 2023; 14:310. [PMID: 36658165 PMCID: PMC9852432 DOI: 10.1038/s41467-023-36035-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 01/11/2023] [Indexed: 01/20/2023] Open
Abstract
Ordered cell cycle progression is coordinated by cyclin dependent kinases (CDKs). CDKs often phosphorylate substrates at multiple sites clustered within disordered regions. However, for most substrates, it is not known which phosphosites are functionally important. We developed a high-throughput approach, Phosphosite Scanning, that tests the importance of each phosphosite within a multisite phosphorylated domain. We show that Phosphosite Scanning identifies multiple combinations of phosphosites that can regulate protein function and reveals specific phosphorylations that are required for phosphorylation at additional sites within a domain. We applied this approach to the yeast transcription factor Hcm1, a conserved regulator of mitotic genes that is critical for accurate chromosome segregation. Phosphosite Scanning revealed a complex CDK-regulatory circuit that mediates Cks1-dependent phosphorylation of key activating sites in vivo. These results illuminate the mechanism of Hcm1 activation by CDK and establish Phosphosite Scanning as a powerful tool for decoding multisite phosphorylated domains.
Collapse
Affiliation(s)
- Michelle M Conti
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Rui Li
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Michelle A Narváez Ramos
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Lihua Julie Zhu
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA.,Program in Bioinformatics and Integrative Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Thomas G Fazzio
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA
| | - Jennifer A Benanti
- Department of Molecular, Cell and Cancer Biology, University of Massachusetts Chan Medical School, Worcester, MA, 01605, USA.
| |
Collapse
|
32
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China,Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom,The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China,Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China,*Correspondence: Xianghua Li,
| |
Collapse
|
33
|
Fu Y, Bedő J, Papenfuss AT, Rubin AF. Integrating deep mutational scanning and low-throughput mutagenesis data to predict the impact of amino acid variants. Gigascience 2022; 12:giad073. [PMID: 37721410 PMCID: PMC10506130 DOI: 10.1093/gigascience/giad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/02/2023] [Accepted: 08/23/2023] [Indexed: 09/19/2023] Open
Abstract
BACKGROUND Evaluating the impact of amino acid variants has been a critical challenge for studying protein function and interpreting genomic data. High-throughput experimental methods like deep mutational scanning (DMS) can measure the effect of large numbers of variants in a target protein, but because DMS studies have not been performed on all proteins, researchers also model DMS data computationally to estimate variant impacts by predictors. RESULTS In this study, we extended a linear regression-based predictor to explore whether incorporating data from alanine scanning (AS), a widely used low-throughput mutagenesis method, would improve prediction results. To evaluate our model, we collected 146 AS datasets, mapping to 54 DMS datasets across 22 distinct proteins. CONCLUSIONS We show that improved model performance depends on the compatibility of the DMS and AS assays, and the scale of improvement is closely related to the correlation between DMS and AS results.
Collapse
Affiliation(s)
- Yunfan Fu
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Justin Bedő
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Anthony T Papenfuss
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
- Peter MacCallum Cancer Centre, Melbourne, Victoria 3000, Australia
| | - Alan F Rubin
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| |
Collapse
|
34
|
Ibrahim Z, Wang T, Destaing O, Salvi N, Hoghoughi N, Chabert C, Rusu A, Gao J, Feletto L, Reynoird N, Schalch T, Zhao Y, Blackledge M, Khochbin S, Panne D. Structural insights into p300 regulation and acetylation-dependent genome organisation. Nat Commun 2022; 13:7759. [PMID: 36522330 PMCID: PMC9755262 DOI: 10.1038/s41467-022-35375-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 11/29/2022] [Indexed: 12/23/2022] Open
Abstract
Histone modifications are deposited by chromatin modifying enzymes and read out by proteins that recognize the modified state. BRD4-NUT is an oncogenic fusion protein of the acetyl lysine reader BRD4 that binds to the acetylase p300 and enables formation of long-range intra- and interchromosomal interactions. We here examine how acetylation reading and writing enable formation of such interactions. We show that NUT contains an acidic transcriptional activation domain that binds to the TAZ2 domain of p300. We use NMR to investigate the structure of the complex and found that the TAZ2 domain has an autoinhibitory role for p300. NUT-TAZ2 interaction or mutations found in cancer that interfere with autoinhibition by TAZ2 allosterically activate p300. p300 activation results in a self-organizing, acetylation-dependent feed-forward reaction that enables long-range interactions by bromodomain multivalent acetyl-lysine binding. We discuss the implications for chromatin organisation, gene regulation and dysregulation in disease.
Collapse
Affiliation(s)
- Ziad Ibrahim
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, United States
| | - Tao Wang
- CNRS UMR 5309, INSERM U1209, Université Grenoble Alpes, Institute for Advanced Biosciences, Grenoble, France
| | - Olivier Destaing
- CNRS UMR 5309, INSERM U1209, Université Grenoble Alpes, Institute for Advanced Biosciences, Grenoble, France
| | - Nicola Salvi
- Institut de Biologie Structurale, CNRS, CEA, UGA, Grenoble, France
| | - Naghmeh Hoghoughi
- CNRS UMR 5309, INSERM U1209, Université Grenoble Alpes, Institute for Advanced Biosciences, Grenoble, France
| | - Clovis Chabert
- CNRS UMR 5309, INSERM U1209, Université Grenoble Alpes, Institute for Advanced Biosciences, Grenoble, France
| | - Alexandra Rusu
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK
| | - Jinjun Gao
- Ben May Department of Cancer Research, The University of Chicago, Chicago, IL, 60637, USA
| | - Leonardo Feletto
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK
| | - Nicolas Reynoird
- CNRS UMR 5309, INSERM U1209, Université Grenoble Alpes, Institute for Advanced Biosciences, Grenoble, France
| | - Thomas Schalch
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK
| | - Yingming Zhao
- Ben May Department of Cancer Research, The University of Chicago, Chicago, IL, 60637, USA
| | | | - Saadi Khochbin
- CNRS UMR 5309, INSERM U1209, Université Grenoble Alpes, Institute for Advanced Biosciences, Grenoble, France
| | - Daniel Panne
- Leicester Institute of Structural and Chemical Biology, Department of Molecular and Cell Biology, University of Leicester, Leicester, UK.
| |
Collapse
|
35
|
Shinn MK, Cohan MC, Bullock JL, Ruff KM, Levin PA, Pappu RV. Connecting sequence features within the disordered C-terminal linker of Bacillus subtilis FtsZ to functions and bacterial cell division. Proc Natl Acad Sci U S A 2022; 119:e2211178119. [PMID: 36215496 PMCID: PMC9586301 DOI: 10.1073/pnas.2211178119] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 09/20/2022] [Indexed: 11/21/2022] Open
Abstract
Intrinsically disordered regions (IDRs) can function as autoregulators of folded enzymes to which they are tethered. One example is the bacterial cell division protein FtsZ. This includes a folded core and a C-terminal tail (CTT) that encompasses a poorly conserved, disordered C-terminal linker (CTL) and a well-conserved 17-residue C-terminal peptide (CT17). Sites for GTPase activity of FtsZs are formed at the interface between GTP binding sites and T7 loops on cores of adjacent subunits within dimers. Here, we explore the basis of autoregulatory functions of the CTT in Bacillus subtilis FtsZ (Bs-FtsZ). Molecular simulations show that the CT17 of Bs-FtsZ makes statistically significant CTL-mediated contacts with the T7 loop. Statistical coupling analysis of more than 1,000 sequences from FtsZ orthologs reveals clear covariation of the T7 loop and the CT17 with most of the core domain, whereas the CTL is under independent selection. Despite this, we discover the conservation of nonrandom sequence patterns within CTLs across orthologs. To test how the nonrandom patterns of CTLs mediate CTT-core interactions and modulate FtsZ functionalities, we designed Bs-FtsZ variants by altering the patterning of oppositely charged residues within the CTL. Such alterations disrupt the core-CTT interactions, lead to anomalous assembly and inefficient GTP hydrolysis in vitro and protein degradation, aberrant assembly, and disruption of cell division in vivo. Our findings suggest that viable CTLs in FtsZs are likely to be IDRs that encompass nonrandom, functionally relevant sequence patterns that also preserve three-way covariation of the CT17, the T7 loop, and core domain.
Collapse
Affiliation(s)
- Min Kyung Shinn
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
- Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO 63130
| | - Megan C. Cohan
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
| | - Jessie L. Bullock
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130
| | - Kiersten M. Ruff
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
- Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO 63130
| | - Petra A. Levin
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130
| | - Rohit V. Pappu
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
- Center for Biomolecular Condensates, James McKelvey School of Engineering, Washington University in St. Louis, St. Louis, MO 63130
| |
Collapse
|
36
|
Staller MV. Transcription factors perform a 2-step search of the nucleus. Genetics 2022; 222:iyac111. [PMID: 35939561 PMCID: PMC9526044 DOI: 10.1093/genetics/iyac111] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 07/14/2022] [Indexed: 01/02/2023] Open
Abstract
Transcription factors regulate gene expression by binding to regulatory DNA and recruiting regulatory protein complexes. The DNA-binding and protein-binding functions of transcription factors are traditionally described as independent functions performed by modular protein domains. Here, I argue that genome binding can be a 2-part process with both DNA-binding and protein-binding steps, enabling transcription factors to perform a 2-step search of the nucleus to find their appropriate binding sites in a eukaryotic genome. I support this hypothesis with new and old results in the literature, discuss how this hypothesis parsimoniously resolves outstanding problems, and present testable predictions.
Collapse
Affiliation(s)
- Max Valentín Staller
- Corresponding author: Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
| |
Collapse
|
37
|
Baughman HER, Narang D, Chen W, Villagrán Suárez AC, Lee J, Bachochin MJ, Gunther TR, Wolynes PG, Komives EA. An intrinsically disordered transcription activation domain increases the DNA binding affinity and reduces the specificity of NFκB p50/RelA. J Biol Chem 2022; 298:102349. [PMID: 35934050 PMCID: PMC9440430 DOI: 10.1016/j.jbc.2022.102349] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Revised: 07/25/2022] [Accepted: 07/26/2022] [Indexed: 12/03/2022] Open
Abstract
Many transcription factors contain intrinsically disordered transcription activation domains (TADs), which mediate interactions with coactivators to activate transcription. Historically, DNA-binding domains and TADs have been considered as modular units, but recent studies have shown that TADs can influence DNA binding. Whether these results can be generalized to more TADs is not clear. Here, we biophysically characterized the NFκB p50/RelA heterodimer including the RelA TAD and investigated the TAD's influence on NFκB-DNA interactions. In solution, we show the RelA TAD is disordered but compact, with helical tendency in two regions that interact with coactivators. We determined that the presence of the TAD increased the stoichiometry of NFκB-DNA complexes containing promoter DNA sequences with tandem κB recognition motifs by promoting the binding of NFκB dimers in excess of the number of κB sites. In addition, we measured the binding affinity of p50/RelA for DNA containing tandem κB sites and single κB sites. While the presence of the TAD enhanced the binding affinity of p50/RelA for all κB sequences tested, it also increased the affinity for nonspecific DNA sequences by over 10-fold, leading to an overall decrease in specificity for κB DNA sequences. In contrast, previous studies have generally reported that TADs decrease DNA-binding affinity and increase sequence specificity. Our results reveal a novel function of the RelA TAD in promoting binding to nonconsensus DNA, which sheds light on previous observations of extensive nonconsensus DNA binding by NFκB in vivo in response to strong inflammatory signals.
Collapse
Affiliation(s)
- Hannah E R Baughman
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Dominic Narang
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Wei Chen
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Amalia C Villagrán Suárez
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Joan Lee
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Maxwell J Bachochin
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Tristan R Gunther
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA
| | - Peter G Wolynes
- Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas, USA
| | - Elizabeth A Komives
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, California, USA.
| |
Collapse
|
38
|
Sangster AG, Zarin T, Moses AM. Evolution of short linear motifs and disordered proteins Topic: yeast as model system to study evolution. Curr Opin Genet Dev 2022; 76:101964. [PMID: 35939968 DOI: 10.1016/j.gde.2022.101964] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 06/29/2022] [Accepted: 07/08/2022] [Indexed: 11/26/2022]
Abstract
Evolutionary preservation of protein structure had a major influence on the field of molecular evolution: changes in individual amino acids that did not disrupt protein folding would either have no effect or subtly change the 'lock' so that it could fit a new 'key'. Homology of individual amino acids could be confidently assigned through sequence alignments, and models of evolution could be tested. This view of molecular evolution excluded large regions of proteins that could not be confidently aligned, such as intrinsically disordered regions (IDRs) that do not fold into stable structures. In the last decade, major progress has been made in understanding the evolution of IDRs, much of it facilitated by new experimental and computational approaches in yeast. Here, we review this progress as well as several still outstanding questions.
Collapse
Affiliation(s)
- Ami G Sangster
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada
| | - Taraneh Zarin
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada. https://twitter.com/@taraneh_z
| | - Alan M Moses
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada.
| |
Collapse
|
39
|
Tomaž Š, Gruden K, Coll A. TGA transcription factors-Structural characteristics as basis for functional variability. FRONTIERS IN PLANT SCIENCE 2022; 13:935819. [PMID: 35958211 PMCID: PMC9360754 DOI: 10.3389/fpls.2022.935819] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 07/04/2022] [Indexed: 06/15/2023]
Abstract
TGA transcription factors are essential regulators of various cellular processes, their activity connected to different hormonal pathways, interacting proteins and regulatory elements. Belonging to the basic region leucine zipper (bZIP) family, TGAs operate by binding to their target DNA sequence as dimers through a conserved bZIP domain. Despite sharing the core DNA-binding sequence, the TGA paralogues exert somewhat different DNA-binding preferences. Sequence variability of their N- and C-terminal protein parts indicates their importance in defining TGA functional specificity through interactions with diverse proteins, affecting their DNA-binding properties. In this review, we provide a short and concise summary on plant TGA transcription factors from a structural point of view, including the relation of their structural characteristics to their functional roles in transcription regulation.
Collapse
Affiliation(s)
- Špela Tomaž
- Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, Slovenia
- Jožef Stefan International Postgraduate School, Ljubljana, Slovenia
| | - Kristina Gruden
- Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, Slovenia
| | - Anna Coll
- Department of Biotechnology and Systems Biology, National Institute of Biology, Ljubljana, Slovenia
| |
Collapse
|
40
|
Loell K, Wu Y, Staller MV, Cohen B. Activation domains can decouple the mean and noise of gene expression. Cell Rep 2022; 40:111118. [PMID: 35858548 PMCID: PMC9912357 DOI: 10.1016/j.celrep.2022.111118] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Revised: 01/18/2022] [Accepted: 06/28/2022] [Indexed: 11/03/2022] Open
Abstract
Regulatory mechanisms set a gene's average level of expression, but a gene's expression constantly fluctuates around that average. These stochastic fluctuations, or expression noise, play a role in cell-fate transitions, bet hedging in microbes, and the development of chemotherapeutic resistance in cancer. An outstanding question is what regulatory mechanisms contribute to noise. Here, we demonstrate that, for a fixed mean level of expression, strong activation domains (ADs) at low abundance produce high expression noise, while weak ADs at high abundance generate lower expression noise. We conclude that differences in noise can be explained by the interplay between a TF's nuclear concentration and the strength of its AD's effect on mean expression, without invoking differences between classes of ADs. These results raise the possibility of engineering gene expression noise independently of mean levels in synthetic biology contexts and provide a potential mechanism for natural selection to tune the noisiness of gene expression.
Collapse
Affiliation(s)
- Kaiser Loell
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA,The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Yawei Wu
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA,The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA
| | - Max V. Staller
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Barak Cohen
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA; The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine in St. Louis, St. Louis, MO 63108, USA.
| |
Collapse
|
41
|
Discovering molecular features of intrinsically disordered regions by using evolution for contrastive learning. PLoS Comput Biol 2022; 18:e1010238. [PMID: 35767567 PMCID: PMC9275697 DOI: 10.1371/journal.pcbi.1010238] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Revised: 07/12/2022] [Accepted: 05/23/2022] [Indexed: 02/07/2023] Open
Abstract
A major challenge to the characterization of intrinsically disordered regions (IDRs), which are widespread in the proteome, but relatively poorly understood, is the identification of molecular features that mediate functions of these regions, such as short motifs, amino acid repeats and physicochemical properties. Here, we introduce a proteome-scale feature discovery approach for IDRs. Our approach, which we call “reverse homology”, exploits the principle that important functional features are conserved over evolution. We use this as a contrastive learning signal for deep learning: given a set of homologous IDRs, the neural network has to correctly choose a held-out homolog from another set of IDRs sampled randomly from the proteome. We pair reverse homology with a simple architecture and standard interpretation techniques, and show that the network learns conserved features of IDRs that can be interpreted as motifs, repeats, or bulk features like charge or amino acid propensities. We also show that our model can be used to produce visualizations of what residues and regions are most important to IDR function, generating hypotheses for uncharacterized IDRs. Our results suggest that feature discovery using unsupervised neural networks is a promising avenue to gain systematic insight into poorly understood protein sequences. Intrinsically disordered regions (IDRs) are widespread in proteins but are poorly understood on a systematic level because they evolve too rapidly for classic bioinformatics methods to be effective. We designed a neural network that learns what features (for example, electrostatic charge, or the presence of certain motifs) might be important to the function of IDRs, even when we don’t have prior knowledge of function. Our neural network learns by exploiting principles of evolution. Important features tend to be conserved over species, so guessing what sequences evolved from the same common ancestor helps the neural network identify these features. Importantly, training a neural network this way can be defined as a fully automatic operation, so no manual effort is required. After our neural network is trained, we can apply interpretation techniques to understand what kinds of features are important to IDRs globally in the proteome, and to form hypotheses about specific IDRs. We show that many of the features our neural network learns are consistent with features we already know to be important to IDRs. We hope that our neural network can be applied to help biologists form hypotheses about poorly characterized IDRs.
Collapse
|
42
|
Zeng X, Ruff KM, Pappu RV. Competing interactions give rise to two-state behavior and switch-like transitions in charge-rich intrinsically disordered proteins. Proc Natl Acad Sci U S A 2022; 119:e2200559119. [PMID: 35512095 PMCID: PMC9171777 DOI: 10.1073/pnas.2200559119] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 04/12/2022] [Indexed: 11/18/2022] Open
Abstract
The most commonly occurring intrinsically disordered proteins (IDPs) are polyampholytes, which are defined by the duality of low net charge per residue and high fractions of charged residues. Recent experiments have uncovered nuances regarding sequence–ensemble relationships of model polyampholytic IDPs. These include differences in conformational preferences for sequences with lysine vs. arginine and the suggestion that well-mixed sequences form a range of conformations, including globules, conformations with ensemble averages that are reminiscent of ideal chains, or self-avoiding walks. Here, we explain these observations by analyzing results from atomistic simulations. We find that polyampholytic IDPs generally sample two distinct stable states, namely, globules and self-avoiding walks. Globules are favored by electrostatic attractions between oppositely charged residues, whereas self-avoiding walks are favored by favorable free energies of hydration of charged residues. We find sequence-specific temperatures of bistability at which globules and self-avoiding walks can coexist. At these temperatures, ensemble averages over coexisting states give rise to statistics that resemble ideal chains without there being an actual counterbalancing of intrachain and chain-solvent interactions. At equivalent temperatures, arginine-rich sequences tilt the preference toward globular conformations whereas lysine-rich sequences tilt the preference toward self-avoiding walks. We also identify differences between aspartate- and glutamate-containing sequences, whereby the shorter aspartate side chain engenders preferences for metastable, necklace-like conformations. Finally, although segregation of oppositely charged residues within the linear sequence maintains the overall two-state behavior, compact states are highly favored by such systems.
Collapse
Affiliation(s)
- Xiangze Zeng
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
- Center for Science & Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130
| | - Kiersten M. Ruff
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
- Center for Science & Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130
| | - Rohit V. Pappu
- Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130
- Center for Science & Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130
| |
Collapse
|
43
|
Soto L, Li Z, Santoso CS, Berenson A, Ho I, Shen VX, Yuan S, Bass JIF. Compendium of human transcription factor effector domains. Mol Cell 2022; 82:514-526. [PMID: 34863368 PMCID: PMC8818021 DOI: 10.1016/j.molcel.2021.11.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/16/2021] [Accepted: 11/03/2021] [Indexed: 02/08/2023]
Abstract
Transcription factors (TFs) regulate gene expression by binding to DNA sequences and modulating transcriptional activity through their effector domains. Despite the central role of effector domains in TF function, there is a current lack of a comprehensive resource and characterization of effector domains. Here, we provide a catalog of 924 effector domains across 594 human TFs. Using this catalog, we characterized the amino acid composition of effector domains, their conservation across species and across the human population, and their roles in human diseases. Furthermore, we provide a classification system for effector domains that constitutes a valuable resource and a blueprint for future experimental studies of TF effector domain function.
Collapse
Affiliation(s)
- Luis Soto
- Escuela Profesional de Genética y Biotecnología, Facultad de Ciencias Biológicas, Universidad Nacional Mayor de San Marcos, Lima 15081, Perú
| | - Zhaorong Li
- Bioinformatics Program, Boston University, Boston MA 02215
| | - Clarissa S Santoso
- Biology Department, Boston University, Boston MA 02215,Molecular Biology, Cellular Biology and Biochemistry Program, Boston University, Boston MA 02215
| | - Anna Berenson
- Biology Department, Boston University, Boston MA 02215,Molecular Biology, Cellular Biology and Biochemistry Program, Boston University, Boston MA 02215
| | - Isabella Ho
- Biology Department, Boston University, Boston MA 02215
| | - Vivian X Shen
- Biology Department, Boston University, Boston MA 02215
| | - Samson Yuan
- Biology Department, Boston University, Boston MA 02215
| | - Juan I Fuxman Bass
- Bioinformatics Program, Boston University, Boston MA 02215,Biology Department, Boston University, Boston MA 02215,Molecular Biology, Cellular Biology and Biochemistry Program, Boston University, Boston MA 02215,correspondence:
| |
Collapse
|
44
|
Cohan MC, Shinn MK, Lalmansingh JM, Pappu RV. Uncovering Non-random Binary Patterns Within Sequences of Intrinsically Disordered Proteins. J Mol Biol 2022; 434:167373. [PMID: 34863777 PMCID: PMC10178624 DOI: 10.1016/j.jmb.2021.167373] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 10/24/2021] [Accepted: 11/16/2021] [Indexed: 01/21/2023]
Abstract
Sequence-ensemble relationships of intrinsically disordered proteins (IDPs) are governed by binary patterns such as the linear clustering or mixing of specific residues or residue types with respect to one another. To enable the discovery of potentially important, shared patterns across sequence families, we describe a computational method referred to as NARDINI for Non-random Arrangement of Residues in Disordered Regions Inferred using Numerical Intermixing. This work was partially motivated by the observation that parameters that are currently in use for describing different binary patterns are not interoperable across IDPs of different amino acid compositions and lengths. In NARDINI, we generate an ensemble of scrambled sequences to set up a composition-specific null model for the patterning parameters of interest. We then compute a series of pattern-specific z-scores to quantify how each pattern deviates from a null model for the IDP of interest. The z-scores help in identifying putative non-random linear sequence patterns within an IDP. We demonstrate the use of NARDINI derived z-scores by identifying sequence patterns in three well-studied IDP systems. We also demonstrate how NARDINI can be deployed to study archetypal IDPs across homologs and orthologs. Overall, NARDINI is likely to aid in designing novel IDPs with a view toward engineering new sequence-function relationships or uncovering cryptic ones. We further propose that the z-scores introduced here are likely to be useful for theoretical and computational descriptions of sequence-ensemble relationships across IDPs of different compositions and lengths.
Collapse
Affiliation(s)
- Megan C Cohan
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA
| | - Min Kyung Shinn
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA
| | | | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, MO 63130, USA.
| |
Collapse
|
45
|
Staller MV, Ramirez E, Kotha SR, Holehouse AS, Pappu RV, Cohen BA. Directed mutational scanning reveals a balance between acidic and hydrophobic residues in strong human activation domains. Cell Syst 2022; 13:334-345.e5. [PMID: 35120642 PMCID: PMC9241528 DOI: 10.1016/j.cels.2022.01.002] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 10/20/2021] [Accepted: 01/05/2022] [Indexed: 01/01/2023]
Abstract
Acidic activation domains are intrinsically disordered regions of the transcription factors that bind coactivators. The intrinsic disorder and low evolutionary conservation of activation domains have made it difficult to identify the sequence features that control activity. To address this problem, we designed thousands of variants in seven acidic activation domains and measured their activities with a high-throughput assay in human cell culture. We found that strong activation domain activity requires a balance between the number of acidic residues and aromatic and leucine residues. These findings motivated a predictor of acidic activation domains that scans the human proteome for clusters of aromatic and leucine residues embedded in regions of high acidity. This predictor identifies known activation domains and accurately predicts previously unidentified ones. Our results support a flexible acidic exposure model of activation domains in which the acidic residues solubilize hydrophobic motifs so that they can interact with coactivators. A record of this paper’s transparent peer review process is included in the supplemental information. Transcriptional activation domains are poorly conserved, intrinsically disordered regions of the transcription factors that remain difficult to predict from protein sequences. A high-throughput method reveals how strong activation domains require a balance between acidic and hydrophobic residues. This balance powers an accurate predictor of activation domains on human transcription factors.
Collapse
Affiliation(s)
- Max V Staller
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Center for Computational Biology, University of California Berkeley, Berkeley, CA 94720, USA.
| | - Eddie Ramirez
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA
| | - Sanjana R Kotha
- Center for Computational Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Center for Science and Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Rohit V Pappu
- Center for Science and Engineering of Living Systems, Washington University in St. Louis, St. Louis, MO 63130, USA; Department of Biomedical Engineering, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Barak A Cohen
- Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA; Department of Genetics, Washington University School of Medicine in St. Louis, Saint Louis, MO 63110, USA.
| |
Collapse
|
46
|
Abstract
Auxin signaling regulates growth and developmental processes in plants. The core of nuclear auxin signaling relies on just three components: TIR1/AFBs, Aux/IAAs, and ARFs. Each component is itself made up of several domains, all of which contribute to the regulation of auxin signaling. Studies of the structural aspects of these three core signaling components have deepened our understanding of auxin signaling dynamics and regulation. In addition to the structured domains of these components, intrinsically disordered regions within the proteins also impact auxin signaling outcomes. New research is beginning to uncover the role intrinsic disorder plays in auxin-regulated degradation and subcellular localization. Structured and intrinsically disordered domains affect auxin perception, protein degradation dynamics, and DNA binding. Taken together, subtle differences within the domains and motifs of each class of auxin signaling component affect signaling outcomes and specificity.
Collapse
Affiliation(s)
- Nicholas Morffy
- Department of Biology, Duke University, Durham, North Carolina 27708, USA
- Center for Science and Engineering Living Systems (CSELS), Washington University, St. Louis, Missouri 63130, USA
| | - Lucia C Strader
- Department of Biology, Duke University, Durham, North Carolina 27708, USA
- Center for Science and Engineering Living Systems (CSELS), Washington University, St. Louis, Missouri 63130, USA
- Center for Engineering Mechanobiology, Washington University, St. Louis, Missouri 63130, USA
| |
Collapse
|
47
|
Albarnaz JD, Ren H, Torres AA, Shmeleva EV, Melo CA, Bannister AJ, Brember MP, Chung BYW, Smith GL. Molecular mimicry of NF-κB by vaccinia virus protein enables selective inhibition of antiviral responses. Nat Microbiol 2022; 7:154-168. [PMID: 34949827 PMCID: PMC7614822 DOI: 10.1038/s41564-021-01004-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Accepted: 10/21/2021] [Indexed: 12/16/2022]
Abstract
Infection of mammalian cells with viruses activates NF-κB to induce the expression of cytokines and chemokines and initiate an antiviral response. Here, we show that a vaccinia virus protein mimics the transactivation domain of the p65 subunit of NF-κB to inhibit selectively the expression of NF-κB-regulated genes. Using co-immunoprecipitation assays, we found that the vaccinia virus protein F14 associates with NF-κB co-activator CREB-binding protein (CBP) and disrupts the interaction between p65 and CBP. This abrogates CBP-mediated acetylation of p65, after which it reduces promoter recruitment of the transcriptional regulator BRD4 and diminishes stimulation of NF-κB-regulated genes CXCL10 and CCL2. Recruitment of BRD4 to the promoters of NFKBIA and CXCL8 remains unaffected by either F14 or JQ1 (a competitive inhibitor of BRD4 bromodomains), indicating that BRD4 recruitment is acetylation-independent. Unlike other viral proteins that are general antagonists of NF-κB, F14 is a selective inhibitor of NF-κB-dependent gene expression. An in vivo model of infection demonstrated that F14 promotes virulence. Molecular mimicry of NF-κB may be conserved because other orthopoxviruses, including variola, monkeypox and cowpox viruses, encode orthologues of F14.
Collapse
Affiliation(s)
- Jonas D Albarnaz
- Department of Pathology, University of Cambridge, Cambridge, UK.
- Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK.
| | - Hongwei Ren
- Department of Pathology, University of Cambridge, Cambridge, UK
- Department of Immunology and Inflammation, Imperial College London, Hammersmith Campus, London, UK
| | - Alice A Torres
- Department of Pathology, University of Cambridge, Cambridge, UK
| | - Evgeniya V Shmeleva
- Department of Pathology, University of Cambridge, Cambridge, UK
- Department of Obstetrics and Gynaecology, University of Cambridge, Cambridge, UK
| | - Carlos A Melo
- The Gurdon Institute, University of Cambridge, Cambridge, UK
| | | | | | - Betty Y-W Chung
- Department of Pathology, University of Cambridge, Cambridge, UK
| | - Geoffrey L Smith
- Department of Pathology, University of Cambridge, Cambridge, UK.
| |
Collapse
|
48
|
Fossat MJ, Posey AE, Pappu RV. Quantifying charge state heterogeneity for proteins with multiple ionizable residues. Biophys J 2021; 120:5438-5453. [PMID: 34826385 PMCID: PMC8715249 DOI: 10.1016/j.bpj.2021.11.2886] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 11/03/2021] [Accepted: 11/19/2021] [Indexed: 01/07/2023] Open
Abstract
Ionizable residues can release and take up protons and this has an influence on protein structure and function. The extent of protonation is linked to the overall pH of the solution and the local environments of ionizable residues. Binding or unbinding of a single proton generates a distinct charge microstate defined by a specific pattern of charges. Accordingly, the overall partition function is a sum over all charge microstates and Boltzmann weights of all conformations associated with each of the charge microstates. This ensemble-of-ensembles description recast as a q-canonical ensemble allows us to analyze and interpret potentiometric titrations that provide information regarding net charge as a function of pH. In the q-canonical ensemble, charge microstates are grouped into mesostates where each mesostate is a collection of microstates of the same net charge. Here, we show that leveraging the structure of the q-canonical ensemble allows us to decouple contributions of net proton binding and release from proton arrangement and conformational considerations. Through application of the q-canonical formalism to analyze potentiometric measurements of net charge in proteins with repetitive patterns of Lys and Glu residues, we determine the underlying mesostate pKa values and, more importantly, we estimate relative mesostate populations as a function of pH. This is a strength of using the q-canonical approach that cannot be replicated using purely site-specific analyses. Overall, our work shows how measurements of charge equilibria, decoupled from measurements of conformational equilibria, and analyzed using the framework of the q-canonical ensemble, provide protein-specific quantitative descriptions of pH-dependent populations of mesostates. This method is of direct relevance for measuring and understanding how different charge states contribute to conformational, binding, and phase equilibria of proteins.
Collapse
Affiliation(s)
- Martin J Fossat
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, St. Louis, Missouri
| | - Ammon E Posey
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, St. Louis, Missouri
| | - Rohit V Pappu
- Department of Biomedical Engineering and Center for Science & Engineering of Living Systems (CSELS), Washington University in St. Louis, St. Louis, Missouri.
| |
Collapse
|
49
|
Abstract
To predict transcription, one needs a mechanistic understanding of how the numerous required transcription factors (TFs) explore the nuclear space to find their target genes, assemble, cooperate, and compete with one another. Advances in fluorescence microscopy have made it possible to visualize real-time TF dynamics in living cells, leading to two intriguing observations: first, most TFs contact chromatin only transiently; and second, TFs can assemble into clusters through their intrinsically disordered regions. These findings suggest that highly dynamic events and spatially structured nuclear microenvironments might play key roles in transcription regulation that are not yet fully understood. The emerging model is that while some promoters directly convert TF-binding events into on/off cycles of transcription, many others apply complex regulatory layers that ultimately lead to diverse phenotypic outputs. Cracking this kinetic code is an ongoing and challenging task that is made possible by combining innovative imaging approaches with biophysical models.
Collapse
Affiliation(s)
- Feiyue Lu
- Institute for Systems Genetics and Cell Biology Department, NYU School of Medicine, New York, New York 10016, USA
| | - Timothée Lionnet
- Institute for Systems Genetics and Cell Biology Department, NYU School of Medicine, New York, New York 10016, USA
| |
Collapse
|
50
|
Broyles BK, Gutierrez AT, Maris TP, Coil DA, Wagner TM, Wang X, Kihara D, Class CA, Erkine AM. Activation of gene expression by detergent-like protein domains. iScience 2021; 24:103017. [PMID: 34522860 PMCID: PMC8426559 DOI: 10.1016/j.isci.2021.103017] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 07/08/2021] [Accepted: 08/18/2021] [Indexed: 11/24/2022] Open
Abstract
The mechanisms by which transcriptional activation domains (tADs) initiate eukaryotic gene expression have been an enigma for decades because most tADs lack specificity in sequence, structure, and interactions with targets. Machine learning analysis of data sets of tAD sequences generated in vivo elucidated several functionality rules: the functional tAD sequences should (i) be devoid of or depleted with basic amino acid residues, (ii) be enriched with aromatic and acidic residues, (iii) be with aromatic residues localized mostly near the terminus of the sequence, and acidic residues localized more internally within a span of 20-30 amino acids, (iv) be with both aromatic and acidic residues preferably spread out in the sequence and not clustered, and (v) not be separated by occasional basic residues. These and other more subtle rules are not absolute, reflecting absence of a tAD consensus sequence, enormous variability, and consistent with surfactant-like tAD biochemical properties. The findings are compatible with the paradigm-shifting nucleosome detergent mechanism of gene expression activation, contributing to the development of the liquid-liquid phase separation model and the biochemistry of near-stochastic functional allosteric interactions.
Collapse
Affiliation(s)
- Bradley K Broyles
- College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| | - Andrew T Gutierrez
- College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| | - Theodore P Maris
- College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| | - Daniel A Coil
- College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| | - Thomas M Wagner
- College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Caleb A Class
- College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| | - Alexandre M Erkine
- College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| |
Collapse
|