1
|
Faure AJ, Lehner B, Miró Pina V, Serrano Colome C, Weghorn D. An extension of the Walsh-Hadamard transform to calculate and model epistasis in genetic landscapes of arbitrary shape and complexity. PLoS Comput Biol 2024; 20:e1012132. [PMID: 38805561 PMCID: PMC11161127 DOI: 10.1371/journal.pcbi.1012132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 06/07/2024] [Accepted: 05/04/2024] [Indexed: 05/30/2024] Open
Abstract
Accurate models describing the relationship between genotype and phenotype are necessary in order to understand and predict how mutations to biological sequences affect the fitness and evolution of living organisms. The apparent abundance of epistasis (genetic interactions), both between and within genes, complicates this task and how to build mechanistic models that incorporate epistatic coefficients (genetic interaction terms) is an open question. The Walsh-Hadamard transform represents a rigorous computational framework for calculating and modeling epistatic interactions at the level of individual genotypic values (known as genetical, biological or physiological epistasis), and can therefore be used to address fundamental questions related to sequence-to-function encodings. However, one of its main limitations is that it can only accommodate two alleles (amino acid or nucleotide states) per sequence position. In this paper we provide an extension of the Walsh-Hadamard transform that allows the calculation and modeling of background-averaged epistasis (also known as ensemble epistasis) in genetic landscapes with an arbitrary number of states per position (20 for amino acids, 4 for nucleotides, etc.). We also provide a recursive formula for the inverse matrix and then derive formulae to directly extract any element of either matrix without having to rely on the computationally intensive task of constructing or inverting large matrices. Finally, we demonstrate the utility of our theory by using it to model epistasis within both simulated and empirical multiallelic fitness landscapes, revealing that both pairwise and higher-order genetic interactions are enriched between physically interacting positions.
Collapse
Affiliation(s)
- Andre J. Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Verónica Miró Pina
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Claudia Serrano Colome
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Donate Weghorn
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
2
|
Ishigami Y, Wong MS, Martí-Gómez C, Ayaz A, Kooshkbaghi M, Hanson SM, McCandlish DM, Krainer AR, Kinney JB. Specificity, synergy, and mechanisms of splice-modifying drugs. Nat Commun 2024; 15:1880. [PMID: 38424098 PMCID: PMC10904865 DOI: 10.1038/s41467-024-46090-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 02/10/2024] [Indexed: 03/02/2024] Open
Abstract
Drugs that target pre-mRNA splicing hold great therapeutic potential, but the quantitative understanding of how these drugs work is limited. Here we introduce mechanistically interpretable quantitative models for the sequence-specific and concentration-dependent behavior of splice-modifying drugs. Using massively parallel splicing assays, RNA-seq experiments, and precision dose-response curves, we obtain quantitative models for two small-molecule drugs, risdiplam and branaplam, developed for treating spinal muscular atrophy. The results quantitatively characterize the specificities of risdiplam and branaplam for 5' splice site sequences, suggest that branaplam recognizes 5' splice sites via two distinct interaction modes, and contradict the prevailing two-site hypothesis for risdiplam activity at SMN2 exon 7. The results also show that anomalous single-drug cooperativity, as well as multi-drug synergy, are widespread among small-molecule drugs and antisense-oligonucleotide drugs that promote exon inclusion. Our quantitative models thus clarify the mechanisms of existing treatments and provide a basis for the rational development of new therapies.
Collapse
Affiliation(s)
- Yuma Ishigami
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Mandy S Wong
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- Beam Therapeutics, Cambridge, MA, 02142, USA
| | | | - Andalus Ayaz
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Mahdi Kooshkbaghi
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- The Estée Lauder Companies, New York, NY, 10153, USA
| | | | | | - Adrian R Krainer
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| | - Justin B Kinney
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| |
Collapse
|
3
|
Parisutham V, Chhabra S, Ali MZ, Brewster RC. Tunable transcription factor library for robust quantification of regulatory properties in Escherichia coli. Mol Syst Biol 2022; 18:e10843. [PMID: 35694815 PMCID: PMC9189660 DOI: 10.15252/msb.202110843] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 05/11/2022] [Accepted: 05/13/2022] [Indexed: 11/12/2022] Open
Abstract
Predicting the quantitative regulatory function of transcription factors (TFs) based on factors such as binding sequence, binding location, and promoter type is not possible. The interconnected nature of gene networks and the difficulty in tuning individual TF concentrations make the isolated study of TF function challenging. Here, we present a library of Escherichia coli strains designed to allow for precise control of the concentration of individual TFs enabling the study of the role of TF concentration on physiology and regulation. We demonstrate the usefulness of this resource by measuring the regulatory function of the zinc‐responsive TF, ZntR, and the paralogous TF pair, GalR/GalS. For ZntR, we find that zinc alters ZntR regulatory function in a way that enables activation of the regulated gene to be robust with respect to ZntR concentration. For GalR and GalS, we are able to demonstrate that these paralogous TFs have fundamentally distinct regulatory roles beyond differences in binding affinity.
Collapse
Affiliation(s)
- Vinuselvi Parisutham
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Shivani Chhabra
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Md Zulfikar Ali
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA
| | - Robert C Brewster
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA, USA.,Department of Microbiology and Physiological Systems, University of Massachusetts Chan Medical School, Worcester, MA, USA
| |
Collapse
|
4
|
Faure AJ, Domingo J, Schmiedel JM, Hidalgo-Carcedo C, Diss G, Lehner B. Mapping the energetic and allosteric landscapes of protein binding domains. Nature 2022; 604:175-183. [PMID: 35388192 DOI: 10.1038/s41586-022-04586-4] [Citation(s) in RCA: 84] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 02/25/2022] [Indexed: 11/09/2022]
Abstract
Allosteric communication between distant sites in proteins is central to biological regulation but still poorly characterized, limiting understanding, engineering and drug development1-6. An important reason for this is the lack of methods to comprehensively quantify allostery in diverse proteins. Here we address this shortcoming and present a method that uses deep mutational scanning to globally map allostery. The approach uses an efficient experimental design to infer en masse the causal biophysical effects of mutations by quantifying multiple molecular phenotypes-here we examine binding and protein abundance-in multiple genetic backgrounds and fitting thermodynamic models using neural networks. We apply the approach to two of the most common protein interaction domains found in humans, an SH3 domain and a PDZ domain, to produce comprehensive atlases of allosteric communication. Allosteric mutations are abundant, with a large mutational target space of network-altering 'edgetic' variants. Mutations are more likely to be allosteric closer to binding interfaces, at glycine residues and at specific residues connecting to an opposite surface within the PDZ domain. This general approach of quantifying mutational effects for multiple molecular phenotypes and in multiple genetic backgrounds should enable the energetic and allosteric landscapes of many proteins to be rapidly and comprehensively mapped.
Collapse
Affiliation(s)
- Andre J Faure
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Júlia Domingo
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,New York Genome Center (NYGC), New York, NY, USA
| | - Jörn M Schmiedel
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Cristina Hidalgo-Carcedo
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Guillaume Diss
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.,Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Ben Lehner
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain. .,Universitat Pompeu Fabra (UPF), Barcelona, Spain. .,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
5
|
Lagator M, Sarikas S, Steinrueck M, Toledo-Aparicio D, Bollback JP, Guet CC, Tkačik G. Predicting bacterial promoter function and evolution from random sequences. eLife 2022; 11:64543. [PMID: 35080492 PMCID: PMC8791639 DOI: 10.7554/elife.64543] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Accepted: 01/09/2022] [Indexed: 12/12/2022] Open
Abstract
Predicting function from sequence is a central problem of biology. Currently, this is possible only locally in a narrow mutational neighborhood around a wildtype sequence rather than globally from any sequence. Using random mutant libraries, we developed a biophysical model that accounts for multiple features of σ70 binding bacterial promoters to predict constitutive gene expression levels from any sequence. We experimentally and theoretically estimated that 10–20% of random sequences lead to expression and ~80% of non-expressing sequences are one mutation away from a functional promoter. The potential for generating expression from random sequences is so pervasive that selection acts against σ70-RNA polymerase binding sites even within inter-genic, promoter-containing regions. This pervasiveness of σ70-binding sites implies that emergence of promoters is not the limiting step in gene regulatory evolution. Ultimately, the inclusion of novel features of promoter function into a mechanistic model enabled not only more accurate predictions of gene expression levels, but also identified that promoters evolve more rapidly than previously thought.
Collapse
Affiliation(s)
- Mato Lagator
- School of Biological Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, United Kingdom.,Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Srdjan Sarikas
- Institute of Science and Technology Austria, Klosterneuburg, Austria.,Center for Physiology and Pharmacology, Medical University of Vienna, Klosterneuburg, Austria
| | | | | | - Jonathan P Bollback
- Institute of Integrative Biology, Functional and Comparative Genomics, University of Liverpool, Liverpool, United Kingdom
| | - Calin C Guet
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Gašper Tkačik
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
6
|
Guharajan S, Chhabra S, Parisutham V, Brewster RC. Quantifying the regulatory role of individual transcription factors in Escherichia coli. Cell Rep 2021; 37:109952. [PMID: 34758318 PMCID: PMC8667592 DOI: 10.1016/j.celrep.2021.109952] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 08/02/2021] [Accepted: 10/13/2021] [Indexed: 11/30/2022] Open
Abstract
Gene regulation often results from the action of multiple transcription factors (TFs) acting at a promoter, obscuring the individual regulatory effect of each TF on RNA polymerase (RNAP). Here we measure the fundamental regulatory interactions of TFs in E. coli by designing synthetic target genes that isolate individual TFs' regulatory effects. Using a thermodynamic model, each TF's regulatory interactions are decoupled from TF occupancy and interpreted as acting through (de)stabilization of RNAP and (de)acceleration of transcription initiation. We find that the contribution of each mechanism depends on TF identity and binding location; regulation immediately downstream of the promoter is insensitive to TF identity, but the same TFs regulate by distinct mechanisms upstream of the promoter. These two mechanisms are uncoupled and can act coherently, to reinforce the observed regulatory role (activation/repression), or incoherently, wherein the TF regulates two distinct steps with opposing effects.
Collapse
Affiliation(s)
- Sunil Guharajan
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Shivani Chhabra
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Vinuselvi Parisutham
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Robert C Brewster
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Department of Microbiology and Physiological Systems, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
7
|
Abstract
Bacterial protein synthesis rates have evolved to maintain preferred stoichiometries at striking precision, from the components of protein complexes to constituents of entire pathways. Setting relative protein production rates to be well within a factor of two requires concerted tuning of transcription, RNA turnover, and translation, allowing many potential regulatory strategies to achieve the preferred output. The last decade has seen a greatly expanded capacity for precise interrogation of each step of the central dogma genome-wide. Here, we summarize how these technologies have shaped the current understanding of diverse bacterial regulatory architectures underpinning stoichiometric protein synthesis. We focus on the emerging expanded view of bacterial operons, which encode diverse primary and secondary mRNA structures for tuning protein stoichiometry. Emphasis is placed on how quantitative tuning is achieved. We discuss the challenges and open questions in the application of quantitative, genome-wide methodologies to the problem of precise protein production. Expected final online publication date for the Annual Review of Microbiology, Volume 75 is October 2021. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- James C Taggart
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; ,
| | - Jean-Benoît Lalanne
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; , .,Department of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.,Current affiliation: Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
| | - Gene-Wei Li
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA; ,
| |
Collapse
|
8
|
Abstract
Simple biophysical models successfully describe bacterial regulatory code, by predicting gene expression from DNA sequences that bind specialized regulatory proteins. Analogous simple models fail in multicellular organisms, where regulatory proteins bind DNA very transiently, yet, nevertheless, effect precise control over gene expression. To date, the more general, “nonequilibrium” models have proven difficult to analyze and connect to data. Here, we reduce this complexity theoretically, by constructing simple nonequilibrium models which perform optimal gene regulation within known experimental constraints. In prokaryotes, thermodynamic models of gene regulation provide a highly quantitative mapping from promoter sequences to gene-expression levels that is compatible with in vivo and in vitro biophysical measurements. Such concordance has not been achieved for models of enhancer function in eukaryotes. In equilibrium models, it is difficult to reconcile the reported short transcription factor (TF) residence times on the DNA with the high specificity of regulation. In nonequilibrium models, progress is difficult due to an explosion in the number of parameters. Here, we navigate this complexity by looking for minimal nonequilibrium enhancer models that yield desired regulatory phenotypes: low TF residence time, high specificity, and tunable cooperativity. We find that a single extra parameter, interpretable as the “linking rate,” by which bound TFs interact with Mediator components, enables our models to escape equilibrium bounds and access optimal regulatory phenotypes, while remaining consistent with the reported phenomenology and simple enough to be inferred from upcoming experiments. We further find that high specificity in nonequilibrium models is in a trade-off with gene-expression noise, predicting bursty dynamics—an experimentally observed hallmark of eukaryotic transcription. By drastically reducing the vast parameter space of nonequilibrium enhancer models to a much smaller subspace that optimally realizes biological function, we deliver a rich class of models that could be tractably inferred from data in the near future.
Collapse
|
9
|
Ireland WT, Beeler SM, Flores-Bautista E, McCarty NS, Röschinger T, Belliveau NM, Sweredoski MJ, Moradian A, Kinney JB, Phillips R. Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a time. eLife 2020; 9:e55308. [PMID: 32955440 PMCID: PMC7567609 DOI: 10.7554/elife.55308] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 09/18/2020] [Indexed: 01/28/2023] Open
Abstract
Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium Escherichia coli, for ≈65% of promoters we remain ignorant of their regulation. Until we crack this regulatory Rosetta Stone, efforts to read and write genomes will remain haphazard. We introduce a new method, Reg-Seq, that links massively parallel reporter assays with mass spectrometry to produce a base pair resolution dissection of more than a E. coli promoters in 12 growth conditions. We demonstrate that the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulatory information. In many cases, we also identify which transcription factors mediate their regulation. This method clears a path for highly multiplexed investigations of the regulatory genome of model organisms, with the potential of moving to an array of microbes of ecological and medical relevance.
Collapse
Affiliation(s)
- William T Ireland
- Department of Physics, California Institute of TechnologyPasadenaUnited States
| | - Suzannah M Beeler
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Emanuel Flores-Bautista
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Nicholas S McCarty
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Tom Röschinger
- Division of Chemistry and Chemical Engineering, California Institute of TechnologyPasadenaUnited States
| | - Nathan M Belliveau
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Michael J Sweredoski
- Proteome Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of TechnologyPasadenaUnited States
| | - Annie Moradian
- Proteome Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of TechnologyPasadenaUnited States
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor LaboratoryCold Spring HarborUnited States
| | - Rob Phillips
- Department of Physics, California Institute of TechnologyPasadenaUnited States
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| |
Collapse
|
10
|
Tareen A, Kinney JB. Logomaker: beautiful sequence logos in Python. Bioinformatics 2020; 36:2272-2274. [PMID: 31821414 PMCID: PMC7141850 DOI: 10.1093/bioinformatics/btz921] [Citation(s) in RCA: 187] [Impact Index Per Article: 46.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 11/14/2019] [Accepted: 12/06/2019] [Indexed: 01/09/2023] Open
Abstract
Summary Sequence logos are visually compelling ways of illustrating the biological properties of DNA, RNA and protein sequences, yet it is currently difficult to generate and customize such logos within the Python programming environment. Here we introduce Logomaker, a Python API for creating publication-quality sequence logos. Logomaker can produce both standard and highly customized logos from either a matrix-like array of numbers or a multiple-sequence alignment. Logos are rendered as native matplotlib objects that are easy to stylize and incorporate into multi-panel figures. Availability and implementation Logomaker can be installed using the pip package manager and is compatible with both Python 2.7 and Python 3.6. Documentation is provided at http://logomaker.readthedocs.io; source code is available at http://github.com/jbkinney/logomaker.
Collapse
Affiliation(s)
- Ammar Tareen
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| |
Collapse
|
11
|
Yim SS, Johns NI, Park J, Gomes ALC, McBee RM, Richardson M, Ronda C, Chen SP, Garenne D, Noireaux V, Wang HH. Multiplex transcriptional characterizations across diverse bacterial species using cell-free systems. Mol Syst Biol 2019; 15:e8875. [PMID: 31464371 PMCID: PMC6692573 DOI: 10.15252/msb.20198875] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Revised: 07/01/2019] [Accepted: 07/03/2019] [Indexed: 12/14/2022] Open
Abstract
Cell-free expression systems enable rapid prototyping of genetic programs in vitro. However, current throughput of cell-free measurements is limited by the use of channel-limited fluorescent readouts. Here, we describe DNA Regulatory element Analysis by cell-Free Transcription and Sequencing (DRAFTS), a rapid and robust in vitro approach for multiplexed measurement of transcriptional activities from thousands of regulatory sequences in a single reaction. We employ this method in active cell lysates developed from ten diverse bacterial species. Interspecies analysis of transcriptional profiles from > 1,000 diverse regulatory sequences reveals functional differences in promoter activity that can be quantitatively modeled, providing a rich resource for tuning gene expression in diverse bacterial species. Finally, we examine the transcriptional capacities of dual-species hybrid lysates that can simultaneously harness gene expression properties of multiple organisms. We expect that this cell-free multiplex transcriptional measurement approach will improve genetic part prototyping in new bacterial chassis for synthetic biology.
Collapse
Affiliation(s)
- Sung Sun Yim
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
| | - Nathan I Johns
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Integrated Program in Cellular, Molecular, and Biomedical StudiesColumbia UniversityNew YorkNYUSA
- Present address:
Department of BioengineeringStanford UniversityStanfordCAUSA
| | - Jimin Park
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Integrated Program in Cellular, Molecular, and Biomedical StudiesColumbia UniversityNew YorkNYUSA
| | - Antonio LC Gomes
- Department of ImmunologyMemorial Sloan Kettering Cancer CenterNew YorkNYUSA
| | - Ross M McBee
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Department of Biological SciencesColumbia UniversityNew YorkNYUSA
| | - Miles Richardson
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Integrated Program in Cellular, Molecular, and Biomedical StudiesColumbia UniversityNew YorkNYUSA
| | - Carlotta Ronda
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
| | - Sway P Chen
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Integrated Program in Cellular, Molecular, and Biomedical StudiesColumbia UniversityNew YorkNYUSA
| | - David Garenne
- School of Physics and AstronomyUniversity of MinnesotaMinneapolisMNUSA
| | - Vincent Noireaux
- School of Physics and AstronomyUniversity of MinnesotaMinneapolisMNUSA
| | - Harris H Wang
- Department of Systems BiologyColumbia UniversityNew YorkNYUSA
- Department of Pathology and Cell BiologyColumbia UniversityNew YorkNYUSA
| |
Collapse
|
12
|
How the avidity of polymerase binding to the -35/-10 promoter sites affects gene expression. Proc Natl Acad Sci U S A 2019; 116:13340-13345. [PMID: 31196959 PMCID: PMC6613100 DOI: 10.1073/pnas.1905615116] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Although the key promoter elements necessary to drive transcription in Escherichia coli have long been understood, we still cannot predict the behavior of arbitrary novel promoters, hampering our ability to characterize the myriad sequenced regulatory architectures as well as to design new synthetic circuits. This work builds upon a beautiful recent experiment by Urtecho et al. [G. Urtecho, et al, Biochemistry, 68, 1539-1551 (2019)] who measured the gene expression of over 10,000 promoters spanning all possible combinations of a small set of regulatory elements. Using these data, we demonstrate that a central claim in energy matrix models of gene expression-that each promoter element contributes independently and additively to gene expression-contradicts experimental measurements. We propose that a key missing ingredient from such models is the avidity between the -35 and -10 RNA polymerase binding sites and develop what we call a multivalent model that incorporates this effect and can successfully characterize the full suite of gene expression data. We explore several applications of this framework, namely, how multivalent binding at the -35 and -10 sites can buffer RNA polymerase (RNAP) kinetics against mutations and how promoters that bind overly tightly to RNA polymerase can inhibit gene expression. The success of our approach suggests that avidity represents a key physical principle governing the interaction of RNA polymerase to its promoter.
Collapse
|
13
|
Kinney JB, McCandlish DM. Massively Parallel Assays and Quantitative Sequence-Function Relationships. Annu Rev Genomics Hum Genet 2019; 20:99-127. [PMID: 31091417 DOI: 10.1146/annurev-genom-083118-014845] [Citation(s) in RCA: 71] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Over the last decade, a rich variety of massively parallel assays have revolutionized our understanding of how biological sequences encode quantitative molecular phenotypes. These assays include deep mutational scanning, high-throughput SELEX, and massively parallel reporter assays. Here, we review these experimental methods and how the data they produce can be used to quantitatively model sequence-function relationships. In doing so, we touch on a diverse range of topics, including the identification of clinically relevant genomic variants, the modeling of transcription factor binding to DNA, the functional and evolutionary landscapes of proteins, and cis-regulatory mechanisms in both transcription and mRNA splicing. We further describe a unified conceptual framework and a core set of mathematical modeling strategies that studies in these diverse areas can make use of. Finally, we highlight key aspects of experimental design and mathematical modeling that are important for the results of such studies to be interpretable and reproducible.
Collapse
Affiliation(s)
- Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA; ,
| |
Collapse
|