1
|
Dhillon N, Kamakaka RT. Transcriptional silencing in Saccharomyces cerevisiae: known unknowns. Epigenetics Chromatin 2024; 17:28. [PMID: 39272151 PMCID: PMC11401328 DOI: 10.1186/s13072-024-00553-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2024] [Accepted: 09/02/2024] [Indexed: 09/15/2024] Open
Abstract
Transcriptional silencing in Saccharomyces cerevisiae is a persistent and highly stable form of gene repression. It involves DNA silencers and repressor proteins that bind nucleosomes. The silenced state is influenced by numerous factors including the concentration of repressors, nature of activators, architecture of regulatory elements, modifying enzymes and the dynamics of chromatin.Silencers function to increase the residence time of repressor Sir proteins at silenced domains while clustering of silenced domains enables increased concentrations of repressors and helps facilitate long-range interactions. The presence of an accessible NDR at the regulatory regions of silenced genes, the cycling of chromatin configurations at regulatory sites, the mobility of Sir proteins, and the non-uniform distribution of the Sir proteins across the silenced domain, all result in silenced chromatin that only stably silences weak promoters and enhancers via changes in transcription burst duration and frequency.These data collectively suggest that silencing is probabilistic and the robustness of silencing is achieved through sub-optimization of many different nodes of action such that a stable expression state is generated and maintained even though individual constituents are in constant flux.
Collapse
Affiliation(s)
- Namrita Dhillon
- Department of Biomolecular Engineering, University of California, 1156 High Street, Santa Cruz, CA, 95064, USA
| | - Rohinton T Kamakaka
- Department of MCD Biology, University of California, 1156 High Street, Santa Cruz, CA, 95064, USA.
| |
Collapse
|
2
|
Wang S, Wang W. Interpretable prediction of mRNA abundance from promoter sequence using contextual regression models. NAR Genom Bioinform 2024; 6:lqae055. [PMID: 38807713 PMCID: PMC11131020 DOI: 10.1093/nargab/lqae055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 04/08/2024] [Accepted: 05/12/2024] [Indexed: 05/30/2024] Open
Abstract
While machine learning models have been successfully applied to predicting gene expression from promoter sequences, it remains a great challenge to derive intuitive interpretation of the model and reveal DNA motif grammar such as motif cooperation and distance constraint between motif sites. Previous interpretation approaches are often time-consuming or have difficulty to learn the combinatory rules. In this work, we designed interpretable neural network models to predict the mRNA expression levels from DNA sequences. By applying the Contextual Regression framework we developed, we extracted weighted features to cluster samples into different groups, which have different gene expression levels. We performed motif analysis in each cluster and found motifs with active or repressive regulation on gene expression. By comparing the co-occurrence locations of discovered motifs, we also uncovered multiple grammars of motif combination including communities of cooperative motifs and distance constraints between motif pairs. These results revealed new insights of the regulatory architecture of promoter sequences.
Collapse
Affiliation(s)
- Song Wang
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, USA
| | - Wei Wang
- Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, USA
- Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, USA
| |
Collapse
|
3
|
Vaknin I, Willinger O, Mandl J, Heuberger H, Ben-Ami D, Zeng Y, Goldberg S, Orenstein Y, Amit R. A universal system for boosting gene expression in eukaryotic cell-lines. Nat Commun 2024; 15:2394. [PMID: 38493141 PMCID: PMC10944472 DOI: 10.1038/s41467-024-46573-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 03/04/2024] [Indexed: 03/18/2024] Open
Abstract
We demonstrate a transcriptional regulatory design algorithm that can boost expression in yeast and mammalian cell lines. The system consists of a simplified transcriptional architecture composed of a minimal core promoter and a synthetic upstream regulatory region (sURS) composed of up to three motifs selected from a list of 41 motifs conserved in the eukaryotic lineage. The sURS system was first characterized using an oligo-library containing 189,990 variants. We validate the resultant expression model using a set of 43 unseen sURS designs. The validation sURS experiments indicate that a generic set of grammar rules for boosting and attenuation may exist in yeast cells. Finally, we demonstrate that this generic set of grammar rules functions similarly in mammalian CHO-K1 and HeLa cells. Consequently, our work provides a design algorithm for boosting the expression of promoters used for expressing industrially relevant proteins in yeast and mammalian cell lines.
Collapse
Affiliation(s)
- Inbal Vaknin
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Or Willinger
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Jonathan Mandl
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
| | - Hadar Heuberger
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Dan Ben-Ami
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer Sheva, Israel
| | - Yi Zeng
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Sarah Goldberg
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel
| | - Yaron Orenstein
- Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat Gan, Israel
| | - Roee Amit
- Department of Biotechnology and Food Engineering, Technion, Haifa, Israel.
- The Russell Berrie Nanotechnology Institute, Technion, Haifa, Israel.
| |
Collapse
|
4
|
Harden TT, Vincent BJ, DePace AH. Transcriptional activators in the early Drosophila embryo perform different kinetic roles. Cell Syst 2023; 14:258-272.e4. [PMID: 37080162 PMCID: PMC10473017 DOI: 10.1016/j.cels.2023.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 06/26/2022] [Accepted: 03/21/2023] [Indexed: 04/22/2023]
Abstract
Combinatorial regulation of gene expression by transcription factors (TFs) may in part arise from kinetic synergy-wherein TFs regulate different steps in the transcription cycle. Kinetic synergy requires that TFs play distinguishable kinetic roles. Here, we used live imaging to determine the kinetic roles of three TFs that activate transcription in the Drosophila embryo-Zelda, Bicoid, and Stat92E-by introducing their binding sites into the even-skipped stripe 2 enhancer. These TFs influence different sets of kinetic parameters, and their influence can change over time. All three TFs increased the fraction of transcriptionally active nuclei; Zelda also shortened the first-passage time into transcription and regulated the interval between transcription events. Stat92E also increased the lifetimes of active transcription. Different TFs can therefore play distinct kinetic roles in activating the transcription. This has consequences for understanding the composition and flexibility of regulatory DNA sequences and the biochemical function of TFs. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Timothy T Harden
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Ben J Vincent
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
5
|
Kari H, Bandi SMS, Kumar A, Yella VR. DeePromClass: Delineator for Eukaryotic Core Promoters Employing Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:802-807. [PMID: 35353704 DOI: 10.1109/tcbb.2022.3163418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Computational promoter identification in eukaryotes is a classical biological problem that should be refurbished with the availability of an avalanche of experimental data and emerging deep learning technologies. The current knowledge indicates that eukaryotic core promoters display multifarious signals such as TATA-Box, Inr element, TCT, and Pause-button, etc., and structural motifs such as G-quadruplexes. In the present study, we combined the power of deep learning with a plethora of promoter motifs to delineate promoter and non-promoters gleaned from the statistical properties of DNA sequence arrangement. To this end, we implemented convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for five model systems with [-100 to +50] segments relative to the transcription start site being the core promoter. Unlike previous state-of-the-art tools, which furnish a binary decision of promoter or non-promoter, we classify a chunk of 151mer sequence into a promoter along with the consensus signal type or a non-promoter. The combined CNN-LSTM model; we call "DeePromClass", achieved testing accuracy of 90.6%, 93.6%, 91.8%, 86.5%, and 84.0% for S. cerevisiae, C. elegans, D. melanogaster, Mus musculus, and Homo sapiens respectively. In total, our tool provides an insightful update on next-generation promoter prediction tools for promoter biologists.
Collapse
|
6
|
Shahein A, López-Malo M, Istomin I, Olson EJ, Cheng S, Maerkl SJ. Systematic analysis of low-affinity transcription factor binding site clusters in vitro and in vivo establishes their functional relevance. Nat Commun 2022; 13:5273. [PMID: 36071116 PMCID: PMC9452512 DOI: 10.1038/s41467-022-32971-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 08/25/2022] [Indexed: 11/10/2022] Open
Abstract
Binding to binding site clusters has yet to be characterized in depth, and the functional relevance of low-affinity clusters remains uncertain. We characterized transcription factor binding to low-affinity clusters in vitro and found that transcription factors can bind concurrently to overlapping sites, challenging the notion of binding exclusivity. Furthermore, small clusters with binding sites an order of magnitude lower in affinity give rise to high mean occupancies at physiologically-relevant transcription factor concentrations. To assess whether the observed in vitro occupancies translate to transcriptional activation in vivo, we tested low-affinity binding site clusters in a synthetic and native gene regulatory network in S. cerevisiae. In both systems, clusters of low-affinity binding sites generated transcriptional output comparable to single or even multiple consensus sites. This systematic characterization demonstrates that clusters of low-affinity binding sites achieve substantial occupancies, and that this occupancy can drive expression in eukaryotic promoters.
Collapse
Affiliation(s)
- Amir Shahein
- Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Maria López-Malo
- Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Ivan Istomin
- Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Evan J Olson
- Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Shiyu Cheng
- Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
| | - Sebastian J Maerkl
- Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland.
| |
Collapse
|
7
|
Foe VE. Does the Pachytene Checkpoint, a Feature of Meiosis, Filter Out Mistakes in Double-Strand DNA Break Repair and as a side-Effect Strongly Promote Adaptive Speciation? Integr Org Biol 2022; 4:obac008. [PMID: 36827645 PMCID: PMC8998493 DOI: 10.1093/iob/obac008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
This essay aims to explain two biological puzzles: why eukaryotic transcription units are composed of short segments of coding DNA interspersed with long stretches of non-coding (intron) DNA, and the near ubiquity of sexual reproduction. As is well known, alternative splicing of its coding sequences enables one transcription unit to produce multiple variants of each encoded protein. Additionally, padding transcription units with non-coding DNA (often many thousands of base pairs long) provides a readily evolvable way to set how soon in a cell cycle the various mRNAs will begin being expressed and the total amount of mRNA that each transcription unit can make during a cell cycle. This regulation complements control via the transcriptional promoter and facilitates the creation of complex eukaryotic cell types, tissues, and organisms. However, it also makes eukaryotes exceedingly vulnerable to double-strand DNA breaks, which end-joining break repair pathways can repair incorrectly. Transcription units cover such a large fraction of the genome that any mis-repair producing a reorganized chromosome has a high probability of destroying a gene. During meiosis, the synaptonemal complex aligns homologous chromosome pairs and the pachytene checkpoint detects, selectively arrests, and in many organisms actively destroys gamete-producing cells with chromosomes that cannot adequately synapse; this creates a filter favoring transmission to the next generation of chromosomes that retain the parental organization, while selectively culling those with interrupted transcription units. This same meiotic checkpoint, reacting to accidental chromosomal reorganizations inflicted by error-prone break repair, can, as a side effect, provide a mechanism for the formation of new species in sympatry. It has been a long-standing puzzle how something as seemingly maladaptive as hybrid sterility between such new species can arise. I suggest that this paradox is resolved by understanding the adaptive importance of the pachytene checkpoint, as outlined above.
Collapse
|
8
|
Vanaja A, Yella VR. Delineation of the DNA Structural Features of Eukaryotic Core Promoter Classes. ACS OMEGA 2022; 7:5657-5669. [PMID: 35224327 PMCID: PMC8867553 DOI: 10.1021/acsomega.1c04603] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 01/27/2022] [Indexed: 05/02/2023]
Abstract
The eukaryotic transcription is orchestrated from a chunk of the DNA region stated as the core promoter. Multifarious and punctilious core promoter signals, viz., TATA-box, Inr, BREs, and Pause Button, are associated with a subset of genes and regulate their spatiotemporal expression. However, the core promoter architecture linked with these signals has not been investigated exhaustively for several species. In this study, we attempted to envisage the adaptive binding landscape of the transcription initiation machinery as a function of DNA structure. To this end, we deployed a set of k-mer based DNA structural estimates and regular expression models derived from experiments, molecular dynamic simulations, and theoretical frameworks, and high-throughout promoter data sets retrieved from the eukaryotic promoter database. We categorized protein-coding gene core promoters based on characteristic motifs at precise locations and analyzed the B-DNA structural properties and non-B-DNA structural motifs for 15 different eukaryotic genomes. We observed that Inr, BREd, and no-motif classes display common patterns of DNA sequence and structural environment. TATA-containing, BREu, and Pause Button classes show a deviant behavior with the TATA class displaying varied axial and twisting flexibility while BREu and Pause Button leaned toward G-quadruplex motif enrichment. Intriguingly, DNA meltability and shape signals are conserved irrespective of the presence or absence of distinct core promoter motifs in the majority of species. Altogether, here we delineated the conserved DNA structural signals associated with several promoter classes that may contribute to the chromatin configuration, orchestration of transcription machinery, and DNA duplex melting during the transcription process.
Collapse
Affiliation(s)
- Akkinepally Vanaja
- Department
of Biotechnology, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
- KL
College of Pharmacy, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
| | - Venkata Rajesh Yella
- Department
of Biotechnology, Koneru Lakshmaiah Education
Foundation, Vaddeswaram, Guntur 522502, Andhra
Pradesh, India
| |
Collapse
|
9
|
Jiang C, Wan S, Hu P, Li Y, Li S. Editorial: Transcriptional Regulation in Metabolism and Immunology. Front Genet 2022; 13:845697. [PMID: 35186050 PMCID: PMC8847674 DOI: 10.3389/fgene.2022.845697] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Accepted: 01/17/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Chunjie Jiang
- Department of Medicine, Division of Diabetes, Endocrinology and Metabolism, Baylor College of Medicine, Houston, TX, United States
| | - Shibiao Wan
- Center for Applied Bioinformatics, St. Jude Children's Research Hospital, Memphis, TN, United States
| | - Peng Hu
- College of Fisheries and Life Science, Shanghai Ocean University, Shanghai, China
| | - Yongsheng Li
- Key Laboratory of Tropical Translational Medicine of Ministry of Education, College of Biomedical Information and Engineering, Hainan Medical University, Haikou, China
| | - Shengli Li
- Precision Research Center for Refractory Diseases, Institute for Clinical Research, Shanghai General Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| |
Collapse
|
10
|
Schmitz RJ, Grotewold E, Stam M. Cis-regulatory sequences in plants: Their importance, discovery, and future challenges. THE PLANT CELL 2022; 34:718-741. [PMID: 34918159 PMCID: PMC8824567 DOI: 10.1093/plcell/koab281] [Citation(s) in RCA: 125] [Impact Index Per Article: 62.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2021] [Accepted: 10/20/2021] [Indexed: 05/19/2023]
Abstract
The identification and characterization of cis-regulatory DNA sequences and how they function to coordinate responses to developmental and environmental cues is of paramount importance to plant biology. Key to these regulatory processes are cis-regulatory modules (CRMs), which include enhancers and silencers. Despite the extraordinary advances in high-quality sequence assemblies and genome annotations, the identification and understanding of CRMs, and how they regulate gene expression, lag significantly behind. This is especially true for their distinguishing characteristics and activity states. Here, we review the current knowledge on CRMs and breakthrough technologies enabling identification, characterization, and validation of CRMs; we compare the genomic distributions of CRMs with respect to their target genes between different plant species, and discuss the role of transposable elements harboring CRMs in the evolution of gene expression. This is an exciting time to study cis-regulomes in plants; however, significant existing challenges need to be overcome to fully understand and appreciate the role of CRMs in plant biology and in crop improvement.
Collapse
Affiliation(s)
- Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| | - Erich Grotewold
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, USA
| | | |
Collapse
|
11
|
Poon GMK. The Non-continuum Nature of Eukaryotic Transcriptional Regulation. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1371:11-32. [PMID: 33616894 PMCID: PMC8380751 DOI: 10.1007/5584_2021_618] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/25/2024]
Abstract
Eukaryotic transcription factors are versatile mediators of specificity in gene regulation. This versatility is achieved through mutual specification by context-specific DNA binding on the one hand, and identity-specific protein-protein partnerships on the other. This interactivity, known as combinatorial control, enables a repertoire of complex transcriptional outputs that are qualitatively disjoint, or non-continuum, with respect to binding affinity. This feature contrasts starkly with prokaryotic gene regulators, whose activities in general vary quantitatively in step with binding affinity. Biophysical studies on prokaryotic model systems and more recent investigations on transcription factors highlight an important role for folded state dynamics and molecular hydration in protein/DNA recognition. Analysis of molecular models of combinatorial control and recent literature in low-affinity gene regulation suggest that transcription factors harbor unique conformational dynamics that are inaccessible or unused by prokaryotic DNA-binding proteins. Thus, understanding the intrinsic dynamics involved in DNA binding and co-regulator recruitment appears to be a key to understanding how transcription factors mediate non-continuum outcomes in eukaryotic gene expression, and how such capability might have evolved from ancient, structurally conserved counterparts.
Collapse
Affiliation(s)
- Gregory M K Poon
- Department of Chemistry, Georgia State University, Atlanta, GA, USA.
- Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, GA, USA.
| |
Collapse
|
12
|
Protein innovation through template switching in the Saccharomyces cerevisiae lineage. Sci Rep 2021; 11:22558. [PMID: 34799587 PMCID: PMC8604942 DOI: 10.1038/s41598-021-01736-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 10/27/2021] [Indexed: 11/08/2022] Open
Abstract
DNA polymerase template switching between short, non-identical inverted repeats (IRs) is a genetic mechanism that leads to the homogenization of IR arms and to IR spacer inversion, which cause multinucleotide mutations (MNMs). It is unknown if and how template switching affects gene evolution. In this study, we performed a phylogenetic analysis to determine the effect of template switching between IR arms on coding DNA of Saccharomyces cerevisiae. To achieve this, perfect IRs that co-occurred with MNMs between a strain and its parental node were identified in S. cerevisiae strains. We determined that template switching introduced MNMs into 39 protein-coding genes through S. cerevisiae evolution, resulting in both arm homogenization and inversion of the IR spacer. These events in turn resulted in nonsynonymous substitutions and up to five neighboring amino acid replacements in a single gene. The study demonstrates that template switching is a powerful generator of multiple substitutions within codons. Additionally, some template switching events occurred more than once during S. cerevisiae evolution. Our findings suggest that template switching constitutes a general mutagenic mechanism that results in both nonsynonymous substitutions and parallel evolution, which are traditionally considered as evidence for positive selection, without the need for adaptive explanations.
Collapse
|
13
|
Guharajan S, Chhabra S, Parisutham V, Brewster RC. Quantifying the regulatory role of individual transcription factors in Escherichia coli. Cell Rep 2021; 37:109952. [PMID: 34758318 PMCID: PMC8667592 DOI: 10.1016/j.celrep.2021.109952] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 08/02/2021] [Accepted: 10/13/2021] [Indexed: 11/30/2022] Open
Abstract
Gene regulation often results from the action of multiple transcription factors (TFs) acting at a promoter, obscuring the individual regulatory effect of each TF on RNA polymerase (RNAP). Here we measure the fundamental regulatory interactions of TFs in E. coli by designing synthetic target genes that isolate individual TFs' regulatory effects. Using a thermodynamic model, each TF's regulatory interactions are decoupled from TF occupancy and interpreted as acting through (de)stabilization of RNAP and (de)acceleration of transcription initiation. We find that the contribution of each mechanism depends on TF identity and binding location; regulation immediately downstream of the promoter is insensitive to TF identity, but the same TFs regulate by distinct mechanisms upstream of the promoter. These two mechanisms are uncoupled and can act coherently, to reinforce the observed regulatory role (activation/repression), or incoherently, wherein the TF regulates two distinct steps with opposing effects.
Collapse
Affiliation(s)
- Sunil Guharajan
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Shivani Chhabra
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Vinuselvi Parisutham
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA
| | - Robert C Brewster
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA; Department of Microbiology and Physiological Systems, University of Massachusetts Chan Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
14
|
Anderson DA, Voigt CA. Competitive dCas9 binding as a mechanism for transcriptional control. Mol Syst Biol 2021; 17:e10512. [PMID: 34747560 PMCID: PMC8574044 DOI: 10.15252/msb.202110512] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 10/10/2021] [Accepted: 10/11/2021] [Indexed: 12/24/2022] Open
Abstract
Catalytically dead Cas9 (dCas9) is a programmable transcription factor that can be targeted to promoters through the design of small guide RNAs (sgRNAs), where it can function as an activator or repressor. Natural promoters use overlapping binding sites as a mechanism for signal integration, where the binding of one can block, displace, or augment the activity of the other. Here, we implemented this strategy in Escherichia coli using pairs of sgRNAs designed to repress and then derepress transcription through competitive binding. When designed to target a promoter, this led to 27-fold repression and complete derepression. This system was also capable of ratiometric input comparison over two orders of magnitude. Additionally, we used this mechanism for promoter sequence-independent control by adopting it for elongation control, achieving 8-fold repression and 4-fold derepression. This work demonstrates a new genetic control mechanism that could be used to build analog circuit or implement cis-regulatory logic on CRISPRi-targeted native genes.
Collapse
Affiliation(s)
- Daniel A Anderson
- Synthetic Biology CenterDepartment of Biological EngineeringMassachusetts Institute of TechnologyCambridgeMAUSA
| | - Christopher A Voigt
- Synthetic Biology CenterDepartment of Biological EngineeringMassachusetts Institute of TechnologyCambridgeMAUSA
| |
Collapse
|
15
|
Fan K, Moore JE, Zhang XO, Weng Z. Genetic and epigenetic features of promoters with ubiquitous chromatin accessibility support ubiquitous transcription of cell-essential genes. Nucleic Acids Res 2021; 49:5705-5725. [PMID: 33978759 PMCID: PMC8191798 DOI: 10.1093/nar/gkab345] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2021] [Revised: 03/19/2021] [Accepted: 05/01/2021] [Indexed: 12/04/2022] Open
Abstract
Gene expression is controlled by regulatory elements within accessible chromatin. Although most regulatory elements are cell type-specific, a subset is accessible in nearly all the 517 human and 94 mouse cell and tissue types assayed by the ENCODE consortium. We systematically analyzed 9000 human and 8000 mouse ubiquitously-accessible candidate cis-regulatory elements (cCREs) with promoter-like signatures (PLSs) from ENCODE, which we denote ubi-PLSs. These are more CpG-rich than non-ubi-PLSs and correspond to genes with ubiquitously high transcription, including a majority of cell-essential genes. ubi-PLSs are enriched with motifs of ubiquitously-expressed transcription factors and preferentially bound by transcriptional cofactors regulating ubiquitously-expressed genes. They are highly conserved between human and mouse at the synteny level but exhibit frequent turnover of motif sites; accordingly, ubi-PLSs show increased variation at their centers compared with flanking regions among the ∼186 thousand human genomes sequenced by the TOPMed project. Finally, ubi-PLSs are enriched in genes implicated in Mendelian diseases, especially diseases broadly impacting most cell types, such as deficiencies in mitochondrial functions. Thus, a set of roughly 9000 mammalian promoters are actively maintained in an accessible state across cell types by a distinct set of transcription factors and cofactors to ensure the transcriptional programs of cell-essential genes.
Collapse
Affiliation(s)
- Kaili Fan
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Jill E Moore
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Xiao-ou Zhang
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| | - Zhiping Weng
- Program in Bioinformatics and Integrative Biology, UMass Medical School, Worcester, MA, USA
| |
Collapse
|
16
|
Wang Y, Jaime-Lara RB, Roy A, Sun Y, Liu X, Joseph PV. SeqEnhDL: sequence-based classification of cell type-specific enhancers using deep learning models. BMC Res Notes 2021; 14:104. [PMID: 33741075 PMCID: PMC7980595 DOI: 10.1186/s13104-021-05518-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Accepted: 03/09/2021] [Indexed: 11/12/2022] Open
Abstract
Objective To address the challenge of computational identification of cell type-specific regulatory elements on a genome-wide scale. Results We propose SeqEnhDL, a deep learning framework for classifying cell type-specific enhancers based on sequence features. DNA sequences of “strong enhancer” chromatin states in nine cell types from the ENCODE project were retrieved to build and test enhancer classifiers. For any DNA sequence, positional k-mer (k = 5, 7, 9 and 11) fold changes relative to randomly selected non-coding sequences across each nucleotide position were used as features for deep learning models. Three deep learning models were implemented, including multi-layer perceptron (MLP), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). All models in SeqEnhDL outperform state-of-the-art enhancer classifiers (including gkm-SVM and DanQ) in distinguishing cell type-specific enhancers from randomly selected non-coding sequences. Moreover, SeqEnhDL can directly discriminate enhancers from different cell types, which has not been achieved by other enhancer classifiers. Our analysis suggests that both enhancers and their tissue-specificity can be accurately identified based on their sequence features. SeqEnhDL is publicly available at https://github.com/wyp1125/SeqEnhDL. Supplementary Information The online version contains supplementary material available at 10.1186/s13104-021-05518-7.
Collapse
Affiliation(s)
- Yupeng Wang
- BDX Research and Consulting LLC, Herndon, VA, 20171, USA. .,Division of Intramural Research, National Institute of Nursing Research, National Institutes of Health, Bethesda, MD, 20892, USA.
| | - Rosario B Jaime-Lara
- Division of Intramural Clinical and Biological Research (DICBR), National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health, Bethesda, MD, 20892, USA.,Division of Intramural Research, National Institute of Nursing Research, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Abhrarup Roy
- Division of Intramural Research, National Institute of Nursing Research, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ying Sun
- BDX Research and Consulting LLC, Herndon, VA, 20171, USA
| | - Xinyue Liu
- BDX Research and Consulting LLC, Herndon, VA, 20171, USA
| | - Paule V Joseph
- Division of Intramural Clinical and Biological Research (DICBR), National Institute on Alcohol Abuse and Alcoholism, National Institutes of Health, Bethesda, MD, 20892, USA. .,Division of Intramural Research, National Institute of Nursing Research, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
17
|
Carbon Catabolite Repression Governs Diverse Physiological Processes and Development in Aspergillus nidulans. mBio 2021; 13:e0373421. [PMID: 35164551 PMCID: PMC8844935 DOI: 10.1128/mbio.03734-21] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Carbon catabolite repression (CCR) is a common phenomenon of microorganisms that enable efficient utilization of carbon nutrients, critical for the fitness of microorganisms in the wild and for pathogenic species to cause infection. In most filamentous fungal species, the conserved transcription factor CreA/Cre1 mediates CCR. Previous studies demonstrated a primary function for CreA/Cre1 in carbon metabolism; however, the phenotype of creA/cre1 mutants indicated broader roles. The global function and regulatory mechanism of this wide-domain transcription factor has remained elusive. Here, we applied two powerful genomics methods (transcriptome sequencing and chromatin immunoprecipitation sequencing) to delineate the direct and indirect roles of Aspergillus nidulans CreA across diverse physiological processes, including secondary metabolism, iron homeostasis, oxidative stress response, development, N-glycan biosynthesis, unfolded protein response, and nutrient and ion transport. The results indicate intricate connections between the regulation of carbon metabolism and diverse cellular functions. Moreover, our work also provides key mechanistic insights into CreA regulation and identifies CreA as a master regulator controlling many transcription factors of different regulatory networks. The discoveries for this highly conserved transcriptional regulator in a model fungus have important implications for CCR in related pathogenic and industrial species. IMPORTANCE The ability to scavenge and use a wide range of nutrients for growth is crucial for microorganisms' survival in the wild. Carbon catabolite repression (CCR) is a transcriptional regulatory phenomenon of both bacteria and fungi to coordinate the expression of genes required for preferential utilization of carbon sources. Since carbon metabolism is essential for growth, CCR is central to the fitness of microorganisms. In filamentous fungi, CCR is mediated by the conserved transcription factor CreA/Cre1, whose function in carbon metabolism has been well established. However, the global roles and regulatory mechanism of CreA/Cre1 are poorly defined. This study uncovers the direct and indirect functions of CreA in the model organism Aspergillus nidulans over diverse physiological processes and development and provides mechanistic insights into how CreA controls different regulatory networks. The work also reveals an interesting functional divergence between filamentous fungal and yeast CreA/Cre1 orthologues.
Collapse
|
18
|
Patra P, Das M, Kundu P, Ghosh A. Recent advances in systems and synthetic biology approaches for developing novel cell-factories in non-conventional yeasts. Biotechnol Adv 2021; 47:107695. [PMID: 33465474 DOI: 10.1016/j.biotechadv.2021.107695] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Revised: 12/14/2020] [Accepted: 01/09/2021] [Indexed: 12/14/2022]
Abstract
Microbial bioproduction of chemicals, proteins, and primary metabolites from cheap carbon sources is currently an advancing area in industrial research. The model yeast, Saccharomyces cerevisiae, is a well-established biorefinery host that has been used extensively for commercial manufacturing of bioethanol from myriad carbon sources. However, its Crabtree-positive nature often limits the use of this organism for the biosynthesis of commercial molecules that do not belong in the fermentative pathway. To avoid extensive strain engineering of S. cerevisiae for the production of metabolites other than ethanol, non-conventional yeasts can be selected as hosts based on their natural capacity to produce desired commodity chemicals. Non-conventional yeasts like Kluyveromyces marxianus, K. lactis, Yarrowia lipolytica, Pichia pastoris, Scheffersomyces stipitis, Hansenula polymorpha, and Rhodotorula toruloides have been considered as potential industrial eukaryotic hosts owing to their desirable phenotypes such as thermotolerance, assimilation of a wide range of carbon sources, as well as ability to secrete high titers of protein and lipid. However, the advanced metabolic engineering efforts in these organisms are still lacking due to the limited availability of systems and synthetic biology methods like in silico models, well-characterised genetic parts, and optimized genome engineering tools. This review provides an insight into the recent advances and challenges of systems and synthetic biology as well as metabolic engineering endeavours towards the commercial usage of non-conventional yeasts. Particularly, the approaches in emerging non-conventional yeasts for the production of enzymes, therapeutic proteins, lipids, and metabolites for commercial applications are extensively discussed here. Various attempts to address current limitations in designing novel cell factories have been highlighted that include the advances in the fields of genome-scale metabolic model reconstruction, flux balance analysis, 'omics'-data integration into models, genome-editing toolkit development, and rewiring of cellular metabolisms for desired chemical production. Additionally, the understanding of metabolic networks using 13C-labelling experiments as well as the utilization of metabolomics in deciphering intracellular fluxes and reactions have also been discussed here. Application of cutting-edge nuclease-based genome editing platforms like CRISPR/Cas9, and its optimization towards efficient strain engineering in non-conventional yeasts have also been described. Additionally, the impact of the advances in promising non-conventional yeasts for efficient commercial molecule synthesis has been meticulously reviewed. In the future, a cohesive approach involving systems and synthetic biology will help in widening the horizon of the use of unexplored non-conventional yeast species towards industrial biotechnology.
Collapse
Affiliation(s)
- Pradipta Patra
- School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Manali Das
- School of Bioscience, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Pritam Kundu
- School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
19
|
Georgakopoulos-Soares I, Mouratidis I, Parada GE, Matharu N, Hemberg M, Ahituv N. Asymmetron: a toolkit for the identification of strand asymmetry patterns in biological sequences. Nucleic Acids Res 2021; 49:e4. [PMID: 33211865 PMCID: PMC7797064 DOI: 10.1093/nar/gkaa1052] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 10/15/2020] [Accepted: 10/20/2020] [Indexed: 11/23/2022] Open
Abstract
DNA strand asymmetries can have a major effect on several biological functions, including replication, transcription and transcription factor binding. As such, DNA strand asymmetries and mutational strand bias can provide information about biological function. However, a versatile tool to explore this does not exist. Here, we present Asymmetron, a user-friendly computational tool that performs statistical analysis and visualizations for the evaluation of strand asymmetries. Asymmetron takes as input DNA features provided with strand annotation and outputs strand asymmetries for consecutive occurrences of a single DNA feature or between pairs of features. We illustrate the use of Asymmetron by identifying transcriptional and replicative strand asymmetries of germline structural variant breakpoints. We also show that the orientation of the binding sites of 45% of human transcription factors analyzed have a significant DNA strand bias in transcribed regions, that is also corroborated in ChIP-seq analyses, and is likely associated with transcription. In summary, we provide a novel tool to assess DNA strand asymmetries and show how it can be used to derive new insights across a variety of biological disciplines.
Collapse
Affiliation(s)
- Ilias Georgakopoulos-Soares
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| | - Ioannis Mouratidis
- Aristotle University of Thessaloniki, Department of Mathematics, Thessaloniki, GR, Greece
| | - Guillermo E Parada
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Navneet Matharu
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Innovative Genomics Institute, University of California San Francisco, San Francisco, CA, USA
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, CB10 1SA, UK
- Wellcome Trust Cancer Research UK Gurdon Institute, University of Cambridge, Tennis Court Road, Cambridge CB2 1QN, UK
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
| |
Collapse
|
20
|
Kreimer A, Yan Z, Ahituv N, Yosef N. Meta-analysis of massively parallel reporter assays enables prediction of regulatory function across cell types. Hum Mutat 2019; 40:1299-1313. [PMID: 31131957 PMCID: PMC6771677 DOI: 10.1002/humu.23820] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 05/18/2019] [Accepted: 05/24/2019] [Indexed: 01/01/2023]
Abstract
Deciphering the potential of noncoding loci to influence gene regulation has been the subject of intense research, with important implications in understanding genetic underpinnings of human diseases. Massively parallel reporter assays (MPRAs) can measure regulatory activity of thousands of DNA sequences and their variants in a single experiment. With increasing number of publically available MPRA data sets, one can now develop data-driven models which, given a DNA sequence, predict its regulatory activity. Here, we performed a comprehensive meta-analysis of several MPRA data sets in a variety of cellular contexts. We first applied an ensemble of methods to predict MPRA output in each context and observed that the most predictive features are consistent across data sets. We then demonstrate that predictive models trained in one cellular context can be used to predict MPRA output in another, with loss of accuracy attributed to cell-type-specific features. Finally, we show that our approach achieves top performance in the Fifth Critical Assessment of Genome Interpretation "Regulation Saturation" Challenge for predicting effects of single-nucleotide variants. Overall, our analysis provides insights into how MPRA data can be leveraged to highlight functional regulatory regions throughout the genome and can guide effective design of future experiments by better prioritizing regions of interest.
Collapse
Affiliation(s)
- Anat Kreimer
- Department of Electrical Engineering and Computer Sciences, Center for Computational BiologyUniversity of CaliforniaBerkeleyCalifornia
- Department of Bioengineering and Therapeutic SciencesUniversity of California, San FranciscoSan FranciscoCalifornia
| | - Zhongxia Yan
- Department of Electrical Engineering and Computer Sciences, Center for Computational BiologyUniversity of CaliforniaBerkeleyCalifornia
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic SciencesUniversity of California, San FranciscoSan FranciscoCalifornia
| | - Nir Yosef
- Department of Electrical Engineering and Computer Sciences, Center for Computational BiologyUniversity of CaliforniaBerkeleyCalifornia
- Ragon Institute of MGH MIT and HarvardCambridgeMassachusetts
- Chan Zuckerberg BiohubSan FranciscoCalifornia
| |
Collapse
|
21
|
Frumkin I, Yofe I, Bar-Ziv R, Gurvich Y, Lu YY, Voichek Y, Towers R, Schirman D, Krebber H, Pilpel Y. Evolution of intron splicing towards optimized gene expression is based on various Cis- and Trans-molecular mechanisms. PLoS Biol 2019; 17:e3000423. [PMID: 31442222 PMCID: PMC6728054 DOI: 10.1371/journal.pbio.3000423] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Revised: 09/05/2019] [Accepted: 08/08/2019] [Indexed: 01/09/2023] Open
Abstract
Splicing expands, reshapes, and regulates the transcriptome of eukaryotic organisms. Despite its importance, key questions remain unanswered, including the following: Can splicing evolve when organisms adapt to new challenges? How does evolution optimize inefficiency of introns’ splicing and of the splicing machinery? To explore these questions, we evolved yeast cells that were engineered to contain an inefficiently spliced intron inside a gene whose protein product was under selection for an increased expression level. We identified a combination of mutations in Cis (within the gene of interest) and in Trans (in mRNA-maturation machinery). Surprisingly, the mutations in Cis resided outside of known intronic functional sites and improved the intron’s splicing efficiency potentially by easing tight mRNA structures. One of these mutations hampered a protein’s domain that was not under selection, demonstrating the evolutionary flexibility of multi-domain proteins as one domain functionality was improved at the expense of the other domain. The Trans adaptations resided in two proteins, Npl3 and Gbp2, that bind pre-mRNAs and are central to their maturation. Interestingly, these mutations either increased or decreased the affinity of these proteins to mRNA, presumably allowing faster spliceosome recruitment or increased time before degradation of the pre-mRNAs, respectively. Altogether, our work reveals various mechanistic pathways toward optimizations of intron splicing to ultimately adapt gene expression patterns to novel demands. An experimental evolution study involving an inefficiently spliced intron reveals that the splicing machinery, introns, and RNA quality control factors evolve in Cis and in Trans when cells optimize their transcriptome to new challenges.
Collapse
Affiliation(s)
- Idan Frumkin
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
- * E-mail: (IF); (YP)
| | - Ido Yofe
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Raz Bar-Ziv
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Yonat Gurvich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Yen-Yun Lu
- Abteilung für Molekulare Genetik, Institut für Mikrobiologie und Genetik, Göttinger Zentrum für Molekulare Biowissenschaften (GZMB), Georg-August Universität Göttingen, Göttingen, Germany
| | - Yoav Voichek
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Ruth Towers
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Dvir Schirman
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
| | - Heike Krebber
- Abteilung für Molekulare Genetik, Institut für Mikrobiologie und Genetik, Göttinger Zentrum für Molekulare Biowissenschaften (GZMB), Georg-August Universität Göttingen, Göttingen, Germany
| | - Yitzhak Pilpel
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel
- * E-mail: (IF); (YP)
| |
Collapse
|
22
|
Chahal G, Tyagi S, Ramialison M. Navigating the non-coding genome in heart development and Congenital Heart Disease. Differentiation 2019; 107:11-23. [PMID: 31102825 DOI: 10.1016/j.diff.2019.05.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2018] [Revised: 01/14/2019] [Accepted: 05/06/2019] [Indexed: 12/12/2022]
Abstract
Congenital Heart Disease (CHD) is characterised by a wide range of cardiac defects, from mild to life-threatening, which occur in babies worldwide. To date, there is no cure to CHD, however, progress in surgery has reduced its mortality allowing children affected by CHD to reach adulthood. In an effort to understand its genetic basis, several studies involving whole-genome sequencing (WGS) of patients with CHD have been undertaken and generated a great wealth of information. The majority of putative causative mutations identified in WGS studies fall into the non-coding part of the genome. Unfortunately, due to the lack of understanding of the function of these non-coding mutations, it is challenging to establish a causal link between the non-coding mutation and the disease. Thus, here we review the state-of-the-art approaches to interpret non-coding mutations in the context of CHD and address the following questions: What are the non-coding sequences important for cardiac function? Which technologies are used to identify them? Which resources are available to analyse them? What mutations are expected in these non-coding sequences? Learning from developmental process, what is their expected role in CHD?
Collapse
Affiliation(s)
- Gulrez Chahal
- Australian Regenerative Medicine Institute (ARMI), 15 Innovation Walk, Monash University, Wellington Road, Clayton, 3800, VIC, Australia; Systems Biology Institute (SBI), Wellington Road, Clayton, 3800, VIC, Australia
| | - Sonika Tyagi
- School of Biological Sciences, Monash University, Wellington Road, Clayton, 3800, VIC, Australia; Australian Genome Research Facility, 305 Grattan Street, Melbourne, VIC, 3000, Australia.
| | - Mirana Ramialison
- Australian Regenerative Medicine Institute (ARMI), 15 Innovation Walk, Monash University, Wellington Road, Clayton, 3800, VIC, Australia; Systems Biology Institute (SBI), Wellington Road, Clayton, 3800, VIC, Australia.
| |
Collapse
|
23
|
Huang K, Xhani S, Albrecht AV, Ha VLT, Esaki S, Poon GMK. Mechanism of cognate sequence discrimination by the ETS-family transcription factor ETS-1. J Biol Chem 2019; 294:9666-9678. [PMID: 31048376 DOI: 10.1074/jbc.ra119.007866] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 05/01/2019] [Indexed: 12/19/2022] Open
Abstract
Functional evidence increasingly implicates low-affinity DNA recognition by transcription factors as a general mechanism for the spatiotemporal control of developmental genes. Although the DNA sequence requirements for affinity are well-defined, the dynamic mechanisms that execute cognate recognition are much less resolved. To address this gap, here we examined ETS1, a paradigm developmental transcription factor, as a model for which cognate discrimination remains enigmatic. Using molecular dynamics simulations, we interrogated the DNA-binding domain of murine ETS1 alone and when bound to high-and low-affinity cognate sites or to nonspecific DNA. The results of our analyses revealed collective backbone and side-chain motions that distinguished cognate versus nonspecific as well as high- versus low-affinity cognate DNA binding. Combined with binding experiments with site-directed ETS1 mutants, the molecular dynamics data disclosed a triad of residues that respond specifically to low-affinity cognate DNA. We found that a DNA-contacting residue (Gln-336) specifically recognizes low-affinity DNA and triggers the loss of a distal salt bridge (Glu-343/Arg-378) via a large side-chain motion that compromises the hydrophobic packing of two core helices. As an intact Glu-343/Arg-378 bridge is the default state in unbound ETS1 and maintained in high-affinity and nonspecific complexes, the low-affinity complex represents a unique conformational adaptation to the suboptimization of developmental enhancers.
Collapse
Affiliation(s)
| | | | | | | | | | - Gregory M K Poon
- From the Department of Chemistry and .,Center for Diagnostics and Therapeutics, Georgia State University, Atlanta, Georgia 30303
| |
Collapse
|
24
|
Thormann V, Glaser LV, Rothkegel MC, Borschiwer M, Bothe M, Fuchs A, Meijsing SH. Expanding the repertoire of glucocorticoid receptor target genes by engineering genomic response elements. Life Sci Alliance 2019; 2:2/2/e201800283. [PMID: 30867223 PMCID: PMC6417287 DOI: 10.26508/lsa.201800283] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2018] [Revised: 03/05/2019] [Accepted: 03/05/2019] [Indexed: 01/25/2023] Open
Abstract
This study shows that addition of a single transcription factor binding site can be sufficient to convert a gene into a glucocorticoid receptor target. The glucocorticoid receptor (GR), a hormone-activated transcription factor, binds to a myriad of genomic binding sites yet seems to regulate a much smaller number of genes. Genome-wide analysis of GR binding and gene regulation has shown that the likelihood of GR-dependent regulation increases with decreased distance of its binding to the transcriptional start site of a gene. To test if we can adopt this knowledge to expand the repertoire of GR target genes, we used CRISPR/Cas-mediated homology-directed repair to add a single GR-binding site directly upstream of the transcriptional start site of each of four genes. To our surprise, we found that the addition of a single GR-binding site can be enough to convert a gene into a GR target. The gain of GR-dependent regulation was observed for two of four genes analyzed and coincided with acquired GR binding at the introduced binding site. However, the gene-specific gain of GR-dependent regulation could not be explained by obvious differences in chromatin accessibility between converted genes and their non-converted counterparts. Furthermore, by introducing GR-binding sequences with different nucleotide compositions, we show that activation can be facilitated by distinct sequences without obvious differences in activity between the GR-binding sequence variants we tested. The approach to use genome engineering to build genomic response elements facilitates the generation of cell lines with tailored repertoires of GR-responsive genes and a framework to test and refine our understanding of the cis-regulatory logic of gene regulation by testing if engineered response elements behave as predicted.
Collapse
Affiliation(s)
- Verena Thormann
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Laura V Glaser
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | | | | | - Melissa Bothe
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Alisa Fuchs
- Max Planck Institute for Molecular Genetics, Berlin, Germany
| | | |
Collapse
|
25
|
Orlomoski R, Bogle A, Loss J, Simons R, Dresch JM, Drewell RA, Spratt DE. Rapid and efficient purification of Drosophila homeodomain transcription factors for biophysical characterization. Protein Expr Purif 2019; 158:9-14. [PMID: 30738927 DOI: 10.1016/j.pep.2019.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2019] [Accepted: 02/03/2019] [Indexed: 10/27/2022]
Abstract
Homeodomain transcription factors (HD TFs) are a large class of evolutionarily conserved DNA binding proteins that contain a basic 60-amino acid region required for binding to specific DNA sites. In Drosophila melanogaster, many of these HD TFs are expressed in the early embryo and control transcription of target genes in development through their interaction with cis-regulatory modules. Previous studies where some of the Drosophila HD TFs were purified required the use of strong denaturants (i.e. 6 M urea) and multiple chromatography columns, making the downstream biochemical examination of the isolated protein difficult. To circumvent these obstacles, we have developed a streamlined expression and purification protocol to produce large yields of Drosophila HD TFs. Using the HD TFs FUSHI-TARAZU (FTZ), ANTENNAPEDIA (ANTP), ABDOMINAL-A (ABD-A), ABDOMINAL-B (ABD-B), and ULTRABITHORAX (UBX) as examples, we demonstrate that our 3-day protocol involving the overexpression of His6-SUMO fusion constructs in E. coli followed by a Ni2+-IMAC, SUMO-tag cleavage with the SUMO protease Ulp1, and a heparin column purification produces pure, soluble protein in biological buffers around pH 7 in the absence of denaturants. Electrophoretic mobility shift assays (EMSA) confirm that the purified HD proteins are functional and nuclear magnetic resonance (NMR) spectra confirm that the purified HDs are well-folded. These purified HD TFs can be used in future biophysical experiments to structurally and biochemically characterize how and why these HD TFs bind to different DNA sequences and further probe how nucleotide differences contribute to TF-DNA specificity in the HD family.
Collapse
Affiliation(s)
- Rachel Orlomoski
- Gustaf H. Carlson School of Chemistry & Biochemistry, Clark University, 950 Main St, Worcester, MA, 01610, USA; Department of Biology, Clark University, 950 Main St, Worcester, MA, 01610, USA
| | - Aaron Bogle
- Gustaf H. Carlson School of Chemistry & Biochemistry, Clark University, 950 Main St, Worcester, MA, 01610, USA; Department of Biology, Clark University, 950 Main St, Worcester, MA, 01610, USA
| | - Jeanmarie Loss
- Gustaf H. Carlson School of Chemistry & Biochemistry, Clark University, 950 Main St, Worcester, MA, 01610, USA; Department of Biology, Clark University, 950 Main St, Worcester, MA, 01610, USA
| | - Rylee Simons
- Gustaf H. Carlson School of Chemistry & Biochemistry, Clark University, 950 Main St, Worcester, MA, 01610, USA; Department of Biology, Clark University, 950 Main St, Worcester, MA, 01610, USA
| | - Jacqueline M Dresch
- Department of Math & Computer Science, Clark University, 950 Main St, Worcester, MA, 01610, USA
| | - Robert A Drewell
- Department of Biology, Clark University, 950 Main St, Worcester, MA, 01610, USA.
| | - Donald E Spratt
- Gustaf H. Carlson School of Chemistry & Biochemistry, Clark University, 950 Main St, Worcester, MA, 01610, USA.
| |
Collapse
|
26
|
Thermodynamic model of gene regulation for the Or59b olfactory receptor in Drosophila. PLoS Comput Biol 2019; 15:e1006709. [PMID: 30653495 PMCID: PMC6353224 DOI: 10.1371/journal.pcbi.1006709] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 01/30/2019] [Accepted: 12/07/2018] [Indexed: 12/22/2022] Open
Abstract
Complex eukaryotic promoters normally contain multiple cis-regulatory sequences for different transcription factors (TFs). The binding patterns of the TFs to these sites, as well as the way the TFs interact with each other and with the RNA polymerase (RNAp), lead to combinatorial problems rarely understood in detail, especially under varying epigenetic conditions. The aim of this paper is to build a model describing how the main regulatory cluster of the olfactory receptor Or59b drives transcription of this gene in Drosophila. The cluster-driven expression of this gene is represented as the equilibrium probability of RNAp being bound to the promoter region, using a statistical thermodynamic approach. The RNAp equilibrium probability is computed in terms of the occupancy probabilities of the single TFs of the cluster to the corresponding binding sites, and of the interaction rules among TFs and RNAp, using experimental data of Or59b expression to tune the model parameters. The model reproduces correctly the changes in RNAp binding probability induced by various mutation of specific sites and epigenetic modifications. Some of its predictions have also been validated in novel experiments.
Collapse
|
27
|
Weingarten-Gabbay S, Nir R, Lubliner S, Sharon E, Kalma Y, Weinberger A, Segal E. Systematic interrogation of human promoters. Genome Res 2019; 29:171-183. [PMID: 30622120 PMCID: PMC6360817 DOI: 10.1101/gr.236075.118] [Citation(s) in RCA: 67] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 12/05/2018] [Indexed: 12/19/2022]
Abstract
Despite much research, our understanding of the architecture and cis-regulatory elements of human promoters is still lacking. Here, we devised a high-throughput assay to quantify the activity of approximately 15,000 fully designed sequences that we integrated and expressed from a fixed location within the human genome. We used this method to investigate thousands of native promoters and preinitiation complex (PIC) binding regions followed by in-depth characterization of the sequence motifs underlying promoter activity, including core promoter elements and TF binding sites. We find that core promoters drive transcription mostly unidirectionally and that sequences originating from promoters exhibit stronger activity than those originating from enhancers. By testing multiple synthetic configurations of core promoter elements, we dissect the motifs that positively and negatively regulate transcription as well as the effect of their combinations and distances, including a 10-bp periodicity in the optimal distance between the TATA and the initiator. By comprehensively screening 133 TF binding sites, we find that in contrast to core promoters, TF binding sites maintain similar activity levels in both orientations, supporting a model by which divergent transcription is driven by two distinct unidirectional core promoters sharing bidirectional TF binding sites. Finally, we find a striking agreement between the effect of binding site multiplicity of individual TFs in our assay and their tendency to appear in homotypic clusters throughout the genome. Overall, our study systematically assays the elements that drive expression in core and proximal promoter regions and sheds light on organization principles of regulatory regions in the human genome.
Collapse
Affiliation(s)
- Shira Weingarten-Gabbay
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Ronit Nir
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Shai Lubliner
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Eilon Sharon
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Yael Kalma
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Adina Weinberger
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.,Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
28
|
Martins-Santana L, Nora LC, Sanches-Medeiros A, Lovate GL, Cassiano MHA, Silva-Rocha R. Systems and Synthetic Biology Approaches to Engineer Fungi for Fine Chemical Production. Front Bioeng Biotechnol 2018; 6:117. [PMID: 30338257 PMCID: PMC6178918 DOI: 10.3389/fbioe.2018.00117] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Accepted: 08/02/2018] [Indexed: 01/16/2023] Open
Abstract
Since the advent of systems and synthetic biology, many studies have sought to harness microbes as cell factories through genetic and metabolic engineering approaches. Yeast and filamentous fungi have been successfully harnessed to produce fine and high value-added chemical products. In this review, we present some of the most promising advances from recent years in the use of fungi for this purpose, focusing on the manipulation of fungal strains using systems and synthetic biology tools to improve metabolic flow and the flow of secondary metabolites by pathway redesign. We also review the roles of bioinformatics analysis and predictions in synthetic circuits, highlighting in silico systemic approaches to improve the efficiency of synthetic modules.
Collapse
Affiliation(s)
- Leonardo Martins-Santana
- Systems and Synthetic Biology Laboratory, Cell and Molecular Biology Department, Ribeirão Preto Medical School, São Paulo University (FMRP-USP), Ribeirão Preto, Brazil
| | - Luisa C Nora
- Systems and Synthetic Biology Laboratory, Cell and Molecular Biology Department, Ribeirão Preto Medical School, São Paulo University (FMRP-USP), Ribeirão Preto, Brazil
| | - Ananda Sanches-Medeiros
- Systems and Synthetic Biology Laboratory, Cell and Molecular Biology Department, Ribeirão Preto Medical School, São Paulo University (FMRP-USP), Ribeirão Preto, Brazil
| | - Gabriel L Lovate
- Systems and Synthetic Biology Laboratory, Cell and Molecular Biology Department, Ribeirão Preto Medical School, São Paulo University (FMRP-USP), Ribeirão Preto, Brazil
| | - Murilo H A Cassiano
- Systems and Synthetic Biology Laboratory, Cell and Molecular Biology Department, Ribeirão Preto Medical School, São Paulo University (FMRP-USP), Ribeirão Preto, Brazil
| | - Rafael Silva-Rocha
- Systems and Synthetic Biology Laboratory, Cell and Molecular Biology Department, Ribeirão Preto Medical School, São Paulo University (FMRP-USP), Ribeirão Preto, Brazil
| |
Collapse
|
29
|
Prielhofer R, Reichinger M, Wagner N, Claes K, Kiziak C, Gasser B, Mattanovich D. Superior protein titers in half the fermentation time: Promoter and process engineering for the glucose-regulated GTH1 promoter of Pichia pastoris. Biotechnol Bioeng 2018; 115:2479-2488. [PMID: 30016537 PMCID: PMC6221138 DOI: 10.1002/bit.26800] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Revised: 06/01/2018] [Accepted: 07/02/2018] [Indexed: 12/17/2022]
Abstract
Protein production in Pichia pastoris is often based on the methanol‐inducible P
AOX1 promoter which drives the expression of the target gene. The use of methanol has major drawbacks, so there is a demand for alternative promoters with good induction properties such as the glucose‐regulated P
GTH1 promoter which we reported recently. To further increase its potential, we investigated its regulation in more details by the screening of promoter variants harboring deletions and mutations. Thereby we could identify the main regulatory region and important putative transcription factor binding sites of P
GTH1. Concluding from that, yeast metabolic regulators, monomeric Gal4‐class motifs, carbon source‐responsive elements, and yeast GC‐box proteins likely contribute to the regulation of the promoter. We engineered a P
GTH1 variant with greatly enhanced induction properties compared with that of the wild‐type promoter. Based on that, a model‐based bioprocess design for high volumetric productivity in a limited time was developed for the P
GTH1 variant, to employ a glucose fed‐batch strategy that clearly outperformed a classical methanol fed‐batch of a P
AOX1 strain in terms of titer and process performance.
Collapse
Affiliation(s)
- Roland Prielhofer
- Department of Biotechnology, BOKU-University of Natural Resources and Life Sciences Vienna, Muthgasse, Austria
| | | | | | | | | | - Brigitte Gasser
- Department of Biotechnology, BOKU-University of Natural Resources and Life Sciences Vienna, Muthgasse, Austria.,Christian Doppler-Laboratory for Growth-decoupled Protein Production in Yeast, BOKU-University of Natural Resources and Life Sciences Vienna, Muthgasse, Austria
| | - Diethard Mattanovich
- Department of Biotechnology, BOKU-University of Natural Resources and Life Sciences Vienna, Muthgasse, Austria
| |
Collapse
|
30
|
Serrão VHB, Silva IR, da Silva MTA, Scortecci JF, de Freitas Fernandes A, Thiemann OH. The unique tRNASec and its role in selenocysteine biosynthesis. Amino Acids 2018; 50:1145-1167. [DOI: 10.1007/s00726-018-2595-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2018] [Accepted: 05/26/2018] [Indexed: 12/26/2022]
|
31
|
A New Algorithm for Identifying Cis-Regulatory Modules Based on Hidden Markov Model. BIOMED RESEARCH INTERNATIONAL 2018; 2017:6274513. [PMID: 28497059 PMCID: PMC5405574 DOI: 10.1155/2017/6274513] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Revised: 03/06/2017] [Accepted: 03/23/2017] [Indexed: 11/24/2022]
Abstract
The discovery of cis-regulatory modules (CRMs) is the key to understanding mechanisms of transcription regulation. Since CRMs have specific regulatory structures that are the basis for the regulation of gene expression, how to model the regulatory structure of CRMs has a considerable impact on the performance of CRM identification. The paper proposes a CRM discovery algorithm called ComSPS. ComSPS builds a regulatory structure model of CRMs based on HMM by exploring the rules of CRM transcriptional grammar that governs the internal motif site arrangement of CRMs. We test ComSPS on three benchmark datasets and compare it with five existing methods. Experimental results show that ComSPS performs better than them.
Collapse
|
32
|
Dubois-Chevalier J, Mazrooei P, Lupien M, Staels B, Lefebvre P, Eeckhoute J. Organizing combinatorial transcription factor recruitment at cis-regulatory modules. Transcription 2017; 9:233-239. [PMID: 29105538 DOI: 10.1080/21541264.2017.1394424] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Gene transcriptional regulation relies on cis-regulatory DNA modules (CRMs), which serve as nexus sites for integration of multiple transcription factor (TF) activities. Here, we provide evidence and discuss recent literature indicating that TF recruitment to CRMs is organized into combinations of trans-regulatory protein modules (TRMs). We propose that TRMs are functional entities composed of TFs displaying the most highly interdependent chromatin binding which are, in addition, able to modulate their recruitment to CRMs through inter-TRM effects. These findings shed light on the architectural organization of TF recruitment encoded by their recognition motifs within CRMs.
Collapse
Affiliation(s)
- Julie Dubois-Chevalier
- a Université de Lille - Inserm - Chru de Lille, Institut Pasteur de Lille , U1011- EGID, F-59000 Lille , France
| | - Parisa Mazrooei
- b The Princess Margaret Cancer Centre, University Health Network, Department of Medical Biophysics , University of Toronto , Toronto , ON M5G 1L7 , Canada
| | - Mathieu Lupien
- b The Princess Margaret Cancer Centre, University Health Network, Department of Medical Biophysics , University of Toronto , Toronto , ON M5G 1L7 , Canada
| | - Bart Staels
- a Université de Lille - Inserm - Chru de Lille, Institut Pasteur de Lille , U1011- EGID, F-59000 Lille , France
| | - Philippe Lefebvre
- a Université de Lille - Inserm - Chru de Lille, Institut Pasteur de Lille , U1011- EGID, F-59000 Lille , France
| | - Jérôme Eeckhoute
- a Université de Lille - Inserm - Chru de Lille, Institut Pasteur de Lille , U1011- EGID, F-59000 Lille , France
| |
Collapse
|
33
|
Brown AJ, Gibson SJ, Hatton D, James DC. In silico design of context-responsive mammalian promoters with user-defined functionality. Nucleic Acids Res 2017; 45:10906-10919. [PMID: 28977454 PMCID: PMC5737543 DOI: 10.1093/nar/gkx768] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2017] [Accepted: 08/22/2017] [Indexed: 12/19/2022] Open
Abstract
Comprehensive de novo-design of complex mammalian promoters is restricted by unpredictable combinatorial interactions between constituent transcription factor regulatory elements (TFREs). In this study, we show that modular binding sites that do not function cooperatively can be identified by analyzing host cell transcription factor expression profiles, and subsequently testing cognate TFRE activities in varying homotypic and heterotypic promoter architectures. TFREs that displayed position-insensitive, additive function within a specific expression context could be rationally combined together in silico to create promoters with highly predictable activities. As TFRE order and spacing did not affect the performance of these TFRE-combinations, compositions could be specifically arranged to preclude the formation of undesirable sequence features. This facilitated simple in silico-design of promoters with context-required, user-defined functionalities. To demonstrate this, we de novo-created promoters for biopharmaceutical production in CHO cells that exhibited precisely designed activity dynamics and long-term expression-stability, without causing observable retroactive effects on cellular performance. The design process described can be utilized for applications requiring context-responsive, customizable promoter function, particularly where co-expression of synthetic TFs is not suitable. Although the synthetic promoter structure utilized does not closely resemble native mammalian architectures, our findings also provide additional support for a flexible billboard model of promoter regulation.
Collapse
Affiliation(s)
- Adam J Brown
- Department of Chemical and Biological Engineering, University of Sheffield, Mappin St., Sheffield S1 3JD, UK
| | - Suzanne J Gibson
- Biopharmaceutical Development, MedImmune, Cambridge CB21 6GH, UK
| | - Diane Hatton
- Biopharmaceutical Development, MedImmune, Cambridge CB21 6GH, UK
| | - David C James
- Department of Chemical and Biological Engineering, University of Sheffield, Mappin St., Sheffield S1 3JD, UK
| |
Collapse
|
34
|
Kreimer A, Zeng H, Edwards MD, Guo Y, Tian K, Shin S, Welch R, Wainberg M, Mohan R, Sinnott-Armstrong NA, Li Y, Eraslan G, AMIN TB, Goke J, Mueller NS, Kellis M, Kundaje A, Beer MA, Keles S, Gifford DK, Yosef N. Predicting gene expression in massively parallel reporter assays: A comparative study. Hum Mutat 2017; 38:1240-1250. [PMID: 28220625 PMCID: PMC5560998 DOI: 10.1002/humu.23197] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2016] [Revised: 01/19/2017] [Accepted: 02/12/2017] [Indexed: 02/03/2023]
Abstract
In many human diseases, associated genetic changes tend to occur within noncoding regions, whose effect might be related to transcriptional control. A central goal in human genetics is to understand the function of such noncoding regions: given a region that is statistically associated with changes in gene expression (expression quantitative trait locus [eQTL]), does it in fact play a regulatory role? And if so, how is this role "coded" in its sequence? These questions were the subject of the Critical Assessment of Genome Interpretation eQTL challenge. Participants were given a set of sequences that flank eQTLs in humans and were asked to predict whether these are capable of regulating transcription (as evaluated by massively parallel reporter assays), and whether this capability changes between alternative alleles. Here, we report lessons learned from this community effort. By inspecting predictive properties in isolation, and conducting meta-analysis over the competing methods, we find that using chromatin accessibility and transcription factor binding as features in an ensemble of classifiers or regression models leads to the most accurate results. We then characterize the loci that are harder to predict, putting the spotlight on areas of weakness, which we expect to be the subject of future studies.
Collapse
Affiliation(s)
- Anat Kreimer
- Department of Electrical Engineering and Computer Science and Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Department of Bioengineering and Therapeutic Sciences, Institute for Human Genetics, University of California, San Francisco, San Francisco, California, USA
| | - Haoyang Zeng
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Matthew D. Edwards
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Yuchun Guo
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Kevin Tian
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Sunyoung Shin
- Department of Statistics, Department of Biostatistics and Medical Informatics University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Rene Welch
- Department of Statistics, Department of Biostatistics and Medical Informatics University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Michael Wainberg
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Rahul Mohan
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Nicholas A. Sinnott-Armstrong
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Yue Li
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
| | - Gökcen Eraslan
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1 85764 Neuherberg, Germany
| | - Talal Bin AMIN
- Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Jonathan Goke
- Computational and Systems Biology, Genome Institute of Singapore, Singapore 138672, Singapore
| | - Nikola S. Mueller
- Computational Cell Maps, Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1 85764 Neuherberg, Germany
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
- Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology, 32 Vassar St, Cambridge, Massachusetts 02139, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University School of Medicine, Department of Computer Science, Stanford, California 94305, USA
| | - Michael A Beer
- McKusick-Nathans Institute of Genetic Medicine, Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Sunduz Keles
- Department of Statistics, Department of Biostatistics and Medical Informatics University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - David K. Gifford
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02142, USA
| | - Nir Yosef
- Department of Electrical Engineering and Computer Science and Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
- Ragon Institute of Massachusetts General Hospital, MIT and Harvard, Cambridge, MA, 02139
| |
Collapse
|
35
|
Gritsenko AA, Weingarten-Gabbay S, Elias-Kirma S, Nir R, de Ridder D, Segal E. Sequence features of viral and human Internal Ribosome Entry Sites predictive of their activity. PLoS Comput Biol 2017; 13:e1005734. [PMID: 28922394 PMCID: PMC5630158 DOI: 10.1371/journal.pcbi.1005734] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2016] [Revised: 10/06/2017] [Accepted: 08/22/2017] [Indexed: 01/25/2023] Open
Abstract
Translation of mRNAs through Internal Ribosome Entry Sites (IRESs) has emerged as a prominent mechanism of cellular and viral initiation. It supports cap-independent translation of select cellular genes under normal conditions, and in conditions when cap-dependent translation is inhibited. IRES structure and sequence are believed to be involved in this process. However due to the small number of IRESs known, there have been no systematic investigations of the determinants of IRES activity. With the recent discovery of thousands of novel IRESs in human and viruses, the next challenge is to decipher the sequence determinants of IRES activity. We present the first in-depth computational analysis of a large body of IRESs, exploring RNA sequence features predictive of IRES activity. We identified predictive k-mer features resembling IRES trans-acting factor (ITAF) binding motifs across human and viral IRESs, and found that their effect on expression depends on their sequence, number and position. Our results also suggest that the architecture of retroviral IRESs differs from that of other viruses, presumably due to their exposure to the nuclear environment. Finally, we measured IRES activity of synthetically designed sequences to confirm our prediction of increasing activity as a function of the number of short IRES elements.
Collapse
Affiliation(s)
- Alexey A. Gritsenko
- The Delft Bioinformatics Laboratory, Department of Intelligent Systems, Delft University of Technology, Delft, The Netherlands
- Platform Green Synthetic Biology, Delft, The Netherlands
- Kluyver Centre for Genomics of Industrial Fermentation, Delft, The Netherlands
| | - Shira Weingarten-Gabbay
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Shani Elias-Kirma
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Ronit Nir
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Dick de Ridder
- The Delft Bioinformatics Laboratory, Department of Intelligent Systems, Delft University of Technology, Delft, The Netherlands
- Platform Green Synthetic Biology, Delft, The Netherlands
- Kluyver Centre for Genomics of Industrial Fermentation, Delft, The Netherlands
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
36
|
Vockley CM, McDowell IC, D'Ippolito AM, Reddy TE. A long-range flexible billboard model of gene activation. Transcription 2017; 8:261-267. [PMID: 28598247 DOI: 10.1080/21541264.2017.1317694] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
Abstract
Gene regulation is fundamentally important for the coordination of diverse biologic processes including homeostasis and responses to developmental and environmental stimuli. Transcription factor (TF) binding sites are one of the major functional subunits of gene regulation. They are arranged in cis-regulatory modules (CRMs) that can be more active than the sum of their individual effects. Recently, we described a mechanism of glucocorticoid (GC)-induced gene regulation in which the glucocorticoid receptor (GR) binds coordinately to multiple CRMs that are 10s of kilobases apart in the genome. In those results, the minority of GR binding sites appear to involve direct TF:DNA interactions. Meanwhile, other GR binding sites in a cluster interact with those direct binding sites to tune their gene regulatory activity. Here, we consider the implications of those and related results in the context of existing models of gene regulation. Based on our analyses, we propose that the billboard and regulatory grammar models of cis-regulatory element activity be expanded to consider the influence of long-range interactions between cis-regulatory modules.
Collapse
Affiliation(s)
- Christopher M Vockley
- a Department of Biostatistics & Bioinformatics , Duke University , Durham , NC , USA.,b Center for Genomic & Computational Biology , Duke University , Durham , NC , USA
| | - Ian C McDowell
- b Center for Genomic & Computational Biology , Duke University , Durham , NC , USA.,c Program in Computational Biology & Bioinformatics , Duke University , Durham , NC , USA
| | - Antony M D'Ippolito
- b Center for Genomic & Computational Biology , Duke University , Durham , NC , USA.,d University Program in Genetics & Genomics, Duke University , Durham , NC , USA
| | - Timothy E Reddy
- a Department of Biostatistics & Bioinformatics , Duke University , Durham , NC , USA.,b Center for Genomic & Computational Biology , Duke University , Durham , NC , USA
| |
Collapse
|
37
|
Singh P, Han EH, Endrizzi JA, O'Brien RM, Chi YI. Crystal structures reveal a new and novel FoxO1 binding site within the human glucose-6-phosphatase catalytic subunit 1 gene promoter. J Struct Biol 2017; 198:54-64. [PMID: 28223045 DOI: 10.1016/j.jsb.2017.02.006] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Revised: 02/10/2017] [Accepted: 02/14/2017] [Indexed: 01/07/2023]
Abstract
Human glucose-6-phosphatase plays a vital role in blood glucose homeostasis and holds promise as a therapeutic target for diabetes. Expression of its catalytic subunit gene 1 (G6PC1) is tightly regulated by metabolic-response transcription factors such as FoxO1 and CREB. Although at least three potential FoxO1 binding sites (insulin response elements, IREs) and one CREB binding site (cAMP response element, CRE) within the proximal region of the G6PC1 promoter have been identified, the interplay between FoxO1 and CREB and between FoxO1 bound at multiple IREs has not been well characterized. Here we present the crystal structures of the FoxO1 DNA binding domain in complex with the G6PC1 promoter. These complexes reveal the presence of a new non-consensus FoxO1 binding site that overlaps the CRE, suggesting a mutual exclusion mechanism for FoxO1 and CREB binding at the G6PC1 promoter. Additional findings include (i) non-canonical FoxO1 recognition sites, (ii) incomplete FoxO1 occupancies at the available IRE sites, and (iii) FoxO1 dimeric interactions that may play a role in stabilizing DNA looping. These findings provide insight into the regulation of G6PC1 gene transcription by FoxO1, and demonstrate a high versatility of target gene recognition by FoxO1 that correlates with its diverse roles in biology.
Collapse
Affiliation(s)
- Puja Singh
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States
| | - Eun Hee Han
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States
| | - James A Endrizzi
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States
| | - Richard M O'Brien
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN 37232, United States.
| | - Young-In Chi
- Section of Structural Biology, Hormel Institute, University of Minnesota, Austin, MN 55912, United States.
| |
Collapse
|
38
|
Guo Y, Gifford DK. Modular combinatorial binding among human trans-acting factors reveals direct and indirect factor binding. BMC Genomics 2017; 18:45. [PMID: 28061806 PMCID: PMC5219757 DOI: 10.1186/s12864-016-3434-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2016] [Accepted: 12/19/2016] [Indexed: 11/25/2022] Open
Abstract
Background The combinatorial binding of trans-acting factors (TFs) to the DNA is critical to the spatial and temporal specificity of gene regulation. For certain regulatory regions, more than one regulatory module (set of TFs that bind together) are combined to achieve context-specific gene regulation. However, previous approaches are limited to either pairwise TF co-association analysis or assuming that only one module is used in each regulatory region. Results We present a new computational approach that models the modular organization of TF combinatorial binding. Our method learns compact and coherent regulatory modules from in vivo binding data using a topic model. We found that the binding of 115 TFs in K562 cells can be organized into 49 interpretable modules. Furthermore, we found that tens of thousands of regulatory regions use multiple modules, a structure that cannot be observed with previous hard clustering based methods. The modules discovered recapitulate many published protein-protein physical interactions, have consistent functional annotations of chromatin states, and uncover context specific co-binding such as gene proximal binding of NFY + FOS + SP and distal binding of NFY + FOS + USF. For certain TFs, the co-binding partners of direct binding (motif present) differs from those of indirect binding (motif absent); the distinct set of co-binding partners can predict whether the TF binds directly or indirectly with up to 95% accuracy. Joint analysis across two cell types reveals both cell-type-specific and shared regulatory modules. Conclusions Our results provide comprehensive cell-type-specific combinatorial binding maps and suggest a modular organization of combinatorial binding. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3434-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yuchun Guo
- MIT, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, 02139, USA
| | - David K Gifford
- MIT, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, 02139, USA.
| |
Collapse
|
39
|
Abstract
Promoter functionality is highly context dependent, as exemplified by gene-specific expression profiles across different tissues and cell types. Cell type-specific promoter regulation is a function of each cell's unique complement of transcriptional machinery components. Accordingly, to achieve high levels of transcriptional activity within a particular cell type, synthetic promoters must be specifically designed to harness those cells discrete repertoire of available transcription factors . Here, we describe a method for constructing very strong cell type-specific synthetic promoters for use in any given mammalian host cell. Transcription factor regulatory elements (TFREs; or transcription factor binding sites) that can independently mediate activation of recombinant gene transcription in the chosen host cells by using available transcription factor activity are identified and utilized as building blocks to construct novel promoter sequences with varying activities. Bioinformatics analysis of synthetic promoter 's TFRE compositions is then performed to determine how differing relative TFRE abundances explain variations in relative promoter activities . This information is used to derive an optimal second-generation promoter library construction design space, such that promoters with maximal transcriptional activity in the host cell type can be created.
Collapse
Affiliation(s)
- Adam J Brown
- Department of Chemical and Biological Engineering, University of Sheffield, Mappin Street, Sheffield, S1 3JD, England, UK.
| | - David C James
- Department of Chemical and Biological Engineering, University of Sheffield, Mappin Street, Sheffield, S1 3JD, England, UK
| |
Collapse
|
40
|
Scholes C, DePace AH, Sánchez Á. Combinatorial Gene Regulation through Kinetic Control of the Transcription Cycle. Cell Syst 2016; 4:97-108.e9. [PMID: 28041762 DOI: 10.1016/j.cels.2016.11.012] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2016] [Revised: 08/09/2016] [Accepted: 11/23/2016] [Indexed: 11/20/2022]
Abstract
Cells decide when, where, and to what level to express their genes by "computing" information from transcription factors (TFs) binding to regulatory DNA. How is the information contained in multiple TF-binding sites integrated to dictate the rate of transcription? The dominant conceptual and quantitative model is that TFs combinatorially recruit one another and RNA polymerase to the promoter by direct physical interactions. Here, we develop a quantitative framework to explore kinetic control, an alternative model in which combinatorial gene regulation can result from TFs working on different kinetic steps of the transcription cycle. Kinetic control can generate a wide range of analog and Boolean computations without requiring the input TFs to be simultaneously bound to regulatory DNA. We propose experiments that will illuminate the role of kinetic control in transcription and discuss implications for deciphering the cis-regulatory "code."
Collapse
Affiliation(s)
- Clarissa Scholes
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA; Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
| | - Álvaro Sánchez
- The Rowland Institute at Harvard, Harvard University, Cambridge, MA 02142, USA.
| |
Collapse
|
41
|
Weingarten-Gabbay S, Segal E. Toward a systematic understanding of translational regulatory elements in human and viruses. RNA Biol 2016; 13:927-933. [PMID: 27442807 DOI: 10.1080/15476286.2016.1212802] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Translational regulation is a critical step in the production of proteins from genomic material in both human and viruses. However, unlike other steps of the central dogma, such as transcriptional regulation, little is known about the cis-regulatory elements involved. In a recent study we devised a high-throughput bicistronic reporter assay for the discovery and the characterization of thousands of novel Internal Ribosome Entry Sites (IRESs) in human and hundreds of viral genomes. Our results provide insights into the landscape of IRES elements in human and viral transcripts and the cis-regulatory sequences underlying their activity. Here, we discuss these results as well as emerging insights from other studies, providing new views about translational regulation in human and viruses. In addition, we highlight recent high-throughput technologies in the field and discuss how combining insights from high- and low-throughput approaches can illuminate yet uncovered mechanisms of translational regulation.
Collapse
Affiliation(s)
- Shira Weingarten-Gabbay
- a Department of Computer Science and Applied Mathematics , Weizmann Institute of Science , Rehovot , Israel.,b Department of Molecular Cell Biology , Weizmann Institute of Science , Rehovot , Israel
| | - Eran Segal
- a Department of Computer Science and Applied Mathematics , Weizmann Institute of Science , Rehovot , Israel.,b Department of Molecular Cell Biology , Weizmann Institute of Science , Rehovot , Israel
| |
Collapse
|
42
|
|
43
|
Shabbir Hussain M, Gambill L, Smith S, Blenner MA. Engineering Promoter Architecture in Oleaginous Yeast Yarrowia lipolytica. ACS Synth Biol 2016; 5:213-23. [PMID: 26635071 DOI: 10.1021/acssynbio.5b00100] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Eukaryotic promoters have a complex architecture to control both the strength and timing of gene transcription spanning up to thousands of bases from the initiation site. This complexity makes rational fine-tuning of promoters in fungi difficult to predict; however, this very same complexity enables multiple possible strategies for engineering promoter strength. Here, we studied promoter architecture in the oleaginous yeast, Yarrowia lipolytica. While recent studies have focused on upstream activating sequences, we systematically examined various components common in fungal promoters. Here, we examine several promoter components including upstream activating sequences, proximal promoter sequences, core promoters, and the TATA box in autonomously replicating expression plasmids and integrated into the genome. Our findings show that promoter strength can be fine-tuned through the engineering of the TATA box sequence, core promoter, and upstream activating sequences. Additionally, we identified a previously unreported oleic acid responsive transcription enhancement in the XPR2 upstream activating sequences, which illustrates the complexity of fungal promoters. The promoters engineered here provide new genetic tools for metabolic engineering in Y. lipolytica and provide promoter engineering strategies that may be useful in engineering other non-model fungal systems.
Collapse
Affiliation(s)
- Murtaza Shabbir Hussain
- Department of Chemical and Biomolecular Engineering and ‡Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina 29634, United States
| | - Lauren Gambill
- Department of Chemical and Biomolecular Engineering and ‡Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina 29634, United States
| | - Spencer Smith
- Department of Chemical and Biomolecular Engineering and ‡Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina 29634, United States
| | - Mark A. Blenner
- Department of Chemical and Biomolecular Engineering and ‡Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina 29634, United States
| |
Collapse
|
44
|
Estrada J, Ruiz-Herrero T, Scholes C, Wunderlich Z, DePace AH. SiteOut: An Online Tool to Design Binding Site-Free DNA Sequences. PLoS One 2016; 11:e0151740. [PMID: 26987123 PMCID: PMC4795680 DOI: 10.1371/journal.pone.0151740] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 03/03/2016] [Indexed: 11/18/2022] Open
Abstract
DNA-binding proteins control many fundamental biological processes such as transcription, recombination and replication. A major goal is to decipher the role that DNA sequence plays in orchestrating the binding and activity of such regulatory proteins. To address this goal, it is useful to rationally design DNA sequences with desired numbers, affinities and arrangements of protein binding sites. However, removing binding sites from DNA is computationally non-trivial since one risks creating new sites in the process of deleting or moving others. Here we present an online binding site removal tool, SiteOut, that enables users to design arbitrary DNA sequences that entirely lack binding sites for factors of interest. SiteOut can also be used to delete sites from a specific sequence, or to introduce site-free spacers between functional sequences without creating new sites at the junctions. In combination with commercial DNA synthesis services, SiteOut provides a powerful and flexible platform for synthetic projects that interrogate regulatory DNA. Here we describe the algorithm and illustrate the ways in which SiteOut can be used; it is publicly available at https://depace.med.harvard.edu/siteout/.
Collapse
Affiliation(s)
- Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, MA, United States of America
| | - Teresa Ruiz-Herrero
- John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, MA, United States of America
| | - Clarissa Scholes
- Department of Systems Biology, Harvard Medical School, Boston, MA, United States of America
| | - Zeba Wunderlich
- Department of Systems Biology, Harvard Medical School, Boston, MA, United States of America
| | - Angela H. DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA, United States of America
- * E-mail:
| |
Collapse
|
45
|
Vincent BJ, Estrada J, DePace AH. The appeasement of Doug: a synthetic approach to enhancer biology. Integr Biol (Camb) 2016; 8:475-84. [DOI: 10.1039/c5ib00321k] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Ben J. Vincent
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| | - Angela H. DePace
- Department of Systems Biology, Harvard Medical School, 200 Longwood Avenue, Boston, MA 02115, USA
| |
Collapse
|
46
|
Peng PC, Hassan Samee MA, Sinha S. Incorporating chromatin accessibility data into sequence-to-expression modeling. Biophys J 2016; 108:1257-67. [PMID: 25762337 DOI: 10.1016/j.bpj.2014.12.037] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2014] [Revised: 12/01/2014] [Accepted: 12/11/2014] [Indexed: 01/30/2023] Open
Abstract
Prediction of gene expression levels from regulatory sequences is one of the major challenges of genomic biology today. A particularly promising approach to this problem is that taken by thermodynamics-based models that interpret an enhancer sequence in a given cellular context specified by transcription factor concentration levels and predict precise expression levels driven by that enhancer. Such models have so far not accounted for the effect of chromatin accessibility on interactions between transcription factor and DNA and consequently on gene-expression levels. Here, we extend a thermodynamics-based model of gene expression, called GEMSTAT (Gene Expression Modeling Based on Statistical Thermodynamics), to incorporate chromatin accessibility data and quantify its effect on accuracy of expression prediction. In the new model, called GEMSTAT-A, accessibility at a binding site is assumed to affect the transcription factor's binding strength at the site, whereas all other aspects are identical to the GEMSTAT model. We show that this modification results in significantly better fits in a data set of over 30 enhancers regulating spatial expression patterns in the blastoderm-stage Drosophila embryo. It is important to note that the improved fits result not from an overall elevated accessibility in active enhancers but from the variation of accessibility levels within an enhancer. With whole-genome DNA accessibility measurements becoming increasingly popular, our work demonstrates how such data may be useful for sequence-to-expression models. It also calls for future advances in modeling accessibility levels from sequence and the transregulatory context, so as to predict accurately the effect of cis and trans perturbations on gene expression.
Collapse
Affiliation(s)
- Pei-Chen Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Md Abul Hassan Samee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, Illinois; Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois.
| |
Collapse
|
47
|
|
48
|
Brown AJ, James DC. Precision control of recombinant gene transcription for CHO cell synthetic biology. Biotechnol Adv 2015; 34:492-503. [PMID: 26721629 DOI: 10.1016/j.biotechadv.2015.12.012] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Revised: 12/11/2015] [Accepted: 12/22/2015] [Indexed: 11/30/2022]
Abstract
The next generation of mammalian cell factories for biopharmaceutical production will be genetically engineered to possess both generic and product-specific manufacturing capabilities that may not exist naturally. Introduction of entirely new combinations of synthetic functions (e.g. novel metabolic or stress-response pathways), and retro-engineering of existing functional cell modules will drive disruptive change in cellular manufacturing performance. However, before we can apply the core concepts underpinning synthetic biology (design, build, test) to CHO cell engineering we must first develop practical and robust enabling technologies. Fundamentally, we will require the ability to precisely control the relative stoichiometry of numerous functional components we simultaneously introduce into the host cell factory. In this review we discuss how this can be achieved by design of engineered promoters that enable concerted control of recombinant gene transcription. We describe the specific mechanisms of transcriptional regulation that affect promoter function during bioproduction processes, and detail the highly-specific promoter design criteria that are required in the context of CHO cell engineering. The relative applicability of diverse promoter development strategies are discussed, including re-engineering of natural sequences, design of synthetic transcription factor-based systems, and construction of synthetic promoters. This review highlights the potential of promoter engineering to achieve precision transcriptional control for CHO cell synthetic biology.
Collapse
Affiliation(s)
- Adam J Brown
- Department of Chemical and Biological Engineering, University of Sheffield, Sheffield S1 3JD, England, United Kingdom
| | - David C James
- Department of Chemical and Biological Engineering, University of Sheffield, Sheffield S1 3JD, England, United Kingdom.
| |
Collapse
|
49
|
Davey NE, Cyert MS, Moses AM. Short linear motifs - ex nihilo evolution of protein regulation. Cell Commun Signal 2015; 13:43. [PMID: 26589632 PMCID: PMC4654906 DOI: 10.1186/s12964-015-0120-z] [Citation(s) in RCA: 142] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 11/13/2015] [Indexed: 12/12/2022] Open
Abstract
Short sequence motifs are ubiquitous across the three major types of biomolecules: hundreds of classes and thousands of instances of DNA regulatory elements, RNA motifs and protein short linear motifs (SLiMs) have been characterised. The increase in complexity of transcriptional, post-transcriptional and post-translational regulation in higher Eukaryotes has coincided with a significant expansion of motif use. But how did the eukaryotic cell acquire such a vast repertoire of motifs? In this review, we curate the available literature on protein motif evolution and discuss the evidence that suggests SLiMs can be acquired by mutations, insertions and deletions in disordered regions. We propose a mechanism of ex nihilo SLiM evolution – the evolution of a novel SLiM from “nothing” – adding a functional module to a previously non-functional region of protein sequence. In our model, hundreds of motif-binding domains in higher eukaryotic proteins connect simple motif specificities with useful functions to create a large functional motif space. Accessible peptides that match the specificity of these motif-binding domains are continuously created and destroyed by mutations in rapidly evolving disordered regions, creating a dynamic supply of new interactions that may have advantageous phenotypic novelty. This provides a reservoir of diversity to modify existing interaction networks. Evolutionary pressures will act on these motifs to retain beneficial instances. However, most will be lost on an evolutionary timescale as negative selection and genetic drift act on deleterious and neutral motifs respectively. In light of the parallels between the presented model and the evolution of motifs in the regulatory segments of genes and (pre-)mRNAs, we suggest our understanding of regulatory networks would benefit from the creation of a shared model describing the evolution of transcriptional, post-transcriptional and post-translational regulation.
Collapse
Affiliation(s)
- Norman E Davey
- Conway Institute of Biomolecular and Biomedical Sciences, University College Dublin, Dublin 4, Ireland.
| | - Martha S Cyert
- Department of Biology, Stanford University, Stanford, CA, 94305, USA.
| | - Alan M Moses
- Department of Cell & Systems Biology, University of Toronto, Toronto, Canada. .,Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, Canada.
| |
Collapse
|
50
|
Function does not follow form in gene regulatory circuits. Sci Rep 2015; 5:13015. [PMID: 26290154 PMCID: PMC4542331 DOI: 10.1038/srep13015] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2015] [Accepted: 07/06/2015] [Indexed: 11/08/2022] Open
Abstract
Gene regulatory circuits are to the cell what arithmetic logic units are to the chip: fundamental components of information processing that map an input onto an output. Gene regulatory circuits come in many different forms, distinct structural configurations that determine who regulates whom. Studies that have focused on the gene expression patterns (functions) of circuits with a given structure (form) have examined just a few structures or gene expression patterns. Here, we use a computational model to exhaustively characterize the gene expression patterns of nearly 17 million three-gene circuits in order to systematically explore the relationship between circuit form and function. Three main conclusions emerge. First, function does not follow form. A circuit of any one structure can have between twelve and nearly thirty thousand distinct gene expression patterns. Second, and conversely, form does not follow function. Most gene expression patterns can be realized by more than one circuit structure. And third, multifunctionality severely constrains circuit form. The number of circuit structures able to drive multiple gene expression patterns decreases rapidly with the number of these patterns. These results indicate that it is generally not possible to infer circuit function from circuit form, or vice versa.
Collapse
|