51
|
McAfee JC, Bell JL, Krupa O, Matoba N, Stein JL, Won H. Focus on your locus with a massively parallel reporter assay. J Neurodev Disord 2022; 14:50. [PMID: 36085003 PMCID: PMC9463819 DOI: 10.1186/s11689-022-09461-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Accepted: 09/01/2022] [Indexed: 01/01/2023] Open
Abstract
A growing number of variants associated with risk for neurodevelopmental disorders have been identified by genome-wide association and whole genome sequencing studies. As common risk variants often fall within large haplotype blocks covering long stretches of the noncoding genome, the causal variants within an associated locus are often unknown. Similarly, the effect of rare noncoding risk variants identified by whole genome sequencing on molecular traits is seldom known without functional assays. A massively parallel reporter assay (MPRA) is an assay that can functionally validate thousands of regulatory elements simultaneously using high-throughput sequencing and barcode technology. MPRA has been adapted to various experimental designs that measure gene regulatory effects of genetic variants within cis- and trans-regulatory elements as well as posttranscriptional processes. This review discusses different MPRA designs that have been or could be used in the future to experimentally validate genetic variants associated with neurodevelopmental disorders. Though MPRA has limitations such as it does not model genomic context, this assay can help narrow down the underlying genetic causes of neurodevelopmental disorders by screening thousands of sequences in one experiment. We conclude by describing future directions of this technique such as applications of MPRA for gene-by-environment interactions and pharmacogenetics.
Collapse
Affiliation(s)
- Jessica C McAfee
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jessica L Bell
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Oleh Krupa
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Nana Matoba
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA
| | - Jason L Stein
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| | - Hyejung Won
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
- UNC Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27599, USA.
| |
Collapse
|
52
|
Schreiber J, Nair S, Balsubramani A, Kundaje A. Accelerating in silico saturation mutagenesis using compressed sensing. Bioinformatics 2022; 38:3557-3564. [PMID: 35678521 PMCID: PMC9272795 DOI: 10.1093/bioinformatics/btac385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Revised: 05/10/2022] [Accepted: 06/06/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION In silico saturation mutagenesis (ISM) is a popular approach in computational genomics for calculating feature attributions on biological sequences that proceeds by systematically perturbing each position in a sequence and recording the difference in model output. However, this method can be slow because systematically perturbing each position requires performing a number of forward passes proportional to the length of the sequence being examined. RESULTS In this work, we propose a modification of ISM that leverages the principles of compressed sensing to require only a constant number of forward passes, regardless of sequence length, when applied to models that contain operations with a limited receptive field, such as convolutions. Our method, named Yuzu, can reduce the time that ISM spends in convolution operations by several orders of magnitude and, consequently, Yuzu can speed up ISM on several commonly used architectures in genomics by over an order of magnitude. Notably, we found that Yuzu provides speedups that increase with the complexity of the convolution operation and the length of the sequence being analyzed, suggesting that Yuzu provides large benefits in realistic settings. AVAILABILITY AND IMPLEMENTATION We have made this tool available at https://github.com/kundajelab/yuzu.
Collapse
Affiliation(s)
- Jacob Schreiber
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Surag Nair
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Akshay Balsubramani
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
53
|
Perkins ML, Gandara L, Crocker J. A synthetic synthesis to explore animal evolution and development. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200517. [PMID: 35634925 PMCID: PMC9149795 DOI: 10.1098/rstb.2020.0517] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Identifying the general principles by which genotypes are converted into phenotypes remains a challenge in the post-genomic era. We still lack a predictive understanding of how genes shape interactions among cells and tissues in response to signalling and environmental cues, and hence how regulatory networks generate the phenotypic variation required for adaptive evolution. Here, we discuss how techniques borrowed from synthetic biology may facilitate a systematic exploration of evolvability across biological scales. Synthetic approaches permit controlled manipulation of both endogenous and fully engineered systems, providing a flexible platform for investigating causal mechanisms in vivo. Combining synthetic approaches with multi-level phenotyping (phenomics) will supply a detailed, quantitative characterization of how internal and external stimuli shape the morphology and behaviour of living organisms. We advocate integrating high-throughput experimental data with mathematical and computational techniques from a variety of disciplines in order to pursue a comprehensive theory of evolution. This article is part of the theme issue ‘Genetic basis of adaptation and speciation: from loci to causative mutations’.
Collapse
Affiliation(s)
- Mindy Liu Perkins
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lautaro Gandara
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Justin Crocker
- Developmental Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| |
Collapse
|
54
|
Cho MH, Hobbs BD, Silverman EK. Genetics of chronic obstructive pulmonary disease: understanding the pathobiology and heterogeneity of a complex disorder. THE LANCET. RESPIRATORY MEDICINE 2022; 10:485-496. [PMID: 35427534 PMCID: PMC11197974 DOI: 10.1016/s2213-2600(21)00510-5] [Citation(s) in RCA: 39] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 10/20/2021] [Accepted: 11/09/2021] [Indexed: 12/20/2022]
Abstract
Chronic obstructive pulmonary disease (COPD) is a deadly and highly morbid disease. Susceptibility to and heterogeneity of COPD are incompletely explained by environmental factors such as cigarette smoking. Family-based and population-based studies have shown that a substantial proportion of COPD risk is related to genetic variation. Genetic association studies have identified hundreds of genetic variants that affect risk for COPD, decreased lung function, and other COPD-related traits. These genetic variants are associated with other pulmonary and non-pulmonary traits, demonstrate a genetic basis for at least part of COPD heterogeneity, have a substantial effect on COPD risk in aggregate, implicate early-life events in COPD pathogenesis, and often involve genes not previously suspected to have a role in COPD. Additional progress will require larger genetic studies with more ancestral diversity, improved profiling of rare variants, and better statistical methods. Through integration of genetic data with other omics data and comprehensive COPD phenotypes, as well as functional description of causal mechanisms for genetic risk variants, COPD genetics will continue to inform novel approaches to understanding the pathobiology of COPD and developing new strategies for management and treatment.
Collapse
Affiliation(s)
- Michael H Cho
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA.
| | - Brian D Hobbs
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
| | - Edwin K Silverman
- Channing Division of Network Medicine and Division of Pulmonary and Critical Care Medicine, Brigham and Women's Hospital, Boston, MA, USA; Harvard Medical School, Boston, MA, USA
| |
Collapse
|
55
|
Nair S, Shrikumar A, Schreiber J, Kundaje A. fastISM: performant in silico saturation mutagenesis for convolutional neural networks. Bioinformatics 2022; 38:2397-2403. [PMID: 35238376 PMCID: PMC9048647 DOI: 10.1093/bioinformatics/btac135] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 02/09/2022] [Accepted: 03/01/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Deep-learning models, such as convolutional neural networks, are able to accurately map biological sequences to associated functional readouts and properties by learning predictive de novo representations. In silico saturation mutagenesis (ISM) is a popular feature attribution technique for inferring contributions of all characters in an input sequence to the model's predicted output. The main drawback of ISM is its runtime, as it involves multiple forward propagations of all possible mutations of each character in the input sequence through the trained model to predict the effects on the output. RESULTS We present fastISM, an algorithm that speeds up ISM by a factor of over 10× for commonly used convolutional neural network architectures. fastISM is based on the observations that the majority of computation in ISM is spent in convolutional layers, and a single mutation only disrupts a limited region of intermediate layers, rendering most computation redundant. fastISM reduces the gap between backpropagation-based feature attribution methods and ISM. It far surpasses the runtime of backpropagation-based methods on multi-output architectures, making it feasible to run ISM on a large number of sequences. AVAILABILITY AND IMPLEMENTATION An easy-to-use Keras/TensorFlow 2 implementation of fastISM is available at https://github.com/kundajelab/fastISM. fastISM can be installed using pip install fastism. A hands-on tutorial can be found at https://colab.research.google.com/github/kundajelab/fastISM/blob/master/notebooks/colab/DeepSEA.ipynb. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Surag Nair
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
| | - Jacob Schreiber
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA 94305, USA
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
56
|
Warren TL, Lambert JT, Nord AS. AAV Deployment of Enhancer-Based Expression Constructs In Vivo in Mouse Brain. J Vis Exp 2022:10.3791/62650. [PMID: 35435902 PMCID: PMC10010840 DOI: 10.3791/62650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022] Open
Abstract
Enhancers are binding platforms for a diverse array of transcription factors that drive specific expression patterns of tissue- and cell-type-specific genes. Multiple means of assessing non-coding DNA and various chromatin states have proven useful in predicting the presence of enhancer sequences in the genome, but validating the activity of these sequences and finding the organs and developmental stages they are active in is a labor-intensive process. Recent advances in adeno-associated virus (AAV) vectors have enabled the widespread delivery of transgenes to mouse tissues, enabling in vivo enhancer testing without necessitating a transgenic animal. This protocol shows how a reporter construct that expresses EGFP under the control of a minimal promoter, which does not drive significant expression on its own, can be used to study the activity patterns of candidate enhancer sequences in the mouse brain. An AAV-packaged reporter construct is delivered to the mouse brain and incubated for 1-4 weeks, after which the animal is sacrificed, and brain sections are observed under a microscope. EGFP appears in cells in which the tested enhancer is sufficient to initiate gene expression, pinpointing the location and developmental stage in which the enhancer is active in the brain. Standard cloning methods, low-cost AAV packaging, and expanding AAV serotypes and methods for in vivo delivery and standard imaging readout make this an accessible approach for the study of how gene expression is regulated in the brain.
Collapse
Affiliation(s)
- Tracy L Warren
- Department of Psychiatry and Behavioral Sciences, University of California, Davis; Department of Neurobiology, Physiology and Behavior, University of California, Davis
| | - Jason T Lambert
- Department of Psychiatry and Behavioral Sciences, University of California, Davis; Department of Neurobiology, Physiology and Behavior, University of California, Davis;
| | - Alex S Nord
- Department of Psychiatry and Behavioral Sciences, University of California, Davis; Department of Neurobiology, Physiology and Behavior, University of California, Davis;
| |
Collapse
|
57
|
Abell NS, DeGorter MK, Gloudemans MJ, Greenwald E, Smith KS, He Z, Montgomery SB. Multiple causal variants underlie genetic associations in humans. Science 2022; 375:1247-1254. [PMID: 35298243 PMCID: PMC9725108 DOI: 10.1126/science.abj5117] [Citation(s) in RCA: 72] [Impact Index Per Article: 36.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Associations between genetic variation and traits are often in noncoding regions with strong linkage disequilibrium (LD), where a single causal variant is assumed to underlie the association. We applied a massively parallel reporter assay (MPRA) to functionally evaluate genetic variants in high, local LD for independent cis-expression quantitative trait loci (eQTL). We found that 17.7% of eQTLs exhibit more than one major allelic effect in tight LD. The detected regulatory variants were highly and specifically enriched for activating chromatin structures and allelic transcription factor binding. Integration of MPRA profiles with eQTL/complex trait colocalizations across 114 human traits and diseases identified causal variant sets demonstrating how genetic association signals can manifest through multiple, tightly linked causal variants.
Collapse
Affiliation(s)
- Nathan S. Abell
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Marianne K. DeGorter
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, 94305, USA
| | | | - Emily Greenwald
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Kevin S. Smith
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Zihuai He
- Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA 94305, USA
- Quantitative Sciences Unit, Department of Medicine, Stanford University, Stanford, CA, 94305, USA
| | - Stephen B. Montgomery
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, 94305, USA
- Department of Pathology, School of Medicine, Stanford University, Stanford, CA, 94305, USA
| |
Collapse
|
58
|
Yan J, Huangfu D. Epigenome rewiring in human pluripotent stem cells. Trends Cell Biol 2022; 32:259-271. [PMID: 34955367 PMCID: PMC8840982 DOI: 10.1016/j.tcb.2021.12.001] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 11/29/2021] [Accepted: 12/02/2021] [Indexed: 01/10/2023]
Abstract
The epigenome plays a crucial role in modulating the activity of regulatory elements, thereby orchestrating diverse transcriptional programs during embryonic development. Human (h)PSC stepwise differentiation provides an excellent platform for capturing dynamic epigenomic events during lineage transition in human development. Here we discuss how recent technological advances, from epigenomic mapping to targeted perturbation, are providing a more comprehensive appreciation of remodeling of the chromatin landscape during human development with implications for aberrant rewiring in disease. We predict that the continuous innovation of hPSC differentiation methods, epigenome mapping, and CRISPR (clustered regularly interspaced short palindromic repeats) perturbation technologies will allow researchers to build toward not only a comprehensive understanding of the epigenomic mechanisms governing development, but also a highly flexible way to model diseases with opportunities for translation.
Collapse
Affiliation(s)
- Jielin Yan
- Sloan Kettering Institute, 1275 York Avenue, New York, NY 10065, USA; Louis V. Gerstner Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, NY 10065, USA
| | - Danwei Huangfu
- Sloan Kettering Institute, 1275 York Avenue, New York, NY 10065, USA.
| |
Collapse
|
59
|
Tosi L, Chaikban L, Larman BH, Rosenfield J, Parekkadan B. Massively parallel DNA target capture using long adapter single stranded oligonucleotide (LASSO) probes assembled through a novel DNA recombinase mediated methodology. Biotechnol J 2022; 17:e2100240. [PMID: 34775678 PMCID: PMC8825753 DOI: 10.1002/biot.202100240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 11/05/2021] [Accepted: 11/05/2021] [Indexed: 02/03/2023]
Abstract
In the attempt to bridge the widening gap from DNA sequence to biological function, we developed a novel methodology to assemble Long-Adapter Single-Strand Oligonucleotide (LASSO) probe libraries that enabled the massively multiplexed capture of kilobase-sized DNA fragments for downstream long read DNA sequencing or expression. This method uses short DNA oligonucleotides (pre-LASSO probes) and a plasmid vector that supplies the linker sequence for the mature LASSO probe through Cre-LoxP intramolecular recombination. This strategy generates high quality LASSO probes libraries (≈46% of correct probes). We performed NGS analysis of the post-capture PCR amplification of DNA circles obtained from the LASSO capture of 3087 Escherichia coli ORFs spanning from 400- to 5000 bp. The median enrichment of all targeted ORFs versus untargeted ORFs was 30 times. For ORFs up to 1kb in size, targeted ORFs were enriched up to a median of 260-fold. Here, we show that LASSO probes obtained in this manner, were able to capture full-length open reading frames from total human cDNA. Furthermore, we show that the LASSO capture specificity and sensitivity is sufficient for target capture from total human genomic DNA template. This technology can be used for the preparation of long-read sequencing libraries and for massively multiplexed cloning of human sequences.
Collapse
Affiliation(s)
- Lorenzo Tosi
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA
| | - Lamia Chaikban
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA
| | - Benjamin H. Larman
- Institute of Cell Engineering, Division of Immunology,
Department of Pathology, Johns Hopkins University, Baltimore, MD, USA
| | - Jeffrey Rosenfield
- Cancer Institute of New Jersey, New Brunswick, New Jersey
08854, USA,Department of Pathology, Robert Wood Johnson Medical
School, New Brunswick, NJ 08903, USA
| | - Biju Parekkadan
- Department of Biomedical Engineering, Rutgers University,
Piscataway, New Jersey 08854, USA,Cancer Institute of New Jersey, New Brunswick, New Jersey
08854, USA,Correspondence and requests for materials should
be addressed to B.P. (; 599 Taylor
Road, Piscataway, NJ 08854)
| |
Collapse
|
60
|
Qi Z, Jung C, Bandilla P, Ludwig C, Heron M, Sophie Kiesel A, Museridze M, Philippou‐Massier J, Nikolov M, Renna Max Schnepf A, Unnerstall U, Ceolin S, Mühlig B, Gompel N, Soeding J, Gaul U. Large-scale analysis of Drosophila core promoter function using synthetic promoters. Mol Syst Biol 2022; 18:e9816. [PMID: 35156763 PMCID: PMC8842121 DOI: 10.15252/msb.20209816] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2020] [Revised: 01/11/2022] [Accepted: 01/13/2022] [Indexed: 02/02/2023] Open
Abstract
The core promoter plays a central role in setting metazoan gene expression levels, but how exactly it "computes" expression remains poorly understood. To dissect its function, we carried out a comprehensive structure-function analysis in Drosophila. First, we performed a genome-wide bioinformatic analysis, providing an improved picture of the sequence motifs architecture. We then measured synthetic promoters' activities of ~3,000 mutational variants with and without an external stimulus (hormonal activation), at large scale and with high accuracy using robotics and a dual luciferase reporter assay. We observed a strong impact on activity of the different types of mutations, including knockout of individual sequence motifs and motif combinations, variations of motif strength, nucleosome positioning, and flanking sequences. A linear combination of the individual motif features largely accounts for the combinatorial effects on core promoter activity. These findings shed new light on the quantitative assessment of gene expression in metazoans.
Collapse
Affiliation(s)
- Zhan Qi
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Christophe Jung
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Peter Bandilla
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Claudia Ludwig
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Mark Heron
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Anja Sophie Kiesel
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Mariam Museridze
- Department of Biology II, Evolutionary BiologyLudwig‐Maximilians‐Universität MünchenPlanegg‐MartinsriedGermany
| | - Julia Philippou‐Massier
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Miroslav Nikolov
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Alessio Renna Max Schnepf
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Ulrich Unnerstall
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| | - Stefano Ceolin
- Department of Biology II, Evolutionary BiologyLudwig‐Maximilians‐Universität MünchenPlanegg‐MartinsriedGermany
| | - Bettina Mühlig
- Department of Biology II, Evolutionary BiologyLudwig‐Maximilians‐Universität MünchenPlanegg‐MartinsriedGermany
| | - Nicolas Gompel
- Department of Biology II, Evolutionary BiologyLudwig‐Maximilians‐Universität MünchenPlanegg‐MartinsriedGermany
| | - Johannes Soeding
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
- Max Planck Institute for Biophysical ChemistryGöttingenGermany
| | - Ulrike Gaul
- Department of Biochemistry, Gene CenterLudwig‐Maximillians‐Universität MünchenFeodor‐Lynen‐str 25MunichGermany
| |
Collapse
|
61
|
Bray D, Hook H, Zhao R, Keenan JL, Penvose A, Osayame Y, Mohaghegh N, Chen X, Parameswaran S, Kottyan LC, Weirauch MT, Siggers T. CASCADE: high-throughput characterization of regulatory complex binding altered by non-coding variants. CELL GENOMICS 2022; 2. [PMID: 35252945 PMCID: PMC8896503 DOI: 10.1016/j.xgen.2022.100098] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Non-coding DNA variants (NCVs) impact gene expression by altering binding sites for regulatory complexes. New high-throughput methods are needed to characterize the impact of NCVs on regulatory complexes. We developed CASCADE (Customizable Approach to Survey Complex Assembly at DNA Elements), an array-based high-throughput method to profile cofactor (COF) recruitment. CASCADE identifies DNA-bound transcription factor-cofactor (TF-COF) complexes in nuclear extracts and quantifies the impact of NCVs on their binding. We demonstrate CASCADE sensitivity in characterizing condition-specific recruitment of COFs p300 and RBBP5 (MLL subunit) to the CXCL10 promoter in lipopolysaccharide (LPS)-stimulated human macrophages and quantify the impact of all possible NCVs. To demonstrate applicability to NCV screens, we profile TF-COF binding to ~1,700 single-nucleotide polymorphism quantitative trait loci (SNP-QTLs) in human macrophages and identify perturbed ETS domain-containing complexes. CASCADE will facilitate high-throughput testing of molecular mechanisms of NCVs for diverse biological applications. Bray et al. develop CASCADE, a method to profile transcription factor (TF)-cofactor (COF) complexes binding to DNA. They demonstrate the approach by profiling complex binding across the CXCL10 cytokine promoter and to ~1,700 single-nucleotide polymorphisms (SNPs). They anticipate that CASCADE can be applied to diverse biological systems to examine regulatory complex binding to DNA.
Collapse
Affiliation(s)
- David Bray
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Heather Hook
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
| | - Rose Zhao
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
| | - Jessica L. Keenan
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
- Bioinformatics Program, Boston University, Boston, MA, USA
| | - Ashley Penvose
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
| | - Yemi Osayame
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
| | - Nima Mohaghegh
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
| | - Xiaoting Chen
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | - Sreeja Parameswaran
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
| | - Leah C. Kottyan
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
- Division of Allergy and Immunology, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio 45229, USA
| | - Matthew T. Weirauch
- Center for Autoimmune Genomics and Etiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
- Division of Developmental Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, USA
| | - Trevor Siggers
- Department of Biology, Boston University, Boston, MA, USA
- Biological Design Center, Boston University, Boston, MA, USA
- Corresponding author
| |
Collapse
|
62
|
Ding J, Frantzeskos A, Orozco G. Functional interrogation of autoimmune disease genetics using CRISPR/Cas9 technologies and massively parallel reporter assays. Semin Immunopathol 2022; 44:137-147. [PMID: 34508276 PMCID: PMC8837574 DOI: 10.1007/s00281-021-00887-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 08/13/2021] [Indexed: 02/07/2023]
Abstract
Genetic studies, including genome-wide association studies, have identified many common variants that are associated with autoimmune diseases. Strikingly, in addition to being frequently observed in healthy individuals, a number of these variants are shared across diseases with diverse clinical presentations. This highlights the potential for improved autoimmune disease understanding which could be achieved by characterising the mechanism by which variants lead to increased risk of disease. Of particular interest is the potential for identifying novel drug targets or of repositioning drugs currently used in other diseases. The majority of autoimmune disease variants do not alter coding regions and it is often difficult to generate a plausible hypothetical mechanism by which variants affect disease-relevant genes and pathways. Given the interest in this area, considerable effort has been invested in developing and applying appropriate methodologies. Two of the most important technologies in this space include both low- and high-throughput genomic perturbation using the CRISPR/Cas9 system and massively parallel reporter assays. In this review, we introduce the field of autoimmune disease functional genomics and use numerous examples to demonstrate the recent and potential future impact of these technologies.
Collapse
Affiliation(s)
- James Ding
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, AV Hill Building, Oxford Road, Manchester, M13 9LJ, UK.
| | - Antonios Frantzeskos
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, AV Hill Building, Oxford Road, Manchester, M13 9LJ, UK
| | - Gisela Orozco
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, AV Hill Building, Oxford Road, Manchester, M13 9LJ, UK
- NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, M13 9WL, UK
| |
Collapse
|
63
|
Yokoshi M, Kawasaki K, Cambón M, Fukaya T. Dynamic modulation of enhancer responsiveness by core promoter elements in living Drosophila embryos. Nucleic Acids Res 2021; 50:92-107. [PMID: 34897508 PMCID: PMC8754644 DOI: 10.1093/nar/gkab1177] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 11/08/2021] [Accepted: 11/12/2021] [Indexed: 11/12/2022] Open
Abstract
Regulatory interactions between enhancers and core promoters are fundamental for the temporal and spatial specificity of gene expression in development. The central role of core promoters is to initiate productive transcription in response to enhancer's activation cues. However, it has not been systematically assessed how individual core promoter elements affect the induction of transcriptional bursting by enhancers. Here, we provide evidence that each core promoter element differentially modulates functional parameters of transcriptional bursting in developing Drosophila embryos. Quantitative live imaging analysis revealed that the timing and the continuity of burst induction are common regulatory steps on which core promoter elements impact. We further show that the upstream TATA also affects the burst amplitude. On the other hand, Inr, MTE and DPE mainly contribute to the regulation of the burst frequency. Genome editing analysis of the pair-rule gene fushi tarazu revealed that the endogenous TATA and DPE are both essential for its correct expression and function during the establishment of body segments in early embryos. We suggest that core promoter elements serve as a key regulatory module in converting enhancer activity into transcription dynamics during animal development.
Collapse
Affiliation(s)
- Moe Yokoshi
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Koji Kawasaki
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Manuel Cambón
- Applied Mathematics Department, University of Granada, Granada, Spain
| | - Takashi Fukaya
- Laboratory of Transcription Dynamics, Research Center for Biological Visualization, Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan.,Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|
64
|
Ren N, Li B, Liu Q, Yang L, Liu X, Huang Q. Dinucleotide tag-based parallel reporter gene assay method enables efficient identification of regulatory mutations. Biotechnol J 2021; 17:e2100341. [PMID: 34894203 DOI: 10.1002/biot.202100341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 12/07/2021] [Accepted: 12/09/2021] [Indexed: 11/06/2022]
Abstract
BACKGROUND The causal single nucleotide polymorphisms (SNPs) leading to increased cancer predisposition mainly function as gene regulatory elements, the evaluation of which largely relies on the parallel reporter gene assay system. However, the common DNA barcodes used in parallel reporter gene assay systems typically because nucleotide composition bias, and many barcodes must be allocated for each sequence to reduce the bias effect. MAIN METHODS AND MAJOR RESULTS Here, a versatile dinucleotide-tag reporter system (DiR) that enables parallel analysis of regulatory elements with minimized bias based on next-generation sequencing is described. The DiR system is more robust than the classical luciferase assay method, particularly for the investigation of moderate-level regulatory elements. The authors applied the DiR-seq assay in the functional evaluation of SNPs with prostate cancer risk and nominated two and six regulatory SNPs in PC-3 and LNCaP cells, respectively. CONCLUSIONS AND IMPLICATIONS The DiR system has great potential to advance the functional study of SNPs associated with polygenic disease risks.
Collapse
Affiliation(s)
- Naixia Ren
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Bo Li
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Qingqing Liu
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Lele Yang
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Xiaodan Liu
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| | - Qilai Huang
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao, China
| |
Collapse
|
65
|
Wang Z, Wei K, Xiong M, Wang J, Zhang C, Fan X, Huang L, Zhao D, Liu Q, Li Q. Glucan, Water-Dikinase 1 (GWD1), an ideal biotechnological target for potential improving yield and quality in rice. PLANT BIOTECHNOLOGY JOURNAL 2021; 19:2606-2618. [PMID: 34416068 PMCID: PMC8633486 DOI: 10.1111/pbi.13686] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Revised: 08/13/2021] [Accepted: 08/15/2021] [Indexed: 05/07/2023]
Abstract
The source-sink relationship determines the overall agronomic performance of rice. Cloning and characterizing key genes involved in the regulation of source and sink dynamics is imperative for improving rice yield. However, few source genes with potential application in rice have been identified. Glucan, Water-Dikinase 1 (GWD1) is an essential enzyme that plays a pivotal role in the first step of transitory starch degradation in source tissues. In the present study, we successfully generated gwd1 weak mutants by promoter editing using CRISPR/Cas9 system, and also leaf-dominant overexpression lines of GWD1 driven by Osl2 promoter. Analysis of the gwd1 plants indicated that promoter editing mediated down-regulation of GWD1 caused no observable effects on rice growth and development, but only mildly modified its grain transparency and seed germination. However, the transgenic pOsl2::GWD1 overexpression lines showed improvements in multiple key traits, including rice yield, grain shape, rice quality, seed germination and stress tolerance. Therefore, our study shows that GWD1 is not only involved in transitory starch degradation in source tissues, but also plays key roles in the seeds, which is a sink tissue. In conclusion, we find that GWD1 is an ideal biotechnological target with promising potential for the breeding of elite rice cultivars via genetic engineering.
Collapse
Affiliation(s)
- Zhen Wang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Ke Wei
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Min Xiong
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Jin‐Dong Wang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
| | - Chang‐Quan Zhang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
- Co‐Innovation Center for Modern Production Technology of Grain Crops of Jiangsu Province / Jiangsu Key Laboratory of Crop Genetics and PhysiologyYangzhou UniversityYangzhouJiangsuChina
| | - Xiao‐Lei Fan
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
- Co‐Innovation Center for Modern Production Technology of Grain Crops of Jiangsu Province / Jiangsu Key Laboratory of Crop Genetics and PhysiologyYangzhou UniversityYangzhouJiangsuChina
| | - Li‐Chun Huang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
- Co‐Innovation Center for Modern Production Technology of Grain Crops of Jiangsu Province / Jiangsu Key Laboratory of Crop Genetics and PhysiologyYangzhou UniversityYangzhouJiangsuChina
| | - Dong‐Sheng Zhao
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
- Co‐Innovation Center for Modern Production Technology of Grain Crops of Jiangsu Province / Jiangsu Key Laboratory of Crop Genetics and PhysiologyYangzhou UniversityYangzhouJiangsuChina
| | - Qiao‐Quan Liu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
- Co‐Innovation Center for Modern Production Technology of Grain Crops of Jiangsu Province / Jiangsu Key Laboratory of Crop Genetics and PhysiologyYangzhou UniversityYangzhouJiangsuChina
| | - Qian‐Feng Li
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding /Key Laboratory of Plant Functional Genomics of the Ministry of EducationCollege of AgricultureYangzhou UniversityYangzhouJiangsuChina
- Co‐Innovation Center for Modern Production Technology of Grain Crops of Jiangsu Province / Jiangsu Key Laboratory of Crop Genetics and PhysiologyYangzhou UniversityYangzhouJiangsuChina
| |
Collapse
|
66
|
Kuiper BP, Prins RC, Billerbeck S. Oligo Pools as an Affordable Source of Synthetic DNA for Cost-Effective Library Construction in Protein- and Metabolic Pathway Engineering. Chembiochem 2021; 23:e202100507. [PMID: 34817110 PMCID: PMC9300125 DOI: 10.1002/cbic.202100507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/23/2021] [Indexed: 11/11/2022]
Abstract
The construction of custom libraries is critical for rational protein engineering and directed evolution. Array‐synthesized oligo pools of thousands of user‐defined sequences (up to ∼350 bases in length) have emerged as a low‐cost commercially available source of DNA. These pools cost ≤10 % (depending on error rate and length) of other commercial sources of custom DNA, and this significant cost difference can determine whether an enzyme engineering project can be realized on a given research budget. However, while being cheap, oligo pools do suffer from a low concentration of individual oligos and relatively high error rates. Several powerful techniques that specifically make use of oligo pools have been developed and proven valuable or even essential for next‐generation protein and pathway engineering strategies, such as sequence‐function mapping, enzyme minimization, or de‐novo design. Here we consolidate the knowledge on these techniques and their applications to facilitate the use of oligo pools within the protein engineering community.
Collapse
Affiliation(s)
- Bastiaan P Kuiper
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Rianne C Prins
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| | - Sonja Billerbeck
- Molecular Microbiology, Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, The Netherlands
| |
Collapse
|
67
|
The non-coding genome in genetic brain disorders: new targets for therapy? Essays Biochem 2021; 65:671-683. [PMID: 34414418 PMCID: PMC8564736 DOI: 10.1042/ebc20200121] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 07/12/2021] [Accepted: 07/26/2021] [Indexed: 11/30/2022]
Abstract
The non-coding genome, consisting of more than 98% of all genetic information in humans and once judged as ‘Junk DNA’, is increasingly moving into the spotlight in the field of human genetics. Non-coding regulatory elements (NCREs) are crucial to ensure correct spatio-temporal gene expression. Technological advancements have allowed to identify NCREs on a large scale, and mechanistic studies have helped to understand the biological mechanisms underlying their function. It is increasingly becoming clear that genetic alterations of NCREs can cause genetic disorders, including brain diseases. In this review, we concisely discuss mechanisms of gene regulation and how to investigate them, and give examples of non-coding alterations of NCREs that give rise to human brain disorders. The cross-talk between basic and clinical studies enhances the understanding of normal and pathological function of NCREs, allowing better interpretation of already existing and novel data. Improved functional annotation of NCREs will not only benefit diagnostics for patients, but might also lead to novel areas of investigations for targeted therapies, applicable to a wide panel of genetic disorders. The intrinsic complexity and precision of the gene regulation process can be turned to the advantage of highly specific treatments. We further discuss this exciting new field of ‘enhancer therapy’ based on recent examples.
Collapse
|
68
|
Herholt A, Sahoo VK, Popovic L, Wehr MC, Rossner MJ. Dissecting intercellular and intracellular signaling networks with barcoded genetic tools. Curr Opin Chem Biol 2021; 66:102091. [PMID: 34644670 DOI: 10.1016/j.cbpa.2021.09.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 08/25/2021] [Accepted: 09/03/2021] [Indexed: 11/19/2022]
Abstract
The power of next-generation sequencing has stimulated the development of many analysis techniques for transcriptomics and genomics. More recently, the concept of 'molecular barcoding' has broadened the spectrum of sequencing-based applications to dissect different aspects of intracellular and intercellular signaling. In these assay formats, barcode reporters replace standard reporter genes. The virtually infinitive number of expressed barcode sequences allows high levels of multiplexing, hence accelerating experimental progress. Furthermore, reporter barcodes are used to quantitatively monitor a variety of biological events in living cells which has already provided much insight into complex cellular signaling and will further increase our knowledge in the future.
Collapse
Affiliation(s)
- Alexander Herholt
- Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Nussbaumstr. 7, 80336 Munich, Germany; Systasy Bioscience GmbH, Balanstr. 6, 81669 Munich, Germany
| | - Vivek K Sahoo
- Systasy Bioscience GmbH, Balanstr. 6, 81669 Munich, Germany
| | - Luksa Popovic
- Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Nussbaumstr. 7, 80336 Munich, Germany; Systasy Bioscience GmbH, Balanstr. 6, 81669 Munich, Germany
| | - Michael C Wehr
- Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Nussbaumstr. 7, 80336 Munich, Germany; Systasy Bioscience GmbH, Balanstr. 6, 81669 Munich, Germany
| | - Moritz J Rossner
- Department of Psychiatry and Psychotherapy, University Hospital, LMU Munich, Nussbaumstr. 7, 80336 Munich, Germany.
| |
Collapse
|
69
|
Findlay GM. Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Hum Mol Genet 2021; 30:R187-R197. [PMID: 34338757 PMCID: PMC8490018 DOI: 10.1093/hmg/ddab219] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 07/19/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
The application of genomics to medicine has accelerated the discovery of mutations underlying disease and has enhanced our knowledge of the molecular underpinnings of diverse pathologies. As the amount of human genetic material queried via sequencing has grown exponentially in recent years, so too has the number of rare variants observed. Despite progress, our ability to distinguish which rare variants have clinical significance remains limited. Over the last decade, however, powerful experimental approaches have emerged to characterize variant effects orders of magnitude faster than before. Fueled by improved DNA synthesis and sequencing and, more recently, by CRISPR/Cas9 genome editing, multiplex functional assays provide a means of generating variant effect data in wide-ranging experimental systems. Here, I review recent applications of multiplex assays that link human variants to disease phenotypes and I describe emerging strategies that will enhance their clinical utility in coming years.
Collapse
Affiliation(s)
- Gregory M Findlay
- The Francis Crick Institute, The Genome Function Laboratory, London NW1 1AT, UK
| |
Collapse
|
70
|
Duveau F, Vande Zande P, Metzger BP, Diaz CJ, Walker EA, Tryban S, Siddiq MA, Yang B, Wittkopp PJ. Mutational sources of trans-regulatory variation affecting gene expression in Saccharomyces cerevisiae. eLife 2021; 10:67806. [PMID: 34463616 PMCID: PMC8456550 DOI: 10.7554/elife.67806] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 08/03/2021] [Indexed: 12/15/2022] Open
Abstract
Heritable variation in a gene’s expression arises from mutations impacting cis- and trans-acting components of its regulatory network. Here, we investigate how trans-regulatory mutations are distributed within the genome and within a gene regulatory network by identifying and characterizing 69 mutations with trans-regulatory effects on expression of the same focal gene in Saccharomyces cerevisiae. Relative to 1766 mutations without effects on expression of this focal gene, we found that these trans-regulatory mutations were enriched in coding sequences of transcription factors previously predicted to regulate expression of the focal gene. However, over 90% of the trans-regulatory mutations identified mapped to other types of genes involved in diverse biological processes including chromatin state, metabolism, and signal transduction. These data show how genetic changes in diverse types of genes can impact a gene’s expression in trans, revealing properties of trans-regulatory mutations that provide the raw material for trans-regulatory variation segregating within natural populations.
Collapse
Affiliation(s)
- Fabien Duveau
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, United States.,Laboratory of Biology and Modeling of the Cell, Ecole Normale Supérieure de Lyon, CNRS, Université Claude Bernard Lyon, Université de Lyon, Lyon, France
| | - Petra Vande Zande
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| | - Brian Ph Metzger
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, United States
| | - Crisandra J Diaz
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| | - Elizabeth A Walker
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, United States
| | - Stephen Tryban
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, United States
| | - Mohammad A Siddiq
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, United States
| | - Bing Yang
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| | - Patricia J Wittkopp
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, United States.,Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, United States
| |
Collapse
|
71
|
Sinyakov AN, Ryabinin VA, Kostina EV. Application of Array-Based Oligonucleotides for Synthesis of Genetic Designs. Mol Biol 2021. [DOI: 10.1134/s0026893321030109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
72
|
Ren N, Liu Q, Yan L, Huang Q. Parallel Reporter Assays Identify Altered Regulatory Role of rs684232 in Leading to Prostate Cancer Predisposition. Int J Mol Sci 2021; 22:8792. [PMID: 34445492 PMCID: PMC8395720 DOI: 10.3390/ijms22168792] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 08/07/2021] [Accepted: 08/13/2021] [Indexed: 02/06/2023] Open
Abstract
Functional characterization of cancer risk-associated single nucleotide polymorphism (SNP) identified by genome-wide association studies (GWAS) has become a big challenge. To identify the regulatory risk SNPs that can lead to transcriptional misregulation, we performed parallel reporter gene assays with both alleles of 213 prostate cancer risk-associated GWAS SNPs in 22Rv1 cells. We disclosed 32 regulatory SNPs that exhibited different regulatory activities with two alleles. For one of the regulatory SNPs, rs684232, we found that the variation altered chromatin binding of transcription factor FOXA1 on the DNA region and led to aberrant gene expression of VPS53, FAM57A, and GEMIN4, which play vital roles in prostate cancer malignancy. Our findings reveal the roles and underlying mechanism of rs684232 in prostate cancer progression and hold great promise in benefiting prostate cancer patients with prognostic prediction and target therapies.
Collapse
Affiliation(s)
| | | | | | - Qilai Huang
- Shandong Provincial Key Laboratory of Animal Cell and Developmental Biology, School of Life Sciences, Shandong University, Qingdao 266237, China; (N.R.); (Q.L.); (L.Y.)
| |
Collapse
|
73
|
Cazier AP, Blazeck J. Advances in promoter engineering: novel applications and predefined transcriptional control. Biotechnol J 2021; 16:e2100239. [PMID: 34351706 DOI: 10.1002/biot.202100239] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Revised: 07/30/2021] [Accepted: 08/03/2021] [Indexed: 11/08/2022]
Abstract
Synthetic biology continues to progress by relying on more robust tools for transcriptional control, of which promoters are the most fundamental component. Numerous studies have sought to characterize promoter function, determine principles to guide their engineering, and create promoters with stronger expression or tailored inducible control. In this review, we will summarize promoter architecture and highlight recent advances in the field, focusing on the novel applications of inducible promoter design and engineering towards metabolic engineering and cellular therapeutic development. Additionally, we will highlight how the expansion of new, machine learning techniques for modeling and engineering promoter sequences are enabling more accurate prediction of promoter characteristics. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Andrew P Cazier
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst St. NW, Atlanta, Georgia, 30332, USA
| | - John Blazeck
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, 311 Ferst St. NW, Atlanta, Georgia, 30332, USA
| |
Collapse
|
74
|
Mulvey B, Dougherty JD. Transcriptional-regulatory convergence across functional MDD risk variants identified by massively parallel reporter assays. Transl Psychiatry 2021; 11:403. [PMID: 34294677 PMCID: PMC8298436 DOI: 10.1038/s41398-021-01493-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 06/02/2021] [Accepted: 06/16/2021] [Indexed: 02/07/2023] Open
Abstract
Family and population studies indicate clear heritability of major depressive disorder (MDD), though its underlying biology remains unclear. The majority of single-nucleotide polymorphism (SNP) linkage blocks associated with MDD by genome-wide association studies (GWASes) are believed to alter transcriptional regulators (e.g., enhancers, promoters) based on enrichment of marks correlated with these functions. A key to understanding MDD pathophysiology will be elucidation of which SNPs are functional and how such functional variants biologically converge to elicit the disease. Furthermore, retinoids can elicit MDD in patients and promote depressive-like behaviors in rodent models, acting via a regulatory system of retinoid receptor transcription factors (TFs). We therefore sought to simultaneously identify functional genetic variants and assess retinoid pathway regulation of MDD risk loci. Using Massively Parallel Reporter Assays (MPRAs), we functionally screened over 1000 SNPs prioritized from 39 neuropsychiatric trait/disease GWAS loci, selecting SNPs based on overlap with predicted regulatory features-including expression quantitative trait loci (eQTL) and histone marks-from human brains and cell cultures. We identified >100 SNPs with allelic effects on expression in a retinoid-responsive model system. Functional SNPs were enriched for binding sequences of retinoic acid-receptive transcription factors (TFs), with additional allelic differences unmasked by treatment with all-trans retinoic acid (ATRA). Finally, motifs overrepresented across functional SNPs corresponded to TFs highly specific to serotonergic neurons, suggesting an in vivo site of action. Our application of MPRAs to screen MDD-associated SNPs suggests a shared transcriptional-regulatory program across loci, a component of which is unmasked by retinoids.
Collapse
Affiliation(s)
- Bernard Mulvey
- Departments of Genetics and Psychiatry, Washington University in St. Louis, St. Louis, MO, USA
| | - Joseph D Dougherty
- Departments of Genetics and Psychiatry, Washington University in St. Louis, St. Louis, MO, USA.
| |
Collapse
|
75
|
Lee D, Kapoor A, Lee C, Mudgett M, Beer MA, Chakravarti A. Sequence-based correction of barcode bias in massively parallel reporter assays. Genome Res 2021; 31:1638-1645. [PMID: 34285053 PMCID: PMC8415370 DOI: 10.1101/gr.268599.120] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Accepted: 07/07/2021] [Indexed: 11/24/2022]
Abstract
Massively parallel reporter assays (MPRAs) are a high-throughput method for evaluating in vitro activities of thousands of candidate cis-regulatory elements (CREs). In these assays, candidate sequences are cloned upstream or downstream from a reporter gene tagged by unique DNA sequences. However, tag sequences may themselves affect reporter gene expression and lead to major potential biases in the measured cis-regulatory activity. Here, we present a sequence-based method for correcting tag-sequence-specific effects and show that our method can significantly reduce this source of variation and improve the identification of functional regulatory variants by MPRAs. We also show that our model captures sequence features associated with post-transcriptional regulation of mRNA. Thus, this new method helps not only to improve detection of regulatory signals in MPRA experiments but also to design better MPRA protocols.
Collapse
Affiliation(s)
| | - Ashish Kapoor
- University of Texas Health Science Center at Houston
| | | | | | | | | |
Collapse
|
76
|
Degtyareva AO, Antontseva EV, Merkulova TI. Regulatory SNPs: Altered Transcription Factor Binding Sites Implicated in Complex Traits and Diseases. Int J Mol Sci 2021; 22:6454. [PMID: 34208629 PMCID: PMC8235176 DOI: 10.3390/ijms22126454] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/15/2021] [Accepted: 06/15/2021] [Indexed: 12/19/2022] Open
Abstract
The vast majority of the genetic variants (mainly SNPs) associated with various human traits and diseases map to a noncoding part of the genome and are enriched in its regulatory compartment, suggesting that many causal variants may affect gene expression. The leading mechanism of action of these SNPs consists in the alterations in the transcription factor binding via creation or disruption of transcription factor binding sites (TFBSs) or some change in the affinity of these regulatory proteins to their cognate sites. In this review, we first focus on the history of the discovery of regulatory SNPs (rSNPs) and systematized description of the existing methodical approaches to their study. Then, we brief the recent comprehensive examples of rSNPs studied from the discovery of the changes in the TFBS sequence as a result of a nucleotide substitution to identification of its effect on the target gene expression and, eventually, to phenotype. We also describe state-of-the-art genome-wide approaches to identification of regulatory variants, including both making molecular sense of genome-wide association studies (GWAS) and the alternative approaches the primary goal of which is to determine the functionality of genetic variants. Among these approaches, special attention is paid to expression quantitative trait loci (eQTLs) analysis and the search for allele-specific events in RNA-seq (ASE events) as well as in ChIP-seq, DNase-seq, and ATAC-seq (ASB events) data.
Collapse
Affiliation(s)
- Arina O. Degtyareva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Elena V. Antontseva
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
| | - Tatiana I. Merkulova
- Department of Molecular Genetic, Institute of Cytology and Genetics, 630090 Novosibirsk, Russia; (A.O.D.); (E.V.A.)
- Department of Natural Sciences, Novosibirsk State University, 630090 Novosibirsk, Russia
| |
Collapse
|
77
|
Jores T, Tonnies J, Wrightsman T, Buckler ES, Cuperus JT, Fields S, Queitsch C. Synthetic promoter designs enabled by a comprehensive analysis of plant core promoters. NATURE PLANTS 2021; 7:842-855. [PMID: 34083762 PMCID: PMC10246763 DOI: 10.1038/s41477-021-00932-y] [Citation(s) in RCA: 64] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/27/2021] [Indexed: 05/24/2023]
Abstract
Targeted engineering of plant gene expression holds great promise for ensuring food security and for producing biopharmaceuticals in plants. However, this engineering requires thorough knowledge of cis-regulatory elements to precisely control either endogenous or introduced genes. To generate this knowledge, we used a massively parallel reporter assay to measure the activity of nearly complete sets of promoters from Arabidopsis, maize and sorghum. We demonstrate that core promoter elements-notably the TATA box-as well as promoter GC content and promoter-proximal transcription factor binding sites influence promoter strength. By performing the experiments in two assay systems, leaves of the dicot tobacco and protoplasts of the monocot maize, we detect species-specific differences in the contributions of GC content and transcription factors to promoter strength. Using these observations, we built computational models to predict promoter strength in both assay systems, allowing us to design highly active promoters comparable in activity to the viral 35S minimal promoter. Our results establish a promising experimental approach to optimize native promoter elements and generate synthetic ones with desirable features.
Collapse
Affiliation(s)
- Tobias Jores
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Jackson Tonnies
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Graduate Program in Biology, University of Washington, Seattle, WA, USA
| | - Travis Wrightsman
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA
| | - Edward S Buckler
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, NY, USA
- Agricultural Research Service, United States Department of Agriculture, Ithaca, NY, USA
- Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA
| | - Josh T Cuperus
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| | - Stanley Fields
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Department of Medicine, University of Washington, Seattle, WA, USA.
| | - Christine Queitsch
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
| |
Collapse
|
78
|
Manrubia S, Cuesta JA, Aguirre J, Ahnert SE, Altenberg L, Cano AV, Catalán P, Diaz-Uriarte R, Elena SF, García-Martín JA, Hogeweg P, Khatri BS, Krug J, Louis AA, Martin NS, Payne JL, Tarnowski MJ, Weiß M. From genotypes to organisms: State-of-the-art and perspectives of a cornerstone in evolutionary dynamics. Phys Life Rev 2021; 38:55-106. [PMID: 34088608 DOI: 10.1016/j.plrev.2021.03.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Accepted: 03/01/2021] [Indexed: 12/21/2022]
Abstract
Understanding how genotypes map onto phenotypes, fitness, and eventually organisms is arguably the next major missing piece in a fully predictive theory of evolution. We refer to this generally as the problem of the genotype-phenotype map. Though we are still far from achieving a complete picture of these relationships, our current understanding of simpler questions, such as the structure induced in the space of genotypes by sequences mapped to molecular structures, has revealed important facts that deeply affect the dynamical description of evolutionary processes. Empirical evidence supporting the fundamental relevance of features such as phenotypic bias is mounting as well, while the synthesis of conceptual and experimental progress leads to questioning current assumptions on the nature of evolutionary dynamics-cancer progression models or synthetic biology approaches being notable examples. This work delves with a critical and constructive attitude into our current knowledge of how genotypes map onto molecular phenotypes and organismal functions, and discusses theoretical and empirical avenues to broaden and improve this comprehension. As a final goal, this community should aim at deriving an updated picture of evolutionary processes soundly relying on the structural properties of genotype spaces, as revealed by modern techniques of molecular and functional analysis.
Collapse
Affiliation(s)
- Susanna Manrubia
- Department of Systems Biology, Centro Nacional de Biotecnología (CSIC), Madrid, Spain; Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain.
| | - José A Cuesta
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain; Instituto de Biocomputación y Física de Sistemas Complejos (BiFi), Universidad de Zaragoza, Spain; UC3M-Santander Big Data Institute (IBiDat), Getafe, Madrid, Spain
| | - Jacobo Aguirre
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Centro de Astrobiología, CSIC-INTA, ctra. de Ajalvir km 4, 28850 Torrejón de Ardoz, Madrid, Spain
| | - Sebastian E Ahnert
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Philippa Fawcett Drive, Cambridge CB3 0AS, UK; The Alan Turing Institute, British Library, 96 Euston Road, London NW1 2DB, UK
| | | | - Alejandro V Cano
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Pablo Catalán
- Grupo Interdisciplinar de Sistemas Complejos (GISC), Madrid, Spain; Departamento de Matemáticas, Universidad Carlos III de Madrid, Leganés, Spain
| | - Ramon Diaz-Uriarte
- Department of Biochemistry, Universidad Autónoma de Madrid, Madrid, Spain; Instituto de Investigaciones Biomédicas "Alberto Sols" (UAM-CSIC), Madrid, Spain
| | - Santiago F Elena
- Instituto de Biología Integrativa de Sistemas, I(2)SysBio (CSIC-UV), València, Spain; The Santa Fe Institute, Santa Fe, NM, USA
| | | | - Paulien Hogeweg
- Theoretical Biology and Bioinformatics Group, Utrecht University, the Netherlands
| | - Bhavin S Khatri
- The Francis Crick Institute, London, UK; Department of Life Sciences, Imperial College London, London, UK
| | - Joachim Krug
- Institute for Biological Physics, University of Cologne, Köln, Germany
| | - Ard A Louis
- Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford, UK
| | - Nora S Martin
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland; Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | - Marcel Weiß
- Theory of Condensed Matter Group, Cavendish Laboratory, University of Cambridge, Cambridge, UK; Sainsbury Laboratory, University of Cambridge, Cambridge, UK
| |
Collapse
|
79
|
Letiagina AE, Omelina ES, Ivankin AV, Pindyurin AV. MPRAdecoder: Processing of the Raw MPRA Data With a priori Unknown Sequences of the Region of Interest and Associated Barcodes. Front Genet 2021; 12:618189. [PMID: 34046055 PMCID: PMC8148044 DOI: 10.3389/fgene.2021.618189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 03/25/2021] [Indexed: 11/13/2022] Open
Abstract
Massively parallel reporter assays (MPRAs) enable high-throughput functional evaluation of numerous DNA regulatory elements and/or their mutant variants. The assays are based on the construction of reporter plasmid libraries containing two variable parts, a region of interest (ROI) and a barcode (BC), located outside and within the transcription unit, respectively. Importantly, each plasmid molecule in a such a highly diverse library is characterized by a unique BC-ROI association. The reporter constructs are delivered to target cells and expression of BCs at the transcript level is assayed by RT-PCR followed by next-generation sequencing (NGS). The obtained values are normalized to the abundance of BCs in the plasmid DNA sample. Altogether, this allows evaluating the regulatory potential of the associated ROI sequences. However, depending on the MPRA library construction design, the BC and ROI sequences as well as their associations can be a priori unknown. In such a case, the BC and ROI sequences, their possible mutant variants, and unambiguous BC-ROI associations have to be identified, whereas all uncertain cases have to be excluded from the analysis. Besides the preparation of additional "mapping" samples for NGS, this also requires specific bioinformatics tools. Here, we present a pipeline for processing raw MPRA data obtained by NGS for reporter construct libraries with a priori unknown sequences of BCs and ROIs. The pipeline robustly identifies unambiguous (so-called genuine) BCs and ROIs associated with them, calculates the normalized expression level for each BC and the averaged values for each ROI, and provides a graphical visualization of the processed data.
Collapse
Affiliation(s)
- Anna E Letiagina
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia.,Faculty of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Evgeniya S Omelina
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Anton V Ivankin
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Alexey V Pindyurin
- Institute of Molecular and Cellular Biology of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| |
Collapse
|
80
|
Roberts BS, Partridge EC, Moyers BA, Agarwal V, Newberry KM, Martin BK, Shendure J, Myers RM, Cooper GM. Genome-wide strand asymmetry in massively parallel reporter activity favors genic strands. Genome Res 2021; 31:866-876. [PMID: 33879525 PMCID: PMC8092006 DOI: 10.1101/gr.270751.120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 02/18/2021] [Indexed: 11/24/2022]
Abstract
Massively parallel reporter assays (MPRAs) are useful tools to characterize regulatory elements in human genomes. An aspect of MPRAs that is not typically the focus of analysis is their intrinsic ability to differentiate activity levels for a given sequence element when placed in both of its possible orientations relative to the reporter construct. Here, we describe pervasive strand asymmetry of MPRA signals in data sets from multiple reporter configurations in both published and newly reported data. These effects are reproducible across different cell types and in different treatments within a cell type and are observed both within and outside of annotated regulatory elements. From elements in gene bodies, MPRA strand asymmetry favors the sense strand, suggesting that function related to endogenous transcription is driving the phenomenon. Similarly, we find that within Alu mobile element insertions, strand asymmetry favors the transcribed strand of the ancestral retrotransposon. The effect is consistent across the multiplicity of Alu elements in human genomes and is more pronounced in less diverged Alu elements. We find sequence features driving MPRA strand asymmetry and show its prediction from sequence alone. We see some evidence for RNA stabilization and transcriptional activation mechanisms and hypothesize that the effect is driven by natural selection favoring efficient transcription. Our results indicate that strand asymmetry is a pervasive and reproducible feature in MPRA data. More importantly, the fact that MPRA asymmetry favors naturally transcribed strands suggests that it stems from preserved biological functions that have a substantial, global impact on gene and genome evolution.
Collapse
Affiliation(s)
- Brian S Roberts
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA.,Department of Biological Sciences, The University of Alabama in Huntsville, Huntsville, Alabama 35899, USA
| | | | - Bryan A Moyers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Vikram Agarwal
- Calico Life Sciences LLC, South San Francisco, California 94080, USA
| | | | - Beth K Martin
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA.,Howard Hughes Medical Institute, Seattle, Washington 98195, USA.,Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, Washington 98195, USA
| | - Richard M Myers
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| | - Gregory M Cooper
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama 35806, USA
| |
Collapse
|
81
|
Lalanne J, Parker DJ, Li G. Spurious regulatory connections dictate the expression-fitness landscape of translation factors. Mol Syst Biol 2021; 17:e10302. [PMID: 33900014 PMCID: PMC8073009 DOI: 10.15252/msb.202110302] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 03/12/2021] [Accepted: 03/16/2021] [Indexed: 12/21/2022] Open
Abstract
During steady-state cell growth, individual enzymatic fluxes can be directly inferred from growth rate by mass conservation, but the inverse problem remains unsolved. Perturbing the flux and expression of a single enzyme could have pleiotropic effects that may or may not dominate the impact on cell fitness. Here, we quantitatively dissect the molecular and global responses to varied expression of translation termination factors (peptide release factors, RFs) in the bacterium Bacillus subtilis. While endogenous RF expression maximizes proliferation, deviations in expression lead to unexpected distal regulatory responses that dictate fitness reduction. Molecularly, RF depletion causes expression imbalance at specific operons, which activates master regulators and detrimentally overrides the transcriptome. Through these spurious connections, RF abundances are thus entrenched by focal points within the regulatory network, in one case located at a single stop codon. Such regulatory entrenchment suggests that predictive bottom-up models of expression-fitness landscapes will require near-exhaustive characterization of parts.
Collapse
Affiliation(s)
- Jean‐Benoît Lalanne
- Department of BiologyMassachusetts Institute of TechnologyCambridgeMAUSA
- Department of PhysicsMassachusetts Institute of TechnologyCambridgeMAUSA
- Present address:
Department of Genome SciencesUniversity of WashingtonSeattleWAUSA
| | - Darren J Parker
- Department of BiologyMassachusetts Institute of TechnologyCambridgeMAUSA
- Present address:
Biosciences DivisionOak Ridge National LaboratoryOak RidgeTNUSA
| | - Gene‐Wei Li
- Department of BiologyMassachusetts Institute of TechnologyCambridgeMAUSA
| |
Collapse
|
82
|
The Context-Dependent Influence of Promoter Sequence Motifs on Transcription Initiation Kinetics and Regulation. J Bacteriol 2021; 203:JB.00512-20. [PMID: 33139481 DOI: 10.1128/jb.00512-20] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The fitness of an individual bacterial cell is highly dependent upon the temporal tuning of gene expression levels when subjected to different environmental cues. Kinetic regulation of transcription initiation is a key step in modulating the levels of transcribed genes to promote bacterial survival. The initiation phase encompasses the binding of RNA polymerase (RNAP) to promoter DNA and a series of coupled protein-DNA conformational changes prior to entry into processive elongation. The time required to complete the initiation phase can vary by orders of magnitude and is ultimately dictated by the DNA sequence of the promoter. In this review, we aim to provide the required background to understand how promoter sequence motifs may affect initiation kinetics during promoter recognition and binding, subsequent conformational changes which lead to DNA opening around the transcription start site, and promoter escape. By calculating the steady-state flux of RNA production as a function of these effects, we illustrate that the presence/absence of a consensus promoter motif cannot be used in isolation to make conclusions regarding promoter strength. Instead, the entire series of linked, sequence-dependent structural transitions must be considered holistically. Finally, we describe how individual transcription factors take advantage of the broad distribution of sequence-dependent basal kinetics to either increase or decrease RNA flux.
Collapse
|
83
|
Global discovery of lupus genetic risk variant allelic enhancer activity. Nat Commun 2021; 12:1611. [PMID: 33712590 PMCID: PMC7955039 DOI: 10.1038/s41467-021-21854-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2020] [Accepted: 02/16/2021] [Indexed: 12/17/2022] Open
Abstract
Genome-wide association studies of Systemic Lupus Erythematosus (SLE) nominate 3073 genetic variants at 91 risk loci. To systematically screen these variants for allelic transcriptional enhancer activity, we construct a massively parallel reporter assay (MPRA) library comprising 12,396 DNA oligonucleotides containing the genomic context around every allele of each SLE variant. Transfection into the Epstein-Barr virus-transformed B cell line GM12878 reveals 482 variants with enhancer activity, with 51 variants showing genotype-dependent (allelic) enhancer activity at 27 risk loci. Comparison of MPRA results in GM12878 and Jurkat T cell lines highlights shared and unique allelic transcriptional regulatory mechanisms at SLE risk loci. In-depth analysis of allelic transcription factor (TF) binding at and around allelic variants identifies one class of TFs whose DNA-binding motif tends to be directly altered by the risk variant and a second class of TFs that bind allelically without direct alteration of their motif by the variant. Collectively, our approach provides a blueprint for the discovery of allelic gene regulation at risk loci for any disease and offers insight into the transcriptional regulatory mechanisms underlying SLE. Thousands of genetic variants have been associated with lupus, but causal variants and mechanisms are unknown. Here, the authors combine a massively parallel reporter assay with genome-wide ChIP experiments to identify risk variants with allelic enhancer activity mediated through transcription factor binding.
Collapse
|
84
|
Abstract
Motivation The universal expressibility assumption of Deep Neural Networks (DNNs) is the key motivation behind recent worksin the systems biology community to employDNNs to solve important problems in functional genomics and moleculargenetics. Typically, such investigations have taken a ‘black box’ approach in which the internal structure of themodel used is set purely by machine learning considerations with little consideration of representing the internalstructure of the biological system by the mathematical structure of the DNN. DNNs have not yet been applied to thedetailed modeling of transcriptional control in which mRNA production is controlled by the binding of specific transcriptionfactors to DNA, in part because such models are in part formulated in terms of specific chemical equationsthat appear different in form from those used in neural networks. Results In this paper, we give an example of a DNN whichcan model the detailed control of transcription in a precise and predictive manner. Its internal structure is fully interpretableand is faithful to underlying chemistry of transcription factor binding to DNA. We derive our DNN from asystems biology model that was not previously recognized as having a DNN structure. Although we apply our DNNto data from the early embryo of the fruit fly Drosophila, this system serves as a test bed for analysis of much larger datasets obtained by systems biology studies on a genomic scale. . Availability and implementation The implementation and data for the models used in this paper are in a zip file in the supplementary material. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi Liu
- Department of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - Kenneth Barr
- Department of Human Genetics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - John Reinitz
- Departments of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
85
|
Massively parallel assessment of human variants with base editor screens. Cell 2021; 184:1064-1080.e20. [PMID: 33606977 DOI: 10.1016/j.cell.2021.01.012] [Citation(s) in RCA: 160] [Impact Index Per Article: 53.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 10/21/2020] [Accepted: 01/07/2021] [Indexed: 12/26/2022]
Abstract
Understanding the functional consequences of single-nucleotide variants is critical to uncovering the genetic underpinnings of diseases, but technologies to characterize variants are limiting. Here, we leverage CRISPR-Cas9 cytosine base editors in pooled screens to scalably assay variants at endogenous loci in mammalian cells. We benchmark the performance of base editors in positive and negative selection screens, identifying known loss-of-function mutations in BRCA1 and BRCA2 with high precision. To demonstrate the utility of base editor screens to probe small molecule-protein interactions, we screen against BH3 mimetics and PARP inhibitors, identifying point mutations that confer drug sensitivity or resistance. We also create a library of single guide RNAs (sgRNAs) predicted to generate 52,034 ClinVar variants in 3,584 genes and conduct screens in the presence of cellular stressors, identifying loss-of-function variants in numerous DNA damage repair genes. We anticipate that this screening approach will be broadly useful to readily and scalably functionalize genetic variants.
Collapse
|
86
|
PsychENCODE and beyond: transcriptomics and epigenomics of brain development and organoids. Neuropsychopharmacology 2021; 46:70-85. [PMID: 32659782 PMCID: PMC7689467 DOI: 10.1038/s41386-020-0763-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Revised: 06/24/2020] [Accepted: 06/25/2020] [Indexed: 12/13/2022]
Abstract
Crucial decisions involving cell fate and connectivity that shape the distinctive development of the human brain occur in the embryonic and fetal stages-stages that are difficult to access and investigate in humans. The last decade has seen an impressive increase in resources-from atlases and databases to biological models-that is progressively lifting the curtain on this critical period. In this review, we describe the current state of genomic, transcriptomic, and epigenomic datasets charting the development of normal human brain with a particular focus on recent single-cell technologies. We discuss the emergence of brain organoids generated from pluripotent stem cells as a model to compensate for the limited availability of fetal tissue. Indeed, comparisons of neural lineages, transcriptional dynamics, and noncoding element activity between fetal brain and organoids have helped identify gene regulatory networks functioning at early stages of brain development. Altogether, we argue that large multi-omics investigations have pushed brain development into the "big data" era, and that current and future transversal approaches needed to leverage both fetal brain and organoid resources promise to answer major questions of brain biology and psychiatry.
Collapse
|
87
|
Mulvey B, Lagunas T, Dougherty JD. Massively Parallel Reporter Assays: Defining Functional Psychiatric Genetic Variants Across Biological Contexts. Biol Psychiatry 2021; 89:76-89. [PMID: 32843144 PMCID: PMC7938388 DOI: 10.1016/j.biopsych.2020.06.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 06/09/2020] [Accepted: 06/10/2020] [Indexed: 12/18/2022]
Abstract
Neuropsychiatric phenotypes have long been known to be influenced by heritable risk factors, directly confirmed by the past decade of genetic studies that have revealed specific genetic variants enriched in disease cohorts. However, the initial hope that a small set of genes would be responsible for a given disorder proved false. The more complex reality is that a given disorder may be influenced by myriad small-effect noncoding variants and/or by rare but severe coding variants, many de novo. Noncoding genomic sequences-for which molecular functions cannot usually be inferred-harbor a large portion of these variants, creating a substantial barrier to understanding higher-order molecular and biological systems of disease. Fortunately, novel genetic technologies-scalable oligonucleotide synthesis, RNA sequencing, and CRISPR (clustered regularly interspaced short palindromic repeats)-have opened novel avenues to experimentally identify biologically significant variants en masse. Massively parallel reporter assays (MPRAs) are an especially versatile technique resulting from such innovations. MPRAs are powerful molecular genetics tools that can be used to screen thousands of untranscribed or untranslated sequences and their variants for functional effects in a single experiment. This approach, though underutilized in psychiatric genetics, has several useful features for the field. We review methods for assaying putatively functional genetic variants and regions, emphasizing MPRAs and the opportunities they hold for dissection of psychiatric polygenicity. We discuss literature applying functional assays in neurogenetics, highlighting strengths, caveats, and design considerations-especially regarding disease-relevant variables (cell type, neurodevelopment, and sex), and we ultimately propose applications of MPRA to both computational and experimental neurogenetics of polygenic disease risk.
Collapse
Affiliation(s)
- Bernard Mulvey
- Division of Biology and Biomedical Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | - Tomás Lagunas
- Division of Biology and Biomedical Sciences, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri
| | - Joseph D Dougherty
- Department of Genetics, Washington University School of Medicine in St. Louis, St. Louis, Missouri; Department of Psychiatry, Washington University School of Medicine in St. Louis, St. Louis, Missouri.
| |
Collapse
|
88
|
Kartje ZJ, Janis HI, Mukhopadhyay S, Gagnon KT. Revisiting T7 RNA polymerase transcription in vitro with the Broccoli RNA aptamer as a simplified real-time fluorescent reporter. J Biol Chem 2020; 296:100175. [PMID: 33303627 PMCID: PMC7948468 DOI: 10.1074/jbc.ra120.014553] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 12/03/2020] [Accepted: 12/10/2020] [Indexed: 11/06/2022] Open
Abstract
Methods for rapid and high-throughput screening of transcription in vitro to examine reaction conditions, enzyme mutants, promoter variants, and small molecule modulators can be extremely valuable tools. However, these techniques may be difficult to establish or inaccessible to many researchers. To develop a straightforward and cost-effective platform for assessing transcription in vitro, we used the "Broccoli" RNA aptamer as a direct, real-time fluorescent transcript readout. To demonstrate the utility of our approach, we screened the effect of common reaction conditions and components on bacteriophage T7 RNA polymerase (RNAP) activity using a common quantitative PCR instrument for fluorescence detection. Several essential conditions for in vitro transcription by T7 RNAP were confirmed with this assay, including the importance of enzyme and substrate concentrations, covariation of magnesium and nucleoside triphosphates, and the effects of several typical additives. When we used this method to assess all possible point mutants of a canonical T7 RNAP promoter, our results coincided well with previous reports. This approach should translate well to a broad variety of bacteriophage in vitro transcription systems and provides a platform for developing fluorescence-based readouts of more complex transcription systems in vitro.
Collapse
Affiliation(s)
- Zachary J Kartje
- Department of Chemistry and Biochemistry, Southern Illinois University, Carbondale, Illinois, USA
| | - Helen I Janis
- Department of Chemistry and Biochemistry, Southern Illinois University, Carbondale, Illinois, USA
| | - Shaoni Mukhopadhyay
- Department of Biochemistry and Molecular Biology, Southern Illinois University School of Medicine, Carbondale, Illinois, USA
| | - Keith T Gagnon
- Department of Chemistry and Biochemistry, Southern Illinois University, Carbondale, Illinois, USA; Department of Biochemistry and Molecular Biology, Southern Illinois University School of Medicine, Carbondale, Illinois, USA.
| |
Collapse
|
89
|
Muller R, Meacham ZA, Ferguson L, Ingolia NT. CiBER-seq dissects genetic networks by quantitative CRISPRi profiling of expression phenotypes. Science 2020; 370:eabb9662. [PMID: 33303588 PMCID: PMC7819735 DOI: 10.1126/science.abb9662] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 10/22/2020] [Indexed: 12/12/2022]
Abstract
To realize the promise of CRISPR-Cas9-based genetics, approaches are needed to quantify a specific, molecular phenotype across genome-wide libraries of genetic perturbations. We addressed this challenge by profiling transcriptional, translational, and posttranslational reporters using CRISPR interference (CRISPRi) with barcoded expression reporter sequencing (CiBER-seq). Our barcoding approach allowed us to connect an entire library of guides to their individual phenotypic consequences using pooled sequencing. CiBER-seq profiling fully recapitulated the integrated stress response (ISR) pathway in yeast. Genetic perturbations causing uncharged transfer RNA (tRNA) accumulation activated ISR reporter transcription. Notably, tRNA insufficiency also activated the reporter, independent of the uncharged tRNA sensor. By uncovering alternate triggers for ISR activation, we illustrate how precise, comprehensive CiBER-seq profiling provides a powerful and broadly applicable tool for dissecting genetic networks.
Collapse
Affiliation(s)
- Ryan Muller
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Zuriah A Meacham
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Lucas Ferguson
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Nicholas T Ingolia
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720, USA.
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
90
|
López-Rivera F, Foster Rhoades OK, Vincent BJ, Pym ECG, Bragdon MDJ, Estrada J, DePace AH, Wunderlich Z. A Mutation in the Drosophila melanogaster eve Stripe 2 Minimal Enhancer Is Buffered by Flanking Sequences. G3 (BETHESDA, MD.) 2020; 10:4473-4482. [PMID: 33037064 PMCID: PMC7718739 DOI: 10.1534/g3.120.401777] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/01/2020] [Accepted: 10/01/2020] [Indexed: 01/18/2023]
Abstract
Enhancers are DNA sequences composed of transcription factor binding sites that drive complex patterns of gene expression in space and time. Until recently, studying enhancers in their genomic context was technically challenging. Therefore, minimal enhancers, the shortest pieces of DNA that can drive an expression pattern that resembles a gene's endogenous pattern, are often used to study features of enhancer function. However, evidence suggests that some enhancers require sequences outside the minimal enhancer to maintain function under environmental perturbations. We hypothesized that these additional sequences also prevent misexpression caused by a transcription factor binding site mutation within a minimal enhancer. Using the Drosophila melanogastereven-skipped stripe 2 enhancer as a case study, we tested the effect of a Giant binding site mutation (gt-2) on the expression patterns driven by minimal and extended enhancer reporter constructs. We found that, in contrast to the misexpression caused by the gt-2 binding site deletion in the minimal enhancer, the same gt-2 binding site deletion in the extended enhancer did not have an effect on expression. The buffering of expression levels, but not expression pattern, is partially explained by an additional Giant binding site outside the minimal enhancer. Deleting the gt-2 binding site in the endogenous locus had no significant effect on stripe 2 expression. Our results indicate that rules derived from mutating enhancer reporter constructs may not represent what occurs in the endogenous context.
Collapse
Affiliation(s)
- Francheska López-Rivera
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
- GSAS Research Scholar Initiative, Harvard University, Cambridge, MA 02138
| | | | - Ben J Vincent
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Edward C G Pym
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | | | - Javier Estrada
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Zeba Wunderlich
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| |
Collapse
|
91
|
Molecular and evolutionary processes generating variation in gene expression. Nat Rev Genet 2020; 22:203-215. [PMID: 33268840 DOI: 10.1038/s41576-020-00304-w] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2020] [Indexed: 12/18/2022]
Abstract
Heritable variation in gene expression is common within and between species. This variation arises from mutations that alter the form or function of molecular gene regulatory networks that are then filtered by natural selection. High-throughput methods for introducing mutations and characterizing their cis- and trans-regulatory effects on gene expression (particularly, transcription) are revealing how different molecular mechanisms generate regulatory variation, and studies comparing these mutational effects with variation seen in the wild are teasing apart the role of neutral and non-neutral evolutionary processes. This integration of molecular and evolutionary biology allows us to understand how the variation in gene expression we see today came to be and to predict how it is most likely to evolve in the future.
Collapse
|
92
|
Townsley KG, Brennand KJ, Huckins LM. Massively parallel techniques for cataloguing the regulome of the human brain. Nat Neurosci 2020; 23:1509-1521. [PMID: 33199899 PMCID: PMC8018778 DOI: 10.1038/s41593-020-00740-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 10/13/2020] [Indexed: 12/14/2022]
Abstract
Complex brain disorders are highly heritable and arise from a complex polygenic risk architecture. Many disease-associated loci are found in non-coding regions that house regulatory elements. These elements influence the transcription of target genes-many of which demonstrate cell-type-specific expression patterns-and thereby affect phenotypically relevant molecular pathways. Thus, cell-type-specificity must be considered when prioritizing candidate risk loci, variants and target genes. This Review discusses the use of high-throughput assays in human induced pluripotent stem cell-based neurodevelopmental models to probe genetic risk in a cell-type- and patient-specific manner. The application of massively parallel reporter assays in human induced pluripotent stem cells can characterize the human regulome and test the transcriptional responses of putative regulatory elements. Parallel CRISPR-based screens can further functionally dissect this genetic regulatory architecture. The integration of these emerging technologies could decode genetic risk into medically actionable information, thereby improving genetic diagnosis and identifying novel points of therapeutic intervention.
Collapse
Affiliation(s)
- Kayla G Townsley
- Graduate School of Biomedical Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kristen J Brennand
- Graduate School of Biomedical Science, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Nash Family Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Laura M Huckins
- Pamela Sklar Division of Psychiatric Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Department of Genetics and Genomics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
- Mental Illness Research, Education and Clinical Centers, James J. Peters Department of Veterans Affairs Medical Center, Bronx, NY, USA.
| |
Collapse
|
93
|
Renganaath K, Chong R, Day L, Kosuri S, Kruglyak L, Albert FW. Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross. eLife 2020; 9:e62669. [PMID: 33179598 PMCID: PMC7685706 DOI: 10.7554/elife.62669] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 11/11/2020] [Indexed: 02/06/2023] Open
Abstract
Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5832 natural DNA variants in the promoters of 2503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.
Collapse
Affiliation(s)
- Kaushik Renganaath
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| | - Rockie Chong
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Laura Day
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Sriram Kosuri
- Department of Chemistry & Biochemistry, University of California, Los AngelesLos AngelesUnited States
| | - Leonid Kruglyak
- Department of Human Genetics, University of California, Los AngelesLos AngelesUnited States
- Department of Biological Chemistry, University of California, Los AngelesLos AngelesUnited States
- Howard Hughes Medical Institute, University of California, Los AngelesLos AngelesUnited States
| | - Frank W Albert
- Department of Genetics, Cell Biology, & Development, University of MinnesotaMinneapolisUnited States
| |
Collapse
|
94
|
Klein JC, Agarwal V, Inoue F, Keith A, Martin B, Kircher M, Ahituv N, Shendure J. A systematic evaluation of the design and context dependencies of massively parallel reporter assays. Nat Methods 2020; 17:1083-1091. [PMID: 33046894 PMCID: PMC7727316 DOI: 10.1038/s41592-020-0965-y] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 08/27/2020] [Indexed: 01/02/2023]
Abstract
Massively parallel reporter assays (MPRAs) functionally screen thousands of sequences for regulatory activity in parallel. To date, there are limited studies that systematically compare differences in MPRA design. Here, we screen a library of 2,440 candidate liver enhancers and controls for regulatory activity in HepG2 cells using nine different MPRA designs. We identify subtle but significant differences that correlate with epigenetic and sequence-level features, as well as differences in dynamic range and reproducibility. We also validate that enhancer activity is largely independent of orientation, at least for our library and designs. Finally, we assemble and test the same enhancers as 192-mers, 354-mers and 678-mers and observe sizable differences. This work provides a framework for the experimental design of high-throughput reporter assays, suggesting that the extended sequence context of tested elements and to a lesser degree the precise assay, influence MPRA results.
Collapse
Affiliation(s)
- Jason C Klein
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Vikram Agarwal
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Calico Life Sciences LLC, South San Francisco, CA, USA
| | - Fumitaka Inoue
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA
- Institute for the Advanced Study of Human Biology (WPI-ASHBi), Kyoto University, Kyoto, Japan
| | - Aidan Keith
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Beth Martin
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Martin Kircher
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
- Berlin Institute of Health (BIH), Berlin, Germany
- Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Nadav Ahituv
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA, USA.
- Institute for Human Genetics, University of California San Francisco, San Francisco, CA, USA.
| | - Jay Shendure
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, Seattle, WA, USA.
- Brotman Baty Institute for Precision Medicine, University of Washington, Seattle, WA, USA.
| |
Collapse
|
95
|
Sidore AM, Plesa C, Samson JA, Lubock NB, Kosuri S. DropSynth 2.0: high-fidelity multiplexed gene synthesis in emulsions. Nucleic Acids Res 2020; 48:e95. [PMID: 32692349 PMCID: PMC7498354 DOI: 10.1093/nar/gkaa600] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 06/13/2020] [Accepted: 07/11/2020] [Indexed: 01/12/2023] Open
Abstract
Multiplexed assays allow functional testing of large synthetic libraries of genetic elements, but are limited by the designability, length, fidelity and scale of the input DNA. Here, we improve DropSynth, a low-cost, multiplexed method that builds gene libraries by compartmentalizing and assembling microarray-derived oligonucleotides in vortexed emulsions. By optimizing enzyme choice, adding enzymatic error correction and increasing scale, we show that DropSynth can build thousands of gene-length fragments at >20% fidelity.
Collapse
Affiliation(s)
- Angus M Sidore
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Calin Plesa
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Joyce A Samson
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Nathan B Lubock
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Sriram Kosuri
- Department of Chemistry and Biochemistry, University of California, Los Angeles, Los Angeles, CA 90095, USA.,UCLA-DOE Institute for Genomics and Proteomics, Molecular Biology Institute, Quantitative and Computational Biology Institute, Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
96
|
Chen X, Zhang C, Lindley ND. Metabolic Engineering Strategies for Sustainable Terpenoid Flavor and Fragrance Synthesis. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2020; 68:10252-10264. [PMID: 31865696 DOI: 10.1021/acs.jafc.9b06203] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Terpenoids derived from plant material are widely applied in the flavor and fragrance industry. Traditional extraction methods are unsustainable, but microbial synthesis offers a promising solution to attain efficient production of natural-identical terpenoids. Overproduction of terpenoids in microbes requires careful balancing of the synthesis pathway constituents within the constraints of host cell metabolism. Advances in metabolic engineering have greatly facilitated overcoming the challenges of achieving high titers, rates, and yields (TRYs). The review summarizes recent development in the molecular biology toolbox to achieve high TRYs for terpenoid biosynthesis, mainly in the two industrial platform microorganisms: Escherichia coli and Saccharomyces cerevisiae. The biosynthetic pathways, including alternative pathway designs, are briefly introduced, followed by recently developed methodologies used for pathway, genome, and strain optimization. Integrated applications of these tools are important to achieve high "TRYs" of terpenoid production and pave the way for translating laboratory research into successful commercial manufacturing.
Collapse
Affiliation(s)
- Xixian Chen
- Biotransformation Innovation Platform, Agency for Science, Technology and Research (A*STAR), 61 Biopolis Drive, Singapore 138673
| | - Congqiang Zhang
- Biotransformation Innovation Platform, Agency for Science, Technology and Research (A*STAR), 61 Biopolis Drive, Singapore 138673
| | - Nicholas D Lindley
- Biotransformation Innovation Platform, Agency for Science, Technology and Research (A*STAR), 61 Biopolis Drive, Singapore 138673
- TBI, Université de Toulouse, CNRS, INRA, INSA,31077 Toulouse, France
| |
Collapse
|
97
|
Tobias IC, Abatti LE, Moorthy SD, Mullany S, Taylor T, Khader N, Filice MA, Mitchell JA. Transcriptional enhancers: from prediction to functional assessment on a genome-wide scale. Genome 2020; 64:426-448. [PMID: 32961076 DOI: 10.1139/gen-2020-0104] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Enhancers are cis-regulatory sequences located distally to target genes. These sequences consolidate developmental and environmental cues to coordinate gene expression in a tissue-specific manner. Enhancer function and tissue specificity depend on the expressed set of transcription factors, which recognize binding sites and recruit cofactors that regulate local chromatin organization and gene transcription. Unlike other genomic elements, enhancers are challenging to identify because they function independently of orientation, are often distant from their promoters, have poorly defined boundaries, and display no reading frame. In addition, there are no defined genetic or epigenetic features that are unambiguously associated with enhancer activity. Over recent years there have been developments in both empirical assays and computational methods for enhancer prediction. We review genome-wide tools, CRISPR advancements, and high-throughput screening approaches that have improved our ability to both observe and manipulate enhancers in vitro at the level of primary genetic sequences, chromatin states, and spatial interactions. We also highlight contemporary animal models and their importance to enhancer validation. Together, these experimental systems and techniques complement one another and broaden our understanding of enhancer function in development, evolution, and disease.
Collapse
Affiliation(s)
- Ian C Tobias
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Luis E Abatti
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Sakthi D Moorthy
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Shanelle Mullany
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Tiegh Taylor
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Nawrah Khader
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Mario A Filice
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| | - Jennifer A Mitchell
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada.,Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada
| |
Collapse
|
98
|
Ireland WT, Beeler SM, Flores-Bautista E, McCarty NS, Röschinger T, Belliveau NM, Sweredoski MJ, Moradian A, Kinney JB, Phillips R. Deciphering the regulatory genome of Escherichia coli, one hundred promoters at a time. eLife 2020; 9:e55308. [PMID: 32955440 PMCID: PMC7567609 DOI: 10.7554/elife.55308] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Accepted: 09/18/2020] [Indexed: 01/28/2023] Open
Abstract
Advances in DNA sequencing have revolutionized our ability to read genomes. However, even in the most well-studied of organisms, the bacterium Escherichia coli, for ≈65% of promoters we remain ignorant of their regulation. Until we crack this regulatory Rosetta Stone, efforts to read and write genomes will remain haphazard. We introduce a new method, Reg-Seq, that links massively parallel reporter assays with mass spectrometry to produce a base pair resolution dissection of more than a E. coli promoters in 12 growth conditions. We demonstrate that the method recapitulates known regulatory information. Then, we examine regulatory architectures for more than 80 promoters which previously had no known regulatory information. In many cases, we also identify which transcription factors mediate their regulation. This method clears a path for highly multiplexed investigations of the regulatory genome of model organisms, with the potential of moving to an array of microbes of ecological and medical relevance.
Collapse
Affiliation(s)
- William T Ireland
- Department of Physics, California Institute of TechnologyPasadenaUnited States
| | - Suzannah M Beeler
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Emanuel Flores-Bautista
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Nicholas S McCarty
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Tom Röschinger
- Division of Chemistry and Chemical Engineering, California Institute of TechnologyPasadenaUnited States
| | - Nathan M Belliveau
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| | - Michael J Sweredoski
- Proteome Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of TechnologyPasadenaUnited States
| | - Annie Moradian
- Proteome Exploration Laboratory, Division of Biology and Biological Engineering, Beckman Institute, California Institute of TechnologyPasadenaUnited States
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor LaboratoryCold Spring HarborUnited States
| | - Rob Phillips
- Department of Physics, California Institute of TechnologyPasadenaUnited States
- Division of Biology and Biological Engineering, California Institute of TechnologyPasadenaUnited States
| |
Collapse
|
99
|
Identification of the human DPR core promoter element using machine learning. Nature 2020; 585:459-463. [PMID: 32908305 PMCID: PMC7501168 DOI: 10.1038/s41586-020-2689-7] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 06/16/2020] [Indexed: 01/31/2023]
Abstract
The RNA polymerase II (Pol II) core promoter is the strategic site of convergence of the signals that lead to transcription initiation1-5, but the downstream core promoter in humans has been difficult to decipher1-3. Here, we analyze the human Pol II core promoter and use machine learning to generate predictive models for the downstream core promoter region (DPR) and the TATA box. We developed a method termed HARPE (high-throughput analysis of randomized promoter elements) to create hundreds of thousands of DPR (or TATA box) variants that are each of known transcriptional strength. We then analyzed the HARPE data by support vector regression (SVR) to provide comprehensive models for the sequence motifs, and found that the SVR-based approach is more effective than a consensus-based method for predicting transcriptional activity. These studies revealed that the DPR is a functionally important core promoter element that is widely used in human promoters. Importantly, there appears to be a duality between the DPR and TATA box, as many promoters contain one or the other element. More broadly, these findings show that functional DNA motifs can be identified by machine learning analysis of a comprehensive set of sequence variants.
Collapse
|
100
|
DiMSum: an error model and pipeline for analyzing deep mutational scanning data and diagnosing common experimental pathologies. Genome Biol 2020; 21:207. [PMID: 32799905 PMCID: PMC7429474 DOI: 10.1186/s13059-020-02091-3] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 07/05/2020] [Indexed: 12/30/2022] Open
Abstract
Deep mutational scanning (DMS) enables multiplexed measurement of the effects of thousands of variants of proteins, RNAs, and regulatory elements. Here, we present a customizable pipeline, DiMSum, that represents an end-to-end solution for obtaining variant fitness and error estimates from raw sequencing data. A key innovation of DiMSum is the use of an interpretable error model that captures the main sources of variability arising in DMS workflows, outperforming previous methods. DiMSum is available as an R/Bioconda package and provides summary reports to help researchers diagnose common DMS pathologies and take remedial steps in their analyses.
Collapse
|