1
|
Schaepe JM, Fries T, Doughty BR, Crocker OJ, Hinks MM, Marklund E, Greenleaf WJ. Thermodynamic principles link in vitro transcription factor affinities to single-molecule chromatin states in cells. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2025.01.27.635162. [PMID: 39975040 PMCID: PMC11838358 DOI: 10.1101/2025.01.27.635162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/21/2025]
Abstract
The molecular details governing transcription factor (TF) binding and the formation of accessible chromatin are not yet quantitatively understood - including how sequence context modulates affinity, how TFs search DNA, the kinetics of TF occupancy, and how motif grammars coordinate binding. To resolve these questions for a human TF, erythroid Krüppel-like factor (eKLF/KLF1), we quantitatively compare, in high throughput, in vitro TF binding rates and affinities with in vivo single molecule TF and nucleosome occupancies across engineered DNA sequences. We find that 40-fold flanking sequence effects on affinity are consistent with distal flanks tuning TF search parameters and captured by a linear energy model. Motif recognition probability, rather than time in the bound state, drives affinity changes, and in vitro and in nuclei measurements exhibit consistent, minutes-long TF residence times. Finally, pairing in vitro biophysical parameters with thermodynamic models accurately predicts in vivo single-molecule chromatin states for unseen motif grammars.
Collapse
Affiliation(s)
- Julia M Schaepe
- Bioengineering Department, Stanford University, Stanford, CA 94305, USA
| | - Torbjörn Fries
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | | | - Olivia J Crocker
- Genetics Department, Stanford University, Stanford, CA 94305, USA
| | - Michaela M Hinks
- Bioengineering Department, Stanford University, Stanford, CA 94305, USA
| | - Emil Marklund
- Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - William J Greenleaf
- Genetics Department, Stanford University, Stanford, CA 94305, USA
- Department of Applied Physics, Stanford University, Stanford, CA 94205, USA
| |
Collapse
|
2
|
Aguirre Rivera J, Mao G, Sabantsev A, Panfilov M, Hou Q, Lindell M, Chanez C, Ritort F, Jinek M, Deindl S. Massively parallel analysis of single-molecule dynamics on next-generation sequencing chips. Science 2024; 385:892-898. [PMID: 39172826 DOI: 10.1126/science.adn5371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 06/12/2024] [Indexed: 08/24/2024]
Abstract
Single-molecule techniques are ideally poised to characterize complex dynamics but are typically limited to investigating a small number of different samples. However, a large sequence or chemical space often needs to be explored to derive a comprehensive understanding of complex biological processes. Here we describe multiplexed single-molecule characterization at the library scale (MUSCLE), a method that combines single-molecule fluorescence microscopy with next-generation sequencing to enable highly multiplexed observations of complex dynamics. We comprehensively profiled the sequence dependence of DNA hairpin properties and Cas9-induced target DNA unwinding-rewinding dynamics. The ability to explore a large sequence space for Cas9 allowed us to identify a number of target sequences with unexpected behaviors. We envision that MUSCLE will enable the mechanistic exploration of many fundamental biological processes.
Collapse
Affiliation(s)
- J Aguirre Rivera
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 75105 Uppsala, Sweden
| | - G Mao
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 75105 Uppsala, Sweden
| | - A Sabantsev
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 75105 Uppsala, Sweden
| | - M Panfilov
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 75105 Uppsala, Sweden
| | - Q Hou
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 75105 Uppsala, Sweden
| | - M Lindell
- Department of Medical Sciences, Science for Life Laboratory, Uppsala University, 75144 Uppsala, Sweden
| | - C Chanez
- Department of Biochemistry, University of Zürich, 8057 Zürich, Switzerland
| | - F Ritort
- Small Biosystems Lab, Condensed Matter Physics Department, Universitat de Barcelona, 08028 Barcelona, Spain
- Institut de Nanociència i Nanotecnologia (IN2UB), Universitat de Barcelona, 08028 Barcelona, Spain
| | - M Jinek
- Department of Biochemistry, University of Zürich, 8057 Zürich, Switzerland
| | - S Deindl
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, 75105 Uppsala, Sweden
| |
Collapse
|
3
|
Severins I, Bastiaanssen C, Kim SH, Simons RB, van Noort J, Joo C. Single-molecule structural and kinetic studies across sequence space. Science 2024; 385:898-904. [PMID: 39172834 DOI: 10.1126/science.adn5968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 07/01/2024] [Indexed: 08/24/2024]
Abstract
At the core of molecular biology lies the intricate interplay between sequence, structure, and function. Single-molecule techniques provide in-depth dynamic insights into structure and function, but laborious assays impede functional screening of large sequence libraries. We introduce high-throughput Single-molecule Parallel Analysis for Rapid eXploration of Sequence space (SPARXS), integrating single-molecule fluorescence with next-generation sequencing. We applied SPARXS to study the sequence-dependent kinetics of the Holliday junction, a critical intermediate in homologous recombination. By examining the dynamics of millions of Holliday junctions, covering thousands of distinct sequences, we demonstrated the ability of SPARXS to uncover sequence patterns, evaluate sequence motifs, and construct thermodynamic models. SPARXS emerges as a versatile tool for untangling the mechanisms that underlie sequence-specific processes at the molecular scale.
Collapse
Affiliation(s)
- Ivo Severins
- Department of BioNanoScience, Kavli Institute of Nanoscience, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, the Netherlands
- Biological and Soft Matter Physics, Huygens-Kamerlingh Onnes Laboratory, Leiden University, Niels Bohrweg 2, 2333 CA Leiden, Netherlands
| | - Carolien Bastiaanssen
- Department of BioNanoScience, Kavli Institute of Nanoscience, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, the Netherlands
| | - Sung Hyun Kim
- Department of BioNanoScience, Kavli Institute of Nanoscience, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, the Netherlands
- Department of Physics, Ewha Womans University, Seoul 03760, Republic of Korea
| | - Roy B Simons
- Department of BioNanoScience, Kavli Institute of Nanoscience, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, the Netherlands
| | - John van Noort
- Biological and Soft Matter Physics, Huygens-Kamerlingh Onnes Laboratory, Leiden University, Niels Bohrweg 2, 2333 CA Leiden, Netherlands
| | - Chirlmin Joo
- Department of BioNanoScience, Kavli Institute of Nanoscience, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, the Netherlands
- Department of Physics, Ewha Womans University, Seoul 03760, Republic of Korea
| |
Collapse
|
4
|
Kuo YA, Chen YI, Wang Y, Korkmaz Z, Yonas S, He Y, Nguyen TD, Hong S, Nguyen AT, Kim S, Seifi S, Fan PH, Wu Y, Yang Z, Liu HW, Lu Y, Ren P, Yeh HC. Fluorogenic Aptamer Optimizations on a Massively Parallel Sequencing Platform. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.07.602435. [PMID: 39026723 PMCID: PMC11257435 DOI: 10.1101/2024.07.07.602435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
F luorogenic ap tamers (FAPs) have become an increasingly important tool in cellular sensing and pathogen diagnostics. However, fine-tuning FAPs for enhanced performance remains challenging even with the structural details provided by X-ray crystallography. Here we present a novel approach to optimize a DNA-based FAP (D-FAP), Lettuce, on repurposed Illumina next-generation sequencing (NGS) chips. When substituting its cognate chromophore, DFHBI-1T, with TO1-biotin, Lettuce not only shows a red-shifted emission peak by 53 nm (from 505 to 558 nm), but also a 4-fold bulk fluorescence enhancement. After screening 8,821 Lettuce variants complexed with TO1-biotin, the C14T mutation is found to exhibit an improved apparent dissociated constant ( vs. 0.82 µM), an increased quantum yield (QY: 0.62 vs. 0.59) and an elongated fluorescence lifetime (τ: 6.00 vs. 5.77 ns), giving 45% more ensemble fluorescence than the canonical Lettuce/TO1-biotin complex. Molecular dynamic simulations further indicate that the π-π stacking interaction is key to determining the coordination structure of TO1-biotin in Lettuce. Our screening-and-simulation pipeline can effectively optimize FAPs without any prior structural knowledge of the canonical FAP/chromophore complexes, providing not only improved molecular probes for fluorescence sensing but also insights into aptamer-chromophore interactions.
Collapse
|
5
|
Yang Y, Chaffin TA, Shao Y, Balasubramanian VK, Markillie M, Mitchell H, Rubio‐Wilhelmi MM, Ahkami AH, Blumwald E, Neal Stewart C. Novel synthetic inducible promoters controlling gene expression during water-deficit stress with green tissue specificity in transgenic poplar. PLANT BIOTECHNOLOGY JOURNAL 2024; 22:1596-1609. [PMID: 38232002 PMCID: PMC11123411 DOI: 10.1111/pbi.14289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Revised: 11/16/2023] [Accepted: 01/03/2024] [Indexed: 01/19/2024]
Abstract
Synthetic promoters may be designed using short cis-regulatory elements (CREs) and core promoter sequences for specific purposes. We identified novel conserved DNA motifs from the promoter sequences of leaf palisade and vascular cell type-specific expressed genes in water-deficit stressed poplar (Populus tremula × Populus alba), collected through low-input RNA-seq analysis using laser capture microdissection. Hexamerized sequences of four conserved 20-base motifs were inserted into each synthetic promoter construct. Two of these synthetic promoters (Syn2 and Syn3) induced GFP in transformed poplar mesophyll protoplasts incubated in 0.5 M mannitol solution. To identify effect of length and sequence from a valuable 20 base motif, 5' and 3' regions from a basic sequence (GTTAACTTCAGGGCCTGTGG) of Syn3 were hexamerized to generate two shorter synthetic promoters, Syn3-10b-1 (5': GTTAACTTCA) and Syn3-10b-2 (3': GGGCCTGTGG). These promoters' activities were compared with Syn3 in plants. Syn3 and Syn3-10b-1 were specifically induced in transient agroinfiltrated Nicotiana benthamiana leaves in water cessation for 3 days. In stable transgenic poplar, Syn3 presented as a constitutive promoter but had the highest activity in leaves. Syn3-10b-1 had stronger induction in green tissues under water-deficit stress conditions than mock control. Therefore, a synthetic promoter containing the 5' sequence of Syn3 endowed both tissue-specificity and water-deficit inducibility in transgenic poplar, whereas the 3' sequence did not. Consequently, we have added two new synthetic promoters to the poplar engineering toolkit: Syn3-10b-1, a green tissue-specific and water-deficit stress-induced promoter, and Syn3, a green tissue-preferential constitutive promoter.
Collapse
Affiliation(s)
- Yongil Yang
- Center for Agricultural Synthetic BiologyUniversity of Tennessee Institute of AgricultureKnoxvilleTennesseeUSA
| | - Timothy A. Chaffin
- Center for Agricultural Synthetic BiologyUniversity of Tennessee Institute of AgricultureKnoxvilleTennesseeUSA
| | - Yuanhua Shao
- Center for Agricultural Synthetic BiologyUniversity of Tennessee Institute of AgricultureKnoxvilleTennesseeUSA
- Department of Plant SciencesUniversity of TennesseeKnoxvilleTennesseeUSA
| | | | - Meng Markillie
- Environmental Molecular Sciences Laboratory, Pacific Northwest National LaboratoryRichlandWAUSA
| | - Hugh Mitchell
- Environmental Molecular Sciences Laboratory, Pacific Northwest National LaboratoryRichlandWAUSA
| | | | - Amir H. Ahkami
- Environmental Molecular Sciences Laboratory, Pacific Northwest National LaboratoryRichlandWAUSA
| | - Eduardo Blumwald
- Department of Plant SciencesUniversity of CaliforniaDavisCaliforniaUSA
| | - C. Neal Stewart
- Center for Agricultural Synthetic BiologyUniversity of Tennessee Institute of AgricultureKnoxvilleTennesseeUSA
- Department of Plant SciencesUniversity of TennesseeKnoxvilleTennesseeUSA
| |
Collapse
|
6
|
Porebski BT, Balmforth M, Browne G, Riley A, Jamali K, Fürst MJLJ, Velic M, Buchanan A, Minter R, Vaughan T, Holliger P. Rapid discovery of high-affinity antibodies via massively parallel sequencing, ribosome display and affinity screening. Nat Biomed Eng 2024; 8:214-232. [PMID: 37814006 PMCID: PMC10963267 DOI: 10.1038/s41551-023-01093-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 08/23/2023] [Indexed: 10/11/2023]
Abstract
Developing therapeutic antibodies is laborious and costly. Here we report a method for antibody discovery that leverages the Illumina HiSeq platform to, within 3 days, screen in the order of 108 antibody-antigen interactions. The method, which we named 'deep screening', involves the clustering and sequencing of antibody libraries, the conversion of the DNA clusters into complementary RNA clusters covalently linked to the instrument's flow-cell surface on the same location, the in situ translation of the clusters into antibodies tethered via ribosome display, and their screening via fluorescently labelled antigens. By using deep screening, we discovered low-nanomolar nanobodies to a model antigen using 4 × 106 unique variants from yeast-display-enriched libraries, and high-picomolar single-chain antibody fragment leads for human interleukin-7 directly from unselected synthetic repertoires. We also leveraged deep screening of a library of 2.4 × 105 sequences of the third complementarity-determining region of the heavy chain of an anti-human epidermal growth factor receptor 2 (HER2) antibody as input for a large language model that generated new single-chain antibody fragment sequences with higher affinity for HER2 than those in the original library.
Collapse
Affiliation(s)
| | | | | | - Aidan Riley
- Biologics Engineering, AstraZeneca, Cambridge, UK
| | | | - Maximillian J L J Fürst
- MRC Laboratory of Molecular Biology, Cambridge, UK
- Groningen Biomolecular Sciences and Biotechnology Institute, University of Groningen, Groningen, the Netherlands
| | | | | | - Ralph Minter
- Biologics Engineering, AstraZeneca, Cambridge, UK
- Alchemab Therapeutics, London, UK
| | | | | |
Collapse
|
7
|
Luthra I, Jensen C, Chen XE, Salaudeen AL, Rafi AM, de Boer CG. Regulatory activity is the default DNA state in eukaryotes. Nat Struct Mol Biol 2024; 31:559-567. [PMID: 38448573 DOI: 10.1038/s41594-024-01235-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/29/2024] [Indexed: 03/08/2024]
Abstract
Genomes encode for genes and non-coding DNA, both capable of transcriptional activity. However, unlike canonical genes, many transcripts from non-coding DNA have limited evidence of conservation or function. Here, to determine how much biological noise is expected from non-genic sequences, we quantify the regulatory activity of evolutionarily naive DNA using RNA-seq in yeast and computational predictions in humans. In yeast, more than 99% of naive DNA bases were transcribed. Unlike the evolved transcriptome, naive transcripts frequently overlapped with opposite sense transcripts, suggesting selection favored coherent gene structures in the yeast genome. In humans, regulation-associated chromatin activity is predicted to be common in naive dinucleotide-content-matched randomized DNA. Here, naive and evolved DNA have similar co-occurrence and cell-type specificity of chromatin marks, challenging these as indicators of selection. However, in both yeast and humans, extreme high activities were rare in naive DNA, suggesting they result from selection. Overall, basal regulatory activity seems to be the default, which selection can hone to evolve a function or, if detrimental, repress.
Collapse
Affiliation(s)
- Ishika Luthra
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Cassandra Jensen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Xinyi E Chen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Asfar Lathif Salaudeen
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Abdul Muntakim Rafi
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada
| | - Carl G de Boer
- School of Biomedical Engineering, University of British Columbia, Vancouver, British Columbia, Canada.
| |
Collapse
|
8
|
Schiopu I, Dragomir I, Asandei A. Single molecule technique unveils the role of electrostatic interactions in ssDNA-gp32 molecular complex stability. RSC Adv 2024; 14:5449-5460. [PMID: 38352678 PMCID: PMC10862658 DOI: 10.1039/d3ra07746b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 02/07/2024] [Indexed: 02/16/2024] Open
Abstract
The exploration of single-strand DNA-binding protein (SSB)-ssDNA interactions and their crucial roles in essential biological processes lagged behind other types of protein-nucleic acid interactions, such as protein-dsDNA and protein-RNA interactions. The ssDNA binding protein gene product 32 (gp32) of the T4 bacteriophage is a central integrating component of the replication complex that must continuously bind to and unbind from transiently exposed template strands during the DNA synthesis. To gain deeper insights into the electrostatic conditions influencing the stability of the ssDNA-gp32 molecular complex, like the salt concentration or some metal ions proven to specifically bind to gp32, we employed a method that performs rapid measurements of the DNA-protein stability using an α-Hemolysin (α-HL) protein nanopore. We indirectly probed the stability of a protein-nucleic acid complex by monitoring the dissociation process between the gp32 protein and the ssDNA molecular complex in single-molecular electrophysiology experiments, but also through fluorescence spectroscopy techniques. We have shown that the complex is more stable in 0.5 M KCl solution than in 2 M KCl solution and that the presence of Zn2+ ions further increases this stability for any salt used in the present study. This method can be applied to other nucleic acid-protein molecular complexes, as well as for an accurate determination of the drug-protein carrier stability.
Collapse
Affiliation(s)
- Irina Schiopu
- The Institute of Interdisciplinary Research, Department of Exact Sciences and Natural Sciences, "Alexandru Ioan Cuza" University of Iaşi 700506 Iasi Romania
| | - Isabela Dragomir
- The Institute of Interdisciplinary Research, Department of Exact Sciences and Natural Sciences, "Alexandru Ioan Cuza" University of Iaşi 700506 Iasi Romania
| | - Alina Asandei
- The Institute of Interdisciplinary Research, Department of Exact Sciences and Natural Sciences, "Alexandru Ioan Cuza" University of Iaşi 700506 Iasi Romania
| |
Collapse
|
9
|
Yasmeen E, Wang J, Riaz M, Zhang L, Zuo K. Designing artificial synthetic promoters for accurate, smart, and versatile gene expression in plants. PLANT COMMUNICATIONS 2023:100558. [PMID: 36760129 PMCID: PMC10363483 DOI: 10.1016/j.xplc.2023.100558] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 01/30/2023] [Accepted: 02/06/2023] [Indexed: 06/18/2023]
Abstract
With the development of high-throughput biology techniques and artificial intelligence, it has become increasingly feasible to design and construct artificial biological parts, modules, circuits, and even whole systems. To overcome the limitations of native promoters in controlling gene expression, artificial promoter design aims to synthesize short, inducible, and conditionally controlled promoters to coordinate the expression of multiple genes in diverse plant metabolic and signaling pathways. Synthetic promoters are versatile and can drive gene expression accurately with smart responses; they show potential for enhancing desirable traits in crops, thereby improving crop yield, nutritional quality, and food security. This review first illustrates the importance of synthetic promoters, then introduces promoter architecture and thoroughly summarizes advances in synthetic promoter construction. Restrictions to the development of synthetic promoters and future applications of such promoters in synthetic plant biology and crop improvement are also discussed.
Collapse
Affiliation(s)
- Erum Yasmeen
- Single Cell Research Center, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jin Wang
- Biotechnology Research Institute, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Muhammad Riaz
- Single Cell Research Center, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Lida Zhang
- Single Cell Research Center, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Kaijing Zuo
- Single Cell Research Center, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China.
| |
Collapse
|
10
|
Wu D, Feagin T, Mage P, Rangel A, Wan L, Kong D, Li A, Coller J, Eisenstein M, Soh H. Flow-Cell-Based Technology for Massively Parallel Characterization of Base-Modified DNA Aptamers. Anal Chem 2023; 95:2645-2652. [PMID: 36693249 DOI: 10.1021/acs.analchem.1c04777] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Aptamers incorporating chemically modified bases can achieve superior affinity and specificity compared to natural aptamers, but their characterization remains a labor-intensive, low-throughput task. Here, we describe the "non-natural aptamer array" (N2A2) system, in which a minimally modified Illumina MiSeq instrument is used for the high-throughput generation and characterization of large libraries of base-modified DNA aptamer candidates based on both target binding and specificity. We first demonstrate the capability to screen multiple different base modifications to identify the optimal chemistry for high-affinity target binding. We next use N2A2 to generate aptamers that can maintain excellent specificity even in complex samples, with equally strong target affinity in both buffer and diluted human serum. For both aptamers, affinity was formally calculated with gold-standard binding assays. Given that N2A2 requires only minor mechanical modifications to the MiSeq, we believe that N2A2 offers a broadly accessible tool for generating high-quality affinity reagents for diverse applications.
Collapse
Affiliation(s)
- Diana Wu
- Department of Bioengineering, Stanford University, Stanford, California 94305, United States
| | - Trevor Feagin
- Department of Radiology, Stanford University, Stanford, California 94305, United States
| | - Peter Mage
- Department of Electrical Engineering, Stanford University, Stanford, California 94305, United States
| | - Alexandra Rangel
- Department of Radiology, Stanford University, Stanford, California 94305, United States
| | - Leighton Wan
- Department of Bioengineering, Stanford University, Stanford, California 94305, United States
| | - Dehui Kong
- Department of Radiology, Stanford University, Stanford, California 94305, United States
| | - Anping Li
- Department of Radiology, Stanford University, Stanford, California 94305, United States
| | - John Coller
- Stanford Functional Genomics Facility, School of Medicine, Stanford University, Stanford, California 94305, United States
| | - Michael Eisenstein
- Department of Radiology, Stanford University, Stanford, California 94305, United States.,Department of Electrical Engineering, Stanford University, Stanford, California 94305, United States
| | - Hyongsok Soh
- Department of Radiology, Stanford University, Stanford, California 94305, United States.,Department of Electrical Engineering, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
11
|
Physicochemical models of protein-DNA binding with standard and modified base pairs. Proc Natl Acad Sci U S A 2023; 120:e2205796120. [PMID: 36656856 PMCID: PMC9942898 DOI: 10.1073/pnas.2205796120] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
DNA-binding proteins play important roles in various cellular processes, but the mechanisms by which proteins recognize genomic target sites remain incompletely understood. Functional groups at the edges of the base pairs (bp) exposed in the DNA grooves represent physicochemical signatures. As these signatures enable proteins to form specific contacts between protein residues and bp, their study can provide mechanistic insights into protein-DNA binding. Existing experimental methods, such as X-ray crystallography, can reveal such mechanisms based on physicochemical interactions between proteins and their DNA target sites. However, the low throughput of structural biology methods limits mechanistic insights for selection of many genomic sites. High-throughput binding assays enable prediction of potential target sites by determining relative binding affinities of a protein to massive numbers of DNA sequences. Many currently available computational methods are based on the sequence of standard Watson-Crick bp. They assume that the contribution of overall binding affinity is independent for each base pair, or alternatively include dinucleotides or short k-mers. These methods cannot directly expand to physicochemical contacts, and they are not suitable to apply to DNA modifications or non-Watson-Crick bp. These variations include DNA methylation, and synthetic or mismatched bp. The proposed method, DeepRec, can predict relative binding affinities as function of physicochemical signatures and the effect of DNA methylation or other chemical modifications on binding. Sequence-based modeling methods are in comparison a coarse-grain description and cannot achieve such insights. Our chemistry-based modeling framework provides a path towards understanding genome function at a mechanistic level.
Collapse
|
12
|
Marklund E, Ke Y, Greenleaf WJ. High-throughput biochemistry in RNA sequence space: predicting structure and function. Nat Rev Genet 2023; 24:401-414. [PMID: 36635406 DOI: 10.1038/s41576-022-00567-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/08/2022] [Indexed: 01/14/2023]
Abstract
RNAs are central to fundamental biological processes in all known organisms. The set of possible intramolecular interactions of RNA nucleotides defines the range of alternative structural conformations of a specific RNA that can coexist, and these structures enable functional catalytic properties of RNAs and/or their productive intermolecular interactions with other RNAs or proteins. However, the immense combinatorial space of potential RNA sequences has precluded predictive mapping between RNA sequence and molecular structure and function. Recent advances in high-throughput approaches in vitro have enabled quantitative thermodynamic and kinetic measurements of RNA-RNA and RNA-protein interactions, across hundreds of thousands of sequence variations. In this Review, we explore these techniques, how they can be used to understand RNA function and how they might form the foundations of an accurate model to predict the structure and function of an RNA directly from its nucleotide sequence. The experimental techniques and modelling frameworks discussed here are also highly relevant for the sampling of sequence-structure-function space of DNAs and proteins.
Collapse
Affiliation(s)
- Emil Marklund
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - Yuxi Ke
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
13
|
Walton RT, Singh A, Blainey PC. Pooled genetic screens with image-based profiling. Mol Syst Biol 2022; 18:e10768. [PMID: 36366905 PMCID: PMC9650298 DOI: 10.15252/msb.202110768] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 09/12/2022] [Accepted: 09/16/2022] [Indexed: 11/13/2022] Open
Abstract
Spatial structure in biology, spanning molecular, organellular, cellular, tissue, and organismal scales, is encoded through a combination of genetic and epigenetic factors in individual cells. Microscopy remains the most direct approach to exploring the intricate spatial complexity defining biological systems and the structured dynamic responses of these systems to perturbations. Genetic screens with deep single-cell profiling via image features or gene expression programs have the capacity to show how biological systems work in detail by cataloging many cellular phenotypes with one experimental assay. Microscopy-based cellular profiling provides information complementary to next-generation sequencing (NGS) profiling and has only recently become compatible with large-scale genetic screens. Optical screening now offers the scale needed for systematic characterization and is poised for further scale-up. We discuss how these methodologies, together with emerging technologies for genetic perturbation and microscopy-based multiplexed molecular phenotyping, are powering new approaches to reveal genotype-phenotype relationships.
Collapse
Affiliation(s)
- Russell T Walton
- Broad Institute of MIT and HarvardCambridgeMAUSA
- Department of Biological EngineeringMITCambridgeMAUSA
| | - Avtar Singh
- Broad Institute of MIT and HarvardCambridgeMAUSA
- Present address:
Department of Cellular and Tissue GenomicsGenentechSouth San FranciscoCAUSA
| | - Paul C Blainey
- Broad Institute of MIT and HarvardCambridgeMAUSA
- Department of Biological EngineeringMITCambridgeMAUSA
- Koch Institute for Integrative Cancer ResearchMITCambridgeMAUSA
| |
Collapse
|
14
|
Kuo YA, Jung C, Chen YA, Kuo HC, Zhao OS, Nguyen TD, Rybarski JR, Hong S, Chen YI, Wylie DC, Hawkins JA, Walker JN, Shields SWJ, Brodbelt JS, Petty JT, Finkelstein IJ, Yeh HC. Massively Parallel Selection of NanoCluster Beacons. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2022; 34:e2204957. [PMID: 35945159 PMCID: PMC9588665 DOI: 10.1002/adma.202204957] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 07/18/2022] [Indexed: 06/15/2023]
Abstract
NanoCluster Beacons (NCBs) are multicolor silver nanocluster probes whose fluorescence can be activated or tuned by a proximal DNA strand called the activator. While a single-nucleotide difference in a pair of activators can lead to drastically different activation outcomes, termed polar opposite twins (POTs), it is difficult to discover new POT-NCBs using the conventional low-throughput characterization approaches. Here, a high-throughput selection method is reported that takes advantage of repurposed next-generation-sequencing chips to screen the activation fluorescence of ≈40 000 activator sequences. It is found that the nucleobases at positions 7-12 of the 18-nucleotide-long activator are critical to creating bright NCBs and positions 4-6 and 2-4 are hotspots to generate yellow-orange and red POTs, respectively. Based on these findings, a "zipper-bag" model is proposed that can explain how these hotspots facilitate the formation of distinct silver cluster chromophores and alter their chemical yields. Combining high-throughput screening with machine-learning algorithms, a pipeline is established to design bright and multicolor NCBs in silico.
Collapse
Affiliation(s)
- Yu-An Kuo
- Department of Biomedical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - Cheulhee Jung
- Department of Biotechnology, College of Life Sciences and Biotechnology, Korea University, Seoul, 02841, Korea
| | - Yu-An Chen
- Department of Biomedical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - Hung-Che Kuo
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
- Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX, 78712, USA
| | - Oliver S Zhao
- Department of Biomedical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - Trung D Nguyen
- Department of Biomedical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - James R Rybarski
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
- Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX, 78712, USA
| | - Soonwoo Hong
- Department of Biomedical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - Yuan-I Chen
- Department of Biomedical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
| | - Dennis C Wylie
- Computational Biology and Bioinformatics, Center for Biomedical Research Support, University of Texas at Austin, Austin, TX, 78712, USA
| | - John A Hawkins
- European Molecular Biology Laboratory (EMBL), 69117, Heidelberg, Germany
| | - Jada N Walker
- Department of Chemistry, University of Texas at Austin, Austin, TX, 78712, USA
| | - Samuel W J Shields
- Department of Chemistry, University of Texas at Austin, Austin, TX, 78712, USA
| | - Jennifer S Brodbelt
- Department of Chemistry, University of Texas at Austin, Austin, TX, 78712, USA
| | - Jeffrey T Petty
- Department of Chemistry, Furman University, Greenville, SC, 29617, USA
| | - Ilya J Finkelstein
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
- Center for Systems and Synthetic Biology, University of Texas at Austin, Austin, TX, 78712, USA
| | - Hsin-Chih Yeh
- Department of Biomedical Engineering, University of Texas at Austin, Austin, TX, 78712, USA
- Texas Materials Institute, University of Texas at Austin, Austin, TX, 78712, USA
| |
Collapse
|
15
|
Barissi S, Sala A, Wieczór M, Battistini F, Orozco M. DNAffinity: a machine-learning approach to predict DNA binding affinities of transcription factors. Nucleic Acids Res 2022; 50:9105-9114. [PMID: 36018808 PMCID: PMC9458447 DOI: 10.1093/nar/gkac708] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 07/21/2022] [Accepted: 08/08/2022] [Indexed: 12/24/2022] Open
Abstract
We present a physics-based machine learning approach to predict in vitro transcription factor binding affinities from structural and mechanical DNA properties directly derived from atomistic molecular dynamics simulations. The method is able to predict affinities obtained with techniques as different as uPBM, gcPBM and HT-SELEX with an excellent performance, much better than existing algorithms. Due to its nature, the method can be extended to epigenetic variants, mismatches, mutations, or any non-coding nucleobases. When complemented with chromatin structure information, our in vitro trained method provides also good estimates of in vivo binding sites in yeast.
Collapse
Affiliation(s)
| | | | - Miłosz Wieczór
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and Technology. Baldiri Reixac 10–12, 08028 Barcelona, Spain,Department of Physical Chemistry. Gdansk University of Technology, 80-233 Gdańsk, Poland
| | | | - Modesto Orozco
- Correspondence may also be addressed to Modesto Orozco. Tel: +34 934 037 156;
| |
Collapse
|
16
|
Severins I, Joo C, van Noort J. Exploring molecular biology in sequence space: The road to next-generation single-molecule biophysics. Mol Cell 2022; 82:1788-1805. [PMID: 35561688 DOI: 10.1016/j.molcel.2022.04.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 04/11/2022] [Accepted: 04/19/2022] [Indexed: 10/18/2022]
Abstract
Next-generation sequencing techniques have led to a new quantitative dimension in the biological sciences. In particular, integrating sequencing techniques with biophysical tools allows sequence-dependent mechanistic studies. Using the millions of DNA clusters that are generated during sequencing to perform high-throughput binding affinity and kinetics measurements enabled the construction of energy landscapes in sequence space, uncovering relationships between sequence, structure, and function. Here, we review the approaches to perform ensemble fluorescence experiments on next-generation sequencing chips for variations of DNA, RNA, and protein sequences. As the next step, we anticipate that these fluorescence experiments will be pushed to the single-molecule level, which can directly uncover kinetics and molecular heterogeneity in an unprecedented high-throughput fashion. Molecular biophysics in sequence space, both at the ensemble and single-molecule level, leads to new mechanistic insights. The wide spectrum of applications in biology and medicine ranges from the fundamental understanding of evolutionary pathways to the development of new therapeutics.
Collapse
Affiliation(s)
- Ivo Severins
- Department of BioNanoScience, Kavli Institute of Nanoscience, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, the Netherlands; Biological and Soft Matter Physics, Huygens-Kamerlingh Onnes Laboratory, Leiden University, Niels Bohrweg 2, 2333 CA Leiden, the Netherlands
| | - Chirlmin Joo
- Department of BioNanoScience, Kavli Institute of Nanoscience, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, the Netherlands.
| | - John van Noort
- Biological and Soft Matter Physics, Huygens-Kamerlingh Onnes Laboratory, Leiden University, Niels Bohrweg 2, 2333 CA Leiden, the Netherlands.
| |
Collapse
|
17
|
Jouravleva K, Vega-Badillo J, Zamore PD. Principles and pitfalls of high-throughput analysis of microRNA-binding thermodynamics and kinetics by RNA Bind-n-Seq. CELL REPORTS METHODS 2022; 2:100185. [PMID: 35475222 PMCID: PMC9017153 DOI: 10.1016/j.crmeth.2022.100185] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 01/18/2022] [Accepted: 02/25/2022] [Indexed: 12/24/2022]
Abstract
RNA Bind-n-Seq (RBNS) is a cost-effective, high-throughput method capable of identifying the sequence preferences of RNA-binding proteins and of qualitatively defining relative dissociation constants. Although RBNS is often described as an unbiased method, several factors may influence the outcome of the analysis. Here, we discuss these biases and present an analytical strategy to estimate absolute binding affinities from RBNS data, extend RBNS to kinetic studies, and develop a framework to compute relative association and dissociation rate constants. As proof of principle, we measured the equilibrium binding properties of mammalian Argonaute2 (AGO2) guided by eight microRNAs (miRNAs) and kinetic parameters for let-7a. The miRNA-binding site repertoires, dissociation constants, and kinetic parameters calculated from RBNS data using our methods correlate well with values measured by traditional ensemble and single-molecule approaches. Our data provide additional quantitative measurements for Argonaute-bound miRNA binding that should facilitate development of quantitative targeting rules for individual miRNAs.
Collapse
Affiliation(s)
- Karina Jouravleva
- RNA Therapeutics Institute, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA 01605, USA
| | - Joel Vega-Badillo
- RNA Therapeutics Institute, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA 01605, USA
| | - Phillip D. Zamore
- RNA Therapeutics Institute, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA 01605, USA
- Howard Hughes Medical Institute, University of Massachusetts Medical School, 368 Plantation Street, Worcester, MA 01605, USA
| |
Collapse
|
18
|
Pandit K, Petrescu J, Cuevas M, Stephenson W, Smibert P, Phatnani H, Maniatis S. An open source toolkit for repurposing Illumina sequencing systems as versatile fluidics and imaging platforms. Sci Rep 2022; 12:5081. [PMID: 35332182 PMCID: PMC8948189 DOI: 10.1038/s41598-022-08740-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Accepted: 03/11/2022] [Indexed: 12/04/2022] Open
Abstract
Fluorescence microscopy is a key method in the life sciences. State of the art -omics methods combine fluorescence microscopy with complex protocols to visualize tens to thousands of features in each of millions of pixels across samples. These -omics methods require precise control of temperature, reagent application, and image acquisition parameters during iterative chemistry and imaging cycles conducted over the course of days or weeks. Automated execution of such methods enables robust and reproducible data generation. However, few commercial solutions exist for temperature controlled, fluidics coupled fluorescence imaging, and implementation of bespoke instrumentation requires specialized engineering expertise. Here we present PySeq2500, an open source Python code base and flow cell design that converts the Illumina HiSeq 2500 instrument, comprising an epifluorescence microscope with integrated fluidics, into an open platform for programmable applications without need for specialized engineering or software development expertise. Customizable PySeq2500 protocols enable experimental designs involving simultaneous 4-channel image acquisition, temperature control, reagent exchange, stable positioning, and sample integrity over extended experiments. To demonstrate accessible automation of complex, multi-day workflows, we use the PySeq2500 system for unattended execution of iterative indirect immunofluorescence imaging (4i). Our automated 4i method uses off-the-shelf antibodies over multiple cycles of staining, imaging, and antibody elution to build highly multiplexed maps of cell types and pathological features in mouse and postmortem human spinal cord sections. Given the widespread availability of HiSeq 2500 platforms and the simplicity of the modifications required to repurpose these systems, PySeq2500 enables non-specialists to develop and implement state of the art fluidics coupled imaging methods in a widely available benchtop system.
Collapse
Affiliation(s)
- Kunal Pandit
- Technology Innovation Lab, New York Genome Center, New York, NY, USA.
| | - Joana Petrescu
- Department of Neurology, Columbia University Irving Medical Center, New York, NY, USA
- Center for Motor Neuron Biology and Disease, Columbia University Irving Medical Center, New York, NY, USA
- Center for Genomics of Neurodegenerative Disease, New York Genome Center, New York, NY, USA
| | - Miguel Cuevas
- Department of Neurology, Columbia University Irving Medical Center, New York, NY, USA
- Center for Motor Neuron Biology and Disease, Columbia University Irving Medical Center, New York, NY, USA
| | | | - Peter Smibert
- Technology Innovation Lab, New York Genome Center, New York, NY, USA
| | - Hemali Phatnani
- Department of Neurology, Columbia University Irving Medical Center, New York, NY, USA.
- Center for Motor Neuron Biology and Disease, Columbia University Irving Medical Center, New York, NY, USA.
- Center for Genomics of Neurodegenerative Disease, New York Genome Center, New York, NY, USA.
| | - Silas Maniatis
- Technology Innovation Lab, New York Genome Center, New York, NY, USA.
| |
Collapse
|
19
|
Wu D, Gordon CKL, Shin JH, Eisenstein M, Soh HT. Directed Evolution of Aptamer Discovery Technologies. Acc Chem Res 2022; 55:685-695. [PMID: 35130439 DOI: 10.1021/acs.accounts.1c00724] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Although antibodies are a powerful tool for molecular biology and clinical diagnostics, there are many emerging applications for which nucleic acid-based aptamers can be advantageous. However, generating high-quality aptamers with sufficient affinity and specificity for biomedical applications is a challenging feat for most research laboratories. In this Account, we describe four techniques developed in our laboratory to accelerate the discovery of high-quality aptamer reagents that can achieve robust binding even for challenging molecular targets. The first method is particle display, in which we convert solution-phase aptamers into aptamer particles that can be screened via fluorescence-activated cell sorting (FACS) to quantitatively isolate individual aptamer particles based on their affinity. This enables the efficient isolation of high-affinity aptamers in fewer selection rounds than conventional methods, thereby minimizing selection biases and reducing the emergence of artifacts in the final aptamer pool. We subsequently developed the multiparametric particle display (MPPD) method, which employs two-color FACS to isolate aptamer particles based on both affinity and specificity, yielding aptamers that exhibit excellent target binding even in complex matrixes such as serum. The third method is an alkyne-azide chemistry ("click chemistry")-based particle display (click-PD) that enables the generation and screening of "non-natural" aptamers with a wide range of base modifications. We have shown that these base-modified aptamers can achieve robust affinity and specificity for targets that have proven challenging or inaccessible with natural nucleotide-based aptamer libraries. Finally, we describe the non-natural aptamer array (N2A2) platform in which a modified benchtop sequencing instrument is used to characterize base-modified aptamers in high throughput, enabling the efficient identification of molecules with excellent affinity and specificity for their targets. This system first generates aptamer clusters on the flow-cell surface that incorporate alkyne-modified nucleobases and then performs a click reaction to couple those nucleobases to an azide-modified chemical moiety. This yields a sequence-defined array of tens of millions of base-modified sequences, which can then be characterized for affinity and specificity in a high-throughput fashion. Collectively, we believe that these advancements are helping to make aptamer technology more accessible, efficient, and robust, thereby enabling the use of these affinity reagents for a wider range of molecular recognition and detection-based applications.
Collapse
|
20
|
Abstract
The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the application of supervised learning in genomics research. However, the assumptions behind the statistical models and performance evaluations in ML software frequently are not met in biological systems. In this Review, we illustrate the impact of several common pitfalls encountered when applying supervised ML in genomics. We explore how the structure of genomics data can bias performance evaluations and predictions. To address the challenges associated with applying cutting-edge ML methods to genomics, we describe solutions and appropriate use cases where ML modelling shows great potential.
Collapse
|
21
|
Bacterial Transcriptional Regulators: A Road Map for Functional, Structural, and Biophysical Characterization. Int J Mol Sci 2022; 23:ijms23042179. [PMID: 35216300 PMCID: PMC8879271 DOI: 10.3390/ijms23042179] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 02/11/2022] [Accepted: 02/11/2022] [Indexed: 12/12/2022] Open
Abstract
The different niches through which bacteria move during their life cycle require a fast response to the many environmental queues they encounter. The sensing of these stimuli and their correct response is driven primarily by transcriptional regulators. This kind of protein is involved in sensing a wide array of chemical species, a process that ultimately leads to the regulation of gene transcription. The allosteric-coupling mechanism of sensing and regulation is a central aspect of biological systems and has become an important field of research during the last decades. In this review, we summarize the state-of-the-art techniques applied to unravel these complex mechanisms. We introduce a roadmap that may serve for experimental design, depending on the answers we seek and the initial information we have about the system of study. We also provide information on databases containing available structural information on each family of transcriptional regulators. Finally, we discuss the recent results of research about the allosteric mechanisms of sensing and regulation involving many transcriptional regulators of interest, highlighting multipronged strategies and novel experimental techniques. The aim of the experiments discussed here was to provide a better understanding at a molecular level of how bacteria adapt to the different environmental threats they face.
Collapse
|
22
|
Evans GW, Craggs T, Kapanidis AN. The Rate-limiting Step of DNA Synthesis by DNA Polymerase Occurs in the Fingers-closed Conformation. J Mol Biol 2022; 434:167410. [PMID: 34929202 PMCID: PMC8783057 DOI: 10.1016/j.jmb.2021.167410] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 11/22/2021] [Accepted: 12/12/2021] [Indexed: 12/03/2022]
Abstract
DNA polymerases maintain genomic integrity by copying DNA with high fidelity, part of which relies on the polymerase fingers opening-closing transition, a series of conformational changes during the DNA synthesis reaction cycle. Fingers opening and closing has been challenging to study, mainly due to the need to synchronise molecular ensembles. We previously studied fingers opening-closing on single polymerase-DNA complexes using single-molecule FRET; however, our work was limited to pre-chemistry reaction steps. Here, we advance our analysis to extensible substrates, and observe DNA polymerase (Pol) conformational changes across the entire DNA polymerisation reaction in real-time, gaining direct access to an elusive post-chemistry step rate-limiting for DNA synthesis. Our results showed that Pol adopts the fingers-closed conformation during polymerisation, and that the post-chemistry rate-limiting step occurs in the fingers-closed conformation. We found that fingers-opening in the Pol-DNA binary complex in the absence of polymerisation is slow (∼5.3 s-1), and comparable to the rate of fingers-opening after polymerisation (3.4 s-1); this indicates that the fingers-opening step itself could be largely responsible for the slow post-chemistry step, with the residual rate potentially accounted for by pyrophosphase release. We also observed that DNA chain-termination of the 3' end of the primer increases substantially the rate of fingers-opening in the Pol-DNA binary complex (5.3 → 29 s-1), demonstrating that the 3'-OH residue is important for the kinetics of fingers conformational changes. Our observations offer mechanistic insight and tools to offer mechanistic insight for all nucleic acid polymerases.
Collapse
Affiliation(s)
- Geraint W Evans
- Department of Physics and Biological Physics Research Group, Clarendon Laboratory, University of Oxford, Parks Road, Oxford OX1 3PU, United Kingdom. https://twitter.com/geraintwe
| | - Timothy Craggs
- Department of Physics and Biological Physics Research Group, Clarendon Laboratory, University of Oxford, Parks Road, Oxford OX1 3PU, United Kingdom; Sheffield Institute for Nucleic Acids, Department of Chemistry, University of Sheffield, Brook Hill, Sheffield S3 7HF, United Kingdom. https://twitter.com/Craggs_Lab
| | - Achillefs N Kapanidis
- Department of Physics and Biological Physics Research Group, Clarendon Laboratory, University of Oxford, Parks Road, Oxford OX1 3PU, United Kingdom.
| |
Collapse
|
23
|
Drees A, Fischer M. High-Throughput Selection and Characterisation of Aptamers on Optical Next-Generation Sequencers. Int J Mol Sci 2021; 22:9202. [PMID: 34502110 PMCID: PMC8431662 DOI: 10.3390/ijms22179202] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 08/19/2021] [Accepted: 08/20/2021] [Indexed: 02/07/2023] Open
Abstract
Aptamers feature a number of advantages, compared to antibodies. However, their application has been limited so far, mainly because of the complex selection process. 'High-throughput sequencing fluorescent ligand interaction profiling' (HiTS-FLIP) significantly increases the selection efficiency and is consequently a very powerful and versatile technology for the selection of high-performance aptamers. It is the first experiment to allow the direct and quantitative measurement of the affinity and specificity of millions of aptamers simultaneously by harnessing the potential of optical next-generation sequencing platforms to perform fluorescence-based binding assays on the clusters displayed on the flow cells and determining their sequence and position in regular high-throughput sequencing. Many variants of the experiment have been developed that allow automation and in situ conversion of DNA clusters into base-modified DNA, RNA, peptides, and even proteins. In addition, the information from mutational assays, performed with HiTS-FLIP, provides deep insights into the relationship between the sequence, structure, and function of aptamers. This enables a detailed understanding of the sequence-specific rules that determine affinity, and thus, supports the evolution of aptamers. Current variants of the HiTS-FLIP experiment and its application in the field of aptamer selection, characterisation, and optimisation are presented in this review.
Collapse
Affiliation(s)
- Alissa Drees
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany;
| | - Markus Fischer
- Hamburg School of Food Science, Institute of Food Chemistry, University of Hamburg, Grindelallee 117, 20146 Hamburg, Germany;
- Center for Hybrid Nanostructures (CHyN), Department of Physics, University of Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| |
Collapse
|
24
|
Boyle EA, Becker WR, Bai HB, Chen JS, Doudna JA, Greenleaf WJ. Quantification of Cas9 binding and cleavage across diverse guide sequences maps landscapes of target engagement. SCIENCE ADVANCES 2021; 7:7/8/eabe5496. [PMID: 33608277 PMCID: PMC7895440 DOI: 10.1126/sciadv.abe5496] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2020] [Accepted: 01/07/2021] [Indexed: 06/12/2023]
Abstract
The RNA-guided nuclease Cas9 has unlocked powerful methods for perturbing both the genome through targeted DNA cleavage and the regulome through targeted DNA binding, but limited biochemical data have hampered efforts to quantitatively model sequence perturbation of target binding and cleavage across diverse guide sequences. We present scalable, sequencing-based platforms for high-throughput filter binding and cleavage and then perform 62,444 quantitative binding and cleavage assays on 35,047 on- and off-target DNA sequences across 90 Cas9 ribonucleoproteins (RNPs) loaded with distinct guide RNAs. We observe that binding and cleavage efficacy, as well as specificity, vary substantially across RNPs; canonically studied guides often have atypically high specificity; sequence context surrounding the target modulates Cas9 on-rate; and Cas9 RNPs may sequester targets in nonproductive states that contribute to "proofreading" capability. Lastly, we distill our findings into an interpretable biophysical model that predicts changes in binding and cleavage for diverse target sequence perturbations.
Collapse
Affiliation(s)
- Evan A Boyle
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
| | - Winston R Becker
- Program in Biophysics, Stanford University, Stanford, CA 94305, USA
| | - Hua B Bai
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Janice S Chen
- Department of Molecular and Cell Biology, California Institute for Quantitative Biosciences (QB3), University of California, Howard Hughes Medical Institute, Department of Chemistry, and the Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA 94720, USA
| | - Jennifer A Doudna
- Department of Molecular and Cell Biology, California Institute for Quantitative Biosciences (QB3), University of California, Howard Hughes Medical Institute, Department of Chemistry, and the Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA 94720, USA
- MBIB Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94710, USA
- Gladstone Institutes, University of California, San Francisco, San Francisco, CA 94158, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA.
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
25
|
Aditham AK, Markin CJ, Mokhtari DA, DelRosso N, Fordyce PM. High-Throughput Affinity Measurements of Transcription Factor and DNA Mutations Reveal Affinity and Specificity Determinants. Cell Syst 2020; 12:112-127.e11. [PMID: 33340452 DOI: 10.1016/j.cels.2020.11.012] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 08/08/2020] [Accepted: 11/24/2020] [Indexed: 01/28/2023]
Abstract
Transcription factors (TFs) bind regulatory DNA to control gene expression, and mutations to either TFs or DNA can alter binding affinities to rewire regulatory networks and drive phenotypic variation. While studies have profiled energetic effects of DNA mutations extensively, we lack similar information for TF variants. Here, we present STAMMP (simultaneous transcription factor affinity measurements via microfluidic protein arrays), a high-throughput microfluidic platform enabling quantitative characterization of hundreds of TF variants simultaneously. Measured affinities for ∼210 mutants of a model yeast TF (Pho4) interacting with 9 oligonucleotides (>1,800 Kds) reveal that many combinations of mutations to poorly conserved TF residues and nucleotides flanking the core binding site alter but preserve physiological binding, providing a mechanism by which combinations of mutations in cis and trans could modulate TF binding to tune occupancies during evolution. Moreover, biochemical double-mutant cycles across the TF-DNA interface reveal molecular mechanisms driving recognition, linking sequence to function. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
Collapse
Affiliation(s)
- Arjun K Aditham
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA
| | - Craig J Markin
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Daniel A Mokhtari
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - Nicole DelRosso
- Graduate Program in Biophysics, Stanford University, Stanford, CA 94305, USA
| | - Polly M Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA; Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA; Department of Genetics, Stanford University, Stanford, CA 94305, USA; Chan Zuckerberg Biohub, San Francisco, CA 94110, USA.
| |
Collapse
|
26
|
Development of a sequencing system for spatial decoding of DNA barcode molecules at single-molecule resolution. Commun Biol 2020; 3:788. [PMID: 33339962 PMCID: PMC7749132 DOI: 10.1038/s42003-020-01499-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Accepted: 11/17/2020] [Indexed: 11/19/2022] Open
Abstract
Single-cell transcriptome analysis has been revolutionized by DNA barcodes that index cDNA libraries, allowing highly multiplexed analyses to be performed. Furthermore, DNA barcodes are being leveraged for spatial transcriptomes. Although spatial resolution relies on methods used to decode DNA barcodes, achieving single-molecule decoding remains a challenge. Here, we developed an in-house sequencing system inspired by a single-molecule sequencing system, HeliScope, to spatially decode DNA barcode molecules at single-molecule resolution. We benchmarked our system with 30 types of DNA barcode molecules and obtained an average read length of ~20 nt with an error rate of less than 5% per nucleotide, which was sufficient to spatially identify them. Additionally, we spatially identified DNA barcode molecules bound to antibodies at single-molecule resolution. Leveraging this, we devised a method, termed “molecular foot printing”, showing potential for applying our system not only to spatial transcriptomics, but also to spatial proteomics. Oguchi et al. developed an in-house sequencing system to spatially decode DNA barcode molecules at single-molecule resolution. They obtain an average read length of 20 nucleotides with an error rate of less than 5% per nucleotide. Leveraging this system, they devised a molecular foot printing method that can be applied spatial proteomics as well as spatial transcriptomics.
Collapse
|
27
|
Schnepf M, von Reutern M, Ludwig C, Jung C, Gaul U. Transcription Factor Binding Affinities and DNA Shape Readout. iScience 2020; 23:101694. [PMID: 33163946 PMCID: PMC7607496 DOI: 10.1016/j.isci.2020.101694] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 09/30/2020] [Accepted: 10/13/2020] [Indexed: 12/16/2022] Open
Abstract
An essential event in gene regulation is the binding of a transcription factor (TF) to its target DNA. Models considering the interactions between the TF and the DNA geometry proved to be successful approaches to describe this binding event, while conserving data interpretability. However, a direct characterization of the DNA shape contribution to binding is still missing due to the lack of accurate and large-scale binding affinity data. Here, we use a binding assay we recently established to measure with high sensitivity the binding specificities of 13 Drosophila TFs, including dinucleotide dependencies to capture non-independent amino acid-base interactions. Correlating the binding affinities with all DNA shape features, we find that shape readout is widely used by these factors. A shape readout/TF-DNA complex structure analysis validates our approach while providing biological insights such as positively charged or highly polar amino acids often contact nucleotides that exhibit strong shape readout. The DNA shape contribution to Drosophila TFs-DNA binding is directly characterized Zeroth- and first-order TF-DNA binding specificities are measured with high accuracy DNA shape readout is widely used by these TFs A shape readout/structural correlation analysis provides biological insights
Collapse
Affiliation(s)
- Max Schnepf
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Marc von Reutern
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Claudia Ludwig
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Christophe Jung
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| | - Ulrike Gaul
- Gene Center and Department of Biochemistry, Center for Protein Science Munich (CIPSM), Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 München, Germany
| |
Collapse
|
28
|
Tack DS, Romantseva EF, Tonner PD, Pressman A, Rammohan J, Strychalski EA. Measurements drive progress in directed evolution for precise engineering of biological systems. CURRENT OPINION IN SYSTEMS BIOLOGY 2020; 23:32-37. [PMID: 34611570 PMCID: PMC8489032 DOI: 10.1016/j.coisb.2020.09.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Precise engineering of biological systems requires quantitative, high-throughput measurements, exemplified by progress in directed evolution. New approaches allow high-throughput measurements of phenotypes and their corresponding genotypes. When integrated into directed evolution, these quantitative approaches enable the precise engineering of biological function. At the same time, the increasingly routine availability of large, high-quality data sets supports the integration of machine learning with directed evolution. Together, these advances herald striking capabilities for engineering biology.
Collapse
Affiliation(s)
- Drew S Tack
- National Institute of Standards and Technology, Gaithersburg, MD, 20898, USA
| | | | - Peter D Tonner
- National Institute of Standards and Technology, Gaithersburg, MD, 20898, USA
| | - Abe Pressman
- National Institute of Standards and Technology, Gaithersburg, MD, 20898, USA
| | - Jayan Rammohan
- National Institute of Standards and Technology, Gaithersburg, MD, 20898, USA
| | | |
Collapse
|
29
|
Micura R, Höbartner C. Fundamental studies of functional nucleic acids: aptamers, riboswitches, ribozymes and DNAzymes. Chem Soc Rev 2020; 49:7331-7353. [PMID: 32944725 DOI: 10.1039/d0cs00617c] [Citation(s) in RCA: 126] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
This review aims at juxtaposing common versus distinct structural and functional strategies that are applied by aptamers, riboswitches, and ribozymes/DNAzymes. Focusing on recently discovered systems, we begin our analysis with small-molecule binding aptamers, with emphasis on in vitro-selected fluorogenic RNA aptamers and their different modes of ligand binding and fluorescence activation. Fundamental insights are much needed to advance RNA imaging probes for detection of exo- and endogenous RNA and for RNA process tracking. Secondly, we discuss the latest gene expression-regulating mRNA riboswitches that respond to the alarmone ppGpp, to PRPP, to NAD+, to adenosine and cytidine diphosphates, and to precursors of thiamine biosynthesis (HMP-PP), and we outline new subclasses of SAM and tetrahydrofolate-binding RNA regulators. Many riboswitches bind protein enzyme cofactors that, in principle, can catalyse a chemical reaction. For RNA, however, only one system (glmS ribozyme) has been identified in Nature thus far that utilizes a small molecule - glucosamine-6-phosphate - to participate directly in reaction catalysis (phosphodiester cleavage). We wonder why that is the case and what is to be done to reveal such likely existing cellular activities that could be more diverse than currently imagined. Thirdly, this brings us to the four latest small nucleolytic ribozymes termed twister, twister-sister, pistol, and hatchet as well as to in vitro selected DNA and RNA enzymes that promote new chemistry, mainly by exploiting their ability for RNA labelling and nucleoside modification recognition. Enormous progress in understanding the strategies of nucleic acids catalysts has been made by providing thorough structural fundaments (e.g. first structure of a DNAzyme, structures of ribozyme transition state mimics) in combination with functional assays and atomic mutagenesis.
Collapse
Affiliation(s)
- Ronald Micura
- Institute of Organic Chemistry and Center for Molecular Biosciences Innsbruck CMBI, Leopold-Franzens University Innsbruck, Innsbruck, Austria.
| | | |
Collapse
|
30
|
Jarmoskaite I, AlSadhan I, Vaidyanathan PP, Herschlag D. How to measure and evaluate binding affinities. eLife 2020; 9:e57264. [PMID: 32758356 PMCID: PMC7452723 DOI: 10.7554/elife.57264] [Citation(s) in RCA: 311] [Impact Index Per Article: 62.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 08/05/2020] [Indexed: 12/23/2022] Open
Abstract
Quantitative measurements of biomolecule associations are central to biological understanding and are needed to build and test predictive and mechanistic models. Given the advances in high-throughput technologies and the projected increase in the availability of binding data, we found it especially timely to evaluate the current standards for performing and reporting binding measurements. A review of 100 studies revealed that in most cases essential controls for establishing the appropriate incubation time and concentration regime were not documented, making it impossible to determine measurement reliability. Moreover, several reported affinities could be concluded to be incorrect, thereby impacting biological interpretations. Given these challenges, we provide a framework for a broad range of researchers to evaluate, teach about, perform, and clearly document high-quality equilibrium binding measurements. We apply this framework and explain underlying fundamental concepts through experimental examples with the RNA-binding protein Puf4.
Collapse
Affiliation(s)
- Inga Jarmoskaite
- Department of Biochemistry, Stanford UniversityStanfordUnited States
| | - Ishraq AlSadhan
- Department of Biochemistry, Stanford UniversityStanfordUnited States
| | | | - Daniel Herschlag
- Department of Biochemistry, Stanford UniversityStanfordUnited States
- Department of Chemical Engineering, Stanford UniversityStanfordUnited States
- Stanford ChEM-H, Stanford UniversityStanfordUnited States
| |
Collapse
|
31
|
Furukawa T, Scheven MT, Misslinger M, Zhao C, Hoefgen S, Gsaller F, Lau J, Jöchl C, Donaldson I, Valiante V, Brakhage AA, Bromley MJ, Haas H, Hortschansky P. The fungal CCAAT-binding complex and HapX display highly variable but evolutionary conserved synergetic promoter-specific DNA recognition. Nucleic Acids Res 2020; 48:3567-3590. [PMID: 32086516 PMCID: PMC7144946 DOI: 10.1093/nar/gkaa109] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2019] [Revised: 02/07/2020] [Accepted: 02/18/2020] [Indexed: 12/13/2022] Open
Abstract
To sustain iron homeostasis, microorganisms have evolved fine-tuned mechanisms for uptake, storage and detoxification of the essential metal iron. In the human pathogen Aspergillus fumigatus, the fungal-specific bZIP-type transcription factor HapX coordinates adaption to both iron starvation and iron excess and is thereby crucial for virulence. Previous studies indicated that a HapX homodimer interacts with the CCAAT-binding complex (CBC) to cooperatively bind bipartite DNA motifs; however, the mode of HapX-DNA recognition had not been resolved. Here, combination of in vivo (genetics and ChIP-seq), in vitro (surface plasmon resonance) and phylogenetic analyses identified an astonishing plasticity of CBC:HapX:DNA interaction. DNA motifs recognized by the CBC:HapX protein complex comprise a bipartite DNA binding site 5′-CSAATN12RWT-3′ and an additional 5′-TKAN-3′ motif positioned 11–23 bp downstream of the CCAAT motif, i.e. occasionally overlapping the 3′-end of the bipartite binding site. Phylogenetic comparison taking advantage of 20 resolved Aspergillus species genomes revealed that DNA recognition by the CBC:HapX complex shows promoter-specific cross-species conservation rather than regulon-specific conservation. Moreover, we show that CBC:HapX interaction is absolutely required for all known functions of HapX. The plasticity of the CBC:HapX:DNA interaction permits fine tuning of CBC:HapX binding specificities that could support adaptation of pathogens to their host niches.
Collapse
Affiliation(s)
- Takanori Furukawa
- Manchester Fungal Infection Group, Institute of Inflammation and Repair, University of Manchester, Manchester M13 9PL, UK
| | - Mareike Thea Scheven
- Department of Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology (HKI), Jena D-07745, Germany
| | - Matthias Misslinger
- Division of Molecular Biology/Biocenter, Innsbruck Medical University, Innsbruck, A-6020, Austria
| | - Can Zhao
- Manchester Fungal Infection Group, Institute of Inflammation and Repair, University of Manchester, Manchester M13 9PL, UK
| | - Sandra Hoefgen
- Leibniz Research Group Biobricks of Microbial Natural Product Syntheses, Leibniz Institute for Natural Product Research and Infection Biology (HKI), Jena D-07745, Germany
| | - Fabio Gsaller
- Division of Molecular Biology/Biocenter, Innsbruck Medical University, Innsbruck, A-6020, Austria
| | - Jeffrey Lau
- Manchester Fungal Infection Group, Institute of Inflammation and Repair, University of Manchester, Manchester M13 9PL, UK
| | - Christoph Jöchl
- Division of Molecular Biology/Biocenter, Innsbruck Medical University, Innsbruck, A-6020, Austria
| | - Ian Donaldson
- Manchester Fungal Infection Group, Institute of Inflammation and Repair, University of Manchester, Manchester M13 9PL, UK
| | - Vito Valiante
- Leibniz Research Group Biobricks of Microbial Natural Product Syntheses, Leibniz Institute for Natural Product Research and Infection Biology (HKI), Jena D-07745, Germany.,Friedrich Schiller University Jena, Jena D-07745, Germany
| | - Axel A Brakhage
- Department of Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology (HKI), Jena D-07745, Germany.,Friedrich Schiller University Jena, Jena D-07745, Germany
| | - Michael J Bromley
- Manchester Fungal Infection Group, Institute of Inflammation and Repair, University of Manchester, Manchester M13 9PL, UK
| | - Hubertus Haas
- Division of Molecular Biology/Biocenter, Innsbruck Medical University, Innsbruck, A-6020, Austria
| | - Peter Hortschansky
- Department of Molecular and Applied Microbiology, Leibniz Institute for Natural Product Research and Infection Biology (HKI), Jena D-07745, Germany
| |
Collapse
|
32
|
Ye X, Jankowsky E. High throughput approaches to study RNA-protein interactions in vitro. Methods 2020; 178:3-10. [PMID: 31494245 PMCID: PMC7071787 DOI: 10.1016/j.ymeth.2019.09.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 06/07/2019] [Accepted: 09/01/2019] [Indexed: 02/08/2023] Open
Abstract
To understand the regulation of gene expression it is critical to determine how proteins interact with and discriminate between different RNAs. In this review, we discuss experimental techniques that utilize high throughput approaches to characterize the interactions of proteins with large numbers of RNAs in vitro. We describe the underlying principles for the main methods, briefly discuss their scope and limitations, and outline how insight from the techniques contributes to our understanding of specificity for RNA-protein interactions.
Collapse
Affiliation(s)
- Xuan Ye
- Center for RNA Science and Therapeutics, School of Medicine, Case Western Reserve University, Cleveland, OH, United States
| | - Eckhard Jankowsky
- Center for RNA Science and Therapeutics, School of Medicine, Case Western Reserve University, Cleveland, OH, United States.
| |
Collapse
|
33
|
Wang L, You ZH, Huang DS, Zhou F. Combining High Speed ELM Learning with a Deep Convolutional Neural Network Feature Encoding for Predicting Protein-RNA Interactions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:972-980. [PMID: 30296240 DOI: 10.1109/tcbb.2018.2874267] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Emerging evidence has shown that RNA plays a crucial role in many cellular processes, and their biological functions are primarily achieved by binding with a variety of proteins. High-throughput biological experiments provide a lot of valuable information for the initial identification of RNA-protein interactions (RPIs), but with the increasing complexity of RPIs networks, this method gradually falls into expensive and time-consuming situations. Therefore, there is an urgent need for high speed and reliable methods to predict RNA-protein interactions. In this study, we propose a computational method for predicting the RNA-protein interactions using sequence information. The deep learning convolution neural network (CNN) algorithm is utilized to mine the hidden high-level discriminative features from the RNA and protein sequences and feed it into the extreme learning machine (ELM) classifier. The experimental results with 5-fold cross-validation indicate that the proposed method achieves superior performance on benchmark datasets (RPI1807, RPI2241, and RPI369) with the accuracy of 98.83, 90.83, and 85.63 percent, respectively. We further evaluate the performance of the proposed model by comparing it with the state-of-the-art SVM classifier and other existing methods on the same benchmark data set. In addition, we predicted the independent NPInter v2.0 data set using the model trained on RPI369. The experimental results show that our model can serve as a useful tool for predicting RNA-protein interactions.
Collapse
|
34
|
Longwell SA, Fordyce PM. micrIO: an open-source autosampler and fraction collector for automated microfluidic input-output. LAB ON A CHIP 2020; 20:93-106. [PMID: 31701110 PMCID: PMC6923132 DOI: 10.1039/c9lc00512a] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Microfluidic devices are an enabling technology for many labs, facilitating a wide range of applications spanning high-throughput encapsulation, molecular separations, and long-term cell culture. In many cases, however, their utility is limited by a 'world-to-chip' barrier that makes it difficult to serially interface samples with these devices. As a result, many researchers are forced to rely on low-throughput, manual approaches for managing device input and output (IO) of samples, reagents, and effluent. Here, we present a hardware-software platform for automated microfluidic IO (micrIO). The platform, which is uniquely compatible with positive-pressure microfluidics, comprises an 'AutoSipper' for input and a 'Fraction Collector' for output. To facilitate widespread adoption, both are open-source builds constructed from components that are readily purchased online or fabricated from included design files. The software control library, written in Python, allows the platform to be integrated with existing experimental setups and to coordinate IO with other functions such as valve actuation and assay imaging. We demonstrate these capabilities by coupling both the AutoSipper and Fraction Collector to two microfluidic devices: a simple, valved inlet manifold and a microfluidic droplet generator that produces beads with distinct spectral codes. Analysis of the collected materials in each case establishes the ability of the platform to draw from and output to specific wells of multiwell plates with negligible cross-contamination between samples.
Collapse
Affiliation(s)
- Scott A Longwell
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
| | - Polly M Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA, USA. and Department of Genetics, Stanford University, Stanford, CA, USA and ChEM-H Institute, Stanford University, Stanford, CA, USA and Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
35
|
Pal S, Hoinka J, Przytycka TM. Co-SELECT reveals sequence non-specific contribution of DNA shape to transcription factor binding in vitro. Nucleic Acids Res 2020; 47:6632-6641. [PMID: 31226207 DOI: 10.1093/nar/gkz540] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Revised: 05/31/2019] [Accepted: 06/06/2019] [Indexed: 12/22/2022] Open
Abstract
Understanding the principles of DNA binding by transcription factors (TFs) is of primary importance for studying gene regulation. Recently, several lines of evidence suggested that both DNA sequence and shape contribute to TF binding. However, the following compelling question is yet to be considered: in the absence of any sequence similarity to the binding motif, can DNA shape still increase binding probability? To address this challenge, we developed Co-SELECT, a computational approach to analyze the results of in vitro HT-SELEX experiments for TF-DNA binding. Specifically, Co-SELECT leverages the presence of motif-free sequences in late HT-SELEX rounds and their enrichment in weak binders allows Co-SELECT to detect an evidence for the role of DNA shape features in TF binding. Our approach revealed that, even in the absence of the sequence motif, TFs have propensity to bind to DNA molecules of the shape consistent with the motif specific binding. This provides the first direct evidence that shape features that accompany the preferred sequence motifs also bestow an advantage for weak, sequence non-specific binding.
Collapse
Affiliation(s)
- Soumitra Pal
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Jan Hoinka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| | - Teresa M Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA
| |
Collapse
|
36
|
Gräwe C, Makowski MM, Vermeulen M. PAQMAN: Protein-nucleic acid affinity quantification by MAss spectrometry in nuclear extracts. Methods 2019; 184:70-77. [PMID: 31857188 DOI: 10.1016/j.ymeth.2019.12.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 12/10/2019] [Accepted: 12/11/2019] [Indexed: 12/01/2022] Open
Abstract
In recent years, various mass spectrometry-based approaches have been developed to determine global protein-DNA binding specificities using DNA affinity purifications from crude nuclear extracts. However, these assays are semi-quantitative and do not provide information about interaction affinities. We recently developed a technology that we call Protein-nucleic acid Affinity Quantification by MAss spectrometry in Nuclear extracts or PAQMAN, that can be used to determine apparent affinities between multiple nuclear proteins and a nucleic acid sequence of interest in one experiment. In PAQMAN, a series of affinity purifications with increasing bait concentrations and fixed amounts of crude nuclear extracts are combined with isobaric stable isotope labeling and quantitative mass spectrometry to generate Hill-like Kd curves for dozens of proteins in a single experiment. Here, we apply PAQMAN to determine apparent affinities for a genetic variant, rs36115365-C, which regulates TERT expression and is associated with an increased risk to develop various malignancies. Furthermore, we describe a detailed protocol for this method including important quality checks.
Collapse
Affiliation(s)
- Cathrin Gräwe
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, 6525 GA Nijmegen, The Netherlands
| | - Matthew M Makowski
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, 6525 GA Nijmegen, The Netherlands
| | - Michiel Vermeulen
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, 6525 GA Nijmegen, The Netherlands.
| |
Collapse
|
37
|
Deciphering eukaryotic gene-regulatory logic with 100 million random promoters. Nat Biotechnol 2019; 38:56-65. [PMID: 31792407 PMCID: PMC6954276 DOI: 10.1038/s41587-019-0315-8] [Citation(s) in RCA: 161] [Impact Index Per Article: 26.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 10/16/2019] [Indexed: 11/26/2022]
Abstract
How transcription factors (TFs) interpret cis-regulatory DNA sequence to control gene expression remains unclear, largely because past studies using native and engineered sequences had insufficient scale. Here, we measure the expression output of >100 million synthetic yeast promoter sequences that are fully random. These sequences yield diverse, reproducible expression levels that can be explained by their chance inclusion of functional TF binding sites. We use machine learning to build interpretable models of transcriptional regulation that predict ~94% of the expression driven from independent test promoters and ~89% of the expression driven from native yeast promoter fragments. These models allow us to characterize each TF’s specificity, activity, and interactions with chromatin. TF activity depends on binding-site strand, position, DNA helical face and chromatin context. Notably, expression level is influenced by weak regulatory interactions, which confound designed-sequence studies. Our analyses show that massive-throughput assays of fully random DNA can provide the big data necessary to develop complex, predictive models of gene regulation. Gene expression levels in yeast are predicted using a massive dataset on promoters with random sequences.
Collapse
|
38
|
Mamet N, Harari G, Zamir A, Bachelet I. Simulating the Monty Hall problem in a DNA sequencing machine. Comput Biol Chem 2019; 83:107122. [DOI: 10.1016/j.compbiolchem.2019.107122] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2019] [Revised: 08/04/2019] [Accepted: 09/04/2019] [Indexed: 02/04/2023]
|
39
|
Unified rational protein engineering with sequence-based deep representation learning. Nat Methods 2019; 16:1315-1322. [PMID: 31636460 DOI: 10.1038/s41592-019-0598-1] [Citation(s) in RCA: 565] [Impact Index Per Article: 94.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Accepted: 09/11/2019] [Indexed: 01/03/2023]
Abstract
Rational protein engineering requires a holistic understanding of protein function. Here, we apply deep learning to unlabeled amino-acid sequences to distill the fundamental features of a protein into a statistical representation that is semantically rich and structurally, evolutionarily and biophysically grounded. We show that the simplest models built on top of this unified representation (UniRep) are broadly applicable and generalize to unseen regions of sequence space. Our data-driven approach predicts the stability of natural and de novo designed proteins, and the quantitative function of molecularly diverse mutants, competitively with the state-of-the-art methods. UniRep further enables two orders of magnitude efficiency improvement in a protein engineering task. UniRep is a versatile summary of fundamental protein features that can be applied across protein engineering informatics.
Collapse
|
40
|
Kemble H, Nghe P, Tenaillon O. Recent insights into the genotype-phenotype relationship from massively parallel genetic assays. Evol Appl 2019; 12:1721-1742. [PMID: 31548853 PMCID: PMC6752143 DOI: 10.1111/eva.12846] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 06/21/2019] [Accepted: 07/02/2019] [Indexed: 12/20/2022] Open
Abstract
With the molecular revolution in Biology, a mechanistic understanding of the genotype-phenotype relationship became possible. Recently, advances in DNA synthesis and sequencing have enabled the development of deep mutational scanning assays, capable of scoring comprehensive libraries of genotypes for fitness and a variety of phenotypes in massively parallel fashion. The resulting empirical genotype-fitness maps pave the way to predictive models, potentially accelerating our ability to anticipate the behaviour of pathogen and cancerous cell populations from sequencing data. Besides from cellular fitness, phenotypes of direct application in industry (e.g. enzyme activity) and medicine (e.g. antibody binding) can be quantified and even selected directly by these assays. This review discusses the technological basis of and recent developments in massively parallel genetics, along with the trends it is uncovering in the genotype-phenotype relationship (distribution of mutation effects, epistasis), their possible mechanistic bases and future directions for advancing towards the goal of predictive genetics.
Collapse
Affiliation(s)
- Harry Kemble
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Philippe Nghe
- École Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI Paris), UMR CNRS‐ESPCI CBI 8231PSL Research UniversityParis Cedex 05France
| | - Olivier Tenaillon
- Infection, Antimicrobials, Modelling, Evolution, INSERM, Unité Mixte de Recherche 1137Université Paris Diderot, Université Paris NordParisFrance
| |
Collapse
|
41
|
Denny SK, Greenleaf WJ. Linking RNA Sequence, Structure, and Function on Massively Parallel High-Throughput Sequencers. Cold Spring Harb Perspect Biol 2019; 11:a032300. [PMID: 30322887 PMCID: PMC6771372 DOI: 10.1101/cshperspect.a032300] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
High-throughput sequencing methods have revolutionized our ability to catalog the diversity of RNAs and RNA-protein interactions that can exist in our cells. However, the relationship between RNA sequence, structure, and function is enormously complex, demonstrating the need for methods that can provide quantitative thermodynamic and kinetic measurements of macromolecular interaction with RNA, at a scale commensurate with the sequence diversity of RNA. Here, we discuss a class of methods that extend the core functionality of DNA sequencers to enable high-throughput measurements of RNA folding and RNA-protein interactions. Topics discussed include a description of the method and multiple applications to RNA-binding proteins, riboswitch design and engineering, and RNA tertiary structure energetics.
Collapse
Affiliation(s)
- Sarah K Denny
- Stanford University Department of Genetics, Stanford, California 94305
| | - William J Greenleaf
- Stanford University Department of Genetics, Stanford, California 94305
- Stanford University Department of Applied Physics, Stanford, California 94025
- Chan Zuckerberg Biohub, San Francisco, California 94158
| |
Collapse
|
42
|
Wang W, Langlois R, Langlois M, Genchev GZ, Wang X, Lu H. Functional Site Discovery From Incomplete Training Data: A Case Study With Nucleic Acid-Binding Proteins. Front Genet 2019; 10:729. [PMID: 31543893 PMCID: PMC6729729 DOI: 10.3389/fgene.2019.00729] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Accepted: 07/11/2019] [Indexed: 12/27/2022] Open
Abstract
Function annotation efforts provide a foundation to our understanding of cellular processes and the functioning of the living cell. This motivates high-throughput computational methods to characterize new protein members of a particular function. Research work has focused on discriminative machine-learning methods, which promise to make efficient, de novo predictions of protein function. Furthermore, available function annotation exists predominantly for individual proteins rather than residues of which only a subset is necessary for the conveyance of a particular function. This limits discriminative approaches to predicting functions for which there is sufficient residue-level annotation, e.g., identification of DNA-binding proteins or where an excellent global representation can be divined. Complete understanding of the various functions of proteins requires discovery and functional annotation at the residue level. Herein, we cast this problem into the setting of multiple-instance learning, which only requires knowledge of the protein’s function yet identifies functionally relevant residues and need not rely on homology. We developed a new multiple-instance leaning algorithm derived from AdaBoost and benchmarked this algorithm against two well-studied protein function prediction tasks: annotating proteins that bind DNA and RNA. This algorithm outperforms certain previous approaches in annotating protein function while identifying functionally relevant residues involved in binding both DNA and RNA, and on one protein-DNA benchmark, it achieves near perfect classification.
Collapse
Affiliation(s)
- Wenchuan Wang
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas
| | - Robert Langlois
- Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States
| | - Marina Langlois
- Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States
| | - Georgi Z Genchev
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas.,Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States.,Bulgarian Institute for Genomics and Precision Medicine, Sofia, Bulgaria
| | - Xiaolei Wang
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas.,Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Hui Lu
- SJTU-Yale Joint Center for Biostatistics and Data Science, Department of Bioinformatics and Biostatistics, College of Life Science and Biotechnology, Shanghai Jiao Tong University, Shanghai, Chinas.,Department of Bioengineering and Department of Computer Science, University of Illinois at Chicago, Chicago, IL, United States.,Center for Biomedical Informatics, Shanghai Children's Hospital, Shanghai, China
| |
Collapse
|
43
|
Wu X, Bartel DP. kpLogo: positional k-mer analysis reveals hidden specificity in biological sequences. Nucleic Acids Res 2019; 45:W534-W538. [PMID: 28460012 PMCID: PMC5570168 DOI: 10.1093/nar/gkx323] [Citation(s) in RCA: 89] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Accepted: 04/13/2017] [Indexed: 12/26/2022] Open
Abstract
Motifs of only 1–4 letters can play important roles when present at key locations within macromolecules. Because existing motif-discovery tools typically miss these position-specific short motifs, we developed kpLogo, a probability-based logo tool for integrated detection and visualization of position-specific ultra-short motifs from a set of aligned sequences. kpLogo also overcomes the limitations of conventional motif-visualization tools in handling positional interdependencies and utilizing ranked or weighted sequences increasingly available from high-throughput assays. kpLogo can be found at http://kplogo.wi.mit.edu/.
Collapse
Affiliation(s)
- Xuebing Wu
- Howard Hughes Medical Institute and Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA.,Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - David P Bartel
- Howard Hughes Medical Institute and Whitehead Institute for Biomedical Research, Cambridge, MA 02142, USA.,Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
44
|
Zhang Q, Zhu L, Huang DS. High-Order Convolutional Neural Network Architecture for Predicting DNA-Protein Binding Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1184-1192. [PMID: 29993783 DOI: 10.1109/tcbb.2018.2819660] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Although Deep learning algorithms have outperformed conventional methods in predicting the sequence specificities of DNA-protein binding, they lack to consider the dependencies among nucleotides and the diverse binding lengths for different transcription factors (TFs). To address the above two limitations simultaneously, in this paper, we propose a high-order convolutional neural network architecture (HOCNN), which employs a high-order encoding method to build high-order dependencies among nucleotides, and a multi-scale convolutional layer to capture the motif features of different length. The experimental results on real ChIP-seq datasets show that the proposed method outperforms the state-of-the-art deep learning method (DeepBind) in the motif discovery task. In addition, we provide further insights about the importance of introducing additional convolutional kernels and the degeneration problem of importing high-order in the motif discovery task.
Collapse
|
45
|
Layton CJ, McMahon PL, Greenleaf WJ. Large-Scale, Quantitative Protein Assays on a High-Throughput DNA Sequencing Chip. Mol Cell 2019; 73:1075-1082.e4. [PMID: 30849388 DOI: 10.1016/j.molcel.2019.02.019] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 01/18/2019] [Accepted: 02/14/2019] [Indexed: 01/22/2023]
Abstract
High-throughput DNA sequencing techniques have enabled diverse approaches for linking DNA sequence to biochemical function. In contrast, assays of protein function have substantial limitations in terms of throughput, automation, and widespread availability. We have adapted an Illumina high-throughput sequencing chip to display an immense diversity of ribosomally translated proteins and peptides and then carried out fluorescence-based functional assays directly on this flow cell, demonstrating that a single, widely available high-throughput platform can perform both sequencing-by-synthesis and protein assays. We quantified the binding of the M2 anti-FLAG antibody to a library of 1.3 × 104 variant FLAG peptides, exploring non-additive effects of combinations of mutations and discovering a "superFLAG" epitope variant. We also measured the enzymatic activity of 1.56 × 105 molecular variants of full-length human O6-alkylguanine-DNA alkyltransferase (SNAP-tag). This comprehensive corpus of catalytic rates revealed amino acid interaction networks and cooperativity, linked positive cooperativity to structural proximity, and revealed ubiquitous positively cooperative interactions with histidine residues.
Collapse
Affiliation(s)
- Curtis J Layton
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Peter L McMahon
- Department of Applied Physics, Stanford University, Stanford, CA 94305, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Applied Physics, Stanford University, Stanford, CA 94305, USA; Chan-Zuckerberg Initiative, Palo Alto, CA 94301, USA.
| |
Collapse
|
46
|
|
47
|
Munzar JD, Ng A, Juncker D. Duplexed aptamers: history, design, theory, and application to biosensing. Chem Soc Rev 2019; 48:1390-1419. [PMID: 30707214 DOI: 10.1039/c8cs00880a] [Citation(s) in RCA: 134] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Nucleic acid aptamers are single stranded DNA or RNA sequences that specifically bind a cognate ligand. In addition to their widespread use as stand-alone affinity binding reagents in analytical chemistry, aptamers have been engineered into a variety of ligand-specific biosensors, termed aptasensors. One of the most common aptasensor formats is the duplexed aptamer (DA). As defined herein, DAs are aptasensors containing two nucleic acid elements coupled via Watson-Crick base pairing: (i) an aptamer sequence, which serves as a ligand-specific receptor, and (ii) an aptamer-complementary element (ACE), such as a short DNA oligonucleotide, which is designed to hybridize to the aptamer. The ACE competes with ligand binding, such that DAs generate a signal upon ligand-dependent ACE-aptamer dehybridization. DAs possess intrinsic advantages over other aptasensor designs. For example, DA biosensing designs generalize across DNA and RNA aptamers, DAs are compatible with many readout methods, and DAs are inherently tunable on the basis of nucleic acid hybridization. However, despite their utility and popularity, DAs have not been well defined in the literature, leading to confusion over the differences between DAs and other aptasensor formats. In this review, we introduce a framework for DAs based on ACEs, and use this framework to distinguish DAs from other aptasensor formats and to categorize cis- and trans-DA designs. We then explore the ligand binding dynamics and chemical properties that underpin DA systems, which fall under conformational selection and induced fit models, and which mirror classical SN1 and SN2 models of nucleophilic substitution reactions. We further review a variety of in vitro and in vivo applications of DAs in the chemical and biological sciences, including riboswitches and riboregulators. Finally, we present future directions of DAs as ligand-responsive nucleic acids. Owing to their tractability, versatility and ease of engineering, DA biosensors bear a great potential for the development of new applications and technologies in fields ranging from analytical chemistry and mechanistic modeling to medicine and synthetic biology.
Collapse
Affiliation(s)
- Jeffrey D Munzar
- McGill University and Genome Quebec Innovation Centre, Montreal, Quebec, Canada.
| | | | | |
Collapse
|
48
|
Cole KH, Lupták A. High-throughput methods in aptamer discovery and analysis. Methods Enzymol 2019; 621:329-346. [PMID: 31128787 DOI: 10.1016/bs.mie.2019.02.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Aptamers are small, functional nucleic acids that bind a variety of targets, often with high specificity and affinity. Genomic aptamers constitute the ligand-binding domains of riboswitches, whereas synthetic aptamers find applications as diagnostic and therapeutic tools, and as ligand-binding domains of regulatory RNAs in synthetic biology. Discovery and characterization of aptamers has been limited by a lack of high-throughput approaches that uncover the target-binding domains and the biochemical properties of individual sequences. With the advent of high-throughput sequencing, large-scale analysis of in vitro selected populations of aptamers (and catalytic nucleic acids, such as ribozymes and DNAzmes) became possible. In recent years the development of new experimental approaches and software tools has led to significant streamlining of the selection-pool analysis. This article provides an overview of post-selection data analysis and describes high-throughput methods that facilitate rapid discovery and biochemical characterization of aptamers.
Collapse
Affiliation(s)
- Kyle H Cole
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, United States
| | - Andrej Lupták
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, United States; Department of Pharmaceutical Sciences, University of California, Irvine, CA, United States; Department of Chemistry, University of California, Irvine, CA, United States.
| |
Collapse
|
49
|
Barnes SL, Belliveau NM, Ireland WT, Kinney JB, Phillips R. Mapping DNA sequence to transcription factor binding energy in vivo. PLoS Comput Biol 2019; 15:e1006226. [PMID: 30716072 PMCID: PMC6375646 DOI: 10.1371/journal.pcbi.1006226] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 02/14/2019] [Accepted: 11/06/2018] [Indexed: 11/18/2022] Open
Abstract
Despite the central importance of transcriptional regulation in biology, it has proven difficult to determine the regulatory mechanisms of individual genes, let alone entire gene networks. It is particularly difficult to decipher the biophysical mechanisms of transcriptional regulation in living cells and determine the energetic properties of binding sites for transcription factors and RNA polymerase. In this work, we present a strategy for dissecting transcriptional regulatory sequences using in vivo methods (massively parallel reporter assays) to formulate quantitative models that map a transcription factor binding site’s DNA sequence to transcription factor-DNA binding energy. We use these models to predict the binding energies of transcription factor binding sites to within 1 kBT of their measured values. We further explore how such a sequence-energy mapping relates to the mechanisms of trancriptional regulation in various promoter contexts. Specifically, we show that our models can be used to design specific induction responses, analyze the effects of amino acid mutations on DNA sequence preference, and determine how regulatory context affects a transcription factor’s sequence specificity. It has been said that we live in the “genomic era,” a time where we can readily sequence full genomes at will. However, it remains difficult to interpret much of the information within a genome. This is especially true of non-coding sequences such as promoters, which contain a number of features such as transcription factor binding sites that determine how genes are regulated. There is no straightforward regulatory “code” that tells us how transcription factor binding sites are organized within a promoter. In this work we examine how DNA sequence determines one of the most important features of a promoter, the strength with which a transcription factor binds to its DNA binding site. We discuss an approach to modeling DNA sequence-specific transcription factor binding energies in vivo using a massively parellel reporter assay. We develop models that allow us to predict the binding energy between a transcription factor and a mutated version of its binding site. We then show that this modeling technique can be used to address a number of scientific and design questions, such as engineering the behavior of genetic circuit elements or examining how transcription factors and their binding sites co-evolve.
Collapse
Affiliation(s)
- Stephanie L. Barnes
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - Nathan M. Belliveau
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
| | - William T. Ireland
- Department of Physics, California Institute of Technology, Pasadena, California, United States of America
| | - Justin B. Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Rob Phillips
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California, United States of America
- Department of Physics, California Institute of Technology, Pasadena, California, United States of America
- * E-mail:
| |
Collapse
|
50
|
Yella VR, Bhimsaria D, Ghoshdastidar D, Rodríguez-Martínez J, Ansari AZ, Bansal M. Flexibility and structure of flanking DNA impact transcription factor affinity for its core motif. Nucleic Acids Res 2018; 46:11883-11897. [PMID: 30395339 PMCID: PMC6294565 DOI: 10.1093/nar/gky1057] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 10/11/2018] [Accepted: 10/17/2018] [Indexed: 01/13/2023] Open
Abstract
Spatial and temporal expression of genes is essential for maintaining phenotype integrity. Transcription factors (TFs) modulate expression patterns by binding to specific DNA sequences in the genome. Along with the core binding motif, the flanking sequence context can play a role in DNA-TF recognition. Here, we employ high-throughput in vitro and in silico analyses to understand the influence of sequences flanking the cognate sites in binding of three most prevalent eukaryotic TF families (zinc finger, homeodomain and bZIP). In vitro binding preferences of each TF toward the entire DNA sequence space were correlated with a wide range of DNA structural parameters, including DNA flexibility. Results demonstrate that conformational plasticity of flanking regions modulates binding affinity of certain TF families. DNA duplex stability and minor groove width also play an important role in DNA-TF recognition but differ in how exactly they influence the binding in each specific case. Our analyses further reveal that the structural features of preferred flanking sequences are not universal, as similar DNA-binding folds can employ distinct DNA recognition modes.
Collapse
Affiliation(s)
- Venkata Rajesh Yella
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, Andhra Pradesh 522502, India
| | - Devesh Bhimsaria
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - José A Rodríguez-Martínez
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
- Department of Biology, University of Puerto Rico-Rio Piedras, San Juan, PR 00925, USA
| | - Aseem Z Ansari
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
- The Genome Center of Wisconsin, Madison, WI 53706, USA
| | - Manju Bansal
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| |
Collapse
|