1
|
Chen SK, Liu J, Van Nynatten A, Tudor-Price BM, Chang BSW. Sampling Strategies for Experimentally Mapping Molecular Fitness Landscapes Using High-Throughput Methods. J Mol Evol 2024:10.1007/s00239-024-10179-8. [PMID: 38886207 DOI: 10.1007/s00239-024-10179-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/20/2024] [Indexed: 06/20/2024]
Abstract
Empirical studies of genotype-phenotype-fitness maps of proteins are fundamental to understanding the evolutionary process, in elucidating the space of possible genotypes accessible through mutations in a landscape of phenotypes and fitness effects. Yet, comprehensively mapping molecular fitness landscapes remains challenging since all possible combinations of amino acid substitutions for even a few protein sites are encoded by an enormous genotype space. High-throughput mapping of genotype space can be achieved using large-scale screening experiments known as multiplexed assays of variant effect (MAVEs). However, to accommodate such multi-mutational studies, the size of MAVEs has grown to the point where a priori determination of sampling requirements is needed. To address this problem, we propose calculations and simulation methods to approximate minimum sampling requirements for multi-mutational MAVEs, which we combine with a new library construction protocol to experimentally validate our approximation approaches. Analysis of our simulated data reveals how sampling trajectories differ between simulations of nucleotide versus amino acid variants and among mutagenesis schemes. For this, we show quantitatively that marginal gains in sampling efficiency demand increasingly greater sampling effort when sampling for nucleotide sequences over their encoded amino acid equivalents. We present a new library construction protocol that efficiently maximizes sequence variation, and demonstrate using ultradeep sequencing that the library encodes virtually all possible combinations of mutations within the experimental design. Insights learned from our analyses together with the methodological advances reported herein are immediately applicable toward pooled experimental screens of arbitrary design, enabling further assay upscaling and expanded testing of genotype space.
Collapse
Affiliation(s)
- Steven K Chen
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Jing Liu
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Alexander Van Nynatten
- Department of Biological Science, University of Toronto Scarborough, Toronto, ON, Canada
| | | | - Belinda S W Chang
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada.
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, Canada.
- Centre for the Analysis of Genome Evolution & Function, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
2
|
Lindenburg L, Huovinen T, van de Wiel K, Herger M, Snaith MR, Hollfelder F. Split & mix assembly of DNA libraries for ultrahigh throughput on-bead screening of functional proteins. Nucleic Acids Res 2020; 48:e63. [PMID: 32383757 PMCID: PMC7293038 DOI: 10.1093/nar/gkaa270] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 04/02/2020] [Accepted: 04/21/2020] [Indexed: 12/13/2022] Open
Abstract
Site-saturation libraries reduce protein screening effort in directed evolution campaigns by focusing on a limited number of rationally chosen residues. However, uneven library synthesis efficiency leads to amino acid bias, remedied at high cost by expensive custom synthesis of oligonucleotides, or through use of proprietary library synthesis platforms. To address these shortcomings, we have devised a method where DNA libraries are constructed on the surface of microbeads by ligating dsDNA fragments onto growing, surface-immobilised DNA, in iterative split-and-mix cycles. This method-termed SpliMLiB for Split-and-Mix Library on Beads-was applied towards the directed evolution of an anti-IgE Affibody (ZIgE), generating a 160,000-membered, 4-site, saturation library on the surface of 8 million monoclonal beads. Deep sequencing confirmed excellent library balance (5.1% ± 0.77 per amino acid) and coverage (99.3%). As SpliMLiB beads are monoclonal, they were amenable to direct functional screening in water-in-oil emulsion droplets with cell-free expression. A FACS-based sorting of the library beads allowed recovery of hits improved in Kd over wild-type ZIgE by up to 3.5-fold, while a consensus mutant of the best hits provided a 10-fold improvement. With SpliMLiB, directed evolution workflows are accelerated by integrating high-quality DNA library generation with an ultra-high throughput protein screening platform.
Collapse
Affiliation(s)
- Laurens Lindenburg
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Rd, Cambridge CB2 1GA, UK
| | - Tuomas Huovinen
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Rd, Cambridge CB2 1GA, UK
| | - Kayleigh van de Wiel
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Rd, Cambridge CB2 1GA, UK
| | - Michael Herger
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Rd, Cambridge CB2 1GA, UK
- AstraZeneca Medimmune Cambridge, Antibody Discovery and Protein Engineering, Cambridge, UK
| | - Michael R Snaith
- AstraZeneca Medimmune Cambridge, Antibody Discovery and Protein Engineering, Cambridge, UK
| | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Rd, Cambridge CB2 1GA, UK
| |
Collapse
|
3
|
Currin A, Swainston N, Day PJ, Kell DB. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev 2015; 44:1172-239. [PMID: 25503938 PMCID: PMC4349129 DOI: 10.1039/c4cs00351a] [Citation(s) in RCA: 258] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Indexed: 12/21/2022]
Abstract
The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the 'search space' of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (Kd) and catalytic (kcat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving kcat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the 'best' amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust.
Collapse
Affiliation(s)
- Andrew Currin
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| | - Neil Swainston
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- School of Computer Science , The University of Manchester , Manchester M13 9PL , UK
| | - Philip J. Day
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- Faculty of Medical and Human Sciences , The University of Manchester , Manchester M13 9PT , UK
| | - Douglas B. Kell
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| |
Collapse
|
4
|
Chen JR, Deng ZN, Chen YB, Hu BW, Lü JJ, Long YL, Xiong XY. Construction of tandem repeats of DNA fragments by a polymerase chain reaction-based method. DNA Cell Biol 2011; 31:600-6. [PMID: 22176214 DOI: 10.1089/dna.2011.1379] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We describe a new application of megaprimer polymerase chain reaction (PCR) for constructing a tandemly repeated DNA sequence using the drought responsive element (DRE) from Arabidopsis thaliana as an example. The key feature in the procedure was PCR primers with partial complementarity but differing melting temperatures (T(m)). The reverse primer had a higher T(m), a 3' end complementary to the DRE sequence and a 5' region complementary to the forward primer. The initial cycles of the PCR were conducted at a lower primer annealing temperature to generate products that served as megaprimers in the later cycles conducted at a higher temperature to prevent annealing of the forward primer. The region of overlap between the megaprimers was extended for generating products with a variable copy number (one to four copies) of tandem DRE sequence repeats (71 bp). The PCR product with four tandem repeats (4× DRE) was used as a template to generate tandem repeats with higher copies (copy number large than four) or demonstrated to bind DRE-binding protein in an yeast one-hybrid assay using promotorless reporter genes (HIS and lacZ). This PCR protocol has numerous applications for generating DNA fragments of repeated sequences.
Collapse
Affiliation(s)
- Ji-Ren Chen
- College of Horticulture and Gardening, Hunan Agricultural University, Changsha, People's Republic of China
| | | | | | | | | | | | | |
Collapse
|
5
|
When second best is good enough: another probabilistic look at saturation mutagenesis. Appl Environ Microbiol 2011; 78:258-62. [PMID: 22038607 DOI: 10.1128/aem.06265-11] [Citation(s) in RCA: 89] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
We developed new criteria for determining the library size in a saturation mutagenesis experiment. When the number of all possible distinct variants is large, any of the top-performing variants (e.g., any of the top three) is likely to meet the design requirements, so the probability that the library contains at least one of them is a sensible criterion for determining the library size. By using a criterion of this type, one may significantly reduce the library size and thus save costs and labor while minimally compromising the quality of the best variant discovered. We present the probabilistic tools underlying these criteria and use them to compare the efficiencies of four randomization schemes: NNN, which uses all 64 codons; NNB, which uses 48 codons; NNK, which uses 32 codons; and MAX, which assigns equal probabilities to each of the 20 amino acids. MAX was found to be the most efficient randomization scheme and NNN the least efficient. TopLib, a computer program for carrying out the related calculations, is available through a user-friendly Web server.
Collapse
|
6
|
Abstract
A dual-peak LPFG (long-period fibre grating), inscribed in an optical fibre, has been employed to sense DNA hybridization in real time, over a 1 h period. One strand of the DNA was immobilized on the fibre, while the other was free in solution. After hybridization, the fibre was stripped and repeated detection of hybridization was achieved, so demonstrating reusability of the device. Neither strand of DNA was fluorescently or otherwise labelled. The present paper will provide an overview of our early-stage experimental data and methodology, examine the potential of fibre gratings for use as biosensors to monitor both nucleic acid and other biomolecular interactions and then give a summary of the theory and fabrication of fibre gratings from a biological standpoint. Finally, the potential of improving signal strength and possible future directions of fibre grating biosensors will be addressed.
Collapse
|
7
|
Generation and functional analysis of zinc finger nucleases. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 434:277-90. [PMID: 18470651 DOI: 10.1007/978-1-60327-248-3_17] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
The recent development of artificial endonucleases with tailored specificities has opened the door for a wide range of new applications, including the correction of mutated genes directly in the chromosome. This kind of gene therapy is based on homologous recombination, which can be stimulated by the creation of a targeted DNA double-strand break (DSB) near the site of the desired recombination event. Artificial nucleases containing zinc finger DNA-binding domains have provided important proofs of concept, showing that inserting a DSB in the target locus leads to gene correction frequencies of 1-18% in human cells. In this paper, we describe how zinc finger nucleases are assembled by polymerase chain reaction (PCR) and present two methods to assess these custom nucleases quickly in vitro and in a cell-based recombination assay.
Collapse
|
8
|
Papworth M, Kolasinska P, Minczuk M. Designer zinc-finger proteins and their applications. Gene 2006; 366:27-38. [PMID: 16298089 DOI: 10.1016/j.gene.2005.09.011] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2005] [Accepted: 09/18/2005] [Indexed: 10/25/2022]
Abstract
The Cys(2)His(2) zinc finger is one of the most common DNA-binding motifs in Eukaryota. A simple mode of DNA recognition by the Cys(2)His(2) zinc finger domain provides an ideal scaffold for designing proteins with novel sequence specificities. The ability to bind specifically to virtually any DNA sequence combined with the potential of fusing them with effector domains has led to the technology of engineering of chimeric DNA-modifying enzymes and transcription factors. This in turn has opened the possibility of using the engineered zinc finger-based factors as novel human therapeutics. One such synthetic factor-designer zinc finger transcription activator of the vascular endothelial growth factor A gene-has recently entered clinical trials to evaluate the ability of stimulating the growth of blood vessels in treating the peripheral arterial obstructive disease. This review concentrates on the aspects of natural Cys(2)His(2) zinc fingers evolution and fundamental steps in design of engineered zinc finger proteins. The applications of engineered zinc finger proteins are discussed in a context of the mechanism mediating their effect on the targeted DNA. Furthermore, the regulation of the expression of zinc finger proteins and their targeting to various cellular compartments and to chromatin and non-chromatin target templates are described. Also possible future applications of designer zinc finger proteins are discussed.
Collapse
Affiliation(s)
- Monika Papworth
- MRC Laboratory of Molecular Biology, Hills Road, CB2 2QH, UK.
| | | | | |
Collapse
|
9
|
Patrick WM, Firth AE. Strategies and computational tools for improving randomized protein libraries. ACTA ACUST UNITED AC 2005; 22:105-12. [PMID: 16095966 DOI: 10.1016/j.bioeng.2005.06.001] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2005] [Revised: 06/20/2005] [Accepted: 06/21/2005] [Indexed: 11/15/2022]
Abstract
In the last decade, directed evolution has become a routine approach for engineering proteins with novel or altered properties. Concurrently, a trend away from purely 'blind' randomization strategies and towards more 'semi-rational' approaches has also become apparent. In this review, we discuss ways in which structural information and predictive computational tools are playing an increasingly important role in guiding the design of randomized libraries: web servers such as ConSurf-HSSP and SCHEMA allow the prediction of sites to target for producing functional variants, while algorithms such as GLUE, PEDEL and DRIVeR are useful for estimating library completeness and diversity. In addition, we review recent methodological developments that facilitate the construction of unbiased libraries, which are inherently more diverse than biased libraries and therefore more likely to yield improved variants.
Collapse
Affiliation(s)
- Wayne M Patrick
- Center for Fundamental and Applied Molecular Evolution, Emory University, 1510 Clifton Road, Atlanta GA 30322, USA.
| | | |
Collapse
|
10
|
Abstract
UNLABELLED We have investigated the statistics associated with constructing and sampling large protein-encoding libraries. Using fairly simple statistics we have written algorithms for estimating the diversity in libraries generated by the most commonly used protocols, including error-prone PCR, DNA shuffling, StEP PCR, oligonucleotide-directed randomization, MAX randomization, synthetic shuffling, DHR, ADO and SISDC. AVAILABILITY Web interface and C++ source code available at http://guinevere.otago.ac.nz/stats.html. SUPPLEMENTARY INFORMATION Complete mathematical notes, model assumptions and justification, users' guide and worked examples at above website.
Collapse
Affiliation(s)
- Andrew E Firth
- Department of Biochemistry, University of Otago, PO Box 56, Dunedin, New Zealand.
| | | |
Collapse
|