1
|
Shimko TC, Fordyce PM, Orenstein Y. DeCoDe: degenerate codon design for complete protein-coding DNA libraries. Bioinformatics 2020; 36:3357-3364. [PMID: 32176271 DOI: 10.1093/bioinformatics/btaa162] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2019] [Revised: 02/13/2020] [Accepted: 03/13/2020] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION High-throughput protein screening is a critical technique for dissecting and designing protein function. Libraries for these assays can be created through a number of means, including targeted or random mutagenesis of a template protein sequence or direct DNA synthesis. However, mutagenic library construction methods often yield vastly more nonfunctional than functional variants and, despite advances in large-scale DNA synthesis, individual synthesis of each desired DNA template is often prohibitively expensive. Consequently, many protein-screening libraries rely on the use of degenerate codons (DCs), mixtures of DNA bases incorporated at specific positions during DNA synthesis, to generate highly diverse protein-variant pools from only a few low-cost synthesis reactions. However, selecting DCs for sets of sequences that covary at multiple positions dramatically increases the difficulty of designing a DC library and leads to the creation of many undesired variants that can quickly outstrip screening capacity. RESULTS We introduce a novel algorithm for total DC library optimization, degenerate codon design (DeCoDe), based on integer linear programming. DeCoDe significantly outperforms state-of-the-art DC optimization algorithms and scales well to more than a hundred proteins sharing complex patterns of covariation (e.g. the lab-derived avGFP lineage). Moreover, DeCoDe is, to our knowledge, the first DC design algorithm with the capability to encode mixed-length protein libraries. We anticipate DeCoDe to be broadly useful for a variety of library generation problems, ranging from protein engineering attempts that leverage mutual information to the reconstruction of ancestral protein states. AVAILABILITY AND IMPLEMENTATION github.com/OrensteinLab/DeCoDe. CONTACT yaronore@bgu.ac.il. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Polly M Fordyce
- Department of Genetics
- Department of Bioengineering
- Stanford ChEM-H, Stanford University, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Yaron Orenstein
- School of Electrical and Computer Engineering, Ben-Gurion University of the Negev, Beer-Sheva 8410501, Israel
| |
Collapse
|
2
|
Suchsland R, Appel B, Müller S. Preparation of trinucleotide phosphoramidites as synthons for the synthesis of gene libraries. Beilstein J Org Chem 2018. [PMID: 29520304 PMCID: PMC5827815 DOI: 10.3762/bjoc.14.28] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The preparation of protein libraries is a key issue in protein engineering and biotechnology. Such libraries can be prepared by a variety of methods, starting from the respective gene library. The challenge in gene library preparation is to achieve controlled total or partial randomization at any predefined number and position of codons of a given gene, in order to obtain a library with a maximum number of potentially successful candidates. This purpose is best achieved by the usage of trinucleotide synthons for codon-based gene synthesis. We here review the strategies for the preparation of fully protected trinucleotides, emphasizing more recent developments for their synthesis on solid phase and on soluble polymers, and their use as synthons in standard DNA synthesis.
Collapse
Affiliation(s)
- Ruth Suchsland
- Institut für Biochemie, Ernst-Moritz-Arndt-Universität Greifswald, Felix-Hausdorff-Str. 4, D-17489 Greifswald, Germany
| | - Bettina Appel
- Institut für Biochemie, Ernst-Moritz-Arndt-Universität Greifswald, Felix-Hausdorff-Str. 4, D-17489 Greifswald, Germany
| | - Sabine Müller
- Institut für Biochemie, Ernst-Moritz-Arndt-Universität Greifswald, Felix-Hausdorff-Str. 4, D-17489 Greifswald, Germany
| |
Collapse
|
3
|
Jacobs TM, Yumerefendi H, Kuhlman B, Leaver-Fay A. SwiftLib: rapid degenerate-codon-library optimization through dynamic programming. Nucleic Acids Res 2014; 43:e34. [PMID: 25539925 PMCID: PMC4357694 DOI: 10.1093/nar/gku1323] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Degenerate codon (DC) libraries efficiently address the experimental library-size limitations of directed evolution by focusing diversity toward the positions and toward the amino acids (AAs) that are most likely to generate hits; however, manually constructing DC libraries is challenging, error prone and time consuming. This paper provides a dynamic programming solution to the task of finding the best DCs while keeping the size of the library beneath some given limit, improving on the existing integer-linear programming formulation. It then extends the algorithm to consider multiple DCs at each position, a heretofore unsolved problem, while adhering to a constraint on the number of primers needed to synthesize the library. In the two library-design problems examined here, the use of multiple DCs produces libraries that very nearly cover the set of desired AAs while still staying within the experimental size limits. Surprisingly, the algorithm is able to find near-perfect libraries where the ratio of amino-acid sequences to nucleic-acid sequences approaches 1; it effectively side-steps the degeneracy of the genetic code. Our algorithm is freely available through our web server and solves most design problems in about a second.
Collapse
Affiliation(s)
- Timothy M Jacobs
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hayretin Yumerefendi
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Brian Kuhlman
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Andrew Leaver-Fay
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
4
|
Optimal codon randomization via mathematical programming. J Theor Biol 2013; 335:147-52. [PMID: 23792109 DOI: 10.1016/j.jtbi.2013.05.034] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2013] [Accepted: 05/28/2013] [Indexed: 01/21/2023]
Abstract
Codon randomization via degenerate oligonucleotides is a widely used approach for generating protein libraries. We use integer programming methodology to model and solve the problem of computing the minimal mixture of oligonucleotides required to induce an arbitrary target probability over the 20 standard amino acids. We consider both randomization via conventional degenerate oligonucleotides, which incorporate at each position of the randomized codon certain nucleotides in equal probabilities, and randomization via spiked oligonucleotides, which admit arbitrary nucleotide distribution at each of the codon's positions. Existing methods for computing such mixtures rely on various heuristics.
Collapse
|
5
|
Arunachalam TS, Wichert C, Appel B, Müller S. Mixed oligonucleotides for random mutagenesis: best way of making them. Org Biomol Chem 2012; 10:4641-50. [PMID: 22552713 DOI: 10.1039/c2ob25328c] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The generation of proteins, especially enzymes, with pre-deliberated, novel properties is a big challenge in the field of protein engineering. This aim, over the years was critically facilitated by newly emerging methods of combinatorial and evolutionary techniques, such as combinatorial gene synthesis followed by functional screening of many structural variants generated in parallel (library). Libraries can be generated by a large number of available methods. Therein the use of mixtures of pre-formed trinucleotide blocks representing codons for the 20 canonical amino acids for oligonucleotide synthesis stands out as allowing fully controlled partial (or total) randomization individually at any number of arbitrarily chosen codon positions of a given gene. This has created substantial demand of fully protected trinucleotide synthons of good reactivity in standard oligonucleotide synthesis. We here review methods for the preparation of oligonucleotide mixtures with a strong focus on codon-specific trinucleotide blocks.
Collapse
Affiliation(s)
- Tamil Selvi Arunachalam
- Institut für Biochemie, Ernst Moritz Arndt Universität, Felix Hausdorff Strasse 4, Greifswald, D-17487, Germany
| | | | | | | |
Collapse
|
6
|
Hidalgo A, Schliessmann A, Molina R, Hermoso J, Bornscheuer UT. A one-pot, simple methodology for cassette randomisation and recombination for focused directed evolution. Protein Eng Des Sel 2008; 21:567-76. [PMID: 18559369 DOI: 10.1093/protein/gzn034] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Protein engineering is currently performed either by rational design, focusing in most cases on only a few positions modified by site-directed mutagenesis, or by directed molecular evolution, in which the entire protein-encoding gene is subjected to random mutagenesis followed by screening or selection of desired phenotypes. A novel alternative is focused directed evolution, in which only fragments of a protein are randomised while the overall scaffold of a protein remains unchanged. For this purpose, we developed a PCR technique using long, spiked oligonucleotides, which allow randomising of one or several cassettes in any given position of a gene. This method allows over 95% incorporation of mutations independently of their position within the gene, yielding sufficient product to generate large libraries, and the possibility of simultaneously randomising more than one locus at a time, thus originating recombination. The high efficiency of this method was verified by creating focused mutant libraries of Pseudomonas fluorescens esterase I (PFEI), screening for altered substrate selectivity and validating against libraries created by error-prone PCR. This led to the identification of two mutants within the OSCARR library with a 10-fold higher catalytic efficiency towards p-nitrophenyl dodecanoate. These PFEI variants were also modelled in order to explain the observed effects.
Collapse
Affiliation(s)
- Aurelio Hidalgo
- Department of Biotechnology and Enzyme Catalysis, Institute of Biochemistry, Ernst-Moritz-Arndt University Greifswald, Felix-Hausdorff-Str. 4, D-17487 Greifswald, Germany
| | | | | | | | | |
Collapse
|
7
|
Volles MJ, Lansbury PT. A computer program for the estimation of protein and nucleic acid sequence diversity in random point mutagenesis libraries. Nucleic Acids Res 2005; 33:3667-77. [PMID: 15990391 PMCID: PMC1166583 DOI: 10.1093/nar/gki669] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
A computer program for the generation and analysis of in silico random point mutagenesis libraries is described. The program operates by mutagenizing an input nucleic acid sequence according to mutation parameters specified by the user for each sequence position and type of point mutation. The program can mimic almost any type of random mutagenesis library, including those produced via error-prone PCR (ep-PCR), mutator Escherichia coli strains, chemical mutagenesis, and doped or random oligonucleotide synthesis. The program analyzes the generated nucleic acid sequences and/or the associated protein library to produce several estimates of library diversity (number of unique sequences, point mutations, and single point mutants) and the rate of saturation of these diversities during experimental screening or selection of clones. This information allows one to select the optimal screen size for a given mutagenesis library, necessary to efficiently obtain a certain coverage of the sequence-space. The program also reports the abundance of each specific protein mutation at each sequence position, which is useful as a measure of the level and type of mutation bias in the library. Alternatively, one can use the program to evaluate the relative merits of preexisting libraries, or to examine various hypothetical mutation schemes to determine the optimal method for creating a library that serves the screen/selection of interest. Simulated libraries of at least 109 sequences are accessible by the numerical algorithm with currently available personal computers; an analytical algorithm is also available which can rapidly calculate a subset of the numerical statistics in libraries of arbitrarily large size. A multi-type double-strand stochastic model of ep-PCR is developed in an appendix to demonstrate the applicability of the algorithm to amplifying mutagenesis procedures. Estimators of DNA polymerase mutation-type-specific error rates are derived using the model. Analyses of an alpha-synuclein ep-PCR library and NNS synthetic oligonucleotide libraries are given as examples.
Collapse
Affiliation(s)
- Michael J Volles
- Center for Neurologic Diseases, Brigham and Women's Hospital and Department of Neurology, Harvard Medical School 65 Landsdowne Street, Cambridge, MA 02139, USA.
| | | |
Collapse
|
8
|
Tabuchi I, Soramoto S, Ueno S, Husimi Y. Multi-line split DNA synthesis: a novel combinatorial method to make high quality peptide libraries. BMC Biotechnol 2004; 4:19. [PMID: 15341664 PMCID: PMC520752 DOI: 10.1186/1472-6750-4-19] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2004] [Accepted: 09/01/2004] [Indexed: 11/30/2022] Open
Abstract
Background We developed a method to make a various high quality random peptide libraries for evolutionary protein engineering based on a combinatorial DNA synthesis. Results A split synthesis in codon units was performed with mixtures of bases optimally designed by using a Genetic Algorithm program. It required only standard DNA synthetic reagents and standard DNA synthesizers in three lines. This multi-line split DNA synthesis (MLSDS) is simply realized by adding a mix-and-split process to normal DNA synthesis protocol. Superiority of MLSDS method over other methods was shown. We demonstrated the synthesis of oligonucleotide libraries with 1016 diversity, and the construction of a library with random sequence coding 120 amino acids containing few stop codons. Conclusions Owing to the flexibility of the MLSDS method, it will be able to design various "rational" libraries by using bioinformatics databases.
Collapse
Affiliation(s)
- Ichiro Tabuchi
- Tokyo Evolution Research Center, 1-1-45-504, Okubo, Shinjuku-ku, Tokyo 169-0072, Japan
- Department of Functional Materials Science, Saitama University,255 Shimo-Okubo, Saitama 338-8570, Japan
| | - Sayaka Soramoto
- Department of Functional Materials Science, Saitama University,255 Shimo-Okubo, Saitama 338-8570, Japan
| | - Shingo Ueno
- Department of Functional Materials Science, Saitama University,255 Shimo-Okubo, Saitama 338-8570, Japan
| | - Yuzuru Husimi
- Department of Functional Materials Science, Saitama University,255 Shimo-Okubo, Saitama 338-8570, Japan
| |
Collapse
|
9
|
Ness JE, Del Cardayré SB, Minshull J, Stemmer WP. Molecular breeding: the natural approach to protein design. ADVANCES IN PROTEIN CHEMISTRY 2001; 55:261-92. [PMID: 11050936 DOI: 10.1016/s0065-3233(01)55006-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
10
|
Gaytán P, Yáñez J, Sánchez F, Soberón X. Orthogonal combinatorial mutagenesis: a codon-level combinatorial mutagenesis method useful for low multiplicity and amino acid-scanning protocols. Nucleic Acids Res 2001; 29:E9. [PMID: 11160911 PMCID: PMC30410 DOI: 10.1093/nar/29.3.e9] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We describe here a method to generate combinatorial libraries of oligonucleotides mutated at the codon-level, with control of the mutagenesis rate so as to create predictable binomial distributions of mutants. The method allows enrichment of the libraries with single, double or larger multiplicity of amino acid replacements by appropriate choice of the mutagenesis rate, depending on the concentration of synthetic precursors. The method makes use of two sets of deoxynucleoside-phosphoramidites bearing orthogonal protecting groups [4,4'-dimethoxytrityl (DMT) and 9-fluorenylmethoxycarbonyl (Fmoc)] in the 5' hydroxyl. These phosphoramidites are divergently combined during automated synthesis in such a way that wild-type codons are assembled with commercial DMT-deoxynucleoside-methyl-phosphoramidites while mutant codons are assembled with Fmoc-deoxynucleoside-methyl-phosphoramidites in an NNG/C fashion in a single synthesis column. This method is easily automated and suitable for low mutagenesis rates and large windows, such as those required for directed evolution and alanine scanning. Through the assembly of three oligonucleotide libraries at different mutagenesis rates, followed by cloning at the polylinker region of plasmid pUC18 and sequencing of 129 clones, we concluded that the method performs essentially as intended.
Collapse
Affiliation(s)
- P Gaytán
- Unidad de Síntesis and Departamento de Reconocimiento Molecular y Bioestructura, Instituto de Biotecnología/UNAM Ap. Postal 510-3 Cuernavaca, Morelos 62250, México
| | | | | | | |
Collapse
|
11
|
Daugherty PS, Olsen MJ, Iverson BL, Georgiou G. Development of an optimized expression system for the screening of antibody libraries displayed on the Escherichia coli surface. PROTEIN ENGINEERING 1999; 12:613-21. [PMID: 10436088 DOI: 10.1093/protein/12.7.613] [Citation(s) in RCA: 107] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
Polypeptide library screening technologies are critically dependent upon the characteristics of the expression system employed. A comparative analysis of the lpp-lac, tet and araBAD promoters was performed to determine the importance of tight regulation and expression level in library screening applications. The surface display of single-chain antibody (scFv) in Escherichia coli as an Lpp-OmpA' fusion was monitored using a fluorescently tagged antigen in conjunction with flow cytometry. In contrast to the lpp-lac promoter, both tet and araBAD promoters could be tightly repressed. Tight regulation was found to be essential for preventing rapid depletion of library clones expressing functional scFv and thus for maintaining the initial library diversity. Induction with subsaturating inducer concentrations yielded mixed populations of uninduced and fully induced cells for both the tet and araBAD expression systems. In contrast, homogeneous expression levels were obtained throughout the population using saturating inducer concentrations and could be adjusted by varying the induction time and plasmid copy number. Under optimal induction conditions for the araBAD system, protein expression did not compromise either cell viability or library diversity. This expression system was used to screen a library of random scFv mutants specific for digoxigenin for clones exhibiting improved hapten dissociation kinetics. Thus, an expression system has been developed which allows library diversity to be preserved and is generally applicable to the screening of E. coli surface displayed libraries.
Collapse
Affiliation(s)
- P S Daugherty
- Department of Chemical Engineering, Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | | | | | | |
Collapse
|
12
|
Jensen LJ, Andersen KV, Svendsen A, Kretzschmar T. Scoring functions for computational algorithms applicable to the design of spiked oligonucleotides. Nucleic Acids Res 1998; 26:697-702. [PMID: 9443959 PMCID: PMC147326 DOI: 10.1093/nar/26.3.697] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Protein engineering by inserting stretches of random DNA sequences into target genes in combination with adequate screening or selection methods is a versatile technique to elucidate and improve protein functions. Established compounds for generating semi-random DNA sequences are spiked oligonucleotides which are synthesised by interspersing wild type (wt) nucleotides of the target sequence with certain amounts of other nucleotides. Directed spiking strategies reduce the complexity of a library to a manageable format compared with completely random libraries. Computational algorithms render feasible the calculation of appropriate nucleotide mixtures to encode specified amino acid subpopulations. The crucial element in the ranking of spiked codons generated during an iterative algorithm is the scoring function. In this report three scoring functions are analysed: the sum-of-square-differences function s, a modified cubic function c, and a scoring function m derived from maximum likelihood considerations. The impact of these scoring functions on calculated amino acid distributions is demonstrated by an example of mutagenising a domain surrounding the active site serine of subtilisin-like proteases. At default weight settings of one for each amino acid, the new scoring function m is superior to functions s and c in finding matches to a given amino acid population.
Collapse
Affiliation(s)
- L J Jensen
- Department of Enzyme Design, Novo Nordisk A/S, DK-2880 Bagsvaerd, Denmark
| | | | | | | |
Collapse
|
13
|
Tomandl D, Schober A, Schwienhorst A. Optimizing doped libraries by using genetic algorithms. J Comput Aided Mol Des 1997; 11:29-38. [PMID: 9139109 DOI: 10.1023/a:1008071310472] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
The insertion of random sequences into protein-encoding genes in combination with biological selection techniques has become a valuable tool in the design of molecules that have useful and possibly novel properties. By employing highly effective screening protocols, a functional and unique structure that had not been anticipated can be distinguished among a huge collection of inactive molecules that together represent all possible amino acid combinations. This technique is severely limited by its restriction to a library of manageable size. One approach for limiting the size of a mutant library relies on 'doping schemes', where subsets of amino acids are generated that reveal only certain combinations of amino acids in a protein sequence. Three mononucleotide mixtures for each codon concerned must be designed, such that the resulting codons that are assembled during chemical gene synthesis represent the desired amino acid mixture on the level of the translated protein. In this paper we present a doping algorithm that "reverse translates' a desired mixture of certain amino acids into three mixtures of mononucleotides. The algorithm is designed to optimally bias these mixtures towards the codons of choice. This approach combines a genetic algorithm with local optimization strategies based on the downhill simplex method. Disparate relative representations of all amino acids (and stop codons) within a target set can be generated. Optional weighing factors are employed to emphasize the frequencies of certain amino acids and their codon usage, and to compensate for reaction rates of different mononucleotide building blocks (synthons) during chemical DNA synthesis. The effect of statistical errors that accompany an experimental realization of calculated nucleotide mixtures on the generated mixtures of amino acids is simulated. These simulations show that the robustness of different optima with respect to small deviations from calculated values depends on their concomitant fitness. Furthermore, the calculations probe the fitness landscape locally and allow a preliminary assessment of its structure.
Collapse
Affiliation(s)
- D Tomandl
- Department of Molecular Evolution Biology, Institute for Molecular Biotechnology, Jena, Germany
| | | | | |
Collapse
|
14
|
Collins J. Phage display. ANNUAL REPORTS IN COMBINATORIAL CHEMISTRY AND MOLECULAR DIVERSITY 1997. [DOI: 10.1007/978-0-306-46904-6_15] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
|
15
|
Kayushin AL, Korosteleva MD, Miroshnikov AI, Kosch W, Zubov D, Piel N. A convenient approach to the synthesis of trinucleotide phosphoramidites--synthons for the generation of oligonucleotide/peptide libraries. Nucleic Acids Res 1996; 24:3748-55. [PMID: 8871554 PMCID: PMC146157 DOI: 10.1093/nar/24.19.3748] [Citation(s) in RCA: 51] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Trinucleotide phosphoramidites that correspond to the codons of all 20 amino acids were synthesized in high yield in 5g scale. Precursors of those amidites--trinucleotide phosphotriesters--have been prepared using the phosphotriester approach without protection of the 3'-hydroxyl function. The structures of trinucleotide phosphotriesters and intermediates were confirmed by 1H- and 31P-NMR spectra, mass-spectra and by analysis of SPDE-hydrolysates of deprotected preparations. Purity of the target products has been confirmed by test reactions. The synthons have been used for automated synthesis of oligonucleotides and corresponding libraries by a phosphite-triester approach. A 54mer, containing 12 randomized internal bases, and a 72mer with 24 internal randomized bases have been synthesized.
Collapse
Affiliation(s)
- A L Kayushin
- Shemiakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, Russia
| | | | | | | | | | | |
Collapse
|
16
|
Loeb LA. Unnatural nucleotide sequences in biopharmaceutics. ADVANCES IN PHARMACOLOGY (SAN DIEGO, CALIF.) 1996; 35:321-47. [PMID: 8920210 DOI: 10.1016/s1054-3589(08)60280-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Affiliation(s)
- L A Loeb
- Department of Pathology, University of Washington School of Medicine, Seattle 98195, USA
| |
Collapse
|
17
|
Youvan DC, Goldman E, Delagrave S, Yang MM. Digital imaging spectroscopy for massively parallel screening of mutants. Methods Enzymol 1995; 246:732-48. [PMID: 7752945 DOI: 10.1016/0076-6879(95)46031-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- D C Youvan
- Palo Alto Institute of Molecular Medicine, Mountain View, California 94043, USA
| | | | | | | |
Collapse
|
18
|
Virnekäs B, Ge L, Plückthun A, Schneider KC, Wellnhofer G, Moroney SE. Trinucleotide phosphoramidites: ideal reagents for the synthesis of mixed oligonucleotides for random mutagenesis. Nucleic Acids Res 1994; 22:5600-7. [PMID: 7838712 PMCID: PMC310122 DOI: 10.1093/nar/22.25.5600] [Citation(s) in RCA: 173] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Trinucleotide phosphoramidites representing codons for all 20 amino acids have been prepared and used in automated, solid-phase DNA synthesis. In contrast to an earlier report, we show that these substances can be used to introduce entire codons into oligonucleotides in excess of 98% yield, and are ideal reagents for the synthesis of mixed oligonucleotides for random mutagenesis.
Collapse
|
19
|
Goodson RJ, Doyle MV, Kaufman SE, Rosenberg S. High-affinity urokinase receptor antagonists identified with bacteriophage peptide display. Proc Natl Acad Sci U S A 1994; 91:7129-33. [PMID: 8041758 PMCID: PMC44352 DOI: 10.1073/pnas.91.15.7129] [Citation(s) in RCA: 141] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
Abstract
Affinity selection of a 15-mer random peptide library displayed on bacteriophage M13 has been used to identify potent ligands for the human urokinase receptor, a key mediator of tumor cell invasion. A family of receptor binding bacteriophage ligands was obtained by sequentially and alternately selecting the peptide library on COS-7 monkey kidney cells and baculovirus-infected Sf9 insect cells overexpressing the human urokinase receptor. Nineteen peptides encoded by the random DNA regions of the selected bacteriophage were synthesized and tested in a urokinase receptor binding assay, where they competed with the labeled N-terminal fragment of urokinase with IC50 values ranging from 10 nM to 10 microM. All of the isolated peptides were linear and showed two relatively short conserved subsequences: LWXXAr (Ar = Y, W, F, or H) and XFXXYLW, neither of which is found in urokinase or its receptor. Competition experiments demonstrated that the most potent peptide, clone 20, prevented binding of bacteriophage displaying the urokinase receptor binding sequence (urokinase residues 13-32). In addition, this peptide blocked other apparently unrelated receptor binding bacteriophage, suggesting overlapping receptor interaction sites for all of these sequences. These results provide a demonstration of bacteriophage display identifying peptide ligands for a receptor expressed on cells and yield leads for the development of urokinase receptor antagonists.
Collapse
|
20
|
Abstract
In vitro selection from molecular libraries has rapidly come of age as a protein-engineering tool. Dramatic increases in protein affinity can be engineered using phage-display libraries, and specific antibodies can be selected directly from a single 'naïve' library of their genes. Repertoires of small molecules are a potentially valuable resource for drug discovery. Libraries of linear peptides provide ligands for proteins that recognize continuous epitopes, and low-affinity mimics of some small molecules, but generally do not contain mimics of large molecular interfaces. Switching to constrained peptide formats, and deploying more diverse, non-peptide chemical libraries, may bring greater success.
Collapse
Affiliation(s)
- T Clackson
- Department of Protein Engineering, Genentech, Inc., South San Francisco, CA 94080
| | | |
Collapse
|
21
|
LaBean TH, Kauffman SA. Design of synthetic gene libraries encoding random sequence proteins with desired ensemble characteristics. Protein Sci 1993; 2:1249-54. [PMID: 8401210 PMCID: PMC2142438 DOI: 10.1002/pro.5560020807] [Citation(s) in RCA: 37] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Libraries of random sequence polypeptides are useful as sources of unevolved proteins, novel ligands, and potential lead compounds for the development of vaccines and therapeutics. The expression of small random peptides has been achieved previously using DNA synthesized with equimolar mixtures of nucleotides. For many potential uses of random polypeptide libraries, concerns such as avoiding termination codons and matching target amino acid compositions make more complex designs necessary. In this study, three mixtures of nucleotides, corresponding to the three positions in the codon, were designed such that semirandom DNA synthesized by repeated cycles of the three mixtures created an open reading frame encoding random sequence polypeptides with desired ensemble characteristics. Two methods were used to design the nucleotide mixtures: the manual use of a spreadsheet and a refining grid search algorithm. Using design targets of less than or equal to 1% stop codons and an amino acid composition based on the average ratios observed in natural, globular proteins, the search methods yielded similar nucleotide ratios, Semirandom DNA, synthesized with a designed, three-residue repeat pattern, can encode libraries of very high diversity and represents an important tool for the construction of random polypeptide libraries.
Collapse
Affiliation(s)
- T H LaBean
- Department of Biochemistry and Biophysics, University of Pennsylvania, Philadelphia 19104
| | | |
Collapse
|
22
|
Arkin AP, Youvan DC. An algorithm for protein engineering: simulations of recursive ensemble mutagenesis. Proc Natl Acad Sci U S A 1992; 89:7811-5. [PMID: 1502200 PMCID: PMC49801 DOI: 10.1073/pnas.89.16.7811] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
An algorithm for protein engineering, termed recursive ensemble mutagenesis, has been developed to produce diverse populations of phenotypically related mutants whose members differ in amino acid sequence. This method uses a feedback mechanism to control successive rounds of combinatorial cassette mutagenesis. Starting from partially randomized "wild-type" DNA sequences, a highly parallel search of sequence space for peptides fitting an experimenter's criteria is performed. Each iteration uses information gained from the previous rounds to search the space more efficiently. Simulations of the technique indicate that, under a variety of conditions, the algorithm can rapidly produce a diverse population of proteins fitting specific criteria. In the experimental analog, genetic selection or screening applied during recursive ensemble mutagenesis should force the evolution of an ensemble of mutants to a targeted cluster of related phenotypes.
Collapse
Affiliation(s)
- A P Arkin
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge 02139
| | | |
Collapse
|