1
|
Nguyen TD, Saito Y, Kameda T. CodonAdjust: a software for in silico design of a mutagenesis library with specific amino acid profiles. Protein Eng Des Sel 2020; 32:503-511. [PMID: 32705123 DOI: 10.1093/protein/gzaa013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 03/27/2020] [Accepted: 06/19/2020] [Indexed: 11/12/2022] Open
Abstract
In protein engineering, generation of mutagenesis libraries is a key step to study the functions of mutants. To generate mutants with a desired composition of amino acids (AAs), a codon consisting of a mixture of nucleotides is widely applied. Several computational methods have been proposed to calculate a codon nucleotide composition for generating a given amino acid profile based on mathematical optimization. However, these previous methods need to manually tune weights of amino acids in objective functions, which are time-consuming and, more importantly, lack publicly available software implementations. Here, we develop CodonAdjust, a software to adjust a codon nucleotide composition for mimicking a given amino acid profile. We propose different options of CodonAdjust, which provide various customizations in practical scenarios such as setting a guaranteeing threshold for the frequencies of amino acids without any manual tasks. We demonstrate the capability of CodonAdjust in the experiments on the complementarity-determining regions (CDRs) of antibodies and T-cell receptors (TCRs) as well as millions of amino acid profiles from Pfam. These results suggest that CodonAdjust is a productive software for codon design and may accelerate library generation. CodonAdjust is freely available at https://github.com/tiffany-nguyen/CodonAdjust. Paper edited by Dr. Jeffery Saven, Board Member for PEDS.
Collapse
Affiliation(s)
- Thuy Duong Nguyen
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| | - Yutaka Saito
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan.,AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan.,Graduate School of Frontier Sciences, University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
| | - Tomoshi Kameda
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
| |
Collapse
|
2
|
Rosenfeld L, Heyne M, Shifman JM, Papo N. Protein Engineering by Combined Computational and In Vitro Evolution Approaches. Trends Biochem Sci 2016; 41:421-433. [PMID: 27061494 DOI: 10.1016/j.tibs.2016.03.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Revised: 02/29/2016] [Accepted: 03/09/2016] [Indexed: 12/30/2022]
Abstract
Two alternative strategies are commonly used to study protein-protein interactions (PPIs) and to engineer protein-based inhibitors. In one approach, binders are selected experimentally from combinatorial libraries of protein mutants that are displayed on a cell surface. In the other approach, computational modeling is used to explore an astronomically large number of protein sequences to select a small number of sequences for experimental testing. While both approaches have some limitations, their combination produces superior results in various protein engineering applications. Such applications include the design of novel binders and inhibitors, the enhancement of affinity and specificity, and the mapping of binding epitopes. The combination of these approaches also aids in the understanding of the specificity profiles of various PPIs.
Collapse
Affiliation(s)
- Lior Rosenfeld
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Michael Heyne
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel; Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Julia M Shifman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | - Niv Papo
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
| |
Collapse
|
3
|
Jacobs TM, Yumerefendi H, Kuhlman B, Leaver-Fay A. SwiftLib: rapid degenerate-codon-library optimization through dynamic programming. Nucleic Acids Res 2014; 43:e34. [PMID: 25539925 PMCID: PMC4357694 DOI: 10.1093/nar/gku1323] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Degenerate codon (DC) libraries efficiently address the experimental library-size limitations of directed evolution by focusing diversity toward the positions and toward the amino acids (AAs) that are most likely to generate hits; however, manually constructing DC libraries is challenging, error prone and time consuming. This paper provides a dynamic programming solution to the task of finding the best DCs while keeping the size of the library beneath some given limit, improving on the existing integer-linear programming formulation. It then extends the algorithm to consider multiple DCs at each position, a heretofore unsolved problem, while adhering to a constraint on the number of primers needed to synthesize the library. In the two library-design problems examined here, the use of multiple DCs produces libraries that very nearly cover the set of desired AAs while still staying within the experimental size limits. Surprisingly, the algorithm is able to find near-perfect libraries where the ratio of amino-acid sequences to nucleic-acid sequences approaches 1; it effectively side-steps the degeneracy of the genetic code. Our algorithm is freely available through our web server and solves most design problems in about a second.
Collapse
Affiliation(s)
- Timothy M Jacobs
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Hayretin Yumerefendi
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Brian Kuhlman
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Andrew Leaver-Fay
- Department of Biochemistry, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| |
Collapse
|
4
|
Chen TS, Palacios H, Keating AE. Structure-based redesign of the binding specificity of anti-apoptotic Bcl-x(L). J Mol Biol 2012; 425:171-85. [PMID: 23154169 DOI: 10.1016/j.jmb.2012.11.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2012] [Revised: 11/05/2012] [Accepted: 11/06/2012] [Indexed: 12/29/2022]
Abstract
Many native proteins are multi-specific and interact with numerous partners, which can confound analysis of their functions. Protein design provides a potential route to generating synthetic variants of native proteins with more selective binding profiles. Redesigned proteins could be used as research tools, diagnostics or therapeutics. In this work, we used a library screening approach to reengineer the multi-specific anti-apoptotic protein Bcl-x(L) to remove its interactions with many of its binding partners, making it a high-affinity and selective binder of the BH3 region of pro-apoptotic protein Bad. To overcome the enormity of the potential Bcl-x(L) sequence space, we developed and applied a computational/experimental framework that used protein structure information to generate focused combinatorial libraries. Sequence features were identified using structure-based modeling, and an optimization algorithm based on integer programming was used to select degenerate codons that maximally covered these features. A constraint on library size was used to ensure thorough sampling. Using yeast surface display to screen a designed library of Bcl-x(L) variants, we successfully identified a protein with ~1000-fold improvement in binding specificity for the BH3 region of Bad over the BH3 region of Bim. Although negative design was targeted only against the BH3 region of Bim, the best redesigned protein was globally specific against binding to 10 other peptides corresponding to native BH3 motifs. Our design framework demonstrates an efficient route to highly specific protein binders and may readily be adapted for application to other design problems.
Collapse
Affiliation(s)
- T Scott Chen
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | | | |
Collapse
|
5
|
Chen TS, Keating AE. Designing specific protein-protein interactions using computation, experimental library screening, or integrated methods. Protein Sci 2012; 21:949-63. [PMID: 22593041 DOI: 10.1002/pro.2096] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2012] [Accepted: 05/11/2012] [Indexed: 11/11/2022]
Abstract
Given the importance of protein-protein interactions for nearly all biological processes, the design of protein affinity reagents for use in research, diagnosis or therapy is an important endeavor. Engineered proteins would ideally have high specificities for their intended targets, but achieving interaction specificity by design can be challenging. There are two major approaches to protein design or redesign. Most commonly, proteins and peptides are engineered using experimental library screening and/or in vitro evolution. An alternative approach involves using protein structure and computational modeling to rationally choose sequences predicted to have desirable properties. Computational design has successfully produced novel proteins with enhanced stability, desired interactions and enzymatic function. Here we review the strengths and limitations of experimental library screening and computational structure-based design, giving examples where these methods have been applied to designing protein interaction specificity. We highlight recent studies that demonstrate strategies for combining computational modeling with library screening. The computational methods provide focused libraries predicted to be enriched in sequences with the properties of interest. Such integrated approaches represent a promising way to increase the efficiency of protein design and to engineer complex functionality such as interaction specificity.
Collapse
Affiliation(s)
- T Scott Chen
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | |
Collapse
|
6
|
Shi L, Wheeler JC, Sweet RW, Lu J, Luo J, Tornetta M, Whitaker B, Reddy R, Brittingham R, Borozdina L, Chen Q, Amegadzie B, Knight DM, Almagro JC, Tsui P. De novo selection of high-affinity antibodies from synthetic fab libraries displayed on phage as pIX fusion proteins. J Mol Biol 2010; 397:385-96. [PMID: 20114051 DOI: 10.1016/j.jmb.2010.01.034] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2009] [Revised: 01/06/2010] [Accepted: 01/14/2010] [Indexed: 10/19/2022]
Abstract
Filamentous phage was the first display platform employed to isolate antibodies in vitro and is still the most broadly used. The success of phage display is due to its robustness, ease of use, and comprehensive technology development, as well as a broad range of selection methods developed during the last two decades. We report here the first combinatorial synthetic Fab libraries displayed on pIX, a fusion partner different from the widely used pIII. The libraries were constructed on four V(L) and three V(H) domains encoded by IGV and IGJ germ-line genes frequently used in human antibodies, which were diversified to mirror the variability observed in the germ-line genes and antibodies isolated from natural sources. Two sets of libraries were built, one with diversity focused on V(H) by keeping V(L) in the germ-line gene configuration and the other with diversity in both V domains. After selection on a diverse panel of proteins, numerous specific Fabs with affinities ranging from 0.2 nM to 20 nM were isolated. V(H) diversity was sufficient for isolating Fabs to most antigens, whereas variability in V(L) was required for isolation of antibodies to some targets. After the application of an integrated maturation process consisting of reshuffling V(L) diversity, the affinity of selected antibodies was improved up to 100-fold to the low picomolar range, suitable for in vivo studies. The results demonstrate the feasibility of displaying complex Fab libraries as pIX fusion proteins for antibody discovery and optimization and lay the foundation for studies on the structure-function relationships of antibodies.
Collapse
Affiliation(s)
- Lei Shi
- Centocor R&D, Inc., 145 King of Prussia Road, Radnor, PA 19087, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Craig RA, Lu J, Luo J, Shi L, Liao L. Optimizing nucleotide sequence ensembles for combinatorial protein libraries using a genetic algorithm. Nucleic Acids Res 2009; 38:e10. [PMID: 19889723 PMCID: PMC2811015 DOI: 10.1093/nar/gkp906] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Protein libraries are essential to the field of protein engineering. Increasingly, probabilistic protein design is being used to synthesize combinatorial protein libraries, which allow the protein engineer to explore a vast space of amino acid sequences, while at the same time placing restrictions on the amino acid distributions. To this end, if site-specific amino acid probabilities are input as the target, then the codon nucleotide distributions that match this target distribution can be used to generate a partially randomized gene library. However, it turns out to be a highly nontrivial computational task to find the codon nucleotide distributions that exactly matches a given target distribution of amino acids. We first showed that for any given target distribution an exact solution may not exist at all. Formulated as a constrained optimization problem, we then developed a genetic algorithm-based approach to find codon nucleotide distributions that match as closely as possible to the target amino acid distribution. As compared with the previous gradient descent method on various objective functions, the new method consistently gave more optimized distributions as measured by the relative entropy between the calculated and the target distributions. To simulate the actual lab solutions, new objective functions were designed to allow for two separate sets of codons in seeking a better match to the target amino acid distribution.
Collapse
Affiliation(s)
- Roger A Craig
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19716, USA
| | | | | | | | | |
Collapse
|
8
|
Philibert P, Stoessel A, Wang W, Sibler AP, Bec N, Larroque C, Saven JG, Courtête J, Weiss E, Martineau P. A focused antibody library for selecting scFvs expressed at high levels in the cytoplasm. BMC Biotechnol 2007; 7:81. [PMID: 18034894 PMCID: PMC2241821 DOI: 10.1186/1472-6750-7-81] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2007] [Accepted: 11/22/2007] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Intrabodies are defined as antibody molecules which are ectopically expressed inside the cell. Such intrabodies can be used to visualize or inhibit the targeted antigen in living cells. However, most antibody fragments cannot be used as intrabodies because they do not fold under the reducing conditions of the cell cytosol and nucleus. RESULTS We describe the construction and validation of a large synthetic human single chain antibody fragment library based on a unique framework and optimized for cytoplasmic expression. Focusing the library by mimicking the natural diversity of CDR3 loops ensured that the scFvs were fully human and functional. We show that the library is highly diverse and functional since it has been possible to isolate by phage-display several strong binders against the five proteins tested in this study, the Syk and Aurora-A protein kinases, the alphabeta tubulin dimer, the papillomavirus E6 protein and the core histones. Some of the selected scFvs are expressed at an exceptional high level in the bacterial cytoplasm, allowing the purification of 1 mg of active scFv from only 20 ml of culture. Finally, we show that after three rounds of selection against core histones, more than half of the selected scFvs were active when expressed in vivo in human cells since they were essentially localized in the nucleus. CONCLUSION This new library is a promising tool not only for an easy and large-scale selection of functional intrabodies but also for the isolation of highly expressed scFvs that could be used in numerous biotechnological and therapeutic applications.
Collapse
Affiliation(s)
- Pascal Philibert
- CNRS, UMR5160, CRLC, 15, av, Charles Flahault, BP14491, 34093, Montpellier Cedex 5, France.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Park S, Kono H, Wang W, Boder ET, Saven JG. Progress in the development and application of computational methods for probabilistic protein design. Comput Chem Eng 2005. [DOI: 10.1016/j.compchemeng.2004.07.037] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
10
|
Hale MB, Nolan GP, Wolkowicz R. Oligonucleotide-directed site-specific integration of high complexity libraries into ssDNA templates. Nucleic Acids Res 2004; 32:e22. [PMID: 14752044 PMCID: PMC373376 DOI: 10.1093/nar/gnh021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present an approach that generates an oligomer-based library with minimal need for restriction site modification of sequences in the target vector. The technique has the advantage that it can be applied for generating peptide aptamer libraries at sites within proteins without the need for introducing flanking enzyme sites. As an example we present a phagemid retroviral shuttle vector that can be used to achieve stable expression of the library in mammalian cells for the purpose of screening for peptides with desired biological activity.
Collapse
Affiliation(s)
- M B Hale
- Department of Molecular Pharmacology, School of Medicine, Stanford University, Stanford, CA 94305, USA
| | | | | |
Collapse
|
11
|
Moore GL, Maranas CD. Computational challenges in combinatorial library design for protein engineering. AIChE J 2004. [DOI: 10.1002/aic.10025] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|