51
|
Goldberg NW, Knight AM, Zhang RK, Arnold FH. Nitrene Transfer Catalyzed by a Non-Heme Iron Enzyme and Enhanced by Non-Native Small-Molecule Ligands. J Am Chem Soc 2019; 141:19585-19588. [PMID: 31790588 DOI: 10.1021/jacs.9b11608] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Transition-metal catalysis is a powerful tool for the construction of chemical bonds. Here we show that Pseudomonas savastanoi ethylene-forming enzyme, a non-heme iron enzyme, can catalyze olefin aziridination and nitrene C-H insertion, and that these activities can be improved by directed evolution. The non-heme iron center allows for facile modification of the primary coordination sphere by addition of metal-coordinating molecules, enabling control over enzyme activity and selectivity using small molecules.
Collapse
|
52
|
Watkins EJ, Almhjell PJ, Arnold FH. Direct Enzymatic Synthesis of a Deep-Blue Fluorescent Noncanonical Amino Acid from Azulene and Serine. Chembiochem 2019; 21:80-83. [PMID: 31513332 DOI: 10.1002/cbic.201900497] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Indexed: 12/21/2022]
Abstract
We report a simple, one-step enzymatic synthesis of the blue fluorescent noncanonical amino acid β-(1-azulenyl)-l-alanine (AzAla). By using an engineered tryptophan synthase β-subunit (TrpB), stereochemically pure AzAla can be synthesized at scale starting from commercially available azulene and l-serine. Mutation of a universally conserved catalytic glutamate in the active site to glycine has only a modest effect on native activity with indole but abolishes activity on azulene, suggesting that this glutamate activates azulene for nucleophilic attack by stabilization of the aromatic ion.
Collapse
|
53
|
Yang Y, Cho I, Qi X, Liu P, Arnold FH. An enzymatic platform for the asymmetric amination of primary, secondary and tertiary C(sp 3)-H bonds. Nat Chem 2019; 11:987-993. [PMID: 31611634 PMCID: PMC6998391 DOI: 10.1038/s41557-019-0343-5] [Citation(s) in RCA: 115] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 08/15/2019] [Indexed: 01/22/2023]
Abstract
The ability to selectively functionalize ubiquitous C-H bonds streamlines the construction of complex molecular architectures from easily available precursors. Here we report enzyme catalysts derived from a cytochrome P450 that use a nitrene transfer mechanism for the enantioselective amination of primary, secondary and tertiary C(sp3)-H bonds. These fully genetically encoded enzymes are produced and function in bacteria, where they can be optimized by directed evolution for a broad spectrum of enantioselective C(sp3)-H amination reactions. These catalysts can aminate a variety of benzylic, allylic and aliphatic C-H bonds in excellent enantioselectivity with access to either antipode of product. Enantioselective amination of primary C(sp3)-H bonds in substrates that bear geminal dimethyl substituents furnished chiral amines that feature a quaternary stereocentre. Moreover, these enzymes enabled the enantioconvergent transformation of racemic substrates that possess a tertiary C(sp3)-H bond to afford products that bear a tetrasubstituted stereocentre, a process that has eluded small-molecule catalysts. Further engineering allowed for the enantioselective construction of methyl-ethyl stereocentres, which is notoriously challenging in asymmetric catalysis.
Collapse
|
54
|
Bedbrook CN, Yang KK, Robinson JE, Mackey ED, Gradinaru V, Arnold FH. Machine learning-guided channelrhodopsin engineering enables minimally invasive optogenetics. Nat Methods 2019; 16:1176-1184. [PMID: 31611694 PMCID: PMC6858556 DOI: 10.1038/s41592-019-0583-8] [Citation(s) in RCA: 111] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Accepted: 08/22/2019] [Indexed: 12/13/2022]
Abstract
We engineered light-gated channelrhodopsins (ChRs) whose current strength and light sensitivity enable minimally-invasive neuronal circuit interrogation. Current ChR tools applied to the mammalian brain require intracranial surgery for transgene delivery and implantation of invasive fiber-optic cables to produce light-dependent activation of a small volume of tissue. To facilitate expansive optogenetics without the need for invasive implants, our engineering approach leverages the significant literature of ChR variants to train statistical models for the design of new, high-performance ChRs. With Gaussian Process models trained on a limited experimental set of 102 functionally characterized ChRs, we designed high-photocurrent ChRs with unprecedented light sensitivity; three of these, ChRger1–3, enable optogenetic activation of the nervous system via minimally-invasive systemic transgene delivery, not possible previously due to low per-cell transgene copy produced by systemic delivery. ChRger2 enables light-induced neuronal excitation without invasive intracranial surgery for virus delivery or fiber optic implantation, i.e. enables minimally-invasive optogenetics.
Collapse
|
55
|
Romney DK, Sarai NS, Arnold FH. Nitroalkanes as Versatile Nucleophiles for Enzymatic Synthesis of Noncanonical Amino Acids. ACS Catal 2019; 9:8726-8730. [PMID: 33274115 DOI: 10.1021/acscatal.9b02089] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
C-C bond-forming reactions often require nucleophilic carbon species rarely compatible with aqueous reaction media, thus restricting their appearance in biocatalysis. Here we report the use of nitroalkanes as a structurally versatile class of nucleophilic substrates for C-C bond formation catalyzed by variants of the β-subunit of tryptophan synthase (TrpB). The enzymes accept a wide range of nitroalkanes to form noncanonical amino acids, here the nitro group can serve as a handle for further modification. Using nitroalkane nucleophiles greatly expands the scope of compounds made by TrpB variants and establishes nitroalkanes as a valuable substrate class for biocatalytic C-C bond formation.
Collapse
|
56
|
Brandenberg OF, Miller DC, Markel U, Ouald Chaib A, Arnold FH. Engineering Chemoselectivity in Hemoprotein-Catalyzed Indole Amidation. ACS Catal 2019; 9:8271-8275. [PMID: 31938573 DOI: 10.1021/acscatal.9b02508] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Here we report a cytochrome P450 variant that catalyzes C2-amidation of 1-methylindoles with tosyl azide via nitrene transfer. Before evolutionary optimization the enzyme exhibited two undesired side reactivities resulting in reduction of the putative iron-nitrenoid intermediate or cycloaddition between the two substrates to form triazole products. We speculated that triazole formation was a promiscuous cycloaddition activity of the P450 heme domain, while sulfonamide formation likely arose from surplus electron transfer from the reductase domain. Directed evolution involving mutagenesis of both the heme and reductase domains delivered an enzyme providing the desired indole amidation products with up to 8400 turnovers, 90% yield, and a shift in chemoselectivity from 2:19:1 to 110:12:1 in favor of nitrene transfer over reduction or triazole formation. This work expands the substrate scope of hemoprotein nitrene transferases to heterocycles and highlights the adaptability of the P450 scaffold to solve challenging chemoselectivity problems in non-natural enzymatic catalysis.
Collapse
|
57
|
Arnold FH. Innovation durch Evolution: Wie man neue Chemie zum Leben erweckt (Nobel‐Vortrag). Angew Chem Int Ed Engl 2019. [DOI: 10.1002/ange.201907729] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
58
|
Arnold FH. Innovation by Evolution: Bringing New Chemistry to Life (Nobel Lecture). Angew Chem Int Ed Engl 2019; 58:14420-14426. [PMID: 31433107 DOI: 10.1002/anie.201907729] [Citation(s) in RCA: 215] [Impact Index Per Article: 43.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Indexed: 12/21/2022]
Abstract
The directed evolution of enzymes is now routinely used to develop new catalysts with various applications, such as in environmentally friendly production of chemicals and renewable fuels. In her Nobel lecture, F. Arnold describes how lessons from nature inspired the development of methods for directed evolution.
Collapse
|
59
|
Yang KK, Wu Z, Arnold FH. Machine-learning-guided directed evolution for protein engineering. Nat Methods 2019; 16:687-694. [PMID: 31308553 DOI: 10.1038/s41592-019-0496-6] [Citation(s) in RCA: 426] [Impact Index Per Article: 85.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 06/17/2019] [Indexed: 02/06/2023]
Abstract
Protein engineering through machine-learning-guided directed evolution enables the optimization of protein functions. Machine-learning approaches predict how sequence maps to function in a data-driven manner without requiring a detailed model of the underlying physics or biological pathways. Such methods accelerate directed evolution by learning from the properties of characterized variants and using that information to select sequences that are likely to exhibit improved properties. Here we introduce the steps required to build machine-learning sequence-function models and to use those models to guide engineering, making recommendations at each stage. This review covers basic concepts relevant to the use of machine learning for protein engineering, as well as the current literature and applications of this engineering paradigm. We illustrate the process with two case studies. Finally, we look to future opportunities for machine learning to enable the discovery of unknown protein functions and uncover the relationship between protein sequence and function.
Collapse
|
60
|
Zhang J, Huang X, Zhang RK, Arnold FH. Enantiodivergent α-Amino C-H Fluoroalkylation Catalyzed by Engineered Cytochrome P450s. J Am Chem Soc 2019; 141:9798-9802. [PMID: 31187993 DOI: 10.1021/jacs.9b04344] [Citation(s) in RCA: 76] [Impact Index Per Article: 15.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The introduction of fluoroalkyl groups into organic compounds can significantly alter pharmacological characteristics. One enabling but underexplored approach for the installation of fluoroalkyl groups is selective C( sp3)-H functionalization due to the ubiquity of C-H bonds in organic molecules. We have engineered heme enzymes that can insert fluoroalkyl carbene intermediates into α-amino C( sp3)-H bonds and enable enantiodivergent synthesis of fluoroalkyl-containing molecules. Using directed evolution, we engineered cytochrome P450 enzymes to catalyze this abiological reaction under mild conditions with total turnovers (TTN) up to 4070 and enantiomeric excess (ee) up to 99%. The iron-heme catalyst is fully genetically encoded and configurable by directed evolution so that just a few mutations to the enzyme completely inverted product enantioselectivity. These catalysts provide a powerful method for synthesis of chiral organofluorine molecules that is currently not possible with small-molecule catalysts.
Collapse
|
61
|
Cho I, Jia ZJ, Arnold FH. RETRACTED: Site-selective enzymatic C‒H amidation for synthesis of diverse lactams. Science 2019; 364:575-578. [PMID: 31073063 DOI: 10.1126/science.aaw9068] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 03/28/2019] [Indexed: 02/23/2024]
Abstract
A major challenge in carbon‒hydrogen (C‒H) bond functionalization is to have the catalyst control precisely where a reaction takes place. In this study, we report engineered cytochrome P450 enzymes that perform unprecedented enantioselective C‒H amidation reactions and control the site selectivity to divergently construct β-, γ-, and δ-lactams, completely overruling the inherent reactivities of the C‒H bonds. The enzymes, expressed in Escherichia coli cells, accomplish this abiological carbon‒nitrogen bond formation via reactive iron-bound carbonyl nitrenes generated from nature-inspired acyl-protected hydroxamate precursors. This transformation is exceptionally efficient (up to 1,020,000 total turnovers) and selective (up to 25:1 regioselectivity and 97%, please refer to compound 2v enantiomeric excess), and can be performed easily on preparative scale.
Collapse
|
62
|
Brandenberg OF, Chen K, Arnold FH. Directed Evolution of a Cytochrome P450 Carbene Transferase for Selective Functionalization of Cyclic Compounds. J Am Chem Soc 2019; 141:8989-8995. [DOI: 10.1021/jacs.9b02931] [Citation(s) in RCA: 74] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
63
|
Chen K, Huang X, Zhang SQ, Zhou AZ, Kan SBJ, Hong X, Arnold FH. Engineered Cytochrome c-Catalyzed Lactone-Carbene B-H Insertion. Synlett 2019; 30:378-382. [PMID: 30930550 PMCID: PMC6436545 DOI: 10.1055/s-0037-1611662] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Previous work has demonstrated that variants of a heme protein, Rhodothermus marinus cytochrome c (Rma cyt c), catalyze abiological carbene boron-hydrogen (B-H) bond insertion with high efficiency and selectivity. Here we investigated this carbon-boron bondforming chemistry with cyclic, lactone-based carbenes. Using directed evolution, we obtained a Rma cyt c variant BOR LAC that shows high selectivity and efficiency for B-H insertion of 5- and 6-membered lactone carbenes (up to 24,500 total turnovers and 97.1:2.9 enantiomeric ratio). The enzyme shows low activity with a 7-membered lactone carbene. Computational studies revealed a highly twisted geometry of the 7membered lactone carbene intermediate relative to 5- and 6-membered ones. Directed evolution of cytochrome c together with computational characterization of key iron-carbene intermediates has allowed us to expand the scope of enzymatic carbene B-H insertion to produce new lactone-based organoborons.
Collapse
|
64
|
Huang X, Garcia-Borràs M, Miao K, Kan SBJ, Zutshi A, Houk KN, Arnold FH. A Biocatalytic Platform for Synthesis of Chiral α-Trifluoromethylated Organoborons. ACS CENTRAL SCIENCE 2019; 5:270-276. [PMID: 30834315 PMCID: PMC6396380 DOI: 10.1021/acscentsci.8b00679] [Citation(s) in RCA: 64] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2018] [Indexed: 05/04/2023]
Abstract
There are few biocatalytic transformations that produce fluorine-containing molecules prevalent in modern pharmaceuticals. To expand the scope of biocatalysis for organofluorine synthesis, we have developed an enzymatic platform for highly enantioselective carbene B-H bond insertion to yield versatile α-trifluoromethylated (α-CF3) organoborons, an important class of organofluorine molecules that contain stereogenic centers bearing both CF3 and boron groups. In contrast to current "carbene transferase" enzymes that use a limited set of simple diazo compounds as carbene precursors, this system based on Rhodothermus marinus cytochrome c (Rma cyt c) can accept a broad range of trifluorodiazo alkanes and deliver versatile chiral α-CF3 organoborons with total turnovers up to 2870 and enantiomeric ratios up to 98.5:1.5. Computational modeling reveals that this broad diazo scope is enabled by an active-site environment that directs the alkyl substituent on the heme CF3-carbene intermediate toward the solvent-exposed face, thereby allowing the protein to accommodate diazo compounds with diverse structural features.
Collapse
|
65
|
Cho I, Prier CK, Jia Z, Zhang RK, Görbe T, Arnold FH. Enantioselective Aminohydroxylation of Styrenyl Olefins Catalyzed by an Engineered Hemoprotein. Angew Chem Int Ed Engl 2019. [DOI: 10.1002/ange.201812968] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
66
|
Cho I, Prier CK, Jia ZJ, Zhang RK, Görbe T, Arnold FH. Enantioselective Aminohydroxylation of Styrenyl Olefins Catalyzed by an Engineered Hemoprotein. Angew Chem Int Ed Engl 2019; 58:3138-3142. [PMID: 30600873 DOI: 10.1002/anie.201812968] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Indexed: 12/14/2022]
Abstract
Chiral 1,2-amino alcohols are widely represented in biologically active compounds from neurotransmitters to antivirals. While many synthetic methods have been developed for accessing amino alcohols, the direct aminohydroxylation of alkenes to unprotected, enantioenriched amino alcohols remains a challenge. Using directed evolution, we have engineered a hemoprotein biocatalyst based on a thermostable cytochrome c that directly transforms alkenes to amino alcohols with high enantioselectivity (up to 2500 TTN and 90 % ee) under anaerobic conditions with O-pivaloylhydroxylamine as an aminating reagent. The reaction is proposed to proceed via a reactive iron-nitrogen species generated in the enzyme active site, enabling tuning of the catalyst's activity and selectivity by protein engineering.
Collapse
|
67
|
Chen K, Zhang SQ, Brandenberg OF, Hong X, Arnold FH. Alternate Heme Ligation Steers Activity and Selectivity in Engineered Cytochrome P450-Catalyzed Carbene-Transfer Reactions. J Am Chem Soc 2018; 140:16402-16407. [DOI: 10.1021/jacs.8b09613] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
68
|
Zhang RK, Huang X, Arnold FH. Selective CH bond functionalization with engineered heme proteins: new tools to generate complexity. Curr Opin Chem Biol 2018; 49:67-75. [PMID: 30343008 DOI: 10.1016/j.cbpa.2018.10.004] [Citation(s) in RCA: 94] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2018] [Revised: 09/27/2018] [Accepted: 10/02/2018] [Indexed: 12/13/2022]
Abstract
CH functionalization is an attractive strategy to construct and diversify molecules. Heme proteins, predominantly cytochromes P450, are responsible for an array of CH oxidations in biology. Recent work has coupled concepts from synthetic chemistry, computation, and natural product biosynthesis to engineer heme protein systems to deliver products with tailored oxidation patterns. Heme protein catalysis has been shown to go well beyond these native reactions and now accesses new-to-nature CH transformations, including CN and CC bond forming processes. Emerging work with these systems moves us along the ambitious path of building complexity from the ubiquitous CH bond.
Collapse
|
69
|
Boville CE, Scheele RA, Koch P, Brinkmann-Chen S, Buller AR, Arnold FH. Engineered Biosynthesis of β-Alkyl Tryptophan Analogues. Angew Chem Int Ed Engl 2018; 57:14764-14768. [PMID: 30215880 DOI: 10.1002/anie.201807998] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2018] [Indexed: 11/12/2022]
Abstract
Noncanonical amino acids (ncAAs) with dual stereocenters at the α and β positions are valuable precursors to natural products and therapeutics. Despite the potential applications of such bioactive β-branched ncAAs, their availability is limited due to the inefficiency of the multistep methods used to prepare them. Herein we report a stereoselective biocatalytic synthesis of β-branched tryptophan analogues using an engineered variant of Pyrococcus furiosus tryptophan synthase (PfTrpB), PfTrpB7E6 . PfTrpB7E6 is the first biocatalyst to synthesize bulky β-branched tryptophan analogues in a single step, with demonstrated access to 27 ncAAs. The molecular basis for the efficient catalysis and broad substrate tolerance of PfTrpB7E6 was explored through X-ray crystallography and UV/Vis spectroscopy, which revealed that a combination of active-site and remote mutations increase the abundance and persistence of a key reactive intermediate. PfTrpB7E6 provides an operationally simple and environmentally benign platform for the preparation of β-branched tryptophan building blocks.
Collapse
|
70
|
Boville CE, Scheele RA, Koch P, Brinkmann-Chen S, Buller AR, Arnold FH. Engineered Biosynthesis of β-Alkyl Tryptophan Analogues. Angew Chem Int Ed Engl 2018. [DOI: 10.1002/ange.201807998] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
71
|
Almhjell PJ, Boville CE, Arnold FH. Engineering enzymes for noncanonical amino acid synthesis. Chem Soc Rev 2018; 47:8980-8997. [PMID: 30280154 DOI: 10.1039/c8cs00665b] [Citation(s) in RCA: 69] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The standard proteinogenic amino acids grant access to a myriad of chemistries that harmonize to create life. Outside of these twenty canonical protein building blocks are countless noncanonical amino acids (ncAAs), either found in nature or created by man. Interest in ncAAs has grown as research has unveiled their importance as precursors to natural products and pharmaceuticals, biological probes, and more. Despite their broad applications, synthesis of ncAAs remains a challenge, as poor stereoselectivity and low functional-group compatibility stymie effective preparative routes. The use of enzymes has emerged as a versatile approach to prepare ncAAs, and nature's enzymes can be engineered to synthesize ncAAs more efficiently and expand the amino acid alphabet. In this tutorial review, we briefly outline different enzyme engineering strategies and then discuss examples where engineering has generated new 'ncAA synthases' for efficient, environmentally benign production of a wide and growing collection of valuable ncAAs.
Collapse
|
72
|
Abstract
Not satisfied with nature’s vast enzyme repertoire, we want to create new ones and expand the space of genetically encoded enzyme functions. We use the most powerful biological design process, evolution, to optimize existing enzymes and invent new ones, thereby circumventing our profound ignorance of how sequence encodes function. Mimicking nature’s evolutionary tricks and using a little chemical intuition, we can generate whole new enzyme families that catalyze important reactions, including ones not known in biology. These new capabilities increase the scope of molecules and materials we can build using biology.
Collapse
|
73
|
Yang KK, Wu Z, Bedbrook CN, Arnold FH. Learned protein embeddings for machine learning. Bioinformatics 2018; 34:2642-2648. [PMID: 29584811 PMCID: PMC6061698 DOI: 10.1093/bioinformatics/bty178] [Citation(s) in RCA: 131] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Revised: 03/20/2018] [Accepted: 03/22/2018] [Indexed: 12/26/2022] Open
Abstract
Motivation Machine-learning models trained on protein sequences and their measured functions can infer biological properties of unseen sequences without requiring an understanding of the underlying physical or biological mechanisms. Such models enable the prediction and discovery of sequences with optimal properties. Machine-learning models generally require that their inputs be vectors, and the conversion from a protein sequence to a vector representation affects the model's ability to learn. We propose to learn embedded representations of protein sequences that take advantage of the vast quantity of unmeasured protein sequence data available. These embeddings are low-dimensional and can greatly simplify downstream modeling. Results The predictive power of Gaussian process models trained using embeddings is comparable to those trained on existing representations, which suggests that embeddings enable accurate predictions despite having orders of magnitude fewer dimensions. Moreover, embeddings are simpler to obtain because they do not require alignments, structural data, or selection of informative amino-acid properties. Visualizing the embedding vectors shows meaningful relationships between the embedded proteins are captured. Availability and implementation The embedding vectors and code to reproduce the results are available at https://github.com/fhalab/embeddings_reproduction/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
74
|
Romney DK, Arnold FH, Lipshutz BH, Li CJ. Chemistry Takes a Bath: Reactions in Aqueous Media. J Org Chem 2018; 83:7319-7322. [DOI: 10.1021/acs.joc.8b01412] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
75
|
Lewis RD, Garcia-Borràs M, Chalkley MJ, Buller AR, Houk KN, Kan SBJ, Arnold FH. Catalytic iron-carbene intermediate revealed in a cytochrome c carbene transferase. Proc Natl Acad Sci U S A 2018; 115:7308-7313. [PMID: 29946033 PMCID: PMC6048479 DOI: 10.1073/pnas.1807027115] [Citation(s) in RCA: 84] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Recently, heme proteins have been discovered and engineered by directed evolution to catalyze chemical transformations that are biochemically unprecedented. Many of these nonnatural enzyme-catalyzed reactions are assumed to proceed through a catalytic iron porphyrin carbene (IPC) intermediate, although this intermediate has never been observed in a protein. Using crystallographic, spectroscopic, and computational methods, we have captured and studied a catalytic IPC intermediate in the active site of an enzyme derived from thermostable Rhodothermus marinus (Rma) cytochrome c High-resolution crystal structures and computational methods reveal how directed evolution created an active site for carbene transfer in an electron transfer protein and how the laboratory-evolved enzyme achieves perfect carbene transfer stereoselectivity by holding the catalytic IPC in a single orientation. We also discovered that the IPC in Rma cytochrome c has a singlet ground electronic state and that the protein environment uses geometrical constraints and noncovalent interactions to influence different IPC electronic states. This information helps us to understand the impressive reactivity and selectivity of carbene transfer enzymes and offers insights that will guide and inspire future engineering efforts.
Collapse
|