1
|
Alford RF, Leaver-Fay A, Jeliazkov JR, O’Meara MJ, DiMaio FP, Park H, Shapovalov MV, Renfrew PD, Mulligan VK, Kappel K, Labonte JW, Pacella MS, Bonneau R, Bradley P, Dunbrack RL, Das R, Baker D, Kuhlman B, Kortemme T, Gray JJ. The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design. J Chem Theory Comput 2017; 13:3031-3048. [PMID: 28430426 PMCID: PMC5717763 DOI: 10.1021/acs.jctc.7b00125] [Citation(s) in RCA: 901] [Impact Index Per Article: 112.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Over the past decade, the Rosetta biomolecular modeling suite has informed diverse biological questions and engineering challenges ranging from interpretation of low-resolution structural data to design of nanomaterials, protein therapeutics, and vaccines. Central to Rosetta's success is the energy function: a model parametrized from small-molecule and X-ray crystal structure data used to approximate the energy associated with each biomolecule conformation. This paper describes the mathematical models and physical concepts that underlie the latest Rosetta energy function, called the Rosetta Energy Function 2015 (REF15). Applying these concepts, we explain how to use Rosetta energies to identify and analyze the features of biomolecular models. Finally, we discuss the latest advances in the energy function that extend its capabilities from soluble proteins to also include membrane proteins, peptides containing noncanonical amino acids, small molecules, carbohydrates, nucleic acids, and other macromolecules.
Collapse
|
research-article |
8 |
901 |
2
|
Leman JK, Weitzner BD, Lewis SM, Adolf-Bryfogle J, Alam N, Alford RF, Aprahamian M, Baker D, Barlow KA, Barth P, Basanta B, Bender BJ, Blacklock K, Bonet J, Boyken SE, Bradley P, Bystroff C, Conway P, Cooper S, Correia BE, Coventry B, Das R, De Jong RM, DiMaio F, Dsilva L, Dunbrack R, Ford AS, Frenz B, Fu DY, Geniesse C, Goldschmidt L, Gowthaman R, Gray JJ, Gront D, Guffy S, Horowitz S, Huang PS, Huber T, Jacobs TM, Jeliazkov JR, Johnson DK, Kappel K, Karanicolas J, Khakzad H, Khar KR, Khare SD, Khatib F, Khramushin A, King IC, Kleffner R, Koepnick B, Kortemme T, Kuenze G, Kuhlman B, Kuroda D, Labonte JW, Lai JK, Lapidoth G, Leaver-Fay A, Lindert S, Linsky T, London N, Lubin JH, Lyskov S, Maguire J, Malmström L, Marcos E, Marcu O, Marze NA, Meiler J, Moretti R, Mulligan VK, Nerli S, Norn C, Ó'Conchúir S, Ollikainen N, Ovchinnikov S, Pacella MS, Pan X, Park H, Pavlovicz RE, Pethe M, Pierce BG, Pilla KB, Raveh B, Renfrew PD, Burman SSR, Rubenstein A, Sauer MF, Scheck A, Schief W, Schueler-Furman O, Sedan Y, Sevy AM, Sgourakis NG, Shi L, Siegel JB, Silva DA, Smith S, Song Y, et alLeman JK, Weitzner BD, Lewis SM, Adolf-Bryfogle J, Alam N, Alford RF, Aprahamian M, Baker D, Barlow KA, Barth P, Basanta B, Bender BJ, Blacklock K, Bonet J, Boyken SE, Bradley P, Bystroff C, Conway P, Cooper S, Correia BE, Coventry B, Das R, De Jong RM, DiMaio F, Dsilva L, Dunbrack R, Ford AS, Frenz B, Fu DY, Geniesse C, Goldschmidt L, Gowthaman R, Gray JJ, Gront D, Guffy S, Horowitz S, Huang PS, Huber T, Jacobs TM, Jeliazkov JR, Johnson DK, Kappel K, Karanicolas J, Khakzad H, Khar KR, Khare SD, Khatib F, Khramushin A, King IC, Kleffner R, Koepnick B, Kortemme T, Kuenze G, Kuhlman B, Kuroda D, Labonte JW, Lai JK, Lapidoth G, Leaver-Fay A, Lindert S, Linsky T, London N, Lubin JH, Lyskov S, Maguire J, Malmström L, Marcos E, Marcu O, Marze NA, Meiler J, Moretti R, Mulligan VK, Nerli S, Norn C, Ó'Conchúir S, Ollikainen N, Ovchinnikov S, Pacella MS, Pan X, Park H, Pavlovicz RE, Pethe M, Pierce BG, Pilla KB, Raveh B, Renfrew PD, Burman SSR, Rubenstein A, Sauer MF, Scheck A, Schief W, Schueler-Furman O, Sedan Y, Sevy AM, Sgourakis NG, Shi L, Siegel JB, Silva DA, Smith S, Song Y, Stein A, Szegedy M, Teets FD, Thyme SB, Wang RYR, Watkins A, Zimmerman L, Bonneau R. Macromolecular modeling and design in Rosetta: recent methods and frameworks. Nat Methods 2020; 17:665-680. [PMID: 32483333 PMCID: PMC7603796 DOI: 10.1038/s41592-020-0848-2] [Show More Authors] [Citation(s) in RCA: 484] [Impact Index Per Article: 96.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 04/22/2020] [Indexed: 12/12/2022]
Abstract
The Rosetta software for macromolecular modeling, docking and design is extensively used in laboratories worldwide. During two decades of development by a community of laboratories at more than 60 institutions, Rosetta has been continuously refactored and extended. Its advantages are its performance and interoperability between broad modeling capabilities. Here we review tools developed in the last 5 years, including over 80 methods. We discuss improvements to the score function, user interfaces and usability. Rosetta is available at http://www.rosettacommons.org.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
484 |
3
|
Firnberg E, Labonte JW, Gray JJ, Ostermeier M. A comprehensive, high-resolution map of a gene's fitness landscape. Mol Biol Evol 2014; 31:1581-92. [PMID: 24567513 PMCID: PMC4032126 DOI: 10.1093/molbev/msu081] [Citation(s) in RCA: 207] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Mutations are central to evolution, providing the genetic variation upon which selection acts. A mutation’s effect on the suitability of a gene to perform a particular function (gene fitness) can be positive, negative, or neutral. Knowledge of the distribution of fitness effects (DFE) of mutations is fundamental for understanding evolutionary dynamics, molecular-level genetic variation, complex genetic disease, the accumulation of deleterious mutations, and the molecular clock. We present comprehensive DFEs for point and codon mutants of the Escherichia coli TEM-1 β-lactamase gene and missense mutations in the TEM-1 protein. These DFEs provide insight into the inherent benefits of the genetic code’s architecture, support for the hypothesis that mRNA stability dictates codon usage at the beginning of genes, an extensive framework for understanding protein mutational tolerance, and evidence that mutational effects on protein thermodynamic stability shape the DFE. Contrary to prevailing expectations, we find that deleterious effects of mutation primarily arise from a decrease in specific protein activity and not cellular protein levels.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
11 |
207 |
4
|
Crawford JM, Korman TP, Labonte JW, Vagstad AL, Hill EA, Kamari-Bidkorpeh O, Tsai SC, Townsend CA. Structural basis for biosynthetic programming of fungal aromatic polyketide cyclization. Nature 2009; 461:1139-43. [PMID: 19847268 DOI: 10.1038/nature08475] [Citation(s) in RCA: 142] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2009] [Accepted: 08/28/2009] [Indexed: 11/09/2022]
Abstract
Polyketides are a class of natural products with diverse structures and biological activities. The structural variability of aromatic products of fungal nonreducing, multidomain iterative polyketide synthases (NR-PKS group of IPKSs) results from regiospecific cyclizations of reactive poly-beta-keto intermediates. How poly-beta-keto species are synthesized and stabilized, how their chain lengths are determined, and, in particular, how specific cyclization patterns are controlled have been largely inaccessible and functionally unknown until recently. A product template (PT) domain is responsible for controlling specific aldol cyclization and aromatization of these mature polyketide precursors, but the mechanistic basis is unknown. Here we present the 1.8 A crystal structure and mutational studies of a dissected PT monodomain from PksA, the NR-PKS that initiates the biosynthesis of the potent hepatocarcinogen aflatoxin B(1) in Aspergillus parasiticus. Despite having minimal sequence similarity to known enzymes, the structure displays a distinct 'double hot dog' (DHD) fold. Co-crystal structures with palmitate or a bicyclic substrate mimic illustrate that PT can bind both linear and bicyclic polyketides. Docking and mutagenesis studies reveal residues important for substrate binding and catalysis, and identify a phosphopantetheine localization channel and a deep two-part interior binding pocket and reaction chamber. Sequence similarity and extensive conservation of active site residues in PT domains suggest that the mechanistic insights gleaned from these studies will prove general for this class of IPKSs, and lay a foundation for defining the molecular rules controlling NR-PKS cyclization specificity.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
16 |
142 |
5
|
Piepenbrink KH, Lillehoj E, Harding CM, Labonte JW, Zuo X, Rapp CA, Munson RS, Goldblum SE, Feldman MF, Gray JJ, Sundberg EJ. Structural Diversity in the Type IV Pili of Multidrug-resistant Acinetobacter. J Biol Chem 2016; 291:22924-22935. [PMID: 27634041 PMCID: PMC5087714 DOI: 10.1074/jbc.m116.751099] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2016] [Indexed: 11/06/2022] Open
Abstract
Acinetobacter baumannii is a Gram-negative coccobacillus found primarily in hospital settings that has recently emerged as a source of hospital-acquired infections. A. baumannii expresses a variety of virulence factors, including type IV pili, bacterial extracellular appendages often essential for attachment to host cells. Here, we report the high resolution structures of the major pilin subunit, PilA, from three Acinetobacter strains, demonstrating that A. baumannii subsets produce morphologically distinct type IV pilin glycoproteins. We examine the consequences of this heterogeneity for protein folding and assembly as well as host-cell adhesion by Acinetobacter Comparisons of genomic and structural data with pilin proteins from other species of soil gammaproteobacteria suggest that these structural differences stem from evolutionary pressure that has resulted in three distinct classes of type IVa pilins, each found in multiple species.
Collapse
|
Journal Article |
9 |
48 |
6
|
Drew K, Renfrew PD, Craven TW, Butterfoss GL, Chou FC, Lyskov S, Bullock BN, Watkins A, Labonte JW, Pacella M, Kilambi KP, Leaver-Fay A, Kuhlman B, Gray JJ, Bradley P, Kirshenbaum K, Arora PS, Das R, Bonneau R. Adding diverse noncanonical backbones to rosetta: enabling peptidomimetic design. PLoS One 2013; 8:e67051. [PMID: 23869206 PMCID: PMC3712014 DOI: 10.1371/journal.pone.0067051] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2013] [Accepted: 05/13/2013] [Indexed: 11/19/2022] Open
Abstract
Peptidomimetics are classes of molecules that mimic structural and functional attributes of polypeptides. Peptidomimetic oligomers can frequently be synthesized using efficient solid phase synthesis procedures similar to peptide synthesis. Conformationally ordered peptidomimetic oligomers are finding broad applications for molecular recognition and for inhibiting protein-protein interactions. One critical limitation is the limited set of design tools for identifying oligomer sequences that can adopt desired conformations. Here, we present expansions to the ROSETTA platform that enable structure prediction and design of five non-peptidic oligomer scaffolds (noncanonical backbones), oligooxopiperazines, oligo-peptoids, [Formula: see text]-peptides, hydrogen bond surrogate helices and oligosaccharides. This work is complementary to prior additions to model noncanonical protein side chains in ROSETTA. The main purpose of our manuscript is to give a detailed description to current and future developers of how each of these noncanonical backbones was implemented. Furthermore, we provide a general outline for implementation of new backbone types not discussed here. To illustrate the utility of this approach, we describe the first tests of the ROSETTA molecular mechanics energy function in the context of oligooxopiperazines, using quantum mechanical calculations as comparison points, scanning through backbone and side chain torsion angles for a model peptidomimetic. Finally, as an example of a novel design application, we describe the automated design of an oligooxopiperazine that inhibits the p53-MDM2 protein-protein interaction. For the general biological and bioengineering community, several noncanonical backbones have been incorporated into web applications that allow users to freely and rapidly test the presented protocols (http://rosie.rosettacommons.org). This work helps address the peptidomimetic community's need for an automated and expandable modeling tool for noncanonical backbones.
Collapse
|
Research Support, N.I.H., Extramural |
12 |
48 |
7
|
Labonte JW, Townsend CA. Active site comparisons and catalytic mechanisms of the hot dog superfamily. Chem Rev 2012. [PMID: 23205964 DOI: 10.1021/cr300169a] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
Review |
13 |
42 |
8
|
Labonte JW, Adolf-Bryfogle J, Schief WR, Gray JJ. Residue-centric modeling and design of saccharide and glycoconjugate structures. J Comput Chem 2016; 38:276-287. [PMID: 27900782 DOI: 10.1002/jcc.24679] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2016] [Revised: 09/23/2016] [Accepted: 11/06/2016] [Indexed: 01/18/2023]
Abstract
The RosettaCarbohydrate framework is a new tool for modeling a wide variety of saccharide and glycoconjugate structures. This report describes the development of the framework and highlights its applications. The framework integrates with established protocols within the Rosetta modeling and design suite, and it handles the vast complexity and variety of carbohydrate molecules, including branching and sugar modifications. To address challenges of sampling and scoring, RosettaCarbohydrate can sample glycosidic bonds, side-chain conformations, and ring forms, and it utilizes a glycan-specific term within its scoring function. Rosetta can work with standard PDB, GLYCAM, and GlycoWorkbench (.gws) file formats. Saccharide residue-specific chemical information is stored internally, permitting glycoengineering and design. Carbohydrate-specific applications described herein include virtual glycosylation, loop-modeling of carbohydrates, and docking of glyco-ligands to antibodies. Benchmarking data are presented and compared to other studies, demonstrating Rosetta's ability to predict glyco-ligand binding. The framework expands the tools available to glycoscientists and engineers. © 2016 Wiley Periodicals, Inc.
Collapse
|
Research Support, Non-U.S. Gov't |
9 |
37 |
9
|
Kilambi KP, Pacella MS, Xu J, Labonte JW, Porter JR, Muthu P, Drew K, Kuroda D, Schueler-Furman O, Bonneau R, Gray JJ. Extending RosettaDock with water, sugar, and pH for prediction of complex structures and affinities for CAPRI rounds 20-27. Proteins 2013; 81:2201-9. [PMID: 24123494 PMCID: PMC4037910 DOI: 10.1002/prot.24425] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2013] [Revised: 09/12/2013] [Accepted: 09/13/2013] [Indexed: 11/09/2022]
Abstract
Rounds 20-27 of the Critical Assessment of PRotein Interactions (CAPRI) provided a testing platform for computational methods designed to address a wide range of challenges. The diverse targets drove the creation of and new combinations of computational tools. In this study, RosettaDock and other novel Rosetta protocols were used to successfully predict four of the 10 blind targets. For example, for DNase domain of Colicin E2-Im2 immunity protein, RosettaDock and RosettaLigand were used to predict the positions of water molecules at the interface, recovering 46% of the native water-mediated contacts. For α-repeat Rep4-Rep2 and g-type lysozyme-PliG inhibitor complexes, homology models were built and standard and pH-sensitive docking algorithms were used to generate structures with interface RMSD values of 3.3 Å and 2.0 Å, respectively. A novel flexible sugar-protein docking protocol was also developed and used for structure prediction of the BT4661-heparin-like saccharide complex, recovering 71% of the native contacts. Challenges remain in the generation of accurate homology models for protein mutants and sampling during global docking. On proteins designed to bind influenza hemagglutinin, only about half of the mutations were identified that affect binding (T55: 54%; T56: 48%). The prediction of the structure of the xylanase complex involving homology modeling and multidomain docking pushed the limits of global conformational sampling and did not result in any successful prediction. The diversity of problems at hand requires computational algorithms to be versatile; the recent additions to the Rosetta suite expand the capabilities to encompass more biologically realistic docking problems.
Collapse
|
Research Support, N.I.H., Extramural |
12 |
22 |
10
|
Keceli G, Moore CD, Labonte JW, Toscano JP. NMR detection and study of hydrolysis of HNO-derived sulfinamides. Biochemistry 2013; 52:7387-96. [PMID: 24073927 DOI: 10.1021/bi401110f] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Nitroxyl (HNO), a potential heart failure therapeutic, is known to post-translationally modify cysteine residues. Among reactive nitrogen oxide species, the modification of cysteine residues to sulfinamides [RS(O)NH2] is unique to HNO. We have applied (15)N-edited (1)H NMR techniques to detect the HNO-induced thiol to sulfinamide modification in several small organic molecules, peptides, and the cysteine protease, papain. Relevant reactions of sulfinamides involve reduction to free thiols in the presence of excess thiol and hydrolysis to form sulfinic acids [RS(O)OH]. We have investigated sulfinamide hydrolysis at physiological pH and temperature. Studies with papain and a related model peptide containing the active site thiol suggest that sulfinamide hydrolysis can be enhanced in a protein environment. These findings are also supported by modeling studies. In addition, analysis of peptide sulfinamides at various pH values suggests that hydrolysis becomes more facile under acidic conditions.
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
12 |
21 |
11
|
Koehler Leman J, Weitzner BD, Renfrew PD, Lewis SM, Moretti R, Watkins AM, Mulligan VK, Lyskov S, Adolf-Bryfogle J, Labonte JW, Krys J, Bystroff C, Schief W, Gront D, Schueler-Furman O, Baker D, Bradley P, Dunbrack R, Kortemme T, Leaver-Fay A, Strauss CEM, Meiler J, Kuhlman B, Gray JJ, Bonneau R. Better together: Elements of successful scientific software development in a distributed collaborative community. PLoS Comput Biol 2020; 16:e1007507. [PMID: 32365137 PMCID: PMC7197760 DOI: 10.1371/journal.pcbi.1007507] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Many scientific disciplines rely on computational methods for data analysis, model generation, and prediction. Implementing these methods is often accomplished by researchers with domain expertise but without formal training in software engineering or computer science. This arrangement has led to underappreciation of sustainability and maintainability of scientific software tools developed in academic environments. Some software tools have avoided this fate, including the scientific library Rosetta. We use this software and its community as a case study to show how modern software development can be accomplished successfully, irrespective of subject area. Rosetta is one of the largest software suites for macromolecular modeling, with 3.1 million lines of code and many state-of-the-art applications. Since the mid 1990s, the software has been developed collaboratively by the RosettaCommons, a community of academics from over 60 institutions worldwide with diverse backgrounds including chemistry, biology, physiology, physics, engineering, mathematics, and computer science. Developing this software suite has provided us with more than two decades of experience in how to effectively develop advanced scientific software in a global community with hundreds of contributors. Here we illustrate the functioning of this development community by addressing technical aspects (like version control, testing, and maintenance), community-building strategies, diversity efforts, software dissemination, and user support. We demonstrate how modern computational research can thrive in a distributed collaborative community. The practices described here are independent of subject area and can be readily adopted by other software development communities.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
20 |
12
|
Mulligan VK, Kang CS, Sawaya MR, Rettie S, Li X, Antselovich I, Craven TW, Watkins AM, Labonte JW, DiMaio F, Yeates TO, Baker D. Computational design of mixed chirality peptide macrocycles with internal symmetry. Protein Sci 2021; 29:2433-2445. [PMID: 33058266 PMCID: PMC7679966 DOI: 10.1002/pro.3974] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 10/12/2020] [Accepted: 10/13/2020] [Indexed: 12/27/2022]
Abstract
Cyclic symmetry is frequent in protein and peptide homo‐oligomers, but extremely rare within a single chain, as it is not compatible with free N‐ and C‐termini. Here we describe the computational design of mixed‐chirality peptide macrocycles with rigid structures that feature internal cyclic symmetries or improper rotational symmetries inaccessible to natural proteins. Crystal structures of three C2‐ and C3‐symmetric macrocycles, and of six diverse S2‐symmetric macrocycles, match the computationally‐designed models with backbone heavy‐atom RMSD values of 1 Å or better. Crystal structures of an S4‐symmetric macrocycle (consisting of a sequence and structure segment mirrored at each of three successive repeats) designed to bind zinc reveal a large‐scale zinc‐driven conformational change from an S4‐symmetric apo‐state to a nearly inverted S4‐symmetric holo‐state almost identical to the design model. These symmetric structures provide promising starting points for applications ranging from design of cyclic peptide based metal organic frameworks to creation of high affinity binders of symmetric protein homo‐oligomers. More generally, this work demonstrates the power of computational design for exploring symmetries and structures not found in nature, and for creating synthetic switchable systems. PDB Code(s): 6UFU, 6UG2, 6UG3, 6UG6, 6UGB, 6UGC, 6UCX, 6UD9, 6UDR, 6UDW, 6UDZ, 6UF4, 6UF7, 6UF8, 6UFA and 6UF9;
Collapse
|
Research Support, U.S. Gov't, Non-P.H.S. |
4 |
13 |
13
|
Burman SSR, Nance ML, Jeliazkov JR, Labonte JW, Lubin JH, Biswas N, Gray JJ. Novel sampling strategies and a coarse-grained score function for docking homomers, flexible heteromers, and oligosaccharides using Rosetta in CAPRI rounds 37-45. Proteins 2020; 88:973-985. [PMID: 31742764 PMCID: PMC8589291 DOI: 10.1002/prot.25855] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 11/04/2019] [Accepted: 11/13/2019] [Indexed: 02/06/2023]
Abstract
Critical Assessment of PRediction of Interactions (CAPRI) rounds 37 through 45 introduced larger complexes, new macromolecules, and multistage assemblies. For these rounds, we used and expanded docking methods in Rosetta to model 23 target complexes. We successfully predicted 14 target complexes and recognized and refined near-native models generated by other groups for two further targets. Notably, for targets T110 and T136, we achieved the closest prediction of any CAPRI participant. We created several innovative approaches during these rounds. Since round 39 (target 122), we have used the new RosettaDock 4.0, which has a revamped coarse-grained energy function and the ability to perform conformer selection during docking with hundreds of pregenerated protein backbones. Ten of the complexes had some degree of symmetry in their interactions, so we tested Rosetta SymDock, realized its shortcomings, and developed the next-generation symmetric docking protocol, SymDock2, which includes docking of multiple backbones and induced-fit refinement. Since the last CAPRI assessment, we also developed methods for modeling and designing carbohydrates in Rosetta, and we used them to successfully model oligosaccharide-protein complexes in round 41. Although the results were broadly encouraging, they also highlighted the pressing need to invest in (a) flexible docking algorithms with the ability to model loop and linker motions and in (b) new sampling and scoring methods for oligosaccharide-protein interactions.
Collapse
|
Research Support, N.I.H., Extramural |
5 |
12 |
14
|
Nance ML, Labonte JW, Adolf-Bryfogle J, Gray JJ. Development and Evaluation of GlycanDock: A Protein-Glycoligand Docking Refinement Algorithm in Rosetta. J Phys Chem B 2021; 125:10.1021/acs.jpcb.1c00910. [PMID: 34133179 PMCID: PMC8742512 DOI: 10.1021/acs.jpcb.1c00910] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Carbohydrate chains are ubiquitous in the complex molecular processes of life. These highly diverse chains are recognized by a variety of protein receptors, enabling glycans to regulate many biological functions. High-resolution structures of protein-glycoligand complexes reveal the atomic details necessary to understand this level of molecular recognition and inform application-focused scientific and engineering pursuits. When experimental challenges hinder high-throughput determination of quality structures, computational tools can, in principle, fill the gap. In this work, we introduce GlycanDock, a residue-centric protein-glycoligand docking refinement algorithm developed within the Rosetta macromolecular modeling and design software suite. We performed a benchmark docking assessment using a set of 109 experimentally determined protein-glycoligand complexes as well as 62 unbound protein structures. The GlycanDock algorithm can sample and discriminate among protein-glycoligand models of native-like structural accuracy with statistical reliability from starting structures of up to 7 Å root-mean-square deviation in the glycoligand ring atoms. We show that GlycanDock-refined models qualitatively replicated the known binding specificity of a bacterial carbohydrate-binding module. Finally, we present a protein-glycoligand docking pipeline for generating putative protein-glycoligand complexes when only the glycoligand sequence and unbound protein structure are known. In combination with other carbohydrate modeling tools, the GlycanDock docking refinement algorithm will accelerate research in the glycosciences.
Collapse
|
research-article |
4 |
12 |
15
|
Pierre B, Labonte JW, Xiong T, Aoraha E, Williams A, Shah V, Chau E, Helal KY, Gray JJ, Kim JR. Molecular Determinants for Protein Stabilization by Insertional Fusion to a Thermophilic Host Protein. Chembiochem 2015; 16:2392-402. [DOI: 10.1002/cbic.201500310] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2015] [Indexed: 12/26/2022]
|
|
10 |
11 |
16
|
Mathew MP, Tan E, Labonte JW, Shah S, Saeui CT, Liu L, Bhattacharya R, Bovonratwet P, Gray JJ, Yarema KJ. Glycoengineering of Esterase Activity through Metabolic Flux-Based Modulation of Sialic Acid. Chembiochem 2017; 18:1204-1215. [PMID: 28218815 PMCID: PMC5757160 DOI: 10.1002/cbic.201600698] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2016] [Indexed: 01/09/2023]
Abstract
This report describes the metabolic glycoengineering (MGE) of intracellular esterase activity in human colon cancer (LS174T) and Chinese hamster ovary (CHO) cells. In silico analysis of carboxylesterases CES1 and CES2 suggested that these enzymes are modified with sialylated N-glycans, which are proposed to stabilize the active multimeric forms of these enzymes. This premise was supported by treating cells with butanolylated ManNAc to increase sialylation, which in turn increased esterase activity. By contrast, hexosamine analogues not targeted to sialic acid biosynthesis (e.g., butanoylated GlcNAc or GalNAc) had minimal impact. Measurement of mRNA and protein confirmed that esterase activity was controlled through glycosylation and not through transcription or translation. Azide-modified ManNAc analogues widely used in MGE also enhanced esterase activity and provided a way to enrich targeted glycoengineered proteins (such as CES2), thereby providing unambiguous evidence that the compounds were converted to sialosides and installed into the glycan structures of esterases as intended. Overall, this study provides a pioneering example of the modulation of intracellular enzyme activity through MGE, which expands the value of this technology from its current status as a labeling strategy and modulator of cell surface biological events.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
8 |
17
|
Koehler Leman J, Lyskov S, Lewis SM, Adolf-Bryfogle J, Alford RF, Barlow K, Ben-Aharon Z, Farrell D, Fell J, Hansen WA, Harmalkar A, Jeliazkov J, Kuenze G, Krys JD, Ljubetič A, Loshbaugh AL, Maguire J, Moretti R, Mulligan VK, Nance ML, Nguyen PT, Ó Conchúir S, Roy Burman SS, Samanta R, Smith ST, Teets F, Tiemann JKS, Watkins A, Woods H, Yachnin BJ, Bahl CD, Bailey-Kellogg C, Baker D, Das R, DiMaio F, Khare SD, Kortemme T, Labonte JW, Lindorff-Larsen K, Meiler J, Schief W, Schueler-Furman O, Siegel JB, Stein A, Yarov-Yarovoy V, Kuhlman B, Leaver-Fay A, Gront D, Gray JJ, Bonneau R. Ensuring scientific reproducibility in bio-macromolecular modeling via extensive, automated benchmarks. Nat Commun 2021; 12:6947. [PMID: 34845212 PMCID: PMC8630030 DOI: 10.1038/s41467-021-27222-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 11/02/2021] [Indexed: 01/14/2023] Open
Abstract
Each year vast international resources are wasted on irreproducible research. The scientific community has been slow to adopt standard software engineering practices, despite the increases in high-dimensional data, complexities of workflows, and computational environments. Here we show how scientific software applications can be created in a reproducible manner when simple design goals for reproducibility are met. We describe the implementation of a test server framework and 40 scientific benchmarks, covering numerous applications in Rosetta bio-macromolecular modeling. High performance computing cluster integration allows these benchmarks to run continuously and automatically. Detailed protocol captures are useful for developers and users of Rosetta and other macromolecular modeling tools. The framework and design concepts presented here are valuable for developers and users of any type of scientific software and for the scientific community to create reproducible methods. Specific examples highlight the utility of this framework, and the comprehensive documentation illustrates the ease of adding new tests in a matter of hours.
Collapse
|
Research Support, N.I.H., Extramural |
4 |
8 |
18
|
Buller AR, Labonte JW, Freeman MF, Wright NT, Schildbach JF, Townsend CA. Autoproteolytic activation of ThnT results in structural reorganization necessary for substrate binding and catalysis. J Mol Biol 2012; 422:508-18. [PMID: 22706025 PMCID: PMC3428426 DOI: 10.1016/j.jmb.2012.06.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2012] [Revised: 06/02/2012] [Accepted: 06/08/2012] [Indexed: 11/01/2022]
Abstract
cis-Autoproteolysis is a post-translational modification necessary for the function of ThnT, an enzyme involved in the biosynthesis of the β-lactam antibiotic thienamycin. This modification generates an N-terminal threonine nucleophile that is used to hydrolyze the pantetheinyl moiety of its natural substrate. We determined the crystal structure of autoactivated ThnT to 1.8Å through X-ray crystallography. Comparison to a mutationally inactivated precursor structure revealed several large conformational rearrangements near the active site. To probe the relevance of these transitions, we designed a pantetheine-like chloromethyl ketone inactivator and co-crystallized it with ThnT. Although this class of inhibitor has been in use for several decades, the mode of inactivation had not been determined for an enzyme that uses an N-terminal nucleophile. The co-crystal structure revealed the chloromethyl ketone bound to the N-terminal nucleophile of ThnT through an ether linkage, and analysis suggests inactivation through a direct displacement mechanism. More importantly, this inactivated complex shows that three regions of ThnT that are critical to the formation of the substrate binding pocket undergo rearrangement upon autoproteolysis. Comparison of ThnT with other autoproteolytic enzymes of disparate evolutionary lineage revealed a high degree of similarity within the proenzyme active site, reflecting shared chemical constraints. However, after autoproteolysis, many enzymes, like ThnT, are observed to rearrange in order to accommodate their specific substrate. We propose that this is a general phenomenon, whereby autoprocessing systems with shared chemistry may possess similar structural features that dissipate upon rearrangement into a mature state.
Collapse
|
Research Support, N.I.H., Extramural |
13 |
8 |
19
|
Firnberg E, Labonte JW, Gray JJ, Ostermeier M. A Comprehensive, High-Resolution Map of a Gene's Fitness Landscape. Mol Biol Evol 2016; 33:1378. [PMID: 26912810 PMCID: PMC4839222 DOI: 10.1093/molbev/msw021] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
Published Erratum |
9 |
7 |
20
|
Mahajan SP, Srinivasan Y, Labonte JW, DeLisa MP, Gray JJ. Structural basis for peptide substrate specificities of glycosyltransferase GalNAc-T2. ACS Catal 2021; 11:2977-2991. [PMID: 34322281 DOI: 10.1021/acscatal.0c04609] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The polypeptide N-acetylgalactosaminyl transferase (GalNAc-T) enzyme family initiates O-linked mucin-type glycosylation. The family constitutes 20 isoenzymes in humans. GalNAc-Ts exhibit both redundancy and finely tuned specificity for a wide range of peptide substrates. In this work, we deciphered the sequence and structural motifs that determine the peptide substrate preferences for the GalNAc-T2 isoform. Our approach involved sampling and characterization of peptide-enzyme conformations obtained from Rosetta Monte Carlo-minimization-based flexible docking. We computationally scanned 19 amino acid residues at positions -1 and +1 of an eight-residue peptide substrate, which comprised a dataset of 361 (19x19) peptides with previously characterized experimental GalNAc-T2 glycosylation efficiencies. The calculations recapitulated experimental specificity data, successfully discriminating between glycosylatable and non-glycosylatable peptides with a probability of 96.5% (ROC-AUC score), a balanced accuracy of 85.5% and a false positive rate of 7.3%. The glycosylatable peptide substrates viz. peptides with proline, serine, threonine, and alanine at the -1 position of the peptide preferentially exhibited cognate sequon-like conformations. The preference for specific residues at the -1 position of the peptide was regulated by enzyme residues R362, K363, Q364, H365 and W331, which modulate the pocket size and specific enzyme-peptide interactions. For the +1 position of the peptide, enzyme residues K281 and K363 formed gating interactions with aromatics and glutamines at the +1 position of the peptide, leading to modes of peptide-binding sub-optimal for catalysis. Overall, our work revealed enzyme features that lead to the finely tuned specificity observed for a broad range of peptide substrates for the GalNAc-T2 enzyme. We anticipate that the key sequence and structural motifs can be extended to analyze specificities of other isoforms of the GalNAc-T family and can be used to guide design of variants with tailored specificity.
Collapse
|
Journal Article |
4 |
6 |
21
|
Le KH, Adolf-Bryfogle J, Klima JC, Lyskov S, Labonte J, Bertolani S, Burman SSR, Leaver-Fay A, Weitzner B, Maguire J, Rangan R, Adrianowycz MA, Alford RF, Adal A, Nance ML, Wu Y, Willis J, Kulp DW, Das R, Dunbrack RL, Schief W, Kuhlman B, Siegel JB, Gray JJ. PyRosetta Jupyter Notebooks Teach Biomolecular Structure Prediction and Design. BIOPHYSICIST (ROCKVILLE, MD.) 2021; 2:108-122. [PMID: 35128343 DOI: 10.35459/tbp.2019.000147] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Biomolecular structure drives function, and computational capabilities have progressed such that the prediction and computational design of biomolecular structures is increasingly feasible. Because computational biophysics attracts students from many different backgrounds and with different levels of resources, teaching the subject can be challenging. One strategy to teach diverse learners is with interactive multimedia material that promotes self-paced, active learning. We have created a hands-on education strategy with a set of sixteen modules that teach topics in biomolecular structure and design, from fundamentals of conformational sampling and energy evaluation to applications like protein docking, antibody design, and RNA structure prediction. Our modules are based on PyRosetta, a Python library that encapsulates all computational modules and methods in the Rosetta software package. The workshop-style modules are implemented as Jupyter Notebooks that can be executed in the Google Colaboratory, allowing learners access with just a web browser. The digital format of Jupyter Notebooks allows us to embed images, molecular visualization movies, and interactive coding exercises. This multimodal approach may better reach students from different disciplines and experience levels as well as attract more researchers from smaller labs and cognate backgrounds to leverage PyRosetta in their science and engineering research. All materials are freely available at https://github.com/RosettaCommons/PyRosetta.notebooks.
Collapse
|
|
4 |
5 |
22
|
Labonte JW, Kudo F, Freeman MF, Raber ML, Townsend CA. Engineering the synthetic potential of β-lactam synthetase and the importance of catalytic loop dynamics. MEDCHEMCOMM 2012; 3:960-966. [PMID: 23616913 DOI: 10.1039/c2md00305h] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The 2-azetidinone ring of the Class A and D β-lactamase inhibitor clavulanic acid (1) is synthesized by the ATP-utilizing enzyme β-lactam synthetase (βLS). A hydroxyethyl group attached to C-6 of 1 in the (S) configuration markedly enhances the efficacy of this compound against Class C β-lactamases. Guided by a series of X-ray structures of βLS, we have engineered this enzyme to act upon a methylated substrate analogue to give selectively the (3S)-methyl β-lactam core, which, upon closure of the second ring of the bicyclic system of 1, would lead to the (6S)-methylated clavulanic acid derivative.
Collapse
|
|
13 |
3 |
23
|
Alford RF, Leaver-Fay A, Jeliazkov JR, O'Meara MJ, DiMaio FP, Park H, Shapovalov MV, Renfrew PD, Mulligan VK, Kappel K, Labonte JW, Pacella MS, Bonneau R, Bradley P, Dunbrack RL, Das R, Baker D, Kuhlman B, Kortemme T, Gray JJ. Correction to "The Rosetta All-Atom Energy Function for Macromolecular Modeling and Design". J Chem Theory Comput 2022; 18:4594. [PMID: 35667008 DOI: 10.1021/acs.jctc.2c00500] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
Published Erratum |
3 |
2 |
24
|
Crone KK, Labonte JW, Elias MH, Freeman MF. α-N-Methyltransferase regiospecificity is mediated by proximal, redundant enzyme-substrate interactions. Protein Sci 2025; 34:e70021. [PMID: 39840790 PMCID: PMC11751858 DOI: 10.1002/pro.70021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 11/15/2024] [Accepted: 12/15/2024] [Indexed: 01/23/2025]
Abstract
N-Methylation of the peptide backbone confers pharmacologically beneficial characteristics to peptides that include greater membrane permeability and resistance to proteolytic degradation. The borosin family of ribosomally synthesized and post-translationally modified peptides offer a post-translational route to install amide backbone α-N-methylations. Previous work has elucidated the substrate scope and engineering potential of two examples of type I borosins, which feature autocatalytic precursors that encode N-methyltransferases that methylate their own C-termini in trans. We recently reported the first discrete N-methyltransferase and precursor peptide from Shewanella oneidensis MR-1, a minimally iterative, type IV borosin that allowed the first detailed kinetic analyses of borosin N-methyltransferases. Herein, we characterize the substrate scope and resilient regiospecificity of this discrete N-methyltransferase by comparison of relative rates and methylation patterns of over 40 precursor peptide variants along with structure analyses of nine enzyme-substrate complexes. Sequences critical to methylation are identified and demonstrated in assaying minimal peptide substrates and non-native peptide sequences for assessment of secondary structure requirements and engineering potential. This work grants understanding towards the mechanism of substrate recognition and iterative activity by discrete borosin N-methyltransferases.
Collapse
|
research-article |
1 |
|
25
|
Adolf-Bryfogle J, Labonte JW, Kraft JC, Shapovalov M, Raemisch S, Lütteke T, DiMaio F, Bahl CD, Pallesen J, King NP, Gray JJ, Kulp DW, Schief WR. Growing Glycans in Rosetta: Accurate de novo glycan modeling, density fitting, and rational sequon design. PLoS Comput Biol 2024; 20:e1011895. [PMID: 38913746 PMCID: PMC11288642 DOI: 10.1371/journal.pcbi.1011895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/30/2024] [Accepted: 02/06/2024] [Indexed: 06/26/2024] Open
Abstract
Carbohydrates and glycoproteins modulate key biological functions. However, experimental structure determination of sugar polymers is notoriously difficult. Computational approaches can aid in carbohydrate structure prediction, structure determination, and design. In this work, we developed a glycan-modeling algorithm, GlycanTreeModeler, that computationally builds glycans layer-by-layer, using adaptive kernel density estimates (KDE) of common glycan conformations derived from data in the Protein Data Bank (PDB) and from quantum mechanics (QM) calculations. GlycanTreeModeler was benchmarked on a test set of glycan structures of varying lengths, or "trees". Structures predicted by GlycanTreeModeler agreed with native structures at high accuracy for both de novo modeling and experimental density-guided building. We employed these tools to design de novo glycan trees into a protein nanoparticle vaccine to shield regions of the scaffold from antibody recognition, and experimentally verified shielding. This work will inform glycoprotein model prediction, glycan masking, and further aid computational methods in experimental structure determination and refinement.
Collapse
|
research-article |
1 |
|