1
|
Randolph NZ, Kuhlman B. Invariant point message passing for protein side chain packing. Proteins 2024. [PMID: 38790143 DOI: 10.1002/prot.26705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 04/19/2024] [Accepted: 05/13/2024] [Indexed: 05/26/2024]
Abstract
Protein side chain packing (PSCP) is a fundamental problem in the field of protein engineering, as high-confidence and low-energy conformations of amino acid side chains are crucial for understanding (and designing) protein folding, protein-protein interactions, and protein-ligand interactions. Traditional PSCP methods (such as the Rosetta Packer) often rely on a library of discrete side chain conformations, or rotamers, and a forcefield to guide the structure to low-energy conformations. Recently, deep learning (DL) based methods (such as DLPacker, AttnPacker, and DiffPack) have demonstrated state-of-the-art predictions and speed in the PSCP task. Building off the success of geometric graph neural networks for protein modeling, we present the Protein Invariant Point Packer (PIPPack) which effectively processes local structural and sequence information to produce realistic, idealized side chain coordinates usingχ $$ \chi $$ -angle distribution predictions and geometry-aware invariant point message passing (IPMP). On a test set of ∼1400 high-quality protein chains, PIPPack is highly competitive with other state-of-the-art PSCP methods in rotamer recovery and per-residue RMSD but is significantly faster.
Collapse
Affiliation(s)
- Nicholas Z Randolph
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
2
|
Dong HL, Zhang C, Dai L, Zhang Y, Zhang XH, Tan ZJ. The origin of different bending stiffness between double-stranded RNA and DNA revealed by magnetic tweezers and simulations. Nucleic Acids Res 2024; 52:2519-2529. [PMID: 38321947 PMCID: PMC10954459 DOI: 10.1093/nar/gkae063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 01/16/2024] [Accepted: 01/19/2024] [Indexed: 02/08/2024] Open
Abstract
The subtle differences in the chemical structures of double-stranded (ds) RNA and DNA lead to significant variations in their biological roles and medical implications, largely due to their distinct biophysical properties, such as bending stiffness. Although it is well known that A-form dsRNA is stiffer than B-form dsDNA under physiological salt conditions, the underlying cause of this difference remains unclear. In this study, we employ high-precision magnetic-tweezer experiments along with molecular dynamics simulations and reveal that the relative bending stiffness between dsRNA and dsDNA is primarily determined by the structure- and salt-concentration-dependent ion distribution around their helical structures. At near-physiological salt conditions, dsRNA shows a sparser ion distribution surrounding its phosphate groups compared to dsDNA, causing its greater stiffness. However, at very high monovalent salt concentrations, phosphate groups in both dsRNA and dsDNA become fully neutralized by excess ions, resulting in a similar intrinsic bending persistence length of approximately 39 nm. This similarity in intrinsic bending stiffness of dsRNA and dsDNA is coupled to the analogous fluctuations in their total groove widths and further coupled to the similar fluctuation of base-pair inclination, despite their distinct A-form and B-form helical structures.
Collapse
Affiliation(s)
- Hai-Long Dong
- School of Physics and Technology, College of Life Sciences, Renmin Hospital of Wuhan University, Wuhan University, Wuhan 430072, China
| | - Chen Zhang
- School of Physics and Technology, College of Life Sciences, Renmin Hospital of Wuhan University, Wuhan University, Wuhan 430072, China
| | - Liang Dai
- Department of Physics, City University of Hong Kong, Hong Kong 999077, China
| | - Yan Zhang
- Department of Clinical Laboratory, Renmin Hospital of Wuhan University, Wuhan 430072, China
| | - Xing-Hua Zhang
- School of Physics and Technology, College of Life Sciences, Renmin Hospital of Wuhan University, Wuhan University, Wuhan 430072, China
| | - Zhi-Jie Tan
- School of Physics and Technology, College of Life Sciences, Renmin Hospital of Wuhan University, Wuhan University, Wuhan 430072, China
| |
Collapse
|
3
|
Guerin N, Childs H, Zhou P, Donald BR. DexDesign: A new OSPREY-based algorithm for designing de novo D-peptide inhibitors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.12.579944. [PMID: 38405797 PMCID: PMC10888900 DOI: 10.1101/2024.02.12.579944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
With over 270 unique occurrences in the human genome, peptide-recognizing PDZ domains play a central role in modulating polarization, signaling, and trafficking pathways. Mutations in PDZ domains lead to diseases such as cancer and cystic fibrosis, making PDZ domains attractive targets for therapeutic intervention. D-peptide inhibitors offer unique advantages as therapeutics, including increased metabolic stability and low immunogenicity. Here, we introduce DexDesign, a novel OSPREY-based algorithm for computationally designing de novo D-peptide inhibitors. DexDesign leverages three novel techniques that are broadly applicable to computational protein design: the Minimum Flexible Set, K*-based Mutational Scan, and Inverse Alanine Scan, which enable exponential reductions in the size of the peptide sequence search space. We apply these techniques and DexDesign to generate novel D-peptide inhibitors of two biomedically important PDZ domain targets: CAL and MAST2. We introduce a new framework for analyzing de novo peptides-evaluation along a replication/restitution axis-and apply it to the DexDesign-generated D-peptides. Notably, the peptides we generated are predicted to bind their targets tighter than their targets' endogenous ligands, validating the peptides' potential as lead therapeutic candidates. We provide an implementation of DexDesign in the free and open source computational protein design software OSPREY.
Collapse
|
4
|
Guerin N, Childs H, Zhou P, Donald BR. DexDesign: an OSPREY-based algorithm for designing de novo D-peptide inhibitors. Protein Eng Des Sel 2024; 37:gzae007. [PMID: 38757573 PMCID: PMC11099876 DOI: 10.1093/protein/gzae007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2023] [Revised: 04/17/2024] [Indexed: 05/18/2024] Open
Abstract
With over 270 unique occurrences in the human genome, peptide-recognizing PDZ domains play a central role in modulating polarization, signaling, and trafficking pathways. Mutations in PDZ domains lead to diseases such as cancer and cystic fibrosis, making PDZ domains attractive targets for therapeutic intervention. D-peptide inhibitors offer unique advantages as therapeutics, including increased metabolic stability and low immunogenicity. Here, we introduce DexDesign, a novel OSPREY-based algorithm for computationally designing de novo D-peptide inhibitors. DexDesign leverages three novel techniques that are broadly applicable to computational protein design: the Minimum Flexible Set, K*-based Mutational Scan, and Inverse Alanine Scan. We apply these techniques and DexDesign to generate novel D-peptide inhibitors of two biomedically important PDZ domain targets: CAL and MAST2. We introduce a framework for analyzing de novo peptides-evaluation along a replication/restitution axis-and apply it to the DexDesign-generated D-peptides. Notably, the peptides we generated are predicted to bind their targets tighter than their targets' endogenous ligands, validating the peptides' potential as lead inhibitors. We also provide an implementation of DexDesign in the free and open source computational protein design software OSPREY.
Collapse
Affiliation(s)
- Nathan Guerin
- Department of Computer Science, Duke University, 308 Research Drive, Durham, NC 27708, United States
| | - Henry Childs
- Department of Chemistry, Duke University, 124 Science Drive, Durham, NC 27708, United States
| | - Pei Zhou
- Department of Biochemistry, Duke University School of Medicine, 307 Research Drive, Durham, NC 22710, United States
| | - Bruce R Donald
- Department of Computer Science, Duke University, 308 Research Drive, Durham, NC 27708, United States
- Department of Chemistry, Duke University, 124 Science Drive, Durham, NC 27708, United States
- Department of Biochemistry, Duke University School of Medicine, 307 Research Drive, Durham, NC 22710, United States
- Department of Mathematics, Duke University, 120 Science Drive, Durham, NC 27708, United States
| |
Collapse
|
5
|
Hong SY, Yoon J, An YJ, Lee S, Cha HG, Pandey A, Yoo YJ, Joo JC. Statistical Analysis of the Role of Cavity Flexibility in Thermostability of Proteins. Polymers (Basel) 2024; 16:291. [PMID: 38276699 PMCID: PMC10819066 DOI: 10.3390/polym16020291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/14/2024] [Accepted: 01/17/2024] [Indexed: 01/27/2024] Open
Abstract
Conventional statistical investigations have primarily focused on the comparison of the simple one-dimensional characteristics of protein cavities, such as number, surface area, and volume. These studies have failed to discern the crucial distinctions in cavity properties between thermophilic and mesophilic proteins that contribute to protein thermostability. In this study, the significance of cavity properties, i.e., flexibility and location, in protein thermostability was investigated by comparing structural differences between homologous thermophilic and mesophilic proteins. Three dimensions of protein structure were categorized into three regions (core, boundary, and surface) and a comparative analysis of cavity properties using this structural index was conducted. The statistical analysis revealed that cavity flexibility is closely related to protein thermostability. The core cavities of thermophilic proteins were less flexible than those of mesophilic proteins (averaged B' factor values, -0.6484 and -0.5111), which might be less deleterious to protein thermostability. Thermophilic proteins exhibited fewer cavities in the boundary and surface regions. Notably, cavities in mesophilic proteins, across all regions, exhibited greater flexibility than those in thermophilic proteins (>95% probability). The increased flexibility of cavities in the boundary and surface regions of mesophilic proteins, as opposed to thermophilic proteins, may compromise stability. Recent protein engineering investigations involving mesophilic xylanase and protease showed results consistent with the findings of this study, suggesting that the manipulation of flexible cavities in the surface region can enhance thermostability. Consequently, our findings suggest that a rational or computational approach to the design of flexible cavities in surface or boundary regions could serve as an effective strategy to enhance the thermostability of mesophilic proteins.
Collapse
Affiliation(s)
- So Yeon Hong
- Department of Chemical and Biological Engineering, Inha Technical College, Inha-ro 100, Michuhol-gu, Incheon 22212, Republic of Korea;
| | - Jihyun Yoon
- Department of Biotechnology, The Catholic University of Korea, Bucheon-si 14662, Republic of Korea (S.L.)
| | - Young Joo An
- Department of Biotechnology, The Catholic University of Korea, Bucheon-si 14662, Republic of Korea (S.L.)
| | - Siseon Lee
- Department of Biotechnology, The Catholic University of Korea, Bucheon-si 14662, Republic of Korea (S.L.)
| | - Haeng-Geun Cha
- Department of Biotechnology, The Catholic University of Korea, Bucheon-si 14662, Republic of Korea (S.L.)
| | - Ashutosh Pandey
- Institute for Water and Wastewater Technology, Durban University of Technology, 19 Steve Biko Road, Durban 4000, South Africa;
- Department of Biotechnology, Faculty of Life Science and Technology, AKS University, Satna 485001, Madhya Pradesh, India
| | - Young Je Yoo
- School of Chemical and Biological Engineering, Seoul National University, Seoul 08826, Republic of Korea;
| | - Jeong Chan Joo
- Department of Biotechnology, The Catholic University of Korea, Bucheon-si 14662, Republic of Korea (S.L.)
| |
Collapse
|
6
|
Samanta R, Gray JJ. Implicit model to capture electrostatic features of membrane environment. PLoS Comput Biol 2024; 20:e1011296. [PMID: 38252688 PMCID: PMC10833867 DOI: 10.1371/journal.pcbi.1011296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2023] [Revised: 02/01/2024] [Accepted: 12/13/2023] [Indexed: 01/24/2024] Open
Abstract
Membrane protein structure prediction and design are challenging due to the complexity of capturing the interactions in the lipid layer, such as those arising from electrostatics. Accurately capturing electrostatic energies in the low-dielectric membrane often requires expensive Poisson-Boltzmann calculations that are not scalable for membrane protein structure prediction and design. In this work, we have developed a fast-to-compute implicit energy function that considers the realistic characteristics of different lipid bilayers, making design calculations tractable. This method captures the impact of the lipid head group using a mean-field-based approach and uses a depth-dependent dielectric constant to characterize the membrane environment. This energy function Franklin2023 (F23) is built upon Franklin2019 (F19), which is based on experimentally derived hydrophobicity scales in the membrane bilayer. We evaluated the performance of F23 on five different tests probing (1) protein orientation in the bilayer, (2) stability, and (3) sequence recovery. Relative to F19, F23 has improved the calculation of the tilt angle of membrane proteins for 90% of WALP peptides, 15% of TM-peptides, and 25% of the adsorbed peptides. The performances for stability and design tests were equivalent for F19 and F23. The speed and calibration of the implicit model will help F23 access biophysical phenomena at long time and length scales and accelerate the membrane protein design pipeline.
Collapse
Affiliation(s)
- Rituparna Samanta
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland, United States of America
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland, United States of America
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland, United States of America
| |
Collapse
|
7
|
Randolph NZ, Kuhlman B. Invariant point message passing for protein side chain packing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.03.551328. [PMID: 38187664 PMCID: PMC10769188 DOI: 10.1101/2023.08.03.551328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Protein side chain packing (PSCP) is a fundamental problem in the field of protein engineering, as high-confidence and low-energy conformations of amino acid side chains are crucial for understanding (and designing) protein folding, protein-protein interactions, and protein-ligand interactions. Traditional PSCP methods (such as the Rosetta Packer) often rely on a library of discrete side chain conformations, or rotamers, and a forcefield to guide the structure to low-energy conformations. Recently, deep learning (DL) based methods (such as DLPacker, AttnPacker, and DiffPack) have demonstrated state-of-the-art predictions and speed in the PSCP task. Building off the success of geometric graph neural networks for protein modeling, we present the Protein Invariant Point Packer (PIPPack) which effectively processes local structural and sequence information to produce realistic, idealized side chain coordinates using χ-angle distribution predictions and geometry-aware invariant point message passing (IPMP). On a test set of ~1,400 high-quality protein chains, PIPPack is highly competitive with other state-of-the-art PSCP methods in rotamer recovery and per-residue RMSD but is significantly faster.
Collapse
Affiliation(s)
- Nicholas Z Randolph
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
8
|
Vaissier Welborn V. Understanding Cysteine Reactivity in Protein Environments with Electric Fields. J Phys Chem B 2023; 127:9936-9942. [PMID: 37962274 DOI: 10.1021/acs.jpcb.3c05749] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The role cysteine residues play in proteins is mediated by their protonation state, whereby the thiolate form of the side chain is highly reactive while the thiol form is more inert. However, the pKa of cysteine residues is hard to predict as it can differ widely from its reference value in solution, an effect that is accentuated by local effects in the heterogeneous protein environment. Here, we present a new approach to the prediction of cysteine reactivity based on electric field calculations at the thiol/thiolate group. We validated our approach by predicting the protonation state of cysteine residues in different protein environments (in the active site, at the protein surface, and buried within the protein interior), including Cys-25 in papaya protease omega, which was proven problematic for the more traditional constant pH molecular dynamics (MD) technique. We predict pKa shifts consistent with experimental observations, and the decomposition of the electric fields into contributions from molecular fragments provides a direct handle to rationalize local pH and pKa effects in proteins without introducing parameters other than those of the force field used for MD simulations.
Collapse
Affiliation(s)
- Valerie Vaissier Welborn
- Department of Chemistry, Virginia Tech, Blacksburg, Virginia 24060, United States
- Macromolecules Innovation Institute (MII),Virginia Tech, Blacksburg, Virginia 24060, United States
| |
Collapse
|
9
|
Khakzad H, Igashov I, Schneuing A, Goverde C, Bronstein M, Correia B. A new age in protein design empowered by deep learning. Cell Syst 2023; 14:925-939. [PMID: 37972559 DOI: 10.1016/j.cels.2023.10.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 06/22/2023] [Accepted: 10/11/2023] [Indexed: 11/19/2023]
Abstract
The rapid progress in the field of deep learning has had a significant impact on protein design. Deep learning methods have recently produced a breakthrough in protein structure prediction, leading to the availability of high-quality models for millions of proteins. Along with novel architectures for generative modeling and sequence analysis, they have revolutionized the protein design field in the past few years remarkably by improving the accuracy and ability to identify novel protein sequences and structures. Deep neural networks can now learn and extract the fundamental features of protein structures, predict how they interact with other biomolecules, and have the potential to create new effective drugs for treating disease. As their applicability in protein design is rapidly growing, we review the recent developments and technology in deep learning methods and provide examples of their performance to generate novel functional proteins.
Collapse
Affiliation(s)
- Hamed Khakzad
- Université de Lorraine, CNRS, Inria, LORIA, 54000 Nancy, France; École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Ilia Igashov
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Arne Schneuing
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | - Casper Goverde
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
| | | | - Bruno Correia
- École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland; Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland.
| |
Collapse
|
10
|
Lategan FA, Schreiber C, Patterton HG. SeqPredNN: a neural network that generates protein sequences that fold into specified tertiary structures. BMC Bioinformatics 2023; 24:373. [PMID: 37789284 PMCID: PMC10546711 DOI: 10.1186/s12859-023-05498-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/25/2023] [Indexed: 10/05/2023] Open
Abstract
BACKGROUND The relationship between the sequence of a protein, its structure, and the resulting connection between its structure and function, is a foundational principle in biological science. Only recently has the computational prediction of protein structure based only on protein sequence been addressed effectively by AlphaFold, a neural network approach that can predict the majority of protein structures with X-ray crystallographic accuracy. A question that is now of acute relevance is the "inverse protein folding problem": predicting the sequence of a protein that folds into a specified structure. This will be of immense value in protein engineering and biotechnology, and will allow the design and expression of recombinant proteins that can, for instance, fold into specified structures as a scaffold for the attachment of recombinant antigens, or enzymes with modified or novel catalytic activities. Here we describe the development of SeqPredNN, a feed-forward neural network trained with X-ray crystallographic structures from the RCSB Protein Data Bank to predict the identity of amino acids in a protein structure using only the relative positions, orientations, and backbone dihedral angles of nearby residues. RESULTS We predict the sequence of a protein expected to fold into a specified structure and assess the accuracy of the prediction using both AlphaFold and RoseTTAFold to computationally generate the fold of the derived sequence. We show that the sequences predicted by SeqPredNN fold into a structure with a median TM-score of 0.638 when compared to the crystal structure according to AlphaFold predictions, yet these sequences are unique and only 28.4% identical to the sequence of the crystallized protein. CONCLUSIONS We propose that SeqPredNN will be a valuable tool to generate proteins of defined structure for the design of novel biomaterials, pharmaceuticals, catalysts, and reporter systems. The low sequence identity of its predictions compared to the native sequence could prove useful for developing proteins with modified physical properties, such as water solubility and thermal stability. The speed and ease of use of SeqPredNN offers a significant advantage over physics-based protein design methods.
Collapse
Affiliation(s)
- F Adriaan Lategan
- Center for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, 7600, South Africa
| | - Caroline Schreiber
- Center for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, 7600, South Africa
| | - Hugh G Patterton
- Center for Bioinformatics and Computational Biology, Stellenbosch University, Stellenbosch, 7600, South Africa.
| |
Collapse
|
11
|
Yan J, Li S, Zhang Y, Hao A, Zhao Q. ZetaDesign: an end-to-end deep learning method for protein sequence design and side-chain packing. Brief Bioinform 2023; 24:bbad257. [PMID: 37429578 DOI: 10.1093/bib/bbad257] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/05/2023] [Accepted: 06/21/2023] [Indexed: 07/12/2023] Open
Abstract
Computational protein design has been demonstrated to be the most powerful tool in the last few years among protein designing and repacking tasks. In practice, these two tasks are strongly related but often treated separately. Besides, state-of-the-art deep-learning-based methods cannot provide interpretability from an energy perspective, affecting the accuracy of the design. Here we propose a new systematic approach, including both a posterior probability and a joint probability parts, to solve the two essential questions once for all. This approach takes the physicochemical property of amino acids into consideration and uses the joint probability model to ensure the convergence between structure and amino acid type. Our results demonstrated that this method could generate feasible, high-confidence sequences with low-energy side conformations. The designed sequences can fold into target structures with high confidence and maintain relatively stable biochemical properties. The side chain conformation has a significantly lower energy landscape without delegating to a rotamer library or performing the expensive conformational searches. Overall, we propose an end-to-end method that combines the advantages of both deep learning and energy-based methods. The design results of this model demonstrate high efficiency, and precision, as well as a low energy state and good interpretability.
Collapse
Affiliation(s)
- Junyu Yan
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Shuai Li
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Ying Zhang
- The Key Laboratory of Cell Proliferation and Regulation Biology, Ministry of Education, College of Life Sciences, Beijing Normal University, Beijing, China
| | - Aimin Hao
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| | - Qinping Zhao
- State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China
| |
Collapse
|
12
|
Samanta R, Gray JJ. Implicit model to capture electrostatic features of membrane environment. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.26.546486. [PMID: 37425950 PMCID: PMC10327106 DOI: 10.1101/2023.06.26.546486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Membrane protein structure prediction and design are challenging due to the complexity of capturing the interactions in the lipid layer, such as those arising from electrostatics. Accurately capturing electrostatic energies in the low-dielectric membrane often requires expensive Poisson-Boltzmann calculations that are not scalable for membrane protein structure prediction and design. In this work, we have developed a fast-to-compute implicit energy function that considers the realistic characteristics of different lipid bilayers, making design calculations tractable. This method captures the impact of the lipid head group using a mean-field-based approach and uses a depth-dependent dielectric constant to characterize the membrane environment. This energy function Franklin2023 (F23) is built upon Franklin2019 (F19), which is based on experimentally derived hydrophobicity scales in the membrane bilayer. We evaluated the performance of F23 on five different tests probing (1) protein orientation in the bilayer, (2) stability, and (3) sequence recovery. Relative to F19, F23 has improved the calculation of the tilt angle of membrane proteins for 90% of WALP peptides, 15% of TM-peptides, and 25% of the adsorbed peptides. The performances for stability and design tests were equivalent for F19 and F23. The speed and calibration of the implicit model will help F23 access biophysical phenomena at long time and length scales and accelerate the membrane protein design pipeline.
Collapse
Affiliation(s)
- Rituparna Samanta
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Jeffrey J Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, Maryland, United States
| |
Collapse
|
13
|
Ali M, Khramushin A, Yadav VK, Schueler-Furman O, Ivarsson Y. Elucidation of Short Linear Motif-Based Interactions of the FERM Domains of Ezrin, Radixin, Moesin, and Merlin. Biochemistry 2023. [PMID: 37224425 DOI: 10.1021/acs.biochem.3c00096] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The ERM (ezrin, radixin, and moesin) family of proteins and the related protein merlin participate in scaffolding and signaling events at the cell cortex. The proteins share an N-terminal FERM [band four-point-one (4.1) ERM] domain composed of three subdomains (F1, F2, and F3) with binding sites for short linear peptide motifs. By screening the FERM domains of the ERMs and merlin against a phage library that displays peptides representing the intrinsically disordered regions of the human proteome, we identified a large number of novel ligands. We determined the affinities for the ERM and merlin FERM domains interacting with 18 peptides and validated interactions with full-length proteins through pull-down experiments. The majority of the peptides contained an apparent Yx[FILV] motif; others show alternative motifs. We defined distinct binding sites for two types of similar but distinct binding motifs (YxV and FYDF) using a combination of Rosetta FlexPepDock computational peptide docking protocols and mutational analysis. We provide a detailed molecular understanding of how the two types of peptides with distinct motifs bind to different sites on the moesin FERM phosphotyrosine binding-like subdomain and uncover interdependencies between the different types of ligands. The study expands the motif-based interactomes of the ERMs and merlin and suggests that the FERM domain acts as a switchable interaction hub.
Collapse
Affiliation(s)
- Muhammad Ali
- Department of Chemistry - BMC, Uppsala University, Husargatan 3, 751 23 Uppsala, Sweden
| | - Alisa Khramushin
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Vikash K Yadav
- Department of Chemistry - BMC, Uppsala University, Husargatan 3, 751 23 Uppsala, Sweden
| | - Ora Schueler-Furman
- Department of Microbiology and Molecular Genetics, Institute for Medical Research Israel-Canada, Faculty of Medicine, The Hebrew University of Jerusalem, Jerusalem 9112102, Israel
| | - Ylva Ivarsson
- Department of Chemistry - BMC, Uppsala University, Husargatan 3, 751 23 Uppsala, Sweden
| |
Collapse
|
14
|
Yang S, Gong W, Zhou T, Sun X, Chen L, Zhou W, Li C. emPDBA: protein-DNA binding affinity prediction by combining features from binding partners and interface learned with ensemble regression model. Brief Bioinform 2023:7165253. [PMID: 37193676 DOI: 10.1093/bib/bbad192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 04/26/2023] [Accepted: 04/29/2023] [Indexed: 05/18/2023] Open
Abstract
Protein-deoxyribonucleic acid (DNA) interactions are important in a variety of biological processes. Accurately predicting protein-DNA binding affinity has been one of the most attractive and challenging issues in computational biology. However, the existing approaches still have much room for improvement. In this work, we propose an ensemble model for Protein-DNA Binding Affinity prediction (emPDBA), which combines six base models with one meta-model. The complexes are classified into four types based on the DNA structure (double-stranded or other forms) and the percentage of interface residues. For each type, emPDBA is trained with the sequence-based, structure-based and energy features from binding partners and complex structures. Through feature selection by the sequential forward selection method, it is found that there do exist considerable differences in the key factors contributing to intermolecular binding affinity. The complex classification is beneficial for the important feature extraction for binding affinity prediction. The performance comparison of our method with other peer ones on the independent testing dataset shows that emPDBA outperforms the state-of-the-art methods with the Pearson correlation coefficient of 0.53 and the mean absolute error of 1.11 kcal/mol. The comprehensive results demonstrate that our method has a good performance for protein-DNA binding affinity prediction. Availability and implementation: The source code is available at https://github.com/ChunhuaLiLab/emPDBA/.
Collapse
Affiliation(s)
- Shuang Yang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Tong Zhou
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Xiaohan Sun
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Lei Chen
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Wenxue Zhou
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
15
|
Huang J, Xie X, Zheng Z, Ye L, Wang P, Xu L, Wu Y, Yan J, Yang M, Yan Y. De Novo Computational Design of a Lipase with Hydrolysis Activity towards Middle-Chained Fatty Acid Esters. Int J Mol Sci 2023; 24:ijms24108581. [PMID: 37239928 DOI: 10.3390/ijms24108581] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/08/2023] [Accepted: 05/09/2023] [Indexed: 05/28/2023] Open
Abstract
Innovations in biocatalysts provide great prospects for intolerant environments or novel reactions. Due to the limited catalytic capacity and the long-term and labor-intensive characteristics of mining enzymes with the desired functions, de novo enzyme design was developed to obtain industrial application candidates in a rapid and convenient way. Here, based on the catalytic mechanisms and the known structures of proteins, we proposed a computational protein design strategy combining de novo enzyme design and laboratory-directed evolution. Starting with the theozyme constructed using a quantum-mechanical approach, the theoretical enzyme-skeleton combinations were assembled and optimized via the Rosetta "inside-out" protocol. A small number of designed sequences were experimentally screened using SDS-PAGE, mass spectrometry and a qualitative activity assay in which the designed enzyme 1a8uD1 exhibited a measurable hydrolysis activity of 24.25 ± 0.57 U/g towards p-nitrophenyl octanoate. To improve the activity of the designed enzyme, molecular dynamics simulations and the RosettaDesign application were utilized to further optimize the substrate binding mode and amino acid sequence, thus keeping the residues of theozyme intact. The redesigned lipase 1a8uD1-M8 displayed enhanced hydrolysis activity towards p-nitrophenyl octanoate-3.34 times higher than that of 1a8uD1. Meanwhile, the natural skeleton protein (PDB entry 1a8u) did not display any hydrolysis activity, confirming that the hydrolysis abilities of the designed 1a8uD1 and the redesigned 1a8uD1-M8 were devised from scratch. More importantly, the designed 1a8uD1-M8 was also able to hydrolyze the natural middle-chained substrate (glycerol trioctanoate), for which the activity was 27.67 ± 0.69 U/g. This study indicates that the strategy employed here has great potential to generate novel enzymes exhibiting the desired reactions.
Collapse
Affiliation(s)
- Jinsha Huang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Xiaoman Xie
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Zhen Zheng
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Luona Ye
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Pengbo Wang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Li Xu
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Ying Wu
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Jinyong Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Min Yang
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| | - Yunjun Yan
- Key Laboratory of Molecular Biophysics, Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
| |
Collapse
|
16
|
Kynast JP, Höcker B. Atligator Web: A Graphical User Interface for Analysis and Design of Protein-Peptide Interactions. BIODESIGN RESEARCH 2023; 5:0011. [PMID: 37849459 PMCID: PMC10521702 DOI: 10.34133/bdr.0011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 04/14/2023] [Indexed: 10/19/2023] Open
Abstract
A key functionality of proteins is based on their ability to form interactions with other proteins or peptides. These interactions are neither easily described nor fully understood, which is why the design of specific protein-protein interaction interfaces remains a challenge for protein engineering. We recently developed the software ATLIGATOR to extract common interaction patterns between different types of amino acids and store them in a database. The tool enables the user to better understand frequent interaction patterns and find groups of interactions. Furthermore, frequent motifs can be directly transferred from the database to a user-defined scaffold as a starting point for the engineering of new binding capabilities. Since three-dimensional visualization is a crucial part of ATLIGATOR, we created ATLIGATOR web-a web server offering an intuitive graphical user interface (GUI) available at https://atligator.uni-bayreuth.de. This new interface empowers users to apply ATLIGATOR by providing easy access with having all parts directly connected. Moreover, we extended the web by a design functionality so that, overall, ATLIGATOR web facilitates the use of ATLIGATOR with a more intuitive UI and advanced design options.
Collapse
Affiliation(s)
- Josef Paul Kynast
- Department of Biochemistry, University of Bayreuth, Bayreuth, Germany
| | - Birte Höcker
- Department of Biochemistry, University of Bayreuth, Bayreuth, Germany
| |
Collapse
|
17
|
Love AC, Caldwell DR, Kolbaba-Kartchner B, Townsend KM, Halbers LP, Yao Z, Brennan CK, Ivanic J, Hadjian T, Mills JH, Schnermann MJ, Prescher JA. Red-Shifted Coumarin Luciferins for Improved Bioluminescence Imaging. J Am Chem Soc 2023; 145:3335-3345. [PMID: 36745536 PMCID: PMC10519142 DOI: 10.1021/jacs.2c07220] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Multicomponent bioluminescence imaging in vivo requires an expanded collection of tissue-penetrant probes. Toward this end, we generated a new class of near-infrared (NIR) emitting coumarin luciferin analogues (CouLuc-3s). The scaffolds were easily accessed from commercially available dyes. Complementary mutant luciferases for the CouLuc-3 analogues were also identified. The brightest probes enabled sensitive imaging in vivo. The CouLuc-3 scaffolds are also orthogonal to popular bioluminescent reporters and can be used for multicomponent imaging applications. Collectively, this work showcases a new set of bioluminescent tools that can be readily implemented for multiplexed imaging in a variety of biological settings.
Collapse
Affiliation(s)
- Anna C Love
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, United States
| | - Donald R Caldwell
- Chemical Biology Laboratory, Cancer for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland 21702, United States
| | - Bethany Kolbaba-Kartchner
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
- The Biodesign Center for Molecular Design and Biomimetics, Arizona State University, Tempe, Arizona 85281, United States
| | - Katherine M Townsend
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, United States
| | - Lila P Halbers
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, United States
| | - Zi Yao
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, United States
| | - Caroline K Brennan
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, United States
| | - Joseph Ivanic
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick, Maryland 21702, United States
| | - Tanya Hadjian
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, United States
| | - Jeremy H Mills
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
- The Biodesign Center for Molecular Design and Biomimetics, Arizona State University, Tempe, Arizona 85281, United States
| | - Martin J Schnermann
- Chemical Biology Laboratory, Cancer for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland 21702, United States
| | - Jennifer A Prescher
- Department of Chemistry, University of California, Irvine, Irvine, California 92697, United States
- Department of Molecular Biology & Biochemistry, University of California, Irvine, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, United States
| |
Collapse
|
18
|
Anderson DM, Jayanthi LP, Gosavi S, Meiering EM. Engineering the kinetic stability of a β-trefoil protein by tuning its topological complexity. Front Mol Biosci 2023; 10:1021733. [PMID: 36845544 PMCID: PMC9945329 DOI: 10.3389/fmolb.2023.1021733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 01/02/2023] [Indexed: 02/11/2023] Open
Abstract
Kinetic stability, defined as the rate of protein unfolding, is central to determining the functional lifetime of proteins, both in nature and in wide-ranging medical and biotechnological applications. Further, high kinetic stability is generally correlated with high resistance against chemical and thermal denaturation, as well as proteolytic degradation. Despite its significance, specific mechanisms governing kinetic stability remain largely unknown, and few studies address the rational design of kinetic stability. Here, we describe a method for designing protein kinetic stability that uses protein long-range order, absolute contact order, and simulated free energy barriers of unfolding to quantitatively analyze and predict unfolding kinetics. We analyze two β-trefoil proteins: hisactophilin, a quasi-three-fold symmetric natural protein with moderate stability, and ThreeFoil, a designed three-fold symmetric protein with extremely high kinetic stability. The quantitative analysis identifies marked differences in long-range interactions across the protein hydrophobic cores that partially account for the differences in kinetic stability. Swapping the core interactions of ThreeFoil into hisactophilin increases kinetic stability with close agreement between predicted and experimentally measured unfolding rates. These results demonstrate the predictive power of readily applied measures of protein topology for altering kinetic stability and recommend core engineering as a tractable target for rationally designing kinetic stability that may be widely applicable.
Collapse
Affiliation(s)
| | - Lakshmi P. Jayanthi
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India
| | - Shachi Gosavi
- Simons Centre for the Study of Living Machines, National Centre for Biological Sciences, Tata Institute of Fundamental Research, Bangalore, India
| | - Elizabeth M. Meiering
- Department of Chemistry, University of Waterloo, Waterloo, ON, Canada,*Correspondence: Elizabeth M. Meiering,
| |
Collapse
|
19
|
Design, Production, and Characterization of Catalytically Active Inclusion Bodies. Methods Mol Biol 2023; 2617:49-74. [PMID: 36656516 DOI: 10.1007/978-1-0716-2930-7_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Catalytically active inclusion bodies (CatIBs) are promising biologically produced enzyme/protein immobilizates for application in biocatalysis, synthetic chemistry, and biomedicine. CatIB formation is commonly induced by fusion of suitable aggregation-inducing tags to a given target protein. Heterologous production of the fusion protein in turn yields CatIBs. This chapter presents the methodology needed to design, produce, and characterize CatIBs.
Collapse
|
20
|
Syrlybaeva R, Strauch EM. Deep learning of protein sequence design of protein-protein interactions. Bioinformatics 2023; 39:6827796. [PMID: 36377772 PMCID: PMC9947925 DOI: 10.1093/bioinformatics/btac733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Revised: 09/16/2022] [Accepted: 11/14/2022] [Indexed: 11/16/2022] Open
Abstract
MOTIVATION As more data of experimentally determined protein structures are becoming available, data-driven models to describe protein sequence-structure relationships become more feasible. Within this space, the amino acid sequence design of protein-protein interactions is still a rather challenging subproblem with very low success rates-yet, it is central to most biological processes. RESULTS We developed an attention-based deep learning model inspired by algorithms used for image-caption assignments to design peptides or protein fragment sequences. Our trained model can be applied for the redesign of natural protein interfaces or the designed protein interaction fragments. Here, we validate the potential by recapitulating naturally occurring protein-protein interactions including antibody-antigen complexes. The designed interfaces accurately capture essential native interactions and have comparable native-like binding affinities in silico. Furthermore, our model does not need a precise backbone location, making it an attractive tool for working with de novo design of protein-protein interactions. AVAILABILITY AND IMPLEMENTATION The source code of the method is available at https://github.com/strauchlab/iNNterfaceDesign. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Raulia Syrlybaeva
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA 30602, USA
| | - Eva-Maria Strauch
- Department of Pharmaceutical and Biomedical Sciences, University of Georgia, Athens, GA 30602, USA.,Institute of Bioinformatics, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
21
|
Gao Y, Wang B, Hu S, Zhu T, Zhang JZH. An efficient method to predict protein thermostability in alanine mutation. Phys Chem Chem Phys 2022; 24:29629-29639. [PMID: 36449314 DOI: 10.1039/d2cp04236c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The relationship between protein sequence and its thermodynamic stability is a critical aspect of computational protein design. In this work, we present a new theoretical method to calculate the free energy change (ΔΔG) resulting from a single-point amino acid mutation to alanine in a protein sequence. The method is derived based on physical interactions and is very efficient in estimating the free energy changes caused by a series of alanine mutations from just a single molecular dynamics (MD) trajectory. Numerical calculations are carried out on a total of 547 alanine mutations in 19 diverse proteins whose experimental results are available. The comparison between the experimental ΔΔGexp and the calculated values shows a generally good correlation with a correlation coefficient of 0.67. Both the advantages and limitations of this method are discussed. This method provides an efficient and valuable tool for protein design and engineering.
Collapse
Affiliation(s)
- Ya Gao
- School of Mathematics, Physics and Statistics, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Bo Wang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.
| | - Shiyu Hu
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China. .,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China. .,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China.,Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| |
Collapse
|
22
|
Slobodyanyuk M, Banda-Vázquez JA, Thompson MJ, Dean RA, Baenziger JE, Chica RA, daCosta CJB. Origin of acetylcholine antagonism in ELIC, a bacterial pentameric ligand-gated ion channel. Commun Biol 2022; 5:1264. [PMID: 36400839 PMCID: PMC9674596 DOI: 10.1038/s42003-022-04227-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 11/04/2022] [Indexed: 11/20/2022] Open
Abstract
ELIC is a prokaryotic homopentameric ligand-gated ion channel that is homologous to vertebrate nicotinic acetylcholine receptors. Acetylcholine binds to ELIC but fails to activate it, despite bringing about conformational changes indicative of activation. Instead, acetylcholine competitively inhibits agonist-activated ELIC currents. What makes acetylcholine an agonist in an acetylcholine receptor context, and an antagonist in an ELIC context, is not known. Here we use available structures and statistical coupling analysis to identify residues in the ELIC agonist-binding site that contribute to agonism. Substitution of these ELIC residues for their acetylcholine receptor counterparts does not convert acetylcholine into an ELIC agonist, but in some cases reduces the sensitivity of ELIC to acetylcholine antagonism. Acetylcholine antagonism can be abolished by combining two substitutions that together appear to knock out acetylcholine binding. Thus, making the ELIC agonist-binding site more acetylcholine receptor-like, paradoxically reduces the apparent affinity for acetylcholine, demonstrating that residues important for agonist binding in one context can be deleterious in another. These findings reinforce the notion that although agonism originates from local interactions within the agonist-binding site, it is a global property with cryptic contributions from distant residues. Finally, our results highlight an underappreciated mechanism of antagonism, where agonists with appreciable affinity, but negligible efficacy, present as competitive antagonists. A structural and functional study of the prokaryotic ligand-gated ion channel, ELIC, provides insight into the origin of agonism and antagonism at nicotinic acetylcholine receptors.
Collapse
|
23
|
Liu H, Chen Q. Computational protein design with data‐driven approaches: Recent developments and perspectives. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
- School of Data Science University of Science and Technology of China Hefei Anhui China
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
24
|
Abstract
De novo protein design enables the exploration of novel sequences and structures absent from the natural protein universe. De novo design also stands as a stringent test for our understanding of the underlying physical principles of protein folding and may lead to the development of proteins with unmatched functional characteristics. The first fundamental challenge of de novo design is to devise "designable" structural templates leading to sequences that will adopt the predicted fold. Here, we built on the TopoBuilder (TB) de novo design method, to automatically assemble structural templates with native-like features starting from string descriptors that capture the overall topology of proteins. Our framework eliminates the dependency of hand-crafted and fold-specific rules through an iterative, data-driven approach that extracts geometrical parameters from structural tertiary motifs. We evaluated the TopoBuilder framework by designing sequences for a set of five protein folds and experimental characterization revealed that several sequences were folded and stable in solution. The TopoBuilder de novo design framework will be broadly useful to guide the generation of artificial proteins with customized geometries, enabling the exploration of the protein universe.
Collapse
|
25
|
Mahajan SP, Ruffolo JA, Frick R, Gray JJ. Hallucinating structure-conditioned antibody libraries for target-specific binders. Front Immunol 2022; 13:999034. [PMID: 36341416 PMCID: PMC9635398 DOI: 10.3389/fimmu.2022.999034] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 09/22/2022] [Indexed: 11/29/2022] Open
Abstract
Antibodies are widely developed and used as therapeutics to treat cancer, infectious disease, and inflammation. During development, initial leads routinely undergo additional engineering to increase their target affinity. Experimental methods for affinity maturation are expensive, laborious, and time-consuming and rarely allow the efficient exploration of the relevant design space. Deep learning (DL) models are transforming the field of protein engineering and design. While several DL-based protein design methods have shown promise, the antibody design problem is distinct, and specialized models for antibody design are desirable. Inspired by hallucination frameworks that leverage accurate structure prediction DL models, we propose the FvHallucinator for designing antibody sequences, especially the CDR loops, conditioned on an antibody structure. Such a strategy generates targeted CDR libraries that retain the conformation of the binder and thereby the mode of binding to the epitope on the antigen. On a benchmark set of 60 antibodies, FvHallucinator generates sequences resembling natural CDRs and recapitulates perplexity of canonical CDR clusters. Furthermore, the FvHallucinator designs amino acid substitutions at the VH-VL interface that are enriched in human antibody repertoires and therapeutic antibodies. We propose a pipeline that screens FvHallucinator designs to obtain a library enriched in binders for an antigen of interest. We apply this pipeline to the CDR H3 of the Trastuzumab-HER2 complex to generate in silico designs predicted to improve upon the binding affinity and interfacial properties of the original antibody. Thus, the FvHallucinator pipeline enables generation of inexpensive, diverse, and targeted antibody libraries enriched in binders for antibody affinity maturation.
Collapse
Affiliation(s)
- Sai Pooja Mahajan
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Jeffrey A. Ruffolo
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD, United States
| | - Rahel Frick
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, United States
| | - Jeffrey J. Gray
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD, United States
- Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD, United States
- Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD, United States
- *Correspondence: Jeffrey J. Gray,
| |
Collapse
|
26
|
Chidyausiku TM, Mendes SR, Klima JC, Nadal M, Eckhard U, Roel-Touris J, Houliston S, Guevara T, Haddox HK, Moyer A, Arrowsmith CH, Gomis-Rüth FX, Baker D, Marcos E. De novo design of immunoglobulin-like domains. Nat Commun 2022; 13:5661. [PMID: 36192397 PMCID: PMC9530121 DOI: 10.1038/s41467-022-33004-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 08/17/2022] [Indexed: 11/21/2022] Open
Abstract
Antibodies, and antibody derivatives such as nanobodies, contain immunoglobulin-like (Ig) β-sandwich scaffolds which anchor the hypervariable antigen-binding loops and constitute the largest growing class of drugs. Current engineering strategies for this class of compounds rely on naturally existing Ig frameworks, which can be hard to modify and have limitations in manufacturability, designability and range of action. Here, we develop design rules for the central feature of the Ig fold architecture—the non-local cross-β structure connecting the two β-sheets—and use these to design highly stable Ig domains de novo, confirm their structures through X-ray crystallography, and show they can correctly scaffold functional loops. Our approach opens the door to the design of antibody-like scaffolds with tailored structures and superior biophysical properties. The immunoglobulin domain framework of antibodies has been a long standing design challenge. Here, the authors describe design rules for tailoring these domains and show they can be accurately designed, de novo, with high stability and the ability to scaffold functional loops.
Collapse
Affiliation(s)
- Tamuka M Chidyausiku
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA.,Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA.,Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA.,Novartis Institutes for BioMedical Research Inc., San Diego, CA, 92121, USA
| | - Soraia R Mendes
- Proteolysis Laboratory, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB-CSIC), Baldiri Reixac 15, 08028, Barcelona, Spain
| | - Jason C Klima
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA.,Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA.,Encodia, Inc., San Diego, CA, 92121, USA
| | - Marta Nadal
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB-CSIC), Baldiri Reixac 15, 08028, Barcelona, Spain
| | - Ulrich Eckhard
- Proteolysis Laboratory, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB-CSIC), Baldiri Reixac 15, 08028, Barcelona, Spain
| | - Jorge Roel-Touris
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB-CSIC), Baldiri Reixac 15, 08028, Barcelona, Spain
| | - Scott Houliston
- Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada.,Princess Margaret Cancer Centre and Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 2M9, Canada
| | - Tibisay Guevara
- Proteolysis Laboratory, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB-CSIC), Baldiri Reixac 15, 08028, Barcelona, Spain
| | - Hugh K Haddox
- Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA
| | - Adam Moyer
- Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA
| | - Cheryl H Arrowsmith
- Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada.,Princess Margaret Cancer Centre and Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 2M9, Canada
| | - F Xavier Gomis-Rüth
- Proteolysis Laboratory, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB-CSIC), Baldiri Reixac 15, 08028, Barcelona, Spain.
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, 98195, USA. .,Institute for Protein Design, University of Washington, Seattle, WA, 98195, USA. .,Howard Hughes Medical Institute, University of Washington, Seattle, WA, 98195, USA.
| | - Enrique Marcos
- Protein Design and Modeling Lab, Department of Structural and Molecular Biology, Molecular Biology Institute of Barcelona (IBMB-CSIC), Baldiri Reixac 15, 08028, Barcelona, Spain.
| |
Collapse
|
27
|
Thieker DF, Maguire JB, Kudlacek ST, Leaver‐Fay A, Lyskov S, Kuhlman B. Stabilizing proteins, simplified: A Rosetta-based webtool for predicting favorable mutations. Protein Sci 2022; 31:e4428. [PMID: 36173174 PMCID: PMC9490798 DOI: 10.1002/pro.4428] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/06/2022] [Accepted: 08/21/2022] [Indexed: 11/07/2022]
Abstract
Many proteins have low thermodynamic stability, which can lead to low expression yields and limit functionality in research, industrial and clinical settings. This article introduces two, web-based tools that use the high-resolution structure of a protein along with the Rosetta molecular modeling program to predict stabilizing mutations. The protocols were recently applied to three genetically and structurally distinct proteins and successfully predicted mutations that improved thermal stability and/or protein yield. In all three cases, combining the stabilizing mutations raised the protein unfolding temperatures by more than 20°C. The first protocol evaluates point mutations and can generate a site saturation mutagenesis heatmap. The second identifies mutation clusters around user-defined positions. Both applications only require a protein structure and are particularly valuable when a deep multiple sequence alignment is not available. These tools were created to simplify protein engineering and enable research that would otherwise be infeasible due to poor expression and stability of the native molecule.
Collapse
Affiliation(s)
- David F. Thieker
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Jack B. Maguire
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Stephan T. Kudlacek
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Andrew Leaver‐Fay
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
| | - Sergey Lyskov
- Department of Chemical and Biomolecular EngineeringJohns Hopkins UniversityBaltimoreMarylandUSA
| | - Brian Kuhlman
- Department of Biochemistry and BiophysicsUniversity of North Carolina School of MedicineChapel HillNorth CarolinaUSA
- Lineburger Comprehensive Cancer CenterUniversity of North Carolina at Chapel HillChapel HillNorth CarolinaUSA
| |
Collapse
|
28
|
Liu YJ. Understanding the complete bioluminescence cycle from a multiscale computational perspective: A review. JOURNAL OF PHOTOCHEMISTRY AND PHOTOBIOLOGY C: PHOTOCHEMISTRY REVIEWS 2022. [DOI: 10.1016/j.jphotochemrev.2022.100537] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
29
|
Miniproteins in medicinal chemistry. Bioorg Med Chem Lett 2022; 71:128806. [PMID: 35660515 DOI: 10.1016/j.bmcl.2022.128806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 05/11/2022] [Accepted: 05/16/2022] [Indexed: 11/20/2022]
Abstract
Miniproteins exhibit great potential as scaffolds for drug candidates because of their well-defined structure and good synthetic availability. Because of recently described methodologies for their de novo design, the field of miniproteins is emerging and can provide molecules that effectively bind to problematic targets, i.e., those that have been previously considered to be undruggable. This review describes methodologies for the development of miniprotein scaffolds and for the construction of biologically active miniproteins.
Collapse
|
30
|
Arndt T, Jaudzems K, Shilkova O, Francis J, Johansson M, Laity PR, Sahin C, Chatterjee U, Kronqvist N, Barajas-Ledesma E, Kumar R, Chen G, Strömberg R, Abelein A, Langton M, Landreh M, Barth A, Holland C, Johansson J, Rising A. Spidroin N-terminal domain forms amyloid-like fibril based hydrogels and provides a protein immobilization platform. Nat Commun 2022; 13:4695. [PMID: 35970823 PMCID: PMC9378615 DOI: 10.1038/s41467-022-32093-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Accepted: 07/15/2022] [Indexed: 11/24/2022] Open
Abstract
Recombinant spider silk proteins (spidroins) have multiple potential applications in development of novel biomaterials, but their multimodal and aggregation-prone nature have complicated production and straightforward applications. Here, we report that recombinant miniature spidroins, and importantly also the N-terminal domain (NT) on its own, rapidly form self-supporting and transparent hydrogels at 37 °C. The gelation is caused by NT α-helix to β-sheet conversion and formation of amyloid-like fibrils, and fusion proteins composed of NT and green fluorescent protein or purine nucleoside phosphorylase form hydrogels with intact functions of the fusion moieties. Our findings demonstrate that recombinant NT and fusion proteins give high expression yields and bestow attractive properties to hydrogels, e.g., transparency, cross-linker free gelation and straightforward immobilization of active proteins at high density. Recombinant spider silks are of interest but the multimodal and aggregation-prone nature of them is a limitation. Here, the authors report on a miniature spidroin based on the N-terminal domain which forms a hydrogel at 37 °C which allows for ease of production and fusion protein modification to generate functional biomaterials.
Collapse
Affiliation(s)
- Tina Arndt
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Kristaps Jaudzems
- Department of Physical Organic Chemistry, Latvian Institute of Organic Synthesis, Riga, LV-1006, Latvia
| | - Olga Shilkova
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Juanita Francis
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Mathias Johansson
- Department of Molecular Sciences, Swedish University of Agricultural Sciences, Uppsala, 750 07, Sweden, Box 7015
| | - Peter R Laity
- Department of Materials Science and Engineering, The University of Sheffield, Sir Robert Hadfield Building, Mappin Street, Sheffield, S1 3JD, UK
| | - Cagla Sahin
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solnavägen 9, 171 65, Solna, Sweden
| | - Urmimala Chatterjee
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Nina Kronqvist
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Edgar Barajas-Ledesma
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solnavägen 9, 171 65, Solna, Sweden
| | - Rakesh Kumar
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Gefei Chen
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Roger Strömberg
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Axel Abelein
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Maud Langton
- Department of Molecular Sciences, Swedish University of Agricultural Sciences, Uppsala, 750 07, Sweden, Box 7015
| | - Michael Landreh
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet, Solnavägen 9, 171 65, Solna, Sweden
| | - Andreas Barth
- Department of Biochemistry and Biophysics, The Arrhenius Laboratories for Natural Sciences, Stockholm University, 10691, Stockholm, Sweden
| | - Chris Holland
- Department of Materials Science and Engineering, The University of Sheffield, Sir Robert Hadfield Building, Mappin Street, Sheffield, S1 3JD, UK
| | - Jan Johansson
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden
| | - Anna Rising
- Department of Biosciences and Nutrition, Karolinska Institutet, Neo, Blickagången 16, Huddinge, 141 52, Sweden. .,Department of Anatomy, Physiology and Biochemistry, Swedish University of Agricultural Sciences, Uppsala, 750 07, Sweden.
| |
Collapse
|
31
|
Pintado-Grima C, Bárcenas O, Manglano-Artuñedo Z, Vilaça R, Macedo-Ribeiro S, Pallarès I, Santos J, Ventura S. CARs-DB: A Database of Cryptic Amyloidogenic Regions in Intrinsically Disordered Proteins. Front Mol Biosci 2022; 9:882160. [PMID: 35898309 PMCID: PMC9309178 DOI: 10.3389/fmolb.2022.882160] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 04/15/2022] [Indexed: 12/20/2022] Open
Abstract
Proteome-wide analyses suggest that most globular proteins contain at least one amyloidogenic region, whereas these aggregation-prone segments are thought to be underrepresented in intrinsically disordered proteins (IDPs). In recent work, we reported that intrinsically disordered regions (IDRs) indeed sustain a significant amyloid load in the form of cryptic amyloidogenic regions (CARs). CARs are widespread in IDRs, but they are necessarily exposed to solvent, and thus they should be more polar and have a milder aggregation potential than conventional amyloid regions protected inside globular proteins. CARs are connected with IDPs function and, in particular, with the establishment of protein-protein interactions through their IDRs. However, their presence also appears associated with pathologies like cancer or Alzheimer’s disease. Given the relevance of CARs for both IDPs function and malfunction, we developed CARs-DB, a database containing precomputed predictions for all CARs present in the IDPs deposited in the DisProt database. This web tool allows for the fast and comprehensive exploration of previously unnoticed amyloidogenic regions embedded within IDRs sequences and might turn helpful in identifying disordered interacting regions. It contains >8,900 unique CARs identified in a total of 1711 IDRs. CARs-DB is freely available for users and can be accessed at http://carsdb.ppmclab.com. To validate CARs-DB, we demonstrate that two previously undescribed CARs selected from the database display full amyloidogenic potential. Overall, CARs-DB allows easy access to a previously unexplored amyloid sequence space.
Collapse
Affiliation(s)
- Carlos Pintado-Grima
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Oriol Bárcenas
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Zoe Manglano-Artuñedo
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Rita Vilaça
- Instituto de Biologia Molecular e Celular and Instituto de Investigação e Inovação Em Saúde, Universidade Do Porto, Porto, Portugal
| | - Sandra Macedo-Ribeiro
- Instituto de Biologia Molecular e Celular and Instituto de Investigação e Inovação Em Saúde, Universidade Do Porto, Porto, Portugal
| | - Irantzu Pallarès
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Jaime Santos
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Salvador Ventura
- Institut de Biotecnologia i de Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, Barcelona, Spain
| |
Collapse
|
32
|
Van Stappen C, Deng Y, Liu Y, Heidari H, Wang JX, Zhou Y, Ledray AP, Lu Y. Designing Artificial Metalloenzymes by Tuning of the Environment beyond the Primary Coordination Sphere. Chem Rev 2022; 122:11974-12045. [PMID: 35816578 DOI: 10.1021/acs.chemrev.2c00106] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Metalloenzymes catalyze a variety of reactions using a limited number of natural amino acids and metallocofactors. Therefore, the environment beyond the primary coordination sphere must play an important role in both conferring and tuning their phenomenal catalytic properties, enabling active sites with otherwise similar primary coordination environments to perform a diverse array of biological functions. However, since the interactions beyond the primary coordination sphere are numerous and weak, it has been difficult to pinpoint structural features responsible for the tuning of activities of native enzymes. Designing artificial metalloenzymes (ArMs) offers an excellent basis to elucidate the roles of these interactions and to further develop practical biological catalysts. In this review, we highlight how the secondary coordination spheres of ArMs influence metal binding and catalysis, with particular focus on the use of native protein scaffolds as templates for the design of ArMs by either rational design aided by computational modeling, directed evolution, or a combination of both approaches. In describing successes in designing heme, nonheme Fe, and Cu metalloenzymes, heteronuclear metalloenzymes containing heme, and those ArMs containing other metal centers (including those with non-native metal ions and metallocofactors), we have summarized insights gained on how careful controls of the interactions in the secondary coordination sphere, including hydrophobic and hydrogen bonding interactions, allow the generation and tuning of these respective systems to approach, rival, and, in a few cases, exceed those of native enzymes. We have also provided an outlook on the remaining challenges in the field and future directions that will allow for a deeper understanding of the secondary coordination sphere a deeper understanding of the secondary coordintion sphere to be gained, and in turn to guide the design of a broader and more efficient variety of ArMs.
Collapse
Affiliation(s)
- Casey Van Stappen
- Department of Chemistry, University of Texas at Austin, 105 East 24th Street, Austin, Texas 78712, United States
| | - Yunling Deng
- Department of Chemistry, University of Texas at Austin, 105 East 24th Street, Austin, Texas 78712, United States
| | - Yiwei Liu
- Department of Chemistry, University of Illinois, Urbana-Champaign, 505 South Mathews Avenue, Urbana, Illinois 61801, United States
| | - Hirbod Heidari
- Department of Chemistry, University of Texas at Austin, 105 East 24th Street, Austin, Texas 78712, United States
| | - Jing-Xiang Wang
- Department of Chemistry, University of Texas at Austin, 105 East 24th Street, Austin, Texas 78712, United States
| | - Yu Zhou
- Department of Chemistry, University of Texas at Austin, 105 East 24th Street, Austin, Texas 78712, United States
| | - Aaron P Ledray
- Department of Chemistry, University of Texas at Austin, 105 East 24th Street, Austin, Texas 78712, United States
| | - Yi Lu
- Department of Chemistry, University of Texas at Austin, 105 East 24th Street, Austin, Texas 78712, United States.,Department of Chemistry, University of Illinois, Urbana-Champaign, 505 South Mathews Avenue, Urbana, Illinois 61801, United States
| |
Collapse
|
33
|
Magi Meconi G, Sasselli IR, Bianco V, Onuchic JN, Coluzza I. Key aspects of the past 30 years of protein design. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2022; 85:086601. [PMID: 35704983 DOI: 10.1088/1361-6633/ac78ef] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 06/15/2022] [Indexed: 06/15/2023]
Abstract
Proteins are the workhorse of life. They are the building infrastructure of living systems; they are the most efficient molecular machines known, and their enzymatic activity is still unmatched in versatility by any artificial system. Perhaps proteins' most remarkable feature is their modularity. The large amount of information required to specify each protein's function is analogically encoded with an alphabet of just ∼20 letters. The protein folding problem is how to encode all such information in a sequence of 20 letters. In this review, we go through the last 30 years of research to summarize the state of the art and highlight some applications related to fundamental problems of protein evolution.
Collapse
Affiliation(s)
- Giulia Magi Meconi
- Computational Biophysics Lab, Center for Cooperative Research in Biomaterials (CIC biomaGUNE), Basque Research and Technology Alliance (BRTA), Paseo de Miramon 182, 20014, Donostia-San Sebastián, Spain
| | - Ivan R Sasselli
- Computational Biophysics Lab, Center for Cooperative Research in Biomaterials (CIC biomaGUNE), Basque Research and Technology Alliance (BRTA), Paseo de Miramon 182, 20014, Donostia-San Sebastián, Spain
| | | | - Jose N Onuchic
- Center for Theoretical Biological Physics, Department of Physics & Astronomy, Department of Chemistry, Department of Biosciences, Rice University, Houston, TX 77251, United States of America
| | - Ivan Coluzza
- BCMaterials, Basque Center for Materials, Applications and Nanostructures, Bld. Martina Casiano, UPV/EHU Science Park, Barrio Sarriena s/n, 48940 Leioa, Spain
- Basque Foundation for Science, Ikerbasque, 48009, Bilbao, Spain
| |
Collapse
|
34
|
Tao L, Byrnes J, Varshney V, Li Y. Machine learning strategies for the structure-property relationship of copolymers. iScience 2022; 25:104585. [PMID: 35789847 PMCID: PMC9249671 DOI: 10.1016/j.isci.2022.104585] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 05/26/2022] [Accepted: 06/07/2022] [Indexed: 11/15/2022] Open
Abstract
Establishing the structure-property relationship is extremely valuable for the molecular design of copolymers. However, machine learning (ML) models can incorporate both chemical composition and sequence distribution of monomers, and have the generalization ability to process various copolymer types (e.g., alternating, random, block, and gradient copolymers) with a unified approach are missing. To address this challenge, we formulate four different ML models for investigation, including a feedforward neural network (FFNN) model, a convolutional neural network (CNN) model, a recurrent neural network (RNN) model, and a combined FFNN/RNN (Fusion) model. We use various copolymer types to systematically validate the performance and generalizability of different models. We find that the RNN architecture that processes the monomer sequence information both forward and backward is a more suitable ML model for copolymers with better generalizability. As a supplement to polymer informatics, our proposed approach provides an efficient way for the evaluation of copolymers. Establish structure-property relationships of copolymer with machine learning (ML) Incorporate both chemical composition and sequential distribution of copolymers Analyze various copolymer types with different models in a unified approach Differentiate the effects of random, block, and gradient patterns of copolymers
Collapse
Affiliation(s)
- Lei Tao
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
| | | | - Vikas Varshney
- Materials and Manufacturing Directorate, Air Force Research Laboratory, Wright-Patterson Air Force Base, Ohio 45433, USA
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
- Corresponding author
| |
Collapse
|
35
|
Biswas G, Ghosh S, Basu S, Bhattacharyya D, Datta AK, Banerjee R. Can the jigsaw puzzle model of protein folding re‐assemble a hydrophobic core? Proteins 2022; 90:1390-1412. [DOI: 10.1002/prot.26321] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 01/11/2022] [Accepted: 01/28/2022] [Indexed: 12/30/2022]
Affiliation(s)
- Gargi Biswas
- Saha Institute of Nuclear Physics Kolkata India
- Homi Bhabha National Institute Mumbai India
| | | | - Sankar Basu
- Saha Institute of Nuclear Physics Kolkata India
| | | | | | - Rahul Banerjee
- Saha Institute of Nuclear Physics Kolkata India
- Homi Bhabha National Institute Mumbai India
| |
Collapse
|
36
|
Arndt T, Greco G, Schmuck B, Bunz J, Shilkova O, Francis J, Pugno NM, Jaudzems K, Barth A, Johansson J, Rising A. Engineered Spider Silk Proteins for Biomimetic Spinning of Fibers with Toughness Equal to Dragline Silks. ADVANCED FUNCTIONAL MATERIALS 2022; 32:2200986. [PMID: 36505976 PMCID: PMC9720699 DOI: 10.1002/adfm.202200986] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Revised: 03/10/2022] [Indexed: 06/17/2023]
Abstract
Spider silk is the toughest fiber found in nature, and bulk production of artificial spider silk that matches its mechanical properties remains elusive. Development of miniature spider silk proteins (mini-spidroins) has made large-scale fiber production economically feasible, but the fibers' mechanical properties are inferior to native silk. The spider silk fiber's tensile strength is conferred by poly-alanine stretches that are zipped together by tight side chain packing in β-sheet crystals. Spidroins are secreted so they must be void of long stretches of hydrophobic residues, since such segments get inserted into the endoplasmic reticulum membrane. At the same time, hydrophobic residues have high β-strand propensity and can mediate tight inter-β-sheet interactions, features that are attractive for generation of strong artificial silks. Protein production in prokaryotes can circumvent biological laws that spiders, being eukaryotic organisms, must obey, and the authors thus design mini-spidroins that are predicted to more avidly form stronger β-sheets than the wildtype protein. Biomimetic spinning of the engineered mini-spidroins indeed results in fibers with increased tensile strength and two fiber types display toughness equal to native dragline silks. Bioreactor expression and purification result in a protein yield of ≈9 g L-1 which is in line with requirements for economically feasible bulk scale production.
Collapse
Affiliation(s)
- Tina Arndt
- Department of Biosciences and NutritionKarolinska InstitutetNeoHuddinge14183Sweden
| | - Gabriele Greco
- Laboratory for Bioinspired, Bionic, Nano, Meta, Materials & MechanicsDepartment of Civil, Environmental and Mechanical EngineeringUniversity of TrentoVia Mesiano 77Trento38123Italy
- Department of AnatomyPhysiology and BiochemistrySwedish University of Agricultural SciencesUppsala75007Sweden
| | - Benjamin Schmuck
- Department of Biosciences and NutritionKarolinska InstitutetNeoHuddinge14183Sweden
- Department of AnatomyPhysiology and BiochemistrySwedish University of Agricultural SciencesUppsala75007Sweden
| | - Jessica Bunz
- Department of Biosciences and NutritionKarolinska InstitutetNeoHuddinge14183Sweden
- Present address:
Spiber Technologies ABAlbaNova University CenterSE‐10691StockholmSweden
| | - Olga Shilkova
- Department of Biosciences and NutritionKarolinska InstitutetNeoHuddinge14183Sweden
| | - Juanita Francis
- Department of Biosciences and NutritionKarolinska InstitutetNeoHuddinge14183Sweden
| | - Nicola M Pugno
- Laboratory for Bioinspired, Bionic, Nano, Meta, Materials & MechanicsDepartment of Civil, Environmental and Mechanical EngineeringUniversity of TrentoVia Mesiano 77Trento38123Italy
- School of Engineering and Materials SciencesQueen Mary University of LondonMile End RoadLondonE1 4NSUK
| | - Kristaps Jaudzems
- Department of Physical Organic ChemistryLatvian Institute of Organic SynthesisRigaLV‐1006Latvia
| | - Andreas Barth
- Department of Biochemistry and BiophysicsThe Arrhenius Laboratories for Natural SciencesStockholm UniversityStockholm10691Sweden
| | - Jan Johansson
- Department of Biosciences and NutritionKarolinska InstitutetNeoHuddinge14183Sweden
| | - Anna Rising
- Department of Biosciences and NutritionKarolinska InstitutetNeoHuddinge14183Sweden
- Department of AnatomyPhysiology and BiochemistrySwedish University of Agricultural SciencesUppsala75007Sweden
| |
Collapse
|
37
|
Swanson S, Sivaraman V, Grigoryan G, Keating AE. Tertiary motifs as building blocks for the design of protein‐binding peptides. Protein Sci 2022; 31:e4322. [PMID: 35634780 PMCID: PMC9088223 DOI: 10.1002/pro.4322] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Revised: 04/12/2022] [Accepted: 04/14/2022] [Indexed: 11/07/2022]
Affiliation(s)
- Sebastian Swanson
- Department of Biology Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Venkatesh Sivaraman
- Department of Biology Massachusetts Institute of Technology Cambridge Massachusetts USA
| | - Gevorg Grigoryan
- Department of Computer Science Dartmouth College Hanover New Hampshire USA
| | - Amy E. Keating
- Department of Biology Massachusetts Institute of Technology Cambridge Massachusetts USA
- Department of Biological Engineering Massachusetts Institute of Technology Cambridge Massachusetts USA
- Koch Center for Integrative Cancer Research Massachusetts Institute of Technology Cambridge Massachusetts USA
| |
Collapse
|
38
|
Ding W, Nakai K, Gong H. Protein design via deep learning. Brief Bioinform 2022; 23:6554124. [PMID: 35348602 PMCID: PMC9116377 DOI: 10.1093/bib/bbac102] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/26/2022] [Accepted: 03/01/2022] [Indexed: 12/11/2022] Open
Abstract
Proteins with desired functions and properties are important in fields like nanotechnology and biomedicine. De novo protein design enables the production of previously unseen proteins from the ground up and is believed as a key point for handling real social challenges. Recent introduction of deep learning into design methods exhibits a transformative influence and is expected to represent a promising and exciting future direction. In this review, we retrospect the major aspects of current advances in deep-learning-based design procedures and illustrate their novelty in comparison with conventional knowledge-based approaches through noticeable cases. We not only describe deep learning developments in structure-based protein design and direct sequence design, but also highlight recent applications of deep reinforcement learning in protein design. The future perspectives on design goals, challenges and opportunities are also comprehensively discussed.
Collapse
Affiliation(s)
- Wenze Ding
- School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing 210044, China.,School of Future Technology, Nanjing University of Information Science and Technology, Nanjing 210044, China.,MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China.,Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing 100084, China
| | - Kenta Nakai
- Institute of Medical Science, the University of Tokyo, Tokyo 1088639, Japan
| | - Haipeng Gong
- MOE Key Laboratory of Bioinformatics, School of Life Sciences, Tsinghua University, Beijing 100084, China.,Beijing Advanced Innovation Center for Structural Biology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
39
|
Malik A, Banerjee A, Pal A, Mitra P. A sequence space search engine for computational protein design to modulate molecular functionality. J Biomol Struct Dyn 2022; 41:2937-2946. [PMID: 35220920 DOI: 10.1080/07391102.2022.2042386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
De-novo protein design explores the untapped sequence space that is otherwise less discovered during the evolutionary process. This necessitates an efficient sequence space search engine for effective convergence in computational protein design. We propose a greedy simulated annealing-based Monte-Carlo parallel search algorithm for better sequence-structure compatibility probing in protein design. The guidance provided by the evolutionary profile, the greedy approach, and the cooling schedule adopted in the Monte Carlo simulation ensures sufficient exploration and exploitation of the search space leading to faster convergence. On evaluating the proposed algorithm, we find that a dataset of 76 target scaffolds report an average root-mean-square-deviation (RMSD) of 1.07 Å and an average TM-Score of 0.93 with the modeled designed protein sequences. High sequence recapitulation of 48.7% (59.4%) observed in the design sequences for all (hydrophobic) solvent-inaccessible residues again establish the goodness of the proposed algorithm. A high (93.4%) intra-group recapitulation of hydrophobic residues in the solvent-inaccessible region indicates that the proposed protein design algorithm preserves the core residues in the protein and provides alternative residue combinations in the solvent-accessible regions of the target protein. Furthermore, a COFACTOR-based protein functional analysis shows that the design sequences exhibit altered molecular functionality and introduce new molecular functions compared to the target scaffolds.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Ayush Malik
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| | - Anupam Banerjee
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| | - Abantika Pal
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| |
Collapse
|
40
|
Abstract
The task of protein sequence design is central to nearly all rational protein engineering problems, and enormous effort has gone into the development of energy functions to guide design. Here, we investigate the capability of a deep neural network model to automate design of sequences onto protein backbones, having learned directly from crystal structure data and without any human-specified priors. The model generalizes to native topologies not seen during training, producing experimentally stable designs. We evaluate the generalizability of our method to a de novo TIM-barrel scaffold. The model produces novel sequences, and high-resolution crystal structures of two designs show excellent agreement with in silico models. Our findings demonstrate the tractability of an entirely learned method for protein sequence design. Rational protein design to achieve a given protein backbone conformation is needed to engineer specific functions. Here Anand et al. describe a machine learning method using a learned neural network potential for fixed-backbone protein design.
Collapse
|
41
|
Pulavarti SVSRK, Maguire JB, Yuen S, Harrison JS, Griffin J, Premkumar L, Esposito EA, Makhatadze GI, Garcia AE, Weiss TM, Snell EH, Kuhlman B, Szyperski T. From Protein Design to the Energy Landscape of a Cold Unfolding Protein. J Phys Chem B 2022; 126:1212-1231. [PMID: 35128921 PMCID: PMC9281400 DOI: 10.1021/acs.jpcb.1c10750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Understanding protein folding is crucial for protein sciences. The conformational spaces and energy landscapes of cold (unfolded) protein states, as well as the associated transitions, are hardly explored. Furthermore, it is not known how structure relates to the cooperativity of cold transitions, if cold and heat unfolded states are thermodynamically similar, and if cold states play important roles for protein function. We created the cold unfolding 4-helix bundle DCUB1 with a de novo designed bipartite hydrophilic/hydrophobic core featuring a hydrogen bond network which extends across the bundle in order to study the relative importance of hydrophobic versus hydrophilic protein-water interactions for cold unfolding. Structural and thermodynamic characterization resulted in the discovery of a complex energy landscape for cold transitions, while the heat unfolded state is a random coil. Below ∼0 °C, the core of DCUB1 disintegrates in a largely cooperative manner, while a near-native helical content is retained. The resulting cold core-unfolded state is compact and features extensive internal dynamics. Below -5 °C, two additional cold transitions are seen, that is, (i) the formation of a water-mediated, compact, and highly dynamic dimer, and (ii) the onset of cold helix unfolding decoupled from cold core unfolding. Our results suggest that cold unfolding is initiated by the intrusion of water into the hydrophilic core network and that cooperativity can be tuned by varying the number of core hydrogen bond networks. Protein design has proven to be invaluable to explore the energy landscapes of cold states and to robustly test related theories.
Collapse
Affiliation(s)
- Surya V S R K Pulavarti
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| | - Jack B Maguire
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Shirley Yuen
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| | - Joseph S Harrison
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Jermel Griffin
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| | - Lakshmanane Premkumar
- Department of Microbiology and Immunology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Edward A Esposito
- Malvern Panalytical Inc, Northhampton, Massachsetts 01060, United States
| | - George I Makhatadze
- Department of Biological Sciences, Rensselaer Polytechnic Institute, Troy, New York 08544, United States
| | - Angel E Garcia
- Center for Non Linear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Thomas M Weiss
- Stanford Synchrotron Radiation Lightsource, Stanford Linear Accelerator Center, Stanford University, Menlo Park, California 94025, United States
| | - Edward H Snell
- Hauptman-Woodward Medical Research Institute, 700 Ellicott Street, Buffalo, New York 14203, United States.,Department of Materials Design and Innovation, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Thomas Szyperski
- Department of Chemistry, University at Buffalo, The State University of New York, Buffalo, New York 14260, United States
| |
Collapse
|
42
|
Bouchiba Y, Ruffini M, Schiex T, Barbe S. Computational Design of Miniprotein Binders. Methods Mol Biol 2022; 2405:361-382. [PMID: 35298822 DOI: 10.1007/978-1-0716-1855-4_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Miniprotein binders hold a great interest as a class of drugs that bridges the gap between monoclonal antibodies and small molecule drugs. Like monoclonal antibodies, they can be designed to bind to therapeutic targets with high affinity, but they are more stable and easier to produce and to administer. In this chapter, we present a structure-based computational generic approach for miniprotein inhibitor design. Specifically, we describe step-by-step the implementation of the approach for the design of miniprotein binders against the SARS-CoV-2 coronavirus, using available structural data on the SARS-CoV-2 spike receptor binding domain (RBD) in interaction with its native target, the human receptor ACE2. Structural data being increasingly accessible around many protein-protein interaction systems, this method might be applied to the design of miniprotein binders against numerous therapeutic targets. The computational pipeline exploits provable and deterministic artificial intelligence-based protein design methods, with some recent additions in terms of binding energy estimation, multistate design and diverse library generation.
Collapse
Affiliation(s)
- Younes Bouchiba
- TBI, Université de Toulouse, CNRS, INRAE, INSA, ANITI, Toulouse, France
| | - Manon Ruffini
- TBI, Université de Toulouse, CNRS, INRAE, INSA, ANITI, Toulouse, France
- Université Fédérale de Toulouse, ANITI, INRAE, UR 875, Toulouse, France
| | - Thomas Schiex
- Université Fédérale de Toulouse, ANITI, INRAE, UR 875, Toulouse, France
| | - Sophie Barbe
- TBI, Université de Toulouse, CNRS, INRAE, INSA, ANITI, Toulouse, France.
| |
Collapse
|
43
|
Green biomanufacturing promoted by automatic retrobiosynthesis planning and computational enzyme design. Chin J Chem Eng 2022. [DOI: 10.1016/j.cjche.2021.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
44
|
Prates ET, Garvin MR, Jones P, Miller JI, Sullivan KA, Cliff A, Gazolla JGFM, Shah MB, Walker AM, Lane M, Rentsch CT, Justice A, Pavicic M, Romero J, Jacobson D. Antiviral Strategies Against SARS-CoV-2: A Systems Biology Approach. Methods Mol Biol 2022; 2452:317-351. [PMID: 35554915 DOI: 10.1007/978-1-0716-2111-0_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The unprecedented scientific achievements in combating the COVID-19 pandemic reflect a global response informed by unprecedented access to data. We now have the ability to rapidly generate a diversity of information on an emerging pathogen and, by using high-performance computing and a systems biology approach, we can mine this wealth of information to understand the complexities of viral pathogenesis and contagion like never before. These efforts will aid in the development of vaccines, antiviral medications, and inform policymakers and clinicians. Here we detail computational protocols developed as SARS-CoV-2 began to spread across the globe. They include pathogen detection, comparative structural proteomics, evolutionary adaptation analysis via network and artificial intelligence methodologies, and multiomic integration. These protocols constitute a core framework on which to build a systems-level infrastructure that can be quickly brought to bear on future pathogens before they evolve into pandemic proportions.
Collapse
Affiliation(s)
- Erica T Prates
- Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA
- National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
| | - Michael R Garvin
- Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA
- National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
| | - Piet Jones
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
| | - J Izaak Miller
- Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA
- National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
| | - Kyle A Sullivan
- Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA
- National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
| | - Ashley Cliff
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
| | - Joao Gabriel Felipe Machado Gazolla
- Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA
- National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
| | - Manesh B Shah
- Genome Science and Technology, University of Tennessee Knoxville, Knoxville, TN, USA
| | - Angelica M Walker
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
| | - Matthew Lane
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
| | - Christopher T Rentsch
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK
- VA Connecticut Healthcare/General Internal Medicine, West Haven, CT, USA
| | - Amy Justice
- VA Connecticut Healthcare/General Internal Medicine, West Haven, CT, USA
- Yale University School of Medicine, New Haven, CT, USA
| | - Mirko Pavicic
- Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA
- National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA
| | - Jonathon Romero
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA
| | - Daniel Jacobson
- Oak Ridge National Laboratory, Computational Systems Biology, Oak Ridge, TN, USA.
- National Virtual Biotechnology Laboratory, US Department of Energy, Washington, DC, USA.
- The Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee Knoxville, Knoxville, TN, USA.
- Genome Science and Technology, University of Tennessee Knoxville, Knoxville, TN, USA.
- Department of Psychology, NeuroNet Research Center, University of Tennessee Knoxville, Knoxville, TN, USA.
| |
Collapse
|
45
|
Liang S, Li Z, Zhan J, Zhou Y. De novo protein design by an energy function based on series expansion in distance and orientation dependence. Bioinformatics 2021; 38:86-93. [PMID: 34406339 DOI: 10.1093/bioinformatics/btab598] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 08/11/2021] [Accepted: 08/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38% to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY AND IMPLEMENTATION The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
Collapse
Affiliation(s)
- Shide Liang
- Department of R & D, Bio-Thera Solutions, Guangzhou 510530, China
| | - Zhixiu Li
- Institute of Health and Biomedical Innovation, Queensland University of Technology at Translational Research Institute, Woolloongabba, QLD 3001, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast Campus, Southport, QLD 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, China
| |
Collapse
|
46
|
Hussain M, Cummins MC, Endo-Streeter S, Sondek J, Kuhlman B. Designer proteins that competitively inhibit Gα q by targeting its effector site. J Biol Chem 2021; 297:101348. [PMID: 34715131 PMCID: PMC8633581 DOI: 10.1016/j.jbc.2021.101348] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Revised: 10/12/2021] [Accepted: 10/22/2021] [Indexed: 11/30/2022] Open
Abstract
During signal transduction, the G protein, Gαq, binds and activates phospholipase C-β isozymes. Several diseases have been shown to manifest upon constitutively activating mutation of Gαq, such as uveal melanoma. Therefore, methods are needed to directly inhibit Gαq. Previously, we demonstrated that a peptide derived from a helix-turn-helix (HTH) region of PLC-β3 (residues 852-878) binds Gαq with low micromolar affinity and inhibits Gαq by competing with full-length PLC-β isozymes for binding. Since the HTH peptide is unstructured in the absence of Gαq, we hypothesized that embedding the HTH in a folded protein might stabilize the binding-competent conformation and further improve the potency of inhibition. Using the molecular modeling software Rosetta, we searched the Protein Data Bank for proteins with similar HTH structures near their surface. The candidate proteins were computationally docked against Gαq, and their surfaces were redesigned to stabilize this interaction. We then used yeast surface display to affinity mature the designs. The most potent design bound Gαq/i with high affinity in vitro (KD = 18 nM) and inhibited activation of PLC-β isozymes in HEK293 cells. We anticipate that our genetically encoded inhibitor will help interrogate the role of Gαq in healthy and disease model systems. Our work demonstrates that grafting interaction motifs into folded proteins is a powerful approach for generating inhibitors of protein-protein interactions.
Collapse
Affiliation(s)
- Mahmud Hussain
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Matthew C Cummins
- Department of Pharmacology, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Stuart Endo-Streeter
- Department of Pharmacology, University of North Carolina, Chapel Hill, North Carolina, USA
| | - John Sondek
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, North Carolina, USA; Department of Pharmacology, University of North Carolina, Chapel Hill, North Carolina, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina, USA.
| | - Brian Kuhlman
- Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, North Carolina, USA; Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, North Carolina, USA.
| |
Collapse
|
47
|
Zhu J, Avakyan N, Kakkis AA, Hoffnagle AM, Han K, Li Y, Zhang Z, Choi TS, Na Y, Yu CJ, Tezcan FA. Protein Assembly by Design. Chem Rev 2021; 121:13701-13796. [PMID: 34405992 PMCID: PMC9148388 DOI: 10.1021/acs.chemrev.1c00308] [Citation(s) in RCA: 89] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Proteins are nature's primary building blocks for the construction of sophisticated molecular machines and dynamic materials, ranging from protein complexes such as photosystem II and nitrogenase that drive biogeochemical cycles to cytoskeletal assemblies and muscle fibers for motion. Such natural systems have inspired extensive efforts in the rational design of artificial protein assemblies in the last two decades. As molecular building blocks, proteins are highly complex, in terms of both their three-dimensional structures and chemical compositions. To enable control over the self-assembly of such complex molecules, scientists have devised many creative strategies by combining tools and principles of experimental and computational biophysics, supramolecular chemistry, inorganic chemistry, materials science, and polymer chemistry, among others. Owing to these innovative strategies, what started as a purely structure-building exercise two decades ago has, in short order, led to artificial protein assemblies with unprecedented structures and functions and protein-based materials with unusual properties. Our goal in this review is to give an overview of this exciting and highly interdisciplinary area of research, first outlining the design strategies and tools that have been devised for controlling protein self-assembly, then describing the diverse structures of artificial protein assemblies, and finally highlighting the emergent properties and functions of these assemblies.
Collapse
Affiliation(s)
| | | | - Albert A. Kakkis
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Alexander M. Hoffnagle
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Kenneth Han
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Yiying Li
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Zhiyin Zhang
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Tae Su Choi
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Youjeong Na
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - Chung-Jui Yu
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| | - F. Akif Tezcan
- Department of Chemistry and Biochemistry, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093-0340, United States
| |
Collapse
|
48
|
Defresne M, Barbe S, Schiex T. Protein Design with Deep Learning. Int J Mol Sci 2021; 22:11741. [PMID: 34769173 PMCID: PMC8584038 DOI: 10.3390/ijms222111741] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/23/2021] [Accepted: 10/26/2021] [Indexed: 12/21/2022] Open
Abstract
Computational Protein Design (CPD) has produced impressive results for engineering new proteins, resulting in a wide variety of applications. In the past few years, various efforts have aimed at replacing or improving existing design methods using Deep Learning technology to leverage the amount of publicly available protein data. Deep Learning (DL) is a very powerful tool to extract patterns from raw data, provided that data are formatted as mathematical objects and the architecture processing them is well suited to the targeted problem. In the case of protein data, specific representations are needed for both the amino acid sequence and the protein structure in order to capture respectively 1D and 3D information. As no consensus has been reached about the most suitable representations, this review describes the representations used so far, discusses their strengths and weaknesses, and details their associated DL architecture for design and related tasks.
Collapse
Affiliation(s)
- Marianne Defresne
- Toulouse Biotechnology Institute, Université de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, France; (M.D.); (S.B.)
- Université Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France
| | - Sophie Barbe
- Toulouse Biotechnology Institute, Université de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, France; (M.D.); (S.B.)
| | - Thomas Schiex
- Université Fédérale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France
| |
Collapse
|
49
|
Saikia B, Gogoi CR, Rahman A, Baruah A. Identification of an optimal foldability criterion to design misfolding resistant protein. J Chem Phys 2021; 155:144102. [PMID: 34654294 DOI: 10.1063/5.0057533] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Proteins achieve their functional, active, and operative three dimensional native structures by overcoming the possibility of being trapped in non-native energy minima present in the energy landscape. The enormous and intricate interactions that play an important role in protein folding also determine the stability of the proteins. The large number of stabilizing/destabilizing interactions makes proteins to be only marginally stable as compared to the other competing structures. Therefore, there are some possibilities that they become trapped in the non-native conformations and thus get misfolded. These misfolded proteins lead to several debilitating diseases. This work performs a comparative study of some existing foldability criteria in the computational design of misfold resistant protein sequences based on self-consistent mean field theory. The foldability criteria selected for this study are Ef, Δ, and Φ that are commonly used in protein design procedures to determine the most efficient foldability criterion for the design of misfolding resistant proteins. The results suggest that the foldability criterion Δ is significantly better in designing a funnel energy landscape stabilizing the target state. The results also suggest that inclusion of negative design features is important for designing misfolding resistant proteins, but more information about the non-native conformations in terms of Φ leads to worse results compared to even simple positive design. The sequences designed using Δ show better resistance to misfolding in the Monte Carlo simulations performed in the study.
Collapse
Affiliation(s)
- Bondeepa Saikia
- Department of Chemistry, Dibrugarh University, Dibrugarh 786004, India
| | - Chimi Rekha Gogoi
- Department of Chemistry, Dibrugarh University, Dibrugarh 786004, India
| | - Aziza Rahman
- Department of Chemistry, Dibrugarh University, Dibrugarh 786004, India
| | - Anupaul Baruah
- Department of Chemistry, Dibrugarh University, Dibrugarh 786004, India
| |
Collapse
|
50
|
Pal A, Mulumudy R, Mitra P. Modularity-based parallel protein design algorithm with an implementation using shared memory programming. Proteins 2021; 90:658-669. [PMID: 34651333 DOI: 10.1002/prot.26263] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 09/23/2021] [Accepted: 10/01/2021] [Indexed: 01/08/2023]
Abstract
Given a target protein structure, the prime objective of protein design is to find amino acid sequences that will fold/acquire to the given three-dimensional structure. The protein design problem belongs to the non-deterministic polynomial-time-hard class as sequence search space increases exponentially with protein length. To ensure better search space exploration and faster convergence, we propose a protein modularity-based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU). Here, we have incorporated a divide-and-conquer approach where a protein is split into PUs and each PU region is explored in a parallel fashion. It has been further analyzed that our shared memory implementation of modularity-based parallel sequence search leads to better search space exploration compared to the case of traditional full protein design. Sequence-based analysis on design sequences depicts an average of 39.7% sequence similarity on the benchmark data set. Structure-based comparison of the modeled structures of the design protein with the target structure exhibited an average root-mean-square deviation of 1.17 Å and an average template modeling score of 0.89. The selected modeled structures of the design protein sequences are validated using 100 ns molecular dynamics simulations where 80% of the proteins have shown better or similar stability to the respective target proteins. Our study informs that our modularity-based protein design algorithm can be extended to protein interaction design as well.
Collapse
Affiliation(s)
- Abantika Pal
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| | - Rohith Mulumudy
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal, India
| |
Collapse
|