1
|
Randolph NZ, Kuhlman B. Invariant point message passing for protein side chain packing. Proteins 2024. [PMID: 38790143 DOI: 10.1002/prot.26705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 04/19/2024] [Accepted: 05/13/2024] [Indexed: 05/26/2024]
Abstract
Protein side chain packing (PSCP) is a fundamental problem in the field of protein engineering, as high-confidence and low-energy conformations of amino acid side chains are crucial for understanding (and designing) protein folding, protein-protein interactions, and protein-ligand interactions. Traditional PSCP methods (such as the Rosetta Packer) often rely on a library of discrete side chain conformations, or rotamers, and a forcefield to guide the structure to low-energy conformations. Recently, deep learning (DL) based methods (such as DLPacker, AttnPacker, and DiffPack) have demonstrated state-of-the-art predictions and speed in the PSCP task. Building off the success of geometric graph neural networks for protein modeling, we present the Protein Invariant Point Packer (PIPPack) which effectively processes local structural and sequence information to produce realistic, idealized side chain coordinates usingχ $$ \chi $$ -angle distribution predictions and geometry-aware invariant point message passing (IPMP). On a test set of ∼1400 high-quality protein chains, PIPPack is highly competitive with other state-of-the-art PSCP methods in rotamer recovery and per-residue RMSD but is significantly faster.
Collapse
Affiliation(s)
- Nicholas Z Randolph
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
2
|
Lewis CTA, Melhedegaard EG, Ognjanovic MM, Olsen MS, Laitila J, Seaborne RAE, Gronset M, Zhang C, Iwamoto H, Hessel AL, Kuehn MN, Merino C, Amigo N, Frobert O, Giroud S, Staples JF, Goropashnaya AV, Fedorov VB, Barnes B, Toien O, Drew K, Sprenger RJ, Ochala J. Remodeling of skeletal muscle myosin metabolic states in hibernating mammals. eLife 2024; 13:RP94616. [PMID: 38752835 PMCID: PMC11098559 DOI: 10.7554/elife.94616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2024] Open
Abstract
Hibernation is a period of metabolic suppression utilized by many small and large mammal species to survive during winter periods. As the underlying cellular and molecular mechanisms remain incompletely understood, our study aimed to determine whether skeletal muscle myosin and its metabolic efficiency undergo alterations during hibernation to optimize energy utilization. We isolated muscle fibers from small hibernators, Ictidomys tridecemlineatus and Eliomys quercinus and larger hibernators, Ursus arctos and Ursus americanus. We then conducted loaded Mant-ATP chase experiments alongside X-ray diffraction to measure resting myosin dynamics and its ATP demand. In parallel, we performed multiple proteomics analyses. Our results showed a preservation of myosin structure in U. arctos and U. americanus during hibernation, whilst in I. tridecemlineatus and E. quercinus, changes in myosin metabolic states during torpor unexpectedly led to higher levels in energy expenditure of type II, fast-twitch muscle fibers at ambient lab temperatures (20 °C). Upon repeating loaded Mant-ATP chase experiments at 8 °C (near the body temperature of torpid animals), we found that myosin ATP consumption in type II muscle fibers was reduced by 77-107% during torpor compared to active periods. Additionally, we observed Myh2 hyper-phosphorylation during torpor in I. tridecemilineatus, which was predicted to stabilize the myosin molecule. This may act as a potential molecular mechanism mitigating myosin-associated increases in skeletal muscle energy expenditure during periods of torpor in response to cold exposure. Altogether, we demonstrate that resting myosin is altered in hibernating mammals, contributing to significant changes to the ATP consumption of skeletal muscle. Additionally, we observe that it is further altered in response to cold exposure and highlight myosin as a potentially contributor to skeletal muscle non-shivering thermogenesis.
Collapse
Affiliation(s)
| | | | - Marija M Ognjanovic
- Department of Biomedical Sciences, University of CopenhagenCopenhagenDenmark
| | - Mathilde S Olsen
- Department of Biomedical Sciences, University of CopenhagenCopenhagenDenmark
| | - Jenni Laitila
- Department of Biomedical Sciences, University of CopenhagenCopenhagenDenmark
| | - Robert AE Seaborne
- Department of Biomedical Sciences, University of CopenhagenCopenhagenDenmark
- Centre for Human and Applied Physiological Sciences, Faculty of Life Sciences & Medicine, King’s College LondonLondonUnited Kingdom
| | - Magnus Gronset
- Department of Cellular and Molecular Medicine, University of CopenhagenCopenhagenDenmark
| | - Changxin Zhang
- Department of Computational Medicine and Bioinformatics, University of MichiganAnn ArborUnited States
| | - Hiroyuki Iwamoto
- Spring-8, Japan Synchrotron Radiation Research InstituteHyogoJapan
| | - Anthony L Hessel
- Institute of Physiology II, University of MuensterMuensterGermany
- Accelerated Muscle Biotechnologies ConsultantsBostonUnited States
| | - Michel N Kuehn
- Institute of Physiology II, University of MuensterMuensterGermany
- Accelerated Muscle Biotechnologies ConsultantsBostonUnited States
| | | | | | - Ole Frobert
- Department of Clinical Medicine, Faculty of Health, Aarhus UniversityAarhusDenmark
- Faculty of Health, Department of Cardiology, Örebro UniversityÖrebroSweden
| | - Sylvain Giroud
- Energetics Lab, Department of Biology, Northern Michigan UniversityMarquetteUnited States
- Research Institute of Wildlife Ecology, Department of Interdisciplinary Life Sciences, University of Veterinary Medicine ViennaViennaAustria
| | - James F Staples
- Department of Biology, University of Western OntarioLondonCanada
| | - Anna V Goropashnaya
- Center for Transformative Research in Metabolism, Institute of Arctic Biology, University of Alaska FairbanksFairbanksUnited States
| | - Vadim B Fedorov
- Center for Transformative Research in Metabolism, Institute of Arctic Biology, University of Alaska FairbanksFairbanksUnited States
| | - Brian Barnes
- Center for Transformative Research in Metabolism, Institute of Arctic Biology, University of Alaska FairbanksFairbanksUnited States
| | - Oivind Toien
- Center for Transformative Research in Metabolism, Institute of Arctic Biology, University of Alaska FairbanksFairbanksUnited States
| | - Kelly Drew
- Center for Transformative Research in Metabolism, Institute of Arctic Biology, University of Alaska FairbanksFairbanksUnited States
| | - Ryan J Sprenger
- Department of Zoology, University of British ColumbiaVancouverCanada
| | - Julien Ochala
- Department of Biomedical Sciences, University of CopenhagenCopenhagenDenmark
| |
Collapse
|
3
|
Tandiana R, Barletta GP, Soler MA, Fortuna S, Rocchia W. Computational Mutagenesis of Antibody Fragments: Disentangling Side Chains from ΔΔ G Predictions. J Chem Theory Comput 2024; 20:2630-2642. [PMID: 38445482 DOI: 10.1021/acs.jctc.3c01225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
The development of highly potent antibodies and antibody fragments as binding agents holds significant implications in fields such as biosensing and biotherapeutics. Their binding strength is intricately linked to the arrangement and composition of residues at the binding interface. Computational techniques offer a robust means to predict the three-dimensional structure of these complexes and to assess the affinity changes resulting from mutations. Given the interdependence of structure and affinity prediction, our objective here is to disentangle their roles. We aim to evaluate independently six side-chain reconstruction methods and ten binding affinity estimation techniques. This evaluation was pivotal in predicting affinity alterations due to single mutations, a key step in computational affinity maturation protocols. Our analysis focuses on a data set comprising 27 distinct antibody/hen egg white lysozyme complexes, each with crystal structures and experimentally determined binding affinities. Using six different side-chain reconstruction methods, we transformed each structure into its corresponding mutant via in silico single-point mutations. Subsequently, these structures undergo minimization and molecular dynamics simulation. We therefore estimate ΔΔG values based on the original crystal structure, its energy-minimized form, and the ensuing molecular dynamics trajectories. Our research underscores the critical importance of selecting reliable side-chain reconstruction methods and conducting thorough molecular dynamics simulations to accurately predict the impact of mutations. In summary, our study demonstrates that the integration of conformational sampling and scoring is a potent approach to precisely characterizing mutation processes in single-point mutagenesis protocols and crucial for computational antibody design.
Collapse
Affiliation(s)
- Rika Tandiana
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
| | - German P Barletta
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
- The Abdus Salam International Centre for Theoretical Physics─ICTP, Strada Costiera 11, 34151 Trieste, Italy
| | - Miguel Angel Soler
- Dipartimento di Scienze Matematiche, Informatiche e Fisiche, Universita' di Udine, Via delle Scienze 206, 33100 Udine, Italy
| | - Sara Fortuna
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
| | - Walter Rocchia
- Computational MOdelling of NanosCalE and BioPhysical SysTems─CONCEPT Lab Istituto Italiano di Tecnologia (IIT), Via Melen-83, B Block, 16152 Genoa, Italy
| |
Collapse
|
4
|
Lewis CTA, Melhedegaard EG, Ognjanovic MM, Olsen MS, Laitila J, Seaborne RAE, Gronset MN, Zhang C, Iwamoto H, Hessel AL, Kuehn MN, Merino C, Amigo N, Frobert O, Giroud S, Staples JF, Goropashnaya AV, Fedorov VB, Barnes BM, Toien O, Drew KL, Sprenger RJ, Ochala J. Remodelling of Skeletal Muscle Myosin Metabolic States in Hibernating Mammals. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.14.566992. [PMID: 38014200 PMCID: PMC10680686 DOI: 10.1101/2023.11.14.566992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Hibernation is a period of metabolic suppression utilized by many small and large mammal species to survive during winter periods. As the underlying cellular and molecular mechanisms remain incompletely understood, our study aimed to determine whether skeletal muscle myosin and its metabolic efficiency undergo alterations during hibernation to optimize energy utilization. We isolated muscle fibers from small hibernators, Ictidomys tridecemlineatus and Eliomys quercinus and larger hibernators, Ursus arctos and Ursus americanus. We then conducted loaded Mant-ATP chase experiments alongside X-ray diffraction to measure resting myosin dynamics and its ATP demand. In parallel, we performed multiple proteomics analyses. Our results showed a preservation of myosin structure in U. arctos and U. americanus during hibernation, whilst in I. tridecemlineatus and E. quercinus, changes in myosin metabolic states during torpor unexpectedly led to higher levels in energy expenditure of type II, fast-twitch muscle fibers at ambient lab temperatures (20°C). Upon repeating loaded Mant-ATP chase experiments at 8°C (near the body temperature of torpid animals), we found that myosin ATP consumption in type II muscle fibers was reduced by 77-107% during torpor compared to active periods. Additionally, we observed Myh2 hyper-phosphorylation during torpor in I. tridecemilineatus, which was predicted to stabilize the myosin molecule. This may act as a potential molecular mechanism mitigating myosin-associated increases in skeletal muscle energy expenditure during periods of torpor in response to cold exposure. Altogether, we demonstrate that resting myosin is altered in hibernating mammals, contributing to significant changes to the ATP consumption of skeletal muscle. Additionally, we observe that it is further altered in response to cold exposure and highlight myosin as a potentially contributor to skeletal muscle non-shivering thermogenesis.
Collapse
|
5
|
Chu HY, Fong JHC, Thean DGL, Zhou P, Fung FKC, Huang Y, Wong ASL. Accurate top protein variant discovery via low-N pick-and-validate machine learning. Cell Syst 2024; 15:193-203.e6. [PMID: 38340729 DOI: 10.1016/j.cels.2024.01.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 10/11/2023] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
A strategy to obtain the greatest number of best-performing variants with least amount of experimental effort over the vast combinatorial mutational landscape would have enormous utility in boosting resource producibility for protein engineering. Toward this goal, we present a simple and effective machine learning-based strategy that outperforms other state-of-the-art methods. Our strategy integrates zero-shot prediction and multi-round sampling to direct active learning via experimenting with only a few predicted top variants. We find that four rounds of low-N pick-and-validate sampling of 12 variants for machine learning yielded the best accuracy of up to 92.6% in selecting the true top 1% variants in combinatorial mutant libraries, whereas two rounds of 24 variants can also be used. We demonstrate our strategy in successfully discovering high-performance protein variants from diverse families including the CRISPR-based genome editors, supporting its generalizable application for solving protein engineering tasks. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - John H C Fong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Dawn G L Thean
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Peng Zhou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Frederic K C Fung
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yuanhua Huang
- School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|
6
|
Vincenzi M, Mercurio FA, Leone M. Virtual Screening of Peptide Libraries: The Search for Peptide-Based Therapeutics Using Computational Tools. Int J Mol Sci 2024; 25:1798. [PMID: 38339078 PMCID: PMC10855943 DOI: 10.3390/ijms25031798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 01/26/2024] [Accepted: 01/30/2024] [Indexed: 02/12/2024] Open
Abstract
Over the last few decades, we have witnessed growing interest from both academic and industrial laboratories in peptides as possible therapeutics. Bioactive peptides have a high potential to treat various diseases with specificity and biological safety. Compared to small molecules, peptides represent better candidates as inhibitors (or general modulators) of key protein-protein interactions. In fact, undruggable proteins containing large and smooth surfaces can be more easily targeted with the conformational plasticity of peptides. The discovery of bioactive peptides, working against disease-relevant protein targets, generally requires the high-throughput screening of large libraries, and in silico approaches are highly exploited for their low-cost incidence and efficiency. The present review reports on the potential challenges linked to the employment of peptides as therapeutics and describes computational approaches, mainly structure-based virtual screening (SBVS), to support the identification of novel peptides for therapeutic implementations. Cutting-edge SBVS strategies are reviewed along with examples of applications focused on diverse classes of bioactive peptides (i.e., anticancer, antimicrobial/antiviral peptides, peptides blocking amyloid fiber formation).
Collapse
Affiliation(s)
| | | | - Marilisa Leone
- Institute of Biostructures and Bioimaging, Via Pietro Castellino 111, 80131 Naples, Italy; (M.V.); (F.A.M.)
| |
Collapse
|
7
|
Castorina LV, Ünal SM, Subr K, Wood CW. TIMED-Design: flexible and accessible protein sequence design with convolutional neural networks. Protein Eng Des Sel 2024; 37:gzae002. [PMID: 38288671 PMCID: PMC10939383 DOI: 10.1093/protein/gzae002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/12/2023] [Accepted: 01/12/2024] [Indexed: 02/18/2024] Open
Abstract
Sequence design is a crucial step in the process of designing or engineering proteins. Traditionally, physics-based methods have been used to solve for optimal sequences, with the main disadvantages being that they are computationally intensive for the end user. Deep learning-based methods offer an attractive alternative, outperforming physics-based methods at a significantly lower computational cost. In this paper, we explore the application of Convolutional Neural Networks (CNNs) for sequence design. We describe the development and benchmarking of a range of networks, as well as reimplementations of previously described CNNs. We demonstrate the flexibility of representing proteins in a three-dimensional voxel grid by encoding additional design constraints into the input data. Finally, we describe TIMED-Design, a web application and command line tool for exploring and applying the models described in this paper. The user interface will be available at the URL: https://pragmaticproteindesign.bio.ed.ac.uk/timed. The source code for TIMED-Design is available at https://github.com/wells-wood-research/timed-design.
Collapse
Affiliation(s)
- Leonardo V Castorina
- School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB United Kingdom
| | - Suleyman Mert Ünal
- School of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, United Kingdom
| | - Kartic Subr
- School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB United Kingdom
| | - Christopher W Wood
- School of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, United Kingdom
| |
Collapse
|
8
|
Liu Y, Liu H. Protein sequence design on given backbones with deep learning. Protein Eng Des Sel 2024; 37:gzad024. [PMID: 38157313 DOI: 10.1093/protein/gzad024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 12/08/2023] [Accepted: 12/18/2023] [Indexed: 01/03/2024] Open
Abstract
Deep learning methods for protein sequence design focus on modeling and sampling the many- dimensional distribution of amino acid sequences conditioned on the backbone structure. To produce physically foldable sequences, inter-residue couplings need to be considered properly. These couplings are treated explicitly in iterative methods or autoregressive methods. Non-autoregressive models treating these couplings implicitly are computationally more efficient, but still await tests by wet experiment. Currently, sequence design methods are evaluated mainly using native sequence recovery rate and native sequence perplexity. These metrics can be complemented by sequence-structure compatibility metrics obtained from energy calculation or structure prediction. However, existing computational metrics have important limitations that may render the generalization of computational test results to performance in real applications unwarranted. Validation of design methods by wet experiments should be encouraged.
Collapse
Affiliation(s)
- Yufeng Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui 230027, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui 230027, China
- School of Biomedical Engineering, Suzhou Institute for Advanced Research, University of Science and Technology of China, Suzhou, Jiangsu 215004, China
| |
Collapse
|
9
|
Randolph NZ, Kuhlman B. Invariant point message passing for protein side chain packing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.03.551328. [PMID: 38187664 PMCID: PMC10769188 DOI: 10.1101/2023.08.03.551328] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Protein side chain packing (PSCP) is a fundamental problem in the field of protein engineering, as high-confidence and low-energy conformations of amino acid side chains are crucial for understanding (and designing) protein folding, protein-protein interactions, and protein-ligand interactions. Traditional PSCP methods (such as the Rosetta Packer) often rely on a library of discrete side chain conformations, or rotamers, and a forcefield to guide the structure to low-energy conformations. Recently, deep learning (DL) based methods (such as DLPacker, AttnPacker, and DiffPack) have demonstrated state-of-the-art predictions and speed in the PSCP task. Building off the success of geometric graph neural networks for protein modeling, we present the Protein Invariant Point Packer (PIPPack) which effectively processes local structural and sequence information to produce realistic, idealized side chain coordinates using χ-angle distribution predictions and geometry-aware invariant point message passing (IPMP). On a test set of ~1,400 high-quality protein chains, PIPPack is highly competitive with other state-of-the-art PSCP methods in rotamer recovery and per-residue RMSD but is significantly faster.
Collapse
Affiliation(s)
- Nicholas Z Randolph
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| | - Brian Kuhlman
- Department of Bioinformatics and Computational Biology, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
- Department of Biochemistry and Biophysics, University of North Carolina School of Medicine, Chapel Hill, North Carolina, USA
| |
Collapse
|
10
|
Ptak CP, Peterson TA, Hopkins JB, Ahern CA, Shy ME, Piper RC. Homomeric interactions of the MPZ Ig domain and their relation to Charcot-Marie-Tooth disease. Brain 2023; 146:5110-5123. [PMID: 37542466 PMCID: PMC10690024 DOI: 10.1093/brain/awad258] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 06/28/2023] [Accepted: 07/17/2023] [Indexed: 08/07/2023] Open
Abstract
Mutations in MPZ (myelin protein zero) can cause demyelinating early-onset Charcot-Marie-Tooth type 1B disease or later onset type 2I/J disease characterized by axonal degeneration, reflecting the diverse roles of MPZ in Schwann cells. MPZ holds apposing membranes of the myelin sheath together, with the adhesion role fulfilled by its extracellular immunoglobulin-like domain (IgMPZ), which oligomerizes. Models for how the IgMPZ might form oligomeric assemblies has been extrapolated from a protein crystal structure in which individual rat IgMPZ subunits are packed together under artificial conditions, forming three weak interfaces. One interface organizes the IgMPZ into tetramers, a second 'dimer' interface links tetramers together across the intraperiod line, and a third hydrophobic interface that mediates binding to lipid bilayers or the same hydrophobic surface on another IgMPZ domain. Presently, there are no data confirming whether the proposed IgMPZ interfaces actually mediate oligomerization in solution, whether they are required for the adhesion activity of MPZ, whether they are important for myelination, or whether their loss results in disease. We performed nuclear magnetic resonance spectroscopy and small angle X-ray scattering analysis of wild-type IgMPZ as well as mutant forms with amino acid substitutions designed to interrupt its presumptive oligomerization interfaces. Here, we confirm the interface that mediates IgMPZ tetramerization, but find that dimerization is mediated by a distinct interface that has yet to be identified. We next correlated different types of Charcot-Marie-Tooth disease symptoms to subregions within IgMPZ tetramers. Variants causing axonal late-onset disease (CMT2I/J) map to surface residues of IgMPZ proximal to the transmembrane domain. Variants causing early-onset demyelinating disease (CMT1B) segregate into two groups: one is described by variants that disrupt the stability of the Ig-fold itself and are largely located within the core of the IgMPZ domain; whereas another describes a region on the surface of IgMPZ tetramers, accessible to protein interactions. Computational docking studies predict that this latter disease-relevant subregion may potentially mediate dimerization of IgMPZ tetramers.
Collapse
Affiliation(s)
- Christopher P Ptak
- Biomolecular Nuclear Magnetic Resonance Facility, University of Iowa Carver College of Medicine, Iowa City, IA 52242, USA
| | - Tabitha A Peterson
- Department of Molecular Physiology and Biophysics, University of Iowa Carver College of Medicine, Iowa City, IA 52242, USA
| | - Jesse B Hopkins
- BioCAT, Department of Physics, Illinois Institute of Technology, Chicago, IL 60616, USA
| | - Christopher A Ahern
- Department of Molecular Physiology and Biophysics, University of Iowa Carver College of Medicine, Iowa City, IA 52242, USA
| | - Michael E Shy
- Department of Neurology, University of Iowa Carver College of Medicine, Iowa City, IA 52242, USA
| | - Robert C Piper
- Department of Molecular Physiology and Biophysics, University of Iowa Carver College of Medicine, Iowa City, IA 52242, USA
| |
Collapse
|
11
|
Zhang J, Liu S, Chen M, Chu H, Wang M, Wang Z, Yu J, Ni N, Yu F, Chen D, Yang YI, Xue B, Yang L, Liu Y, Gao YQ. Unsupervisedly Prompting AlphaFold2 for Accurate Few-Shot Protein Structure Prediction. J Chem Theory Comput 2023; 19:8460-8471. [PMID: 37947474 DOI: 10.1021/acs.jctc.3c00528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2023]
Abstract
Data-driven predictive methods that can efficiently and accurately transform protein sequences into biologically active structures are highly valuable for scientific research and medical development. Determining an accurate folding landscape using coevolutionary information is fundamental to the success of modern protein structure prediction methods. As the state of the art, AlphaFold2 has dramatically raised the accuracy without performing explicit coevolutionary analysis. Nevertheless, its performance still shows strong dependence on available sequence homologues. Based on the interrogation on the cause of such dependence, we presented EvoGen, a meta generative model, to remedy the underperformance of AlphaFold2 for poor MSA targets. By prompting the model with calibrated or virtually generated homologue sequences, EvoGen helps AlphaFold2 fold accurately in the low-data regime and even achieve encouraging performance with single-sequence predictions. Being able to make accurate predictions with few-shot MSA not only generalizes AlphaFold2 better for orphan sequences but also democratizes its use for high-throughput applications. Besides, EvoGen combined with AlphaFold2 yields a probabilistic structure generation method that could explore alternative conformations of protein sequences, and the task-aware differentiable algorithm for sequence generation will benefit other related tasks including protein design.
Collapse
Affiliation(s)
- Jun Zhang
- Changping Laboratory, Beijing 102200, China
| | - Sirui Liu
- Changping Laboratory, Beijing 102200, China
| | - Mengyun Chen
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Haotian Chu
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Min Wang
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Zidong Wang
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Jialiang Yu
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Ningxi Ni
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Fan Yu
- Huawei Hangzhou Research Institute, Huawei Technologies Co. Ltd., Hangzhou 310051, China
| | - Dechin Chen
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yi Isaac Yang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Boxin Xue
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Lijiang Yang
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yuan Liu
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yi Qin Gao
- Changping Laboratory, Beijing 102200, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Biomedical Pioneering Innovation Center, Peking University, Beijing 100871, China
| |
Collapse
|
12
|
Kasapoglu AG, Ilhan E, Aydin M, Yigider E, Inal B, Buyuk I, Taspinar MS, Ciltas A, Agar G. Characterization of Two-Component System gene ( TCS) in melatonin-treated common bean under salt and drought stress. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2023; 29:1733-1754. [PMID: 38162914 PMCID: PMC10754802 DOI: 10.1007/s12298-023-01406-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 10/21/2023] [Accepted: 12/12/2023] [Indexed: 01/03/2024]
Abstract
The two-component system (TCS) generally consists of three elements, namely the histidine kinase (HK), response regulator (RR), and histidine phosphotransfer (HP) gene families. This study aimed to assess the expression of TCS genes in P. vulgaris leaf tissue under salt and drought stress and perform a genome-wide analysis of TCS gene family members using bioinformatics methods. This study identified 67 PvTCS genes, including 10 PvHP, 38 PvRR, and 19 PvHK, in the bean genome. PvHK2 had the maximum number of amino acids with 1261, whilst PvHP8 had the lowest number with 87. In addition, their theoretical isoelectric points were between 4.56 (PvHP8) and 9.15 (PvPRR10). The majority of PvTCS genes are unstable. Phylogenetic analysis of TCS genes in A. thaliana, G. max, and bean found that PvTCS genes had close phylogenetic relationships with the genes of other plants. Segmental and tandem duplicate gene pairs were detected among the TCS genes and TCS genes have been subjected to purifying selection pressure in the evolutionary process. Furthermore, the TCS gene family, which has an important role in abiotic stress and hormonal responses in plants, was characterized for the first time in beans, and its expression of TCS genes in bean leaves under salt and drought stress was established using RNAseq and qRT-PCR analyses. The findings of this study will aid future functional and genomic studies by providing essential information about the members of the TCS gene family in beans. Supplementary Information The online version contains supplementary material available at 10.1007/s12298-023-01406-5.
Collapse
Affiliation(s)
- Ayse Gul Kasapoglu
- Department of Molecular Biology and Genetics, Faculty of Science, Erzurum Technical University, 25050 Erzurum, Turkey
| | - Emre Ilhan
- Department of Molecular Biology and Genetics, Faculty of Science, Erzurum Technical University, 25050 Erzurum, Turkey
| | - Murat Aydin
- Department of Agricultural Biotechnology, Faculty of Agriculture, Ataturk University, 25050 Erzurum, Turkey
| | - Esma Yigider
- Department of Agricultural Biotechnology, Faculty of Agriculture, Ataturk University, 25050 Erzurum, Turkey
| | - Behcet Inal
- Department of Agricultural Biotechnology, Faculty of Agriculture, Siirt University, 56100 Siirt, Turkey
| | - Ilker Buyuk
- Department of Biology, Faculty of Science, Ankara University, 06100 Ankara, Turkey
| | - Mahmut Sinan Taspinar
- Department of Agricultural Biotechnology, Faculty of Agriculture, Ataturk University, 25050 Erzurum, Turkey
| | - Abdulkadir Ciltas
- Department of Agricultural Biotechnology, Faculty of Agriculture, Ataturk University, 25050 Erzurum, Turkey
| | - Guleray Agar
- Department of Biology, Faculty of Science, Ataturk University, 25050 Erzurum, Turkey
| |
Collapse
|
13
|
Xu G, Luo Z, Zhou R, Wang Q, Ma J. OPUS-Fold3: a gradient-based protein all-atom folding and docking framework on TensorFlow. Brief Bioinform 2023; 24:bbad365. [PMID: 37833840 DOI: 10.1093/bib/bbad365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 08/29/2023] [Accepted: 09/20/2023] [Indexed: 10/15/2023] Open
Abstract
For refining and designing protein structures, it is essential to have an efficient protein folding and docking framework that generates a protein 3D structure based on given constraints. In this study, we introduce OPUS-Fold3 as a gradient-based, all-atom protein folding and docking framework, which accurately generates 3D protein structures in compliance with specified constraints, such as a potential function as long as it can be expressed as a function of positions of heavy atoms. Our tests show that, for example, OPUS-Fold3 achieves performance comparable to pyRosetta in backbone folding and significantly better in side-chain modeling. Developed using Python and TensorFlow 2.4, OPUS-Fold3 is user-friendly for any source-code level modifications and can be seamlessly combined with other deep learning models, thus facilitating collaboration between the biology and AI communities. The source code of OPUS-Fold3 can be downloaded from http://github.com/OPUS-MaLab/opus_fold3. It is freely available for academic usage.
Collapse
Affiliation(s)
- Gang Xu
- Multiscale Research Institute of Complex Systems, Fudan University, Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 201210, China
- Shanghai AI Laboratory, Shanghai, 200030, China
| | - Zhenwei Luo
- Multiscale Research Institute of Complex Systems, Fudan University, Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 201210, China
- Shanghai AI Laboratory, Shanghai, 200030, China
| | - Ruhong Zhou
- Institute of Quantitative Biology, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
- Shanghai Institute for Advanced Study, Zhejiang University, Shanghai, 201203, China
| | - Qinghua Wang
- Center for Biomolecular Innovation, Harcam Biomedicines, Shanghai, 200131, China
| | - Jianpeng Ma
- Multiscale Research Institute of Complex Systems, Fudan University, Shanghai, 200433, China
- Zhangjiang Fudan International Innovation Center, Fudan University, Shanghai, 201210, China
- Shanghai AI Laboratory, Shanghai, 200030, China
- Shanghai Institute for Advanced Study, Zhejiang University, Shanghai, 201203, China
| |
Collapse
|
14
|
Huang X, Sun Y, Osawa Y, Chen YE, Zhang H. Computational redesign of cytochrome P450 CYP102A1 for highly stereoselective omeprazole hydroxylation by UniDesign. J Biol Chem 2023; 299:105050. [PMID: 37451479 PMCID: PMC10413352 DOI: 10.1016/j.jbc.2023.105050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 07/03/2023] [Accepted: 07/09/2023] [Indexed: 07/18/2023] Open
Abstract
Cytochrome P450 CYP102A1 is a prototypic biocatalyst that has great potential in chemical synthesis, drug discovery, and biotechnology. CYP102A1 variants engineered by directed evolution and/or rational design are capable of catalyzing the oxidation of a wide range of organic compounds. However, it is difficult to foresee the outcome of engineering CYP102A1 for a compound of interest. Here, we introduce UniDesign as a computational framework for enzyme design and engineering. We tested UniDesign by redesigning CYP102A1 for stereoselective metabolism of omeprazole (OMP), a proton pump inhibitor, starting from an active but nonstereoselective triple mutant (TM: A82F/F87V/L188Q). To shift stereoselectivity toward (R)-OMP, we computationally scanned three active site positions (75, 264, and 328) for mutations that would stabilize the binding of the transition state of (R)-OMP while destabilizing that of (S)-OMP and picked three variants, namely UD1 (TM/L75I), UD2 (TM/A264G), and UD3 (TM/A328V), for experimentation, based on computed energy scores and models. UD1, UD2, and UD3 exhibit high turnover rates of 55 ± 4.7, 84 ± 4.8, and 79 ± 5.7 min-1, respectively, for (R)-OMP hydroxylation, whereas the corresponding rates for (S)-OMP are only 2.2 ± 0.19, 6.0 ± 0.68, and 14 ± 2.8 min-1, yielding an enantiomeric excess value of 92, 87, and 70%, respectively. These results suggest the critical roles of L75I, A264G, and A328V in steering OMP in the optimal orientation for stereoselective oxidation and demonstrate the utility of UniDesign for engineering CYP102A1 to produce drug metabolites of interest. The results are discussed in the context of protein structures.
Collapse
Affiliation(s)
- Xiaoqiang Huang
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA.
| | - Yudong Sun
- Department of Pharmacology, University of Michigan, Ann Arbor, Michigan, USA
| | - Yoichi Osawa
- Department of Pharmacology, University of Michigan, Ann Arbor, Michigan, USA
| | - Y Eugene Chen
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, USA
| | - Haoming Zhang
- Department of Pharmacology, University of Michigan, Ann Arbor, Michigan, USA.
| |
Collapse
|
15
|
Wang G, Liu X, Wang K, Gao Y, Li G, Baptista-Hon DT, Yang XH, Xue K, Tai WH, Jiang Z, Cheng L, Fok M, Lau JYN, Yang S, Lu L, Zhang P, Zhang K. Deep-learning-enabled protein-protein interaction analysis for prediction of SARS-CoV-2 infectivity and variant evolution. Nat Med 2023; 29:2007-2018. [PMID: 37524952 DOI: 10.1038/s41591-023-02483-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 06/28/2023] [Indexed: 08/02/2023]
Abstract
Host-pathogen interactions and pathogen evolution are underpinned by protein-protein interactions between viral and host proteins. An understanding of how viral variants affect protein-protein binding is important for predicting viral-host interactions, such as the emergence of new pathogenic SARS-CoV-2 variants. Here we propose an artificial intelligence-based framework called UniBind, in which proteins are represented as a graph at the residue and atom levels. UniBind integrates protein three-dimensional structure and binding affinity and is capable of multi-task learning for heterogeneous biological data integration. In systematic tests on benchmark datasets and further experimental validation, UniBind effectively and scalably predicted the effects of SARS-CoV-2 spike protein variants on their binding affinities to the human ACE2 receptor, as well as to SARS-CoV-2 neutralizing monoclonal antibodies. Furthermore, in a cross-species analysis, UniBind could be applied to predict host susceptibility to SARS-CoV-2 variants and to predict future viral variant evolutionary trends. This in silico approach has the potential to serve as an early warning system for problematic emerging SARS-CoV-2 variants, as well as to facilitate research on protein-protein interactions in general.
Collapse
Affiliation(s)
- Guangyu Wang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China.
| | - Xiaohong Liu
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- UCL Cancer Institute, University College London, London, UK
| | - Kai Wang
- Department of Big Data and Biomedical Artificial Intelligence, National Biomedical Imaging Center, College of Future Technology, Peking University and Peking-Tsinghua Center for Life Sciences, Beijing, China
| | - Yuanxu Gao
- Guangzhou National Laboratory, Guangzhou, China
| | - Gen Li
- Guangzhou National Laboratory, Guangzhou, China
- Guangzhou Women and Children's Medical Center, Guangzhou, China
| | - Daniel T Baptista-Hon
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Xiaohong Helena Yang
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Kanmin Xue
- Nuffield Laboratory of Ophthalmology, Department of Clinical Neurosciences, University of Oxford, Oxford, UK
| | - Wa Hou Tai
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Zeyu Jiang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Linling Cheng
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Manson Fok
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Johnson Yiu-Nam Lau
- Departments of Biology and Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China
| | - Shengyong Yang
- State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Ligong Lu
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China
| | - Ping Zhang
- State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
| | - Kang Zhang
- Instutite for Artificial Intelligence in Medicine and Faculty of Medicine, Macau University of Science and Technology, Macau, China.
- Department of Big Data and Biomedical Artificial Intelligence, National Biomedical Imaging Center, College of Future Technology, Peking University and Peking-Tsinghua Center for Life Sciences, Beijing, China.
- Guangzhou National Laboratory, Guangzhou, China.
- Zhuhai International Eye Center and Provincial Key Laboratory of Tumor Interventional Diagnosis and Treatment, Zhuhai People's Hospital and the First Affiliated Hospital of Faculty of Medicine, Macau University of Science and Technology, Guangdong, China.
| |
Collapse
|
16
|
Huang X, Zhou J, Yang D, Zhang J, Xia X, Chen YE, Xu J. Decoding CRISPR-Cas PAM recognition with UniDesign. Brief Bioinform 2023; 24:bbad133. [PMID: 37078688 PMCID: PMC10199764 DOI: 10.1093/bib/bbad133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 02/09/2023] [Accepted: 03/16/2023] [Indexed: 04/21/2023] Open
Abstract
The critical first step in Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR)-associated (CRISPR-Cas) protein-mediated gene editing is recognizing a preferred protospacer adjacent motif (PAM) on target DNAs by the protein's PAM-interacting amino acids (PIAAs). Thus, accurate computational modeling of PAM recognition is useful in assisting CRISPR-Cas engineering to relax or tighten PAM requirements for subsequent applications. Here, we describe a universal computational protein design framework (UniDesign) for designing protein-nucleic acid interactions. As a proof of concept, we applied UniDesign to decode the PAM-PIAA interactions for eight Cas9 and two Cas12a proteins. We show that, given native PIAAs, the UniDesign-predicted PAMs are largely identical to the natural PAMs of all Cas proteins. In turn, given natural PAMs, the computationally redesigned PIAA residues largely recapitulated the native PIAAs (74% and 86% in terms of identity and similarity, respectively). These results demonstrate that UniDesign faithfully captures the mutual preference between natural PAMs and native PIAAs, suggesting it is a useful tool for engineering CRISPR-Cas and other nucleic acid-interacting proteins. UniDesign is open-sourced at https://github.com/tommyhuangthu/UniDesign.
Collapse
Affiliation(s)
- Xiaoqiang Huang
- Center for Advanced Models for Translational Sciences and Therapeutics, Department of Internal Medicine, University of Michigan Medical School, 2800 Plymouth Road, Ann Arbor, MI 48109, USA
| | - Jun Zhou
- Center for Advanced Models for Translational Sciences and Therapeutics, Department of Internal Medicine, University of Michigan Medical School, 2800 Plymouth Road, Ann Arbor, MI 48109, USA
| | - Dongshan Yang
- Center for Advanced Models for Translational Sciences and Therapeutics, Department of Internal Medicine, University of Michigan Medical School, 2800 Plymouth Road, Ann Arbor, MI 48109, USA
| | - Jifeng Zhang
- Center for Advanced Models for Translational Sciences and Therapeutics, Department of Internal Medicine, University of Michigan Medical School, 2800 Plymouth Road, Ann Arbor, MI 48109, USA
| | - Xiaofeng Xia
- Research & Development, ATGC Inc., 100 E Lancaster Avenue, LIMR Building Lab 129, Wynnewood, PA 19096, USA
| | - Yuqing Eugene Chen
- Center for Advanced Models for Translational Sciences and Therapeutics, Department of Internal Medicine, University of Michigan Medical School, 2800 Plymouth Road, Ann Arbor, MI 48109, USA
| | - Jie Xu
- Center for Advanced Models for Translational Sciences and Therapeutics, Department of Internal Medicine, University of Michigan Medical School, 2800 Plymouth Road, Ann Arbor, MI 48109, USA
| |
Collapse
|
17
|
Braunstein EM, Imada E, Pasca S, Wang S, Chen H, Alba C, Hupalo DN, Wilkerson M, Dalgard CL, Ghannam J, Liu Y, Marchionni L, Moliterno A, Hourigan CS, Gondek LP. Recurrent germline variant in ATM associated with familial myeloproliferative neoplasms. Leukemia 2023; 37:627-635. [PMID: 36543879 DOI: 10.1038/s41375-022-01797-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Revised: 12/07/2022] [Accepted: 12/12/2022] [Indexed: 12/24/2022]
Abstract
Genetic predisposition (familial risk) in the myeloproliferative neoplasms (MPNs) is more common than the risk observed in most other cancers, including breast, prostate, and colon. Up to 10% of MPNs are considered to be familial. Recent genome-wide association studies have identified genomic loci associated with an MPN diagnosis. However, the identification of variants with functional contributions to the development of MPN remains limited. In this study, we have included 630 MPN patients and whole genome sequencing was performed in 64 individuals with familial MPN to uncover recurrent germline predisposition variants. Both targeted and unbiased filtering of single nucleotide variants (SNVs) was performed, with a comparison to 218 individuals with MPN unselected for familial status. This approach identified an ATM L2307F SNV occurring in nearly 8% of individuals with familial MPN. Structural protein modeling of this variant suggested stabilization of inactive ATM dimer, and alteration of the endogenous ATM locus in a human myeloid cell line resulted in decreased phosphorylation of the downstream tumor suppressor CHEK2. These results implicate ATM, and the DNA-damage response pathway, in predisposition to MPN.
Collapse
Affiliation(s)
- Evan M Braunstein
- Division of Hematological Malignancies, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA.,Division of Hematology, Department of Medicine, Johns Hopkins Hospital, Baltimore, MD, USA
| | - Eddie Imada
- Division of Computational and Systems Pathology, Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Sergiu Pasca
- Division of Hematological Malignancies, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
| | - Shiyu Wang
- Division of Hematological Malignancies, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA
| | - Hang Chen
- Division of Hematology, Department of Medicine, Johns Hopkins Hospital, Baltimore, MD, USA.,Committee on Genetics, Genomics and Systems Biology, Biological Sciences Division, University of Chicago, Chicago, IL, USA
| | - Camille Alba
- Henry Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, USA.,The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Dan N Hupalo
- Henry Jackson Foundation for the Advancement of Military Medicine, Bethesda, MD, USA
| | - Matthew Wilkerson
- Department of Anatomy Physiology & Genetics, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Clifton L Dalgard
- The American Genome Center, Uniformed Services University of the Health Sciences, Bethesda, MD, USA.,Department of Anatomy Physiology & Genetics, Uniformed Services University of the Health Sciences, Bethesda, MD, USA
| | - Jack Ghannam
- Laboratory of Myeloid Malignancies, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yujia Liu
- Department of Biochemistry and Molecular Biology, Biological Sciences Division, University of Chicago, Chicago, IL, USA
| | - Luigi Marchionni
- Division of Computational and Systems Pathology, Department of Pathology and Laboratory Medicine, Weill Cornell Medicine, New York, NY, USA
| | - Alison Moliterno
- Division of Hematology, Department of Medicine, Johns Hopkins Hospital, Baltimore, MD, USA
| | - Christopher S Hourigan
- Laboratory of Myeloid Malignancies, National Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lukasz P Gondek
- Division of Hematological Malignancies, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University, Baltimore, MD, USA.
| |
Collapse
|
18
|
De novo protein fold design through sequence-independent fragment assembly simulations. Proc Natl Acad Sci U S A 2023; 120:e2208275120. [PMID: 36656852 PMCID: PMC9942881 DOI: 10.1073/pnas.2208275120] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
De novo protein design generally consists of two steps, including structure and sequence design. Many protein design studies have focused on sequence design with scaffolds adapted from native structures in the PDB, which renders novel areas of protein structure and function space unexplored. We developed FoldDesign to create novel protein folds from specific secondary structure (SS) assignments through sequence-independent replica-exchange Monte Carlo (REMC) simulations. The method was tested on 354 non-redundant topologies, where FoldDesign consistently created stable structural folds, while recapitulating on average 87.7% of the SS elements. Meanwhile, the FoldDesign scaffolds had well-formed structures with buried residues and solvent-exposed areas closely matching their native counterparts. Despite the high fidelity to the input SS restraints and local structural characteristics of native proteins, a large portion of the designed scaffolds possessed global folds completely different from natural proteins in the PDB, highlighting the ability of FoldDesign to explore novel areas of protein fold space. Detailed data analyses revealed that the major contributions to the successful structure design lay in the optimal energy force field, which contains a balanced set of SS packing terms, and REMC simulations, which were coupled with multiple auxiliary movements to efficiently search the conformational space. Additionally, the ability to recognize and assemble uncommon super-SS geometries, rather than the unique arrangement of common SS motifs, was the key to generating novel folds. These results demonstrate a strong potential to explore both structural and functional spaces through computational design simulations that natural proteins have not reached through evolution.
Collapse
|
19
|
Hernández IM, Dehouck Y, Bastolla U, López-Blanco JR, Chacón P. Predicting protein stability changes upon mutation using a simple orientational potential. Bioinformatics 2023; 39:6984713. [PMID: 36629451 PMCID: PMC9850275 DOI: 10.1093/bioinformatics/btad011] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 11/17/2022] [Accepted: 01/10/2023] [Indexed: 01/12/2023] Open
Abstract
MOTIVATION Structure-based stability prediction upon mutation is crucial for protein engineering and design, and for understanding genetic diseases or drug resistance events. For this task, we adopted a simple residue-based orientational potential that considers only three backbone atoms, previously applied in protein modeling. Its application to stability prediction only requires parametrizing 12 amino acid-dependent weights using cross-validation strategies on a curated dataset in which we tried to reduce the mutations that belong to protein-protein or protein-ligand interfaces, extreme conditions and the alanine over-representation. RESULTS Our method, called KORPM, accurately predicts mutational effects on an independent benchmark dataset, whether the wild-type or mutated structure is used as starting point. Compared with state-of-the-art methods on this balanced dataset, our approach obtained the lowest root mean square error (RMSE) and the highest correlation between predicted and experimental ΔΔG measures, as well as better receiver operating characteristics and precision-recall curves. Our method is almost anti-symmetric by construction, and it performs thus similarly for the direct and reverse mutations with the corresponding wild-type and mutated structures. Despite the strong limitations of the available experimental mutation data in terms of size, variability, and heterogeneity, we show competitive results with a simple sum of energy terms, which is more efficient and less prone to overfitting. AVAILABILITY AND IMPLEMENTATION https://github.com/chaconlab/korpm. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Iván Martín Hernández
- Department of Biological Physical Chemistry, Rocasolano Institute of Physical Chemistry, CSIC, 28006 Madrid, Spain
| | - Yves Dehouck
- Bioinformatic Unit, Centro de Biología Molecular “Severo Ochoa,” CSIC-UAM Cantoblanco, Madrid 28049, Spain
| | - Ugo Bastolla
- Bioinformatic Unit, Centro de Biología Molecular “Severo Ochoa,” CSIC-UAM Cantoblanco, Madrid 28049, Spain
| | - José Ramón López-Blanco
- Department of Biological Physical Chemistry, Rocasolano Institute of Physical Chemistry, CSIC, 28006 Madrid, Spain
| | | |
Collapse
|
20
|
Thepsuwan P, Bhattacharya A, Song Z, Hippleheuser S, Feng S, Wei X, Das NK, Sierra M, Wei J, Fang D, Huang YMM, Zhang K, Shah YM, Sun S. Hepatic SEL1L-HRD1 ER-associated degradation regulates systemic iron homeostasis via ceruloplasmin. Proc Natl Acad Sci U S A 2023; 120:e2212644120. [PMID: 36595688 PMCID: PMC9926173 DOI: 10.1073/pnas.2212644120] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 11/18/2022] [Indexed: 01/05/2023] Open
Abstract
Iron homeostasis is critical for cellular and organismal function and is tightly regulated to prevent toxicity or anemia due to iron excess or deficiency, respectively. However, subcellular regulatory mechanisms of iron remain largely unexplored. Here, we report that SEL1L-HRD1 protein complex of endoplasmic reticulum (ER)-associated degradation (ERAD) in hepatocytes controls systemic iron homeostasis in a ceruloplasmin (CP)-dependent, and ER stress-independent, manner. Mice with hepatocyte-specific Sel1L deficiency exhibit altered basal iron homeostasis and are sensitized to iron deficiency while resistant to iron overload. Proteomics screening for a factor linking ERAD deficiency to altered iron homeostasis identifies CP, a key ferroxidase involved in systemic iron distribution by catalyzing iron oxidation and efflux from tissues. Indeed, CP is highly unstable and a bona fide substrate of SEL1L-HRD1 ERAD. In the absence of ERAD, CP protein accumulates in the ER and is shunted to refolding, leading to elevated secretion. Providing clinical relevance of these findings, SEL1L-HRD1 ERAD is responsible for the degradation of a subset of disease-causing CP mutants, thereby attenuating their pathogenicity. Together, this study uncovers the role of SEL1L-HRD1 ERAD in systemic iron homeostasis and provides insights into protein misfolding-associated proteotoxicity.
Collapse
Affiliation(s)
- Pattaraporn Thepsuwan
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI48201
| | - Asmita Bhattacharya
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI48105
| | - Zhenfeng Song
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI48201
| | - Stephen Hippleheuser
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI48201
| | - Shaobin Feng
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI48201
| | - Xiaoqiong Wei
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI48105
| | - Nupur K. Das
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI48105
| | - Mariana Sierra
- Department of Physics and Astronomy, Wayne State University, Detroit, MI48201
| | - Juncheng Wei
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL60611
| | - Deyu Fang
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL60611
| | - Yu-ming M. Huang
- Department of Physics and Astronomy, Wayne State University, Detroit, MI48201
| | - Kezhong Zhang
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI48201
- Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, MI48201
| | - Yatrik M. Shah
- Department of Molecular & Integrative Physiology, University of Michigan Medical School, Ann Arbor, MI48105
- Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI48109
| | - Shengyi Sun
- Center for Molecular Medicine and Genetics, Wayne State University School of Medicine, Detroit, MI48201
- Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, MI48201
| |
Collapse
|
21
|
Castorina LV, Petrenas R, Subr K, Wood CW. PDBench: evaluating computational methods for protein-sequence design. Bioinformatics 2023; 39:btad027. [PMID: 36637198 PMCID: PMC9869650 DOI: 10.1093/bioinformatics/btad027] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 11/14/2022] [Accepted: 01/12/2023] [Indexed: 01/14/2023] Open
Abstract
SUMMARY Ever increasing amounts of protein structure data, combined with advances in machine learning, have led to the rapid proliferation of methods available for protein-sequence design. In order to utilize a design method effectively, it is important to understand the nuances of its performance and how it varies by design target. Here, we present PDBench, a set of proteins and a number of standard tests for assessing the performance of sequence-design methods. PDBench aims to maximize the structural diversity of the benchmark, compared with previous benchmarking sets, in order to provide useful biological insight into the behaviour of sequence-design methods, which is essential for evaluating their performance and practical utility. We believe that these tools are useful for guiding the development of novel sequence design algorithms and will enable users to choose a method that best suits their design target. AVAILABILITY AND IMPLEMENTATION https://github.com/wells-wood-research/PDBench. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Leonardo V Castorina
- School of Informatics, University of Edinburgh, 10 Crichton Street, Newington, Edinburgh EH8 9AB, UK
| | - Rokas Petrenas
- School of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, UK
| | - Kartic Subr
- School of Informatics, University of Edinburgh, 10 Crichton Street, Newington, Edinburgh EH8 9AB, UK
| | - Christopher W Wood
- School of Biological Sciences, University of Edinburgh, Roger Land Building, Edinburgh EH9 3FF, UK
| |
Collapse
|
22
|
Liu H, Chen Q. Computational protein design with data‐driven approaches: Recent developments and perspectives. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
- School of Data Science University of Science and Technology of China Hefei Anhui China
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine University of Science and Technology of China Hefei Anhui China
- Biomedical Sciences and Health Laboratory of Anhui Province University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
23
|
Chowdhury R, Bouatta N, Biswas S, Floristean C, Kharkar A, Roy K, Rochereau C, Ahdritz G, Zhang J, Church GM, Sorger PK, AlQuraishi M. Single-sequence protein structure prediction using a language model and deep learning. Nat Biotechnol 2022; 40:1617-1623. [PMID: 36192636 PMCID: PMC10440047 DOI: 10.1038/s41587-022-01432-w] [Citation(s) in RCA: 120] [Impact Index Per Article: 60.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 07/15/2022] [Indexed: 12/30/2022]
Abstract
AlphaFold2 and related computational systems predict protein structure using deep learning and co-evolutionary relationships encoded in multiple sequence alignments (MSAs). Despite high prediction accuracy achieved by these systems, challenges remain in (1) prediction of orphan and rapidly evolving proteins for which an MSA cannot be generated; (2) rapid exploration of designed structures; and (3) understanding the rules governing spontaneous polypeptide folding in solution. Here we report development of an end-to-end differentiable recurrent geometric network (RGN) that uses a protein language model (AminoBERT) to learn latent structural information from unaligned proteins. A linked geometric module compactly represents Cα backbone geometry in a translationally and rotationally invariant way. On average, RGN2 outperforms AlphaFold2 and RoseTTAFold on orphan proteins and classes of designed proteins while achieving up to a 106-fold reduction in compute time. These findings demonstrate the practical and theoretical strengths of protein language models relative to MSAs in structure prediction.
Collapse
Affiliation(s)
- Ratul Chowdhury
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA, USA
| | - Nazim Bouatta
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA, USA.
| | - Surojit Biswas
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Nabla Bio, Inc., Boston, MA, USA
| | | | - Anant Kharkar
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Koushik Roy
- Department of Computer Science, Columbia University, New York, NY, USA
| | - Charlotte Rochereau
- Integrated Program in Cellular, Molecular, and Biomedical Studies, Columbia University, New York, NY, USA
| | - Gustaf Ahdritz
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Joanna Zhang
- Department of Computer Science, Columbia University, New York, NY, USA
| | - George M Church
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Peter K Sorger
- Laboratory of Systems Pharmacology, Program in Therapeutic Science, Harvard Medical School, Boston, MA, USA.
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| | - Mohammed AlQuraishi
- Department of Computer Science, Columbia University, New York, NY, USA.
- Department of Systems Biology, Columbia University, New York, NY, USA.
| |
Collapse
|
24
|
|
25
|
Woodard J, Iqbal S, Mashaghi A. Circuit topology predicts pathogenicity of missense mutations. Proteins 2022; 90:1634-1644. [PMID: 35394672 PMCID: PMC9543832 DOI: 10.1002/prot.26342] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 03/07/2022] [Accepted: 03/30/2022] [Indexed: 12/05/2022]
Abstract
The contact topology of a protein determines important aspects of the folding process. The topological measure of contact order has been shown to be predictive of the rate of folding. Circuit topology is emerging as another fundamental descriptor of biomolecular structure, with predicted effects on the folding rate. We analyze the residue‐based circuit topological environments of 21 K mutations labeled as pathogenic or benign. Multiple statistical lines of reasoning support the conclusion that the number of contacts in two specific circuit topological arrangements, namely inverse parallel and cross relations, with contacts involving the mutated residue have discriminatory value in determining the pathogenicity of human variants. We investigate how results vary with residue type and according to whether the gene is essential. We further explore the relationship to a number of structural features and find that circuit topology provides nonredundant information on protein structures and pathogenicity of mutations. Results may have implications for the polymer physics of protein folding and suggest that “local” topological information, including residue‐based circuit topology and residue contact order, could be useful in improving state‐of‐the‐art machine learning algorithms for pathogenicity prediction.
Collapse
Affiliation(s)
- Jaie Woodard
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Leiden, The Netherlands.,Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - Sumaiya Iqbal
- Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.,Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.,Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.,Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, Massachusetts, USA
| | - Alireza Mashaghi
- Medical Systems Biophysics and Bioengineering, Leiden Academic Centre for Drug Research, Faculty of Science, Leiden University, Leiden, The Netherlands.,Centre for Interdisciplinary Genome Research, Faculty of Science, Leiden University, Leiden, The Netherlands
| |
Collapse
|
26
|
Liu Y, Zhang L, Wang W, Zhu M, Wang C, Li F, Zhang J, Li H, Chen Q, Liu H. Rotamer-free protein sequence design based on deep learning and self-consistency. NATURE COMPUTATIONAL SCIENCE 2022; 2:451-462. [PMID: 38177863 DOI: 10.1038/s43588-022-00273-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 06/07/2022] [Indexed: 01/06/2024]
Abstract
Several previously proposed deep learning methods to design amino acid sequences that autonomously fold into a given protein backbone yielded promising results in computational tests but did not outperform conventional energy function-based methods in wet experiments. Here we present the ABACUS-R method, which uses an encoder-decoder network trained using a multitask learning strategy to predict the sidechain type of a central residue from its three-dimensional local environment, which includes, besides other features, the types but not the conformations of the surrounding sidechains. This eliminates the need to reconstruct and optimize sidechain structures, and drastically simplifies the sequence design process. Thus iteratively applying the encoder-decoder to different central residues is able to produce self-consistent overall sequences for a target backbone. Results of wet experiments, including five structures solved by X-ray crystallography, show that ABACUS-R outperforms state-of-the-art energy function-based methods in success rate and design precision.
Collapse
Affiliation(s)
- Yufeng Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Lu Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Weilun Wang
- CAS Key Laboratory of GIPAS, School of Information Science and Technology, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, Anhui, China
| | - Min Zhu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Chenchen Wang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Fudong Li
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China
| | - Jiahai Zhang
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China
| | - Houqiang Li
- CAS Key Laboratory of GIPAS, School of Information Science and Technology, Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei, Anhui, China.
| | - Quan Chen
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China.
| | - Haiyan Liu
- MOE Key Laboratory for Membraneless Organelles and Cellular Dynamics, School of Life Sciences, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, China.
- Biomedical Sciences and Health Laboratory of Anhui Province, University of Science and Technology of China, Hefei, Anhui, China.
- School of Data Science, University of Science and Technology of China, Hefei, Anhui, China.
| |
Collapse
|
27
|
Li L, Gao M, Jiao P, Zu S, Deng YQ, Wan D, Cao Y, Duan J, Aliyari SR, Li J, Shi Y, Rao Z, Qin CF, Guo Y, Cheng G, Yang H. Antibody engineering improves neutralization activity against K417 spike mutant SARS-CoV-2 variants. Cell Biosci 2022; 12:63. [PMID: 35581593 PMCID: PMC9113379 DOI: 10.1186/s13578-022-00794-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 04/18/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Neutralizing antibodies are approved drugs to treat coronavirus disease-2019 (COVID-19) patients, yet mutations in severe acute respiratory syndrome coronavirus (SARS-CoV-2) variants may reduce the antibody neutralizing activity. New monoclonal antibodies (mAbs) and antibody remolding strategies are recalled in the battle with COVID-19 epidemic. RESULTS We identified multiple mAbs from antibody phage display library made from COVID-19 patients and further characterized the R3P1-E4 clone, which effectively suppressed SARS-CoV-2 infection and rescued the lethal phenotype in mice infected with SARS-CoV-2. Crystal structural analysis not only explained why R3P1-E4 had selectively reduced binding and neutralizing activity to SARS-CoV-2 variants carrying K417 mutations, but also allowed us to engineer mutant antibodies with improved neutralizing activity against these variants. Thus, we screened out R3P1-E4 mAb which inhibits SARS-CoV-2 and related mutations in vitro and in vivo. Antibody engineering improved neutralizing activity of R3P1-E4 against K417 mutations. CONCLUSION Our studies have outlined a strategy to identify and engineer neutralizing antibodies against SARS-CoV-2 variants.
Collapse
Affiliation(s)
- Lili Li
- Institute of Systems Medicine, Chinese Academy of Medical Science & Peking Union College, Beijing, 100005, China
- Suzhou Institute of Systems Medicine, Suzhou, 215123, China
| | - Meiling Gao
- Institute of Systems Medicine, Chinese Academy of Medical Science & Peking Union College, Beijing, 100005, China
- Suzhou Institute of Systems Medicine, Suzhou, 215123, China
| | - Peng Jiao
- State Key Laboratory of Medicinal Chemical Biology and College of Life Sciences, Nankai University, Tianjin, 300071, China
| | - Shulong Zu
- Institute of Systems Medicine, Chinese Academy of Medical Science & Peking Union College, Beijing, 100005, China
- Suzhou Institute of Systems Medicine, Suzhou, 215123, China
| | - Yong-Qiang Deng
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, 100071, China
| | - Dingyi Wan
- AtaGenix Laboratories (Wuhan) Co., Ltd, Wuhan, 430075, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu, 610065, China
| | - Jing Duan
- AtaGenix Laboratories (Wuhan) Co., Ltd, Wuhan, 430075, China
| | - Saba R Aliyari
- Department of Microbiology, Immunology & Molecular Genetics, University of California, Los Angeles, CA, 90095, USA
| | - Jie Li
- Department of Laboratory Medicine, Taihe Hospital, Hubei University of Medicine, Shiyan, 442000, China
| | - Yueyue Shi
- Institute of Systems Medicine, Chinese Academy of Medical Science & Peking Union College, Beijing, 100005, China
- Suzhou Institute of Systems Medicine, Suzhou, 215123, China
| | - Zihe Rao
- State Key Laboratory of Medicinal Chemical Biology and College of Life Sciences, Nankai University, Tianjin, 300071, China
- Guangzhou Laboratory, B1, Standard Property Unite 4, Guangzhou international bio-island, Guangzhou, 510320, China
| | - Cheng-Feng Qin
- Department of Virology, State Key Laboratory of Pathogen and Biosecurity, Beijing Institute of Microbiology and Epidemiology, AMMS, Beijing, 100071, China.
| | - Yu Guo
- State Key Laboratory of Medicinal Chemical Biology and College of Life Sciences, Nankai University, Tianjin, 300071, China.
| | - Genhong Cheng
- Department of Microbiology, Immunology & Molecular Genetics, University of California, Los Angeles, CA, 90095, USA
| | - Heng Yang
- Institute of Systems Medicine, Chinese Academy of Medical Science & Peking Union College, Beijing, 100005, China.
- Suzhou Institute of Systems Medicine, Suzhou, 215123, China.
| |
Collapse
|
28
|
Dubois C, Lahfa M, Pissarra J, de Guillen K, Barthe P, Kroj T, Roumestand C, Padilla A. Combining High-Pressure NMR and Geometrical Sampling to Obtain a Full Topological Description of Protein Folding Landscapes: Application to the Folding of Two MAX Effectors from Magnaporthe oryzae. Int J Mol Sci 2022; 23:ijms23105461. [PMID: 35628267 PMCID: PMC9141691 DOI: 10.3390/ijms23105461] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2022] [Revised: 05/10/2022] [Accepted: 05/12/2022] [Indexed: 11/16/2022] Open
Abstract
Despite advances in experimental and computational methods, the mechanisms by which an unstructured polypeptide chain regains its unique three-dimensional structure remains one of the main puzzling questions in biology. Single-molecule techniques, ultra-fast perturbation and detection approaches and improvement in all-atom and coarse-grained simulation methods have greatly deepened our understanding of protein folding and the effects of environmental factors on folding landscape. However, a major challenge remains the detailed characterization of the protein folding landscape. Here, we used high hydrostatic pressure 2D NMR spectroscopy to obtain high-resolution experimental structural information in a site-specific manner across the polypeptide sequence and along the folding reaction coordinate. We used this residue-specific information to constrain Cyana3 calculations, in order to obtain a topological description of the entire folding landscape. This approach was used to describe the conformers populating the folding landscape of two small globular proteins, AVR-Pia and AVR-Pib, that belong to the structurally conserved but sequence-unrelated MAX effectors superfamily. Comparing the two folding landscapes, we found that, in spite of their divergent sequences, the folding pathway of these two proteins involves a similar, inescapable, folding intermediate, even if, statistically, the routes used are different.
Collapse
Affiliation(s)
- Cécile Dubois
- Centre de Biologie Structurale, University of Montpellier, INSERM U1054, CNRS UMR 5048, 34000 Montpellier, France
| | - Mounia Lahfa
- Centre de Biologie Structurale, University of Montpellier, INSERM U1054, CNRS UMR 5048, 34000 Montpellier, France
| | - Joana Pissarra
- Centre de Biologie Structurale, University of Montpellier, INSERM U1054, CNRS UMR 5048, 34000 Montpellier, France
| | - Karine de Guillen
- Centre de Biologie Structurale, University of Montpellier, INSERM U1054, CNRS UMR 5048, 34000 Montpellier, France
| | - Philippe Barthe
- Centre de Biologie Structurale, University of Montpellier, INSERM U1054, CNRS UMR 5048, 34000 Montpellier, France
| | - Thomas Kroj
- PHIM Plant Health Institute, University of Montpellier, INRAE, CIRAD, Institut Agro, IRD, 34000 Montpellier, France
| | - Christian Roumestand
- Centre de Biologie Structurale, University of Montpellier, INSERM U1054, CNRS UMR 5048, 34000 Montpellier, France
| | - André Padilla
- Centre de Biologie Structurale, University of Montpellier, INSERM U1054, CNRS UMR 5048, 34000 Montpellier, France
| |
Collapse
|
29
|
Talluri S. Algorithms for protein design. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 130:1-38. [PMID: 35534105 DOI: 10.1016/bs.apcsb.2022.01.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Computational Protein Design has the potential to contribute to major advances in enzyme technology, vaccine design, receptor-ligand engineering, biomaterials, nanosensors, and synthetic biology. Although Protein Design is a challenging problem, proteins can be designed by experts in Protein Design, as well as by non-experts whose primary interests are in the applications of Protein Design. The increased accessibility of Protein Design technology is attributable to the accumulated knowledge and experience with Protein Design as well as to the availability of software and online resources. The objective of this review is to serve as a guide to the relevant literature with a focus on the novel methods and algorithms that have been developed or applied for Protein Design, and to assist in the selection of algorithms for Protein Design. Novel algorithms and models that have been introduced to utilize the enormous amount of experimental data and novel computational hardware have the potential for producing substantial increases in the accuracy, reliability and range of applications of designed proteins.
Collapse
Affiliation(s)
- Sekhar Talluri
- Department of Biotechnology, GITAM, Visakhapatnam, India.
| |
Collapse
|
30
|
Gupta S, Azadvari N, Hosseinzadeh P. Design of Protein Segments and Peptides for Binding to Protein Targets. BIODESIGN RESEARCH 2022; 2022:9783197. [PMID: 37850124 PMCID: PMC10521657 DOI: 10.34133/2022/9783197] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 03/16/2022] [Indexed: 10/19/2023] Open
Abstract
Recent years have witnessed a rise in methods for accurate prediction of structure and design of novel functional proteins. Design of functional protein fragments and peptides occupy a small, albeit unique, space within the general field of protein design. While the smaller size of these peptides allows for more exhaustive computational methods, flexibility in their structure and sparsity of data compared to proteins, as well as presence of noncanonical building blocks, add additional challenges to their design. This review summarizes the current advances in the design of protein fragments and peptides for binding to targets and discusses the challenges in the field, with an eye toward future directions.
Collapse
Affiliation(s)
- Suchetana Gupta
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| | - Noora Azadvari
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| | - Parisa Hosseinzadeh
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| |
Collapse
|
31
|
Multiple expansions of globally uncommon SARS-CoV-2 lineages in Nigeria. Nat Commun 2022; 13:688. [PMID: 35115515 PMCID: PMC8813984 DOI: 10.1038/s41467-022-28317-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 01/18/2022] [Indexed: 12/28/2022] Open
Abstract
Disparities in SARS-CoV-2 genomic surveillance have limited our understanding of the viral population dynamics and may delay identification of globally important variants. Despite being the most populated country in Africa, Nigeria has remained critically under sampled. Here, we report sequences from 378 SARS-CoV-2 isolates collected in Oyo State, Nigeria between July 2020 and August 2021. In early 2021, most isolates belonged to the Alpha “variant of concern” (VOC) or the Eta lineage. Eta outcompeted Alpha in Nigeria and across West Africa, persisting in the region even after expansion of an otherwise rare Delta sub-lineage. Spike protein from the Eta variant conferred increased infectivity and decreased neutralization by convalescent sera in vitro. Phylodynamic reconstructions suggest that Eta originated in West Africa before spreading globally and represented a VOC in early 2021. These results demonstrate a distinct distribution of SARS-CoV-2 lineages in Nigeria, and emphasize the need for improved genomic surveillance worldwide. SARS-CoV-2 genomic surveillance has been important for informing pandemic responses but many regions remain under-sampled, limiting knowledge of circulating strains. Here, the authors sequence 378 isolates from Nigeria and identify two strains that appear to be important locally though globally uncommon.
Collapse
|
32
|
Liang S, Li Z, Zhan J, Zhou Y. De novo protein design by an energy function based on series expansion in distance and orientation dependence. Bioinformatics 2021; 38:86-93. [PMID: 34406339 DOI: 10.1093/bioinformatics/btab598] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 08/11/2021] [Accepted: 08/16/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Despite many successes, de novo protein design is not yet a solved problem as its success rate remains low. The low success rate is largely because we do not yet have an accurate energy function for describing the solvent-mediated interaction between amino acid residues in a protein chain. Previous studies showed that an energy function based on series expansions with its parameters optimized for side-chain and loop conformations can lead to one of the most accurate methods for side chain (OSCAR) and loop prediction (LEAP). Following the same strategy, we developed an energy function based on series expansions with the parameters optimized in four separate stages (recovering single-residue types without and with orientation dependence, selecting loop decoys and maintaining the composition of amino acids). We tested the energy function for de novo design by using Monte Carlo simulated annealing. RESULTS The method for protein design (OSCAR-Design) is found to be as accurate as OSCAR and LEAP for side-chain and loop prediction, respectively. In de novo design, it can recover native residue types ranging from 38% to 43% depending on test sets, conserve hydrophobic/hydrophilic residues at ∼75%, and yield the overall similarity in amino acid compositions at more than 90%. These performance measures are all statistically significantly better than several protein design programs compared. Moreover, the largest hydrophobic patch areas in designed proteins are near or smaller than those in native proteins. Thus, an energy function based on series expansion can be made useful for protein design. AVAILABILITY AND IMPLEMENTATION The Linux executable version is freely available for academic users at http://zhouyq-lab.szbl.ac.cn/resources/.
Collapse
Affiliation(s)
- Shide Liang
- Department of R & D, Bio-Thera Solutions, Guangzhou 510530, China
| | - Zhixiu Li
- Institute of Health and Biomedical Innovation, Queensland University of Technology at Translational Research Institute, Woolloongabba, QLD 3001, Australia
| | - Jian Zhan
- Institute for Glycomics and School of Information and Communication Technology, Griffith University, Gold Coast Campus, Southport, QLD 4222, Australia.,Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China
| | - Yaoqi Zhou
- Institute for Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen 518055, China.,Peking University Shenzhen Graduate School, Shenzhen 518055, China
| |
Collapse
|
33
|
Havranek B, Chan KK, Wu A, Procko E, Islam SM. Computationally Designed ACE2 Decoy Receptor Binds SARS-CoV-2 Spike (S) Protein with Tight Nanomolar Affinity. J Chem Inf Model 2021; 61:4656-4669. [PMID: 34427448 PMCID: PMC8409145 DOI: 10.1021/acs.jcim.1c00783] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Indexed: 12/25/2022]
Abstract
Even with the availability of vaccines, therapeutic options for COVID-19 still remain highly desirable, especially in hospitalized patients with moderate or severe disease. Soluble ACE2 (sACE2) is a promising therapeutic candidate that neutralizes SARS CoV-2 infection by acting as a decoy. Using computational mutagenesis, we designed a number of sACE2 derivatives carrying three to four mutations. The top-predicted sACE2 decoy based on the in silico mutagenesis scan was subjected to molecular dynamics and free-energy calculations for further validation. After illuminating the mechanism of increased binding for our designed sACE2 derivative, the design was verified experimentally by flow cytometry and BLI-binding experiments. The computationally designed sACE2 decoy (ACE2-FFWF) bound the receptor-binding domain of SARS-CoV-2 tightly with low nanomolar affinity and ninefold affinity enhancement over the wild type. Furthermore, cell surface expression was slightly greater than wild-type ACE2, suggesting that the design is well-folded and stable. Having an arsenal of high-affinity sACE2 derivatives will help to buffer against the emergence of SARS CoV-2 variants. Here, we show that computational methods have become sufficiently accurate for the design of therapeutics for current and future viral pandemics.
Collapse
Affiliation(s)
- Brandon Havranek
- Department of Chemistry, University of
Illinois at Chicago, Chicago, Illinois 60607, United
States
| | - Kui K. Chan
- Orthogonal Biologics Inc.,
Urbana, Illinois 61801, United States
| | - Austin Wu
- Department of Computer Science,
Northwestern University, Evanston, Illinois 60208,
United States
| | - Erik Procko
- Department of Biochemistry and Cancer Center at
Illinois, University of Illinois, Urbana, Illinois 61801,
United States
| | - Shahidul M. Islam
- Department of Chemistry, University of
Illinois at Chicago, Chicago, Illinois 60607, United
States
| |
Collapse
|
34
|
Greener JG, Jones DT. Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins. PLoS One 2021; 16:e0256990. [PMID: 34473813 PMCID: PMC8412298 DOI: 10.1371/journal.pone.0256990] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 08/19/2021] [Indexed: 11/26/2022] Open
Abstract
Finding optimal parameters for force fields used in molecular simulation is a challenging and time-consuming task, partly due to the difficulty of tuning multiple parameters at once. Automatic differentiation presents a general solution: run a simulation, obtain gradients of a loss function with respect to all the parameters, and use these to improve the force field. This approach takes advantage of the deep learning revolution whilst retaining the interpretability and efficiency of existing force fields. We demonstrate that this is possible by parameterising a simple coarse-grained force field for proteins, based on training simulations of up to 2,000 steps learning to keep the native structure stable. The learned potential matches chemical knowledge and PDB data, can fold and reproduce the dynamics of small proteins, and shows ability in protein design and model scoring applications. Problems in applying differentiable molecular simulation to all-atom models of proteins are discussed along with possible solutions and the variety of available loss functions. The learned potential, simulation scripts and training code are made available at https://github.com/psipred/cgdms.
Collapse
Affiliation(s)
- Joe G. Greener
- Department of Computer Science, University College London, London, United Kingdom
| | - David T. Jones
- Department of Computer Science, University College London, London, United Kingdom
| |
Collapse
|
35
|
Nazet J, Lang E, Merkl R. Rosetta:MSF:NN: Boosting performance of multi-state computational protein design with a neural network. PLoS One 2021; 16:e0256691. [PMID: 34437621 PMCID: PMC8389498 DOI: 10.1371/journal.pone.0256691] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 08/12/2021] [Indexed: 12/05/2022] Open
Abstract
Rational protein design aims at the targeted modification of existing proteins. To reach this goal, software suites like Rosetta propose sequences to introduce the desired properties. Challenging design problems necessitate the representation of a protein by means of a structural ensemble. Thus, Rosetta multi-state design (MSD) protocols have been developed wherein each state represents one protein conformation. Computational demands of MSD protocols are high, because for each of the candidate sequences a costly three-dimensional (3D) model has to be created and assessed for all states. Each of these scores contributes one data point to a complex, design-specific energy landscape. As neural networks (NN) proved well-suited to learn such solution spaces, we integrated one into the framework Rosetta:MSF instead of the so far used genetic algorithm with the aim to reduce computational costs. As its predecessor, Rosetta:MSF:NN administers a set of candidate sequences and their scores and scans sequence space iteratively. During each iteration, the union of all candidate sequences and their Rosetta scores are used to re-train NNs that possess a design-specific architecture. The enormous speed of the NNs allows an extensive assessment of alternative sequences, which are ranked on the scores predicted by the NN. Costly 3D models are computed only for a small fraction of best-scoring sequences; these and the corresponding 3D-based scores replace half of the candidate sequences during each iteration. The analysis of two sets of candidate sequences generated for a specific design problem by means of a genetic algorithm confirmed that the NN predicted 3D-based scores quite well; the Pearson correlation coefficient was at least 0.95. Applying Rosetta:MSF:NN:enzdes to a benchmark consisting of 16 ligand-binding problems showed that this protocol converges ten-times faster than the genetic algorithm and finds sequences with comparable scores.
Collapse
Affiliation(s)
- Julian Nazet
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Elmar Lang
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | - Rainer Merkl
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
- * E-mail:
| |
Collapse
|
36
|
Huang X, Pearce R, Omenn GS, Zhang Y. Identification of 13 Guanidinobenzoyl- or Aminidinobenzoyl-Containing Drugs to Potentially Inhibit TMPRSS2 for COVID-19 Treatment. Int J Mol Sci 2021; 22:7060. [PMID: 34209110 PMCID: PMC8269196 DOI: 10.3390/ijms22137060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 06/18/2021] [Accepted: 06/28/2021] [Indexed: 12/26/2022] Open
Abstract
Positively charged groups that mimic arginine or lysine in a natural substrate of trypsin are necessary for drugs to inhibit the trypsin-like serine protease TMPRSS2 that is involved in the viral entry and spread of coronaviruses, including SARS-CoV-2. Based on this assumption, we identified a set of 13 approved or clinically investigational drugs with positively charged guanidinobenzoyl and/or aminidinobenzoyl groups, including the experimentally verified TMPRSS2 inhibitors Camostat and Nafamostat. Molecular docking using the C-I-TASSER-predicted TMPRSS2 catalytic domain model suggested that the guanidinobenzoyl or aminidinobenzoyl group in all the drugs could form putative salt bridge interactions with the side-chain carboxyl group of Asp435 located in the S1 pocket of TMPRSS2. Molecular dynamics simulations further revealed the high stability of the putative salt bridge interactions over long-time (100 ns) simulations. The molecular mechanics/generalized Born surface area-binding free energy assessment and per-residue energy decomposition analysis also supported the strong binding interactions between TMPRSS2 and the proposed drugs. These results suggest that the proposed compounds, in addition to Camostat and Nafamostat, could be effective TMPRSS2 inhibitors for COVID-19 treatment by occupying the S1 pocket with the hallmark positively charged groups.
Collapse
Affiliation(s)
- Xiaoqiang Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA; (X.H.); (R.P.); (G.S.O.)
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA; (X.H.); (R.P.); (G.S.O.)
| | - Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA; (X.H.); (R.P.); (G.S.O.)
- Departments of Internal Medicine and Human Genetics and School of Public Health, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA; (X.H.); (R.P.); (G.S.O.)
- Department of Biological Chemistry, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| |
Collapse
|
37
|
Pal A, Pal D, Mitra P. A computational framework for modeling functional protein-protein interactions. Proteins 2021; 89:1353-1364. [PMID: 34076296 DOI: 10.1002/prot.26156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Revised: 04/17/2021] [Accepted: 05/19/2021] [Indexed: 11/06/2022]
Abstract
Protein interactions and their assemblies assist in understanding the cellular mechanisms through the knowledge of interactome. Despite recent advances, a vast number of interacting protein complexes is not annotated by three-dimensional structures. Therefore, a computational framework is a suitable alternative to fill the large gap between identified interactions and the interactions with known structures. In this work, we develop an automated computational framework for modeling functionally related protein-complex structures utilizing GO-based semantic similarity technique and co-evolutionary information of the interaction sites. The framework can consider protein sequence and structure information as input and employ both rigid-body docking and template-based modeling exploiting the existing structural templates and sequence homology information from the PDB. Our framework combines geometric as well as physicochemical features for re-ranking the docking decoys. The proposed framework has an 83% success rate when tested on a benchmark dataset while considering Top1 models for template-based modeling and Top10 models for the docking pipeline. We believe that our computational framework can be used for any pair of proteins with higher confidence to identify the functional protein-protein interactions.
Collapse
Affiliation(s)
- Abantika Pal
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, India
| | - Debnath Pal
- Department of Computational and Data Sciences, Indian Institute of Science Bangalore, Bangalore, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, India
| |
Collapse
|
38
|
Pearce R, Zhang Y. Deep learning techniques have significantly impacted protein structure prediction and protein design. Curr Opin Struct Biol 2021; 68:194-207. [PMID: 33639355 PMCID: PMC8222070 DOI: 10.1016/j.sbi.2021.01.007] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 01/09/2021] [Accepted: 01/18/2021] [Indexed: 12/26/2022]
Abstract
Protein structure prediction and design can be regarded as two inverse processes governed by the same folding principle. Although progress remained stagnant over the past two decades, the recent application of deep neural networks to spatial constraint prediction and end-to-end model training has significantly improved the accuracy of protein structure prediction, largely solving the problem at the fold level for single-domain proteins. The field of protein design has also witnessed dramatic improvement, where noticeable examples have shown that information stored in neural-network models can be used to advance functional protein design. Thus, incorporation of deep learning techniques into different steps of protein folding and design approaches represents an exciting future direction and should continue to have a transformative impact on both fields.
Collapse
Affiliation(s)
- Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA; Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
39
|
Amengual-Rigo P, Fernández-Recio J, Guallar V. UEP: an open-source and fast classifier for predicting the impact of mutations in protein-protein complexes. Bioinformatics 2021; 37:334-341. [PMID: 32761082 DOI: 10.1093/bioinformatics/btaa708] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 07/23/2020] [Accepted: 07/31/2020] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Single protein residue mutations may reshape the binding affinity of protein-protein interactions. Therefore, predicting its effects is of great interest in biotechnology and biomedicine. Unfortunately, the availability of experimental data on binding affinity changes upon mutation is limited, which hampers the development of new and more precise algorithms. Here, we propose UEP, a classifier for predicting beneficial and detrimental mutations in protein-protein complexes trained on interactome data. RESULTS Regardless of the simplicity of the UEP algorithm, which is based on a simple three-body contact potential derived from interactome data, we report competitive results with the gold standard methods in this field with the advantage of being faster in terms of computational time. Moreover, we propose a consensus selection procedure by involving the combination of three predictors that showed higher classification accuracy in our benchmark: UEP, pyDock and EvoEF1/FoldX. Overall, we demonstrate that the analysis of interactome data allows predicting the impact of protein-protein mutations using UEP, a fast and reliable open-source code. AVAILABILITY AND IMPLEMENTATION UEP algorithm can be found at: https://github.com/pepamengual/UEP. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pep Amengual-Rigo
- Department of Life Sciences, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Juan Fernández-Recio
- Department of Life Sciences, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain.,Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC-Universidad de la Rioja-Gobierno de la Rioja, 26007 Logroño, Spain
| | - Victor Guallar
- Department of Life Sciences, Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain.,ICREA: Institució Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain
| |
Collapse
|
40
|
Frappier V, Keating AE. Data-driven computational protein design. Curr Opin Struct Biol 2021; 69:63-69. [PMID: 33910104 DOI: 10.1016/j.sbi.2021.03.009] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 03/18/2021] [Accepted: 03/19/2021] [Indexed: 01/28/2023]
Abstract
Computational protein design can generate proteins not found in nature that adopt desired structures and perform novel functions. Although proteins could, in theory, be designed with ab initio methods, practical success has come from using large amounts of data that describe the sequences, structures, and functions of existing proteins and their variants. We present recent creative uses of multiple-sequence alignments, protein structures, and high-throughput functional assays in computational protein design. Approaches range from enhancing structure-based design with experimental data to building regression models to training deep neural nets that generate novel sequences. Looking ahead, deep learning will be increasingly important for maximizing the value of data for protein design.
Collapse
Affiliation(s)
- Vincent Frappier
- Generate Biomedicines, 26 Landsdowne Street, Cambridge, MA, 02139, USA
| | - Amy E Keating
- MIT Departments of Biology and Biological Engineering, 77 Massachusetts Ave., Cambridge, MA, 02139, USA.
| |
Collapse
|
41
|
Yoshida S, Wei X, Zhang G, O'Connor CL, Torres M, Zhou Z, Lin L, Menon R, Xu X, Zheng W, Xiong Y, Otto E, Tang CHA, Hua R, Verma R, Mori H, Zhang Y, Hu CCA, Liu M, Garg P, Hodgin JB, Sun S, Bitzer M, Qi L. Endoplasmic reticulum-associated degradation is required for nephrin maturation and kidney glomerular filtration function. J Clin Invest 2021; 131:143988. [PMID: 33591954 DOI: 10.1172/jci143988] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Accepted: 02/11/2021] [Indexed: 02/06/2023] Open
Abstract
Podocytes are key to the glomerular filtration barrier by forming a slit diaphragm between interdigitating foot processes; however, the molecular details and functional importance of protein folding and degradation in the ER remain unknown. Here, we show that the SEL1L-HRD1 protein complex of ER-associated degradation (ERAD) is required for slit diaphragm formation and glomerular filtration function. SEL1L-HRD1 ERAD is highly expressed in podocytes of both mouse and human kidneys. Mice with podocyte-specific Sel1L deficiency develop podocytopathy and severe congenital nephrotic syndrome with an impaired slit diaphragm shortly after weaning and die prematurely, with a median lifespan of approximately 3 months. We show mechanistically that nephrin, a type 1 membrane protein causally linked to congenital nephrotic syndrome, is an endogenous ERAD substrate. ERAD deficiency attenuated the maturation of nascent nephrin, leading to its retention in the ER. We also show that various autosomal-recessive nephrin disease mutants were highly unstable and broken down by SEL1L-HRD1 ERAD, which attenuated the pathogenicity of the mutants toward the WT allele. This study uncovers a critical role of SEL1L-HRD1 ERAD in glomerular filtration barrier function and provides insights into the pathogenesis associated with autosomal-recessive disease mutants.
Collapse
Affiliation(s)
- Sei Yoshida
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA.,State Key Laboratory of Medical Chemical Biology, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, China
| | - Xiaoqiong Wei
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Gensheng Zhang
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Christopher L O'Connor
- Division of Nephrology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Mauricio Torres
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Zhangsen Zhou
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Liangguang Lin
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Rajasree Menon
- Division of Nephrology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Xiaoxi Xu
- Department of Endocrinology and Metabolism, Tianjin Medical University General Hospital, Tianjin, China
| | - Wenyue Zheng
- State Key Laboratory of Medical Chemical Biology, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, China
| | - Yi Xiong
- Center for Molecular Medicine and Genetics, Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Edgar Otto
- Division of Nephrology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Chih-Hang Anthony Tang
- Houston Methodist Cancer Center, Houston Methodist Academic Institute, Houston, Texas, USA
| | - Rui Hua
- State Key Laboratory of Medical Chemical Biology, College of Life Sciences, Frontiers Science Center for Cell Responses, Nankai University, Tianjin, China
| | - Rakesh Verma
- Division of Nephrology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Hiroyuki Mori
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics and Department of Biological Chemistry and
| | - Chih-Chi Andrew Hu
- Houston Methodist Cancer Center, Houston Methodist Academic Institute, Houston, Texas, USA
| | - Ming Liu
- Department of Endocrinology and Metabolism, Tianjin Medical University General Hospital, Tianjin, China
| | - Puneet Garg
- Division of Nephrology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | | | - Shengyi Sun
- Center for Molecular Medicine and Genetics, Department of Biochemistry, Microbiology and Immunology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Markus Bitzer
- Division of Nephrology, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Ling Qi
- Department of Molecular and Integrative Physiology, University of Michigan Medical School, Ann Arbor, Michigan, USA.,Division of Metabolism, Endocrinology & Diabetes, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA
| |
Collapse
|
42
|
Kruglikov A, Rakesh M, Wei Y, Xia X. Applications of Protein Secondary Structure Algorithms in SARS-CoV-2 Research. J Proteome Res 2021; 20:1457-1463. [PMID: 33617253 PMCID: PMC7927282 DOI: 10.1021/acs.jproteome.0c00734] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2020] [Indexed: 01/25/2023]
Abstract
Since the outset of COVID-19, the pandemic has prompted immediate global efforts to sequence SARS-CoV-2, and over 450 000 complete genomes have been publicly deposited over the course of 12 months. Despite this, comparative nucleotide and amino acid sequence analyses often fall short in answering key questions in vaccine design. For example, the binding affinity between different ACE2 receptors and SARS-COV-2 spike protein cannot be fully explained by amino acid similarity at ACE2 contact sites because protein structure similarities are not fully reflected by amino acid sequence similarities. To comprehensively compare protein homology, secondary structure (SS) analysis is required. While protein structure is slow and difficult to obtain, SS predictions can be made rapidly, and a well-predicted SS structure may serve as a viable proxy to gain biological insight. Here we review algorithms and information used in predicting protein SS to highlight its potential application in pandemics research. We also showed examples of how SS predictions can be used to compare ACE2 proteins and to evaluate the zoonotic origins of viruses. As computational tools are much faster than wet-lab experiments, these applications can be important for research especially in times when quickly obtained biological insights can help in speeding up response to pandemics.
Collapse
Affiliation(s)
- Alibek Kruglikov
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | - Mohan Rakesh
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | - Yulong Wei
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | - Xuhua Xia
- Department
of Biology, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
- Ottawa
Institute of Systems Biology, University
of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| |
Collapse
|
43
|
Stam MJ, Wood CW. DE-STRESS: a user-friendly web application for the evaluation of protein designs. Protein Eng Des Sel 2021; 34:gzab029. [PMID: 34908138 PMCID: PMC8672653 DOI: 10.1093/protein/gzab029] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 10/11/2021] [Accepted: 10/25/2021] [Indexed: 11/16/2022] Open
Abstract
De novo protein design is a rapidly growing field, and there are now many interesting and useful examples of designed proteins in the literature. However, most designs could be classed as failures when characterised in the lab, usually as a result of low expression, misfolding, aggregation or lack of function. This high attrition rate makes protein design unreliable and costly. It is possible that some of these failures could be caught earlier in the design process if it were quick and easy to generate information and a set of high-quality metrics regarding designs, which could be used to make reproducible and data-driven decisions about which designs to characterise experimentally. We present DE-STRESS (DEsigned STRucture Evaluation ServiceS), a web application for evaluating structural models of designed and engineered proteins. DE-STRESS has been designed to be simple, intuitive to use and responsive. It provides a wealth of information regarding designs, as well as tools to help contextualise the results and formally describe the properties that a design requires to be fit for purpose.
Collapse
Affiliation(s)
- Michael J Stam
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
| | - Christopher W Wood
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3FF, UK
| |
Collapse
|
44
|
Calzini MA, Malico AA, Mitchler MM, Williams GJ. Protein engineering for natural product biosynthesis and synthetic biology applications. Protein Eng Des Sel 2021; 34:gzab015. [PMID: 34137436 PMCID: PMC8209613 DOI: 10.1093/protein/gzab015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 05/12/2021] [Accepted: 05/17/2021] [Indexed: 11/14/2022] Open
Abstract
As protein engineering grows more salient, many strategies have emerged to alter protein structure and function, with the goal of redesigning and optimizing natural product biosynthesis. Computational tools, including machine learning and molecular dynamics simulations, have enabled the rational mutagenesis of key catalytic residues for enhanced or altered biocatalysis. Semi-rational, directed evolution and microenvironment engineering strategies have optimized catalysis for native substrates and increased enzyme promiscuity beyond the scope of traditional rational approaches. These advances are made possible using novel high-throughput screens, including designer protein-based biosensors with engineered ligand specificity. Herein, we detail the most recent of these advances, focusing on polyketides, non-ribosomal peptides and isoprenoids, including their native biosynthetic logic to provide clarity for future applications of these technologies for natural product synthetic biology.
Collapse
Affiliation(s)
- Miles A Calzini
- Department of Chemistry, NC State University, Raleigh, NC 27695-8204, USA
| | - Alexandra A Malico
- Department of Chemistry, NC State University, Raleigh, NC 27695-8204, USA
| | - Melissa M Mitchler
- Department of Chemistry, NC State University, Raleigh, NC 27695-8204, USA
| | - Gavin J Williams
- Department of Chemistry, NC State University, Raleigh, NC 27695-8204, USA
- Comparative Medicine Institute, NC State University Raleigh, Raleigh, NC 27695-8204, USA
| |
Collapse
|
45
|
ADDRESS: A Database of Disease-associated Human Variants Incorporating Protein Structure and Folding Stabilities. J Mol Biol 2021; 433:166840. [PMID: 33539887 DOI: 10.1016/j.jmb.2021.166840] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 01/17/2021] [Accepted: 01/20/2021] [Indexed: 11/22/2022]
Abstract
Numerous human diseases are caused by mutations in genomic sequences. Since amino acid changes affect protein function through mechanisms often predictable from protein structure, the integration of structural and sequence data enables us to estimate with greater accuracy whether and how a given mutation will lead to disease. Publicly available annotated databases enable hypothesis assessment and benchmarking of prediction tools. However, the results are often presented as summary statistics or black box predictors, without providing full descriptive information. We developed a new semi-manually curated human variant database presenting information on the protein contact-map, sequence-to-structure mapping, amino acid identity change, and stability prediction for the popular UniProt database. We found that the profiles of pathogenic and benign missense polymorphisms can be effectively deduced using decision trees and comparative analyses based on the presented dataset. The database is made publicly available through https://zhanglab.ccmb.med.umich.edu/ADDRESS.
Collapse
|
46
|
Badhe Y, Gupta R, Rai B. In silico design of peptides with binding to the receptor binding domain (RBD) of the SARS-CoV-2 and their utility in bio-sensor development for SARS-CoV-2 detection. RSC Adv 2021; 11:3816-3826. [PMID: 35424358 PMCID: PMC8694220 DOI: 10.1039/d0ra09123e] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 01/13/2021] [Indexed: 12/23/2022] Open
Abstract
The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected millions of people across the globe and created not only a health emergency but also a financial crisis. This virus attacks the angiotensin-converting enzyme 2 (ACE2) receptor situated on the surface of the host cell membrane. The spike protein of the virus binds to this receptor which is a critical step in infection. A molecule which can specifically stop this binding could be a potential therapeutic agent. In this study, we have tested 12 potential peptides which can bind to the receptor binding domain (RBD) of the spike protein of the virus and thus can potentially inhibit the binding of the latter on ACE2 receptors. These peptides are screened based on their binding with the RBD of the spike protein and aqueous stability, obtained using several atomistic molecular dynamic simulations. The potential of mean force calculation of peptides confirmed their binding to the RBD of the spike protein. Furthermore, two potential peptides were tested for use in a biosensing application for SARS-CoV-2 detection. Two types of biosensing platforms, a graphene sheet and a carbon nano tube (CNT) were tested. The peptides were modified in order to functionalize the graphene and CNT. Based on the interaction between the substrate, peptide and spike protein, the utility of the screened peptide for a given bio sensing platform is discussed and recommended. The protocol for peptide design and testing for its usage as a sensor.![]()
Collapse
Affiliation(s)
- Yogesh Badhe
- Physical Science Research Area
- Tata Research Development and Design Centre
- TCS Research
- Tata Consultancy Services
- Pune-411013
| | - Rakesh Gupta
- Physical Science Research Area
- Tata Research Development and Design Centre
- TCS Research
- Tata Consultancy Services
- Pune-411013
| | - Beena Rai
- Physical Science Research Area
- Tata Research Development and Design Centre
- TCS Research
- Tata Consultancy Services
- Pune-411013
| |
Collapse
|
47
|
Ong E, Huang X, Pearce R, Zhang Y, He Y. Computational design of SARS-CoV-2 spike glycoproteins to increase immunogenicity by T cell epitope engineering. Comput Struct Biotechnol J 2020; 19:518-529. [PMID: 33398234 PMCID: PMC7773544 DOI: 10.1016/j.csbj.2020.12.039] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2020] [Revised: 12/24/2020] [Accepted: 12/24/2020] [Indexed: 01/12/2023] Open
Abstract
The development of effective and safe vaccines is the ultimate way to efficiently stop the ongoing COVID-19 pandemic, which is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Built on the fact that SARS-CoV-2 utilizes the association of its Spike (S) protein with the human angiotensin-converting enzyme 2 (ACE2) receptor to invade host cells, we computationally redesigned the S protein sequence to improve its immunogenicity and antigenicity. Toward this purpose, we extended an evolutionary protein design algorithm, EvoDesign, to create thousands of stable S protein variants that perturb the core protein sequence but keep the surface conformation and B cell epitopes. The T cell epitope content and similarity scores of the perturbed sequences were calculated and evaluated. Out of 22,914 designs with favorable stability energy, 301 candidates contained at least two pre-existing immunity-related epitopes and had promising immunogenic potential. The benchmark tests showed that, although the epitope restraints were not included in the scoring function of EvoDesign, the top S protein design successfully recovered 31 out of the 32 major histocompatibility complex (MHC)-II T cell promiscuous epitopes in the native S protein, where two epitopes were present in all seven human coronaviruses. Moreover, the newly designed S protein introduced nine new MHC-II T cell promiscuous epitopes that do not exist in the wildtype SARS-CoV-2. These results demonstrated a new and effective avenue to enhance a target protein's immunogenicity using rational protein design, which could be applied for new vaccine design against COVID-19 and other pathogens.
Collapse
Affiliation(s)
- Edison Ong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiaoqiang Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yongqun He
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
- Unit for Laboratory Animal Medicine, Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
48
|
Huang X, Pearce R, Zhang Y. FASPR: an open-source tool for fast and accurate protein side-chain packing. Bioinformatics 2020; 36:3758-3765. [PMID: 32259206 DOI: 10.1093/bioinformatics/btaa234] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Revised: 03/30/2020] [Accepted: 04/01/2020] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Protein structure and function are essentially determined by how the side-chain atoms interact with each other. Thus, accurate protein side-chain packing (PSCP) is a critical step toward protein structure prediction and protein design. Despite the importance of the problem, however, the accuracy and speed of current PSCP programs are still not satisfactory. RESULTS We present FASPR for fast and accurate PSCP by using an optimized scoring function in combination with a deterministic searching algorithm. The performance of FASPR was compared with four state-of-the-art PSCP methods (CISRR, RASP, SCATD and SCWRL4) on both native and non-native protein backbones. For the assessment on native backbones, FASPR achieved a good performance by correctly predicting 69.1% of all the side-chain dihedral angles using a stringent tolerance criterion of 20°, compared favorably with SCWRL4, CISRR, RASP and SCATD which successfully predicted 68.8%, 68.6%, 67.8% and 61.7%, respectively. Additionally, FASPR achieved the highest speed for packing the 379 test protein structures in only 34.3 s, which was significantly faster than the control methods. For the assessment on non-native backbones, FASPR showed an equivalent or better performance on I-TASSER predicted backbones and the backbones perturbed from experimental structures. Detailed analyses showed that the major advantage of FASPR lies in the optimal combination of the dead-end elimination and tree decomposition with a well optimized scoring function, which makes FASPR of practical use for both protein structure modeling and protein design studies. AVAILABILITY AND IMPLEMENTATION The web server, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/FASPR and https://github.com/tommyhuangthu/FASPR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
49
|
Huang X, Zhang C, Pearce R, Omenn GS, Zhang Y. Identifying the Zoonotic Origin of SARS-CoV-2 by Modeling the Binding Affinity between the Spike Receptor-Binding Domain and Host ACE2. J Proteome Res 2020; 19:4844-4856. [PMID: 33175551 PMCID: PMC7770890 DOI: 10.1021/acs.jproteome.0c00717] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Indexed: 12/14/2022]
Abstract
Despite considerable research progress on SARS-CoV-2, the direct zoonotic origin (intermediate host) of the virus remains ambiguous. The most definitive approach to identify the intermediate host would be the detection of SARS-CoV-2-like coronaviruses in wild animals. However, due to the high number of animal species, it is not feasible to screen all the species in the laboratory. Given that binding to ACE2 proteins is the first step for the coronaviruses to invade host cells, we propose a computational pipeline to identify potential intermediate hosts of SARS-CoV-2 by modeling the binding affinity between the Spike receptor-binding domain (RBD) and host ACE2. Using this pipeline, we systematically examined 285 ACE2 variants from mammals, birds, fish, reptiles, and amphibians, and found that the binding energies calculated for the modeled Spike-RBD/ACE2 complex structures correlated closely with the effectiveness of animal infection as determined by multiple experimental data sets. Built on the optimized binding affinity cutoff, we suggest a set of 96 mammals, including 48 experimentally investigated ones, which are permissive to SARS-CoV-2, with candidates from primates, rodents, and carnivores at the highest risk of infection. Overall, this work not only suggests a limited range of potential intermediate SARS-CoV-2 hosts for further experimental investigation, but also, more importantly, it proposes a new structure-based approach to general zoonotic origin and susceptibility analyses that are critical for human infectious disease control and wildlife protection.
Collapse
Affiliation(s)
- Xiaoqiang Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Gilbert S. Omenn
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
- Department of Biological Chemistry, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA
| |
Collapse
|
50
|
Grimm M, Liu Y, Yang X, Bu C, Xiao Z, Cao Y. LigMate: A Multifeature Integration Algorithm for Ligand-Similarity-Based Virtual Screening. J Chem Inf Model 2020; 60:6044-6053. [DOI: 10.1021/acs.jcim.9b01210] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Maximilian Grimm
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Yang Liu
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Xiaocong Yang
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Chunya Bu
- College of Biological Science and Engineering, Beijing University of Agriculture, Beijing 102206, China
| | - Zhixiong Xiao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, China
| |
Collapse
|