1
|
Wang T, Wang L, Zhang X, Shen C, Zhang O, Wang J, Wu J, Jin R, Zhou D, Chen S, Liu L, Wang X, Hsieh CY, Chen G, Pan P, Kang Y, Hou T. Comprehensive assessment of protein loop modeling programs on large-scale datasets: prediction accuracy and efficiency. Brief Bioinform 2023; 25:bbad486. [PMID: 38171930 PMCID: PMC10764206 DOI: 10.1093/bib/bbad486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/04/2023] [Accepted: 12/05/2023] [Indexed: 01/05/2024] Open
Abstract
Protein loops play a critical role in the dynamics of proteins and are essential for numerous biological functions, and various computational approaches to loop modeling have been proposed over the past decades. However, a comprehensive understanding of the strengths and weaknesses of each method is lacking. In this work, we constructed two high-quality datasets (i.e. the General dataset and the CASP dataset) and systematically evaluated the accuracy and efficiency of 13 commonly used loop modeling approaches from the perspective of loop lengths, protein classes and residue types. The results indicate that the knowledge-based method FREAD generally outperforms the other tested programs in most cases, but encountered challenges when predicting loops longer than 15 and 30 residues on the CASP and General datasets, respectively. The ab initio method Rosetta NGK demonstrated exceptional modeling accuracy for short loops with four to eight residues and achieved the highest success rate on the CASP dataset. The well-known AlphaFold2 and RoseTTAFold require more resources for better performance, but they exhibit promise for predicting loops longer than 16 and 30 residues in the CASP and General datasets. These observations can provide valuable insights for selecting suitable methods for specific loop modeling tasks and contribute to future advancements in the field.
Collapse
Affiliation(s)
- Tianyue Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Langcheng Wang
- Department of Pathology, New York University Medical Center, 550 First Avenue, New York, NY 10016, USA
| | - Xujun Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Chao Shen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Odin Zhang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jike Wang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Jialu Wu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Ruofan Jin
- College of Life Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Donghao Zhou
- Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Shicheng Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Liwei Liu
- Advanced Computing and Storage Laboratory, Central Research Institute, 2012 Laboratories, Huawei Technologies Co., Ltd., Shenzhen 518129, Guangdong, China
| | - Xiaorui Wang
- State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Macao, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Guangyong Chen
- Zhejiang Lab, Zhejiang University, Hangzhou 311121, Zhejiang, China
| | - Peichen Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, China
| |
Collapse
|
2
|
Mirzaei R, Shafiee S, Vafaei R, Salehi M, Jalili N, Nazerian Z, Muhammadnajad A, Yadegari F, Reza Esmailinejad M, Farahmand L. Production of novel recombinant anti-EpCAM antibody as targeted therapy for breast cancer. Int Immunopharmacol 2023; 122:110656. [PMID: 37473710 DOI: 10.1016/j.intimp.2023.110656] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 07/10/2023] [Accepted: 07/13/2023] [Indexed: 07/22/2023]
Abstract
BACKGROUND The utilization of monoclonal antibodies (moAbs), an issue correlated with the biopharmaceutical professions, is developing and maturing. Coordinated with this conception, we produced the appealingly modeled anti-EpCAM scFv for breast cancer tumors. METHODS Afterward cloning and expression of recombinant antibody in Escherichia coli bacteria, the correctness of the desired antibody was checked by western blotting. Flow cytometry was utilized to determine the capacity of the recombinant antibody to append to the desired receptors in the malignant breast cancer (BC)cell line. The recombinant antibody (anti-EpCAM scFv) was examined for preclinical efficacy in reducing tumor growth, angiogenesis, and invasiveness (in vitro- in vivo). FINDINGS A target antibody-mediated attenuation of migration and invasion in the examined cancer cell lines was substantiated (P-value < 0.05). Grafted tumors from breast cancer in mice indicated significant and compelling suppression of tumor growth and decrement in blood supply in reaction to the recombinant anti-EpCAM intervention. Evaluations of immunohistochemical and histopathological findings revealed an enhanced response rate to the treatment. CONCLUSION The desired anti-EpCAM scFv can be a therapeutic tool to reduce invasion and proliferation in malignant breast cancer.
Collapse
Affiliation(s)
- Roya Mirzaei
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Soodabeh Shafiee
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Rana Vafaei
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran; Department of Surgery and Radiology, Faculty of Veterinary Medicine, University of Tehran, Tehran, Iran
| | - Malihe Salehi
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Neda Jalili
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Zahra Nazerian
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Ahad Muhammadnajad
- Cancer Biology Research Center, Cancer Institute of Iran, Tehran University of Medical Sciences, Tehran, Iran
| | - Fatemeh Yadegari
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Mohamad Reza Esmailinejad
- Department of Surgery and Radiology, Faculty of Veterinary Medicine, University of Tehran, Tehran, Iran
| | - Leila Farahmand
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran.
| |
Collapse
|
3
|
Yong Joon Kim J, Sang Z, Xiang Y, Shen Z, Shi Y. Nanobodies: Robust miniprotein binders in biomedicine. Adv Drug Deliv Rev 2023; 195:114726. [PMID: 36754285 DOI: 10.1016/j.addr.2023.114726] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 12/30/2022] [Accepted: 02/02/2023] [Indexed: 02/10/2023]
Abstract
Variable domains of heavy chain-only antibodies (VHH), also known as nanobodies (Nbs), are monomeric antigen-binding domains derived from the camelid heavy chain-only antibodies. Nbs are characterized by small size, high target selectivity, and marked solubility and stability, which collectively facilitate high-quality drug development. In addition, Nbs are readily expressed from various expression systems, including E. coli and yeast cells. For these reasons, Nbs have emerged as preferred antibody fragments for protein engineering, disease diagnosis, and treatment. To date, two Nb-based therapies have been approved by the U.S. Food and Drug Administration (FDA). Numerous candidates spanning a wide spectrum of diseases such as cancer, immune disorders, infectious diseases, and neurodegenerative disorders are under preclinical and clinical investigation. Here, we discuss the structural features of Nbs that allow for specific, versatile, and strong target binding. We also summarize emerging technologies for identification, structural analysis, and humanization of Nbs. Our main focus is to review recent advances in using Nbs as a modular scaffold to facilitate the engineering of multivalent polymers for cutting-edge applications. Finally, we discuss remaining challenges for Nb development and envision new opportunities in Nb-based research.
Collapse
Affiliation(s)
- Jeffrey Yong Joon Kim
- Center of Protein Engineering and Therapeutics, Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1, Gustave L. Levy Pl, New York, NY 10029, USA; Medical Scientist Training Program, University of Pittsburgh School of Medicine and Carnegie Mellon University, Pittsburgh, PA, USA
| | - Zhe Sang
- Center of Protein Engineering and Therapeutics, Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1, Gustave L. Levy Pl, New York, NY 10029, USA
| | - Yufei Xiang
- Center of Protein Engineering and Therapeutics, Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1, Gustave L. Levy Pl, New York, NY 10029, USA
| | - Zhuolun Shen
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, China
| | - Yi Shi
- Center of Protein Engineering and Therapeutics, Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1, Gustave L. Levy Pl, New York, NY 10029, USA.
| |
Collapse
|
4
|
Stevens AO, He Y. Benchmarking the Accuracy of AlphaFold 2 in Loop Structure Prediction. Biomolecules 2022; 12:985. [PMID: 35883541 PMCID: PMC9312937 DOI: 10.3390/biom12070985] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/05/2022] [Accepted: 07/12/2022] [Indexed: 01/22/2023] Open
Abstract
The inhibition of protein-protein interactions is a growing strategy in drug development. In addition to structured regions, many protein loop regions are involved in protein-protein interactions and thus have been identified as potential drug targets. To effectively target such regions, protein structure is critical. Loop structure prediction is a challenging subgroup in the field of protein structure prediction because of the reduced level of conservation in protein sequences compared to the secondary structure elements. AlphaFold 2 has been suggested to be one of the greatest achievements in the field of protein structure prediction. The AlphaFold 2 predicted protein structures near the X-ray resolution in the Critical Assessment of protein Structure Prediction (CASP 14) competition in 2020. The purpose of this work is to survey the performance of AlphaFold 2 in specifically predicting protein loop regions. We have constructed an independent dataset of 31,650 loop regions from 2613 proteins (deposited after the AlphaFold 2 was trained) with both experimentally determined structures and AlphaFold 2 predicted structures. With extensive evaluation using our dataset, the results indicate that AlphaFold 2 is a good predictor of the structure of loop regions, especially for short loop regions. Loops less than 10 residues in length have an average Root Mean Square Deviation (RMSD) of 0.33 Å and an average the Template Modeling score (TM-score) of 0.82. However, we see that as the number of residues in a given loop increases, the accuracy of AlphaFold 2's prediction decreases. Loops more than 20 residues in length have an average RMSD of 2.04 Å and an average TM-score of 0.55. Such a correlation between accuracy and length of the loop is directly linked to the increase in flexibility. Moreover, AlphaFold 2 does slightly over-predict α-helices and β-strands in proteins.
Collapse
Affiliation(s)
- Amy O. Stevens
- Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, NM 87131, USA;
| | - Yi He
- Department of Chemistry and Chemical Biology, University of New Mexico, Albuquerque, NM 87131, USA;
- Translational Informatics Division, Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87131, USA
| |
Collapse
|
5
|
Gilodi M, Lisi S, F. Dudás E, Fantini M, Puglisi R, Louka A, Marcatili P, Cattaneo A, Pastore A. Selection and Modelling of a New Single-Domain Intrabody Against TDP-43. Front Mol Biosci 2022; 8:773234. [PMID: 35237655 PMCID: PMC8884700 DOI: 10.3389/fmolb.2021.773234] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 11/29/2021] [Indexed: 12/13/2022] Open
Abstract
Amyotrophic lateral sclerosis (ALS) is a neurodegenerative disorder associated to deteriorating motor and cognitive functions, and short survival. The disease is caused by neuronal death which results in progressive muscle wasting and weakness, ultimately leading to lethal respiratory failure. The misbehaviour of a specific protein, TDP-43, which aggregates and becomes toxic in ALS patient’s neurons, is supposed to be one of the causes. TDP-43 is a DNA/RNA-binding protein involved in several functions related to nucleic acid metabolism. Sequestration of TDP-43 aggregates is a possible therapeutic strategy that could alleviate or block pathology. Here, we describe the selection and characterization of a new intracellular antibody (intrabody) against TDP-43 from a llama nanobody library. The structure of the selected intrabody was predicted in silico and the model was used to suggest mutations that enabled to improve its expression yield, facilitating its experimental validation. We showed how coupling experimental methodologies with in silico design may allow us to obtain an antibody able to recognize the RNA binding regions of TDP-43. Our findings illustrate a strategy for the mitigation of TDP-43 proteinopathy in ALS and provide a potential new tool for diagnostics.
Collapse
Affiliation(s)
- Martina Gilodi
- Department of Molecular Medicine, University of Pavia, Pavia, Italy
- Dementia Research Institute at King’s College London, The Wohl Institute, London, United Kingdom
| | - Simonetta Lisi
- Bio@SNS Laboratory, Scuola Normale Superiore, Piazza dei Cavalieri, Pisa, Italy
| | - Erika F. Dudás
- Dementia Research Institute at King’s College London, The Wohl Institute, London, United Kingdom
| | - Marco Fantini
- Bio@SNS Laboratory, Scuola Normale Superiore, Piazza dei Cavalieri, Pisa, Italy
| | - Rita Puglisi
- Dementia Research Institute at King’s College London, The Wohl Institute, London, United Kingdom
| | - Alexandra Louka
- Dementia Research Institute at King’s College London, The Wohl Institute, London, United Kingdom
| | - Paolo Marcatili
- Department of Bioinformatics, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Antonino Cattaneo
- Bio@SNS Laboratory, Scuola Normale Superiore, Piazza dei Cavalieri, Pisa, Italy
- *Correspondence: Annalisa Pastore, ; Antonino Cattaneo,
| | - Annalisa Pastore
- Dementia Research Institute at King’s College London, The Wohl Institute, London, United Kingdom
- *Correspondence: Annalisa Pastore, ; Antonino Cattaneo,
| |
Collapse
|
6
|
Rudnev VR, Kulikova LI, Nikolsky KS, Malsagova KA, Kopylov AT, Kaysheva AL. Current Approaches in Supersecondary Structures Investigation. Int J Mol Sci 2021; 22:11879. [PMID: 34769310 PMCID: PMC8584461 DOI: 10.3390/ijms222111879] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 10/27/2021] [Accepted: 10/29/2021] [Indexed: 11/16/2022] Open
Abstract
Proteins expressed during the cell cycle determine cell function, topology, and responses to environmental influences. The development and improvement of experimental methods in the field of structural biology provide valuable information about the structure and functions of individual proteins. This work is devoted to the study of supersecondary structures of proteins and determination of their structural motifs, description of experimental methods for their detection, databases, and repositories for storage, as well as methods of molecular dynamics research. The interest in the study of supersecondary structures in proteins is due to their autonomous stability outside the protein globule, which makes it possible to study folding processes, conformational changes in protein isoforms, and aberrant proteins with high productivity.
Collapse
Affiliation(s)
- Vladimir R. Rudnev
- Biobanking Group, Branch of Institute of Biomedical Chemistry “Scientific and Education Center”, 109028 Moscow, Russia; (V.R.R.); (L.I.K.); (K.S.N.); (A.T.K.); (A.L.K.)
- Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, 142290 Pushchino, Russia
| | - Liudmila I. Kulikova
- Biobanking Group, Branch of Institute of Biomedical Chemistry “Scientific and Education Center”, 109028 Moscow, Russia; (V.R.R.); (L.I.K.); (K.S.N.); (A.T.K.); (A.L.K.)
- Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, 142290 Pushchino, Russia
- Institute of Mathematical Problems of Biology RAS—The Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, 142290 Pushchino, Russia
| | - Kirill S. Nikolsky
- Biobanking Group, Branch of Institute of Biomedical Chemistry “Scientific and Education Center”, 109028 Moscow, Russia; (V.R.R.); (L.I.K.); (K.S.N.); (A.T.K.); (A.L.K.)
| | - Kristina A. Malsagova
- Biobanking Group, Branch of Institute of Biomedical Chemistry “Scientific and Education Center”, 109028 Moscow, Russia; (V.R.R.); (L.I.K.); (K.S.N.); (A.T.K.); (A.L.K.)
| | - Arthur T. Kopylov
- Biobanking Group, Branch of Institute of Biomedical Chemistry “Scientific and Education Center”, 109028 Moscow, Russia; (V.R.R.); (L.I.K.); (K.S.N.); (A.T.K.); (A.L.K.)
| | - Anna L. Kaysheva
- Biobanking Group, Branch of Institute of Biomedical Chemistry “Scientific and Education Center”, 109028 Moscow, Russia; (V.R.R.); (L.I.K.); (K.S.N.); (A.T.K.); (A.L.K.)
| |
Collapse
|
7
|
Barozet A, Chacón P, Cortés J. Current approaches to flexible loop modeling. Curr Res Struct Biol 2021; 3:187-191. [PMID: 34409304 PMCID: PMC8361254 DOI: 10.1016/j.crstbi.2021.07.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 06/30/2021] [Accepted: 07/25/2021] [Indexed: 01/14/2023] Open
Abstract
Loops are key components of protein structures, involved in many biological functions. Due to their conformational variability, the structural investigation of loops is a difficult topic, requiring a combination of experimental and computational methods. This paper provides a brief overview of current computational approaches to flexible loop modeling, and presents the main ingredients of the most standard protocols. Despite great progress in recent years, accurately modeling the conformational variability of long flexible loops remains a challenging problem. Future advances in this field will likely come from a tight coupling of experimental and computational techniques, which would enable a better understanding of the relationships between loop sequence, structural flexibility, and functional roles. In fine, accurate loop modeling will open the road to loop design problems of interest for applications in biomedicine and biotechnology.
Collapse
Affiliation(s)
- Amélie Barozet
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| | - Pablo Chacón
- Department of Biological Physical Chemistry, Rocasolano Physical Chemistry Institute C.S.I.C., Madrid, Spain
| | - Juan Cortés
- LAAS-CNRS, Université de Toulouse, CNRS, Toulouse, France
| |
Collapse
|
8
|
Feng JJ, Chen JN, Kang W, Wu YD. Accurate Structure Prediction for Protein Loops Based on Molecular Dynamics Simulations with RSFF2C. J Chem Theory Comput 2021; 17:4614-4628. [PMID: 34170125 DOI: 10.1021/acs.jctc.1c00341] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein loops, connecting the α-helices and β-strands, are involved in many important biological processes. However, due to their conformational flexibility, it is still challenging to accurately determine three-dimensional (3D) structures of long loops experimentally and computationally. Herein, we present a systematic study of the protein loop structure prediction via a total of ∼850 μs molecular dynamics (MD) simulations. For a set of 15 long (10-16 residues) and solvent-exposed loops, we first evaluated the performance of four state-of-the-art loop modeling algorithms, DaReUS-Loop, Sphinx, Rosetta-NGK, and MODELLER, on each loop, and none of them could accurately predict the structures for most loops. Then, temperature replica exchange molecular dynamics (REMD) simulations were conducted with three recent force fields, RSFF2C with TIP3P water model, CHARMM36m with CHARMM-modified TIP3P, and AMBER ff19SB with OPC. We found that our recently developed residue-specific force field RSFF2C performed the best and successfully predicted 12 out of 15 loops with a root-mean-square deviation (RMSD) < 1.5 Å. As an alternative with lower computational cost, normal MD simulations at high temperatures (380, 500, and 620 K) were investigated. Temperature-dependent performance was observed for each force field, and, for RSFF2C+TIP3P, we found that three independent 100-ns MD simulations at 500 K gave comparable results with REMD simulations. These results suggest that MD simulations, especially with enhanced sampling techniques such as replica exchange, with the RSFF2C force field could be useful for accurate loop structure prediction.
Collapse
Affiliation(s)
- Jia-Jie Feng
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Jia-Nan Chen
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Wei Kang
- Pingshan Translational Medicine Center, Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Yun-Dong Wu
- Lab of Computational Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China.,College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China.,Shenzhen Bay Laboratory, Shenzhen 518132, China
| |
Collapse
|
9
|
Del Alamo D, Fischer AW, Moretti R, Alexander NS, Mendenhall J, Hyman NJ, Meiler J. Efficient Sampling of Protein Loop Regions Using Conformational Hashing Complemented with Random Coordinate Descent. J Chem Theory Comput 2021; 17:560-570. [PMID: 33373213 DOI: 10.1021/acs.jctc.0c00836] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
De novo construction of loop regions is an important problem in computational structural biology. Compared to regions with well-defined secondary structure, loops tend to exhibit significant conformational heterogeneity. As a result, their structures are often ambiguous when determined using experimental data obtained by crystallography, cryo-EM, or NMR. Although structurally diverse models could provide a more relevant representation of proteins in their native states, obtaining large numbers of biophysically realistic and physiologically relevant loop conformations is a resource-consuming task. To address this need, we developed a novel loop construction algorithm, Hash/RCD, that combines knowledge-based conformational hashing with random coordinate descent (RCD). This hybrid approach achieved a closure rate of 100% on a benchmark set of 195 loops in 29 proteins that range from 3 to 31 residues. More importantly, the use of templates allows Hash/RCD to maintain the accuracy of state-of-the-art coordinate descent methods while reducing sampling time from over 400 to 141 ms. These results highlight how the integration of coordinate descent with knowledge-based sampling overcomes barriers inherent to either approach in isolation. This method may facilitate the identification of native-like loop conformations using experimental data or full-atom scoring functions by allowing rapid sampling of large numbers of loops. In this manuscript, we investigate and discuss the advantages, bottlenecks, and limitations of combining conformational hashing with RCD. By providing a detailed technical description of the Hash/RCD algorithm, we hope to facilitate its implementation by other researchers.
Collapse
Affiliation(s)
- Diego Del Alamo
- Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, 37235 Tennessee, United States
| | - Axel W Fischer
- Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, 37235 Tennessee, United States
| | - Rocco Moretti
- Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, 37235 Tennessee, United States
| | - Nathan S Alexander
- Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, 37235 Tennessee, United States
| | - Jeffrey Mendenhall
- Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, 37235 Tennessee, United States
| | - Nicholas J Hyman
- Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, 37235 Tennessee, United States
| | - Jens Meiler
- Department of Chemistry and Center for Structural Biology, Vanderbilt University, Nashville, 37235 Tennessee, United States.,Institut for Drug Discovery, Leipzig University, Leipzig SAC 04103, Germany
| |
Collapse
|
10
|
Studer G, Tauriello G, Bienert S, Biasini M, Johner N, Schwede T. ProMod3-A versatile homology modelling toolbox. PLoS Comput Biol 2021; 17:e1008667. [PMID: 33507980 PMCID: PMC7872268 DOI: 10.1371/journal.pcbi.1008667] [Citation(s) in RCA: 130] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 02/09/2021] [Accepted: 01/03/2021] [Indexed: 11/18/2022] Open
Abstract
Computational methods for protein structure modelling are routinely used to complement experimental structure determination, thus they help to address a broad spectrum of scientific questions in biomedical research. The most accurate methods today are based on homology modelling, i.e. detecting a homologue to the desired target sequence that can be used as a template for modelling. Here we present a versatile open source homology modelling toolbox as foundation for flexible and computationally efficient modelling workflows. ProMod3 is a fully scriptable software platform that can perform all steps required to generate a protein model by homology. Its modular design aims at fast prototyping of novel algorithms and implementing flexible modelling pipelines. Common modelling tasks, such as loop modelling, sidechain modelling or generating a full protein model by homology, are provided as production ready pipelines, forming the starting point for own developments and enhancements. ProMod3 is the central software component of the widely used SWISS-MODEL web-server.
Collapse
Affiliation(s)
- Gabriel Studer
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Gerardo Tauriello
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Stefan Bienert
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Marco Biasini
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Niklaus Johner
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Torsten Schwede
- Biozentrum, University of Basel, Basel, Switzerland
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| |
Collapse
|
11
|
Wong WK, Georges G, Ros F, Kelm S, Lewis AP, Taddese B, Leem J, Deane CM. SCALOP: sequence-based antibody canonical loop structure annotation. Bioinformatics 2020; 35:1774-1776. [PMID: 30321295 PMCID: PMC6513161 DOI: 10.1093/bioinformatics/bty877] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 09/17/2018] [Accepted: 10/13/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Canonical forms of the antibody complementarity-determining regions (CDRs) were first described in 1987 and have been redefined on multiple occasions since. The canonical forms are often used to approximate the antibody binding site shape as they can be predicted from sequence. A rapid predictor would facilitate the annotation of CDR structures in the large amounts of repertoire data now becoming available from next generation sequencing experiments. RESULTS SCALOP annotates CDR canonical forms for antibody sequences, supported by an auto-updating database to capture the latest cluster information. Its accuracy is comparable to that of a standard structural predictor but it is 800 times faster. The auto-updating nature of SCALOP ensures that it always attains the best possible coverage. AVAILABILITY AND IMPLEMENTATION SCALOP is available as a web application and for download under a GPLv3 license at opig.stats.ox.ac.uk/webapps/scalop. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wing Ki Wong
- Department of Statistics, University of Oxford, Oxford, UK
| | - Guy Georges
- Roche Pharma Research and Early Development, Large Molecule Research Roche Innovation Center Munich, Penzberg, Germany
| | - Francesca Ros
- Roche Pharma Research and Early Development, Large Molecule Research Roche Innovation Center Munich, Penzberg, Germany
| | | | - Alan P Lewis
- Computational and Modelling Sciences, GlaxoSmithKline Research and Development, Stevenage, UK
| | - Bruck Taddese
- Antibody Discovery and Protein Engineering, MedImmune, Granta Park, Cambridge, UK
| | - Jinwoo Leem
- Department of Statistics, University of Oxford, Oxford, UK
| | | |
Collapse
|
12
|
Karami Y, Rey J, Postic G, Murail S, Tufféry P, de Vries SJ. DaReUS-Loop: a web server to model multiple loops in homology models. Nucleic Acids Res 2020; 47:W423-W428. [PMID: 31114872 PMCID: PMC6602439 DOI: 10.1093/nar/gkz403] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 04/20/2019] [Accepted: 05/06/2019] [Indexed: 02/07/2023] Open
Abstract
Loop regions in protein structures often have crucial roles, and they are much more variable in sequence and structure than other regions. In homology modeling, this leads to larger deviations from the homologous templates, and loop modeling of homology models remains an open problem. To address this issue, we have previously developed the DaReUS-Loop protocol, leading to significant improvement over existing methods. Here, a DaReUS-Loop web server is presented, providing an automated platform for modeling or remodeling loops in the context of homology models. This is the first web server accepting a protein with up to 20 loop regions, and modeling them all in parallel. It also provides a prediction confidence level that corresponds to the expected accuracy of the loops. DaReUS-Loop facilitates the analysis of the results through its interactive graphical interface and is freely available at http://bioserv.rpbs.univ-paris-diderot.fr/services/DaReUS-Loop/.
Collapse
Affiliation(s)
- Yasaman Karami
- Sorbonne Paris Cité, Université Paris Diderot, CNRS UMR 8251, INSERM ERL U1133, Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| | - Julien Rey
- Sorbonne Paris Cité, Université Paris Diderot, CNRS UMR 8251, INSERM ERL U1133, Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| | - Guillaume Postic
- Sorbonne Paris Cité, Université Paris Diderot, CNRS UMR 8251, INSERM ERL U1133, Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France.,Institut Français de Bioinformatique (IFB), UMS 3601-CNRS, Université Paris-Saclay, Orsay, France
| | - Samuel Murail
- Sorbonne Paris Cité, Université Paris Diderot, CNRS UMR 8251, INSERM ERL U1133, Paris, France
| | - Pierre Tufféry
- Sorbonne Paris Cité, Université Paris Diderot, CNRS UMR 8251, INSERM ERL U1133, Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| | - Sjoerd J de Vries
- Sorbonne Paris Cité, Université Paris Diderot, CNRS UMR 8251, INSERM ERL U1133, Paris, France.,Ressource Parisienne en Bioinformatique Structurale (RPBS), Paris, France
| |
Collapse
|
13
|
Kovaltsuk A, Raybould MIJ, Wong WK, Marks C, Kelm S, Snowden J, Trück J, Deane CM. Structural diversity of B-cell receptor repertoires along the B-cell differentiation axis in humans and mice. PLoS Comput Biol 2020; 16:e1007636. [PMID: 32069281 PMCID: PMC7048297 DOI: 10.1371/journal.pcbi.1007636] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2019] [Revised: 02/28/2020] [Accepted: 01/07/2020] [Indexed: 01/18/2023] Open
Abstract
Most current analysis tools for antibody next-generation sequencing data work with primary sequence descriptors, leaving accompanying structural information unharnessed. We have used novel rapid methods to structurally characterize the complementary-determining regions (CDRs) of more than 180 million human and mouse B-cell receptor (BCR) repertoire sequences. These structurally annotated CDRs provide unprecedented insights into both the structural predetermination and dynamics of the adaptive immune response. We show that B-cell types can be distinguished based solely on these structural properties. Antigen-unexperienced BCR repertoires use the highest number and diversity of CDR structures and these patterns of naïve repertoire paratope usage are highly conserved across subjects. In contrast, more differentiated B-cells are more personalized in terms of CDR structure usage. Our results establish the CDR structure differences in BCR repertoires and have applications for many fields including immunodiagnostics, phage display library generation, and “humanness” assessment of BCR repertoires from transgenic animals. The software tool for structural annotation of BCR repertoires, SAAB+, is available at https://github.com/oxpig/saab_plus. B-cell receptors (BCR) are the major components of the adaptive immune system. These are immunoglobulin molecules that bind to foreign substances known as antigens. Each individual has a huge BCR repertoire, where each individual BCR has a specific binding site composed of the complementary-determining regions (CDRs) capable of recognising a specific antigen. Drug discovery and immunodiagnostics inspired by the adaptive immune system rely on our ability to accurately interrogate the structural diversity of the binding sites of the BCR repertoire. Here we report our novel rapid pipeline, SAAB+, which has enabled us to interrogate how the structure of the CDR changes in BCR repertoires along the B-cell differentiation axis. By analysing human and mouse BCR repertoires at an unprecedented scale, we observed species-specific structural predetermination and detected CDR dynamics across multiple stages of B-cell differentiation. We showed that naïve repertoires share the highest number and diversity of CDR structures, a pattern which was highly conserved in all B-cell donors. Our results suggest that increased B-cell differentiation is associated with a personalization of CDR structure usages. Finally, we established the differences in CDR usages between humans and mice, analysis with immediate relevance for BCR repertoire “humanness” assessment and rational immunotherapeutic engineering.
Collapse
Affiliation(s)
| | | | - Wing Ki Wong
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Claire Marks
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | | | | | - Johannes Trück
- Division of Immunology, University Children's Hospital, University of Zurich, Zurich, Switzerland
| | - Charlotte M. Deane
- Department of Statistics, University of Oxford, Oxford, United Kingdom
- * E-mail:
| |
Collapse
|
14
|
Dhar J, Kishore R, Chakrabarti P. Delineation of a new structural motif involving NHN γ-turn. Proteins 2019; 88:431-439. [PMID: 31587358 DOI: 10.1002/prot.25820] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 09/17/2019] [Accepted: 09/18/2019] [Indexed: 10/25/2022]
Abstract
Macromolecules are characterized by distinctive arrangement of hydrogen bonds. Different patterns of hydrogen bonds give rise to distinct and stable structural motifs. An analysis of 4114 non-redundant protein chains reveals the existence of a three-residue, (i - 1) to (i + 1), structural motif, having two hydrogen-bonded five-membered pseudo rings (the first, an NH···OC involving the first residue, and the second being NH∙∙∙N involving the last two residues), separated by a peptide bond. There could be an additional hydrogen bond between the side-chain at (i-1) and the main-chain NH of (i + 1). The average backbone torsion angles of -76(±21)° and - 12(±17)° at i creates a tight turn in the polypeptide chain, akin to a γ-turn. Indeed, a search of three-residue fragments with restriction on the terminal Cα ···Cα distance and the existence of the two pseudo rings on either side revealed the presence 14 846 cases of a variant, termed NHN γ-turn, distinct from the NHO γ-turn (2032 cases) that has traditionally been characterized by the presence of NHO hydrogen bond linking the terminal main-chain atoms. As in the latter, the newly identified γ-turns are also of two types-classical and inverse, occurring in the ratio of 1:6. The propensities of residues to occur in these turns and their secondary structural features have been enumerated. An understanding of these turns would be useful for structure prediction and loop modeling, and may serve as models to represent some of the unfolded state or disordered region in proteins.
Collapse
Affiliation(s)
- Jesmita Dhar
- Bioinformatics Centre, Bose Institute, Kolkata, India
| | - Raghuvansh Kishore
- Department of Zoology and Department of Biotechnology, Mizoram University, Aizawl, India
| | - Pinak Chakrabarti
- Bioinformatics Centre, Bose Institute, Kolkata, India.,Department of Biochemistry, Bose Institute, Kolkata, India
| |
Collapse
|
15
|
Marks C, Deane CM. Increasing the accuracy of protein loop structure prediction with evolutionary constraints. Bioinformatics 2019; 35:2585-2592. [PMID: 30535347 DOI: 10.1093/bioinformatics/bty996] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Revised: 09/28/2018] [Accepted: 12/07/2018] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Accurate prediction of loop structures remains challenging. This is especially true for long loops where the large conformational space and limited coverage of experimentally determined structures often leads to low accuracy. Co-evolutionary contact predictors, which provide information about the proximity of pairs of residues, have been used to improve whole-protein models generated through de novo techniques. Here we investigate whether these evolutionary constraints can enhance the prediction of long loop structures. RESULTS As a first stage, we assess the accuracy of predicted contacts that involve loop regions. We find that these are less accurate than contacts in general. We also observe that some incorrectly predicted contacts can be identified as they are never satisfied in any of our generated loop conformations. We examined two different strategies for incorporating contacts, and on a test set of long loops (10 residues or more), both approaches improve the accuracy of prediction. For a set of 135 loops, contacts were predicted and hence our methods were applicable in 97 cases. Both strategies result in an increase in the proportion of near-native decoys in the ensemble, leading to more accurate predictions and in some cases improving the root-mean-square deviation of the final model by more than 3 Å. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claire Marks
- Department of Statistics, University of Oxford, Oxford, UK
| | | |
Collapse
|
16
|
Leem J, Deane CM. High-Throughput Antibody Structure Modeling and Design Using ABodyBuilder. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2019; 1851:367-380. [PMID: 30298409 DOI: 10.1007/978-1-4939-8736-8_21] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Antibodies are proteins of the adaptive immune system; they can be designed to bind almost any molecule, and are increasingly being used as biotherapeutics. Experimental antibody design is an expensive and time-consuming process, and computational antibody design methods can now be used to help develop new therapeutics and diagnostics. Within the design pipeline, accurate antibody structure modeling is essential, as it provides the basis for antibody-antigen docking, binding affinity prediction, and estimating thermal stability. Ideally, models should be rapidly generated, allowing the exploration of the breadth of antibody space. This allows methods to replicate the natural processes of antibody diversification (e.g., V(D)J recombination and somatic hypermutation), and cope with large volumes of data that are typical of next-generation sequencing datasets. Here we describe ABodyBuilder and PEARS, algorithms that build and mutate antibody model structures. These methods take ~30 s to generate a model antibody structure.
Collapse
Affiliation(s)
- Jinwoo Leem
- Department of Statistics, University of Oxford, Oxford, UK
| | | |
Collapse
|
17
|
Kundert K, Kortemme T. Computational design of structured loops for new protein functions. Biol Chem 2019; 400:275-288. [PMID: 30676995 PMCID: PMC6530579 DOI: 10.1515/hsz-2018-0348] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 12/18/2018] [Indexed: 12/20/2022]
Abstract
The ability to engineer the precise geometries, fine-tuned energetics and subtle dynamics that are characteristic of functional proteins is a major unsolved challenge in the field of computational protein design. In natural proteins, functional sites exhibiting these properties often feature structured loops. However, unlike the elements of secondary structures that comprise idealized protein folds, structured loops have been difficult to design computationally. Addressing this shortcoming in a general way is a necessary first step towards the routine design of protein function. In this perspective, we will describe the progress that has been made on this problem and discuss how recent advances in the field of loop structure prediction can be harnessed and applied to the inverse problem of computational loop design.
Collapse
Affiliation(s)
- Kale Kundert
- Graduate Group in Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
| | - Tanja Kortemme
- Graduate Group in Biophysics, University of California San Francisco, San Francisco, CA 94158, USA
- Department of Bioengineering and Therapeutic Sciences, University of California San Francisco, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, CA 94158, USA
| |
Collapse
|
18
|
Marks C, Shi J, Deane CM. Predicting loop conformational ensembles. Bioinformatics 2018; 34:949-956. [PMID: 29136084 DOI: 10.1093/bioinformatics/btx718] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 11/09/2017] [Indexed: 12/23/2022] Open
Abstract
Motivation Protein function is often facilitated by the existence of multiple stable conformations. Structure prediction algorithms need to be able to model these different conformations accurately and produce an ensemble of structures that represent a target's conformational diversity rather than just a single state. Here, we investigate whether current loop prediction algorithms are capable of this. We use the algorithms to predict the structures of loops with multiple experimentally determined conformations, and the structures of loops with only one conformation, and assess their ability to generate and select decoys that are close to any, or all, of the observed structures. Results We find that while loops with only one known conformation are predicted well, conformationally diverse loops are modelled poorly, and in most cases the predictions returned by the methods do not resemble any of the known conformers. Our results contradict the often-held assumption that multiple native conformations will be present in the decoy set, making the production of accurate conformational ensembles impossible, and hence indicating that current methodologies are not well suited to prediction of conformationally diverse, often functionally important protein regions. Contact marks@stats.ox.ac.uk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claire Marks
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Jiye Shi
- Department of Chemistry, UCB Pharma, Slough SL1 3WE, UK
| | | |
Collapse
|
19
|
Dhar J, Chakrabarti P. Structural motif, topi and its role in protein function and fibrillation. Mol Omics 2018; 14:247-256. [PMID: 29896602 DOI: 10.1039/c8mo00048d] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
A protein chain is arranged into regions in which the backbone is organized into regular patterns (of conformation and hydrogen bonding) to form the most common secondary structures, α-helix and β-sheet, which are interspersed by turns and more irregular loop regions. A structural motif, topi, is discussed in which a pair of 2-residue segments, each containing hydrogen-bonded five-membered fused-ring motifs, distant in sequence are linked to each other by a hydrogen bond. Though a small motif, it appears to be important in the context of local folding patterns of proteins and occurs near protein active sites. The motif shows quite significant residue preference, and a Cys (or Ser) occupying the second position may further stabilize the motif by forming an additional hydrogen bond across it. Remarkably, topi is found within disease causing misfolded proteins, such as the fibrilled form of Aβ42, and also across the interface between two protein chains. This motif may be an important component of fibrillation and useful for modeling loop regions.
Collapse
Affiliation(s)
- Jesmita Dhar
- Bioinformatics Centre, Bose Institute, P1/12 CIT Scheme VIIM, Kolkata 700054, India.
| | | |
Collapse
|
20
|
Karami Y, Guyon F, De Vries S, Tufféry P. DaReUS-Loop: accurate loop modeling using fragments from remote or unrelated proteins. Sci Rep 2018; 8:13673. [PMID: 30209260 PMCID: PMC6135855 DOI: 10.1038/s41598-018-32079-w] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 08/31/2018] [Indexed: 11/08/2022] Open
Abstract
Despite efforts during the past decades, loop modeling remains a difficult part of protein structure modeling. Several approaches have been developed in the framework of crystal structures. However, for homology models, the modeling of loops is still far from being solved. We propose DaReUS-Loop, a data-based approach that identifies loop candidates mining the complete set of experimental structures available in the Protein Data Bank. Candidate filtering relies on local conformation profile-profile comparison, together with physico-chemical scoring. Applied to three different template-based test sets, DaReUS-Loop shows significant increase in the number of high-accuracy loops, and significant enhancement for modeling long loops. A special advantage is that our method proposes a prediction confidence score that correlates well with the expected accuracy of the loops. Strikingly, over 50% of successful loop models are derived from unrelated proteins, indicating that fragments under similar constraints tend to adopt similar structure, beyond mere homology.
Collapse
Affiliation(s)
- Yasaman Karami
- Molécules Thérapeutiques in silico, UMR-S973, Institut National de la Santé et de la Recherche Médicale (INSERM), Université Paris Diderot, Sorbonne Paris Cité, RPBS, 75013, Paris, France
| | - Frédéric Guyon
- Molécules Thérapeutiques in silico, UMR-S973, Institut National de la Santé et de la Recherche Médicale (INSERM), Université Paris Diderot, Sorbonne Paris Cité, RPBS, 75013, Paris, France
| | - Sjoerd De Vries
- Molécules Thérapeutiques in silico, UMR-S973, Institut National de la Santé et de la Recherche Médicale (INSERM), Université Paris Diderot, Sorbonne Paris Cité, RPBS, 75013, Paris, France.
| | - Pierre Tufféry
- Molécules Thérapeutiques in silico, UMR-S973, Institut National de la Santé et de la Recherche Médicale (INSERM), Université Paris Diderot, Sorbonne Paris Cité, RPBS, 75013, Paris, France.
| |
Collapse
|
21
|
Krawczyk K, Kelm S, Kovaltsuk A, Galson JD, Kelly D, Trück J, Regep C, Leem J, Wong WK, Nowak J, Snowden J, Wright M, Starkie L, Scott-Tucker A, Shi J, Deane CM. Structurally Mapping Antibody Repertoires. Front Immunol 2018; 9:1698. [PMID: 30083160 PMCID: PMC6064724 DOI: 10.3389/fimmu.2018.01698] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 07/10/2018] [Indexed: 12/15/2022] Open
Abstract
Every human possesses millions of distinct antibodies. It is now possible to analyze this diversity via next-generation sequencing of immunoglobulin genes (Ig-seq). This technique produces large volume sequence snapshots of B-cell receptors that are indicative of the antibody repertoire. In this paper, we enrich these large-scale sequence datasets with structural information. Enriching a sequence with its structural data allows better approximation of many vital features, such as its binding site and specificity. Here, we describe the structural annotation of antibodies pipeline that maps the outputs of large Ig-seq experiments to known antibody structures. We demonstrate the viability of our protocol on five separate Ig-seq datasets covering ca. 35 m unique amino acid sequences from ca. 600 individuals. Despite the great theoretical diversity of antibodies, we find that the majority of sequences coming from such studies can be reliably mapped to an existing structure.
Collapse
Affiliation(s)
- Konrad Krawczyk
- Department of Statistics, Oxford University, Oxford, United Kingdom
| | | | | | - Jacob D Galson
- Division of Immunology, Children's Research Center, University Children's Hospital, Zurich, Switzerland
| | - Dominic Kelly
- Division of Immunology, Children's Research Center, University Children's Hospital, Zurich, Switzerland
| | - Johannes Trück
- Division of Immunology, Children's Research Center, University Children's Hospital, Zurich, Switzerland.,Oxford Vaccine Group, University of Oxford, NIHR Oxford Biomedical Research Centre, Oxford, United Kingdom
| | - Cristian Regep
- Department of Statistics, Oxford University, Oxford, United Kingdom
| | - Jinwoo Leem
- Department of Statistics, Oxford University, Oxford, United Kingdom
| | - Wing K Wong
- Department of Statistics, Oxford University, Oxford, United Kingdom
| | - Jaroslaw Nowak
- Department of Statistics, Oxford University, Oxford, United Kingdom
| | | | | | | | | | - Jiye Shi
- UCB Pharma, Slough, United Kingdom
| | | |
Collapse
|
22
|
Won J, Lee GR, Park H, Seok C. GalaxyGPCRloop: Template-Based and Ab Initio Structure Sampling of the Extracellular Loops of G-Protein-Coupled Receptors. J Chem Inf Model 2018; 58:1234-1243. [DOI: 10.1021/acs.jcim.8b00148] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Jonghun Won
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Gyu Rie Lee
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Hahnbeom Park
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul 08826, Republic of Korea
| |
Collapse
|
23
|
Bansal N, Zheng Z, Song LF, Pei J, Merz KM. The Role of the Active Site Flap in Streptavidin/Biotin Complex Formation. J Am Chem Soc 2018; 140:5434-5446. [PMID: 29607642 DOI: 10.1021/jacs.8b00743] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Obtaining a detailed description of how active site flap motion affects substrate or ligand binding will advance structure-based drug design (SBDD) efforts on systems including the kinases, HSP90, HIV protease, ureases, etc. Through this understanding, we will be able to design better inhibitors and better proteins that have desired functions. Herein we address this issue by generating the relevant configurational states of a protein flap on the molecular energy landscape using an approach we call MTFlex-b and then following this with a procedure to estimate the free energy associated with the motion of the flap region. To illustrate our overall workflow, we explored the free energy changes in the streptavidin/biotin system upon introducing conformational flexibility in loop3-4 in the biotin unbound ( apo) and bound ( holo) state. The free energy surfaces were created using the Movable Type free energy method, and for further validation, we compared them to potential of mean force (PMF) generated free energy surfaces using MD simulations employing the FF99SBILDN and FF14SB force fields. We also estimated the free energy thermodynamic cycle using an ensemble of closed-like and open-like end states for the ligand unbound and bound states and estimated the binding free energy to be approximately -16.2 kcal/mol (experimental -18.3 kcal/mol). The good agreement between MTFlex-b in combination with the MT method with experiment and MD simulations supports the effectiveness of our strategy in obtaining unique insights into the motions in proteins that can then be used in a range of biological and biomedical applications.
Collapse
Affiliation(s)
- Nupur Bansal
- Department of Chemistry and Department of Biochemistry and Molecular Biology , Michigan State University , 578 South Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Zheng Zheng
- Department of Chemistry and Department of Biochemistry and Molecular Biology , Michigan State University , 578 South Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Lin Frank Song
- Department of Chemistry and Department of Biochemistry and Molecular Biology , Michigan State University , 578 South Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Jun Pei
- Department of Chemistry and Department of Biochemistry and Molecular Biology , Michigan State University , 578 South Shaw Lane , East Lansing , Michigan 48824 , United States
| | - Kenneth M Merz
- Department of Chemistry and Department of Biochemistry and Molecular Biology , Michigan State University , 578 South Shaw Lane , East Lansing , Michigan 48824 , United States.,Institute for Cyber Enabled Research , Michigan State University , 567 Wilson Road , East Lansing , Michigan 48824 , United States
| |
Collapse
|
24
|
Marks C, Nowak J, Klostermann S, Georges G, Dunbar J, Shi J, Kelm S, Deane CM. Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction. Bioinformatics 2018; 33:1346-1353. [PMID: 28453681 PMCID: PMC5408792 DOI: 10.1093/bioinformatics/btw823] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2016] [Accepted: 01/09/2017] [Indexed: 01/31/2023] Open
Abstract
Motivation Loops are often vital for protein function, however, their irregular structures make them difficult to model accurately. Current loop modelling algorithms can mostly be divided into two categories: knowledge-based, where databases of fragments are searched to find suitable conformations and ab initio, where conformations are generated computationally. Existing knowledge-based methods only use fragments that are the same length as the target, even though loops of slightly different lengths may adopt similar conformations. Here, we present a novel method, Sphinx, which combines ab initio techniques with the potential extra structural information contained within loops of a different length to improve structure prediction. Results We show that Sphinx is able to generate high-accuracy predictions and decoy sets enriched with near-native loop conformations, performing better than the ab initio algorithm on which it is based. In addition, it is able to provide predictions for every target, unlike some knowledge-based methods. Sphinx can be used successfully for the difficult problem of antibody H3 prediction, outperforming RosettaAntibody, one of the leading H3-specific ab initio methods, both in accuracy and speed. Availability and Implementation Sphinx is available at http://opig.stats.ox.ac.uk/webapps/sphinx. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claire Marks
- Department of Statistics, University of Oxford, Oxford, UK
| | - Jaroslaw Nowak
- Department of Statistics, University of Oxford, Oxford, UK
| | | | - Guy Georges
- Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Penzberg, DE, Germany
| | - James Dunbar
- Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Penzberg, DE, Germany
| | - Jiye Shi
- Department of Informatics, UCB Pharma, Slough, UK
| | | | | |
Collapse
|
25
|
Kovaltsuk A, Krawczyk K, Galson JD, Kelly DF, Deane CM, Trück J. How B-Cell Receptor Repertoire Sequencing Can Be Enriched with Structural Antibody Data. Front Immunol 2017; 8:1753. [PMID: 29276518 PMCID: PMC5727015 DOI: 10.3389/fimmu.2017.01753] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Accepted: 11/27/2017] [Indexed: 12/24/2022] Open
Abstract
Next-generation sequencing of immunoglobulin gene repertoires (Ig-seq) allows the investigation of large-scale antibody dynamics at a sequence level. However, structural information, a crucial descriptor of antibody binding capability, is not collected in Ig-seq protocols. Developing systematic relationships between the antibody sequence information gathered from Ig-seq and low-throughput techniques such as X-ray crystallography could radically improve our understanding of antibodies. The mapping of Ig-seq datasets to known antibody structures can indicate structurally, and perhaps functionally, uncharted areas. Furthermore, contrasting naïve and antigenically challenged datasets using structural antibody descriptors should provide insights into antibody maturation. As the number of antibody structures steadily increases and more and more Ig-seq datasets become available, the opportunities that arise from combining the two types of information increase as well. Here, we review how these data types enrich one another and show potential for advancing our knowledge of the immune system and improving antibody engineering.
Collapse
Affiliation(s)
| | - Konrad Krawczyk
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Jacob D Galson
- Division of Immunology and the Children's Research Center, University Children's Hospital, University of Zürich, Zürich, Switzerland
| | - Dominic F Kelly
- Oxford Vaccine Group, Department of Paediatrics, University of Oxford and the NIHR Oxford Biomedical Research Center, Oxford, United Kingdom
| | - Charlotte M Deane
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Johannes Trück
- Division of Immunology and the Children's Research Center, University Children's Hospital, University of Zürich, Zürich, Switzerland
| |
Collapse
|
26
|
Heo S, Lee J, Joo K, Shin HC, Lee J. Protein Loop Structure Prediction Using Conformational Space Annealing. J Chem Inf Model 2017; 57:1068-1078. [DOI: 10.1021/acs.jcim.6b00742] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Seungryong Heo
- School
of Systems Biomedical Science, Soongsil University, Seoul 06978, Korea
| | - Juyong Lee
- Laboratory
of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | | | - Hang-Cheol Shin
- School
of Systems Biomedical Science, Soongsil University, Seoul 06978, Korea
| | | |
Collapse
|
27
|
Marks C, Deane C. Antibody H3 Structure Prediction. Comput Struct Biotechnol J 2017; 15:222-231. [PMID: 28228926 PMCID: PMC5312500 DOI: 10.1016/j.csbj.2017.01.010] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Revised: 01/24/2017] [Accepted: 01/27/2017] [Indexed: 01/20/2023] Open
Abstract
Antibodies are proteins of the immune system that are able to bind to a huge variety of different substances, making them attractive candidates for therapeutic applications. Antibody structures have the potential to be useful during drug development, allowing the implementation of rational design procedures. The most challenging part of the antibody structure to experimentally determine or model is the H3 loop, which in addition is often the most important region in an antibody's binding site. This review summarises the approaches used so far in the pursuit of accurate computational H3 structure prediction.
Collapse
Affiliation(s)
- C. Marks
- Department of Statistics, University of Oxford, 24-29 St Giles', Oxford OX1 3LB, United Kingdom
| | | |
Collapse
|
28
|
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
29
|
Warfarin resistance associated with genetic polymorphism of VKORC1: linking clinical response to molecular mechanism using computational modeling. Pharmacogenet Genomics 2016; 26:44-50. [PMID: 26513304 DOI: 10.1097/fpc.0000000000000184] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The variable response to warfarin treatment often has a genetic basis. A protein homology model of human vitamin K epoxide reductase, subunit 1 (VKORC1), was generated to elucidate the mechanism of warfarin resistance observed in a patient with the Val66Met mutation. The VKORC1 homology model comprises four transmembrane (TM) helical domains and a half helical lid domain. Cys132 and Cys135, located in the N-terminal end of TM-4, are linked through a disulfide bond. Two distinct binding sites for warfarin were identified. Site-1, which binds vitamin K epoxide (KO) in a catalytically favorable orientation, shows higher affinity for S-warfarin compared with R-warfarin. Site-2, positioned in the domain occupied by the hydrophobic tail of KO, binds both warfarin enantiomers with similar affinity. Displacement of Arg37 occurs in the Val66Met mutant, blocking access of warfarin (but not KO) to Site-1, consistent with clinical observation of warfarin resistance.
Collapse
|
30
|
Leem J, Dunbar J, Georges G, Shi J, Deane CM. ABodyBuilder: Automated antibody structure prediction with data-driven accuracy estimation. MAbs 2016; 8:1259-1268. [PMID: 27392298 PMCID: PMC5058620 DOI: 10.1080/19420862.2016.1205773] [Citation(s) in RCA: 153] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022] Open
Abstract
Computational modeling of antibody structures plays a critical role in therapeutic antibody design. Several antibody modeling pipelines exist, but no freely available methods currently model nanobodies, provide estimates of expected model accuracy, or highlight potential issues with the antibody's experimental development. Here, we describe our automated antibody modeling pipeline, ABodyBuilder, designed to overcome these issues. The algorithm itself follows the standard 4 steps of template selection, orientation prediction, complementarity-determining region (CDR) loop modeling, and side chain prediction. ABodyBuilder then annotates the 'confidence' of the model as a probability that a component of the antibody (e.g., CDRL3 loop) will be modeled within a root-mean square deviation threshold. It also flags structural motifs on the model that are known to cause issues during in vitro development. ABodyBuilder was tested on 4 separate datasets, including the 11 antibodies from the Antibody Modeling Assessment-II competition. ABodyBuilder builds models that are of similar quality to other methodologies, with sub-Angstrom predictions for the 'canonical' CDR loops. Its ability to model nanobodies, and rapidly generate models (∼30 seconds per model) widens its potential usage. ABodyBuilder can also help users in decision-making for the development of novel antibodies because it provides model confidence and potential sequence liabilities. ABodyBuilder is freely available at http://opig.stats.ox.ac.uk/webapps/abodybuilder .
Collapse
Affiliation(s)
- Jinwoo Leem
- a Department of Statistics , University of Oxford , Oxford , UK
| | - James Dunbar
- a Department of Statistics , University of Oxford , Oxford , UK.,b Roche Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich , Penzberg , Germany
| | - Guy Georges
- b Roche Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich , Penzberg , Germany
| | - Jiye Shi
- c Informatics Department , UCB Pharma , Slough , UK
| | | |
Collapse
|
31
|
Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. CURRENT PROTOCOLS IN BIOINFORMATICS 2016; 54:5.6.1-5.6.37. [PMID: 27322406 PMCID: PMC5031415 DOI: 10.1002/cpbi.3] [Citation(s) in RCA: 1845] [Impact Index Per Article: 230.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Comparative protein structure modeling predicts the three-dimensional structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and how to use the ModBase database of such models, and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described. © 2016 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | - Andrej Sali
- University of California at San Francisco, San Francisco, California
| |
Collapse
|
32
|
Ismer J, Rose AS, Tiemann JKS, Goede A, Preissner R, Hildebrand PW. SL2: an interactive webtool for modeling of missing segments in proteins. Nucleic Acids Res 2016; 44:W390-4. [PMID: 27105847 PMCID: PMC4987885 DOI: 10.1093/nar/gkw297] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 04/11/2016] [Indexed: 11/22/2022] Open
Abstract
SuperLooper2 (SL2) (http://proteinformatics.charite.de/sl2) is the updated version of our previous web-server SuperLooper, a fragment based tool for the prediction and interactive placement of loop structures into globular and helical membrane proteins. In comparison to our previous version, SL2 benefits from both a considerably enlarged database of fragments derived from high-resolution 3D protein structures of globular and helical membrane proteins, and the integration of a new protein viewer. The database, now with double the content, significantly improved the coverage of fragment conformations and prediction quality. The employment of the NGL viewer for visualization of the protein under investigation and interactive selection of appropriate loops makes SL2 independent of third-party plug-ins and additional installations.
Collapse
Affiliation(s)
- Jochen Ismer
- Institute of Medical Physics and Biophysics, University Medicine, Berlin, 10117 Berlin, Germany
| | - Alexander S Rose
- Institute of Medical Physics and Biophysics, University Medicine, Berlin, 10117 Berlin, Germany
| | - Johanna K S Tiemann
- Institute of Medical Physics and Biophysics, University Medicine, Berlin, 10117 Berlin, Germany
| | - Andrean Goede
- Institute of Physiology & Experimental Clinical Research Center, University Medicine, Berlin, 13125, Germany
| | - Robert Preissner
- Institute of Physiology & Experimental Clinical Research Center, University Medicine, Berlin, 13125, Germany
| | - Peter W Hildebrand
- Institute of Medical Physics and Biophysics, University Medicine, Berlin, 10117 Berlin, Germany
| |
Collapse
|
33
|
Choi Y, Hua C, Sentman CL, Ackerman ME, Bailey-Kellogg C. Antibody humanization by structure-based computational protein design. MAbs 2015; 7:1045-57. [PMID: 26252731 PMCID: PMC5045135 DOI: 10.1080/19420862.2015.1076600] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2015] [Revised: 07/06/2015] [Accepted: 07/20/2015] [Indexed: 12/15/2022] Open
Abstract
Antibodies derived from non-human sources must be modified for therapeutic use so as to mitigate undesirable immune responses. While complementarity-determining region (CDR) grafting-based humanization techniques have been successfully applied in many cases, it remains challenging to maintain the desired stability and antigen binding affinity upon grafting. We developed an alternative humanization approach called CoDAH ("Computationally-Driven Antibody Humanization") in which computational protein design methods directly select sets of amino acids to incorporate from human germline sequences to increase humanness while maintaining structural stability. Retrospective studies show that CoDAH is able to identify variants deemed beneficial according to both humanness and structural stability criteria, even for targets lacking crystal structures. Prospective application to TZ47, a murine anti-human B7H6 antibody, demonstrates the approach. Four diverse humanized variants were designed, and all possible unique VH/VL combinations were produced as full-length IgG1 antibodies. Soluble and cell surface expressed antigen binding assays showed that 75% (6 of 8) of the computationally designed VH/VL variants were successfully expressed and competed with the murine TZ47 for binding to B7H6 antigen. Furthermore, 4 of the 6 bound with an estimated KD within an order of magnitude of the original TZ47 antibody. In contrast, a traditional CDR-grafted variant could not be expressed. These results suggest that the computational protein design approach described here can be used to efficiently generate functional humanized antibodies and provide humanized templates for further affinity maturation.
Collapse
Affiliation(s)
- Yoonjoo Choi
- Department of Computer Science; Dartmouth College; Hanover, NH USA
| | - Casey Hua
- Thayer School of Engineering; Dartmouth College; Hanover, NH USA
- Department of Microbiology and Immunology; Geisel School of Medicine; Dartmouth College; Lebanon, NH USA
| | - Charles L Sentman
- Department of Microbiology and Immunology; Geisel School of Medicine; Dartmouth College; Lebanon, NH USA
| | - Margaret E Ackerman
- Thayer School of Engineering; Dartmouth College; Hanover, NH USA
- Department of Microbiology and Immunology; Geisel School of Medicine; Dartmouth College; Lebanon, NH USA
| | | |
Collapse
|
34
|
Messih MA, Lepore R, Tramontano A. LoopIng: a template-based tool for predicting the structure of protein loops. Bioinformatics 2015; 31:3767-72. [PMID: 26249814 PMCID: PMC4653384 DOI: 10.1093/bioinformatics/btv438] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Accepted: 07/21/2015] [Indexed: 12/31/2022] Open
Abstract
Motivation: Predicting the structure of protein loops is very challenging, mainly because they are not necessarily subject to strong evolutionary pressure. This implies that, unlike the rest of the protein, standard homology modeling techniques are not very effective in modeling their structure. However, loops are often involved in protein function, hence inferring their structure is important for predicting protein structure as well as function. Results: We describe a method, LoopIng, based on the Random Forest automated learning technique, which, given a target loop, selects a structural template for it from a database of loop candidates. Compared to the most recently available methods, LoopIng is able to achieve similar accuracy for short loops (4–10 residues) and significant enhancements for long loops (11–20 residues). The quality of the predictions is robust to errors that unavoidably affect the stem regions when these are modeled. The method returns a confidence score for the predicted template loops and has the advantage of being very fast (on average: 1 min/loop). Availability and implementation:www.biocomputing.it/looping Contact:anna.tramontano@uniroma1.it Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Rosalba Lepore
- Department of Physics, Sapienza University, 00185 Rome, Italy and
| | - Anna Tramontano
- Department of Physics, Sapienza University, 00185 Rome, Italy and Istituto Pasteur-Fondazione Cenci Bolognetti, Viale Regina Elena 291, 00161 Rome, Italy
| |
Collapse
|
35
|
Abstract
Functional characterization of a protein sequence is one of the most frequent problems in biology. This task is usually facilitated by accurate three-dimensional (3-D) structure of the studied protein. In the absence of an experimentally determined structure, comparative or homology modeling can sometimes provide a useful 3-D model for a protein that is related to at least one known protein structure. Comparative modeling predicts the 3-D structure of a given protein sequence (target) based primarily on its alignment to one or more proteins of known structure (templates). The prediction process consists of fold assignment, target-template alignment, model building, and model evaluation. This unit describes how to calculate comparative models using the program MODELLER and discusses all four steps of comparative modeling, frequently observed errors, and some applications. Modeling lactate dehydrogenase from Trichomonas vaginalis (TvLDH) is described as an example. The download and installation of the MODELLER software is also described.
Collapse
Affiliation(s)
- Benjamin Webb
- University of California at San Francisco, San Francisco, California
| | | |
Collapse
|
36
|
Holtby D, Li SC, Li M. LoopWeaver: loop modeling by the weighted scaling of verified proteins. J Comput Biol 2014; 20:212-23. [PMID: 23461572 DOI: 10.1089/cmb.2012.0078] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Modeling loops is a necessary step in protein structure determination, even with experimental nuclear magnetic resonance (NMR) data, it is widely known to be difficult. Database techniques have the advantage of producing a higher proportion of predictions with subangstrom accuracy when compared with ab initio techniques, but the disadvantage of also producing a higher proportion of clashing or highly inaccurate predictions. We introduce LoopWeaver, a database method that uses multidimensional scaling to achieve better, clash-free placement of loops obtained from a database of protein structures. This allows us to maintain the above-mentioned advantage while avoiding the disadvantage. Test results show that we achieve significantly better results than all other methods, including Modeler, Loopy, SuperLooper, and Rapper, before refinement. With refinement, our results (LoopWeaver and Loopy consensus) are better than ROSETTA, with 0.42 Å RMSD on average for 206 length 6 loops, 0.64 Å local RMSD for 168 length 7 loops, 0.81Å RMSD for 117 length 8 loops, and 0.98 Å RMSD for length 9 loops, while ROSETTA has 0.55, 0.79, 1.16, 1.42, respectively, at the same average time limit (3 hours). When we allow ROSETTA to run for over a week, it approaches, but does not surpass, our accuracy.
Collapse
Affiliation(s)
- Daniel Holtby
- David R. Chariton School of Computer Science, University of Waterloo, Waterloo, Canada.
| | | | | |
Collapse
|
37
|
Abstract
Structural proteomics aims to understand the structural basis of protein interactions and functions. A prerequisite for this is the availability of 3D protein structures that mediate the biochemical interactions. The explosion in the number of available gene sequences set the stage for the next step in genome-scale projects -- to obtain 3D structures for each protein. To achieve this ambitious goal, the slow and costly structure determination experiments are supplemented with theoretical approaches. The current state and recent advances in structure modeling approaches are reviewed here, with special emphasis on comparative protein structure modeling techniques.
Collapse
Affiliation(s)
- András Fiser
- Department of Biochemistry, Seaver Foundation Center for Bioinformatics, Albert Einstein College of Medicine, 1300 Morris Park Ave., Bronx, NY 10461, USA.
| |
Collapse
|
38
|
Computational Approaches and Resources in Single Amino Acid Substitutions Analysis Toward Clinical Research. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 94:365-423. [DOI: 10.1016/b978-0-12-800168-4.00010-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
39
|
Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state-of-the-art by a number of specific examples.
Collapse
|
40
|
Kelm S, Vangone A, Choi Y, Ebejer JP, Shi J, Deane CM. Fragment-based modeling of membrane protein loops: successes, failures, and prospects for the future. Proteins 2013; 82:175-86. [PMID: 23589399 DOI: 10.1002/prot.24299] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2012] [Revised: 02/22/2013] [Accepted: 03/26/2013] [Indexed: 11/12/2022]
Abstract
Membrane proteins (MPs) have become a major focus in structure prediction, due to their medical importance. There is, however, a lack of fast and reliable methods that specialize in the modeling of MP loops. Often methods designed for soluble proteins (SPs) are applied directly to MPs. In this article, we investigate the validity of such an approach in the realm of fragment-based methods. We also examined the differences in membrane and soluble protein loops that might affect accuracy. We test our ability to predict soluble and MP loops with the previously published method FREAD. We show that it is possible to predict accurately the structure of MP loops using a database of MP fragments (0.5-1 Å median root-mean-square deviation). The presence of homologous proteins in the database helps prediction accuracy. However, even when homologues are removed better results are still achieved using fragments of MPs (0.8-1.6 Å) rather than SPs (1-4 Å) to model MP loops. We find that many fragments of SPs have shapes similar to their MP counterparts but have very different sequences; however, they do not appear to differ in their substitution patterns. Our findings may allow further improvements to fragment-based loop modeling algorithms for MPs. The current version of our proof-of-concept loop modeling protocol produces high-accuracy loop models for MPs and is available as a web server at http://medeller.info/fread.
Collapse
Affiliation(s)
- Sebastian Kelm
- Department of Statistics, University of Oxford, Oxford, United Kingdom
| | | | | | | | | | | |
Collapse
|
41
|
Mishra S, Saxena A, Sangwan RS. Fundamentals of Homology Modeling Steps and Comparison among Important Bioinformatics Tools: An Overview. ACTA ACUST UNITED AC 2013. [DOI: 10.17311/sciintl.2013.237.252] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
42
|
Ebejer JP, Hill JR, Kelm S, Shi J, Deane CM. Memoir: template-based structure prediction for membrane proteins. Nucleic Acids Res 2013; 41:W379-83. [PMID: 23640332 PMCID: PMC3692111 DOI: 10.1093/nar/gkt331] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Membrane proteins are estimated to be the targets of 50% of drugs that are currently in development, yet we have few membrane protein crystal structures. As a result, for a membrane protein of interest, the much-needed structural information usually comes from a homology model. Current homology modelling software is optimized for globular proteins, and ignores the constraints that the membrane is known to place on protein structure. Our Memoir server produces homology models using alignment and coordinate generation software that has been designed specifically for transmembrane proteins. Memoir is easy to use, with the only inputs being a structural template and the sequence that is to be modelled. We provide a video tutorial and a guide to assessing model quality. Supporting data aid manual refinement of the models. These data include a set of alternative conformations for each modelled loop, and a multiple sequence alignment that incorporates the query and template. Memoir works with both α-helical and β-barrel types of membrane proteins and is freely available at http://opig.stats.ox.ac.uk/webapps/memoir.
Collapse
Affiliation(s)
- Jean-Paul Ebejer
- Department of Statistics, Oxford University, Oxford, OX1 3TG, UK
| | | | | | | | | |
Collapse
|
43
|
Li Y. Conformational sampling in template-free protein loop structure modeling: an overview. Comput Struct Biotechnol J 2013; 5:e201302003. [PMID: 24688696 PMCID: PMC3962101 DOI: 10.5936/csbj.201302003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2012] [Revised: 01/23/2013] [Accepted: 01/28/2013] [Indexed: 01/04/2023] Open
Abstract
Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a "mini protein folding problem" under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized.
Collapse
Affiliation(s)
- Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
44
|
The interactions of apamin and tetraethylammonium are differentially affected by single mutations in the pore mouth of small conductance calcium-activated potassium (SK) channels. Biochem Pharmacol 2013; 85:560-9. [DOI: 10.1016/j.bcp.2012.12.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Revised: 12/14/2012] [Accepted: 12/17/2012] [Indexed: 10/27/2022]
|
45
|
Fernandez-Fuentes N, Fiser A. A modular perspective of protein structures: application to fragment based loop modeling. Methods Mol Biol 2013; 932:141-58. [PMID: 22987351 PMCID: PMC3635063 DOI: 10.1007/978-1-62703-065-6_9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Proteins can be decomposed into supersecondary structure modules. We used a generic definition of supersecondary structure elements, so-called Smotifs, which are composed of two flanking regular secondary structures connected by a loop, to explore the evolution and current variety of structure building blocks. Here, we discuss recent observations about the saturation of Smotif geometries in protein structures and how it opens new avenues in protein structure modeling and design. As a first application of these observations we describe our loop conformation modeling algorithm, ArchPred that takes advantage of Smotifs classification. In this application, instead of focusing on specific loop properties the method narrows down possible template conformations in other, often not homologous structures, by identifying the most likely supersecondary structure environment that cradles the loop. Beyond identifying the correct starting supersecondary structure geometry, it takes into account information of fit of anchor residues, sterical clashes, match of predicted and observed dihedral angle preferences, and local sequence signal.
Collapse
Affiliation(s)
- Narcis Fernandez-Fuentes
- Leeds Institute of Molecular Medicine, Section of Experimental Therapeutics, University of Leeds, St. James's University Hospital, Leeds LS9 7TF, UK
| | - Andras Fiser
- Department of Systems and Computational Biology, Department of Biochemistry Albert Einstein College of Medicine, 1301 Morris Park Ave, Bronx, NY 10461, USA
| |
Collapse
|
46
|
Abstract
The prediction of loop structures is considered one of the main challenges in the protein folding problem. Regardless of the dependence of the overall algorithm on the protein data bank, the flexibility of loop regions dictates the need for special attention to their structures. In this article, we present algorithms for loop structure prediction with fixed stem and flexible stem geometry. In the flexible stem geometry problem, only the secondary structure of three stem residues on either side of the loop is known. In the fixed stem geometry problem, the structure of the three stem residues on either side of the loop is also known. Initial loop structures are generated using a probability database for the flexible stem geometry problem, and using torsion angle dynamics for the fixed stem geometry problem. Three rotamer optimization algorithms are introduced to alleviate steric clashes between the generated backbone structures and the side chain rotamers. The structures are optimized by energy minimization using an all-atom force field. The optimized structures are clustered using a traveling salesman problem-based clustering algorithm. The structures in the densest clusters are then utilized to refine dihedral angle bounds on all amino acids in the loop. The entire procedure is carried out for a number of iterations, leading to improved structure prediction and refined dihedral angle bounds. The algorithms presented in this article have been tested on 3190 loops from the PDBSelect25 data set and on targets from the recently concluded CASP9 community-wide experiment.
Collapse
Affiliation(s)
- A. Subramani
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| | - C. A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, NJ 08544-5263, U.S.A
| |
Collapse
|
47
|
St-Pierre JF, Mousseau N. Large loop conformation sampling using the activation relaxation technique, ART-nouveau method. Proteins 2012; 80:1883-94. [PMID: 22488731 DOI: 10.1002/prot.24085] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2011] [Revised: 03/19/2011] [Accepted: 03/30/2012] [Indexed: 12/25/2022]
Abstract
We present an adaptation of the ART-nouveau energy surface sampling method to the problem of loop structure prediction. This method, previously used to study protein folding pathways and peptide aggregation, is well suited to the problem of sampling the conformation space of large loops by targeting probable folding pathways instead of sampling exhaustively that space. The number of sampled conformations needed by ART nouveau to find the global energy minimum for a loop was found to scale linearly with the sequence length of the loop for loops between 8 and about 20 amino acids. Considering the linear scaling dependence of the computation cost on the loop sequence length for sampling new conformations, we estimate the total computational cost of sampling larger loops to scale quadratically compared to the exponential scaling of exhaustive search methods.
Collapse
Affiliation(s)
- Jean-François St-Pierre
- Département de Physique and Regroupement Québécois sur les Matériaux de Pointe, Université de Montréal, CP 6128, Succursale Centre-Ville, Montréal, Québec, Canada H3C 3J7
| | | |
Collapse
|
48
|
Structural modelling and dynamics of proteins for insights into drug interactions. Adv Drug Deliv Rev 2012; 64:323-43. [PMID: 22155026 DOI: 10.1016/j.addr.2011.11.011] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2011] [Revised: 11/17/2011] [Accepted: 11/24/2011] [Indexed: 12/27/2022]
Abstract
Proteins are the workhorses of biomolecules and their function is affected by their structure and their structural rearrangements during ligand entry, ligand binding and protein-protein interactions. Hence, the knowledge of protein structure and, importantly, the dynamic behaviour of the structure are critical for understanding how the protein performs its function. The predictions of the structure and the dynamic behaviour can be performed by combinations of structure modelling and molecular dynamics simulations. The simulations also need to be sensitive to the constraints of the environment in which the protein resides. Standard computational methods now exist in this field to support the experimental effort of solving protein structures. This review presents a comprehensive overview of the basis of the calculations and the well-established computational methods used to generate and understand protein structure and function and the study of their dynamic behaviour with the reference to lung-related targets.
Collapse
|
49
|
Sacan A, Ekins S, Kortagere S. Applications and limitations of in silico models in drug discovery. Methods Mol Biol 2012; 910:87-124. [PMID: 22821594 DOI: 10.1007/978-1-61779-965-5_6] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Drug discovery in the late twentieth and early twenty-first century has witnessed a myriad of changes that were adopted to predict whether a compound is likely to be successful, or conversely enable identification of molecules with liabilities as early as possible. These changes include integration of in silico strategies for lead design and optimization that perform complementary roles to that of the traditional in vitro and in vivo approaches. The in silico models are facilitated by the availability of large datasets associated with high-throughput screening, bioinformatics algorithms to mine and annotate the data from a target perspective, and chemoinformatics methods to integrate chemistry methods into lead design process. This chapter highlights the applications of some of these methods and their limitations. We hope this serves as an introduction to in silico drug discovery.
Collapse
Affiliation(s)
- Ahmet Sacan
- School of Biomedical Engineering, Drexel University, Philadelphia, PA, USA
| | | | | |
Collapse
|
50
|
Adhikari AN, Peng J, Wilde M, Xu J, Freed KF, Sosnick TR. Modeling large regions in proteins: applications to loops, termini, and folding. Protein Sci 2012; 21:107-21. [PMID: 22095743 PMCID: PMC3323786 DOI: 10.1002/pro.767] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2011] [Revised: 11/02/2011] [Accepted: 11/06/2011] [Indexed: 11/10/2022]
Abstract
Template-based methods for predicting protein structure provide models for a significant portion of the protein but often contain insertions or chain ends (InsEnds) of indeterminate conformation. The local structure prediction "problem" entails modeling the InsEnds onto the rest of the protein. A well-known limit involves predicting loops of ≤12 residues in crystal structures. However, InsEnds may contain as many as ~50 amino acids, and the template-based model of the protein itself may be imperfect. To address these challenges, we present a free modeling method for predicting the local structure of loops and large InsEnds in both crystal structures and template-based models. The approach uses single amino acid torsional angle "pivot" moves of the protein backbone with a C(β) level representation. Nevertheless, our accuracy for loops is comparable to existing methods. We also apply a more stringent test, the blind structure prediction and refinement categories of the CASP9 tournament, where we improve the quality of several homology based models by modeling InsEnds as long as 45 amino acids, sizes generally inaccessible to existing loop prediction methods. Our approach ranks as one of the best in the CASP9 refinement category that involves improving template-based models so that they can function as molecular replacement models to solve the phase problem for crystallographic structure determination.
Collapse
Affiliation(s)
- Aashish N Adhikari
- Department of Chemistry, The University of ChicagoChicago, Illinois 60637
- The James Franck Institute, The University of ChicagoChicago, Illinois 60637
| | - Jian Peng
- Toyota Technological Institute at ChicagoChicago, Illinois 60637
| | - Michael Wilde
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicago, Illinois 60637
| | - Jinbo Xu
- Toyota Technological Institute at ChicagoChicago, Illinois 60637
| | - Karl F Freed
- Department of Chemistry, The University of ChicagoChicago, Illinois 60637
- The James Franck Institute, The University of ChicagoChicago, Illinois 60637
- Computation Institute, The University of Chicago and Argonne National LaboratoryChicago, Illinois 60637
| | - Tobin R Sosnick
- Computation Institute, The University of Chicago and Argonne National LaboratoryChicago, Illinois 60637
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicago, Illinois 60637
- Institute for Biophysical Dynamics, The University of ChicagoChicago, Illinois 60637
| |
Collapse
|