1
|
Li SS, Liu ZM, Li J, Ma YB, Dong ZY, Hou JW, Shen FJ, Wang WB, Li QM, Su JG. Prediction of mutation-induced protein stability changes based on the geometric representations learned by a self-supervised method. BMC Bioinformatics 2024; 25:282. [PMID: 39198740 PMCID: PMC11360314 DOI: 10.1186/s12859-024-05876-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 07/19/2024] [Indexed: 09/01/2024] Open
Abstract
BACKGROUND Thermostability is a fundamental property of proteins to maintain their biological functions. Predicting protein stability changes upon mutation is important for our understanding protein structure-function relationship, and is also of great interest in protein engineering and pharmaceutical design. RESULTS Here we present mutDDG-SSM, a deep learning-based framework that uses the geometric representations encoded in protein structure to predict the mutation-induced protein stability changes. mutDDG-SSM consists of two parts: a graph attention network-based protein structural feature extractor that is trained with a self-supervised learning scheme using large-scale high-resolution protein structures, and an eXtreme Gradient Boosting model-based stability change predictor with an advantage of alleviating overfitting problem. The performance of mutDDG-SSM was tested on several widely-used independent datasets. Then, myoglobin and p53 were used as case studies to illustrate the effectiveness of the model in predicting protein stability changes upon mutations. Our results show that mutDDG-SSM achieved high performance in estimating the effects of mutations on protein stability. In addition, mutDDG-SSM exhibited good unbiasedness, where the prediction accuracy on the inverse mutations is as well as that on the direct mutations. CONCLUSION Meaningful features can be extracted from our pre-trained model to build downstream tasks and our model may serve as a valuable tool for protein engineering and drug design.
Collapse
Affiliation(s)
- Shan Shan Li
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Zhao Ming Liu
- National Engineering Center for New Vaccine Research, Beijing, China
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China
| | - Jiao Li
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Yi Bo Ma
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Ze Yuan Dong
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Jun Wei Hou
- National Engineering Center for New Vaccine Research, Beijing, China
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China
| | - Fu Jie Shen
- National Engineering Center for New Vaccine Research, Beijing, China
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China
| | - Wei Bu Wang
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China
- National Engineering Center for New Vaccine Research, Beijing, China
| | - Qi Ming Li
- National Engineering Center for New Vaccine Research, Beijing, China.
- The Sixth Laboratory, National Vaccine and Serum Institute (NVSI), Beijing, China.
| | - Ji Guo Su
- High Performance Computing Center, National Vaccine and Serum Institute (NVSI), Beijing, China.
- National Engineering Center for New Vaccine Research, Beijing, China.
| |
Collapse
|
2
|
Álvarez-Machancoses Ó, Faraggi E, deAndrés-Galiana EJ, Fernández-Martínez JL, Kloczkowski A. Prediction of Deleterious Single Amino Acid Polymorphisms with a Consensus Holdout Sampler. Curr Genomics 2024; 25:171-184. [PMID: 39086995 PMCID: PMC11288160 DOI: 10.2174/0113892029236347240308054538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 08/03/2023] [Accepted: 09/22/2023] [Indexed: 08/02/2024] Open
Abstract
Background Single Amino Acid Polymorphisms (SAPs) or nonsynonymous Single Nucleotide Variants (nsSNVs) are the most common genetic variations. They result from missense mutations where a single base pair substitution changes the genetic code in such a way that the triplet of bases (codon) at a given position is coding a different amino acid. Since genetic mutations sometimes cause genetic diseases, it is important to comprehend and foresee which variations are harmful and which ones are neutral (not causing changes in the phenotype). This can be posed as a classification problem. Methods Computational methods using machine intelligence are gradually replacing repetitive and exceedingly overpriced mutagenic tests. By and large, uneven quality, deficiencies, and irregularities of nsSNVs datasets debase the convenience of artificial intelligence-based methods. Subsequently, strong and more exact approaches are needed to address these problems. In the present work paper, we show a consensus classifier built on the holdout sampler, which appears strong and precise and outflanks all other popular methods. Results We produced 100 holdouts to test the structures and diverse classification variables of diverse classifiers during the training phase. The finest performing holdouts were chosen to develop a consensus classifier and tested using a k-fold (1 ≤ k ≤5) cross-validation method. We also examined which protein properties have the biggest impact on the precise prediction of the effects of nsSNVs. Conclusion Our Consensus Holdout Sampler outflanks other popular algorithms, and gives excellent results, highly accurate with low standard deviation. The advantage of our method emerges from using a tree of holdouts, where diverse LM/AI-based programs are sampled in diverse ways.
Collapse
Affiliation(s)
- Óscar Álvarez-Machancoses
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Eshel Faraggi
- School of Science, Indiana University-Purdue University Indianapolis, IN, USA
| | - Enrique J deAndrés-Galiana
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007, Oviedo, Spain
- Department of Computer Science, University of Oviedo, C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Juan L Fernández-Martínez
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Andrzej Kloczkowski
- Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
3
|
Serghini A, Portelli S, Troadec G, Song C, Pan Q, Pires DEV, Ascher DB. Characterizing and predicting ccRCC-causing missense mutations in Von Hippel-Lindau disease. Hum Mol Genet 2024; 33:224-232. [PMID: 37883464 PMCID: PMC10800015 DOI: 10.1093/hmg/ddad181] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 10/19/2023] [Accepted: 10/20/2023] [Indexed: 10/28/2023] Open
Abstract
BACKGROUND Mutations within the Von Hippel-Lindau (VHL) tumor suppressor gene are known to cause VHL disease, which is characterized by the formation of cysts and tumors in multiple organs of the body, particularly clear cell renal cell carcinoma (ccRCC). A major challenge in clinical practice is determining tumor risk from a given mutation in the VHL gene. Previous efforts have been hindered by limited available clinical data and technological constraints. METHODS To overcome this, we initially manually curated the largest set of clinically validated VHL mutations to date, enabling a robust assessment of existing predictive tools on an independent test set. Additionally, we comprehensively characterized the effects of mutations within VHL using in silico biophysical tools describing changes in protein stability, dynamics and affinity to binding partners to provide insights into the structure-phenotype relationship. These descriptive properties were used as molecular features for the construction of a machine learning model, designed to predict the risk of ccRCC development as a result of a VHL missense mutation. RESULTS Analysis of our model showed an accuracy of 0.81 in the identification of ccRCC-causing missense mutations, and a Matthew's Correlation Coefficient of 0.44 on a non-redundant blind test, a significant improvement in comparison to the previous available approaches. CONCLUSION This work highlights the power of using protein 3D structure to fully explore the range of molecular and functional consequences of genomic variants. We believe this optimized model will better enable its clinical implementation and assist guiding patient risk stratification and management.
Collapse
Affiliation(s)
- Adam Serghini
- School of Chemistry and Molecular Biosciences, Chemistry Building 68, Cooper Road, The University of Queensland, St Lucia, QLD 4072, Queensland, Australia
| | - Stephanie Portelli
- School of Chemistry and Molecular Biosciences, Chemistry Building 68, Cooper Road, The University of Queensland, St Lucia, QLD 4072, Queensland, Australia
| | - Guillaume Troadec
- School of Computing and Information Systems, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Catherine Song
- School of Computing and Information Systems, University of Melbourne, Melbourne, VIC 3010, Australia
| | - Qisheng Pan
- School of Chemistry and Molecular Biosciences, Chemistry Building 68, Cooper Road, The University of Queensland, St Lucia, QLD 4072, Queensland, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, 75 Commercial Road, Melbourne, VIC 3004, Australia
| | - Douglas E V Pires
- School of Computing and Information Systems, University of Melbourne, Melbourne, VIC 3010, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, 75 Commercial Road, Melbourne, VIC 3004, Australia
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, Chemistry Building 68, Cooper Road, The University of Queensland, St Lucia, QLD 4072, Queensland, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, 75 Commercial Road, Melbourne, VIC 3004, Australia
| |
Collapse
|
4
|
Yazdanpanah N, Jumentier B, Yazdanpanah M, Ong KK, Perry JRB, Manousaki D. Mendelian randomization identifies circulating proteins as biomarkers for age at menarche and age at natural menopause. Commun Biol 2024; 7:47. [PMID: 38184718 PMCID: PMC10771430 DOI: 10.1038/s42003-023-05737-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 12/21/2023] [Indexed: 01/08/2024] Open
Abstract
Age at menarche (AAM) and age at natural menopause (ANM) are highly heritable traits and have been linked to various health outcomes. We aimed to identify circulating proteins associated with altered ANM and AAM using an unbiased two-sample Mendelian randomization (MR) and colocalization approach. By testing causal effects of 1,271 proteins on AAM, we identified 22 proteins causally associated with AAM in MR, among which 13 proteins (GCKR, FOXO3, SEMA3G, PATE4, AZGP1, NEGR1, LHB, DLK1, ANXA2, YWHAB, DNAJB12, RMDN1 and HPGDS) colocalized. Among 1,349 proteins tested for causal association with ANM using MR, we identified 19 causal proteins among which 7 proteins (CPNE1, TYMP, DNER, ADAMTS13, LCT, ARL and PLXNA1) colocalized. Follow-up pathway and gene enrichment analyses demonstrated links between AAM-related proteins and obesity and diabetes, and between AAM and ANM-related proteins and various types of cancer. In conclusion, we identified proteomic signatures of reproductive ageing in women, highlighting biological processes at both ends of the reproductive lifespan.
Collapse
Affiliation(s)
- Nahid Yazdanpanah
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - Basile Jumentier
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - Mojgan Yazdanpanah
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada
| | - Ken K Ong
- MRC Epidemiology Unit, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, CB2 0QQ, UK
| | - John R B Perry
- MRC Epidemiology Unit, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, CB2 0QQ, UK
- Metabolic Research Laboratory, Wellcome-MRC Institute of Metabolic Science, University of Cambridge School of Clinical Medicine, Cambridge, CB2 0QQ, UK
| | - Despoina Manousaki
- Research Center of the Sainte-Justine University Hospital, University of Montreal, Montreal, Quebec, Canada.
- Departments of Pediatrics, Biochemistry and Molecular Medicine, University of Montreal, Montreal, Canada.
| |
Collapse
|
5
|
Abstract
The greatest challenge in drug discovery remains the high rate of attrition across the different phases of the process, which cost the industry billions of dollars every year. While all phases remain crucial to ensure pharmaceutical-level safety, quality, and efficacy of the end product, streamlining these efforts toward compounds with success potential is pivotal for a more efficient and cost-effective process. The use of artificial intelligence (AI) within the pharmaceutical industry aims at just this, and has applications in preclinical screening for biological activity, optimization of pharmacokinetic properties for improved drug formulation, early toxicity prediction which reduces attrition, and pre-emptively screening for genetic changes in the biological target to improve therapeutic longevity. Here, we present a series of in silico tools that address these applications in small molecule development and describe how they can be embedded within the current pharmaceutical development pipeline.
Collapse
Affiliation(s)
- Adam Serghini
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
| | - Stephanie Portelli
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia.
| | - David B Ascher
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, QLD, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
| |
Collapse
|
6
|
Ahangari N, Gholampour-Faroji N, Doosti M, Ghayour Mobarhan M, Shahrokhzadeh S, Karimiani EG, Hasani-Sabzevar B, Torbati PN, Haddad-Mashadrizeh A. ECEL1 novel mutation in arthrogryposis type 5D: A molecular dynamic simulation study. Mol Genet Genomic Med 2023:e2153. [PMID: 36794879 DOI: 10.1002/mgg3.2153] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 09/26/2021] [Accepted: 02/03/2023] [Indexed: 02/17/2023] Open
Abstract
BACKGROUND ECEL1 has been presented as a causal gene of an autosomal recessive form distal arthrogryposis (DA) which affects the distal joints. The present study focused on bioinformatic analysis of a novel mutation in ECEL1, c.535A>G (p. Lys179Glu), which was reported in a family with 2 affected boys and fetus through prenatal diagnosis. METHODS Whole-exome sequencing data analyzed followed by molecular dynamic (MD) simulation of native ECEL1 protein and mutant structures using GROMACS software. One variant c.535A>G, p. Lys179Glu (homozygous) on gene ECEL1 has been detected in proband which was validated in all family members through Sanger sequencing. RESULTS We demonstrated remarkable constructional differences by MD simulation between wild-type and novel mutant of ECEL1 gene. The reason for the lack of the Zn ion binding in mutation in the ECEL1 protein has been identified by average atomic distance and SMD analysis among the wild-type and mutant. CONCLUSION Overall, in this study, we present knowledge of the effect of the studied variant on the ECEL1 protein leading to neurodegenerative disorder in humans. This work may hopefully be supplementary to classical molecular dynamics to dissolve the mutational effects of cofactor-dependent protein.
Collapse
Affiliation(s)
- Najmeh Ahangari
- Innovative Medical Research Center, Faculty of Medicine, Mashhad Medical Science, Islamic Azad University, Mashhad, Iran.,Department of Medical Genetics, Next Generation Genetic Polyclinic, Mashhad, Iran
| | - Nazanin Gholampour-Faroji
- Biotechnology Department, Iranian Research Organization for Science and Technology (IROST), Tehran, Iran
| | - Mohammad Doosti
- Department of Medical Genetics, Next Generation Genetic Polyclinic, Mashhad, Iran
| | - Majid Ghayour Mobarhan
- Metabolic Syndrome Research Center, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Sima Shahrokhzadeh
- Department of Medical Genetics, Next Generation Genetic Polyclinic, Mashhad, Iran
| | - Ehsan Ghayoor Karimiani
- Department of Medical Genetics, Next Generation Genetic Polyclinic, Mashhad, Iran.,Molecular and Clinical Sciences Institute, St. George's University of London, Cranmer Terrace, London, UK.,Department of Molecular Genetics, Next Generation Genetic Polyclinic, Mashhad, Iran
| | | | | | - Aliakbar Haddad-Mashadrizeh
- Industrial Biotechnology Research Group, Institute of Biotechnology, Ferdowsi University of Mashhad, Mashhad, Iran
| |
Collapse
|
7
|
Benevenuta S, Birolo G, Sanavia T, Capriotti E, Fariselli P. Challenges in predicting stabilizing variations: An exploration. Front Mol Biosci 2023; 9:1075570. [PMID: 36685278 PMCID: PMC9849384 DOI: 10.3389/fmolb.2022.1075570] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Accepted: 12/15/2022] [Indexed: 01/06/2023] Open
Abstract
An open challenge of computational and experimental biology is understanding the impact of non-synonymous DNA variations on protein function and, subsequently, human health. The effects of these variants on protein stability can be measured as the difference in the free energy of unfolding (ΔΔG) between the mutated structure of the protein and its wild-type form. Throughout the years, bioinformaticians have developed a wide variety of tools and approaches to predict the ΔΔG. Although the performance of these tools is highly variable, overall they are less accurate in predicting ΔΔG stabilizing variations rather than the destabilizing ones. Here, we analyze the possible reasons for this difference by focusing on the relationship between experimentally-measured ΔΔG and seven protein properties on three widely-used datasets (S2648, VariBench, Ssym) and a recently introduced one (S669). These properties include protein structural information, different physical properties and statistical potentials. We found that two highly used input features, i.e., hydrophobicity and the Blosum62 substitution matrix, show a performance close to random choice when trying to separate stabilizing variants from either neutral or destabilizing ones. We then speculate that, since destabilizing variations are the most abundant class in the available datasets, the overall performance of the methods is higher when including features that improve the prediction for the destabilizing variants at the expense of the stabilizing ones. These findings highlight the need of designing predictive methods able to exploit also input features highly correlated with the stabilizing variants. New tools should also be tested on a not-artificially balanced dataset, reporting the performance on all the three classes (i.e., stabilizing, neutral and destabilizing variants) and not only the overall results.
Collapse
Affiliation(s)
| | - Giovanni Birolo
- Department of Medical Sciences, University of Torino, Torino, Italy
| | - Tiziana Sanavia
- Department of Medical Sciences, University of Torino, Torino, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Torino, Italy,*Correspondence: Piero Fariselli,
| |
Collapse
|
8
|
Ascher DB, Kaminskas LM, Myung Y, Pires DEV. Using Graph-Based Signatures to Guide Rational Antibody Engineering. Methods Mol Biol 2023; 2552:375-397. [PMID: 36346604 DOI: 10.1007/978-1-0716-2609-2_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Antibodies are essential experimental and diagnostic tools and as biotherapeutics have significantly advanced our ability to treat a range of diseases. With recent innovations in computational tools to guide protein engineering, we can now rationally design better antibodies with improved efficacy, stability, and pharmacokinetics. Here, we describe the use of the mCSM web-based in silico suite, which uses graph-based signatures to rapidly identify the structural and functional consequences of mutations, to guide rational antibody engineering to improve stability, affinity, and specificity.
Collapse
Affiliation(s)
- David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- Department of Biochemistry, Cambridge University, Cambridge, UK
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Lisa M Kaminskas
- School of Biological Sciences, University of Queensland, St Lucia, QLD, Australia
| | - Yoochan Myung
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, St Lucia, Queensland, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC, Australia.
- School of Computing and Information Systems, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
9
|
Parthasarathy S, Ruggiero SM, Gelot A, Soardi FC, Ribeiro BFR, Pires DEV, Ascher DB, Schmitt A, Rambaud C, Represa A, Xie HM, Lusk L, Wilmarth O, McDonnell PP, Juarez OA, Grace AN, Buratti J, Mignot C, Gras D, Nava C, Pierce SR, Keren B, Kennedy BC, Pena SDJ, Helbig I, Cuddapah VA. A recurrent de novo splice site variant involving DNM1 exon 10a causes developmental and epileptic encephalopathy through a dominant-negative mechanism. Am J Hum Genet 2022; 109:2253-2269. [PMID: 36413998 PMCID: PMC9748255 DOI: 10.1016/j.ajhg.2022.11.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 11/01/2022] [Indexed: 11/23/2022] Open
Abstract
Heterozygous pathogenic variants in DNM1 cause developmental and epileptic encephalopathy (DEE) as a result of a dominant-negative mechanism impeding vesicular fission. Thus far, pathogenic variants in DNM1 have been studied with a canonical transcript that includes the alternatively spliced exon 10b. However, after performing RNA sequencing in 39 pediatric brain samples, we find the primary transcript expressed in the brain includes the downstream exon 10a instead. Using this information, we evaluated genotype-phenotype correlations of variants affecting exon 10a and identified a cohort of eleven previously unreported individuals. Eight individuals harbor a recurrent de novo splice site variant, c.1197-8G>A (GenBank: NM_001288739.1), which affects exon 10a and leads to DEE consistent with the classical DNM1 phenotype. We find this splice site variant leads to disease through an unexpected dominant-negative mechanism. Functional testing reveals an in-frame upstream splice acceptor causing insertion of two amino acids predicted to impair oligomerization-dependent activity. This is supported by neuropathological samples showing accumulation of enlarged synaptic vesicles adherent to the plasma membrane consistent with impaired vesicular fission. Two additional individuals with missense variants affecting exon 10a, p.Arg399Trp and p.Gly401Asp, had a similar DEE phenotype. In contrast, one individual with a missense variant affecting exon 10b, p.Pro405Leu, which is less expressed in the brain, had a correspondingly less severe presentation. Thus, we implicate variants affecting exon 10a as causing the severe DEE typically associated with DNM1-related disorders. We highlight the importance of considering relevant isoforms for disease-causing variants as well as the possibility of splice site variants acting through a dominant-negative mechanism.
Collapse
Affiliation(s)
- Shridhar Parthasarathy
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19146, USA
| | - Sarah McKeown Ruggiero
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19146, USA
| | - Antoinette Gelot
- AP-HP, Hôpital Armand-Trousseau, Service d'Anatomie Pathologique, 75012 Paris, France; INMED INSERM U 901 Parc Scientifique de Luminy, 13273 Marseille, France; Centre de Recherche Clinique ConCer-LD, Paris, France
| | - Fernanda C Soardi
- GENE - Núcleo de Genética Médica, Belo Horizonte, MG, Brazil; Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil; Laboratório de Genômica Clínica, Faculdade de Medicina, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | | | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia; Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, VIC 3052, Australia; School of Computing and Information Systems, University of Melbourne, Melbourne, VIC 3053, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia; Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, VIC 3052, Australia; School of Chemistry and Molecular Biology, University of Queensland, St Lucia, QLD 4072, Australia
| | - Alain Schmitt
- INSERM U 1016, Institut Cochin, Paris, France; CNRS UMR 8104, Paris, France; Université Paris Descartes, Sorbonne Paris Cité, Paris, France
| | - Caroline Rambaud
- AP-HP, Hôpital Raymond-Poincaré, Laboratoire Anatomie Pathologique, Garches, France
| | - Alfonso Represa
- INMED, INSERM, Aix-Marseille Université, Campus de Luminy, 13009 Marseille, France
| | - Hongbo M Xie
- Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19146, USA
| | - Laina Lusk
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19146, USA
| | - Olivia Wilmarth
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Pamela Pojomovsky McDonnell
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Olivia A Juarez
- Baylor College of Medicine Genetics Clinic, Children's Hospital of San Antonio, San Antonio, TX, USA
| | - Alexandra N Grace
- Baylor College of Medicine Genetics Clinic, Children's Hospital of San Antonio, San Antonio, TX, USA
| | - Julien Buratti
- AP-HP, Hôpital de la Pitié Salpêtrière, Département de Génétique, 75013 Paris, France
| | - Cyril Mignot
- AP-HP, Hôpital de la Pitié Salpêtrière, Département de Génétique, 75013 Paris, France; Sorbonne Universités, UPMC Univ Paris 06, UMR S 1127, INSERM U 1127, CNRS UMR 7225, ICM, 75013 Paris, France; AP-HP, Hôpital Robert Debré, Service de Neurologie Pediatrique et de Maladies Métaboliques, 75019 Paris, France
| | - Domitille Gras
- AP-HP, Hôpital Robert Debré, Service de Neurologie Pediatrique et de Maladies Métaboliques, 75019 Paris, France
| | - Caroline Nava
- AP-HP, Hôpital de la Pitié Salpêtrière, Département de Génétique, 75013 Paris, France; Sorbonne Universités, UPMC Univ Paris 06, UMR S 1127, INSERM U 1127, CNRS UMR 7225, ICM, 75013 Paris, France; AP-HP, Hôpital Robert Debré, Service de Neurologie Pediatrique et de Maladies Métaboliques, 75019 Paris, France
| | - Samuel R Pierce
- The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Physical Therapy, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | - Boris Keren
- AP-HP, Hôpital de la Pitié Salpêtrière, Département de Génétique, 75013 Paris, France; Sorbonne Universités, UPMC Univ Paris 06, UMR S 1127, INSERM U 1127, CNRS UMR 7225, ICM, 75013 Paris, France; AP-HP, Hôpital Robert Debré, Service de Neurologie Pediatrique et de Maladies Métaboliques, 75019 Paris, France
| | - Benjamin C Kennedy
- Division of Neurosurgery, Children's Hospital of Philadelphia, Philadelphia, PA 19146, USA; Department of Neurosurgery, The University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Sergio D J Pena
- GENE - Núcleo de Genética Médica, Belo Horizonte, MG, Brazil; Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil; Laboratório de Genômica Clínica, Faculdade de Medicina, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Ingo Helbig
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; Department of Biomedical and Health Informatics, Children's Hospital of Philadelphia, Philadelphia, PA 19146, USA; Department of Neurology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA
| | - Vishnu Anand Cuddapah
- Division of Neurology, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA; The Epilepsy NeuroGenetics Initiative, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.
| |
Collapse
|
10
|
Iqbal S, Ge F, Li F, Akutsu T, Zheng Y, Gasser RB, Yu DJ, Webb GI, Song J. PROST: AlphaFold2-aware Sequence-Based Predictor to Estimate Protein Stability Changes upon Missense Mutations. J Chem Inf Model 2022; 62:4270-4282. [PMID: 35973091 DOI: 10.1021/acs.jcim.2c00799] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
An essential step in engineering proteins and understanding disease-causing missense mutations is to accurately model protein stability changes when such mutations occur. Here, we developed a new sequence-based predictor for the protein stability (PROST) change (Gibb's free energy change, ΔΔG) upon a single-point missense mutation. PROST extracts multiple descriptors from the most promising sequence-based predictors, such as BoostDDG, SAAFEC-SEQ, and DDGun. RPOST also extracts descriptors from iFeature and AlphaFold2. The extracted descriptors include sequence-based features, physicochemical properties, evolutionary information, evolutionary-based physicochemical properties, and predicted structural features. The PROST predictor is a weighted average ensemble model based on extreme gradient boosting (XGBoost) decision trees and an extra-trees regressor; PROST is trained on both direct and hypothetical reverse mutations using the S5294 (S2647 direct mutations + S2647 inverse mutations). The parameters for the PROST model are optimized using grid searching with 5-fold cross-validation, and feature importance analysis unveils the most relevant features. The performance of PROST is evaluated in a blinded manner, employing nine distinct data sets and existing state-of-the-art sequence-based and structure-based predictors. This method consistently performs well on frataxin, S217, S349, Ssym, S669, Myoglobin, and CAGI5 data sets in blind tests and similarly to the state-of-the-art predictors for p53 and S276 data sets. When the performance of PROST is compared with the latest predictors such as BoostDDG, SAAFEC-SEQ, ACDC-NN-seq, and DDGun, PROST dominates these predictors. A case study of mutation scanning of the frataxin protein for nine wild-type residues demonstrates the utility of PROST. Taken together, these findings indicate that PROST is a well-suited predictor when no protein structural information is available. The source code of PROST, data sets, examples, and pretrained models along with how to use PROST are available at https://github.com/ShahidIqb/PROST and https://prost.erc.monash.edu/seq.
Collapse
Affiliation(s)
- Shahid Iqbal
- Department of Data Science and AI, Faculty of IT, Monash University, Clayton, Victoria 3800, Australia.,Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria 3800, Australia.,Monash Data Futures Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Fang Ge
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Fuyi Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria 3800, Australia.,Monash Data Futures Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Yuanting Zheng
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China
| | - Geoffrey I Webb
- Department of Data Science and AI, Faculty of IT, Monash University, Clayton, Victoria 3800, Australia.,Monash Data Futures Institute, Monash University, Clayton, Victoria 3800, Australia
| | - Jiangning Song
- Department of Data Science and AI, Faculty of IT, Monash University, Clayton, Victoria 3800, Australia.,Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria 3800, Australia.,Monash Data Futures Institute, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
11
|
Tastan Bishop Ö, Mutemi Musyoka T, Barozi V. Allostery and missense mutations as intermittently linked promising aspects of modern computational drug discovery. J Mol Biol 2022; 434:167610. [DOI: 10.1016/j.jmb.2022.167610] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 04/21/2022] [Accepted: 04/22/2022] [Indexed: 12/15/2022]
|
12
|
Pan Q, Nguyen TB, Ascher DB, Pires DEV. Systematic evaluation of computational tools to predict the effects of mutations on protein stability in the absence of experimental structures. Brief Bioinform 2022; 23:bbac025. [PMID: 35189634 PMCID: PMC9155634 DOI: 10.1093/bib/bbac025] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/13/2022] [Accepted: 01/30/2022] [Indexed: 12/26/2022] Open
Abstract
Changes in protein sequence can have dramatic effects on how proteins fold, their stability and dynamics. Over the last 20 years, pioneering methods have been developed to try to estimate the effects of missense mutations on protein stability, leveraging growing availability of protein 3D structures. These, however, have been developed and validated using experimentally derived structures and biophysical measurements. A large proportion of protein structures remain to be experimentally elucidated and, while many studies have based their conclusions on predictions made using homology models, there has been no systematic evaluation of the reliability of these tools in the absence of experimental structural data. We have, therefore, systematically investigated the performance and robustness of ten widely used structural methods when presented with homology models built using templates at a range of sequence identity levels (from 15% to 95%) and contrasted performance with sequence-based tools, as a baseline. We found there is indeed performance deterioration on homology models built using templates with sequence identity below 40%, where sequence-based tools might become preferable. This was most marked for mutations in solvent exposed residues and stabilizing mutations. As structure prediction tools improve, the reliability of these predictors is expected to follow, however we strongly suggest that these factors should be taken into consideration when interpreting results from structure-based predictors of mutation effects on protein stability.
Collapse
Affiliation(s)
- Qisheng Pan
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
| | - Thanh Binh Nguyen
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
- Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, UK
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria 3004, Australia
- School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane City, Queensland 4072, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville, Victoria 3052, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria 3053, Australia
| |
Collapse
|
13
|
Pancotti C, Benevenuta S, Birolo G, Alberini V, Repetto V, Sanavia T, Capriotti E, Fariselli P. Predicting protein stability changes upon single-point mutation: a thorough comparison of the available tools on a new dataset. Brief Bioinform 2022; 23:6502552. [PMID: 35021190 PMCID: PMC8921618 DOI: 10.1093/bib/bbab555] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Revised: 11/29/2021] [Accepted: 12/05/2021] [Indexed: 12/13/2022] Open
Abstract
Predicting the difference in thermodynamic stability between protein variants is crucial for protein design and understanding the genotype-phenotype relationships. So far, several computational tools have been created to address this task. Nevertheless, most of them have been trained or optimized on the same and ‘all’ available data, making a fair comparison unfeasible. Here, we introduce a novel dataset, collected and manually cleaned from the latest version of the ThermoMutDB database, consisting of 669 variants not included in the most widely used training datasets. The prediction performance and the ability to satisfy the antisymmetry property by considering both direct and reverse variants were evaluated across 21 different tools. The Pearson correlations of the tested tools were in the ranges of 0.21–0.5 and 0–0.45 for the direct and reverse variants, respectively. When both direct and reverse variants are considered, the antisymmetric methods perform better achieving a Pearson correlation in the range of 0.51–0.62. The tested methods seem relatively insensitive to the physiological conditions, performing well also on the variants measured with more extreme pH and temperature values. A common issue with all the tested methods is the compression of the \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$\Delta \Delta G$\end{document} predictions toward zero. Furthermore, the thermodynamic stability of the most significantly stabilizing variants was found to be more challenging to predict. This study is the most extensive comparisons of prediction methods using an entirely novel set of variants never tested before.
Collapse
Affiliation(s)
- Corrado Pancotti
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Silvia Benevenuta
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Giovanni Birolo
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Virginia Alberini
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Valeria Repetto
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Tiziana Sanavia
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| | - Emidio Capriotti
- Department of Pharmacy and Biotechnology (FaBiT), University of Bologna, Bologna, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Via Santena 19, 10126 Torino, Italy
| |
Collapse
|
14
|
Rodrigues CHM, Pires DEV, Ascher DB. mmCSM-PPI: predicting the effects of multiple point mutations on protein-protein interactions. Nucleic Acids Res 2021; 49:W417-W424. [PMID: 33893812 PMCID: PMC8262703 DOI: 10.1093/nar/gkab273] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/18/2021] [Accepted: 04/15/2021] [Indexed: 11/16/2022] Open
Abstract
Protein-protein interactions play a crucial role in all cellular functions and biological processes and mutations leading to their disruption are enriched in many diseases. While a number of computational methods to assess the effects of variants on protein-protein binding affinity have been proposed, they are in general limited to the analysis of single point mutations and have been shown to perform poorly on independent test sets. Here, we present mmCSM-PPI, a scalable and effective machine learning model for accurately assessing changes in protein-protein binding affinity caused by single and multiple missense mutations. We expanded our well-established graph-based signatures in order to capture physicochemical and geometrical properties of multiple wild-type residue environments and integrated them with substitution scores and dynamics terms from normal mode analysis. mmCSM-PPI was able to achieve a Pearson's correlation of up to 0.75 (RMSE = 1.64 kcal/mol) under 10-fold cross-validation and 0.70 (RMSE = 2.06 kcal/mol) on a non-redundant blind test, outperforming existing methods. Our method is freely available as a user-friendly and easy-to-use web server and API at http://biosig.unimelb.edu.au/mmcsm_ppi.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Pharmacology, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
15
|
Molecular Analysis of Streptomycin Resistance Genes in Clinical Strains of Mycobacterium tuberculosis and Biocomputational Analysis of the MtGidB L101F Variant. Antibiotics (Basel) 2021; 10:antibiotics10070807. [PMID: 34356728 PMCID: PMC8300841 DOI: 10.3390/antibiotics10070807] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 06/29/2021] [Accepted: 06/30/2021] [Indexed: 12/30/2022] Open
Abstract
Globally, tuberculosis (TB) remains a prevalent threat to public health. In 2019, TB affected 10 million people and caused 1.4 million deaths. The major challenge for controlling this infectious disease is the emergence and spread of drug-resistant Mycobacterium tuberculosis, the causative agent of TB. The antibiotic streptomycin is not a current first-line anti-TB drug. However, WHO recommends its use in patients infected with a streptomycin-sensitive strain. Several mutations in the M. tuberculosisrpsL, rrs and gidB genes have proved association with streptomycin resistance. In this study, we performed a molecular analysis of these genes in clinical isolates to determine the prevalence of known or novel mutations. Here, we describe the genetic analysis outcome. Furthermore, a biocomputational analysis of the MtGidB L101F variant, the product of a novel mutation detected in gidB during molecular analysis, is also reported as a theoretical approach to study the apparent genotype-phenotype association.
Collapse
|
16
|
Iqbal S, Li F, Akutsu T, Ascher DB, Webb GI, Song J. Assessing the performance of computational predictors for estimating protein stability changes upon missense mutations. Brief Bioinform 2021; 22:6289890. [PMID: 34058752 DOI: 10.1093/bib/bbab184] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Revised: 04/07/2021] [Accepted: 04/21/2021] [Indexed: 11/14/2022] Open
Abstract
Understanding how a mutation might affect protein stability is of significant importance to protein engineering and for understanding protein evolution genetic diseases. While a number of computational tools have been developed to predict the effect of missense mutations on protein stability protein stability upon mutations, they are known to exhibit large biases imparted in part by the data used to train and evaluate them. Here, we provide a comprehensive overview of predictive tools, which has provided an evolving insight into the importance and relevance of features that can discern the effects of mutations on protein stability. A diverse selection of these freely available tools was benchmarked using a large mutation-level blind dataset of 1342 experimentally characterised mutations across 130 proteins from ThermoMutDB, a second test dataset encompassing 630 experimentally characterised mutations across 39 proteins from iStable2.0 and a third blind test dataset consisting of 268 mutations in 27 proteins from the newly published ProThermDB. The performance of the methods was further evaluated with respect to the site of mutation, type of mutant residue and by ranging the pH and temperature. Additionally, the classification performance was also evaluated by classifying the mutations as stabilizing (∆∆G ≥ 0) or destabilizing (∆∆G < 0). The results reveal that the performance of the predictors is affected by the site of mutation and the type of mutant residue. Further, the results show very low performance for pH values 6-8 and temperature higher than 65 for all predictors except iStable2.0 on the S630 dataset. To illustrate how stability and structure change upon single point mutation, we considered four stabilizing, two destabilizing and two stabilizing mutations from two proteins, namely the toxin protein and bovine liver cytochrome. Overall, the results on S268, S630 and S1342 datasets show that the performance of the integrated predictors is better than the mechanistic or individual machine learning predictors. We expect that this paper will provide useful guidance for the design and development of next-generation bioinformatic tools for predicting protein stability changes upon mutations.
Collapse
Affiliation(s)
- Shahid Iqbal
- Computer System Engineering from Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Pakistan
| | - Fuyi Li
- Department of Microbiology and Immunology, Peter Doherty Institute for Infection and Immunity, the University of Melbourne, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Japan
| | | | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Victoria 3800, Australia
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Australia
| |
Collapse
|
17
|
Portelli S, Barr L, de Sá AG, Pires DE, Ascher DB. Distinguishing between PTEN clinical phenotypes through mutation analysis. Comput Struct Biotechnol J 2021; 19:3097-3109. [PMID: 34141133 PMCID: PMC8180946 DOI: 10.1016/j.csbj.2021.05.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 04/29/2021] [Accepted: 05/19/2021] [Indexed: 12/28/2022] Open
Abstract
Phosphate and tensin homolog on chromosome ten (PTEN) germline mutations are associated with an overarching condition known as PTEN hamartoma tumor syndrome. Clinical phenotypes associated with this syndrome range from macrocephaly and autism spectrum disorder to Cowden syndrome, which manifests as multiple noncancerous tumor-like growths (hamartomas), and an increased predisposition to certain cancers. It is unclear, however, the basis by which mutations might lead to these very diverse phenotypic outcomes. Here we show that, by considering the molecular consequences of mutations in PTEN on protein structure and function, we can accurately distinguish PTEN mutations exhibiting different phenotypes. Changes in phosphatase activity, protein stability, and intramolecular interactions appeared to be major drivers of clinical phenotype, with cancer-associated variants leading to the most drastic changes, while ASD and non-pathogenic variants associated with more mild and neutral changes, respectively. Importantly, we show via saturation mutagenesis that more than half of variants of unknown significance could be associated with disease phenotypes, while over half of Cowden syndrome mutations likely lead to cancer. These insights can assist in exploring potentially important clinical outcomes delineated by PTEN variation.
Collapse
Affiliation(s)
- Stephanie Portelli
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Lucy Barr
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Alex G.C. de Sá
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
| | - Douglas E.V. Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B. Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, University of Melbourne, Melbourne, Victoria, Australia
- Systems and Computational Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Baker Department of Cardiometabolic Health, Melbourne Medical School, University of Melbourne, Melbourne, Victoria, Australia
- Department of Biochemistry, University of Cambridge, 80 Tennis Ct Rd, Cambridge CB2 1GA, United States
| |
Collapse
|
18
|
Singh P, Jamal S, Ahmed F, Saqib N, Mehra S, Ali W, Roy D, Ehtesham NZ, Hasnain SE. Computational modeling and bioinformatic analyses of functional mutations in drug target genes in Mycobacterium tuberculosis. Comput Struct Biotechnol J 2021; 19:2423-2446. [PMID: 34025934 PMCID: PMC8113780 DOI: 10.1016/j.csbj.2021.04.034] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 04/09/2021] [Accepted: 04/15/2021] [Indexed: 11/29/2022] Open
Abstract
MycoTRAP-DB, a database of mutations and their impact on normal functionality of protein in M.tb genes. Several secondary mutations were identified with significant impact on protein structure and function. Comprehensive information gives insight for screening of suspected hotspots in advance to combat drug resistant TB.
Tuberculosis (TB) continues to be the leading cause of deaths due to its persistent drug resistance and the consequent ineffectiveness of anti-TB treatment. Recent years witnessed huge amount of sequencing data, revealing mutations responsible for drug resistance. However, the lack of an up-to-date repository remains a barrier towards utilization of these data and identifying major mutations-associated with resistance. Amongst all mutations, non-synonymous mutations alter the amino acid sequence of a protein and have a much greater effect on pathogenicity. Hence, this type of gene mutation is of prime interest of the present study. The purpose of this study is to develop an updated database comprising almost all reported substitutions within the Mycobacterium tuberculosis (M.tb) drug target genes rpoB, inhA, katG, pncA, gyrA and gyrB. Various bioinformatics prediction tools were used to assess the structural and biophysical impacts of the resistance causing non-synonymous single nucleotide polymorphisms (nsSNPs) at the molecular level. This was followed by evaluating the impact of these mutations on binding affinity of the drugs to target proteins. We have developed a comprehensive online resource named MycoTRAP-DB (Mycobacterium tuberculosis Resistance Associated Polymorphisms Database) that connects mutations in genes with their structural, functional and pathogenic implications on protein. This database is accessible at http://139.59.12.92. This integrated platform would enable comprehensive analysis and prioritization of SNPs for the development of improved diagnostics and antimycobacterial medications. Moreover, our study puts forward secondary mutations that can be important for prognostic assessments of drug-resistance mechanism and actionable anti-TB drugs.
Collapse
Affiliation(s)
- Pooja Singh
- Jamia Hamdard Institute of Molecular Medicine, Jamia Hamdard, New Delhi 110062, India
| | - Salma Jamal
- Jamia Hamdard Institute of Molecular Medicine, Jamia Hamdard, New Delhi 110062, India
| | - Faraz Ahmed
- Jamia Hamdard Institute of Molecular Medicine, Jamia Hamdard, New Delhi 110062, India
| | - Najumu Saqib
- Jamia Hamdard Institute of Molecular Medicine, Jamia Hamdard, New Delhi 110062, India
| | - Seema Mehra
- Jamia Hamdard Institute of Molecular Medicine, Jamia Hamdard, New Delhi 110062, India
| | - Waseem Ali
- Jamia Hamdard Institute of Molecular Medicine, Jamia Hamdard, New Delhi 110062, India
| | - Deodutta Roy
- Department of Environmental and Occupational Health, Florida International University, Miami 33029, USA
| | - Nasreen Z Ehtesham
- ICMR-National Institute of Pathology, Safdarjung Hospital Campus, New Delhi, India
| | - Seyed E Hasnain
- Department of Life Sciences, School of Basic Sciences and Research, Sharda University, Greater Noida 201301, India.,Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology, Delhi (IIT-D), Hauz Khas, New Delhi 110016, India
| |
Collapse
|
19
|
Xavier JS, Nguyen TB, Karmarkar M, Portelli S, Rezende PM, Velloso JPL, Ascher DB, Pires DEV. ThermoMutDB: a thermodynamic database for missense mutations. Nucleic Acids Res 2021; 49:D475-D479. [PMID: 33095862 PMCID: PMC7778973 DOI: 10.1093/nar/gkaa925] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 09/21/2020] [Accepted: 10/12/2020] [Indexed: 01/17/2023] Open
Abstract
Proteins are intricate, dynamic structures, and small changes in their amino acid sequences can lead to large effects on their folding, stability and dynamics. To facilitate the further development and evaluation of methods to predict these changes, we have developed ThermoMutDB, a manually curated database containing >14,669 experimental data of thermodynamic parameters for wild type and mutant proteins. This represents an increase of 83% in unique mutations over previous databases and includes thermodynamic information on 204 new proteins. During manual curation we have also corrected annotation errors in previously curated entries. Associated with each entry, we have included information on the unfolding Gibbs free energy and melting temperature change, and have associated entries with available experimental structural information. ThermoMutDB supports users to contribute to new data points and programmatic access to the database via a RESTful API. ThermoMutDB is freely available at: http://biosig.unimelb.edu.au/thermomutdb.
Collapse
Affiliation(s)
- Joicymara S Xavier
- Institute of Agricultural Sciences, Universidade Federal dos Vales do Jequitinhonha e Mucuri.,Instituto René Rachou, Fundação Oswaldo Cruz
| | | | - Malancha Karmarkar
- Bio 21 Institute, University of Melbourne.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute
| | - Stephanie Portelli
- Bio 21 Institute, University of Melbourne.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute
| | | | | | - David B Ascher
- Bio 21 Institute, University of Melbourne.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute.,Department of Biochemistry, University of Cambridge
| | - Douglas E V Pires
- Bio 21 Institute, University of Melbourne.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute.,School of Computing and Information Systems, University of Melbourne
| |
Collapse
|
20
|
Modi T, Campitelli P, Kazan IC, Ozkan SB. Protein folding stability and binding interactions through the lens of evolution: a dynamical perspective. Curr Opin Struct Biol 2020; 66:207-215. [PMID: 33388636 DOI: 10.1016/j.sbi.2020.11.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 11/02/2020] [Accepted: 11/26/2020] [Indexed: 01/06/2023]
Abstract
While the function of a protein depends heavily on its ability to fold into a correct 3D structure, billions of years of evolution have tailored proteins from highly stable objects to flexible molecules as they adapted to environmental changes. Nature maintains the fine balance of protein folding and stability while still evolving towards new function through generations of fine-tuning necessary interactions with other proteins and small molecules. Here we focus on recent computational and experimental studies that shed light onto how evolution molds protein folding and the functional landscape from a conformational dynamics' perspective. Particularly, we explore the importance of dynamic allostery throughout protein evolution and discuss how the protein anisotropic network can give rise to allosteric and epistatic interactions.
Collapse
Affiliation(s)
- Tushar Modi
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85287-1504, USA
| | - Paul Campitelli
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85287-1504, USA
| | - Ismail Can Kazan
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85287-1504, USA
| | - Sefika Banu Ozkan
- Department of Physics and Center for Biological Physics, Arizona State University, Tempe, AZ 85287-1504, USA.
| |
Collapse
|
21
|
HARP: a database of structural impacts of systematic missense mutations in drug targets of Mycobacterium leprae. Comput Struct Biotechnol J 2020; 18:3692-3704. [PMID: 33304465 PMCID: PMC7711215 DOI: 10.1016/j.csbj.2020.11.013] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Accepted: 11/08/2020] [Indexed: 12/20/2022] Open
Abstract
Computational Saturation Mutagenesis is an in-silico approach that employs systematic mutagenesis of each amino acid residue in the protein to all other amino acid types, and predicts changes in thermodynamic stability and affinity to the other subunits/protein counterparts, ligands and nucleic acid molecules. The data thus generated are useful in understanding the functional consequences of mutations in antimicrobial resistance phenotypes. In this study, we applied computational saturation mutagenesis to three important drug-targets in Mycobacterium leprae (M. leprae) for the drugs dapsone, rifampin and ofloxacin namely Dihydropteroate Synthase (DHPS), RNA Polymerase (RNAP) and DNA Gyrase (GYR), respectively. M. leprae causes leprosy and is an obligate intracellular bacillus with limited protein structural information associating mutations with phenotypic resistance outcomes in leprosy. Experimentally solved structures of DHPS, RNAP and GYR of M. leprae are not available in the Protein Data Bank, therefore, we modelled the structures of these proteins using template-based comparative modelling and introduced systematic mutations in each model generating 80,902 mutations and mutant structures for all the three proteins. Impacts of mutations on stability and protein-subunit, protein-ligand and protein-nucleic acid affinities were computed using various in-house developed and other published protein stability and affinity prediction software. A consensus impact was estimated for each mutation using qualitative scoring metrics for physicochemical properties and by a categorical grouping of stability and affinity predictions. We developed a web database named HARP (a database of Hansen's Disease Antimicrobial Resistance Profiles), which is accessible at the URL - https://harp-leprosy.org and provides the details to each of these predictions.
Collapse
|
22
|
Tunstall T, Portelli S, Phelan J, Clark TG, Ascher DB, Furnham N. Combining structure and genomics to understand antimicrobial resistance. Comput Struct Biotechnol J 2020; 18:3377-3394. [PMID: 33294134 PMCID: PMC7683289 DOI: 10.1016/j.csbj.2020.10.017] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 10/15/2020] [Accepted: 10/17/2020] [Indexed: 02/07/2023] Open
Abstract
Antimicrobials against bacterial, viral and parasitic pathogens have transformed human and animal health. Nevertheless, their widespread use (and misuse) has led to the emergence of antimicrobial resistance (AMR) which poses a potentially catastrophic threat to public health and animal husbandry. There are several routes, both intrinsic and acquired, by which AMR can develop. One major route is through non-synonymous single nucleotide polymorphisms (nsSNPs) in coding regions. Large scale genomic studies using high-throughput sequencing data have provided powerful new ways to rapidly detect and respond to such genetic mutations linked to AMR. However, these studies are limited in their mechanistic insight. Computational tools can rapidly and inexpensively evaluate the effect of mutations on protein function and evolution. Subsequent insights can then inform experimental studies, and direct existing or new computational methods. Here we review a range of sequence and structure-based computational tools, focussing on tools successfully used to investigate mutational effect on drug targets in clinically important pathogens, particularly Mycobacterium tuberculosis. Combining genomic results with the biophysical effects of mutations can help reveal the molecular basis and consequences of resistance development. Furthermore, we summarise how the application of such a mechanistic understanding of drug resistance can be applied to limit the impact of AMR.
Collapse
Affiliation(s)
- Tanushree Tunstall
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| | - Stephanie Portelli
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Australia
| | - Jody Phelan
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| | - Taane G. Clark
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| | - David B. Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Australia
- Structural Biology and Bioinformatics, Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Australia
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London WC1E 7HT, UK
| |
Collapse
|
23
|
Portelli S, Myung Y, Furnham N, Vedithi SC, Pires DEV, Ascher DB. Prediction of rifampicin resistance beyond the RRDR using structure-based machine learning approaches. Sci Rep 2020; 10:18120. [PMID: 33093532 PMCID: PMC7581776 DOI: 10.1038/s41598-020-74648-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Accepted: 09/21/2020] [Indexed: 01/23/2023] Open
Abstract
Rifampicin resistance is a major therapeutic challenge, particularly in tuberculosis, leprosy, P. aeruginosa and S. aureus infections, where it develops via missense mutations in gene rpoB. Previously we have highlighted that these mutations reduce protein affinities within the RNA polymerase complex, subsequently reducing nucleic acid affinity. Here, we have used these insights to develop a computational rifampicin resistance predictor capable of identifying resistant mutations even outside the well-defined rifampicin resistance determining region (RRDR), using clinical M. tuberculosis sequencing information. Our tool successfully identified up to 90.9% of M. tuberculosis rpoB variants correctly, with sensitivity of 92.2%, specificity of 83.6% and MCC of 0.69, outperforming the current gold-standard GeneXpert-MTB/RIF. We show our model can be translated to other clinically relevant organisms: M. leprae, P. aeruginosa and S. aureus, despite weak sequence identity. Our method was implemented as an interactive tool, SUSPECT-RIF (StrUctural Susceptibility PrEdiCTion for RIFampicin), freely available at https://biosig.unimelb.edu.au/suspect_rif/ .
Collapse
Affiliation(s)
- Stephanie Portelli
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria, 3010, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, 3004, VIC, Australia
| | - Yoochan Myung
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria, 3010, Australia
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, 3004, VIC, Australia
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | | | - Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, 3004, VIC, Australia
- School of Computing and Information Systems, University of Melbourne, Victoria, 3010, Australia
| | - David B Ascher
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria, 3010, Australia.
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, 3004, VIC, Australia.
- Department of Biochemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
24
|
Myung Y, Rodrigues CHM, Ascher DB, Pires DEV. mCSM-AB2: guiding rational antibody design using graph-based signatures. Bioinformatics 2020; 36:1453-1459. [PMID: 31665262 DOI: 10.1093/bioinformatics/btz779] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 10/07/2019] [Accepted: 10/23/2019] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION A lack of accurate computational tools to guide rational mutagenesis has made affinity maturation a recurrent challenge in antibody (Ab) development. We previously showed that graph-based signatures can be used to predict the effects of mutations on Ab binding affinity. RESULTS Here we present an updated and refined version of this approach, mCSM-AB2, capable of accurately modelling the effects of mutations on Ab-antigen binding affinity, through the inclusion of evolutionary and energetic terms. Using a new and expanded database of over 1800 mutations with experimental binding measurements and structural information, mCSM-AB2 achieved a Pearson's correlation of 0.73 and 0.77 across training and blind tests, respectively, outperforming available methods currently used for rational Ab engineering. AVAILABILITY AND IMPLEMENTATION mCSM-AB2 is available as a user-friendly and freely accessible web server providing rapid analysis of both individual mutations or the entire binding interface to guide rational antibody affinity maturation at http://biosig.unimelb.edu.au/mcsm_ab2. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yoochan Myung
- Department of Biochemistry and Molecular Biology.,ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, VIC 3010, Australia.,Structural Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - Carlos H M Rodrigues
- Department of Biochemistry and Molecular Biology.,ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, VIC 3010, Australia.,Structural Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - David B Ascher
- Department of Biochemistry and Molecular Biology.,ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, VIC 3010, Australia.,Structural Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia.,Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Douglas E V Pires
- Department of Biochemistry and Molecular Biology.,ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, VIC 3010, Australia.,Structural Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, VIC 3010, Australia
| |
Collapse
|
25
|
Rodrigues CHM, Pires DEV, Ascher DB. DynaMut2: Assessing changes in stability and flexibility upon single and multiple point missense mutations. Protein Sci 2020; 30:60-69. [PMID: 32881105 PMCID: PMC7737773 DOI: 10.1002/pro.3942] [Citation(s) in RCA: 239] [Impact Index Per Article: 59.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Revised: 08/27/2020] [Accepted: 08/28/2020] [Indexed: 12/11/2022]
Abstract
Predicting the effect of missense variations on protein stability and dynamics is important for understanding their role in diseases, and the link between protein structure and function. Approaches to estimate these changes have been proposed, but most only consider single‐point missense variants and a static state of the protein, with those that incorporate dynamics are computationally expensive. Here we present DynaMut2, a web server that combines Normal Mode Analysis (NMA) methods to capture protein motion and our graph‐based signatures to represent the wildtype environment to investigate the effects of single and multiple point mutations on protein stability and dynamics. DynaMut2 was able to accurately predict the effects of missense mutations on protein stability, achieving Pearson's correlation of up to 0.72 (RMSE: 1.02 kcal/mol) on a single point and 0.64 (RMSE: 1.80 kcal/mol) on multiple‐point missense mutations across 10‐fold cross‐validation and independent blind tests. For single‐point mutations, DynaMut2 achieved comparable performance with other methods when predicting variations in Gibbs Free Energy (ΔΔG) and in melting temperature (ΔTm). We anticipate our tool to be a valuable suite for the study of protein flexibility analysis and the study of the role of variants in disease. DynaMut2 is freely available as a web server and API at http://biosig.unimelb.edu.au/dynamut2.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Structural Biology and Bioinformatics, Department of Biochemistry, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
| | - Douglas E V Pires
- Structural Biology and Bioinformatics, Department of Biochemistry, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,School of Computing and Information Systems, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Structural Biology and Bioinformatics, Department of Biochemistry, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.,Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
26
|
Abstract
Mutations in protein-coding regions can lead to large biological changes and are associated with genetic conditions, including cancers and Mendelian diseases, as well as drug resistance. Although whole genome and exome sequencing help to elucidate potential genotype-phenotype correlations, there is a large gap between the identification of new variants and deciphering their molecular consequences. A comprehensive understanding of these mechanistic consequences is crucial to better understand and treat diseases in a more personalized and effective way. This is particularly relevant considering estimates that over 80% of mutations associated with a disease are incorrectly assumed to be causative. A thorough analysis of potential effects of mutations is required to correctly identify the molecular mechanisms of disease and enable the distinction between disease-causing and non-disease-causing variation within a gene. Here we present an overview of our integrative mutation analysis platform, which focuses on refining the current genotype-phenotype correlation methods by using the wealth of protein structural information.
Collapse
|
27
|
Pires DEV, Ascher DB. mycoCSM: Using Graph-Based Signatures to Identify Safe Potent Hits against Mycobacteria. J Chem Inf Model 2020; 60:3450-3456. [PMID: 32615035 DOI: 10.1021/acs.jcim.0c00362] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Development of new potent, safe drugs to treat Mycobacteria has proven to be challenging, with limited hit rates of initial screens restricting subsequent development efforts. Despite significant efforts and the evolution of quantitative structure-activity relationship as well as machine learning-based models for computationally predicting molecule bioactivity, there is an unmet need for efficient and reliable methods for identifying biologically active compounds against Mycobacterium that are also safe for humans. Here we developed mycoCSM, a graph-based signature approach to rapidly identify compounds likely to be active against bacteria from the genus Mycobacterium, or against specific Mycobacteria species. mycoCSM was trained and validated on eight organism-specific and for the first time a general Mycobacteria data set, achieving correlation coefficients of up to 0.89 on cross-validation and 0.88 on independent blind tests, when predicting bioactivity in terms of minimum inhibitory concentration. In addition, we also developed a predictor to identify those compounds likely to penetrate in necrotic tuberculosis foci, which achieved a correlation coefficient of 0.75. Together with a built-in estimator of the maximum tolerated dose in humans, we believe this method will provide a valuable resource to enrich screening libraries with potent, safe molecules. To provide simple guidance in the selection of libraries with favorable anti-Mycobacteria properties, we made mycoCSM freely available online at http://biosig.unimelb.edu.au/myco_csm.
Collapse
Affiliation(s)
- Douglas E V Pires
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, 75 Commercial Road, Melbourne 3004, VIC, Australia.,Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville 3052, VIC, Australia.,School of Computing and Information Systems, University of Melbourne, Parkville 3052, VIC, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, 75 Commercial Road, Melbourne 3004, VIC, Australia.,Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, 30 Flemington Rd, Parkville 3052, VIC, Australia.,Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge CB2 1GA, England
| |
Collapse
|
28
|
Karmakar M, Rodrigues CHM, Horan K, Denholm JT, Ascher DB. Structure guided prediction of Pyrazinamide resistance mutations in pncA. Sci Rep 2020; 10:1875. [PMID: 32024884 PMCID: PMC7002382 DOI: 10.1038/s41598-020-58635-x] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 11/28/2019] [Indexed: 11/29/2022] Open
Abstract
Pyrazinamide plays an important role in tuberculosis treatment; however, its use is complicated by side-effects and challenges with reliable drug susceptibility testing. Resistance to pyrazinamide is largely driven by mutations in pyrazinamidase (pncA), responsible for drug activation, but genetic heterogeneity has hindered development of a molecular diagnostic test. We proposed to use information on how variants were likely to affect the 3D structure of pncA to identify variants likely to lead to pyrazinamide resistance. We curated 610 pncA mutations with high confidence experimental and clinical information on pyrazinamide susceptibility. The molecular consequences of each mutation on protein stability, conformation, and interactions were computationally assessed using our comprehensive suite of graph-based signature methods, mCSM. The molecular consequences of the variants were used to train a classifier with an accuracy of 80%. Our model was tested against internationally curated clinical datasets, achieving up to 85% accuracy. Screening of 600 Victorian clinical isolates identified a set of previously unreported variants, which our model had a 71% agreement with drug susceptibility testing. Here, we have shown the 3D structure of pncA can be used to accurately identify pyrazinamide resistance mutations. SUSPECT-PZA is freely available at: http://biosig.unimelb.edu.au/suspect_pza/.
Collapse
Affiliation(s)
- Malancha Karmakar
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
- Victorian Tuberculosis Program, Melbourne Health and Department of Microbiology and Immunology, University of Melbourne, Melbourne, Victoria, Australia
| | - Carlos H M Rodrigues
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia
| | - Kristy Horan
- Microbiological Diagnostic Unit Public Health Laboratory, University of Melbourne at The Peter Doherty Institute for Infection &Immunity, Melbourne, Victoria, Australia
| | - Justin T Denholm
- Victorian Tuberculosis Program, Melbourne Health and Department of Microbiology and Immunology, University of Melbourne, Melbourne, Victoria, Australia
| | - David B Ascher
- Computational Biology and Clinical Informatics, Baker Heart and Diabetes Institute, Melbourne, Victoria, Australia.
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, Victoria, Australia.
- Department of Biochemistry, University of Cambridge, Cambridge, CB2 1GA, UK.
| |
Collapse
|
29
|
Udhaya Kumar S, Thirumal Kumar D, Mandal PD, Sankar S, Haldar R, Kamaraj B, Walter CEJ, Siva R, George Priya Doss C, Zayed H. Comprehensive in silico screening and molecular dynamics studies of missense mutations in Sjogren-Larsson syndrome associated with the ALDH3A2 gene. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2020; 120:349-377. [PMID: 32085885 DOI: 10.1016/bs.apcsb.2019.11.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Sjögren-Larsson syndrome (SLS) is an autoimmune disorder inherited in an autosomal recessive pattern. To date, 80 missense mutations have been identified in association with the Aldehyde Dehydrogenase 3 Family Member A2 (ALDH3A2) gene causing SLS. Disruption of the function of ALDH3A2 leads to excessive accumulation of fat in the cells, which interferes with the normal function of protective membranes or materials that are necessary for the body to function normally. We retrieved 54 missense mutations in the ALDH3A2 from the OMIM, UniProt, dbSNP, and HGMD databases that are known to cause SLS. These mutations were examined with various in silico stability tools, which predicted that the mutations p.S308N and p.R423H that are located at the protein-protein interaction domains are the most destabilizing. Furthermore, to determine the atomistic-level differences within the protein-protein interactions owing to mutations, we performed macromolecular simulation (MMS) using GROMACS to validate the motion patterns and dynamic behavior of the biological system. We found that both mutations (p.S380N and p.R423H) had significant effects on the protein-protein interaction and disrupted the dimeric interactions. The computational pipeline provided in this study helps to elucidate the potential structural and functional differences between the ALDH3A2 native and mutant homodimeric proteins, and will pave the way for drug discovery against specific targets in the SLS patients.
Collapse
Affiliation(s)
- S Udhaya Kumar
- School of BioSciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - D Thirumal Kumar
- School of BioSciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Pinky D Mandal
- School of BioSciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Srivarshini Sankar
- School of BioSciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Rishin Haldar
- School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Balu Kamaraj
- Department of Neuroscience Technology, College of Applied Medical Sciences, Imam Abdulrahman Bin Faisal University, Jubail, Saudi Arabia
| | - Charles Emmanuel Jebaraj Walter
- Department of Biotechnology, Sri Ramachandra Institute of Higher Education and Research (Deemed to be University), Chennai, India
| | - R Siva
- School of BioSciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - C George Priya Doss
- School of BioSciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, India
| | - Hatem Zayed
- Department of Biomedical Sciences, College of Health and Sciences, Qatar University, Doha, Qatar
| |
Collapse
|
30
|
A Comprehensive Computational Platform to Guide Drug Development Using Graph-Based Signature Methods. Methods Mol Biol 2020. [PMID: 32006280 DOI: 10.1007/978-1-0716-0270-6_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
High-throughput computational techniques have become invaluable tools to help increase the overall success, process efficiency, and associated costs of drug development. By designing ligands tailored to specific protein structures in a disease of interest, an understanding of molecular interactions and ways to optimize them can be achieved prior to chemical synthesis. This understanding can help direct crucial chemical and biological experiments by maximizing available resources on higher quality leads. Moreover, predicting molecular binding affinity within specific biological contexts, as well as ligand pharmacokinetics and toxicities, can aid in filtering out redundant leads early on within the process. We describe a set of computational tools which can aid in drug discovery at different stages, from hit identification (EasyVS) to lead optimization and candidate selection (CSM-lig, mCSM-lig, Arpeggio, pkCSM). Incorporating these tools along the drug development process can help ensure that candidate leads are chemically and biologically feasible to become successful and tractable drugs.
Collapse
|
31
|
Vedithi SC, Rodrigues CHM, Portelli S, Skwark MJ, Das M, Ascher DB, Blundell TL, Malhotra S. Computational saturation mutagenesis to predict structural consequences of systematic mutations in the beta subunit of RNA polymerase in Mycobacterium leprae. Comput Struct Biotechnol J 2020; 18:271-286. [PMID: 32042379 PMCID: PMC7000446 DOI: 10.1016/j.csbj.2020.01.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 01/03/2020] [Accepted: 01/07/2020] [Indexed: 11/26/2022] Open
Abstract
Rifampin resistance in leprosy may remain undetected due to the lack of rapid and effective diagnostic tools. A quick and reliable method is essential to determine the impacts of emerging detrimental mutations in the drug targets. The functional consequences of missense mutations in the β-subunit of RNA polymerase (RNAP) in Mycobacterium leprae (M. leprae) contribute to phenotypic resistance to rifampin in leprosy. Here, we report in-silico saturation mutagenesis of all residues in the β-subunit of RNAP to all other 19 amino acid types (generating 21,394 mutations for 1126 residues) and predict their impacts on overall thermodynamic stability, on interactions at subunit interfaces, and on β-subunit-RNA and rifampin affinities (only for the rifampin binding site) using state-of-the-art structure, sequence and normal mode analysis-based methods. Mutations in the conserved residues that line the active-site cleft show largely destabilizing effects, resulting in increased relative solvent accessibility and a concomitant decrease in residue-depth (the extent to which a residue is buried in the protein structure space) of the mutant residues. The mutations at residue positions S437, G459, H451, P489, K884 and H1035 are identified as extremely detrimental as they induce highly destabilizing effects on the overall protein stability, and nucleic acid and rifampin affinities. Destabilizing effects were predicted for all the clinically/experimentally identified rifampin-resistant mutations in M. leprae indicating that this model can be used as a surveillance tool to monitor emerging detrimental mutations that destabilise RNAP-rifampin interactions and confer rifampin resistance in leprosy. Author summary The emergence of primary and secondary drug resistance to rifampin in leprosy is a growing concern and poses a threat to the leprosy control and elimination measures globally. In the absence of an effective in-vitro system to detect and monitor phenotypic resistance to rifampin in leprosy, diagnosis mainly relies on the presence of mutations in drug resistance determining regions of the rpoB gene that encodes the β-subunit of RNAP in M. leprae. Few labs in the world perform mouse food pad propagation of M. leprae in the presence of drugs (rifampin) to determine growth patterns and confirm resistance, however the duration of these methods lasts from 8 to 12 months making them impractical for diagnosis. Understanding molecular mechanisms of drug resistance is vital to associating mutations to clinically detected drug resistance in leprosy. Here we propose an in-silico saturation mutagenesis approach to comprehensively elucidate the structural implications of any mutations that exist or that can arise in the β-subunit of RNAP in M. leprae. Most of the predicted mutations may not occur in M. leprae due to fitness costs but the information thus generated by this approach help decipher the impacts of mutations across the structure and conversely enable identification of stable regions in the protein that are least impacted by mutations (mutation coolspots) which can be a potential choice for small molecule binding and structure guided drug discovery.
Collapse
Affiliation(s)
| | - Carlos H M Rodrigues
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC 3052, Australia.,Structural Biology and Bioinformatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - Stephanie Portelli
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC 3052, Australia.,Structural Biology and Bioinformatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - Marcin J Skwark
- Department of Biochemistry, University of Cambridge, Tennis Court Rd., CB2 1GA, UK
| | - Madhusmita Das
- Molecular Biology Laboratory, Schieffelin Institute of Heath-Research and Leprosy Center, Karigiri, Vellore, Tamil Nadu 632106, India
| | - David B Ascher
- Department of Biochemistry, University of Cambridge, Tennis Court Rd., CB2 1GA, UK.,Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Parkville, VIC 3052, Australia.,Structural Biology and Bioinformatics, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Tennis Court Rd., CB2 1GA, UK
| | - Sony Malhotra
- Department of Biochemistry, University of Cambridge, Tennis Court Rd., CB2 1GA, UK
| |
Collapse
|
32
|
Pandurangan AP, Blundell TL. Prediction of impacts of mutations on protein structure and interactions: SDM, a statistical approach, and mCSM, using machine learning. Protein Sci 2020; 29:247-257. [PMID: 31693276 PMCID: PMC6933854 DOI: 10.1002/pro.3774] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 10/31/2019] [Accepted: 10/31/2019] [Indexed: 02/02/2023]
Abstract
Next-generation sequencing methods have not only allowed an understanding of genome sequence variation during the evolution of organisms but have also provided invaluable information about genetic variants in inherited disease and the emergence of resistance to drugs in cancers and infectious disease. A challenge is to distinguish mutations that are drivers of disease or drug resistance, from passengers that are neutral or even selectively advantageous to the organism. This requires an understanding of impacts of missense mutations in gene expression and regulation, and on the disruption of protein function by modulating protein stability or disturbing interactions with proteins, nucleic acids, small molecule ligands, and other biological molecules. Experimental approaches to understanding differences between wild-type and mutant proteins are most accurate but are also time-consuming and costly. Computational tools used to predict the impacts of mutations can provide useful information more quickly. Here, we focus on two widely used structure-based approaches, originally developed in the Blundell lab: site-directed mutator (SDM), a statistical approach to analyze amino acid substitutions, and mutation cutoff scanning matrix (mCSM), which uses graph-based signatures to represent the wild-type structural environment and machine learning to predict the effect of mutations on protein stability. Here, we describe DUET that uses machine learning to combine the two approaches. We discuss briefly the development of mCSM for understanding the impacts of mutations on interfaces with other proteins, nucleic acids, and ligands, and we exemplify the wide application of these approaches to understand human genetic disorders and drug resistance mutations relevant to cancer and mycobacterial infections. STATEMENT FOR A BROADER AUDIENCE: Genetic or somatic changes in genes can lead to mutations in human proteins, which give rise to genetic disorders or cancer, or to genes of pathogens leading to drug resistance. Computer software described here, using statistical approaches or machine learning, uses the information from genome sequencing of humans and pathogens, together with experimental or modeled 3D structures of gene products, the proteins, to predict impacts of mutations in genetic disease, cancer and drug resistance.
Collapse
Affiliation(s)
- Arun Prasad Pandurangan
- Department of BiochemistryUniversity of CambridgeCambridgeUK
- MRC Laboratory of Molecular BiologyCambridgeUK
| | - Tom L. Blundell
- Department of BiochemistryUniversity of CambridgeCambridgeUK
| |
Collapse
|
33
|
dendPoint: a web resource for dendrimer pharmacokinetics investigation and prediction. Sci Rep 2019; 9:15465. [PMID: 31664080 PMCID: PMC6820739 DOI: 10.1038/s41598-019-51789-3] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 09/24/2019] [Indexed: 01/01/2023] Open
Abstract
Nanomedicine development currently suffers from a lack of efficient tools to predict pharmacokinetic behavior without relying upon testing in large numbers of animals, impacting success rates and development costs. This work presents dendPoint, the first in silico model to predict the intravenous pharmacokinetics of dendrimers, a commonly explored drug vector, based on physicochemical properties. We have manually curated the largest relational database of dendrimer pharmacokinetic parameters and their structural/physicochemical properties. This was used to develop a machine learning-based model capable of accurately predicting pharmacokinetic parameters, including half-life, clearance, volume of distribution and dose recovered in the liver and urine. dendPoint successfully predicts dendrimer pharmacokinetic properties, achieving correlations of up to r = 0.83 and Q2 up to 0.68. dendPoint is freely available as a user-friendly web-service and database at http://biosig.unimelb.edu.au/dendpoint. This platform is ultimately expected to be used to guide dendrimer construct design and refinement prior to embarking on more time consuming and expensive in vivo testing.
Collapse
|
34
|
Karpiyevich M, Adjalley S, Mol M, Ascher DB, Mason B, van der Heden van Noort GJ, Laman H, Ovaa H, Lee MCS, Artavanis-Tsakonas K. Nedd8 hydrolysis by UCH proteases in Plasmodium parasites. PLoS Pathog 2019; 15:e1008086. [PMID: 31658303 PMCID: PMC6837540 DOI: 10.1371/journal.ppat.1008086] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 11/07/2019] [Accepted: 09/16/2019] [Indexed: 11/19/2022] Open
Abstract
Plasmodium parasites are the causative agents of malaria, a disease with wide public health repercussions. Increasing drug resistance and the absence of a vaccine make finding new chemotherapeutic strategies imperative. Components of the ubiquitin and ubiquitin-like pathways have garnered increased attention as novel targets given their necessity to parasite survival. Understanding how these pathways are regulated in Plasmodium and identifying differences to the host is paramount to selectively interfering with parasites. Here, we focus on Nedd8 modification in Plasmodium falciparum, given its central role to cell division and DNA repair, processes critical to Plasmodium parasites given their unusual cell cycle and requirement for refined repair mechanisms. By applying a functional chemical approach, we show that deNeddylation is controlled by a different set of enzymes in the parasite versus the human host. We elucidate the molecular determinants of the unusual dual ubiquitin/Nedd8 recognition by the essential PfUCH37 enzyme and, through parasite transgenics and drug assays, determine that only its ubiquitin activity is critical to parasite survival. Our experiments reveal interesting evolutionary differences in how neddylation is controlled in higher versus lower eukaryotes, and highlight the Nedd8 pathway as worthy of further exploration for therapeutic targeting in antimalarial drug design. Ubiquitin and ubiquitin-like post-translational modifications are evolutionarily conserved and involved in fundamental cellular processes essential to all eukaryotes. As such, enzymatic components of these pathways present attractive targets for therapeutic intervention for both chronic and communicable diseases. Nedd8 modification of cullin ubiquitin E3 ligases is critical to the viability of eukaryotic organisms and mediates cell cycle progression and DNA damage repair. Given the complex lifecycle and unusual replication mechanisms of the malaria parasite, one would expect neddylation to be of central importance to its survival, yet little is known about this pathway in Plasmodium. Here we present our findings on how Nedd8 removal is controlled in Plasmodium falciparum and how this pathway differs to that of its human host.
Collapse
Affiliation(s)
- Maryia Karpiyevich
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | - Sophie Adjalley
- Parasites and Microbes Programme, Wellcome Sanger Institute, Cambridge, United Kingdom
| | - Marco Mol
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | - David B. Ascher
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
- Department of Biochemistry, University of Melbourne, Melbourne, Australia
| | - Bethany Mason
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | | | - Heike Laman
- Department of Pathology, University of Cambridge, Cambridge, United Kingdom
| | - Huib Ovaa
- Oncode Institute and Department of Cell and Chemical Biology, Leiden University Medical Centre, Leiden, The Netherlands
| | - Marcus C. S. Lee
- Parasites and Microbes Programme, Wellcome Sanger Institute, Cambridge, United Kingdom
| | | |
Collapse
|
35
|
Karmakar M, Globan M, Fyfe JAM, Stinear TP, Johnson PDR, Holmes NE, Denholm JT, Ascher DB. Analysis of a Novel pncA Mutation for Susceptibility to Pyrazinamide Therapy. Am J Respir Crit Care Med 2019; 198:541-544. [PMID: 29694240 DOI: 10.1164/rccm.201712-2572le] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Affiliation(s)
| | - Maria Globan
- 1 University of Melbourne Melbourne, Victoria, Australia and.,2 Melbourne Health Melbourne, Victoria, Australia
| | - Janet A M Fyfe
- 1 University of Melbourne Melbourne, Victoria, Australia and.,2 Melbourne Health Melbourne, Victoria, Australia
| | | | - Paul D R Johnson
- 1 University of Melbourne Melbourne, Victoria, Australia and.,3 World Health Organization Collaborating Centre for Mycobacterium ulcerans Melbourne, Victoria, Australia
| | - Natasha E Holmes
- 1 University of Melbourne Melbourne, Victoria, Australia and.,4 Austin Health Heidelberg, Victoria, Australia
| | | | - David B Ascher
- 1 University of Melbourne Melbourne, Victoria, Australia and.,5 University of Cambridge Cambridge, United Kingdom
| |
Collapse
|
36
|
Protein stability engineering insights revealed by domain-wide comprehensive mutagenesis. Proc Natl Acad Sci U S A 2019; 116:16367-16377. [PMID: 31371509 DOI: 10.1073/pnas.1903888116] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The accurate prediction of protein stability upon sequence mutation is an important but unsolved challenge in protein engineering. Large mutational datasets are required to train computational predictors, but traditional methods for collecting stability data are either low-throughput or measure protein stability indirectly. Here, we develop an automated method to generate thermodynamic stability data for nearly every single mutant in a small 56-residue protein. Analysis reveals that most single mutants have a neutral effect on stability, mutational sensitivity is largely governed by residue burial, and unexpectedly, hydrophobics are the best tolerated amino acid type. Correlating the output of various stability-prediction algorithms against our data shows that nearly all perform better on boundary and surface positions than for those in the core and are better at predicting large-to-small mutations than small-to-large ones. We show that the most stable variants in the single-mutant landscape are better identified using combinations of 2 prediction algorithms and including more algorithms can provide diminishing returns. In most cases, poor in silico predictions were tied to compositional differences between the data being analyzed and the datasets used to train the algorithm. Finally, we find that strategies to extract stabilities from high-throughput fitness data such as deep mutational scanning are promising and that data produced by these methods may be applicable toward training future stability-prediction tools.
Collapse
|
37
|
Munir A, Kumar N, Ramalingam SB, Tamilzhalagan S, Shanmugam SK, Palaniappan AN, Nair D, Priyadarshini P, Natarajan M, Tripathy S, Ranganathan UD, Peacock SJ, Parkhill J, Blundell TL, Malhotra S. Identification and Characterization of Genetic Determinants of Isoniazid and Rifampicin Resistance in Mycobacterium tuberculosis in Southern India. Sci Rep 2019; 9:10283. [PMID: 31311987 PMCID: PMC6635374 DOI: 10.1038/s41598-019-46756-x] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 06/28/2019] [Indexed: 02/02/2023] Open
Abstract
Drug-resistant tuberculosis (TB), one of the leading causes of death worldwide, arises mainly from spontaneous mutations in the genome of Mycobacterium tuberculosis. There is an urgent need to understand the mechanisms by which the mutations confer resistance in order to identify new drug targets and to design new drugs. Previous studies have reported numerous mutations that confer resistance to anti-TB drugs, but there has been little systematic analysis to understand their genetic background and the potential impacts on the drug target stability and/or interactions. Here, we report the analysis of whole-genome sequence data for 98 clinical M. tuberculosis isolates from a city in southern India. The collection was screened for phenotypic resistance and sequenced to mine the genetic mutations conferring resistance to isoniazid and rifampicin. The most frequent mutation among isoniazid and rifampicin isolates was S315T in katG and S450L in rpoB respectively. The impacts of mutations on protein stability, protein-protein interactions and protein-ligand interactions were analysed using both statistical and machine-learning approaches. Drug-resistant mutations were predicted not only to target active sites in an orthosteric manner, but also to act through allosteric mechanisms arising from distant sites, sometimes at the protein-protein interface.
Collapse
Affiliation(s)
- Asma Munir
- 0000000121885934grid.5335.0Department of Biochemistry, University of Cambridge, Tennis Court. Rd., Cambridge, CB2 1GA UK
| | - Narender Kumar
- 0000000121885934grid.5335.0Department of Medicine, University of Cambridge, Hills Rd., Cambridge, CB2 0QQ UK
| | - Suresh Babu Ramalingam
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | - Sembulingam Tamilzhalagan
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | - Siva Kumar Shanmugam
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | | | - Dina Nair
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | - Padma Priyadarshini
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | - Mohan Natarajan
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | - Srikanth Tripathy
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | - Uma Devi Ranganathan
- 0000 0004 1767 6138grid.417330.2ICMR-National Institute for Research in Tuberculosis, Chennai, 600031 India
| | - Sharon J. Peacock
- 0000000121885934grid.5335.0Department of Medicine, University of Cambridge, Hills Rd., Cambridge, CB2 0QQ UK ,0000 0004 0425 469Xgrid.8991.9London School of Hygiene & Tropical Medicine, Keppel Street, London, WC1E 7HT UK
| | - Julian Parkhill
- 0000 0004 0606 5382grid.10306.34Wellcome Trust Sanger Institute, Hinxton, Cambridgeshire CB10 1SA UK
| | - Tom L. Blundell
- 0000000121885934grid.5335.0Department of Biochemistry, University of Cambridge, Tennis Court. Rd., Cambridge, CB2 1GA UK
| | - Sony Malhotra
- 0000000121885934grid.5335.0Department of Biochemistry, University of Cambridge, Tennis Court. Rd., Cambridge, CB2 1GA UK ,0000 0001 2161 2573grid.4464.2Present Address: Birkbeck College, University of London, Malet Street, WC1E7HX London, UK
| |
Collapse
|
38
|
Rodrigues CHM, Myung Y, Pires DEV, Ascher DB. mCSM-PPI2: predicting the effects of mutations on protein-protein interactions. Nucleic Acids Res 2019; 47:W338-W344. [PMID: 31114883 PMCID: PMC6602427 DOI: 10.1093/nar/gkz383] [Citation(s) in RCA: 200] [Impact Index Per Article: 40.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Revised: 04/30/2019] [Accepted: 05/20/2019] [Indexed: 12/13/2022] Open
Abstract
Protein-protein Interactions are involved in most fundamental biological processes, with disease causing mutations enriched at their interfaces. Here we present mCSM-PPI2, a novel machine learning computational tool designed to more accurately predict the effects of missense mutations on protein-protein interaction binding affinity. mCSM-PPI2 uses graph-based structural signatures to model effects of variations on the inter-residue interaction network, evolutionary information, complex network metrics and energetic terms to generate an optimised predictor. We demonstrate that our method outperforms previous methods, ranking first among 26 others on CAPRI blind tests. mCSM-PPI2 is freely available as a user friendly webserver at http://biosig.unimelb.edu.au/mcsm_ppi2/.
Collapse
Affiliation(s)
- Carlos H M Rodrigues
- Department of Biochemistry and Molecular Biology, University of Melbourne, Melbourne, Australia
- ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, Australia
- Structural Biology and Bioinformatics, Baker Heart and Diabetes Institute, Melbourne, Australia
| | - Yoochan Myung
- Department of Biochemistry and Molecular Biology, University of Melbourne, Melbourne, Australia
- ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, Australia
- Structural Biology and Bioinformatics, Baker Heart and Diabetes Institute, Melbourne, Australia
| | - Douglas E V Pires
- Department of Biochemistry and Molecular Biology, University of Melbourne, Melbourne, Australia
- ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, Australia
- Structural Biology and Bioinformatics, Baker Heart and Diabetes Institute, Melbourne, Australia
| | - David B Ascher
- Department of Biochemistry and Molecular Biology, University of Melbourne, Melbourne, Australia
- ACRF Facility for Innovative Cancer Drug Discovery, Bio21 Institute, University of Melbourne, Melbourne, Australia
- Structural Biology and Bioinformatics, Baker Heart and Diabetes Institute, Melbourne, Australia
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
39
|
Pires DEV, Ascher DB. mCSM-NA: predicting the effects of mutations on protein-nucleic acids interactions. Nucleic Acids Res 2019; 45:W241-W246. [PMID: 28383703 PMCID: PMC5570212 DOI: 10.1093/nar/gkx236] [Citation(s) in RCA: 85] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2017] [Accepted: 04/03/2017] [Indexed: 01/17/2023] Open
Abstract
Over the past two decades, several computational methods have been proposed to predict how missense mutations can affect protein structure and function, either by altering protein stability or interactions with its partners, shedding light into potential molecular mechanisms giving rise to different phenotypes. Effectively and efficiently predicting consequences of mutations on protein–nucleic acid interactions, however, remained until recently a great and unmet challenge. Here we report an updated webserver for mCSM–NA, the only scalable method we are aware of capable of quantitatively predicting the effects of mutations in protein coding regions on nucleic acid binding affinities. We have significantly enhanced the original method by including a pharmacophore modelling and information of nucleic acid properties into our graph-based signatures, considering the reverse mutation and by using a refined, more reliable data set, based on a new release of the ProNIT database, which has significantly improved the reliability and applicability of the methodology. Our new predictive model was capable of achieving a correlation coefficient of up to 0.70 on cross-validation and 0.68 on blind-tests, outperforming its previous version. The server is freely available via a user-friendly web interface at: http://structure.bioc.cam.ac.uk/mcsm_na.
Collapse
Affiliation(s)
| | - David B Ascher
- Centro de Pesquisas René Rachou, Fundação Oswaldo Cruz, Brazil.,Department of Biochemistry, University of Cambridge, Cambridge, UK.,Department of Biochemistry and Molecular Biology, University of Melbourne, Melbourne, Australia
| |
Collapse
|
40
|
Pandurangan AP, Ochoa-Montaño B, Ascher DB, Blundell TL. SDM: a server for predicting effects of mutations on protein stability. Nucleic Acids Res 2019; 45:W229-W235. [PMID: 28525590 PMCID: PMC5793720 DOI: 10.1093/nar/gkx439] [Citation(s) in RCA: 333] [Impact Index Per Article: 66.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Accepted: 05/15/2017] [Indexed: 02/02/2023] Open
Abstract
Here, we report a webserver for the improved SDM, used for predicting the effects of mutations on protein stability. As a pioneering knowledge-based approach, SDM has been highlighted as the most appropriate method to use in combination with many other approaches. We have updated the environment-specific amino-acid substitution tables based on the current expanded PDB (a 5-fold increase in information), and introduced new residue-conformation and interaction parameters, including packing density and residue depth. The updated server has been extensively tested using a benchmark containing 2690 point mutations from 132 different protein structures. The revised method correlates well against the hypothetical reverse mutations, better than comparable methods built using machine-learning approaches, highlighting the strength of our knowledge-based approach for identifying stabilising mutations. Given a PDB file (a Protein Data Bank file format containing the 3D coordinates of the protein atoms), and a point mutation, the server calculates the stability difference score between the wildtype and mutant protein. The server is available at http://structure.bioc.cam.ac.uk/sdm2
Collapse
Affiliation(s)
| | | | - David B Ascher
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK.,Department of Biochemistry and Molecular Biology, University of Melbourne, Australia
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| |
Collapse
|
41
|
Empirical ways to identify novel Bedaquiline resistance mutations in AtpE. PLoS One 2019; 14:e0217169. [PMID: 31141524 PMCID: PMC6541270 DOI: 10.1371/journal.pone.0217169] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 05/01/2019] [Indexed: 12/28/2022] Open
Abstract
Clinical resistance against Bedaquiline, the first new anti-tuberculosis compound with a novel mechanism of action in over 40 years, has already been detected in Mycobacterium tuberculosis. As a new drug, however, there is currently insufficient clinical data to facilitate reliable and timely identification of genomic determinants of resistance. Here we investigate the structural basis for M. tuberculosis associated bedaquiline resistance in the drug target, AtpE. Together with the 9 previously identified resistance-associated variants in AtpE, 54 non-resistance-associated mutations were identified through comparisons of bedaquiline susceptibility across 23 different mycobacterial species. Computational analysis of the structural and functional consequences of these variants revealed that resistance associated variants were mainly localized at the drug binding site, disrupting key interactions with bedaquiline leading to reduced binding affinity. This was used to train a supervised predictive algorithm, which accurately identified likely resistance mutations (93.3% accuracy). Application of this model to circulating variants present in the Asia-Pacific region suggests that current circulating variants are likely to be susceptible to bedaquiline. We have made this model freely available through a user-friendly web interface called SUSPECT-BDQ, StrUctural Susceptibility PrEdiCTion for bedaquiline (http://biosig.unimelb.edu.au/suspect_bdq/). This tool could be useful for the rapid characterization of novel clinical variants, to help guide the effective use of bedaquiline, and to minimize the spread of clinical resistance.
Collapse
|
42
|
Synthesis and Structure-Activity relationship of 1-(5-isoquinolinesulfonyl)piperazine analogues as inhibitors of Mycobacterium tuberculosis IMPDH. Eur J Med Chem 2019; 174:309-329. [PMID: 31055147 PMCID: PMC6990405 DOI: 10.1016/j.ejmech.2019.04.027] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2018] [Revised: 04/11/2019] [Accepted: 04/11/2019] [Indexed: 02/06/2023]
Abstract
Tuberculosis (TB) is a major infectious disease associated increasingly with drug resistance. Thus, new anti-tubercular agents with novel mechanisms of action are urgently required for the treatment of drug-resistant TB. In prior work, we identified compound 1 (cyclohexyl(4-(isoquinolin-5-ylsulfonyl)piperazin-1-yl)methanone) and showed that its anti-tubercular activity is attributable to inhibition of inosine-5′-monophosphate dehydrogenase (IMPDH) in Mycobacterium tuberculosis. In the present study, we explored the structure–activity relationship around compound 1 by synthesizing and evaluating the inhibitory activity of analogues against M. tuberculosis IMPDH in biochemical and whole-cell assays. X-ray crystallography was performed to elucidate the mode of binding of selected analogues to IMPDH. We establish the importance of the cyclohexyl, piperazine and isoquinoline rings for activity, and report the identification of an analogue with IMPDH-selective activity against a mutant of M. tuberculosis that is highly resistant to compound 1. We also show that the nitrogen in urea analogues is required for anti-tubercular activity and identify benzylurea derivatives as promising inhibitors that warrant further investigation. Forty-eight analogues of 1-(5-isoquinolinesulfonyl)piperazine were synthesized. Biochemical, whole-cell, and X-ray studies were performed to elucidate the IMPDH inhibition. Piperazine and isoquinoline rings were essential for target-selective whole-cell activity. Compound 47 showed improved IC50 against the MtbIMPDH and maintained on-target whole-cell activity. Compound 21 showed activity against IMPDH in both wild type M. tuberculosis and a resistant mutant of compound 1.
Collapse
|
43
|
Tang N, Dehury B, Kepp KP. Computing the Pathogenicity of Alzheimer’s Disease Presenilin 1 Mutations. J Chem Inf Model 2019; 59:858-870. [DOI: 10.1021/acs.jcim.8b00896] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Ning Tang
- Department of Chemistry, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Budheswar Dehury
- Department of Chemistry, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Kasper P. Kepp
- Department of Chemistry, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| |
Collapse
|
44
|
Waman VP, Vedithi SC, Thomas SE, Bannerman BP, Munir A, Skwark MJ, Malhotra S, Blundell TL. Mycobacterial genomics and structural bioinformatics: opportunities and challenges in drug discovery. Emerg Microbes Infect 2019; 8:109-118. [PMID: 30866765 PMCID: PMC6334779 DOI: 10.1080/22221751.2018.1561158] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 12/03/2018] [Accepted: 12/09/2018] [Indexed: 01/08/2023]
Abstract
Of the more than 190 distinct species of Mycobacterium genus, many are economically and clinically important pathogens of humans or animals. Among those mycobacteria that infect humans, three species namely Mycobacterium tuberculosis (causative agent of tuberculosis), Mycobacterium leprae (causative agent of leprosy) and Mycobacterium abscessus (causative agent of chronic pulmonary infections) pose concern to global public health. Although antibiotics have been successfully developed to combat each of these, the emergence of drug-resistant strains is an increasing challenge for treatment and drug discovery. Here we describe the impact of the rapid expansion of genome sequencing and genome/pathway annotations that have greatly improved the progress of structure-guided drug discovery. We focus on the applications of comparative genomics, metabolomics, evolutionary bioinformatics and structural proteomics to identify potential drug targets. The opportunities and challenges for the design of drugs for M. tuberculosis, M. leprae and M. abscessus to combat resistance are discussed.
Collapse
Affiliation(s)
| | | | | | | | - Asma Munir
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Marcin J. Skwark
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Sony Malhotra
- Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London, UK
| | - Tom L. Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
45
|
Pires DEV, Rodrigues CHM, Albanaz ATS, Karmakar M, Myung Y, Xavier J, Michanetzi EM, Portelli S, Ascher DB. Exploring Protein Supersecondary Structure Through Changes in Protein Folding, Stability, and Flexibility. Methods Mol Biol 2019; 1958:173-185. [PMID: 30945219 DOI: 10.1007/978-1-4939-9161-7_9] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
The ability to predict how mutations affect protein structure, folding, and flexibility can elucidate the molecular mechanisms leading to disruption of supersecondary structures, the emergence of phenotypes, as well guiding rational protein engineering. The advent of fast and accurate computational tools has enabled us to comprehensively explore the landscape of mutation effects on protein structures, prioritizing mutations for rational experimental validation.Here we describe the use of two complementary web-based in silico methods, DUET and DynaMut, developed to infer the effects of mutations on folding, stability, and flexibility and how they can be used to explore and interpret these effects on protein supersecondary structures.
Collapse
Affiliation(s)
- Douglas E V Pires
- Instituto René Rachou, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil. .,Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia.
| | - Carlos H M Rodrigues
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
| | | | - Malancha Karmakar
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
| | - Yoochan Myung
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
| | - Joicymara Xavier
- Instituto René Rachou, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | - Eleni-Maria Michanetzi
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
| | - Stephanie Portelli
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia
| | - David B Ascher
- Instituto René Rachou, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil.,Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Melbourne, VIC, Australia.,Department of Biochemistry, University of Cambridge, Cambridge, UK
| |
Collapse
|
46
|
Portelli S, Phelan JE, Ascher DB, Clark TG, Furnham N. Understanding molecular consequences of putative drug resistant mutations in Mycobacterium tuberculosis. Sci Rep 2018; 8:15356. [PMID: 30337649 PMCID: PMC6193939 DOI: 10.1038/s41598-018-33370-6] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Accepted: 09/26/2018] [Indexed: 12/21/2022] Open
Abstract
Genomic studies of Mycobacterium tuberculosis bacteria have revealed loci associated with resistance to anti-tuberculosis drugs. However, the molecular consequences of polymorphism within these candidate loci remain poorly understood. To address this, we have used computational tools to quantify the effects of point mutations conferring resistance to three major anti-tuberculosis drugs, isoniazid (n = 189), rifampicin (n = 201) and D-cycloserine (n = 48), within their primary targets, katG, rpoB, and alr. Notably, mild biophysical effects brought about by high incidence mutations were considered more tolerable, while different structural effects brought about by haplotype combinations reflected differences in their functional importance. Additionally, highly destabilising mutations such as alr Y388, highlighted a functional importance of the wildtype residue. Our qualitative analysis enabled us to relate resistance mutations onto a theoretical landscape linking enthalpic changes with phenotype. Such insights will aid the development of new resistance-resistant drugs and, via an integration into predictive tools, in pathogen surveillance.
Collapse
Affiliation(s)
- Stephanie Portelli
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria, 3051, Australia
| | - Jody E Phelan
- Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - David B Ascher
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne, Victoria, 3051, Australia
| | - Taane G Clark
- Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK
| | - Nicholas Furnham
- Department of Pathogen Molecular Biology, London School of Hygiene and Tropical Medicine, Keppel Street, London, WC1E 7HT, UK.
| |
Collapse
|
47
|
Abayakoon P, Jin Y, Lingford JP, Petricevic M, John A, Ryan E, Wai-Ying Mui J, Pires DE, Ascher DB, Davies GJ, Goddard-Borger ED, Williams SJ. Structural and Biochemical Insights into the Function and Evolution of Sulfoquinovosidases. ACS CENTRAL SCIENCE 2018; 4:1266-1273. [PMID: 30276262 PMCID: PMC6161063 DOI: 10.1021/acscentsci.8b00453] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Indexed: 06/08/2023]
Abstract
An estimated 10 billion tonnes of sulfoquinovose (SQ) are produced and degraded each year. Prokaryotic sulfoglycolytic pathways catabolize sulfoquinovose (SQ) liberated from plant sulfolipid, or its delipidated form α-d-sulfoquinovosyl glycerol (SQGro), through the action of a sulfoquinovosidase (SQase), but little is known about the capacity of SQ glycosides to support growth. Structural studies of the first reported SQase (Escherichia coli YihQ) have identified three conserved residues that are essential for substrate recognition, but crossover mutations exploring active-site residues of predicted SQases from other organisms have yielded inactive mutants casting doubt on bioinformatic functional assignment. Here, we show that SQGro can support the growth of E. coli on par with d-glucose, and that the E. coli SQase prefers the naturally occurring diastereomer of SQGro. A predicted, but divergent, SQase from Agrobacterium tumefaciens proved to have highly specific activity toward SQ glycosides, and structural, mutagenic, and bioinformatic analyses revealed the molecular coevolution of catalytically important amino acid pairs directly involved in substrate recognition, as well as structurally important pairs distal to the active site. Understanding the defining features of SQases empowers bioinformatic approaches for mapping sulfur metabolism in diverse microbial communities and sheds light on this poorly understood arm of the biosulfur cycle.
Collapse
Affiliation(s)
- Palika Abayakoon
- School
of Chemistry and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Yi Jin
- York
Structural Biology Laboratory, Department of Chemistry, University of York, Heslington YO10 5DD, United Kingdom
| | - James P. Lingford
- ACRF
Chemical Biology Division, The Walter and
Eliza Hall Institute of Medical Research, Parkville, Victoria 3010, Australia
- Department
of Medical Biology, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Marija Petricevic
- School
of Chemistry and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Alan John
- ACRF
Chemical Biology Division, The Walter and
Eliza Hall Institute of Medical Research, Parkville, Victoria 3010, Australia
- Department
of Medical Biology, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Eileen Ryan
- School
of Chemistry and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Janice Wai-Ying Mui
- School
of Chemistry and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Douglas E.V. Pires
- Department
of Biochemistry and Molecular Biology, and Bio21 Molecular Science
and Biotechnology Institute, University
of Melbourne, Parkville, Victoria 3010, Australia
| | - David B. Ascher
- Department
of Biochemistry and Molecular Biology, and Bio21 Molecular Science
and Biotechnology Institute, University
of Melbourne, Parkville, Victoria 3010, Australia
| | - Gideon J. Davies
- York
Structural Biology Laboratory, Department of Chemistry, University of York, Heslington YO10 5DD, United Kingdom
| | - Ethan D. Goddard-Borger
- ACRF
Chemical Biology Division, The Walter and
Eliza Hall Institute of Medical Research, Parkville, Victoria 3010, Australia
- Department
of Medical Biology, University of Melbourne, Parkville, Victoria 3010, Australia
| | - Spencer J. Williams
- School
of Chemistry and Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Parkville, Victoria 3010, Australia
| |
Collapse
|
48
|
Kulandaisamy A, Srivastava A, Kumar P, Nagarajan R, Priya SB, Gromiha MM. Identification and Analysis of Key Residues in Protein-RNA Complexes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1436-1444. [PMID: 29993582 DOI: 10.1109/tcbb.2018.2834387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Protein-RNA complexes play important roles in various biological processes. The functions of protein-RNA complexes are dictated by their interactions, binding, stability, and affinity. In this work, we have identified the key residues (KRs), which are involved in both stability and binding. We found that 42 percent of considered proteins share common binding and stabilizing residues, whereas these residues are distinct in 58 percent of the proteins. Overall, 5 percent of stabilizing and 3 percent of binding residues serve as key residues. These residues are enriched with the combination of polar, charged, aliphatic, and aromatic residues. Analysis on subclasses of protein-RNA complexes based on protein structural class, function and RNA type showed that regulatory proteins, and complexes with single stranded RNA and rRNA have appreciable number of key residues. Specifically, Arg, Tyr, and Thr are preferred in most of the subclasses of protein-RNA complexes. In addition, residues with similar chemical behavior have different preferences to be KRs, such that Arg, Tyr, Val, and Thr are preferred over Lys, Trp, Ile, and Ser, respectively. Atomic level contacts revealed that charged and polar-nonpolar contacts are dominant in enzymes, polar in structural, and nonpolar in regulatory proteins. On the other hand, polar-nonpolar contacts are enriched in all these classes of protein-RNA complexes. Further, the influence of sequence and structural features such as conservation score, surrounding hydrophobicity, solvent accessibility, secondary structure, and long-range order in key residues are also discussed. We envisage that the present study provides insights to understand the structural and functional aspects of protein-RNA complexes.
Collapse
|
49
|
Rodrigues CHM, Pires DEV, Ascher DB. DynaMut: predicting the impact of mutations on protein conformation, flexibility and stability. Nucleic Acids Res 2018; 46:W350-W355. [PMID: 29718330 PMCID: PMC6031064 DOI: 10.1093/nar/gky300] [Citation(s) in RCA: 647] [Impact Index Per Article: 107.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 04/03/2018] [Accepted: 04/16/2018] [Indexed: 12/31/2022] Open
Abstract
Proteins are highly dynamic molecules, whose function is intrinsically linked to their molecular motions. Despite the pivotal role of protein dynamics, their computational simulation cost has led to most structure-based approaches for assessing the impact of mutations on protein structure and function relying upon static structures. Here we present DynaMut, a web server implementing two distinct, well established normal mode approaches, which can be used to analyze and visualize protein dynamics by sampling conformations and assess the impact of mutations on protein dynamics and stability resulting from vibrational entropy changes. DynaMut integrates our graph-based signatures along with normal mode dynamics to generate a consensus prediction of the impact of a mutation on protein stability. We demonstrate our approach outperforms alternative approaches to predict the effects of mutations on protein stability and flexibility (P-value < 0.001), achieving a correlation of up to 0.70 on blind tests. DynaMut also provides a comprehensive suite for protein motion and flexibility analysis and visualization via a freely available, user friendly web server at http://biosig.unimelb.edu.au/dynamut/.
Collapse
Affiliation(s)
- Carlos HM Rodrigues
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Australia
| | | | - David B Ascher
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Australia
- Instituto René Rachou, Fundação Oswaldo Cruz, Brazil
- Department of Biochemistry, University of Cambridge, UK
| |
Collapse
|
50
|
Rodrigues CHM, Ascher DB, Pires DEV. Kinact: a computational approach for predicting activating missense mutations in protein kinases. Nucleic Acids Res 2018; 46:W127-W132. [PMID: 29788456 PMCID: PMC6031004 DOI: 10.1093/nar/gky375] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Revised: 04/15/2018] [Accepted: 04/28/2018] [Indexed: 12/31/2022] Open
Abstract
Protein phosphorylation is tightly regulated due to its vital role in many cellular processes. While gain of function mutations leading to constitutive activation of protein kinases are known to be driver events of many cancers, the identification of these mutations has proven challenging. Here we present Kinact, a novel machine learning approach for predicting kinase activating missense mutations using information from sequence and structure. By adapting our graph-based signatures, Kinact represents both structural and sequence information, which are used as evidence to train predictive models. We show the combination of structural and sequence features significantly improved the overall accuracy compared to considering either primary or tertiary structure alone, highlighting their complementarity. Kinact achieved a precision of 87% and 94% and Area Under ROC Curve of 0.89 and 0.92 on 10-fold cross-validation, and on blind tests, respectively, outperforming well established tools (P < 0.01). We further show that Kinact performs equally well on homology models built using templates with sequence identity as low as 33%. Kinact is freely available as a user-friendly web server at http://biosig.unimelb.edu.au/kinact/.
Collapse
Affiliation(s)
- Carlos HM Rodrigues
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne
| | - David B Ascher
- Department of Biochemistry and Molecular Biology, Bio21 Institute, University of Melbourne
- Department of Biochemistry, University of Cambridge
- Instituto René Rachou, Fundação Oswaldo Cruz
| | | |
Collapse
|