1
|
Moth CW, Sheehan JH, Mamun AA, Sivley RM, Gulsevin A, Rinker D, Capra JA, Meiler J. VUStruct: a compute pipeline for high throughput and personalized structural biology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.06.606224. [PMID: 39149406 PMCID: PMC11326201 DOI: 10.1101/2024.08.06.606224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Effective diagnosis and treatment of rare genetic disorders requires the interpretation of a patient's genetic variants of unknown significance (VUSs). Today, clinical decision-making is primarily guided by gene-phenotype association databases and DNA-based scoring methods. Our web-accessible variant analysis pipeline, VUStruct, supplements these established approaches by deeply analyzing the downstream molecular impact of variation in context of 3D protein structure. VUStruct's growing impact is fueled by the co-proliferation of protein 3D structural models, gene sequencing, compute power, and artificial intelligence. Contextualizing VUSs in protein 3D structural models also illuminates longitudinal genomics studies and biochemical bench research focused on VUS, and we created VUStruct for clinicians and researchers alike. We now introduce VUStruct to the broad scientific community as a mature, web-facing, extensible, High Performance Computing (HPC) software pipeline. VUStruct maps missense variants onto automatically selected protein structures and launches a broad range of analyses. These include energy-based assessments of protein folding and stability, pathogenicity prediction through spatial clustering analysis, and machine learning (ML) predictors of binding surface disruptions and nearby post-translational modification sites. The pipeline also considers the entire input set of VUS and identifies genes potentially involved in digenic disease. VUStruct's utility in clinical rare disease genome interpretation has been demonstrated through its analysis of over 175 Undiagnosed Disease Network (UDN) Patient cases. VUStruct-leveraged hypotheses have often informed clinicians in their consideration of additional patient testing, and we report here details from two cases where VUStruct was key to their solution. We also note successes with academic research collaborators, for whom VUStruct has informed research directions in both computational genomics and wet lab studies.
Collapse
Affiliation(s)
- Christopher W. Moth
- Departments of Chemistry, Pharmacology, and Biomedical Informatics; Center for Structural Biology and Institute of Chemical Biology; Vanderbilt Univ., Nashville, TN 37232, USA
| | - Jonathan H. Sheehan
- Division of Infection Diseases, Milliken Dept. of Internal Medicine, Washington Univ. of Medicine in St. Louis, MO 63110, USA
| | - Abdullah Al Mamun
- Departments of Chemistry, Pharmacology, and Biomedical Informatics; Center for Structural Biology and Institute of Chemical Biology; Vanderbilt Univ., Nashville, TN 37232, USA
| | | | - Alican Gulsevin
- Department of Pharmaceutical Sciences, College of Pharmacy and Health Sciences, Butler University, Indianapolis, IN 46208, USA
| | - David Rinker
- Department of Biological Sciences, Evolutionary Studies Initiative; Vanderbilt Univ., Nashville, TN 37232, USA
| | - John A. Capra
- Bakar Computational Health Science Institute and Department of Epidemiology and Biostatistics, Univ. of California San Francisco, CA 94143, USA
| | - Jens Meiler
- Departments of Chemistry, Pharmacology, and Biomedical Informatics; Center for Structural Biology and Institute of Chemical Biology; Vanderbilt Univ., Nashville, TN 37232, USA
- Leipzig University Medical School, Institute for Drug Discovery, Brüderstraße 34, 04103 Leipzig, Germany
| |
Collapse
|
2
|
Rashmi M, Murmu S, Nagrale DT, Singh MK, Behera SK, Shankar R, Ranjan R, Jha GK, Chaurasia A, Kumar S. Dataset on double mutation in PGIP of Glycine max improves defense to PG of Sclerotinia sclerotiorum. Data Brief 2024; 54:110518. [PMID: 38827253 PMCID: PMC11141275 DOI: 10.1016/j.dib.2024.110518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 05/02/2024] [Indexed: 06/04/2024] Open
Abstract
The cell wall of the Glycine max altered by the polygalacturonases (PGs) secreted by the fungus Sclerotinia sclerotiorum, causes disease and quality losses. In soybeans, a resistance protein called polygalacturonases-inhibiting proteins (PGIPs) binds to the PG to block fungal infection. The active site residues of PGIP3, VAL170 and GLN242 are mutated naturally by various amino acids in different types of PGIPs. Therefore, the mutation of VAL170 to GLY is ineffective but the GLN242 amino acid mutation by LYS significantly alters the structure and is crucial for interacting with the PG protein. Docking and Molecular Dynamics simulation provide a comprehensive evaluation of the interactions between gmPGIP and ssPG. By elucidating the structural basis of the interaction between gmPGIP and ssPG, this investigation lays a foundation for the development of targeted strategies in-order to enhance soybean resistance against Sclerotinia sclerotiorum. By leveraging this knowledge, researchers can potentially engineer soybean varieties with improved resistance to the fungus, thereby reducing disease incidence and improving crop yields.
Collapse
Affiliation(s)
- Mayank Rashmi
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | - Sneha Murmu
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | | | | | | | - Raja Shankar
- ICAR-Indian Institute of Horticultural Research, Bengaluru, India
| | | | - Girish Kumar Jha
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| | | | - Sunil Kumar
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India
| |
Collapse
|
3
|
Schijven D, Soheili-Nezhad S, Fisher SE, Francks C. Exome-wide analysis implicates rare protein-altering variants in human handedness. Nat Commun 2024; 15:2632. [PMID: 38565598 PMCID: PMC10987538 DOI: 10.1038/s41467-024-46277-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 02/20/2024] [Indexed: 04/04/2024] Open
Abstract
Handedness is a manifestation of brain hemispheric specialization. Left-handedness occurs at increased rates in neurodevelopmental disorders. Genome-wide association studies have identified common genetic effects on handedness or brain asymmetry, which mostly involve variants outside protein-coding regions and may affect gene expression. Implicated genes include several that encode tubulins (microtubule components) or microtubule-associated proteins. Here we examine whether left-handedness is also influenced by rare coding variants (frequencies ≤ 1%), using exome data from 38,043 left-handed and 313,271 right-handed individuals from the UK Biobank. The beta-tubulin gene TUBB4B shows exome-wide significant association, with a rate of rare coding variants 2.7 times higher in left-handers than right-handers. The TUBB4B variants are mostly heterozygous missense changes, but include two frameshifts found only in left-handers. Other TUBB4B variants have been linked to sensorineural and/or ciliopathic disorders, but not the variants found here. Among genes previously implicated in autism or schizophrenia by exome screening, DSCAM and FOXP1 show evidence for rare coding variant association with left-handedness. The exome-wide heritability of left-handedness due to rare coding variants was 0.91%. This study reveals a role for rare, protein-altering variants in left-handedness, providing further evidence for the involvement of microtubules and disorder-relevant genes.
Collapse
Affiliation(s)
- Dick Schijven
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Sourena Soheili-Nezhad
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
| | - Simon E Fisher
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands
| | - Clyde Francks
- Language & Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, The Netherlands.
- Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, The Netherlands.
- Department of Cognitive Neuroscience, Radboud University Medical Center, Nijmegen, The Netherlands.
| |
Collapse
|
4
|
Versini R, Sritharan S, Aykac Fas B, Tubiana T, Aimeur SZ, Henri J, Erard M, Nüsse O, Andreani J, Baaden M, Fuchs P, Galochkina T, Chatzigoulas A, Cournia Z, Santuz H, Sacquin-Mora S, Taly A. A Perspective on the Prospective Use of AI in Protein Structure Prediction. J Chem Inf Model 2024; 64:26-41. [PMID: 38124369 DOI: 10.1021/acs.jcim.3c01361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
AlphaFold2 (AF2) and RoseTTaFold (RF) have revolutionized structural biology, serving as highly reliable and effective methods for predicting protein structures. This article explores their impact and limitations, focusing on their integration into experimental pipelines and their application in diverse protein classes, including membrane proteins, intrinsically disordered proteins (IDPs), and oligomers. In experimental pipelines, AF2 models help X-ray crystallography in resolving the phase problem, while complementarity with mass spectrometry and NMR data enhances structure determination and protein flexibility prediction. Predicting the structure of membrane proteins remains challenging for both AF2 and RF due to difficulties in capturing conformational ensembles and interactions with the membrane. Improvements in incorporating membrane-specific features and predicting the structural effect of mutations are crucial. For intrinsically disordered proteins, AF2's confidence score (pLDDT) serves as a competitive disorder predictor, but integrative approaches including molecular dynamics (MD) simulations or hydrophobic cluster analyses are advocated for accurate dynamics representation. AF2 and RF show promising results for oligomeric models, outperforming traditional docking methods, with AlphaFold-Multimer showing improved performance. However, some caveats remain in particular for membrane proteins. Real-life examples demonstrate AF2's predictive capabilities in unknown protein structures, but models should be evaluated for their agreement with experimental data. Furthermore, AF2 models can be used complementarily with MD simulations. In this Perspective, we propose a "wish list" for improving deep-learning-based protein folding prediction models, including using experimental data as constraints and modifying models with binding partners or post-translational modifications. Additionally, a meta-tool for ranking and suggesting composite models is suggested, driving future advancements in this rapidly evolving field.
Collapse
Affiliation(s)
- Raphaelle Versini
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sujith Sritharan
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Burcu Aykac Fas
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Thibault Tubiana
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Sana Zineb Aimeur
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Julien Henri
- Sorbonne Université, CNRS, Laboratoire de Biologie, Computationnelle et Quantitative UMR 7238, Institut de Biologie Paris-Seine, 4 Place Jussieu, F-75005 Paris, France
| | - Marie Erard
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Oliver Nüsse
- Université Paris-Saclay, CNRS, Institut de Chimie Physique, 91405 Orsay, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Marc Baaden
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Patrick Fuchs
- Sorbonne Université, École Normale Supérieure, PSL University, CNRS, Laboratoire des Biomolécules, LBM, 75005 Paris, France
- Université de Paris, UFR Sciences du Vivant, 75013 Paris, France
| | - Tatiana Galochkina
- Université Paris Cité and Université des Antilles and Université de la Réunion, INSERM, BIGR, F-75014 Paris, France
| | - Alexios Chatzigoulas
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Zoe Cournia
- Biomedical Research Foundation, Academy of Athens, 11527 Athens, Greece
- Department of Informatics and Telecommunications, National and Kapodistrian University of Athens, 15784 Athens, Greece
| | - Hubert Santuz
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| | - Antoine Taly
- Laboratoire de Biochimie Théorique, CNRS (UPR9080), Université Paris Cité, F-75005 Paris, France
| |
Collapse
|