1
|
Chu SKS, Narang K, Siegel JB. Protein stability prediction by fine-tuning a protein language model on a mega-scale dataset. PLoS Comput Biol 2024; 20:e1012248. [PMID: 39038042 PMCID: PMC11293664 DOI: 10.1371/journal.pcbi.1012248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 08/01/2024] [Accepted: 06/13/2024] [Indexed: 07/24/2024] Open
Abstract
Protein stability plays a crucial role in a variety of applications, such as food processing, therapeutics, and the identification of pathogenic mutations. Engineering campaigns commonly seek to improve protein stability, and there is a strong interest in streamlining these processes to enable rapid optimization of highly stabilized proteins with fewer iterations. In this work, we explore utilizing a mega-scale dataset to develop a protein language model optimized for stability prediction. ESMtherm is trained on the folding stability of 528k natural and de novo sequences derived from 461 protein domains and can accommodate deletions, insertions, and multiple-point mutations. We show that a protein language model can be fine-tuned to predict folding stability. ESMtherm performs reasonably on small protein domains and generalizes to sequences distal from the training set. Lastly, we discuss our model's limitations compared to other state-of-the-art methods in generalizing to larger protein scaffolds. Our results highlight the need for large-scale stability measurements on a diverse dataset that mirrors the distribution of sequence lengths commonly observed in nature.
Collapse
Affiliation(s)
- Simon K. S. Chu
- Biophysics Graduate Program, University of California Davis, Davis, California, United States of America
| | - Kush Narang
- College of Biological Sciences, University of California Davis, Davis, California, United States of America
| | - Justin B. Siegel
- Genome Center, University of California Davis, Davis, California, United States of America
- Department of Chemistry, University of California Davis, Davis, California, United States of America
- Department of Biochemistry and Molecular Medicine, University of California Davis, Davis, California, United States of America
| |
Collapse
|
2
|
Rao J, Xin R, Macdonald C, Howard MK, Estevam GO, Yee SW, Wang M, Fraser JS, Coyote-Maestas W, Pimentel H. Rosace: a robust deep mutational scanning analysis framework employing position and mean-variance shrinkage. Genome Biol 2024; 25:138. [PMID: 38789982 PMCID: PMC11127319 DOI: 10.1186/s13059-024-03279-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 05/14/2024] [Indexed: 05/26/2024] Open
Abstract
Deep mutational scanning (DMS) measures the effects of thousands of genetic variants in a protein simultaneously. The small sample size renders classical statistical methods ineffective. For example, p-values cannot be correctly calibrated when treating variants independently. We propose Rosace, a Bayesian framework for analyzing growth-based DMS data. Rosace leverages amino acid position information to increase power and control the false discovery rate by sharing information across parameters via shrinkage. We also developed Rosette for simulating the distributional properties of DMS. We show that Rosace is robust to the violation of model assumptions and is more powerful than existing tools.
Collapse
Affiliation(s)
- Jingyou Rao
- Department of Computer Science, UCLA, Los Angeles, CA, USA
| | - Ruiqi Xin
- Computational and Systems Biology Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Christian Macdonald
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, CA, USA
| | - Matthew K Howard
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, CA, USA
- Tetrad Graduate Program, UCSF, San Francisco, CA, USA
- Department of Pharmaceutical Chemistry, UCSF, San Francisco, CA, USA
| | - Gabriella O Estevam
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, CA, USA
- Tetrad Graduate Program, UCSF, San Francisco, CA, USA
| | - Sook Wah Yee
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, CA, USA
| | - Mingsen Wang
- Department of Mathematics, Baruch College, CUNY, New York, NY, USA
| | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, CA, USA
- Quantitative Biosciences Institute, UCSF, San Francisco, CA, USA
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Sciences, UCSF, San Francisco, CA, USA.
- Quantitative Biosciences Institute, UCSF, San Francisco, CA, USA.
| | - Harold Pimentel
- Department of Computer Science, UCLA, Los Angeles, CA, USA.
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
| |
Collapse
|
3
|
Clausen L, Okarmus J, Voutsinos V, Meyer M, Lindorff-Larsen K, Hartmann-Petersen R. PRKN-linked familial Parkinson's disease: cellular and molecular mechanisms of disease-linked variants. Cell Mol Life Sci 2024; 81:223. [PMID: 38767677 PMCID: PMC11106057 DOI: 10.1007/s00018-024-05262-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/25/2024] [Accepted: 05/02/2024] [Indexed: 05/22/2024]
Abstract
Parkinson's disease (PD) is a common and incurable neurodegenerative disorder that arises from the loss of dopaminergic neurons in the substantia nigra and is mainly characterized by progressive loss of motor function. Monogenic familial PD is associated with highly penetrant variants in specific genes, notably the PRKN gene, where homozygous or compound heterozygous loss-of-function variants predominate. PRKN encodes Parkin, an E3 ubiquitin-protein ligase important for protein ubiquitination and mitophagy of damaged mitochondria. Accordingly, Parkin plays a central role in mitochondrial quality control but is itself also subject to a strict protein quality control system that rapidly eliminates certain disease-linked Parkin variants. Here, we summarize the cellular and molecular functions of Parkin, highlighting the various mechanisms by which PRKN gene variants result in loss-of-function. We emphasize the importance of high-throughput assays and computational tools for the clinical classification of PRKN gene variants and how detailed insights into the pathogenic mechanisms of PRKN gene variants may impact the development of personalized therapeutics.
Collapse
Affiliation(s)
- Lene Clausen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Justyna Okarmus
- Department of Neurobiology Research, Institute of Molecular Medicine, University of Southern Denmark, 5230, Odense, Denmark
| | - Vasileios Voutsinos
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Morten Meyer
- Department of Neurobiology Research, Institute of Molecular Medicine, University of Southern Denmark, 5230, Odense, Denmark
- Department of Neurology, Odense University Hospital, 5000, Odense, Denmark
- Department of Clinical Research, BRIDGE, Brain Research Inter Disciplinary Guided Excellence, University of Southern Denmark, 5230, Odense, Denmark
| | - Kresten Lindorff-Larsen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Rasmus Hartmann-Petersen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark.
| |
Collapse
|
4
|
Grønbæk-Thygesen M, Voutsinos V, Johansson KE, Schulze TK, Cagiada M, Pedersen L, Clausen L, Nariya S, Powell RL, Stein A, Fowler DM, Lindorff-Larsen K, Hartmann-Petersen R. Deep mutational scanning reveals a correlation between degradation and toxicity of thousands of aspartoacylase variants. Nat Commun 2024; 15:4026. [PMID: 38740822 DOI: 10.1038/s41467-024-48481-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 05/02/2024] [Indexed: 05/16/2024] Open
Abstract
Unstable proteins are prone to form non-native interactions with other proteins and thereby may become toxic. To mitigate this, destabilized proteins are targeted by the protein quality control network. Here we present systematic studies of the cytosolic aspartoacylase, ASPA, where variants are linked to Canavan disease, a lethal neurological disorder. We determine the abundance of 6152 of the 6260 ( ~ 98%) possible single amino acid substitutions and nonsense ASPA variants in human cells. Most low abundance variants are degraded through the ubiquitin-proteasome pathway and become toxic upon prolonged expression. The data correlates with predicted changes in thermodynamic stability, evolutionary conservation, and separate disease-linked variants from benign variants. Mapping of degradation signals (degrons) shows that these are often buried and the C-terminal region functions as a degron. The data can be used to interpret Canavan disease variants and provide insight into the relationship between protein stability, degradation and cell fitness.
Collapse
Affiliation(s)
- Martin Grønbæk-Thygesen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Vasileios Voutsinos
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kristoffer E Johansson
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Thea K Schulze
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Matteo Cagiada
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Line Pedersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Lene Clausen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Snehal Nariya
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Rachel L Powell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Amelie Stein
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Rasmus Hartmann-Petersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
5
|
Claussnitzer M, Parikh VN, Wagner AH, Arbesfeld JA, Bult CJ, Firth HV, Muffley LA, Nguyen Ba AN, Riehle K, Roth FP, Tabet D, Bolognesi B, Glazer AM, Rubin AF. Minimum information and guidelines for reporting a multiplexed assay of variant effect. Genome Biol 2024; 25:100. [PMID: 38641812 PMCID: PMC11027375 DOI: 10.1186/s13059-024-03223-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 03/25/2024] [Indexed: 04/21/2024] Open
Abstract
Multiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.
Collapse
Affiliation(s)
- Melina Claussnitzer
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Cambridge, MA, 02142, USA
| | - Victoria N Parikh
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Alex H Wagner
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43210, USA
| | - Jeremy A Arbesfeld
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Carol J Bult
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | - Helen V Firth
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
- Dept of Medical Genetics, Cambridge University Hospitals NHS Trust, Cambridge, UK
| | - Lara A Muffley
- Department of Genome Sciences, University of Washington, Seattle, WA, 98105, USA
| | - Alex N Nguyen Ba
- Department of Biology, University of Toronto at Mississauga, Mississauga, ON, Canada
| | - Kevin Riehle
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Daniel Tabet
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalunya (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain.
| | - Andrew M Glazer
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
6
|
Grønbæk-Thygesen M, Hartmann-Petersen R. Cellular and molecular mechanisms of aspartoacylase and its role in Canavan disease. Cell Biosci 2024; 14:45. [PMID: 38582917 PMCID: PMC10998430 DOI: 10.1186/s13578-024-01224-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 03/24/2024] [Indexed: 04/08/2024] Open
Abstract
Canavan disease is an autosomal recessive and lethal neurological disorder, characterized by the spongy degeneration of the white matter in the brain. The disease is caused by a deficiency of the cytosolic aspartoacylase (ASPA) enzyme, which catalyzes the hydrolysis of N-acetyl-aspartate (NAA), an abundant brain metabolite, into aspartate and acetate. On the physiological level, the mechanism of pathogenicity remains somewhat obscure, with multiple, not mutually exclusive, suggested hypotheses. At the molecular level, recent studies have shown that most disease linked ASPA gene variants lead to a structural destabilization and subsequent proteasomal degradation of the ASPA protein variants, and accordingly Canavan disease should in general be considered a protein misfolding disorder. Here, we comprehensively summarize the molecular and cell biology of ASPA, with a particular focus on disease-linked gene variants and the pathophysiology of Canavan disease. We highlight the importance of high-throughput technologies and computational prediction tools for making genotype-phenotype predictions as we await the results of ongoing trials with gene therapy for Canavan disease.
Collapse
Affiliation(s)
- Martin Grønbæk-Thygesen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200N, Copenhagen, Denmark.
| | - Rasmus Hartmann-Petersen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200N, Copenhagen, Denmark.
| |
Collapse
|
7
|
Bursch KL, Goetz CJ, Jiao G, Nuñez R, Olp MD, Dhiman A, Khurana M, Zimmermann MT, Urrutia RA, Dykhuizen EC, Smith BC. Cancer-associated polybromo-1 bromodomain 4 missense variants variably impact bromodomain ligand binding and cell growth suppression. J Biol Chem 2024; 300:107146. [PMID: 38460939 PMCID: PMC11002309 DOI: 10.1016/j.jbc.2024.107146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Revised: 02/12/2024] [Accepted: 02/29/2024] [Indexed: 03/11/2024] Open
Abstract
The polybromo, brahma-related gene 1-associated factors (PBAF) chromatin remodeling complex subunit polybromo-1 (PBRM1) contains six bromodomains that recognize and bind acetylated lysine residues on histone tails and other nuclear proteins. PBRM1 bromodomains thus provide a link between epigenetic posttranslational modifications and PBAF modulation of chromatin accessibility and transcription. As a putative tumor suppressor in several cancers, PBRM1 protein expression is often abrogated by truncations and deletions. However, ∼33% of PBRM1 mutations in cancer are missense and cluster within its bromodomains. Such mutations may generate full-length PBRM1 variant proteins with undetermined structural and functional characteristics. Here, we employed computational, biophysical, and cellular assays to interrogate the effects of PBRM1 bromodomain missense variants on bromodomain stability and function. Since mutations in the fourth bromodomain of PBRM1 (PBRM1-BD4) comprise nearly 20% of all cancer-associated PBRM1 missense mutations, we focused our analysis on PBRM1-BD4 missense protein variants. Selecting 16 potentially deleterious PBRM1-BD4 missense protein variants for further study based on high residue mutational frequency and/or conservation, we show that cancer-associated PBRM1-BD4 missense variants exhibit varied bromodomain stability and ability to bind acetylated histones. Our results demonstrate the effectiveness of identifying the unique impacts of individual PBRM1-BD4 missense variants on protein structure and function, based on affected residue location within the bromodomain. This knowledge provides a foundation for drawing correlations between specific cancer-associated PBRM1 missense variants and distinct alterations in PBRM1 function, informing future cancer personalized medicine approaches.
Collapse
Affiliation(s)
- Karina L Bursch
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA; Structural Genomics Unit, Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Christopher J Goetz
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Guanming Jiao
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana, USA
| | - Raymundo Nuñez
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Michael D Olp
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Alisha Dhiman
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana, USA
| | - Mallika Khurana
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Michael T Zimmermann
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA; Structural Genomics Unit, Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, Wisconsin, USA; Clinical and Translational Sciences Institute, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Raul A Urrutia
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA; Structural Genomics Unit, Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, Wisconsin, USA; Department of Surgery, Medical College of Wisconsin, Milwaukee, Wisconsin, USA
| | - Emily C Dykhuizen
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana, USA
| | - Brian C Smith
- Department of Biochemistry, Medical College of Wisconsin, Milwaukee, Wisconsin, USA; Structural Genomics Unit, Linda T. and John A. Mellowes Center for Genomic Sciences and Precision Medicine, Medical College of Wisconsin, Milwaukee, Wisconsin, USA; Program in Chemical Biology, Medical College of Wisconsin, Milwaukee, Wisconsin, USA.
| |
Collapse
|
8
|
Sifeddine N, Elkhattabi L, Ait El Cadi C, Krami AM, Mounaji K, el khalfi B, Barakat A. Insights from the SNP analysis of TYMP gene linking MNGIE. Bioinformation 2024; 20:261-270. [PMID: 38712004 PMCID: PMC11069602 DOI: 10.6026/973206300200261] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 03/31/2024] [Accepted: 03/31/2024] [Indexed: 05/08/2024] Open
Abstract
TYMP gene, which codes for thymidine phosphorylase (TP) is also known as platelet-derived endothelial cell growth factor (PD-ECGF). TP plays crucial roles in nucleotide metabolism and angiogenesis. Mutations in the TYMP gene can lead to Mitochondrial Neurogastrointestinal Encephalopathy (MNGIE) syndrome, a rare genetic disorder. Our main objective was to evaluate the impact of detrimental non-synonymous single nucleotide polymorphisms (nsSNPs) on TP protein structure and predict harmful variants in untranslated regions (UTR). We employed a combination of predictive algorithms to identify nsSNPs with potential deleterious effects, followed by molecular modeling analysis to understand their effects on protein structure and function. Using 13 algorithms, we identified 119 potentially deleterious nsSNPs, with 82 located in highly conserved regions. Of these, 53 nsSNPs were functional and exposed, while 79 nsSNPs reduced TP protein stability. Further analysis of 18 nsSNPs through 3D protein structure analysis revealed alterations in amino acid interactions, indicating their potential impact on protein function. This will help in the development of faster and more efficient genetic tests for detecting TYMP gene mutations.
Collapse
Affiliation(s)
- Najat Sifeddine
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
- Laboratory of Physiology and Molecular Genetics, Department of Biology, Faculty of Sciences Ain Chock, Hassan II University of Casablanca, Casablanca, Morocco
| | - Lamiae Elkhattabi
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Chaimaa Ait El Cadi
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Al Mehdi Krami
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| | - Khadija Mounaji
- Laboratory of Physiology and Molecular Genetics, Department of Biology, Faculty of Sciences Ain Chock, Hassan II University of Casablanca, Casablanca, Morocco
| | - Bouchra el khalfi
- Laboratory of Physiology and Molecular Genetics, Department of Biology, Faculty of Sciences Ain Chock, Hassan II University of Casablanca, Casablanca, Morocco
| | - Abdelhamid Barakat
- Laboratory of Genomics and Human Genetics, Institut Pasteur du Maroc, Casablanca, Morocco
| |
Collapse
|
9
|
Kang J, Wei S, Jia Z, Ma Y, Chen H, Sun C, Xu J, Tao J, Dong Y, Lv W, Tian H, Guo X, Bi S, Zhang C, Jiang Y, Lv H, Zhang M. Effects of genetic variation on the structure of RNA and protein. Proteomics 2024; 24:e2300235. [PMID: 38197532 DOI: 10.1002/pmic.202300235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 12/15/2023] [Accepted: 12/19/2023] [Indexed: 01/11/2024]
Abstract
Changes in the structure of RNA and protein, have an important impact on biological functions and are even important determinants of disease pathogenesis and treatment. Some genetic variations, including copy number variation, single nucleotide variation, and so on, can lead to changes in biological function and increased susceptibility to certain diseases by changing the structure of RNA or protein. With the development of structural biology and sequencing technology, a large amount of RNA and protein structure data and genetic variation data resources has emerged to be used to explain biological processes. Here, we reviewed the effects of genetic variation on the structure of RNAs and proteins, and investigated their impact on several diseases. An online resource (http://www.onethird-lab.com/gems/) to support convenient retrieval of common tools is also built. Finally, the challenges and future development of the effects of genetic variation on RNA and protein were discussed.
Collapse
Affiliation(s)
- Jingxuan Kang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Siyu Wei
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Zhe Jia
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Yingnan Ma
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Haiyan Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Chen Sun
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Jing Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Junxian Tao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Yu Dong
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Wenhua Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Hongsheng Tian
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xuying Guo
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Shuo Bi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chen Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yongshuai Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Hongchao Lv
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| | - Mingming Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
- The Epigenome-Wide Association Study Project, Harbin, China
| |
Collapse
|
10
|
Clausen L, Voutsinos V, Cagiada M, Johansson KE, Grønbæk-Thygesen M, Nariya S, Powell RL, Have MKN, Oestergaard VH, Stein A, Fowler DM, Lindorff-Larsen K, Hartmann-Petersen R. A mutational atlas for Parkin proteostasis. Nat Commun 2024; 15:1541. [PMID: 38378758 PMCID: PMC10879094 DOI: 10.1038/s41467-024-45829-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2023] [Accepted: 02/01/2024] [Indexed: 02/22/2024] Open
Abstract
Proteostasis can be disturbed by mutations affecting folding and stability of the encoded protein. An example is the ubiquitin ligase Parkin, where gene variants result in autosomal recessive Parkinsonism. To uncover the pathological mechanism and provide comprehensive genotype-phenotype information, variant abundance by massively parallel sequencing (VAMP-seq) is leveraged to quantify the abundance of Parkin variants in cultured human cells. The resulting mutational map, covering 9219 out of the 9300 possible single-site amino acid substitutions and nonsense Parkin variants, shows that most low abundance variants are proteasome targets and are located within the structured domains of the protein. Half of the known disease-linked variants are found at low abundance. Systematic mapping of degradation signals (degrons) reveals an exposed degron region proximal to the so-called "activation element". This work provides examples of how missense variants may cause degradation either via destabilization of the native protein, or by introducing local signals for degradation.
Collapse
Affiliation(s)
- Lene Clausen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Vasileios Voutsinos
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Matteo Cagiada
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kristoffer E Johansson
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Martin Grønbæk-Thygesen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Snehal Nariya
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Rachel L Powell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Magnus K N Have
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | | | - Amelie Stein
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Rasmus Hartmann-Petersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
11
|
Galano‐Frutos JJ, Sancho J. Energy, water, and protein folding: A molecular dynamics-based quantitative inventory of molecular interactions and forces that make proteins stable. Protein Sci 2024; 33:e4905. [PMID: 38284492 PMCID: PMC10804899 DOI: 10.1002/pro.4905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 12/12/2023] [Accepted: 01/05/2024] [Indexed: 01/30/2024]
Abstract
Protein folding energetics can be determined experimentally on a case-by-case basis but it is not understood in sufficient detail to provide deep control in protein design. The fundamentals of protein stability have been outlined by calorimetry, protein engineering, and biophysical modeling, but these approaches still face great difficulty in elucidating the specific contributions of the intervening molecules and physical interactions. Recently, we have shown that the enthalpy and heat capacity changes associated to the protein folding reaction can be calculated within experimental error using molecular dynamics simulations of native protein structures and their corresponding unfolded ensembles. Analyzing in depth molecular dynamics simulations of four model proteins (CI2, barnase, SNase, and apoflavodoxin), we dissect here the energy contributions to ΔH (a key component of protein stability) made by the molecular players (polypeptide and solvent molecules) and physical interactions (electrostatic, van der Waals, and bonded) involved. Although the proteins analyzed differ in length, isoelectric point and fold class, their folding energetics is governed by the same quantitative pattern. Relative to the unfolded ensemble, the native conformations are enthalpically stabilized by comparable contributions from protein-protein and solvent-solvent interactions, and almost equally destabilized by interactions between protein and solvent molecules. The native protein surface seems to interact better with water than the unfolded one, but this is outweighed by the unfolded surface being larger. From the perspective of physical interactions, the native conformations are stabilized by van de Waals and Coulomb interactions and destabilized by conformational strain arising from bonded interactions. Also common to the four proteins, the sign of the heat capacity change is set by interactions between protein and solvent molecules or, from the alternative perspective, by Coulomb interactions.
Collapse
Affiliation(s)
- Juan José Galano‐Frutos
- Biocomputation and Complex Systems Physics Institute (BIFI)‐Joint Unit GBsC‐CSICUniversity of ZaragozaZaragozaSpain
- Departamento de Bioquímica y Biología Molecular y Celular, Facultad de CienciasUniversity of ZaragozaZaragozaSpain
| | - Javier Sancho
- Biocomputation and Complex Systems Physics Institute (BIFI)‐Joint Unit GBsC‐CSICUniversity of ZaragozaZaragozaSpain
- Departamento de Bioquímica y Biología Molecular y Celular, Facultad de CienciasUniversity of ZaragozaZaragozaSpain
- Aragon Health Research Institute (IIS Aragón)ZaragozaSpain
| |
Collapse
|
12
|
Sinclair M, Stein RA, Sheehan JH, Hawes EM, O’Brien RM, Tajkhorshid E, Claxton DP. Integrative analysis of pathogenic variants in glucose-6-phosphatase based on an AlphaFold2 model. PNAS NEXUS 2024; 3:pgae036. [PMID: 38328777 PMCID: PMC10849595 DOI: 10.1093/pnasnexus/pgae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 01/09/2024] [Indexed: 02/09/2024]
Abstract
Mediating the terminal reaction of gluconeogenesis and glycogenolysis, the integral membrane protein glucose-6-phosphate catalytic subunit 1 (G6PC1) regulates hepatic glucose production by catalyzing hydrolysis of glucose-6-phosphate (G6P) within the lumen of the endoplasmic reticulum. Consistent with its vital contribution to glucose homeostasis, inactivating mutations in G6PC1 causes glycogen storage disease (GSD) type 1a characterized by hepatomegaly and severe hypoglycemia. Despite its physiological importance, the structural basis of G6P binding to G6PC1 and the molecular disruptions induced by missense mutations within the active site that give rise to GSD type 1a are unknown. In this study, we determine the atomic interactions governing G6P binding as well as explore the perturbations imposed by disease-linked missense variants by subjecting an AlphaFold2 G6PC1 structural model to molecular dynamics simulations and in silico predictions of thermodynamic stability validated with robust in vitro and in situ biochemical assays. We identify a collection of side chains, including conserved residues from the signature phosphatidic acid phosphatase motif, that contribute to a hydrogen bonding and van der Waals network stabilizing G6P in the active site. The introduction of GSD type 1a mutations modified the thermodynamic landscape, altered side chain packing and substrate-binding interactions, and induced trapping of catalytic intermediates. Our results, which corroborate the high quality of the AF2 model as a guide for experimental design and to interpret outcomes, not only confirm the active-site structural organization but also identify previously unobserved mechanistic contributions of catalytic and noncatalytic side chains.
Collapse
Affiliation(s)
- Matt Sinclair
- Theoretical and Computational Biophysics Group, NIH Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Richard A Stein
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
- Center for Applied Artificial Intelligence in Protein Dynamics, Vanderbilt University, Nashville, TN 37240, USA
| | - Jonathan H Sheehan
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
- Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, St Louis, MO 63110, USA
| | - Emily M Hawes
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
| | - Richard M O’Brien
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
| | - Emad Tajkhorshid
- Theoretical and Computational Biophysics Group, NIH Center for Macromolecular Modeling and Visualization, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Biochemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Derek P Claxton
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN 37232, USA
- Center for Applied Artificial Intelligence in Protein Dynamics, Vanderbilt University, Nashville, TN 37240, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37240, USA
| |
Collapse
|
13
|
Nourbakhsh M, Degn K, Saksager A, Tiberti M, Papaleo E. Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks. Brief Bioinform 2024; 25:bbad519. [PMID: 38261338 PMCID: PMC10805075 DOI: 10.1093/bib/bbad519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 11/27/2023] [Accepted: 12/11/2023] [Indexed: 01/24/2024] Open
Abstract
The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks.
Collapse
Affiliation(s)
- Mona Nourbakhsh
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Astrid Saksager
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| | - Elena Papaleo
- Cancer Systems Biology, Section for Bioinformatics, Department of Health Technology, Technical University of Denmark, 2800 Lyngby, Denmark
- Cancer Structural Biology, Danish Cancer Institute, 2100 Copenhagen, Denmark
| |
Collapse
|
14
|
Wayment-Steele HK, Ojoawo A, Otten R, Apitz JM, Pitsawong W, Hömberger M, Ovchinnikov S, Colwell L, Kern D. Predicting multiple conformations via sequence clustering and AlphaFold2. Nature 2024; 625:832-839. [PMID: 37956700 PMCID: PMC10808063 DOI: 10.1038/s41586-023-06832-9] [Citation(s) in RCA: 51] [Impact Index Per Article: 51.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 11/03/2023] [Indexed: 11/15/2023]
Abstract
AlphaFold2 (ref. 1) has revolutionized structural biology by accurately predicting single structures of proteins. However, a protein's biological function often depends on multiple conformational substates2, and disease-causing point mutations often cause population changes within these substates3,4. We demonstrate that clustering a multiple-sequence alignment by sequence similarity enables AlphaFold2 to sample alternative states of known metamorphic proteins with high confidence. Using this method, named AF-Cluster, we investigated the evolutionary distribution of predicted structures for the metamorphic protein KaiB5 and found that predictions of both conformations were distributed in clusters across the KaiB family. We used nuclear magnetic resonance spectroscopy to confirm an AF-Cluster prediction: a cyanobacteria KaiB variant is stabilized in the opposite state compared with the more widely studied variant. To test AF-Cluster's sensitivity to point mutations, we designed and experimentally verified a set of three mutations predicted to flip KaiB from Rhodobacter sphaeroides from the ground to the fold-switched state. Finally, screening for alternative states in protein families without known fold switching identified a putative alternative state for the oxidoreductase Mpt53 in Mycobacterium tuberculosis. Further development of such bioinformatic methods in tandem with experiments will probably have a considerable impact on predicting protein energy landscapes, essential for illuminating biological function.
Collapse
Affiliation(s)
- Hannah K Wayment-Steele
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Adedolapo Ojoawo
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Renee Otten
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Treeline Biosciences, Watertown, MA, USA
| | - Julia M Apitz
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
| | - Warintra Pitsawong
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Biomolecular Discovery, Relay Therapeutics, Cambridge, MA, USA
| | - Marc Hömberger
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA
- Treeline Biosciences, Watertown, MA, USA
| | | | - Lucy Colwell
- Google Research, Cambridge, MA, USA
- Cambridge University, Cambridge, UK
| | - Dorothee Kern
- Department of Biochemistry, Brandeis University and Howard Hughes Medical Institute, Waltham, MA, USA.
| |
Collapse
|
15
|
Rana MM, Nguyen DD. Geometric Graph Learning to Predict Changes in Binding Free Energy and Protein Thermodynamic Stability upon Mutation. J Phys Chem Lett 2023; 14:10870-10879. [PMID: 38032742 DOI: 10.1021/acs.jpclett.3c02679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2023]
Abstract
Accurate prediction of binding free energy changes upon mutations is vital for optimizing drugs, designing proteins, understanding genetic diseases, and cost-effective virtual screening. While machine learning methods show promise in this domain, achieving accuracy and generalization across diverse data sets remains a challenge. This study introduces Geometric Graph Learning for Protein-Protein Interactions (GGL-PPI), a novel approach integrating geometric graph representation and machine learning to forecast mutation-induced binding free energy changes. GGL-PPI leverages atom-level graph coloring and multiscale weighted colored geometric subgraphs to capture structural features of biomolecules, demonstrating superior performance on three standard data sets, namely, AB-Bind, SKEMPI 1.0, and SKEMPI 2.0 data sets. The model's efficacy extends to predicting protein thermodynamic stability in a blind test set, providing unbiased predictions for both direct and reverse mutations and showcasing notable generalization. GGL-PPI's precision in predicting changes in binding free energy and stability due to mutations enhances our comprehension of protein complexes, offering valuable insights for drug design endeavors.
Collapse
Affiliation(s)
- Md Masud Rana
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
16
|
Flagg MP, Lam B, Lam DK, Le TM, Kao A, Slaiwa YI, Hampton RY. Exploring the "misfolding problem" by systematic discovery and analysis of functional-but-degraded proteins. Mol Biol Cell 2023; 34:ar125. [PMID: 37729018 PMCID: PMC10848938 DOI: 10.1091/mbc.e23-06-0248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 09/08/2023] [Accepted: 09/11/2023] [Indexed: 09/22/2023] Open
Abstract
In both health and disease, the ubiquitin-proteasome system (UPS) degrades point mutants that retain partial function but have decreased stability compared with their wild-type counterparts. This class of UPS substrate includes routine translational errors and numerous human disease alleles, such as the most common cause of cystic fibrosis, ΔF508-CFTR. Yet, there is no systematic way to discover novel examples of these "minimally misfolded" substrates. To address that shortcoming, we designed a genetic screen to isolate functional-but-degraded point mutants, and we used the screen to study soluble, monomeric proteins with known structures. These simple parent proteins yielded diverse substrates, allowing us to investigate the structural features, cytotoxicity, and small-molecule regulation of minimal misfolding. Our screen can support numerous lines of inquiry, and it provides broad access to a class of poorly understood but biomedically critical quality-control substrates.
Collapse
Affiliation(s)
- Matthew P. Flagg
- Division of Biological Sciences, the Section of Cell and Developmental Biology, University of California San Diego, La Jolla, CA 92093
| | - Breanna Lam
- Division of Biological Sciences, the Section of Cell and Developmental Biology, University of California San Diego, La Jolla, CA 92093
| | - Darren K. Lam
- Division of Biological Sciences, the Section of Cell and Developmental Biology, University of California San Diego, La Jolla, CA 92093
| | - Tiffany M. Le
- Division of Biological Sciences, the Section of Cell and Developmental Biology, University of California San Diego, La Jolla, CA 92093
| | - Andy Kao
- Division of Biological Sciences, the Section of Cell and Developmental Biology, University of California San Diego, La Jolla, CA 92093
| | - Yousif I. Slaiwa
- Division of Biological Sciences, the Section of Cell and Developmental Biology, University of California San Diego, La Jolla, CA 92093
| | - Randolph Y. Hampton
- Division of Biological Sciences, the Section of Cell and Developmental Biology, University of California San Diego, La Jolla, CA 92093
| |
Collapse
|
17
|
McBride JM, Polev K, Abdirasulov A, Reinharz V, Grzybowski BA, Tlusty T. AlphaFold2 Can Predict Single-Mutation Effects. PHYSICAL REVIEW LETTERS 2023; 131:218401. [PMID: 38072605 DOI: 10.1103/physrevlett.131.218401] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 09/26/2023] [Indexed: 12/18/2023]
Abstract
AlphaFold2 (AF) is a promising tool, but is it accurate enough to predict single mutation effects? Here, we report that the localized structural deformation between protein pairs differing by only 1-3 mutations-as measured by the effective strain-is correlated across 3901 experimental and AF-predicted structures. Furthermore, analysis of ∼11 000 proteins shows that the local structural change correlates with various phenotypic changes. These findings suggest that AF can predict the range and magnitude of single-mutation effects on average, and we propose a method to improve precision of AF predictions and to indicate when predictions are unreliable.
Collapse
Affiliation(s)
- John M McBride
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
| | - Konstantin Polev
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | - Amirbek Abdirasulov
- Department of Computer Science and Engineering, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | | | - Bartosz A Grzybowski
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| |
Collapse
|
18
|
M. S, V. J, Ahmad SF, Attia SM, Emran TB, Patil RB, Ahmed SSSJ. Structural Characteristics of PON1 with Leu55Met and Gln192Arg Variants Influencing Oxidative-Stress-Related Diseases: An Integrated Molecular Modeling and Dynamics Study. MEDICINA (KAUNAS, LITHUANIA) 2023; 59:2060. [PMID: 38138163 PMCID: PMC10744641 DOI: 10.3390/medicina59122060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 11/04/2023] [Accepted: 11/20/2023] [Indexed: 12/24/2023]
Abstract
Background and Objectives: PON1 is a multi-functional antioxidant protein that hydrolyzes a variety of endogenous and exogenous substrates in the human system. Growing evidence suggests that the Leu55Met and Gln192Arg substitutions alter PON1 activity and are linked with a variety of oxidative-stress-related diseases. Materials and Methods: We implemented structural modeling and molecular dynamics (MD) simulation along with essential dynamics of PON1 and molecular docking with their endogenous (n = 4) and exogenous (n = 6) substrates to gain insights into conformational changes and binding affinity in order to characterize the specific functional ramifications of PON1 variants. Results: The Leu55Met variation had a higher root mean square deviation (0.249 nm) than the wild type (0.216 nm) and Gln192Arg (0.202 nm), implying increased protein flexibility. Furthermore, the essential dynamics analysis confirms the structural change in PON1 with Leu55Met vs. Gln192Arg and wild type. Additionally, PON1 with Leu55Met causes local conformational alterations at the substrate binding site, leading to changes in binding affinity with their substrates. Conclusions: Our findings highlight the structural consequences of the variants, which would increase understanding of the role of PON1 in the pathogenesis of oxidative-stress-related diseases, as well as the management of endogenous and exogenous chemicals in the treatment of diseases.
Collapse
Affiliation(s)
- Sudhan M.
- Drug Discovery and Multi-Omics Laboratory, Faculty of Allied Health Sciences, Chettinad Hospital and Research Institute, Chettinad Academy of Research and Education, Kelambakkam 603103, Tamil Nadu, India
| | - Janakiraman V.
- Drug Discovery and Multi-Omics Laboratory, Faculty of Allied Health Sciences, Chettinad Hospital and Research Institute, Chettinad Academy of Research and Education, Kelambakkam 603103, Tamil Nadu, India
| | - Sheikh F. Ahmad
- Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, Riyadh 11451, Saudi Arabia
| | - Sabry M. Attia
- Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, Riyadh 11451, Saudi Arabia
| | - Talha Bin Emran
- Department of Pathology and Laboratory Medicine, Warren Alpert Medical School, Brown University, Providence, RI 02912, USA
- Legorreta Cancer Center, Brown University, Providence, RI 02912, USA
- Department of Pharmacy, Faculty of Allied Health Sciences, Daffodil International University, Dhaka 1207, Bangladesh
| | - Rajesh B. Patil
- Department of Pharmaceutical Chemistry, Sinhgad Technical Education Societys, Sinhgad College of Pharmacy, Vadgaon (BK), Pune 411041, Maharashtra, India
| | - Shiek S. S. J. Ahmed
- Drug Discovery and Multi-Omics Laboratory, Faculty of Allied Health Sciences, Chettinad Hospital and Research Institute, Chettinad Academy of Research and Education, Kelambakkam 603103, Tamil Nadu, India
| |
Collapse
|
19
|
Sinclair M, Stein RA, Sheehan JH, Hawes EM, O'Brien RM, Tajkhorshid E, Claxton DP. Molecular mechanisms of catalytic inhibition for active site mutations in glucose-6-phosphatase catalytic subunit 1 linked to glycogen storage disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.13.532485. [PMID: 36993754 PMCID: PMC10054992 DOI: 10.1101/2023.03.13.532485] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Mediating the terminal reaction of gluconeogenesis and glycogenolysis, the integral membrane protein G6PC1 regulates hepatic glucose production by catalyzing hydrolysis of glucose-6-phosphate (G6P) within the lumen of the endoplasmic reticulum. Consistent with its vital contribution to glucose homeostasis, inactivating mutations in G6PC1 cause glycogen storage disease (GSD) type 1a characterized by hepatomegaly and severe hypoglycemia. Despite its physiological importance, the structural basis of G6P binding to G6PC1 and the molecular disruptions induced by missense mutations within the active site that give rise to GSD type 1a are unknown. Exploiting a computational model of G6PC1 derived from the groundbreaking structure prediction algorithm AlphaFold2 (AF2), we combine molecular dynamics (MD) simulations and computational predictions of thermodynamic stability with a robust in vitro screening platform to define the atomic interactions governing G6P binding as well as explore the energetic perturbations imposed by disease-linked variants. We identify a collection of side chains, including conserved residues from the signature phosphatidic acid phosphatase motif, that contribute to a hydrogen bonding and van der Waals network stabilizing G6P in the active site. Introduction of GSD type 1a mutations into the G6PC1 sequence elicits changes in G6P binding energy, thermostability and structural properties, suggesting multiple pathways of catalytic impairment. Our results, which corroborate the high quality of the AF2 model as a guide for experimental design and to interpret outcomes, not only confirm active site structural organization but also suggest novel mechanistic contributions of catalytic and non-catalytic side chains.
Collapse
|
20
|
Tsuboyama K, Dauparas J, Chen J, Laine E, Mohseni Behbahani Y, Weinstein JJ, Mangan NM, Ovchinnikov S, Rocklin GJ. Mega-scale experimental analysis of protein folding stability in biology and design. Nature 2023; 620:434-444. [PMID: 37468638 PMCID: PMC10412457 DOI: 10.1038/s41586-023-06328-6] [Citation(s) in RCA: 50] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 06/14/2023] [Indexed: 07/21/2023]
Abstract
Advances in DNA sequencing and machine learning are providing insights into protein sequences and structures on an enormous scale1. However, the energetics driving folding are invisible in these structures and remain largely unknown2. The hidden thermodynamics of folding can drive disease3,4, shape protein evolution5-7 and guide protein engineering8-10, and new approaches are needed to reveal these thermodynamics for every sequence and structure. Here we present cDNA display proteolysis, a method for measuring thermodynamic folding stability for up to 900,000 protein domains in a one-week experiment. From 1.8 million measurements in total, we curated a set of around 776,000 high-quality folding stabilities covering all single amino acid variants and selected double mutants of 331 natural and 148 de novo designed protein domains 40-72 amino acids in length. Using this extensive dataset, we quantified (1) environmental factors influencing amino acid fitness, (2) thermodynamic couplings (including unexpected interactions) between protein sites, and (3) the global divergence between evolutionary amino acid usage and protein folding stability. We also examined how our approach could identify stability determinants in designed proteins and evaluate design methods. The cDNA display proteolysis method is fast, accurate and uniquely scalable, and promises to reveal the quantitative rules for how amino acid sequences encode folding stability.
Collapse
Affiliation(s)
- Kotaro Tsuboyama
- Department of Pharmacology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- PRESTO, Japan Science and Technology Agency, Tokyo, Japan
- Institute of Industrial Science, The University of Tokyo, Tokyo, Japan
| | - Justas Dauparas
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jonathan Chen
- Department of Pharmacology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- McCormick School of Engineering, Northwestern University, Evanston, IL, USA
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Yasser Mohseni Behbahani
- Sorbonne Université, CNRS, IBPS, Laboratory of Computational and Quantitative Biology (LCQB), UMR 7238, Paris, France
| | - Jonathan J Weinstein
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Niall M Mangan
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA
- Department of Engineering Sciences and Applied Mathematics, Northwestern University, Evanston, IL, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA
| | - Gabriel J Rocklin
- Department of Pharmacology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA.
- Center for Synthetic Biology, Northwestern University, Evanston, IL, USA.
| |
Collapse
|
21
|
Tollefson MR, Gogal RA, Weaver AM, Schaefer AM, Marini RJ, Azaiez H, Kolbe DL, Wang D, Weaver AE, Casavant TL, Braun TA, Smith RJH, Schnieders MJ. Assessing variants of uncertain significance implicated in hearing loss using a comprehensive deafness proteome. Hum Genet 2023; 142:819-834. [PMID: 37086329 PMCID: PMC10182131 DOI: 10.1007/s00439-023-02559-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 04/11/2023] [Indexed: 04/23/2023]
Abstract
Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are "variants of uncertain significance" (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆GFold) for all DVD missense variants. We find that 5772 VUSs have a large, destabilizing ∆∆GFold that is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3456 VUSs are likely pathogenic at a probability of 99.0%. Of the 224 genes in the DVD, 166 genes (74%) exhibit one or more missense variants predicted to cause a pathogenic change in protein folding stability. The VUSs prioritized here affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.
Collapse
Affiliation(s)
- Mallory R Tollefson
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Rose A Gogal
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - A Monique Weaver
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Amanda M Schaefer
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Robert J Marini
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Hela Azaiez
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Diana L Kolbe
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Donghong Wang
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Amy E Weaver
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA
| | - Thomas L Casavant
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - Terry A Braun
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA
| | - Richard J H Smith
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, IA, 52242, USA.
| | - Michael J Schnieders
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, IA, 52242, USA.
- Department of Biochemistry and Molecular Biology, University of Iowa, Iowa City, IA, 52242, USA.
| |
Collapse
|
22
|
Chauhan R, Bhattacharya J, Solanki R, Ahmad FJ, Alankar B, Kaur H. GUD-VE visualization tool for physicochemical properties of proteins. MethodsX 2023; 10:102226. [PMID: 37424755 PMCID: PMC10326500 DOI: 10.1016/j.mex.2023.102226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 05/17/2023] [Indexed: 07/11/2023] Open
Abstract
The physicochemical properties of primary sequences of proteins helps in determining both the structure and biological functions. The sequence analysis of the proteins and nucleic acids is most fundamental element of bioinformatics. Without these elements, it is impossible to gain insight deeper molecular and biochemical mechanisms. For this purpose, the computational methods like bioinformatics tools assist experts and novices alike in resolving issues relating to protein analysis. Similarly, this proposed work, for the graphical user interface (GUI) based prediction and visualization through the computations-based method done on Jupyter Notebook with tkinter package which allows the creation of a program on a local host platform and accessed by the programmer.•When it is queried with a protein sequence, it predicts physicochemical parameters of the peptides.•Users can choose to visualize the findings acquired either anonymously or on the user-specified email address and compare the biophysical properties of one protein with other using amino acids (AA) sequences. The aim of this paper is to meet the requirements of experimentalists, not just hardcore bioinformaticians related to biophysical properties prediction and comparison with other proteins. The code for it has been uploaded on GitHub (an online repository of codes) in private mode.
Collapse
Affiliation(s)
- Ritu Chauhan
- Amity University, Noida 201313, Uttar Pradesh, India
| | | | - Rubi Solanki
- School of Interdisciplinary Sciences and Technology, Jamia Hamdard, New Delhi 110062, India
| | - Farhan Jalees Ahmad
- School of Interdisciplinary Sciences and Technology, Jamia Hamdard, New Delhi 110062, India
| | - Bhavya Alankar
- Department of Computer Science and Engineering, School of Engineering Sciences and Technology, Jamia Hamdard, New Delhi 110062, India
| | - Harleen Kaur
- Department of Computer Science and Engineering, School of Engineering Sciences and Technology, Jamia Hamdard, New Delhi 110062, India
| |
Collapse
|
23
|
Licata L, Via A, Turina P, Babbi G, Benevenuta S, Carta C, Casadio R, Cicconardi A, Facchiano A, Fariselli P, Giordano D, Isidori F, Marabotti A, Martelli PL, Pascarella S, Pinelli M, Pippucci T, Russo R, Savojardo C, Scafuri B, Valeriani L, Capriotti E. Resources and tools for rare disease variant interpretation. Front Mol Biosci 2023; 10:1169109. [PMID: 37234922 PMCID: PMC10206239 DOI: 10.3389/fmolb.2023.1169109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 04/25/2023] [Indexed: 05/28/2023] Open
Abstract
Collectively, rare genetic disorders affect a substantial portion of the world's population. In most cases, those affected face difficulties in receiving a clinical diagnosis and genetic characterization. The understanding of the molecular mechanisms of these diseases and the development of therapeutic treatments for patients are also challenging. However, the application of recent advancements in genome sequencing/analysis technologies and computer-aided tools for predicting phenotype-genotype associations can bring significant benefits to this field. In this review, we highlight the most relevant online resources and computational tools for genome interpretation that can enhance the diagnosis, clinical management, and development of treatments for rare disorders. Our focus is on resources for interpreting single nucleotide variants. Additionally, we present use cases for interpreting genetic variants in clinical settings and review the limitations of these results and prediction tools. Finally, we have compiled a curated set of core resources and tools for analyzing rare disease genomes. Such resources and tools can be utilized to develop standardized protocols that will enhance the accuracy and effectiveness of rare disease diagnosis.
Collapse
Affiliation(s)
- Luana Licata
- Department of Biology, University of Rome Tor Vergata, Roma, Italy
| | - Allegra Via
- Department of Biochemical Sciences “A. Rossi Fanelli”, University of Rome “La Sapienza”, Roma, Italy
| | - Paola Turina
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Giulia Babbi
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | | | - Claudio Carta
- National Centre for Rare Diseases, Istituto Superiore di Sanità, Roma, Italy
| | - Rita Casadio
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Andrea Cicconardi
- Department of Physics, University of Genova, Genova, Italy
- Italiano di Tecnologia—IIT, Genova, Italy
| | - Angelo Facchiano
- National Research Council, Institute of Food Science, Avellino, Italy
| | - Piero Fariselli
- Department of Medical Sciences, University of Torino, Torino, Italy
| | - Deborah Giordano
- National Research Council, Institute of Food Science, Avellino, Italy
| | - Federica Isidori
- Medical Genetics Unit, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| | - Anna Marabotti
- Department of Chemistry and Biology “A. Zambelli”, University of Salerno, Fisciano, SA, Italy
| | - Pier Luigi Martelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Stefano Pascarella
- Department of Biochemical Sciences “A. Rossi Fanelli”, University of Rome “La Sapienza”, Roma, Italy
| | - Michele Pinelli
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Napoli, Italy
| | - Tommaso Pippucci
- Medical Genetics Unit, IRCCS Azienda Ospedaliero-Universitaria di Bologna, Bologna, Italy
| | - Roberta Russo
- Department of Molecular Medicine and Medical Biotechnology, University of Naples Federico II, Napoli, Italy
- CEINGE Biotecnologie Avanzate Franco Salvatore, Napoli, Italy
| | - Castrense Savojardo
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Bernardina Scafuri
- Department of Chemistry and Biology “A. Zambelli”, University of Salerno, Fisciano, SA, Italy
| | | | - Emidio Capriotti
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| |
Collapse
|
24
|
Abildgaard AB, Nielsen SV, Bernstein I, Stein A, Lindorff-Larsen K, Hartmann-Petersen R. Lynch syndrome, molecular mechanisms and variant classification. Br J Cancer 2023; 128:726-734. [PMID: 36434153 PMCID: PMC9978028 DOI: 10.1038/s41416-022-02059-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 10/31/2022] [Accepted: 11/02/2022] [Indexed: 11/27/2022] Open
Abstract
Patients with the heritable cancer disease, Lynch syndrome, carry germline variants in the MLH1, MSH2, MSH6 and PMS2 genes, encoding the central components of the DNA mismatch repair system. Loss-of-function variants disrupt the DNA mismatch repair system and give rise to a detrimental increase in the cellular mutational burden and cancer development. The treatment prospects for Lynch syndrome rely heavily on early diagnosis; however, accurate diagnosis is inextricably linked to correct clinical interpretation of individual variants. Protein variant classification traditionally relies on cumulative information from occurrence in patients, as well as experimental testing of the individual variants. The complexity of variant classification is due to (1) that variants of unknown significance are rare in the population and phenotypic information on the specific variants is missing, and (2) that individual variant testing is challenging, costly and slow. Here, we summarise recent developments in high-throughput technologies and computational prediction tools for the assessment of variants of unknown significance in Lynch syndrome. These approaches may vastly increase the number of interpretable variants and could also provide important mechanistic insights into the disease. These insights may in turn pave the road towards developing personalised treatment approaches for Lynch syndrome.
Collapse
Affiliation(s)
- Amanda B Abildgaard
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Sofie V Nielsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Inge Bernstein
- Department of Surgical Gastroenterology, Aalborg University Hospital, Aalborg, Denmark
- Institute of Clinical Medicine, Aalborg University Hospital, Aalborg University, Aalborg, Denmark
| | - Amelie Stein
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Rasmus Hartmann-Petersen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
25
|
Tollefson MR, Gogal RA, Weaver AM, Schaefer AM, Marini RJ, Azaiez H, Kolbe DL, Wang D, Weaver AE, Casavant TL, Braun TA, Smith RJH, Schnieders M. Assessing Variants of Uncertain Significance Implicated in Hearing Loss Using a Comprehensive Deafness Proteome. RESEARCH SQUARE 2023:rs.3.rs-2508462. [PMID: 36778238 PMCID: PMC9915777 DOI: 10.21203/rs.3.rs-2508462/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Hearing loss is the leading sensory deficit, affecting ~ 5% of the population. It exhibits remarkable heterogeneity across 223 genes with 6,328 pathogenic missense variants, making deafness-specific expertise a prerequisite for ascribing phenotypic consequences to genetic variants. Deafness-implicated variants are curated in the Deafness Variation Database (DVD) after classification by a genetic hearing loss expert panel and thorough informatics pipeline. However, seventy percent of the 128,167 missense variants in the DVD are "variants of uncertain significance" (VUS) due to insufficient evidence for classification. Here, we use the deep learning protein prediction algorithm, AlphaFold2, to curate structures for all DVD genes. We refine these structures with global optimization and the AMOEBA force field and use DDGun3D to predict folding free energy differences (∆∆G Fold ) for all DVD missense variants. We find that 5,772 VUSs have a large, destabilizing ∆∆G Fold that is consistent with pathogenic variants. When also filtered for CADD scores (> 25.7), we determine 3,456 VUSs are likely pathogenic at a probability of 99.0%. These VUSs affect 119 patients (~ 3% of cases) sequenced by the OtoSCOPE targeted panel. Approximately half of these patients previously received an inconclusive report, and reclassification of these VUSs as pathogenic provides a new genetic diagnosis for six patients.
Collapse
|
26
|
Johansson KE, Mashahreh B, Hartmann-Petersen R, Ravid T, Lindorff-Larsen K. Prediction of Quality-control Degradation Signals in Yeast Proteins. J Mol Biol 2023; 435:167915. [PMID: 36495918 DOI: 10.1016/j.jmb.2022.167915] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 11/26/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022]
Abstract
Effective proteome homeostasis is key to cellular and organismal survival, and cells therefore contain efficient quality control systems to monitor and remove potentially toxic misfolded proteins. Such general protein quality control to a large extent relies on the efficient and robust delivery of misfolded or unfolded proteins to the ubiquitin-proteasome system. This is achieved via recognition of so-called degradation motifs-degrons-that are assumed to become exposed as a result of protein misfolding. Despite their importance, the nature and sequence properties of quality-control degrons remain elusive. Here, we have used data from a yeast-based screen of 23,600 17-residue peptides to build a predictor of quality-control degrons. The resulting model, QCDPred (Quality Control Degron Prediction), achieves good accuracy using only the sequence composition of the peptides as input. Our analysis reveals that strong degrons are enriched in hydrophobic amino acids and depleted in negatively charged amino acids, in line with the expectation that they are buried in natively folded proteins. We applied QCDPred to the yeast proteome, enabling us to analyse more widely the potential effects of degrons. As an example, we show a correlation between cellular abundance and degron potential in disordered regions of proteins. Together with recent results on membrane proteins, our work suggest that the recognition of exposed hydrophobic residues is a key and generic mechanism for proteome homeostasis. QCDPred is freely available as open source code and via a web interface.
Collapse
Affiliation(s)
- Kristoffer E Johansson
- Linderstrøm-Lang Centre for Protein Science, Section for Biomolecular Sciences, Department of Biology, University for Copenhagen, Copenhagen, Denmark. https://twitter.com/kristofferenoee
| | - Bayan Mashahreh
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Rasmus Hartmann-Petersen
- Linderstrøm-Lang Centre for Protein Science, Section for Biomolecular Sciences, Department of Biology, University for Copenhagen, Copenhagen, Denmark. https://twitter.com/rasmushartmannp
| | - Tommer Ravid
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Section for Biomolecular Sciences, Department of Biology, University for Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
27
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China,Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom,The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China,Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China,*Correspondence: Xianghua Li,
| |
Collapse
|
28
|
Molecular Mechanisms, Genotype-Phenotype Correlations and Patient-Specific Treatments in Inherited Metabolic Diseases. J Pers Med 2023; 13:jpm13010117. [PMID: 36675778 PMCID: PMC9864038 DOI: 10.3390/jpm13010117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Accepted: 01/03/2023] [Indexed: 01/06/2023] Open
Abstract
Advances in DNA sequencing technologies are revealing a vast genetic heterogeneity in human population, which may predispose to metabolic alterations if the activity of metabolic enzymes is affected [...].
Collapse
|
29
|
Landau J, Tsaban L, Yaacov A, Ben Cohen G, Rosenberg S. Shared Cancer Dataset Analysis Identifies and Predicts the Quantitative Effects of Pan-Cancer Somatic Driver Variants. Cancer Res 2023; 83:74-88. [PMID: 36264175 DOI: 10.1158/0008-5472.can-22-1038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 08/02/2022] [Accepted: 10/18/2022] [Indexed: 02/03/2023]
Abstract
Driver mutations endow tumors with selective advantages and produce an array of pathogenic effects. Determining the function of somatic variants is important for understanding cancer biology and identifying optimal therapies. Here, we compiled a shared dataset from several cancer genomic databases. Two measures were applied to 535 cancer genes based on observed and expected frequencies of driver variants as derived from cancer-specific rates of somatic mutagenesis. The first measure comprised a binary classifier based on a binomial test; the second was tumor variant amplitude (TVA), a continuous measure representing the selective advantage of individual variants. TVA outperformed all other computational tools in terms of its correlation with experimentally derived functional scores of cancer mutations. TVA also highly correlated with drug response, overall survival, and other clinical implications in relevant cancer genes. This study demonstrates how a selective advantage measure based on a large cancer dataset significantly impacts our understanding of the spectral effect of driver variants in cancer. The impact of this information will increase as cancer treatment becomes more precise and personalized to tumor-specific mutations. SIGNIFICANCE A new selective advantage estimation assists in oncogenic driver identification and relative effect measurements, enabling better prognostication, therapy selection, and prioritization.
Collapse
Affiliation(s)
- Jakob Landau
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Linoy Tsaban
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Adar Yaacov
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,Department of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Gil Ben Cohen
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Shai Rosenberg
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
30
|
Chaperone-Dependent Mechanisms as a Pharmacological Target for Neuroprotection. Int J Mol Sci 2023; 24:ijms24010823. [PMID: 36614266 PMCID: PMC9820882 DOI: 10.3390/ijms24010823] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 12/26/2022] [Accepted: 12/27/2022] [Indexed: 01/05/2023] Open
Abstract
Modern pharmacotherapy of neurodegenerative diseases is predominantly symptomatic and does not allow vicious circles causing disease development to break. Protein misfolding is considered the most important pathogenetic factor of neurodegenerative diseases. Physiological mechanisms related to the function of chaperones, which contribute to the restoration of native conformation of functionally important proteins, evolved evolutionarily. These mechanisms can be considered promising for pharmacological regulation. Therefore, the aim of this review was to analyze the mechanisms of endoplasmic reticulum stress (ER stress) and unfolded protein response (UPR) in the pathogenesis of neurodegenerative diseases. Data on BiP and Sigma1R chaperones in clinical and experimental studies of Alzheimer's disease, Parkinson's disease, amyotrophic lateral sclerosis, and Huntington's disease are presented. The possibility of neuroprotective effect dependent on Sigma1R ligand activation in these diseases is also demonstrated. The interaction between Sigma1R and BiP-associated signaling in the neuroprotection is discussed. The performed analysis suggests the feasibility of pharmacological regulation of chaperone function, possibility of ligand activation of Sigma1R in order to achieve a neuroprotective effect, and the need for further studies of the conjugation of cellular mechanisms controlled by Sigma1R and BiP chaperones.
Collapse
|
31
|
Tiemann JKS, Zschach H, Lindorff-Larsen K, Stein A. Interpreting the molecular mechanisms of disease variants in human transmembrane proteins. Biophys J 2023:S0006-3495(22)03941-8. [PMID: 36600598 DOI: 10.1016/j.bpj.2022.12.031] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Revised: 11/19/2022] [Accepted: 12/21/2022] [Indexed: 01/06/2023] Open
Abstract
Next-generation sequencing of human genomes reveals millions of missense variants, some of which may lead to loss of protein function and ultimately disease. Here, we investigate missense variants in membrane proteins-key drivers in cell signaling and recognition. We find enrichment of pathogenic variants in the transmembrane region across 19,000 functionally classified variants in human membrane proteins. To accurately predict variant consequences, one fundamentally needs to understand the underlying molecular processes. A key mechanism underlying pathogenicity in missense variants of soluble proteins has been shown to be loss of stability. Membrane proteins, however, are widely understudied. Here, we interpret variant effects on a larger scale by performing structure-based estimations of changes in thermodynamic stability using a membrane-specific energy function and analyses of sequence conservation during evolution of 15 transmembrane proteins. We find evidence for loss of stability being the cause of pathogenicity in more than half of the pathogenic variants, indicating that this is a driving factor also in membrane-protein-associated diseases. Our findings show how computational tools aid in gaining mechanistic insights into variant consequences for membrane proteins. To enable broader analyses of disease-related and population variants, we include variant mappings for the entire human proteome.
Collapse
Affiliation(s)
- Johanna Katarina Sofie Tiemann
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark; Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Henrike Zschach
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Amelie Stein
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
32
|
Sora V, Laspiur AO, Degn K, Arnaudi M, Utichi M, Beltrame L, De Menezes D, Orlandi M, Stoltze UK, Rigina O, Sackett PW, Wadt K, Schmiegelow K, Tiberti M, Papaleo E. RosettaDDGPrediction for high-throughput mutational scans: From stability to binding. Protein Sci 2023; 32:e4527. [PMID: 36461907 PMCID: PMC9795540 DOI: 10.1002/pro.4527] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/05/2022]
Abstract
Reliable prediction of free energy changes upon amino acid substitutions (ΔΔGs) is crucial to investigate their impact on protein stability and protein-protein interaction. Advances in experimental mutational scans allow high-throughput studies thanks to multiplex techniques. On the other hand, genomics initiatives provide a large amount of data on disease-related variants that can benefit from analyses with structure-based methods. Therefore, the computational field should keep the same pace and provide new tools for fast and accurate high-throughput ΔΔG calculations. In this context, the Rosetta modeling suite implements effective approaches to predict folding/unfolding ΔΔGs in a protein monomer upon amino acid substitutions and calculate the changes in binding free energy in protein complexes. However, their application can be challenging to users without extensive experience with Rosetta. Furthermore, Rosetta protocols for ΔΔG prediction are designed considering one variant at a time, making the setup of high-throughput screenings cumbersome. For these reasons, we devised RosettaDDGPrediction, a customizable Python wrapper designed to run free energy calculations on a set of amino acid substitutions using Rosetta protocols with little intervention from the user. Moreover, RosettaDDGPrediction assists with checking completed runs and aggregates raw data for multiple variants, as well as generates publication-ready graphics. We showed the potential of the tool in four case studies, including variants of uncertain significance in childhood cancer, proteins with known experimental unfolding ΔΔGs values, interactions between target proteins and disordered motifs, and phosphomimetics. RosettaDDGPrediction is available, free of charge and under GNU General Public License v3.0, at https://github.com/ELELAB/RosettaDDGPrediction.
Collapse
Affiliation(s)
- Valentina Sora
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Adrian Otamendi Laspiur
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Matteo Arnaudi
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Mattia Utichi
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Ludovica Beltrame
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Dayana De Menezes
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Matteo Orlandi
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Ulrik Kristoffer Stoltze
- Department of Clinical GeneticsCopenhagen University Hospital RigshospitaletCopenhagenDenmark
- Department of Pediatrics and Adolescent MedicineUniversity Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Olga Rigina
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Peter Wad Sackett
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Karin Wadt
- Department of Clinical GeneticsCopenhagen University Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Kjeld Schmiegelow
- Department of Pediatrics and Adolescent MedicineUniversity Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| |
Collapse
|
33
|
Phosphorylation of Thr9 Affects the Folding Landscape of the N-Terminal Segment of Human AGT Enhancing Protein Aggregation of Disease-Causing Mutants. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27248762. [PMID: 36557898 PMCID: PMC9786777 DOI: 10.3390/molecules27248762] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 12/07/2022] [Accepted: 12/08/2022] [Indexed: 12/14/2022]
Abstract
The mutations G170R and I244T are the most common disease cause in primary hyperoxaluria type I (PH1). These mutations cause the misfolding of the AGT protein in the minor allele AGT-LM that contains the P11L polymorphism, which may affect the folding of the N-terminal segment (NTT-AGT). The NTT-AGT is phosphorylated at T9, although the role of this event in PH1 is unknown. In this work, phosphorylation of T9 was mimicked by introducing the T9E mutation in the NTT-AGT peptide and the full-length protein. The NTT-AGT conformational landscape was studied by circular dichroism, NMR, and statistical mechanical methods. Functional and stability effects on the full-length AGT protein were characterized by spectroscopic methods. The T9E and P11L mutations together reshaped the conformational landscape of the isolated NTT-AGT peptide by stabilizing ordered conformations. In the context of the full-length AGT protein, the T9E mutation had no effect on the overall AGT function or conformation, but enhanced aggregation of the minor allele (LM) protein and synergized with the mutations G170R and I244T. Our findings indicate that phosphorylation of T9 may affect the conformation of the NTT-AGT and synergize with PH1-causing mutations to promote aggregation in a genotype-specific manner. Phosphorylation should be considered a novel regulatory mechanism in PH1 pathogenesis.
Collapse
|
34
|
Gong J, Wang J, Zong X, Ma Z, Xu D. Prediction of protein stability changes upon single-point variant using 3D structure profile. Comput Struct Biotechnol J 2022; 21:354-364. [PMID: 36582438 PMCID: PMC9791599 DOI: 10.1016/j.csbj.2022.12.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2022] [Revised: 12/04/2022] [Accepted: 12/05/2022] [Indexed: 12/13/2022] Open
Abstract
Identifying protein thermodynamic stability changes upon single-point variants is crucial for studying mutation-induced alterations in protein biophysics, genomic variants, and mutation-related diseases. In the last decade, various computational methods have been developed to predict the effects of single-point variants, but the prediction accuracy is still far from satisfactory for practical applications. Herein, we review approaches and tools for predicting stability changes upon the single-point variant. Most of these methods require tertiary protein structure as input to achieve reliable predictions. However, the availability of protein structures limits the immediate application of these tools. To improve the performance of a computational prediction from a protein sequence without experimental structural information, we introduce a new computational framework: MU3DSP. This method assesses the effects of single-point variants on protein thermodynamic stability based on point mutated protein 3D structure profile. Given a protein sequence with a single variant as input, MU3DSP integrates both sequence-level features and averaged features of 3D structures obtained from sequence alignment to PDB to assess the change of thermodynamic stability induced by the substitution. MU3DSP outperforms existing methods on various benchmarks, making it a reliable tool to assess both somatic and germline substitution variants and assist in protein design. MU3DSP is available as an open-source tool at https://github.com/hurraygong/MU3DSP.
Collapse
Affiliation(s)
- Jianting Gong
- School of Information Science and Technology, and Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
| | - Juexin Wang
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
- Department of BioHealth Informatics, School of Informatics and Computing, Indiana University Purdue University Indianapolis, Indianapolis, IN, USA
| | - Xizeng Zong
- School of Computer Science and Engineering, Changchun University of Technology, Changchun 130117, China
| | - Zhiqiang Ma
- School of Information Science and Technology, and Institution of Computational Biology, Northeast Normal University, Changchun 130117, China
- Department of Computer Science, College of Humanities & Sciences of Northeast Normal University, Changchun 130117, China
- Corresponding authors.
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
- Corresponding authors.
| |
Collapse
|
35
|
Haynes LM, Huttinger ZM, Yee A, Kretz CA, Siemieniak DR, Lawrence DA, Ginsburg D. Deep mutational scanning and massively parallel kinetics of plasminogen activator inhibitor-1 functional stability to probe its latency transition. J Biol Chem 2022; 298:102608. [PMID: 36257408 PMCID: PMC9667310 DOI: 10.1016/j.jbc.2022.102608] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 10/10/2022] [Accepted: 10/12/2022] [Indexed: 11/05/2022] Open
Abstract
Plasminogen activator inhibitor-1 (PAI-1), a member of the serine protease inhibitor superfamily of proteins, is unique among serine protease inhibitors for exhibiting a spontaneous conformational change to a latent or inactive state. The functional half-life for this transition at physiologic temperature and pH is ∼1 to 2 h. To better understand the molecular mechanisms underlying this transition, we now report on the analysis of a comprehensive PAI-1 variant library expressed on filamentous phage and selected for functional stability after 48 h at 37 °C. Of the 7201 possible single amino acid substitutions in PAI-1, we identified 439 that increased the functional stability of PAI-1 beyond that of the WT protein. We also found 1549 single amino acid substitutions that retained inhibitory activity toward the canonical target protease of PAI-1 (urokinase-like plasminogen activator), whereas exhibiting functional stability less than or equal to that of WT PAI-1. Missense mutations that increase PAI-1 functional stability are concentrated in highly flexible regions within the PAI-1 structure. Finally, we developed a method for simultaneously measuring the functional half-lives of hundreds of PAI-1 variants in a multiplexed, massively parallel manner, quantifying the functional half-lives for 697 single missense variants of PAI-1 by this approach. Overall, these findings provide novel insight into the mechanisms underlying the latency transition of PAI-1 and provide a database for interpreting human PAI-1 genetic variants.
Collapse
Affiliation(s)
- Laura M Haynes
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, USA
| | - Zachary M Huttinger
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, USA; Cellular and Molecular Biology Program, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - Andrew Yee
- Department of Pediatrics, Baylor College of Medicine, Houston, Texas, USA
| | - Colin A Kretz
- Department of Medicine, McMaster University and the Thrombosis and Atherosclerosis Research Institute, Hamilton, Ontario, Canada
| | - David R Siemieniak
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, USA; Howard Hughes Medical Institute
| | - Daniel A Lawrence
- Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA; Department of Pathology, University of Michigan Medical School, Ann Arbor, Michigan, USA
| | - David Ginsburg
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, USA; Cellular and Molecular Biology Program, University of Michigan Medical School, Ann Arbor, Michigan, USA; Howard Hughes Medical Institute; Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, USA; Departments of Human Genetics and Pediatrics, University of Michigan, Ann Arbor, Michigan, USA.
| |
Collapse
|
36
|
Valanciute A, Nygaard L, Zschach H, Maglegaard Jepsen M, Lindorff-Larsen K, Stein A. Accurate protein stability predictions from homology models. Comput Struct Biotechnol J 2022; 21:66-73. [PMID: 36514339 PMCID: PMC9729920 DOI: 10.1016/j.csbj.2022.11.048] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 11/22/2022] [Accepted: 11/23/2022] [Indexed: 11/27/2022] Open
Abstract
Calculating changes in protein stability (ΔΔG) has been shown to be central for predicting the consequences of single amino acid substitutions in protein engineering as well as interpretation of genomic variants for disease risk. Structure-based calculations are considered most accurate, however the tools used to calculate ΔΔGs have been developed on experimentally resolved structures. Extending those calculations to homology models based on related proteins would greatly extend their applicability as large parts of e.g. the human proteome are not structurally resolved. In this study we aim to investigate the accuracy of ΔΔG values predicted on homology models compared to crystal structures. Specifically, we identified four proteins with a large number of experimentally tested ΔΔGs and templates for homology modeling across a broad range of sequence identities, and selected three methods for ΔΔG calculations to test. We find that ΔΔG-values predicted from homology models compare equally well to experimental ΔΔGs as those predicted on experimentally established crystal structures, as long as the sequence identity of the model template to the target protein is at least 40%. In particular, the Rosetta cartesian_ddg protocol is robust against the small perturbations in the structure which homology modeling introduces. In an independent assessment, we observe a similar trend when using ΔΔGs to categorize variants as low or wild-type-like abundance. Overall, our results show that stability calculations performed on homology models can substitute for those on crystal structures with acceptable accuracy as long as the model is built on a template with sequence identity of at least 40% to the target protein.
Collapse
Affiliation(s)
- Audrone Valanciute
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Lasse Nygaard
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Henrike Zschach
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Michael Maglegaard Jepsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark,Corresponding authors.
| | - Amelie Stein
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark,Corresponding authors.
| |
Collapse
|
37
|
Pacheco-Garcia JL, Cagiada M, Tienne-Matos K, Salido E, Lindorff-Larsen K, L. Pey A. Effect of naturally-occurring mutations on the stability and function of cancer-associated NQO1: Comparison of experiments and computation. Front Mol Biosci 2022; 9:1063620. [PMID: 36504709 PMCID: PMC9730889 DOI: 10.3389/fmolb.2022.1063620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 11/03/2022] [Indexed: 11/25/2022] Open
Abstract
Recent advances in DNA sequencing technologies are revealing a large individual variability of the human genome. Our capacity to establish genotype-phenotype correlations in such large-scale is, however, limited. This task is particularly challenging due to the multifunctional nature of many proteins. Here we describe an extensive analysis of the stability and function of naturally-occurring variants (found in the COSMIC and gnomAD databases) of the cancer-associated human NAD(P)H:quinone oxidoreductase 1 (NQO1). First, we performed in silico saturation mutagenesis studies (>5,000 substitutions) aimed to identify regions in NQO1 important for stability and function. We then experimentally characterized twenty-two naturally-occurring variants in terms of protein levels during bacterial expression, solubility, thermal stability, and coenzyme binding. These studies showed a good overall correlation between experimental analysis and computational predictions; also the magnitude of the effects of the substitutions are similarly distributed in variants from the COSMIC and gnomAD databases. Outliers in these experimental-computational genotype-phenotype correlations remain, and we discuss these on the grounds and limitations of our approaches. Our work represents a further step to characterize the mutational landscape of NQO1 in the human genome and may help to improve high-throughput in silico tools for genotype-phenotype correlations in this multifunctional protein associated with disease.
Collapse
Affiliation(s)
| | - Matteo Cagiada
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, Copenhagen, Denmark
| | | | - Eduardo Salido
- Center for Rare Diseases (CIBERER), Hospital Universitario de Canarias, Universidad de la Laguna, La Laguna, TenerifeTenerife, Spain
| | - Kresten Lindorff-Larsen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, Copenhagen, Denmark
| | - Angel L. Pey
- Departamento de Química Física, Unidad de Excelencia en Química Aplicada a Biomedicina y Medioambiente e Instituto de Biotecnología, Universidad de Granada, Granada, Spain,*Correspondence: Angel L. Pey,
| |
Collapse
|
38
|
Loss of stability and unfolding cooperativity in hPGK1 upon gradual structural perturbation of its N-terminal domain hydrophobic core. Sci Rep 2022; 12:17200. [PMID: 36229482 PMCID: PMC9561527 DOI: 10.1038/s41598-022-22088-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 10/10/2022] [Indexed: 01/06/2023] Open
Abstract
Phosphoglycerate kinase has been a model for the stability, folding cooperativity and catalysis of a two-domain protein. The human isoform 1 (hPGK1) is associated with cancer development and rare genetic diseases that affect several of its features. To investigate how mutations affect hPGK1 folding landscape and interaction networks, we have introduced mutations at a buried site in the N-terminal domain (F25 mutants) that either created cavities (F25L, F25V, F25A), enhanced conformational entropy (F25G) or introduced structural strain (F25W) and evaluated their effects using biophysical experimental and theoretical methods. All F25 mutants folded well, but showed reduced unfolding cooperativity, kinetic stability and altered activation energetics according to the results from thermal and chemical denaturation analyses. These alterations correlated well with the structural perturbation caused by mutations in the N-terminal domain and the destabilization caused in the interdomain interface as revealed by H/D exchange under native conditions. Importantly, experimental and theoretical analyses showed that these effects are significant even when the perturbation is mild and local. Our approach will be useful to establish the molecular basis of hPGK1 genotype-phenotype correlations due to phosphorylation events and single amino acid substitutions associated with disease.
Collapse
|
39
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
40
|
Dellefave-Castillo LM, Cirino AL, Callis TE, Esplin ED, Garcia J, Hatchell KE, Johnson B, Morales A, Regalado E, Rojahn S, Vatta M, Nussbaum RL, McNally EM. Assessment of the Diagnostic Yield of Combined Cardiomyopathy and Arrhythmia Genetic Testing. JAMA Cardiol 2022; 7:966-974. [PMID: 35947370 PMCID: PMC9366660 DOI: 10.1001/jamacardio.2022.2455] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Importance Genetic testing can guide management of both cardiomyopathies and arrhythmias, but cost, yield, and uncertain results can be barriers to its use. It is unknown whether combined disease testing can improve diagnostic yield and clinical utility for patients with a suspected genetic cardiomyopathy or arrhythmia. Objective To evaluate the diagnostic yield and clinical management implications of combined cardiomyopathy and arrhythmia genetic testing through a no-charge, sponsored program for patients with a suspected genetic cardiomyopathy or arrhythmia. Design, Setting, and Participants This cohort study involved a retrospective review of DNA sequencing results for cardiomyopathy- and arrhythmia-associated genes. The study included 4782 patients with a suspected genetic cardiomyopathy or arrhythmia who were referred for genetic testing by 1203 clinicians; all patients participated in a no-charge, sponsored genetic testing program for cases of suspected genetic cardiomyopathy and arrhythmia at a single testing site from July 12, 2019, through July 9, 2020. Main Outcomes and Measures Positive gene findings from combined cardiomyopathy and arrhythmia testing were compared with findings from smaller subtype-specific gene panels and clinician-provided diagnoses. Results Among 4782 patients (mean [SD] age, 40.5 [21.3] years; 2551 male [53.3%]) who received genetic testing, 39 patients (0.8%) were Ashkenazi Jewish, 113 (2.4%) were Asian, 571 (11.9%) were Black or African American, 375 (7.8%) were Hispanic, 2866 (59.9%) were White, 240 (5.0%) were of multiple races and/or ethnicities, 138 (2.9%) were of other races and/or ethnicities, and 440 (9.2%) were of unknown race and/or ethnicity. A positive result (molecular diagnosis) was confirmed in 954 of 4782 patients (19.9%). Of those, 630 patients with positive results (66.0%) had the potential to inform clinical management associated with adverse clinical outcomes, increased arrhythmia risk, or targeted therapies. Combined cardiomyopathy and arrhythmia gene panel testing identified clinically relevant variants for 1 in 5 patients suspected of having a genetic cardiomyopathy or arrhythmia. If only patients with a high suspicion of genetic cardiomyopathy or arrhythmia had been tested, at least 137 positive results (14.4%) would have been missed. If testing had been restricted to panels associated with the clinician-provided diagnostic indications, 75 of 689 positive results (10.9%) would have been missed; 27 of 75 findings (36.0%) gained through combined testing involved a cardiomyopathy indication with an arrhythmia genetic finding or vice versa. Cascade testing of family members yielded 402 of 958 positive results (42.0%). Overall, 2446 of 4782 patients (51.2%) had only variants of uncertain significance. Patients referred for arrhythmogenic cardiomyopathy had the lowest rate of variants of uncertain significance (81 of 176 patients [46.0%]), and patients referred for catecholaminergic polymorphic ventricular tachycardia had the highest rate (48 of 76 patients [63.2%]). Conclusions and Relevance In this study, comprehensive genetic testing for cardiomyopathies and arrhythmias revealed diagnoses that would have been missed by disease-specific testing. In addition, comprehensive testing provided diagnostic and prognostic information that could have potentially changed management and monitoring strategies for patients and their family members. These results suggest that this improved diagnostic yield may outweigh the burden of uncertain results.
Collapse
Affiliation(s)
- Lisa M Dellefave-Castillo
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| | - Allison L Cirino
- Cardiovascular Division, Brigham and Women's Hospital, Boston, Massachusetts.,Institute of Health Professions, Massachusetts General Hospital, Boston
| | | | | | - John Garcia
- Invitae Corporation, San Francisco, California
| | | | | | - Ana Morales
- Invitae Corporation, San Francisco, California
| | | | | | | | | | - Elizabeth M McNally
- Center for Genetic Medicine, Feinberg School of Medicine, Northwestern University, Chicago, Illinois
| |
Collapse
|
41
|
Koren I. The hidden (degron) truth behind the degradation of DHFR disease-associated variants. Structure 2022; 30:1219-1221. [PMID: 36055220 DOI: 10.1016/j.str.2022.08.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In this issue of Structure, Kampmeyer et al. provide detailed mechanistic insights into how structural changes in disease-associated dihydrofolate reductase (DHFR) missense variants affect their cellular protein abundance and discuss implications for hereditary megaloblastic anemia disease.
Collapse
Affiliation(s)
- Itay Koren
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, 5290002, Israel.
| |
Collapse
|
42
|
Wang B, Gamazon ER. Modeling mutational effects on biochemical phenotypes using convolutional neural networks: application to SARS-CoV-2. iScience 2022; 25:104500. [PMID: 35669036 PMCID: PMC9159778 DOI: 10.1016/j.isci.2022.104500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 11/15/2021] [Accepted: 05/26/2022] [Indexed: 11/29/2022] Open
Abstract
Deep mutational scanning (DMS) experiments have been performed on SARS-CoV-2’s spike receptor-binding domain (RBD) and human angiotensin-converting enzyme 2 (ACE2) zinc-binding peptidase domain—both central players in viral infection and evolution and antibody evasion—quantifying how mutations impact biochemical phenotypes. We modeled biochemical phenotypes from massively parallel assays, using neural networks trained on protein sequence mutations in the virus and human host. Neural networks were significantly predictive of binding affinity, protein expression, and antibody escape, learning complex interactions and higher-order features that are difficult to capture with conventional methods from structural biology. Integrating the physicochemical properties of amino acids, such as hydrophobicity and long-range non-bonded energy per atom, significantly improved prediction (empirical p < 0.01). We observed concordance of the neural network predictions with molecular dynamics (multiple 500 ns or 1 μs all-atom) simulations of the spike protein-ACE2 interface, with critical implications for the use of deep learning to dissect molecular mechanisms. Deep learning models of biochemical phenotypes from deep mutational scanning (DMS) data Prediction performance gain from using physicochemical properties of amino acids Concordance of neural network predictions with molecular dynamics simulations Improved causal inference properties for neural-network-defined phenotypes
Collapse
Affiliation(s)
- Bo Wang
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Eric R Gamazon
- Division of Genetic Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Data Science Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA.,Clare Hall, University of Cambridge, Cambridge CB3 9AL, UK
| |
Collapse
|
43
|
Loss-of-function, gain-of-function and dominant-negative mutations have profoundly different effects on protein structure. Nat Commun 2022; 13:3895. [PMID: 35794153 PMCID: PMC9259657 DOI: 10.1038/s41467-022-31686-6] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 06/29/2022] [Indexed: 12/12/2022] Open
Abstract
Most known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Taking protein structure into account has therefore provided great insight into the molecular mechanisms underlying human genetic disease. While there has been much focus on how mutations can disrupt protein structure and thus cause a loss of function (LOF), alternative mechanisms, specifically dominant-negative (DN) and gain-of-function (GOF) effects, are less understood. Here, we investigate the protein-level effects of pathogenic missense mutations associated with different molecular mechanisms. We observe striking differences between recessive vs dominant, and LOF vs non-LOF mutations, with dominant, non-LOF disease mutations having much milder effects on protein structure, and DN mutations being highly enriched at protein interfaces. We also find that nearly all computational variant effect predictors, even those based solely on sequence conservation, underperform on non-LOF mutations. However, we do show that non-LOF mutations could potentially be identified by their tendency to cluster in three-dimensional space. Overall, our work suggests that many pathogenic mutations that act via DN and GOF mechanisms are likely being missed by current variant prioritisation strategies, but that there is considerable scope to improve computational predictions through consideration of molecular disease mechanisms. Most known pathogenic mutations occur in protein-coding regions of DNA and change the way proteins are made. Here the authors analyse the locations of thousands of human disease mutations and their predicted effects on protein structure and show that,while loss-of-function mutations tend to be highly disruptive, non-loss-of-function mutations are in general much milder at a protein structural level.
Collapse
|
44
|
Anderson CL, Munawar S, Reilly L, Kamp TJ, January CT, Delisle BP, Eckhardt LL. How Functional Genomics Can Keep Pace With VUS Identification. Front Cardiovasc Med 2022; 9:900431. [PMID: 35859585 PMCID: PMC9291992 DOI: 10.3389/fcvm.2022.900431] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Accepted: 06/09/2022] [Indexed: 01/03/2023] Open
Abstract
Over the last two decades, an exponentially expanding number of genetic variants have been identified associated with inherited cardiac conditions. These tremendous gains also present challenges in deciphering the clinical relevance of unclassified variants or variants of uncertain significance (VUS). This review provides an overview of the advancements (and challenges) in functional and computational approaches to characterize variants and help keep pace with VUS identification related to inherited heart diseases.
Collapse
Affiliation(s)
- Corey L. Anderson
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Saba Munawar
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Louise Reilly
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Timothy J. Kamp
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Craig T. January
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| | - Brian P. Delisle
- Department of Physiology, University of Kentucky College of Medicine, Lexington, KY, United States
| | - Lee L. Eckhardt
- Cellular and Molecular Arrythmias Program, Division of Cardiovascular Medicine, Department of Medicine, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
45
|
Kampmeyer C, Larsen-Ledet S, Wagnkilde MR, Michelsen M, Iversen HKM, Nielsen SV, Lindemose S, Caregnato A, Ravid T, Stein A, Teilum K, Lindorff-Larsen K, Hartmann-Petersen R. Disease-linked mutations cause exposure of a protein quality control degron. Structure 2022; 30:1245-1253.e5. [PMID: 35700725 DOI: 10.1016/j.str.2022.05.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 04/08/2022] [Accepted: 05/18/2022] [Indexed: 10/18/2022]
Abstract
More than half of disease-causing missense variants are thought to lead to protein degradation, but the molecular mechanism of how these variants are recognized by the cell remains enigmatic. Degrons are stretches of amino acids that help mediate recognition by E3 ligases and thus confer protein degradation via the ubiquitin-proteasome system. While degrons that mediate controlled degradation of, for example, signaling components and cell-cycle regulators are well described, so-called protein-quality-control degrons that mediate the degradation of destabilized proteins are poorly understood. Here, we show that disease-linked dihydrofolate reductase (DHFR) missense variants are structurally destabilized and chaperone-dependent proteasome targets. We find two regions in DHFR that act as degrons, and the proteasomal turnover of one of these was dependent on the molecular chaperone Hsp70. Structural analyses by nuclear magnetic resonance (NMR) and hydrogen/deuterium exchange revealed that this degron is buried in wild-type DHFR but becomes transiently exposed in the disease-linked missense variants.
Collapse
Affiliation(s)
- Caroline Kampmeyer
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Sven Larsen-Ledet
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Morten Rose Wagnkilde
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Mathias Michelsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Henriette K M Iversen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Sofie V Nielsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Søren Lindemose
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Alberto Caregnato
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark
| | - Tommer Ravid
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Edmond J. Safra Campus, Givat-Ram, 91904 Jerusalem, Israel
| | - Amelie Stein
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark.
| | - Kaare Teilum
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark.
| | - Kresten Lindorff-Larsen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark.
| | - Rasmus Hartmann-Petersen
- The Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen, Denmark.
| |
Collapse
|
46
|
Horne J, Shukla D. Recent Advances in Machine Learning Variant Effect Prediction Tools for Protein Engineering. Ind Eng Chem Res 2022; 61:6235-6245. [DOI: 10.1021/acs.iecr.1c04943] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Jesse Horne
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana−Champaign, Champaign, Illinois 61801, United States
| | - Diwakar Shukla
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana−Champaign, Champaign, Illinois 61801, United States
- Department of Bioengineering, University of Illinois Urbana−Champaign, Champaign, Illinois 61801, United States
- Department of Plant Biology, University of Illinois Urbana−Champaign, Champaign, Illinois 61801, United States
- Cancer Center at Illinois, University of Illinois Urbana−Champaign, Champaign, Illinois 61801, United States
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana−Champaign, Champaign, Illinois 61801, United States
| |
Collapse
|
47
|
Li B, Jin B, Capra JA, Bush WS. Integration of Protein Structure and Population-Scale DNA Sequence Data for Disease Gene Discovery and Variant Interpretation. Annu Rev Biomed Data Sci 2022; 5:141-161. [PMID: 35508071 DOI: 10.1146/annurev-biodatasci-122220-112147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The experimental and computational techniques for capturing information about protein structures and genetic variation within the human genome have advanced dramatically in the past 20 years, generating extensive new data resources. In this review, we discuss these advances, along with new approaches for determining the impact a genetic variant has on protein function. We focus on the potential of new methods that integrate human genetic variation into protein structures to discover relationships to disease, including the discovery of mutational hotspots in cancer-related proteins, the localization of protein-altering variants within protein regions for common complex diseases, and the assessment of variants of unknown significance for Mendelian traits. We expect that approaches that integrate these data sources will play increasingly important roles in disease gene discovery and variant interpretation. Expected final online publication date for the Annual Review of Biomedical Data Science, Volume 5 is August 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Bian Li
- Department of Biological Sciences and Center for Structural Biology, Vanderbilt University, Nashville, Tennessee, USA
| | - Bowen Jin
- Graduate Program in Systems Biology and Bioinformatics, Department of Nutrition, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - John A Capra
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, California, USA;
| | - William S Bush
- Cleveland Institute for Computational Biology, Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, Ohio, USA;
| |
Collapse
|
48
|
Abstract
In-cell structural biology aims at extracting structural information about proteins or nucleic acids in their native, cellular environment. This emerging field holds great promise and is already providing new facts and outlooks of interest at both fundamental and applied levels. NMR spectroscopy has important contributions on this stage: It brings information on a broad variety of nuclei at the atomic scale, which ensures its great versatility and uniqueness. Here, we detail the methods, the fundamental knowledge, and the applications in biomedical engineering related to in-cell structural biology by NMR. We finally propose a brief overview of the main other techniques in the field (EPR, smFRET, cryo-ET, etc.) to draw some advisable developments for in-cell NMR. In the era of large-scale screenings and deep learning, both accurate and qualitative experimental evidence are as essential as ever to understand the interior life of cells. In-cell structural biology by NMR spectroscopy can generate such a knowledge, and it does so at the atomic scale. This review is meant to deliver comprehensive but accessible information, with advanced technical details and reflections on the methods, the nature of the results, and the future of the field.
Collapse
Affiliation(s)
- Francois-Xavier Theillet
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
49
|
Bossaerts L, Hendrickx Van de Craen E, Cacace R, Asselbergh B, Van Broeckhoven C. Rare missense mutations in ABCA7 might increase Alzheimer's disease risk by plasma membrane exclusion. Acta Neuropathol Commun 2022; 10:43. [PMID: 35361255 PMCID: PMC8973822 DOI: 10.1186/s40478-022-01346-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 03/11/2022] [Indexed: 11/10/2022] Open
Abstract
The adenosine triphosphate-binding cassette subfamily A member 7 gene (ABCA7) is associated with Alzheimer's disease (AD) in large genome-wide association studies. Targeted sequencing of ABCA7 suggests a role for rare premature termination codon (PTC) mutations in AD, with haploinsufficiency through nonsense-mediated mRNA decay as a plausible pathogenic mechanism. Since other classes of rare variants in ABCA7 are poorly understood, we investigated the contribution and pathogenicity of rare missense, indel and splice variants in ABCA7 in Belgian AD patient and control cohorts. We identified 8.36% rare variants in the patient cohort versus 6.05% in the control cohort. For 10 missense mutations identified in the Belgian cohort we analyzed the pathogenetic effect on protein localization in vitro using immunocytochemistry. Our results demonstrate that rare ABCA7 missense mutations can contribute to AD by inducing protein mislocalization, resulting in a lack of functional protein at the plasma membrane. In one pedigree, a mislocalization-inducing missense mutation in ABCA7 (p.G1820S) co-segregated with AD in an autosomal dominant inheritance pattern. Brain autopsy of six patient missense mutation carriers showed typical AD neuropathological characteristics including cerebral amyloid angiopathy type 1. Also, among the rare ABCA7 missense mutations, we observed mutations that affect amino acid residues that are conserved in ABCA1 and ABCA4, of which some correspond to established ABCA1 or ABCA4 disease-causing mutations involved in Tangier or Stargardt disease.
Collapse
|
50
|
Tiberti M, Terkelsen T, Degn K, Beltrame L, Cremers TC, da Piedade I, Di Marco M, Maiani E, Papaleo E. MutateX: an automated pipeline for in silico saturation mutagenesis of protein structures and structural ensembles. Brief Bioinform 2022; 23:6552273. [PMID: 35323860 DOI: 10.1093/bib/bbac074] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/28/2022] [Accepted: 02/16/2022] [Indexed: 12/26/2022] Open
Abstract
Mutations, which result in amino acid substitutions, influence the stability of proteins and their binding to biomolecules. A molecular understanding of the effects of protein mutations is both of biotechnological and medical relevance. Empirical free energy functions that quickly estimate the free energy change upon mutation (ΔΔG) can be exploited for systematic screenings of proteins and protein complexes. In silico saturation mutagenesis can guide the design of new experiments or rationalize the consequences of known mutations. Often software such as FoldX, while fast and reliable, lack the necessary automation features to apply them in a high-throughput manner. We introduce MutateX, a software to automate the prediction of ΔΔGs associated with the systematic mutation of each residue within a protein, or protein complex to all other possible residue types, using the FoldX energy function. MutateX also supports ΔΔG calculations over protein ensembles, upon post-translational modifications and in multimeric assemblies. At the heart of MutateX lies an automated pipeline engine that handles input preparation, parallelization and outputs publication-ready figures. We illustrate the MutateX protocol applied to different case studies. The results of the high-throughput scan provided by our tools can help in different applications, such as the analysis of disease-associated mutations, to complement experimental deep mutational scans, or assist the design of variants for industrial applications. MutateX is a collection of Python tools that relies on open-source libraries. It is available free of charge under the GNU General Public License from https://github.com/ELELAB/mutatex.
Collapse
Affiliation(s)
- Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark
| | - Thilde Terkelsen
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and Technology, Technical University of Denmark, 2800, Lyngby, Denmark
| | - Ludovica Beltrame
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark
| | - Tycho Canter Cremers
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark
| | - Isabelle da Piedade
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark
| | - Miriam Di Marco
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark
| | - Emiliano Maiani
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research Center, 2100, Copenhagen, Denmark.,Cancer Systems Biology, Section for Bioinformatics, Department of Health and Technology, Technical University of Denmark, 2800, Lyngby, Denmark.,Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|