1
|
Shukla K, Idanwekhai K, Naradikian M, Ting S, Schoenberger SP, Brunk E. Machine Learning of Three-Dimensional Protein Structures to Predict the Functional Impacts of Genome Variation. J Chem Inf Model 2024; 64:5328-5343. [PMID: 38635316 DOI: 10.1021/acs.jcim.3c01967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024]
Abstract
Research in the human genome sciences generates a substantial amount of genetic data for hundreds of thousands of individuals, which concomitantly increases the number of variants of unknown significance (VUS). Bioinformatic analyses can successfully reveal rare variants and variants with clear associations with disease-related phenotypes. These studies have had a significant impact on how clinical genetic screens are interpreted and how patients are stratified for treatment. There are few, if any, computational methods for variants comparable to biological activity predictions. To address this gap, we developed a machine learning method that uses protein three-dimensional structures from AlphaFold to predict how a variant will influence changes to a gene's downstream biological pathways. We trained state-of-the-art machine learning classifiers to predict which protein regions will most likely impact transcriptional activities of two proto-oncogenes, nuclear factor erythroid 2 (NFE2L2)-related factor 2 (NRF2) and c-Myc. We have identified classifiers that attain accuracies higher than 80%, which have allowed us to identify a set of key protein regions that lead to significant perturbations in c-Myc or NRF2 transcriptional pathway activities.
Collapse
Affiliation(s)
- Kriti Shukla
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
| | - Kelvin Idanwekhai
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
- School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
| | - Martin Naradikian
- La Jolla Institute for Immunology, San Diego, California 92093, United States
| | - Stephanie Ting
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
| | | | - Elizabeth Brunk
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
- Department of Pharmacology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
- Integrative Program for Biological and Genome Sciences (IBGS), University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
- Computational Medicine Program, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27516, United States
| |
Collapse
|
2
|
Tabet DR, Kuang D, Lancaster MC, Li R, Liu K, Weile J, Coté AG, Wu Y, Hegele RA, Roden DM, Roth FP. Benchmarking computational variant effect predictors by their ability to infer human traits. Genome Biol 2024; 25:172. [PMID: 38951922 PMCID: PMC11218265 DOI: 10.1186/s13059-024-03314-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Accepted: 06/17/2024] [Indexed: 07/03/2024] Open
Abstract
BACKGROUND Computational variant effect predictors offer a scalable and increasingly reliable means of interpreting human genetic variation, but concerns of circularity and bias have limited previous methods for evaluating and comparing predictors. Population-level cohorts of genotyped and phenotyped participants that have not been used in predictor training can facilitate an unbiased benchmarking of available methods. Using a curated set of human gene-trait associations with a reported rare-variant burden association, we evaluate the correlations of 24 computational variant effect predictors with associated human traits in the UK Biobank and All of Us cohorts. RESULTS AlphaMissense outperformed all other predictors in inferring human traits based on rare missense variants in UK Biobank and All of Us participants. The overall rankings of computational variant effect predictors in these two cohorts showed a significant positive correlation. CONCLUSION We describe a method to assess computational variant effect predictors that sidesteps the limitations of previous evaluations. This approach is generalizable to future predictors and could continue to inform predictor choice for personal and clinical genetics.
Collapse
Affiliation(s)
- Daniel R Tabet
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Da Kuang
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Megan C Lancaster
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Roujia Li
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Karen Liu
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Jochen Weile
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Atina G Coté
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Yingzhou Wu
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Robert A Hegele
- Department of Medicine, Department of Biochemistry, Schulich School of Medicine and Dentistry, Robarts Research Institute, Western University, London, ON, Canada
| | - Dan M Roden
- Division of Cardiovascular Medicine, Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Pharmacology, Vanderbilt University Medical Centre, Nashville, TN, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Department of Computer Science, University of Toronto, Toronto, ON, Canada.
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada.
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
3
|
Chen SK, Liu J, Van Nynatten A, Tudor-Price BM, Chang BSW. Sampling Strategies for Experimentally Mapping Molecular Fitness Landscapes Using High-Throughput Methods. J Mol Evol 2024:10.1007/s00239-024-10179-8. [PMID: 38886207 DOI: 10.1007/s00239-024-10179-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/20/2024] [Indexed: 06/20/2024]
Abstract
Empirical studies of genotype-phenotype-fitness maps of proteins are fundamental to understanding the evolutionary process, in elucidating the space of possible genotypes accessible through mutations in a landscape of phenotypes and fitness effects. Yet, comprehensively mapping molecular fitness landscapes remains challenging since all possible combinations of amino acid substitutions for even a few protein sites are encoded by an enormous genotype space. High-throughput mapping of genotype space can be achieved using large-scale screening experiments known as multiplexed assays of variant effect (MAVEs). However, to accommodate such multi-mutational studies, the size of MAVEs has grown to the point where a priori determination of sampling requirements is needed. To address this problem, we propose calculations and simulation methods to approximate minimum sampling requirements for multi-mutational MAVEs, which we combine with a new library construction protocol to experimentally validate our approximation approaches. Analysis of our simulated data reveals how sampling trajectories differ between simulations of nucleotide versus amino acid variants and among mutagenesis schemes. For this, we show quantitatively that marginal gains in sampling efficiency demand increasingly greater sampling effort when sampling for nucleotide sequences over their encoded amino acid equivalents. We present a new library construction protocol that efficiently maximizes sequence variation, and demonstrate using ultradeep sequencing that the library encodes virtually all possible combinations of mutations within the experimental design. Insights learned from our analyses together with the methodological advances reported herein are immediately applicable toward pooled experimental screens of arbitrary design, enabling further assay upscaling and expanded testing of genotype space.
Collapse
Affiliation(s)
- Steven K Chen
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Jing Liu
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Alexander Van Nynatten
- Department of Biological Science, University of Toronto Scarborough, Toronto, ON, Canada
| | | | - Belinda S W Chang
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada.
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, Canada.
- Centre for the Analysis of Genome Evolution & Function, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
4
|
Ma K, Gauthier LO, Cheung F, Huang S, Lek M. High-throughput assays to assess variant effects on disease. Dis Model Mech 2024; 17:dmm050573. [PMID: 38940340 PMCID: PMC11225591 DOI: 10.1242/dmm.050573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Interpreting the wealth of rare genetic variants discovered in population-scale sequencing efforts and deciphering their associations with human health and disease present a critical challenge due to the lack of sufficient clinical case reports. One promising avenue to overcome this problem is deep mutational scanning (DMS), a method of introducing and evaluating large-scale genetic variants in model cell lines. DMS allows unbiased investigation of variants, including those that are not found in clinical reports, thus improving rare disease diagnostics. Currently, the main obstacle limiting the full potential of DMS is the availability of functional assays that are specific to disease mechanisms. Thus, we explore high-throughput functional methodologies suitable to examine broad disease mechanisms. We specifically focus on methods that do not require robotics or automation but instead use well-designed molecular tools to transform biological mechanisms into easily detectable signals, such as cell survival rate, fluorescence or drug resistance. Here, we aim to bridge the gap between disease-relevant assays and their integration into the DMS framework.
Collapse
Affiliation(s)
- Kaiyue Ma
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Logan O. Gauthier
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Frances Cheung
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Shushu Huang
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT 06510, USA
| |
Collapse
|
5
|
Muhammad A, Calandranis ME, Li B, Yang T, Blackwell DJ, Harvey ML, Smith JE, Daniel ZA, Chew AE, Capra JA, Matreyek KA, Fowler DM, Roden DM, Glazer AM. High-throughput functional mapping of variants in an arrhythmia gene, KCNE1, reveals novel biology. Genome Med 2024; 16:73. [PMID: 38816749 PMCID: PMC11138074 DOI: 10.1186/s13073-024-01340-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 04/26/2024] [Indexed: 06/01/2024] Open
Abstract
BACKGROUND KCNE1 encodes a 129-residue cardiac potassium channel (IKs) subunit. KCNE1 variants are associated with long QT syndrome and atrial fibrillation. However, most variants have insufficient evidence of clinical consequences and thus limited clinical utility. METHODS In this study, we leveraged the power of variant effect mapping, which couples saturation mutagenesis with high-throughput sequencing, to ascertain the function of thousands of protein-coding KCNE1 variants. RESULTS We comprehensively assayed KCNE1 variant cell surface expression (2554/2709 possible single-amino-acid variants) and function (2534 variants). Our study identified 470 loss- or partial loss-of-surface expression and 574 loss- or partial loss-of-function variants. Of the 574 loss- or partial loss-of-function variants, 152 (26.5%) had reduced cell surface expression, indicating that most functionally deleterious variants affect channel gating. Nonsense variants at residues 56-104 generally had WT-like trafficking scores but decreased functional scores, indicating that the latter half of the protein is dispensable for protein trafficking but essential for channel function. 22 of the 30 KCNE1 residues (73%) highly intolerant of variation (with > 70% loss-of-function variants) were in predicted close contact with binding partners KCNQ1 or calmodulin. Our functional assay data were consistent with gold standard electrophysiological data (ρ = - 0.64), population and patient cohorts (32/38 presumed benign or pathogenic variants with consistent scores), and computational predictors (ρ = - 0.62). Our data provide moderate-strength evidence for the American College of Medical Genetics/Association of Molecular Pathology functional criteria for benign and pathogenic variants. CONCLUSIONS Comprehensive variant effect maps of KCNE1 can both provide insight into I Ks channel biology and help reclassify variants of uncertain significance.
Collapse
Affiliation(s)
- Ayesha Muhammad
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, 1235 Medical Research Building IV, 2215B Garland Avenue, Nashville, TN, 37232, USA
- Medical Scientist Training Program, Vanderbilt University, Nashville, TN, 37232, USA
| | - Maria E Calandranis
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Bian Li
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Regeneron Pharmaceuticals Inc., Tarrytown, NY, USA
| | - Tao Yang
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Daniel J Blackwell
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - M Lorena Harvey
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Jeremy E Smith
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Zerubabell A Daniel
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Ashli E Chew
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - John A Capra
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, CA, 94143, USA
| | - Kenneth A Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH, 44106, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, 98195, USA
| | - Dan M Roden
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, 1235 Medical Research Building IV, 2215B Garland Avenue, Nashville, TN, 37232, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN, 37232, USA
| | - Andrew M Glazer
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, 1235 Medical Research Building IV, 2215B Garland Avenue, Nashville, TN, 37232, USA.
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| |
Collapse
|
6
|
Clausen L, Okarmus J, Voutsinos V, Meyer M, Lindorff-Larsen K, Hartmann-Petersen R. PRKN-linked familial Parkinson's disease: cellular and molecular mechanisms of disease-linked variants. Cell Mol Life Sci 2024; 81:223. [PMID: 38767677 PMCID: PMC11106057 DOI: 10.1007/s00018-024-05262-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 04/25/2024] [Accepted: 05/02/2024] [Indexed: 05/22/2024]
Abstract
Parkinson's disease (PD) is a common and incurable neurodegenerative disorder that arises from the loss of dopaminergic neurons in the substantia nigra and is mainly characterized by progressive loss of motor function. Monogenic familial PD is associated with highly penetrant variants in specific genes, notably the PRKN gene, where homozygous or compound heterozygous loss-of-function variants predominate. PRKN encodes Parkin, an E3 ubiquitin-protein ligase important for protein ubiquitination and mitophagy of damaged mitochondria. Accordingly, Parkin plays a central role in mitochondrial quality control but is itself also subject to a strict protein quality control system that rapidly eliminates certain disease-linked Parkin variants. Here, we summarize the cellular and molecular functions of Parkin, highlighting the various mechanisms by which PRKN gene variants result in loss-of-function. We emphasize the importance of high-throughput assays and computational tools for the clinical classification of PRKN gene variants and how detailed insights into the pathogenic mechanisms of PRKN gene variants may impact the development of personalized therapeutics.
Collapse
Affiliation(s)
- Lene Clausen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Justyna Okarmus
- Department of Neurobiology Research, Institute of Molecular Medicine, University of Southern Denmark, 5230, Odense, Denmark
| | - Vasileios Voutsinos
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Morten Meyer
- Department of Neurobiology Research, Institute of Molecular Medicine, University of Southern Denmark, 5230, Odense, Denmark
- Department of Neurology, Odense University Hospital, 5000, Odense, Denmark
- Department of Clinical Research, BRIDGE, Brain Research Inter Disciplinary Guided Excellence, University of Southern Denmark, 5230, Odense, Denmark
| | - Kresten Lindorff-Larsen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark
| | - Rasmus Hartmann-Petersen
- Department of Biology, Linderstrøm-Lang Centre for Protein Science, University of Copenhagen, 2200, Copenhagen, Denmark.
| |
Collapse
|
7
|
Grønbæk-Thygesen M, Voutsinos V, Johansson KE, Schulze TK, Cagiada M, Pedersen L, Clausen L, Nariya S, Powell RL, Stein A, Fowler DM, Lindorff-Larsen K, Hartmann-Petersen R. Deep mutational scanning reveals a correlation between degradation and toxicity of thousands of aspartoacylase variants. Nat Commun 2024; 15:4026. [PMID: 38740822 DOI: 10.1038/s41467-024-48481-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 05/02/2024] [Indexed: 05/16/2024] Open
Abstract
Unstable proteins are prone to form non-native interactions with other proteins and thereby may become toxic. To mitigate this, destabilized proteins are targeted by the protein quality control network. Here we present systematic studies of the cytosolic aspartoacylase, ASPA, where variants are linked to Canavan disease, a lethal neurological disorder. We determine the abundance of 6152 of the 6260 ( ~ 98%) possible single amino acid substitutions and nonsense ASPA variants in human cells. Most low abundance variants are degraded through the ubiquitin-proteasome pathway and become toxic upon prolonged expression. The data correlates with predicted changes in thermodynamic stability, evolutionary conservation, and separate disease-linked variants from benign variants. Mapping of degradation signals (degrons) shows that these are often buried and the C-terminal region functions as a degron. The data can be used to interpret Canavan disease variants and provide insight into the relationship between protein stability, degradation and cell fitness.
Collapse
Affiliation(s)
- Martin Grønbæk-Thygesen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Vasileios Voutsinos
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Kristoffer E Johansson
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Thea K Schulze
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Matteo Cagiada
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Line Pedersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Lene Clausen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Snehal Nariya
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Rachel L Powell
- Department of Genome Sciences, University of Washington, Seattle, WA, USA
| | - Amelie Stein
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA.
- Department of Bioengineering, University of Washington, Seattle, WA, USA.
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Rasmus Hartmann-Petersen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
8
|
Zhang R, Li S, Schippers K, Li Y, Eimers B, Lavrijsen M, Wang L, Cui G, Chen X, Peppelenbosch MP, Lebbink JH, Smits R. Analysis of Tumor-Associated AXIN1 Missense Mutations Identifies Variants That Activate β-Catenin Signaling. Cancer Res 2024; 84:1443-1459. [PMID: 38359148 PMCID: PMC11063763 DOI: 10.1158/0008-5472.can-23-2268] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/14/2023] [Accepted: 02/12/2024] [Indexed: 02/17/2024]
Abstract
AXIN1 is a major component of the β-catenin destruction complex and is frequently mutated in various cancer types, particularly liver cancers. Truncating AXIN1 mutations are recognized to encode a defective protein that leads to β-catenin stabilization, but the functional consequences of missense mutations are not well characterized. Here, we first identified the GSK3β, β-catenin, and RGS/APC interaction domains of AXIN1 that are the most critical for proper β-catenin regulation. Analysis of 80 tumor-associated variants in these domains identified 18 that significantly affected β-catenin signaling. Coimmunoprecipitation experiments revealed that most of them lost binding to the binding partner corresponding to the mutated domain. A comprehensive protein structure analysis predicted the consequences of these mutations, which largely overlapped with the observed effects on β-catenin signaling in functional experiments. The structure analysis also predicted that loss-of-function mutations within the RGS/APC interaction domain either directly affected the interface for APC binding or were located within the hydrophobic core and destabilized the entire structure. In addition, truncated AXIN1 length inversely correlated with the β-catenin regulatory function, with longer proteins retaining more functionality. These analyses suggest that all AXIN1-truncating mutations at least partially affect β-catenin regulation, whereas this is only the case for a subset of missense mutations. Consistently, most colorectal and liver cancers carrying missense variants acquire mutations in other β-catenin regulatory genes such as APC and CTNNB1. These results will aid the functional annotation of AXIN1 mutations identified in large-scale sequencing efforts or in individual patients. SIGNIFICANCE Characterization of 80 tumor-associated missense variants of AXIN1 reveals a subset of 18 mutations that disrupt its β-catenin regulatory function, whereas the majority are passenger mutations.
Collapse
Affiliation(s)
- Ruyi Zhang
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Shanshan Li
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Kelly Schippers
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Yunlong Li
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Boaz Eimers
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Marla Lavrijsen
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Ling Wang
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Guofei Cui
- Cancer Biology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
| | - Xin Chen
- Cancer Biology Program, University of Hawaii Cancer Center, Honolulu, Hawaii
| | - Maikel P. Peppelenbosch
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| | - Joyce H.G. Lebbink
- Department of Molecular Genetics, Oncode Institute, Erasmus MC Cancer Institute, Erasmus University Medical Center, Rotterdam, the Netherlands
- Department of Radiotherapy, Erasmus University Medical Center, Rotterdam, the Netherlands
| | - Ron Smits
- Department of Gastroenterology and Hepatology, Erasmus MC Cancer Institute, University Medical Center, Rotterdam, the Netherlands
| |
Collapse
|
9
|
Tsishyn M, Cia G, Hermans P, Kwasigroch J, Rooman M, Pucci F. FiTMuSiC: leveraging structural and (co)evolutionary data for protein fitness prediction. Hum Genomics 2024; 18:36. [PMID: 38627807 PMCID: PMC11020440 DOI: 10.1186/s40246-024-00605-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Accepted: 04/02/2024] [Indexed: 04/19/2024] Open
Abstract
Systematically predicting the effects of mutations on protein fitness is essential for the understanding of genetic diseases. Indeed, predictions complement experimental efforts in analyzing how variants lead to dysfunctional proteins that in turn can cause diseases. Here we present our new fitness predictor, FiTMuSiC, which leverages structural, evolutionary and coevolutionary information. We show that FiTMuSiC predicts fitness with high accuracy despite the simplicity of its underlying model: it was among the top predictors on the hydroxymethylbilane synthase (HMBS) target of the sixth round of the Critical Assessment of Genome Interpretation challenge (CAGI6) and performs as well as much more complex deep learning models such as AlphaMissense. To further demonstrate FiTMuSiC's robustness, we compared its predictions with in vitro activity data on HMBS, variant fitness data on human glucokinase (GCK), and variant deleteriousness data on HMBS and GCK. These analyses further confirm FiTMuSiC's qualities and accuracy, which compare favorably with those of other predictors. Additionally, FiTMuSiC returns two scores that separately describe the functional and structural effects of the variant, thus providing mechanistic insight into why the variant leads to fitness loss or gain. We also provide an easy-to-use webserver at https://babylone.ulb.ac.be/FiTMuSiC , which is freely available for academic use and does not require any bioinformatics expertise, which simplifies the accessibility of our tool for the entire scientific community.
Collapse
Affiliation(s)
- Matsvei Tsishyn
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Triumph Bvd, 1050, Brussels, Belgium
| | - Gabriel Cia
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Triumph Bvd, 1050, Brussels, Belgium
| | - Pauline Hermans
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Triumph Bvd, 1050, Brussels, Belgium
| | - Jean Kwasigroch
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Triumph Bvd, 1050, Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Triumph Bvd, 1050, Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, 50 Roosevelt Ave, 1050, Brussels, Belgium.
- Interuniversity Institute of Bioinformatics in Brussels, Triumph Bvd, 1050, Brussels, Belgium.
| |
Collapse
|
10
|
Rao S, Sadybekov A, DeWitt DC, Lipka J, Katritch V, Herring BE. Detection of autism spectrum disorder-related pathogenic trio variants by a novel structure-based approach. Mol Autism 2024; 15:12. [PMID: 38566250 PMCID: PMC10988830 DOI: 10.1186/s13229-024-00590-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 02/16/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Glutamatergic synapse dysfunction is believed to underlie the development of Autism Spectrum Disorder (ASD) and Intellectual Disability (ID) in many individuals. However, identification of genetic markers that contribute to synaptic dysfunction in these individuals is notoriously difficult. Based on genomic analysis, structural modeling, and functional data, we recently established the involvement of the TRIO-RAC1 pathway in ASD and ID. Furthermore, we identified a pathological de novo missense mutation hotspot in TRIO's GEF1 domain. ASD/ID-related missense mutations within this domain compromise glutamatergic synapse function and likely contribute to the development of ASD/ID. The number of ASD/ID cases with mutations identified within TRIO's GEF1 domain is increasing. However, tools for accurately predicting whether such mutations are detrimental to protein function are lacking. METHODS Here we deployed advanced protein structural modeling techniques to predict potential de novo pathogenic and benign mutations within TRIO's GEF1 domain. Mutant TRIO-9 constructs were generated and expressed in CA1 pyramidal neurons of organotypic cultured hippocampal slices. AMPA receptor-mediated postsynaptic currents were examined in these neurons using dual whole-cell patch clamp electrophysiology. We also validated these findings using orthogonal co-immunoprecipitation and fluorescence lifetime imaging (FLIM-FRET) experiments to assay TRIO mutant overexpression effects on TRIO-RAC1 binding and on RAC1 activity in HEK293/T cells. RESULTS Missense mutations in TRIO's GEF1 domain that were predicted to disrupt TRIO-RAC1 binding or stability were tested experimentally and found to greatly impair TRIO-9's influence on glutamatergic synapse function. In contrast, missense mutations in TRIO's GEF1 domain that were predicted to have minimal effect on TRIO-RAC1 binding or stability did not impair TRIO-9's influence on glutamatergic synapse function in our experimental assays. In orthogonal assays, we find most of the mutations predicted to disrupt binding display loss of function but mutants predicted to disrupt stability do not reflect our results from neuronal electrophysiological data. LIMITATIONS We present a method to predict missense mutations in TRIO's GEF1 domain that may compromise TRIO function and test for effects in a limited number of assays. Possible limitations arising from the model systems employed here can be addressed in future studies. Our method does not provide evidence for whether these mutations confer ASD/ID risk or the likelihood that such mutations will result in the development of ASD/ID. CONCLUSIONS Here we show that a combination of structure-based computational predictions and experimental validation can be employed to reliably predict whether missense mutations in the human TRIO gene impede TRIO protein function and compromise TRIO's role in glutamatergic synapse regulation. With the growing accessibility of genome sequencing, the use of such tools in the accurate identification of pathological mutations will be instrumental in diagnostics of ASD/ID.
Collapse
Affiliation(s)
- Sadhna Rao
- Department of Biological Sciences, Neurobiology Section, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, CA, 90089, USA.
- Neuroscience Graduate Program, University of Southern California, Los Angeles, CA, 90089, USA.
| | - Anastasiia Sadybekov
- Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA
- Department of Chemistry, University of Southern California, Los Angeles, CA, 90089, USA
| | - David C DeWitt
- Department of Pathology, Genentech, Inc., South San Francisco, CA, 94080, USA
| | - Joanna Lipka
- Department of Neuroscience, Genentech, Inc., South San Francisco, CA, 94080, USA
| | - Vsevolod Katritch
- Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, 90089, USA.
- Department of Chemistry, University of Southern California, Los Angeles, CA, 90089, USA.
| | - Bruce E Herring
- Department of Biological Sciences, Neurobiology Section, Dornsife College of Letters, Arts and Sciences, University of Southern California, Los Angeles, CA, 90089, USA.
- Neuroscience Graduate Program, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
11
|
Scott BM, Chen SK, Van Nynatten A, Liu J, Schott RK, Heon E, Peisajovich SG, Chang BSW. Scaling up Functional Analyses of the G Protein-Coupled Receptor Rhodopsin. J Mol Evol 2024; 92:61-71. [PMID: 38324225 DOI: 10.1007/s00239-024-10154-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 12/22/2023] [Indexed: 02/08/2024]
Abstract
Eukaryotic cells use G protein-coupled receptors (GPCRs) to convert external stimuli into internal signals to elicit cellular responses. However, how mutations in GPCR-coding genes affect GPCR activation and downstream signaling pathways remain poorly understood. Approaches such as deep mutational scanning show promise in investigations of GPCRs, but a high-throughput method to measure rhodopsin activation has yet to be achieved. Here, we scale up a fluorescent reporter assay in budding yeast that we engineered to study rhodopsin's light-activated signal transduction. Using this approach, we measured the mutational effects of over 1200 individual human rhodopsin mutants, generated by low-frequency random mutagenesis of the GPCR rhodopsin (RHO) gene. Analysis of the data in the context of rhodopsin's three-dimensional structure reveals that transmembrane helices are generally less tolerant to mutations compared to flanking helices that face the lipid bilayer, which suggest that mutational tolerance is contingent on both the local environment surrounding specific residues and the specific position of these residues in the protein structure. Comparison of functional scores from our screen to clinically identified rhodopsin disease variants found many pathogenic mutants to be loss of function. Lastly, functional scores from our assay were consistent with a complex counterion mechanism involved in ligand-binding and rhodopsin activation. Our results demonstrate that deep mutational scanning is possible for rhodopsin activation and can be an effective method for revealing properties of mutational tolerance that may be generalizable to other transmembrane proteins.
Collapse
Affiliation(s)
- Benjamin M Scott
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Steven K Chen
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada
| | | | - Jing Liu
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Ryan K Schott
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada
- Department of Biology and Centre for Vision Research, York University, Toronto, ON, Canada
- Department of Vertebrate Zoology, National Museum of Natural History, Smithsonian Institution, Washington, DC, USA
| | - Elise Heon
- Department of Ophthalmology, Hospital for Sick Children, Toronto, ON, Canada
| | - Sergio G Peisajovich
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Belinda S W Chang
- Department of Cell and Systems Biology, University of Toronto, Toronto, ON, Canada.
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada.
- Centre for the Analysis of Genome Evolution and Function, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
12
|
Yeow D, Rudaks LI, Siow SF, Davis RL, Kumar KR. Genetic Testing of Movements Disorders: A Review of Clinical Utility. Tremor Other Hyperkinet Mov (N Y) 2024; 14:2. [PMID: 38222898 PMCID: PMC10785957 DOI: 10.5334/tohm.835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 12/04/2023] [Indexed: 01/16/2024] Open
Abstract
Currently, pathogenic variants in more than 500 different genes are known to cause various movement disorders. The increasing accessibility and reducing cost of genetic testing has resulted in increasing clinical use of genetic testing for the diagnosis of movement disorders. However, the optimal use case(s) for genetic testing at a patient level remain ill-defined. Here, we review the utility of genetic testing in patients with movement disorders and also highlight current challenges and limitations that need to be considered when making decisions about genetic testing in clinical practice. Highlights The utility of genetic testing extends across multiple clinical and non-clinical domains. Here we review different aspects of the utility of genetic testing for movement disorders and the numerous associated challenges and limitations. These factors should be weighed on a case-by-case basis when requesting genetic tests in clinical practice.
Collapse
Affiliation(s)
- Dennis Yeow
- Translational Neurogenomics Group, Neurology Department & Molecular Medicine Laboratory, Concord Repatriation General Hospital, Concord, NSW, Australia
- Concord Clinical School, Sydney Medical School, Faculty of Health & Medicine, University of Sydney, Concord, NSW, Australia
- Rare Disease Program, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
- Department of Neurology, Prince of Wales Hospital, Randwick, NSW, Australia
- Neuroscience Research Australia, Randwick, NSW, Australia
| | - Laura I. Rudaks
- Translational Neurogenomics Group, Neurology Department & Molecular Medicine Laboratory, Concord Repatriation General Hospital, Concord, NSW, Australia
- Concord Clinical School, Sydney Medical School, Faculty of Health & Medicine, University of Sydney, Concord, NSW, Australia
- Rare Disease Program, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
| | - Sue-Faye Siow
- Department of Clinical Genetics, Royal North Shore Hospital, St Leonards, NSW, Australia
| | - Ryan L. Davis
- Rare Disease Program, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
- Neurogenetics Research Group, Kolling Institute, School of Medical Sciences, Faculty of Medicine and Health, University of Sydney and Northern Sydney Local Health District, St Leonards, NSW, Australia
| | - Kishore R. Kumar
- Translational Neurogenomics Group, Neurology Department & Molecular Medicine Laboratory, Concord Repatriation General Hospital, Concord, NSW, Australia
- Concord Clinical School, Sydney Medical School, Faculty of Health & Medicine, University of Sydney, Concord, NSW, Australia
- Rare Disease Program, Garvan Institute of Medical Research, Darlinghurst, NSW, Australia
- School of Clinical Medicine, University of New South Wales, Sydney, NSW, Australia
| |
Collapse
|
13
|
Feng Y, Yi J, Yang L, Wang Y, Wen J, Zhao W, Kim P, Zhou X. COV2Var, a function annotation database of SARS-CoV-2 genetic variation. Nucleic Acids Res 2024; 52:D701-D713. [PMID: 37897356 PMCID: PMC10767816 DOI: 10.1093/nar/gkad958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/29/2023] [Accepted: 10/16/2023] [Indexed: 10/30/2023] Open
Abstract
The COVID-19 pandemic, caused by the coronavirus SARS-CoV-2, has resulted in the loss of millions of lives and severe global economic consequences. Every time SARS-CoV-2 replicates, the viruses acquire new mutations in their genomes. Mutations in SARS-CoV-2 genomes led to increased transmissibility, severe disease outcomes, evasion of the immune response, changes in clinical manifestations and reducing the efficacy of vaccines or treatments. To date, the multiple resources provide lists of detected mutations without key functional annotations. There is a lack of research examining the relationship between mutations and various factors such as disease severity, pathogenicity, patient age, patient gender, cross-species transmission, viral immune escape, immune response level, viral transmission capability, viral evolution, host adaptability, viral protein structure, viral protein function, viral protein stability and concurrent mutations. Deep understanding the relationship between mutation sites and these factors is crucial for advancing our knowledge of SARS-CoV-2 and for developing effective responses. To fill this gap, we built COV2Var, a function annotation database of SARS-CoV-2 genetic variation, available at http://biomedbdc.wchscu.cn/COV2Var/. COV2Var aims to identify common mutations in SARS-CoV-2 variants and assess their effects, providing a valuable resource for intensive functional annotations of common mutations among SARS-CoV-2 variants.
Collapse
Affiliation(s)
- Yuzhou Feng
- Department of Laboratory Medicine and West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
- Med-X Center for Informatics, Sichuan University, Chengdu 610041, China
| | - Jiahao Yi
- School of Big Health, Guizhou Medical University, Guiyang 550025, China
| | - Lin Yang
- Department of Cardiology and Laboratory of Gene Therapy for Heart Diseases, State Key Laboratory of Biotherapy, West China Hospital, Sichuan University and Collaborative Innovation Center for Biotherapy, Chengdu 610041, China
| | - Yanfei Wang
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Jianguo Wen
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Weiling Zhao
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Pora Kim
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- McGovern Medical School, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- School of Dentistry, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| |
Collapse
|
14
|
Fowler DM, Rehm HL. Will variants of uncertain significance still exist in 2030? Am J Hum Genet 2024; 111:5-10. [PMID: 38086381 PMCID: PMC10806733 DOI: 10.1016/j.ajhg.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/12/2023] [Accepted: 11/13/2023] [Indexed: 12/28/2023] Open
Abstract
In 2020, the National Human Genome Research Institute (NHGRI) made ten "bold predictions," including that "the clinical relevance of all encountered genomic variants will be readily predictable, rendering the diagnostic designation 'variant of uncertain significance (VUS)' obsolete." We discuss the prospects for this prediction, arguing that many, if not most, VUS in coding regions will be resolved by 2030. We outline a confluence of recent changes making this possible, especially advances in the standards for variant classification that better leverage diverse types of evidence, improvements in computational variant effect predictor performance, scalable multiplexed assays of variant effect capable of saturating the genome, and data-sharing efforts that will maximize the information gained from each new individual sequenced and variant interpreted. We suggest that clinicians and researchers can realize a future where VUSs have largely been eliminated, in line with the NHGRI's bold prediction. The length of time taken to reach this future, and thus whether we are able to achieve the goal of largely eliminating VUSs by 2030, is largely a consequence of the choices made now and in the next few years. We believe that investing in eliminating VUSs is worthwhile, since their predominance remains one of the biggest challenges to precision genomic medicine.
Collapse
Affiliation(s)
- Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA, USA; Department of Bioengineering, University of Washington, Seattle, WA, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA, USA.
| | - Heidi L Rehm
- Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
15
|
Huang H, Hu C, Na J, Hart SN, Gnanaolivu RD, Abozaid M, Rao T, Tecleab YA, Pesaran T, Lyra PCM, Karam R, Yadav S, Domchek SM, de la Hoya M, Robson M, Mehine M, Bandlamudi C, Mandelker D, Monteiro ANA, Boddicker N, Chen W, Richardson ME, Couch FJ. Saturation genome editing-based functional evaluation and clinical classification of BRCA2 single nucleotide variants. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.14.571597. [PMID: 38168194 PMCID: PMC10760149 DOI: 10.1101/2023.12.14.571597] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Germline BRCA2 loss-of function (LOF) variants identified by clinical genetic testing predispose to breast, ovarian, prostate and pancreatic cancer. However, variants of uncertain significance (VUS) (n>4000) limit the clinical use of testing results. Thus, there is an urgent need for functional characterization and clinical classification of all BRCA2 variants. Here we report on comprehensive saturation genome editing-based functional characterization of 97% of all possible single nucleotide variants (SNVs) in the BRCA2 DNA Binding Domain hotspot for pathogenic missense variants that is encoded by exons 15 to 26. The assay was based on deep sequence analysis of surviving endogenously targeted haploid cells. A total of 7013 SNVs were characterized as functionally abnormal (n=955), intermediate/uncertain, or functionally normal (n=5224) based on 95% agreement with ClinVar known pathogenic and benign standards. Results were validated relative to batches of nonsense and synonymous variants and variants evaluated using a homology directed repair (HDR) functional assay. Breast cancer case-control association studies showed that pooled SNVs encoding functionally abnormal missense variants were associated with increased risk of breast cancer (odds ratio (OR) 3.89, 95%CI: 2.77-5.51). In addition, 86% of tumors associated with abnormal missense SNVs displayed loss of heterozygosity (LOH), whereas 26% of tumors with normal variants had LOH. The functional data were added to other sources of information in a ClinGen/ACMG/AMP-like model and 700 functionally abnormal SNVs, including 220 missense SNVs, were classified as pathogenic or likely pathogenic, while 4862 functionally normal SNVs, including 3084 missense SNVs, were classified as benign or likely benign. These classified variants can now be used for risk assessment and clinical care of variant carriers and the remaining functional scores can be used directly for clinical classification and interpretation of many additional variants. Summary Germline BRCA2 loss-of function (LOF) variants identified by clinical genetic testing predispose to several types of cancer. However, variants of uncertain significance (VUS) limit the clinical use of testing results. Thus, there is an urgent need for functional characterization and clinical classification of all BRCA2 variants to facilitate current and future clinical management of individuals with these variants. Here we show the results from a saturation genome editing (SGE) and functional analysis of all possible single nucleotide variants (SNVs) from exons 15 to 26 that encode the BRCA2 DNA Binding Domain hotspot for pathogenic missense variants. The assay was based on deep sequence analysis of surviving endogenously targeted human haploid HAP1 cells. The assay was calibrated relative to ClinVar known pathogenic and benign missense standards and 95% prevalence thresholds for functionally abnormal and normal variants were identified. Thresholds were validated based on nonsense and synonymous variants. SNVs encoding functionally abnormal missense variants were associated with increased risks of breast and ovarian cancer. The functional assay results were integrated into a ClinGen/ACMG/AMP-like model for clinical classification of the majority of BRCA2 SNVs as pathogenic/likely pathogenic or benign/likely benign. The classified variants can be used for improved clinical management of variant carriers.
Collapse
|
16
|
Sahu S, Sullivan T, Southon E, Caylor D, Geh J, Sharan SK. Protocol for the saturation and multiplexing of genetic variants using CRISPR-Cas9. STAR Protoc 2023; 4:102702. [PMID: 37948185 PMCID: PMC10658368 DOI: 10.1016/j.xpro.2023.102702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 10/11/2023] [Accepted: 10/20/2023] [Indexed: 11/12/2023] Open
Abstract
Here, we present a multiplexed assay for variant effect protocol to assess the functional impact of all possible genetic variations within a particular genomic region. We describe steps for saturation genome editing by designing and cloning of single-guide RNA (sgRNA). We then detail steps for nucleofection of sgRNA, testing drug response on variants, and amplification of genomic DNA for next-generation sequencing. For complete details on the use and execution of this protocol, please refer to Sahu et al.1.
Collapse
Affiliation(s)
- Sounak Sahu
- Mouse Cancer Genetics Program, Centre for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA.
| | - Teresa Sullivan
- Mouse Cancer Genetics Program, Centre for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA
| | - Eileen Southon
- Mouse Cancer Genetics Program, Centre for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA
| | - Dylan Caylor
- Mouse Cancer Genetics Program, Centre for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA
| | - Josephine Geh
- Mouse Cancer Genetics Program, Centre for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA
| | - Shyam K Sharan
- Mouse Cancer Genetics Program, Centre for Cancer Research, National Cancer Institute, Frederick, MD 21702, USA.
| |
Collapse
|
17
|
Xie MJ, Cromie GA, Owens K, Timour MS, Tang M, Kutz JN, El-Hattab AW, McLaughlin RN, Dudley AM. Constructing and interpreting a large-scale variant effect map for an ultrarare disease gene: Comprehensive prediction of the functional impact of PSAT1 genotypes. PLoS Genet 2023; 19:e1010972. [PMID: 37812589 PMCID: PMC10561871 DOI: 10.1371/journal.pgen.1010972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 09/13/2023] [Indexed: 10/11/2023] Open
Abstract
Reduced activity of the enzymes encoded by PHGDH, PSAT1, and PSPH causes a set of ultrarare, autosomal recessive diseases known as serine biosynthesis defects. These diseases present in a broad phenotypic spectrum: at the severe end is Neu-Laxova syndrome, in the intermediate range are infantile serine biosynthesis defects with severe neurological manifestations and growth deficiency, and at the mild end is childhood disease with intellectual disability. However, L-serine supplementation, especially if started early, can ameliorate and in some cases even prevent symptoms. Therefore, knowledge of pathogenic variants can improve clinical outcomes. Here, we use a yeast-based assay to individually measure the functional impact of 1,914 SNV-accessible amino acid substitutions in PSAT. Results of our assay agree well with clinical interpretations and protein structure-function relationships, supporting the inclusion of our data as functional evidence as part of the ACMG variant interpretation guidelines. We use existing ClinVar variants, disease alleles reported in the literature and variants present as homozygotes in the primAD database to define assay ranges that could aid clinical variant interpretation for up to 98% of the tested variants. In addition to measuring the functional impact of individual variants in yeast haploid cells, we also assay pairwise combinations of PSAT1 alleles that recapitulate human genotypes, including compound heterozygotes, in yeast diploids. Results from our diploid assay successfully distinguish the genotypes of affected individuals from those of healthy carriers and agree well with disease severity. Finally, we present a linear model that uses individual allele measurements to predict the biallelic function of ~1.8 million allele combinations corresponding to potential human genotypes. Taken together, our work provides an example of how large-scale functional assays in model systems can be powerfully applied to the study of ultrarare diseases.
Collapse
Affiliation(s)
- Michael J. Xie
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
- Molecular Engineering Graduate Program, University of Washington, Seattle, Washington, United States of America
| | - Gareth A. Cromie
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
| | - Katherine Owens
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
- Department of Applied Mathematics, University of Washington, Seattle, Washington, United States of America
| | - Martin S. Timour
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
| | - Michelle Tang
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
| | - J. Nathan Kutz
- Department of Applied Mathematics, University of Washington, Seattle, Washington, United States of America
| | - Ayman W. El-Hattab
- Department of Clinical Sciences, College of Medicine, University of Sharjah, Sharjah, United Arab Emirates
| | | | - Aimée M. Dudley
- Pacific Northwest Research Institute, Seattle, Washington, United States of America
- Molecular Engineering Graduate Program, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
18
|
Sinha S, Li J, Tam B, Wang SM. Classification of PTEN missense VUS through exascale simulations. Brief Bioinform 2023; 24:bbad361. [PMID: 37843401 DOI: 10.1093/bib/bbad361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 09/08/2023] [Accepted: 09/20/2023] [Indexed: 10/17/2023] Open
Abstract
Phosphatase and tensin homolog (PTEN), a tumor suppressor with dual phosphatase properties, is a key factor in PI3K/AKT signaling pathway. Pathogenic germline variation in PTEN can abrogate its ability to dephosphorylate, causing high cancer risk. Lack of functional evidence lets numerous PTEN variants be classified as variants of uncertain significance (VUS). Utilizing Molecular Dynamics (MD) simulations, we performed a thorough evaluation for 147 PTEN missense VUS, sorting them into 66 deleterious and 81 tolerated variants. Utilizing replica exchange molecular dynamic (REMD) simulations, we further assessed the variants situated in the catalytic core of PTEN's phosphatase domain and uncovered conformational alterations influencing the structural stability of the phosphatase domain. There was a high degree of agreement between our results and the variants classified by Variant Abundance by Massively Parallel Sequencing, saturation mutagenesis, multiplexed functional data and experimental assays. Our extensive analysis of PTEN missense VUS should benefit their clinical applications in PTEN-related cancer. SIGNIFICANCE STATEMENT Classification of PTEN variants affecting its lipid phosphatase activity is important for understanding the roles of PTEN variation in the pathogenesis of hereditary and sporadic malignancies. Of the 3000 variants identified in PTEN, 1296 (43%) were assigned as VUS. Here, we applied MD and REMD simulations to investigate the effects of PTEN missense VUS on the structural integrity of the PTEN phosphatase domain consisting the WPD, P and TI active sites. We classified a total of 147 missense VUS into 66 deleterious and 81 tolerated variants by referring to the control group comprising 54 pathogenic and 12 benign variants. The classification was largely in concordance with these classified by experimental approaches.
Collapse
Affiliation(s)
- Siddharth Sinha
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| | - Jiaheng Li
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| | - Benjamin Tam
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| | - San Ming Wang
- Ministry of Education Frontiers Science Center for Precision Oncology, Cancer Centre and Institute of Translational Medicine, Faculty of Health Sciences, University of Macau, Macau S.A.R, China
| |
Collapse
|
19
|
Fowler DM, Adams DJ, Gloyn AL, Hahn WC, Marks DS, Muffley LA, Neal JT, Roth FP, Rubin AF, Starita LM, Hurles ME. An Atlas of Variant Effects to understand the genome at nucleotide resolution. Genome Biol 2023; 24:147. [PMID: 37394429 PMCID: PMC10316620 DOI: 10.1186/s13059-023-02986-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 06/13/2023] [Indexed: 07/04/2023] Open
Abstract
Sequencing has revealed hundreds of millions of human genetic variants, and continued efforts will only add to this variant avalanche. Insufficient information exists to interpret the effects of most variants, limiting opportunities for precision medicine and comprehension of genome function. A solution lies in experimental assessment of the functional effect of variants, which can reveal their biological and clinical impact. However, variant effect assays have generally been undertaken reactively for individual variants only after and, in most cases long after, their first observation. Now, multiplexed assays of variant effect can characterise massive numbers of variants simultaneously, yielding variant effect maps that reveal the function of every possible single nucleotide change in a gene or regulatory element. Generating maps for every protein encoding gene and regulatory element in the human genome would create an 'Atlas' of variant effect maps and transform our understanding of genetics and usher in a new era of nucleotide-resolution functional knowledge of the genome. An Atlas would reveal the fundamental biology of the human genome, inform human evolution, empower the development and use of therapeutics and maximize the utility of genomics for diagnosing and treating disease. The Atlas of Variant Effects Alliance is an international collaborative group comprising hundreds of researchers, technologists and clinicians dedicated to realising an Atlas of Variant Effects to help deliver on the promise of genomics.
Collapse
Affiliation(s)
- Douglas M. Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA USA
- Department of Bioengineering, University of Washington, Seattle, WA USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA USA
| | | | - Anna L. Gloyn
- Department of Pediatrics & Department of Genetics, Division of Endocrinology, Stanford School of Medicine, Stanford University, Stanford, CA USA
| | - William C. Hahn
- Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA USA
- Broad Institute of MIT and Harvard, Cambridge, MA USA
| | - Debora S. Marks
- Broad Institute of MIT and Harvard, Cambridge, MA USA
- Department of Systems Biology, Harvard Medical School, Cambridge, USA
| | - Lara A. Muffley
- Department of Genome Sciences, University of Washington, Seattle, WA USA
| | - James T. Neal
- Broad Institute of MIT and Harvard, Cambridge, MA USA
- Novo Nordisk Foundation Center for Genomic Mechanisms of Disease at Broad Institute, Cambridge, MA USA
| | - Frederick P. Roth
- Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON Canada
| | - Alan F. Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC Australia
| | - Lea M. Starita
- Department of Genome Sciences, University of Washington, Seattle, WA USA
- Department of Bioengineering, University of Washington, Seattle, WA USA
- Brotman Baty Institute for Precision Medicine, Seattle, WA USA
| | | |
Collapse
|
20
|
Lin WS. Translating Genetic Discovery into a Mechanistic Understanding of Pediatric Movement Disorders: Lessons from Genetic Dystonias and Related Disorders. ADVANCED GENETICS (HOBOKEN, N.J.) 2023; 4:2200018. [PMID: 37288166 PMCID: PMC10242408 DOI: 10.1002/ggn2.202200018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Indexed: 06/09/2023]
Abstract
The era of next-generation sequencing has increased the pace of gene discovery in the field of pediatric movement disorders. Following the identification of novel disease-causing genes, several studies have aimed to link the molecular and clinical aspects of these disorders. This perspective presents the developing stories of several childhood-onset movement disorders, including paroxysmal kinesigenic dyskinesia, myoclonus-dystonia syndrome, and other monogenic dystonias. These stories illustrate how gene discovery helps focus the research efforts of scientists trying to understand the mechanisms of disease. The genetic diagnosis of these clinical syndromes also helps clarify the associated phenotypic spectra and aids the search for additional disease-causing genes. Collectively, the findings of previous studies have led to increased recognition of the role of the cerebellum in the physiology and pathophysiology of motor control-a common theme in many pediatric movement disorders. To fully exploit the genetic information garnered in the clinical and research arenas, it is crucial that corresponding multi-omics analyses and functional studies also be performed at scale. Hopefully, these integrated efforts will provide us with a more comprehensive understanding of the genetic and neurobiological bases of movement disorders in childhood.
Collapse
Affiliation(s)
- Wei-Sheng Lin
- Department of Pediatrics Taipei Veterans General Hospital Taipei 11217 Taiwan
- School of Medicine National Yang Ming Chiao Tung University Taipei 112304 Taiwan
| |
Collapse
|
21
|
Lo RS, Cromie GA, Tang M, Teng K, Owens K, Sirr A, Kutz JN, Morizono H, Caldovic L, Ah Mew N, Gropman A, Dudley AM. The functional impact of 1,570 individual amino acid substitutions in human OTC. Am J Hum Genet 2023; 110:863-879. [PMID: 37146589 PMCID: PMC10183466 DOI: 10.1016/j.ajhg.2023.03.019] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 03/30/2023] [Indexed: 05/07/2023] Open
Abstract
Deleterious mutations in the X-linked gene encoding ornithine transcarbamylase (OTC) cause the most common urea cycle disorder, OTC deficiency. This rare but highly actionable disease can present with severe neonatal onset in males or with later onset in either sex. Individuals with neonatal onset appear normal at birth but rapidly develop hyperammonemia, which can progress to cerebral edema, coma, and death, outcomes ameliorated by rapid diagnosis and treatment. Here, we develop a high-throughput functional assay for human OTC and individually measure the impact of 1,570 variants, 84% of all SNV-accessible missense mutations. Comparison to existing clinical significance calls, demonstrated that our assay distinguishes known benign from pathogenic variants and variants with neonatal onset from late-onset disease presentation. This functional stratification allowed us to identify score ranges corresponding to clinically relevant levels of impairment of OTC activity. Examining the results of our assay in the context of protein structure further allowed us to identify a 13 amino acid domain, the SMG loop, whose function appears to be required in human cells but not in yeast. Finally, inclusion of our data as PS3 evidence under the current ACMG guidelines, in a pilot reclassification of 34 variants with complete loss of activity, would change the classification of 22 from variants of unknown significance to clinically actionable likely pathogenic variants. These results illustrate how large-scale functional assays are especially powerful when applied to rare genetic diseases.
Collapse
Affiliation(s)
- Russell S Lo
- Pacific Northwest Research Institute, Seattle, WA, USA
| | | | - Michelle Tang
- Pacific Northwest Research Institute, Seattle, WA, USA
| | - Kevin Teng
- Pacific Northwest Research Institute, Seattle, WA, USA
| | - Katherine Owens
- Pacific Northwest Research Institute, Seattle, WA, USA; Department of Applied Mathematics, University of Washington, Seattle, WA, USA
| | - Amy Sirr
- Pacific Northwest Research Institute, Seattle, WA, USA
| | - J Nathan Kutz
- Department of Applied Mathematics, University of Washington, Seattle, WA, USA
| | - Hiroki Morizono
- Center for Genetic Medicine Research, Children's National Research Institute, Children's National Hospital, Washington, DC, USA; Department of Genomics and Precision Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, USA
| | - Ljubica Caldovic
- Center for Genetic Medicine Research, Children's National Research Institute, Children's National Hospital, Washington, DC, USA; Department of Genomics and Precision Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, USA
| | - Nicholas Ah Mew
- Center for Genetic Medicine Research, Children's National Research Institute, Children's National Hospital, Washington, DC, USA; Department of Genomics and Precision Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, USA
| | - Andrea Gropman
- Center for Genetic Medicine Research, Children's National Research Institute, Children's National Hospital, Washington, DC, USA; Department of Genomics and Precision Medicine, School of Medicine and Health Sciences, The George Washington University, Washington, DC, USA; Department of Neurology, Division of Neurogenetics and Neurodevelopmental Disabilities, Children's National Hospital, Washington, DC, USA; Center for Neuroscience Research, Children's National Research Institute, Children's National Hospital, Washington, DC, USA
| | | |
Collapse
|
22
|
Muhammad A, Calandranis ME, Li B, Yang T, Blackwell DJ, Harvey ML, Smith JE, Chew AE, Capra JA, Matreyek KA, Fowler DM, Roden DM, Glazer AM. High-throughput functional mapping of variants in an arrhythmia gene, KCNE1, reveals novel biology. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.28.538612. [PMID: 37162834 PMCID: PMC10168370 DOI: 10.1101/2023.04.28.538612] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Background KCNE1 encodes a 129-residue cardiac potassium channel (IKs) subunit. KCNE1 variants are associated with long QT syndrome and atrial fibrillation. However, most variants have insufficient evidence of clinical consequences and thus limited clinical utility. Results Here, we demonstrate the power of variant effect mapping, which couples saturation mutagenesis with high-throughput sequencing, to ascertain the function of thousands of protein coding KCNE1 variants. We comprehensively assayed KCNE1 variant cell surface expression (2,554/2,709 possible single amino acid variants) and function (2,539 variants). We identified 470 loss-of-surface expression and 588 loss-of-function variants. Out of the 588 loss-of-function variants, only 155 had low cell surface expression. The latter half of the protein is dispensable for protein trafficking but essential for channel function. 22 of the 30 KCNE1 residues (73%) highly intolerant of variation were in predicted close contact with binding partners KCNQ1 or calmodulin. Our data were highly concordant with gold standard electrophysiological data (ρ = -0.65), population and patient cohorts (32/38 concordant variants), and computational metrics (ρ = -0.55). Our data provide moderate-strength evidence for the ACMG/AMP functional criteria for benign and pathogenic variants. Conclusions Comprehensive variant effect maps of KCNE1 can both provide insight into IKs channel biology and help reclassify variants of uncertain significance.
Collapse
Affiliation(s)
- Ayesha Muhammad
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Medical Scientist Training Program, Vanderbilt University, Nashville, TN 37232, USA
| | - Maria E. Calandranis
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Bian Li
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Tao Yang
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Daniel J. Blackwell
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - M. Lorena Harvey
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Jeremy E. Smith
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Ashli E. Chew
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - John A. Capra
- Bakar Computational Health Sciences Institute and Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94143, USA
| | - Kenneth A. Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
| | - Douglas M. Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Dan M. Roden
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Andrew M. Glazer
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN 37232, USA
- Department of Medicine, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| |
Collapse
|
23
|
Hoskins I, Sun S, Cote A, Roth FP, Cenik C. satmut_utils: a simulation and variant calling package for multiplexed assays of variant effect. Genome Biol 2023; 24:82. [PMID: 37081510 PMCID: PMC10116734 DOI: 10.1186/s13059-023-02922-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 04/04/2023] [Indexed: 04/22/2023] Open
Abstract
The impact of millions of individual genetic variants on molecular phenotypes in coding sequences remains unknown. Multiplexed assays of variant effect (MAVEs) are scalable methods to annotate relevant variants, but existing software lacks standardization, requires cumbersome configuration, and does not scale to large targets. We present satmut_utils as a flexible solution for simulation and variant quantification. We then benchmark MAVE software using simulated and real MAVE data. We finally determine mRNA abundance for thousands of cystathionine beta-synthase variants using two experimental methods. The satmut_utils package enables high-performance analysis of MAVEs and reveals the capability of variants to alter mRNA abundance.
Collapse
Affiliation(s)
- Ian Hoskins
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Song Sun
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Atina Cote
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Frederick P Roth
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
24
|
Derbel H, Giacoletto CJ, Benjamin R, Chen G, Schiller MR, Liu Q. Accurate Prediction of Transcriptional Activity of Single Missense Variants in HIV Tat with Deep Learning. Int J Mol Sci 2023; 24:6138. [PMID: 37047108 PMCID: PMC10093788 DOI: 10.3390/ijms24076138] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/23/2023] [Accepted: 03/23/2023] [Indexed: 04/14/2023] Open
Abstract
Tat is an essential gene for increasing the transcription of all HIV genes, and affects HIV replication, HIV exit from latency, and AIDS progression. The Tat gene frequently mutates in vivo and produces variants with diverse activities, contributing to HIV viral heterogeneity as well as drug-resistant clones. Thus, identifying the transcriptional activities of Tat variants will help to better understand AIDS pathology and treatment. We recently reported the missense mutation landscape of all single amino acid Tat variants. In these experiments, a fraction of double missense alleles exhibited intragenic epistasis. However, it is too time-consuming and costly to determine the effect of the variants for all double mutant alleles through experiments. Therefore, we propose a combined GigaAssay/deep learning approach. As a first step to determine activity landscapes for complex variants, we evaluated a deep learning framework using previously reported GigaAssay experiments to predict how transcription activity is affected by Tat variants with single missense substitutions. Our approach achieved a 0.94 Pearson correlation coefficient when comparing the predicted to experimental activities. This hybrid approach can be extensible to more complex Tat alleles for a better understanding of the genetic control of HIV genome transcription.
Collapse
Affiliation(s)
- Houssemeddine Derbel
- Nevada Institute of Personalized Medicine, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| | - Christopher J. Giacoletto
- School of Life Sciences, College of Sciences, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| | - Ronald Benjamin
- Nevada Institute of Personalized Medicine, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
- School of Life Sciences, College of Sciences, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| | - Gordon Chen
- Nevada Institute of Personalized Medicine, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| | - Martin R. Schiller
- Nevada Institute of Personalized Medicine, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
- School of Life Sciences, College of Sciences, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| | - Qian Liu
- Nevada Institute of Personalized Medicine, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
- School of Life Sciences, College of Sciences, University of Nevada Las Vegas, 4505 S Maryland Pkwy, Las Vegas, NV 89154, USA
| |
Collapse
|
25
|
Kurant DE. Opportunities and Challenges with Artificial Intelligence in Genomics. Clin Lab Med 2023; 43:87-97. [PMID: 36764810 DOI: 10.1016/j.cll.2022.09.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The development of artificial intelligence and machine learning algorithms may allow for advances in patient care. There are existing and potential applications in cancer diagnosis and monitoring, identification of at-risk groups of individuals, classification of genetic variants, and even prediction of patient ancestry. This article provides an overview of some current and future applications of artificial intelligence in genomic medicine, in addition to discussing challenges and considerations when bringing these tools into clinical practice.
Collapse
Affiliation(s)
- Danielle E Kurant
- Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
26
|
Xie MJ, Cromie GA, Owens K, Timour MS, Tang M, Kutz JN, El-Hattab AW, McLaughlin RN, Dudley AM. Predicting the functional effect of compound heterozygous genotypes from large scale variant effect maps. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.11.523651. [PMID: 36711904 PMCID: PMC9882023 DOI: 10.1101/2023.01.11.523651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
Background Pathogenic variants in PHGDH, PSAT1 , and PSPH cause a set of rare, autosomal recessive diseases known as serine biosynthesis defects. Serine biosynthesis defects present in a broad phenotypic spectrum that includes, at the severe end, Neu-Laxova syndrome, a lethal multiple congenital anomaly disease, intermediately in the form of infantile serine biosynthesis defects with severe neurological manifestations and growth deficiency, and at the mild end, as childhood disease with intellectual disability. However, because L-serine supplementation, especially if started early, can ameliorate and in some cases even prevent symptoms, knowledge of pathogenic variants is highly actionable. Methods Recently, our laboratory established a yeast-based assay for human PSAT1 function. We have now applied it at scale to assay the functional impact of 1,914 SNV-accessible amino acid substitutions. In addition to assaying the functional impact of individual variants in yeast haploid cells, we can assay pairwise combinations of PSAT1 alleles that recapitulate human genotypes, including compound heterozygotes, in yeast diploids. Results Results of our assays of individual variants (in haploid yeast cells) agree well with clinical interpretations and protein structure-function relationships, supporting the use of our data as functional evidence under the ACMG interpretation guidelines. Results from our diploid assay successfully distinguish patient genotypes from those of healthy carriers and agree well with disease severity. Finally, we present a linear model that uses individual allele measurements (in haploid yeast cells) to accurately predict the biallelic function (in diploid yeast cells) of ~ 1.8 million allele combinations corresponding to potential human genotypes. Conclusions Taken together, our work provides an example of how large-scale functional assays in model systems can be powerfully applied to the study of a rare disease.
Collapse
|
27
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China,Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom,The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China,Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China,*Correspondence: Xianghua Li,
| |
Collapse
|
28
|
Landau J, Tsaban L, Yaacov A, Ben Cohen G, Rosenberg S. Shared Cancer Dataset Analysis Identifies and Predicts the Quantitative Effects of Pan-Cancer Somatic Driver Variants. Cancer Res 2023; 83:74-88. [PMID: 36264175 DOI: 10.1158/0008-5472.can-22-1038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 08/02/2022] [Accepted: 10/18/2022] [Indexed: 02/03/2023]
Abstract
Driver mutations endow tumors with selective advantages and produce an array of pathogenic effects. Determining the function of somatic variants is important for understanding cancer biology and identifying optimal therapies. Here, we compiled a shared dataset from several cancer genomic databases. Two measures were applied to 535 cancer genes based on observed and expected frequencies of driver variants as derived from cancer-specific rates of somatic mutagenesis. The first measure comprised a binary classifier based on a binomial test; the second was tumor variant amplitude (TVA), a continuous measure representing the selective advantage of individual variants. TVA outperformed all other computational tools in terms of its correlation with experimentally derived functional scores of cancer mutations. TVA also highly correlated with drug response, overall survival, and other clinical implications in relevant cancer genes. This study demonstrates how a selective advantage measure based on a large cancer dataset significantly impacts our understanding of the spectral effect of driver variants in cancer. The impact of this information will increase as cancer treatment becomes more precise and personalized to tumor-specific mutations. SIGNIFICANCE A new selective advantage estimation assists in oncogenic driver identification and relative effect measurements, enabling better prognostication, therapy selection, and prioritization.
Collapse
Affiliation(s)
- Jakob Landau
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Linoy Tsaban
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Adar Yaacov
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,Department of Microbiology and Molecular Genetics, IMRIC, Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Gil Ben Cohen
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| | - Shai Rosenberg
- Gaffin Center for Neuro-Oncology, Sharett Institute for Oncology, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel.,The Wohl Institute for Translational Medicine, Hadassah Medical Center and Faculty of Medicine, Hebrew University of Jerusalem, Jerusalem, Israel
| |
Collapse
|
29
|
Sora V, Laspiur AO, Degn K, Arnaudi M, Utichi M, Beltrame L, De Menezes D, Orlandi M, Stoltze UK, Rigina O, Sackett PW, Wadt K, Schmiegelow K, Tiberti M, Papaleo E. RosettaDDGPrediction for high-throughput mutational scans: From stability to binding. Protein Sci 2023; 32:e4527. [PMID: 36461907 PMCID: PMC9795540 DOI: 10.1002/pro.4527] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/05/2022]
Abstract
Reliable prediction of free energy changes upon amino acid substitutions (ΔΔGs) is crucial to investigate their impact on protein stability and protein-protein interaction. Advances in experimental mutational scans allow high-throughput studies thanks to multiplex techniques. On the other hand, genomics initiatives provide a large amount of data on disease-related variants that can benefit from analyses with structure-based methods. Therefore, the computational field should keep the same pace and provide new tools for fast and accurate high-throughput ΔΔG calculations. In this context, the Rosetta modeling suite implements effective approaches to predict folding/unfolding ΔΔGs in a protein monomer upon amino acid substitutions and calculate the changes in binding free energy in protein complexes. However, their application can be challenging to users without extensive experience with Rosetta. Furthermore, Rosetta protocols for ΔΔG prediction are designed considering one variant at a time, making the setup of high-throughput screenings cumbersome. For these reasons, we devised RosettaDDGPrediction, a customizable Python wrapper designed to run free energy calculations on a set of amino acid substitutions using Rosetta protocols with little intervention from the user. Moreover, RosettaDDGPrediction assists with checking completed runs and aggregates raw data for multiple variants, as well as generates publication-ready graphics. We showed the potential of the tool in four case studies, including variants of uncertain significance in childhood cancer, proteins with known experimental unfolding ΔΔGs values, interactions between target proteins and disordered motifs, and phosphomimetics. RosettaDDGPrediction is available, free of charge and under GNU General Public License v3.0, at https://github.com/ELELAB/RosettaDDGPrediction.
Collapse
Affiliation(s)
- Valentina Sora
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Adrian Otamendi Laspiur
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Kristine Degn
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Matteo Arnaudi
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Mattia Utichi
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Ludovica Beltrame
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Dayana De Menezes
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Matteo Orlandi
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Ulrik Kristoffer Stoltze
- Department of Clinical GeneticsCopenhagen University Hospital RigshospitaletCopenhagenDenmark
- Department of Pediatrics and Adolescent MedicineUniversity Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Olga Rigina
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Peter Wad Sackett
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| | - Karin Wadt
- Department of Clinical GeneticsCopenhagen University Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Kjeld Schmiegelow
- Department of Pediatrics and Adolescent MedicineUniversity Hospital RigshospitaletCopenhagenDenmark
- Institute of Clinical Medicine, Faculty of MedicineUniversity of CopenhagenCopenhagenDenmark
| | - Matteo Tiberti
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
| | - Elena Papaleo
- Cancer Structural Biology, Danish Cancer Society Research CenterCopenhagenDenmark
- Cancer Systems Biology, Section for Bioinformatics, Department of Health and TechnologyTechnical University of DenmarkLyngbyDenmark
| |
Collapse
|
30
|
Llargués-Sistac G, Bonjoch L, Castellvi-Bel S. HAP1, a new revolutionary cell model for gene editing using CRISPR-Cas9. Front Cell Dev Biol 2023; 11:1111488. [PMID: 36936678 PMCID: PMC10020200 DOI: 10.3389/fcell.2023.1111488] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 02/22/2023] [Indexed: 03/06/2023] Open
Abstract
The use of next-generation sequencing (NGS) technologies has been instrumental in the characterization of the mutational landscape of complex human diseases like cancer. But despite the enormous rise in the identification of disease candidate genetic variants, their functionality is yet to be fully elucidated in order to have a clear implication in patient care. Haploid human cell models have become the tool of choice for functional gene studies, since they only contain one copy of the genome and can therefore show the unmasked phenotype of genetic variants. Over the past few years, the human near-haploid cell line HAP1 has widely been consolidated as one of the favorite cell line models for functional genetic studies. Its rapid turnover coupled with the fact that only one allele needs to be modified in order to express the subsequent desired phenotype has made this human cell line a valuable tool for gene editing by CRISPR-Cas9 technologies. This review examines the recent uses of the HAP1 cell line model in functional genetic studies and high-throughput genetic screens using the CRISPR-Cas9 system. It covers its use in an attempt to develop new and relevant disease models to further elucidate gene function, and create new ways to understand the genetic basis of human diseases. We will cover the advantages and potential of the use of CRISPR-Cas9 technology on HAP1 to easily and efficiently study the functional interpretation of gene function and human single-nucleotide genetic variants of unknown significance identified through NGS technologies, and its implications for changes in clinical practice and patient care.
Collapse
|
31
|
Fu Y, Bedő J, Papenfuss AT, Rubin AF. Integrating deep mutational scanning and low-throughput mutagenesis data to predict the impact of amino acid variants. Gigascience 2022; 12:giad073. [PMID: 37721410 PMCID: PMC10506130 DOI: 10.1093/gigascience/giad073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 07/02/2023] [Accepted: 08/23/2023] [Indexed: 09/19/2023] Open
Abstract
BACKGROUND Evaluating the impact of amino acid variants has been a critical challenge for studying protein function and interpreting genomic data. High-throughput experimental methods like deep mutational scanning (DMS) can measure the effect of large numbers of variants in a target protein, but because DMS studies have not been performed on all proteins, researchers also model DMS data computationally to estimate variant impacts by predictors. RESULTS In this study, we extended a linear regression-based predictor to explore whether incorporating data from alanine scanning (AS), a widely used low-throughput mutagenesis method, would improve prediction results. To evaluate our model, we collected 146 AS datasets, mapping to 54 DMS datasets across 22 distinct proteins. CONCLUSIONS We show that improved model performance depends on the compatibility of the DMS and AS assays, and the scale of improvement is closely related to the correlation between DMS and AS results.
Collapse
Affiliation(s)
- Yunfan Fu
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Justin Bedő
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| | - Anthony T Papenfuss
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
- Peter MacCallum Cancer Centre, Melbourne, Victoria 3000, Australia
| | - Alan F Rubin
- The Walter and Eliza Hall Institute of Medical Research, Bioinformatics Division, 1G Royal Pde, Parkville, Victoria 3052, Australia
- The University of Melbourne, Department of Medical Biology, Parkville, Victoria 3010, Australia
| |
Collapse
|
32
|
Scott A, Hernandez F, Chamberlin A, Smith C, Karam R, Kitzman JO. Saturation-scale functional evidence supports clinical variant interpretation in Lynch syndrome. Genome Biol 2022; 23:266. [PMID: 36550560 PMCID: PMC9773515 DOI: 10.1186/s13059-022-02839-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 12/13/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Lynch syndrome (LS) is a cancer predisposition syndrome affecting more than 1 in every 300 individuals worldwide. Clinical genetic testing for LS can be life-saving but is complicated by the heavy burden of variants of uncertain significance (VUS), especially missense changes. RESULT To address this challenge, we leverage a multiplexed analysis of variant effect (MAVE) map covering >94% of the 17,746 possible missense variants in the key LS gene MSH2. To establish this map's utility in large-scale variant reclassification, we overlay it on clinical databases of >15,000 individuals with LS gene variants uncovered during clinical genetic testing. We validate these functional measurements in a cohort of individuals with paired tumor-normal test results and find that MAVE-based function scores agree with the clinical interpretation for every one of the MSH2 missense variants with an available classification. We use these scores to attempt reclassification for 682 unique missense VUS, among which 34 scored as deleterious by our function map, in line with previously published rates for other cancer predisposition genes. Combining functional data and other evidence, ten missense VUS are reclassified as pathogenic/likely pathogenic, and another 497 could be moved to benign/likely benign. Finally, we apply these functional scores to paired tumor-normal genetic tests and identify a subset of patients with biallelic somatic loss of function, reflecting a sporadic Lynch-like Syndrome with distinct implications for treatment and relatives' risk. CONCLUSION This study demonstrates how high-throughput functional assays can empower scalable VUS resolution and prospectively generate strong evidence for variant classification.
Collapse
Affiliation(s)
- Anthony Scott
- grid.214458.e0000000086837370Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Division of Genetic Medicine, Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, MI 48109 USA
| | - Felicia Hernandez
- grid.465138.d0000 0004 0455 211XAmbry Genetics, Aliso Viejo, CA 92656 USA
| | - Adam Chamberlin
- grid.465138.d0000 0004 0455 211XAmbry Genetics, Aliso Viejo, CA 92656 USA
| | - Cathy Smith
- grid.214458.e0000000086837370Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109 USA
| | - Rachid Karam
- grid.465138.d0000 0004 0455 211XAmbry Genetics, Aliso Viejo, CA 92656 USA ,grid.214458.e0000000086837370Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109 USA
| | - Jacob O. Kitzman
- grid.214458.e0000000086837370Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109 USA
| |
Collapse
|
33
|
Dace P, Findlay GM. Reducing uncertainty in genetic testing with Saturation Genome Editing. MED GENET-BERLIN 2022; 34:297-304. [PMID: 38836089 PMCID: PMC11006300 DOI: 10.1515/medgen-2022-2159] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
Accurate interpretation of human genetic data is critical for optimizing outcomes in the era of genomic medicine. Powerful methods for testing genetic variants for functional effects are allowing researchers to characterize thousands of variants across disease genes. Here, we review experimental tools enabling highly scalable assays of variants, focusing specifically on Saturation Genome Editing (SGE). We discuss examples of how this technique is being implemented for variant testing at scale and describe how SGE data for BRCA1 have been clinically validated and used to aid variant interpretation. The initial success at predicting variant pathogenicity with SGE has spurred efforts to expand this and related techniques to many more genes.
Collapse
Affiliation(s)
- Phoebe Dace
- The Genome Function Laboratory, The Francis Crick Institute, 1 Midland Rd, London, United Kingdom
| | - Gregory M Findlay
- The Genome Function Laboratory, The Francis Crick Institute, 1 Midland Rd, London, United Kingdom
| |
Collapse
|
34
|
Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu Rev Genet 2022; 56:441-465. [PMID: 36055970 DOI: 10.1146/annurev-genet-072920-032107] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Scalable sequence-function studies have enabled the systematic analysis and cataloging of hundreds of thousands of coding and noncoding genetic variants in the human genome. This has improved clinical variant interpretation and provided insights into the molecular, biophysical, and cellular effects of genetic variants at an astonishing scale and resolution across the spectrum of allele frequencies. In this review, we explore current applications and prospects for the field and outline the principles underlying scalable functional assay design, with a focus on the study of single-nucleotide coding and noncoding variants.
Collapse
Affiliation(s)
- Daniel Tabet
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Victoria Parikh
- Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Frederick P Roth
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Melina Claussnitzer
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Center for Genomic Medicine and Endocrine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Harvard University, Boston, Massachusetts, USA;
| |
Collapse
|
35
|
Pooled image-base screening of mitochondria with microraft isolation distinguishes pathogenic mitofusin 2 mutations. Commun Biol 2022; 5:1128. [PMID: 36284160 PMCID: PMC9596453 DOI: 10.1038/s42003-022-04089-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 10/11/2022] [Indexed: 11/08/2022] Open
Abstract
Most human genetic variation is classified as variants of uncertain significance. While advances in genome editing have allowed innovation in pooled screening platforms, many screens deal with relatively simple readouts (viability, fluorescence) and cannot identify the complex cellular phenotypes that underlie most human diseases. In this paper, we present a generalizable functional genomics platform that combines high-content imaging, machine learning, and microraft isolation in a method termed “Raft-Seq”. We highlight the efficacy of our platform by showing its ability to distinguish pathogenic point mutations of the mitochondrial regulator Mitofusin 2, even when the cellular phenotype is subtle. We also show that our platform achieves its efficacy using multiple cellular features, which can be configured on-the-fly. Raft-Seq enables a way to perform pooled screening on sets of mutations in biologically relevant cells, with the ability to physically capture any cell with a perturbed phenotype and expand it clonally, directly from the primary screen. Raft-Seq is a generalizable pooled screening platform that combines high-content imaging, machine learning and microraft isolation, and enables efficient screening of genetic perturbations based on their impact on phenotypes.
Collapse
|
36
|
Azbukina N, Zharikova A, Ramensky V. Intragenic compensation through the lens of deep mutational scanning. Biophys Rev 2022; 14:1161-1182. [PMID: 36345285 PMCID: PMC9636336 DOI: 10.1007/s12551-022-01005-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/26/2022] [Indexed: 12/20/2022] Open
Abstract
A significant fraction of mutations in proteins are deleterious and result in adverse consequences for protein function, stability, or interaction with other molecules. Intragenic compensation is a specific case of positive epistasis when a neutral missense mutation cancels effect of a deleterious mutation in the same protein. Permissive compensatory mutations facilitate protein evolution, since without them all sequences would be extremely conserved. Understanding compensatory mechanisms is an important scientific challenge at the intersection of protein biophysics and evolution. In human genetics, intragenic compensatory interactions are important since they may result in variable penetrance of pathogenic mutations or fixation of pathogenic human alleles in orthologous proteins from related species. The latter phenomenon complicates computational and clinical inference of an allele's pathogenicity. Deep mutational scanning is a relatively new technique that enables experimental studies of functional effects of thousands of mutations in proteins. We review the important aspects of the field and discuss existing limitations of current datasets. We reviewed ten published DMS datasets with quantified functional effects of single and double mutations and described rates and patterns of intragenic compensation in eight of them. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-022-01005-w.
Collapse
Affiliation(s)
- Nadezhda Azbukina
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
| | - Anastasia Zharikova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| | - Vasily Ramensky
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73, Leninskie Gory, 119991 Moscow, Russia
- National Medical Research Center for Therapy and Preventive Medicine, Petroverigsky per., 10, Bld.3, 101000 Moscow, Russia
| |
Collapse
|
37
|
High-throughput approaches to functional characterization of genetic variation in yeast. Curr Opin Genet Dev 2022; 76:101979. [PMID: 36075138 DOI: 10.1016/j.gde.2022.101979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Revised: 07/29/2022] [Accepted: 08/02/2022] [Indexed: 11/20/2022]
Abstract
Expansion of sequencing efforts to include thousands of genomes is providing a fundamental resource for determining the genetic diversity that exists in a population. Now, high-throughput approaches are necessary to begin to understand the role these genotypic changes play in affecting phenotypic variation. Saccharomyces cerevisiae maintains its position as an excellent model system to determine the function of unknown variants with its exceptional genetic diversity, phenotypic diversity, and reliable genetic manipulation tools. Here, we review strategies and techniques developed in yeast that scale classic approaches of assessing variant function. These approaches improve our ability to better map quantitative trait loci at a higher resolution, even for rare variants, and are already providing greater insight into the role that different types of mutations play in phenotypic variation and evolution not just in yeast but across taxa.
Collapse
|
38
|
Marquet C, Heinzinger M, Olenyi T, Dallago C, Erckert K, Bernhofer M, Nechaev D, Rost B. Embeddings from protein language models predict conservation and variant effects. Hum Genet 2022; 141:1629-1647. [PMID: 34967936 PMCID: PMC8716573 DOI: 10.1007/s00439-021-02411-y] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 12/06/2021] [Indexed: 12/13/2022]
Abstract
The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (pLMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences. These methods learn to predict missing or masked amino acids from the context of entire sequence regions. Here, we used pLM representations (embeddings) to predict sequence conservation and SAV effects without multiple sequence alignments (MSAs). Embeddings alone predicted residue conservation almost as accurately from single sequences as ConSeq using MSAs (two-state Matthews Correlation Coefficient-MCC-for ProtT5 embeddings of 0.596 ± 0.006 vs. 0.608 ± 0.006 for ConSeq). Inputting the conservation prediction along with BLOSUM62 substitution scores and pLM mask reconstruction probabilities into a simplistic logistic regression (LR) ensemble for Variant Effect Score Prediction without Alignments (VESPA) predicted SAV effect magnitude without any optimization on DMS data. Comparing predictions for a standard set of 39 DMS experiments to other methods (incl. ESM-1v, DeepSequence, and GEMME) revealed our approach as competitive with the state-of-the-art (SOTA) methods using MSA input. No method outperformed all others, neither consistently nor statistically significantly, independently of the performance measure applied (Spearman and Pearson correlation). Finally, we investigated binary effect predictions on DMS experiments for four human proteins. Overall, embedding-based methods have become competitive with methods relying on MSAs for SAV effect prediction at a fraction of the costs in computing/energy. Our method predicted SAV effects for the entire human proteome (~ 20 k proteins) within 40 min on one Nvidia Quadro RTX 8000. All methods and data sets are freely available for local and online execution through bioembeddings.com, https://github.com/Rostlab/VESPA , and PredictProtein.
Collapse
Affiliation(s)
- Céline Marquet
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany.
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany.
| | - Michael Heinzinger
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Tobias Olenyi
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Christian Dallago
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Kyra Erckert
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Michael Bernhofer
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Dmitrii Nechaev
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- TUM Graduate School, Center of Doctoral Studies in Informatics and its Applications (CeDoSIA), Boltzmannstr. 11, 85748, Garching, Germany
| | - Burkhard Rost
- Department of Informatics, Bioinformatics and Computational Biology - i12, TUM-Technical University of Munich, Boltzmannstr. 3, Garching, 85748, Munich, Germany
- Institute for Advanced Study (TUM-IAS), Lichtenbergstr. 2a, Garching, 85748, Munich, Germany
- TUM School of Life Sciences Weihenstephan (TUM-WZW), Alte Akademie 8, Freising, Germany
| |
Collapse
|
39
|
Abdo AN, Rintisch C, Gabriel CH, Kramer A. Mutational scanning identified amino acids of the CLOCK exon 19-domain essential for circadian rhythms. Acta Physiol (Oxf) 2022; 234:e13794. [PMID: 35112498 DOI: 10.1111/apha.13794] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 01/28/2022] [Accepted: 01/29/2022] [Indexed: 12/21/2022]
Abstract
AIM In the mammalian circadian clock, the CLOCK/BMAL1 heterodimer binds to E-box enhancer elements in the promoters of its target genes to activate transcription. The classical Clock mice, the first circadian mouse mutant discovered, are behaviourally arrhythmic. In this mutant, CLOCK lacks a 51 amino acid domain corresponding to exon 19 (CLOCKΔ19), which is required for normal transactivation. While the importance of this CLOCK domain for circadian rhythms is well established, the exact molecular mechanism is still unclear. METHODS Using CRISPR/Cas9 technology, we created a CLOCK knockout - CLOCK rescue system in human circadian reporter cells and performed systematic mutational scanning to assess the functionality of individual amino acids within the CLOCK exon 19-domain. RESULTS CLOCK knockout cells were arrhythmic, and circadian rhythms could be rescued by introducing wild-type CLOCK, but not CLOCKΔ19. In addition, we identified several residues, whose mutation failed to rescue rhythms in CLOCK knockout cells. Many of these are part of the hydrophobic binding interface of the predicted dimer of the CLOCK exon 19-domain. CONCLUSION Our data not only indicate that CLOCK/BMAL1 oligomerization mediated by the exon 19-domain is important for circadian dynamics but also suggest that the exon 19-domain provides a platform for binding coactivators and repressors, which in turn is required for normal circadian rhythms.
Collapse
Affiliation(s)
- Ashraf N Abdo
- Laboratory of Chronobiology, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Carola Rintisch
- Laboratory of Chronobiology, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Christian H Gabriel
- Laboratory of Chronobiology, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Achim Kramer
- Laboratory of Chronobiology, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany
| |
Collapse
|
40
|
Mahecha D, Nuñez H, Lattig MC, Duitama J. Machine Learning Models for Accurate Prioritization of Variants of Uncertain Significance. Hum Mutat 2022; 43:449-460. [PMID: 35143088 DOI: 10.1002/humu.24339] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2020] [Revised: 01/04/2022] [Accepted: 01/23/2022] [Indexed: 11/08/2022]
Abstract
The growing use of next generation sequencing technologies on genetic diagnosis has produced an exponential increase in the number of Variants of Uncertain Significance (VUS). In this manuscript we compare three machine learning methods to classify VUS as Pathogenic or No pathogenic, implementing a Random Forest (RF), a Support Vector Machine (SVM), and a Multilayer Perceptron (MLP). To train the models, we extracted high quality variants from ClinVar that were previously classified as VUS. For each variant, we retrieved 9 conservation scores, the loss of function tool and allele frequencies. For the RF and SVM models, hyperparameters were tuned using cross validation with a grid search. The three models were tested on a non-overlapping set of variants that had been classified as VUS any time along the last three years but had been reclassified in august 2020. The three models yielded superior accuracy on this set compared to the benchmarked tools. The RF based model yielded the best performance across different variant types and was used to create VusPrize, an open source software tool for prioritization of variants of uncertain significance. We believe that our model can improve the process of genetic diagnosis in research and clinical settings. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Daniel Mahecha
- SIGEN, Alianza Universidad de los Andes - Fundación Santa Fe de Bogota, Colombia.,Systems and Computing Engineering Department, Universidad de los Andes, Colombia
| | - Haydemar Nuñez
- Systems and Computing Engineering Department, Universidad de los Andes, Colombia
| | - Maria C Lattig
- SIGEN, Alianza Universidad de los Andes - Fundación Santa Fe de Bogota, Colombia.,Facultad de Ciencias, Universidad de los Andes
| | - Jorge Duitama
- Systems and Computing Engineering Department, Universidad de los Andes, Colombia
| |
Collapse
|
41
|
Barbon L, Offord V, Radford EJ, Butler AP, Gerety SS, Adams DJ, Tan HK, Waters AJ. Variant Library Annotation Tool (VaLiAnT): an oligonucleotide library design and annotation tool for saturation genome editing and other deep mutational scanning experiments. Bioinformatics 2022; 38:892-899. [PMID: 34791067 PMCID: PMC8796380 DOI: 10.1093/bioinformatics/btab776] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 07/13/2021] [Accepted: 11/10/2021] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION CRISPR/Cas9-based technology allows for the functional analysis of genetic variants at single nucleotide resolution whilst maintaining genomic context. This approach, known as saturation genome editing (SGE), a form of deep mutational scanning, systematically alters each position in a target region to explore its function. SGE experiments require the design and synthesis of oligonucleotide variant libraries which are introduced into the genome. This technology is applicable to diverse fields such as disease variant identification, drug development, structure-function studies, synthetic biology, evolutionary genetics and host-pathogen interactions. Here, we present the Variant Library Annotation Tool (VaLiAnT) which can be used to generate variant libraries from user-defined genomic coordinates and standard input files. The software can accommodate user-specified species, reference sequences and transcript annotations. RESULTS Coordinates for a genomic range are provided by the user to retrieve a corresponding oligonucleotide reference sequence. A user-specified range within this sequence is then subject to systematic, nucleotide and/or amino acid saturating mutator functions. VaLiAnT provides a novel way to retrieve, mutate and annotate genomic sequences for oligonucleotide library generation. Specific features for SGE library generation can be employed. In addition, VaLiAnT is configurable, allowing for cDNA and prime editing saturation library generation, with other diverse applications possible. AVAILABILITY AND IMPLEMENTATION VaLiAnT is a command line tool written in Python. Source code, testing data, example input and output files and executables are available (https://github.com/cancerit/VaLiAnT) in addition to a detailed user manual (https://github.com/cancerit/VaLiAnT/wiki). VaLiAnT is licensed under AGPLv3. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Luca Barbon
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Victoria Offord
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Elizabeth J Radford
- Human Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
- Department of Paediatrics, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Adam P Butler
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Sebastian S Gerety
- Human Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - David J Adams
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Hong Kee Tan
- Human Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - Andrew J Waters
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| |
Collapse
|
42
|
An oncogenic splice variant of PDGFRα in adult glioblastoma as a therapeutic target for selective CDK4/6 inhibitors. Sci Rep 2022; 12:1275. [PMID: 35075231 PMCID: PMC8786844 DOI: 10.1038/s41598-022-05391-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2021] [Accepted: 01/06/2022] [Indexed: 12/15/2022] Open
Abstract
Understanding human genome alterations is necessary to optimize genome-based cancer therapeutics. However, some newly discovered mutations remain as variants of unknown significance (VUS). Here, the mutation c.1403A > G in exon 10 of the platelet-derived growth factor receptor-alpha (PDGFRA) gene, a VUS found in adult glioblastoma multiforme (GBM), was introduced in human embryonal kidney 293 T (HEK293T) cells using genome editing to investigate its potential oncogenic functions. Genome editing was performed using CRISPR/Cas9; the proliferation, drug sensitivity, and carcinogenic potential of genome-edited cells were investigated. We also investigated the mechanism underlying the observed phenotypes. Three GBM patients carrying the c.1403A > G mutation were studied to validate the in vitro results. The c.1403A > G mutation led to a splice variant (p.K455_N468delinsN) because of the generation of a 3’-acceptor splice site in exon 10. PDGFRA-mutated HEK293T cells exhibited a higher proliferative activity via PDGFRα and the cyclin-dependent kinase (CDK)4/CDK6-cyclin D1 signaling pathway in a ligand-independent manner. They showed higher sensitivity to multi-kinase, receptor tyrosine kinase, and CDK4/CDK6 inhibitors. Of the three GBM patients studied, two harbored the p.K455_N468delinsN splice variant. The splicing mutation c.1403A > G in PDGFRA is oncogenic in nature. Kinase inhibitors targeting PDGFRα and CDK4/CDK6 signaling should be evaluated for treating GBM patients harboring this mutation.
Collapse
|
43
|
Zeng Z, Aptekmann AA, Bromberg Y. Decoding the effects of synonymous variants. Nucleic Acids Res 2021; 49:12673-12691. [PMID: 34850938 PMCID: PMC8682775 DOI: 10.1093/nar/gkab1159] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 11/02/2021] [Accepted: 11/08/2021] [Indexed: 12/12/2022] Open
Abstract
Synonymous single nucleotide variants (sSNVs) are common in the human genome but are often overlooked. However, sSNVs can have significant biological impact and may lead to disease. Existing computational methods for evaluating the effect of sSNVs suffer from the lack of gold-standard training/evaluation data and exhibit over-reliance on sequence conservation signals. We developed synVep (synonymous Variant effect predictor), a machine learning-based method that overcomes both of these limitations. Our training data was a combination of variants reported by gnomAD (observed) and those unreported, but possible in the human genome (generated). We used positive-unlabeled learning to purify the generated variant set of any likely unobservable variants. We then trained two sequential extreme gradient boosting models to identify subsets of the remaining variants putatively enriched and depleted in effect. Our method attained 90% precision/recall on a previously unseen set of variants. Furthermore, although synVep does not explicitly use conservation, its scores correlated with evolutionary distances between orthologs in cross-species variation analysis. synVep was also able to differentiate pathogenic vs. benign variants, as well as splice-site disrupting variants (SDV) vs. non-SDVs. Thus, synVep provides an important improvement in annotation of sSNVs, allowing users to focus on variants that most likely harbor effects.
Collapse
Affiliation(s)
- Zishuo Zeng
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Ariel A Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
| | - Yana Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ 08873, USA
- Department of Genetics, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
44
|
Chu HY, Wong ASL. Facilitating Machine Learning-Guided Protein Engineering with Smart Library Design and Massively Parallel Assays. ADVANCED GENETICS (HOBOKEN, N.J.) 2021; 2:2100038. [PMID: 36619853 PMCID: PMC9744531 DOI: 10.1002/ggn2.202100038] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Revised: 11/08/2021] [Indexed: 01/11/2023]
Abstract
Protein design plays an important role in recent medical advances from antibody therapy to vaccine design. Typically, exhaustive mutational screens or directed evolution experiments are used for the identification of the best design or for improvements to the wild-type variant. Even with a high-throughput screening on pooled libraries and Next-Generation Sequencing to boost the scale of read-outs, surveying all the variants with combinatorial mutations for their empirical fitness scores is still of magnitudes beyond the capacity of existing experimental settings. To tackle this challenge, in-silico approaches using machine learning to predict the fitness of novel variants based on a subset of empirical measurements are now employed. These machine learning models turn out to be useful in many cases, with the premise that the experimentally determined fitness scores and the amino-acid descriptors of the models are informative. The machine learning models can guide the search for the highest fitness variants, resolve complex epistatic relationships, and highlight bio-physical rules for protein folding. Using machine learning-guided approaches, researchers can build more focused libraries, thus relieving themselves from labor-intensive screens and fast-tracking the optimization process. Here, we describe the current advances in massive-scale variant screens, and how machine learning and mutagenesis strategies can be integrated to accelerate protein engineering. More specifically, we examine strategies to make screens more economical, informative, and effective in discovery of useful variants.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic BiologySchool of Biomedical SciencesThe University of Hong KongHong Kong852China
| | - Alan S. L. Wong
- Laboratory of Combinatorial Genetics and Synthetic BiologySchool of Biomedical SciencesThe University of Hong KongHong Kong852China,Electrical and Electronic EngineeringThe University of Hong KongPokfulamHong Kong852China
| |
Collapse
|
45
|
Chora JR, Iacocca MA, Tichý L, Wand H, Kurtz CL, Zimmermann H, Leon A, Williams M, Humphries SE, Hooper AJ, Trinder M, Brunham LR, Costa Pereira A, Jannes CE, Chen M, Chonis J, Wang J, Kim S, Johnston T, Soucek P, Kramarek M, Leigh SE, Carrié A, Sijbrands EJ, Hegele RA, Freiberger T, Knowles JW, Bourbon M. The Clinical Genome Resource (ClinGen) Familial Hypercholesterolemia Variant Curation Expert Panel consensus guidelines for LDLR variant classification. Genet Med 2021; 24:293-306. [PMID: 34906454 DOI: 10.1016/j.gim.2021.09.012] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Revised: 08/06/2021] [Accepted: 09/15/2021] [Indexed: 01/02/2023] Open
Abstract
PURPOSE In 2015, the American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) published consensus standardized guidelines for sequence-level variant classification in Mendelian disorders. To increase accuracy and consistency, the Clinical Genome Resource Familial Hypercholesterolemia (FH) Variant Curation Expert Panel was tasked with optimizing the existing ACMG/AMP framework for disease-specific classification in FH. In this study, we provide consensus recommendations for the most common FH-associated gene, LDLR, where >2300 unique FH-associated variants have been identified. METHODS The multidisciplinary FH Variant Curation Expert Panel met in person and through frequent emails and conference calls to develop LDLR-specific modifications of ACMG/AMP guidelines. Through iteration, pilot testing, debate, and commentary, consensus among experts was reached. RESULTS The consensus LDLR variant modifications to existing ACMG/AMP guidelines include (1) alteration of population frequency thresholds, (2) delineation of loss-of-function variant types, (3) functional study criteria specifications, (4) cosegregation criteria specifications, and (5) specific use and thresholds for in silico prediction tools, among others. CONCLUSION Establishment of these guidelines as the new standard in the clinical laboratory setting will result in a more evidence-based, harmonized method for LDLR variant classification worldwide, thereby improving the care of patients with FH.
Collapse
Affiliation(s)
- Joana R Chora
- Department of Health Promotion and Prevention of Noncommunicable Diseases, Nacional Institute of Health Dr. Ricardo Jorge, Lisbon, Portugal; BioISI - BioSystems & Integrative Sciences Institute, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, Lisbon, Portugal
| | - Michael A Iacocca
- Departments of Biomedical Data Science and Pathology, School of Medicine, Stanford University, Stanford, CA; Department of Paediatric Laboratory Medicine, The Hospital for Sick Children, Toronto Ontario, Canada
| | - Lukáš Tichý
- Centre of Molecular Biology and Gene Therapy, University Hospital Brno, Brno, Czech Republic
| | - Hannah Wand
- Departments of Biomedical Data Science and Pathology, School of Medicine, Stanford University, Stanford, CA; Center for Inherited Cardiovascular Disease, Stanford Health Care, Stanford University, Stanford, CA
| | - C Lisa Kurtz
- Department of Genetics, School of Medicine, University of North Carolina at Chapel Hill, Chapel Hill, NC
| | | | | | - Maggie Williams
- Bristol Genetics Laboratory, North Bristol NHS Trust, Bristol, United Kingdom
| | - Steve E Humphries
- Centre for Cardiovascular Genetics, Institute of Cardiovascular Science, University College London, London, United Kingdom
| | - Amanda J Hooper
- Department of Clinical Biochemistry, PathWest Laboratory Medicine WA, Royal Perth Hospital and Fiona Stanley Hospital Network, University of Western Australia, Perth, Western Australia, Australia
| | - Mark Trinder
- Department of Medicine, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Liam R Brunham
- Department of Medicine, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada
| | - Alexandre Costa Pereira
- Laboratory of Genetics and Molecular Cardiology, Institute of the Hearth (InCor), Faculty of Medicine, São Paulo University, São Paulo, Brazil
| | - Cinthia E Jannes
- Laboratory of Genetics and Molecular Cardiology, Institute of the Hearth (InCor), Faculty of Medicine, São Paulo University, São Paulo, Brazil
| | | | | | - Jian Wang
- Robarts Research Institute, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada
| | | | | | - Premysl Soucek
- Centre for Cardiovascular Surgery and Transplantation, Brno, Czech Republic; Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | - Michal Kramarek
- Centre for Cardiovascular Surgery and Transplantation, Brno, Czech Republic; Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | | | - Alain Carrié
- University Hospitals Pitié-Salpêtrière/Charles-Foix, Molecular and Chromosomal Genetics Center, Obesity and Dyslipidemia Genetics Unit, Sorbonne University, Paris, France
| | - Eric J Sijbrands
- Academic Medical Center, Erasmus University, Rotterdam, Netherlands
| | - Robert A Hegele
- Robarts Research Institute, Schulich School of Medicine & Dentistry, Western University, London, Ontario, Canada
| | - Tomáš Freiberger
- Centre for Cardiovascular Surgery and Transplantation, Brno, Czech Republic; Faculty of Medicine, Masaryk University, Brno, Czech Republic
| | - Joshua W Knowles
- Division of Cardiovascular Medicine, Stanford Cardiovascular Institute, Prevention Research Center, and Diabetes Research Center, School of Medicine, Stanford University, Stanford, CA; FH Foundation, Pasadena, CA
| | - Mafalda Bourbon
- Department of Health Promotion and Prevention of Noncommunicable Diseases, Nacional Institute of Health Dr. Ricardo Jorge, Lisbon, Portugal; BioISI - BioSystems & Integrative Sciences Institute, Department of Chemistry and Biochemistry, Faculty of Sciences, University of Lisbon, Lisbon, Portugal.
| | | |
Collapse
|
46
|
Wu Y, Liu H, Li R, Sun S, Weile J, Roth FP. Improved pathogenicity prediction for rare human missense variants. Am J Hum Genet 2021; 108:1891-1906. [PMID: 34551312 PMCID: PMC8546039 DOI: 10.1016/j.ajhg.2021.08.012] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 08/18/2021] [Indexed: 01/01/2023] Open
Abstract
The success of personalized genomic medicine depends on our ability to assess the pathogenicity of rare human variants, including the important class of missense variation. There are many challenges in training accurate computational systems, e.g., in finding the balance between quantity, quality, and bias in the variant sets used as training examples and avoiding predictive features that can accentuate the effects of bias. Here, we describe VARITY, which judiciously exploits a larger reservoir of training examples with uncertain accuracy and representativity. To limit circularity and bias, VARITY excludes features informed by variant annotation and protein identity. To provide a rationale for each prediction, we quantified the contribution of features and feature combinations to the pathogenicity inference of each variant. VARITY outperformed all previous computational methods evaluated, identifying at least 10% more pathogenic variants at thresholds achieving high (90% precision) stringency.
Collapse
|
47
|
Findlay GM. Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Hum Mol Genet 2021; 30:R187-R197. [PMID: 34338757 PMCID: PMC8490018 DOI: 10.1093/hmg/ddab219] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 07/19/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
The application of genomics to medicine has accelerated the discovery of mutations underlying disease and has enhanced our knowledge of the molecular underpinnings of diverse pathologies. As the amount of human genetic material queried via sequencing has grown exponentially in recent years, so too has the number of rare variants observed. Despite progress, our ability to distinguish which rare variants have clinical significance remains limited. Over the last decade, however, powerful experimental approaches have emerged to characterize variant effects orders of magnitude faster than before. Fueled by improved DNA synthesis and sequencing and, more recently, by CRISPR/Cas9 genome editing, multiplex functional assays provide a means of generating variant effect data in wide-ranging experimental systems. Here, I review recent applications of multiplex assays that link human variants to disease phenotypes and I describe emerging strategies that will enhance their clinical utility in coming years.
Collapse
Affiliation(s)
- Gregory M Findlay
- The Francis Crick Institute, The Genome Function Laboratory, London NW1 1AT, UK
| |
Collapse
|
48
|
Geck RC, Boyle G, Amorosi CJ, Fowler DM, Dunham MJ. Measuring Pharmacogene Variant Function at Scale Using Multiplexed Assays. Annu Rev Pharmacol Toxicol 2021; 62:531-550. [PMID: 34516287 DOI: 10.1146/annurev-pharmtox-032221-085807] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
As costs of next-generation sequencing decrease, identification of genetic variants has far outpaced our ability to understand their functional consequences. This lack of understanding is a central challenge to a key promise of pharmacogenomics: using genetic information to guide drug selection and dosing. Recently developed multiplexed assays of variant effect enable experimental measurement of the function of thousands of variants simultaneously. Here, we describe multiplexed assays that have been performed on nearly 25,000 variants in eight key pharmacogenes (ADRB2, CYP2C9, CYP2C19, NUDT15, SLCO1B1, TMPT, VKORC1, and the LDLR promoter), discuss advances in experimental design, and explore key challenges that must be overcome to maximize the utility of multiplexed functional data. Expected final online publication date for the Annual Review of Pharmacology and Toxicology, Volume 62 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Renee C Geck
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Gabriel Boyle
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Clara J Amorosi
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; , .,Department of Bioengineering, University of Washington, Seattle, Washington 98195, USA
| | - Maitreya J Dunham
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA; ,
| |
Collapse
|
49
|
Markin CJ, Mokhtari DA, Sunden F, Appel MJ, Akiva E, Longwell SA, Sabatti C, Herschlag D, Fordyce PM. Revealing enzyme functional architecture via high-throughput microfluidic enzyme kinetics. Science 2021; 373:373/6553/eabf8761. [PMID: 34437092 DOI: 10.1126/science.abf8761] [Citation(s) in RCA: 73] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 05/24/2021] [Indexed: 12/21/2022]
Abstract
Systematic and extensive investigation of enzymes is needed to understand their extraordinary efficiency and meet current challenges in medicine and engineering. We present HT-MEK (High-Throughput Microfluidic Enzyme Kinetics), a microfluidic platform for high-throughput expression, purification, and characterization of more than 1500 enzyme variants per experiment. For 1036 mutants of the alkaline phosphatase PafA (phosphate-irrepressible alkaline phosphatase of Flavobacterium), we performed more than 670,000 reactions and determined more than 5000 kinetic and physical constants for multiple substrates and inhibitors. We uncovered extensive kinetic partitioning to a misfolded state and isolated catalytic effects, revealing spatially contiguous regions of residues linked to particular aspects of function. Regions included active-site proximal residues but extended to the enzyme surface, providing a map of underlying architecture not possible to derive from existing approaches. HT-MEK has applications that range from understanding molecular mechanisms to medicine, engineering, and design.
Collapse
Affiliation(s)
- C J Markin
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - D A Mokhtari
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - F Sunden
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - M J Appel
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA
| | - E Akiva
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, CA 94158, USA
| | - S A Longwell
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | - C Sabatti
- Department of Biomedical Data Science, Stanford University, Stanford, CA 94305, USA.,Department of Statistics, Stanford University, Stanford, CA 94305, USA
| | - D Herschlag
- Department of Biochemistry, Stanford University, Stanford, CA 94305, USA. .,Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA.,ChEM-H Institute, Stanford University, Stanford, CA 94305, USA
| | - P M Fordyce
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA. .,ChEM-H Institute, Stanford University, Stanford, CA 94305, USA.,Department of Genetics, Stanford University, Stanford, CA 94305, USA.,Chan Zuckerberg Biohub; San Francisco, CA 94110, USA
| |
Collapse
|
50
|
Shifting landscapes of human MTHFR missense-variant effects. Am J Hum Genet 2021; 108:1283-1300. [PMID: 34214447 PMCID: PMC8322931 DOI: 10.1016/j.ajhg.2021.05.009] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 05/18/2021] [Indexed: 12/20/2022] Open
Abstract
Most rare clinical missense variants cannot currently be classified as pathogenic or benign. Deficiency in human 5,10-methylenetetrahydrofolate reductase (MTHFR), the most common inherited disorder of folate metabolism, is caused primarily by rare missense variants. Further complicating variant interpretation, variant impacts often depend on environment. An important example of this phenomenon is the MTHFR variant p.Ala222Val (c.665C>T), which is carried by half of all humans and has a phenotypic impact that depends on dietary folate. Here we describe the results of 98,336 variant functional-impact assays, covering nearly all possible MTHFR amino acid substitutions in four folinate environments, each in the presence and absence of p.Ala222Val. The resulting atlas of MTHFR variant effects reveals many complex dependencies on both folinate and p.Ala222Val. MTHFR atlas scores can distinguish pathogenic from benign variants and, among individuals with severe MTHFR deficiency, correlate with age of disease onset. Providing a powerful tool for understanding structure-function relationships, the atlas suggests a role for a disordered loop in retaining cofactor at the active site and identifies variants that enable escape of inhibition by S-adenosylmethionine. Thus, a model based on eight MTHFR variant effect maps illustrates how shifting landscapes of environment- and genetic-background-dependent missense variation can inform our clinical, structural, and functional understanding of MTHFR deficiency.
Collapse
|