1
|
De Leonardis M, Fernandez-de-Cossio-Diaz J, Uguzzoni G, Pagnani A. Unsupervised modeling of mutational landscapes of adeno-associated viruses viability. BMC Bioinformatics 2024; 25:229. [PMID: 38956474 DOI: 10.1186/s12859-024-05823-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 06/03/2024] [Indexed: 07/04/2024] Open
Abstract
Adeno-associated viruses 2 (AAV2) are minute viruses renowned for their capacity to infect human cells and akin organisms. They have recently emerged as prominent candidates in the field of gene therapy, primarily attributed to their inherent non-pathogenic nature in humans and the safety associated with their manipulation. The efficacy of AAV2 as gene therapy vectors hinges on their ability to infiltrate host cells, a phenomenon reliant on their competence to construct a capsid capable of breaching the nucleus of the target cell. To enhance their infection potential, researchers have extensively scrutinized various combinatorial libraries by introducing mutations into the capsid, aiming to boost their effectiveness. The emergence of high-throughput experimental techniques, like deep mutational scanning (DMS), has made it feasible to experimentally assess the fitness of these libraries for their intended purpose. Notably, machine learning is starting to demonstrate its potential in addressing predictions within the mutational landscape from sequence data. In this context, we introduce a biophysically-inspired model designed to predict the viability of genetic variants in DMS experiments. This model is tailored to a specific segment of the CAP region within AAV2's capsid protein. To evaluate its effectiveness, we conduct model training with diverse datasets, each tailored to explore different aspects of the mutational landscape influenced by the selection process. Our assessment of the biophysical model centers on two primary objectives: (i) providing quantitative forecasts for the log-selectivity of variants and (ii) deploying it as a binary classifier to categorize sequences into viable and non-viable classes.
Collapse
Affiliation(s)
- Matteo De Leonardis
- DISAT, Politecnico di Torino, Corso Duca degli Abruzzi, 10129, Torino, Italy.
| | - Jorge Fernandez-de-Cossio-Diaz
- Laboratoire de Physique de l'Ecole Normale Supérieure, CNRS, PSL University, Sorbonne Université, Universite, Paris-Cité, 75005, Paris, France
| | - Guido Uguzzoni
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, 10060, Candiolo, Italy
| | - Andrea Pagnani
- DISAT, Politecnico di Torino, Corso Duca degli Abruzzi, 10129, Torino, Italy
- Italian Institute for Genomic Medicine, IRCCS Candiolo, SP-142, 10060, Candiolo, Italy
| |
Collapse
|
2
|
McDonnell AF, Plech M, Livesey BJ, Gerasimavicius L, Owen LJ, Hall HN, FitzPatrick DR, Marsh JA, Kudla G. Deep mutational scanning quantifies DNA binding and predicts clinical outcomes of PAX6 variants. Mol Syst Biol 2024; 20:825-844. [PMID: 38849565 DOI: 10.1038/s44320-024-00043-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Revised: 04/05/2024] [Accepted: 05/14/2024] [Indexed: 06/09/2024] Open
Abstract
Nonsense and missense mutations in the transcription factor PAX6 cause a wide range of eye development defects, including aniridia, microphthalmia and coloboma. To understand how changes of PAX6:DNA binding cause these phenotypes, we combined saturation mutagenesis of the paired domain of PAX6 with a yeast one-hybrid (Y1H) assay in which expression of a PAX6-GAL4 fusion gene drives antibiotic resistance. We quantified binding of more than 2700 single amino-acid variants to two DNA sequence elements. Mutations in DNA-facing residues of the N-terminal subdomain and linker region were most detrimental, as were mutations to prolines and to negatively charged residues. Many variants caused sequence-specific molecular gain-of-function effects, including variants in position 71 that increased binding to the LE9 enhancer but decreased binding to a SELEX-derived binding site. In the absence of antibiotic selection, variants that retained DNA binding slowed yeast growth, likely because such variants perturbed the yeast transcriptome. Benchmarking against known patient variants and applying ACMG/AMP guidelines to variant classification, we obtained supporting-to-moderate evidence that 977 variants are likely pathogenic and 1306 are likely benign. Our analysis shows that most pathogenic mutations in the paired domain of PAX6 can be explained simply by the effects of these mutations on PAX6:DNA association, and establishes Y1H as a generalisable assay for the interpretation of variant effects in transcription factors.
Collapse
Affiliation(s)
- Alexander F McDonnell
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Marcin Plech
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Benjamin J Livesey
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Lukas Gerasimavicius
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Liusaidh J Owen
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Hildegard Nikki Hall
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - David R FitzPatrick
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK
| | - Grzegorz Kudla
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, EH4 2XU, UK.
| |
Collapse
|
3
|
Ma K, Huang S, Ng KK, Lake NJ, Joseph S, Xu J, Lek A, Ge L, Woodman KG, Koczwara KE, Cohen J, Ho V, O'Connor CL, Brindley MA, Campbell KP, Lek M. Deep Mutational Scanning in Disease-related Genes with Saturation Mutagenesis-Reinforced Functional Assays (SMuRF). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.07.12.548370. [PMID: 37873263 PMCID: PMC10592615 DOI: 10.1101/2023.07.12.548370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Interpretation of disease-causing genetic variants remains a challenge in human genetics. Current costs and complexity of deep mutational scanning methods hamper crowd-sourcing approaches toward genome-wide resolution of variants in disease-related genes. Our framework, Saturation Mutagenesis-Reinforced Functional assays (SMuRF), addresses these issues by offering simple and cost-effective saturation mutagenesis, as well as streamlining functional assays to enhance the interpretation of unresolved variants. Applying SMuRF to neuromuscular disease genes FKRP and LARGE1 , we generated functional scores for all possible coding single nucleotide variants, which aid in resolving clinically reported variants of uncertain significance. SMuRF also demonstrates utility in predicting disease severity, resolving critical structural regions, and providing training datasets for the development of computational predictors. Our approach opens new directions for enabling variant-to-function insights for disease genes in a manner that is broadly useful for crowd-sourcing implementation across standard research laboratories.
Collapse
|
4
|
Bendel AM, Skendo K, Klein D, Shimada K, Kauneckaite-Griguole K, Diss G. Optimization of a deep mutational scanning workflow to improve quantification of mutation effects on protein-protein interactions. BMC Genomics 2024; 25:630. [PMID: 38914936 PMCID: PMC11194945 DOI: 10.1186/s12864-024-10524-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 06/14/2024] [Indexed: 06/26/2024] Open
Abstract
Deep Mutational Scanning (DMS) assays are powerful tools to study sequence-function relationships by measuring the effects of thousands of sequence variants on protein function. During a DMS experiment, several technical artefacts might distort non-linearly the functional score obtained, potentially biasing the interpretation of the results. We therefore tested several technical parameters in the deepPCA workflow, a DMS assay for protein-protein interactions, in order to identify technical sources of non-linearities. We found that parameters common to many DMS assays such as amount of transformed DNA, timepoint of harvest and library composition can cause non-linearities in the data. Designing experiments in a way to minimize these non-linear effects will improve the quantification and interpretation of mutation effects.
Collapse
Affiliation(s)
- Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
| | | | - Dominique Klein
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Kenji Shimada
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Kotryna Kauneckaite-Griguole
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland.
| |
Collapse
|
5
|
Biar CG, Pfeifer C, Carvill GL, Calhoun JD. Multimodal framework to resolve variants of uncertain significance in TSC2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.07.597916. [PMID: 38895336 PMCID: PMC11185720 DOI: 10.1101/2024.06.07.597916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Efforts to resolve the functional impact of variants of uncertain significance (VUS) have lagged behind the identification of new VUS; as such, there is a critical need for scalable VUS resolution technologies. Computational variant effect predictors (VEPs), once trained, can predict pathogenicity for all missense variants in a gene, set of genes, or the exome. Existing tools have employed information on known pathogenic and benign variants throughout the genome to predict pathogenicity of VUS. We hypothesize that taking a gene-specific approach will improve pathogenicity prediction over globally-trained VEPs. We tested this hypothesis using the gene TSC2, whose loss of function results in tuberous sclerosis, a multisystem mTORopathy affecting about 1 in 6,000 individuals born in the United States. TSC2 has been identified as a high-priority target for VUS resolution, with (1) well-characterized molecular and patient phenotypes associated with loss-of-function variants, and (2) more than 2,700 VUS already documented in ClinVar. We developed Tuberous sclerosis classifier to Resolve variants of Uncertain Significance in T SC2 (TRUST), a machine learning model to predict pathogenicity of TSC2 missense VUS. To test whether these predictions are accurate, we further introduce curated loci prime editing (cliPE) as an accessible strategy for performing scalable multiplexed assays of variant effect (MAVEs). Using cliPE, we tested the effects of more than 200 TSC2 variants, including 106 VUS. It is highly likely this functional data alone would be sufficient to reclassify 92 VUS with most being reclassified as likely benign. We found that TRUST's classifications were correlated with the functional data, providing additional validation for the in silico predictions. We provide our pathogenicity predictions and MAVE data to aid with VUS resolution. In the near future, we plan to host these data on a public website and deposit into relevant databases such as MAVEdb as a community resource. Ultimately, this study provides a framework to complete variant effect maps of TSC1 and TSC2 and adapt this approach to other mTORopathy genes.
Collapse
Affiliation(s)
- Carina G Biar
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| | - Cole Pfeifer
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| | - Gemma L Carvill
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| | - Jeffrey D Calhoun
- Ken and Ruth Davee Department of Neurology, Northwestern Feinberg School of Medicine, Chicago, Illinois
| |
Collapse
|
6
|
Yee SW, Macdonald CB, Mitrovic D, Zhou X, Koleske ML, Yang J, Buitrago Silva D, Rockefeller Grimes P, Trinidad DD, More SS, Kachuri L, Witte JS, Delemotte L, Giacomini KM, Coyote-Maestas W. The full spectrum of SLC22 OCT1 mutations illuminates the bridge between drug transporter biophysics and pharmacogenomics. Mol Cell 2024; 84:1932-1947.e10. [PMID: 38703769 DOI: 10.1016/j.molcel.2024.04.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 01/04/2024] [Accepted: 04/15/2024] [Indexed: 05/06/2024]
Abstract
Mutations in transporters can impact an individual's response to drugs and cause many diseases. Few variants in transporters have been evaluated for their functional impact. Here, we combine saturation mutagenesis and multi-phenotypic screening to dissect the impact of 11,213 missense single-amino-acid deletions, and synonymous variants across the 554 residues of OCT1, a key liver xenobiotic transporter. By quantifying in parallel expression and substrate uptake, we find that most variants exert their primary effect on protein abundance, a phenotype not commonly measured alongside function. Using our mutagenesis results combined with structure prediction and molecular dynamic simulations, we develop accurate structure-function models of the entire transport cycle, providing biophysical characterization of all known and possible human OCT1 polymorphisms. This work provides a complete functional map of OCT1 variants along with a framework for integrating functional genomics, biophysical modeling, and human genetics to predict variant effects on disease and drug efficacy.
Collapse
Affiliation(s)
- Sook Wah Yee
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Christian B Macdonald
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Darko Mitrovic
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Stockholm, Stockholm County 114 28, Sweden
| | - Xujia Zhou
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Megan L Koleske
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Jia Yang
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Dina Buitrago Silva
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Patrick Rockefeller Grimes
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Donovan D Trinidad
- Department of Medicine, Division of Infectious Disease, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Swati S More
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Linda Kachuri
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA 94305, USA; Stanford Cancer Institute, Stanford University, Stanford, CA 94305, USA
| | - John S Witte
- Department of Epidemiology and Population Health, Stanford University, Stanford, CA 94305, USA; Stanford Cancer Institute, Stanford University, Stanford, CA 94305, USA
| | - Lucie Delemotte
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Stockholm, Stockholm County 114 28, Sweden.
| | - Kathleen M Giacomini
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA.
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94143, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA; Chan Zuckerberg Biohub, San Francisco, CA 94148, USA.
| |
Collapse
|
7
|
Estevam GO, Linossi EM, Macdonald CB, Espinoza CA, Michaud JM, Coyote-Maestas W, Collisson EA, Jura N, Fraser JS. Conserved regulatory motifs in the juxtamembrane domain and kinase N-lobe revealed through deep mutational scanning of the MET receptor tyrosine kinase domain. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.08.03.551866. [PMID: 37577651 PMCID: PMC10418267 DOI: 10.1101/2023.08.03.551866] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
MET is a receptor tyrosine kinase (RTK) responsible for initiating signaling pathways involved in development and wound repair. MET activation relies on ligand binding to the extracellular receptor, which prompts dimerization, intracellular phosphorylation, and recruitment of associated signaling proteins. Mutations, which are predominantly observed clinically in the intracellular juxtamembrane and kinase domains, can disrupt typical MET regulatory mechanisms. Understanding how juxtamembrane variants, such as exon 14 skipping (METΔEx14), and rare kinase domain mutations can increase signaling, often leading to cancer, remains a challenge. Here, we perform a parallel deep mutational scan (DMS) of the MET intracellular kinase domain in two fusion protein backgrounds: wild type and METΔEx14. Our comparative approach has revealed a critical hydrophobic interaction between a juxtamembrane segment and the kinase αC-helix, pointing to potential differences in regulatory mechanisms between MET and other RTKs. Additionally, we have uncovered a β5 motif that acts as a structural pivot for the kinase domain in MET and other TAM family of kinases. We also describe a number of previously unknown activating mutations, aiding the effort to annotate driver, passenger, and drug resistance mutations in the MET kinase domain.
Collapse
Affiliation(s)
- Gabriella O. Estevam
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco,United States
- Tetrad Graduate Program, University of California San Francisco, San Francisco, United States
| | - Edmond M. Linossi
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, United States
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, United States
| | - Christian B. Macdonald
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco,United States
| | - Carla A. Espinoza
- Tetrad Graduate Program, University of California San Francisco, San Francisco, United States
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, United States
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, United States
| | - Jennifer M. Michaud
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco,United States
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco,United States
- Quantitative Biosciences Institute, University of California, San Francisco, United States, United States
| | - Eric A. Collisson
- Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, United States
- Department of Medicine/Hematology and Oncology, University of California, San Francisco, United States
| | - Natalia Jura
- Cardiovascular Research Institute, University of California San Francisco, San Francisco, United States
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, United States
- Quantitative Biosciences Institute, University of California, San Francisco, United States, United States
| | - James S. Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco,United States
- Quantitative Biosciences Institute, University of California, San Francisco, United States, United States
| |
Collapse
|
8
|
Rozhoňová H, Martí-Gómez C, McCandlish DM, Payne JL. Robust genetic codes enhance protein evolvability. PLoS Biol 2024; 22:e3002594. [PMID: 38754362 PMCID: PMC11098591 DOI: 10.1371/journal.pbio.3002594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Accepted: 03/19/2024] [Indexed: 05/18/2024] Open
Abstract
The standard genetic code defines the rules of translation for nearly every life form on Earth. It also determines the amino acid changes accessible via single-nucleotide mutations, thus influencing protein evolvability-the ability of mutation to bring forth adaptive variation in protein function. One of the most striking features of the standard genetic code is its robustness to mutation, yet it remains an open question whether such robustness facilitates or frustrates protein evolvability. To answer this question, we use data from massively parallel sequence-to-function assays to construct and analyze 6 empirical adaptive landscapes under hundreds of thousands of rewired genetic codes, including those of codon compression schemes relevant to protein engineering and synthetic biology. We find that robust genetic codes tend to enhance protein evolvability by rendering smooth adaptive landscapes with few peaks, which are readily accessible from throughout sequence space. However, the standard genetic code is rarely exceptional in this regard, because many alternative codes render smoother landscapes than the standard code. By constructing low-dimensional visualizations of these landscapes, which each comprise more than 16 million mRNA sequences, we show that such alternative codes radically alter the topological features of the network of high-fitness genotypes. Whereas the genetic codes that optimize evolvability depend to some extent on the detailed relationship between amino acid sequence and protein function, we also uncover general design principles for engineering nonstandard genetic codes for enhanced and diminished evolvability, which may facilitate directed protein evolution experiments and the bio-containment of synthetic organisms, respectively.
Collapse
Affiliation(s)
- Hana Rozhoňová
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Carlos Martí-Gómez
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - David M. McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Joshua L. Payne
- Institute of Integrative Biology, ETH Zürich, Zürich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
9
|
Shukla N, Roelle SM, Snell JC, DelSignore O, Bruchez AM, Matreyek KA. Pseudotyped virus infection of multiplexed ACE2 libraries reveals SARS-CoV-2 variant shifts in receptor usage. PLoS Pathog 2024; 20:e1012044. [PMID: 38768238 PMCID: PMC11142672 DOI: 10.1371/journal.ppat.1012044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Revised: 05/31/2024] [Accepted: 05/07/2024] [Indexed: 05/22/2024] Open
Abstract
Pairwise compatibility between virus and host proteins can dictate the outcome of infection. During transmission, both inter- and intraspecies variabilities in receptor protein sequences can impact cell susceptibility. Many viruses possess mutable viral entry proteins and the patterns of host compatibility can shift as the viral protein sequence changes. This combinatorial sequence space between virus and host is poorly understood, as traditional experimental approaches lack the throughput to simultaneously test all possible combinations of protein sequences. Here, we created a pseudotyped virus infection assay where a multiplexed target-cell library of host receptor variants can be assayed simultaneously using a DNA barcode sequencing readout. We applied this assay to test a panel of 30 ACE2 orthologs or human sequence mutants for infectability by the original SARS-CoV-2 spike protein or the Alpha, Beta, Gamma, Delta, and Omicron BA1 variant spikes. We compared these results to an analysis of the structural shifts that occurred for each variant spike's interface with human ACE2. Mutated residues were directly involved in the largest shifts, although there were also widespread indirect effects altering interface structure. The N501Y substitution in spike conferred a large structural shift for interaction with ACE2, which was partially recreated by indirect distal substitutions in Delta, which does not harbor N501Y. The structural shifts from N501Y greatly influenced the set of animal orthologs the variant spike was capable of interacting with. Out of the thirteen non-human orthologs, ten exhibited unique patterns of variant-specific compatibility, demonstrating that spike sequence changes during human transmission can toggle ACE2 compatibility and potential susceptibility of other animal species, and cumulatively increase overall compatibilities as new variants emerge. These experiments provide a blueprint for similar large-scale assessments of protein compatibility during entry by diverse viruses. This dataset demonstrates the complex compatibility relationships that occur between variable interacting host and virus proteins.
Collapse
Affiliation(s)
- Nidhi Shukla
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Sarah M. Roelle
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - John C. Snell
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Olivia DelSignore
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Anna M. Bruchez
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| | - Kenneth A. Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, United States of America
| |
Collapse
|
10
|
Hoskins I, Rao S, Tante C, Cenik C. Integrated multiplexed assays of variant effect reveal determinants of catechol-O-methyltransferase gene expression. Mol Syst Biol 2024; 20:481-505. [PMID: 38355921 PMCID: PMC11066095 DOI: 10.1038/s44320-024-00018-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 01/16/2024] [Accepted: 01/18/2024] [Indexed: 02/16/2024] Open
Abstract
Multiplexed assays of variant effect are powerful methods to profile the consequences of rare variants on gene expression and organismal fitness. Yet, few studies have integrated several multiplexed assays to map variant effects on gene expression in coding sequences. Here, we pioneered a multiplexed assay based on polysome profiling to measure variant effects on translation at scale, uncovering single-nucleotide variants that increase or decrease ribosome load. By combining high-throughput ribosome load data with multiplexed mRNA and protein abundance readouts, we mapped the cis-regulatory landscape of thousands of catechol-O-methyltransferase (COMT) variants from RNA to protein and found numerous coding variants that alter COMT expression. Finally, we trained machine learning models to map signatures of variant effects on COMT gene expression and uncovered both directional and divergent impacts across expression layers. Our analyses reveal expression phenotypes for thousands of variants in COMT and highlight variant effects on both single and multiple layers of expression. Our findings prompt future studies that integrate several multiplexed assays for the readout of gene expression.
Collapse
Affiliation(s)
- Ian Hoskins
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Shilpa Rao
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Charisma Tante
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
11
|
Claussnitzer M, Parikh VN, Wagner AH, Arbesfeld JA, Bult CJ, Firth HV, Muffley LA, Nguyen Ba AN, Riehle K, Roth FP, Tabet D, Bolognesi B, Glazer AM, Rubin AF. Minimum information and guidelines for reporting a multiplexed assay of variant effect. Genome Biol 2024; 25:100. [PMID: 38641812 PMCID: PMC11027375 DOI: 10.1186/s13059-024-03223-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 03/25/2024] [Indexed: 04/21/2024] Open
Abstract
Multiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.
Collapse
Affiliation(s)
- Melina Claussnitzer
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Cambridge, MA, 02142, USA
| | - Victoria N Parikh
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Alex H Wagner
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43210, USA
| | - Jeremy A Arbesfeld
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Carol J Bult
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | - Helen V Firth
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
- Dept of Medical Genetics, Cambridge University Hospitals NHS Trust, Cambridge, UK
| | - Lara A Muffley
- Department of Genome Sciences, University of Washington, Seattle, WA, 98105, USA
| | - Alex N Nguyen Ba
- Department of Biology, University of Toronto at Mississauga, Mississauga, ON, Canada
| | - Kevin Riehle
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Daniel Tabet
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalunya (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain.
| | - Andrew M Glazer
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
12
|
Howard MK, Hoppe N, Huang XP, Macdonald CB, Mehrota E, Grimes PR, Zahm A, Trinidad DD, English J, Coyote-Maestas W, Manglik A. Molecular basis of proton-sensing by G protein-coupled receptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.17.590000. [PMID: 38659943 PMCID: PMC11042331 DOI: 10.1101/2024.04.17.590000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Three proton-sensing G protein-coupled receptors (GPCRs), GPR4, GPR65, and GPR68, respond to changes in extracellular pH to regulate diverse physiology and are implicated in a wide range of diseases. A central challenge in determining how protons activate these receptors is identifying the set of residues that bind protons. Here, we determine structures of each receptor to understand the spatial arrangement of putative proton sensing residues in the active state. With a newly developed deep mutational scanning approach, we determined the functional importance of every residue in proton activation for GPR68 by generating ~9,500 mutants and measuring effects on signaling and surface expression. This unbiased screen revealed that, unlike other proton-sensitive cell surface channels and receptors, no single site is critical for proton recognition in GPR68. Instead, a network of titratable residues extend from the extracellular surface to the transmembrane region and converge on canonical class A GPCR activation motifs to activate proton-sensing GPCRs. More broadly, our approach integrating structure and unbiased functional interrogation defines a new framework for understanding the rich complexity of GPCR signaling.
Collapse
Affiliation(s)
- Matthew K. Howard
- Tetrad graduate program, University of California, San Francisco, CA, USA
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, CA, USA
| | - Nicholas Hoppe
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA
- Biophysics graduate program, University of California, San Francisco, CA, USA
| | - Xi-Ping Huang
- Department of Pharmacology and the National Institute of Mental Health Psychoactive Drug Screening Program (NIMH PDSP), The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Christian B. Macdonald
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, CA, USA
| | - Eshan Mehrota
- Tetrad graduate program, University of California, San Francisco, CA, USA
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA
- Medical Scientist Training Program, University of California, San Francisco, CA, USA
| | | | - Adam Zahm
- Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Donovan D. Trinidad
- Department of Medicine, Division of Infectious Disease, University of California, San Francisco, United States
| | - Justin English
- Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Quantitative Biosciences Institute, University of California, San Francisco, USA
| | - Aashish Manglik
- Department of Pharmaceutical Chemistry, University of California, San Francisco, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Quantitative Biosciences Institute, University of California, San Francisco, USA
- Department of Anesthesia and Perioperative Care, University of California, San Francisco, CA, USA
| |
Collapse
|
13
|
Sundar V, Tu B, Guan L, Esvelt K. FLIGHTED: Inferring Fitness Landscapes from Noisy High-Throughput Experimental Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.26.586797. [PMID: 38586054 PMCID: PMC10996587 DOI: 10.1101/2024.03.26.586797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Machine learning (ML) for protein design requires large protein fitness datasets generated by high-throughput experiments for training, fine-tuning, and benchmarking models. However, most models do not account for experimental noise inherent in these datasets, harming model performance and changing model rankings in benchmarking studies. Here, we develop FLIGHTED, a Bayesian method for generating fitness landscapes with calibrated errors from noisy high-throughput experimental data. We apply FLIGHTED to single-step selection assays such as phage display and to a novel high-throughput assay DHARMA that ties fitness to base editing activity. Our results show that FLIGHTED robustly generates fitness landscapes with accurate errors. We demonstrate that FLIGHTED improves model performance and enables the generation of protein fitness datasets of up to 106 variants with DHARMA. FLIGHTED can be used on any high-throughput assay and makes it easy for ML scientists to account for experimental noise when modeling protein fitness.
Collapse
Affiliation(s)
- Vikram Sundar
- Computational and Systems Biology Program, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, 02139, MA, USA
| | - Boqiang Tu
- Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, 02139, MA, USA
| | - Lindsey Guan
- Computational and Systems Biology Program, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, 02139, MA, USA
| | - Kevin Esvelt
- Media Lab, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, 02139, MA, USA
| |
Collapse
|
14
|
Gelman S, Johnson B, Freschlin C, D'Costa S, Gitter A, Romero PA. Biophysics-based protein language models for protein engineering. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585128. [PMID: 38559182 PMCID: PMC10980077 DOI: 10.1101/2024.03.15.585128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure, and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose Mutational Effect Transfer Learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure, and energetics. We finetune METL on experimental sequence-function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity, and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL's ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering.
Collapse
Affiliation(s)
- Sam Gelman
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
| | - Bryce Johnson
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
| | | | - Sameer D'Costa
- Department of Biochemistry, University of Wisconsin-Madison
| | - Anthony Gitter
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison
| | | |
Collapse
|
15
|
Case M, Smith M, Vinh J, Thurber G. Machine learning to predict continuous protein properties from binary cell sorting data and map unseen sequence space. Proc Natl Acad Sci U S A 2024; 121:e2311726121. [PMID: 38451939 PMCID: PMC10945751 DOI: 10.1073/pnas.2311726121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 12/27/2023] [Indexed: 03/09/2024] Open
Abstract
Proteins are a diverse class of biomolecules responsible for wide-ranging cellular functions, from catalyzing reactions to recognizing pathogens. The ability to evolve proteins rapidly and inexpensively toward improved properties is a common objective for protein engineers. Powerful high-throughput methods like fluorescent activated cell sorting and next-generation sequencing have dramatically improved directed evolution experiments. However, it is unclear how to best leverage these data to characterize protein fitness landscapes more completely and identify lead candidates. In this work, we develop a simple yet powerful framework to improve protein optimization by predicting continuous protein properties from simple directed evolution experiments using interpretable, linear machine learning models. Importantly, we find that these models, which use data from simple but imprecise experimental estimates of protein fitness, have predictive capabilities that approach more precise but expensive data. Evaluated across five diverse protein engineering tasks, continuous properties are consistently predicted from readily available deep sequencing data, demonstrating that protein fitness space can be reasonably well modeled by linear relationships among sequence mutations. To prospectively test the utility of this approach, we generated a library of stapled peptides and applied the framework to predict affinity and specificity from simple cell sorting data. We then coupled integer linear programming, a method to optimize protein fitness from linear weights, with mutation scores from machine learning to identify variants in unseen sequence space that have improved and co-optimal properties. This approach represents a versatile tool for improved analysis and identification of protein variants across many domains of protein engineering.
Collapse
Affiliation(s)
- Marshall Case
- Chemical Engineering, University of Michigan, Ann Arbor, MI48109
| | - Matthew Smith
- Chemical Engineering, University of Michigan, Ann Arbor, MI48109
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI48109
| | - Jordan Vinh
- Biomedical Engineering, University of Michigan, Ann Arbor, MI48109
| | - Greg Thurber
- Chemical Engineering, University of Michigan, Ann Arbor, MI48109
- Biomedical Engineering, University of Michigan, Ann Arbor, MI48109
| |
Collapse
|
16
|
Chakraborty S, Ahler E, Simon JJ, Fang L, Potter ZE, Sitko KA, Stephany JJ, Guttman M, Fowler DM, Maly DJ. Profiling of drug resistance in Src kinase at scale uncovers a regulatory network coupling autoinhibition and catalytic domain dynamics. Cell Chem Biol 2024; 31:207-220.e11. [PMID: 37683649 PMCID: PMC10902203 DOI: 10.1016/j.chembiol.2023.08.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Revised: 07/03/2023] [Accepted: 08/16/2023] [Indexed: 09/10/2023]
Abstract
Kinase inhibitors are effective cancer therapies, but resistance often limits clinical efficacy. Despite the cataloging of numerous resistance mutations, our understanding of kinase inhibitor resistance is still incomplete. Here, we comprehensively profiled the resistance of ∼3,500 Src tyrosine kinase mutants to four different ATP-competitive inhibitors. We found that ATP-competitive inhibitor resistance mutations are distributed throughout Src's catalytic domain. In addition to inhibitor contact residues, residues that participate in regulating Src's phosphotransferase activity were prone to the development of resistance. Unexpectedly, we found that a resistance-prone cluster of residues located on the top face of the N-terminal lobe of Src's catalytic domain contributes to autoinhibition by reducing catalytic domain dynamics, and mutations in this cluster led to resistance by lowering inhibitor affinity and promoting kinase hyperactivation. Together, our studies demonstrate how drug resistance profiling can be used to define potential resistance pathways and uncover new mechanisms of kinase regulation.
Collapse
Affiliation(s)
- Sujata Chakraborty
- Department of Chemistry, University of Washington, Seattle, WA 98195, USA
| | - Ethan Ahler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Molecular and Cellular Biology, University of Washington, Seattle, WA 98195, USA
| | - Jessica J Simon
- Department of Chemistry, University of Washington, Seattle, WA 98195, USA
| | - Linglan Fang
- Department of Chemistry, University of Washington, Seattle, WA 98195, USA
| | - Zachary E Potter
- Department of Chemistry, University of Washington, Seattle, WA 98195, USA
| | - Katherine A Sitko
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Jason J Stephany
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA
| | - Miklos Guttman
- Department of Medicinal Chemistry, University of Washington, Seattle, WA 98195, USA
| | - Douglas M Fowler
- Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Department of Bioengineering, University of Washington, Seattle, WA 98195, USA.
| | - Dustin J Maly
- Department of Chemistry, University of Washington, Seattle, WA 98195, USA; Department of Biochemistry, University of Washington, Seattle, WA 98195, USA.
| |
Collapse
|
17
|
Shukla N, Roelle SM, Snell JC, DelSignore O, Bruchez AM, Matreyek KA. Pseudotyped virus infection of multiplexed ACE2 libraries reveals SARS-CoV-2 variant shifts in receptor usage. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.13.580056. [PMID: 38405739 PMCID: PMC10888787 DOI: 10.1101/2024.02.13.580056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/27/2024]
Abstract
Pairwise compatibility between virus and host proteins can dictate the outcome of infection. During transmission, both inter- and intraspecies variabilities in receptor protein sequences can impact cell susceptibility. Many viruses possess mutable viral entry proteins and the patterns of host compatibility can shift as the viral protein sequence changes. This combinatorial sequence space between virus and host is poorly understood, as traditional experimental approaches lack the throughput to simultaneously test all possible combinations of protein sequences. Here, we created a pseudotyped virus infection assay where a multiplexed target-cell library of host receptor variants can be assayed simultaneously using a DNA barcode sequencing readout. We applied this assay to test a panel of 30 ACE2 orthologs or human sequence mutants for infectability by the original SARS-CoV-2 spike protein or the Alpha, Beta, Gamma, Delta, and Omicron BA1 variant spikes. We compared these results to an analysis of the structural shifts that occurred for each variant spike's interface with human ACE2. Mutated residues were directly involved in the largest shifts, although there were also widespread indirect effects altering interface structure. The N501Y substitution in spike conferred a large structural shift for interaction with ACE2, which was partially recreated by indirect distal substitutions in Delta, which does not harbor N501Y. The structural shifts from N501Y greatly influenced the set of animal orthologs the variant spike was capable of interacting with. Out of the thirteen non-human orthologs, ten exhibited unique patterns of variant-specific compatibility, demonstrating that spike sequence changes during human transmission can toggle ACE2 compatibility and potential susceptibility of other animal species, and cumulatively increase overall compatibilities as new variants emerge. These experiments provide a blueprint for similar large-scale assessments of protein compatibility during entry by diverse viruses. This dataset demonstrates the complex compatibility relationships that occur between variable interacting host and virus proteins.
Collapse
Affiliation(s)
- Nidhi Shukla
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Sarah M Roelle
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - John C Snell
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Olivia DelSignore
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Anna M Bruchez
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| | - Kenneth A Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, Ohio, USA
| |
Collapse
|
18
|
Sesta L, Pagnani A, Fernandez-de-Cossio-Diaz J, Uguzzoni G. Inference of annealed protein fitness landscapes with AnnealDCA. PLoS Comput Biol 2024; 20:e1011812. [PMID: 38377054 PMCID: PMC10878520 DOI: 10.1371/journal.pcbi.1011812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 01/08/2024] [Indexed: 02/22/2024] Open
Abstract
The design of proteins with specific tasks is a major challenge in molecular biology with important diagnostic and therapeutic applications. High-throughput screening methods have been developed to systematically evaluate protein activity, but only a small fraction of possible protein variants can be tested using these techniques. Computational models that explore the sequence space in-silico to identify the fittest molecules for a given function are needed to overcome this limitation. In this article, we propose AnnealDCA, a machine-learning framework to learn the protein fitness landscape from sequencing data derived from a broad range of experiments that use selection and sequencing to quantify protein activity. We demonstrate the effectiveness of our method by applying it to antibody Rep-Seq data of immunized mice and screening experiments, assessing the quality of the fitness landscape reconstructions. Our method can be applied to several experimental cases where a population of protein variants undergoes various rounds of selection and sequencing, without relying on the computation of variants enrichment ratios, and thus can be used even in cases of disjoint sequence samples.
Collapse
Affiliation(s)
- Luca Sesta
- Department of Applied Science and Technology, Politecnico di Torino, Torino, Italy
| | - Andrea Pagnani
- Department of Applied Science and Technology, Politecnico di Torino, Torino, Italy
- Italian Institute for Genomic Medicine, Torino, Italy
- INFN, Sezione di Torino, Torino, Italy
| | | | | |
Collapse
|
19
|
Fannjiang C, Listgarten J. Is Novelty Predictable? Cold Spring Harb Perspect Biol 2024; 16:a041469. [PMID: 38052497 PMCID: PMC10835614 DOI: 10.1101/cshperspect.a041469] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Machine learning-based design has gained traction in the sciences, most notably in the design of small molecules, materials, and proteins, with societal applications ranging from drug development and plastic degradation to carbon sequestration. When designing objects to achieve novel property values with machine learning, one faces a fundamental challenge: how to push past the frontier of current knowledge, distilled from the training data into the model, in a manner that rationally controls the risk of failure. If one trusts learned models too much in extrapolation, one is likely to design rubbish. In contrast, if one does not extrapolate, one cannot find novelty. Herein, we ponder how one might strike a useful balance between these two extremes. We focus in particular on designing proteins with novel property values, although much of our discussion is relevant to machine learning-based design more broadly.
Collapse
Affiliation(s)
- Clara Fannjiang
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA
| | - Jennifer Listgarten
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA
| |
Collapse
|
20
|
Hong Z, Barton JP. popDMS infers mutation effects from deep mutational scanning data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.29.577759. [PMID: 38352383 PMCID: PMC10862717 DOI: 10.1101/2024.01.29.577759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/21/2024]
Abstract
Deep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions.
Collapse
Affiliation(s)
- Zhenchen Hong
- Department of Physics and Astronomy, University of California, Riverside, USA
| | - John P. Barton
- Department of Physics and Astronomy, University of California, Riverside, USA
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, USA
- Department of Physics and Astronomy, University of Pittsburgh, USA
| |
Collapse
|
21
|
Yang DD, Rusch LM, Widney KA, Morgenthaler AB, Copley SD. Synonymous edits in the Escherichia coli genome have substantial and condition-dependent effects on fitness. Proc Natl Acad Sci U S A 2024; 121:e2316834121. [PMID: 38252823 PMCID: PMC10835057 DOI: 10.1073/pnas.2316834121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 12/19/2023] [Indexed: 01/24/2024] Open
Abstract
CRISPR-Cas-based genome editing is widely used in bacteria at scales ranging from construction of individual mutants to massively parallel libraries. This procedure relies on guide RNA-directed cleavage of the genome followed by repair with a template that introduces a desired mutation along with synonymous "immunizing" mutations to prevent re-cleavage of the genome after editing. Because the immunizing mutations do not change the protein sequence, they are often assumed to be neutral. However, synonymous mutations can change mRNA structures in ways that alter levels of the encoded proteins. We have tested the assumption that immunizing mutations are neutral by constructing a library of over 50,000 edits that consist of only synonymous mutations in Escherichia coli. Thousands of edits had substantial effects on fitness during growth of E. coli on acetate, a poor carbon source that is toxic at high concentrations. The percentage of high-impact edits varied considerably between genes and at different positions within genes. We reconstructed clones with high-impact edits and found that 69% indeed had significant effects on growth in acetate. Interestingly, fewer edits affected fitness during growth in glucose, a preferred carbon source, suggesting that changes in protein expression caused by synonymous mutations may be most important when an organism encounters challenging conditions. Finally, we showed that synonymous edits can have widespread effects; a synonymous edit at the 5' end of ptsI altered expression of hundreds of genes. Our results suggest that the synonymous immunizing edits introduced during CRISPR-Cas-based genome editing should not be assumed to be innocuous.
Collapse
Affiliation(s)
- Dong-Dong Yang
- Department of Molecular, Cellular and Developmental Biology and the Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80309
| | - Leo M Rusch
- Department of Molecular, Cellular and Developmental Biology and the Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80309
| | - Karl A Widney
- Department of Molecular, Cellular and Developmental Biology and the Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80309
| | - Andrew B Morgenthaler
- Department of Molecular, Cellular and Developmental Biology and the Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80309
- Amyris, Inc., Emeryville, CA 94608
| | - Shelley D Copley
- Department of Molecular, Cellular and Developmental Biology and the Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder, Boulder, CO 80309
| |
Collapse
|
22
|
Kamath ND, Matreyek KA. Multiplex Functional Characterization of Protein Variant Libraries in Mammalian Cells with Single-Copy Genomic Integration and High-Throughput DNA Sequencing. Methods Mol Biol 2024; 2774:135-152. [PMID: 38441763 DOI: 10.1007/978-1-0716-3718-0_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
Sequencing-based, massively parallel genetic assays have enabled simultaneous characterization of the genotype-phenotype relationships for libraries encoding thousands of unique protein variants. Since plasmid transfection and lentiviral transduction have characteristics that limit multiplexing with pooled libraries, we developed a mammalian synthetic biology platform that harnesses the Bxb1 bacteriophage DNA recombinase to insert single promoterless plasmids encoding a transgene of interest into a pre-engineered "landing pad" site within the cell genome. The transgene is expressed behind a genomically integrated promoter, ensuring only one transgene is expressed per cell, preserving a strict genotype-phenotype link. Upon selecting cells based on a desired phenotype, the transgene can be sequenced to ascribe each variant a phenotypic score. We describe how to create and utilize landing pad cells for large-scale, library-based genetic experiments. Using the provided examples, the experimental template can be adapted to explore protein variants in diverse biological problems within mammalian cells.
Collapse
Affiliation(s)
- Nisha D Kamath
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH, USA
| | - Kenneth A Matreyek
- Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
| |
Collapse
|
23
|
Nemoto T, Ocari T, Planul A, Tekinsoy M, Zin EA, Dalkara D, Ferrari U. ACIDES: on-line monitoring of forward genetic screens for protein engineering. Nat Commun 2023; 14:8504. [PMID: 38148337 PMCID: PMC10751290 DOI: 10.1038/s41467-023-43967-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 11/24/2023] [Indexed: 12/28/2023] Open
Abstract
Forward genetic screens of mutated variants are a versatile strategy for protein engineering and investigation, which has been successfully applied to various studies like directed evolution (DE) and deep mutational scanning (DMS). While next-generation sequencing can track millions of variants during the screening rounds, the vast and noisy nature of the sequencing data impedes the estimation of the performance of individual variants. Here, we propose ACIDES that combines statistical inference and in-silico simulations to improve performance estimation in the library selection process by attributing accurate statistical scores to individual variants. We tested ACIDES first on a random-peptide-insertion experiment and then on multiple public datasets from DE and DMS studies. ACIDES allows experimentalists to reliably estimate variant performance on the fly and can aid protein engineering and research pipelines in a range of applications, including gene therapy.
Collapse
Affiliation(s)
- Takahiro Nemoto
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
- Graduate School of Informatics, Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, 606-8501, Japan.
- Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka University, Suita, Osaka, 565-0871, Japan.
| | - Tommaso Ocari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Arthur Planul
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Muge Tekinsoy
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Emilia A Zin
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Deniz Dalkara
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| | - Ulisse Ferrari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| |
Collapse
|
24
|
Hoffmann MD, Zdechlik AC, He Y, Nedrud D, Aslanidi G, Gordon W, Schmidt D. Multiparametric domain insertional profiling of adeno-associated virus VP1. Mol Ther Methods Clin Dev 2023; 31:101143. [PMID: 38027057 PMCID: PMC10661864 DOI: 10.1016/j.omtm.2023.101143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 10/21/2023] [Indexed: 12/01/2023]
Abstract
Several evolved properties of adeno-associated virus (AAV), such as broad tropism and immunogenicity in humans, are barriers to AAV-based gene therapy. Most efforts to re-engineer these properties have focused on variable regions near AAV's 3-fold protrusions and capsid protein termini. To comprehensively survey AAV capsids for engineerable hotspots, we determined multiple AAV fitness phenotypes upon insertion of six structured protein domains into the entire AAV-DJ capsid protein VP1. This is the largest and most comprehensive AAV domain insertion dataset to date. Our data revealed a surprising robustness of AAV capsids to accommodate large domain insertions. Insertion permissibility depended strongly on insertion position, domain type, and measured fitness phenotype, which clustered into contiguous structural units that we could link to distinct roles in AAV assembly, stability, and infectivity. We also identified engineerable hotspots of AAV that facilitate the covalent attachment of binding scaffolds, which may represent an alternative approach to re-direct AAV tropism.
Collapse
Affiliation(s)
- Mareike D. Hoffmann
- Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - Alina C. Zdechlik
- Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Yungui He
- Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, MN 55455, USA
| | - David Nedrud
- Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | | | - Wendy Gordon
- Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, MN 55455, USA
| | - Daniel Schmidt
- Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
25
|
Maes S, Deploey N, Peelman F, Eyckerman S. Deep mutational scanning of proteins in mammalian cells. CELL REPORTS METHODS 2023; 3:100641. [PMID: 37963462 PMCID: PMC10694495 DOI: 10.1016/j.crmeth.2023.100641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/06/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023]
Abstract
Protein mutagenesis is essential for unveiling the molecular mechanisms underlying protein function in health, disease, and evolution. In the past decade, deep mutational scanning methods have evolved to support the functional analysis of nearly all possible single-amino acid changes in a protein of interest. While historically these methods were developed in lower organisms such as E. coli and yeast, recent technological advancements have resulted in the increased use of mammalian cells, particularly for studying proteins involved in human disease. These advancements will aid significantly in the classification and interpretation of variants of unknown significance, which are being discovered at large scale due to the current surge in the use of whole-genome sequencing in clinical contexts. Here, we explore the experimental aspects of deep mutational scanning studies in mammalian cells and report the different methods used in each step of the workflow, ultimately providing a useful guide toward the design of such studies.
Collapse
Affiliation(s)
- Stefanie Maes
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biochemistry and Microbiology, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Nick Deploey
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Frank Peelman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium.
| |
Collapse
|
26
|
Hoskins I, Rao S, Tante C, Cenik C. Integrated multiplexed assays of variant effect reveal cis-regulatory determinants of catechol- O-methyltransferase gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.02.551517. [PMID: 38014045 PMCID: PMC10680568 DOI: 10.1101/2023.08.02.551517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Multiplexed assays of variant effect are powerful methods to profile the consequences of rare variants on gene expression and organismal fitness. Yet, few studies have integrated several multiplexed assays to map variant effects on gene expression in coding sequences. Here, we pioneered a multiplexed assay based on polysome profiling to measure variant effects on translation at scale, uncovering single-nucleotide variants that increase and decrease ribosome load. By combining high-throughput ribosome load data with multiplexed mRNA and protein abundance readouts, we mapped the cis-regulatory landscape of thousands of catechol-O-methyltransferase (COMT) variants from RNA to protein and found numerous coding variants that alter COMT expression. Finally, we trained machine learning models to map signatures of variant effects on COMT gene expression and uncovered both directional and divergent impacts across expression layers. Our analyses reveal expression phenotypes for thousands of variants in COMT and highlight variant effects on both single and multiple layers of expression. Our findings prompt future studies that integrate several multiplexed assays for the readout of gene expression.
Collapse
Affiliation(s)
- Ian Hoskins
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Shilpa Rao
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Charisma Tante
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
27
|
Nagy G, Diabate M, Banerjee T, Adamovich AI, Smith N, Jeon H, Dhar S, Liu W, Burgess K, Chung D, Starita LM, Parvin JD. Multiplexed assay of variant effect reveals residues of functional importance in the BRCA1 coiled-coil and serine cluster domains. PLoS One 2023; 18:e0293422. [PMID: 37917606 PMCID: PMC10621863 DOI: 10.1371/journal.pone.0293422] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 10/12/2023] [Indexed: 11/04/2023] Open
Abstract
Delineating functionally normal variants from functionally abnormal variants in tumor suppressor proteins is critical for cancer surveillance, prognosis, and treatment options. BRCA1 is a protein that has many variants of uncertain significance which are not yet classified as functionally normal or abnormal. In vitro functional assays can be used to identify the functional impact of a variant when the variant has not yet been categorized through clinical observation. Here we employ a homology-directed repair (HDR) reporter assay to evaluate over 300 missense and nonsense BRCA1 variants between amino acid residues 1280 and 1576, which encompasses the coiled-coil and serine cluster domains. Functionally abnormal variants tended to cluster in residues known to interact with PALB2, which is critical for homology-directed repair. Multiplexed results were confirmed by singleton assay and by ClinVar database variant interpretations. Comparison of multiplexed results to designated benign or likely benign or pathogenic or likely pathogenic variants in the ClinVar database yielded 100% specificity and 100% sensitivity of the multiplexed assay. Clinicians can reference the results of this functional assay for help in guiding cancer treatment and surveillance options. These results are the first to evaluate this domain of BRCA1 using a multiplexed approach and indicate the importance of this domain in the DNA repair process.
Collapse
Affiliation(s)
- Gregory Nagy
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Mariame Diabate
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Tapahsama Banerjee
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Aleksandra I. Adamovich
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Nahum Smith
- Department of Genome Sciences, University of Washington and Brotman Baty Institute for Precision Medicine, Seattle, Washington, United States of America
| | - Hyeongseon Jeon
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Shruti Dhar
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Wenfang Liu
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Katherine Burgess
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Dongjun Chung
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| | - Lea M. Starita
- Department of Genome Sciences, University of Washington and Brotman Baty Institute for Precision Medicine, Seattle, Washington, United States of America
| | - Jeffrey D. Parvin
- Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
28
|
Choudhury A, Gachet B, Dixit Z, Faure R, Gill RT, Tenaillon O. Deep mutational scanning reveals the molecular determinants of RNA polymerase-mediated adaptation and tradeoffs. Nat Commun 2023; 14:6319. [PMID: 37813857 PMCID: PMC10562459 DOI: 10.1038/s41467-023-41882-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 09/21/2023] [Indexed: 10/11/2023] Open
Abstract
RNA polymerase (RNAP) is emblematic of complex biological systems that control multiple traits involving trade-offs such as growth versus maintenance. Laboratory evolution has revealed that mutations in RNAP subunits, including RpoB, are frequently selected. However, we lack a systems view of how mutations alter the RNAP molecular functions to promote adaptation. We, therefore, measured the fitness of thousands of mutations within a region of rpoB under multiple conditions and genetic backgrounds, to find that adaptive mutations cluster in two modules. Mutations in one module favor growth over maintenance through a partial loss of an interaction associated with faster elongation. Mutations in the other favor maintenance over growth through a destabilized RNAP-DNA complex. The two molecular handles capture the versatile RNAP-mediated adaptations. Combining both interaction losses simultaneously improved maintenance and growth, challenging the idea that growth-maintenance tradeoff resorts only from limited resources, and revealing how compensatory evolution operates within RNAP.
Collapse
Affiliation(s)
- Alaksh Choudhury
- Université de Paris Cité, INSERM, IAME, UMR 1137, 75018, Paris, France.
- Laboratoire Biophysique et Évolution (LBE), UMR Chimie Biologie Innovation 8231, ESPCI Paris, Université PSL, CNRS, 75005, Paris, France.
| | - Benoit Gachet
- Université de Paris Cité, INSERM, IAME, UMR 1137, 75018, Paris, France
| | - Zoya Dixit
- Université de Paris Cité, INSERM, IAME, UMR 1137, 75018, Paris, France
- Université de Paris Cité, INSERM, CNRS, Institut Cochin, UMR 1016, 75014, Paris, France
| | - Roland Faure
- Université de Paris Cité, INSERM, IAME, UMR 1137, 75018, Paris, France
- Université de Rennes, INRIA RBA, CNRS UMR 6074, Rennes, France
- Service Evolution Biologique et Ecologie, Université libre de Bruxelles (ULB), 1050, Brussels, Belgium
| | - Ryan T Gill
- Renewable and Sustainable Energy Institute (RASEI), University of Colorado-Boulder, Boulder, CO, 80309-0027, USA
- Novo Nordisk Foundation, Denmark Technical University, 2800 Kgs, Lyngby, Denmark
| | - Olivier Tenaillon
- Université de Paris Cité, INSERM, IAME, UMR 1137, 75018, Paris, France.
- Université de Paris Cité, INSERM, CNRS, Institut Cochin, UMR 1016, 75014, Paris, France.
| |
Collapse
|
29
|
Busia A, Listgarten J. MBE: model-based enrichment estimation and prediction for differential sequencing data. Genome Biol 2023; 24:218. [PMID: 37784130 PMCID: PMC10544408 DOI: 10.1186/s13059-023-03058-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 09/14/2023] [Indexed: 10/04/2023] Open
Abstract
Characterizing differences in sequences between two conditions, such as with and without drug exposure, using high-throughput sequencing data is a prevalent problem involving quantifying changes in sequence abundances, and predicting such differences for unobserved sequences. A key shortcoming of current approaches is their extremely limited ability to share information across related but non-identical reads. Consequently, they cannot use sequencing data effectively, nor be directly applied in many settings of interest. We introduce model-based enrichment (MBE) to overcome this shortcoming. We evaluate MBE using both simulated and real data. Overall, MBE improves accuracy compared to current differential analysis methods.
Collapse
Affiliation(s)
- Akosua Busia
- Department of Electrical Engineering & Computer Science, University of California, Berkeley, Berkeley, 94720, CA, USA.
| | - Jennifer Listgarten
- Department of Electrical Engineering & Computer Science, University of California, Berkeley, Berkeley, 94720, CA, USA.
| |
Collapse
|
30
|
Truong A, Myerscough D, Campbell I, Atkinson J, Silberg JJ. A cellular selection identifies elongated flavodoxins that support electron transfer to sulfite reductase. Protein Sci 2023; 32:e4746. [PMID: 37551563 PMCID: PMC10503412 DOI: 10.1002/pro.4746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Revised: 07/17/2023] [Accepted: 08/04/2023] [Indexed: 08/09/2023]
Abstract
Flavodoxins (Flds) mediate the flux of electrons between oxidoreductases in diverse metabolic pathways. To investigate whether Flds can support electron transfer to a sulfite reductase (SIR) that evolved to couple with a ferredoxin, we evaluated the ability of Flds to transfer electrons from a ferredoxin-NADP reductase (FNR) to a ferredoxin-dependent SIR using growth complementation of an Escherichia coli strain with a sulfur metabolism defect. We show that Flds from cyanobacteria complement this growth defect when coexpressed with an FNR and an SIR that evolved to couple with a plant ferredoxin. When we evaluated the effect of peptide insertion on Fld-mediated electron transfer, we observed a sensitivity to insertions within regions predicted to be proximal to the cofactor and partner binding sites, while a high insertion tolerance was detected within loops distal from the cofactor and within regions of helices and sheets that are proximal to those loops. Bioinformatic analysis showed that natural Fld sequence variability predicts a large fraction of the motifs that tolerate insertion of the octapeptide SGRPGSLS. These results represent the first evidence that Flds can support electron transfer to assimilatory SIRs, and they suggest that the pattern of insertion tolerance is influenced by interactions with oxidoreductase partners.
Collapse
Affiliation(s)
- Albert Truong
- Biochemistry and Cell Biology Graduate Program, Rice University, Houston, Texas, USA
- Department of Biosciences, Rice University, Houston, Texas, USA
| | - Dru Myerscough
- Department of Biosciences, Rice University, Houston, Texas, USA
| | - Ian Campbell
- Department of Biosciences, Rice University, Houston, Texas, USA
| | - Joshua Atkinson
- Department of Biosciences, Rice University, Houston, Texas, USA
| | - Jonathan J Silberg
- Department of Biosciences, Rice University, Houston, Texas, USA
- Department of Bioengineering, Rice University, Houston, Texas, USA
- Department of Chemical and Biomolecular Engineering, Rice University, Houston, Texas, USA
| |
Collapse
|
31
|
Smith MD, Case MA, Makowski EK, Tessier PM. Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data. Bioinformatics 2023; 39:btad446. [PMID: 37478351 PMCID: PMC10477941 DOI: 10.1093/bioinformatics/btad446] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 06/21/2023] [Accepted: 07/20/2023] [Indexed: 07/23/2023] Open
Abstract
MOTIVATION Deep sequencing of antibody and related protein libraries after phage or yeast-surface display sorting is widely used to identify variants with increased affinity, specificity, and/or improvements in key biophysical properties. Conventional approaches for identifying optimal variants typically use the frequencies of observation in enriched libraries or the corresponding enrichment ratios. However, these approaches disregard the vast majority of deep sequencing data and often fail to identify the best variants in the libraries. RESULTS Here, we present a method, Position-Specific Enrichment Ratio Matrix (PSERM) scoring, that uses entire deep sequencing datasets from pre- and post-selections to score each observed protein variant. The PSERM scores are the sum of the site-specific enrichment ratios observed at each mutated position. We find that PSERM scores are much more reproducible and correlate more strongly with experimentally measured properties than frequencies or enrichment ratios, including for multiple antibody properties (affinity and non-specific binding) for a clinical-stage antibody (emibetuzumab). We expect that this method will be broadly applicable to diverse protein engineering campaigns. AVAILABILITY AND IMPLEMENTATION All deep sequencing datasets and code to perform the analyses presented within are available via https://github.com/Tessier-Lab-UMich/PSERM_paper.
Collapse
Affiliation(s)
- Matthew D Smith
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI 48109-2200, United States
| | - Marshall A Case
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
| | - Emily K Makowski
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI 48109-2200, United States
| | - Peter M Tessier
- Department of Chemical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Biointerfaces Institute, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Department of Biomedical Engineering, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Protein Folding Disease Initiative, University of Michigan, Ann Arbor, MI 48109-2200, United States
- Michigan Alzheimer’s Disease Center, University of Michigan, Ann Arbor, MI 48109-2200, United States
| |
Collapse
|
32
|
Diabate M, Islam MM, Nagy G, Banerjee T, Dhar S, Smith N, Adamovich AI, Starita LM, Parvin JD. DNA repair function scores for 2172 variants in the BRCA1 amino-terminus. PLoS Genet 2023; 19:e1010739. [PMID: 37578980 PMCID: PMC10449183 DOI: 10.1371/journal.pgen.1010739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 08/24/2023] [Accepted: 07/16/2023] [Indexed: 08/16/2023] Open
Abstract
Single nucleotide variants are the most frequent type of sequence changes detected in the genome and these are frequently variants of uncertain significance (VUS). VUS are changes in DNA for which disease risk association is unknown. Thus, methods that classify the functional impact of a VUS can be used as evidence for variant interpretation. In the case of the breast and ovarian cancer specific tumor suppressor protein, BRCA1, pathogenic missense variants frequently score as loss of function in an assay for homology-directed repair (HDR) of DNA double-strand breaks. We previously published functional results using a multiplexed assay for 1056 amino acid substitutions residues 2-192 in the amino terminus of BRCA1. In this study, we have re-assessed the data from this multiplexed assay using an improved analysis pipeline. These new analysis methods yield functional scores for more variants in the first 192 amino acids of BRCA1, plus we report new results for BRCA1 amino acid residues 193-302. We now present the functional classification of 2172 BRCA1 variants in BRCA1 residues 2-302 using the multiplexed HDR assay. Comparison of the functional determinations of the missense variants with clinically known benign or pathogenic variants indicated 93% sensitivity and 100% specificity for this assay. The results from BRCA1 variants tested in this assay are a resource for clinical geneticists for evidence to evaluate VUS in BRCA1.
Collapse
Affiliation(s)
- Mariame Diabate
- The Ohio State University, Department of Biomedical Informatics, Columbus, Ohio, United States of America
- The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, United States of America
| | - Muhtadi M. Islam
- The Ohio State University, Department of Biomedical Informatics, Columbus, Ohio, United States of America
- The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, United States of America
| | - Gregory Nagy
- The Ohio State University, Department of Biomedical Informatics, Columbus, Ohio, United States of America
- The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, United States of America
| | - Tapahsama Banerjee
- The Ohio State University, Department of Biomedical Informatics, Columbus, Ohio, United States of America
- The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, United States of America
| | - Shruti Dhar
- The Ohio State University, Department of Biomedical Informatics, Columbus, Ohio, United States of America
- The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, United States of America
| | - Nahum Smith
- The University of Washington, Department of Genome Sciences, Seattle, Washington, United States of America
- Brotman Baty Institute for Precision Medicine, Seattle, Washington, United States of America
| | - Aleksandra I. Adamovich
- The Ohio State University, Department of Biomedical Informatics, Columbus, Ohio, United States of America
- The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, United States of America
| | - Lea M. Starita
- The University of Washington, Department of Genome Sciences, Seattle, Washington, United States of America
- Brotman Baty Institute for Precision Medicine, Seattle, Washington, United States of America
| | - Jeffrey D. Parvin
- The Ohio State University, Department of Biomedical Informatics, Columbus, Ohio, United States of America
- The Ohio State University Comprehensive Cancer Center, Columbus, Ohio, United States of America
| |
Collapse
|
33
|
Smith MD, Case MA, Makowski EK, Tessier PM. Position-Specific Enrichment Ratio Matrix scores predict antibody variant properties from deep sequencing data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.10.548448. [PMID: 37503142 PMCID: PMC10369870 DOI: 10.1101/2023.07.10.548448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Motivation Deep sequencing of antibody and related protein libraries after phage or yeast-surface display sorting is widely used to identify variants with increased affinity, specificity and/or improvements in key biophysical properties. Conventional approaches for identifying optimal variants typically use the frequencies of observation in enriched libraries or the corresponding enrichment ratios. However, these approaches disregard the vast majority of deep sequencing data and often fail to identify the best variants in the libraries. Results Here, we present a method, Position-Specific Enrichment Ratio Matrix (PSERM) scoring, that uses entire deep sequencing datasets from pre- and post-selections to score each observed protein variant. The PSERM scores are the sum of the site-specific enrichment ratios observed at each mutated position. We find that PSERM scores are much more reproducible and correlate more strongly with experimentally measured properties than frequencies or enrichment ratios, including for multiple antibody properties (affinity and non-specific binding) for a clinical-stage antibody (emibetuzumab). We expect that this method will be broadly applicable to diverse protein engineering campaigns. Availability All deep sequencing datasets and code to do the analyses presented within are available via GitHub. Contact Peter Tessier, ptessier@umich.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
34
|
Bonadio A, Wenig BL, Hockla A, Radisky ES, Shifman JM. Designed Loop Extension Followed by Combinatorial Screening Confers High Specificity to a Broad Matrix MetalloproteinaseInhibitor. J Mol Biol 2023; 435:168095. [PMID: 37068580 PMCID: PMC10312305 DOI: 10.1016/j.jmb.2023.168095] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 04/03/2023] [Accepted: 04/10/2023] [Indexed: 04/19/2023]
Abstract
Matrix metalloproteinases (MMPs) are key drivers of various diseases, including cancer. Development of probes and drugs capable of selectively inhibiting the individual members of the large MMP family remains a persistent challenge. The inhibitory N-terminal domain of tissue inhibitor of metalloproteinases-2 (N-TIMP2), a natural broad MMP inhibitor, can provide a scaffold for protein engineering to create more selective MMP inhibitors. Here, we pursued a unique approach harnessing both computational design and combinatorial screening to confer high binding specificity toward a target MMP in preference to an anti-target MMP. We designed a loop extension of N-TIMP2 to allow new interactions with the non-conserved MMP surface and generated an efficient focused library for yeast surface display, which was then screened for high binding to the target MMP-14 and low binding to anti-target MMP-3. Deep sequencing analysis identified the most promising variants, which were expressed, purified, and tested for selectivity of inhibition. Our best N-TIMP2 variant exhibited 29 pM binding affinity to MMP-14 and 2.4 µM affinity to MMP-3, revealing 7500-fold greater specificity than WT N-TIMP2. High-confidence structural models were obtained by including NGS data in the AlphaFold multiple sequence alignment. The modeling together with experimental mutagenesis validated our design predictions, demonstrating that the loop extension packs tightly against non-conserved residues on MMP-14 and clashes with MMP-3. This study demonstrates how introduction of loop extensions in a manner guided by target protein conservation data and loop design can offer an attractive strategy to achieve specificity in design of protein ligands.
Collapse
Affiliation(s)
- Alessandro Bonadio
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Israel
| | - Bernhard L Wenig
- Department of Cancer Biology, Mayo Clinic Comprehensive Cancer Center, Jacksonville, Florida, USA; Paracelsus Medical University, Salzburg, Austria
| | - Alexandra Hockla
- Department of Cancer Biology, Mayo Clinic Comprehensive Cancer Center, Jacksonville, Florida, USA
| | - Evette S Radisky
- Department of Cancer Biology, Mayo Clinic Comprehensive Cancer Center, Jacksonville, Florida, USA.
| | - Julia M Shifman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Israel.
| |
Collapse
|
35
|
Nguyen V, Ahler E, Sitko KA, Stephany JJ, Maly DJ, Fowler DM. Molecular determinants of Hsp90 dependence of Src kinase revealed by deep mutational scanning. Protein Sci 2023; 32:e4656. [PMID: 37167432 PMCID: PMC10273359 DOI: 10.1002/pro.4656] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 05/05/2023] [Accepted: 05/10/2023] [Indexed: 05/13/2023]
Abstract
Hsp90 is a molecular chaperone involved in the refolding and activation of numerous protein substrates referred to as clients. While the molecular determinants of Hsp90 client specificity are poorly understood and limited to a handful of client proteins, strong clients are thought to be destabilized and conformationally extended. Here, we measured the phosphotransferase activity of 3929 variants of the tyrosine kinase Src in both the presence and absence of an Hsp90 inhibitor. We identified 84 previously unknown functionally dependent client variants. Unexpectedly, many destabilized or extended variants were not functionally dependent on Hsp90. Instead, functionally dependent client variants were clustered in the αF pocket and β1-β2 strand regions of Src, which have yet to be described in driving Hsp90 dependence. Hsp90 dependence was also strongly correlated with kinase activity. We found that a combination of activation, global extension, and general conformational flexibility, primarily induced by variants at the αF pocket and β1-β2 strands, was necessary to render Src functionally dependent on Hsp90. Moreover, the degree of activation and flexibility required to transform Src into a functionally dependent client varied with variant location, suggesting that a combination of regulatory domain disengagement and catalytic domain flexibility are required for chaperone dependence. Thus, by studying the chaperone dependence of a massive number of variants, we highlight factors driving Hsp90 client specificity and propose a model of chaperone-kinase interactions.
Collapse
Affiliation(s)
- Vanessa Nguyen
- Department of BioengineeringUniversity of WashingtonSeattleWashingtonUSA
| | - Ethan Ahler
- Department of Genome SciencesUniversity of WashingtonSeattleWashingtonUSA
| | - Katherine A. Sitko
- Department of Genome SciencesUniversity of WashingtonSeattleWashingtonUSA
| | - Jason J. Stephany
- Department of Genome SciencesUniversity of WashingtonSeattleWashingtonUSA
| | - Dustin J. Maly
- Department of ChemistryUniversity of WashingtonSeattleWashingtonUSA
| | - Douglas M. Fowler
- Department of BioengineeringUniversity of WashingtonSeattleWashingtonUSA
- Department of Genome SciencesUniversity of WashingtonSeattleWashingtonUSA
| |
Collapse
|
36
|
Yee SW, Macdonald C, Mitrovic D, Zhou X, Koleske ML, Yang J, Silva DB, Grimes PR, Trinidad D, More SS, Kachuri L, Witte JS, Delemotte L, Giacomini KM, Coyote-Maestas W. The full spectrum of OCT1 (SLC22A1) mutations bridges transporter biophysics to drug pharmacogenomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.06.06.543963. [PMID: 37333090 PMCID: PMC10274788 DOI: 10.1101/2023.06.06.543963] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Membrane transporters play a fundamental role in the tissue distribution of endogenous compounds and xenobiotics and are major determinants of efficacy and side effects profiles. Polymorphisms within these drug transporters result in inter-individual variation in drug response, with some patients not responding to the recommended dosage of drug whereas others experience catastrophic side effects. For example, variants within the major hepatic Human organic cation transporter OCT1 (SLC22A1) can change endogenous organic cations and many prescription drug levels. To understand how variants mechanistically impact drug uptake, we systematically study how all known and possible single missense and single amino acid deletion variants impact expression and substrate uptake of OCT1. We find that human variants primarily disrupt function via folding rather than substrate uptake. Our study revealed that the major determinants of folding reside in the first 300 amino acids, including the first 6 transmembrane domains and the extracellular domain (ECD) with a stabilizing and highly conserved stabilizing helical motif making key interactions between the ECD and transmembrane domains. Using the functional data combined with computational approaches, we determine and validate a structure-function model of OCT1s conformational ensemble without experimental structures. Using this model and molecular dynamic simulations of key mutants, we determine biophysical mechanisms for how specific human variants alter transport phenotypes. We identify differences in frequencies of reduced function alleles across populations with East Asians vs European populations having the lowest and highest frequency of reduced function variants, respectively. Mining human population databases reveals that reduced function alleles of OCT1 identified in this study associate significantly with high LDL cholesterol levels. Our general approach broadly applied could transform the landscape of precision medicine by producing a mechanistic basis for understanding the effects of human mutations on disease and drug response.
Collapse
Affiliation(s)
- Sook Wah Yee
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Christian Macdonald
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Darko Mitrovic
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Sweden
| | - Xujia Zhou
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Megan L Koleske
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Jia Yang
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Dina Buitrago Silva
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Patrick Rockefeller Grimes
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Donovan Trinidad
- Department of Medicine, Division of Infectious Disease, University of California, San Francisco, United States
| | - Swati S More
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
- Current address: Center for Drug Design (CDD), College of Pharmacy, University of Minnesota, Minnesota, United States
| | - Linda Kachuri
- Epidemiology and Population Health, Stanford University, California, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, United States
| | - John S Witte
- Epidemiology and Population Health, Stanford University, California, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, United States
| | - Lucie Delemotte
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Sweden
| | - Kathleen M Giacomini
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, United States
- Quantitative Biosciences Institute, University of California, San Francisco, United States
| |
Collapse
|
37
|
Soneson C, Bendel AM, Diss G, Stadler MB. mutscan-a flexible R package for efficient end-to-end analysis of multiplexed assays of variant effect data. Genome Biol 2023; 24:132. [PMID: 37264470 DOI: 10.1186/s13059-023-02967-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 05/10/2023] [Indexed: 06/03/2023] Open
Abstract
Multiplexed assays of variant effect (MAVE) experimentally measure the effect of large numbers of sequence variants by selective enrichment of sequences with desirable properties followed by quantification by sequencing. mutscan is an R package for flexible analysis of such experiments, covering the entire workflow from raw reads up to statistical analysis and visualization. The core components are implemented in C++ for efficiency. Various experimental designs are supported, including single or paired reads with optional unique molecular identifiers. To find variants with changed relative abundance, mutscan employs established statistical models provided in the edgeR and limma packages. mutscan is available from https://github.com/fmicompbio/mutscan .
Collapse
Affiliation(s)
- Charlotte Soneson
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Michael B Stadler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
- University of Basel, Basel, Switzerland.
| |
Collapse
|
38
|
Serebryany E, Zhao VY, Park K, Bitran A, Trauger SA, Budnik B, Shakhnovich EI. Systematic conformation-to-phenotype mapping via limited deep sequencing of proteins. Mol Cell 2023; 83:1936-1952.e7. [PMID: 37267908 PMCID: PMC10281453 DOI: 10.1016/j.molcel.2023.05.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 01/29/2023] [Accepted: 05/03/2023] [Indexed: 06/04/2023]
Abstract
Non-native conformations drive protein-misfolding diseases, complicate bioengineering efforts, and fuel molecular evolution. No current experimental technique is well suited for elucidating them and their phenotypic effects. Especially intractable are the transient conformations populated by intrinsically disordered proteins. We describe an approach to systematically discover, stabilize, and purify native and non-native conformations, generated in vitro or in vivo, and directly link conformations to molecular, organismal, or evolutionary phenotypes. This approach involves high-throughput disulfide scanning (HTDS) of the entire protein. To reveal which disulfides trap which chromatographically resolvable conformers, we devised a deep-sequencing method for double-Cys variant libraries of proteins that precisely and simultaneously locates both Cys residues within each polypeptide. HTDS of the abundant E. coli periplasmic chaperone HdeA revealed distinct classes of disordered hydrophobic conformers with variable cytotoxicity depending on where the backbone was cross-linked. HTDS can bridge conformational and phenotypic landscapes for many proteins that function in disulfide-permissive environments.
Collapse
Affiliation(s)
- Eugene Serebryany
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA.
| | - Victor Y Zhao
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
| | - Kibum Park
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
| | - Amir Bitran
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA
| | - Sunia A Trauger
- Center for Mass Spectrometry, Harvard University, Cambridge, MA 02138, USA
| | - Bogdan Budnik
- Center for Mass Spectrometry, Harvard University, Cambridge, MA 02138, USA
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
39
|
Chen Y, Hu R, Li K, Zhang Y, Fu L, Zhang J, Si T. Deep Mutational Scanning of an Oxygen-Independent Fluorescent Protein CreiLOV for Comprehensive Profiling of Mutational and Epistatic Effects. ACS Synth Biol 2023; 12:1461-1473. [PMID: 37066862 PMCID: PMC10204710 DOI: 10.1021/acssynbio.2c00662] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Indexed: 04/18/2023]
Abstract
Oxygen-independent, flavin mononucleotide-based fluorescent proteins (FbFPs) are promising alternatives to green fluorescent protein in anaerobic contexts. Deep mutational scanning performs systematic profiling of protein sequence-function relationships but has not been applied to FbFPs. Focusing on CreiLOV from Chlamydomonas reinhardtii, we created and analyzed two comprehensive mutant collections: (1) single-residue, site-saturation mutagenesis libraries covering all 118 residues; and (2) a full combinatorial metagenesis library among 20 mutations at 15 residues, where mutation and residue selection was based on single-site mutagenesis results. Notably, the second type of library is indispensable to study higher-order epistasis but underrepresented in the literature. Using optimized FACS-seq assays, 2,185 (>92.5%) out of 2,360 possible single-site mutants and 165,428 (>89.7%) out of 184,320 possible combinatorial mutants were reliably assigned with fitness values. We constructed statistical and machine-learning models to analyze the CreiLOV data set, enabling accurate fitness prediction of higher-order mutants using lower-order mutagenesis data. In addition, we successfully isolated CreiLOV variants with improved fluorescence quantum yield and thermostability. This work provides new empirical data and design rules to engineer combinatorial protein variants.
Collapse
Affiliation(s)
- Yongcan Chen
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Ruyun Hu
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Keyi Li
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yating Zhang
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Lihao Fu
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University
of Chinese Academy of Sciences, Beijing 100049, China
| | - Jianzhi Zhang
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Tong Si
- CAS
Key Laboratory for Quantitative Engineering Biology, Shenzhen Institute
of Synthetic Biology, Shenzhen Institute
of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- BGI-Shenzhen, Shenzhen 518083, China
- University
of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
40
|
Tayebi N, Leon‐Ricardo B, McCall K, Mehinovic E, Engelstad K, Huynh V, Turner TN, Weisenberg J, Thio LL, Hruz P, Williams RSB, De Vivo DC, Petit V, Haller G, Gurnett CA. Quantitative determination of SLC2A1 variant functional effects in GLUT1 deficiency syndrome. Ann Clin Transl Neurol 2023; 10:787-801. [PMID: 37000947 PMCID: PMC10187726 DOI: 10.1002/acn3.51767] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 03/08/2023] [Accepted: 03/12/2023] [Indexed: 04/03/2023] Open
Abstract
OBJECTIVE The goal of this study is to demonstrate the utility of a growth assay to quantify the functional impact of single nucleotide variants (SNVs) in SLC2A1, the gene responsible for Glut1DS. METHODS The functional impact of 40 SNVs in SLC2A1 was quantitatively determined in HAP1 cells in which SLC2A1 is required for growth. Donor libraries were introduced into the endogenous SLC2A1 gene in HAP1-Lig4KO cells using CRISPR/Cas9. Cell populations were harvested and sequenced to quantify the effect of variants on growth and generate a functional score. Quantitative functional scores were compared to 3-OMG uptake, SLC2A1 cell surface expression, CADD score, and clinical data, including CSF/blood glucose ratio. RESULTS Nonsense variants (N = 3) were reduced in cell culture over time resulting in negative scores (mean score: -1.15 ± 0.17), whereas synonymous variants (N = 10) were not depleted (mean score: 0.25 ± 0.12) (P < 2e-16). Missense variants (N = 27) yielded a range of functional scores including slightly negative scores, supporting a partial function and intermediate phenotype. Several variants with normal results on either cell surface expression (p.N34S and p.W65R) or 3-OMG uptake (p.W65R) had negative functional scores. There is a moderate but significant correlation between our functional scores and CADD scores. INTERPRETATION Cell growth is useful to quantitatively determine the functional effects of SLC2A1 variants. Nonsense variants were reliably distinguished from benign variants in this in vitro functional assay. For facilitating early diagnosis and therapeutic intervention, future work is needed to determine the functional effect of every possible variant in SLC2A1.
Collapse
Affiliation(s)
- Naeimeh Tayebi
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Brian Leon‐Ricardo
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Kevin McCall
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Elvisa Mehinovic
- Department of GeneticsWashington University in St LouisSt LouisMissouriUSA
| | - Kristin Engelstad
- Department of NeurologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Vincent Huynh
- Department of NeurologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | - Tychele N. Turner
- Department of GeneticsWashington University in St LouisSt LouisMissouriUSA
| | - Judy Weisenberg
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Liu L. Thio
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
| | - Paul Hruz
- Department of PediatricsWashington University in St LouisSt LouisMissouriUSA
| | - Robin S. B. Williams
- Centre for Biomedical Sciences, Department of Biological SciencesRoyal Holloway University of LondonEghamUK
| | - Darryl C. De Vivo
- Department of NeurologyColumbia University Irving Medical CenterNew YorkNew YorkUSA
| | | | - Gabe Haller
- Department of NeurologyWashington University in St LouisSt LouisMissouriUSA
- Department of GeneticsWashington University in St LouisSt LouisMissouriUSA
- Department of Neurological SurgeryWashington University in St LouisSt LouisMissouriUSA
| | | |
Collapse
|
41
|
Hoffmann MD, Zdechlik AC, He Y, Nedrud D, Aslanidi G, Gordon W, Schmidt D. Multiparametric domain insertional profiling of Adeno-Associated Virus VP1. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.19.537549. [PMID: 37131661 PMCID: PMC10153220 DOI: 10.1101/2023.04.19.537549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Evolved properties of Adeno-Associated Virus (AAV), such as broad tropism and immunogenicity in humans, are barriers to AAV-based gene therapy. Previous efforts to re-engineer these properties have focused on variable regions near AAV’s 3-fold protrusions and capsid protein termini. To comprehensively survey AAV capsids for engineerable hotspots, we determined multiple AAV fitness phenotypes upon insertion of large, structured protein domains into the entire AAV-DJ capsid protein VP1. This is the largest and most comprehensive AAV domain insertion dataset to date. Our data revealed a surprising robustness of AAV capsids to accommodate large domain insertions. There was strong positional, domain-type, and fitness phenotype dependence of insertion permissibility, which clustered into correlated structural units that we could link to distinct roles in AAV assembly, stability, and infectivity. We also identified new engineerable hotspots of AAV that facilitate the covalent attachment of binding scaffolds, which may represent an alternative approach to re-direct AAV tropism.
Collapse
|
42
|
Hoskins I, Sun S, Cote A, Roth FP, Cenik C. satmut_utils: a simulation and variant calling package for multiplexed assays of variant effect. Genome Biol 2023; 24:82. [PMID: 37081510 PMCID: PMC10116734 DOI: 10.1186/s13059-023-02922-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 04/04/2023] [Indexed: 04/22/2023] Open
Abstract
The impact of millions of individual genetic variants on molecular phenotypes in coding sequences remains unknown. Multiplexed assays of variant effect (MAVEs) are scalable methods to annotate relevant variants, but existing software lacks standardization, requires cumbersome configuration, and does not scale to large targets. We present satmut_utils as a flexible solution for simulation and variant quantification. We then benchmark MAVE software using simulated and real MAVE data. We finally determine mRNA abundance for thousands of cystathionine beta-synthase variants using two experimental methods. The satmut_utils package enables high-performance analysis of MAVEs and reveals the capability of variants to alter mRNA abundance.
Collapse
Affiliation(s)
- Ian Hoskins
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Song Sun
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Atina Cote
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Frederick P Roth
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
43
|
Diabate M, Islam MM, Nagy G, Banerjee T, Dhar S, Smith N, Adamovich AI, Starita LM, Parvin JD. DNA Repair Function Scores for 2172 Variants in the BRCA1 Amino-Terminus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.10.536331. [PMID: 37090572 PMCID: PMC10120616 DOI: 10.1101/2023.04.10.536331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Single nucleotide variants are the most frequent type of sequence changes detected in the genome and these are frequently variants of uncertain significance (VUS). VUS are changes in DNA for which disease risk association is unknown. Thus, methods that classify the functional impact of a VUS can be used as evidence for variant interpretation. In the case of the breast and ovarian cancer specific tumor suppressor protein, BRCA1, pathogenic missense variants frequently score as loss of function in an assay for homology-directed repair (HDR) of DNA double-strand breaks. We previously published functional results using a multiplexed assay for 1056 amino acid substitutions residues 2-192 in the amino terminus of BRCA1. In this study, we have re-assessed the data from this multiplexed assay using an improved analysis pipeline. These new analysis methods yield functional scores for more variants in the first 192 amino acids of BRCA1, plus we report new results for BRCA1 amino acid residues 193-302. We now present the functional classification of 2172 BRCA1 variants in BRCA1 residues 2-302 using the multiplexed HDR assay. Comparison of the functional determinations of the missense variants with clinically known benign or pathogenic variants indicated 93% sensitivity and 100% specificity for this assay. The results from BRCA1 variants tested in this assay are a resource for clinical geneticists for evidence to evaluate VUS in BRCA1 . AUTHOR SUMMARY Most missense substitutions in BRCA1 are variants of unknown significance (VUS), and individuals with a VUS in BRCA1 cannot know from genetic information alone whether this variant predisposes to breast or ovarian cancer. We apply a multiplexed functional assay for homology directed repair of DNA double strand breaks to assess variant impact on this important BRCA1 protein function. We analyzed 2172 variants in the amino-terminus of BRCA1 and demonstrate that variants that are known as pathogenic have a loss of function in the DNA repair assay. Conversely, variants that are known to be benign are functionally normal in the multiplexed assay. We suggest that these functional determinations of BRCA1 variants can be used to augment the information that clinical cancer geneticists provide to patients who have a VUS in BRCA1 .
Collapse
Affiliation(s)
- Mariame Diabate
- The Ohio State University, Department of Biomedical Informatics, and The Ohio State University Comprehensive Center, Columbus, OH 43210
| | - Muhtadi M Islam
- The Ohio State University, Department of Biomedical Informatics, and The Ohio State University Comprehensive Center, Columbus, OH 43210
| | - Gregory Nagy
- The Ohio State University, Department of Biomedical Informatics, and The Ohio State University Comprehensive Center, Columbus, OH 43210
| | - Tapahsama Banerjee
- The Ohio State University, Department of Biomedical Informatics, and The Ohio State University Comprehensive Center, Columbus, OH 43210
| | - Shruti Dhar
- The Ohio State University, Department of Biomedical Informatics, and The Ohio State University Comprehensive Center, Columbus, OH 43210
| | - Nahum Smith
- The University of Washington, Department of Genome Sciences, Seattle, WA 98195
- Brotman Baty Institute for Precision Medicine, Seattle WA, 98195
| | - Aleksandra I Adamovich
- The Ohio State University, Department of Biomedical Informatics, and The Ohio State University Comprehensive Center, Columbus, OH 43210
| | - Lea M Starita
- The University of Washington, Department of Genome Sciences, Seattle, WA 98195
- Brotman Baty Institute for Precision Medicine, Seattle WA, 98195
| | - Jeffrey D Parvin
- The Ohio State University, Department of Biomedical Informatics, and The Ohio State University Comprehensive Center, Columbus, OH 43210
| |
Collapse
|
44
|
Meier G, Thavarasah S, Ehrenbolger K, Hutter CAJ, Hürlimann LM, Barandun J, Seeger MA. Deep mutational scan of a drug efflux pump reveals its structure-function landscape. Nat Chem Biol 2023; 19:440-450. [PMID: 36443574 PMCID: PMC7615509 DOI: 10.1038/s41589-022-01205-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Accepted: 10/10/2022] [Indexed: 11/30/2022]
Abstract
Drug efflux is a common resistance mechanism found in bacteria and cancer cells, but studies providing comprehensive functional insights are scarce. In this study, we performed deep mutational scanning (DMS) on the bacterial ABC transporter EfrCD to determine the drug efflux activity profile of more than 1,430 single variants. These systematic measurements revealed that the introduction of negative charges at different locations within the large substrate binding pocket results in strongly increased efflux activity toward positively charged ethidium, whereas additional aromatic residues did not display the same effect. Data analysis in the context of an inward-facing cryogenic electron microscopy structure of EfrCD uncovered a high-affinity binding site, which releases bound drugs through a peristaltic transport mechanism as the transporter transits to its outward-facing conformation. Finally, we identified substitutions resulting in rapid Hoechst influx without affecting the efflux activity for ethidium and daunorubicin. Hence, single mutations can convert EfrCD into a drug-specific ABC importer.
Collapse
Affiliation(s)
- Gianmarco Meier
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland
| | - Sujani Thavarasah
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland
| | - Kai Ehrenbolger
- Laboratory for Molecular Infection Medicine Sweden (MIMS), Department of Molecular Biology, Umeå Centre for Microbial Research, Umeå University, Umeå, Sweden
- Science for Life Laboratory, Umeå University, Umeå, Sweden
| | - Cedric A J Hutter
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland
- Linkster Therapeutics AG, Zurich, Switzerland
| | - Lea M Hürlimann
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland
- Linkster Therapeutics AG, Zurich, Switzerland
| | - Jonas Barandun
- Laboratory for Molecular Infection Medicine Sweden (MIMS), Department of Molecular Biology, Umeå Centre for Microbial Research, Umeå University, Umeå, Sweden
- Science for Life Laboratory, Umeå University, Umeå, Sweden
| | - Markus A Seeger
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
45
|
Macdonald CB, Nedrud D, Grimes PR, Trinidad D, Fraser JS, Coyote-Maestas W. DIMPLE: deep insertion, deletion, and missense mutation libraries for exploring protein variation in evolution, disease, and biology. Genome Biol 2023; 24:36. [PMID: 36829241 PMCID: PMC9951526 DOI: 10.1186/s13059-023-02880-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 02/16/2023] [Indexed: 02/26/2023] Open
Abstract
Insertions and deletions (indels) enable evolution and cause disease. Due to technical challenges, indels are left out of most mutational scans, limiting our understanding of them in disease, biology, and evolution. We develop a low cost and bias method, DIMPLE, for systematically generating deletions, insertions, and missense mutations in genes, which we test on a range of targets, including Kir2.1. We use DIMPLE to study how indels impact potassium channel structure, disease, and evolution. We find deletions are most disruptive overall, beta sheets are most sensitive to indels, and flexible loops are sensitive to deletions yet tolerate insertions.
Collapse
Affiliation(s)
- Christian B Macdonald
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, USA
| | | | | | - Donovan Trinidad
- Department of Medicine, Division of Infectious Disease, University of California, San Francisco, USA
| | - James S Fraser
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, USA.,Quantitative Biosciences Institute, University of California, San Francisco, USA
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, USA. .,Quantitative Biosciences Institute, University of California, San Francisco, USA.
| |
Collapse
|
46
|
Serebryany E, Zhao VY, Park K, Bitran A, Trauger SA, Budnik B, Shakhnovich EI. Systematic conformation-to-phenotype mapping via limited deep-sequencing of proteins. ARXIV 2023:2204.06159. [PMID: 36776823 PMCID: PMC9915745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
Non-native conformations drive protein misfolding diseases, complicate bioengineering efforts, and fuel molecular evolution. No current experimental technique is well-suited for elucidating them and their phenotypic effects. Especially intractable are the transient conformations populated by intrinsically disordered proteins. We describe an approach to systematically discover, stabilize, and purify native and non-native conformations, generated in vitro or in vivo, and directly link conformations to molecular, organismal, or evolutionary phenotypes. This approach involves high-throughput disulfide scanning (HTDS) of the entire protein. To reveal which disulfides trap which chromatographically resolvable conformers, we devised a deep-sequencing method for double-Cys variant libraries of proteins that precisely and simultaneously locates both Cys residues within each polypeptide. HTDS of the abundant E. coli periplasmic chaperone HdeA revealed distinct classes of disordered hydrophobic conformers with variable cytotoxicity depending on where the backbone was cross-linked. HTDS can bridge conformational and phenotypic landscapes for many proteins that function in disulfide-permissive environments.
Collapse
Affiliation(s)
- Eugene Serebryany
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
| | - Victor Y. Zhao
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
| | - Kibum Park
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
| | - Amir Bitran
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA
| | | | - Bogdan Budnik
- Center for Mass Spectrometry, Harvard University, Cambridge, MA
| | | |
Collapse
|
47
|
Dewachter L, Brooks AN, Noon K, Cialek C, Clark-ElSayed A, Schalck T, Krishnamurthy N, Versées W, Vranken W, Michiels J. Deep mutational scanning of essential bacterial proteins can guide antibiotic development. Nat Commun 2023; 14:241. [PMID: 36646716 PMCID: PMC9842644 DOI: 10.1038/s41467-023-35940-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 01/09/2023] [Indexed: 01/18/2023] Open
Abstract
Deep mutational scanning is a powerful approach to investigate a wide variety of research questions including protein function and stability. Here, we perform deep mutational scanning on three essential E. coli proteins (FabZ, LpxC and MurA) involved in cell envelope synthesis using high-throughput CRISPR genome editing, and study the effect of the mutations in their original genomic context. We use more than 17,000 variants of the proteins to interrogate protein function and the importance of individual amino acids in supporting viability. Additionally, we exploit these libraries to study resistance development against antimicrobial compounds that target the selected proteins. Among the three proteins studied, MurA seems to be the superior antimicrobial target due to its low mutational flexibility, which decreases the chance of acquiring resistance-conferring mutations that simultaneously preserve MurA function. Additionally, we rank anti-LpxC lead compounds for further development, guided by the number of resistance-conferring mutations against each compound. Our results show that deep mutational scanning studies can be used to guide drug development, which we hope will contribute towards the development of novel antimicrobial therapies.
Collapse
Affiliation(s)
- Liselot Dewachter
- Centre of Microbial and Plant Genetics, KU Leuven, Leuven, Belgium. .,VIB-KU Leuven Center for Microbiology, Leuven, Belgium.
| | | | | | | | | | - Thomas Schalck
- Centre of Microbial and Plant Genetics, KU Leuven, Leuven, Belgium.,VIB-KU Leuven Center for Microbiology, Leuven, Belgium
| | | | - Wim Versées
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,VIB-VUB Center for Structural Biology, Brussels, Belgium
| | - Wim Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.,VIB-VUB Center for Structural Biology, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Brussels, Belgium
| | - Jan Michiels
- Centre of Microbial and Plant Genetics, KU Leuven, Leuven, Belgium. .,VIB-KU Leuven Center for Microbiology, Leuven, Belgium.
| |
Collapse
|
48
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China,Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom,The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China,Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China,*Correspondence: Xianghua Li,
| |
Collapse
|
49
|
Abbott RC, Iliopoulos M, Watson KA, Arcucci V, Go M, Hughes-Parry HE, Smith P, Call MJ, Cross RS, Jenkins MR. Human EGFRvIII chimeric antigen receptor T cells demonstrate favorable safety profile and curative responses in orthotopic glioblastoma. Clin Transl Immunology 2023; 12:e1440. [PMID: 36890859 PMCID: PMC9986233 DOI: 10.1002/cti2.1440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Revised: 01/30/2023] [Accepted: 02/07/2023] [Indexed: 03/07/2023] Open
Abstract
Objectives Glioblastoma is a highly aggressive and fatal brain malignancy, and effective targeted therapies are required. The combination of standard treatments including surgery, chemotherapy and radiotherapy is not curative. Chimeric antigen receptor (CAR) T cells are known to cross the blood-brain barrier, mediating antitumor responses. A tumor-expressed deletion mutant of the epidermal growth factor receptor (EGFRvIII) is a robust CAR T cell target in glioblastoma. Here, we show our de novo generated, high-affinity EGFRvIII-specific CAR; GCT02, demonstrating curative efficacy in human orthotopic glioblastoma models. Methods The GCT02 binding epitope was predicted using Deep Mutational Scanning (DMS). GCT02 CAR T cell cytotoxicity was investigated in three glioblastoma models in vitro using the IncuCyte platform, and cytokine secretion with a cytometric bead array. GCT02 in vivo functionality was demonstrated in two NSG orthotopic glioblastoma models. The specificity profile was generated by measuring T cell degranulation in response to coculture with primary human healthy cells. Results The GCT02 binding location was predicted to be located at a shared region of EGFR and EGFRvIII; however, the in vitro functionality remained exquisitely EGFRvIII specific. A single CAR T cell infusion generated curative responses in two orthotopic models of human glioblastoma in NSG mice. The safety analysis further validated the specificity of GCT02 for mutant-expressing cells. Conclusion This study demonstrates the preclinical functionality of a highly specific CAR targeting EGFRvIII on human cells. This CAR could be an effective treatment for glioblastoma and warrants future clinical investigation.
Collapse
Affiliation(s)
- Rebecca C Abbott
- Immunology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia.,The Department of Medical Biology University of Melbourne Parkville VIC Australia
| | - Melinda Iliopoulos
- Immunology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia
| | - Katherine A Watson
- Immunology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia
| | - Valeria Arcucci
- Immunology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia
| | - Margareta Go
- Structural Biology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia
| | - Hannah E Hughes-Parry
- Immunology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia.,The Department of Medical Biology University of Melbourne Parkville VIC Australia
| | - Pete Smith
- Myrio Therapeutics Blackburn North, Melbourne VIC Australia
| | - Melissa J Call
- The Department of Medical Biology University of Melbourne Parkville VIC Australia.,Structural Biology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia
| | - Ryan S Cross
- Immunology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia
| | - Misty R Jenkins
- Immunology Division The Walter and Eliza Hall Institute of Medical Research Parkville VIC Australia.,The Department of Medical Biology University of Melbourne Parkville VIC Australia.,Department of Biochemistry and Chemistry Institute for Molecular Science, La Trobe University Bundoora VIC Australia
| |
Collapse
|
50
|
Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu Rev Genet 2022; 56:441-465. [PMID: 36055970 DOI: 10.1146/annurev-genet-072920-032107] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Scalable sequence-function studies have enabled the systematic analysis and cataloging of hundreds of thousands of coding and noncoding genetic variants in the human genome. This has improved clinical variant interpretation and provided insights into the molecular, biophysical, and cellular effects of genetic variants at an astonishing scale and resolution across the spectrum of allele frequencies. In this review, we explore current applications and prospects for the field and outline the principles underlying scalable functional assay design, with a focus on the study of single-nucleotide coding and noncoding variants.
Collapse
Affiliation(s)
- Daniel Tabet
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Victoria Parikh
- Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Frederick P Roth
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Melina Claussnitzer
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Center for Genomic Medicine and Endocrine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Harvard University, Boston, Massachusetts, USA;
| |
Collapse
|