1
|
La Fleur A, Shi Y, Seelig G. Decoding biology with massively parallel reporter assays and machine learning. Genes Dev 2024; 38:843-865. [PMID: 39362779 PMCID: PMC11535156 DOI: 10.1101/gad.351800.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2024]
Abstract
Massively parallel reporter assays (MPRAs) are powerful tools for quantifying the impacts of sequence variation on gene expression. Reading out molecular phenotypes with sequencing enables interrogating the impact of sequence variation beyond genome scale. Machine learning models integrate and codify information learned from MPRAs and enable generalization by predicting sequences outside the training data set. Models can provide a quantitative understanding of cis-regulatory codes controlling gene expression, enable variant stratification, and guide the design of synthetic regulatory elements for applications from synthetic biology to mRNA and gene therapy. This review focuses on cis-regulatory MPRAs, particularly those that interrogate cotranscriptional and post-transcriptional processes: alternative splicing, cleavage and polyadenylation, translation, and mRNA decay.
Collapse
Affiliation(s)
- Alyssa La Fleur
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA
| | - Yongsheng Shi
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA;
| | - Georg Seelig
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, Washington 98195, USA;
- Department of Electrical & Computer Engineering, University of Washington, Seattle, Washington 98195, USA
| |
Collapse
|
2
|
Wall RJ, MacGowan SA, Hallyburton I, Syed AJ, Ajay Castro S, Dey G, Milne R, Patterson S, Phelan J, Wiedemar N, Wyllie S. ResMAP-a saturation mutagenesis platform enabling parallel profiling of target-specific resistance-conferring mutations in Plasmodium. mBio 2024; 15:e0170824. [PMID: 39191404 PMCID: PMC11481570 DOI: 10.1128/mbio.01708-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Accepted: 07/29/2024] [Indexed: 08/29/2024] Open
Abstract
New and improved drugs are required for the treatment and ultimate eradication of malaria. The efficacy of front-line therapies is now threatened by emerging drug resistance; thus, new tools to support the development of drugs with a lower propensity for resistance are needed. Here, we describe the development of a RESistance Mapping And Profiling (ResMAP) platform for the identification of resistance-conferring mutations in Plasmodium drug targets. Proof-of-concept studies focused on interrogating the antimalarial drug target, Plasmodium falciparum lysyl tRNA synthetase (PfKRS). Saturation mutagenesis was used to construct a plasmid library encoding all conceivable mutations within a 20-residue span at the base of the PfKRS ATP-binding pocket. The superior transfection efficiency of Plasmodium knowlesi was exploited to generate a high coverage parasite library expressing PfKRS bearing all possible amino acid changes within this region of the enzyme. The selection of the library with PfKRS inhibitors, cladosporin and DDD01510706, successfully identified multiple resistance-conferring substitutions. Genetic validation of a subset of these mutations confirmed their direct role in resistance, with computational modeling used to dissect the structural basis of resistance. The application of ResMAP to inform the development of resistance-resilient antimalarials of the future is discussed. IMPORTANCE An increase in treatment failures for malaria highlights an urgent need for new tools to understand and minimize the spread of drug resistance. We describe the development of a RESistance Mapping And Profiling (ResMAP) platform for the identification of resistance-conferring mutations in Plasmodium spp, the causative agent of malaria. Saturation mutagenesis was used to generate a mutation library containing all conceivable mutations for a region of the antimalarial-binding site of a promising drug target, Plasmodium falciparum lysyl tRNA synthetase (PfKRS). Screening of this high-coverage library with characterized PfKRS inhibitors revealed multiple resistance-conferring substitutions including several clinically relevant mutations. Genetic validation of these mutations confirmed resistance of up to 100-fold and computational modeling dissected their role in drug resistance. We discuss potential applications of this data including the potential to design compounds that can bypass the most serious resistance mutations and future resistance surveillance.
Collapse
Affiliation(s)
- Richard J. Wall
- Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, United Kingdom
| | - Stuart A. MacGowan
- Division of Computational Biology, School of Life Sciences, University of Dundee, Dundee, United Kingdom
| | - Irene Hallyburton
- Drug Discovery Unit, Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, University of Dundee, Dundee, United Kingdom
| | - Aisha J. Syed
- Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, United Kingdom
| | - Sowmya Ajay Castro
- Division of Molecular Microbiology, School of Life Sciences, University of Dundee, Dundee, United Kingdom
| | - Gourav Dey
- Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, United Kingdom
| | - Rachel Milne
- Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, United Kingdom
| | - Stephen Patterson
- Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, United Kingdom
| | - Jody Phelan
- Department of Infection Biology, Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Natalie Wiedemar
- Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, United Kingdom
| | - Susan Wyllie
- Wellcome Center for Anti-infectives Research, Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dow Street, Dundee, United Kingdom
| |
Collapse
|
3
|
Bendel AM, Faure AJ, Klein D, Shimada K, Lyautey R, Schiffelholz N, Kempf G, Cavadini S, Lehner B, Diss G. The genetic architecture of protein interaction affinity and specificity. Nat Commun 2024; 15:8868. [PMID: 39402041 PMCID: PMC11479274 DOI: 10.1038/s41467-024-53195-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 10/04/2024] [Indexed: 10/17/2024] Open
Abstract
The encoding and evolution of specificity and affinity in protein-protein interactions is poorly understood. Here, we address this question by quantifying how all mutations in one protein, JUN, alter binding to all other members of a protein family, the 54 human basic leucine zipper transcription factors. We fit a global thermodynamic model to the data to reveal that most affinity changing mutations equally affect JUN's affinity to all its interaction partners. Mutations that alter binding specificity are relatively rare but distributed throughout the interaction interface. Specificity is determined both by features that promote on-target interactions and by those that prevent off-target interactions. Approximately half of the specificity-defining residues in JUN contribute both to promoting on-target binding and preventing off-target binding. Nearly all specificity-altering mutations in the interaction interface are pleiotropic, also altering affinity to all partners. In contrast, mutations outside the interface can tune global affinity without affecting specificity. Our results reveal the distributed encoding of specificity and affinity in an interaction interface and how coiled-coils provide an elegant solution to the challenge of optimizing both specificity and affinity in a large protein family.
Collapse
Affiliation(s)
- Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
- Swiss Institute for Experimental Cancer Research, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Andre J Faure
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- ALLOX, C/ Dr. Aiguader, 88, PRBB Building, Barcelona, Spain
| | - Dominique Klein
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Kenji Shimada
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Romane Lyautey
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Nicole Schiffelholz
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Georg Kempf
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Simone Cavadini
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Ben Lehner
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
- Wellcome Sanger Institute, Hinxton, UK.
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland.
| |
Collapse
|
4
|
Faure AJ, Martí-Aranda A, Hidalgo-Carcedo C, Beltran A, Schmiedel JM, Lehner B. The genetic architecture of protein stability. Nature 2024; 634:995-1003. [PMID: 39322666 PMCID: PMC11499273 DOI: 10.1038/s41586-024-07966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 08/20/2024] [Indexed: 09/27/2024]
Abstract
There are more ways to synthesize a 100-amino acid (aa) protein (20100) than there are atoms in the universe. Only a very small fraction of such a vast sequence space can ever be experimentally or computationally surveyed. Deep neural networks are increasingly being used to navigate high-dimensional sequence spaces1. However, these models are extremely complicated. Here, by experimentally sampling from sequence spaces larger than 1010, we show that the genetic architecture of at least some proteins is remarkably simple, allowing accurate genetic prediction in high-dimensional sequence spaces with fully interpretable energy models. These models capture the nonlinear relationships between free energies and phenotypes but otherwise consist of additive free energy changes with a small contribution from pairwise energetic couplings. These energetic couplings are sparse and associated with structural contacts and backbone proximity. Our results indicate that protein genetics is actually both rather simple and intelligible.
Collapse
Affiliation(s)
- Andre J Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- ALLOX, Barcelona, Spain.
| | - Aina Martí-Aranda
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Cristina Hidalgo-Carcedo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Antoni Beltran
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jörn M Schmiedel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
- factorize.bio, Berlin, Germany
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
| |
Collapse
|
5
|
Thompson M, Martín M, Olmo TS, Rajesh C, Koo PK, Bolognesi B, Lehner B. Massive experimental quantification of amyloid nucleation allows interpretable deep learning of protein aggregation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.13.603366. [PMID: 39071305 PMCID: PMC11275847 DOI: 10.1101/2024.07.13.603366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Protein aggregation is a pathological hallmark of more than fifty human diseases and a major problem for biotechnology. Methods have been proposed to predict aggregation from sequence, but these have been trained and evaluated on small and biased experimental datasets. Here we directly address this data shortage by experimentally quantifying the amyloid nucleation of >100,000 protein sequences. This unprecedented dataset reveals the limited performance of existing computational methods and allows us to train CANYA, a convolution-attention hybrid neural network that accurately predicts amyloid nucleation from sequence. We adapt genomic neural network interpretability analyses to reveal CANYA's decision-making process and learned grammar. Our results illustrate the power of massive experimental analysis of random sequence-spaces and provide an interpretable and robust neural network model to predict amyloid nucleation.
Collapse
Affiliation(s)
- Mike Thompson
- Systems and Synthetic Biology, Centre for Genomic Regulation, The Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
| | - Mariano Martín
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Trinidad Sanmartín Olmo
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Chandana Rajesh
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Peter K. Koo
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Ben Lehner
- Systems and Synthetic Biology, Centre for Genomic Regulation, The Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
- University Pompeu Fabra (UPF), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| |
Collapse
|
6
|
Ma K, Huang S, Ng KK, Lake NJ, Joseph S, Xu J, Lek A, Ge L, Woodman KG, Koczwara KE, Cohen J, Ho V, O'Connor CL, Brindley MA, Campbell KP, Lek M. Saturation mutagenesis-reinforced functional assays for disease-related genes. Cell 2024:S0092-8674(24)00976-0. [PMID: 39326416 DOI: 10.1016/j.cell.2024.08.047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 07/29/2024] [Accepted: 08/23/2024] [Indexed: 09/28/2024]
Abstract
Interpretation of disease-causing genetic variants remains a challenge in human genetics. Current costs and complexity of deep mutational scanning methods are obstacles for achieving genome-wide resolution of variants in disease-related genes. Our framework, saturation mutagenesis-reinforced functional assays (SMuRF), offers simple and cost-effective saturation mutagenesis paired with streamlined functional assays to enhance the interpretation of unresolved variants. Applying SMuRF to neuromuscular disease genes FKRP and LARGE1, we generated functional scores for all possible coding single-nucleotide variants, which aid in resolving clinically reported variants of uncertain significance. SMuRF also demonstrates utility in predicting disease severity, resolving critical structural regions, and providing training datasets for the development of computational predictors. Overall, our approach enables variant-to-function insights for disease genes in a cost-effective manner that can be broadly implemented by standard research laboratories.
Collapse
Affiliation(s)
- Kaiyue Ma
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA; Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.
| | - Shushu Huang
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Kenneth K Ng
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Nicole J Lake
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Soumya Joseph
- Howard Hughes Medical Institute, Senator Paul D. Wellstone Muscular Dystrophy Specialized Research Center, Department of Molecular Physiology and Biophysics and Department of Neurology, Roy J. and Lucille A. Carver College of Medicine, The University of Iowa, Iowa City, IA, USA
| | - Jenny Xu
- Yale University, New Haven, CT, USA
| | - Angela Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA; Muscular Dystrophy Association, Chicago, IL, USA
| | - Lin Ge
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA; Department of Neurology, Beijing Children's Hospital, Capital Medical University, National Center for Children's Health, Beijing, China
| | - Keryn G Woodman
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | - Justin Cohen
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | - Vincent Ho
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA
| | | | - Melinda A Brindley
- Department of Infectious Diseases, Department of Population Health, University of Georgia, Athens, GA, USA
| | - Kevin P Campbell
- Howard Hughes Medical Institute, Senator Paul D. Wellstone Muscular Dystrophy Specialized Research Center, Department of Molecular Physiology and Biophysics and Department of Neurology, Roy J. and Lucille A. Carver College of Medicine, The University of Iowa, Iowa City, IA, USA
| | - Monkol Lek
- Department of Genetics, Yale School of Medicine, New Haven, CT, USA.
| |
Collapse
|
7
|
Arutyunyan A, Seuma M, Faure AJ, Bolognesi B, Lehner B. Energetic portrait of the amyloid beta nucleation transition state. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.24.604935. [PMID: 39091732 PMCID: PMC11291115 DOI: 10.1101/2024.07.24.604935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Amyloid protein aggregates are pathological hallmarks of more than fifty human diseases including the most common neurodegenerative disorders. The atomic structures of amyloid fibrils have now been determined, but the process by which soluble proteins nucleate to form amyloids remains poorly characterised and difficult to study, even though this is the key step to understand to prevent the formation and spread of aggregates. Here we use massively parallel combinatorial mutagenesis, a kinetic selection assay, and machine learning to reveal the transition state of the nucleation reaction of amyloid beta, the protein that aggregates in Alzheimer's disease. By quantifying the nucleation of >140,000 proteins we infer the changes in activation energy for all 798 amino acid substitutions in amyloid beta and the energetic couplings between >600 pairs of mutations. This unprecedented dataset provides the first comprehensive view of the energy landscape and the first large-scale measurement of energetic couplings for a protein transition state. The energy landscape reveals that the amyloid beta nucleation transition state contains a short structured C-terminal hydrophobic core with a subset of interactions similar to mature fibrils. This study demonstrates the feasibility of using mutation-selection-sequencing experiments to study transition states and identifies the key molecular species that initiates amyloid beta aggregation and, potentially, Alzheimer's disease.
Collapse
Affiliation(s)
| | - Mireia Seuma
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology (BIST), Baldiri Reixac 10-12, 08028, Barcelona, Spain
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
- Current address: Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | - Andre J. Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Current address: ALLOX, C/Dr. Aiguader, 88, PRBB Building, 08003 Barcelona, Spain
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology (BIST), Baldiri Reixac 10-12, 08028, Barcelona, Spain
| | - Ben Lehner
- Wellcome Sanger Institute, Cambridge, UK
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
8
|
Ramesh S, Go M, Call ME, Call MJ. Deep mutational scanning reveals transmembrane features governing surface expression of the B cell antigen receptor. Front Immunol 2024; 15:1426795. [PMID: 39108267 PMCID: PMC11300204 DOI: 10.3389/fimmu.2024.1426795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 07/05/2024] [Indexed: 09/17/2024] Open
Abstract
B cells surveil the body for foreign matter using their surface-expressed B cell antigen receptor (BCR), a tetrameric complex comprising a membrane-tethered antibody (mIg) that binds antigens and a signaling dimer (CD79AB) that conveys this interaction to the B cell. Recent cryogenic electron microscopy (cryo-EM) structures of IgM and IgG isotype BCRs provide the first complete views of their architecture, revealing that the largest interaction surfaces between the mIg and CD79AB are in their transmembrane domains (TMDs). These structures support decades of biochemical work interrogating the requirements for assembly of a functional BCR and provide the basis for explaining the effects of mutations. Here we report a focused saturating mutagenesis to comprehensively characterize the nature of the interactions in the mIg TMD that are required for BCR surface expression. We examined the effects of 600 single-amino-acid changes simultaneously in a pooled competition assay and quantified their effects by next-generation sequencing. Our deep mutational scanning results reflect a feature-rich TMD sequence, with some positions completely intolerant to mutation and others requiring specific biochemical properties such as charge, polarity or hydrophobicity, emphasizing the high value of saturating mutagenesis over, for example, alanine scanning. The data agree closely with published mutagenesis and the cryo-EM structures, while also highlighting several positions and surfaces that have not previously been characterized or have effects that are difficult to rationalize purely based on structure. This unbiased and complete mutagenesis dataset serves as a reference and framework for informed hypothesis testing, design of therapeutics to regulate BCR surface expression and to annotate patient mutations.
Collapse
Affiliation(s)
- Samyuktha Ramesh
- Structural Biology Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Margareta Go
- Structural Biology Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Matthew E. Call
- Structural Biology Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Melissa J. Call
- Structural Biology Division, Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| |
Collapse
|
9
|
Wu X, Go M, Nguyen JV, Kuchel NW, Lu BGC, Zeglinski K, Lowes KN, Calleja DJ, Mitchell JP, Lessene G, Komander D, Call ME, Call MJ. Mutational profiling of SARS-CoV-2 papain-like protease reveals requirements for function, structure, and drug escape. Nat Commun 2024; 15:6219. [PMID: 39043718 PMCID: PMC11266423 DOI: 10.1038/s41467-024-50566-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 07/12/2024] [Indexed: 07/25/2024] Open
Abstract
Papain-like protease (PLpro) is an attractive drug target for SARS-CoV-2 because it is essential for viral replication, cleaving viral poly-proteins pp1a and pp1ab, and has de-ubiquitylation and de-ISGylation activities, affecting innate immune responses. We employ Deep Mutational Scanning to evaluate the mutational effects on PLpro enzymatic activity and protein stability in mammalian cells. We confirm features of the active site and identify mutations in neighboring residues that alter activity. We characterize residues responsible for substrate binding and demonstrate that although residues in the blocking loop are remarkably tolerant to mutation, blocking loop flexibility is important for function. We additionally find a connected network of mutations affecting activity that extends far from the active site. We leverage our library to identify drug-escape variants to a common PLpro inhibitor scaffold and predict that plasticity in both the S4 pocket and blocking loop sequence should be considered during the drug design process.
Collapse
Affiliation(s)
- Xinyu Wu
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Margareta Go
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Julie V Nguyen
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
| | - Nathan W Kuchel
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Bernadine G C Lu
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Kathleen Zeglinski
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Kym N Lowes
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Dale J Calleja
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Jeffrey P Mitchell
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Guillaume Lessene
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
- Department of Biochemistry and Pharmacology, University of Melbourne, Parkville, VIC, Australia
| | - David Komander
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Matthew E Call
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia
| | - Melissa J Call
- The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
10
|
Schnettler JD, Wang MS, Gantz M, Bunzel HA, Karas C, Hollfelder F, Hecht MH. Selection of a promiscuous minimalist cAMP phosphodiesterase from a library of de novo designed proteins. Nat Chem 2024; 16:1200-1208. [PMID: 38702405 PMCID: PMC11230910 DOI: 10.1038/s41557-024-01490-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 02/27/2024] [Indexed: 05/06/2024]
Abstract
The ability of unevolved amino acid sequences to become biological catalysts was key to the emergence of life on Earth. However, billions of years of evolution separate complex modern enzymes from their simpler early ancestors. To probe how unevolved sequences can develop new functions, we use ultrahigh-throughput droplet microfluidics to screen for phosphoesterase activity amidst a library of more than one million sequences based on a de novo designed 4-helix bundle. Characterization of hits revealed that acquisition of function involved a large jump in sequence space enriching for truncations that removed >40% of the protein chain. Biophysical characterization of a catalytically active truncated protein revealed that it dimerizes into an α-helical structure, with the gain of function accompanied by increased structural dynamics. The identified phosphodiesterase is a manganese-dependent metalloenzyme that hydrolyses a range of phosphodiesters. It is most active towards cyclic AMP, with a rate acceleration of ~109 and a catalytic proficiency of >1014 M-1, comparable to larger enzymes shaped by billions of years of evolution.
Collapse
Affiliation(s)
| | - Michael S Wang
- Department of Chemistry, Princeton University, Princeton, USA
| | - Maximilian Gantz
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - H Adrian Bunzel
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Christina Karas
- Department of Molecular Biology, Princeton University, Princeton, USA
| | | | - Michael H Hecht
- Department of Chemistry, Princeton University, Princeton, USA.
| |
Collapse
|
11
|
Bendel AM, Skendo K, Klein D, Shimada K, Kauneckaite-Griguole K, Diss G. Optimization of a deep mutational scanning workflow to improve quantification of mutation effects on protein-protein interactions. BMC Genomics 2024; 25:630. [PMID: 38914936 PMCID: PMC11194945 DOI: 10.1186/s12864-024-10524-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Accepted: 06/14/2024] [Indexed: 06/26/2024] Open
Abstract
Deep Mutational Scanning (DMS) assays are powerful tools to study sequence-function relationships by measuring the effects of thousands of sequence variants on protein function. During a DMS experiment, several technical artefacts might distort non-linearly the functional score obtained, potentially biasing the interpretation of the results. We therefore tested several technical parameters in the deepPCA workflow, a DMS assay for protein-protein interactions, in order to identify technical sources of non-linearities. We found that parameters common to many DMS assays such as amount of transformed DNA, timepoint of harvest and library composition can cause non-linearities in the data. Designing experiments in a way to minimize these non-linear effects will improve the quantification and interpretation of mutation effects.
Collapse
Affiliation(s)
- Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
| | | | - Dominique Klein
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Kenji Shimada
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
| | - Kotryna Kauneckaite-Griguole
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland
- University of Basel, Basel, Switzerland
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research (FMI), Basel, Switzerland.
| |
Collapse
|
12
|
Chen SK, Liu J, Van Nynatten A, Tudor-Price BM, Chang BSW. Sampling Strategies for Experimentally Mapping Molecular Fitness Landscapes Using High-Throughput Methods. J Mol Evol 2024:10.1007/s00239-024-10179-8. [PMID: 38886207 DOI: 10.1007/s00239-024-10179-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 05/20/2024] [Indexed: 06/20/2024]
Abstract
Empirical studies of genotype-phenotype-fitness maps of proteins are fundamental to understanding the evolutionary process, in elucidating the space of possible genotypes accessible through mutations in a landscape of phenotypes and fitness effects. Yet, comprehensively mapping molecular fitness landscapes remains challenging since all possible combinations of amino acid substitutions for even a few protein sites are encoded by an enormous genotype space. High-throughput mapping of genotype space can be achieved using large-scale screening experiments known as multiplexed assays of variant effect (MAVEs). However, to accommodate such multi-mutational studies, the size of MAVEs has grown to the point where a priori determination of sampling requirements is needed. To address this problem, we propose calculations and simulation methods to approximate minimum sampling requirements for multi-mutational MAVEs, which we combine with a new library construction protocol to experimentally validate our approximation approaches. Analysis of our simulated data reveals how sampling trajectories differ between simulations of nucleotide versus amino acid variants and among mutagenesis schemes. For this, we show quantitatively that marginal gains in sampling efficiency demand increasingly greater sampling effort when sampling for nucleotide sequences over their encoded amino acid equivalents. We present a new library construction protocol that efficiently maximizes sequence variation, and demonstrate using ultradeep sequencing that the library encodes virtually all possible combinations of mutations within the experimental design. Insights learned from our analyses together with the methodological advances reported herein are immediately applicable toward pooled experimental screens of arbitrary design, enabling further assay upscaling and expanded testing of genotype space.
Collapse
Affiliation(s)
- Steven K Chen
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Jing Liu
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada
| | - Alexander Van Nynatten
- Department of Biological Science, University of Toronto Scarborough, Toronto, ON, Canada
| | | | - Belinda S W Chang
- Department of Cell & Systems Biology, University of Toronto, Toronto, ON, Canada.
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, ON, Canada.
- Centre for the Analysis of Genome Evolution & Function, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
13
|
Sung AY, Guerra RM, Steenberge LH, Alston CL, Murayama K, Okazaki Y, Shimura M, Prokisch H, Ghezzi D, Torraco A, Carrozzo R, Rötig A, Taylor RW, Keck JL, Pagliarini DJ. Systematic analysis of NDUFAF6 in complex I assembly and mitochondrial disease. Nat Metab 2024; 6:1128-1142. [PMID: 38720117 PMCID: PMC11395703 DOI: 10.1038/s42255-024-01039-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Accepted: 03/28/2024] [Indexed: 06/27/2024]
Abstract
Isolated complex I (CI) deficiencies are a major cause of primary mitochondrial disease. A substantial proportion of CI deficiencies are believed to arise from defects in CI assembly factors (CIAFs) that are not part of the CI holoenzyme. The biochemistry of these CIAFs is poorly defined, making their role in CI assembly unclear, and confounding interpretation of potential disease-causing genetic variants. To address these challenges, we devised a deep mutational scanning approach to systematically assess the function of thousands of NDUFAF6 genetic variants. Guided by these data, biochemical analyses and cross-linking mass spectrometry, we discovered that the CIAF NDUFAF6 facilitates incorporation of NDUFS8 into CI and reveal that NDUFS8 overexpression rectifies NDUFAF6 deficiency. Our data further provide experimental support of pathogenicity for seven novel NDUFAF6 variants associated with human pathology and introduce functional evidence for over 5,000 additional variants. Overall, our work defines the molecular function of NDUFAF6 and provides a clinical resource for aiding diagnosis of NDUFAF6-related diseases.
Collapse
Affiliation(s)
- Andrew Y Sung
- Department of Biomolecular Chemistry, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - Rachel M Guerra
- Department of Cell Biology and Physiology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA
| | - Laura H Steenberge
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI, USA
| | - Charlotte L Alston
- Mitochondrial Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NHS Highly Specialised Service for Rare Mitochondrial Disorders, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - Kei Murayama
- Department of Metabolism, Chiba Children's Hospital, Chiba, Japan
- Diagnostics and Therapeutic of Intractable Diseases, Intractable Disease Research Center, Graduate School of Medicine, Juntendo University, Tokyo, Japan
| | - Yasushi Okazaki
- Diagnostics and Therapeutic of Intractable Diseases, Intractable Disease Research Center, Graduate School of Medicine, Juntendo University, Tokyo, Japan
| | - Masaru Shimura
- Department of Metabolism, Chiba Children's Hospital, Chiba, Japan
- Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, Germany
| | - Holger Prokisch
- Institute of Neurogenomics, Computational Health Center, Helmholtz Zentrum München, Neuherberg, Germany
- School of Medicine, Institute of Human Genetics, Technical University of Munich, Munich, Germany
| | - Daniele Ghezzi
- Department of Pathophysiology and Transplantation, University of Milan, Milan, Italy
- Medical Genetics and Neurogenetics Unit, Fondazione IRCCS Instituto Neurologico Carlo Besta, Milan, Italy
| | - Alessandra Torraco
- Unit of Cell Biology and Diagnosis of Mitochondrial Disorders, Laboratory of Medical Genetics, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Rosalba Carrozzo
- Unit of Cell Biology and Diagnosis of Mitochondrial Disorders, Laboratory of Medical Genetics, Bambino Gesù Children's Hospital, IRCCS, Rome, Italy
| | - Agnès Rötig
- Université Paris Cité, Imagine Institute, INSERM UMR 1163, Paris, France
| | - Robert W Taylor
- Mitochondrial Research Group, Translational and Clinical Research Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
- NHS Highly Specialised Service for Rare Mitochondrial Disorders, Newcastle upon Tyne Hospitals NHS Foundation Trust, Newcastle upon Tyne, UK
| | - James L Keck
- Department of Biomolecular Chemistry, University of Wisconsin School of Medicine and Public Health, Madison, WI, USA
| | - David J Pagliarini
- Department of Cell Biology and Physiology, Washington University School of Medicine in St. Louis, St. Louis, MO, USA.
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA.
| |
Collapse
|
14
|
Faure AJ, Lehner B, Miró Pina V, Serrano Colome C, Weghorn D. An extension of the Walsh-Hadamard transform to calculate and model epistasis in genetic landscapes of arbitrary shape and complexity. PLoS Comput Biol 2024; 20:e1012132. [PMID: 38805561 PMCID: PMC11161127 DOI: 10.1371/journal.pcbi.1012132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 06/07/2024] [Accepted: 05/04/2024] [Indexed: 05/30/2024] Open
Abstract
Accurate models describing the relationship between genotype and phenotype are necessary in order to understand and predict how mutations to biological sequences affect the fitness and evolution of living organisms. The apparent abundance of epistasis (genetic interactions), both between and within genes, complicates this task and how to build mechanistic models that incorporate epistatic coefficients (genetic interaction terms) is an open question. The Walsh-Hadamard transform represents a rigorous computational framework for calculating and modeling epistatic interactions at the level of individual genotypic values (known as genetical, biological or physiological epistasis), and can therefore be used to address fundamental questions related to sequence-to-function encodings. However, one of its main limitations is that it can only accommodate two alleles (amino acid or nucleotide states) per sequence position. In this paper we provide an extension of the Walsh-Hadamard transform that allows the calculation and modeling of background-averaged epistasis (also known as ensemble epistasis) in genetic landscapes with an arbitrary number of states per position (20 for amino acids, 4 for nucleotides, etc.). We also provide a recursive formula for the inverse matrix and then derive formulae to directly extract any element of either matrix without having to rely on the computationally intensive task of constructing or inverting large matrices. Finally, we demonstrate the utility of our theory by using it to model epistasis within both simulated and empirical multiallelic fitness landscapes, revealing that both pairwise and higher-order genetic interactions are enriched between physically interacting positions.
Collapse
Affiliation(s)
- Andre J. Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- ICREA, Pg. Lluis Companys 23, Barcelona 08010, Spain
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Verónica Miró Pina
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Claudia Serrano Colome
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
| | - Donate Weghorn
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
15
|
Zarin T, Lehner B. A complete map of specificity encoding for a partially fuzzy protein interaction. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.25.591103. [PMID: 38712134 PMCID: PMC11071492 DOI: 10.1101/2024.04.25.591103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Thousands of human proteins function by binding short linear motifs embedded in intrinsically disordered regions. How affinity and specificity are encoded in these binding domains and the motifs themselves is not well understood. The evolvability of binding specificity - how rapidly and extensively it can change upon mutation - is also largely unexplored, as is the contribution of 'fuzzy' dynamic residues to affinity and specificity in protein-protein interactions. Here we report the first complete map of specificity encoding for a globular protein domain. Quantifying >200,000 energetic interactions between a PDZ domain and its ligand identifies 20 major energetically coupled pairs of sites that control specificity. These are organized into six modules, with most mutations in each module reprogramming specificity for a single position in the ligand. Nine of the major energetic couplings controlling specificity are between structural contacts and 11 have an allosteric mechanism of action. The dynamic tail of the ligand is more robust to mutation than the structured residues but contributes additively to binding affinity and communicates with structured residues to enable changes in specificity. Our results quantify the binding specificities of >1,800 globular proteins to reveal how specificity is encoded and provide a direct comparison of the encoding of affinity and specificity in structured and dynamic molecular recognition.
Collapse
Affiliation(s)
- Taraneh Zarin
- Centre for Genomic Regulation (CRG), Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), Barcelona Institute for Science and Technology (BIST), Barcelona, Spain
- Wellcome Sanger Institute, Cambridge, UK
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| |
Collapse
|
16
|
Claussnitzer M, Parikh VN, Wagner AH, Arbesfeld JA, Bult CJ, Firth HV, Muffley LA, Nguyen Ba AN, Riehle K, Roth FP, Tabet D, Bolognesi B, Glazer AM, Rubin AF. Minimum information and guidelines for reporting a multiplexed assay of variant effect. Genome Biol 2024; 25:100. [PMID: 38641812 PMCID: PMC11027375 DOI: 10.1186/s13059-024-03223-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Accepted: 03/25/2024] [Indexed: 04/21/2024] Open
Abstract
Multiplexed assays of variant effect (MAVEs) have emerged as a powerful approach for interrogating thousands of genetic variants in a single experiment. The flexibility and widespread adoption of these techniques across diverse disciplines have led to a heterogeneous mix of data formats and descriptions, which complicates the downstream use of the resulting datasets. To address these issues and promote reproducibility and reuse of MAVE data, we define a set of minimum information standards for MAVE data and metadata and outline a controlled vocabulary aligned with established biomedical ontologies for describing these experimental designs.
Collapse
Affiliation(s)
- Melina Claussnitzer
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Harvard Medical School, Cambridge, MA, 02142, USA
| | - Victoria N Parikh
- Stanford Center for Inherited Cardiovascular Disease, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Alex H Wagner
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43210, USA
| | - Jeremy A Arbesfeld
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43215, USA
- Department of Biomedical Informatics, The Ohio State University, Columbus, OH, 43210, USA
| | - Carol J Bult
- The Jackson Laboratory, Bar Harbor, ME, 04609, USA
| | - Helen V Firth
- Wellcome Sanger Institute, Hinxton, Cambridge, UK
- Dept of Medical Genetics, Cambridge University Hospitals NHS Trust, Cambridge, UK
| | - Lara A Muffley
- Department of Genome Sciences, University of Washington, Seattle, WA, 98105, USA
| | - Alex N Nguyen Ba
- Department of Biology, University of Toronto at Mississauga, Mississauga, ON, Canada
| | - Kevin Riehle
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Frederick P Roth
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Daniel Tabet
- Donnelly Centre, University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalunya (IBEC), The Barcelona Institute of Science and Technology, Barcelona, Spain.
| | - Andrew M Glazer
- Vanderbilt University Medical Center, Nashville, TN, 37232, USA.
| | - Alan F Rubin
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, Australia.
- Department of Medical Biology, University of Melbourne, Parkville, VIC, Australia.
| |
Collapse
|
17
|
McDermott SM, Pham V, Oliver B, Carnes J, Sather DN, Stuart KD. Deep mutational scanning of the RNase III-like domain in Trypanosoma brucei RNA editing protein KREPB4. Front Cell Infect Microbiol 2024; 14:1381155. [PMID: 38650737 PMCID: PMC11033214 DOI: 10.3389/fcimb.2024.1381155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Accepted: 03/14/2024] [Indexed: 04/25/2024] Open
Abstract
Kinetoplastid pathogens including Trypanosoma brucei, T. cruzi, and Leishmania species, are early diverged, eukaryotic, unicellular parasites. Functional understanding of many proteins from these pathogens has been hampered by limited sequence homology to proteins from other model organisms. Here we describe the development of a high-throughput deep mutational scanning approach in T. brucei that facilitates rapid and unbiased assessment of the impacts of many possible amino acid substitutions within a protein on cell fitness, as measured by relative cell growth. The approach leverages several molecular technologies: cells with conditional expression of a wild-type gene of interest and constitutive expression of a library of mutant variants, degron-controlled stabilization of I-SceI meganuclease to mediate highly efficient transfection of a mutant allele library, and a high-throughput sequencing readout for cell growth upon conditional knockdown of wild-type gene expression and exclusive expression of mutant variants. Using this method, we queried the effects of amino acid substitutions in the apparently non-catalytic RNase III-like domain of KREPB4 (B4), which is an essential component of the RNA Editing Catalytic Complexes (RECCs) that carry out mitochondrial RNA editing in T. brucei. We measured the impacts of thousands of B4 variants on bloodstream form cell growth and validated the most deleterious variants containing single amino acid substitutions. Crucially, there was no correlation between phenotypes and amino acid conservation, demonstrating the greater power of this method over traditional sequence homology searching to identify functional residues. The bloodstream form cell growth phenotypes were combined with structural modeling, RECC protein proximity data, and analysis of selected substitutions in procyclic form T. brucei. These analyses revealed that the B4 RNaseIII-like domain is essential for maintenance of RECC integrity and RECC protein abundances and is also involved in changes in RECCs that occur between bloodstream and procyclic form life cycle stages.
Collapse
Affiliation(s)
- Suzanne M. McDermott
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA, United States
- Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, United States
| | - Vy Pham
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA, United States
| | - Brian Oliver
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA, United States
| | - Jason Carnes
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA, United States
| | - D. Noah Sather
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA, United States
- Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, United States
| | - Kenneth D. Stuart
- Center for Global Infectious Disease Research, Seattle Children’s Research Institute, Seattle, WA, United States
- Department of Pediatrics, University of Washington School of Medicine, Seattle, WA, United States
| |
Collapse
|
18
|
Chan CWF, Wang B, Nan L, Huang X, Mao T, Chu HY, Luo C, Chu H, Choi GCG, Shum HC, Wong ASL. High-throughput screening of genetic and cellular drivers of syncytium formation induced by the spike protein of SARS-CoV-2. Nat Biomed Eng 2024; 8:291-309. [PMID: 37996617 DOI: 10.1038/s41551-023-01140-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 10/18/2023] [Indexed: 11/25/2023]
Abstract
Mapping mutations and discovering cellular determinants that cause the spike protein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) to induce infected cells to form syncytia would facilitate the development of strategies for blocking the formation of such cell-cell fusion. Here we describe high-throughput screening methods based on droplet microfluidics and the size-exclusion selection of syncytia, coupled with large-scale mutagenesis and genome-wide knockout screening via clustered regularly interspaced short palindromic repeats (CRISPR), for the large-scale identification of determinants of cell-cell fusion. We used the methods to perform deep mutational scans in spike-presenting cells to pinpoint mutable syncytium-enhancing substitutions in two regions of the spike protein (the fusion peptide proximal region and the furin-cleavage site). We also used a genome-wide CRISPR screen in cells expressing the receptor angiotensin-converting enzyme 2 to identify inhibitors of clathrin-mediated endocytosis that impede syncytium formation, which we validated in hamsters infected with SARS-CoV-2. Finding genetic and cellular determinants of the formation of syncytia may reveal insights into the physiological and pathological consequences of cell-cell fusion.
Collapse
Affiliation(s)
- Charles W F Chan
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Shatin, Hong Kong SAR, China
| | - Bei Wang
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Shatin, Hong Kong SAR, China
| | - Lang Nan
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, Shatin, Hong Kong SAR, China
| | - Xiner Huang
- State Key Laboratory of Emerging Infectious Diseases, Department of Microbiology, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Tianjiao Mao
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, Shatin, Hong Kong SAR, China
| | - Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Shatin, Hong Kong SAR, China
| | - Cuiting Luo
- State Key Laboratory of Emerging Infectious Diseases, Department of Microbiology, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
| | - Hin Chu
- State Key Laboratory of Emerging Infectious Diseases, Department of Microbiology, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
- Centre for Virology, Vaccinology and Therapeutics, Hong Kong Science Park, Shatin, Hong Kong SAR, China.
- Department of Infectious Disease and Microbiology, The University of Hong Kong-Shenzhen Hospital, Shenzhen, People's Republic of China.
| | - Gigi C G Choi
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
- Centre for Oncology and Immunology, Hong Kong Science Park, Shatin, Hong Kong SAR, China.
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
| | - Ho Cheung Shum
- Department of Mechanical Engineering, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
- Advanced Biomedical Instrumentation Centre, Hong Kong Science Park, Shatin, Hong Kong SAR, China.
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
- Centre for Oncology and Immunology, Hong Kong Science Park, Shatin, Hong Kong SAR, China.
| |
Collapse
|
19
|
Sesta L, Pagnani A, Fernandez-de-Cossio-Diaz J, Uguzzoni G. Inference of annealed protein fitness landscapes with AnnealDCA. PLoS Comput Biol 2024; 20:e1011812. [PMID: 38377054 PMCID: PMC10878520 DOI: 10.1371/journal.pcbi.1011812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 01/08/2024] [Indexed: 02/22/2024] Open
Abstract
The design of proteins with specific tasks is a major challenge in molecular biology with important diagnostic and therapeutic applications. High-throughput screening methods have been developed to systematically evaluate protein activity, but only a small fraction of possible protein variants can be tested using these techniques. Computational models that explore the sequence space in-silico to identify the fittest molecules for a given function are needed to overcome this limitation. In this article, we propose AnnealDCA, a machine-learning framework to learn the protein fitness landscape from sequencing data derived from a broad range of experiments that use selection and sequencing to quantify protein activity. We demonstrate the effectiveness of our method by applying it to antibody Rep-Seq data of immunized mice and screening experiments, assessing the quality of the fitness landscape reconstructions. Our method can be applied to several experimental cases where a population of protein variants undergoes various rounds of selection and sequencing, without relying on the computation of variants enrichment ratios, and thus can be used even in cases of disjoint sequence samples.
Collapse
Affiliation(s)
- Luca Sesta
- Department of Applied Science and Technology, Politecnico di Torino, Torino, Italy
| | - Andrea Pagnani
- Department of Applied Science and Technology, Politecnico di Torino, Torino, Italy
- Italian Institute for Genomic Medicine, Torino, Italy
- INFN, Sezione di Torino, Torino, Italy
| | | | | |
Collapse
|
20
|
Weng C, Faure AJ, Escobedo A, Lehner B. The energetic and allosteric landscape for KRAS inhibition. Nature 2024; 626:643-652. [PMID: 38109937 PMCID: PMC10866706 DOI: 10.1038/s41586-023-06954-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 12/07/2023] [Indexed: 12/20/2023]
Abstract
Thousands of proteins have been validated genetically as therapeutic targets for human diseases1. However, very few have been successfully targeted, and many are considered 'undruggable'. This is particularly true for proteins that function via protein-protein interactions-direct inhibition of binding interfaces is difficult and requires the identification of allosteric sites. However, most proteins have no known allosteric sites, and a comprehensive allosteric map does not exist for any protein. Here we address this shortcoming by charting multiple global atlases of inhibitory allosteric communication in KRAS. We quantified the effects of more than 26,000 mutations on the folding of KRAS and its binding to six interaction partners. Genetic interactions in double mutants enabled us to perform biophysical measurements at scale, inferring more than 22,000 causal free energy changes. These energy landscapes quantify how mutations tune the binding specificity of a signalling protein and map the inhibitory allosteric sites for an important therapeutic target. Allosteric propagation is particularly effective across the central β-sheet of KRAS, and multiple surface pockets are genetically validated as allosterically active, including a distal pocket in the C-terminal lobe of the protein. Allosteric mutations typically inhibit binding to all tested effectors, but they can also change the binding specificity, revealing the regulatory, evolutionary and therapeutic potential to tune pathway activation. Using the approach described here, it should be possible to rapidly and comprehensively identify allosteric target sites in many proteins.
Collapse
Affiliation(s)
- Chenchun Weng
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Andre J Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Albert Escobedo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- University Pompeu Fabra (UPF), Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|
21
|
Nemoto T, Ocari T, Planul A, Tekinsoy M, Zin EA, Dalkara D, Ferrari U. ACIDES: on-line monitoring of forward genetic screens for protein engineering. Nat Commun 2023; 14:8504. [PMID: 38148337 PMCID: PMC10751290 DOI: 10.1038/s41467-023-43967-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2023] [Accepted: 11/24/2023] [Indexed: 12/28/2023] Open
Abstract
Forward genetic screens of mutated variants are a versatile strategy for protein engineering and investigation, which has been successfully applied to various studies like directed evolution (DE) and deep mutational scanning (DMS). While next-generation sequencing can track millions of variants during the screening rounds, the vast and noisy nature of the sequencing data impedes the estimation of the performance of individual variants. Here, we propose ACIDES that combines statistical inference and in-silico simulations to improve performance estimation in the library selection process by attributing accurate statistical scores to individual variants. We tested ACIDES first on a random-peptide-insertion experiment and then on multiple public datasets from DE and DMS studies. ACIDES allows experimentalists to reliably estimate variant performance on the fly and can aid protein engineering and research pipelines in a range of applications, including gene therapy.
Collapse
Affiliation(s)
- Takahiro Nemoto
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
- Graduate School of Informatics, Kyoto University, Yoshida Hon-machi, Sakyo-ku, Kyoto, 606-8501, Japan.
- Premium Research Institute for Human Metaverse Medicine (WPI-PRIMe), Osaka University, Suita, Osaka, 565-0871, Japan.
| | - Tommaso Ocari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Arthur Planul
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Muge Tekinsoy
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Emilia A Zin
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France
| | - Deniz Dalkara
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| | - Ulisse Ferrari
- Institut de la Vision, Sorbonne Université, INSERM, CNRS, 17 rue Moreau, 75012, Paris, France.
| |
Collapse
|
22
|
Maes S, Deploey N, Peelman F, Eyckerman S. Deep mutational scanning of proteins in mammalian cells. CELL REPORTS METHODS 2023; 3:100641. [PMID: 37963462 PMCID: PMC10694495 DOI: 10.1016/j.crmeth.2023.100641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/06/2023] [Accepted: 10/20/2023] [Indexed: 11/16/2023]
Abstract
Protein mutagenesis is essential for unveiling the molecular mechanisms underlying protein function in health, disease, and evolution. In the past decade, deep mutational scanning methods have evolved to support the functional analysis of nearly all possible single-amino acid changes in a protein of interest. While historically these methods were developed in lower organisms such as E. coli and yeast, recent technological advancements have resulted in the increased use of mammalian cells, particularly for studying proteins involved in human disease. These advancements will aid significantly in the classification and interpretation of variants of unknown significance, which are being discovered at large scale due to the current surge in the use of whole-genome sequencing in clinical contexts. Here, we explore the experimental aspects of deep mutational scanning studies in mammalian cells and report the different methods used in each step of the workflow, ultimately providing a useful guide toward the design of such studies.
Collapse
Affiliation(s)
- Stefanie Maes
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biochemistry and Microbiology, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Nick Deploey
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Frank Peelman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium
| | - Sven Eyckerman
- VIB Center for Medical Biotechnology (CMB), Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium; Department of Biomolecular Medicine, Ghent University, Technologiepark-Zwijnaarde 75, 9052 Ghent, Belgium.
| |
Collapse
|
23
|
Mighell TL, Toledano I, Lehner B. SUNi mutagenesis: Scalable and uniform nicking for efficient generation of variant libraries. PLoS One 2023; 18:e0288158. [PMID: 37418460 DOI: 10.1371/journal.pone.0288158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2023] [Accepted: 06/20/2023] [Indexed: 07/09/2023] Open
Abstract
Multiplexed assays of variant effects (MAVEs) have made possible the functional assessment of all possible mutations to genes and regulatory sequences. A core pillar of the approach is generation of variant libraries, but current methods are either difficult to scale or not uniform enough to enable MAVEs at the scale of gene families or beyond. We present an improved method called Scalable and Uniform Nicking (SUNi) mutagenesis that combines massive scalability with high uniformity to enable cost-effective MAVEs of gene families and eventually genomes.
Collapse
Affiliation(s)
- Taylor L Mighell
- The Barcelona Institute of Science and Technology, Center for Genomic Regulation (CRG), Barcelona, Spain
| | - Ignasi Toledano
- The Barcelona Institute of Science and Technology, Center for Genomic Regulation (CRG), Barcelona, Spain
| | - Ben Lehner
- The Barcelona Institute of Science and Technology, Center for Genomic Regulation (CRG), Barcelona, Spain
- Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, United Kingdom
| |
Collapse
|
24
|
Soneson C, Bendel AM, Diss G, Stadler MB. mutscan-a flexible R package for efficient end-to-end analysis of multiplexed assays of variant effect data. Genome Biol 2023; 24:132. [PMID: 37264470 DOI: 10.1186/s13059-023-02967-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 05/10/2023] [Indexed: 06/03/2023] Open
Abstract
Multiplexed assays of variant effect (MAVE) experimentally measure the effect of large numbers of sequence variants by selective enrichment of sequences with desirable properties followed by quantification by sequencing. mutscan is an R package for flexible analysis of such experiments, covering the entire workflow from raw reads up to statistical analysis and visualization. The core components are implemented in C++ for efficiency. Various experimental designs are supported, including single or paired reads with optional unique molecular identifiers. To find variants with changed relative abundance, mutscan employs established statistical models provided in the edgeR and limma packages. mutscan is available from https://github.com/fmicompbio/mutscan .
Collapse
Affiliation(s)
- Charlotte Soneson
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
| | - Alexandra M Bendel
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Guillaume Diss
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland
| | - Michael B Stadler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland.
- University of Basel, Basel, Switzerland.
| |
Collapse
|
25
|
Hoskins I, Sun S, Cote A, Roth FP, Cenik C. satmut_utils: a simulation and variant calling package for multiplexed assays of variant effect. Genome Biol 2023; 24:82. [PMID: 37081510 PMCID: PMC10116734 DOI: 10.1186/s13059-023-02922-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 04/04/2023] [Indexed: 04/22/2023] Open
Abstract
The impact of millions of individual genetic variants on molecular phenotypes in coding sequences remains unknown. Multiplexed assays of variant effect (MAVEs) are scalable methods to annotate relevant variants, but existing software lacks standardization, requires cumbersome configuration, and does not scale to large targets. We present satmut_utils as a flexible solution for simulation and variant quantification. We then benchmark MAVE software using simulated and real MAVE data. We finally determine mRNA abundance for thousands of cystathionine beta-synthase variants using two experimental methods. The satmut_utils package enables high-performance analysis of MAVEs and reveals the capability of variants to alter mRNA abundance.
Collapse
Affiliation(s)
- Ian Hoskins
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA
| | - Song Sun
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Atina Cote
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Frederick P Roth
- The Donnelly Centre and Departments of Molecular Genetics and Computer Science, University of Toronto, Toronto, ON, Canada
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, ON, Canada
| | - Can Cenik
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
26
|
Wei H, Li X. Deep mutational scanning: A versatile tool in systematically mapping genotypes to phenotypes. Front Genet 2023; 14:1087267. [PMID: 36713072 PMCID: PMC9878224 DOI: 10.3389/fgene.2023.1087267] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 01/02/2023] [Indexed: 01/13/2023] Open
Abstract
Unveiling how genetic variations lead to phenotypic variations is one of the key questions in evolutionary biology, genetics, and biomedical research. Deep mutational scanning (DMS) technology has allowed the mapping of tens of thousands of genetic variations to phenotypic variations efficiently and economically. Since its first systematic introduction about a decade ago, we have witnessed the use of deep mutational scanning in many research areas leading to scientific breakthroughs. Also, the methods in each step of deep mutational scanning have become much more versatile thanks to the oligo-synthesizing technology, high-throughput phenotyping methods and deep sequencing technology. However, each specific possible step of deep mutational scanning has its pros and cons, and some limitations still await further technological development. Here, we discuss recent scientific accomplishments achieved through the deep mutational scanning and describe widely used methods in each step of deep mutational scanning. We also compare these different methods and analyze their advantages and disadvantages, providing insight into how to design a deep mutational scanning study that best suits the aims of the readers' projects.
Collapse
Affiliation(s)
- Huijin Wei
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
| | - Xianghua Li
- Zhejiang University—University of Edinburgh Institute, Zhejiang University, Haining, Zhejiang, China
- Deanery of Biomedical Sciences, University of Edinburgh, Edinburgh, United Kingdom
- The Second Affiliated Hospital of Zhejiang University, Hangzhou, Zhejiang, China
- Biomedical and Health Translational Centre of Zhejiang Province, Haining, Zhejiang, China
| |
Collapse
|
27
|
Tabet D, Parikh V, Mali P, Roth FP, Claussnitzer M. Scalable Functional Assays for the Interpretation of Human Genetic Variation. Annu Rev Genet 2022; 56:441-465. [PMID: 36055970 DOI: 10.1146/annurev-genet-072920-032107] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Scalable sequence-function studies have enabled the systematic analysis and cataloging of hundreds of thousands of coding and noncoding genetic variants in the human genome. This has improved clinical variant interpretation and provided insights into the molecular, biophysical, and cellular effects of genetic variants at an astonishing scale and resolution across the spectrum of allele frequencies. In this review, we explore current applications and prospects for the field and outline the principles underlying scalable functional assay design, with a focus on the study of single-nucleotide coding and noncoding variants.
Collapse
Affiliation(s)
- Daniel Tabet
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Victoria Parikh
- Center for Inherited Cardiovascular Disease, Division of Cardiovascular Medicine, Stanford University School of Medicine, Stanford, California, USA
| | - Prashant Mali
- Department of Bioengineering, University of California, San Diego, California, USA
| | - Frederick P Roth
- Donnelly Centre, Department of Molecular Genetics, and Department of Computer Science, University of Toronto, Toronto, Ontario, Canada;
- Lunenfeld-Tanenbaum Research Institute, Sinai Health, Toronto, Ontario, Canada
| | - Melina Claussnitzer
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
- Center for Genomic Medicine and Endocrine Division, Massachusetts General Hospital, Boston, Massachusetts, USA
- Harvard Medical School, Harvard University, Boston, Massachusetts, USA;
| |
Collapse
|
28
|
Seuma M, Lehner B, Bolognesi B. An atlas of amyloid aggregation: the impact of substitutions, insertions, deletions and truncations on amyloid beta fibril nucleation. Nat Commun 2022; 13:7084. [PMID: 36400770 PMCID: PMC9674652 DOI: 10.1038/s41467-022-34742-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 11/04/2022] [Indexed: 11/19/2022] Open
Abstract
Multiplexed assays of variant effects (MAVEs) guide clinical variant interpretation and reveal disease mechanisms. To date, MAVEs have focussed on a single mutation type-amino acid (AA) substitutions-despite the diversity of coding variants that cause disease. Here we use Deep Indel Mutagenesis (DIM) to generate a comprehensive atlas of diverse variant effects for a disease protein, the amyloid beta (Aβ) peptide that aggregates in Alzheimer's disease (AD) and is mutated in familial AD (fAD). The atlas identifies known fAD mutations and reveals that many variants beyond substitutions accelerate Aβ aggregation and are likely to be pathogenic. Truncations, substitutions, insertions, single- and internal multi-AA deletions differ in their propensity to enhance or impair aggregation, but likely pathogenic variants from all classes are highly enriched in the polar N-terminal region of Aβ. This comparative atlas highlights the importance of including diverse mutation types in MAVEs and provides important mechanistic insights into amyloid nucleation.
Collapse
Affiliation(s)
- Mireia Seuma
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, 08028, Barcelona, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Doctor Aiguader 88, 08003, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
- ICREA, Pg. Lluís Companys 23, Barcelona, 08010, Spain.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and Technology, Baldiri Reixac 10-12, 08028, Barcelona, Spain.
| |
Collapse
|
29
|
Rotrattanadumrong R, Yokobayashi Y. Experimental exploration of a ribozyme neutral network using evolutionary algorithm and deep learning. Nat Commun 2022; 13:4847. [PMID: 35977956 PMCID: PMC9385714 DOI: 10.1038/s41467-022-32538-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 08/03/2022] [Indexed: 11/18/2022] Open
Abstract
A neutral network connects all genotypes with equivalent phenotypes in a fitness landscape and plays an important role in the mutational robustness and evolvability of biomolecules. In contrast to earlier theoretical works, evidence of large neutral networks has been lacking in recent experimental studies of fitness landscapes. This suggests that evolution could be constrained globally. Here, we demonstrate that a deep learning-guided evolutionary algorithm can efficiently identify neutral genotypes within the sequence space of an RNA ligase ribozyme. Furthermore, we measure the activities of all 216 variants connecting two active ribozymes that differ by 16 mutations and analyze mutational interactions (epistasis) up to the 16th order. We discover an extensive network of neutral paths linking the two genotypes and reveal that these paths might be predicted using only information from lower-order interactions. Our experimental evaluation of over 120,000 ribozyme sequences provides important empirical evidence that neutral networks can increase the accessibility and predictability of the fitness landscape. Neutral networks, which are sets of genotypes connected via single mutations that share the same phenotype, are important for evolvability. Here, the authors provide experimental evidence of a neutral network in an RNA enzyme using a high-throughput assay and deep learning.
Collapse
Affiliation(s)
- Rachapun Rotrattanadumrong
- Nucleic Acid Chemistry and Engineering Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, 9040495, Japan
| | - Yohei Yokobayashi
- Nucleic Acid Chemistry and Engineering Unit, Okinawa Institute of Science and Technology Graduate University, Onna, Okinawa, 9040495, Japan.
| |
Collapse
|
30
|
Pelosi B. Developing a bioinformatics pipeline for comparative protein classification analysis. BMC Genom Data 2022; 23:43. [PMID: 35668373 PMCID: PMC9172112 DOI: 10.1186/s12863-022-01045-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 03/11/2022] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Protein classification is a task of paramount importance in various fields of biology. Despite the great momentum of modern implementation of protein classification, machine learning techniques such as Random Forest and Neural Network could not always be used for several reasons: data collection, unbalanced classification or labelling of the data.As an alternative, I propose the use of a bioinformatics pipeline to search for and classify information from protein databases. Hence, to evaluate the efficiency and accuracy of the pipeline, I focused on the carotenoid biosynthetic genes and developed a filtering approach to retrieve orthologs clusters in two well-studied plants that belong to the Brassicaceae family: Arabidopsis thaliana and Brassica rapa Pekinensis group. The result obtained has been compared with previous studies on carotenoid biosynthetic genes in B. rapa where phylogenetic analysis was conducted. RESULTS The developed bioinformatics pipeline relies on commercial software and multiple databeses including the use of phylogeny, Gene Ontology terms (GOs) and Protein Families (Pfams) at a protein level. Furthermore, the phylogeny is coupled with "population analysis" to evaluate the potential orthologs. All the steps taken together give a final table of potential orthologs. The phylogenetic tree gives a result of 43 putative orthologs conserved in B. rapa Pekinensis group. Different A. thaliana proteins have more than one syntenic ortholog as also shown in a previous finding (Li et al., BMC Genomics 16(1):1-11, 2015). CONCLUSIONS This study demonstrates that, when the biological features of proteins of interest are not specific, I can rely on a computational approach in filtering steps for classification purposes. The comparison of the results obtained here for the carotenoid biosynthetic genes with previous research confirmed the accuracy of the developed pipeline which can therefore be applied for filtering different types of datasets.
Collapse
Affiliation(s)
- Benedetta Pelosi
- Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Stockholm, Sweden.
| |
Collapse
|
31
|
Tareen A, Kooshkbaghi M, Posfai A, Ireland WT, McCandlish DM, Kinney JB. MAVE-NN: learning genotype-phenotype maps from multiplex assays of variant effect. Genome Biol 2022; 23:98. [PMID: 35428271 PMCID: PMC9011994 DOI: 10.1186/s13059-022-02661-7] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 03/21/2022] [Accepted: 03/24/2022] [Indexed: 12/17/2022] Open
Abstract
Multiplex assays of variant effect (MAVEs) are a family of methods that includes deep mutational scanning experiments on proteins and massively parallel reporter assays on gene regulatory sequences. Despite their increasing popularity, a general strategy for inferring quantitative models of genotype-phenotype maps from MAVE data is lacking. Here we introduce MAVE-NN, a neural-network-based Python package that implements a broadly applicable information-theoretic framework for learning genotype-phenotype maps-including biophysically interpretable models-from MAVE datasets. We demonstrate MAVE-NN in multiple biological contexts, and highlight the ability of our approach to deconvolve mutational effects from otherwise confounding experimental nonlinearities and noise.
Collapse
Affiliation(s)
- Ammar Tareen
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
- Present Address: Regeneron Pharmaceuticals, Inc., Tarrytown, 10591, NY, USA
| | - Mahdi Kooshkbaghi
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
| | - Anna Posfai
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
| | - William T Ireland
- Department of Physics, California Institute of Technology, Pasadena, 91125, CA, USA
- Present Address: Department of Applied Physics, Harvard University, Cambridge, 02134, MA, USA
| | - David M McCandlish
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA
| | - Justin B Kinney
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, 11724, NY, USA.
| |
Collapse
|
32
|
Barbon L, Offord V, Radford EJ, Butler AP, Gerety SS, Adams DJ, Tan HK, Waters AJ. Variant Library Annotation Tool (VaLiAnT): an oligonucleotide library design and annotation tool for saturation genome editing and other deep mutational scanning experiments. Bioinformatics 2022; 38:892-899. [PMID: 34791067 PMCID: PMC8796380 DOI: 10.1093/bioinformatics/btab776] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 07/13/2021] [Accepted: 11/10/2021] [Indexed: 02/04/2023] Open
Abstract
MOTIVATION CRISPR/Cas9-based technology allows for the functional analysis of genetic variants at single nucleotide resolution whilst maintaining genomic context. This approach, known as saturation genome editing (SGE), a form of deep mutational scanning, systematically alters each position in a target region to explore its function. SGE experiments require the design and synthesis of oligonucleotide variant libraries which are introduced into the genome. This technology is applicable to diverse fields such as disease variant identification, drug development, structure-function studies, synthetic biology, evolutionary genetics and host-pathogen interactions. Here, we present the Variant Library Annotation Tool (VaLiAnT) which can be used to generate variant libraries from user-defined genomic coordinates and standard input files. The software can accommodate user-specified species, reference sequences and transcript annotations. RESULTS Coordinates for a genomic range are provided by the user to retrieve a corresponding oligonucleotide reference sequence. A user-specified range within this sequence is then subject to systematic, nucleotide and/or amino acid saturating mutator functions. VaLiAnT provides a novel way to retrieve, mutate and annotate genomic sequences for oligonucleotide library generation. Specific features for SGE library generation can be employed. In addition, VaLiAnT is configurable, allowing for cDNA and prime editing saturation library generation, with other diverse applications possible. AVAILABILITY AND IMPLEMENTATION VaLiAnT is a command line tool written in Python. Source code, testing data, example input and output files and executables are available (https://github.com/cancerit/VaLiAnT) in addition to a detailed user manual (https://github.com/cancerit/VaLiAnT/wiki). VaLiAnT is licensed under AGPLv3. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Luca Barbon
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Victoria Offord
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Elizabeth J Radford
- Human Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
- Department of Paediatrics, University of Cambridge, Cambridge CB2 0QQ, UK
| | - Adam P Butler
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Sebastian S Gerety
- Human Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - David J Adams
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| | - Hong Kee Tan
- Human Genetics Programme, Wellcome Sanger Institute, Hinxton, Cambridge CB10 1SA, UK
| | - Andrew J Waters
- Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK
| |
Collapse
|
33
|
Dubé AK, Dandage R, Dibyachintan S, Dionne U, Després PC, Landry CR. Deep Mutational Scanning of Protein-Protein Interactions Between Partners Expressed from Their Endogenous Loci In Vivo. Methods Mol Biol 2022; 2477:237-259. [PMID: 35524121 DOI: 10.1007/978-1-0716-2257-5_14] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Deep mutational scanning (DMS) generates mutants of a protein of interest in a comprehensive manner. CRISPR-Cas9 technology enables large-scale genome editing with high efficiency. Using both DMS and CRISPR-Cas9 therefore allows us to investigate the effects of thousands of mutations inserted directly in the genome. Combined with protein-fragment complementation assay (PCA), which enables the quantitative measurement of protein-protein interactions (PPIs) in vivo, these methods allow for the systematic assessment of the effects of mutations on PPIs in living cells. Here, we describe a method leveraging DMS, CRISPR-Cas9, and PCA to study the effect of point mutations on PPIs mediated by protein domains in yeast.
Collapse
Affiliation(s)
- Alexandre K Dubé
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada.
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada.
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
| | - Rohan Dandage
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
| | - Soham Dibyachintan
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- Department of Chemical Engineering, Indian Institute of Technology Bombay (IIT), Powai, Mumbai, Maharashtra, India
| | - Ugo Dionne
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
- Centre de recherche du Centre Hospitalier Universitaire (CHU) de Québec, Université Laval, Québec, QC, Canada
- Centre de recherche sur le cancer de l'Université Laval, Québec, QC, Canada
| | - Philippe C Després
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada
| | - Christian R Landry
- Département de Biochimie, Microbiologie et Bio-informatique, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
- PROTEO, le regroupement québécois de recherche sur la fonction, l'ingénierie et les applications des protéines, Université Laval, Québec, QC, Canada.
- Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada.
- Institut de Biologie Intégrative et des Systèmes, Université Laval, Québec, QC, Canada.
- Département de Biologie, Faculté de Sciences et Génie, Université Laval, Québec, QC, Canada.
| |
Collapse
|
34
|
Hanning KR, Minot M, Warrender AK, Kelton W, Reddy ST. Deep mutational scanning for therapeutic antibody engineering. Trends Pharmacol Sci 2021; 43:123-135. [PMID: 34895944 DOI: 10.1016/j.tips.2021.11.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 11/02/2021] [Accepted: 11/10/2021] [Indexed: 12/24/2022]
Abstract
The biophysical and functional properties of monoclonal antibody (mAb) drug candidates are often improved by protein engineering methods to increase the probability of clinical efficacy. One emerging method is deep mutational scanning (DMS) which combines the power of exhaustive protein mutagenesis and functional screening with deep sequencing and bioinformatics. The application of DMS has yielded significant improvements to the affinity, specificity, and stability of several preclinical antibodies alongside novel applications such as introducing multi-specific binding properties. DMS has also been applied directly on target antigens to precisely map antibody-binding epitopes and notably to profile the mutational escape potential of viral targets (e.g., SARS-CoV-2 variants). Finally, DMS combined with machine learning is enabling advances in the computational screening and engineering of therapeutic antibodies.
Collapse
Affiliation(s)
- Kyrin R Hanning
- Te Huataki Waiora School of Health, University of Waikato, Hamilton 3240, New Zealand
| | - Mason Minot
- Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule (ETH) Zurich, Basel 4058, Switzerland
| | - Annmaree K Warrender
- Te Huataki Waiora School of Health, University of Waikato, Hamilton 3240, New Zealand
| | - William Kelton
- Te Huataki Waiora School of Health, University of Waikato, Hamilton 3240, New Zealand.
| | - Sai T Reddy
- Department of Biosystems Science and Engineering, Eidgenössische Technische Hochschule (ETH) Zurich, Basel 4058, Switzerland.
| |
Collapse
|
35
|
Findlay GM. Linking genome variants to disease: scalable approaches to test the functional impact of human mutations. Hum Mol Genet 2021; 30:R187-R197. [PMID: 34338757 PMCID: PMC8490018 DOI: 10.1093/hmg/ddab219] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 07/19/2021] [Accepted: 07/19/2021] [Indexed: 11/13/2022] Open
Abstract
The application of genomics to medicine has accelerated the discovery of mutations underlying disease and has enhanced our knowledge of the molecular underpinnings of diverse pathologies. As the amount of human genetic material queried via sequencing has grown exponentially in recent years, so too has the number of rare variants observed. Despite progress, our ability to distinguish which rare variants have clinical significance remains limited. Over the last decade, however, powerful experimental approaches have emerged to characterize variant effects orders of magnitude faster than before. Fueled by improved DNA synthesis and sequencing and, more recently, by CRISPR/Cas9 genome editing, multiplex functional assays provide a means of generating variant effect data in wide-ranging experimental systems. Here, I review recent applications of multiplex assays that link human variants to disease phenotypes and I describe emerging strategies that will enhance their clinical utility in coming years.
Collapse
Affiliation(s)
- Gregory M Findlay
- The Francis Crick Institute, The Genome Function Laboratory, London NW1 1AT, UK
| |
Collapse
|
36
|
Soo VWC, Swadling JB, Faure AJ, Warnecke T. Fitness landscape of a dynamic RNA structure. PLoS Genet 2021; 17:e1009353. [PMID: 33524037 PMCID: PMC7877785 DOI: 10.1371/journal.pgen.1009353] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 02/11/2021] [Accepted: 01/12/2021] [Indexed: 11/24/2022] Open
Abstract
RNA structures are dynamic. As a consequence, mutational effects can be hard to rationalize with reference to a single static native structure. We reasoned that deep mutational scanning experiments, which couple molecular function to fitness, should capture mutational effects across multiple conformational states simultaneously. Here, we provide a proof-of-principle that this is indeed the case, using the self-splicing group I intron from Tetrahymena thermophila as a model system. We comprehensively mutagenized two 4-bp segments of the intron. These segments first come together to form the P1 extension (P1ex) helix at the 5' splice site. Following cleavage at the 5' splice site, the two halves of the helix dissociate to allow formation of an alternative helix (P10) at the 3' splice site. Using an in vivo reporter system that couples splicing activity to fitness in E. coli, we demonstrate that fitness is driven jointly by constraints on P1ex and P10 formation. We further show that patterns of epistasis can be used to infer the presence of intramolecular pleiotropy. Using a machine learning approach that allows quantification of mutational effects in a genotype-specific manner, we demonstrate that the fitness landscape can be deconvoluted to implicate P1ex or P10 as the effective genetic background in which molecular fitness is compromised or enhanced. Our results highlight deep mutational scanning as a tool to study alternative conformational states, with the capacity to provide critical insights into the structure, evolution and evolvability of RNAs as dynamic ensembles. Our findings also suggest that, in the future, deep mutational scanning approaches might help reverse-engineer multiple alternative or successive conformations from a single fitness landscape.
Collapse
Affiliation(s)
- Valerie W. C. Soo
- Medical Research Council London Institute of Medical Sciences, London, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Jacob B. Swadling
- Medical Research Council London Institute of Medical Sciences, London, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| | - Andre J. Faure
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Tobias Warnecke
- Medical Research Council London Institute of Medical Sciences, London, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, United Kingdom
| |
Collapse
|
37
|
Seuma M, Faure AJ, Badia M, Lehner B, Bolognesi B. The genetic landscape for amyloid beta fibril nucleation accurately discriminates familial Alzheimer's disease mutations. eLife 2021; 10:e63364. [PMID: 33522485 PMCID: PMC7943193 DOI: 10.7554/elife.63364] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 02/01/2021] [Indexed: 12/20/2022] Open
Abstract
Plaques of the amyloid beta (Aß) peptide are a pathological hallmark of Alzheimer's disease (AD), the most common form of dementia. Mutations in Aß also cause familial forms of AD (fAD). Here, we use deep mutational scanning to quantify the effects of >14,000 mutations on the aggregation of Aß. The resulting genetic landscape reveals mechanistic insights into fibril nucleation, including the importance of charge and gatekeeper residues in the disordered region outside of the amyloid core in preventing nucleation. Strikingly, unlike computational predictors and previous measurements, the empirical nucleation scores accurately identify all known dominant fAD mutations in Aß, genetically validating that the mechanism of nucleation in a cell-based assay is likely to be very similar to the mechanism that causes the human disease. These results provide the first comprehensive atlas of how mutations alter the formation of any amyloid fibril and a resource for the interpretation of genetic variation in Aß.
Collapse
Affiliation(s)
- Mireia Seuma
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Andre J Faure
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Marta Badia
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Ben Lehner
- Center for Genomic Regulation (CRG), The Barcelona Institute of Science and TechnologyBarcelonaSpain
- Universitat Pompeu Fabra (UPF)BarcelonaSpain
- ICREA, Pg. Lluís CompanysBarcelonaSpain
| | - Benedetta Bolognesi
- Institute for Bioengineering of Catalonia (IBEC), The Barcelona Institute of Science and TechnologyBarcelonaSpain
| |
Collapse
|
38
|
Baeza-Centurion P, Miñana B, Valcárcel J, Lehner B. Mutations primarily alter the inclusion of alternatively spliced exons. eLife 2020; 9:59959. [PMID: 33112234 PMCID: PMC7673789 DOI: 10.7554/elife.59959] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2020] [Accepted: 10/27/2020] [Indexed: 12/17/2022] Open
Abstract
Genetic analyses and systematic mutagenesis have revealed that synonymous, non-synonymous and intronic mutations frequently alter the inclusion levels of alternatively spliced exons, consistent with the concept that altered splicing might be a common mechanism by which mutations cause disease. However, most exons expressed in any cell are highly-included in mature mRNAs. Here, by performing deep mutagenesis of highly-included exons and by analysing the association between genome sequence variation and exon inclusion across the transcriptome, we report that mutations only very rarely alter the inclusion of highly-included exons. This is true for both exonic and intronic mutations as well as for perturbations in trans. Therefore, mutations that affect splicing are not evenly distributed across primary transcripts but are focussed in and around alternatively spliced exons with intermediate inclusion levels. These results provide a resource for prioritising synonymous and other variants as disease-causing mutations.
Collapse
Affiliation(s)
- Pablo Baeza-Centurion
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Belén Miñana
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain
| | - Juan Valcárcel
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | - Ben Lehner
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|