51
|
Jain I, Kolesnik M, Kuznedelov K, Minakhin L, Morozova N, Shiriaeva A, Kirillov A, Medvedeva S, Livenskyi A, Kazieva L, Makarova KS, Koonin EV, Borukhov S, Severinov K, Semenova E. tRNA anticodon cleavage by target-activated CRISPR-Cas13a effector. SCIENCE ADVANCES 2024; 10:eadl0164. [PMID: 38657076 PMCID: PMC11042736 DOI: 10.1126/sciadv.adl0164] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2023] [Accepted: 03/20/2024] [Indexed: 04/26/2024]
Abstract
Type VI CRISPR-Cas systems are among the few CRISPR varieties that target exclusively RNA. The CRISPR RNA-guided, sequence-specific binding of target RNAs, such as phage transcripts, activates the type VI effector, Cas13. Once activated, Cas13 causes collateral RNA cleavage, which induces bacterial cell dormancy, thus protecting the host population from the phage spread. We show here that the principal form of collateral RNA degradation elicited by Leptotrichia shahii Cas13a expressed in Escherichia coli cells is the cleavage of anticodons in a subset of transfer RNAs (tRNAs) with uridine-rich anticodons. This tRNA cleavage is accompanied by inhibition of protein synthesis, thus providing defense from the phages. In addition, Cas13a-mediated tRNA cleavage indirectly activates the RNases of bacterial toxin-antitoxin modules cleaving messenger RNA, which could provide a backup defense. The mechanism of Cas13a-induced antiphage defense resembles that of bacterial anticodon nucleases, which is compatible with the hypothesis that type VI effectors evolved from an abortive infection module encompassing an anticodon nuclease.
Collapse
Affiliation(s)
- Ishita Jain
- Waksman Institute for Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Matvey Kolesnik
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Konstantin Kuznedelov
- Waksman Institute for Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Leonid Minakhin
- Waksman Institute for Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Natalia Morozova
- Peter the Great St. Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Anna Shiriaeva
- Peter the Great St. Petersburg Polytechnic University, Saint Petersburg, Russia
- Saint Petersburg State University, Saint Petersburg, Russia
| | - Alexandr Kirillov
- Peter the Great St. Petersburg Polytechnic University, Saint Petersburg, Russia
| | - Sofia Medvedeva
- Center for Molecular and Cellular Biology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Alexei Livenskyi
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Moscow, Russia
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, Russia
| | | | - Kira S. Makarova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Sergei Borukhov
- Department of Cell Biology and Neuroscience, Rowan University School of Osteopathic Medicine at Stratford; Stratford, NJ, USA
| | - Konstantin Severinov
- Waksman Institute for Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Moscow, Russia
| | - Ekaterina Semenova
- Waksman Institute for Microbiology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| |
Collapse
|
52
|
Pedrazzoli E, Demozzi M, Visentin E, Ciciani M, Bonuzzi I, Pezzè L, Lucchetta L, Maule G, Amistadi S, Esposito F, Lupo M, Miccio A, Auricchio A, Casini A, Segata N, Cereseto A. CoCas9 is a compact nuclease from the human microbiome for efficient and precise genome editing. Nat Commun 2024; 15:3478. [PMID: 38658578 PMCID: PMC11043407 DOI: 10.1038/s41467-024-47800-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 04/11/2024] [Indexed: 04/26/2024] Open
Abstract
The expansion of the CRISPR-Cas toolbox is highly needed to accelerate the development of therapies for genetic diseases. Here, through the interrogation of a massively expanded repository of metagenome-assembled genomes, mostly from human microbiomes, we uncover a large variety (n = 17,173) of type II CRISPR-Cas loci. Among these we identify CoCas9, a strongly active and high-fidelity nuclease with reduced molecular size (1004 amino acids) isolated from an uncultivated Collinsella species. CoCas9 is efficiently co-delivered with its sgRNA through adeno associated viral (AAV) vectors, obtaining efficient in vivo editing in the mouse retina. With this study we uncover a collection of previously uncharacterized Cas9 nucleases, including CoCas9, which enriches the genome editing toolbox.
Collapse
Affiliation(s)
- Eleonora Pedrazzoli
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
| | - Michele Demozzi
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
| | - Elisabetta Visentin
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
| | - Matteo Ciciani
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
| | - Ilaria Bonuzzi
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
| | | | - Lorenzo Lucchetta
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
| | - Giulia Maule
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
| | - Simone Amistadi
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy
- Université de Paris, Imagine Institute, Laboratory of chromatin and gene regulation during development, INSERM, UMR 1163, Paris, France
| | - Federica Esposito
- Telethon Institute of Genetics and Medicine (TIGEM), 80078, Pozzuoli (NA), Italy
| | - Mariangela Lupo
- Telethon Institute of Genetics and Medicine (TIGEM), 80078, Pozzuoli (NA), Italy
| | - Annarita Miccio
- Université de Paris, Imagine Institute, Laboratory of chromatin and gene regulation during development, INSERM, UMR 1163, Paris, France
| | - Alberto Auricchio
- Telethon Institute of Genetics and Medicine (TIGEM), 80078, Pozzuoli (NA), Italy
- Medical Genetics, Department of Advanced Biomedical Sciences, University of Naples "Federico II", 80131, Naples, Italy
| | | | - Nicola Segata
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy.
| | - Anna Cereseto
- Department of Computational, Cellular and Integrative Biology (CIBIO), University of Trento, 38123, Trento, Italy.
| |
Collapse
|
53
|
Nishikawa KK, Chen J, Acheson JF, Harbaugh SV, Huss P, Frenkel M, Novy N, Sieren HR, Lodewyk EC, Lee DH, Chávez JL, Fox BG, Raman S. Highly multiplexed design of an allosteric transcription factor to sense novel ligands. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.07.583947. [PMID: 38496486 PMCID: PMC10942455 DOI: 10.1101/2024.03.07.583947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Allosteric transcription factors (aTF), widely used as biosensors, have proven challenging to design for detecting novel molecules because mutation of ligand-binding residues often disrupts allostery. We developed Sensor-seq, a high-throughput platform to design and identify aTF biosensors that bind to non-native ligands. We screened a library of 17,737 variants of the aTF TtgR, a regulator of a multidrug exporter, against six non-native ligands of diverse chemical structures - four derivatives of the cancer therapeutic tamoxifen, the antimalarial drug quinine, and the opiate analog naltrexone - as well as two native flavonoid ligands, naringenin and phloretin. Sensor-seq identified novel biosensors for each of these ligands with high dynamic range and diverse specificity profiles. The structure of a naltrexone-bound design showed shape-complementary methionine-aromatic interactions driving ligand specificity. To demonstrate practical utility, we developed cell-free detection systems for naltrexone and quinine. Sensor-seq enables rapid, scalable design of new biosensors, overcoming constraints of natural biosensors.
Collapse
Affiliation(s)
- Kyle K Nishikawa
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Jackie Chen
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Justin F Acheson
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Svetlana V Harbaugh
- 711th Human Performance Wing, Air Force Research Laboratory Wright Patterson Air Force Base, OH, USA
| | - Phil Huss
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Max Frenkel
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Nathan Novy
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Hailey R Sieren
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Ella C Lodewyk
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Daniel H Lee
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Jorge L Chávez
- 711th Human Performance Wing, Air Force Research Laboratory Wright Patterson Air Force Base, OH, USA
| | - Brian G Fox
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, USA
| | - Srivatsan Raman
- Department of Biochemistry, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI, USA
- Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
54
|
Singer A, Ramos A, Keating AE. Elaboration of the Homer1 Recognition Landscape Reveals Incomplete Divergence of Paralogous EVH1 Domains. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.23.576863. [PMID: 38645240 PMCID: PMC11030225 DOI: 10.1101/2024.01.23.576863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Short sequences that mediate interactions with modular binding domains are ubiquitous throughout eukaryotic proteomes. Networks of Short Linear Motifs (SLiMs) and their corresponding binding domains orchestrate many cellular processes, and the low mutational barrier to evolving novel interactions provides a way for biological systems to rapidly sample selectable phenotypes. Mapping SLiM binding specificity and the rules that govern SLiM evolution is fundamental to uncovering the pathways regulated by these networks and developing the tools to manipulate them. We used high-throughput screening of the human proteome to identify sequences that bind to the Enabled/VASP homology 1 (EVH1) domain of the postsynaptic density scaffolding protein Homer1. In doing so, we expanded current understanding of the determinants of Homer EVH1 binding preferences and defined a new motif that can facilitate the discovery of additional Homer-mediated interactions. Interestingly, the Homer1 EVH1 domain preferentially binds to sequences containing an N-terminally overlapping motif that is bound by the paralogous family of Ena/VASP actin polymerases, and many of these sequences can bind to EVH1 domains from both protein families. We provide evidence from orthologous EVH1 domains in pre-metazoan organisms that the overlap in human Ena/VASP and Homer binding preferences corresponds to an incomplete divergence from a common Ena/VASP ancestor. Given this overlap in binding profiles, promiscuous sequences that can be recognized by both families either achieve specificity through extrinsic regulatory strategies or may provide functional benefits via multi-specificity. This may explain why these paralogs incompletely diverged despite the accessibility of further diverged isoforms.
Collapse
Affiliation(s)
- Avinoam Singer
- MIT Department of Biology, Cambridge, Massachusetts, USA
| | | | - Amy E. Keating
- MIT Department of Biology, Cambridge, Massachusetts, USA
- MIT Department of Biological Engineering, Cambridge, Massachusetts, USA
- Koch Institute for Integrative Cancer Research, Cambridge, Massachusetts, USA
| |
Collapse
|
55
|
Kayrouz CM, Ireland KA, Ying V, Davis KM, Seyedsayamdost MR. Ovoselenol, a Selenium-containing Antioxidant Derived from Convergent Evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.10.588772. [PMID: 38645211 PMCID: PMC11030361 DOI: 10.1101/2024.04.10.588772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Selenium is an essential micronutrient, but its presence in biology has been limited to protein and nucleic acid biopolymers. The recent identification of the first biosynthetic pathway for selenium-containing small molecules suggests that there is a larger family of selenometabolites that remains to be discovered. Using a bioinformatic search strategy that relies on mapping of composite active site motifs, we identify a recently evolved branch of abundant and uncharacterized metalloenzymes that we predict are involved in selenometabolite biosynthesis. Biochemical studies confirm this prediction and show that these enzymes form an unusual C-Se bond onto histidine, thus giving rise to a novel selenometabolite and potent antioxidant that we have termed ovoselenol. Aside from providing insights into the evolution of this enzyme class and the structural basis of C-Se bond formation, our work offers a blueprint for charting the microbial selenometabolome in the future.
Collapse
Affiliation(s)
- Chase M. Kayrouz
- Department of Chemistry, Princeton University, Princeton, NJ 08544, United States
| | - Kendra A. Ireland
- Department of Chemistry, Emory University, Atlanta, GA 30322, United States
| | - Vanessa Ying
- Department of Chemistry, Princeton University, Princeton, NJ 08544, United States
| | - Katherine M. Davis
- Department of Chemistry, Emory University, Atlanta, GA 30322, United States
| | - Mohammad R. Seyedsayamdost
- Department of Chemistry, Princeton University, Princeton, NJ 08544, United States
- Department of Molecular Biology, Princeton University, Princeton, NJ 08544, United States
| |
Collapse
|
56
|
Cuevas-Zuviría B, Garcia AK, Rivier AJ, Rucker HR, Carruthers BM, Kaçar B. Emergence of an Orphan Nitrogenase Protein Following Atmospheric Oxygenation. Mol Biol Evol 2024; 41:msae067. [PMID: 38526235 PMCID: PMC11018506 DOI: 10.1093/molbev/msae067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 03/06/2024] [Accepted: 03/19/2024] [Indexed: 03/26/2024] Open
Abstract
Molecular innovations within key metabolisms can have profound impacts on element cycling and ecological distribution. Yet, much of the molecular foundations of early evolved enzymes and metabolisms are unknown. Here, we bring one such mystery to relief by probing the birth and evolution of the G-subunit protein, an integral component of certain members of the nitrogenase family, the only enzymes capable of biological nitrogen fixation. The G-subunit is a Paleoproterozoic-age orphan protein that appears more than 1 billion years after the origin of nitrogenases. We show that the G-subunit arose with novel nitrogenase metal dependence and the ecological expansion of nitrogen-fixing microbes following the transition in environmental metal availabilities and atmospheric oxygenation that began ∼2.5 billion years ago. We identify molecular features that suggest early G-subunit proteins mediated cofactor or protein interactions required for novel metal dependency, priming ancient nitrogenases and their hosts to exploit these newly diversified geochemical environments. We further examined the degree of functional specialization in G-subunit evolution with extant and ancestral homologs using laboratory reconstruction experiments. Our results indicate that permanent recruitment of the orphan protein depended on the prior establishment of conserved molecular features and showcase how contingent evolutionary novelties might shape ecologically important microbial innovations.
Collapse
Affiliation(s)
| | - Amanda K Garcia
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Alex J Rivier
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Holly R Rucker
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Brooke M Carruthers
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - Betül Kaçar
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| |
Collapse
|
57
|
Nikolaev A, Kuzmin A, Markeeva E, Kuznetsova E, Ryzhykau YL, Semenov O, Anuchina A, Remeeva A, Gushchin I. Reengineering of a flavin-binding fluorescent protein using ProteinMPNN. Protein Sci 2024; 33:e4958. [PMID: 38501498 PMCID: PMC10949330 DOI: 10.1002/pro.4958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 01/12/2024] [Accepted: 02/18/2024] [Indexed: 03/20/2024]
Abstract
Recent advances in machine learning techniques have led to development of a number of protein design and engineering approaches. One of them, ProteinMPNN, predicts an amino acid sequence that would fold and match user-defined backbone structure. Its performance was previously tested for proteins composed of standard amino acids, as well as for peptide- and protein-binding proteins. In this short report, we test whether ProteinMPNN can be used to reengineer a non-proteinaceous ligand-binding protein, flavin-based fluorescent protein CagFbFP. We fixed the native backbone conformation and the identity of 20 amino acids interacting with the chromophore (flavin mononucleotide, FMN) while letting ProteinMPNN predict the rest of the sequence. The software package suggested replacing 36-48 out of the remaining 86 amino acids so that the resulting sequences are 55%-66% identical to the original one. The three designs that we tested experimentally displayed different expression levels, yet all were able to bind FMN and displayed fluorescence, thermal stability, and other properties similar to those of CagFbFP. Our results demonstrate that ProteinMPNN can be used to generate diverging unnatural variants of fluorescent proteins, and, more generally, to reengineer proteins without losing their ligand-binding capabilities.
Collapse
Affiliation(s)
- Andrey Nikolaev
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Alexander Kuzmin
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Elena Markeeva
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Elizaveta Kuznetsova
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Yury L. Ryzhykau
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
- Frank Laboratory of Neutron PhysicsJoint Institute for Nuclear ResearchDubnaRussia
| | - Oleg Semenov
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Arina Anuchina
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Alina Remeeva
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Ivan Gushchin
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| |
Collapse
|
58
|
Datta S, Nabeel Asim M, Dengel A, Ahmed S. NTpred: a robust and precise machine learning framework for in silico identification of Tyrosine nitration sites in protein sequences. Brief Funct Genomics 2024; 23:163-179. [PMID: 37248673 DOI: 10.1093/bfgp/elad018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 04/12/2023] [Accepted: 05/02/2023] [Indexed: 05/31/2023] Open
Abstract
Post-translational modifications (PTMs) either enhance a protein's activity in various sub-cellular processes, or degrade their activity which leads toward failure of intracellular processes. Tyrosine nitration (NT) modification degrades protein's activity that initiates and propagates various diseases including neurodegenerative, cardiovascular, autoimmune diseases and carcinogenesis. Identification of NT modification supports development of novel therapies and drug discoveries for associated diseases. Identification of NT modification in biochemical labs is expensive, time consuming and error-prone. To supplement this process, several computational approaches have been proposed. However these approaches fail to precisely identify NT modification, due to the extraction of irrelevant, redundant and less discriminative features from protein sequences. This paper presents the NTpred framework that is competent in extracting comprehensive features from raw protein sequences using four different sequence encoders. To reap the benefits of different encoders, it generates four additional feature spaces by fusing different combinations of individual encodings. Furthermore, it eradicates irrelevant and redundant features from eight different feature spaces through a Recursive Feature Elimination process. Selected features of four individual encodings and four feature fusion vectors are used to train eight different Gradient Boosted Tree classifiers. The probability scores from the trained classifiers are utilized to generate a new probabilistic feature space, which is used to train a Logistic Regression classifier. On the BD1 benchmark dataset, the proposed framework outperforms the existing best-performing predictor in 5-fold cross validation and independent test evaluation with combined improvement of 13.7% in MCC and 20.1% in AUC. Similarly, on the BD2 benchmark dataset, the proposed framework outperforms the existing best-performing predictor with combined improvement of 5.3% in MCC and 1.0% in AUC. NTpred is publicly available for further experimentation and predictive use at: https://sds_genetic_analysis.opendfki.de/PredNTS/.
Collapse
Affiliation(s)
- Sourajyoti Datta
- Department of Computer Science, Rheinland Pfälzische Technische Universität, Kaiserslautern, 67663, Germany
| | - Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany
| | - Andreas Dengel
- Department of Computer Science, Rheinland Pfälzische Technische Universität, Kaiserslautern, 67663, Germany
- German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence, Kaiserslautern, 67663, Germany
| |
Collapse
|
59
|
Gu K, Mok L, Wakefield MJ, Chong MMW. Non-canonical RNA substrates of Drosha lack many of the conserved features found in primary microRNA stem-loops. Sci Rep 2024; 14:6713. [PMID: 38509178 PMCID: PMC10954719 DOI: 10.1038/s41598-024-57330-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 03/18/2024] [Indexed: 03/22/2024] Open
Abstract
The RNase III enzyme Drosha has a central role in microRNA (miRNA) biogenesis, where it is required to release the stem-loop intermediate from primary (pri)-miRNA transcripts. However, it can also cleave stem-loops embedded within messenger (m)RNAs. This destabilizes the mRNA causing target gene repression and appears to occur primarily in stem cells. While pri-miRNA stem-loops have been extensively studied, such non-canonical substrates of Drosha have yet to be characterized in detail. In this study, we employed high-throughput sequencing to capture all polyA-tailed RNAs that are cleaved by Drosha in mouse embryonic stem cells (ESCs) and compared the features of non-canonical versus miRNA stem-loop substrates. mRNA substrates are less efficiently processed than miRNA stem-loops. Sequence and structural analyses revealed that these mRNA substrates are also less stable and more likely to fold into alternative structures than miRNA stem-loops. Moreover, they lack the sequence and structural motifs found in miRNA stem-loops that are required for precise cleavage. Notably, we discovered a non-canonical Drosha substrate that is cleaved in an inverse manner, which is a process that is normally inhibited by features in miRNA stem-loops. Our study thus provides valuable insights into the recognition of non-canonical targets by Drosha.
Collapse
Affiliation(s)
- Karen Gu
- St Vincent's Institute of Medical Research, Fitzroy, VIC, 3065, Australia
- Department of Medicine (St Vincent's), University of Melbourne, Fitzroy, VIC, 3065, Australia
| | - Lawrence Mok
- St Vincent's Institute of Medical Research, Fitzroy, VIC, 3065, Australia
| | - Matthew J Wakefield
- Walter and Eliza Hall Institute of Medical Research, Parkville, VIC, 3052, Australia
- Department of Obstetrics and Gynaecology, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Mark M W Chong
- St Vincent's Institute of Medical Research, Fitzroy, VIC, 3065, Australia.
- Department of Medicine (St Vincent's), University of Melbourne, Fitzroy, VIC, 3065, Australia.
| |
Collapse
|
60
|
Zheng Y, Stormo GD, Chen S. Aberrant homeodomain-DNA cooperative dimerization underlies distinct developmental defects in two dominant CRX retinopathy models. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.12.584677. [PMID: 38559186 PMCID: PMC10979960 DOI: 10.1101/2024.03.12.584677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Paired-class homeodomain transcription factors (HD TFs) play essential roles in vertebrate development, and their mutations are linked to human diseases. One unique feature of paired-class HD is cooperative dimerization on specific palindrome DNA sequences. Yet, the functional significance of HD cooperative dimerization in animal development and its dysregulation in diseases remain elusive. Using the retinal TF Cone-rod Homeobox (CRX) as a model, we have studied how blindness-causing mutations in the paired HD, p.E80A and p.K88N, alter CRX's cooperative dimerization, lead to gene misexpression and photoreceptor developmental deficits in dominant manners. CRXE80A maintains binding at monomeric WT CRX motifs but is deficient in cooperative binding at dimeric motifs. CRXE80A's cooperativity defect impacts the exponential increase of photoreceptor gene expression in terminal differentiation and produces immature, non-functional photoreceptors in the CrxE80A retinas. CRXK88N is highly cooperative and localizes to ectopic genomic sites with strong enrichment of dimeric HD motifs. CRXK88N's altered biochemical properties disrupt CRX's ability to direct dynamic chromatin remodeling during development to activate photoreceptor differentiation programs and silence progenitor programs. Our study here provides in vitro and in vivo molecular evidence that paired-class HD cooperative dimerization regulates neuronal development and dysregulation of cooperative binding contributes to severe dominant blinding retinopathies.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetics and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
| | - Gary D. Stormo
- Department of Genetics, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
| | - Shiming Chen
- Department of Ophthalmology and Visual Sciences, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
- Department of Developmental Biology, Washington University in St Louis, Saint Louis, Missouri, 63110, USA
| |
Collapse
|
61
|
Choppavarapu L, Fang K, Liu T, Jin VX. Hi-C profiling in tissues reveals 3D chromatin-regulated breast tumor heterogeneity and tumor-specific looping-mediated biological pathways. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.13.584872. [PMID: 38559097 PMCID: PMC10979939 DOI: 10.1101/2024.03.13.584872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Current knowledge in three-dimensional (3D) chromatin regulation in normal and disease states was mostly accumulated through Hi-C profiling in in vitro cell culture system. The limitations include failing to recapitulate disease-specific physiological properties and often lacking clinically relevant disease microenvironment. In this study, we conduct tissue-specific Hi-C profiling in a pilot cohort of 12 breast tissues comprising of two normal tissues (NTs) and ten ER+ breast tumor tissues (TTs) including five primary tumors (PTs), and five tamoxifen-treated recurrent tumors (RTs). We find largely preserved compartments, highly heterogeneous topological associated domains (TADs) and intensively variable chromatin loops among breast tumors, demonstrating 3D chromatin-regulated breast tumor heterogeneity. Further cross-examination identifies RT-specific looping-mediated biological pathways and suggests CA2, an enhancer-promoter looping (EPL)-mediated target gene within the bicarbonate transport metabolism pathway, might play a role in driving the tamoxifen resistance. Remarkably, the inhibition of CA2 not only impedes tumor growth both in vitro and in vivo , but also reverses chromatin looping. Our study thus yields significant mechanistic insights into the role and clinical relevance of 3D chromatin architecture in breast cancer endocrine resistance.
Collapse
|
62
|
Yang KB, Rasouly A, Epshtein V, Martinez C, Nguyen T, Shamovsky I, Nudler E. Persistence of backtracking by human RNA polymerase II. Mol Cell 2024; 84:897-909.e4. [PMID: 38340716 DOI: 10.1016/j.molcel.2024.01.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 11/20/2023] [Accepted: 01/22/2024] [Indexed: 02/12/2024]
Abstract
RNA polymerase II (RNA Pol II) can backtrack during transcription elongation, exposing the 3' end of nascent RNA. Nascent RNA sequencing can approximate the location of backtracking events that are quickly resolved; however, the extent and genome-wide distribution of more persistent backtracking are unknown. Consequently, we developed a method to directly sequence the extruded, "backtracked" 3' RNA. Our data show that RNA Pol II slides backward more than 20 nt in human cells and can persist in this backtracked state. Persistent backtracking mainly occurs where RNA Pol II pauses near promoters and intron-exon junctions and is enriched in genes involved in translation, replication, and development, where gene expression is decreased if these events are unresolved. Histone genes are highly prone to persistent backtracking, and the resolution of such events is likely required for timely expression during cell division. These results demonstrate that persistent backtracking can potentially affect diverse gene expression programs.
Collapse
Affiliation(s)
- Kevin B Yang
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Aviram Rasouly
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA; Howard Hughes Medical Institute, NYU Langone Health, New York, NY 10016, USA
| | - Vitaly Epshtein
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Criseyda Martinez
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Thao Nguyen
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Ilya Shamovsky
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA
| | - Evgeny Nudler
- Department of Biochemistry and Molecular Pharmacology, New York University Grossman School of Medicine, New York, NY 10016, USA; Howard Hughes Medical Institute, NYU Langone Health, New York, NY 10016, USA.
| |
Collapse
|
63
|
Kohyama S, Frohn BP, Babl L, Schwille P. Machine learning-aided design and screening of an emergent protein function in synthetic cells. Nat Commun 2024; 15:2010. [PMID: 38443351 PMCID: PMC10914801 DOI: 10.1038/s41467-024-46203-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Accepted: 02/16/2024] [Indexed: 03/07/2024] Open
Abstract
Recently, utilization of Machine Learning (ML) has led to astonishing progress in computational protein design, bringing into reach the targeted engineering of proteins for industrial and biomedical applications. However, the design of proteins for emergent functions of core relevance to cells, such as the ability to spatiotemporally self-organize and thereby structure the cellular space, is still extremely challenging. While on the generative side conditional generative models and multi-state design are on the rise, for emergent functions there is a lack of tailored screening methods as typically needed in a protein design project, both computational and experimental. Here we describe a proof-of-principle of how such screening, in silico and in vitro, can be achieved for ML-generated variants of a protein that forms intracellular spatiotemporal patterns. For computational screening we use a structure-based divide-and-conquer approach to find the most promising candidates, while for the subsequent in vitro screening we use synthetic cell-mimics as established by Bottom-Up Synthetic Biology. We then show that the best screened candidate can indeed completely substitute the wildtype gene in Escherichia coli. These results raise great hopes for the next level of synthetic biology, where ML-designed synthetic proteins will be used to engineer cellular functions.
Collapse
Affiliation(s)
- Shunshi Kohyama
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany
| | - Béla P Frohn
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany
| | - Leon Babl
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany
| | - Petra Schwille
- Dept. Cellular and Molecular Biophysics, Max Planck Institute of Biochemistry, Martinsried, D-82152, Germany.
| |
Collapse
|
64
|
Hong L, Kortemme T. An integrative approach to protein sequence design through multiobjective optimization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.01.582670. [PMID: 38496480 PMCID: PMC10942313 DOI: 10.1101/2024.03.01.582670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
With recent methodological advances in the field of computational protein design, in particular those based on deep learning, there is an increasing need for frameworks that allow for coherent, direct integration of different models and objective functions into the generative design process. Here we demonstrate how evolutionary multiobjective optimization techniques can be adapted to provide such an approach. With the established Non-dominated Sorting Genetic Algorithm II (NSGA-II) as the optimization framework, we use AlphaFold2 and ProteinMPNN confidence metrics to define the objective space, and a mutation operator composed of ESM-1v and ProteinMPNN to rank and then redesign the least favorable positions. Using the multistate design problem of the foldswitching protein RfaH as an in-depth case study, we show that the evolutionary multiobjective optimization approach leads to significant reduction in the bias and variance in RfaH native sequence recovery, compared to a direct application of ProteinMPNN. We suggest that this improvement is due to three factors: (i) the use of an informative mutation operator that accelerates the sequence space exploration, (ii) the parallel, iterative design process inherent to the genetic algorithm that improves upon the ProteinMPNN autoregressive sequence decoding scheme, and (iii) the explicit approximation of the Pareto front that leads to optimal design candidates representing diverse tradeoff conditions. We anticipate this approach to be readily adaptable to different models and broadly relevant for protein design tasks with complex specifications.
Collapse
Affiliation(s)
- Lu Hong
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Tanja Kortemme
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA 94158, USA
- Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94158, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
65
|
Ishigami Y, Wong MS, Martí-Gómez C, Ayaz A, Kooshkbaghi M, Hanson SM, McCandlish DM, Krainer AR, Kinney JB. Specificity, synergy, and mechanisms of splice-modifying drugs. Nat Commun 2024; 15:1880. [PMID: 38424098 PMCID: PMC10904865 DOI: 10.1038/s41467-024-46090-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 02/10/2024] [Indexed: 03/02/2024] Open
Abstract
Drugs that target pre-mRNA splicing hold great therapeutic potential, but the quantitative understanding of how these drugs work is limited. Here we introduce mechanistically interpretable quantitative models for the sequence-specific and concentration-dependent behavior of splice-modifying drugs. Using massively parallel splicing assays, RNA-seq experiments, and precision dose-response curves, we obtain quantitative models for two small-molecule drugs, risdiplam and branaplam, developed for treating spinal muscular atrophy. The results quantitatively characterize the specificities of risdiplam and branaplam for 5' splice site sequences, suggest that branaplam recognizes 5' splice sites via two distinct interaction modes, and contradict the prevailing two-site hypothesis for risdiplam activity at SMN2 exon 7. The results also show that anomalous single-drug cooperativity, as well as multi-drug synergy, are widespread among small-molecule drugs and antisense-oligonucleotide drugs that promote exon inclusion. Our quantitative models thus clarify the mechanisms of existing treatments and provide a basis for the rational development of new therapies.
Collapse
Affiliation(s)
- Yuma Ishigami
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Mandy S Wong
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- Beam Therapeutics, Cambridge, MA, 02142, USA
| | | | - Andalus Ayaz
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Mahdi Kooshkbaghi
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- The Estée Lauder Companies, New York, NY, 10153, USA
| | | | | | - Adrian R Krainer
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| | - Justin B Kinney
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA.
| |
Collapse
|
66
|
Zhang E, Neugebauer ME, Krasnow NA, Liu DR. Phage-assisted evolution of highly active cytosine base editors with enhanced selectivity and minimal sequence context preference. Nat Commun 2024; 15:1697. [PMID: 38402281 PMCID: PMC10894238 DOI: 10.1038/s41467-024-45969-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 02/07/2024] [Indexed: 02/26/2024] Open
Abstract
TadA-derived cytosine base editors (TadCBEs) enable programmable C•G-to-T•A editing while retaining the small size, high on-target activity, and low off-target activity of TadA deaminases. Existing TadCBEs, however, exhibit residual A•T-to-G•C editing at certain positions and lower editing efficiencies at some sequence contexts and with non-SpCas9 targeting domains. To address these limitations, we use phage-assisted evolution to evolve CBE6s from a TadA-mediated dual cytosine and adenine base editor, discovering mutations at N46 and Y73 in TadA that prevent A•T-to-G•C editing and improve C•G-to-T•A editing with expanded sequence-context compatibility, respectively. In E. coli, CBE6 variants offer high C•G-to-T•A editing and no detected A•T-to-G•C editing in any sequence context. In human cells, CBE6 variants exhibit broad Cas domain compatibility and retain low off-target editing despite exceeding BE4max and previous TadCBEs in on-target editing efficiency. Finally, we show that the high selectivity of CBE6 variants is well-suited for therapeutically relevant stop codon installation without creating unwanted missense mutations from residual A•T-to-G•C editing.
Collapse
Affiliation(s)
- Emily Zhang
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - Monica E Neugebauer
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - Nicholas A Krasnow
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA
| | - David R Liu
- Merkin Institute of Transformative Technologies in Healthcare, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, MA, USA.
- Howard Hughes Medical Institute, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
67
|
Sørensen CV, Hofmann N, Rawat P, Sørensen FV, Ljungars A, Greiff V, Laustsen AH, Jenkins TP. ExpoSeq: simplified analysis of high-throughput sequencing data from antibody discovery campaigns. BIOINFORMATICS ADVANCES 2024; 4:vbae020. [PMID: 38425781 PMCID: PMC10902677 DOI: 10.1093/bioadv/vbae020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 01/08/2024] [Accepted: 02/08/2024] [Indexed: 03/02/2024]
Abstract
Summary High-throughput sequencing (HTS) offers a modern, fast, and explorative solution to unveil the full potential of display techniques, like antibody phage display, in molecular biology. However, a significant challenge lies in the processing and analysis of such data. Furthermore, there is a notable absence of open-access user-friendly software tools that can be utilized by scientists lacking programming expertise. Here, we present ExpoSeq as an easy-to-use tool to explore, process, and visualize HTS data from antibody discovery campaigns like an expert while only requiring a beginner's knowledge. Availability and implementation The pipeline is distributed via GitHub and PyPI, and it can either be installed as a package with pip or the user can choose to clone the repository.
Collapse
Affiliation(s)
- Christoffer V Sørensen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Nils Hofmann
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Puneet Rawat
- Department of Immunology, University of Oslo and Oslo University Hospital, NO-0316 Oslo, Norway
| | | | - Anne Ljungars
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, NO-0316 Oslo, Norway
| | - Andreas H Laustsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Timothy P Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| |
Collapse
|
68
|
Hernández G, García A, Weingarten-Gabbay S, Mishra R, Hussain T, Amiri M, Moreno-Hagelsieb G, Montiel-Dávalos A, Lasko P, Sonenberg N. Functional analysis of the AUG initiator codon context reveals novel conserved sequences that disfavor mRNA translation in eukaryotes. Nucleic Acids Res 2024; 52:1064-1079. [PMID: 38038264 PMCID: PMC10853783 DOI: 10.1093/nar/gkad1152] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 11/09/2023] [Accepted: 11/15/2023] [Indexed: 12/02/2023] Open
Abstract
mRNA translation is a fundamental process for life. Selection of the translation initiation site (TIS) is crucial, as it establishes the correct open reading frame for mRNA decoding. Studies in vertebrate mRNAs discovered that a purine at -3 and a G at +4 (where A of the AUG initiator codon is numbered + 1), promote TIS recognition. However, the TIS context in other eukaryotes has been poorly experimentally analyzed. We analyzed in vitro the influence of the -3, -2, -1 and + 4 positions of the TIS context in rabbit, Drosophila, wheat, and yeast. We observed that -3A conferred the best translational efficiency across these species. However, we found variability at the + 4 position for optimal translation. In addition, the Kozak motif that was defined from mammalian cells was only weakly predictive for wheat and essentially non-predictive for yeast. We discovered eight conserved sequences that significantly disfavored translation. Due to the big differences in translational efficiency observed among weak TIS context sequences, we define a novel category that we termed 'barren AUG context sequences (BACS)', which represent sequences disfavoring translation. Analysis of mRNA-ribosomal complexes structures provided insights into the function of BACS. The gene ontology of the BACS-containing mRNAs is presented.
Collapse
Affiliation(s)
- Greco Hernández
- mRNA and Cancer Laboratory, Unit of Biomedical Research on Cancer, National Institute of Cancer (INCan), Mexico City 14080, Mexico
| | - Alejandra García
- mRNA and Cancer Laboratory, Unit of Biomedical Research on Cancer, National Institute of Cancer (INCan), Mexico City 14080, Mexico
| | - Shira Weingarten-Gabbay
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Laboratory of Virology and Infectious Disease, The Rockefeller University, New York, NY, USA
| | - Rishi Kumar Mishra
- Department of Developmental Biology and Genetics, Indian Institute of Science, Bengaluru-560012, India
| | - Tanweer Hussain
- Department of Developmental Biology and Genetics, Indian Institute of Science, Bengaluru-560012, India
| | - Mehdi Amiri
- Department of Biochemistry and Goodman Cancer Institute. McGill University., Montreal, QC H3A 1A3, Canada
| | - Gabriel Moreno-Hagelsieb
- Department of Biology, Wilfrid Laurier University. 75 University Ave. W, Waterloo, ON N2L 3C5, Canada
| | - Angélica Montiel-Dávalos
- mRNA and Cancer Laboratory, Unit of Biomedical Research on Cancer, National Institute of Cancer (INCan), Mexico City 14080, Mexico
| | - Paul Lasko
- Department of Biology, McGill University. Montreal, QC H3G 0B1, Canada
| | - Nahum Sonenberg
- Department of Biochemistry and Goodman Cancer Institute. McGill University., Montreal, QC H3A 1A3, Canada
| |
Collapse
|
69
|
Montua N, Thye P, Hartwig P, Kühle M, Sewald N. Enzymatic Peptide and Protein Bromination: The BromoTrp Tag. Angew Chem Int Ed Engl 2024; 63:e202314961. [PMID: 38009455 DOI: 10.1002/anie.202314961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 11/21/2023] [Accepted: 11/24/2023] [Indexed: 11/28/2023]
Abstract
Bio-orthogonal reactions for modification of proteins and unprotected peptides are of high value in chemical biology. The combination of enzymatic halogenation with transition metal-catalyzed cross-coupling provides a feasible approach for the modification of proteins and unprotected peptides. By a semirational protein engineering approach, variants of the tryptophan 6-halogenase Thal were identified that enable efficient bromination of peptides with a C-terminal tryptophan residue. The substrate scope was explored using di-, tri-, and tetrapeptide arrays, leading to the identification of an optimized peptide tag we named BromoTrp tag. This tag was introduced into three model proteins. Preparative scale post-translational bromination was possible with only a single cultivation and purification step using the brominating E. coli coexpression system Brocoli. Palladium-catalyzed Suzuki-Miyaura cross-coupling of the bromoarene was achieved with Pd nanoparticle catalysts at 37 °C, highlighting the rich potential of this strategy for bio-orthogonal functionalization and conjugation.
Collapse
Affiliation(s)
- Nicolai Montua
- Organic and Bioorganic Chemistry, Department of Chemistry, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Paula Thye
- Organic and Bioorganic Chemistry, Department of Chemistry, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Pia Hartwig
- Organic and Bioorganic Chemistry, Department of Chemistry, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Matthias Kühle
- Organic and Bioorganic Chemistry, Department of Chemistry, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| | - Norbert Sewald
- Organic and Bioorganic Chemistry, Department of Chemistry, Bielefeld University, Universitätsstraße 25, 33615, Bielefeld, Germany
| |
Collapse
|
70
|
Mehmood F, Arshad S, Shoaib M. ADH-Enhancer: an attention-based deep hybrid framework for enhancer identification and strength prediction. Brief Bioinform 2024; 25:bbae030. [PMID: 38385876 PMCID: PMC10885011 DOI: 10.1093/bib/bbae030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 12/30/2023] [Accepted: 01/11/2024] [Indexed: 02/23/2024] Open
Abstract
Enhancers play an important role in the process of gene expression regulation. In DNA sequence abundance or absence of enhancers and irregularities in the strength of enhancers affects gene expression process that leads to the initiation and propagation of diverse types of genetic diseases such as hemophilia, bladder cancer, diabetes and congenital disorders. Enhancer identification and strength prediction through experimental approaches is expensive, time-consuming and error-prone. To accelerate and expedite the research related to enhancers identification and strength prediction, around 19 computational frameworks have been proposed. These frameworks used machine and deep learning methods that take raw DNA sequences and predict enhancer's presence and strength. However, these frameworks still lack in performance and are not useful in real time analysis. This paper presents a novel deep learning framework that uses language modeling strategies for transforming DNA sequences into statistical feature space. It applies transfer learning by training a language model in an unsupervised fashion by predicting a group of nucleotides also known as k-mers based on the context of existing k-mers in a sequence. At the classification stage, it presents a novel classifier that reaps the benefits of two different architectures: convolutional neural network and attention mechanism. The proposed framework is evaluated over the enhancer identification benchmark dataset where it outperforms the existing best-performing framework by 5%, and 9% in terms of accuracy and MCC. Similarly, when evaluated over the enhancer strength prediction benchmark dataset, it outperforms the existing best-performing framework by 4%, and 7% in terms of accuracy and MCC.
Collapse
Affiliation(s)
- Faiza Mehmood
- Department of Computer Science, University of Engineering and Technology Lahore, (Faisalabad Campus) Pakistan
| | - Shazia Arshad
- Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan
| | - Muhammad Shoaib
- Department of Computer Science, University of Engineering and Technology Lahore, 54890, Pakistan
| |
Collapse
|
71
|
Ballmer D, Lou HJ, Ishii M, Turk BE, Akiyoshi B. An unconventional regulatory circuitry involving Aurora B controls anaphase onset and error-free chromosome segregation in trypanosomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576407. [PMID: 38293145 PMCID: PMC10827227 DOI: 10.1101/2024.01.20.576407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Accurate chromosome segregation during mitosis requires that all chromosomes establish stable bi-oriented attachments with the spindle apparatus. Kinetochores form the interface between chromosomes and spindle microtubules and as such are under tight control by complex regulatory circuitry. As part of the chromosomal passenger complex (CPC), the Aurora B kinase plays a central role within this circuitry by destabilizing improper kinetochore-microtubule attachments and relaying the attachment status to the spindle assembly checkpoint, a feedback control system that delays the onset of anaphase by inhibiting the anaphase-promoting complex/cyclosome. Intriguingly, Aurora B is conserved even in kinetoplastids, an evolutionarily divergent group of eukaryotes, whose kinetochores are composed of a unique set of structural and regulatory proteins. Kinetoplastids do not have a canonical spindle checkpoint and it remains unclear how their kinetochores are regulated to ensure the fidelity and timing of chromosome segregation. Here, we show in Trypanosoma brucei, the kinetoplastid parasite that causes African sleeping sickness, that inhibition of Aurora B using an analogue-sensitive approach arrests cells in metaphase, with a reduction in properly bi-oriented kinetochores. Aurora B phosphorylates several kinetochore proteins in vitro, including the N-terminal region of the divergent Bub1-like protein KKT14. Depletion of KKT14 partially overrides the cell cycle arrest caused by Aurora B inhibition, while overexpression of a non-phosphorylatable KKT14 protein results in a prominent delay in the metaphase-to-anaphase transition. Finally, we demonstrate using a nanobody-based system that re-targeting the catalytic module of the CPC to the outer kinetochore is sufficient to promote mitotic exit but causes massive chromosome mis-segregation in anaphase. Our results indicate that the CPC and KKT14 are involved in an unconventional pathway controlling mitotic exit and error-free chromosome segregation in trypanosomes.
Collapse
Affiliation(s)
- Daniel Ballmer
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, United Kingdom
| | - Hua Jane Lou
- Department of Pharmacology, Yale School of Medicine, New Haven, CT, USA
| | - Midori Ishii
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, United Kingdom
- The Wellcome Centre for Cell Biology, Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Max Born Crescent Edinburgh, EH9 3BF, United Kingdom
| | - Benjamin E. Turk
- Department of Pharmacology, Yale School of Medicine, New Haven, CT, USA
| | - Bungo Akiyoshi
- Department of Biochemistry, University of Oxford, South Parks Road, Oxford, OX1 3QU, United Kingdom
- The Wellcome Centre for Cell Biology, Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Max Born Crescent Edinburgh, EH9 3BF, United Kingdom
| |
Collapse
|
72
|
Rondthaler S, Sarker B, Howitz N, Shah I, Andrews LB. Toolbox of Characterized Genetic Parts for Staphylococcus aureus. ACS Synth Biol 2024; 13:103-118. [PMID: 38064657 PMCID: PMC10805105 DOI: 10.1021/acssynbio.3c00325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 10/06/2023] [Accepted: 10/10/2023] [Indexed: 01/23/2024]
Abstract
Staphylococcus aureus is an important clinical bacterium prevalent in human-associated microbiomes and the cause of many diseases. However, S. aureus has been intractable to synthetic biology approaches due to limited characterized genetic parts for this nonmodel Gram-positive bacterium. Moreover, genetic manipulation of S. aureus has relied on cumbersome and inefficient cloning strategies. Here, we report the first standardized genetic parts toolbox for S. aureus, which includes characterized promoters, ribosome binding sites, terminators, and plasmid replicons from a variety of bacteria for precise control of gene expression. We established a standard relative expression unit (REU) for S. aureus using a plasmid reference and characterized genetic parts in standardized REUs using S. aureus ATCC 12600. We constructed promoter and terminator part plasmids that are compatible with an efficient Type IIS DNA assembly strategy to effectively build multipart DNA constructs. A library of 24 constitutive promoters was built and characterized in S. aureus, which showed a 380-fold activity range. This promoter library was also assayed in Bacillus subtilis (122-fold activity range) to demonstrate the transferability of the constitutive promoters between these Gram-positive bacteria. By applying an iterative design-build-test-learn cycle, we demonstrated the use of our toolbox for the rational design and engineering of a tetracycline sensor in S. aureus using the PXyl-TetO aTc-inducible promoter that achieved 25.8-fold induction. This toolbox greatly expands the growing number of genetic parts for Gram-positive bacteria and will allow researchers to leverage synthetic biology approaches to study and engineer cellular processes in S. aureus.
Collapse
Affiliation(s)
- Stephen
N. Rondthaler
- Department
of Chemical Engineering, University of Massachusetts
Amherst, Amherst, Massachusetts 01003, United States
| | - Biprodev Sarker
- Department
of Chemical Engineering, University of Massachusetts
Amherst, Amherst, Massachusetts 01003, United States
| | - Nathaniel Howitz
- Department
of Chemical Engineering, University of Massachusetts
Amherst, Amherst, Massachusetts 01003, United States
| | - Ishita Shah
- Department
of Chemical Engineering, University of Massachusetts
Amherst, Amherst, Massachusetts 01003, United States
| | - Lauren B. Andrews
- Department
of Chemical Engineering, University of Massachusetts
Amherst, Amherst, Massachusetts 01003, United States
- Molecular
and Cellular Biology Graduate Program, University
of Massachusetts Amherst, Amherst, Massachusetts 01003, United States
- Biotechnology
Training Program, University of Massachusetts
Amherst, Amherst, Massachusetts 01003, United States
| |
Collapse
|
73
|
Naeem FM, Gemler BT, McNutt ZA, Bundschuh R, Fredrick K. Analysis of programmed frameshifting during translation of prfB in Flavobacterium johnsoniae. RNA (NEW YORK, N.Y.) 2024; 30:136-148. [PMID: 37949662 PMCID: PMC10798248 DOI: 10.1261/rna.079721.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 10/27/2023] [Indexed: 11/12/2023]
Abstract
Ribosomes of Bacteroidia fail to recognize Shine-Dalgarno (SD) sequences due to sequestration of the 3' tail of the 16S rRNA on the 30S platform. Yet in these organisms, the prfB gene typically contains the programmed +1 frameshift site with its characteristic SD sequence. Here, we investigate prfB autoregulation in Flavobacterium johnsoniae, a member of the Bacteroidia. We find that the efficiency of prfB frameshifting in F. johnsoniae is low (∼7%) relative to that in Escherichia coli (∼50%). Mutation or truncation of bS21 in F. johnsoniae increases frameshifting substantially, suggesting that anti-SD (ASD) sequestration is responsible for the reduced efficiency. The frameshift site of certain Flavobacteriales, such as Winogradskyella psychrotolerans, has no SD. In F. johnsoniae, this W. psychrotolerans sequence supports frameshifting as well as the native sequence, and mutation of bS21 causes no enhancement. These data suggest that prfB frameshifting normally occurs without SD-ASD pairing, at least under optimal laboratory growth conditions. Chromosomal mutations that remove the frameshift or ablate the SD confer subtle growth defects in the presence of paraquat or streptomycin, respectively, indicating that both the autoregulatory mechanism and the SD element contribute to F. johnsoniae cell fitness. Analysis of prfB frameshift sites across 2686 representative bacteria shows loss of the SD sequence in many clades, with no obvious relationship to genome-wide SD usage. These data reveal unexpected variation in the mechanism of frameshifting and identify another group of organisms, the Verrucomicrobiales, that globally lack SD sequences.
Collapse
Affiliation(s)
- Fawwaz M Naeem
- Ohio State Biochemistry Program, The Ohio State University, Columbus, Ohio 43210, USA
- Center for RNA Biology, The Ohio State University, Columbus, Ohio 43210, USA
| | - Bryan T Gemler
- Center for RNA Biology, The Ohio State University, Columbus, Ohio 43210, USA
- Interdisciplinary Biophysics Graduate Program, The Ohio State University, Columbus, Ohio 43210, USA
| | - Zakkary A McNutt
- Ohio State Biochemistry Program, The Ohio State University, Columbus, Ohio 43210, USA
- Center for RNA Biology, The Ohio State University, Columbus, Ohio 43210, USA
| | - Ralf Bundschuh
- Center for RNA Biology, The Ohio State University, Columbus, Ohio 43210, USA
- Interdisciplinary Biophysics Graduate Program, The Ohio State University, Columbus, Ohio 43210, USA
- Department of Physics, The Ohio State University, Columbus, Ohio 43210, USA
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio 43210, USA
- Division of Hematology, Department of Internal Medicine, The Ohio State University, Columbus, Ohio 43210, USA
| | - Kurt Fredrick
- Ohio State Biochemistry Program, The Ohio State University, Columbus, Ohio 43210, USA
- Center for RNA Biology, The Ohio State University, Columbus, Ohio 43210, USA
- Department of Microbiology, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
74
|
Klein TA, Shah PY, Gkragkopoulou P, Grebenc DW, Kim Y, Whitney JC. Structure of a tripartite protein complex that targets toxins to the type VII secretion system. Proc Natl Acad Sci U S A 2024; 121:e2312455121. [PMID: 38194450 PMCID: PMC10801868 DOI: 10.1073/pnas.2312455121] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 11/20/2023] [Indexed: 01/11/2024] Open
Abstract
Type VII secretion systems are membrane-embedded nanomachines used by Gram-positive bacteria to export effector proteins from the cytoplasm to the extracellular environment. Many of these effectors are polymorphic toxins comprised of an N-terminal Leu-x-Gly (LXG) domain of unknown function and a C-terminal toxin domain that inhibits the growth of bacterial competitors. In recent work, it was shown that LXG effectors require two cognate Lap proteins for T7SS-dependent export. Here, we present the 2.6 Å structure of the LXG domain of the TelA toxin from the opportunistic pathogen Streptococcus intermedius in complex with both of its cognate Lap targeting factors. The structure reveals an elongated α-helical bundle within which each Lap protein makes extensive hydrophobic contacts with either end of the LXG domain. Remarkably, despite low overall sequence identity, we identify striking structural similarity between our LXG complex and PE-PPE heterodimers exported by the distantly related ESX type VII secretion systems of Mycobacteria implying a conserved mechanism of effector export among diverse Gram-positive bacteria. Overall, our findings demonstrate that LXG domains, in conjunction with their cognate Lap targeting factors, represent a tripartite secretion signal for a widespread family of T7SS toxins.
Collapse
Affiliation(s)
- Timothy A. Klein
- Michael DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ONL8S 4K1, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ONL8S 4K1, Canada
| | - Prakhar Y. Shah
- Michael DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ONL8S 4K1, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ONL8S 4K1, Canada
| | - Polyniki Gkragkopoulou
- Michael DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ONL8S 4K1, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ONL8S 4K1, Canada
| | - Dirk W. Grebenc
- Michael DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ONL8S 4K1, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ONL8S 4K1, Canada
| | - Youngchang Kim
- Structural Biology Center, X-ray Science Division, Advanced Photon Source, Argonne National Laboratory, Lemont, IL60439
| | - John C. Whitney
- Michael DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, ONL8S 4K1, Canada
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, ONL8S 4K1, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, ONL8S 4K1, Canada
| |
Collapse
|
75
|
Zhu Y, Vvedenskaya IO, Sze SH, Nickels BE, Kaplan CD. Quantitative analysis of transcription start site selection reveals control by DNA sequence, RNA polymerase II activity and NTP levels. Nat Struct Mol Biol 2024; 31:190-202. [PMID: 38177677 PMCID: PMC10928753 DOI: 10.1038/s41594-023-01171-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Accepted: 11/03/2023] [Indexed: 01/06/2024]
Abstract
Transcription start site (TSS) selection is a key step in gene expression and occurs at many promoter positions over a wide range of efficiencies. Here we develop a massively parallel reporter assay to quantitatively dissect contributions of promoter sequence, nucleoside triphosphate substrate levels and RNA polymerase II (Pol II) activity to TSS selection by 'promoter scanning' in Saccharomyces cerevisiae (Pol II MAssively Systematic Transcript End Readout, 'Pol II MASTER'). Using Pol II MASTER, we measure the efficiency of Pol II initiation at 1,000,000 individual TSS sequences in a defined promoter context. Pol II MASTER confirms proposed critical qualities of S. cerevisiae TSS -8, -1 and +1 positions, quantitatively, in a controlled promoter context. Pol II MASTER extends quantitative analysis to surrounding sequences and determines that they tune initiation over a wide range of efficiencies. These results enabled the development of a predictive model for initiation efficiency based on sequence. We show that genetic perturbation of Pol II catalytic activity alters initiation efficiency mostly independently of TSS sequence, but selectively modulates preference for the initiating nucleotide. Intriguingly, we find that Pol II initiation efficiency is directly sensitive to guanosine-5'-triphosphate levels at the first five transcript positions and to cytosine-5'-triphosphate and uridine-5'-triphosphate levels at the second position genome wide. These results suggest individual nucleoside triphosphate levels can have transcript-specific effects on initiation, representing a cryptic layer of potential regulation at the level of Pol II biochemical properties. The results establish Pol II MASTER as a method for quantitative dissection of transcription initiation in eukaryotes.
Collapse
Affiliation(s)
- Yunye Zhu
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Irina O Vvedenskaya
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Sing-Hoi Sze
- Department of Biochemistry and Biophysics, Texas A&M University, College Station, TX, USA
- Department of Computer Science and Engineering, Texas A&M University, College Station, TX, USA
| | - Bryce E Nickels
- Department of Genetics and Waksman Institute, Rutgers University, Piscataway, NJ, USA
| | - Craig D Kaplan
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
76
|
Anaya J, Sidhom JW, Mahmood F, Baras AS. Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status. Nat Biomed Eng 2024; 8:57-67. [PMID: 37919367 PMCID: PMC10805698 DOI: 10.1038/s41551-023-01120-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 09/30/2023] [Indexed: 11/04/2023]
Abstract
Large-scale genomic data are well suited to analysis by deep learning algorithms. However, for many genomic datasets, labels are at the level of the sample rather than for individual genomic measures. Machine learning models leveraging these datasets generate predictions by using statically encoded measures that are then aggregated at the sample level. Here we show that a single weakly supervised end-to-end multiple-instance-learning model with multi-headed attention can be trained to encode and aggregate the local sequence context or genomic position of somatic mutations, hence allowing for the modelling of the importance of individual measures for sample-level classification and thus providing enhanced explainability. The model solves synthetic tasks that conventional models fail at, and achieves best-in-class performance for the classification of tumour type and for predicting microsatellite status. By improving the performance of tasks that require aggregate information from genomic datasets, multiple-instance deep learning may generate biological insight.
Collapse
Affiliation(s)
- Jordan Anaya
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - John-William Sidhom
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Faisal Mahmood
- Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
- Department of Pathology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, USA
- Cancer Data Science Program, Dana-Farber Cancer Institute, Boston, MA, USA
- Harvard Data Science Initiative, Harvard University, Cambridge, MA, USA
| | - Alexander S Baras
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Bloomberg~Kimmel Institute for Cancer Immunotherapy, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
77
|
Bachmann Salvy M, Santuari L, Schmid-Siegert E, Lykoskoufis N, Xenarios I, Arpat B. Seq2scFv: a toolkit for the comprehensive analysis of display libraries from long-read sequencing platforms. MAbs 2024; 16:2408344. [PMID: 39379324 PMCID: PMC11469439 DOI: 10.1080/19420862.2024.2408344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 09/15/2024] [Accepted: 09/19/2024] [Indexed: 10/10/2024] Open
Abstract
Antibodies have emerged as the leading class of biotherapeutics, yet traditional screening methods face significant time and resource challenges in identifying lead candidates. Integrating high-throughput sequencing with computational approaches marks a pivotal advancement in antibody discovery, expanding the antibody space to explore. In this context, a major breakthrough has been the full-length sequencing of single-chain variable fragments (scFvs) used in in vitro display libraries. However, few tools address the task of annotating the paired heavy and light chain variable domains (VH and VL), which is the primary advantage of full-scFv sequencing. To address this methodological gap, we introduce Seq2scFv, a novel open-source toolkit designed for analyzing in vitro display libraries from long-read sequencing platforms. Seq2scFv facilitates the identification and thorough characterization of V(D)J recombination in both VH and VL regions. In addition to providing annotated scFvs, translated sequences and numbered chains, Seq2scFv enables linker inference and characterization, sequence encoding with unique identifiers and quantification of identical sequences across selection rounds, thereby simplifying enrichment identification. With its versatile and standalone functionality, we anticipate that the implementation of Seq2scFv tools in antibody discovery pipelines will efficiently expedite the full characterization of display libraries and potentially facilitate the identification of high-affinity antibody candidates.
Collapse
Affiliation(s)
| | - Luca Santuari
- NGS-AI Division, JSR Life Sciences, Epalinges, Switzerland
| | | | | | | | - Bulak Arpat
- NGS-AI Division, JSR Life Sciences, Epalinges, Switzerland
| |
Collapse
|
78
|
Croote D, Wong JJW, Pecalvel C, Leveque E, Casanovas N, Kamphuis JBJ, Creeks P, Romero J, Sohail S, Bedinger D, Nadeau KC, Chinthrajah RS, Reber LL, Lowman HB. Widespread monoclonal IgE antibody convergence to an immunodominant, proanaphylactic Ara h 2 epitope in peanut allergy. J Allergy Clin Immunol 2024; 153:182-192.e7. [PMID: 37748654 PMCID: PMC10766438 DOI: 10.1016/j.jaci.2023.08.035] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 07/25/2023] [Accepted: 08/31/2023] [Indexed: 09/27/2023]
Abstract
BACKGROUND Despite their central role in peanut allergy, human monoclonal IgE antibodies have eluded characterization. OBJECTIVE We sought to define the sequences, affinities, clonality, and functional properties of human monoclonal IgE antibodies in peanut allergy. METHODS We applied our single-cell RNA sequencing-based SEQ SIFTER discovery platform to samples from allergic individuals who varied by age, sex, ethnicity, and geographic location in order to understand commonalities in the human IgE response to peanut allergens. Select antibodies were then recombinantly expressed and characterized for their allergen and epitope specificity, affinity, and functional properties. RESULTS We found striking convergent evolution of IgE monoclonal antibodies (mAbs) from several clonal families comprising both memory B cells and plasmablasts. These antibodies bound with subnanomolar affinity to the immunodominant peanut allergen Ara h 2, specifically a linear, repetitive motif. Further characterization of these mAbs revealed their ability to single-handedly cause affinity-dependent degranulation of human mast cells and systemic anaphylaxis on peanut allergen challenge in humanized mice. Finally, we demonstrated that these mAbs, reengineered as IgGs, inhibit significant, but variable, amounts of Ara h 2- and peanut-mediated degranulation of mast cells sensitized with allergic plasma. CONCLUSIONS Convergent evolution of IgE mAbs in peanut allergy is a common phenomenon that can reveal immunodominant epitopes on major allergenic proteins. Understanding the functional properties of these molecules is key to developing therapeutics, such as competitive IgG inhibitors, that are able to stoichiometrically outcompete endogenous IgE for allergen and thereby prevent allergic cascade in cases of accidental allergen exposure.
Collapse
Affiliation(s)
| | | | - Cyprien Pecalvel
- Toulouse Institute for Infectious and Inflammatory Diseases (Infinity), UMR 1291, University of Toulouse, INSERM, CNRS, Toulouse, France
| | - Edouard Leveque
- Toulouse Institute for Infectious and Inflammatory Diseases (Infinity), UMR 1291, University of Toulouse, INSERM, CNRS, Toulouse, France
| | - Natacha Casanovas
- Toulouse Institute for Infectious and Inflammatory Diseases (Infinity), UMR 1291, University of Toulouse, INSERM, CNRS, Toulouse, France
| | - Jasper B J Kamphuis
- Toulouse Institute for Infectious and Inflammatory Diseases (Infinity), UMR 1291, University of Toulouse, INSERM, CNRS, Toulouse, France
| | | | | | | | | | - Kari C Nadeau
- Sean N. Parker Center for Allergy and Asthma Research, Stanford University School of Medicine, Stanford, Calif; Department of Medicine, Stanford University School of Medicine, Stanford, Calif
| | - Rebecca S Chinthrajah
- Sean N. Parker Center for Allergy and Asthma Research, Stanford University School of Medicine, Stanford, Calif; Department of Medicine, Stanford University School of Medicine, Stanford, Calif
| | - Laurent L Reber
- Toulouse Institute for Infectious and Inflammatory Diseases (Infinity), UMR 1291, University of Toulouse, INSERM, CNRS, Toulouse, France
| | | |
Collapse
|
79
|
Vuong CN, Reynolds KM, Rivera GS, Zeng B, Karimpourkalou Z, Norng M, Zhang Y, Chowdhury R, Pedersen D, Pantoja M, Collarini E, Garimalla S, Izquierdo S, Vajda EG, Antonio B, Srivastava DB, van de Lavoir MC, Abdiche Y, Harriman W, Leighton PA. Heavy chain-only antibodies with a stabilized human VH in transgenic chickens for therapeutic antibody discovery. MAbs 2024; 16:2435476. [PMID: 39607037 DOI: 10.1080/19420862.2024.2435476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2024] [Revised: 11/23/2024] [Accepted: 11/24/2024] [Indexed: 11/29/2024] Open
Abstract
Heavy chain-only antibodies have found many applications where conventional heavy-light heterodimeric antibodies are not favorable. Heavy chain-only antibodies with their single antigen-binding domain offer the advantage of a smaller size and higher stability relative to conventional antibodies, and thus, the potential for novel targeting modalities. Domain antibodies have commonly been sourced from camelids with ex-vivo humanization or transgenic rodents expressing heavy chains without light chains, but these host species are all mammalian, limiting their capacity to elicit robust immune responses to conserved mammalian targets. We have developed transgenic chickens expressing heavy chain-only antibodies with a human variable region to combine the superior target recognition advantages of a divergent, non-mammalian host with the ability to discover single-domain binders. These birds produce robust immune responses, consisting of antigen-specific antibodies targeting diverse epitopes with a range of affinities. Biophysical attributes are favorable, with good developability profiles and low predicted immunogenicity.
Collapse
|
80
|
Arnedo-Pac C, Muiños F, Gonzalez-Perez A, Lopez-Bigas N. Hotspot propensity across mutational processes. Mol Syst Biol 2024; 20:6-27. [PMID: 38177930 PMCID: PMC10883281 DOI: 10.1038/s44320-023-00001-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Revised: 10/30/2023] [Accepted: 11/09/2023] [Indexed: 01/06/2024] Open
Abstract
The sparsity of mutations observed across tumours hinders our ability to study mutation rate variability at nucleotide resolution. To circumvent this, here we investigated the propensity of mutational processes to form mutational hotspots as a readout of their mutation rate variability at single base resolution. Mutational signatures 1 and 17 have the highest hotspot propensity (5-78 times higher than other processes). After accounting for trinucleotide mutational probabilities, sequence composition and mutational heterogeneity at 10 Kbp, most (94-95%) signature 17 hotspots remain unexplained, suggesting a significant role of local genomic features. For signature 1, the inclusion of genome-wide distribution of methylated CpG sites into models can explain most (80-100%) of the hotspot propensity. There is an increased hotspot propensity of signature 1 in normal tissues and de novo germline mutations. We demonstrate that hotspot propensity is a useful readout to assess the accuracy of mutation rate models at nucleotide resolution. This new approach and the findings derived from it open up new avenues for a range of somatic and germline studies investigating and modelling mutagenesis.
Collapse
Affiliation(s)
- Claudia Arnedo-Pac
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica en Red en Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Ferran Muiños
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain
- Centro de Investigación Biomédica en Red en Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain
| | - Abel Gonzalez-Perez
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Centro de Investigación Biomédica en Red en Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain.
| | - Nuria Lopez-Bigas
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
- Centro de Investigación Biomédica en Red en Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain.
- Department of Medicine and Life Sciences (MELIS), Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
81
|
Zheng W, Fong JHC, Wan YK, Chu AHY, Huang Y, Wong ASL, Ho JWK. Discovery of regulatory motifs in 5' untranslated regions using interpretable multi-task learning models. Cell Syst 2023; 14:1103-1112.e6. [PMID: 38016465 DOI: 10.1016/j.cels.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 09/18/2023] [Accepted: 10/31/2023] [Indexed: 11/30/2023]
Abstract
The sequence in the 5' untranslated regions (UTRs) is known to affect mRNA translation rates. However, the underlying regulatory grammar remains elusive. Here, we propose MTtrans, a multi-task translation rate predictor capable of learning common sequence patterns from datasets across various experimental techniques. The core premise is that common motifs are more likely to be genuinely involved in translation control. MTtrans outperforms existing methods in both accuracy and the ability to capture transferable motifs across species, highlighting its strength in identifying evolutionarily conserved sequence motifs. Our independent fluorescence-activated cell sorting coupled with deep sequencing (FACS-seq) experiment validates the impact of most motifs identified by MTtrans. Additionally, we introduce "GRU-rewiring," a technique to interpret the hidden states of the recurrent units. Gated recurrent unit (GRU)-rewiring allows us to identify regulatory element-enriched positions and examine the local effects of 5' UTR mutations. MTtrans is a powerful tool for deciphering the translation regulatory motifs.
Collapse
Affiliation(s)
- Weizhong Zheng
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - John H C Fong
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Yuk Kei Wan
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Athena H Y Chu
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yuanhua Huang
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China; Department of Statistics and Actuarial Science, The University of Hong Kong, Hong Kong SAR, China; Center for Translational Stem Cell Biology, Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Alan S L Wong
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China; Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China; Department of Electrical and Electronic Engineering, The University of Hong Kong, Hong Kong SAR, China
| | - Joshua W K Ho
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China; Laboratory of Data Discovery for Health (D24H) Limited, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|
82
|
Yang KB, Rasouly A, Epshtein V, Martinez C, Nguyen T, Shamovsky I, Nudler E. Persistence of backtracking by human RNA polymerase II. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.13.571520. [PMID: 38168453 PMCID: PMC10760130 DOI: 10.1101/2023.12.13.571520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
RNA polymerase II (pol II) can backtrack during transcription elongation, exposing the 3' end of nascent RNA. Nascent RNA sequencing can approximate the location of backtracking events that are quickly resolved; however, the extent and genome wide distribution of more persistent backtracking is unknown. Consequently, we developed a novel method to directly sequence the extruded, "backtracked" 3' RNA. Our data shows that pol II slides backwards more than 20 nucleotides in human cells and can persist in this backtracked state. Persistent backtracking mainly occurs where pol II pauses near promoters and intron-exon junctions, and is enriched in genes involved in translation, replication, and development, where gene expression is decreased if these events are unresolved. Histone genes are highly prone to persistent backtracking, and the resolution of such events is likely required for timely expression during cell division. These results demonstrate that persistent backtracking has the potential to affect diverse gene expression programs.
Collapse
|
83
|
Yoon PH, Skopintsev P, Shi H, Chen L, Adler BA, Al-Shimary M, Craig RJ, Loi KJ, DeTurk EC, Li Z, Amerasekera J, Trinidad M, Nisonoff H, Chen K, Lahiri A, Boger R, Jacobsen S, Banfield JF, Doudna JA. Eukaryotic RNA-guided endonucleases evolved from a unique clade of bacterial enzymes. Nucleic Acids Res 2023; 51:12414-12427. [PMID: 37971304 PMCID: PMC10711439 DOI: 10.1093/nar/gkad1053] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/19/2023] [Accepted: 10/24/2023] [Indexed: 11/19/2023] Open
Abstract
RNA-guided endonucleases form the crux of diverse biological processes and technologies, including adaptive immunity, transposition, and genome editing. Some of these enzymes are components of insertion sequences (IS) in the IS200/IS605 and IS607 transposon families. Both IS families encode a TnpA transposase and a TnpB nuclease, an RNA-guided enzyme ancestral to CRISPR-Cas12s. In eukaryotes, TnpB homologs occur as two distinct types, Fanzor1s and Fanzor2s. We analyzed the evolutionary relationships between prokaryotic TnpBs and eukaryotic Fanzors, which revealed that both Fanzor1s and Fanzor2s stem from a single lineage of IS607 TnpBs with unusual active site arrangement. The widespread nature of Fanzors implies that the properties of this particular lineage of IS607 TnpBs were particularly suited to adaptation in eukaryotes. Biochemical analysis of an IS607 TnpB and Fanzor1s revealed common strategies employed by TnpBs and Fanzors to co-evolve with their cognate transposases. Collectively, our results provide a new model of sequential evolution from IS607 TnpBs to Fanzor2s, and Fanzor2s to Fanzor1s that details how genes of prokaryotic origin evolve to give rise to new protein families in eukaryotes.
Collapse
Affiliation(s)
- Peter H Yoon
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
| | - Petr Skopintsev
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Honglue Shi
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California, Berkeley; Berkeley, CA, USA
| | - LinXing Chen
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- Department of Earth and Planetary Science, University of California, Berkeley, CA, USA
| | - Benjamin A Adler
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Muntathar Al-Shimary
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
| | - Rory J Craig
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Kenneth J Loi
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
| | - Evan C DeTurk
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Zheng Li
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA, USA
| | - Jasmine Amerasekera
- Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Marena Trinidad
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California, Berkeley; Berkeley, CA, USA
| | - Hunter Nisonoff
- Center for Computational Biology, University of California, Berkeley; Berkeley, CA, USA
| | - Kai Chen
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
| | - Arushi Lahiri
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
| | - Ron Boger
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
| | - Steve Jacobsen
- Department of Molecular, Cell and Developmental Biology, University of California, Los Angeles, CA, USA
- Howard Hughes Medical Institute, University of California, Los Angeles CA, USA
| | - Jillian F Banfield
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- Department of Earth and Planetary Science, University of California, Berkeley, CA, USA
| | - Jennifer A Doudna
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
- Innovative Genomics Institute; University of California, Berkeley, CA, USA
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, CA, USA
- Howard Hughes Medical Institute, University of California, Berkeley; Berkeley, CA, USA
- Gladstone Institutes; San Francisco, CA, USA
- Gladstone-UCSF Institute of Genomic Immunology; San Francisco, CA, USA
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory; Berkeley, CA, USA
- Department of Chemistry, University of California, Berkeley; Berkeley, CA, USA
| |
Collapse
|
84
|
Jeong DE, Sundrani S, Hall RN, Krupovic M, Koonin EV, Fire AZ. DNA Polymerase Diversity Reveals Multiple Incursions of Polintons During Nematode Evolution. Mol Biol Evol 2023; 40:msad274. [PMID: 38069639 DOI: 10.1093/molbev/msad274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 11/01/2023] [Accepted: 12/04/2023] [Indexed: 12/19/2023] Open
Abstract
Polintons are double-stranded DNA, virus-like self-synthesizing transposons widely found in eukaryotic genomes. Recent metagenomic discoveries of Polinton-like viruses are consistent with the hypothesis that Polintons invade eukaryotic host genomes through infectious viral particles. Nematode genomes contain multiple copies of Polintons and provide an opportunity to explore the natural distribution and evolution of Polintons during this process. We performed an extensive search of Polintons across nematode genomes, identifying multiple full-length Polinton copies in several species. We provide evidence of both ancient Polinton integrations and recent mobility in strains of the same nematode species. In addition to the major nematode Polinton family, we identified a group of Polintons that are overall closely related to the major family but encode a distinct protein-primed DNA polymerase B (pPolB) that is related to homologs from a different group of Polintons present outside of the Nematoda. Phylogenetic analyses on the pPolBs support the evolutionary scenarios in which these extrinsic pPolBs that seem to derive from Polinton families present in oomycetes and molluscs replaced the canonical pPolB in subsets of Polintons found in terrestrial and marine nematodes, respectively, suggesting interphylum horizontal gene transfers. The pPolBs of the terrestrial nematode and oomycete Polintons share a unique feature, an insertion of an HNH nuclease domain, whereas the pPolBs in the marine nematode Polintons share an insertion of a VSR nuclease domain with marine mollusc pPolBs. We hypothesize that horizontal gene transfer occurs among Polintons from widely different but cohabiting hosts.
Collapse
Affiliation(s)
- Dae-Eun Jeong
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
| | - Sameer Sundrani
- Department of Bioengineering, Stanford University, Stanford, CA, USA
- Present address: Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
| | | | - Mart Krupovic
- Institut Pasteur, Université Paris Cité, Archaeal Virology Unit, Paris, France
| | - Eugene V Koonin
- National National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Andrew Z Fire
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
85
|
Teo QW, Wang Y, Lv H, Tan TJC, Lei R, Mao KJ, Wu NC. Stringent and complex sequence constraints of an IGHV1-69 broadly neutralizing antibody to influenza HA stem. Cell Rep 2023; 42:113410. [PMID: 37976161 PMCID: PMC10872586 DOI: 10.1016/j.celrep.2023.113410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 09/29/2023] [Accepted: 10/24/2023] [Indexed: 11/19/2023] Open
Abstract
IGHV1-69 is frequently utilized by broadly neutralizing influenza antibodies to the hemagglutinin (HA) stem. These IGHV1-69 HA stem antibodies have diverse complementarity-determining region (CDR) H3 sequences. Besides, their light chains have minimal to no contact with the epitope. Consequently, sequence determinants that confer IGHV1-69 antibodies with HA stem specificity remain largely elusive. Using high-throughput experiments, this study reveals the importance of light-chain sequence for the IGHV1-69 HA stem antibody CR9114, which is the broadest influenza antibody known to date. Moreover, we demonstrate that the CDR H3 sequences from many other IGHV1-69 antibodies, including those to the HA stem, are incompatible with CR9114. Along with mutagenesis and structural analysis, our results indicate that light-chain and CDR H3 sequences coordinately determine the HA stem specificity of IGHV1-69 antibodies. Overall, this work provides molecular insights into broadly neutralizing antibody responses to influenza virus, which have important implications for universal influenza vaccine development.
Collapse
Affiliation(s)
- Qi Wen Teo
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Yiquan Wang
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Huibin Lv
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Timothy J C Tan
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Ruipeng Lei
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Kevin J Mao
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Nicholas C Wu
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA; Carle Illinois College of Medicine, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA.
| |
Collapse
|
86
|
Zheng Y, Sun C, Zhang X, Ruzycki PA, Chen S. Missense mutations in CRX homeodomain cause dominant retinopathies through two distinct mechanisms. eLife 2023; 12:RP87147. [PMID: 37963072 PMCID: PMC10645426 DOI: 10.7554/elife.87147] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023] Open
Abstract
Homeodomain transcription factors (HD TFs) are instrumental to vertebrate development. Mutations in HD TFs have been linked to human diseases, but their pathogenic mechanisms remain elusive. Here, we use Cone-Rod Homeobox (CRX) as a model to decipher the disease-causing mechanisms of two HD mutations, p.E80A and p.K88N, that produce severe dominant retinopathies. Through integrated analysis of molecular and functional evidence in vitro and in knock-in mouse models, we uncover two novel gain-of-function mechanisms: p.E80A increases CRX-mediated transactivation of canonical CRX target genes in developing photoreceptors; p.K88N alters CRX DNA-binding specificity resulting in binding at ectopic sites and severe perturbation of CRX target gene expression. Both mechanisms produce novel retinal morphological defects and hinder photoreceptor maturation distinct from loss-of-function models. This study reveals the distinct roles of E80 and K88 residues in CRX HD regulatory functions and emphasizes the importance of transcriptional precision in normal development.
Collapse
Affiliation(s)
- Yiqiao Zheng
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
| | - Chi Sun
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
| | - Xiaodong Zhang
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
| | - Philip A Ruzycki
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Genetics, Washington University in St LouisSaint LouisUnited States
| | - Shiming Chen
- Molecular Genetic and Genomics Graduate Program, Division of Biological and Biomedical Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Ophthalmology and Visual Sciences, Washington University in St LouisSaint LouisUnited States
- Department of Developmental Biology, Washington University in St LouisSaint LouisUnited States
| |
Collapse
|
87
|
Kim GB, Kim JY, Lee JA, Norsigian CJ, Palsson BO, Lee SY. Functional annotation of enzyme-encoding genes using deep learning with transformer layers. Nat Commun 2023; 14:7370. [PMID: 37963869 PMCID: PMC10645960 DOI: 10.1038/s41467-023-43216-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Accepted: 11/03/2023] [Indexed: 11/16/2023] Open
Abstract
Functional annotation of open reading frames in microbial genomes remains substantially incomplete. Enzymes constitute the most prevalent functional gene class in microbial genomes and can be described by their specific catalytic functions using the Enzyme Commission (EC) number. Consequently, the ability to predict EC numbers could substantially reduce the number of un-annotated genes. Here we present a deep learning model, DeepECtransformer, which utilizes transformer layers as a neural network architecture to predict EC numbers. Using the extensively studied Escherichia coli K-12 MG1655 genome, DeepECtransformer predicted EC numbers for 464 un-annotated genes. We experimentally validated the enzymatic activities predicted for three proteins (YgfF, YciO, and YjdM). Further examination of the neural network's reasoning process revealed that the trained neural network relies on functional motifs of enzymes to predict EC numbers. Thus, DeepECtransformer is a method that facilitates the functional annotation of uncharacterized genes.
Collapse
Affiliation(s)
- Gi Bae Kim
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
- Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), KAIST, Daejeon, 34141, Republic of Korea
- KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon, 34141, Republic of Korea
| | - Ji Yeon Kim
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
- Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), KAIST, Daejeon, 34141, Republic of Korea
- KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon, 34141, Republic of Korea
| | - Jong An Lee
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
- Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), KAIST, Daejeon, 34141, Republic of Korea
- KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon, 34141, Republic of Korea
| | - Charles J Norsigian
- Division of Biological Sciences, University of California San Diego, La Jolla, CA, 92093, USA
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
| | - Bernhard O Palsson
- Department of Bioengineering, University of California San Diego, La Jolla, CA, 92093, USA
- Bioinformatics and Systems Biology Program, University of California San Diego, La Jolla, CA, 92093, USA
- Novo Nordisk Foundation Center for Biosustainability, 2800, Kongens Lyngby, Denmark
| | - Sang Yup Lee
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea.
- Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), KAIST, Daejeon, 34141, Republic of Korea.
- KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon, 34141, Republic of Korea.
- BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon, 34141, Republic of Korea.
| |
Collapse
|
88
|
Diebold PJ, Rhee MW, Shi Q, Trung NV, Umrani F, Ahmed S, Kulkarni V, Deshpande P, Alexander M, Thi Hoa N, Christakis NA, Iqbal NT, Ali SA, Mathad JS, Brito IL. Clinically relevant antibiotic resistance genes are linked to a limited set of taxa within gut microbiome worldwide. Nat Commun 2023; 14:7366. [PMID: 37963868 PMCID: PMC10645880 DOI: 10.1038/s41467-023-42998-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Accepted: 10/27/2023] [Indexed: 11/16/2023] Open
Abstract
The acquisition of antimicrobial resistance (AR) genes has rendered important pathogens nearly or fully unresponsive to antibiotics. It has been suggested that pathogens acquire AR traits from the gut microbiota, which collectively serve as a global reservoir for AR genes conferring resistance to all classes of antibiotics. However, only a subset of AR genes confers resistance to clinically relevant antibiotics, and, although these AR gene profiles are well-characterized for common pathogens, less is known about their taxonomic associations and transfer potential within diverse members of the gut microbiota. We examined a collection of 14,850 human metagenomes and 1666 environmental metagenomes from 33 countries, in addition to nearly 600,000 isolate genomes, to gain insight into the global prevalence and taxonomic range of clinically relevant AR genes. We find that several of the most concerning AR genes, such as those encoding the cephalosporinase CTX-M and carbapenemases KPC, IMP, NDM, and VIM, remain taxonomically restricted to Proteobacteria. Even cfiA, the most common carbapenemase gene within the human gut microbiome, remains tightly restricted to Bacteroides, despite being found on a mobilizable plasmid. We confirmed these findings in gut microbiome samples from India, Honduras, Pakistan, and Vietnam, using a high-sensitivity single-cell fusion PCR approach. Focusing on a set of genes encoding carbapenemases and cephalosporinases, thus far restricted to Bacteroides species, we find that few mutations are required for efficacy in a different phylum, raising the question of why these genes have not spread more widely. Overall, these data suggest that globally prevalent, clinically relevant AR genes have not yet established themselves across diverse commensal gut microbiota.
Collapse
Affiliation(s)
- Peter J Diebold
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA
| | - Matthew W Rhee
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA
| | - Qiaojuan Shi
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA
| | - Nguyen Vinh Trung
- Oxford University Clinical Research Unit (OUCRU) in Ho Chi Minh City, Ho Chi Minh city, Viet Nam
| | | | | | - Vandana Kulkarni
- Johns Hopkins University Clinical Trials Unit, Byramjee Jeejeebhoy Government Medical College, Pune, Maharashtra, India
| | - Prasad Deshpande
- Johns Hopkins University Clinical Trials Unit, Byramjee Jeejeebhoy Government Medical College, Pune, Maharashtra, India
| | - Mallika Alexander
- Johns Hopkins University Clinical Trials Unit, Byramjee Jeejeebhoy Government Medical College, Pune, Maharashtra, India
| | - Ngo Thi Hoa
- Oxford University Clinical Research Unit (OUCRU) in Ho Chi Minh City, Ho Chi Minh city, Viet Nam
- Centre for Tropical Medicine, Nuffield Department of Medicine, University of Oxford, Oxford, UK
- Microbiology Department and Center for Tropical Medicine Research, Ngoc Thach University of Medicine, Ho Chi Minh city, Vietnam
| | | | | | | | | | - Ilana L Brito
- Meinig School of Biomedical Engineering, Cornell University, Ithaca, NY, USA.
| |
Collapse
|
89
|
Miao R, Jahn M, Shabestary K, Peltier G, Hudson EP. CRISPR interference screens reveal growth-robustness tradeoffs in Synechocystis sp. PCC 6803 across growth conditions. THE PLANT CELL 2023; 35:3937-3956. [PMID: 37494719 PMCID: PMC10615215 DOI: 10.1093/plcell/koad208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/01/2023] [Accepted: 07/20/2023] [Indexed: 07/28/2023]
Abstract
Barcoded mutant libraries are a powerful tool for elucidating gene function in microbes, particularly when screened in multiple growth conditions. Here, we screened a pooled CRISPR interference library of the model cyanobacterium Synechocystis sp. PCC 6803 in 11 bioreactor-controlled conditions, spanning multiple light regimes and carbon sources. This gene repression library contained 21,705 individual mutants with high redundancy over all open reading frames and noncoding RNAs. Comparison of the derived gene fitness scores revealed multiple instances of gene repression being beneficial in 1 condition while generally detrimental in others, particularly for genes within light harvesting and conversion, such as antennae components at high light and PSII subunits during photoheterotrophy. Suboptimal regulation of such genes likely represents a tradeoff of reduced growth speed for enhanced robustness to perturbation. The extensive data set assigns condition-specific importance to many previously unannotated genes and suggests additional functions for central metabolic enzymes. Phosphoribulokinase, glyceraldehyde-3-phosphate dehydrogenase, and the small protein CP12 were critical for mixotrophy and photoheterotrophy, which implicates the ternary complex as important for redirecting metabolic flux in these conditions in addition to inactivation of the Calvin cycle in the dark. To predict the potency of sgRNA sequences, we applied machine learning on sgRNA sequences and gene repression data, which showed the importance of C enrichment and T depletion proximal to the PAM site. Fitness data for all genes in all conditions are compiled in an interactive web application.
Collapse
Affiliation(s)
- Rui Miao
- School of Engineering Sciences in Chemistry, Biotechnology and Health, Science for Life Laboratory, KTH—Royal Institute of Technology, Stockholm, SE-17165,Sweden
| | - Michael Jahn
- School of Engineering Sciences in Chemistry, Biotechnology and Health, Science for Life Laboratory, KTH—Royal Institute of Technology, Stockholm, SE-17165,Sweden
- Max Planck Unit for the Science of Pathogens, 10117 Berlin,Germany
| | - Kiyan Shabestary
- School of Engineering Sciences in Chemistry, Biotechnology and Health, Science for Life Laboratory, KTH—Royal Institute of Technology, Stockholm, SE-17165,Sweden
- Department of Bioengineering and Imperial College Centre for Synthetic Biology, Imperial College London, London SW7 2AZ,UK
| | - Gilles Peltier
- Aix Marseille Univ, CEA, CNRS, Institut de Biosciences et Biotechnologies Aix-Marseille, CEA Cadarache, 13108 Saint Paul-Lez-Durance,France
| | - Elton P Hudson
- School of Engineering Sciences in Chemistry, Biotechnology and Health, Science for Life Laboratory, KTH—Royal Institute of Technology, Stockholm, SE-17165,Sweden
| |
Collapse
|
90
|
Zhang Y, Guan J, Li C, Wang Z, Deng Z, Gasser RB, Song J, Ou HY. DeepSecE: A Deep-Learning-Based Framework for Multiclass Prediction of Secreted Proteins in Gram-Negative Bacteria. RESEARCH (WASHINGTON, D.C.) 2023; 6:0258. [PMID: 37886621 PMCID: PMC10599158 DOI: 10.34133/research.0258] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 10/08/2023] [Indexed: 10/28/2023]
Abstract
Proteins secreted by Gram-negative bacteria are tightly linked to the virulence and adaptability of these microbes to environmental changes. Accurate identification of such secreted proteins can facilitate the investigations of infections and diseases caused by these bacterial pathogens. However, current bioinformatic methods for predicting bacterial secreted substrate proteins have limited computational efficiency and application scope on a genome-wide scale. Here, we propose a novel deep-learning-based framework-DeepSecE-for the simultaneous inference of multiple distinct groups of secreted proteins produced by Gram-negative bacteria. DeepSecE remarkably improves their classification from nonsecreted proteins using a pretrained protein language model and transformer, achieving a macro-average accuracy of 0.883 on 5-fold cross-validation. Performance benchmarking suggests that DeepSecE achieves competitive performance with the state-of-the-art binary predictors specialized for individual types of secreted substrates. The attention mechanism corroborates salient patterns and motifs at the N or C termini of the protein sequences. Using this pipeline, we further investigate the genome-wide prediction of novel secreted proteins and their taxonomic distribution across ~1,000 Gram-negative bacterial genomes. The present analysis demonstrates that DeepSecE has major potential for the discovery of disease-associated secreted proteins in a diverse range of Gram-negative bacteria. An online web server of DeepSecE is also publicly available to predict and explore various secreted substrate proteins via the input of bacterial genome sequences.
Collapse
Affiliation(s)
- Yumeng Zhang
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology,
Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Key Laboratory of Veterinary Biotechnology,
Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jiahao Guan
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology,
Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology,
Monash University, Melbourne, VIC 3800, Australia
| | - Zhikang Wang
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology,
Monash University, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute,
Monash University, Melbourne, VIC 3800, Australia
| | - Zixin Deng
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology,
Shanghai Jiao Tong University, Shanghai 200240, China
| | - Robin B. Gasser
- Melbourne Veterinary School, Faculty of Science,
The University of Melbourne, Parkville, VIC 3010, Australia
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology,
Monash University, Melbourne, VIC 3800, Australia
- Monash Data Futures Institute,
Monash University, Melbourne, VIC 3800, Australia
- Melbourne Veterinary School, Faculty of Science,
The University of Melbourne, Parkville, VIC 3010, Australia
| | - Hong-Yu Ou
- State Key Laboratory of Microbial Metabolism, Joint International Laboratory on Metabolic & Developmental Sciences, School of Life Sciences and Biotechnology,
Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Key Laboratory of Veterinary Biotechnology,
Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
91
|
Sun M, Hu H, Pang W, Zhou Y. ACP-BC: A Model for Accurate Identification of Anticancer Peptides Based on Fusion Features of Bidirectional Long Short-Term Memory and Chemically Derived Information. Int J Mol Sci 2023; 24:15447. [PMID: 37895128 PMCID: PMC10607064 DOI: 10.3390/ijms242015447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2023] [Revised: 09/10/2023] [Accepted: 10/20/2023] [Indexed: 10/29/2023] Open
Abstract
Anticancer peptides (ACPs) have been proven to possess potent anticancer activities. Although computational methods have emerged for rapid ACPs identification, their accuracy still needs improvement. In this study, we propose a model called ACP-BC, a three-channel end-to-end model that utilizes various combinations of data augmentation techniques. In the first channel, features are extracted from the raw sequence using a bidirectional long short-term memory network. In the second channel, the entire sequence is converted into a chemical molecular formula, which is further simplified using Simplified Molecular Input Line Entry System notation to obtain deep abstract features through a bidirectional encoder representation transformer (BERT). In the third channel, we manually selected four effective features according to dipeptide composition, binary profile feature, k-mer sparse matrix, and pseudo amino acid composition. Notably, the application of chemical BERT in predicting ACPs is novel and successfully integrated into our model. To validate the performance of our model, we selected two benchmark datasets, ACPs740 and ACPs240. ACP-BC achieved prediction accuracy with 87% and 90% on these two datasets, respectively, representing improvements of 1.3% and 7% compared to existing state-of-the-art methods on these datasets. Therefore, systematic comparative experiments have shown that the ACP-BC can effectively identify anticancer peptides.
Collapse
Affiliation(s)
- Mingwei Sun
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Haoyuan Hu
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
| | - Wei Pang
- School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh EH14 4AS, UK;
| | - You Zhou
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China; (M.S.); (H.H.)
- College of Software, Jilin University, Changchun 130012, China
| |
Collapse
|
92
|
Zhang L, He W, Fu R, Wang S, Chen Y, Xu H. Guide-specific loss of efficiency and off-target reduction with Cas9 variants. Nucleic Acids Res 2023; 51:9880-9893. [PMID: 37615574 PMCID: PMC10570041 DOI: 10.1093/nar/gkad702] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 08/08/2023] [Accepted: 08/14/2023] [Indexed: 08/25/2023] Open
Abstract
High-fidelity clustered regularly interspaced palindromic repeats (CRISPR)-associated protein 9 (Cas9) variants have been developed to reduce the off-target effects of CRISPR systems at a cost of efficiency loss. To systematically evaluate the efficiency and off-target tolerance of Cas9 variants in complex with different single guide RNAs (sgRNAs), we applied high-throughput viability screens and a synthetic paired sgRNA-target system to assess thousands of sgRNAs in combination with two high-fidelity Cas9 variants HiFi and LZ3. Comparing these variants against wild-type SpCas9, we found that ∼20% of sgRNAs are associated with a significant loss of efficiency when complexed with either HiFi or LZ3. The loss of efficiency is dependent on the sequence context in the seed region of sgRNAs, as well as at positions 15-18 in the non-seed region that interacts with the REC3 domain of Cas9, suggesting that the variant-specific mutations in the REC3 domain account for the loss of efficiency. We also observed various degrees of sequence-dependent off-target reduction when different sgRNAs are used in combination with the variants. Given these observations, we developed GuideVar, a transfer learning-based computational framework for the prediction of on-target efficiency and off-target effects with high-fidelity variants. GuideVar facilitates the prioritization of sgRNAs in the applications with HiFi and LZ3, as demonstrated by the improvement of signal-to-noise ratios in high-throughput viability screens using these high-fidelity variants.
Collapse
Affiliation(s)
- Liang Zhang
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Wei He
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Rongjie Fu
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Shuyue Wang
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Yiwen Chen
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Han Xu
- Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- The Center for Cancer Epigenetics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
93
|
Cao T, Li Q, Huang Y, Li A. plotnineSeqSuite: a Python package for visualizing sequence data using ggplot2 style. BMC Genomics 2023; 24:585. [PMID: 37789265 PMCID: PMC10546746 DOI: 10.1186/s12864-023-09677-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Accepted: 09/14/2023] [Indexed: 10/05/2023] Open
Abstract
BACKGROUND The visual sequence logo has been a hot area in the development of bioinformatics tools. ggseqlogo written in R language has been the most popular API since it was published. With the popularity of artificial intelligence and deep learning, Python is currently the most popular programming language. The programming language used by bioinformaticians began to shift to Python. Providing APIs in Python that are similar to those in R can reduce the learning cost of relearning a programming language. And compared to ggplot2 in R, drawing framework is not as easy to use in Python. The appearance of plotnine (ggplot2 in Python version) makes it possible to unify the programming methods of bioinformatics visualization tools between R and Python. RESULTS Here, we introduce plotnineSeqSuite, a new plotnine-based Python package provides a ggseqlogo-like API for programmatic drawing of sequence logos, sequence alignment diagrams and sequence histograms. To be more precise, it supports custom letters, color themes, and fonts. Moreover, the class for drawing layers is based on object-oriented design so that users can easily encapsulate and extend it. CONCLUSIONS plotnineSeqSuite is the first ggplot2-style package to implement visualization of sequence -related graphs in Python. It enhances the uniformity of programmatic plotting between R and Python. Compared with tools appeared already, the categories supported by plotnineSeqSuite are much more complete. The source code of plotnineSeqSuite can be obtained on GitHub ( https://github.com/caotianze/plotnineseqsuite ) and PyPI ( https://pypi.org/project/plotnineseqsuite ), and the documentation homepage is freely available on GitHub at ( https://caotianze.github.io/plotnineseqsuite/ ).
Collapse
Affiliation(s)
- Tianze Cao
- School of Mathematics, Hangzhou Normal University, Hangzhou, Zhejiang Province, China
| | - Qian Li
- Department of Rehabilitation, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei Province, China
| | - Yuexia Huang
- School of Mathematics, Hangzhou Normal University, Hangzhou, Zhejiang Province, China.
| | - Anshui Li
- Department of Statistics, Shaoxing University, Shaoxing, Zhejiang Province, China.
| |
Collapse
|
94
|
Diaz-Vidal T, Martínez-Pérez RB, Rosales-Rivera LC. Computational insights of the molecular recognition between volatile molecules and odorant binding proteins from the red palm weevil Rhynchophorus ferrugineus. J Biomol Struct Dyn 2023; 42:11285-11298. [PMID: 37776004 DOI: 10.1080/07391102.2023.2262583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Accepted: 09/17/2023] [Indexed: 10/01/2023]
Abstract
The red palm weevil Rhynchophorus ferrugineus (Coleoptera: Curculionidae) is one of the most harmful pests for palm trees, causing serious economic damage worldwide. The present work aims to model and study the 3D structures of highly expressed odorant binding proteins from R. ferrugineus (RferOBPs) and identify possible binding modes and ligand release mechanism by docking and molecular dynamics. Highly confident 3D structures of a total of 11 odorant binding proteins (OBPs) were obtained with AlphaFold2. All 3D RferOBPs modeled structures displayed six characteristic α-helices, except for RfeOBP7 and RfeOBP10, which had an extra terminal α-helix. Among the eleven modeled RferOBPs, RferOBP4 was highly expressed in the antennae and subsequently selected for further analyses. Molecular docking analyses demonstrated that ferruginol, α-pinene, DEET, and picaridin can favorably bind the RferOBP4 cavity with low affinity energies. Molecular dynamic simulations of RferOBP4 bound to ferruginol at different pH values showed that low pH environments dictate a structural change into an apo-state that modifies the number of tunnels where the ligand can coexist, further triggering ligand release by a pH-dependent mechanism. This is the first report concerning the modelling and study of ligand binding modes and release mechanism of R. ferrugineus OBPs.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Tania Diaz-Vidal
- Departamento de Ingeniería Química, Universidad de Guadalajara, Guadalajara, Mexico
| | - Raúl Balam Martínez-Pérez
- Departamento de Biotecnología y Ciencias Alimentarias, Instituto Tecnológico de Sonora, Ciudad Obregón, Mexico
| | | |
Collapse
|
95
|
Spoendlin FC, Abanades B, Raybould MIJ, Wong WK, Georges G, Deane CM. Improved computational epitope profiling using structural models identifies a broader diversity of antibodies that bind to the same epitope. Front Mol Biosci 2023; 10:1237621. [PMID: 37790877 PMCID: PMC10544996 DOI: 10.3389/fmolb.2023.1237621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 08/28/2023] [Indexed: 10/05/2023] Open
Abstract
The function of an antibody is intrinsically linked to the epitope it engages. Clonal clustering methods, based on sequence identity, are commonly used to group antibodies that will bind to the same epitope. However, such methods neglect the fact that antibodies with highly diverse sequences can exhibit similar binding site geometries and engage common epitopes. In a previous study, we described SPACE1, a method that structurally clustered antibodies in order to predict their epitopes. This methodology was limited by the inaccuracies and incomplete coverage of template-based modeling. In addition, it was only benchmarked at the level of domain-consistency on one virus class. Here, we present SPACE2, which uses the latest machine learning-based structure prediction technology combined with a novel clustering protocol, and benchmark it on binding data that have epitope-level resolution. On six diverse sets of antigen-specific antibodies, we demonstrate that SPACE2 accurately clusters antibodies that engage common epitopes and achieves far higher dataset coverage than clonal clustering and SPACE1. Furthermore, we show that the functionally consistent structural clusters identified by SPACE2 are even more diverse in sequence, genetic lineage, and species origin than those found by SPACE1. These results reiterate that structural data improve our ability to identify antibodies that bind to the same epitope, adding information to sequence-based methods, especially in datasets of antibodies from diverse sources. SPACE2 is openly available on GitHub (https://github.com/oxpig/SPACE2).
Collapse
Affiliation(s)
- Fabian C. Spoendlin
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Brennan Abanades
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Matthew I. J. Raybould
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| | - Wing Ki Wong
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, Penzberg, Germany
| | - Guy Georges
- Large Molecule Research, Roche Pharma Research and Early Development, Roche Innovation Center Munich, Penzberg, Germany
| | - Charlotte M. Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
96
|
Wang Y, Lv H, Lei R, Yeung YH, Shen IR, Choi D, Teo QW, Tan TJ, Gopal AB, Chen X, Graham CS, Wu NC. An explainable language model for antibody specificity prediction using curated influenza hemagglutinin antibodies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.11.557288. [PMID: 37745338 PMCID: PMC10515799 DOI: 10.1101/2023.09.11.557288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Despite decades of antibody research, it remains challenging to predict the specificity of an antibody solely based on its sequence. Two major obstacles are the lack of appropriate models and inaccessibility of datasets for model training. In this study, we curated a dataset of >5,000 influenza hemagglutinin (HA) antibodies by mining research publications and patents, which revealed many distinct sequence features between antibodies to HA head and stem domains. We then leveraged this dataset to develop a lightweight memory B cell language model (mBLM) for sequence-based antibody specificity prediction. Model explainability analysis showed that mBLM captured key sequence motifs of HA stem antibodies. Additionally, by applying mBLM to HA antibodies with unknown epitopes, we discovered and experimentally validated many HA stem antibodies. Overall, this study not only advances our molecular understanding of antibody response to influenza virus, but also provides an invaluable resource for applying deep learning to antibody research.
Collapse
Affiliation(s)
- Yiquan Wang
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Huibin Lv
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Ruipeng Lei
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Yuen-Hei Yeung
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Department of Computer Science, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Ivana R. Shen
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Danbi Choi
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Qi Wen Teo
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Timothy J.C. Tan
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Akshita B. Gopal
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Xin Chen
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Claire S. Graham
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Nicholas C. Wu
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carle Illinois College of Medicine, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
97
|
Leroy EC, Perry TN, Renault TT, Innis CA. Tetracenomycin X sequesters peptidyl-tRNA during translation of QK motifs. Nat Chem Biol 2023; 19:1091-1096. [PMID: 37322159 DOI: 10.1038/s41589-023-01343-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 04/18/2023] [Indexed: 06/17/2023]
Abstract
As antimicrobial resistance threatens our ability to treat common bacterial infections, new antibiotics with limited cross-resistance are urgently needed. In this regard, natural products that target the bacterial ribosome have the potential to be developed into potent drugs through structure-guided design, provided their mechanisms of action are well understood. Here we use inverse toeprinting coupled to next-generation sequencing to show that the aromatic polyketide tetracenomycin X primarily inhibits peptide bond formation between an incoming aminoacyl-tRNA and a terminal Gln-Lys (QK) motif in the nascent polypeptide. Using cryogenic electron microscopy, we reveal that translation inhibition at QK motifs occurs via an unusual mechanism involving sequestration of the 3' adenosine of peptidyl-tRNALys in the drug-occupied nascent polypeptide exit tunnel of the ribosome. Our study provides mechanistic insights into the mode of action of tetracenomycin X on the bacterial ribosome and suggests a path forward for the development of novel aromatic polyketide antibiotics.
Collapse
Affiliation(s)
- Elodie C Leroy
- ARNA Laboratory, UMR 5320, U1212, Institut Européen de Chimie et Biologie, Univ. Bordeaux, Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Pessac, France
- Human Technopole, Milan, Italy
| | - Thomas N Perry
- ARNA Laboratory, UMR 5320, U1212, Institut Européen de Chimie et Biologie, Univ. Bordeaux, Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Pessac, France
- Human Technopole, Milan, Italy
| | - Thibaud T Renault
- ARNA Laboratory, UMR 5320, U1212, Institut Européen de Chimie et Biologie, Univ. Bordeaux, Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Pessac, France.
| | - C Axel Innis
- ARNA Laboratory, UMR 5320, U1212, Institut Européen de Chimie et Biologie, Univ. Bordeaux, Centre National de la Recherche Scientifique, Institut National de la Santé et de la Recherche Médicale, Pessac, France.
| |
Collapse
|
98
|
Jeong DE, Sundrani S, Hall RN, Krupovic M, Koonin EV, Fire AZ. DNA polymerase diversity reveals multiple incursions of Polintons during nematode evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.22.554363. [PMID: 37662302 PMCID: PMC10473752 DOI: 10.1101/2023.08.22.554363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Polintons are dsDNA, virus-like self-synthesizing transposons widely found in eukaryotic genomes. Recent metagenomic discoveries of Polinton-like viruses are consistent with the hypothesis that Polintons invade eukaryotic host genomes through infectious viral particles. Nematode genomes contain multiple copies of Polintons and provide an opportunity to explore the natural distribution and evolution of Polintons during this process. We performed an extensive search of Polintons across nematode genomes, identifying multiple full-length Polinton copies in several species. We provide evidence of both ancient Polinton integrations and recent mobility in strains of the same nematode species. In addition to the major nematode Polinton family, we identified a group of Polintons that are overall closely related to the major family, but encode a distinct protein-primed B family DNA polymerase (pPolB) that is related to homologs from a different group of Polintons present outside of the Nematoda . Phylogenetic analyses on the pPolBs support the evolutionary scenarios in which these extrinsic pPolBs that seem to derive from Polinton families present in oomycetes and molluscs replaced the canonical pPolB in subsets of Polintons found in terrestrial and marine nematodes, respectively, suggesting inter-phylum horizontal gene transfers. The pPolBs of the terrestrial nematode and oomycete Polintons share a unique feature, an insertion of a HNH nuclease domain, whereas the pPolBs in the marine nematode Polintons share an insertion of a VSR nuclease domain with marine mollusc pPolBs. We hypothesize that horizontal gene transfer occurs among Polintons from widely different but cohabiting hosts.
Collapse
|
99
|
Friedman RZ, Ramu A, Lichtarge S, Myers CA, Granas DM, Gause M, Corbo JC, Cohen BA, White MA. Active learning of enhancer and silencer regulatory grammar in photoreceptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.21.554146. [PMID: 37662358 PMCID: PMC10473580 DOI: 10.1101/2023.08.21.554146] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Cis-regulatory elements (CREs) direct gene expression in health and disease, and models that can accurately predict their activities from DNA sequences are crucial for biomedicine. Deep learning represents one emerging strategy to model the regulatory grammar that relates CRE sequence to function. However, these models require training data on a scale that exceeds the number of CREs in the genome. We address this problem using active machine learning to iteratively train models on multiple rounds of synthetic DNA sequences assayed in live mammalian retinas. During each round of training the model actively selects sequence perturbations to assay, thereby efficiently generating informative training data. We iteratively trained a model that predicts the activities of sequences containing binding motifs for the photoreceptor transcription factor Cone-rod homeobox (CRX) using an order of magnitude less training data than current approaches. The model's internal confidence estimates of its predictions are reliable guides for designing sequences with high activity. The model correctly identified critical sequence differences between active and inactive sequences with nearly identical transcription factor binding sites, and revealed order and spacing preferences for combinations of motifs. Our results establish active learning as an effective method to train accurate deep learning models of cis-regulatory function after exhausting naturally occurring training examples in the genome.
Collapse
Affiliation(s)
- Ryan Z. Friedman
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Avinash Ramu
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Sara Lichtarge
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Connie A. Myers
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - David M. Granas
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Maria Gause
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Joseph C. Corbo
- Department of Pathology and Immunology, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Barak A. Cohen
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| | - Michael A. White
- The Edison Family Center for Genome Sciences & Systems Biology, Washington University School of Medicine, Saint Louis, MO, 63110
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, 63110
| |
Collapse
|
100
|
Teo QW, Wang Y, Lv H, Tan TJ, Lei R, Mao KJ, Wu NC. Stringent and complex sequence constraints of an IGHV1-69 broadly neutralizing antibody to influenza HA stem. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.06.547908. [PMID: 37461670 PMCID: PMC10350038 DOI: 10.1101/2023.07.06.547908] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 07/25/2023]
Abstract
IGHV1-69 is frequently utilized by broadly neutralizing influenza antibodies to the hemagglutinin (HA) stem. These IGHV1-69 HA stem antibodies have diverse complementarity-determining region (CDR) H3 sequences. Besides, their light chains have minimal to no contact with the epitope. Consequently, sequence determinants that confer IGHV1-69 antibodies with HA stem specificity remain largely elusive. Using high-throughput experiments, this study revealed the importance of light chain sequence for the IGHV1-69 HA stem antibody CR9114, which is the broadest influenza antibody known to date. Moreover, we demonstrated that the CDR H3 sequences from many other IGHV1-69 antibodies, including those to HA stem, were incompatible with CR9114. Along with mutagenesis and structural analysis, our results indicate that light chain and CDR H3 sequences coordinately determine the HA stem specificity of IGHV1-69 antibodies. Overall, this work provides molecular insights into broadly neutralizing antibody responses to influenza virus, which have important implications for universal influenza vaccine development.
Collapse
Affiliation(s)
- Qi Wen Teo
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Yiquan Wang
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Huibin Lv
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Timothy J.C. Tan
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Ruipeng Lei
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Kevin J. Mao
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| | - Nicholas C. Wu
- Department of Biochemistry, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Center for Biophysics and Quantitative Biology, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
- Carle Illinois College of Medicine, University of Illinois Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|