1
|
Harrison PM. Optimizing strategy for the discovery of compositionally-biased or low-complexity regions in proteins. Sci Rep 2024; 14:680. [PMID: 38182699 PMCID: PMC10770407 DOI: 10.1038/s41598-023-50991-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 12/28/2023] [Indexed: 01/07/2024] Open
Abstract
Proteins can contain tracts dominated by a subset of amino acids and that have a functional significance. These are often termed 'low-complexity regions' (LCRs) or 'compositionally-biased regions' (CBRs). However, a wide spectrum of compositional bias is possible, and program parameters used to annotate these regions are often arbitrarily chosen. Also, investigators are sometimes interested in longer regions, or sometimes very short ones. Here, two programs for annotating LCRs/CBRs, namely SEG and fLPS, are investigated in detail across the whole expanse of their parameter spaces. In doing so, boundary behaviours are resolved that are used to derive an optimized systematic strategy for annotating LCRs/CBRs. Sets of parameters that progressively annotate or 'cover' more of protein sequence space and are optimized for a given target length have been derived. This progressive annotation can be applied to discern the biological relevance of CBRs, e.g., in parsing domains for experimental constructs and in generating hypotheses. It is also useful for picking out candidate regions of interest of a given target length and bias signature, and for assessing the parameter dependence of annotations. This latter application is demonstrated for a set of human intrinsically-disordered proteins associated with cancer.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, Canada.
| |
Collapse
|
2
|
Orlov YL, Orlova NG. Bioinformatics tools for the sequence complexity estimates. Biophys Rev 2023; 15:1367-1378. [PMID: 37974990 PMCID: PMC10643780 DOI: 10.1007/s12551-023-01140-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 09/01/2023] [Indexed: 11/19/2023] Open
Abstract
We review current methods and bioinformatics tools for the text complexity estimates (information and entropy measures). The search DNA regions with extreme statistical characteristics such as low complexity regions are important for biophysical models of chromosome function and gene transcription regulation in genome scale. We discuss the complexity profiling for segmentation and delineation of genome sequences, search for genome repeats and transposable elements, and applications to next-generation sequencing reads. We review the complexity methods and new applications fields: analysis of mutation hotspots loci, analysis of short sequencing reads with quality control, and alignment-free genome comparisons. The algorithms implementing various numerical measures of text complexity estimates including combinatorial and linguistic measures have been developed before genome sequencing era. The series of tools to estimate sequence complexity use compression approaches, mainly by modification of Lempel-Ziv compression. Most of the tools are available online providing large-scale service for whole genome analysis. Novel machine learning applications for classification of complete genome sequences also include sequence compression and complexity algorithms. We present comparison of the complexity methods on the different sequence sets, the applications for gene transcription regulatory regions analysis. Furthermore, we discuss approaches and application of sequence complexity for proteins. The complexity measures for amino acid sequences could be calculated by the same entropy and compression-based algorithms. But the functional and evolutionary roles of low complexity regions in protein have specific features differing from DNA. The tools for protein sequence complexity aimed for protein structural constraints. It was shown that low complexity regions in protein sequences are conservative in evolution and have important biological and structural functions. Finally, we summarize recent findings in large scale genome complexity comparison and applications for coronavirus genome analysis.
Collapse
Affiliation(s)
- Yuriy L. Orlov
- The Digital Health Institute, I.M. Sechenov First Moscow State Medical University of the Russian Ministry of Health (Sechenov University), Moscow, 119991 Russia
- Institute of Cytology and Genetics SB RAS, 630090 Novosibirsk, Russia
- Agrarian and Technological Institute, Peoples’ Friendship University of Russia, 117198 Moscow, Russia
| | - Nina G. Orlova
- Department of Mathematics, Financial University under the Government of the Russian Federation, Moscow, 125167 Russia
| |
Collapse
|
3
|
Persi E, Wolf YI, Karamycheva S, Makarova KS, Koonin EV. Compensatory relationship between low-complexity regions and gene paralogy in the evolution of prokaryotes. Proc Natl Acad Sci U S A 2023; 120:e2300154120. [PMID: 37036997 PMCID: PMC10120016 DOI: 10.1073/pnas.2300154120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2023] [Accepted: 03/17/2023] [Indexed: 04/12/2023] Open
Abstract
The evolution of genomes in all life forms involves two distinct, dynamic types of genomic changes: gene duplication (and loss) that shape families of paralogous genes and extension (and contraction) of low-complexity regions (LCR), which occurs through dynamics of short repeats in protein-coding genes. Although the roles of each of these types of events in genome evolution have been studied, their co-evolutionary dynamics is not thoroughly understood. Here, by analyzing a wide range of genomes from diverse bacteria and archaea, we show that LCR and paralogy represent two distinct routes of evolution that are inversely correlated. The emergence of LCR is a prominent evolutionary mechanism in fast evolving, young protein families, whereas paralogy dominates the comparatively slow evolution of old protein families. The analysis of multiple prokaryotic genomes shows that the formation of LCR is likely a widespread, transient evolutionary mechanism that temporally and locally affects also ancestral functions, but apparently, fades away with time, under mutational and selective pressures, yielding to gene paralogy. We propose that compensatory relationships between short-term and longer-term evolutionary mechanisms are universal in the evolution of life.
Collapse
Affiliation(s)
- Erez Persi
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20894
| | - Yuri I. Wolf
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20894
| | - Svetlana Karamycheva
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20894
| | - Kira S. Makarova
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20894
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD20894
| |
Collapse
|
4
|
Cascarina SM, Ross ED. The LCD-Composer webserver: high-specificity identification and functional analysis of low-complexity domains in proteins. Bioinformatics 2022; 38:5446-5448. [PMID: 36282522 PMCID: PMC9750097 DOI: 10.1093/bioinformatics/btac699] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 10/12/2022] [Accepted: 10/21/2022] [Indexed: 12/25/2022] Open
Abstract
SUMMARY Low-complexity domains (LCDs) in proteins are regions enriched in a small subset of amino acids. LCDs exist in all domains of life, often have unusual biophysical behavior, and function in both normal and pathological processes. We recently developed an algorithm to identify LCDs based predominantly on amino acid composition thresholds. Here, we have integrated this algorithm with a webserver and augmented it with additional analysis options. Specifically, users can (i) search for LCDs in whole proteomes by setting minimum composition thresholds for individual or grouped amino acids, (ii) submit a known LCD sequence to search for similar LCDs, (iii) search for and plot LCDs within a single protein, (iv) statistically test for enrichment of LCDs within a user-provided protein set and (v) specifically identify proteins with multiple types of LCDs. AVAILABILITY AND IMPLEMENTATION The LCD-Composer server can be accessed at http://lcd-composer.bmb.colostate.edu. The corresponding command-line scripts can be accessed at https://github.com/RossLabCSU/LCD-Composer/tree/master/WebserverScripts.
Collapse
Affiliation(s)
| | - Eric D Ross
- To whom correspondence should be addressed. or
| |
Collapse
|
5
|
Tetreau G, Sawaya MR, De Zitter E, Andreeva EA, Banneville AS, Schibrowsky NA, Coquelle N, Brewster AS, Grünbein ML, Kovacs GN, Hunter MS, Kloos M, Sierra RG, Schiro G, Qiao P, Stricker M, Bideshi D, Young ID, Zala N, Engilberge S, Gorel A, Signor L, Teulon JM, Hilpert M, Foucar L, Bielecki J, Bean R, de Wijn R, Sato T, Kirkwood H, Letrun R, Batyuk A, Snigireva I, Fenel D, Schubert R, Canfield EJ, Alba MM, Laporte F, Després L, Bacia M, Roux A, Chapelle C, Riobé F, Maury O, Ling WL, Boutet S, Mancuso A, Gutsche I, Girard E, Barends TRM, Pellequer JL, Park HW, Laganowsky AD, Rodriguez J, Burghammer M, Shoeman RL, Doak RB, Weik M, Sauter NK, Federici B, Cascio D, Schlichting I, Colletier JP. De novo determination of mosquitocidal Cry11Aa and Cry11Ba structures from naturally-occurring nanocrystals. Nat Commun 2022; 13:4376. [PMID: 35902572 PMCID: PMC9334358 DOI: 10.1038/s41467-022-31746-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 06/30/2022] [Indexed: 11/08/2022] Open
Abstract
Cry11Aa and Cry11Ba are the two most potent toxins produced by mosquitocidal Bacillus thuringiensis subsp. israelensis and jegathesan, respectively. The toxins naturally crystallize within the host; however, the crystals are too small for structure determination at synchrotron sources. Therefore, we applied serial femtosecond crystallography at X-ray free electron lasers to in vivo-grown nanocrystals of these toxins. The structure of Cry11Aa was determined de novo using the single-wavelength anomalous dispersion method, which in turn enabled the determination of the Cry11Ba structure by molecular replacement. The two structures reveal a new pattern for in vivo crystallization of Cry toxins, whereby each of their three domains packs with a symmetrically identical domain, and a cleavable crystal packing motif is located within the protoxin rather than at the termini. The diversity of in vivo crystallization patterns suggests explanations for their varied levels of toxicity and rational approaches to improve these toxins for mosquito control.
Collapse
Affiliation(s)
- Guillaume Tetreau
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Michael R Sawaya
- UCLA-DOE Institute for Genomics and Proteomics, Department of Biological Chemistry, University of California, Los Angeles, CA, 90095-1570, USA
| | - Elke De Zitter
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Elena A Andreeva
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Anne-Sophie Banneville
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Natalie A Schibrowsky
- UCLA-DOE Institute for Genomics and Proteomics, Department of Biological Chemistry, University of California, Los Angeles, CA, 90095-1570, USA
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA
| | - Nicolas Coquelle
- Large-Scale Structures Group, Institut Laue-Langevin, F-38000, Grenoble, France
| | - Aaron S Brewster
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Marie Luise Grünbein
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Gabriela Nass Kovacs
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Mark S Hunter
- Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Marco Kloos
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Raymond G Sierra
- Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Giorgio Schiro
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Pei Qiao
- Department of Chemistry, Texas A&M University, College Station, TX, 77845, USA
| | - Myriam Stricker
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Dennis Bideshi
- Department of Entomology and Institute for Integrative Genome Biology, University of California, Riverside, CA, 92521, USA
- Department of Biological Sciences, California Baptist University, Riverside, CA, 92504, USA
| | - Iris D Young
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Ninon Zala
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Sylvain Engilberge
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Alexander Gorel
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Luca Signor
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Jean-Marie Teulon
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Mario Hilpert
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Lutz Foucar
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Johan Bielecki
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Richard Bean
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Raphael de Wijn
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Tokushi Sato
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Henry Kirkwood
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Romain Letrun
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Alexander Batyuk
- Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Irina Snigireva
- European Synchrotron Radiation Facility (ESRF), BP 220, 38043, Grenoble, France
| | - Daphna Fenel
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Robin Schubert
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Ethan J Canfield
- Mass Spectrometry Core Facility, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
| | - Mario M Alba
- Department of Pharmacology and Pharmaceutical Sciences, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
| | | | | | - Maria Bacia
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Amandine Roux
- Univ. Lyon, ENS de Lyon, CNRS UMR 5182, Université Claude Bernard Lyon 1, Laboratoire de Chimie, F-69342, Lyon, France
| | | | - François Riobé
- Univ. Lyon, ENS de Lyon, CNRS UMR 5182, Université Claude Bernard Lyon 1, Laboratoire de Chimie, F-69342, Lyon, France
| | - Olivier Maury
- Univ. Lyon, ENS de Lyon, CNRS UMR 5182, Université Claude Bernard Lyon 1, Laboratoire de Chimie, F-69342, Lyon, France
| | - Wai Li Ling
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Sébastien Boutet
- Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, CA, 94025, USA
| | - Adrian Mancuso
- European XFEL GmbH, Holzkoppel 4, 22869, Schenefeld, Germany
| | - Irina Gutsche
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Eric Girard
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Thomas R M Barends
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Jean-Luc Pellequer
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Hyun-Woo Park
- Department of Entomology and Institute for Integrative Genome Biology, University of California, Riverside, CA, 92521, USA
- Department of Biological Sciences, California Baptist University, Riverside, CA, 92504, USA
| | - Arthur D Laganowsky
- Department of Chemistry, Texas A&M University, College Station, TX, 77845, USA
| | - Jose Rodriguez
- UCLA-DOE Institute for Genomics and Proteomics, Department of Biological Chemistry, University of California, Los Angeles, CA, 90095-1570, USA
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, 90095, USA
| | - Manfred Burghammer
- European Synchrotron Radiation Facility (ESRF), BP 220, 38043, Grenoble, France
| | - Robert L Shoeman
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - R Bruce Doak
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Martin Weik
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France
| | - Nicholas K Sauter
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Brian Federici
- Department of Entomology and Institute for Integrative Genome Biology, University of California, Riverside, CA, 92521, USA
| | - Duilio Cascio
- UCLA-DOE Institute for Genomics and Proteomics, Department of Biological Chemistry, University of California, Los Angeles, CA, 90095-1570, USA
| | - Ilme Schlichting
- Max-Planck-Institut für medizinische Forschung, Jahnstrasse 29, 69120, Heidelberg, Germany
| | - Jacques-Philippe Colletier
- Univ. Grenoble Alpes, CNRS, CEA, Institut de Biologie Structurale, 71 Avenue des martyrs, F-38000, Grenoble, France.
| |
Collapse
|
6
|
Harrison PM. fLPS 2.0: rapid annotation of compositionally-biased regions in biological sequences. PeerJ 2021; 9:e12363. [PMID: 34760378 PMCID: PMC8557692 DOI: 10.7717/peerj.12363] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 09/30/2021] [Indexed: 12/12/2022] Open
Abstract
Compositionally-biased (CB) regions in biological sequences are enriched for a subset of sequence residue types. These can be shorter regions with a concentrated bias (i.e., those termed ‘low-complexity’), or longer regions that have a compositional skew. These regions comprise a prominent class of the uncharacterized ‘dark matter’ of the protein universe. Here, I report the latest version of the fLPS package for the annotation of CB regions, which includes added consideration of DNA sequences, to label the eight possible biased regions of DNA. In this version, the user is now able to restrict analysis to a specified subset of residue types, and also to filter for previously annotated domains to enable detection of discontinuous CB regions. A ‘thorough’ option has been added which enables the labelling of subtler biases, typically made from a skew for several residue types. In the output, protein CB regions are now labelled with bias classes reflecting the physico-chemical character of the biasing residues. The fLPS 2.0 package is available from: https://github.com/pmharrison/flps2 or in a Supplemental File of this paper.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, Canada
| |
Collapse
|
7
|
Komura K, Inamoto T, Tsujino T, Matsui Y, Konuma T, Nishimura K, Uchimoto T, Tsutsumi T, Matsunaga T, Maenosono R, Yoshikawa Y, Taniguchi K, Tanaka T, Uehara H, Hirata K, Hirano H, Nomi H, Hirose Y, Ono F, Azuma H. Increased BUB1B/BUBR1 expression contributes to aberrant DNA repair activity leading to resistance to DNA-damaging agents. Oncogene 2021; 40:6210-6222. [PMID: 34545188 PMCID: PMC8553621 DOI: 10.1038/s41388-021-02021-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Revised: 08/24/2021] [Accepted: 09/10/2021] [Indexed: 11/16/2022]
Abstract
There has been accumulating evidence for the clinical benefit of chemoradiation therapy (CRT), whereas mechanisms in CRT-recurrent clones derived from the primary tumor are still elusive. Herein, we identified an aberrant BUB1B/BUBR1 expression in CRT-recurrent clones in bladder cancer (BC) by comprehensive proteomic analysis. CRT-recurrent BC cells exhibited a cell-cycle-independent upregulation of BUB1B/BUBR1 expression rendering an enhanced DNA repair activity in response to DNA double-strand breaks (DSBs). With DNA repair analyses employing the CRISPR/cas9 system, we revealed that cells with aberrant BUB1B/BUBR1 expression dominantly exploit mutagenic nonhomologous end joining (NHEJ). We further found that phosphorylated ATM interacts with BUB1B/BUBR1 after ionizing radiation (IR) treatment, and the resistance to DSBs by increased BUB1B/BUBR1 depends on the functional ATM. In vivo, tumor growth of CRT-resistant T24R cells was abrogated by ATM inhibition using AZD0156. A dataset analysis identified FOXM1 as a putative BUB1B/BUBR1-targeting transcription factor causing its increased expression. These data collectively suggest a redundant role of BUB1B/BUBR1 underlying mutagenic NHEJ in an ATM-dependent manner, aside from the canonical activity of BUB1B/BUBR1 on the G2/M checkpoint, and offer novel clues to overcome CRT resistance.
Collapse
Affiliation(s)
- Kazumasa Komura
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan. .,Translational Research Program, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan.
| | - Teruo Inamoto
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Takuya Tsujino
- Division of Urology, Department of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, United States
| | - Yusuke Matsui
- Biomedical and Health Informatics Unit, Department of Integrated Health Science, Nagoya University Graduate School of Medicine, Nagoya, 461-8673, Japan.,Institute for Glyco-core Research (iGCORE), Nagoya University, Nagoya, 461-8673, Japan
| | - Tsuyoshi Konuma
- Graduate School of Medical Life Science, Yokohama City University, Yokohama, 230-0045, Japan
| | - Kazuki Nishimura
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Taizo Uchimoto
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Takeshi Tsutsumi
- Division of Urology, Department of Surgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, United States
| | - Tomohisa Matsunaga
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Ryoichi Maenosono
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Yuki Yoshikawa
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Kohei Taniguchi
- Translational Research Program, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Tomohito Tanaka
- Translational Research Program, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Hirofumi Uehara
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Koichi Hirata
- Department of Pathology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Hajime Hirano
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Hayahito Nomi
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Yoshinobu Hirose
- Department of Pathology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Fumihito Ono
- Translational Research Program, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan.,Department of Physiology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| | - Haruhito Azuma
- Department of Urology, Osaka Medical and Pharmaceutical University, Osaka, 569-8686, Japan
| |
Collapse
|
8
|
Hoefig KP, Reim A, Gallus C, Wong EH, Behrens G, Conrad C, Xu M, Kifinger L, Ito-Kureha T, Defourny KAY, Geerlof A, Mautner J, Hauck SM, Baumjohann D, Feederle R, Mann M, Wierer M, Glasmacher E, Heissmeyer V. Defining the RBPome of primary T helper cells to elucidate higher-order Roquin-mediated mRNA regulation. Nat Commun 2021; 12:5208. [PMID: 34471108 PMCID: PMC8410761 DOI: 10.1038/s41467-021-25345-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 07/28/2021] [Indexed: 01/01/2023] Open
Abstract
Post-transcriptional gene regulation in T cells is dynamic and complex as targeted transcripts respond to various factors. This is evident for the Icos mRNA encoding an essential costimulatory receptor that is regulated by several RNA-binding proteins (RBP), including Roquin-1 and Roquin-2. Here, we identify a core RBPome of 798 mouse and 801 human T cell proteins by utilizing global RNA interactome capture (RNA-IC) and orthogonal organic phase separation (OOPS). The RBPome includes Stat1, Stat4 and Vav1 proteins suggesting unexpected functions for these transcription factors and signal transducers. Based on proximity to Roquin-1, we select ~50 RBPs for testing coregulation of Roquin-1/2 targets by induced expression in wild-type or Roquin-1/2-deficient T cells. Besides Roquin-independent contributions from Rbms1 and Cpeb4 we also show Roquin-1/2-dependent and target-specific coregulation of Icos by Celf1 and Igf2bp3. Connecting the cellular RBPome in a post-transcriptional context, we find contributions from multiple RBPs to the prototypic regulation of mRNA targets by individual trans-acting factors. An extensive RNA binding protein atlas (RBPome) for primary T cells would be a useful resource. Here the authors use two different methods to characterise the mouse and human T cell RBPome and show regulation of Roquin-1/2 dependent and independent pathways.
Collapse
Affiliation(s)
- Kai P Hoefig
- Research Unit Molecular Immune Regulation, Helmholtz Center Munich, Munich, Germany
| | - Alexander Reim
- Department of Proteomics and Signal Transduction, Max-Planck-Institute of Biochemistry, Munich, Germany
| | - Christian Gallus
- Institute of Diabetes and Obesity, Helmholtz Center Munich, Munich, Germany
| | - Elaine H Wong
- Institute for Immunology, Biomedical Center, Ludwig Maximilians University Munich, Planegg-Martinsried, Germany
| | - Gesine Behrens
- Research Unit Molecular Immune Regulation, Helmholtz Center Munich, Munich, Germany
| | - Christine Conrad
- Institute for Immunology, Biomedical Center, Ludwig Maximilians University Munich, Planegg-Martinsried, Germany
| | - Meng Xu
- Research Unit Molecular Immune Regulation, Helmholtz Center Munich, Munich, Germany
| | - Lisa Kifinger
- Institute for Immunology, Biomedical Center, Ludwig Maximilians University Munich, Planegg-Martinsried, Germany
| | - Taku Ito-Kureha
- Institute for Immunology, Biomedical Center, Ludwig Maximilians University Munich, Planegg-Martinsried, Germany
| | - Kyra A Y Defourny
- Institute for Immunology, Biomedical Center, Ludwig Maximilians University Munich, Planegg-Martinsried, Germany.,Department of Biomolecular Health Sciences, Utrecht University, Utrecht, The Netherlands
| | - Arie Geerlof
- Institute of Structural Biology, Helmholtz Center Munich, Neuherberg, Germany
| | - Josef Mautner
- Research Unit Gene Vectors, Helmholtz Center Munich & Children's Hospital, TU Munich, Munich, Germany
| | - Stefanie M Hauck
- Research Unit Protein Science, Helmholtz Center Munich, Munich, Germany
| | - Dirk Baumjohann
- Institute for Immunology, Biomedical Center, Ludwig Maximilians University Munich, Planegg-Martinsried, Germany.,Medical Clinic III for Oncology, Immuno-Oncology and Rheumatology University Hospital Bonn, University of Bonn, Bonn, Germany
| | - Regina Feederle
- Monoclonal Antibody Core Facility and Research Group, Institute for Diabetes and Obesity, Helmholtz Center Munich, Neuherberg, Germany
| | - Matthias Mann
- Department of Proteomics and Signal Transduction, Max-Planck-Institute of Biochemistry, Munich, Germany
| | - Michael Wierer
- Department of Proteomics and Signal Transduction, Max-Planck-Institute of Biochemistry, Munich, Germany. .,Proteomics Research Infrastructure, University of Copenhagen, Copenhagen, Denmark.
| | - Elke Glasmacher
- Institute of Diabetes and Obesity, Helmholtz Center Munich, Munich, Germany. .,Roche Pharma Research and Early Development, Large Molecule Research, Roche Innovation Center Munich, Penzberg, Germany.
| | - Vigo Heissmeyer
- Research Unit Molecular Immune Regulation, Helmholtz Center Munich, Munich, Germany. .,Institute for Immunology, Biomedical Center, Ludwig Maximilians University Munich, Planegg-Martinsried, Germany.
| |
Collapse
|
9
|
Mier P, Paladin L, Tamana S, Petrosian S, Hajdu-Soltész B, Urbanek A, Gruca A, Plewczynski D, Grynberg M, Bernadó P, Gáspári Z, Ouzounis CA, Promponas VJ, Kajava AV, Hancock JM, Tosatto SCE, Dosztanyi Z, Andrade-Navarro MA. Disentangling the complexity of low complexity proteins. Brief Bioinform 2021; 21:458-472. [PMID: 30698641 PMCID: PMC7299295 DOI: 10.1093/bib/bbz007] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2018] [Revised: 12/19/2018] [Accepted: 01/07/2019] [Indexed: 12/31/2022] Open
Abstract
There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. Short abstract There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.
Collapse
Affiliation(s)
- Pablo Mier
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, Mainz, Germany
| | - Lisanna Paladin
- Department of Biomedical Science, University of Padova, Padova, Italy
| | - Stella Tamana
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Sophia Petrosian
- Biological Computation and Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
| | - Borbála Hajdu-Soltész
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Annika Urbanek
- Centre de Biochimie Structurale, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Aleksandra Gruca
- Institute of Informatics, Silesian University of Technology, Gliwice, Poland
| | - Dariusz Plewczynski
- Center of New Technologies, University of Warsaw, Warsaw, Poland.,Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | | | - Pau Bernadó
- Centre de Biochimie Structurale, INSERM, CNRS, Université de Montpellier, Montpellier, France
| | - Zoltán Gáspári
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Budapest, Hungary
| | - Christos A Ouzounis
- Biological Computation and Process Laboratory, Chemical Process & Energy Resources Institute, Centre for Research & Technology Hellas, Thessalonica, Greece
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Andrey V Kajava
- Centre de Recherche en Biologie Cellulaire de Montpellier, CNRS-UMR, Institut de Biologie Computationnelle, Universite de Montpellier, Montpellier, France.,Institute of Bioengineering, University ITMO, St. Petersburg, Russia
| | - John M Hancock
- Earlham Institute, Norwich, UK.,ELIXIR Hub, Welcome Genome Campus, Hinxton, UK
| | - Silvio C E Tosatto
- Department of Biomedical Science, University of Padova, Padova, Italy.,CNR Institute of Neuroscience, Padova, Italy
| | - Zsuzsanna Dosztanyi
- MTA-ELTE Lendület Bioinformatics Research Group, Department of Biochemistry, Eötvös Loránd University, Budapest, Hungary
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University of Mainz, Mainz, Germany
| |
Collapse
|
10
|
Mier P, Andrade-Navarro MA. Assessing the low complexity of protein sequences via the low complexity triangle. PLoS One 2020; 15:e0239154. [PMID: 33378336 PMCID: PMC7773278 DOI: 10.1371/journal.pone.0239154] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 08/31/2020] [Indexed: 11/24/2022] Open
Abstract
Background Proteins with low complexity regions (LCRs) have atypical sequence and structural features. Their amino acid composition varies from the expected, determined proteome-wise, and they do not follow the rules of structural folding that prevail in globular regions. One way to characterize these regions is by assessing the repeatability of a sequence, that is, calculating the local propensity of a region to be part of a repeat. Results We combine two local measures of low complexity, repeatability (using the RES algorithm) and fraction of the most frequent amino acid, to evaluate different proteomes, datasets of protein regions with specific features, and individual cases of proteins with extreme compositions. We apply a representation called ‘low complexity triangle’ as a proof-of-concept to represent the low complexity measured values. Results show that proteomes have distinct signatures in the low complexity triangle, and that these signatures are associated to complexity features of the sequences. We developed a web tool called LCT (http://cbdm-01.zdv.uni-mainz.de/~munoz/lct/) to allow users to calculate the low complexity triangle of a given protein or region of interest. Conclusions The low complexity triangle proves to be a suitable procedure to represent the general low complexity of a sequence or protein dataset. Homorepeats, direpeats, compositionally biased regions and globular regions occupy characteristic positions in the triangle. The described pipeline can be used to characterize LCRs and may help in quantifying the content of degenerated tandem repeats in proteins and proteomes.
Collapse
Affiliation(s)
- Pablo Mier
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany
- * E-mail:
| | - Miguel A. Andrade-Navarro
- Faculty of Biology, Institute of Organismic and Molecular Evolution, Johannes Gutenberg University Mainz, Mainz, Germany
| |
Collapse
|
11
|
Wilfling F, Lee CW, Erdmann PS, Zheng Y, Sherpa D, Jentsch S, Pfander B, Schulman BA, Baumeister W. A Selective Autophagy Pathway for Phase-Separated Endocytic Protein Deposits. Mol Cell 2020; 80:764-778.e7. [PMID: 33207182 PMCID: PMC7721475 DOI: 10.1016/j.molcel.2020.10.030] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 09/20/2020] [Accepted: 10/21/2020] [Indexed: 12/14/2022]
Abstract
Autophagy eliminates cytoplasmic content selected by autophagy receptors, which link cargo to the membrane-bound autophagosomal ubiquitin-like protein Atg8/LC3. Here, we report a selective autophagy pathway for protein condensates formed by endocytic proteins in yeast. In this pathway, the endocytic protein Ede1 functions as a selective autophagy receptor. Distinct domains within Ede1 bind Atg8 and mediate phase separation into condensates. Both properties are necessary for an Ede1-dependent autophagy pathway for endocytic proteins, which differs from regular endocytosis and does not involve other known selective autophagy receptors but requires the core autophagy machinery. Cryo-electron tomography of Ede1-containing condensates, at the plasma membrane and in autophagic bodies, shows a phase-separated compartment at the beginning and end of the Ede1-mediated selective autophagy route. Our data suggest a model for autophagic degradation of macromolecular protein complexes by the action of intrinsic autophagy receptors.
Collapse
Affiliation(s)
- Florian Wilfling
- Molecular Cell Biology, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany; Molecular Structural Biology, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany; Molecular Machines and Signaling, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany.
| | - Chia-Wei Lee
- Molecular Cell Biology, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany; Molecular Structural Biology, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Philipp S Erdmann
- Molecular Structural Biology, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany.
| | - Yumei Zheng
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA; Department of Microbiology, Immunology and Biochemistry, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Dawafuti Sherpa
- Molecular Machines and Signaling, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Stefan Jentsch
- Molecular Cell Biology, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Boris Pfander
- DNA Replication and Genome Integrity, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Brenda A Schulman
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA; Molecular Machines and Signaling, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany
| | - Wolfgang Baumeister
- Molecular Structural Biology, Max Planck Institute of Biochemistry, 82152 Martinsried, Germany.
| |
Collapse
|
12
|
Jarnot P, Ziemska-Legiecka J, Dobson L, Merski M, Mier P, Andrade-Navarro MA, Hancock JM, Dosztányi Z, Paladin L, Necci M, Piovesan D, Tosatto SCE, Promponas VJ, Grynberg M, Gruca A. PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins. Nucleic Acids Res 2020; 48:W77-W84. [PMID: 32421769 PMCID: PMC7319588 DOI: 10.1093/nar/gkaa339] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 04/08/2020] [Accepted: 05/01/2020] [Indexed: 12/25/2022] Open
Abstract
Low complexity regions (LCRs) in protein sequences are characterized by a less diverse amino acid composition compared to typically observed sequence diversity. Recent studies have shown that LCRs may co-occur with intrinsically disordered regions, are highly conserved in many organisms, and often play important roles in protein functions and in diseases. In previous decades, several methods have been developed to identify regions with LCRs or amino acid bias, but most of them as stand-alone applications and currently there is no web-based tool which allows users to explore LCRs in protein sequences with additional functional annotations. We aim to fill this gap by providing PlaToLoCo - PLAtform of TOols for LOw COmplexity-a meta-server that integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. In addition, the union or intersection of the results of the search on a query sequence can be obtained. By developing the PlaToLoCo meta-server, we provide the community with a fast and easily accessible tool for the analysis of LCRs with additional information included to aid the interpretation of the results. The PlaToLoCo platform is available at: http://platoloco.aei.polsl.pl/.
Collapse
Affiliation(s)
- Patryk Jarnot
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | | | - Laszlo Dobson
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50/A, 1083 Budapest, Hungary.,Research Centre for Natural Sciences, Magyar Tudósok Körútja 2, 1117 Budapest, Hungary
| | - Matthew Merski
- Structural Biology Group, Biological and Chemical Research Centre, Department of Chemistry, University of Warsaw, Żwirki i Wigury 101, 02-089 Warsaw, Poland
| | - Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - John M Hancock
- ELIXIR, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös LorándUniversity, Budapest, Pázmány Péter stny 1/c 1117, Budapest, Hungary
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Marco Necci
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, P.O. Box 20537, Nicosia, CY 1678, Cyprus
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics PAS, Pawinskiego 5A, 02-106 Warsaw, Poland
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| |
Collapse
|
13
|
Wang Z, Chen K, Jia Y, Chuang JC, Sun X, Lin YH, Celen C, Li L, Huang F, Liu X, Castrillon DH, Wang T, Zhu H. Dual ARID1A/ARID1B loss leads to rapid carcinogenesis and disruptive redistribution of BAF complexes. ACTA ACUST UNITED AC 2020; 1:909-922. [PMID: 34386776 DOI: 10.1038/s43018-020-00109-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
SWI/SNF chromatin remodelers play critical roles in development and cancer. The causal links between SWI/SNF complex disassembly and carcinogenesis are obscured by redundancy between paralogous components. Canonical cBAF-specific paralogs ARID1A and ARID1B are synthetic lethal in some contexts, but simultaneous mutations in both ARID1s are prevalent in cancer. To understand if and how cBAF abrogation causes cancer, we examined the physiologic and biochemical consequences of ARID1A/ARID1B loss. In double knockout liver and skin, aggressive carcinogenesis followed de-differentiation and hyperproliferation. In double mutant endometrial cancer, add-back of either induced senescence. Biochemically, residual cBAF subcomplexes resulting from loss of ARID1 scaffolding were unexpectedly found to disrupt polybromo containing pBAF function. 37 of 69 mutations in the conserved scaffolding domains of ARID1 proteins observed in human cancer caused complex disassembly, partially explaining their mutation spectra. ARID1-less, cBAF-less states promote carcinogenesis across tissues, and suggest caution against paralog-directed therapies for ARID1-mutant cancer.
Collapse
Affiliation(s)
- Zixi Wang
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Kenian Chen
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA, 75390
| | - Yuemeng Jia
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Jen-Chieh Chuang
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Xuxu Sun
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Yu-Hsuan Lin
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Cemre Celen
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Lin Li
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Fang Huang
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Xin Liu
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Diego H Castrillon
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - Tao Wang
- Quantitative Biomedical Research Center, Department of Population and Data Sciences, University of Texas Southwestern Medical Center, Dallas, TX, USA, 75390
| | - Hao Zhu
- Children's Research Institute, Departments of Pediatrics and Internal Medicine, Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
14
|
Li W, Wu G, Wang M, Yue A, Du W, Liu D, Zhao J. Colorimetric detection of class A soybean saponins by coupling DNAzyme with the gap ligase chain reaction. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2020; 12:3361-3367. [PMID: 32930223 DOI: 10.1039/d0ay00820f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Class A saponins are responsible for the taste of soybean products, and the rapid identification of class A saponins from soybean food is essential for both food safety and cultivar screening. In this study, we propose a colorimetric assay based on the coupling of gap ligase chain reaction (Gap-LCR) with DNAzyme to detect the target GmSg-1 genes of class A soybean saponins with the naked eye, without the involvement of expensive instruments. The limits of detection (LODs) for the GmSg-1a and GmSg-1b genes were determined to be 0.1618 and 0.1625 μM, respectively, with a linear range of 0.2-1.2 μM. The DNAzyme-based Gap LCR assay was successfully employed to identify the target genes from different soybean cultivars, providing a simple means for monitoring the quality of soybean products.
Collapse
Affiliation(s)
- Wenshuai Li
- College of Arts and Sciences, Shanxi Agricultural University, Taigu, Shanxi 030801, China.
| | - Guorui Wu
- College of Agronomy, Shanxi Agricultural University, Taigu, Shanxi 030801, China.
| | - Min Wang
- College of Agronomy, Shanxi Agricultural University, Taigu, Shanxi 030801, China.
| | - Aiqin Yue
- College of Agronomy, Shanxi Agricultural University, Taigu, Shanxi 030801, China.
| | - Weijun Du
- College of Agronomy, Shanxi Agricultural University, Taigu, Shanxi 030801, China.
| | - Dingbin Liu
- College of Chemistry, Nankai University, Tianjin, 300071, China
| | - Jinzhong Zhao
- College of Arts and Sciences, Shanxi Agricultural University, Taigu, Shanxi 030801, China.
| |
Collapse
|
15
|
Panayidou S, Georgiades K, Christofi T, Tamana S, Promponas VJ, Apidianakis Y. Pseudomonas aeruginosa core metabolism exerts a widespread growth-independent control on virulence. Sci Rep 2020; 10:9505. [PMID: 32528034 PMCID: PMC7289854 DOI: 10.1038/s41598-020-66194-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Accepted: 05/13/2020] [Indexed: 02/04/2023] Open
Abstract
To assess the role of core metabolism genes in bacterial virulence - independently of their effect on growth - we correlated the genome, the transcriptome and the pathogenicity in flies and mice of 30 fully sequenced Pseudomonas strains. Gene presence correlates robustly with pathogenicity differences among all Pseudomonas species, but not among the P. aeruginosa strains. However, gene expression differences are evident between highly and lowly pathogenic P. aeruginosa strains in multiple virulence factors and a few metabolism genes. Moreover, 16.5%, a noticeable fraction of the core metabolism genes of P. aeruginosa strain PA14 (compared to 8.5% of the non-metabolic genes tested), appear necessary for full virulence when mutated. Most of these virulence-defective core metabolism mutants are compromised in at least one key virulence mechanism independently of auxotrophy. A pathway level analysis of PA14 core metabolism, uncovers beta-oxidation and the biosynthesis of amino-acids, succinate, citramalate, and chorismate to be important for full virulence. Strikingly, the relative expression among P. aeruginosa strains of genes belonging in these metabolic pathways is indicative of their pathogenicity. Thus, P. aeruginosa strain-to-strain virulence variation, remains largely obscure at the genome level, but can be dissected at the pathway level via functional transcriptomics of core metabolism.
Collapse
Affiliation(s)
- Stavria Panayidou
- Infection and Cancer Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Kaliopi Georgiades
- Infection and Cancer Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus.,Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Theodoulakis Christofi
- Infection and Cancer Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Stella Tamana
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus.
| | - Yiorgos Apidianakis
- Infection and Cancer Laboratory, Department of Biological Sciences, University of Cyprus, Nicosia, Cyprus.
| |
Collapse
|
16
|
Ntountoumi C, Vlastaridis P, Mossialos D, Stathopoulos C, Iliopoulos I, Promponas V, Oliver SG, Amoutzias GD. Low complexity regions in the proteins of prokaryotes perform important functional roles and are highly conserved. Nucleic Acids Res 2019; 47:9998-10009. [PMID: 31504783 PMCID: PMC6821194 DOI: 10.1093/nar/gkz730] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 07/16/2019] [Accepted: 08/15/2019] [Indexed: 01/27/2023] Open
Abstract
We provide the first high-throughput analysis of the properties and functional role of Low Complexity Regions (LCRs) in more than 1500 prokaryotic and phage proteomes. We observe that, contrary to a widespread belief based on older and sparse data, LCRs actually have a significant, persistent and highly conserved presence and role in many and diverse prokaryotes. Their specific amino acid content is linked to proteins with certain molecular functions, such as the binding of RNA, DNA, metal-ions and polysaccharides. In addition, LCRs have been repeatedly identified in very ancient, and usually highly expressed proteins of the translation machinery. At last, based on the amino acid content enriched in certain categories, we have developed a neural network web server to identify LCRs and accurately predict whether they can bind nucleic acids, metal-ions or are involved in chaperone functions. An evaluation of the tool showed that it is highly accurate for eukaryotic proteins as well.
Collapse
Affiliation(s)
- Chrysa Ntountoumi
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Panayotis Vlastaridis
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | - Dimitris Mossialos
- Microbial Biotechnology-Molecular Bacteriology-Virology Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| | | | | | - Vasilios Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, New Campus, University of Cyprus, PO Box 20537, CY-1678 Nicosia, Cyprus
| | - Stephen G Oliver
- Cambridge Systems Biology Centre & Department of Biochemistry, University of Cambridge, CB2 1GA, UK
| | - Grigoris D Amoutzias
- Bioinformatics Laboratory, Department of Biochemistry and Biotechnology, University of Thessaly, 41500, Greece
| |
Collapse
|
17
|
Harrison PM. Compositionally Biased Dark Matter in the Protein Universe. Proteomics 2018; 18:e1800069. [PMID: 30260558 DOI: 10.1002/pmic.201800069] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 08/29/2018] [Indexed: 01/01/2023]
Abstract
Compositionally biased regions (BRs) occur when a few amino-acid types are enriched in a protein segment. There are possibly BR types in the known protein universe that have not been characterized experimentally. The UniProt protein database has been surveyed for evidence of such compositionally ''dark matter''. A ''dark biased region'' (DBR) is defined as a biased region with low probability of being an individual structural domain or intrinsically disordered region. The bias annotation program fLPS is used to generate a list of >13 million BRs, which is then thoroughly filtered for structure and intrinsic disorder. About a third of BRs (31%) has both substantial intrinsic disorder and structure. After filtering, there are ≈0.9 million DBRs (≈7% of the original BRs in ≈1.4% of proteins). These DBRs are hugely enriched in eukaryotes and hugely depleted in bacteria. They tend to be more hydrophobic than other protein regions, but are made of less extreme combinations of hydrophobic/hydrophilic residues. Given varying assumptions, It has been estimated that how many DBRs there might be for the high bias levels examined (with p-values < 1 × 10-06 ), deriving a reasonable range of 0.7-7.2% of proteins having such DBRs. Hypotheses are examined about what such DBRs might be, that is, that they are from un- or undersampled domain/region categories or are unappreciated categories somewhat like existing ones.
Collapse
Affiliation(s)
- Paul M Harrison
- Department of Biology, McGill University, Montreal, QC, H3A 1B1, Canada
| |
Collapse
|
18
|
Pokhrel AR, Nguyen HT, Dhakal D, Chaudhary AK, Sohng JK. Implication of orphan histidine kinase (OhkAsp) in biosynthesis of doxorubicin and daunorubicin in Streptomyces peucetius ATCC 27952. Microbiol Res 2018; 214:37-46. [DOI: 10.1016/j.micres.2018.05.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2018] [Revised: 04/18/2018] [Accepted: 05/09/2018] [Indexed: 12/30/2022]
|
19
|
A bioinformatics pipeline to search functional motifs within whole-proteome data: a case study of poxviruses. Virus Genes 2016; 53:173-178. [PMID: 28000080 PMCID: PMC5357487 DOI: 10.1007/s11262-016-1416-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 12/01/2016] [Indexed: 12/19/2022]
Abstract
Proteins harbor domains or short linear motifs, which facilitate their functions and interactions. Finding functional motifs in protein sequences could predict the putative cellular roles or characteristics of hypothetical proteins. In this study, we present Shetti-Motif, which is an interactive tool to (i) map UniProt and PROSITE flat files, (ii) search for multiple pre-defined consensus patterns or experimentally validated functional motifs in large datasets protein sequences (proteome-wide), (iii) search for motifs containing repeated residues (low-complexity regions, e.g., Leu-, SR-, PEST-rich motifs, etc.). As proof of principle, using this comparative proteomics pipeline, eleven proteomes encoded by member of Poxviridae family were searched against about 100 experimentally validated functional motifs. The closely related viruses and viruses infect the same host cells (e.g. vaccinia and variola viruses) show similar motif-containing proteins profile. The motifs encoded by these viruses are correlated, which explains why poxviruses are able to interact with wide range of host cells. In conclusion, this in silico analysis is useful to establish a dataset(s) or potential proteins for further investigation or compare between species.
Collapse
|
20
|
Battistuzzi FU, Schneider KA, Spencer MK, Fisher D, Chaudhry S, Escalante AA. Profiles of low complexity regions in Apicomplexa. BMC Evol Biol 2016; 16:47. [PMID: 26923229 PMCID: PMC4770516 DOI: 10.1186/s12862-016-0625-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 02/17/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Low complexity regions (LCRs) are a ubiquitous feature in genomes and yet their evolutionary history and functional roles are unclear. Previous studies have shown contrasting evidence in favor of both neutral and selective mechanisms of evolution for different sets of LCRs suggesting that modes of identification of these regions may play a role in our ability to discern their evolutionary history. To further investigate this issue, we used a multiple threshold approach to identify species-specific profiles of proteome complexity and, by comparing properties of these sets, determine the influence that starting parameters have on evolutionary inferences. RESULTS We find that, although qualitatively similar, quantitatively each species has a unique LCR profile which represents the frequency of these regions within each genome. Inferences based on these profiles are more accurate in comparative analyses of genome complexity as they allow to determine the relative complexity of multiple genomes as well as the type of repetitiveness that is most common in each. Based on the multiple threshold LCR sets obtained, we identified predominant evolutionary mechanisms at different complexity levels, which show neutral mechanisms acting on highly repetitive LCRs (e.g., homopolymers) and selective forces becoming more important as heterogeneity of the LCRs increases. CONCLUSIONS Our results show how inferences based on LCRs are influenced by the parameters used to identify these regions. Sets of LCRs are heterogeneous aggregates of regions that include homo- and heteropolymers and, as such, evolve according to different mechanisms. LCR profiles provide a new way to investigate genome complexity across species and to determine the driving mechanism of their evolution.
Collapse
Affiliation(s)
| | - Kristan A Schneider
- Department of MNI, University of Applied Sciences Mittweida, Mittweida, Germany.
| | - Matthew K Spencer
- Department of Geology and Physics, Lake Superior State University, Sault Ste. Marie, MI, USA.
| | - David Fisher
- David Eccles School of Business, University of Utah, Salt Lake City, UT, USA.
| | - Sophia Chaudhry
- Department of Biological Sciences, Oakland University, Rochester, MI, USA. .,Center for Molecular Medicine and Genetics, Wayne State University, Detroit, MI, USA.
| | - Ananias A Escalante
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA.
| |
Collapse
|