1
|
Monzon AM, Arrías PN, Elofsson A, Mier P, Andrade-Navarro MA, Bevilacqua M, Clementel D, Bateman A, Hirsh L, Fornasari MS, Parisi G, Piovesan D, Kajava AV, Tosatto SCE. A STRP-ed definition of Structured Tandem Repeats in Proteins. J Struct Biol 2023; 215:108023. [PMID: 37652396 DOI: 10.1016/j.jsb.2023.108023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/31/2023] [Accepted: 08/28/2023] [Indexed: 09/02/2023]
Abstract
Tandem Repeat Proteins (TRPs) are a class of proteins with repetitive amino acid sequences that have been studied extensively for over two decades. Different features at the level of sequence, structure, function and evolution have been attributed to them by various authors. And yet many of its salient features appear only when looking at specific subclasses of protein tandem repeats. Here, we attempt to rationalize the existing knowledge on Tandem Repeat Proteins (TRPs) by pointing out several dichotomies. The emerging picture is more nuanced than generally assumed and allows us to draw some boundaries of what is not a "proper" TRP. We conclude with an operational definition of a specific subset, which we have denominated STRPs (Structural Tandem Repeat Proteins), which separates a subclass of tandem repeats with distinctive features from several other less well-defined types of repeats. We believe that this definition will help researchers in the field to better characterize the biological meaning of this large yet largely understudied group of proteins.
Collapse
Affiliation(s)
- Alexander Miguel Monzon
- Dept. of Information Engineering, University of Padova, via Giovanni Gradenigo 6/B, 35131 Padova, Italy
| | - Paula Nazarena Arrías
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Arne Elofsson
- Dept. of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Tomtebodavägen 23, 171 21 Solna, Sweden
| | - Pablo Mier
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Institute of Organismic and Molecular Evolution, Faculty of Biology, Johannes Gutenberg University of Mainz, Hanns-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Martina Bevilacqua
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Damiano Clementel
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Layla Hirsh
- Dept. of Engineering, Faculty of Science and Engineering, Pontifical Catholic University of Peru, Av. Universitaria 1801 San Miguel, Lima 32, Lima, Peru
| | - Maria Silvina Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Damiano Piovesan
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, 1919 Route de Mende, Cedex 5, 34293 Montpellier, France
| | - Silvio C E Tosatto
- Dept. of Biomedical Sciences, University of Padova, via U. Bassi 58/b, 35121 Padova, Italy.
| |
Collapse
|
2
|
Szatkownik A, Zea DJ, Richard H, Laine E. Building alternative splicing and evolution-aware sequence-structure maps for protein repeats. J Struct Biol 2023; 215:107997. [PMID: 37453591 DOI: 10.1016/j.jsb.2023.107997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 06/15/2023] [Accepted: 07/05/2023] [Indexed: 07/18/2023]
Abstract
Alternative splicing of repeats in proteins provides a mechanism for rewiring and fine-tuning protein interaction networks. In this work, we developed a robust and versatile method, ASPRING, to identify alternatively spliced protein repeats from gene annotations. ASPRING leverages evolutionary meaningful alternative splicing-aware hierarchical graphs to provide maps between protein repeats sequences and 3D structures. We re-think the definition of repeats by explicitly accounting for transcript diversity across several genes/species. Using a stringent sequence-based similarity criterion, we detected over 5,000 evolutionary conserved repeats by screening virtually all human protein-coding genes and their orthologs across a dozen species. Through a joint analysis of their sequences and structures, we extracted specificity-determining sequence signatures and assessed their implication in experimentally resolved and modelled protein interactions. Our findings demonstrate the widespread alternative usage of protein repeats in modulating protein interactions and open avenues for targeting repeat-mediated interactions.
Collapse
Affiliation(s)
- Antoine Szatkownik
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France; Bioinformatics Unit, Genome Competence Center (MF1), Robert Koch Institute, 13353 Berlin, Germany
| | - Diego Javier Zea
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Hugues Richard
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France; Bioinformatics Unit, Genome Competence Center (MF1), Robert Koch Institute, 13353 Berlin, Germany.
| | - Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.
| |
Collapse
|
3
|
Manasra S, Kajava AV. Why does the first protein repeat often become the only one? J Struct Biol 2023; 215:108014. [PMID: 37567371 DOI: 10.1016/j.jsb.2023.108014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 08/06/2023] [Accepted: 08/09/2023] [Indexed: 08/13/2023]
Abstract
Proteins with two similar motifs in tandem are one of the most common cases of tandem repeat proteins. The question arises: why is the first emerged repeat frequently fixed in the process of evolution, despite the ample opportunities to continue its multiplication at the DNA level? To answer this question, we systematically analyzed the structure and function of these proteins. Our analysis showed that, in the vast majority of cases, the structural repetitive units have a two-fold (C2) internal symmetry. These closed structures provide an internal structural limitation for the subsequent growth of the repeat number. Frequently, the units "swap" their secondary structure elements with each other. Moreover, the duplicated domains, in contrast to other tandem repeat proteins, form binding sites for small molecules around the axis of C2 symmetry. Thus, the closure of the C2 structures and the emergence of new functional sites around the axis of C2 symmetry provide plausible explanations for why a repeat, once appeared, becomes fixed in the evolutionary process. We have placed these structures within the general structural classification of tandem repeat proteins, classifying them as either Class IV or V depending on the size of the repetitive unit.
Collapse
Affiliation(s)
- Simona Manasra
- Institute of Bioengineering, ITMO University, Kronverksky Pr. 49, 197101 Saint Petersburg, Russia
| | - Andrey V Kajava
- Centre de Recherche en Biologie cellulaire de Montpellier (CRBM), UMR 5237 CNRS, Université Montpellier, 1919 Route de Mende, Cedex 5, 34293 Montpellier, France.
| |
Collapse
|
4
|
Michael D, Gurusaran M, Santhosh R, Hussain MK, Satheesh SN, Suhan S, Sivaranjan P, Jaiswal A, Sekar K. RepEx: A web server to extract sequence repeats from protein and DNA sequences. Comput Biol Chem 2019; 78:424-30. [PMID: 30598392 DOI: 10.1016/j.compbiolchem.2018.12.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2018] [Accepted: 12/25/2018] [Indexed: 11/20/2022]
Abstract
Evolution builds up new genetic material from existing ones, not in random, but in highly ordered and eloquent patterns. Most of these sequence repeats are revelatory of valuable information contributing to areas of disease research and function of macromolecules, to name a few. In the age of next generation genome sequencing, rapid and efficient extraction of all unbiased sequence repeats from macromolecules is the need of the hour. In view of this reckoning, an online web-based computing server, RepEx, has been developed to extract and display all possible repeats for DNA and protein sequences. Apart from exact or identical repeats, the server has been designed adeptly to identify and extract degenerate, inverted, everted and mirror repeats from both DNA and protein sequences. The server has striking output displays, featuring interactive graphs and comprehensive output files. In addition, RepEx has been accoutered with an easy-to-use interface and search filters to facilitate a user-defined query or search and is freely available and accessible via the World Wide Web at http://bioserver2.physics.iisc.ac.in/RepEx/.
Collapse
|