1
|
Gren BA, Antczak M, Zok T, Sulkowska JI, Szachniuk M. Knotted artifacts in predicted 3D RNA structures. PLoS Comput Biol 2024; 20:e1011959. [PMID: 38900780 PMCID: PMC11218946 DOI: 10.1371/journal.pcbi.1011959] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 07/02/2024] [Accepted: 06/01/2024] [Indexed: 06/22/2024] Open
Abstract
Unlike proteins, RNAs deposited in the Protein Data Bank do not contain topological knots. Recently, admittedly, the first trefoil knot and some lasso-type conformations have been found in experimental RNA structures, but these are still exceptional cases. Meanwhile, algorithms predicting 3D RNA models have happened to form knotted structures not so rarely. Interestingly, machine learning-based predictors seem to be more prone to generate knotted RNA folds than traditional methods. A similar situation is observed for the entanglements of structural elements. In this paper, we analyze all models submitted to the CASP15 competition in the 3D RNA structure prediction category. We show what types of topological knots and structure element entanglements appear in the submitted models and highlight what methods are behind the generation of such conformations. We also study the structural aspect of susceptibility to entanglement. We suggest that predictors take care of an evaluation of RNA models to avoid publishing structures with artifacts, such as unusual entanglements, that result from hallucinations of predictive algorithms.
Collapse
Affiliation(s)
- Bartosz A. Gren
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Maciej Antczak
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | | | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| |
Collapse
|
2
|
Sarzynska J, Popenda M, Antczak M, Szachniuk M. RNA tertiary structure prediction using RNAComposer in CASP15. Proteins 2023; 91:1790-1799. [PMID: 37615316 DOI: 10.1002/prot.26578] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2023] [Revised: 06/14/2023] [Accepted: 08/08/2023] [Indexed: 08/25/2023]
Abstract
As CASP15 participants, in the new category of 3D RNA structure prediction, we applied expert modeling with the support of our proprietary system RNAComposer. Although RNAComposer is primarily known as an automated web server, its features allow it to be used interactively, for example, for homology-based modeling or assembling models from user-provided structural elements. In the paper, we present various scenarios of applying the system to predict the 3D RNA structures that we employed. Their combination with expert input, comparative analysis of models, and routines to select representative resultant structures form a ready-for-reuse workflow. With selected examples, we demonstrate its application for the in silico modeling of natural and synthetic RNA molecules targeted in CASP15.
Collapse
Affiliation(s)
- Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| |
Collapse
|
3
|
Justyna M, Antczak M, Szachniuk M. Machine learning for RNA 2D structure prediction benchmarked on experimental data. Brief Bioinform 2023; 24:7140288. [PMID: 37096592 DOI: 10.1093/bib/bbad153] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/15/2023] [Accepted: 03/29/2023] [Indexed: 04/26/2023] Open
Abstract
Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL) and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.
Collapse
Affiliation(s)
- Marek Justyna
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Maciej Antczak
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| |
Collapse
|
4
|
Wiedemann J, Kaczor J, Milostan M, Zok T, Blazewicz J, Szachniuk M, Antczak M. RNAloops: a database of RNA multiloops. Bioinformatics 2022; 38:4200-4205. [PMID: 35809063 PMCID: PMC9438955 DOI: 10.1093/bioinformatics/btac484] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Revised: 06/26/2022] [Accepted: 07/06/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Knowledge of the 3D structure of RNA supports discovering its functions and is crucial for designing drugs and modern therapeutic solutions. Thus, much attention is devoted to experimental determination and computational prediction targeting the global fold of RNA and its local substructures. The latter include multi-branched loops-functionally significant elements that highly affect the spatial shape of the entire molecule. Unfortunately, their computational modeling constitutes a weak point of structural bioinformatics. A remedy for this is in collecting these motifs and analyzing their features. RESULTS RNAloops is a self-updating database that stores multi-branched loops identified in the PDB-deposited RNA structures. A description of each loop includes angular data-planar and Euler angles computed between pairs of adjacent helices to allow studying their mutual arrangement in space. The system enables search and analysis of multiloops, presents their structure details numerically and visually, and computes data statistics. AVAILABILITY AND IMPLEMENTATION RNAloops is freely accessible at https://rnaloops.cs.put.poznan.pl. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jakub Wiedemann
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland
| | - Jacek Kaczor
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland
| | - Maciej Milostan
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland,Poznan Supercomputing and Networking Center, 61-131 Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland,Poznan Supercomputing and Networking Center, 61-131 Poznan, Poland
| | - Jacek Blazewicz
- Institute of Computing Science, Poznan University of Technology, 60-965 Poznan, Poland,Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | | | | |
Collapse
|
5
|
Magnus M. rna-tools.online: a Swiss army knife for RNA 3D structure modeling workflow. Nucleic Acids Res 2022; 50:W657-W662. [PMID: 35580057 PMCID: PMC9252763 DOI: 10.1093/nar/gkac372] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 04/20/2022] [Accepted: 05/02/2022] [Indexed: 11/15/2022] Open
Abstract
Significant improvements have been made in the efficiency and accuracy of RNA 3D structure prediction methods in recent years; however, many tools developed in the field stay exclusive to only a few bioinformatic groups. To perform a complete RNA 3D structure modeling analysis as proposed by the RNA-Puzzles community, researchers must familiarize themselves with a quite complex set of tools. In order to facilitate the processing of RNA sequences and structures, we previously developed the rna-tools package. However, using rna-tools requires the installation of a mixture of libraries and tools, basic knowledge of the command line and the Python programming language. To provide an opportunity for the broader community of biologists to take advantage of the new developments in RNA structural biology, we developed rna-tools.online. The web server provides a user-friendly platform to perform many standard analyses required for the typical modeling workflow: 3D structure manipulation and editing, structure minimization, structure analysis, quality assessment, and comparison. rna-tools.online supports biologists to start benefiting from the maturing field of RNA 3D structural bioinformatics and can be used for educational purposes. The web server is available at https://rna-tools.online.
Collapse
Affiliation(s)
- Marcin Magnus
- ReMedy International Research Agenda Unit, IMol Polish Academy of Sciences, Warsaw, Poland
| |
Collapse
|
6
|
Popenda M, Zok T, Sarzynska J, Korpeta A, Adamiak R, Antczak M, Szachniuk M. Entanglements of structure elements revealed in RNA 3D models. Nucleic Acids Res 2021; 49:9625-9632. [PMID: 34432024 PMCID: PMC8464073 DOI: 10.1093/nar/gkab716] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 08/02/2021] [Accepted: 08/06/2021] [Indexed: 01/14/2023] Open
Abstract
Computational methods to predict RNA 3D structure have more and more practical applications in molecular biology and medicine. Therefore, it is crucial to intensify efforts to improve the accuracy and quality of predicted three-dimensional structures. A significant role in this is played by the RNA-Puzzles initiative that collects, evaluates, and shares RNAs built computationally within currently nearly 30 challenges. RNA-Puzzles datasets, subjected to multi-criteria analysis, allow revealing the strengths and weaknesses of computer prediction methods. Here, we study the issue of entangled RNA fragments in the predicted RNA 3D structure models. By entanglement, we mean an arrangement of two structural elements such that one of them passes through the other. We propose the classification of entanglements driven by their topology and components. It distinguishes two general classes, interlaces and lassos, and subclasses characterized by element types-loops, dinucleotide steps, open single-stranded fragments-and puncture multiplicity. Our computational pipeline for entanglement detection, applied for 1,017 non-redundant models from RNA-Puzzles, has shown the frequency of different entanglements and allowed identifying 138 structures with intersected assemblies.
Collapse
Affiliation(s)
- Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Agnieszka Korpeta
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Ryszard W Adamiak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Maciej Antczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| |
Collapse
|
7
|
Kudla M, Gutowska K, Synak J, Weber M, Bohnsack KS, Lukasiak P, Villmann T, Blazewicz J, Szachniuk M. Virxicon: A Lexicon Of Viral Sequences. Bioinformatics 2020; 36:5507-5513. [PMID: 33367605 PMCID: PMC8016492 DOI: 10.1093/bioinformatics/btaa1066] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 11/18/2020] [Accepted: 12/11/2020] [Indexed: 11/12/2022] Open
Abstract
Motivation Viruses are the most abundant biological entities and constitute a large reservoir of genetic diversity. In recent years, knowledge about them has increased significantly as a result of dynamic development in life sciences and rapid technological progress. This knowledge is scattered across various data repositories, making a comprehensive analysis of viral data difficult. Results In response to the need for gathering a comprehensive knowledge of viruses and viral sequences, we developed Virxicon, a lexicon of all experimentally acquired sequences for RNA and DNA viruses. The ability to quickly obtain data for entire viral groups, searching sequences by levels of taxonomic hierarchy—according to the Baltimore classification and ICTV taxonomy—and tracking the distribution of viral data and its growth over time are unique features of our database compared to the other tools. Availabilityand implementation Virxicon is a publicly available resource, updated weekly. It has an intuitive web interface and can be freely accessed at http://virxicon.cs.put.poznan.pl/.
Collapse
Affiliation(s)
- Mateusz Kudla
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan, 60-965, Poland.,Saxon Institute for Computational Intelligence and Machine Learning, University of Applied Sciences Mittweida, Mittweida, 09648, Germany
| | - Kaja Gutowska
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, 61-704, Poland
| | - Jaroslaw Synak
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan, 60-965, Poland
| | - Mirko Weber
- Saxon Institute for Computational Intelligence and Machine Learning, University of Applied Sciences Mittweida, Mittweida, 09648, Germany
| | - Katrin Sophie Bohnsack
- Saxon Institute for Computational Intelligence and Machine Learning, University of Applied Sciences Mittweida, Mittweida, 09648, Germany
| | - Piotr Lukasiak
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, 61-704, Poland
| | - Thomas Villmann
- Saxon Institute for Computational Intelligence and Machine Learning, University of Applied Sciences Mittweida, Mittweida, 09648, Germany
| | - Jacek Blazewicz
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, 61-704, Poland
| | - Marta Szachniuk
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan, 60-965, Poland.,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, 61-704, Poland
| |
Collapse
|
8
|
Gumna J, Zok T, Figurski K, Pachulska-Wieczorek K, Szachniuk M. RNAthor - fast, accurate normalization, visualization and statistical analysis of RNA probing data resolved by capillary electrophoresis. PLoS One 2020; 15:e0239287. [PMID: 33002005 PMCID: PMC7529196 DOI: 10.1371/journal.pone.0239287] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Accepted: 09/03/2020] [Indexed: 12/18/2022] Open
Abstract
RNAs adopt specific structures to perform their functions, which are critical to fundamental cellular processes. For decades, these structures have been determined and modeled with strong support from computational methods. Still, the accuracy of the latter ones depends on the availability of experimental data, for example, chemical probing information that can define pseudo-energy constraints for RNA folding algorithms. At the same time, diverse computational tools have been developed to facilitate analysis and visualization of data from RNA structure probing experiments followed by capillary electrophoresis or next-generation sequencing. RNAthor, a new software tool for the fully automated normalization of SHAPE and DMS probing data resolved by capillary electrophoresis, has recently joined this collection. RNAthor automatically identifies unreliable probing data. It normalizes the reactivity information to a uniform scale and uses it in the RNA secondary structure prediction. Our web server also provides tools for fast and easy RNA probing data visualization and statistical analysis that facilitates the comparison of multiple data sets. RNAthor is freely available at http://rnathor.cs.put.poznan.pl/.
Collapse
Affiliation(s)
- Julita Gumna
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Kacper Figurski
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
| | | | - Marta Szachniuk
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland
- Institute of Computing Science, Poznan University of Technology, Poznan, Poland
- * E-mail: (KPW); (MS)
| |
Collapse
|
9
|
Popenda M, Miskiewicz J, Sarzynska J, Zok T, Szachniuk M. Topology-based classification of tetrads and quadruplex structures. Bioinformatics 2020; 36:1129-1134. [PMID: 31588513 PMCID: PMC7031778 DOI: 10.1093/bioinformatics/btz738] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 08/12/2019] [Accepted: 09/25/2019] [Indexed: 12/02/2022] Open
Abstract
Motivation Quadruplexes attract the attention of researchers from many fields of bio-science. Due to a specific structure, these tertiary motifs are involved in various biological processes. They are also promising therapeutic targets in many strategies of drug development, including anticancer and neurological disease treatment. The uniqueness and diversity of their forms cause that quadruplexes show great potential in novel biological applications. The existing approaches for quadruplex analysis are based on sequence or 3D structure features and address canonical motifs only. Results In our study, we analyzed tetrads and quadruplexes contained in nucleic acid molecules deposited in Protein Data Bank. Focusing on their secondary structure topology, we adjusted its graphical diagram and proposed new dot-bracket and arc representations. We defined the novel classification of these motifs. It can handle both canonical and non-canonical cases. Based on this new taxonomy, we implemented a method that automatically recognizes the types of tetrads and quadruplexes occurring as unimolecular structures. Finally, we conducted a statistical analysis of these motifs found in experimentally determined nucleic acid structures in relation to the new classification. Availability and implementation https://github.com/tzok/eltetrado/ Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mariusz Popenda
- Department of Structural Bioinformatics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan 61-704, Poland
| | - Joanna Miskiewicz
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan 60-965, Poland
| | - Joanna Sarzynska
- Department of Structural Bioinformatics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan 61-704, Poland
| | - Tomasz Zok
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan 60-965, Poland.,Poznan Supercomputing and Networking Center, Poznan 61-139, Poland
| | - Marta Szachniuk
- Department of Structural Bioinformatics, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan 61-704, Poland.,Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Poznan 60-965, Poland
| |
Collapse
|
10
|
Miskiewicz J, Sarzynska J, Szachniuk M. How bioinformatics resources work with G4 RNAs. Brief Bioinform 2020; 22:5902714. [PMID: 32898859 PMCID: PMC8138894 DOI: 10.1093/bib/bbaa201] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Revised: 08/03/2020] [Accepted: 08/04/2020] [Indexed: 12/17/2022] Open
Abstract
Quadruplexes (G4s) are of interest, which increases with the number of identified G4 structures and knowledge about their biomedical potential. These unique motifs form in many organisms, including humans, where their appearance correlates with various diseases. Scientists store and analyze quadruplexes using recently developed bioinformatic tools—many of them focused on DNA structures. With an expanding collection of G4 RNAs, we check how existing tools deal with them. We review all available bioinformatics resources dedicated to quadruplexes and examine their usefulness in G4 RNA analysis. We distinguish the following subsets of resources: databases, tools to predict putative quadruplex sequences, tools to predict secondary structure with quadruplexes and tools to analyze and visualize quadruplex structures. We share the results obtained from processing specially created RNA datasets with these tools. Contact: mszachniuk@cs.put.poznan.pl Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Joanna Miskiewicz
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| |
Collapse
|
11
|
Zok T, Popenda M, Szachniuk M. ElTetrado: a tool for identification and classification of tetrads and quadruplexes. BMC Bioinformatics 2020; 21:40. [PMID: 32005130 PMCID: PMC6995151 DOI: 10.1186/s12859-020-3385-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 01/24/2020] [Indexed: 12/13/2022] Open
Abstract
Background Quadruplexes are specific structure motifs occurring, e.g., in telomeres and transcriptional regulatory regions. Recent discoveries confirmed their importance in biomedicine and led to an intensified examination of their properties. So far, the study of these motifs has focused mainly on the sequence and the tertiary structure, and concerned canonical structures only. Whereas, more and more non-canonical quadruplex motifs are being discovered. Results Here, we present ElTetrado, a software that identifies quadruplexes (composed of guanine- and other nucleobase-containing tetrads) in nucleic acid structures and classifies them according to the recently introduced ONZ taxonomy. The categorization is based on the secondary structure topology of quadruplexes and their component tetrads. It supports the analysis of canonical and non-canonical motifs. Besides the class recognition, ElTetrado prepares a dot-bracket and graphical representations of the secondary structure, which reflect the specificity of the quadruplex’s structure topology. It is implemented as a freely available, standalone application, available at https://github.com/tzok/eltetrado. Conclusions The proposed software tool allows to identify and classify tetrads and quadruplexes based on the topology of their secondary structures. It complements existing approaches focusing on the sequence and 3D structure.
Collapse
Affiliation(s)
- Tomasz Zok
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland.,Poznan Supercomputing and Networking Center, Jana Pawla II 10, Poznan, 61-139, Poland
| | - Mariusz Popenda
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan, 61-704, Poland
| | - Marta Szachniuk
- Institute of Computing Science and European Centre for Bioinformatics and Genomics, Poznan University of Technology, Piotrowo 2, Poznan, 60-965, Poland. .,Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan, 61-704, Poland.
| |
Collapse
|
12
|
Magnus M, Antczak M, Zok T, Wiedemann J, Lukasiak P, Cao Y, Bujnicki JM, Westhof E, Szachniuk M, Miao Z. RNA-Puzzles toolkit: a computational resource of RNA 3D structure benchmark datasets, structure manipulation, and evaluation tools. Nucleic Acids Res 2020; 48:576-588. [PMID: 31799609 PMCID: PMC7145511 DOI: 10.1093/nar/gkz1108] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 11/06/2019] [Accepted: 11/15/2019] [Indexed: 12/12/2022] Open
Abstract
Significant improvements have been made in the efficiency and accuracy of RNA 3D structure prediction methods during the succeeding challenges of RNA-Puzzles, a community-wide effort on the assessment of blind prediction of RNA tertiary structures. The RNA-Puzzles contest has shown, among others, that the development and validation of computational methods for RNA fold prediction strongly depend on the benchmark datasets and the structure comparison algorithms. Yet, there has been no systematic benchmark set or decoy structures available for the 3D structure prediction of RNA, hindering the standardization of comparative tests in the modeling of RNA structure. Furthermore, there has not been a unified set of tools that allows deep and complete RNA structure analysis, and at the same time, that is easy to use. Here, we present RNA-Puzzles toolkit, a computational resource including (i) decoy sets generated by different RNA 3D structure prediction methods (raw, for-evaluation and standardized datasets), (ii) 3D structure normalization, analysis, manipulation, visualization tools (RNA_format, RNA_normalizer, rna-tools) and (iii) 3D structure comparison metric tools (RNAQUA, MCQ4Structures). This resource provides a full list of computational tools as well as a standard RNA 3D structure prediction assessment protocol for the community.
Collapse
Affiliation(s)
- Marcin Magnus
- International Institute of Molecular and Cell Biology in Warsaw, 02-109 Warsaw, Poland
- ReMedy-International Research Agenda Unit, Centre of New Technologies, University of Warsaw, 02-097 Warsaw, Poland
| | - Maciej Antczak
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Tomasz Zok
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
| | - Jakub Wiedemann
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Piotr Lukasiak
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Yang Cao
- Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resource and Eco-Environment of Ministry of Education, College of Life Sciences, Sichuan University, Chengdu 610065, PR China
| | - Janusz M Bujnicki
- International Institute of Molecular and Cell Biology in Warsaw, 02-109 Warsaw, Poland
- Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Poznan, Poland
| | - Eric Westhof
- Architecture et Réactivité de l’ARN, Université de Strasbourg, Institut de biologie moléculaire et cellulaire du CNRS, 12 allée Konrad Roentgen, 67084 Strasbourg, France
| | - Marta Szachniuk
- Institute of Computing Science & European Centre for Bioinformatics and Genomics, Poznan University of Technology, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Zhichao Miao
- Translational Research Institute of Brain and Brain-Like Intelligence and Department of Anesthesiology, Shanghai Fourth People's Hospital Affiliated to Tongji University School of Medicine, Shanghai 200081, China
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridge CB10 1SD, UK
- Newcastle Fibrosis Research Group, Institute of Cellular Medicine, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|