1
|
Boccaletto P, Machnicka MA, Purta E, Piatkowski P, Baginski B, Wirecki TK, de Crécy-Lagard V, Ross R, Limbach PA, Kotter A, Helm M, Bujnicki JM. MODOMICS: a database of RNA modification pathways. 2017 update. Nucleic Acids Res 2019; 46:D303-D307. [PMID: 29106616 PMCID: PMC5753262 DOI: 10.1093/nar/gkx1030] [Citation(s) in RCA: 1345] [Impact Index Per Article: 224.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/18/2017] [Indexed: 12/13/2022] Open
Abstract
MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location of modified residues in RNA sequences, and RNA-modifying enzymes. In the current database version, we included the following new features and data: extended mass spectrometry and liquid chromatography data for modified nucleosides; links between human tRNA sequences and MINTbase - a framework for the interactive exploration of mitochondrial and nuclear tRNA fragments; new, machine-friendly system of unified abbreviations for modified nucleoside names; sets of modified tRNA sequences for two bacterial species, updated collection of mammalian tRNA modifications, 19 newly identified modified ribonucleosides and 66 functionally characterized proteins involved in RNA modification. Data from MODOMICS have been linked to the RNAcentral database of RNA sequences. MODOMICS is available at http://modomics.genesilico.pl.
Collapse
|
Research Support, Non-U.S. Gov't |
6 |
1345 |
2
|
Machnicka MA, Milanowska K, Osman Oglou O, Purta E, Kurkowska M, Olchowik A, Januszewski W, Kalinowski S, Dunin-Horkawicz S, Rother KM, Helm M, Bujnicki JM, Grosjean H. MODOMICS: a database of RNA modification pathways--2013 update. Nucleic Acids Res 2012; 41:D262-7. [PMID: 23118484 PMCID: PMC3531130 DOI: 10.1093/nar/gks1007] [Citation(s) in RCA: 813] [Impact Index Per Article: 62.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
MODOMICS is a database of RNA modifications that provides comprehensive information concerning the chemical structures of modified ribonucleosides, their biosynthetic pathways, RNA-modifying enzymes and location of modified residues in RNA sequences. In the current database version, accessible at http://modomics.genesilico.pl, we included new features: a census of human and yeast snoRNAs involved in RNA-guided RNA modification, a new section covering the 5'-end capping process, and a catalogue of 'building blocks' for chemical synthesis of a large variety of modified nucleosides. The MODOMICS collections of RNA modifications, RNA-modifying enzymes and modified RNAs have been also updated. A number of newly identified modified ribonucleosides and more than one hundred functionally and structurally characterized proteins from various organisms have been added. In the RNA sequences section, snRNAs and snoRNAs with experimentally mapped modified nucleosides have been added and the current collection of rRNA and tRNA sequences has been substantially enlarged. To facilitate literature searches, each record in MODOMICS has been cross-referenced to other databases and to selected key publications. New options for database searching and querying have been implemented, including a BLAST search of protein sequences and a PARALIGN search of the collected nucleic acid sequences.
Collapse
|
Research Support, Non-U.S. Gov't |
13 |
813 |
3
|
Boccaletto P, Stefaniak F, Ray A, Cappannini A, Mukherjee S, Purta E, Kurkowska M, Shirvanizadeh N, Destefanis E, Groza P, Avşar G, Romitelli A, Pir P, Dassi E, Conticello SG, Aguilo F, Bujnicki JM. MODOMICS: a database of RNA modification pathways. 2021 update. Nucleic Acids Res 2021; 50:D231-D235. [PMID: 34893873 PMCID: PMC8728126 DOI: 10.1093/nar/gkab1083] [Citation(s) in RCA: 479] [Impact Index Per Article: 119.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 10/16/2021] [Accepted: 12/01/2021] [Indexed: 01/02/2023] Open
Abstract
The MODOMICS database has been, since 2006, a manually curated and centralized resource, storing and distributing comprehensive information about modified ribonucleosides. Originally, it only contained data on the chemical structures of modified ribonucleosides, their biosynthetic pathways, the location of modified residues in RNA sequences, and RNA-modifying enzymes. Over the years, prompted by the accumulation of new knowledge and new types of data, it has been updated with new information and functionalities. In this new release, we have created a catalog of RNA modifications linked to human diseases, e.g., due to mutations in genes encoding modification enzymes. MODOMICS has been linked extensively to RCSB Protein Data Bank, and sequences of experimentally determined RNA structures with modified residues have been added. This expansion was accompanied by including nucleotide 5′-monophosphate residues. We redesigned the web interface and upgraded the database backend. In addition, a search engine for chemically similar modified residues has been included that can be queried by SMILES codes or by drawing chemical molecules. Finally, previously available datasets of modified residues, biosynthetic pathways, and RNA-modifying enzymes have been updated. Overall, we provide users with a new, enhanced, and restyled tool for research on RNA modification. MODOMICS is available at https://iimcb.genesilico.pl/modomics/.
Collapse
|
|
4 |
479 |
4
|
Sabates-Bellver J, Van der Flier LG, de Palo M, Cattaneo E, Maake C, Rehrauer H, Laczko E, Kurowski MA, Bujnicki JM, Menigatti M, Luz J, Ranalli TV, Gomes V, Pastorelli A, Faggiani R, Anti M, Jiricny J, Clevers H, Marra G. Transcriptome profile of human colorectal adenomas. Mol Cancer Res 2008; 5:1263-75. [PMID: 18171984 DOI: 10.1158/1541-7786.mcr-07-0267] [Citation(s) in RCA: 388] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Colorectal cancers are believed to arise predominantly from adenomas. Although these precancerous lesions have been subjected to extensive clinical, pathologic, and molecular analyses, little is currently known about the global gene expression changes accompanying their formation. To characterize the molecular processes underlying the transformation of normal colonic epithelium, we compared the transcriptomes of 32 prospectively collected adenomas with those of normal mucosa from the same individuals. Important differences emerged not only between the expression profiles of normal and adenomatous tissues but also between those of small and large adenomas. A key feature of the transformation process was the remodeling of the Wnt pathway reflected in patent overexpression and underexpression of 78 known components of this signaling cascade. The expression of 19 Wnt targets was closely correlated with clear up-regulation of KIAA1199, whose function is currently unknown. In normal mucosa, KIAA1199 expression was confined to cells in the lower portion of intestinal crypts, where Wnt signaling is physiologically active, but it was markedly increased in all adenomas, where it was expressed in most of the epithelial cells, and in colon cancer cell lines, it was markedly reduced by inactivation of the beta-catenin/T-cell factor(s) transcription complex, the pivotal mediator of Wnt signaling. Our transcriptomic profiles of normal colonic mucosa and colorectal adenomas shed new light on the early stages of colorectal tumorigenesis and identified KIAA1199 as a novel target of the Wnt signaling pathway and a putative marker of colorectal adenomatous transformation.
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
388 |
5
|
Abstract
Rigorous assessments of protein structure prediction have demonstrated that fold recognition methods can identify remote similarities between proteins when standard sequence search methods fail. It has been shown that the accuracy of predictions is improved when refined multiple sequence alignments are used instead of single sequences and if different methods are combined to generate a consensus model. There are several meta-servers available that integrate protein structure predictions performed by various methods, but they do not allow for submission of user-defined multiple sequence alignments and they seldom offer confidentiality of the results. We developed a novel WWW gateway for protein structure prediction, which combines the useful features of other meta-servers available, but with much greater flexibility of the input. The user may submit an amino acid sequence or a multiple sequence alignment to a set of methods for primary, secondary and tertiary structure prediction. Fold-recognition results (target-template alignments) are converted into full-atom 3D models and the quality of these models is uniformly assessed. A consensus between different FR methods is also inferred. The results are conveniently presented on-line on a single web page over a secure, password-protected connection. The GeneSilico protein structure prediction meta-server is freely available for academic users at http://genesilico.pl/meta.
Collapse
|
research-article |
22 |
345 |
6
|
Boniecki MJ, Lach G, Dawson WK, Tomala K, Lukasz P, Soltysinski T, Rother KM, Bujnicki JM. SimRNA: a coarse-grained method for RNA folding simulations and 3D structure prediction. Nucleic Acids Res 2015; 44:e63. [PMID: 26687716 PMCID: PMC4838351 DOI: 10.1093/nar/gkv1479] [Citation(s) in RCA: 243] [Impact Index Per Article: 24.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2014] [Accepted: 12/05/2015] [Indexed: 01/08/2023] Open
Abstract
RNA molecules play fundamental roles in cellular processes. Their function and interactions with other biomolecules are dependent on the ability to form complex three-dimensional (3D) structures. However, experimental determination of RNA 3D structures is laborious and challenging, and therefore, the majority of known RNAs remain structurally uncharacterized. Here, we present SimRNA: a new method for computational RNA 3D structure prediction, which uses a coarse-grained representation, relies on the Monte Carlo method for sampling the conformational space, and employs a statistical potential to approximate the energy and identify conformations that correspond to biologically relevant structures. SimRNA can fold RNA molecules using only sequence information, and, on established test sequences, it recapitulates secondary structure with high accuracy, including correct prediction of pseudoknots. For modeling of complex 3D structures, it can use additional restraints, derived from experimental or computational analyses, including information about secondary structure and/or long-range contacts. SimRNA also can be used to analyze conformational landscapes and identify potential alternative structures.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
243 |
7
|
Zhang Z, Theler D, Kaminska KH, Hiller M, de la Grange P, Pudimat R, Rafalska I, Heinrich B, Bujnicki JM, Allain FHT, Stamm S. The YTH domain is a novel RNA binding domain. J Biol Chem 2010; 285:14701-10. [PMID: 20167602 DOI: 10.1074/jbc.m110.104711] [Citation(s) in RCA: 232] [Impact Index Per Article: 15.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The YTH (YT521-B homology) domain was identified by sequence comparison and is found in 174 different proteins expressed in eukaryotes. It is characterized by 14 invariant residues within an alpha-helix/beta-sheet structure. Here we show that the YTH domain is a novel RNA binding domain that binds to a short, degenerated, single-stranded RNA sequence motif. The presence of the binding motif in alternative exons is necessary for YT521-B to directly influence splice site selection in vivo. Array analyses demonstrate that YT521-B predominantly regulates vertebrate-specific exons. An NMR titration experiment identified the binding surface for single-stranded RNA on the YTH domain. Structural analyses indicate that the YTH domain is related to the pseudouridine synthase and archaeosine transglycosylase (PUA) domain. Our data show that the YTH domain conveys RNA binding ability to a new class of proteins that are found in all eukaryotic organisms.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
232 |
8
|
Cruz JA, Blanchet MF, Boniecki M, Bujnicki JM, Chen SJ, Cao S, Das R, Ding F, Dokholyan NV, Flores SC, Huang L, Lavender CA, Lisi V, Major F, Mikolajczak K, Patel DJ, Philips A, Puton T, Santalucia J, Sijenyi F, Hermann T, Rother K, Rother M, Serganov A, Skorupski M, Soltysinski T, Sripakdeevong P, Tuszynska I, Weeks KM, Waldsich C, Wildauer M, Leontis NB, Westhof E. RNA-Puzzles: a CASP-like evaluation of RNA three-dimensional structure prediction. RNA (NEW YORK, N.Y.) 2012; 18:610-25. [PMID: 22361291 PMCID: PMC3312550 DOI: 10.1261/rna.031054.111] [Citation(s) in RCA: 200] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
We report the results of a first, collective, blind experiment in RNA three-dimensional (3D) structure prediction, encompassing three prediction puzzles. The goals are to assess the leading edge of RNA structure prediction techniques; compare existing methods and tools; and evaluate their relative strengths, weaknesses, and limitations in terms of sequence length and structural complexity. The results should give potential users insight into the suitability of available methods for different applications and facilitate efforts in the RNA structure prediction community in ongoing efforts to improve prediction tools. We also report the creation of an automated evaluation pipeline to facilitate the analysis of future RNA structure prediction exercises.
Collapse
|
Research Support, N.I.H., Extramural |
13 |
200 |
9
|
Rother M, Rother K, Puton T, Bujnicki JM. ModeRNA: a tool for comparative modeling of RNA 3D structure. Nucleic Acids Res 2011; 39:4007-22. [PMID: 21300639 PMCID: PMC3105415 DOI: 10.1093/nar/gkq1320] [Citation(s) in RCA: 197] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
RNA is a large group of functionally important biomacromolecules. In striking analogy to proteins, the function of RNA depends on its structure and dynamics, which in turn is encoded in the linear sequence. However, while there are numerous methods for computational prediction of protein three-dimensional (3D) structure from sequence, with comparative modeling being the most reliable approach, there are very few such methods for RNA. Here, we present ModeRNA, a software tool for comparative modeling of RNA 3D structures. As an input, ModeRNA requires a 3D structure of a template RNA molecule, and a sequence alignment between the target to be modeled and the template. It must be emphasized that a good alignment is required for successful modeling, and for large and complex RNA molecules the development of a good alignment usually requires manual adjustments of the input data based on previous expertise of the respective RNA family. ModeRNA can model post-transcriptional modifications, a functionally important feature analogous to post-translational modifications in proteins. ModeRNA can also model DNA structures or use them as templates. It is equipped with many functions for merging fragments of different nucleic acid structures into a single model and analyzing their geometry. Windows and UNIX implementations of ModeRNA with comprehensive documentation and a tutorial are freely available.
Collapse
|
Research Support, Non-U.S. Gov't |
14 |
197 |
10
|
Manfredonia I, Nithin C, Ponce-Salvatierra A, Ghosh P, Wirecki TK, Marinus T, Ogando NS, Snijder E, van Hemert MJ, Bujnicki JM, Incarnato D. Genome-wide mapping of SARS-CoV-2 RNA structures identifies therapeutically-relevant elements. Nucleic Acids Res 2020; 48:12436-12452. [PMID: 33166999 PMCID: PMC7736786 DOI: 10.1093/nar/gkaa1053] [Citation(s) in RCA: 189] [Impact Index Per Article: 37.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Revised: 10/13/2020] [Accepted: 10/22/2020] [Indexed: 01/25/2023] Open
Abstract
SARS-CoV-2 is a betacoronavirus with a linear single-stranded, positive-sense RNA genome, whose outbreak caused the ongoing COVID-19 pandemic. The ability of coronaviruses to rapidly evolve, adapt, and cross species barriers makes the development of effective and durable therapeutic strategies a challenging and urgent need. As for other RNA viruses, genomic RNA structures are expected to play crucial roles in several steps of the coronavirus replication cycle. Despite this, only a handful of functionally-conserved coronavirus structural RNA elements have been identified to date. Here, we performed RNA structure probing to obtain single-base resolution secondary structure maps of the full SARS-CoV-2 coronavirus genome both in vitro and in living infected cells. Probing data recapitulate the previously described coronavirus RNA elements (5' UTR and s2m), and reveal new structures. Of these, ∼10.2% show significant covariation among SARS-CoV-2 and other coronaviruses, hinting at their functionally-conserved role. Secondary structure-restrained 3D modeling of these segments further allowed for the identification of putative druggable pockets. In addition, we identify a set of single-stranded segments in vivo, showing high sequence conservation, suitable for the development of antisense oligonucleotide therapeutics. Collectively, our work lays the foundation for the development of innovative RNA-targeted therapeutic strategies to fight SARS-related infections.
Collapse
|
research-article |
5 |
189 |
11
|
Dunin-Horkawicz S, Czerwoniec A, Gajda MJ, Feder M, Grosjean H, Bujnicki JM. MODOMICS: a database of RNA modification pathways. Nucleic Acids Res 2006; 34:D145-9. [PMID: 16381833 PMCID: PMC1347447 DOI: 10.1093/nar/gkj084] [Citation(s) in RCA: 189] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MODOMICS is the first comprehensive database resource for systems biology of RNA modification. It integrates information about the chemical structure of modified nucleosides, their localization in RNA sequences, pathways of their biosynthesis and enzymes that carry out the respective reactions. MODOMICS also provides literature information, and links to other databases, including the available protein sequence and structure data. The current list of modifications and pathways is comprehensive, while the dataset of enzymes is limited to Escherichia coli and Saccharomyces cerevisiae and sequence alignments are presented only for tRNAs from these organisms. RNAs and enzymes from other organisms will be included in the near future. MODOMICS can be queried by the type of nucleoside (e.g. A, G, C, U, I, m1A, nm5s2U, etc.), type of RNA, position of a particular nucleoside, type of reaction (e.g. methylation, thiolation, deamination, etc.) and name or sequence of an enzyme of interest. Options for data presentation include graphs of pathways involving the query nucleoside, multiple sequence alignments of RNA sequences and tabular forms with enzyme and literature data. The contents of MODOMICS can be accessed through the World Wide Web at .
Collapse
|
Research Support, Non-U.S. Gov't |
19 |
189 |
12
|
Bujnicki JM, Feder M, Radlinska M, Blumenthal RM. Structure prediction and phylogenetic analysis of a functionally diverse family of proteins homologous to the MT-A70 subunit of the human mRNA:m(6)A methyltransferase. J Mol Evol 2002; 55:431-44. [PMID: 12355263 DOI: 10.1007/s00239-002-2339-8] [Citation(s) in RCA: 177] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2001] [Accepted: 04/02/2002] [Indexed: 10/27/2022]
Abstract
MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m(6)A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs. The physiological importance yet limited understanding of MT-A70 and its apparent lack of similarity to other known RNA MTases combined to make this protein an attractive target for bioinformatic analysis. The sequence of MT-A70 was subjected to extensive in silico analysis to identify orthologous and paralogous polypeptides. This analysis revealed that the MT-A70 family comprises four subfamilies with varying degrees of interrelatedness. One subfamily is a small group of bacterial DNA:m(6)A MTases. The other three subfamilies are paralogous eukaryotic lineages, two of which have not been associated with MTase activity but include proteins having substantial regulatory effects. Multiple sequence alignments and structure prediction for members of all four subfamilies indicated a high probability that a consensus MTase fold domain is present. Significantly, this consensus fold shows the permuted topology characteristic of the b class of MTases, which to date has only been known to include DNA MTases.
Collapse
|
|
23 |
177 |
13
|
Abstract
UNLABELLED The Structure Prediction Meta Server offers a convenient way for biologists to utilize various high quality structure prediction servers available worldwide. The meta server translates the results obtained from remote services into uniform format, which are consequently used to request a jury prediction from a remote consensus server Pcons. AVAILABILITY The structure prediction meta server is freely available at http://BioInfo.PL/meta/, some remote servers have however restrictions for non-academic users, which are respected by the meta server. SUPPLEMENTARY INFORMATION Results of several sessions of the CAFASP and LiveBench programs for assessment of performance of fold-recognition servers carried out via the meta server are available at http://BioInfo.PL/services.html.
Collapse
|
|
24 |
168 |
14
|
Kurowski MA, Bhagwat AS, Papaj G, Bujnicki JM. Phylogenomic identification of five new human homologs of the DNA repair enzyme AlkB. BMC Genomics 2003; 4:48. [PMID: 14667252 PMCID: PMC317286 DOI: 10.1186/1471-2164-4-48] [Citation(s) in RCA: 168] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2003] [Accepted: 12/10/2003] [Indexed: 11/10/2022] Open
Abstract
Background Combination of biochemical and bioinformatic analyses led to the discovery of oxidative demethylation – a novel DNA repair mechanism catalyzed by the Escherichia coli AlkB protein and its two human homologs, hABH2 and hABH3. This discovery was based on the prediction made by Aravind and Koonin that AlkB is a member of the 2OG-Fe2+ oxygenase superfamily. Results In this article, we report identification and sequence analysis of five human members of the (2OG-Fe2+) oxygenase superfamily designated here as hABH4 through hABH8. These experimentally uncharacterized and poorly annotated genes were not associated with the AlkB family in any database, but are predicted here to be phylogenetically and functionally related to the AlkB family (and specifically to the lineage that groups together hABH2 and hABH3) rather than to any other oxygenase family. Our analysis reveals the history of ABH gene duplications in the evolution of vertebrate genomes. Conclusions We hypothesize that hABH 4–8 could either be back-up enzymes for hABH1-3 or may code for novel DNA or RNA repair activities. For example, enzymes that can dealkylate N3-methylpurines or N7-methylpurines in DNA have not been described. Our analysis will guide experimental confirmation of these novel human putative DNA repair enzymes.
Collapse
|
Research Support, U.S. Gov't, P.H.S. |
22 |
168 |
15
|
Tuszynska I, Magnus M, Jonak K, Dawson W, Bujnicki JM. NPDock: a web server for protein-nucleic acid docking. Nucleic Acids Res 2015; 43:W425-30. [PMID: 25977296 PMCID: PMC4489298 DOI: 10.1093/nar/gkv493] [Citation(s) in RCA: 167] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 05/02/2015] [Indexed: 01/03/2023] Open
Abstract
Protein–RNA and protein–DNA interactions play fundamental roles in many biological processes. A detailed understanding of these interactions requires knowledge about protein–nucleic acid complex structures. Because the experimental determination of these complexes is time-consuming and perhaps futile in some instances, we have focused on computational docking methods starting from the separate structures. Docking methods are widely employed to study protein–protein interactions; however, only a few methods have been made available to model protein–nucleic acid complexes. Here, we describe NPDock (Nucleic acid–Protein Docking); a novel web server for predicting complexes of protein–nucleic acid structures which implements a computational workflow that includes docking, scoring of poses, clustering of the best-scored models and refinement of the most promising solutions. The NPDock server provides a user-friendly interface and 3D visualization of the results. The smallest set of input data consists of a protein structure and a DNA or RNA structure in PDB format. Advanced options are available to control specific details of the docking process and obtain intermediate results. The web server is available at http://genesilico.pl/NPDock.
Collapse
|
Research Support, Non-U.S. Gov't |
10 |
167 |
16
|
Czerwoniec A, Dunin-Horkawicz S, Purta E, Kaminska KH, Kasprzak JM, Bujnicki JM, Grosjean H, Rother K. MODOMICS: a database of RNA modification pathways. 2008 update. Nucleic Acids Res 2008; 37:D118-21. [PMID: 18854352 PMCID: PMC2686465 DOI: 10.1093/nar/gkn710] [Citation(s) in RCA: 164] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
MODOMICS, a database devoted to the systems biology of RNA modification, has been subjected to substantial improvements. It provides comprehensive information on the chemical structure of modified nucleosides, pathways of their biosynthesis, sequences of RNAs containing these modifications and RNA-modifying enzymes. MODOMICS also provides cross-references to other databases and to literature. In addition to the previously available manually curated tRNA sequences from a few model organisms, we have now included additional tRNAs and rRNAs, and all RNAs with 3D structures in the Nucleic Acid Database, in which modified nucleosides are present. In total, 3460 modified bases in RNA sequences of different organisms have been annotated. New RNA-modifying enzymes have been also added. The current collection of enzymes includes mainly proteins for the model organisms Escherichia coli and Saccharomyces cerevisiae, and is currently being expanded to include proteins from other organisms, in particular Archaea and Homo sapiens. For enzymes with known structures, links are provided to the corresponding Protein Data Bank entries, while for many others homology models have been created. Many new options for database searching and querying have been included. MODOMICS can be accessed at http://genesilico.pl/modomics.
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
164 |
17
|
Petrov AI, Kay SJE, Kalvari I, Howe KL, Gray KA, Bruford EA, Kersey PJ, Cochrane G, Finn RD, Bateman A, Kozomara A, Griffiths-Jones S, Frankish A, Zwieb CW, Lau BY, Williams KP, Chan PP, Lowe TM, Cannone JJ, Gutell R, Machnicka MA, Bujnicki JM, Yoshihama M, Kenmochi N, Chai B, Cole JR, Szymanski M, Karlowski WM, Wood V, Huala E, Berardini TZ, Zhao Y, Chen R, Zhu W, Paraskevopoulou MD, Vlachos IS, Hatzigeorgiou AG, Ma L, Zhang Z, Puetz J, Stadler PF, McDonald D, Basu S, Fey P, Engel SR, Cherry JM, Volders PJ, Mestdagh P, Wower J, Clark MB, Quek XC, Dinger ME. RNAcentral: a comprehensive database of non-coding RNA sequences. Nucleic Acids Res 2017; 45:D128-D134. [PMID: 27794554 PMCID: PMC5210518 DOI: 10.1093/nar/gkw1008] [Citation(s) in RCA: 154] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 10/13/2016] [Accepted: 10/18/2016] [Indexed: 12/12/2022] Open
Abstract
RNAcentral is a database of non-coding RNA (ncRNA) sequences that aggregates data from specialised ncRNA resources and provides a single entry point for accessing ncRNA sequences of all ncRNA types from all organisms. Since its launch in 2014, RNAcentral has integrated twelve new resources, taking the total number of collaborating database to 22, and began importing new types of data, such as modified nucleotides from MODOMICS and PDB. We created new species-specific identifiers that refer to unique RNA sequences within a context of single species. The website has been subject to continuous improvements focusing on text and sequence similarity searches as well as genome browsing functionality. All RNAcentral data is provided for free and is available for browsing, bulk downloads, and programmatic access at http://rnacentral.org/.
Collapse
|
research-article |
8 |
154 |
18
|
Pawlowski M, Gajda MJ, Matlak R, Bujnicki JM. MetaMQAP: a meta-server for the quality assessment of protein models. BMC Bioinformatics 2008; 9:403. [PMID: 18823532 PMCID: PMC2573893 DOI: 10.1186/1471-2105-9-403] [Citation(s) in RCA: 149] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2008] [Accepted: 09/29/2008] [Indexed: 12/31/2022] Open
Abstract
Background Computational models of protein structure are usually inaccurate and exhibit significant deviations from the true structure. The utility of models depends on the degree of these deviations. A number of predictive methods have been developed to discriminate between the globally incorrect and approximately correct models. However, only a few methods predict correctness of different parts of computational models. Several Model Quality Assessment Programs (MQAPs) have been developed to detect local inaccuracies in unrefined crystallographic models, but it is not known if they are useful for computational models, which usually exhibit different and much more severe errors. Results The ability to identify local errors in models was tested for eight MQAPs: VERIFY3D, PROSA, BALA, ANOLEA, PROVE, TUNE, REFINER, PROQRES on 8251 models from the CASP-5 and CASP-6 experiments, by calculating the Spearman's rank correlation coefficients between per-residue scores of these methods and local deviations between C-alpha atoms in the models vs. experimental structures. As a reference, we calculated the value of correlation between the local deviations and trivial features that can be calculated for each residue directly from the models, i.e. solvent accessibility, depth in the structure, and the number of local and non-local neighbours. We found that absolute correlations of scores returned by the MQAPs and local deviations were poor for all methods. In addition, scores of PROQRES and several other MQAPs strongly correlate with 'trivial' features. Therefore, we developed MetaMQAP, a meta-predictor based on a multivariate regression model, which uses scores of the above-mentioned methods, but in which trivial parameters are controlled. MetaMQAP predicts the absolute deviation (in Ångströms) of individual C-alpha atoms between the model and the unknown true structure as well as global deviations (expressed as root mean square deviation and GDT_TS scores). Local model accuracy predicted by MetaMQAP shows an impressive correlation coefficient of 0.7 with true deviations from native structures, a significant improvement over all constituent primary MQAP scores. The global MetaMQAP score is correlated with model GDT_TS on the level of 0.89. Conclusion Finally, we compared our method with the MQAPs that scored best in the 7th edition of CASP, using CASP7 server models (not included in the MetaMQAP training set) as the test data. In our benchmark, MetaMQAP is outperformed only by PCONS6 and method QA_556 – methods that require comparison of multiple alternative models and score each of them depending on its similarity to other models. MetaMQAP is however the best among methods capable of evaluating just single models. We implemented the MetaMQAP as a web server available for free use by all academic users at the URL
Collapse
|
Research Support, Non-U.S. Gov't |
17 |
149 |
19
|
Machnicka MA, Olchowik A, Grosjean H, Bujnicki JM. Distribution and frequencies of post-transcriptional modifications in tRNAs. RNA Biol 2015; 11:1619-29. [PMID: 25611331 DOI: 10.4161/15476286.2014.992273] [Citation(s) in RCA: 149] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Functional tRNA molecules always contain a wide variety of post-transcriptionally modified nucleosides. These modifications stabilize tRNA structure, allow for proper interaction with other macromolecules and fine-tune the decoding of mRNAs during translation. Their presence in functionally important regions of tRNA is conserved in all domains of life. However, the identities of many of these modified residues depend much on the phylogeny of organisms the tRNAs are found in, attesting for domain-specific strategies of tRNA maturation. In this work we present a new tool, tRNAmodviz web server (http://genesilico.pl/trnamodviz) for easy comparative analysis and visualization of modification patterns in individual tRNAs, as well as in groups of selected tRNA sequences. We also present results of comparative analysis of tRNA sequences derived from 7 phylogenetically distinct groups of organisms: Gram-negative bacteria, Gram-positive bacteria, cytosol of eukaryotic single cell organisms, Fungi and Metazoa, cytosol of Viridiplantae, mitochondria, plastids and Euryarchaeota. These data update the study conducted 20 y ago with the tRNA sequences available at that time.
Collapse
|
Review |
10 |
149 |
20
|
Xu Y, Keene DR, Bujnicki JM, Höök M, Lukomski S. Streptococcal Scl1 and Scl2 proteins form collagen-like triple helices. J Biol Chem 2002; 277:27312-8. [PMID: 11976327 DOI: 10.1074/jbc.m201163200] [Citation(s) in RCA: 147] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The collagens are a family of animal proteins containing segments of repeated Gly-Xaa-Yaa (GXY) motifs that form a characteristic triple-helical structure. Genes encoding proteins with repeated GXY motifs have also been reported in bacteria and phages; however, it is unclear whether these prokaryotic proteins can form a collagen-like triple-helical structure. Here we used two recently identified streptococcal proteins, Scl1 and Scl2, containing extended GXY sequence repeats as model proteins. First we observed that prior to heat denaturation recombinant Scl proteins migrated as homotrimers in gel electrophoresis with and without SDS. We next showed that the collagen-like domain of Scl is resistant to proteolysis by trypsin. We further showed that circular dichroism spectra of the Scl proteins contained features characteristic of collagen triple helices, including a positive maximum of ellipticity at 220 nm. Furthermore the triple helices of Scl1 and Scl2 showed a temperature-dependent unfolding with melting temperatures of 36.4 and 37.6 degrees C, respectively, which resembles those seen for collagens. We finally demonstrated by electron microscopy that the Scl proteins are organized into "lollipop-like" structures, similar to those seen in human proteins with collagenous domains. This implies that the repeated GXY tripeptide motif is a structural indicator of collagen-like triple helices in proteins from such phylogenetically distant sources as bacteria and humans.
Collapse
|
|
23 |
147 |
21
|
Miao Z, Adamiak RW, Blanchet MF, Boniecki M, Bujnicki JM, Chen SJ, Cheng C, Chojnowski G, Chou FC, Cordero P, Cruz JA, Ferré-D'Amaré AR, Das R, Ding F, Dokholyan NV, Dunin-Horkawicz S, Kladwang W, Krokhotin A, Lach G, Magnus M, Major F, Mann TH, Masquida B, Matelska D, Meyer M, Peselis A, Popenda M, Purzycka KJ, Serganov A, Stasiewicz J, Szachniuk M, Tandon A, Tian S, Wang J, Xiao Y, Xu X, Zhang J, Zhao P, Zok T, Westhof E. RNA-Puzzles Round II: assessment of RNA structure prediction programs applied to three large RNA structures. RNA (NEW YORK, N.Y.) 2015; 21:1066-84. [PMID: 25883046 PMCID: PMC4436661 DOI: 10.1261/rna.049502.114] [Citation(s) in RCA: 141] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2015] [Accepted: 02/12/2015] [Indexed: 05/04/2023]
Abstract
This paper is a report of a second round of RNA-Puzzles, a collective and blind experiment in three-dimensional (3D) RNA structure prediction. Three puzzles, Puzzles 5, 6, and 10, represented sequences of three large RNA structures with limited or no homology with previously solved RNA molecules. A lariat-capping ribozyme, as well as riboswitches complexed to adenosylcobalamin and tRNA, were predicted by seven groups using RNAComposer, ModeRNA/SimRNA, Vfold, Rosetta, DMD, MC-Fold, 3dRNA, and AMBER refinement. Some groups derived models using data from state-of-the-art chemical-mapping methods (SHAPE, DMS, CMCT, and mutate-and-map). The comparisons between the predictions and the three subsequently released crystallographic structures, solved at diffraction resolutions of 2.5-3.2 Å, were carried out automatically using various sets of quality indicators. The comparisons clearly demonstrate the state of present-day de novo prediction abilities as well as the limitations of these state-of-the-art methods. All of the best prediction models have similar topologies to the native structures, which suggests that computational methods for RNA structure prediction can already provide useful structural information for biological problems. However, the prediction accuracy for non-Watson-Crick interactions, key to proper folding of RNAs, is low and some predicted models had high Clash Scores. These two difficulties point to some of the continuing bottlenecks in RNA structure prediction. All submitted models are available for download at http://ahsoka.u-strasbg.fr/rnapuzzles/.
Collapse
|
Comparative Study |
10 |
141 |
22
|
Miao Z, Adamiak RW, Antczak M, Batey RT, Becka AJ, Biesiada M, Boniecki MJ, Bujnicki JM, Chen SJ, Cheng CY, Chou FC, Ferré-D'Amaré AR, Das R, Dawson WK, Ding F, Dokholyan NV, Dunin-Horkawicz S, Geniesse C, Kappel K, Kladwang W, Krokhotin A, Łach GE, Major F, Mann TH, Magnus M, Pachulska-Wieczorek K, Patel DJ, Piccirilli JA, Popenda M, Purzycka KJ, Ren A, Rice GM, Santalucia J, Sarzynska J, Szachniuk M, Tandon A, Trausch JJ, Tian S, Wang J, Weeks KM, Williams B, Xiao Y, Xu X, Zhang D, Zok T, Westhof E. RNA-Puzzles Round III: 3D RNA structure prediction of five riboswitches and one ribozyme. RNA (NEW YORK, N.Y.) 2017; 23:655-672. [PMID: 28138060 PMCID: PMC5393176 DOI: 10.1261/rna.060368.116] [Citation(s) in RCA: 140] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2016] [Accepted: 01/26/2017] [Indexed: 05/21/2023]
Abstract
RNA-Puzzles is a collective experiment in blind 3D RNA structure prediction. We report here a third round of RNA-Puzzles. Five puzzles, 4, 8, 12, 13, 14, all structures of riboswitch aptamers and puzzle 7, a ribozyme structure, are included in this round of the experiment. The riboswitch structures include biological binding sites for small molecules (S-adenosyl methionine, cyclic diadenosine monophosphate, 5-amino 4-imidazole carboxamide riboside 5'-triphosphate, glutamine) and proteins (YbxF), and one set describes large conformational changes between ligand-free and ligand-bound states. The Varkud satellite ribozyme is the most recently solved structure of a known large ribozyme. All puzzles have established biological functions and require structural understanding to appreciate their molecular mechanisms. Through the use of fast-track experimental data, including multidimensional chemical mapping, and accurate prediction of RNA secondary structure, a large portion of the contacts in 3D have been predicted correctly leading to similar topologies for the top ranking predictions. Template-based and homology-derived predictions could predict structures to particularly high accuracies. However, achieving biological insights from de novo prediction of RNA 3D structures still depends on the size and complexity of the RNA. Blind computational predictions of RNA structures already appear to provide useful structural information in many cases. Similar to the previous RNA-Puzzles Round II experiment, the prediction of non-Watson-Crick interactions and the observed high atomic clash scores reveal a notable need for an algorithm of improvement. All prediction models and assessment results are available at http://ahsoka.u-strasbg.fr/rnapuzzles/.
Collapse
|
Research Support, N.I.H., Extramural |
8 |
140 |
23
|
Pintard L, Lecointe F, Bujnicki JM, Bonnerot C, Grosjean H, Lapeyre B. Trm7p catalyses the formation of two 2'-O-methylriboses in yeast tRNA anticodon loop. EMBO J 2002; 21:1811-20. [PMID: 11927565 PMCID: PMC125368 DOI: 10.1093/emboj/21.7.1811] [Citation(s) in RCA: 139] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The genome of Saccharomyces cerevisiae encodes three close homologues of the Escherichia coli 2'-O-rRNA methyltransferase FtsJ/RrmJ, designated Trm7p, Spb1p and Mrm2p. We present evidence that Trm7p methylates the 2'-O-ribose of nucleotides at positions 32 and 34 of the tRNA anticodon loop, both in vivo and in vitro. In a trm7Delta strain, which is viable but grows slowly, translation is impaired, thus indicating that these tRNA modifications could be important for translation efficiency. We discuss the emergence of a family of three 2'-O-RNA methyltransferases in Eukaryota and one in Prokaryota from a common ancestor. We propose that each eukaryotic enzyme is located in a different cell compartment, in which it would methylate a different RNA that can adopt a very similar secondary structure.
Collapse
|
|
23 |
139 |
24
|
Kosinski J, Cymerman IA, Feder M, Kurowski MA, Sasin JM, Bujnicki JM. A "FRankenstein's monster" approach to comparative modeling: merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluation. Proteins 2004; 53 Suppl 6:369-79. [PMID: 14579325 DOI: 10.1002/prot.10545] [Citation(s) in RCA: 138] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We applied a new multi-step protocol to predict the structures of all targets during CASP5, regardless of their potential category. 1) We used diverse fold-recognition (FR) methods to generate initial target-template alignments, which were converted into preliminary full-atom models by comparative modeling. All preliminary models were evaluated (scored) by VERIFY3D to identify well- and poorly-folded fragments. 2) Preliminary models with similar 3D folds were superimposed, poorly-scoring regions were deleted and the "average model" structure was created by merging the remaining segments. All template structures reported by FR were superimposed and a composite multiple-structure template was created from the most conserved fragments. 3). The average model was superimposed onto the composite template and the structure-based target-template alignment was inferred. This alignment was used to build a new (intermediate) comparative model of the target, again scored with VERIFY3D. 4) For all poorly scoring regions series of alternative alignments were generated by progressively shifting the "unfit" sequence fragment in either direction. Here, we considered additional information, such as secondary structure, placement of insertions and deletions in loops, conservation of putative catalytic residues, and the necessity to obtain a compact, well-folded structure. For all alternative alignments, new models were built and evaluated. 5) All models were superimposed and the "FRankenstein's monster" (FR, fold recognition) model was built from best-scoring segments. The final model was obtained after limited energy minimization to remove steric clashes between sidechains from different fragments. The novelty of this approach is in the focus on "vertical" recombination of structure fragments, typical for the ab initio field, rather than "horizontal" sequence alignment typical for comparative modeling. We tested the usefulness of the "FRankenstein" approach for non-expert predictors: only the leader of our team had considerable experience in protein modeling - he registered as a separate group (020) and submitted models built only by himself. At the onset of CASP5, the other five members of the team (students) had very little or no experience with modeling. They followed the same protocol in a deliberately naïve way. In the fourth step they used solely the VERIFY3D criterion to compare their models and the leader's model (the latter regarded only as one of the many alternatives) and generated the hybrid or selected only one model for submission (group 517). In order to compare our protocol with the traditional "one target-one template-one alignment" approach, we submitted (as a separate group 242) models selected from those automatically generated by all CAFASP servers (i.e. obtained without any human intervention). Here, we compare the results obtained by the three "groups", describe successes and failures of the "FRankenstein" approach and discuss future developments of comparative modeling. The automatic version of our multi-step protocol is being developed as a meta-server; the prototype is freely available at http://genesilico.pl/meta/.
Collapse
|
Research Support, Non-U.S. Gov't |
21 |
138 |
25
|
de Crécy-Lagard V, Boccaletto P, Mangleburg CG, Sharma P, Lowe TM, Leidel SA, Bujnicki JM. Matching tRNA modifications in humans to their known and predicted enzymes. Nucleic Acids Res 2019; 47:2143-2159. [PMID: 30698754 PMCID: PMC6412123 DOI: 10.1093/nar/gkz011] [Citation(s) in RCA: 111] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2018] [Revised: 12/28/2018] [Accepted: 01/10/2019] [Indexed: 12/25/2022] Open
Abstract
tRNA are post-transcriptionally modified by chemical modifications that affect all aspects of tRNA biology. An increasing number of mutations underlying human genetic diseases map to genes encoding for tRNA modification enzymes. However, our knowledge on human tRNA-modification genes remains fragmentary and the most comprehensive RNA modification database currently contains information on approximately 20% of human cytosolic tRNAs, primarily based on biochemical studies. Recent high-throughput methods such as DM-tRNA-seq now allow annotation of a majority of tRNAs for six specific base modifications. Furthermore, we identified large gaps in knowledge when we predicted all cytosolic and mitochondrial human tRNA modification genes. Only 48% of the candidate cytosolic tRNA modification enzymes have been experimentally validated in mammals (either directly or in a heterologous system). Approximately 23% of the modification genes (cytosolic and mitochondrial combined) remain unknown. We discuss these 'unidentified enzymes' cases in detail and propose candidates whenever possible. Finally, tissue-specific expression analysis shows that modification genes are highly expressed in proliferative tissues like testis and transformed cells, but scarcely in differentiated tissues, with the exception of the cerebellum. Our work provides a comprehensive up to date compilation of human tRNA modifications and their enzymes that can be used as a resource for further studies.
Collapse
|
Research Support, N.I.H., Extramural |
6 |
111 |