1
|
Hashempour A, Khodadad N, Bemani P, Ghasemi Y, Akbarinia S, Bordbari R, Tabatabaei AH, Falahi S. Design of multivalent-epitope vaccine models directed toward the world's population against HIV-Gag polyprotein: Reverse vaccinology and immunoinformatics. PLoS One 2024; 19:e0306559. [PMID: 39331650 DOI: 10.1371/journal.pone.0306559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Accepted: 06/18/2024] [Indexed: 09/29/2024] Open
Abstract
Significant progress has been made in HIV-1 research; however, researchers have not yet achieved the objective of eradicating HIV-1 infection. Accordingly, in this study, eucaryotic and procaryotic in silico vaccines were developed for HIV-Gag polyproteins from 100 major HIV subtypes and CRFs using immunoinformatic techniques to simulate immune responses in mice and humans. The epitopes located in the conserved domains of the Gag polyprotein were evaluated for allergenicity, antigenicity, immunogenicity, toxicity, homology, topology, and IFN-γ induction. Adjuvants, linkers, CTLs, HTLs, and BCL epitopes were incorporated into the vaccine models. Strong binding affinities were detected between HLA/MHC alleles, TLR-2, TLR-3, TLR-4, TLR-7, and TLR-9, and vaccine models. Immunological simulation showed that innate and adaptive immune cells elicited active and consistent responses. The human vaccine model was matched with approximately 93.91% of the human population. The strong binding of the vaccine to MHC/HLA and TLR molecules was confirmed through molecular dynamic stimulation. Codon optimization ensured the successful translation of the designed constructs into human cells and E. coli hosts. We believe that the HIV-1 Gag vaccine formulated in our research can reduce the challenges faced in developing an HIV-1 vaccine. Nevertheless, experimental verification is necessary to confirm the effectiveness of these vaccines in these models.
Collapse
Affiliation(s)
- Ava Hashempour
- HIV/AIDS Research Center, Institute of Health, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Nastaran Khodadad
- HIV/AIDS Research Center, Institute of Health, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Peyman Bemani
- HIV/AIDS Research Center, Institute of Health, Shiraz University of Medical Sciences, Shiraz, Iran
- Department of Immunology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Younes Ghasemi
- Department of Pharmaceutical Biotechnology, School of Pharmacy, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Shokufeh Akbarinia
- HIV/AIDS Research Center, Institute of Health, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Reza Bordbari
- HIV/AIDS Research Center, Institute of Health, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Amir Hossein Tabatabaei
- HIV/AIDS Research Center, Institute of Health, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Shahab Falahi
- HIV/AIDS Research Center, Institute of Health, Shiraz University of Medical Sciences, Shiraz, Iran
- Zoonotic Diseases Research Center, Ilam University of Medical Sciences, Ilam, Iran
| |
Collapse
|
2
|
Tong Y, Childs-Disney JL, Disney MD. Targeting RNA with small molecules, from RNA structures to precision medicines: IUPHAR review: 40. Br J Pharmacol 2024. [PMID: 39224931 DOI: 10.1111/bph.17308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2024] [Revised: 06/10/2024] [Accepted: 07/09/2024] [Indexed: 09/04/2024] Open
Abstract
RNA plays important roles in regulating both health and disease biology in all kingdoms of life. Notably, RNA can form intricate three-dimensional structures, and their biological functions are dependent on these structures. Targeting the structured regions of RNA with small molecules has gained increasing attention over the past decade, because it provides both chemical probes to study fundamental biology processes and lead medicines for diseases with unmet medical needs. Recent advances in RNA structure prediction and determination and RNA biology have accelerated the rational design and development of RNA-targeted small molecules to modulate disease pathology. However, challenges remain in advancing RNA-targeted small molecules towards clinical applications. This review summarizes strategies to study RNA structures, to identify small molecules recognizing these structures, and to augment the functionality of RNA-binding small molecules. We focus on recent advances in developing RNA-targeted small molecules as potential therapeutics in a variety of diseases, encompassing different modes of actions and targeting strategies. Furthermore, we present the current gaps between early-stage discovery of RNA-binding small molecules and their clinical applications, as well as a roadmap to overcome these challenges in the near future.
Collapse
Affiliation(s)
- Yuquan Tong
- Department of Chemistry, The Scripps Research Institute, Jupiter, Florida, USA
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| | - Jessica L Childs-Disney
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| | - Matthew D Disney
- Department of Chemistry, The Scripps Research Institute, Jupiter, Florida, USA
- Department of Chemistry, The Herbert Wertheim UF Scripps Institute for Biomedical Innovation & Technology, Jupiter, Florida, USA
| |
Collapse
|
3
|
Mittal A, Turner DH, Mathews DH. NNDB: An Expanded Database of Nearest Neighbor Parameters for Predicting Stability of Nucleic Acid Secondary Structures. J Mol Biol 2024; 436:168549. [PMID: 38522645 PMCID: PMC11377154 DOI: 10.1016/j.jmb.2024.168549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 03/18/2024] [Accepted: 03/19/2024] [Indexed: 03/26/2024]
Abstract
Nearest neighbor thermodynamic parameters are widely used for RNA and DNA secondary structure prediction and to model thermodynamic ensembles of secondary structures. The Nearest Neighbor Database (NNDB) is a freely available web resource (https://rna.urmc.rochester.edu/NNDB) that provides the functional forms, parameter values, and example calculations. The NNDB provides the 1999 and 2004 set of RNA folding nearest neighbor parameters. We expanded the database to include a set of DNA parameters and a set of RNA parameters that includes m6A in addition to the canonical RNA nucleobases. The site was redesigned using the Quarto open-source publishing system. A downloadable PDF version of the complete resource and downloadable sets of nearest neighbor parameters are available.
Collapse
Affiliation(s)
- Abhinav Mittal
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA; Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Douglas H Turner
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA; Department of Chemistry, University of Rochester, Rochester, NY 14627, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA; Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.
| |
Collapse
|
4
|
Kim J, Seo M, Lim Y, Kim J. START: A Versatile Platform for Bacterial Ligand Sensing with Programmable Performances. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2402029. [PMID: 39075726 PMCID: PMC11423158 DOI: 10.1002/advs.202402029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 05/31/2024] [Indexed: 07/31/2024]
Abstract
Recognition of signaling molecules for coordinated regulation of target genes is a fundamental process for biological systems. Cells often rely on transcription factors to accomplish these intricate tasks, yet the subtle conformational changes of protein structures, coupled with the complexity of intertwined protein interaction networks, pose challenges for repurposing these for bioengineering applications. This study introduces a novel platform for ligand-responsive gene regulation, termed START (Synthetic Trans-Acting Riboswitch with Triggering RNA). Inspired by the bacterial ligand sensing system, riboswitch, and the synthetic gene regulator, toehold switch, the START platform enables the implementation of synthetic biosensors for various ligands. Rational sequence design with targeted domain optimization yields high-performance STARTs with a dynamic range up to 67.29-fold and a tunable ligand sensitivity, providing a simple and intuitive strategy for sensor engineering. The START platform also exhibits modularity and composability to allow flexible genetic circuit construction, enabling seamless implementation of OR, AND, and NOT Boolean logic gates for multiple ligand inputs. The START design principle is capable of broadening the suite of synthetic biosensors for diverse chemical and protein ligands, providing a novel riboregulator chassis for synthetic biology and bioengineering applications.
Collapse
Affiliation(s)
- Jeongwon Kim
- Department of Life SciencesPohang University of Science and TechnologyPohang37673South Korea
| | - Minchae Seo
- Department of Life SciencesPohang University of Science and TechnologyPohang37673South Korea
| | - Yelin Lim
- Department of Life SciencesPohang University of Science and TechnologyPohang37673South Korea
| | - Jongmin Kim
- Department of Life SciencesPohang University of Science and TechnologyPohang37673South Korea
| |
Collapse
|
5
|
Oleynikov M, Jaffrey SR. RNA tertiary structure and conformational dynamics revealed by BASH MaP. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.11.589009. [PMID: 38645201 PMCID: PMC11030352 DOI: 10.1101/2024.04.11.589009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The functional effects of an RNA can arise from complex three-dimensional folds known as tertiary structures. However, predicting the tertiary structure of an RNA and whether an RNA adopts distinct tertiary conformations remains challenging. To address this, we developed BASH MaP, a single-molecule dimethyl sulfate (DMS) footprinting method and DAGGER, a computational pipeline, to identify alternative tertiary structures adopted by different molecules of RNA. BASH MaP utilizes potassium borohydride to reveal the chemical accessibility of the N7 position of guanosine, a key mediator of tertiary structures. We used BASH MaP to identify diverse conformational states and dynamics of RNA G-quadruplexes, an important RNA tertiary motif, in vitro and in cells. BASH MaP and DAGGER analysis of the fluorogenic aptamer Spinach reveals that it adopts alternative tertiary conformations which determine its fluorescence states. BASH MaP thus provides an approach for structural analysis of RNA by revealing previously undetectable tertiary structures.
Collapse
Affiliation(s)
- Maxim Oleynikov
- Department of Pharmacology, Weill Medical College, Cornell University, New York, NY, USA
| | - Samie R. Jaffrey
- Department of Pharmacology, Weill Medical College, Cornell University, New York, NY, USA
| |
Collapse
|
6
|
Chatterjee B, Thakur SS. miRNA-protein-metabolite interaction network reveals the regulatory network and players of pregnancy regulation in dairy cows. Front Cell Dev Biol 2024; 12:1377172. [PMID: 39156977 PMCID: PMC11329941 DOI: 10.3389/fcell.2024.1377172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 07/05/2024] [Indexed: 08/20/2024] Open
Abstract
Pregnancy is a complex process involving complex molecular interaction networks, such as between miRNA-protein, protein-protein, metabolite-metabolite, and protein-metabolite interactions. Advances in technology have led to the identification of many pregnancy-associated microRNA (miRNA), protein, and metabolite fingerprints in dairy cows. An array of miRNA, protein, and metabolite fingerprints produced during the early pregnancy of dairy cows were described. We have found the in silico interaction networks between miRNA-protein, protein-protein, metabolite-metabolite, and protein-metabolite. We have manually constructed miRNA-protein-metabolite interaction networks such as bta-miR-423-3p-IGFBP2-PGF2α interactomes. This interactome is obtained by manually combining the interaction network formed between bta-miR-423-3p-IGFBP2 and the interaction network between IGFBP2-PGF2α with IGFBP2 as a common interactor with bta-miR-423-3p and PGF2α with the provided sources of evidence. The interaction between bta-miR-423-3p and IGFBP2 has many sources of evidence including a high miRanda score of 169, minimum free energy (MFE) score of -25.14, binding probability (p) of 1, and energy of -25.5. The interaction between IGFBP2 and PGF2α occurs at high confidence scores (≥0.7 or 70%). Interestingly, PGF2α is also found to interact with different metabolites, such as PGF2α-PGD2, PGF2α-thromboxane B2, PGF2α-PGE2, and PGF2α-6-keto-PGF1α at high confidence scores (≥0.7 or 70%). Furthermore, the interactions between C3-PGE2, C3-PGD2, PGE2-PGD2, PGD2-thromboxane B2, PGE2-thromboxane B2, 6-keto-PGF1α-thromboxane B2, and PGE2-6-keto-PGF1α were also obtained at high confidence scores (≥0.7 or 70%). Therefore, we propose that miRNA-protein-metabolite interactomes involving miRNA, protein, and metabolite fingerprints of early pregnancy of dairy cows such as bta-miR-423-3p, IGFBP2, PGF2α, PGD2, C3, PGE2, 6-keto-PGF1 alpha, and thromboxane B2 may form the key regulatory networks and players of pregnancy regulation in dairy cows. This is the first study involving miRNA-protein-metabolite interactomes obtained in the early pregnancy stage of dairy cows.
Collapse
|
7
|
Yuan B, He G, Dong W. The first complete mitochondrial genome of the genus Laelaps with novel gene arrangement reveals extensive rearrangement and phylogenetics in the superfamily Dermanyssoidea. EXPERIMENTAL & APPLIED ACAROLOGY 2024:10.1007/s10493-024-00943-2. [PMID: 39017744 DOI: 10.1007/s10493-024-00943-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2023] [Accepted: 06/26/2024] [Indexed: 07/18/2024]
Abstract
We collected 56 specimens of Laelaps chini from the endemic Hengduan Mountain rat species (Eothenomys miletus) and obtained the first complete mitochondrial genome of L. chini by next-generation sequencing (NGS). The L. chini mitogenome is 16,507 bp in size and contains 37 genes and a control region of 2380 bp in length. The L. chini mitogenome has a high AT content and a compact arrangement with four overlapping regions ranging from 1 to 2 bp and 16 spacer regions ranging from 1 to 48 bp. We analyzed 13 protein-coding genes of L. chini mitogenome and found that protein-coding genes in the L. chini mitogenome preferred codons ending in A/U and codon usage pattern was mainly influenced by natural selection. Cox1 has the slowest evolution rate and cox3 has the fastest evolution rate. We combined the mitochondrial genome of eight species of gamasid mites in the superfamily Dermanyssoidea from Genbank and the L. chini mitochondrial genome to analyze its rearrangement patterns and breakpoint numbers. We found that the L. chini mitogenome showed a novel arrangement pattern and nine species of gamasid mites in the superfamily Dermanyssoidea, which have been sequenced complete mitochondrial genomes to date, all showed different degrees of rearrangement. Laelaps chini, Echinolaelaps echidninus and Echinolaelaps fukinenensis were closely related species based on genetic distance and phylogenetic analyses. Notably they are clustered with Varroa destructor of the family Varroidae, suggesting that the family Varroidae is more closely related to the family Laelapidae, but more data are needed to test whether Varroa can be classified under the family Laelapidae. The L. chini mitogenome is the first complete mitochondrial genome for the genus Laelaps, and contributes to further exploration of the mitochondrial gene rearrangements and phylogeny for the superfamily Dermanyssoidea.
Collapse
Affiliation(s)
- Bili Yuan
- Yunnan Provincial Key Laboratory for Zoonosis Control and Prevention, Institute of Pathogens and Vectors, Dali University, Dali, 671000, Yunnan, China
| | - Gangxian He
- Yunnan Provincial Key Laboratory for Zoonosis Control and Prevention, Institute of Pathogens and Vectors, Dali University, Dali, 671000, Yunnan, China
| | - Wenge Dong
- Yunnan Provincial Key Laboratory for Zoonosis Control and Prevention, Institute of Pathogens and Vectors, Dali University, Dali, 671000, Yunnan, China.
| |
Collapse
|
8
|
von Löhneysen S, Spicher T, Varenyk Y, Yao HT, Lorenz R, Hofacker I, Stadler PF. Phylogenetic and Chemical Probing Information as Soft Constraints in RNA Secondary Structure Prediction. J Comput Biol 2024; 31:549-563. [PMID: 38935442 DOI: 10.1089/cmb.2024.0519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
Extrinsic, experimental information can be incorporated into thermodynamics-based RNA folding algorithms in the form of pseudo-energies. Evolutionary conservation of RNA secondary structure elements is detectable in alignments of phylogenetically related sequences and provides evidence for the presence of certain base pairs that can also be converted into pseudo-energy contributions. We show that the centroid base pairs computed from a consensus folding model such as RNAalifold result in a substantial improvement of the prediction accuracy for single sequences. Evidence for specific base pairs turns out to be more informative than a position-wise profile for the conservation of the pairing status. A comparison with chemical probing data, furthermore, strongly suggests that phylogenetic base pairing data are more informative than position-specific data on (un)pairedness as obtained from chemical probing experiments. In this context we demonstrate, in addition, that the conversion of signal from probing data into pseudo-energies is possible using thermodynamic structure predictions as a reference instead of known RNA structures.
Collapse
Affiliation(s)
- Sarah von Löhneysen
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
| | - Thomas Spicher
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- UniVie Doctoral School Computer Science (DoCS), University of Vienna, Vienna, Austria
| | - Yuliia Varenyk
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical, University of Vienna, Vienna, Austria
| | - Hua-Ting Yao
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ronny Lorenz
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Ivo Hofacker
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Universität Leipzig, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, Colombia
- Santa Fe Institute, Santa Fe, New Mexico, USA
| |
Collapse
|
9
|
Bugnon LA, Di Persia L, Gerard M, Raad J, Prochetto S, Fenoy E, Chorostecki U, Ariel F, Stegmayer G, Milone DH. sincFold: end-to-end learning of short- and long-range interactions in RNA secondary structure. Brief Bioinform 2024; 25:bbae271. [PMID: 38855913 PMCID: PMC11163250 DOI: 10.1093/bib/bbae271] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/03/2024] [Accepted: 05/24/2024] [Indexed: 06/11/2024] Open
Abstract
MOTIVATION Coding and noncoding RNA molecules participate in many important biological processes. Noncoding RNAs fold into well-defined secondary structures to exert their functions. However, the computational prediction of the secondary structure from a raw RNA sequence is a long-standing unsolved problem, which after decades of almost unchanged performance has now re-emerged due to deep learning. Traditional RNA secondary structure prediction algorithms have been mostly based on thermodynamic models and dynamic programming for free energy minimization. More recently deep learning methods have shown competitive performance compared with the classical ones, but there is still a wide margin for improvement. RESULTS In this work we present sincFold, an end-to-end deep learning approach, that predicts the nucleotides contact matrix using only the RNA sequence as input. The model is based on 1D and 2D residual neural networks that can learn short- and long-range interaction patterns. We show that structures can be accurately predicted with minimal physical assumptions. Extensive experiments were conducted on several benchmark datasets, considering sequence homology and cross-family validation. sincFold was compared with classical methods and recent deep learning models, showing that it can outperform the state-of-the-art methods.
Collapse
Affiliation(s)
- Leandro A Bugnon
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
| | - Leandro Di Persia
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
| | - Matias Gerard
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
| | - Jonathan Raad
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
| | - Santiago Prochetto
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
- Instituto de Agrobiotecnología del Litoral, CONICET-UNL, CCT-Santa Fe, Ruta Nacional N° 168 Km 0, s/n, Paraje el Pozo, 3000, Santa Fe, Argentina
| | - Emilio Fenoy
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
| | - Uciel Chorostecki
- Faculty of Medicine and Health Sciences, Universitat Internacional de Catalunya, Barcelona, Spain
| | - Federico Ariel
- Instituto de Agrobiotecnología del Litoral, CONICET-UNL, CCT-Santa Fe, Ruta Nacional N° 168 Km 0, s/n, Paraje el Pozo, 3000, Santa Fe, Argentina
| | - Georgina Stegmayer
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
| | - Diego H Milone
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), FICH-UNL, CONICET, Ciudad Universitaria UNL, 3000, Santa Fe, Argentina
| |
Collapse
|
10
|
Kolaitis A, Makris E, Karagiannis AA, Tsanakas P, Pavlatos C. Knotify_V2.0: Deciphering RNA Secondary Structures with H-Type Pseudoknots and Hairpin Loops. Genes (Basel) 2024; 15:670. [PMID: 38927606 PMCID: PMC11203014 DOI: 10.3390/genes15060670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 05/19/2024] [Accepted: 05/22/2024] [Indexed: 06/28/2024] Open
Abstract
Accurately predicting the pairing order of bases in RNA molecules is essential for anticipating RNA secondary structures. Consequently, this task holds significant importance in unveiling previously unknown biological processes. The urgent need to comprehend RNA structures has been accentuated by the unprecedented impact of the widespread COVID-19 pandemic. This paper presents a framework, Knotify_V2.0, which makes use of syntactic pattern recognition techniques in order to predict RNA structures, with a specific emphasis on tackling the demanding task of predicting H-type pseudoknots that encompass bulges and hairpins. By leveraging the expressive capabilities of a Context-Free Grammar (CFG), the suggested framework integrates the inherent benefits of CFG and makes use of minimum free energy and maximum base pairing criteria. This integration enables the effective management of this inherently ambiguous task. The main contribution of Knotify_V2.0 compared to earlier versions lies in its capacity to identify additional motifs like bulges and hairpins within the internal loops of the pseudoknot. Notably, the proposed methodology, Knotify_V2.0, demonstrates superior accuracy in predicting core stems compared to state-of-the-art frameworks. Knotify_V2.0 exhibited exceptional performance by accurately identifying both core base pairing that form the ground truth pseudoknot in 70% of the examined sequences. Furthermore, Knotify_V2.0 narrowed the performance gap with Knotty, which had demonstrated better performance than Knotify and even surpassed it in Recall and F1-score metrics. Knotify_V2.0 achieved a higher count of true positives (tp) and a significantly lower count of false negatives (fn) compared to Knotify, highlighting improvements in Prediction and Recall metrics, respectively. Consequently, Knotify_V2.0 achieved a higher F1-score than any other platform. The source code and comprehensive implementation details of Knotify_V2.0 are publicly available on GitHub.
Collapse
Affiliation(s)
- Angelos Kolaitis
- School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou St., 15780 Athens, Greece; (A.K.); (E.M.); (A.A.K.); (P.T.)
| | - Evangelos Makris
- School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou St., 15780 Athens, Greece; (A.K.); (E.M.); (A.A.K.); (P.T.)
| | - Alexandros Anastasios Karagiannis
- School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou St., 15780 Athens, Greece; (A.K.); (E.M.); (A.A.K.); (P.T.)
| | - Panayiotis Tsanakas
- School of Electrical and Computer Engineering, National Technical University of Athens, 9 Iroon Polytechniou St., 15780 Athens, Greece; (A.K.); (E.M.); (A.A.K.); (P.T.)
| | - Christos Pavlatos
- Hellenic Air Force Academy, Dekelia Air Base, Acharnes, 13671 Athens, Greece
| |
Collapse
|
11
|
Chattopadhyay A, Jailani AAK, Roy A, Mukherjee SK, Mandal B. Expanding Possibilities for Foreign Gene Expression by Cucumber Green Mottle Mosaic Virus Genome-Based Bipartite Vector System. PLANTS (BASEL, SWITZERLAND) 2024; 13:1414. [PMID: 38794484 PMCID: PMC11124972 DOI: 10.3390/plants13101414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 05/01/2024] [Accepted: 05/15/2024] [Indexed: 05/26/2024]
Abstract
Expanding possibilities for foreign gene expression in cucurbits, we present a novel approach utilising a bipartite vector system based on the cucumber green mottle mosaic virus (CGMMV) genome. Traditional full-length CGMMV vectors face limitations such as a restricted cargo capacity and unstable foreign gene expression. To address these challenges, we developed two 'deconstructed' CGMMV genomes, DG-1 and DG-2. DG-1 features a major internal deletion, resulting in the loss of crucial replicase enzyme domains, rendering it incapable of self-replication. However, a staggered infiltration of DG-1 in CGMMV-infected plants enabled successful replication and movement, facilitating gene-silencing experiments. Conversely, DG-2 was engineered to enhance replication rates and provide multiple cloning sites. Although it exhibited higher replication rates, DG-2 remained localised within infiltrated tissue, displaying trans-replication and restricted movement. Notably, DG-2 demonstrated utility in expressing GFP, with a peak expression observed between 6 and 10 days post-infiltration. Overall, our bipartite system represents a significant advancement in functional genomics, offering a robust tool for foreign gene expression in Nicotiana benthamiana.
Collapse
Affiliation(s)
- Anirudha Chattopadhyay
- Advanced Centre for Plant Virology, Division of Plant Pathology, Indian Agricultural Research Institute, New Delhi 110012, India; (A.C.); (A.R.); (S.K.M.)
- Pulses Research Station, Sardarkrushinagar Dantiwada Agricultural University, Sardarkrushinagar 385506, Gujarat, India
| | - A. Abdul Kader Jailani
- Advanced Centre for Plant Virology, Division of Plant Pathology, Indian Agricultural Research Institute, New Delhi 110012, India; (A.C.); (A.R.); (S.K.M.)
- Plant Pathology Department, North Florida Research and Education Center, University of Florida, Quincy, FL 32351, USA
| | - Anirban Roy
- Advanced Centre for Plant Virology, Division of Plant Pathology, Indian Agricultural Research Institute, New Delhi 110012, India; (A.C.); (A.R.); (S.K.M.)
| | - Sunil Kumar Mukherjee
- Advanced Centre for Plant Virology, Division of Plant Pathology, Indian Agricultural Research Institute, New Delhi 110012, India; (A.C.); (A.R.); (S.K.M.)
- Plant Molecular Biology Group, International Centre for Genetic Engineering and Biotechnology, New Delhi 110067, India
| | - Bikash Mandal
- Advanced Centre for Plant Virology, Division of Plant Pathology, Indian Agricultural Research Institute, New Delhi 110012, India; (A.C.); (A.R.); (S.K.M.)
| |
Collapse
|
12
|
Su M, Roberts SJ, Sutherland JD. Initial Amino Acid:Codon Assignments and Strength of Codon:Anticodon Binding. J Am Chem Soc 2024; 146:12857-12863. [PMID: 38676654 PMCID: PMC11082893 DOI: 10.1021/jacs.4c03644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 04/16/2024] [Accepted: 04/18/2024] [Indexed: 04/29/2024]
Abstract
The ribosome brings 3'-aminoacyl-tRNA and 3'-peptidyl-tRNAs together to enable peptidyl transfer by binding them in two major ways. First, their anticodon loops are bound to mRNA, itself anchored at the ribosomal subunit interface, by contiguous anticodon:codon pairing augmented by interactions with the decoding center of the small ribosomal subunit. Second, their acceptor stems are bound by the peptidyl transferase center, which aligns the 3'-aminoacyl- and 3'-peptidyl-termini for optimal interaction of the nucleophilic amino group and electrophilic ester carbonyl group. Reasoning that intrinsic codon:anticodon binding might have been a major contributor to bringing tRNA 3'-termini into proximity at an early stage of ribosomal peptide synthesis, we wondered if primordial amino acids might have been assigned to those codons that bind the corresponding anticodon loops most tightly. By measuring the binding of anticodon stem loops to short oligonucleotides, we determined that family-box codon:anticodon pairings are typically tighter than split-box codon:anticodon pairings. Furthermore, we find that two family-box anticodon stem loops can tightly bind a pair of contiguous codons simultaneously, whereas two split-box anticodon stem loops cannot. The amino acids assigned to family boxes correspond to those accessible by what has been termed cyanosulfidic chemistry, supporting the contention that these limited amino acids might have been the first used in primordial coded peptide synthesis.
Collapse
Affiliation(s)
- Meng Su
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge
Biomedical Campus, Cambridge CB2 0QH, U.K.
| | - Samuel J. Roberts
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge
Biomedical Campus, Cambridge CB2 0QH, U.K.
| | - John D. Sutherland
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge
Biomedical Campus, Cambridge CB2 0QH, U.K.
| |
Collapse
|
13
|
Yang C, Zhou Q, Shen Y, Liu L, Cao Y, Tian H, Cao S, Liu C. The co-dispersal strategy of Endocarpon (Verrucariaceae) shapes an unusual lichen population structure. MYCOSCIENCE 2024; 65:138-150. [PMID: 39233758 PMCID: PMC11369309 DOI: 10.47371/mycosci.2024.02.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 02/20/2024] [Accepted: 02/21/2024] [Indexed: 09/06/2024]
Abstract
The reproduction and dispersal strategies of lichens play a major role in shaping their population structure and photobiont diversity. Sexual reproduction, which is common, leads to high lichen genetic diversity and low photobiont selectivity. However, the lichen genus Endocarpon adopts a special co-dispersal model in which algal cells from the photobiont and ascospores from the mycobiont are released together into the environment. To explore the dispersal strategy impact on population structures, a total of 62 Endocarpon individuals and 12 related Verrucariaceae genera individuals, representing co-dispersal strategy and conventional independent dispersal mode were studied. Phylogenetic analysis revealed that Endocarpon, with a large-scale geographical distribution, showed an extremely high specificity of symbiotic associations with their photobiont. Furthermore, three types of group I intron at 1769 site have been found in most Endocarpon mycobionts, which showed a high variety of group I intron in the same insertion site even in the same species collected from one location. This study suggested that the ascospore-alga co-dispersal mode of Endocarpon resulted in this unusual mycobiont-photobiont relationship; also provided an evidence for the horizontal transfer of group I intron that may suggest the origin of the complexity and diversity of lichen symbiotic associations.
Collapse
Affiliation(s)
- ChunYan Yang
- School of Life Science and Technology, Harbin Institute of Technology
| | | | - Yue Shen
- Key Laboratory for Polar Science, State Ocean Administration, Polar Research Institute of China
| | - LuShan Liu
- Emergency Department of China Rehabilitation Research Center, Capital medical University
| | - YunShu Cao
- Inner Mongolia Vocational and Technical College of Communications
| | - HuiMin Tian
- Department of Physiology, Medical College, Chifeng University
| | - ShuNan Cao
- Key Laboratory for Polar Science, State Ocean Administration, Polar Research Institute of China
| | - ChuanPeng Liu
- School of Life Science and Technology, Harbin Institute of Technology
| |
Collapse
|
14
|
Mortazavi B, Molaei A, Fard NA. Multi-epitopevaccines, from design to expression; an in silico approach. Hum Immunol 2024; 85:110804. [PMID: 38658216 DOI: 10.1016/j.humimm.2024.110804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 04/02/2024] [Accepted: 04/15/2024] [Indexed: 04/26/2024]
Abstract
The development of vaccines against a wide range of infectious diseases and pathogens often relies on multi-epitope strategies that can effectively stimulate both humoral and cellular immunity. Immunoinformatics tools play a pivotal role in designing such vaccines, enhancing immune response potential, and minimizing the risk of failure. This review presents a comprehensive overview of practical tools for epitope prediction and the associated immune responses. These immunoinformatics tools facilitate the selection of epitopes based on parameters such as antigenicity, absence of toxic and allergenic sequences, secondary and tertiary structures, sequence conservation, and population coverage. The chosen epitopes can be tailored for B-cells or T-cells, both of which require further assessments covered in this study. We offer a range of suitable linkers that effectively separate cytotoxic T lymphocyte and helper T lymphocyte epitopes while preserving their functionality. Additionally, we identify various adjuvants for specific purposes. We delve into the evaluation of MHC-epitope interactions, MHC clusters, and the simulation of final constructs through molecular docking techniques. We provide diverse linkers and adjuvants optimized for epitope functions to bolster immune responses through epitope attachment. By leveraging these comprehensive tools, the development of multi-epitope vaccines holds the promise of robust immunity and a significant reduction in experimental costs.
Collapse
Affiliation(s)
- Behnam Mortazavi
- Department of systems Biotechnology, Faculty of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran
| | - Ali Molaei
- Department of Biology, Science and Research Branch, Islamic Azad University, Tehran, Iran
| | - Najaf Allahyari Fard
- Department of systems Biotechnology, Faculty of Industrial and Environmental Biotechnology, National Institute of Genetic Engineering and Biotechnology (NIGEB), Tehran, Iran.
| |
Collapse
|
15
|
Trinity L, Stege U, Jabbari H. Tying the knot: Unraveling the intricacies of the coronavirus frameshift pseudoknot. PLoS Comput Biol 2024; 20:e1011787. [PMID: 38713726 PMCID: PMC11108256 DOI: 10.1371/journal.pcbi.1011787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Revised: 05/21/2024] [Accepted: 04/27/2024] [Indexed: 05/09/2024] Open
Abstract
Understanding and targeting functional RNA structures towards treatment of coronavirus infection can help us to prepare for novel variants of SARS-CoV-2 (the virus causing COVID-19), and any other coronaviruses that could emerge via human-to-human transmission or potential zoonotic (inter-species) events. Leveraging the fact that all coronaviruses use a mechanism known as -1 programmed ribosomal frameshifting (-1 PRF) to replicate, we apply algorithms to predict the most energetically favourable secondary structures (each nucleotide involved in at most one pairing) that may be involved in regulating the -1 PRF event in coronaviruses, especially SARS-CoV-2. We compute previously unknown most stable structure predictions for the frameshift site of coronaviruses via hierarchical folding, a biologically motivated framework where initial non-crossing structure folds first, followed by subsequent, possibly crossing (pseudoknotted), structures. Using mutual information from 181 coronavirus sequences, in conjunction with the algorithm KnotAli, we compute secondary structure predictions for the frameshift site of different coronaviruses. We then utilize the Shapify algorithm to obtain most stable SARS-CoV-2 secondary structure predictions guided by frameshift sequence-specific and genome-wide experimental data. We build on our previous secondary structure investigation of the singular SARS-CoV-2 68 nt frameshift element sequence, by using Shapify to obtain predictions for 132 extended sequences and including covariation information. Previous investigations have not applied hierarchical folding to extended length SARS-CoV-2 frameshift sequences. By doing so, we simulate the effects of ribosome interaction with the frameshift site, providing insight to biological function. We contribute in-depth discussion to contextualize secondary structure dual-graph motifs for SARS-CoV-2, highlighting the energetic stability of the previously identified 3_8 motif alongside the known dominant 3_3 and 3_6 (native-type) -1 PRF structures. Using a combination of thermodynamic methods and sequence covariation, our novel predictions suggest function of the attenuator hairpin via previously unknown pseudoknotted base pairing. While certain initial RNA folding is consistent, other pseudoknotted base pairs form which indicate potential conformational switching between the two structures.
Collapse
Affiliation(s)
- Luke Trinity
- Department of Computer Science, University of Victoria, Victoria, British Columbia, Canada
| | - Ulrike Stege
- Department of Computer Science, University of Victoria, Victoria, British Columbia, Canada
| | - Hosna Jabbari
- Department of Biomedical Engineering, University of Alberta, Edmonton, Alberta, Canada
- Institute on Aging and Lifelong Health, Victoria, British Columbia, Canada
| |
Collapse
|
16
|
Daniel Thomas S, Vijayakumar K, John L, Krishnan D, Rehman N, Revikumar A, Kandel Codi JA, Prasad TSK, S S V, Raju R. Machine Learning Strategies in MicroRNA Research: Bridging Genome to Phenome. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2024; 28:213-233. [PMID: 38752932 DOI: 10.1089/omi.2024.0047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2024]
Abstract
MicroRNAs (miRNAs) have emerged as a prominent layer of regulation of gene expression. This article offers the salient and current aspects of machine learning (ML) tools and approaches from genome to phenome in miRNA research. First, we underline that the complexity in the analysis of miRNA function ranges from their modes of biogenesis to the target diversity in diverse biological conditions. Therefore, it is imperative to first ascertain the miRNA coding potential of genomes and understand the regulatory mechanisms of their expression. This knowledge enables the efficient classification of miRNA precursors and the identification of their mature forms and respective target genes. Second, and because one miRNA can target multiple mRNAs and vice versa, another challenge is the assessment of the miRNA-mRNA target interaction network. Furthermore, long-noncoding RNA (lncRNA)and circular RNAs (circRNAs) also contribute to this complexity. ML has been used to tackle these challenges at the high-dimensional data level. The present expert review covers more than 100 tools adopting various ML approaches pertaining to, for example, (1) miRNA promoter prediction, (2) precursor classification, (3) mature miRNA prediction, (4) miRNA target prediction, (5) miRNA- lncRNA and miRNA-circRNA interactions, (6) miRNA-mRNA expression profiling, (7) miRNA regulatory module detection, (8) miRNA-disease association, and (9) miRNA essentiality prediction. Taken together, we unpack, critically examine, and highlight the cutting-edge synergy of ML approaches and miRNA research so as to develop a dynamic and microlevel understanding of human health and diseases.
Collapse
Affiliation(s)
- Sonet Daniel Thomas
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Krithika Vijayakumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Levin John
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Deepak Krishnan
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Niyas Rehman
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Amjesh Revikumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Kerala Genome Data Centre, Kerala Development and Innovation Strategic Council, Thiruvananthapuram, Kerala, India
| | - Jalaluddin Akbar Kandel Codi
- Department of Surgical Oncology, Yenepoya Medical College, Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | | | - Vinodchandra S S
- Department of Computer Science, University of Kerala, Thiruvananthapuram, Kerala, India
| | - Rajesh Raju
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| |
Collapse
|
17
|
Koksaldi I, Park D, Atilla A, Kang H, Kim J, Seker UOS. RNA-Based Sensor Systems for Affordable Diagnostics in the Age of Pandemics. ACS Synth Biol 2024; 13:1026-1037. [PMID: 38588603 PMCID: PMC11036506 DOI: 10.1021/acssynbio.3c00698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/25/2024] [Accepted: 03/25/2024] [Indexed: 04/10/2024]
Abstract
In the era of the COVID-19 pandemic, the significance of point-of-care (POC) diagnostic tools has become increasingly vital, driven by the need for quick and precise virus identification. RNA-based sensors, particularly toehold sensors, have emerged as promising candidates for POC detection systems due to their selectivity and sensitivity. Toehold sensors operate by employing an RNA switch that changes the conformation when it binds to a target RNA molecule, resulting in a detectable signal. This review focuses on the development and deployment of RNA-based sensors for POC viral RNA detection with a particular emphasis on toehold sensors. The benefits and limits of toehold sensors are explored, and obstacles and future directions for improving their performance within POC detection systems are presented. The use of RNA-based sensors as a technology for rapid and sensitive detection of viral RNA holds great potential for effectively managing (dealing/coping) with present and future pandemics in resource-constrained settings.
Collapse
Affiliation(s)
- Ilkay
Cisil Koksaldi
- UNAM
− Institute of Materials Science and Nanotechnology, National
Nanotechnology Research Center (UNAM), Bilkent
University, Ankara 06800, Turkey
| | - Dongwon Park
- Department
of Life Sciences, Pohang University of Science
and Technology, Pohang 37673, South Korea
| | - Abdurahman Atilla
- UNAM
− Institute of Materials Science and Nanotechnology, National
Nanotechnology Research Center (UNAM), Bilkent
University, Ankara 06800, Turkey
| | - Hansol Kang
- Department
of Life Sciences, Pohang University of Science
and Technology, Pohang 37673, South Korea
| | - Jongmin Kim
- Department
of Life Sciences, Pohang University of Science
and Technology, Pohang 37673, South Korea
| | - Urartu Ozgur Safak Seker
- UNAM
− Institute of Materials Science and Nanotechnology, National
Nanotechnology Research Center (UNAM), Bilkent
University, Ankara 06800, Turkey
| |
Collapse
|
18
|
White DS, Dunyak BM, Vaillancourt FH, Hoskins AA. A Sequential Binding Mechanism for 5' Splice Site Recognition and Modulation for the Human U1 snRNP. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.18.590139. [PMID: 38659798 PMCID: PMC11042371 DOI: 10.1101/2024.04.18.590139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Splice site recognition is essential for defining the transcriptome. Drugs like risdiplam and branaplam change how U1 snRNP recognizes particular 5' splice sites (5'SS) and promote U1 snRNP binding and splicing at these locations. Despite the therapeutic potential of 5'SS modulators, the complexity of their interactions and snRNP substrates have precluded defining a mechanism for 5'SS modulation. We have determined a sequential binding mechanism for modulation of -1A bulged 5'SS by branaplam using a combination of ensemble kinetic measurements and colocalization single molecule spectroscopy (CoSMoS). Our mechanism establishes that U1-C protein binds reversibly to U1 snRNP, and branaplam binds to the U1 snRNP/U1-C complex only after it has engaged a -1A bulged 5'SS. Obligate orders of binding and unbinding explain how reversible branaplam interactions cause formation of long-lived U1 snRNP/5'SS complexes. Branaplam is a ribonucleoprotein, not RNA duplex alone, targeting drug whose action depends on fundamental properties of 5'SS recognition.
Collapse
Affiliation(s)
- David S. White
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI
- Present Address: Element Biosciences, San Diego, CA
| | | | | | - Aaron A. Hoskins
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI
- Department of Chemistry, University of Wisconsin-Madison, Madison, WI
| |
Collapse
|
19
|
Gupta S, Pal D. Detection of intrinsic transcription termination sites in bacteria: consensus from hairpin detection approaches. J Biomol Struct Dyn 2024:1-11. [PMID: 38605579 DOI: 10.1080/07391102.2024.2325107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Accepted: 02/23/2024] [Indexed: 04/13/2024]
Abstract
We compare the WebGeSTer and INtrinsic transcription TERmination hairPIN (INTERPIN) databases used for intrinsic transcription termination (ITT) site prediction in bacteria. The former deploys inverted nucleotide repeat detection for identification of RNA hairpin, while the latter a pair-potential function - the hairpin energy score evaluation being identical for both. We find INTERPIN more sensitive than WebGeSTer with about 6% and 51% additional predictions for ITTs in chromosomal and plasmid operons, respectively. INTERPIN hairpins are relatively shorter in length with ungapped stem, and even located in AT-rich segments, compared to GC-rich longer hairpins with a gapped stem in WebGeSTer. The GC%, length, and energy score from INTERPIN transcription units (TUs) are best inter-correlated while the lowest energy single hairpins from WebGeSTer, considered suitable for ITT, being the worst. Around 72% TUs from the two databases overlap, and ∼60% of all alternate ITT sites downstream of TUs overlap, of which 65% are cluster hairpins. This helps highlight hairpin features that can be used to identify termination sites in bacteria across different prediction methods. Overall, the pair-potential-function-based hairpins screened appear to be more consistent with the kinetic and thermodynamics processes of ITT known to date.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Swati Gupta
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, India
| | - Debnath Pal
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, India
| |
Collapse
|
20
|
Morishita EC, Nakamura S. Recent applications of artificial intelligence in RNA-targeted small molecule drug discovery. Expert Opin Drug Discov 2024; 19:415-431. [PMID: 38321848 DOI: 10.1080/17460441.2024.2313455] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 01/30/2024] [Indexed: 02/08/2024]
Abstract
INTRODUCTION Targeting RNAs with small molecules offers an alternative to the conventional protein-targeted drug discovery and can potentially address unmet and emerging medical needs. The recent rise of interest in the strategy has already resulted in large amounts of data on disease associated RNAs, as well as on small molecules that bind to such RNAs. Artificial intelligence (AI) approaches, including machine learning and deep learning, present an opportunity to speed up the discovery of RNA-targeted small molecules by improving decision-making efficiency and quality. AREAS COVERED The topics described in this review include the recent applications of AI in the identification of RNA targets, RNA structure determination, screening of chemical compound libraries, and hit-to-lead optimization. The impact and limitations of the recent AI applications are discussed, along with an outlook on the possible applications of next-generation AI tools for the discovery of novel RNA-targeted small molecule drugs. EXPERT OPINION Key areas for improvement include developing AI tools for understanding RNA dynamics and RNA - small molecule interactions. High-quality and comprehensive data still need to be generated especially on the biological activity of small molecules that target RNAs.
Collapse
|
21
|
Yao HT, Marchand B, Berkemer SJ, Ponty Y, Will S. Infrared: a declarative tree decomposition-powered framework for bioinformatics. Algorithms Mol Biol 2024; 19:13. [PMID: 38493130 PMCID: PMC10943887 DOI: 10.1186/s13015-024-00258-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 02/13/2024] [Indexed: 03/18/2024] Open
Abstract
MOTIVATION Many bioinformatics problems can be approached as optimization or controlled sampling tasks, and solved exactly and efficiently using Dynamic Programming (DP). However, such exact methods are typically tailored towards specific settings, complex to develop, and hard to implement and adapt to problem variations. METHODS We introduce the Infrared framework to overcome such hindrances for a large class of problems. Its underlying paradigm is tailored toward problems that can be declaratively formalized as sparse feature networks, a generalization of constraint networks. Classic Boolean constraints specify a search space, consisting of putative solutions whose evaluation is performed through a combination of features. Problems are then solved using generic cluster tree elimination algorithms over a tree decomposition of the feature network. Their overall complexities are linear on the number of variables, and only exponential in the treewidth of the feature network. For sparse feature networks, associated with low to moderate treewidths, these algorithms allow to find optimal solutions, or generate controlled samples, with practical empirical efficiency. RESULTS Implementing these methods, the Infrared software allows Python programmers to rapidly develop exact optimization and sampling applications based on a tree decomposition-based efficient processing. Instead of directly coding specialized algorithms, problems are declaratively modeled as sets of variables over finite domains, whose dependencies are captured by constraints and functions. Such models are then automatically solved by generic DP algorithms. To illustrate the applicability of Infrared in bioinformatics and guide new users, we model and discuss variants of bioinformatics applications. We provide reimplementations and extensions of methods for RNA design, RNA sequence-structure alignment, parsimony-driven inference of ancestral traits in phylogenetic trees/networks, and design of coding sequences. Moreover, we demonstrate multidimensional Boltzmann sampling. These applications of the framework-together with our novel results-underline the practical relevance of Infrared. Remarkably, the achieved complexities are typically equivalent to the ones of specialized algorithms and implementations. AVAILABILITY Infrared is available at https://amibio.gitlabpages.inria.fr/Infrared with extensive documentation, including various usage examples and API reference; it can be installed using Conda or from source.
Collapse
Affiliation(s)
- Hua-Ting Yao
- LIX, CNRS UMR 7161, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France.
- Department of Theoretical Chemistry, University of Vienna, Vienna, Austria.
- School of Computer Science, McGill University, Montreal, Canada.
| | - Bertrand Marchand
- LIX, CNRS UMR 7161, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Sarah J Berkemer
- LIX, CNRS UMR 7161, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, Japan
| | - Yann Ponty
- LIX, CNRS UMR 7161, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France
| | - Sebastian Will
- LIX, CNRS UMR 7161, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau, France.
| |
Collapse
|
22
|
Gu X, Qi Y, El-Kebir M. DERNA Enables Pareto Optimal RNA Design. J Comput Biol 2024; 31:179-196. [PMID: 38416637 DOI: 10.1089/cmb.2023.0283] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2024] Open
Abstract
The design of an RNA sequence v that encodes an input target protein sequence w is a crucial aspect of messenger RNA (mRNA) vaccine development. There are an exponential number of possible RNA sequences for a single target protein due to codon degeneracy. These potential RNA sequences can assume various secondary structure conformations, each with distinct minimum free energy (MFE), impacting thermodynamic stability and mRNA half-life. Furthermore, the presence of species-specific codon usage bias, quantified by the codon adaptation index (CAI), plays a vital role in translation efficiency. While earlier studies focused on optimizing either MFE or CAI, recent research has underscored the advantages of simultaneously optimizing both objectives. However, optimizing one objective comes at the expense of the other. In this work, we present the Pareto Optimal RNA Design problem, aiming to identify the set of Pareto optimal solutions for which no alternative solutions exist that exhibit better MFE and CAI values. Our algorithm DEsign RNA (DERNA) uses the weighted sum method to enumerate the Pareto front by optimizing convex combinations of both objectives. We use dynamic programming to solve each convex combination in O ( | w | 3 ) time and O ( | w | 2 ) space. Compared with a CDSfold, previous approach that only optimizes MFE, we show on a benchmark data set that DERNA obtains solutions with identical MFE but superior CAI. Moreover, we show that DERNA matches the performance in terms of solution quality of LinearDesign, a recent approach that similarly seeks to balance MFE and CAI. We conclude by demonstrating our method's potential for mRNA vaccine design for the SARS-CoV-2 spike protein.
Collapse
Affiliation(s)
- Xinyu Gu
- Department of Computer Science and University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Yuanyuan Qi
- Department of Computer Science and University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| | - Mohammed El-Kebir
- Department of Computer Science and University of Illinois Urbana-Champaign, Urbana, Illinois, USA
- Cancer Center at Illinois, University of Illinois Urbana-Champaign, Urbana, Illinois, USA
| |
Collapse
|
23
|
Matthies MC, Krueger R, Torda AE, Ward M. Differentiable partition function calculation for RNA. Nucleic Acids Res 2024; 52:e14. [PMID: 38038257 PMCID: PMC10853804 DOI: 10.1093/nar/gkad1168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 10/24/2023] [Accepted: 11/28/2023] [Indexed: 12/02/2023] Open
Abstract
Ribonucleic acid (RNA) is an essential molecule in a wide range of biological functions. In 1990, McCaskill introduced a dynamic programming algorithm for computing the partition function of an RNA sequence. McCaskill's algorithm is widely used today for understanding the thermodynamic properties of RNA. In this work, we introduce a generalization of McCaskill's algorithm that is well-defined over continuous inputs. Crucially, this enables us to implement an end-to-end differentiable partition function calculation. The derivative can be computed with respect to the input, or to any other fixed values, such as the parameters of the energy model. This builds a bridge between RNA thermodynamics and the tools of differentiable programming including deep learning as it enables the partition function to be incorporated directly into any end-to-end differentiable pipeline. To demonstrate the effectiveness of our new approach, we tackle the inverse folding problem directly using gradient optimization. We find that using the gradient to optimize the sequence directly is sufficient to arrive at sequences with a high probability of folding into the desired structure. This indicates that the gradients we compute are meaningful.
Collapse
Affiliation(s)
- Marco C Matthies
- Centre for Bioinformatics, University of Hamburg, Bundesstr. 43, 20146 Hamburg, Germany
| | - Ryan Krueger
- Department of Applied Mathematics, Harvard University, 29 Oxford St, Cambridge, MA 02138, USA
| | - Andrew E Torda
- Centre for Bioinformatics, University of Hamburg, Bundesstr. 43, 20146 Hamburg, Germany
| | - Max Ward
- Department of Computer Science and Software Engineering, The University of Western Australia, 241, 35 Stirling Hwy, Crawley, WA 6009, Australia
| |
Collapse
|
24
|
Di Mauro V, Lauta FC, Modica J, Appleton SL, De Franciscis V, Catalucci D. Diagnostic and Therapeutic Aptamers: A Promising Pathway to Improved Cardiovascular Disease Management. JACC Basic Transl Sci 2024; 9:260-277. [PMID: 38510714 PMCID: PMC10950404 DOI: 10.1016/j.jacbts.2023.06.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 06/29/2023] [Indexed: 03/22/2024]
Abstract
Despite advances in care, cardiovascular diseases remain the leading cause of death worldwide. As a result, identifying suitable biomarkers for early diagnosis and improving therapeutic and diagnostic strategies is crucial. Because of their significant advantages over other therapeutic approaches, nucleic-based therapies, particularly aptamers, are gaining increased attention. Aptamers are innovative synthetic polymers or oligomers of single-stranded DNA (ssDNA) or RNA molecules that can form 3-dimensional structures and thus interact with their targets with high specificity and affinity. Furthermore, they outperform classical protein-based antibodies in terms of in vitro selection, production, ease of modification and conjugation, high stability, low immunogenicity, and suitability for nanoparticle functionalization for targeted drug delivery. This work aims to review the advances made in the aptamers' field in biomarker detection, diagnosis, imaging, and targeted therapy, which highlight their huge potential in the management of cardiovascular diseases.
Collapse
Affiliation(s)
- Vittoria Di Mauro
- Veneto Institute of Molecular Medicine, Padua, Italy
- Institute of Genetic and Biomedical Research, Milan, Milan Italy
- Humanitas Cardio Center, IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
| | | | - Jessica Modica
- Institute of Genetic and Biomedical Research, Milan, Milan Italy
- Humanitas Cardio Center, IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
| | - Silvia Lucia Appleton
- Institute of Genetic and Biomedical Research, Milan, Milan Italy
- Humanitas Cardio Center, IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
| | | | - Daniele Catalucci
- Institute of Genetic and Biomedical Research, Milan, Milan Italy
- Humanitas Cardio Center, IRCCS Humanitas Research Hospital, Rozzano, Milan, Italy
| |
Collapse
|
25
|
Ma Z, Xu W, Li S, Chen S, Yang Y, Li Z, Xing T, Zhao Z, Hou D, Li Q, Lu Z, Zhang H. Effect of RpoS on the survival, induction, resuscitation, morphology, and gene expression of viable but non-culturable Salmonella Enteritidis in powdered infant formula. Int J Food Microbiol 2024; 410:110463. [PMID: 38039925 DOI: 10.1016/j.ijfoodmicro.2023.110463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 10/19/2023] [Accepted: 10/27/2023] [Indexed: 12/03/2023]
Abstract
Involvement of the transcriptional regulator RpoS in the persistence of viable but non-culturable (VBNC) state has been demonstrated in several species of bacteria. This study investigated the role of the RpoS in the formation and resuscitation of VBNC state in Salmonella enterica serovar Enteritidis CICC 21482 by measuring bacterial survival, morphology, physiological characteristics, and gene expression in wild-type (WT) and rpoS-deletion (ΔrpoS) strains during long-term storage in powdered infant formula (PIF). The ΔrpoS strain was produced by allelic exchange using a suicide plasmid. Bacteria were inoculated into PIF for 635-day storage. Survival, morphology, intracellular reactive oxygen species (ROS) levels and intercellular quorum sensing autoinducer-2 (AI-2) contents were regularly measured. Resuscitation assays were conducted after obtaining VBNC cells. Gene expression was measured using real-time quantitative polymerase chain reaction (qPCR). The results showed that RpoS and low temperature conditions were associated with enhanced culturability and recoverability of Salmonella Enteritidis after desiccation storage in low water activity (aw) PIF. In addition, the synthesis of intracellular ROS and intercellular quorum sensing AI-2 was regulated by RpoS, inducing the formation and resuscitation of VBNC cells. Gene expression of soxS, katG and relA was found strongly associated with RpoS. Due to the lack of RpoS factor, the ΔrpoS strain could not normally synthesize SoxS, catalase and (p)ppGpp, resulting in its early shift to the VBNC state. This study elucidates the role of rpoS in desiccation stress and the formation and resuscitation mechanism of VBNC cells under desiccation stress. It serves as the basis for preventing and controlling the recovery of pathogenic bacteria in VBNC state in low aw foods.
Collapse
Affiliation(s)
- Zhuolin Ma
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Weiying Xu
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Shaoting Li
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Siyi Chen
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Yuheng Yang
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Zefeng Li
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Tong Xing
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Zepeng Zhao
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Dongping Hou
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Qingqing Li
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Ziying Lu
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China
| | - Hongmei Zhang
- College of Biological and Pharmaceutical Science, Guangdong University of Technology, No. 100 Waihuan Xi Road, Panyu District, Guangzhou 510006, China.
| |
Collapse
|
26
|
Dutta N, Sarzynska J, Deb I, Lahiri A. Predicting nearest neighbor free energies of modified RNA with LIE: results for pseudouridine and N1-methylpseudouridine within RNA duplexes. Phys Chem Chem Phys 2024; 26:992-999. [PMID: 38088148 DOI: 10.1039/d3cp02442c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
Pseudouridine (Ψ) and N1-methylpseudouridine (m1Ψ) are among the key modifications in the field of mRNA therapeutics and vaccine research. The accuracy of the design and development of therapeutic RNAs containing such modifications depends on the accuracy of the secondary structure prediction, which in turn depends on the nearest neighbor (NN) thermodynamic parameters for the standard and modified residues. Here, we propose a simple approach based on molecular dynamics simulations and linear interaction energy (LIE) approximation that is able to predict the NN free energy parameters for U-A, Ψ-A and m1Ψ-A pairs in reasonable agreement with the recent experimental reports. We report the NN thermodynamic parameters for different U, Ψ and m1Ψ base pairs, which might be helpful for a deeper understanding of the effect of these modifications in RNA. The predicted NN free energy parameters in this study are able to closely reproduce the folding free energies of duplexes containing internal Ψ for which the thermodynamic data were available. Additionally, we report the predicted folding free energies for the duplexes containing internal m1Ψ.
Collapse
Affiliation(s)
- Nivedita Dutta
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, 92, Acharya Prafulla Chandra Road, Kolkata 700009, West Bengal, India.
| | - Joanna Sarzynska
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, Poznan 61-704, Poland
| | - Indrajit Deb
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, 92, Acharya Prafulla Chandra Road, Kolkata 700009, West Bengal, India.
| | - Ansuman Lahiri
- Department of Biophysics, Molecular Biology and Bioinformatics, University of Calcutta, 92, Acharya Prafulla Chandra Road, Kolkata 700009, West Bengal, India.
| |
Collapse
|
27
|
Kalirad A, Burch CL, Azevedo RBR. Genetic drift promotes and recombination hinders speciation on holey fitness landscapes. PLoS Genet 2024; 20:e1011126. [PMID: 38252672 PMCID: PMC10833538 DOI: 10.1371/journal.pgen.1011126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 02/01/2024] [Accepted: 01/06/2024] [Indexed: 01/24/2024] Open
Abstract
Dobzhansky and Muller proposed a general mechanism through which microevolution, the substitution of alleles within populations, can cause the evolution of reproductive isolation between populations and, therefore, macroevolution. As allopatric populations diverge, many combinations of alleles differing between them have not been tested by natural selection and may thus be incompatible. Such genetic incompatibilities often cause low fitness in hybrids between species. Furthermore, the number of incompatibilities grows with the genetic distance between diverging populations. However, what determines the rate and pattern of accumulation of incompatibilities remains unclear. We investigate this question by simulating evolution on holey fitness landscapes on which genetic incompatibilities can be identified unambiguously. We find that genetic incompatibilities accumulate more slowly among genetically robust populations and identify two determinants of the accumulation rate: recombination rate and population size. In large populations with abundant genetic variation, recombination selects for increased genetic robustness and, consequently, incompatibilities accumulate more slowly. In small populations, genetic drift interferes with this process and promotes the accumulation of genetic incompatibilities. Our results suggest a novel mechanism by which genetic drift promotes and recombination hinders speciation.
Collapse
Affiliation(s)
- Ata Kalirad
- Department of Biology and Biochemistry, University of Houston, Houston, Texas, United States of America
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| | - Christina L. Burch
- Department of Biology, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Ricardo B. R. Azevedo
- Department of Biology and Biochemistry, University of Houston, Houston, Texas, United States of America
| |
Collapse
|
28
|
Kühnl F, Stadler PF, Findeiß S. Assessing the Quality of Cotranscriptional Folding Simulations. Methods Mol Biol 2024; 2726:347-376. [PMID: 38780738 DOI: 10.1007/978-1-0716-3519-3_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
Structural changes in RNAs are an important contributor to controlling gene expression not only at the posttranscriptional stage but also during transcription. A subclass of riboswitches and RNA thermometers located in the 5' region of the primary transcript regulates the downstream functional unit - usually an ORF - through premature termination of transcription. Not only such elements occur naturally, but they are also attractive devices in synthetic biology. The possibility to design such riboswitches or RNA thermometers is thus of considerable practical interest. Since these functional RNA elements act already during transcription, it is important to model and understand the dynamics of folding and, in particular, the formation of intermediate structures concurrently with transcription. Cotranscriptional folding simulations are therefore an important step to verify the functionality of design constructs before conducting expensive and labor-intensive wet lab experiments. For RNAs, full-fledged molecular dynamics simulations are far beyond practical reach because of both the size of the molecules and the timescales of interest. Even at the simplified level of secondary structures, further approximations are necessary. The BarMap approach is based on representing the secondary structure landscape for each individual transcription step by a coarse-grained representation that only retains a small set of low-energy local minima and the energy barriers between them. The folding dynamics between two transcriptional elongation steps is modeled as a Markov process on this representation. Maps between pairs of consecutive coarse-grained landscapes make it possible to follow the folding process as it changes in response to transcription elongation. In its original implementation, the BarMap software provides a general framework to investigate RNA folding dynamics on temporally changing landscapes. It is, however, difficult to use in particular for specific scenarios such as cotranscriptional folding. To overcome this limitation, we developed the user-friendly BarMap-QA pipeline described in detail in this contribution. It is illustrated here by an elaborate example that emphasizes the careful monitoring of several quality measures. Using an iterative workflow, a reliable and complete kinetics simulation of a synthetic, transcription-regulating riboswitch is obtained using minimal computational resources. All programs and scripts used in this contribution are free software and available for download as a source distribution for Linux® or as a platform-independent Docker® image including support for Apple macOS® and Microsoft Windows®.
Collapse
Affiliation(s)
- Felix Kühnl
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany
| | - Peter F Stadler
- Bioinformatics Group, Department of Computer Science, Interdisciplinary Center of Bioinformatics, German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Competence Center for Scalable Data Services and Solutions, and Leipzig Research Center for Civilization Diseases, Leipzig University, Leipzig, Germany
- Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany
- Institute for Theoretical Chemistry, University of Vienna, Vienna, Austria
- Facultad de Ciencias, Universidad Nacional de Colombia, Bogotá, D.C., Colombia
- Santa Fe Institute, Santa Fe, NM, USA
| | - Sven Findeiß
- Bioinformatics Group, Department of Computer Science, and Interdisciplinary Center for Bioinformatics, Leipzig University, Leipzig, Germany.
| |
Collapse
|
29
|
Gupta S, Pal D. Utilizing RNA-seq Data to Infer Bacterial Transcription Termination Sites and Validate Predictions. Methods Mol Biol 2024; 2812:345-365. [PMID: 39068372 DOI: 10.1007/978-1-0716-3886-6_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
The transcription termination process is an important part of the gene expression process in the cell. It has been studied extensively, but many aspects of the mechanism are not well understood. The widespread availability of experimental RNA-seq data from high-throughput experiments provides a unique opportunity to infer the end of the transcription units genome wide. This data is available for both Rho-dependent and Rho-independent termination pathways that drive transcription termination in bacteria. Our book chapter gives an overview of the current knowledge of Rho-independent transcription termination mechanisms and the prediction approaches currently deployed to infer the termination sites. Thereafter, we describe our method that uses cluster hairpins to detect Rho-independent transcription termination sites. These clusters are a group of hairpins that lies at <15 bp from each other and are together capable of enforcing the termination process. The idea of a group of hairpins being extensively used for transcription termination is new, and results show that at least 52% of the total cases are of this type, while in the remaining cases, a single strong hairpin is capable of driving transcription termination. The reads derived from the RNA-seq data for corresponding bacteria have been used to validate the predicted sites. The predictions that match these RNA-seq derived sites have higher confidence, and we find almost 98% of the predicted sites, including alternate termination sites, to match the RNA-seq data. We discuss the features of predicted hairpins in detail for a better understanding of the Rho-independent transcription termination mechanism in bacteria. We also explain how users can use the tools developed by us to do transcription terminator predictions and design their experiments through genome-level visualization of the transcription termination sites from the precomputed INTERPIN database.
Collapse
Affiliation(s)
- Swati Gupta
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, Karnataka, India
| | - Debnath Pal
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, Karnataka, India.
| |
Collapse
|
30
|
Greenwood T, Heitsch CE. How Parameters Influence SHAPE-Directed Predictions. Methods Mol Biol 2024; 2726:105-124. [PMID: 38780729 DOI: 10.1007/978-1-0716-3519-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
The structure of an RNA sequence encodes information about its biological function. Dynamic programming algorithms are often used to predict the conformation of an RNA molecule from its sequence alone, and adding experimental data as auxiliary information improves prediction accuracy. This auxiliary data is typically incorporated into the nearest neighbor thermodynamic model22 by converting the data into pseudoenergies. Here, we look at how much of the space of possible structures auxiliary data allows prediction methods to explore. We find that for a large class of RNA sequences, auxiliary data shifts the predictions significantly. Additionally, we find that predictions are highly sensitive to the parameters which define the auxiliary data pseudoenergies. In fact, the parameter space can typically be partitioned into regions where different structural predictions predominate.
Collapse
|
31
|
Ferreira I, Weber G. VarGibbs Usage in the Optimization of Nearest-Neighbor Parameters and Prediction of Melting Temperature of RNA Duplexes. Methods Mol Biol 2024; 2726:15-43. [PMID: 38780726 DOI: 10.1007/978-1-0716-3519-3_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2024]
Abstract
The nearest-neighbor (NN) model is a general tool for the evaluation for oligonucleotide thermodynamic stability. It is primarily used for the prediction of melting temperatures but has also found use in RNA secondary structure prediction and theoretical models of hybridization kinetics. One of the key problems is to obtain the NN parameters from melting temperatures, and VarGibbs was designed to obtain those parameters directly from melting temperatures. Here we will describe the basic workflow from RNA melting temperatures to NN parameters with the use of VarGibbs. We start by a brief revision of the basic concepts of RNA hybridization and of the NN model and then show how to prepare the data files, run the parameter optimization, and interpret the results.
Collapse
Affiliation(s)
- Izabela Ferreira
- Departamento de Física, Universidade Federal de Minas Gerais, Belo Horizonte-MG, Brazil
| | - Gerald Weber
- Departamento de Física, Universidade Federal de Minas Gerais, Belo Horizonte-MG, Brazil
| |
Collapse
|
32
|
Tieng FYF, Abdullah-Zawawi MR, Md Shahri NAA, Mohamed-Hussein ZA, Lee LH, Mutalib NSA. A Hitchhiker's guide to RNA-RNA structure and interaction prediction tools. Brief Bioinform 2023; 25:bbad421. [PMID: 38040490 PMCID: PMC10753535 DOI: 10.1093/bib/bbad421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/16/2023] [Accepted: 10/26/2023] [Indexed: 12/03/2023] Open
Abstract
RNA biology has risen to prominence after a remarkable discovery of diverse functions of noncoding RNA (ncRNA). Most untranslated transcripts often exert their regulatory functions into RNA-RNA complexes via base pairing with complementary sequences in other RNAs. An interplay between RNAs is essential, as it possesses various functional roles in human cells, including genetic translation, RNA splicing, editing, ribosomal RNA maturation, RNA degradation and the regulation of metabolic pathways/riboswitches. Moreover, the pervasive transcription of the human genome allows for the discovery of novel genomic functions via RNA interactome investigation. The advancement of experimental procedures has resulted in an explosion of documented data, necessitating the development of efficient and precise computational tools and algorithms. This review provides an extensive update on RNA-RNA interaction (RRI) analysis via thermodynamic- and comparative-based RNA secondary structure prediction (RSP) and RNA-RNA interaction prediction (RIP) tools and their general functions. We also highlighted the current knowledge of RRIs and the limitations of RNA interactome mapping via experimental data. Then, the gap between RSP and RIP, the importance of RNA homologues, the relationship between pseudoknots, and RNA folding thermodynamics are discussed. It is hoped that these emerging prediction tools will deepen the understanding of RNA-associated interactions in human diseases and hasten treatment processes.
Collapse
Affiliation(s)
- Francis Yew Fu Tieng
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | | | - Nur Alyaa Afifah Md Shahri
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
| | - Zeti-Azura Mohamed-Hussein
- Institute of Systems Biology (INBIOSIS), UKM, Selangor 43600, Malaysia
- Department of Applied Physics, Faculty of Science and Technology, UKM, Selangor 43600, Malaysia
| | - Learn-Han Lee
- Sunway Microbiomics Centre, School of Medical and Life Sciences, Sunway University, Sunway City 47500, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
| | - Nurul-Syakima Ab Mutalib
- UKM Medical Molecular Biology Institute (UMBI), Universiti Kebangsaan Malaysia (UKM), Kuala Lumpur 56000, Malaysia
- Novel Bacteria and Drug Discovery Research Group, Microbiome and Bioresource Research Strength, Jeffrey Cheah School of Medicine and Health Sciences, Monash University of Malaysia, Selangor 47500, Malaysia
- Faculty of Health Sciences, UKM, Kuala Lumpur 50300, Malaysia
| |
Collapse
|
33
|
Ruiz-Ciancio D, Veeramani S, Embree E, Ortman C, Thiel KW, Thiel WH. AptamerRunner: An accessible aptamer structure prediction and clustering algorithm for visualization of selected aptamers. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.13.566453. [PMID: 38014343 PMCID: PMC10680646 DOI: 10.1101/2023.11.13.566453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Aptamers are short single-stranded DNA or RNA molecules with high affinity and specificity for targets and are generated using the iterative Systematic Evolution of Ligands by EXponential enrichment (SELEX) process. Next-generation sequencing (NGS) revolutionized aptamer selections by allowing a more comprehensive analysis of SELEX-enriched aptamers as compared to Sanger sequencing. The current challenge with aptamer NGS datasets is identifying a diverse cohort of candidate aptamers with the highest likelihood of successful experimental validation. Herein we present AptamerRunner, an aptamer clustering algorithm that generates visual networks of aptamers that are related by sequence and/or structure. These networks can then be overlayed with ranking data, such as fold enrichment or data from scoring algorithms. The ability to visually integrate data using AptamerRunner represents a significant advancement over existing clustering tools by providing a natural context to depict groups of aptamers from which ranked or scored candidates can be chosen for experimental validation. The inherent flexibility, user-friendly design, and prospects for future enhancements with AptamerRunner has broad-reaching implications for aptamer researchers across a wide range of disciplines.
Collapse
Affiliation(s)
- Dario Ruiz-Ciancio
- Instituto de Ciencias Biomédicas (ICBM), Facultad de Ciencias Médicas, Universidad Católica de Cuyo, Av. José Ignacio de la Roza 1516, Rivadavia, 5400, San Juan, Argentina
- National Council of Scientific and Technical Research (CONICET), Godoy Cruz 2290, C1425FQB Ciudad Autónoma de Buenos Aires Argentina
- Cancer Genome Engineering Group, Vall d’Hebron Institute of Oncology (VHIO), Barcelona 08035, Spain
| | - Suresh Veeramani
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
- Holden Comprehensive Cancer Center, University of Iowa, Iowa City, IA, 52242, USA
| | - Eric Embree
- Carver College of Medicine, University of Iowa, Iowa City, IA 52242, USA
| | - Chris Ortman
- Institute for Clinical and Translational Science, University of Iowa, Iowa City, IA 52242, USA
| | - Kristina W. Thiel
- Holden Comprehensive Cancer Center, University of Iowa, Iowa City, IA, 52242, USA
- Department of Obstetrics and Gynecology, University of Iowa, Iowa City, IA 52242, USA
| | - William H Thiel
- Department of Internal Medicine, University of Iowa, Iowa City, IA 52242, USA
| |
Collapse
|
34
|
Binet T, Padiolleau-Lefèvre S, Octave S, Avalle B, Maffucci I. Comparative Study of Single-stranded Oligonucleotides Secondary Structure Prediction Tools. BMC Bioinformatics 2023; 24:422. [PMID: 37940855 PMCID: PMC10634105 DOI: 10.1186/s12859-023-05532-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 10/13/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND Single-stranded nucleic acids (ssNAs) have important biological roles and a high biotechnological potential linked to their ability to bind to numerous molecular targets. This depends on the different spatial conformations they can assume. The first level of ssNAs spatial organisation corresponds to their base pairs pattern, i.e. their secondary structure. Many computational tools have been developed to predict the ssNAs secondary structures, making the choice of the appropriate tool difficult, and an up-to-date guide on the limits and applicability of current secondary structure prediction tools is missing. Therefore, we performed a comparative study of the performances of 9 freely available tools (mfold, RNAfold, CentroidFold, CONTRAfold, MC-Fold, LinearFold, UFold, SPOT-RNA, and MXfold2) on a dataset of 538 ssNAs with known experimental secondary structure. RESULTS The minimum free energy-based tools, namely mfold and RNAfold, and some tools based on artificial intelligence, namely CONTRAfold and MXfold2, provided the best results, with [Formula: see text] of exact predictions, whilst MC-fold seemed to be the worst performing tool, with only [Formula: see text] of exact predictions. In addition, UFold and SPOT-RNA are the only options for pseudoknots prediction. Including in the analysis of mfold and RNAfold results 5-10 suboptimal solutions further improved the performances of these tools. Nevertheless, we could observe issues in predicting particular motifs, such as multiple-ways junctions and mini-dumbbells, or the ssNAs whose structure has been determined in complex with a protein. In addition, our benchmark shows that some effort has to be paid for ssDNA secondary structure predictions. CONCLUSIONS In general, Mfold, RNAfold, and MXfold2 seem to currently be the best choice for the ssNAs secondary structure prediction, although they still show some limits linked to specific structural motifs. Nevertheless, actual trends suggest that artificial intelligence has a high potential to overcome these remaining issues, for example the recently developed UFold and SPOT-RNA have a high success rate in predicting pseudoknots.
Collapse
Affiliation(s)
- Thomas Binet
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France
| | - Séverine Padiolleau-Lefèvre
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France
| | - Stéphane Octave
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France
| | - Bérangère Avalle
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France.
| | - Irene Maffucci
- Université de technologie de Compiègne, UPJV, CNRS, Enzyme and Cell Engineering, Centre de recherche Royallieu - CS 60 319, 60203, Compiègne Cedex, France.
| |
Collapse
|
35
|
Dong YW. Roles of multi-level temperature-adaptive responses and microhabitat variation in establishing distributions of intertidal species. J Exp Biol 2023; 226:jeb245745. [PMID: 37909420 DOI: 10.1242/jeb.245745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2023]
Abstract
How intertidal species survive their harsh environment and how best to evaluate and forecast range shifts in species distribution are two important and closely related questions for intertidal ecologists and global change biologists. Adaptive variation in responses of organisms to environmental change across all levels of biological organization - from behavior to molecular systems - is of key importance in setting distribution patterns, yet studies often neglect the interactions of diverse types of biological variation (e.g. differences in thermal optima owing to genetic and acclimation-induced effects) with environmental variation, notably at the scale of microhabitats. Intertidal species have to cope with extreme and frequently changing thermal stress, and have shown high variation in thermal sensitivities and adaptive responses at different levels of biological organization. Here, I review the physiological and biochemical adaptations of intertidal species to environmental temperature on multiple spatial and temporal scales. With fine-scale datasets for the thermal limits of individuals and for environmental temperature variation at the microhabitat scale, we can map the thermal sensitivity for each individual in different microhabitats, and then scale up the thermal sensitivity analysis to the population level and, finally, to the species level by incorporating physiological traits into species distribution models. These more refined mechanistic models that include consideration of physiological variations have higher predictive power than models that neglect these variations, and they will be crucial to answering the questions posed above concerning adaptive mechanisms and the roles they play in governing distribution patterns in a rapidly changing world.
Collapse
Affiliation(s)
- Yun-Wei Dong
- Ministry Key Laboratory of Mariculture, Fisheries College, Ocean University of China, Qingdao 266001, China
| |
Collapse
|
36
|
Ruggiero V, Fagioli C, de Pretis S, Di Carlo V, Landsberger N, Zacchetti D. Complex CDKL5 translational regulation and its potential role in CDKL5 deficiency disorder. Front Cell Neurosci 2023; 17:1231493. [PMID: 37964795 PMCID: PMC10642286 DOI: 10.3389/fncel.2023.1231493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 10/10/2023] [Indexed: 11/16/2023] Open
Abstract
CDKL5 is a kinase with relevant functions in correct neuronal development and in the shaping of synapses. A decrease in its expression or activity leads to a severe neurodevelopmental condition known as CDKL5 deficiency disorder (CDD). CDD arises from CDKL5 mutations that lie in the coding region of the gene. However, the identification of a SNP in the CDKL5 5'UTR in a patient with symptoms consistent with CDD, together with the complexity of the CDKL5 transcript leader, points toward a relevant translational regulation of CDKL5 expression with important consequences in physiological processes as well as in the pathogenesis of CDD. We performed a bioinformatics and molecular analysis of the 5'UTR of CDKL5 to identify translational regulatory features. We propose an important role for structural cis-acting elements, with the involvement of the eukaryotic translational initiation factor eIF4B. By evaluating both cap-dependent and cap-independent translation initiation, we suggest the presence of an IRES supporting the translation of CDKL5 mRNA and propose a pathogenic effect of the C>T -189 SNP in decreasing the translation of the downstream protein.
Collapse
Affiliation(s)
- Valeria Ruggiero
- Vita-Salute San Raffaele University, Milan, Italy
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Claudio Fagioli
- Vita-Salute San Raffaele University, Milan, Italy
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Stefano de Pretis
- Vita-Salute San Raffaele University, Milan, Italy
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Valerio Di Carlo
- Department of Medical Biotechnology and Translational Medicine, University of Milan, Segrate, Italy
| | - Nicoletta Landsberger
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
- Department of Medical Biotechnology and Translational Medicine, University of Milan, Segrate, Italy
| | - Daniele Zacchetti
- Vita-Salute San Raffaele University, Milan, Italy
- Division of Neuroscience, IRCCS San Raffaele Scientific Institute, Milan, Italy
| |
Collapse
|
37
|
Zhu YJ, Liao ML, Dong YW. Exploring the adaptability of the secondary structure of mRNA to temperature in intertidal snails based on SHAPE experiments. J Exp Biol 2023; 226:jeb246544. [PMID: 37767692 DOI: 10.1242/jeb.246544] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Accepted: 09/22/2023] [Indexed: 09/29/2023]
Abstract
RNA-based thermal regulation is an important strategy for organisms to cope with temperature changes. Inhabiting the intertidal rocky shore, a key interface of the ocean, atmosphere and terrestrial environments, intertidal species have developed variable thermal adaptation mechanisms; however, adaptions at the RNA level remain largely uninvestigated. To examine the relationship between mRNA structural stability and species distribution, in the present study, the secondary structure of cytosolic malate dehydrogenase (cMDH) mRNA of Echinolittorina malaccana, Echinolittorina radiata and Littorina brevicula was determined using selective 2'-hydroxyl acylation analyzed by primer extension (SHAPE), and the change in folding free energy of formation (ΔGfold) was calculated. The results showed that ΔGfold increased as the temperature increased. The difference in ΔGfold (ΔΔGfold) between two specific temperatures (25 versus 0°C, 37 versus 0°C and 57 versus 0°C) differed among the three species, and the ΔΔGfold value of E. malaccana was significantly lower than those of E. radiata and L. brevicula. The number of stems of cMDH mRNA of the snails decreased with increasing temperature, and the breakpoint temperature of E. malaccana was the highest among these. The number of loops was also reduced with increasing temperature, while the length of the loop structure increased accordingly. Consequently, these structural changes can potentially affect the translational efficiency of mRNA. These results imply that there were interspecific differences in the thermal stability of RNA secondary structures in intertidal snails, and these differences may be related to snail distribution.
Collapse
Affiliation(s)
- Ya-Jie Zhu
- The Key Laboratory of Mariculture, Ministry of Education, Fisheries College, Ocean University of China, Qingdao 266003, PR China
| | - Ming-Ling Liao
- The Key Laboratory of Mariculture, Ministry of Education, Fisheries College, Ocean University of China, Qingdao 266003, PR China
| | - Yun-Wei Dong
- The Key Laboratory of Mariculture, Ministry of Education, Fisheries College, Ocean University of China, Qingdao 266003, PR China
| |
Collapse
|
38
|
Zhang H, Li S, Dai N, Zhang L, Mathews DH, Huang L. LinearCoFold and LinearCoPartition: linear-time algorithms for secondary structure prediction of interacting RNA molecules. Nucleic Acids Res 2023; 51:e94. [PMID: 37650626 PMCID: PMC10570024 DOI: 10.1093/nar/gkad664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 06/15/2023] [Accepted: 08/17/2023] [Indexed: 09/01/2023] Open
Abstract
Many RNAs function through RNA-RNA interactions. Fast and reliable RNA structure prediction with consideration of RNA-RNA interaction is useful, however, existing tools are either too simplistic or too slow. To address this issue, we present LinearCoFold, which approximates the complete minimum free energy structure of two strands in linear time, and LinearCoPartition, which approximates the cofolding partition function and base pairing probabilities in linear time. LinearCoFold and LinearCoPartition are orders of magnitude faster than RNAcofold. For example, on a sequence pair with combined length of 26,190 nt, LinearCoFold is 86.8× faster than RNAcofold MFE mode, and LinearCoPartition is 642.3× faster than RNAcofold partition function mode. Surprisingly, LinearCoFold and LinearCoPartition's predictions have higher PPV and sensitivity of intermolecular base pairs. Furthermore, we apply LinearCoFold to predict the RNA-RNA interaction between SARS-CoV-2 genomic RNA (gRNA) and human U4 small nuclear RNA (snRNA), which has been experimentally studied, and observe that LinearCoFold's prediction correlates better with the wet lab results than RNAcofold's.
Collapse
Affiliation(s)
- He Zhang
- Baidu Research, Sunnyvale, CA, USA
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| | - Sizhen Li
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| | - Ning Dai
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| | - Liang Zhang
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| | - David H Mathews
- Department of Biochemistry & Biophysics,Rochester, NY 14642, USA
- Center for RNA Biology, Rochester, NY 14642, USA
- Department of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Liang Huang
- School of Electrical Engineering & Computer Science, Oregon State University, Corvallis, OR, USA
| |
Collapse
|
39
|
Nakajima M, Smith AD. Counting Distinguishable RNA Secondary Structures. J Comput Biol 2023; 30:1089-1097. [PMID: 37815558 DOI: 10.1089/cmb.2022.0501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023] Open
Abstract
RNA secondary structures are essential abstractions for understanding spacial folding behaviors of those macromolecules. Many secondary structure algorithms involve a common dynamic programming setup to exploit the property that secondary structures can be decomposed into substructures. Dirks et al. noted that this setup cannot directly address an issue of distinguishability among secondary structures, which arises for classes of sequences that admit nontrivial symmetry. Circular sequences are among these. We examine the problem of counting distinguishable secondary structures. Drawing from elementary results in group theory, we identify useful subsets of secondary structures. We then extend an algorithm due to Hofacker et al. for computing the sizes of these subsets. This yields a cubic-time algorithm to count distinguishable structures compatible with a given circular sequence. Furthermore, this general approach may be used to solve similar problems for which the RNA structures of interest involve symmetries.
Collapse
Affiliation(s)
- Masaru Nakajima
- Department of Physics and Astronomy and University of Southern California, Los Angeles, California, USA
| | - Andrew D Smith
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, USA
| |
Collapse
|
40
|
Sutanto K, Turcotte M. Assessing Global-Local Secondary Structure Fingerprints to Classify RNA Sequences With Deep Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2736-2747. [PMID: 34633933 DOI: 10.1109/tcbb.2021.3118358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
RNA elements that are transcribed but not translated into proteins are called non-coding RNAs (ncRNAs). They play wide-ranging roles in biological processes and disorders. Just like proteins, their structure is often intimately linked to their function. Many examples have been documented where structure is conserved across taxa despite sequence divergence. Thus, structure is often used to identify function. Specifically, the secondary structure is predicted and ncRNAs with similar structures are assumed to have same or similar functions. However, a strand of RNA can fold into multiple possible structures, and some strands even fold differently in vivo and in vitro. Furthermore, ncRNAs often function as RNA-protein complexes, which can affect structure. Because of these, we hypothesized using one structure per sequence may discard information, possibly resulting in poorer classification accuracy. Therefore, we propose using secondary structure fingerprints, comprising two categories: a higher-level representation derived from RNA-As-Graphs (RAG), and free energy fingerprints based on a curated repertoire of small structural motifs. The fingerprints take into account the difference between global and local structural matches. We also evaluated our deep learning architecture with k-mers. By combining our global-local fingerprints with 6-mer, we achieved an accuracy, precision, and recall of 91.04%, 91.10%, and 91.00%.
Collapse
|
41
|
Zhang H, Zhang L, Lin A, Xu C, Li Z, Liu K, Liu B, Ma X, Zhao F, Jiang H, Chen C, Shen H, Li H, Mathews DH, Zhang Y, Huang L. Algorithm for optimized mRNA design improves stability and immunogenicity. Nature 2023; 621:396-403. [PMID: 37130545 PMCID: PMC10499610 DOI: 10.1038/s41586-023-06127-z] [Citation(s) in RCA: 68] [Impact Index Per Article: 68.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2022] [Accepted: 04/25/2023] [Indexed: 05/04/2023]
Abstract
Messenger RNA (mRNA) vaccines are being used to combat the spread of COVID-19 (refs. 1-3), but they still exhibit critical limitations caused by mRNA instability and degradation, which are major obstacles for the storage, distribution and efficacy of the vaccine products4. Increasing secondary structure lengthens mRNA half-life, which, together with optimal codons, improves protein expression5. Therefore, a principled mRNA design algorithm must optimize both structural stability and codon usage. However, owing to synonymous codons, the mRNA design space is prohibitively large-for example, there are around 2.4 × 10632 candidate mRNA sequences for the SARS-CoV-2 spike protein. This poses insurmountable computational challenges. Here we provide a simple and unexpected solution using the classical concept of lattice parsing in computational linguistics, where finding the optimal mRNA sequence is analogous to identifying the most likely sentence among similar-sounding alternatives6. Our algorithm LinearDesign finds an optimal mRNA design for the spike protein in just 11 minutes, and can concurrently optimize stability and codon usage. LinearDesign substantially improves mRNA half-life and protein expression, and profoundly increases antibody titre by up to 128 times in mice compared to the codon-optimization benchmark on mRNA vaccines for COVID-19 and varicella-zoster virus. This result reveals the great potential of principled mRNA design and enables the exploration of previously unreachable but highly stable and efficient designs. Our work is a timely tool for vaccines and other mRNA-based medicines encoding therapeutic proteins such as monoclonal antibodies and anti-cancer drugs7,8.
Collapse
Affiliation(s)
- He Zhang
- Baidu Research USA, Sunnyvale, CA, USA
- School of EECS, Oregon State University, Corvallis, OR, USA
| | - Liang Zhang
- Baidu Research USA, Sunnyvale, CA, USA
- School of EECS, Oregon State University, Corvallis, OR, USA
- Vaccine Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
| | - Ang Lin
- StemiRNA Therapeutics, Shanghai, China
- Vaccine Center, School of Basic Medicine and Clinical Pharmacy, China Pharmaceutical University, Nanjing, China
| | | | - Ziyu Li
- Baidu Research USA, Sunnyvale, CA, USA
| | - Kaibo Liu
- Baidu Research USA, Sunnyvale, CA, USA
- School of EECS, Oregon State University, Corvallis, OR, USA
| | - Boxiang Liu
- Baidu Research USA, Sunnyvale, CA, USA
- Department of Pharmacy, National University of Singapore, Singapore, Singapore
| | | | | | | | | | | | | | - David H Mathews
- Department of Biochemistry and Biophysics, University of Rochester Medical Center, Rochester, NY, USA.
- Center for RNA Biology, University of Rochester Medical Center, Rochester, NY, USA.
- Department of Biostatistics and Computational Biology, University of Rochester Medical Center, Rochester, NY, USA.
- Coderna.ai, Inc., Sunnyvale, CA, USA.
| | - Yujian Zhang
- StemiRNA Therapeutics, Shanghai, China.
- , Gaithersburg, MD, USA.
| | - Liang Huang
- Baidu Research USA, Sunnyvale, CA, USA.
- School of EECS, Oregon State University, Corvallis, OR, USA.
- Coderna.ai, Inc., Sunnyvale, CA, USA.
| |
Collapse
|
42
|
Margasyuk S, Kalinina M, Petrova M, Skvortsov D, Cao C, Pervouchine DD. RNA in situ conformation sequencing reveals novel long-range RNA structures with impact on splicing. RNA (NEW YORK, N.Y.) 2023; 29:1423-1436. [PMID: 37295923 PMCID: PMC10573301 DOI: 10.1261/rna.079508.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 05/22/2023] [Indexed: 06/12/2023]
Abstract
Over recent years, long-range RNA structure has emerged as a factor that is fundamental to alternative splicing regulation. An increasing number of human disorders are now being associated with splicing defects; hence it is essential to develop methods that assess long-range RNA structure experimentally. RNA in situ conformation sequencing (RIC-seq) is a method that recapitulates RNA structure within physiological RNA-protein complexes. In this work, we juxtapose pairs of conserved complementary regions (PCCRs) that were predicted in silico with the results of RIC-seq experiments conducted in seven human cell lines. We show statistically that RIC-seq support of PCCRs correlates with their properties, such as equilibrium free energy, presence of compensatory substitutions, and occurrence of A-to-I RNA editing sites and forked eCLIP peaks. Exons enclosed in PCCRs that are supported by RIC-seq tend to have weaker splice sites and lower inclusion rates, which is indicative of post-transcriptional splicing regulation mediated by RNA structure. Based on these findings, we prioritize PCCRs according to their RIC-seq support and show, using antisense nucleotides and minigene mutagenesis, that PCCRs in two disease-associated human genes, PHF20L1 and CASK, and also PCCRs in their murine orthologs, impact alternative splicing. In sum, we demonstrate how RIC-seq experiments can be used to discover functional long-range RNA structures, and particularly those that regulate alternative splicing.
Collapse
Affiliation(s)
- Sergey Margasyuk
- Skolkovo Institute of Science and Technology, Moscow 143026, Russia
| | - Marina Kalinina
- Skolkovo Institute of Science and Technology, Moscow 143026, Russia
| | - Marina Petrova
- Skolkovo Institute of Science and Technology, Moscow 143026, Russia
| | - Dmitry Skvortsov
- Skolkovo Institute of Science and Technology, Moscow 143026, Russia
- Moscow State University, Faculty of Chemistry, Moscow 119991, Russia
| | - Changchang Cao
- Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | | |
Collapse
|
43
|
Mathez G, Cagno V. Small Molecules Targeting Viral RNA. Int J Mol Sci 2023; 24:13500. [PMID: 37686306 PMCID: PMC10487773 DOI: 10.3390/ijms241713500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2023] [Revised: 08/25/2023] [Accepted: 08/28/2023] [Indexed: 09/10/2023] Open
Abstract
The majority of antivirals available target viral proteins; however, RNA is emerging as a new and promising antiviral target due to the presence of highly structured RNA in viral genomes fundamental for their replication cycle. Here, we discuss methods for the identification of RNA-targeting compounds, starting from the determination of RNA structures either from purified RNA or in living cells, followed by in silico screening on RNA and phenotypic assays to evaluate viral inhibition. Moreover, we review the small molecules known to target the programmed ribosomal frameshifting element of SARS-CoV-2, the internal ribosomal entry site of different viruses, and RNA elements of HIV.
Collapse
Affiliation(s)
| | - Valeria Cagno
- Institute of Microbiology, University Hospital of Lausanne, University of Lausanne, 1011 Lausanne, Switzerland
| |
Collapse
|
44
|
Lebo KJ, Zappulla DC. Inverse-Folding Design of Yeast Telomerase RNA Increases Activity In Vitro. Noncoding RNA 2023; 9:51. [PMID: 37736897 PMCID: PMC10514824 DOI: 10.3390/ncrna9050051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Revised: 08/19/2023] [Accepted: 08/24/2023] [Indexed: 09/23/2023] Open
Abstract
Saccharomyces cerevisiae telomerase RNA, TLC1, is an 1157 nt non-coding RNA that functions as both a template for DNA synthesis and a flexible scaffold for telomerase RNP holoenzyme protein subunits. The tractable budding yeast system has provided landmark discoveries about telomere biology in vivo, but yeast telomerase research has been hampered by the fact that the large TLC1 RNA subunit does not support robust telomerase activity in vitro. In contrast, 155-500 nt miniaturized TLC1 alleles comprising the catalytic core domain and lacking the RNA's long arms do reconstitute robust activity. We hypothesized that full-length TLC1 is prone to misfolding in vitro. To create a full-length yeast telomerase RNA, predicted to fold into its biologically relevant structure, we took an inverse RNA-folding approach, changing 59 nucleotides predicted to increase the energetic favorability of folding into the modeled native structure based on the p-num feature of Mfold software. The sequence changes lowered the predicted ∆G of this "determined-arm" allele, DA-TLC1, by 61 kcal/mol (-19%) compared to wild-type. We tested DA-TLC1 for reconstituted activity and found it to be ~5-fold more robust than wild-type TLC1, suggesting that the inverse-folding design indeed improved folding in vitro into a catalytically active conformation. We also tested if DA-TLC1 functions in vivo, discovering that it complements a tlc1∆ strain, allowing cells to avoid senescence and maintain telomeres of nearly wild-type length. However, all inverse-designed RNAs that we tested had reduced abundance in vivo. In particular, inverse-designing nearly all of the Ku arm caused a profound reduction in telomerase RNA abundance in the cell and very short telomeres. Overall, these results show that the inverse design of S. cerevisiae telomerase RNA increases activity in vitro, while reducing abundance in vivo. This study provides a biochemically and biologically tested approach to inverse-design RNAs using Mfold that could be useful for controlling RNA structure in basic research and biomedicine.
Collapse
Affiliation(s)
- Kevin J. Lebo
- Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
| | - David C. Zappulla
- Department of Biology, Johns Hopkins University, Baltimore, MD 21218, USA
- Department of Biological Sciences, Lehigh University, Bethlehem, PA 18015, USA
| |
Collapse
|
45
|
Hidalgo M, Ramos C, Zolla G. Analysis of lncRNAs in Lupinus mutabilis (Tarwi) and Their Potential Role in Drought Response. Noncoding RNA 2023; 9:48. [PMID: 37736894 PMCID: PMC10514842 DOI: 10.3390/ncrna9050048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 08/01/2023] [Accepted: 08/16/2023] [Indexed: 09/23/2023] Open
Abstract
Lupinus mutabilis is a legume with high agronomic potential and available transcriptomic data for which lncRNAs have not been studied. Therefore, our objective was to identify, characterize, and validate the drought-responsive lncRNAs in L. mutabilis. To achieve this, we used a multilevel approach based on lncRNA prediction, annotation, subcellular location, thermodynamic characterization, structural conservation, and validation. Thus, 590 lncRNAs were identified by at least two algorithms of lncRNA identification. Annotation with the PLncDB database showed 571 lncRNAs unique to tarwi and 19 lncRNAs with homology in 28 botanical families including Solanaceae (19), Fabaceae (17), Brassicaceae (17), Rutaceae (17), Rosaceae (16), and Malvaceae (16), among others. In total, 12 lncRNAs had homology in more than 40 species. A total of 67% of lncRNAs were located in the cytoplasm and 33% in exosomes. Thermodynamic characterization of S03 showed a stable secondary structure with -105.67 kcal/mol. This structure included three regions, with a multibranch loop containing a hairpin with a SECIS-like element. Evaluation of the structural conservation by CROSSalign revealed partial similarities between L. mutabilis (S03) and S. lycopersicum (Solyc04r022210.1). RT-PCR validation demonstrated that S03 was upregulated in a drought-tolerant accession of L. mutabilis. Finally, these results highlighted the importance of lncRNAs in tarwi improvement under drought conditions.
Collapse
Affiliation(s)
- Manuel Hidalgo
- Programa de Estudio de Medicina Humana, Universidad Privada Antenor Orrego, Av. América Sur 3145, Trujillo 13008, Peru; (M.H.); (C.R.)
| | - Cynthia Ramos
- Programa de Estudio de Medicina Humana, Universidad Privada Antenor Orrego, Av. América Sur 3145, Trujillo 13008, Peru; (M.H.); (C.R.)
| | - Gaston Zolla
- Laboratorio de Fisiología Molecular de Plantas del Programa de Cereales y Granos Nativos, Facultad de Agronomía, Universidad Nacional Agraria La Molina, Lima 12, Peru
| |
Collapse
|
46
|
Oxenfarth A, Kümmerer F, Bottaro S, Schnieders R, Pinter G, Jonker HRA, Fürtig B, Richter C, Blackledge M, Lindorff-Larsen K, Schwalbe H. Integrated NMR/Molecular Dynamics Determination of the Ensemble Conformation of a Thermodynamically Stable CUUG RNA Tetraloop. J Am Chem Soc 2023. [PMID: 37479220 PMCID: PMC10401711 DOI: 10.1021/jacs.3c03578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/23/2023]
Abstract
Both experimental and theoretical structure determinations of RNAs have remained challenging due to the intrinsic dynamics of RNAs. We report here an integrated nuclear magnetic resonance/molecular dynamics (NMR/MD) structure determination approach to describe the dynamic structure of the CUUG tetraloop. We show that the tetraloop undergoes substantial dynamics, leading to averaging of the experimental data. These dynamics are particularly linked to the temperature-dependent presence of a hydrogen bond within the tetraloop. Interpreting the NMR data by a single structure represents the low-temperature structure well but fails to capture all conformational states occurring at a higher temperature. We integrate MD simulations, starting from structures of CUUG tetraloops within the Protein Data Bank, with an extensive set of NMR data, and provide a structural ensemble that describes the dynamic nature of the tetraloop and the experimental NMR data well. We thus show that one of the most stable and frequently found RNA tetraloops displays substantial dynamics, warranting such an integrated structural approach.
Collapse
Affiliation(s)
- Andreas Oxenfarth
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt am Main, Max-von-Laue-Str. 7, 60438 Frankfurt/Main, Hessen, Germany
| | - Felix Kümmerer
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Sandro Bottaro
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
- IRCCS Humanitas Research Hospital, Department of Biomedical Sciences, Humanitas University, Milan 20089, Italy
| | - Robbin Schnieders
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt am Main, Max-von-Laue-Str. 7, 60438 Frankfurt/Main, Hessen, Germany
| | - György Pinter
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt am Main, Max-von-Laue-Str. 7, 60438 Frankfurt/Main, Hessen, Germany
| | - Hendrik R A Jonker
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt am Main, Max-von-Laue-Str. 7, 60438 Frankfurt/Main, Hessen, Germany
| | - Boris Fürtig
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt am Main, Max-von-Laue-Str. 7, 60438 Frankfurt/Main, Hessen, Germany
| | - Christian Richter
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt am Main, Max-von-Laue-Str. 7, 60438 Frankfurt/Main, Hessen, Germany
| | - Martin Blackledge
- Institut de Biologie Structurale (IBS), CEA, CNRS, University Grenoble Alpes, Grenoble 38000, France
| | - Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, DK-2200 Copenhagen N, Denmark
| | - Harald Schwalbe
- Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance (BMRZ), Goethe University Frankfurt am Main, Max-von-Laue-Str. 7, 60438 Frankfurt/Main, Hessen, Germany
| |
Collapse
|
47
|
Wang X, Yu S, Lou E, Tan YL, Tan ZJ. RNA 3D Structure Prediction: Progress and Perspective. Molecules 2023; 28:5532. [PMID: 37513407 PMCID: PMC10386116 DOI: 10.3390/molecules28145532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 07/05/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Ribonucleic acid (RNA) molecules play vital roles in numerous important biological functions such as catalysis and gene regulation. The functions of RNAs are strongly coupled to their structures or proper structure changes, and RNA structure prediction has been paid much attention in the last two decades. Some computational models have been developed to predict RNA three-dimensional (3D) structures in silico, and these models are generally composed of predicting RNA 3D structure ensemble, evaluating near-native RNAs from the structure ensemble, and refining the identified RNAs. In this review, we will make a comprehensive overview of the recent advances in RNA 3D structure modeling, including structure ensemble prediction, evaluation, and refinement. Finally, we will emphasize some insights and perspectives in modeling RNA 3D structures.
Collapse
Affiliation(s)
- Xunxun Wang
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Shixiong Yu
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - En Lou
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| | - Ya-Lan Tan
- School of Bioengineering and Health, Wuhan Textile University, Wuhan 430200, China
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, China
| | - Zhi-Jie Tan
- Department of Physics, Key Laboratory of Artificial Micro & Nano-Structures of Ministry of Education, School of Physics and Technology, Wuhan University, Wuhan 430072, China
| |
Collapse
|
48
|
Sato K, Hamada M. Recent trends in RNA informatics: a review of machine learning and deep learning for RNA secondary structure prediction and RNA drug discovery. Brief Bioinform 2023; 24:bbad186. [PMID: 37232359 PMCID: PMC10359090 DOI: 10.1093/bib/bbad186] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 04/24/2023] [Accepted: 04/25/2023] [Indexed: 05/27/2023] Open
Abstract
Computational analysis of RNA sequences constitutes a crucial step in the field of RNA biology. As in other domains of the life sciences, the incorporation of artificial intelligence and machine learning techniques into RNA sequence analysis has gained significant traction in recent years. Historically, thermodynamics-based methods were widely employed for the prediction of RNA secondary structures; however, machine learning-based approaches have demonstrated remarkable advancements in recent years, enabling more accurate predictions. Consequently, the precision of sequence analysis pertaining to RNA secondary structures, such as RNA-protein interactions, has also been enhanced, making a substantial contribution to the field of RNA biology. Additionally, artificial intelligence and machine learning are also introducing technical innovations in the analysis of RNA-small molecule interactions for RNA-targeted drug discovery and in the design of RNA aptamers, where RNA serves as its own ligand. This review will highlight recent trends in the prediction of RNA secondary structure, RNA aptamers and RNA drug discovery using machine learning, deep learning and related technologies, and will also discuss potential future avenues in the field of RNA informatics.
Collapse
Affiliation(s)
- Kengo Sato
- School of System Design and Technology, Tokyo Denki University, 5 Senju Asahi-cho, Adachi-ku, Tokyo 120-8551, Japan
| | - Michiaki Hamada
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL) , National Institute of Advanced Industrial Science and Technology (AIST), 3-4-1, Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Graduate School of Medicine, Nippon Medical School, 1-1-5, Sendagi, Bunkyo-ku, Tokyo 113-8602, Japan
| |
Collapse
|
49
|
Mizuuchi R, Ichihashi N. Minimal RNA self-reproduction discovered from a random pool of oligomers. Chem Sci 2023; 14:7656-7664. [PMID: 37476714 PMCID: PMC10355099 DOI: 10.1039/d3sc01940c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Accepted: 06/18/2023] [Indexed: 07/22/2023] Open
Abstract
The emergence of RNA self-reproduction from prebiotic components would have been crucial in developing a genetic system during the origins of life. However, all known self-reproducing RNA molecules are complex ribozymes, and how they could have arisen from abiotic materials remains unclear. Therefore, it has been proposed that the first self-reproducing RNA may have been short oligomers that assemble their components as templates. Here, we sought such minimal RNA self-reproduction in prebiotically accessible short random RNA pools that undergo spontaneous ligation and recombination. By examining enriched RNA families with common motifs, we identified a 20-nucleotide (nt) RNA variant that self-reproduces via template-directed ligation of two 10 nt oligonucleotides. The RNA oligomer contains a 2'-5' phosphodiester bond, which typically forms during prebiotically plausible RNA synthesis. This non-canonical linkage helps prevent the formation of inactive complexes between self-complementary oligomers while decreasing the ligation efficiency. The system appears to possess an autocatalytic property consistent with exponential self-reproduction despite the limitation of forming a ternary complex of the template and two substrates, similar to the behavior of a much larger ligase ribozyme. Such a minimal, ribozyme-independent RNA self-reproduction may represent the first step in the emergence of an RNA-based genetic system from primordial components. Simultaneously, our examination of random RNA pools highlights the likelihood that complex species interactions were necessary to initiate RNA reproduction.
Collapse
Affiliation(s)
- Ryo Mizuuchi
- Department of Electrical Engineering and Bioscience, Faculty of Science and Engineering, Waseda University Shinjuku Tokyo 162-8480 Japan
- JST, FOREST Kawaguchi Saitama 332-0012 Japan
| | - Norikazu Ichihashi
- Komaba Institute for Science, The University of Tokyo Meguro Tokyo 153-8902 Japan
- Department of Life Science, Graduate School of Arts and Science, The University of Tokyo Meguro Tokyo 153-8902 Japan
- Universal Biology Institute, The University of Tokyo Meguro Tokyo 153-8902 Japan
| |
Collapse
|
50
|
Abstract
RNAstructure is a user-friendly program for the prediction and analysis of RNA secondary structure. It is available as a web server, a program with a graphical user interface, or a set of command line tools. The programs are available for Microsoft Windows, macOS, or Linux. This article provides protocols for prediction of RNA secondary structure (using the web server, the graphical user interface, or the command line) and high-affinity oligonucleotide binding sites to a structured RNA target (using the graphical user interface). © 2023 Wiley Periodicals LLC. Basic Protocol 1: Predicting RNA secondary structure using the RNAstructure web server Alternate Protocol 1: Predicting secondary structure and base pair probabilities using the RNAstructure graphical user interface Alternate Protocol 2: Predicting secondary structure and base pair probabilities using the RNAstructure command line interface Basic Protocol 2: Predicting binding affinities of oligonucleotides complementary to an RNA target using OligoWalk.
Collapse
Affiliation(s)
- Sara E. Ali
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| | - Abhinav Mittal
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| | - David H. Mathews
- Department of Biochemistry & Biophysics and Center for RNA Biology, University of Rochester Medical Center, 601 Elmwood Avenue, Box 712, Rochester, New York 14642
| |
Collapse
|