1
|
Han S, Liu L. GP-HTNLoc: A graph prototype head-tail network-based model for multi-label subcellular localization prediction of ncRNAs. Comput Struct Biotechnol J 2024; 23:2034-2048. [PMID: 38765609 PMCID: PMC11101938 DOI: 10.1016/j.csbj.2024.04.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/22/2024] Open
Abstract
Numerous research results demonstrated that understanding the subcellular localization of non-coding RNAs (ncRNAs) is pivotal in elucidating their roles and regulatory mechanisms in cells. Despite the existence of over ten computational models dedicated to predicting the subcellular localization of ncRNAs, a majority of these models are designed solely for single-label prediction. In reality, ncRNAs often exhibit localization across multiple subcellular compartments. Furthermore, the existing multi-label localization prediction models are insufficient in addressing the challenges posed by the scarcity of training samples and class imbalance in ncRNA dataset. To address these limitations, this study proposes a novel multi-label localization prediction model for ncRNAs, named GP-HTNLoc. To mitigate class imbalance, GP-HTNLoc adopts separate training approaches for head and tail location labels. Additionally, GP-HTNLoc introduces a pioneering graph prototype module to enhance its performance in small-sample, multi-label scenarios. The experimental results based on 10-fold cross-validation on benchmark datasets demonstrate that GP-HTNLoc achieves competitive predictive performance. The average results from 10 rounds of testing on an independent dataset show that GP-HTNLoc outperforms the best existing models on the human lncRNA, human snoRNA, and human miRNA subsets, with average precision improvements of 31.5%, 14.2%, and 5.6%, respectively, reaching 0.685, 0.632, and 0.704. A user-friendly online GP-HTNLoc server is accessible at https://56s8y85390.goho.co.
Collapse
Affiliation(s)
- Shuangkai Han
- School of Information, Yunnan Normal University, Kunming, China
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, China
| | - Lin Liu
- School of Information, Yunnan Normal University, Kunming, China
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, China
| |
Collapse
|
2
|
Li F, Bi Y, Guo X, Tan X, Wang C, Pan S. Advancing mRNA subcellular localization prediction with graph neural network and RNA structure. Bioinformatics 2024; 40:btae504. [PMID: 39133151 PMCID: PMC11361792 DOI: 10.1093/bioinformatics/btae504] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 08/06/2024] [Accepted: 08/09/2024] [Indexed: 08/13/2024] Open
Abstract
MOTIVATION The asymmetrical distribution of expressed mRNAs tightly controls the precise synthesis of proteins within human cells. This non-uniform distribution, a cornerstone of developmental biology, plays a pivotal role in numerous cellular processes. To advance our comprehension of gene regulatory networks, it is essential to develop computational tools for accurately identifying the subcellular localizations of mRNAs. However, considering multi-localization phenomena remains limited in existing approaches, with none considering the influence of RNA's secondary structure. RESULTS In this study, we propose Allocator, a multi-view parallel deep learning framework that seamlessly integrates the RNA sequence-level and structure-level information, enhancing the prediction of mRNA multi-localization. The Allocator models equip four efficient feature extractors, each designed to handle different inputs. Two are tailored for sequence-based inputs, incorporating multilayer perceptron and multi-head self-attention mechanisms. The other two are specialized in processing structure-based inputs, employing graph neural networks. Benchmarking results underscore Allocator's superiority over state-of-the-art methods, showcasing its strength in revealing intricate localization associations. AVAILABILITY AND IMPLEMENTATION The webserver of Allocator is available at http://Allocator.unimelb-biotools.cloud.edu.au; the source code and datasets are available on GitHub (https://github.com/lifuyi774/Allocator) and Zenodo (https://doi.org/10.5281/zenodo.13235798).
Collapse
Affiliation(s)
- Fuyi Li
- College of Information Engineering, Northwest A&F University, Yangling 712100, China
- South Australian immunoGENomics Cancer Institute (SAiGENCI), The University of Adelaide, Adelaide, SA 5005, Australia
| | - Yue Bi
- Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Xudong Guo
- College of Information Engineering, Northwest A&F University, Yangling 712100, China
| | - Xiaolan Tan
- Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Cong Wang
- College of Information Engineering, Northwest A&F University, Yangling 712100, China
| | - Shirui Pan
- Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
- School of Information and Communication Technology, Griffith University, Gold Coast, QLD 4222, Australia
| |
Collapse
|
3
|
Liu X, Yao X, Chen L. Expanding roles of circRNAs in cardiovascular diseases. Noncoding RNA Res 2024; 9:429-436. [PMID: 38511061 PMCID: PMC10950605 DOI: 10.1016/j.ncrna.2024.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 02/01/2024] [Accepted: 02/04/2024] [Indexed: 03/22/2024] Open
Abstract
CircRNAs are a class of single-stranded RNAs characterized by covalently looped structures. Emerging advances have promoted our understanding of circRNA biogenesis, nuclear export, biological functions, and functional mechanisms. Roles of circRNAs in diverse diseases have been increasingly recognized in the past decade, with novel approaches in bioinformatics analysis and new strategies in modulating circRNA levels, which have made circRNAs the hot spot for therapeutic applications. Moreover, due to the intrinsic features of circRNAs such as high stability, conservation, and tissue-/stage-specific expression, circRNAs are believed to be promising prognostic and diagnostic markers for diseases. Aiming cardiovascular disease (CVD), one of the leading causes of mortality worldwide, we briefly summarize the current understanding of circRNAs, provide the recent progress in circRNA functions and functional mechanisms in CVD, and discuss the future perspectives both in circRNA research and therapeutics based on existing knowledge.
Collapse
Affiliation(s)
- Xu Liu
- Department of Cardiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, China
| | - Xuelin Yao
- Department of Endocrinology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, China
- Department of Endocrinology, The First Affiliated Hospital of Anhui Medical University, Hefei, 230022, China
| | - Liang Chen
- Department of Cardiology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, 230001, China
| |
Collapse
|
4
|
Choudhury S, Bajiya N, Patiyal S, Raghava GPS. MRSLpred-a hybrid approach for predicting multi-label subcellular localization of mRNA at the genome scale. FRONTIERS IN BIOINFORMATICS 2024; 4:1341479. [PMID: 38379813 PMCID: PMC10877048 DOI: 10.3389/fbinf.2024.1341479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 01/15/2024] [Indexed: 02/22/2024] Open
Abstract
In the past, several methods have been developed for predicting the single-label subcellular localization of messenger RNA (mRNA). However, only limited methods are designed to predict the multi-label subcellular localization of mRNA. Furthermore, the existing methods are slow and cannot be implemented at a transcriptome scale. In this study, a fast and reliable method has been developed for predicting the multi-label subcellular localization of mRNA that can be implemented at a genome scale. Machine learning-based methods have been developed using mRNA sequence composition, where the XGBoost-based classifier achieved an average area under the receiver operator characteristic (AUROC) of 0.709 (0.668-0.732). In addition to alignment-free methods, we developed alignment-based methods using motif search techniques. Finally, a hybrid technique that combines the XGBoost model and the motif-based approach has been developed, achieving an average AUROC of 0.742 (0.708-0.816). Our method-MRSLpred-outperforms the existing state-of-the-art classifier in terms of performance and computation efficiency. A publicly accessible webserver and a standalone tool have been developed to facilitate researchers (webserver: https://webs.iiitd.edu.in/raghava/mrslpred/).
Collapse
Affiliation(s)
| | | | | | - Gajendra P. S. Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| |
Collapse
|
5
|
Fu X, Chen Y, Tian S. DlncRNALoc: A discrete wavelet transform-based model for predicting lncRNA subcellular localization. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20648-20667. [PMID: 38124569 DOI: 10.3934/mbe.2023913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
The prediction of long non-coding RNA (lncRNA) subcellular localization is essential to the understanding of its function and involvement in cellular regulation. Traditional biological experimental methods are costly and time-consuming, making computational methods the preferred approach for predicting lncRNA subcellular localization (LSL). However, existing computational methods have limitations due to the structural characteristics of lncRNAs and the uneven distribution of data across subcellular compartments. We propose a discrete wavelet transform (DWT)-based model for predicting LSL, called DlncRNALoc. We construct a physicochemical property matrix of a 2-tuple bases based on lncRNA sequences, and we introduce a DWT lncRNA feature extraction method. We use the Synthetic Minority Over-sampling Technique (SMOTE) for oversampling and the local fisher discriminant analysis (LFDA) algorithm to optimize feature information. The optimized feature vectors are fed into support vector machine (SVM) to construct a predictive model. DlncRNALoc has been applied for a five-fold cross-validation on the three sets of benchmark datasets. Extensive experiments have demonstrated the superiority and effectiveness of the DlncRNALoc model in predicting LSL.
Collapse
Affiliation(s)
- Xiangzheng Fu
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, China
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
- Department of Basic Biology, Changsha Medical College, Changsha, Hunan, China
| | - Yifan Chen
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
- Department of Basic Biology, Changsha Medical College, Changsha, Hunan, China
| | - Sha Tian
- Department of Internal Medicine, College of Integrated Chinese and Western Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China
| |
Collapse
|
6
|
Forrest SL, Lee S, Nassir N, Martinez-Valbuena I, Sackmann V, Li J, Ahmed A, Tartaglia MC, Ittner LM, Lang AE, Uddin M, Kovacs GG. Cell-specific MAPT gene expression is preserved in neuronal and glial tau cytopathologies in progressive supranuclear palsy. Acta Neuropathol 2023; 146:395-414. [PMID: 37354322 PMCID: PMC10412651 DOI: 10.1007/s00401-023-02604-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2023] [Revised: 06/11/2023] [Accepted: 06/16/2023] [Indexed: 06/26/2023]
Abstract
Microtubule-associated protein tau (MAPT) aggregates in neurons, astrocytes and oligodendrocytes in a number of neurodegenerative diseases, including progressive supranuclear palsy (PSP). Tau is a target of therapy and the strategy includes either the elimination of pathological tau aggregates or reducing MAPT expression, and thus the amount of tau protein made to prevent its aggregation. Disease-associated tau affects brain regions in a sequential manner that includes cell-to-cell spreading. Involvement of glial cells that show tau aggregates is interpreted as glial cells taking up misfolded tau assuming that glial cells do not express enough MAPT. Although studies have evaluated MAPT expression in human brain tissue homogenates, it is not clear whether MAPT expression is compromised in cells accumulating pathological tau. To address these perplexing aspects of disease pathogenesis, this study used RNAscope combined with immunofluorescence (AT8), and single-nuclear(sn) RNAseq to systematically map and quantify MAPT expression dynamics across different cell types and brain regions in controls (n = 3) and evaluated whether tau cytopathology affects MAPT expression in PSP (n = 3). MAPT transcripts were detected in neurons, astrocytes and oligodendrocytes, and varied between brain regions and within each cell type, and were preserved in all cell types with tau aggregates in PSP. These results propose a complex scenario in all cell types, where, in addition to the ingested misfolded tau, the preserved cellular MAPT expression provides a pool for local protein production that can (1) be phosphorylated and aggregated, or (2) feed the seeding of ingested misfolded tau by providing physiological tau, both accentuating the pathological process. Since tau cytopathology does not compromise MAPT gene expression in PSP, a complete loss of tau protein expression as an early pathogenic component is less likely. These observations provide rationale for a dual approach to therapy by decreasing cellular MAPT expression and targeting removal of misfolded tau.
Collapse
Affiliation(s)
- Shelley L Forrest
- Tanz Centre for Research in Neurodegenerative Disease (CRND), University of Toronto, Krembil Discovery Tower, 60 Leonard Ave, Toronto, ON, M5T 0S8, Canada
- Dementia Research Centre, Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
- Laboratory Medicine Program and Krembil Brain Institute, University Health Network, Toronto, ON, Canada
| | - Seojin Lee
- Tanz Centre for Research in Neurodegenerative Disease (CRND), University of Toronto, Krembil Discovery Tower, 60 Leonard Ave, Toronto, ON, M5T 0S8, Canada
| | - Nasna Nassir
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Ivan Martinez-Valbuena
- Tanz Centre for Research in Neurodegenerative Disease (CRND), University of Toronto, Krembil Discovery Tower, 60 Leonard Ave, Toronto, ON, M5T 0S8, Canada
| | - Valerie Sackmann
- Tanz Centre for Research in Neurodegenerative Disease (CRND), University of Toronto, Krembil Discovery Tower, 60 Leonard Ave, Toronto, ON, M5T 0S8, Canada
| | - Jun Li
- Tanz Centre for Research in Neurodegenerative Disease (CRND), University of Toronto, Krembil Discovery Tower, 60 Leonard Ave, Toronto, ON, M5T 0S8, Canada
| | - Awab Ahmed
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
| | - Maria Carmela Tartaglia
- Tanz Centre for Research in Neurodegenerative Disease (CRND), University of Toronto, Krembil Discovery Tower, 60 Leonard Ave, Toronto, ON, M5T 0S8, Canada
- University Health Network Memory Clinic, Krembil Brain Institute, Toronto, ON, Canada
| | - Lars M Ittner
- Dementia Research Centre, Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia
| | - Anthony E Lang
- Edmond J. Safra Program in Parkinson's Disease, Rossy PSP Centre and the Morton and Gloria Shulman Movement Disorders Clinic, Toronto Western Hospital, Toronto, ON, Canada
| | - Mohammed Uddin
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences, Dubai, UAE
- Cellular Intelligence (Ci) Lab, GenomeArc Inc., Toronto, ON, Canada
| | - Gabor G Kovacs
- Tanz Centre for Research in Neurodegenerative Disease (CRND), University of Toronto, Krembil Discovery Tower, 60 Leonard Ave, Toronto, ON, M5T 0S8, Canada.
- Dementia Research Centre, Macquarie Medical School, Faculty of Medicine, Health and Human Sciences, Macquarie University, Sydney, Australia.
- Laboratory Medicine Program and Krembil Brain Institute, University Health Network, Toronto, ON, Canada.
- Edmond J. Safra Program in Parkinson's Disease, Rossy PSP Centre and the Morton and Gloria Shulman Movement Disorders Clinic, Toronto Western Hospital, Toronto, ON, Canada.
- Department of Laboratory Medicine and Pathobiology and Department of Medicine, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
7
|
Li J, Zou Q, Yuan L. A review from biological mapping to computation-based subcellular localization. MOLECULAR THERAPY. NUCLEIC ACIDS 2023; 32:507-521. [PMID: 37215152 PMCID: PMC10192651 DOI: 10.1016/j.omtn.2023.04.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Subcellular localization is crucial to the study of virus and diseases. Specifically, research on protein subcellular localization can help identify clues between virus and host cells that can aid in the design of targeted drugs. Research on RNA subcellular localization is significant for human diseases (such as Alzheimer's disease, colon cancer, etc.). To date, only reviews addressing subcellular localization of proteins have been published, which are outdated for reference, and reviews of RNA subcellular localization are not comprehensive. Therefore, we collated (the most up-to-date) literature on protein and RNA subcellular localization to help researchers understand changes in the field of protein and RNA subcellular localization. Extensive and complete methods for constructing subcellular localization models have also been summarized, which can help readers understand the changes in application of biotechnology and computer science in subcellular localization research and explore how to use biological data to construct improved subcellular localization models. This paper is the first review to cover both protein subcellular localization and RNA subcellular localization. We urge researchers from biology and computational biology to jointly pay attention to transformation patterns, interrelationships, differences, and causality of protein subcellular localization and RNA subcellular localization.
Collapse
Affiliation(s)
- Jing Li
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
- School of Biomedical Sciences, University of Hong Kong, Hong Kong, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324000, China
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, 100 Minjiang Main Road, Quzhou, Zhejiang 324000, China
| |
Collapse
|
8
|
Wong YY, Harbison JE, Hope CM, Gundsambuu B, Brown KA, Wong SW, Brown CY, Couper JJ, Breen J, Liu N, Pederson SM, Köhne M, Klee K, Schultze J, Beyer M, Sadlon T, Barry SC. Parallel recovery of chromatin accessibility and gene expression dynamics from frozen human regulatory T cells. Sci Rep 2023; 13:5506. [PMID: 37016052 PMCID: PMC10073253 DOI: 10.1038/s41598-023-32256-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 03/24/2023] [Indexed: 04/06/2023] Open
Abstract
Epigenetic features such as DNA accessibility dictate transcriptional regulation in a cell type- and cell state- specific manner, and mapping this in health vs. disease in clinically relevant material is opening the door to new mechanistic insights and new targets for therapy. Assay for Transposase Accessible Chromatin Sequencing (ATAC-seq) allows chromatin accessibility profiling from low cell input, making it tractable on rare cell populations, such as regulatory T (Treg) cells. However, little is known about the compatibility of the assay with cryopreserved rare cell populations. Here we demonstrate the robustness of an ATAC-seq protocol comparing primary Treg cells recovered from fresh or cryopreserved PBMC samples, in the steady state and in response to stimulation. We extend this method to explore the feasibility of conducting simultaneous quantitation of chromatin accessibility and transcriptome from a single aliquot of 50,000 cryopreserved Treg cells. Profiling of chromatin accessibility and gene expression in parallel within the same pool of cells controls for cellular heterogeneity and is particularly beneficial when constrained by limited input material. Overall, we observed a high correlation of accessibility patterns and transcription factor dynamics between fresh and cryopreserved samples. Furthermore, highly similar transcriptomic profiles were obtained from whole cells and from the supernatants recovered from ATAC-seq reactions. We highlight the feasibility of applying these techniques to profile the epigenomic landscape of cells recovered from cryopreservation biorepositories.
Collapse
Affiliation(s)
- Ying Y Wong
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
| | - Jessica E Harbison
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
- Women's and Children's Hospital, North Adelaide, Australia
| | - Christopher M Hope
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
- Women's and Children's Hospital, North Adelaide, Australia
| | | | - Katherine A Brown
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
| | - Soon W Wong
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
| | - Cheryl Y Brown
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
- Women's and Children's Hospital, North Adelaide, Australia
| | - Jennifer J Couper
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
- Women's and Children's Hospital, North Adelaide, Australia
| | - Jimmy Breen
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
| | - Ning Liu
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
| | - Stephen M Pederson
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
| | - Maren Köhne
- German Center for Neurodegenerative Diseases, University of Bonn, Bonn, Germany
| | - Kathrin Klee
- German Center for Neurodegenerative Diseases, University of Bonn, Bonn, Germany
| | - Joachim Schultze
- German Center for Neurodegenerative Diseases, University of Bonn, Bonn, Germany
| | - Marc Beyer
- German Center for Neurodegenerative Diseases, University of Bonn, Bonn, Germany
| | - Timothy Sadlon
- Robinson Research Institute, University of Adelaide, Adelaide, Australia
- Women's and Children's Hospital, North Adelaide, Australia
| | - Simon C Barry
- Robinson Research Institute, University of Adelaide, Adelaide, Australia.
- Women's and Children's Hospital, North Adelaide, Australia.
| |
Collapse
|
9
|
Asim MN, Ibrahim MA, Imran Malik M, Dengel A, Ahmed S. Circ-LocNet: A Computational Framework for Circular RNA Sub-Cellular Localization Prediction. Int J Mol Sci 2022; 23:ijms23158221. [PMID: 35897818 PMCID: PMC9329987 DOI: 10.3390/ijms23158221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/15/2022] [Accepted: 07/20/2022] [Indexed: 02/04/2023] Open
Abstract
Circular ribonucleic acids (circRNAs) are novel non-coding RNAs that emanate from alternative splicing of precursor mRNA in reversed order across exons. Despite the abundant presence of circRNAs in human genes and their involvement in diverse physiological processes, the functionality of most circRNAs remains a mystery. Like other non-coding RNAs, sub-cellular localization knowledge of circRNAs has the aptitude to demystify the influence of circRNAs on protein synthesis, degradation, destination, their association with different diseases, and potential for drug development. To date, wet experimental approaches are being used to detect sub-cellular locations of circular RNAs. These approaches help to elucidate the role of circRNAs as protein scaffolds, RNA-binding protein (RBP) sponges, micro-RNA (miRNA) sponges, parental gene expression modifiers, alternative splicing regulators, and transcription regulators. To complement wet-lab experiments, considering the progress made by machine learning approaches for the determination of sub-cellular localization of other non-coding RNAs, the paper in hand develops a computational framework, Circ-LocNet, to precisely detect circRNA sub-cellular localization. Circ-LocNet performs comprehensive extrinsic evaluation of 7 residue frequency-based, residue order and frequency-based, and physio-chemical property-based sequence descriptors using the five most widely used machine learning classifiers. Further, it explores the performance impact of K-order sequence descriptor fusion where it ensembles similar as well dissimilar genres of statistical representation learning approaches to reap the combined benefits. Considering the diversity of statistical representation learning schemes, it assesses the performance of second-order, third-order, and going all the way up to seventh-order sequence descriptor fusion. A comprehensive empirical evaluation of Circ-LocNet over a newly developed benchmark dataset using different settings reveals that standalone residue frequency-based sequence descriptors and tree-based classifiers are more suitable to predict sub-cellular localization of circular RNAs. Further, K-order heterogeneous sequence descriptors fusion in combination with tree-based classifiers most accurately predict sub-cellular localization of circular RNAs. We anticipate this study will act as a rich baseline and push the development of robust computational methodologies for the accurate sub-cellular localization determination of novel circRNAs.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
- Correspondence:
| | - Muhammad Ali Ibrahim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Imran Malik
- School of Computer Science & Electrical Engineering, National University of Sciences and Technology, Islamabad 44000, Pakistan;
| | - Andreas Dengel
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- DeepReader GmbH, Trippstadter Str. 122, 67663 Kaiserslautern, Germany
| |
Collapse
|
10
|
Le P, Ahmed N, Yeo GW. Illuminating RNA biology through imaging. Nat Cell Biol 2022; 24:815-824. [PMID: 35697782 PMCID: PMC11132331 DOI: 10.1038/s41556-022-00933-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Accepted: 05/06/2022] [Indexed: 12/14/2022]
Abstract
RNA processing plays a central role in accurately transmitting genetic information into functional RNA and protein regulators. To fully appreciate the RNA life-cycle, tools to observe RNA with high spatial and temporal resolution are critical. Here we review recent advances in RNA imaging and highlight how they will propel the field of RNA biology. We discuss current trends in RNA imaging and their potential to elucidate unanswered questions in RNA biology.
Collapse
Affiliation(s)
- Phuong Le
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA
- Stem Cell Program, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
| | - Noorsher Ahmed
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA
- Stem Cell Program, University of California San Diego, La Jolla, CA, USA
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA, USA
| | - Gene W Yeo
- Department of Cellular and Molecular Medicine, University of California San Diego, La Jolla, CA, USA.
- Stem Cell Program, University of California San Diego, La Jolla, CA, USA.
- Institute for Genomic Medicine, University of California San Diego, La Jolla, CA, USA.
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
11
|
Abstract
Most of the transcribed human genome codes for noncoding RNAs (ncRNAs), and long noncoding RNAs (lncRNAs) make for the lion's share of the human ncRNA space. Despite growing interest in lncRNAs, because there are so many of them, and because of their tissue specialization and, often, lower abundance, their catalog remains incomplete and there are multiple ongoing efforts to improve it. Consequently, the number of human lncRNA genes may be lower than 10,000 or higher than 200,000. A key open challenge for lncRNA research, now that so many lncRNA species have been identified, is the characterization of lncRNA function and the interpretation of the roles of genetic and epigenetic alterations at their loci. After all, the most important human genes to catalog and study are those that contribute to important cellular functions-that affect development or cell differentiation and whose dysregulation may play a role in the genesis and progression of human diseases. Multiple efforts have used screens based on RNA-mediated interference (RNAi), antisense oligonucleotide (ASO), and CRISPR screens to identify the consequences of lncRNA dysregulation and predict lncRNA function in select contexts, but these approaches have unresolved scalability and accuracy challenges. Instead-as was the case for better-studied ncRNAs in the past-researchers often focus on characterizing lncRNA interactions and investigating their effects on genes and pathways with known functions. Here, we focus most of our review on computational methods to identify lncRNA interactions and to predict the effects of their alterations and dysregulation on human disease pathways.
Collapse
|
12
|
Hendra C, Pratanwanich PN, Wan YK, Goh WSS, Thiery A, Göke J. Detection of m6A from direct RNA sequencing using a multiple instance learning framework. Nat Methods 2022; 19:1590-1598. [PMID: 36357692 PMCID: PMC9718678 DOI: 10.1038/s41592-022-01666-1] [Citation(s) in RCA: 66] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 09/27/2022] [Indexed: 11/12/2022]
Abstract
RNA modifications such as m6A methylation form an additional layer of complexity in the transcriptome. Nanopore direct RNA sequencing can capture this information in the raw current signal for each RNA molecule, enabling the detection of RNA modifications using supervised machine learning. However, experimental approaches provide only site-level training data, whereas the modification status for each single RNA molecule is missing. Here we present m6Anet, a neural-network-based method that leverages the multiple instance learning framework to specifically handle missing read-level modification labels in site-level training data. m6Anet outperforms existing computational methods, shows similar accuracy as experimental approaches, and generalizes with high accuracy to different cell lines and species without retraining model parameters. In addition, we demonstrate that m6Anet captures the underlying read-level stoichiometry, which can be used to approximate differences in modification rates. Overall, m6Anet offers a tool to capture the transcriptome-wide identification and quantification of m6A from a single run of direct RNA sequencing.
Collapse
Affiliation(s)
- Christopher Hendra
- grid.4280.e0000 0001 2180 6431Institute of Data Science, National University of Singapore, Singapore, Singapore ,grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, A*STAR, Singapore, Singapore ,grid.4280.e0000 0001 2180 6431Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore
| | - Ploy N. Pratanwanich
- grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, A*STAR, Singapore, Singapore ,grid.7922.e0000 0001 0244 7875Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Chulalongkorn, Thailand ,grid.7922.e0000 0001 0244 7875Chula Intelligent and Complex Systems Research Unit, Chulalongkorn University, Chulalongkorn, Thailand
| | - Yuk Kei Wan
- grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, A*STAR, Singapore, Singapore ,grid.4280.e0000 0001 2180 6431 Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - W. S. Sho Goh
- grid.510951.90000 0004 7775 6738Institute of Molecular Physiology, Shenzhen Bay Laboratory, Shenzhen, China
| | - Alexandre Thiery
- grid.4280.e0000 0001 2180 6431Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore
| | - Jonathan Göke
- grid.418377.e0000 0004 0620 715XGenome Institute of Singapore, A*STAR, Singapore, Singapore ,grid.4280.e0000 0001 2180 6431Department of Statistics and Data Science, National University of Singapore, Singapore, Singapore ,grid.410724.40000 0004 0620 9745National Cancer Center of Singapore, Singapore, Singapore
| |
Collapse
|
13
|
Savulescu AF, Bouilhol E, Beaume N, Nikolski M. Prediction of RNA subcellular localization: Learning from heterogeneous data sources. iScience 2021; 24:103298. [PMID: 34765919 PMCID: PMC8571491 DOI: 10.1016/j.isci.2021.103298] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
RNA subcellular localization has recently emerged as a widespread phenomenon, which may apply to the majority of RNAs. The two main sources of data for characterization of RNA localization are sequence features and microscopy images, such as obtained from single-molecule fluorescent in situ hybridization-based techniques. Although such imaging data are ideal for characterization of RNA distribution, these techniques remain costly, time-consuming, and technically challenging. Given these limitations, imaging data exist only for a limited number of RNAs. We argue that the field of RNA localization would greatly benefit from complementary techniques able to characterize location of RNA. Here we discuss the importance of RNA localization and the current methodology in the field, followed by an introduction on prediction of location of molecules. We then suggest a machine learning approach based on the integration between imaging localization data and sequence-based data to assist in characterization of RNA localization on a transcriptome level.
Collapse
Affiliation(s)
- Anca Flavia Savulescu
- Division of Chemical, Systems & Synthetic Biology, Institute for Infectious Disease & Molecular Medicine, Faculty of Health Sciences, University of Cape Town, 7925 Cape Town, South Africa
| | - Emmanuel Bouilhol
- Université de Bordeaux, Bordeaux Bioinformatics Center, Bordeaux, France
- Université de Bordeaux, CNRS, IBGC, UMR 5095, Bordeaux, France
| | - Nicolas Beaume
- Division of Medical Virology, Faculty of Health Sciences, University of Cape Town,7925 Cape Town, South Africa
| | - Macha Nikolski
- Université de Bordeaux, Bordeaux Bioinformatics Center, Bordeaux, France
- Université de Bordeaux, CNRS, IBGC, UMR 5095, Bordeaux, France
| |
Collapse
|
14
|
Asim MN, Ibrahim MA, Imran Malik M, Dengel A, Ahmed S. Advances in Computational Methodologies for Classification and Sub-Cellular Locality Prediction of Non-Coding RNAs. Int J Mol Sci 2021; 22:8719. [PMID: 34445436 PMCID: PMC8395733 DOI: 10.3390/ijms22168719] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 02/06/2023] Open
Abstract
Apart from protein-coding Ribonucleic acids (RNAs), there exists a variety of non-coding RNAs (ncRNAs) which regulate complex cellular and molecular processes. High-throughput sequencing technologies and bioinformatics approaches have largely promoted the exploration of ncRNAs which revealed their crucial roles in gene regulation, miRNA binding, protein interactions, and splicing. Furthermore, ncRNAs are involved in the development of complicated diseases like cancer. Categorization of ncRNAs is essential to understand the mechanisms of diseases and to develop effective treatments. Sub-cellular localization information of ncRNAs demystifies diverse functionalities of ncRNAs. To date, several computational methodologies have been proposed to precisely identify the class as well as sub-cellular localization patterns of RNAs). This paper discusses different types of ncRNAs, reviews computational approaches proposed in the last 10 years to distinguish coding-RNA from ncRNA, to identify sub-types of ncRNAs such as piwi-associated RNA, micro RNA, long ncRNA, and circular RNA, and to determine sub-cellular localization of distinct ncRNAs and RNAs. Furthermore, it summarizes diverse ncRNA classification and sub-cellular localization determination datasets along with benchmark performance to aid the development and evaluation of novel computational methodologies. It identifies research gaps, heterogeneity, and challenges in the development of computational approaches for RNA sequence analysis. We consider that our expert analysis will assist Artificial Intelligence researchers with knowing state-of-the-art performance, model selection for various tasks on one platform, dominantly used sequence descriptors, neural architectures, and interpreting inter-species and intra-species performance deviation.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Ali Ibrahim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Imran Malik
- National Center for Artificial Intelligence (NCAI), National University of Sciences and Technology, Islamabad 44000, Pakistan;
- School of Electrical Engineering & Computer Science, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Andreas Dengel
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- DeepReader GmbH, Trippstadter Str. 122, 67663 Kaiserslautern, Germany
| |
Collapse
|
15
|
Su Y, Li Q, Zheng Z, Wei X, Hou P. Identification of genes, pathways and transcription factor-miRNA-target gene networks and experimental verification in venous thromboembolism. Sci Rep 2021; 11:16352. [PMID: 34381164 PMCID: PMC8357955 DOI: 10.1038/s41598-021-95909-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Accepted: 08/02/2021] [Indexed: 12/17/2022] Open
Abstract
Venous thromboembolism (VTE) is a complex, multifactorial life-threatening disease that involves vascular endothelial cell (VEC) dysfunction. However, the exact pathogenesis and underlying mechanisms of VTE are not completely clear. The aim of this study was to identify the core genes and pathways in VECs that are involved in the development and progression of unprovoked VTE (uVTE). The microarray dataset GSE118259 was downloaded from the Gene Expression Omnibus database, and 341 up-regulated and 8 down-regulated genes were identified in the VTE patients relative to the healthy controls, including CREB1, HIF1α, CBL, ILK, ESM1 and the ribosomal protein family genes. The protein-protein interaction (PPI) network and the transcription factor (TF)-miRNA-target gene network were constructed with these differentially expressed genes (DEGs), and visualized using Cytoscape software 3.6.1. Eighty-nine miRNAs were predicted as the targeting miRNAs of the DEGs, and 197 TFs were predicted as regulators of these miRNAs. In addition, 237 node genes and 4 modules were identified in the PPI network. The significantly enriched pathways included metabolic, cell adhesion, cell proliferation and cellular response to growth factor stimulus pathways. CREB1 was a differentially expressed TF in the TF-miRNA-target gene network, which regulated six miRNA-target gene pairs. The up-regulation of ESM1, HIF1α and CREB1 was confirmed at the mRNA and protein level in the plasma of uVTE patients. Taken together, ESM1, HIF1α and the CREB1-miRNA-target genes axis play potential mechanistic roles in uVTE development.
Collapse
Affiliation(s)
- Yiming Su
- Department of Vascular Surgery, LiuzhouWorker's Hospital, Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, 545005, Guangxi Province, China
| | - Qiyi Li
- Department of Vascular Surgery, LiuzhouWorker's Hospital, Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, 545005, Guangxi Province, China
| | - Zhiyong Zheng
- Department of Vascular Surgery, LiuzhouWorker's Hospital, Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, 545005, Guangxi Province, China
| | - Xiaomin Wei
- Department of Vascular Surgery, LiuzhouWorker's Hospital, Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, 545005, Guangxi Province, China
| | - Peiyong Hou
- Department of Vascular Surgery, LiuzhouWorker's Hospital, Fourth Affiliated Hospital of Guangxi Medical University, Liuzhou, 545005, Guangxi Province, China.
| |
Collapse
|
16
|
Meher PK, Rai A, Rao AR. mLoc-mRNA: predicting multiple sub-cellular localization of mRNAs using random forest algorithm coupled with feature selection via elastic net. BMC Bioinformatics 2021; 22:342. [PMID: 34167457 PMCID: PMC8223360 DOI: 10.1186/s12859-021-04264-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Accepted: 06/11/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Localization of messenger RNAs (mRNAs) plays a crucial role in the growth and development of cells. Particularly, it plays a major role in regulating spatio-temporal gene expression. The in situ hybridization is a promising experimental technique used to determine the localization of mRNAs but it is costly and laborious. It is also a known fact that a single mRNA can be present in more than one location, whereas the existing computational tools are capable of predicting only a single location for such mRNAs. Thus, the development of high-end computational tool is required for reliable and timely prediction of multiple subcellular locations of mRNAs. Hence, we develop the present computational model to predict the multiple localizations of mRNAs. RESULTS The mRNA sequences from 9 different localizations were considered. Each sequence was first transformed to a numeric feature vector of size 5460, based on the k-mer features of sizes 1-6. Out of 5460 k-mer features, 1812 important features were selected by the Elastic Net statistical model. The Random Forest supervised learning algorithm was then employed for predicting the localizations with the selected features. Five-fold cross-validation accuracies of 70.87, 68.32, 68.36, 68.79, 96.46, 73.44, 70.94, 97.42 and 71.77% were obtained for the cytoplasm, cytosol, endoplasmic reticulum, exosome, mitochondrion, nucleus, pseudopodium, posterior and ribosome respectively. With an independent test set, accuracies of 65.33, 73.37, 75.86, 72.99, 94.26, 70.91, 65.53, 93.60 and 73.45% were obtained for the respective localizations. The developed approach also achieved higher accuracies than the existing localization prediction tools. CONCLUSIONS This study presents a novel computational tool for predicting the multiple localization of mRNAs. Based on the proposed approach, an online prediction server "mLoc-mRNA" is accessible at http://cabgrid.res.in:8080/mlocmrna/ . The developed approach is believed to supplement the existing tools and techniques for the localization prediction of mRNAs.
Collapse
Affiliation(s)
- Prabina Kumar Meher
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India.
| | - Anil Rai
- ICAR-Indian Agricultural Statistics Research Institute, New Delhi, 110012, India.
| | | |
Collapse
|
17
|
Song B, Li Z, Lin X, Wang J, Wang T, Fu X. Pretraining model for biological sequence data. Brief Funct Genomics 2021; 20:181-195. [PMID: 34050350 PMCID: PMC8194843 DOI: 10.1093/bfgp/elab025] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Revised: 04/13/2021] [Accepted: 04/21/2021] [Indexed: 12/26/2022] Open
Abstract
With the development of high-throughput sequencing technology, biological sequence data reflecting life information becomes increasingly accessible. Particularly on the background of the COVID-19 pandemic, biological sequence data play an important role in detecting diseases, analyzing the mechanism and discovering specific drugs. In recent years, pretraining models that have emerged in natural language processing have attracted widespread attention in many research fields not only to decrease training cost but also to improve performance on downstream tasks. Pretraining models are used for embedding biological sequence and extracting feature from large biological sequence corpus to comprehensively understand the biological sequence data. In this survey, we provide a broad review on pretraining models for biological sequence data. Moreover, we first introduce biological sequences and corresponding datasets, including brief description and accessible link. Subsequently, we systematically summarize popular pretraining models for biological sequences based on four categories: CNN, word2vec, LSTM and Transformer. Then, we present some applications with proposed pretraining models on downstream tasks to explain the role of pretraining models. Next, we provide a novel pretraining scheme for protein sequences and a multitask benchmark for protein pretraining models. Finally, we discuss the challenges and future directions in pretraining models for biological sequences.
Collapse
Affiliation(s)
| | | | | | | | | | - Xiangzheng Fu
- Corresponding author: Xiangzheng Fu, College of Information Science and Engineering, Hunan University, Changsha, Hunan, China. Tel: 86-0731-88821907; E-mail:
| |
Collapse
|
18
|
Wang D, Zhang Z, Jiang Y, Mao Z, Wang D, Lin H, Xu D. DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism. Nucleic Acids Res 2021; 49:e46. [PMID: 33503258 PMCID: PMC8096227 DOI: 10.1093/nar/gkab016] [Citation(s) in RCA: 84] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 12/09/2020] [Accepted: 01/06/2021] [Indexed: 12/30/2022] Open
Abstract
Subcellular localization of messenger RNAs (mRNAs), as a prevalent mechanism, gives precise and efficient control for the translation process. There is mounting evidence for the important roles of this process in a variety of cellular events. Computational methods for mRNA subcellular localization prediction provide a useful approach for studying mRNA functions. However, few computational methods were designed for mRNA subcellular localization prediction and their performance have room for improvement. Especially, there is still no available tool to predict for mRNAs that have multiple localization annotations. In this paper, we propose a multi-head self-attention method, DM3Loc, for multi-label mRNA subcellular localization prediction. Evaluation results show that DM3Loc outperforms existing methods and tools in general. Furthermore, DM3Loc has the interpretation ability to analyze RNA-binding protein motifs and key signals on mRNAs for subcellular localization. Our analyses found hundreds of instances of mRNA isoform-specific subcellular localizations and many significantly enriched gene functions for mRNAs in different subcellular localizations.
Collapse
Affiliation(s)
- Duolin Wang
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO 65203, USA
| | - Zhaoyue Zhang
- Center for Information Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yuexu Jiang
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO 65203, USA
| | - Ziting Mao
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO 65203, USA
| | - Dong Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Hao Lin
- Center for Information Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Dong Xu
- Department of Electrical Engineering and Computer Science, Bond Life Sciences Center, University of Missouri, Columbia, MO 65203, USA
| |
Collapse
|
19
|
Zhang J, Sun M, Zhao Y, Geng G, Hu Y. Identification of Gingivitis-Related Genes Across Human Tissues Based on the Summary Mendelian Randomization. Front Cell Dev Biol 2021; 8:624766. [PMID: 34026747 PMCID: PMC8134671 DOI: 10.3389/fcell.2020.624766] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2020] [Accepted: 12/02/2020] [Indexed: 11/13/2022] Open
Abstract
Periodontal diseases are among the most frequent inflammatory diseases affecting children and adolescents, which affect the supporting structures of the teeth and lead to tooth loss and contribute to systemic inflammation. Gingivitis is the most common periodontal infection. Gingivitis, which is mainly caused by a substance produced by microbial plaque, systemic disorders, and genetic abnormalities in the host. Identifying gingivitis-related genes across human tissues is not only significant for understanding disease mechanisms but also disease development and clinical diagnosis. The Genome-wide association study (GWAS) a commonly used method to mine disease-related genetic variants. However, due to some factors such as linkage disequilibrium, it is difficult for GWAS to identify genes directly related to the disease. Hence, we constructed a data integration method that uses the Summary Mendelian randomization (SMR) to combine the GWAS with expression quantitative trait locus (eQTL) data to identify gingivitis-related genes. Five eQTL studies from different human tissues and one GWAS studies were referenced in this paper. This study identified several candidates SNPs and genes relate to gingivitis in tissue-specific or cross-tissue. Further, we also analyzed and explained the functions of these genes. The R program for the SMR method has been uploaded to GitHub(https://github.com/hxdde/SMR).
Collapse
Affiliation(s)
- Jiahui Zhang
- Department of Stomatology and Dental Hygiene, The Fourth Affiliated Hospital, Harbin Medical University, Harbin, China
| | - Mingai Sun
- General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, China
| | - Yuanyuan Zhao
- General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, China
| | - Guannan Geng
- Department of Endocrinology, The First Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Yang Hu
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, China
| |
Collapse
|
20
|
ANPrAod: Identify Antioxidant Proteins by Fusing Amino Acid Clustering Strategy and N-Peptide Combination. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:5518209. [PMID: 33927782 PMCID: PMC8049822 DOI: 10.1155/2021/5518209] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 03/02/2021] [Accepted: 03/10/2021] [Indexed: 11/18/2022]
Abstract
Antioxidant proteins perform significant functions in disease control and delaying aging which can prevent free radicals from damaging organisms. Accurate identification of antioxidant proteins has important implications for the development of new drugs and the treatment of related diseases, as they play a critical role in the control or prevention of cancer and aging-related conditions. Since experimental identification techniques are time-consuming and expensive, many computational methods have been proposed to identify antioxidant proteins. Although the accuracy of these methods is acceptable, there are still some challenges. In this study, we developed a computational model called ANPrAod to identify antioxidant proteins based on a support vector machine. In order to eliminate potential redundant features and improve prediction accuracy, 673 amino acid reduction alphabets were calculated by us to find the optimal feature representation scheme. The final model could produce an overall accuracy of 87.53% with the ROC of 0.7266 in five-fold cross-validation, which was better than the existing methods. The results of the independent dataset also demonstrated the excellent robustness and reliability of ANPrAod, which could be a promising tool for antioxidant protein identification and contribute to hypothesis-driven experimental design.
Collapse
|
21
|
Tang Q, Nie F, Kang J, Chen W. mRNALocater: Enhance the prediction accuracy of eukaryotic mRNA subcellular localization by using model fusion strategy. Mol Ther 2021; 29:2617-2623. [PMID: 33823302 DOI: 10.1016/j.ymthe.2021.04.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Revised: 03/23/2021] [Accepted: 03/31/2021] [Indexed: 02/07/2023] Open
Abstract
The functions of mRNAs are closely correlated with their locations in cells. Knowledge about the subcellular locations of mRNA is helpful to understand their biological functions. In recent years, it has become a hot topic to develop effective computational models to predict eukaryotic mRNA subcellular localizations. However, existing state-of-the-art models still have certain deficiencies in terms of prediction accuracy and generalization ability. Therefore, it is urgent to develop novel methods to accurately predict mRNA subcellular localizations. In this study, a novel method called mRNALocater was proposed to detect the subcellular localization of eukaryotic mRNA by adopting the model fusion strategy. To fully extract information from mRNA sequences, the electron-ion interaction pseudopotential and pseudo k-tuple nucleotide composition were used to encode the sequences. Moreover, the correlation coefficient filtering algorithm and feature forward search technology were used to mine hidden feature information, which guarantees that mRNALocater can be more effectively applied to new sequences. The results based on the independent dataset tests demonstrate that mRNALocater yields promising performances for predicting eukaryotic mRNA subcellular localizations and is a powerful tool in practical applications. A freely available online web server for mRNALocater has been established at http://bio-bigdata.cn/mRNALocater.
Collapse
Affiliation(s)
- Qiang Tang
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China
| | - Fulei Nie
- School of Life Sciences, North China University of Science and Technology, Tangshan 063210, China; School of Public Health, North China University of Science and Technology, Tangshan 063210, China
| | - Juanjuan Kang
- Affiliated Foshan Maternity & Child Healthcare Hospital, Southern Medical University (Foshan Maternity & Child Healthcare Hospital), Foshan 528000, China
| | - Wei Chen
- State Key Laboratory of Southwestern Chinese Medicine Resources, Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China; School of Life Sciences, North China University of Science and Technology, Tangshan 063210, China; School of Public Health, North China University of Science and Technology, Tangshan 063210, China.
| |
Collapse
|
22
|
MicroRNAs and long non-coding RNAs as novel regulators of ribosome biogenesis. Biochem Soc Trans 2021; 48:595-612. [PMID: 32267487 PMCID: PMC7200637 DOI: 10.1042/bst20190854] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 03/12/2020] [Accepted: 03/16/2020] [Indexed: 12/14/2022]
Abstract
Ribosome biogenesis is the fine-tuned, essential process that generates mature ribosomal subunits and ultimately enables all protein synthesis within a cell. Novel regulators of ribosome biogenesis continue to be discovered in higher eukaryotes. While many known regulatory factors are proteins or small nucleolar ribonucleoproteins, microRNAs (miRNAs), and long non-coding RNAs (lncRNAs) are emerging as a novel modulatory layer controlling ribosome production. Here, we summarize work uncovering non-coding RNAs (ncRNAs) as novel regulators of ribosome biogenesis and highlight their links to diseases of defective ribosome biogenesis. It is still unclear how many miRNAs or lncRNAs are involved in phenotypic or pathological disease outcomes caused by impaired ribosome production, as in the ribosomopathies, or by increased ribosome production, as in cancer. In time, we hypothesize that many more ncRNA regulators of ribosome biogenesis will be discovered, which will be followed by an effort to establish connections between disease pathologies and the molecular mechanisms of this additional layer of ribosome biogenesis control.
Collapse
|
23
|
Zaghlool A, Niazi A, Björklund ÅK, Westholm JO, Ameur A, Feuk L. Characterization of the nuclear and cytosolic transcriptomes in human brain tissue reveals new insights into the subcellular distribution of RNA transcripts. Sci Rep 2021; 11:4076. [PMID: 33603054 PMCID: PMC7893067 DOI: 10.1038/s41598-021-83541-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 01/20/2021] [Indexed: 12/23/2022] Open
Abstract
Transcriptome analysis has mainly relied on analyzing RNA sequencing data from whole cells, overlooking the impact of subcellular RNA localization and its influence on our understanding of gene function, and interpretation of gene expression signatures in cells. Here, we separated cytosolic and nuclear RNA from human fetal and adult brain samples and performed a comprehensive analysis of cytosolic and nuclear transcriptomes. There are significant differences in RNA expression for protein-coding and lncRNA genes between cytosol and nucleus. We show that transcripts encoding the nuclear-encoded mitochondrial proteins are significantly enriched in the cytosol compared to the rest of protein-coding genes. Differential expression analysis between fetal and adult frontal cortex show that results obtained from the cytosolic RNA differ from results using nuclear RNA both at the level of transcript types and the number of differentially expressed genes. Our data provide a resource for the subcellular localization of thousands of RNA transcripts in the human brain and highlight differences in using the cytosolic or the nuclear transcriptomes for expression analysis.
Collapse
Affiliation(s)
- Ammar Zaghlool
- Department of Immunology, Genetics and Pathology, Uppsala University, BMC B11:4, Box 815, 751 08, Uppsala, Sweden. .,Science for Life Laboratory in Uppsala, Uppsala University, Uppsala, Sweden.
| | - Adnan Niazi
- Department of Immunology, Genetics and Pathology, Uppsala University, BMC B11:4, Box 815, 751 08, Uppsala, Sweden.,Science for Life Laboratory in Uppsala, Uppsala University, Uppsala, Sweden
| | - Åsa K Björklund
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Husargatan 3, 752 37, Uppsala, Sweden
| | - Jakub Orzechowski Westholm
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Box 1031, 17121, Solna, Sweden
| | - Adam Ameur
- Department of Immunology, Genetics and Pathology, Uppsala University, BMC B11:4, Box 815, 751 08, Uppsala, Sweden.,Science for Life Laboratory in Uppsala, Uppsala University, Uppsala, Sweden
| | - Lars Feuk
- Department of Immunology, Genetics and Pathology, Uppsala University, BMC B11:4, Box 815, 751 08, Uppsala, Sweden. .,Science for Life Laboratory in Uppsala, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
24
|
Chen J, Zhang J, Gao Y, Li Y, Feng C, Song C, Ning Z, Zhou X, Zhao J, Feng M, Zhang Y, Wei L, Pan Q, Jiang Y, Qian F, Han J, Yang Y, Wang Q, Li C. LncSEA: a platform for long non-coding RNA related sets and enrichment analysis. Nucleic Acids Res 2021; 49:D969-D980. [PMID: 33045741 PMCID: PMC7778898 DOI: 10.1093/nar/gkaa806] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 09/03/2020] [Accepted: 09/30/2020] [Indexed: 02/01/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) have been proven to play important roles in transcriptional processes and various biological functions. Establishing a comprehensive collection of human lncRNA sets is urgent work at present. Using reference lncRNA sets, enrichment analyses will be useful for analyzing lncRNA lists of interest submitted by users. Therefore, we developed a human lncRNA sets database, called LncSEA, which aimed to document a large number of available resources for human lncRNA sets and provide annotation and enrichment analyses for lncRNAs. LncSEA supports >40 000 lncRNA reference sets across 18 categories and 66 sub-categories, and covers over 50 000 lncRNAs. We not only collected lncRNA sets based on downstream regulatory data sources, but also identified a large number of lncRNA sets regulated by upstream transcription factors (TFs) and DNA regulatory elements by integrating TF ChIP-seq, DNase-seq, ATAC-seq and H3K27ac ChIP-seq data. Importantly, LncSEA provides annotation and enrichment analyses of lncRNA sets associated with upstream regulators and downstream targets. In summary, LncSEA is a powerful platform that provides a variety of types of lncRNA sets for users, and supports lncRNA annotations and enrichment analyses. The LncSEA database is freely accessible at http://bio.liclab.net/LncSEA/index.php.
Collapse
Affiliation(s)
- Jiaxin Chen
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Jian Zhang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Yu Gao
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Yanyu Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Chenchen Feng
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Chao Song
- Department of Pharmacology, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Ziyu Ning
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Xinyuan Zhou
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Jianmei Zhao
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Minghong Feng
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Yuexin Zhang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Ling Wei
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Qi Pan
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Yong Jiang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Fengcui Qian
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yongsan Yang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Qiuyu Wang
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| | - Chunquan Li
- School of Medical Informatics, Daqing Campus, Harbin Medical University, Daqing 163319, China
| |
Collapse
|
25
|
Roth YD, Lian Z, Pochiraju S, Shaikh B, Karr JR. Datanator: an integrated database of molecular data for quantitatively modeling cellular behavior. Nucleic Acids Res 2021; 49:D516-D522. [PMID: 33174603 PMCID: PMC7779073 DOI: 10.1093/nar/gkaa1008] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Revised: 10/12/2020] [Accepted: 10/21/2020] [Indexed: 12/23/2022] Open
Abstract
Integrative research about multiple biochemical subsystems has significant potential to help advance biology, bioengineering and medicine. However, it is difficult to obtain the diverse data needed for integrative research. To facilitate biochemical research, we developed Datanator (https://datanator.info), an integrated database and set of tools for finding clouds of multiple types of molecular data about specific molecules and reactions in specific organisms and environments, as well as data about chemically-similar molecules and reactions in phylogenetically-similar organisms in similar environments. Currently, Datanator includes metabolite concentrations, RNA modifications and half-lives, protein abundances and modifications, and reaction rate constants about a broad range of organisms. Going forward, we aim to launch a community initiative to curate additional data. Datanator also provides tools for filtering, visualizing and exporting these data clouds. We believe that Datanator can facilitate a wide range of research from integrative mechanistic models, such as whole-cell models, to comparative data-driven analyses of multiple organisms.
Collapse
Affiliation(s)
- Yosef D Roth
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Zhouyang Lian
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Saahith Pochiraju
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Bilal Shaikh
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| | - Jonathan R Karr
- Icahn Institute for Data Science and Genomic Technology and Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1255 5th Avenue, Suite C2, New York, NY 10029, USA
| |
Collapse
|
26
|
Wang H, Ding Y, Tang J, Zou Q, Guo F. Identify RNA-associated subcellular localizations based on multi-label learning using Chou's 5-steps rule. BMC Genomics 2021; 22:56. [PMID: 33451286 PMCID: PMC7811227 DOI: 10.1186/s12864-020-07347-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 12/22/2020] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND Biological functions of biomolecules rely on the cellular compartments where they are located in cells. Importantly, RNAs are assigned in specific locations of a cell, enabling the cell to implement diverse biochemical processes in the way of concurrency. However, lots of existing RNA subcellular localization classifiers only solve the problem of single-label classification. It is of great practical significance to expand RNA subcellular localization into multi-label classification problem. RESULTS In this study, we extract multi-label classification datasets about RNA-associated subcellular localizations on various types of RNAs, and then construct subcellular localization datasets on four RNA categories. In order to study Homo sapiens, we further establish human RNA subcellular localization datasets. Furthermore, we utilize different nucleotide property composition models to extract effective features to adequately represent the important information of nucleotide sequences. In the most critical part, we achieve a major challenge that is to fuse the multivariate information through multiple kernel learning based on Hilbert-Schmidt independence criterion. The optimal combined kernel can be put into an integration support vector machine model for identifying multi-label RNA subcellular localizations. Our method obtained excellent results of 0.703, 0.757, 0.787, and 0.800, respectively on four RNA data sets on average precision. CONCLUSION To be specific, our novel method performs outstanding rather than other prediction tools on novel benchmark datasets. Moreover, we establish user-friendly web server with the implementation of our method.
Collapse
Affiliation(s)
- Hao Wang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
- School of Computational Science and Engineering, University of South Carolina, Columbia, 29208, SC, US
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.
| |
Collapse
|
27
|
Ning L, Cui T, Zheng B, Wang N, Luo J, Yang B, Du M, Cheng J, Dou Y, Wang D. MNDR v3.0: mammal ncRNA-disease repository with increased coverage and annotation. Nucleic Acids Res 2021; 49:D160-D164. [PMID: 32833025 PMCID: PMC7779040 DOI: 10.1093/nar/gkaa707] [Citation(s) in RCA: 85] [Impact Index Per Article: 28.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 08/12/2020] [Accepted: 08/14/2020] [Indexed: 02/07/2023] Open
Abstract
Many studies have indicated that non-coding RNA (ncRNA) dysfunction is closely related to numerous diseases. Recently, accumulated ncRNA-disease associations have made related databases insufficient to meet the demands of biomedical research. The constant updating of ncRNA-disease resources has become essential. Here, we have updated the mammal ncRNA-disease repository (MNDR, http://www.rna-society.org/mndr/) to version 3.0, containing more than one million entries, four-fold increment in data compared to the previous version. Experimental and predicted circRNA-disease associations have been integrated, increasing the number of categories of ncRNAs to five, and the number of mammalian species to 11. Moreover, ncRNA-disease related drug annotations and associations, as well as ncRNA subcellular localizations and interactions, were added. In addition, three ncRNA-disease (miRNA/lncRNA/circRNA) prediction tools were provided, and the website was also optimized, making it more practical and user-friendly. In summary, MNDR v3.0 will be a valuable resource for the investigation of disease mechanisms and clinical treatment strategies.
Collapse
Affiliation(s)
- Lin Ning
- Dermatology Hospital, Southern Medical University, Guangzhou 510091, China
| | - Tianyu Cui
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Boyang Zheng
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Nuo Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Jiaxin Luo
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Beilei Yang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Mengze Du
- Qingyuan People's Hospital, The Sixth Affiliated Hospital of Guangzhou Medical University, B24 Yinquan South Road, Qingyuan 511518, Guangdong Province, People's Republic of China
| | - Jun Cheng
- Affiliated Foshan Maternity & Child Healthcare Hospital, Southern Medical University (Foshan Maternity & Child Healthcare Hospital)
| | - Yiying Dou
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Dong Wang
- Dermatology Hospital, Southern Medical University, Guangzhou 510091, China
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
| |
Collapse
|
28
|
Huang Y, Wang J, Zhao Y, Wang H, Liu T, Li Y, Cui T, Li W, Feng Y, Luo J, Gong J, Ning L, Zhang Y, Wang D, Zhang Y. cncRNAdb: a manually curated resource of experimentally supported RNAs with both protein-coding and noncoding function. Nucleic Acids Res 2021; 49:D65-D70. [PMID: 33010163 PMCID: PMC7778915 DOI: 10.1093/nar/gkaa791] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 08/30/2020] [Accepted: 09/11/2020] [Indexed: 12/14/2022] Open
Abstract
RNA endowed with both protein-coding and noncoding functions is referred to as 'dual-function RNA', 'binary functional RNA (bifunctional RNA)' or 'cncRNA (coding and noncoding RNA)'. Recently, an increasing number of cncRNAs have been identified, including both translated ncRNAs (ncRNAs with coding functions) and untranslated mRNAs (mRNAs with noncoding functions). However, an appropriate database for storing and organizing cncRNAs is still lacking. Here, we developed cncRNAdb, a manually curated database of experimentally supported cncRNAs, which aims to provide a resource for efficient manipulation, browsing and analysis of cncRNAs. The current version of cncRNAdb documents about 2600 manually curated entries of cncRNA functions with experimental evidence, involving more than 2,000 RNAs (including over 1300 translated ncRNAs and over 600 untranslated mRNAs) across over 20 species. In summary, we believe that cncRNAdb will help elucidate the functions and mechanisms of cncRNAs and develop new prediction methods. The database is available at http://www.rna-society.org/cncrnadb/.
Collapse
MESH Headings
- 3' Untranslated Regions
- 5' Untranslated Regions
- Animals
- Databases, Nucleic Acid/organization & administration
- Drosophila melanogaster/genetics
- Humans
- Mice
- MicroRNAs/classification
- MicroRNAs/genetics
- Pan troglodytes/genetics
- RNA, Circular/classification
- RNA, Circular/genetics
- RNA, Long Noncoding/classification
- RNA, Long Noncoding/genetics
- RNA, Messenger/classification
- RNA, Messenger/genetics
- RNA, Ribosomal/classification
- RNA, Ribosomal/genetics
- RNA, Small Interfering/classification
- RNA, Small Interfering/genetics
- RNA, Transfer/classification
- RNA, Transfer/genetics
- Software
- Zebrafish/genetics
Collapse
Affiliation(s)
- Yan Huang
- Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde Foshan), Foshan 528308, China
| | - Jing Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Yue Zhao
- School of Basic Medical Sciences & Forensic Medicine, Hangzhou Medical College, Hangzhou 310053, China
| | - Huafeng Wang
- Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde Foshan), Foshan 528308, China
| | - Tianyuan Liu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Yuhe Li
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Tianyu Cui
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Weiyi Li
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Yige Feng
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Jiaxin Luo
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Jiaqi Gong
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Lin Ning
- Dermatology Hospital, Southern Medical University, Guangzhou 510091, China
| | - Yong Zhang
- Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde Foshan), Foshan 528308, China
| | - Dong Wang
- Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde Foshan), Foshan 528308, China
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
- Dermatology Hospital, Southern Medical University, Guangzhou 510091, China
| | - Yang Zhang
- Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde Foshan), Foshan 528308, China
| |
Collapse
|
29
|
Liu T, Chen JM, Zhang D, Zhang Q, Peng B, Xu L, Tang H. ApoPred: Identification of Apolipoproteins and Their Subfamilies With Multifarious Features. Front Cell Dev Biol 2021; 8:621144. [PMID: 33490085 PMCID: PMC7820372 DOI: 10.3389/fcell.2020.621144] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Accepted: 11/24/2020] [Indexed: 01/24/2023] Open
Abstract
Apolipoprotein is a group of plasma proteins that are associated with a variety of diseases, such as hyperlipidemia, atherosclerosis, Alzheimer's disease, and diabetes. In order to investigate the function of apolipoproteins and to develop effective targets for related diseases, it is necessary to accurately identify and classify apolipoproteins. Although it is possible to identify apolipoproteins accurately through biochemical experiments, they are expensive and time-consuming. This work aims to establish a high-efficiency and high-accuracy prediction model for recognition of apolipoproteins and their subfamilies. We firstly constructed a high-quality benchmark dataset including 270 apolipoproteins and 535 non-apolipoproteins. Based on the dataset, pseudo-amino acid composition (PseAAC) and composition of k-spaced amino acid pairs (CKSAAP) were used as input vectors. To improve the prediction accuracy and eliminate redundant information, analysis of variance (ANOVA) was used to rank the features. And the incremental feature selection was utilized to obtain the best feature subset. Support vector machine (SVM) was proposed to construct the classification model, which could produce the accuracy of 97.27%, sensitivity of 96.30%, and specificity of 97.76% for discriminating apolipoprotein from non-apolipoprotein in 10-fold cross-validation. In addition, the same process was repeated to generate a new model for predicting apolipoprotein subfamilies. The new model could achieve an overall accuracy of 95.93% in 10-fold cross-validation. According to our proposed model, a convenient webserver called ApoPred was established, which can be freely accessed at http://tang-biolab.com/server/ApoPred/service.html. We expect that this work will contribute to apolipoprotein function research and drug development in relevant diseases.
Collapse
Affiliation(s)
- Ting Liu
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Jia-Mao Chen
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Dan Zhang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Qian Zhang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
| | - Bowen Peng
- Division of international Cooperation, Health Commission of Sichuan Province, Chengdu, China
| | - Lei Xu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China
| | - Hua Tang
- School of Basic Medical Sciences, Southwest Medical University, Luzhou, China
- Central Nervous System Drug Key Laboratory of Sichuan Province, Luzhou, China
| |
Collapse
|
30
|
iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021; 2021:6664362. [PMID: 33505515 PMCID: PMC7808816 DOI: 10.1155/2021/6664362] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 12/13/2020] [Accepted: 12/28/2020] [Indexed: 02/07/2023]
Abstract
Bioluminescent proteins (BLPs) are a class of proteins that widely distributed in many living organisms with various mechanisms of light emission including bioluminescence and chemiluminescence from luminous organisms. Bioluminescence has been commonly used in various analytical research methods of cellular processes, such as gene expression analysis, drug discovery, cellular imaging, and toxicity determination. However, the identification of bioluminescent proteins is challenging as they share poor sequence similarities among them. In this paper, we briefly reviewed the development of the computational identification of BLPs and subsequently proposed a novel predicting framework for identifying BLPs based on eXtreme gradient boosting algorithm (XGBoost) and using sequence-derived features. To train the models, we collected BLP data from bacteria, eukaryote, and archaea. Then, for getting more effective prediction models, we examined the performances of different feature extraction methods and their combinations as well as classification algorithms. Finally, based on the optimal model, a novel predictor named iBLP was constructed to identify BLPs. The robustness of iBLP has been proved by experiments on training and independent datasets. Comparison with other published method further demonstrated that the proposed method is powerful and could provide good performance for BLP identification. The webserver and software package for BLP identification are freely available at http://lin-group.cn/server/iBLP.
Collapse
|
31
|
Bharathi M, Senthil Kumar N, Chellapandi P. Functional Prediction and Assignment of Methanobrevibacter ruminantium M1 Operome Using a Combined Bioinformatics Approach. Front Genet 2020; 11:593990. [PMID: 33391347 PMCID: PMC7772410 DOI: 10.3389/fgene.2020.593990] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2020] [Accepted: 11/17/2020] [Indexed: 12/11/2022] Open
Abstract
Methanobrevibacter ruminantium M1 (MRU) is a rod-shaped rumen methanogen with the ability to use H2 and CO2, and formate as substrates for methane formation in the ruminants. Enteric methane emitted from this organism can also be influential to the loss of dietary energy in ruminants and humans. To date, there is no successful technology to reduce methane due to a lack of knowledge on its molecular machinery and 73% conserved hypothetical proteins (HPs; operome) whose functions are still not ascertained perceptively. To address this issue, we have predicted and assigned a precise function to HPs and categorize them as metabolic enzymes, binding proteins, and transport proteins using a combined bioinformatics approach. The results of our study show that 257 (34%) HPs have well-defined functions and contributed essential roles in its growth physiology and host adaptation. The genome-neighborhood analysis identified 6 operon-like clusters such as hsp, TRAM, dsr, cbs and cas, which are responsible for protein folding, sudden heat-shock, host defense, and protection against the toxicities in the rumen. The functions predicted from MRU operome comprised of 96 metabolic enzymes with 17 metabolic subsystems, 31 transcriptional regulators, 23 transport, and 11 binding proteins. Functional annotation of its operome is thus more imperative to unravel the molecular and cellular machinery at the systems-level. The functional assignment of its operome would advance strategies to develop new anti-methanogenic targets to mitigate methane production. Hence, our approach provides new insight into the understanding of its growth physiology and lifestyle in the ruminants and also to reduce anthropogenic greenhouse gas emissions worldwide.
Collapse
Affiliation(s)
- M Bharathi
- Molecular Systems Engineering Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, India
| | - N Senthil Kumar
- Human Genetics Lab, Department of Biotechnology, School of Life Sciences, Mizoram University (Central University), Aizawl, India
| | - P Chellapandi
- Molecular Systems Engineering Lab, Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli, India
| |
Collapse
|
32
|
MirLocPredictor: A ConvNet-Based Multi-Label MicroRNA Subcellular Localization Predictor by Incorporating k-Mer Positional Information. Genes (Basel) 2020; 11:genes11121475. [PMID: 33316943 PMCID: PMC7763197 DOI: 10.3390/genes11121475] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 11/23/2020] [Accepted: 11/25/2020] [Indexed: 02/06/2023] Open
Abstract
MicroRNAs (miRNA) are small noncoding RNA sequences consisting of about 22 nucleotides that are involved in the regulation of almost 60% of mammalian genes. Presently, there are very limited approaches for the visualization of miRNA locations present inside cells to support the elucidation of pathways and mechanisms behind miRNA function, transport, and biogenesis. MIRLocator, a state-of-the-art tool for the prediction of subcellular localization of miRNAs makes use of a sequence-to-sequence model along with pretrained k-mer embeddings. Existing pretrained k-mer embedding generation methodologies focus on the extraction of semantics of k-mers. However, in RNA sequences, positional information of nucleotides is more important because distinct positions of the four nucleotides define the function of an RNA molecule. Considering the importance of the nucleotide position, we propose a novel approach (kmerPR2vec) which is a fusion of positional information of k-mers with randomly initialized neural k-mer embeddings. In contrast to existing k-mer-based representation, the proposed kmerPR2vec representation is much more rich in terms of semantic information and has more discriminative power. Using novel kmerPR2vec representation, we further present an end-to-end system (MirLocPredictor) which couples the discriminative power of kmerPR2vec with Convolutional Neural Networks (CNNs) for miRNA subcellular location prediction. The effectiveness of the proposed kmerPR2vec approach is evaluated with deep learning-based topologies (i.e., Convolutional Neural Networks (CNN) and Recurrent Neural Network (RNN)) and by using 9 different evaluation measures. Analysis of the results reveals that MirLocPredictor outperform state-of-the-art methods with a significant margin of 18% and 19% in terms of precision and recall.
Collapse
|
33
|
Kern F, Fehlmann T, Solomon J, Schwed L, Grammes N, Backes C, Van Keuren-Jensen K, Craig DW, Meese E, Keller A. miEAA 2.0: integrating multi-species microRNA enrichment analysis and workflow management systems. Nucleic Acids Res 2020; 48:W521-W528. [PMID: 32374865 PMCID: PMC7319446 DOI: 10.1093/nar/gkaa309] [Citation(s) in RCA: 126] [Impact Index Per Article: 31.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Revised: 04/06/2020] [Accepted: 04/22/2020] [Indexed: 01/01/2023] Open
Abstract
Gene set enrichment analysis has become one of the most frequently used applications in molecular biology research. Originally developed for gene sets, the same statistical principles are now available for all omics types. In 2016, we published the miRNA enrichment analysis and annotation tool (miEAA) for human precursor and mature miRNAs. Here, we present miEAA 2.0, supporting miRNA input from ten frequently investigated organisms. To facilitate inclusion of miEAA in workflow systems, we implemented an Application Programming Interface (API). Users can perform miRNA set enrichment analysis using either the web-interface, a dedicated Python package, or custom remote clients. Moreover, the number of category sets was raised by an order of magnitude. We implemented novel categories like annotation confidence level or localisation in biological compartments. In combination with the miRBase miRNA-version and miRNA-to-precursor converters, miEAA supports research settings where older releases of miRBase are in use. The web server also offers novel comprehensive visualizations such as heatmaps and running sum curves with background distributions. We demonstrate the new features with case studies for human kidney cancer, a biomarker study on Parkinson’s disease from the PPMI cohort, and a mouse model for breast cancer. The tool is freely accessible at: https://www.ccb.uni-saarland.de/mieaa2.
Collapse
Affiliation(s)
- Fabian Kern
- Chair for Clinical Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
| | - Tobias Fehlmann
- Chair for Clinical Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
| | - Jeffrey Solomon
- Chair for Clinical Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
| | - Louisa Schwed
- Chair for Clinical Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
| | - Nadja Grammes
- Chair for Clinical Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
| | - Christina Backes
- Chair for Clinical Bioinformatics, Saarland University, 66123 Saarbrücken, Germany
| | | | - David Wesley Craig
- Institute of Translational Genomics, University of Southern California, Los Angeles, CA 90033, USA
| | - Eckart Meese
- Department of Human Genetics, Saarland University, 66421 Homburg, Germany
| | - Andreas Keller
- Chair for Clinical Bioinformatics, Saarland University, 66123 Saarbrücken, Germany.,School of Medicine Office, Stanford University, Stanford, CA 94305, USA.,Department of Neurology and Neurological Sciences, Stanford University, Stanford, CA 94304, USA
| |
Collapse
|
34
|
Aillaud M, Schulte LN. Emerging Roles of Long Noncoding RNAs in the Cytoplasmic Milieu. Noncoding RNA 2020; 6:ncrna6040044. [PMID: 33182489 PMCID: PMC7711603 DOI: 10.3390/ncrna6040044] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 10/26/2020] [Accepted: 11/05/2020] [Indexed: 02/06/2023] Open
Abstract
While the important functions of long noncoding RNAs (lncRNAs) in nuclear organization are well documented, their orchestrating and architectural roles in the cytoplasmic environment have long been underestimated. However, recently developed fractionation and proximity labelling approaches have shown that a considerable proportion of cellular lncRNAs is exported into the cytoplasm and associates nonrandomly with proteins in the cytosol and organelles. The functions of these lncRNAs range from the control of translation and mitochondrial metabolism to the anchoring of cellular components on the cytoskeleton and regulation of protein degradation at the proteasome. In the present review, we provide an overview of the functions of lncRNAs in cytoplasmic structures and machineries und discuss their emerging roles in the coordination of the dense intracellular milieu. It is becoming apparent that further research into the functions of these lncRNAs will lead to an improved understanding of the spatiotemporal organization of cytoplasmic processes during homeostasis and disease.
Collapse
Affiliation(s)
- Michelle Aillaud
- Institute for Lung Research, Philipps University Marburg, 35043 Marburg, Germany;
| | - Leon N Schulte
- Institute for Lung Research, Philipps University Marburg, 35043 Marburg, Germany;
- German Center for Lung Research (DZL), 35392 Giessen, Germany
- Correspondence:
| |
Collapse
|
35
|
An Integrating Immune-Related Signature to Improve Prognosis of Hepatocellular Carcinoma. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2020; 2020:8872329. [PMID: 33204302 PMCID: PMC7655255 DOI: 10.1155/2020/8872329] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/26/2020] [Accepted: 10/15/2020] [Indexed: 01/27/2023]
Abstract
Growing evidence suggests that the superiority of long noncoding RNAs (lncRNAs) and messenger RNAs (mRNAs) could act as biomarkers for cancer prognosis. However, the prognostic marker for hepatocellular carcinoma with high accuracy and sensitivity is still lacking. In this research, a retrospective, cohort-based study of genome-wide RNA-seq data of patients with hepatocellular carcinoma was carried out, and two protein-coding genes (GTPBP4, TREM-1) and one lncRNA (LINC00426) were sorted out to construct an integrative signature to predict the prognosis of patients. The results show that both the AUC and the C-index of this model perform well in TCGA validation dataset, cross-platform GEO validation dataset, and different subsets divided by gender, stage, and grade. The expression pattern and functional analysis show that all three genes contained in the model are associated with immune infiltration, cell proliferation, invasion, and metastasis, providing further confirmation of this model. In summary, the proposed model can effectively distinguish the high- and low-risk groups of hepatocellular carcinoma patients and is expected to shed light on the treatment of hepatocellular carcinoma and greatly improve the patients' prognosis.
Collapse
|
36
|
Garg A, Singhal N, Kumar R, Kumar M. mRNALoc: a novel machine-learning based in-silico tool to predict mRNA subcellular localization. Nucleic Acids Res 2020; 48:W239-W243. [PMID: 32421834 PMCID: PMC7319581 DOI: 10.1093/nar/gkaa385] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Revised: 04/14/2020] [Accepted: 04/30/2020] [Indexed: 02/06/2023] Open
Abstract
Recent evidences suggest that the localization of mRNAs near the subcellular compartment of the translated proteins is a more robust cellular tool, which optimizes protein expression, post-transcriptionally. Retention of mRNA in the nucleus can regulate the amount of protein translated from each mRNA, thus allowing a tight temporal regulation of translation or buffering of protein levels from bursty transcription. Besides, mRNA localization performs a variety of additional roles like long-distance signaling, facilitating assembly of protein complexes and coordination of developmental processes. Here, we describe a novel machine-learning based tool, mRNALoc, to predict five sub-cellular locations of eukaryotic mRNAs using cDNA/mRNA sequences. During five fold cross-validations, the maximum overall accuracy was 65.19, 75.36, 67.10, 99.70 and 73.59% for the extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. Assessment on independent datasets revealed the prediction accuracies of 58.10, 69.23, 64.55, 96.88 and 69.35% for extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. The corresponding values of AUC were 0.76, 0.75, 0.70, 0.98 and 0.74 for the extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. The mRNALoc standalone software and web-server are freely available for academic use under GNU GPL at http://proteininformatics.org/mkumar/mrnaloc.
Collapse
Affiliation(s)
- Anjali Garg
- Department of Biophysics, University of Delhi South Campus, New Delhi 110021, India
| | - Neelja Singhal
- Department of Biophysics, University of Delhi South Campus, New Delhi 110021, India
| | - Ravindra Kumar
- Department of Biophysics, University of Delhi South Campus, New Delhi 110021, India
| | - Manish Kumar
- Department of Biophysics, University of Delhi South Campus, New Delhi 110021, India
| |
Collapse
|
37
|
LncLocation: Efficient Subcellular Location Prediction of Long Non-Coding RNA-Based Multi-Source Heterogeneous Feature Fusion. Int J Mol Sci 2020; 21:ijms21197271. [PMID: 33019721 PMCID: PMC7582431 DOI: 10.3390/ijms21197271] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Revised: 09/27/2020] [Accepted: 09/28/2020] [Indexed: 12/13/2022] Open
Abstract
Recent studies uncover that subcellular location of long non-coding RNAs (lncRNAs) can provide significant information on its function. Due to the lack of experimental data, the number of lncRNAs is very limited, experimentally verified subcellular localization, and the numbers of lncRNAs located in different organelle are wildly imbalanced. The prediction of subcellular location of lncRNAs is actually a multi-classification small sample imbalance problem. The imbalance of data results in the poor recognition effect of machine learning models on small data subsets, which is a puzzling and challenging problem in the existing research. In this study, we integrate multi-source features to construct a sequence-based computational tool, lncLocation, to predict the subcellular location of lncRNAs. Autoencoder is used to enhance part of the features, and the binomial distribution-based filtering method and recursive feature elimination (RFE) are used to filter some of the features. It improves the representation ability of data and reduces the problem of unbalanced multi-classification data. By comprehensive experiments on different feature combinations and machine learning models, we select the optimal features and classifier model scheme to construct a subcellular location prediction tool, lncLocation. LncLocation can obtain an 87.78% accuracy using 5-fold cross validation on the benchmark data, which is higher than the state-of-the-art tools, and the classification performance, especially for small class sets, is improved significantly.
Collapse
|
38
|
Abstract
Systematics is described for annotation of variations in RNA molecules. The conceptual framework is part of Variation Ontology (VariO) and facilitates depiction of types of variations, their functional and structural effects and other consequences in any RNA molecule in any organism. There are more than 150 RNA related VariO terms in seven levels, which can be further combined to generate even more complicated and detailed annotations. The terms are described together with examples, usually for variations and effects in human and in diseases. RNA variation type has two subcategories: variation classification and origin with subterms. Altogether six terms are available for function description. Several terms are available for affected RNA properties. The ontology contains also terms for structural description for affected RNA type, post-transcriptional RNA modifications, secondary and tertiary structure effects and RNA sugar variations. Together with the DNA and protein concepts and annotations, RNA terms allow comprehensive description of variations of genetic and non-genetic origin at all possible levels. The VariO annotations are readable both for humans and computer programs for advanced data integration and mining.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
39
|
miRNALoc: predicting miRNA subcellular localizations based on principal component scores of physico-chemical properties and pseudo compositions of di-nucleotides. Sci Rep 2020; 10:14557. [PMID: 32884018 PMCID: PMC7471944 DOI: 10.1038/s41598-020-71381-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Accepted: 07/07/2020] [Indexed: 12/20/2022] Open
Abstract
MicroRNAs (miRNAs) are one kind of non-coding RNA, play vital role in regulating several physiological and developmental processes. Subcellular localization of miRNAs and their abundance in the native cell are central for maintaining physiological homeostasis. Besides, RNA silencing activity of miRNAs is also influenced by their localization and stability. Thus, development of computational method for subcellular localization prediction of miRNAs is desired. In this work, we have proposed a computational method for predicting subcellular localizations of miRNAs based on principal component scores of thermodynamic, structural properties and pseudo compositions of di-nucleotides. Prediction accuracy was analyzed following fivefold cross validation, where ~ 63–71% of AUC-ROC and ~ 69–76% of AUC-PR were observed. While evaluated with independent test set, > 50% localizations were found to be correctly predicted. Besides, the developed computational model achieved higher accuracy than the existing methods. A user-friendly prediction server “miRNALoc” is freely accessible at https://cabgrid.res.in:8080/mirnaloc/, by which the user can predict localizations of miRNAs.
Collapse
|
40
|
Non-Coding RNA Databases in Cardiovascular Research. Noncoding RNA 2020; 6:ncrna6030035. [PMID: 32887511 PMCID: PMC7549374 DOI: 10.3390/ncrna6030035] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 08/28/2020] [Accepted: 09/01/2020] [Indexed: 12/11/2022] Open
Abstract
Cardiovascular diseases (CVDs) are of multifactorial origin and can be attributed to several genetic and environmental components. CVDs are the leading cause of mortality worldwide and they primarily damage the heart and the vascular system. Non-coding RNA (ncRNA) refers to functional RNA molecules, which have been transcribed into DNA but do not further get translated into proteins. Recent transcriptomic studies have identified the presence of thousands of ncRNA molecules across species. In humans, less than 2% of the total genome represents the protein-coding genes. While the role of many ncRNAs is yet to be ascertained, some long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) have been associated with disease progression, serving as useful diagnostic and prognostic biomarkers. A plethora of data repositories specialized in ncRNAs have been developed over the years using publicly available high-throughput data from next-generation sequencing and other approaches, that cover various facets of ncRNA research like basic and functional annotation, expressional profile, structural and molecular changes, and interaction with other biomolecules. Here, we provide a compendium of the current ncRNA databases relevant to cardiovascular research.
Collapse
|
41
|
Garcia-Moreno A, Carmona-Saez P. Computational Methods and Software Tools for Functional Analysis of miRNA Data. Biomolecules 2020; 10:biom10091252. [PMID: 32872205 PMCID: PMC7563698 DOI: 10.3390/biom10091252] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 08/24/2020] [Accepted: 08/26/2020] [Indexed: 12/15/2022] Open
Abstract
miRNAs are important regulators of gene expression that play a key role in many biological processes. High-throughput techniques allow researchers to discover and characterize large sets of miRNAs, and enrichment analysis tools are becoming increasingly important in decoding which miRNAs are implicated in biological processes. Enrichment analysis of miRNA targets is the standard technique for functional analysis, but this approach carries limitations and bias; alternatives are currently being proposed, based on direct and curated annotations. In this review, we describe the two workflows of miRNAs enrichment analysis, based on target gene or miRNA annotations, highlighting statistical tests, software tools, up-to-date databases, and functional annotations resources in the study of metazoan miRNAs.
Collapse
Affiliation(s)
- Adrian Garcia-Moreno
- Bioinformatics Unit, Centre for Genomics and Oncological Research (GENyO)—Pfizer/University of Granada/Andalusian Regional Government, PTS Granada, 18016 Granada, Spain;
| | - Pedro Carmona-Saez
- Bioinformatics Unit, Centre for Genomics and Oncological Research (GENyO)—Pfizer/University of Granada/Andalusian Regional Government, PTS Granada, 18016 Granada, Spain;
- Department of Statistics, University of Granada, 18071 Granada, Spain
- Correspondence:
| |
Collapse
|
42
|
Liu Z, Zhang Y, Han X, Li C, Yang X, Gao J, Xie G, Du N. Identifying Cancer-Related lncRNAs Based on a Convolutional Neural Network. Front Cell Dev Biol 2020; 8:637. [PMID: 32850792 PMCID: PMC7432192 DOI: 10.3389/fcell.2020.00637] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 06/24/2020] [Indexed: 12/15/2022] Open
Abstract
Millions of people are suffering from cancers, but accurate early diagnosis and effective treatment are still tough for all doctors. In recent years, long non-coding RNAs (lncRNAs) have been proven to play an important role in diseases, especially cancers. These lncRNAs execute their functions by regulating gene expression. Therefore, identifying lncRNAs which are related to cancers could help researchers gain a deeper understanding of cancer mechanisms and help them find treatment options. A large number of relationships between lncRNAs and cancers have been verified by biological experiments, which give us a chance to use computational methods to identify cancer-related lncRNAs. In this paper, we applied the convolutional neural network (CNN) to identify cancer-related lncRNAs by lncRNA's target genes and their tissue expression specificity. Since lncRNA regulates target gene expression and it has been reported to have tissue expression specificity, their target genes and expression in different tissues were used as features of lncRNAs. Then, the deep belief network (DBN) was used to unsupervised encode features of lncRNAs. Finally, CNN was used to predict cancer-related lncRNAs based on known relationships between lncRNAs and cancers. For each type of cancer, we built a CNN model to predict its related lncRNAs. We identified more related lncRNAs for 41 kinds of cancers. Ten-cross validation has been used to prove the performance of our method. The results showed that our method is better than several previous methods with area under the curve (AUC) 0.81 and area under the precision–recall curve (AUPR) 0.79. To verify the accuracy of our results, case studies have been done.
Collapse
Affiliation(s)
- Zihao Liu
- Department of Oncology, Medical School of Chinese PLA, Chinese PLA General Hospital, Beijing, China.,Department of Oncology, The Fourth Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Ying Zhang
- Department of Pharmacy, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| | - Xudong Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Chenxi Li
- Department of Oncology, The Fourth Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Xuhui Yang
- Department of Oncology, Medical School of Chinese PLA, Chinese PLA General Hospital, Beijing, China
| | - Jie Gao
- Department of Oncology, The Fourth Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Ganfeng Xie
- Department of Oncology, Southwest Hospital, Army Medical University, Chongqing, China
| | - Nan Du
- Department of Oncology, Medical School of Chinese PLA, Chinese PLA General Hospital, Beijing, China.,Department of Oncology, The Fourth Medical Center, Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
43
|
Sun YM, Chen YQ. Principles and innovative technologies for decrypting noncoding RNAs: from discovery and functional prediction to clinical application. J Hematol Oncol 2020; 13:109. [PMID: 32778133 PMCID: PMC7416809 DOI: 10.1186/s13045-020-00945-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 07/27/2020] [Indexed: 12/20/2022] Open
Abstract
Noncoding RNAs (ncRNAs) are a large segment of the transcriptome that do not have apparent protein-coding roles, but they have been verified to play important roles in diverse biological processes, including disease pathogenesis. With the development of innovative technologies, an increasing number of novel ncRNAs have been uncovered; information about their prominent tissue-specific expression patterns, various interaction networks, and subcellular locations will undoubtedly enhance our understanding of their potential functions. Here, we summarized the principles and innovative methods for identifications of novel ncRNAs that have potential functional roles in cancer biology. Moreover, this review also provides alternative ncRNA databases based on high-throughput sequencing or experimental validation, and it briefly describes the current strategy for the clinical translation of cancer-associated ncRNAs to be used in diagnosis.
Collapse
Affiliation(s)
- Yu-Meng Sun
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275 People’s Republic of China
| | - Yue-Qin Chen
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275 People’s Republic of China
| |
Collapse
|
44
|
Zeng C, Hamada M. RNA-Seq Analysis Reveals Localization-Associated Alternative Splicing across 13 Cell Lines. Genes (Basel) 2020; 11:E820. [PMID: 32708427 PMCID: PMC7397181 DOI: 10.3390/genes11070820] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/15/2020] [Accepted: 07/17/2020] [Indexed: 12/14/2022] Open
Abstract
Alternative splicing, a ubiquitous phenomenon in eukaryotes, is a regulatory mechanism for the biological diversity of individual genes. Most studies have focused on the effects of alternative splicing for protein synthesis. However, the transcriptome-wide influence of alternative splicing on RNA subcellular localization has rarely been studied. By analyzing RNA-seq data obtained from subcellular fractions across 13 human cell lines, we identified 8720 switching genes between the cytoplasm and the nucleus. Consistent with previous reports, intron retention was observed to be enriched in the nuclear transcript variants. Interestingly, we found that short and structurally stable introns were positively correlated with nuclear localization. Motif analysis reveals that fourteen RNA-binding protein (RBPs) are prone to be preferentially bound with such introns. To our knowledge, this is the first transcriptome-wide study to analyze and evaluate the effect of alternative splicing on RNA subcellular localization. Our findings reveal that alternative splicing plays a promising role in regulating RNA subcellular localization.
Collapse
Affiliation(s)
- Chao Zeng
- AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), Tokyo 169-8555, Japan
- Faculty of Science and Engineering, Waseda University, Tokyo 169-8555, Japan
| | - Michiaki Hamada
- AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), Tokyo 169-8555, Japan
- Faculty of Science and Engineering, Waseda University, Tokyo 169-8555, Japan
- Institute for Medical-oriented Structural Biology, Waseda University, Tokyo 162-8480, Japan
- Graduate School of Medicine, Nippon Medical School, Tokyo 113-8602, Japan
| |
Collapse
|
45
|
Guan ZX, Li SH, Zhang ZM, Zhang D, Yang H, Ding H. A Brief Survey for MicroRNA Precursor Identification Using Machine Learning Methods. Curr Genomics 2020; 21:11-25. [PMID: 32655294 PMCID: PMC7324890 DOI: 10.2174/1389202921666200214125102] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/24/2020] [Accepted: 01/30/2020] [Indexed: 11/22/2022] Open
Abstract
MicroRNAs, a group of short non-coding RNA molecules, could regulate gene expression. Many diseases are associated with abnormal expression of miRNAs. Therefore, accurate identification of miRNA precursors is necessary. In the past 10 years, experimental methods, comparative genomics methods, and artificial intelligence methods have been used to identify pre-miRNAs. However, experimental methods and comparative genomics methods have their disadvantages, such as time-consuming. In contrast, machine learning-based method is a better choice. Therefore, the review summarizes the current advances in pre-miRNA recognition based on computational methods, including the construction of benchmark datasets, feature extraction methods, prediction algorithms, and the results of the models. And we also provide valid information about the predictors currently available. Finally, we give the future perspectives on the identification of pre-miRNAs. The review provides scholars with a whole background of pre-miRNA identification by using machine learning methods, which can help researchers have a clear understanding of progress of the research in this field.
Collapse
Affiliation(s)
- Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Shi-Hao Li
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Zi-Mei Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Dan Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu610054, China
| |
Collapse
|
46
|
Geng G, Zhang Z, Cheng L. Identification of a Multi-Long Noncoding RNA Signature for the Diagnosis of Type 1 Diabetes Mellitus. Front Bioeng Biotechnol 2020; 8:553. [PMID: 32719778 PMCID: PMC7350420 DOI: 10.3389/fbioe.2020.00553] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Accepted: 05/07/2020] [Indexed: 02/01/2023] Open
Abstract
Due to the increasing prevalence of type 1 diabetes mellitus (T1DM) and its complications, there is an urgent need to identify novel methods for predicting the occurrence and understanding the pathogenetic mechanisms of the disease. Accumulated data have demonstrated the potential of long noncoding RNAs (lncRNAs), as biomarkers in establishing diagnosis and predicting prognosis of numerous diseases. Yet, little is known about the expression patterns and regulatory roles of lncRNAs in the pathogenesis of T1DM and whether they can be used as diagnostic biomarkers for the disease. To further explore these questions, in the present study, we conducted a comparative analysis of the expression patterns of lncRNAs between 20 T1DM patients and 42 health controls by retrospectively analyzing a published microarray data set. Our results indicate that, compared with healthy controls, diabetic patients had altered levels of lncRNAs. Then, we used three time cross-validation strategy and support vector machine to propose a specific 26-lncRNA signature (termed 26LncSigT1DM). This 26LncSigT1DM signature can be used to effectively distinguish between healthy and diabetic individuals (area under the curve = 0.825) of a validation cohort. After the 26LncSigT1DM was prospectively validated, we used Pearson correlation to identify 915 mRNAs, whose expression levels were positively correlated with those of the 26 lncRNAs. According to their Gene Ontology annotations, these mRNAs participate in processes including cellular response to stimulus, cell communication, multicellular organismal process, and cell motility. Kyoto Encyclopedia of Genes and Genomes analysis demonstrated that the genes encoding the 915 mRNAs may be associated with the NOD-like receptor signaling pathway, transforming growth factor β signaling pathway, and mineral absorption, suggesting that the deregulation of these lncRNAs may mediate inflammatory abnormalities and immune dysfunctions, which jointly promote the pathogenesis of T1DM. Thus, our study identifies a novel diagnostic tool and may shed more light on the molecular mechanisms underlying the pathogenesis of T1DM.
Collapse
Affiliation(s)
- Guannan Geng
- Department of Endocrinology, The First Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Zicheng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
47
|
Lang B, Armaos A, Tartaglia GG. RNAct: Protein-RNA interaction predictions for model organisms with supporting experimental data. Nucleic Acids Res 2020; 47:D601-D606. [PMID: 30445601 PMCID: PMC6324028 DOI: 10.1093/nar/gky967] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2018] [Accepted: 10/11/2018] [Indexed: 01/15/2023] Open
Abstract
Protein-RNA interactions are implicated in a number of physiological roles as well as diseases, with molecular mechanisms ranging from defects in RNA splicing, localization and translation to the formation of aggregates. Currently, ∼1400 human proteins have experimental evidence of RNA-binding activity. However, only ∼250 of these proteins currently have experimental data on their target RNAs from various sequencing-based methods such as eCLIP. To bridge this gap, we used an established, computationally expensive protein-RNA interaction prediction method, catRAPID, to populate a large database, RNAct. RNAct allows easy lookup of known and predicted interactions and enables global views of the human, mouse and yeast protein-RNA interactomes, expanding them in a genome-wide manner far beyond experimental data (http://rnact.crg.eu).
Collapse
Affiliation(s)
- Benjamin Lang
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | - Alexandros Armaos
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | - Gian G Tartaglia
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), 23 Passeig Lluís Companys, Barcelona 08010, Spain.,Universitat Pompeu Fabra (UPF), Department of Experimental and Health Sciences, Barcelona 08003, Spain.,Department of Biology 'Charles Darwin', Sapienza University of Rome, P.le A. Moro 5, Rome 00185, Italy
| |
Collapse
|
48
|
Donato L, Scimone C, Alibrandi S, Rinaldi C, Sidoti A, D’Angelo R. Transcriptome Analyses of lncRNAs in A2E-Stressed Retinal Epithelial Cells Unveil Advanced Links between Metabolic Impairments Related to Oxidative Stress and Retinitis Pigmentosa. Antioxidants (Basel) 2020; 9:E318. [PMID: 32326576 PMCID: PMC7222347 DOI: 10.3390/antiox9040318] [Citation(s) in RCA: 44] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2020] [Revised: 04/08/2020] [Accepted: 04/14/2020] [Indexed: 12/12/2022] Open
Abstract
: Long non-coding RNAs (lncRNAs) are untranslated transcripts which regulate many biological processes. Changes in lncRNA expression pattern are well-known related to various human disorders, such as ocular diseases. Among them, retinitis pigmentosa, one of the most heterogeneous inherited disorder, is strictly related to oxidative stress. However, little is known about regulative aspects able to link oxidative stress to etiopathogenesis of retinitis. Thus, we realized a total RNA-Seq experiment, analyzing human retinal pigment epithelium cells treated by the oxidant agent N-retinylidene-N-retinylethanolamine (A2E), considering three independent experimental groups (untreated control cells, cells treated for 3 h and cells treated for 6 h). Differentially expressed lncRNAs were filtered out, explored with specific tools and databases, and finally subjected to pathway analysis. We detected 3,3'-overlapping ncRNAs, 107 antisense, 24 sense-intronic, four sense-overlapping and 227 lincRNAs very differentially expressed throughout all considered time points. Analyzed lncRNAs could be involved in several biochemical pathways related to compromised response to oxidative stress, carbohydrate and lipid metabolism impairment, melanin biosynthetic process alteration, deficiency in cellular response to amino acid starvation, unbalanced regulation of cofactor metabolic process, all leading to retinal cell death. The explored lncRNAs could play a relevant role in retinitis pigmentosa etiopathogenesis, and seem to be the ideal candidate for novel molecular markers and therapeutic strategies.
Collapse
Affiliation(s)
- Luigi Donato
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Division of Medical Biotechnologies and Preventive Medicine, University of Messina, 98125 Messina, Italy
- Department of Biomolecular Strategies, Genetics and Avant-Garde Therapies, I.E.ME.S.T., 90139 Palermo, Italy
| | - Concetta Scimone
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Division of Medical Biotechnologies and Preventive Medicine, University of Messina, 98125 Messina, Italy
- Department of Biomolecular Strategies, Genetics and Avant-Garde Therapies, I.E.ME.S.T., 90139 Palermo, Italy
| | - Simona Alibrandi
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Division of Medical Biotechnologies and Preventive Medicine, University of Messina, 98125 Messina, Italy
- Department of Chemical, Biological, Pharmaceutical and Environmental Sciences, University of Messina, 98125 Messina, Italy
| | - Carmela Rinaldi
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Division of Medical Biotechnologies and Preventive Medicine, University of Messina, 98125 Messina, Italy
| | - Antonina Sidoti
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Division of Medical Biotechnologies and Preventive Medicine, University of Messina, 98125 Messina, Italy
- Department of Biomolecular Strategies, Genetics and Avant-Garde Therapies, I.E.ME.S.T., 90139 Palermo, Italy
| | - Rosalia D’Angelo
- Department of Biomedical and Dental Sciences and Morphofunctional Imaging, Division of Medical Biotechnologies and Preventive Medicine, University of Messina, 98125 Messina, Italy
- Department of Biomolecular Strategies, Genetics and Avant-Garde Therapies, I.E.ME.S.T., 90139 Palermo, Italy
| |
Collapse
|
49
|
A susceptibility biomarker identification strategy based on significantly differentially expressed ceRNA triplets for ischemic cardiomyopathy. Biosci Rep 2020; 40:221818. [PMID: 31919492 PMCID: PMC6981099 DOI: 10.1042/bsr20191731] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 01/06/2020] [Accepted: 01/07/2020] [Indexed: 12/17/2022] Open
Abstract
Ischemic cardiomyopathy (ICM) is a common human heart disease that causes death. No effective biomarkers for ICM could be found in existing databases, which is detrimental to the in-depth study of this disease. In the present study, ICM susceptibility biomarkers were identified using a proposed strategy based on RNA-Seq and miRNA-Seq data of ICM and normal samples. Significantly differentially expressed competing endogenous RNA (ceRNA) triplets were constructed using permutation tests and differentially expressed mRNAs, miRNAs and lncRNAs. Candidate ICM susceptible genes were screened out as differentially expressed genes in significantly differentially expressed ceRNA triplets enriched in ICM-related functional classes. Finally, eight ICM susceptibility genes and their significantly correlated lncRNAs with high classification accuracy were identified as ICM susceptibility biomarkers. These biomarkers would contribute to the diagnosis and treatment of ICM. The proposed strategy could be extended to other complex diseases without disease biomarkers in public databases.
Collapse
|
50
|
Wang J, Cao Y, Lu X, Wang X, Kong X, Bo C, Li S, Bai M, Jiao Y, Gao H, Yao X, Ning S, Wang L, Zhang H. Identification of the Regulatory Role of lncRNA SNHG16 in Myasthenia Gravis by Constructing a Competing Endogenous RNA Network. MOLECULAR THERAPY. NUCLEIC ACIDS 2020; 19:1123-1133. [PMID: 32059338 PMCID: PMC7016163 DOI: 10.1016/j.omtn.2020.01.005] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Revised: 12/27/2019] [Accepted: 01/06/2020] [Indexed: 12/19/2022]
Abstract
Myasthenia gravis (MG) is an autoimmune disorder resulting from antibodies against the proteins at the neuromuscular junction. Emerging evidence indicates that long non-coding RNAs (lncRNAs), acting as competing endogenous RNAs (ceRNAs), are involved in various diseases. However, the regulatory mechanisms of ceRNAs underlying MG remain largely unknown. In this study, we constructed a lncRNA-mediated ceRNA network involved in MG using a multi-step computational strategy. Functional annotation analysis suggests that these lncRNAs may play crucial roles in the immunological mechanism underlying MG. Importantly, through manual literature mining, we found that lncRNA SNHG16 (small nucleolar RNA host gene 16), acting as a ceRNA, plays important roles in the immune processes. Further experiments showed that SNHG16 expression was upregulated in peripheral blood mononuclear cells (PBMCs) from MG patients compared to healthy controls. Luciferase reporter assays confirmed that SNHG16 is a target of the microRNA (miRNA) let-7c-5p. Subsequent experiments indicated that SNHG16 regulates the expression of the key MG gene interleukin (IL)-10 by sponging let-7c-5p in a ceRNA manner. Furthermore, functional assays showed that SNHG16 inhibits Jurkat cell apoptosis and promotes cell proliferation by sponging let-7c-5p. Our study will contribute to a deeper understanding of the regulatory mechanism of MG and will potentially provide new therapeutic targets for MG patients.
Collapse
Affiliation(s)
- Jianjian Wang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Yuze Cao
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China; Department of Neurology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Beijing 100730, China
| | - Xiaoyu Lu
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Xiaolong Wang
- Department of Orthopedics, Harbin Medical University Cancer Hospital, Harbin 150000, China
| | - Xiaotong Kong
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Chunrui Bo
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Shuang Li
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Ming Bai
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Yang Jiao
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Hongyu Gao
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China
| | - Xiuhua Yao
- Tianjin Neurosurgical Institute, Tianjin Key Laboratory of Cerebral Vascular and Neurodegenerative Diseases, Tianjin Huanhu Hospital, Tianjin 300350, China
| | - Shangwei Ning
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| | - Lihua Wang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China.
| | - Huixue Zhang
- Department of Neurology, The Second Affiliated Hospital of Harbin Medical University, Harbin 150081, China.
| |
Collapse
|