1
|
Zhang L, Liu Q, Guo Y, Tian L, Chen K, Bai D, Yu H, Han X, Luo W, Feng T, Deng S, Xie G. DNA-based molecular classifiers for the profiling of gene expression signatures. J Nanobiotechnology 2024; 22:189. [PMID: 38632615 PMCID: PMC11025223 DOI: 10.1186/s12951-024-02445-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 03/28/2024] [Indexed: 04/19/2024] Open
Abstract
Although gene expression signatures offer tremendous potential in diseases diagnostic and prognostic, but massive gene expression signatures caused challenges for experimental detection and computational analysis in clinical setting. Here, we introduce a universal DNA-based molecular classifier for profiling gene expression signatures and generating immediate diagnostic outcomes. The molecular classifier begins with feature transformation, a modular and programmable strategy was used to capture relative relationships of low-concentration RNAs and convert them to general coding inputs. Then, competitive inhibition of the DNA catalytic reaction enables strict weight assignment for different inputs according to their importance, followed by summation, annihilation and reporting to accurately implement the mathematical model of the classifier. We validated the entire workflow by utilizing miRNA expression levels for the diagnosis of hepatocellular carcinoma (HCC) in clinical samples with an accuracy 85.7%. The results demonstrate the molecular classifier provides a universal solution to explore the correlation between gene expression patterns and disease diagnostics, monitoring, and prognosis, and supports personalized healthcare in primary care.
Collapse
Affiliation(s)
- Li Zhang
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
- Department of Forensic Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Qian Liu
- Nuclear Medicine Department, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, 400010, China
| | - Yongcan Guo
- Clinical Laboratory, Traditional Chinese Medicine Hospital Affiliated to Southwest Medical University, Luzhou, 646000, China
| | - Luyao Tian
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Kena Chen
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Dan Bai
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Hongyan Yu
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Xiaole Han
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Wang Luo
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Tong Feng
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China
| | - Shixiong Deng
- Department of Forensic Medicine, Chongqing Medical University, Chongqing, 400016, China.
| | - Guoming Xie
- Key Laboratory of Laboratory Medical Diagnostics, Ministry of Education, Department of Laboratory Medicine, Chongqing Medical University, Chongqing, 400016, China.
| |
Collapse
|
2
|
Huang M, Ma J, Zhang J. Inferring cell developmental stage-specific lncRNA regulation in the developing human neocortex with CDSlncR. Front Mol Neurosci 2023; 15:1037565. [PMID: 36710930 PMCID: PMC9880432 DOI: 10.3389/fnmol.2022.1037565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 12/26/2022] [Indexed: 01/15/2023] Open
Abstract
Noncoding RNAs (ncRNAs) occupy ~98% of the transcriptome in human, and are usually not translated into proteins. Among ncRNAs, long non-coding RNAs (lncRNAs, >200 nucleotides) are important regulators to modulate gene expression, and are involved in many biological processes (e.g., cell development). To study lncRNA regulation, many computational approaches or tools have been proposed by using bulk transcriptomics data. Nevertheless, previous bulk data-driven methods are mostly limited to explore the lncRNA regulation regarding all of cells, instead of the lncRNA regulation specific to cell developmental stages. Fortunately, recent advance in single-cell sequencing data has provided a way to investigate cell developmental stage-specific lncRNA regulation. In this work, we present a novel computational method, CDSlncR (Cell Developmental Stage-specific lncRNA regulation), which combines putative lncRNA-target binding information with single-cell transcriptomics data to infer cell developmental stage-specific lncRNA regulation. For each cell developmental stage, CDSlncR constructs a cell developmental stage-specific lncRNA regulatory network in the cell developmental stage. To illustrate the effectiveness of CDSlncR, we apply CDSlncR into single-cell transcriptomics data of the developing human neocortex for exploring lncRNA regulation across different human neocortex developmental stages. Network analysis shows that the lncRNA regulation is unique in each developmental stage of human neocortex. As a case study, we also perform particular analysis on the cell developmental stage-specific lncRNA regulation related to 18 known lncRNA biomarkers in autism spectrum disorder. Finally, the comparison result indicates that CDSlncR is an effective method for predicting cell developmental stage-specific lncRNA targets. CDSlncR is available at https://github.com/linxi159/CDSlncR.
Collapse
Affiliation(s)
- Meng Huang
- Department of Automation, Xiamen University, Xiamen, China,Department of Computer Science, University of Tsukuba, Tsukuba, Japan
| | - Jiangtao Ma
- Department of Automation, Xiamen University, Xiamen, China,School of Engineering, Dali University, Dali, China
| | - Junpeng Zhang
- School of Engineering, Dali University, Dali, China,*Correspondence: Junpeng Zhang, ✉
| |
Collapse
|
3
|
Zhou Y, Wang X, Yao L, Zhu M. LDAformer: predicting lncRNA-disease associations based on topological feature extraction and Transformer encoder. Brief Bioinform 2022; 23:6696138. [PMID: 36094081 DOI: 10.1093/bib/bbac370] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 07/27/2022] [Accepted: 08/06/2022] [Indexed: 12/14/2022] Open
Abstract
The identification of long noncoding RNA (lncRNA)-disease associations is of great value for disease diagnosis and treatment, and it is now commonly used to predict potential lncRNA-disease associations with computational methods. However, the existing methods do not sufficiently extract key features during data processing, and the learning model parts are either less powerful or overly complex. Therefore, there is still potential to achieve better predictive performance by improving these two aspects. In this work, we propose a novel lncRNA-disease association prediction method LDAformer based on topological feature extraction and Transformer encoder. We construct the heterogeneous network by integrating the associations between lncRNAs, diseases and micro RNAs (miRNAs). Intra-class similarities and inter-class associations are presented as the lncRNA-disease-miRNA weighted adjacency matrix to unify semantics. Next, we design a topological feature extraction process to further obtain multi-hop topological pathway features latent in the adjacency matrix. Finally, to capture the interdependencies between heterogeneous pathways, a Transformer encoder based on the global self-attention mechanism is employed to predict lncRNA-disease associations. The efficient feature extraction and the intuitive and powerful learning model lead to ideal performance. The results of computational experiments on two datasets show that our method outperforms the state-of-the-art baseline methods. Additionally, case studies further indicate its capability to discover new associations accurately.
Collapse
Affiliation(s)
- Yi Zhou
- College of Computer Science, Sichuan University, 1st Ring Road South 1 Section, 610065, Chengdu, China
| | - Xinyi Wang
- College of Computer Science, Sichuan University, 1st Ring Road South 1 Section, 610065, Chengdu, China
| | - Lin Yao
- College of Computer Science, Sichuan University, 1st Ring Road South 1 Section, 610065, Chengdu, China
| | - Min Zhu
- College of Computer Science, Sichuan University, 1st Ring Road South 1 Section, 610065, Chengdu, China
| |
Collapse
|
4
|
Wang MN, Lei LL, He W, Ding DW. SPCMLMI: A structural perturbation-based matrix completion method to predict lncRNA–miRNA interactions. Front Genet 2022; 13:1032428. [DOI: 10.3389/fgene.2022.1032428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 10/28/2022] [Indexed: 11/17/2022] Open
Abstract
Accumulating evidence indicated that the interaction between lncRNA and miRNA is crucial for gene regulation, which can regulate gene transcription, further affecting the occurrence and development of many complex diseases. Accurate identification of interactions between lncRNAs and miRNAs is helpful for the diagnosis and therapeutics of complex diseases. However, the number of known interactions of lncRNA with miRNA is still very limited, and identifying their interactions through biological experiments is time-consuming and expensive. There is an urgent need to develop more accurate and efficient computational methods to infer lncRNA–miRNA interactions. In this work, we developed a matrix completion approach based on structural perturbation to infer lncRNA–miRNA interactions (SPCMLMI). Specifically, we first calculated the similarities of lncRNA and miRNA, including the lncRNA expression profile similarity, miRNA expression profile similarity, lncRNA sequence similarity, and miRNA sequence similarity. Second, a bilayer network was constructed by integrating the known interaction network, lncRNA similarity network, and miRNA similarity network. Finally, a structural perturbation-based matrix completion method was used to predict potential interactions of lncRNA with miRNA. To evaluate the prediction performance of SPCMLMI, five-fold cross validation and a series of comparison experiments were implemented. SPCMLMI achieved AUCs of 0.8984 and 0.9891 on two different datasets, which is superior to other compared methods. Case studies for lncRNA XIST and miRNA hsa-mir-195–5-p further confirmed the effectiveness of our method in inferring lncRNA–miRNA interactions. Furthermore, we found that the structural consistency of the bilayer network was higher than that of other related networks. The results suggest that SPCMLMI can be used as a useful tool to predict interactions between lncRNAs and miRNAs.
Collapse
|
5
|
Zhang W, Wei H, Liu B. idenMD-NRF: a ranking framework for miRNA-disease association identification. Brief Bioinform 2022; 23:6604995. [PMID: 35679537 DOI: 10.1093/bib/bbac224] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 04/18/2022] [Accepted: 05/11/2022] [Indexed: 11/12/2022] Open
Abstract
Identifying miRNA-disease associations is an important task for revealing pathogenic mechanism of complicated diseases. Different computational methods have been proposed. Although these methods obtained encouraging performance for detecting missing associations between known miRNAs and diseases, how to accurately predict associated diseases for new miRNAs is still a difficult task. In this regard, a ranking framework named idenMD-NRF is proposed for miRNA-disease association identification. idenMD-NRF treats the miRNA-disease association identification as an information retrieval task. Given a novel query miRNA, idenMD-NRF employs Learning to Rank algorithm to rank associated diseases based on high-level association features and various predictors. The experimental results on two independent test datasets indicate that idenMD-NRF is superior to other compared predictors. A user-friendly web server of idenMD-NRF predictor is freely available at http://bliulab.net/idenMD-NRF/.
Collapse
Affiliation(s)
- Wenxiang Zhang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| | - Hang Wei
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China.,Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
6
|
Fan Y, Chen M, Pan X. GCRFLDA: scoring lncRNA-disease associations using graph convolution matrix completion with conditional random field. Brief Bioinform 2021; 23:6363052. [PMID: 34486019 DOI: 10.1093/bib/bbab361] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 07/19/2021] [Accepted: 08/16/2021] [Indexed: 12/12/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) play important roles in various biological regulatory processes, and are closely related to the occurrence and development of diseases. Identifying lncRNA-disease associations is valuable for revealing the molecular mechanism of diseases and exploring treatment strategies. Thus, it is necessary to computationally predict lncRNA-disease associations as a complementary method for biological experiments. In this study, we proposed a novel prediction method GCRFLDA based on the graph convolutional matrix completion. GCRFLDA first constructed a graph using the available lncRNA-disease association information. Then, it constructed an encoder consisting of conditional random field and attention mechanism to learn efficient embeddings of nodes, and a decoder layer to score lncRNA-disease associations. In GCRFLDA, the Gaussian interaction profile kernels similarity and cosine similarity were fused as side information of lncRNA and disease nodes. Experimental results on four benchmark datasets show that GCRFLDA is superior to other existing methods. Moreover, we conducted case studies on four diseases and observed that 70 of 80 predicted associated lncRNAs were confirmed by the literature.
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer Science and Information Security, Guilin University of Electronic Technology
| | - Meijun Chen
- Guilin University of Electronic Technology, Guilin 541004, China
| | - Xiaoyong Pan
- Department of Automation of Shanghai Jiao Tong University
| |
Collapse
|
7
|
Chowdhary A, Satagopam V, Schneider R. Long Non-coding RNAs: Mechanisms, Experimental, and Computational Approaches in Identification, Characterization, and Their Biomarker Potential in Cancer. Front Genet 2021; 12:649619. [PMID: 34276764 PMCID: PMC8281131 DOI: 10.3389/fgene.2021.649619] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2021] [Accepted: 04/20/2021] [Indexed: 01/09/2023] Open
Abstract
Long non-coding RNAs are diverse class of non-coding RNA molecules >200 base pairs of length having various functions like gene regulation, dosage compensation, epigenetic regulation. Dysregulation and genomic variations of several lncRNAs have been implicated in several diseases. Their tissue and developmental specific expression are contributing factors for them to be viable indicators of physiological states of the cells. Here we present an comprehensive review the molecular mechanisms and functions, state of the art experimental and computational pipelines and challenges involved in the identification and functional annotation of lncRNAs and their prospects as biomarkers. We also illustrate the application of co-expression networks on the TCGA-LIHC dataset for putative functional predictions of lncRNAs having a therapeutic potential in Hepatocellular carcinoma (HCC).
Collapse
Affiliation(s)
- Anshika Chowdhary
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Venkata Satagopam
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Reinhard Schneider
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
8
|
Barshir R, Fishilevich S, Iny-Stein T, Zelig O, Mazor Y, Guan-Golan Y, Safran M, Lancet D. GeneCaRNA: A Comprehensive Gene-centric Database of Human Non-coding RNAs in the GeneCards Suite. J Mol Biol 2021; 433:166913. [PMID: 33676929 DOI: 10.1016/j.jmb.2021.166913] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 01/14/2021] [Accepted: 02/25/2021] [Indexed: 12/20/2022]
Abstract
Non-coding RNA (ncRNA) genes assume increasing biological importance, with growing associations with diseases. Many ncRNA sources are transcript-centric, but for non-coding variant analysis and disease decipherment it is essential to transform this information into a comprehensive set of genome-mapped ncRNA genes. We present GeneCaRNA, a new all-inclusive gene-centric ncRNA database within the GeneCards Suite. GeneCaRNA information is integrated from four community-backed data structures: the major transcript database RNAcentral with its 20 encompassed databases, and the ncRNA entries of three major gene resources HGNC, Ensembl and NCBI Gene. GeneCaRNA presents 219,587 ncRNA gene pages, a 7-fold increase from those available in our three gene mining sources. Each ncRNA gene has wide-ranging annotation, mined from >100 worldwide sources, providing a powerful GeneCards-leveraged search. The latter empowers VarElect, our disease-gene interpretation tool, allowing one to systematically decipher ncRNA variants. The combined power of GeneCaRNA with GeneHancer, our regulatory elements database, facilitates wide-ranging scrutiny of the non-coding terra incognita of gene networks and whole genome analyses.
Collapse
Affiliation(s)
- Ruth Barshir
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610010, Israel.
| | - Simon Fishilevich
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610010, Israel.
| | - Tsippi Iny-Stein
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610010, Israel.
| | - Ofer Zelig
- LifeMap Sciences Inc., Alameda, CA 94501, USA.
| | - Yaron Mazor
- LifeMap Sciences Inc., Alameda, CA 94501, USA.
| | | | - Marilyn Safran
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610010, Israel.
| | - Doron Lancet
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 7610010, Israel.
| |
Collapse
|
9
|
Zhu L, Yang X, Zhu R, Yu L. Identifying Discriminative Biological Function Features and Rules for Cancer-Related Long Non-coding RNAs. Front Genet 2021; 11:598773. [PMID: 33391350 PMCID: PMC7772407 DOI: 10.3389/fgene.2020.598773] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 11/23/2020] [Indexed: 01/17/2023] Open
Abstract
Cancer has been a major public health problem worldwide for many centuries. Cancer is a complex disease associated with accumulative genetic mutations, epigenetic aberrations, chromosomal instability, and expression alteration. Increasing lines of evidence suggest that many non-coding transcripts, which are termed as non-coding RNAs, have important regulatory roles in cancer. In particular, long non-coding RNAs (lncRNAs) play crucial roles in tumorigenesis. Cancer-related lncRNAs serve as oncogenic factors or tumor suppressors. Although many lncRNAs are identified as potential regulators in tumorigenesis by using traditional experimental methods, they are time consuming and expensive considering the tremendous amount of lncRNAs needed. Thus, effective and fast approaches to recognize tumor-related lncRNAs should be developed. The proposed approach should help us understand not only the mechanisms of lncRNAs that participate in tumorigenesis but also their satisfactory performance in distinguishing cancer-related lncRNAs. In this study, we utilized a decision tree (DT), a type of rule learning algorithm, to investigate cancer-related lncRNAs with functional annotation contents [gene ontology (GO) terms and KEGG pathways] of their co-expressed genes. Cancer-related and other lncRNAs encoded by the key enrichment features of GO and KEGG filtered by feature selection methods were used to build an informative DT, which further induced several decision rules. The rules provided not only a new tool for identifying cancer-related lncRNAs but also connected the lncRNAs and cancers with the combinations of GO terms. Results provided new directions for understanding cancer-related lncRNAs.
Collapse
Affiliation(s)
- Liucun Zhu
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Xin Yang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Rui Zhu
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Lei Yu
- Department of Medical Oncology, Shanghai Concord Medical Cancer Center, Shanghai, China
| |
Collapse
|
10
|
Yan C, Zhang Z, Bao S, Hou P, Zhou M, Xu C, Sun J. Computational Methods and Applications for Identifying Disease-Associated lncRNAs as Potential Biomarkers and Therapeutic Targets. MOLECULAR THERAPY. NUCLEIC ACIDS 2020; 21:156-171. [PMID: 32585624 PMCID: PMC7321789 DOI: 10.1016/j.omtn.2020.05.018] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 04/06/2020] [Accepted: 05/18/2020] [Indexed: 12/12/2022]
Abstract
Long non-coding RNAs (lncRNAs) have been recognized as critical components of a broad genomic regulatory network and play pivotal roles in physiological and pathological processes. Identification of disease-associated lncRNAs is becoming increasingly crucial for fundamentally improving our understanding of molecular mechanisms of disease and developing novel biomarkers and therapeutic targets. Considering lower efficiency and higher time and labor cost of biological experiments, computer-aided inference of disease-associated RNAs has become a promising avenue for facilitating the study of lncRNA functions and provides complementary value for experimental studies. In this study, we first summarize data and knowledge resources publicly available for the study of lncRNA-disease associations. Then, we present an updated systematic overview of dozens of computational methods and models for inferring lncRNA-disease associations proposed in recent years. Finally, we explore the perspectives and challenges for further studies. Our study provides a guide for biologists and medical scientists to look for dedicated resources and more competent tools for accelerating the unraveling of disease-associated lncRNAs.
Collapse
Affiliation(s)
- Congcong Yan
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Zicheng Zhang
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Siqi Bao
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Ping Hou
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Meng Zhou
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Chongyong Xu
- Department of Radiology, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou 325027, P.R. China.
| | - Jie Sun
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China.
| |
Collapse
|
11
|
iterb-PPse: Identification of transcriptional terminators in bacterial by incorporating nucleotide properties into PseKNC. PLoS One 2020; 15:e0228479. [PMID: 32413030 PMCID: PMC7228126 DOI: 10.1371/journal.pone.0228479] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2020] [Accepted: 05/01/2020] [Indexed: 11/19/2022] Open
Abstract
Terminator is a DNA sequence that gives the RNA polymerase the transcriptional termination signal. Identifying terminators correctly can optimize the genome annotation, more importantly, it has considerable application value in disease diagnosis and therapies. However, accurate prediction methods are deficient and in urgent need. Therefore, we proposed a prediction method "iterb-PPse" for terminators by incorporating 47 nucleotide properties into PseKNC-Ⅰ and PseKNC-Ⅱ and utilizing Extreme Gradient Boosting to predict terminators based on Escherichia coli and Bacillus subtilis. Combing with the preceding methods, we employed three new feature extraction methods K-pwm, Base-content, Nucleotidepro to formulate raw samples. The two-step method was applied to select features. When identifying terminators based on optimized features, we compared five single models as well as 16 ensemble models. As a result, the accuracy of our method on benchmark dataset achieved 99.88%, higher than the existing state-of-the-art predictor iTerm-PseKNC in 100 times five-fold cross-validation test. Its prediction accuracy for two independent datasets reached 94.24% and 99.45% respectively. For the convenience of users, we developed a software on the basis of "iterb-PPse" with the same name. The open software and source code of "iterb-PPse" are available at https://github.com/Sarahyouzi/iterb-PPse.
Collapse
|
12
|
Liu F, Dong H, Mei Z, Huang T. Investigation of miRNA and mRNA Co-expression Network in Ependymoma. Front Bioeng Biotechnol 2020; 8:177. [PMID: 32266223 PMCID: PMC7096354 DOI: 10.3389/fbioe.2020.00177] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2020] [Accepted: 02/20/2020] [Indexed: 12/18/2022] Open
Abstract
Ependymoma (EPN) is a rare primary tumor of the central nervous system (CNS) that affects both children and adults. Despite the definition and classification of distinct molecular subgroups, there remains a group of EPNs with a balanced genome, which makes it difficult to predict a prognosis of patients with EPN. The role of miRNA-mRNA network on EPN is still poorly understood. We assessed the involvement of miRNA-mRNA pairs in EPN by applying a weighted co-expression network analysis (WGCNA) approach. Using whole genome expression profile analysis followed by functional enrichment, we detected hub genes involved in active proliferation and DNA replication of nerve cells. Key genes including CYP11B1, KRT33B, RUNX1T1, SIK1, MAP3K4, MLANA, and SFRP5 identified in co-expression networks were regulated by miR-15a and miR-24-1. These seven miRNA-mRNA pairs were considered to influence not only pathways in cancer and tumor suppression process, but also MAPK, NF-kappaB, and WNT signaling pathways which were associated with tumorigenesis and development. This study provides a novel insight into potential diagnostic biomarkers of EPN and may have value in choosing therapeutic targets with clinical utility.
Collapse
Affiliation(s)
- Feili Liu
- Department of Neurosurgery, Huashan Hospital, Fudan University, Shanghai, China
| | - Hang Dong
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Zi Mei
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Tao Huang
- Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
13
|
Li Y, Jiang T, Zhou W, Li J, Li X, Wang Q, Jin X, Yin J, Chen L, Zhang Y, Xu J, Li X. Pan-cancer characterization of immune-related lncRNAs identifies potential oncogenic biomarkers. Nat Commun 2020; 11:1000. [PMID: 32081859 PMCID: PMC7035327 DOI: 10.1038/s41467-020-14802-2] [Citation(s) in RCA: 263] [Impact Index Per Article: 65.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 02/03/2020] [Indexed: 12/18/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are emerging as critical regulators of gene expression and they play fundamental roles in immune regulation. Here we introduce an integrated algorithm, ImmLnc, for identifying lncRNA regulators of immune-related pathways. We comprehensively chart the landscape of lncRNA regulation in the immunome across 33 cancer types and show that cancers with similar tissue origin are likely to share lncRNA immune regulators. Moreover, the immune-related lncRNAs are likely to show expression perturbation in cancer and are significantly correlated with immune cell infiltration. ImmLnc can help prioritize cancer-related lncRNAs and further identify three molecular subtypes (proliferative, intermediate, and immunological) of non-small cell lung cancer. These subtypes are characterized by differences in mutation burden, immune cell infiltration, expression of immunomodulatory genes, response to chemotherapy, and prognosis. In summary, the ImmLnc pipeline and the resulting data serve as a valuable resource for understanding lncRNA function and to advance identification of immunotherapy targets.
Collapse
Affiliation(s)
- Yongsheng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China. .,Key Laboratory of Tropical Translational Medicine of Ministry of Education, Hainan Medical University, Haikou, 571199, China. .,College of Biomedical Information and Engineering, Hainan Medical University, Haikou, Hainan, 570100, China.
| | - Tiantongfei Jiang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Weiwei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Junyi Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Xinhui Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Qi Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Xiaoyan Jin
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Jiaqi Yin
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Liuxin Chen
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China
| | - Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China. .,Key Laboratory of Tropical Translational Medicine of Ministry of Education, Hainan Medical University, Haikou, 571199, China. .,College of Biomedical Information and Engineering, Hainan Medical University, Haikou, Hainan, 570100, China.
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, 150081, China. .,Key Laboratory of Tropical Translational Medicine of Ministry of Education, Hainan Medical University, Haikou, 571199, China. .,College of Biomedical Information and Engineering, Hainan Medical University, Haikou, Hainan, 570100, China.
| |
Collapse
|
14
|
Fan Y, Cui J, Zhu Q. Heterogeneous graph inference based on similarity network fusion for predicting lncRNA–miRNA interaction. RSC Adv 2020; 10:11634-11642. [PMID: 35496629 PMCID: PMC9050493 DOI: 10.1039/c9ra11043g] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 03/14/2020] [Indexed: 12/28/2022] Open
Abstract
LncRNA and miRNA are two non-coding RNA types that are popular in current research. LncRNA interacts with miRNA to regulate gene transcription, further affecting human health and disease. Accurate identification of lncRNA–miRNA interactions contributes to the in-depth study of the biological functions and mechanisms of non-coding RNA. However, relying on biological experiments to obtain interaction information is time-consuming and expensive. Considering the rapid accumulation of gene information and the few computational methods, it is urgent to supplement the effective computational models to predict lncRNA–miRNA interactions. In this work, we propose a heterogeneous graph inference method based on similarity network fusion (SNFHGILMI) to predict potential lncRNA–miRNA interactions. First, we calculated multiple similarity data, including lncRNA sequence similarity, miRNA sequence similarity, lncRNA Gaussian nuclear similarity, and miRNA Gaussian nuclear similarity. Second, the similarity network fusion method was employed to integrate the data and get the similarity network of lncRNA and miRNA. Then, we constructed a bipartite network by combining the known interaction network and similarity network of lncRNA and miRNA. Finally, the heterogeneous graph inference method was introduced to construct a prediction model. On the real dataset, the model SNFHGILMI achieved AUC of 0.9501 and 0.9426 ± 0.0035 based on LOOCV and 5-fold cross validation, respectively. Furthermore, case studies also demonstrate that SNFHGILMI is a high-performance prediction method that can accurately predict new lncRNA–miRNA interactions. The Matlab code and readme file of SNFHGILMI can be downloaded from https://github.com/cj-DaSE/SNFHGILMI. LncRNA and miRNA are two non-coding RNA types that are popular in current research.![]()
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer and Information Security
- Guilin University of Electronic Technology
- Guilin 541004
- China
| | - Juan Cui
- School of Computer and Information Security
- Guilin University of Electronic Technology
- Guilin 541004
- China
| | - QingQi Zhu
- School of Computer and Information Security
- Guilin University of Electronic Technology
- Guilin 541004
- China
| |
Collapse
|
15
|
Pan X, Shen HB. Inferring Disease-Associated MicroRNAs Using Semi-supervised Multi-Label Graph Convolutional Networks. iScience 2019; 20:265-277. [PMID: 31605942 PMCID: PMC6817654 DOI: 10.1016/j.isci.2019.09.013] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/05/2019] [Accepted: 09/11/2019] [Indexed: 01/22/2023] Open
Abstract
MicroRNAs (miRNAs) play crucial roles in biological processes involved in diseases. The associations between diseases and protein-coding genes (PCGs) have been well investigated, and miRNAs interact with PCGs to trigger them to be functional. We present a computational method, DimiG, to infer miRNA-associated diseases using a semi-supervised Graph Convolutional Network model (GCN). DimiG uses a multi-label framework to integrate PCG-PCG interactions, PCG-miRNA interactions, PCG-disease associations, and tissue expression profiles. DimiG is trained on disease-PCG associations and an interaction network using a GCN, which is further used to score associations between diseases and miRNAs. We evaluate DimiG on a benchmark set from verified disease-miRNA associations. Our results demonstrate that DimiG outperforms the best unsupervised method and is comparable to two supervised methods. Three case studies of prostate cancer, lung cancer, and inflammatory bowel disease further demonstrate the efficacy of DimiG, where top miRNAs predicted by DimiG are supported by literature.
Collapse
Affiliation(s)
- Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China; Department of Medical informatics, Erasmus Medical Center, 3015 CE Rotterdam, the Netherlands.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China.
| |
Collapse
|