51
|
Lin Y, Liu T, Cui T, Wang Z, Zhang Y, Tan P, Huang Y, Yu J, Wang D. RNAInter in 2020: RNA interactome repository with increased coverage and annotation. Nucleic Acids Res 2020; 48:D189-D197. [PMID: 31906603 PMCID: PMC6943043 DOI: 10.1093/nar/gkz804] [Citation(s) in RCA: 154] [Impact Index Per Article: 38.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 09/03/2019] [Accepted: 09/10/2019] [Indexed: 01/23/2023] Open
Abstract
Research on RNA-associated interactions has exploded in recent years, and increasing numbers of studies are not limited to RNA-RNA and RNA-protein interactions but also include RNA-DNA/compound interactions. To facilitate the development of the interactome and promote understanding of the biological functions and molecular mechanisms of RNA, we updated RAID v2.0 to RNAInter (RNA Interactome Database), a repository for RNA-associated interactions that is freely accessible at http://www.rna-society.org/rnainter/ or http://www.rna-society.org/raid/. Compared to RAID v2.0, new features in RNAInter include (i) 8-fold more interaction data and 94 additional species; (ii) more definite annotations organized, including RNA editing/localization/modification/structure and homology interaction; (iii) advanced functions including fuzzy/batch search, interaction network and RNA dynamic expression and (iv) four embedded RNA interactome tools: RIscoper, IntaRNA, PRIdictor and DeepBind. Consequently, RNAInter contains >41 million RNA-associated interaction entries, involving more than 450 thousand unique molecules, including RNA, protein, DNA and compound. Overall, RNAInter provides a comprehensive RNA interactome resource for researchers and paves the way to investigate the regulatory landscape of cellular RNAs.
Collapse
Affiliation(s)
- Yunqing Lin
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Tianyuan Liu
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Tianyu Cui
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Zhao Wang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Yuncong Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Puwen Tan
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Yan Huang
- Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde), Foshan 528308, China
| | - Jia Yu
- State Key Laboratory of Medical Molecular Biology, Department of Biochemistry & Molecular Biology, Institute of Basic Medical Sciences, Chinese Academy of Medical Sciences (CAMS) & Peking Union Medical College (PUMC), Beijing 100730, China
| | - Dong Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
- Shunde Hospital, Southern Medical University (The First People's Hospital of Shunde), Foshan 528308, China
- Dermatology Hospital, Southern Medical University, Guangzhou 510091, China
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 611731, China
- To whom correspondence should be addressed. Tel: +86 20 61648279; Fax: +86 20 61648279; or
| |
Collapse
|
52
|
Abstract
As RNA in situ hybridization (ISH) moves into the mainstream lab and increasingly into clinical adoption and additional multiplexing techniques are developed to enable further RNA ISH identification, a set of guidelines on the validation of ISH is required. These guidelines include choice of methods, appropriate controls, and protocol optimization as well as a central core message of understanding the target, understanding the ISH technique, and using the most appropriate controlling mechanisms to enable reproducible and trustworthy data to be obtained.
Collapse
|
53
|
Xu J, Cai L, Liao B, Zhu W, Wang P, Meng Y, Lang J, Tian G, Yang J. Identifying Potential miRNAs-Disease Associations With Probability Matrix Factorization. Front Genet 2019; 10:1234. [PMID: 31921290 PMCID: PMC6918542 DOI: 10.3389/fgene.2019.01234] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Accepted: 11/06/2019] [Indexed: 12/30/2022] Open
Abstract
In recent years, miRNAs have been verified to play an irreplaceable role in biological processes associated with human disease. Discovering potential disease-related miRNAs helps explain the underlying pathogenesis of the disease at the molecular level. Given the high cost and labor intensity of biological experiments, computational predictions will be an indispensable alternative. Therefore, we design a new model called probability matrix factorization (PMFMDA). Specifically, we first integrate miRNA and disease similarity. Next, the known association matrix and integrated similarity matrix are utilized to construct a probability matrix factorization algorithm to identify potentially relevant miRNAs for disease. We find that PMFMDA achieves reliable performance in the frameworks of global leave-one-out cross validation (LOOCV) and 5-fold cross validation (AUCs are 0.9237 and 0.9187, respectively) in the HMDD (V2.0) dataset, significantly outperforming a few state-of-the-art methods including CMFMDA, IMCMDA, NCPMDA, RLSMDA, and RWRMDA. In addition, case studies show that PMFMDA has good predictive performance for new associations, and the evidence can be identified by literature mining.
Collapse
Affiliation(s)
- Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Lijun Cai
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Wen Zhu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Peng Wang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Yajie Meng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Jidong Lang
- Department of Science, Geneis Beijing Co., Ltd., Beijing, China
| | - Geng Tian
- Department of Science, Geneis Beijing Co., Ltd., Beijing, China
| | - Jialiang Yang
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| |
Collapse
|
54
|
Lv H, Dao FY, Guan ZX, Zhang D, Tan JX, Zhang Y, Chen W, Lin H. iDNA6mA-Rice: A Computational Tool for Detecting N6-Methyladenine Sites in Rice. Front Genet 2019; 10:793. [PMID: 31552096 PMCID: PMC6746913 DOI: 10.3389/fgene.2019.00793] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 07/26/2019] [Indexed: 01/08/2023] Open
Abstract
DNA N6-methyladenine (6mA) is a dominant DNA modification form and involved in many biological functions. The accurate genome-wide identification of 6mA sites may increase understanding of its biological functions. Experimental methods for 6mA detection in eukaryotes genome are laborious and expensive. Therefore, it is necessary to develop computational methods to identify 6mA sites on a genomic scale, especially for plant genomes. Based on this consideration, the study aims to develop a machine learning-based method of predicting 6mA sites in the rice genome. We initially used mono-nucleotide binary encoding to formulate positive and negative samples. Subsequently, the machine learning algorithm named Random Forest was utilized to perform the classification for identifying 6mA sites. Our proposed method could produce an area under the receiver operating characteristic curve of 0.964 with an overall accuracy of 0.917, as indicated by the fivefold cross-validation test. Furthermore, an independent dataset was established to assess the generalization ability of our method. Finally, an area under the receiver operating characteristic curve of 0.981 was obtained, suggesting that the proposed method had good performance of predicting 6mA sites in the rice genome. For the convenience of retrieving 6mA sites, on the basis of the computational method, we built a freely accessible web server named iDNA6mA-Rice at http://lin-group.cn/server/iDNA6mA-Rice.
Collapse
Affiliation(s)
- Hao Lv
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Dan Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiu-Xin Tan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Yong Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Wei Chen
- Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
55
|
Pan-Cancer Analysis of lncRNA Regulation Supports Their Targeting of Cancer Genes in Each Tumor Context. Cell Rep 2019; 23:297-312.e12. [PMID: 29617668 PMCID: PMC5906131 DOI: 10.1016/j.celrep.2018.03.064] [Citation(s) in RCA: 175] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Revised: 02/12/2018] [Accepted: 03/15/2018] [Indexed: 12/13/2022] Open
Abstract
Long noncoding RNAs (lncRNAs) are commonly dysregulated in tumors, but only a handful are known to play pathophysiological roles in cancer. We inferred lncRNAs that dysregulate cancer pathways, oncogenes, and tumor suppressors (cancer genes) by modeling their effects on the activity of transcription factors, RNA-binding proteins, and microRNAs in 5,185 TCGA tumors and 1,019 ENCODE assays. Our predictions included hundreds of candidate onco- and tumor-suppressor lncRNAs (cancer lncRNAs) whose somatic alterations account for the dysregulation of dozens of cancer genes and pathways in each of 14 tumor contexts. To demonstrate proof of concept, we showed that perturbations targeting OIP5-AS1 (an inferred tumor suppressor) and TUG1 and WT1-AS (inferred onco-lncRNAs) dysregulated cancer genes and altered proliferation of breast and gynecologic cancer cells. Our analysis indicates that, although most lncRNAs are dysregulated in a tumor-specific manner, some, including OIP5-AS1, TUG1, NEAT1, MEG3, and TSIX, synergistically dysregulate cancer pathways in multiple tumor contexts. Hundreds of lncRNAs target cancer genes and pathways in each tumor context lncRNA copy numbers are predictive of target cancer gene dysregulation Most lncRNAs are predicted to be transcriptional or post-transcriptional specialists lncRNAs are predicted to synergistically regulate proliferation pathways in cancer
Collapse
|
56
|
Park J, Zhu Y, Tao X, Brazill JM, Li C, Wuchty S, Zhai RG. MicroRNA miR-1002 Enhances NMNAT-Mediated Stress Response by Modulating Alternative Splicing. iScience 2019; 19:1048-1064. [PMID: 31522116 PMCID: PMC6745518 DOI: 10.1016/j.isci.2019.08.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2018] [Revised: 05/07/2019] [Accepted: 08/27/2019] [Indexed: 11/30/2022] Open
Abstract
Understanding endogenous regulation of stress resistance and homeostasis maintenance is critical to developing neuroprotective therapies. Nicotinamide mononucleotide adenylyltransferase (NMNAT) is a conserved essential enzyme that confers extraordinary protection and stress resistance in many neurodegenerative disease models. Drosophila Nmnat is alternatively spliced to two mRNA variants, RA and RB. RB translates to protein isoform PD with robust protective activity and is upregulated upon stress to confer enhanced neuroprotection. The mechanisms regulating the alternative splicing and stress response of NMNAT remain unclear. We have discovered a Drosophila microRNA, dme-miR-1002, which promotes the splicing of NMNAT pre-mRNA to RB by disrupting a pre-mRNA stem-loop structure. NMNAT pre-mRNA is preferentially spliced to RA in basal conditions, whereas miR-1002 enhances NMNAT PD-mediated stress protection by binding via RISC component Argonaute1 to the pre-mRNA, facilitating the splicing switch to RB. These results outline a new process for microRNAs in regulating alternative splicing and modulating stress resistance.
Collapse
Affiliation(s)
- Joun Park
- Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Program in Neuroscience, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Yi Zhu
- Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Program in Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Xianzun Tao
- Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Jennifer M Brazill
- Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Program in Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Chong Li
- Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Program in Human Genetics and Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA
| | - Stefan Wuchty
- Department of Computer Science, University of Miami, Coral Gables, FL 33146, USA
| | - R Grace Zhai
- Department of Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Program in Neuroscience, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Program in Molecular and Cellular Pharmacology, University of Miami Miller School of Medicine, Miami, FL 33136, USA; Program in Human Genetics and Genomics, University of Miami Miller School of Medicine, Miami, FL 33136, USA.
| |
Collapse
|
57
|
Assmann TS, Milagro FI, Martínez JA. Crosstalk between microRNAs, the putative target genes and the lncRNA network in metabolic diseases. Mol Med Rep 2019; 20:3543-3554. [PMID: 31485667 PMCID: PMC6755190 DOI: 10.3892/mmr.2019.10595] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Accepted: 04/18/2019] [Indexed: 02/07/2023] Open
Abstract
MicroRNAs (miRNAs/miRs) are small non-coding RNAs (ncRNAs) that regulate gene expression. Emerging knowledge has suggested that miRNAs have a role in the pathogenesis of metabolic disorders, supporting the hypothesis that miRNAs may represent potential biomarkers or targets for this set of diseases. However, the current evidence is often controversial. Therefore, the aim of the present study was to determine the associations between miRNAs-target genes, miRNA-long ncRNAs (lncRNAs), and miRNAs-small molecules in human metabolic diseases, including obesity, type 2 diabetes and non-alcoholic fatty liver disease. The metabolic disease-related miRNAs were obtained from the Human MicroRNA Disease Database (HMDD) and miR2Disease database. A search on the databases Matrix Decomposition and Heterogeneous Graph Inference (MDHGI) and DisGeNET were also performed. miRNAs target genes were obtained from three independent sources: Microcosm, TargetScan and miRTarBase. The interactions between miRNAs-lncRNA and miRNA-small molecules were performed using the miRNet web tool. The network analyses were performed using Cytoscape software. As a result, a total of 20 miRNAs were revealed to be associated with metabolic disorders in the present study. Notably, 6 miRNAs (miR-17-5p, miR-29c-3p, miR-34a-5p, miR-103a-3p, miR-107 and miR-132-3p) were found in the four resources (HMDD, miR2Disease, MDHGI, and DisGeNET) used for these analyses, presenting a stronger association with the diseases. Furthermore, the target genes of these miRNAs participate in several pathways previously associated with metabolic diseases. In addition, interactions between miRNA-lncRNA and miRNA-small molecules were also found, suggesting that some molecules can modulate gene expression via such an indirect way. Thus, the results of this data mining and integration analysis provide further information on the possible molecular basis of the metabolic disease pathogenesis as well as provide a path to search for potential biomarkers and therapeutic targets concerning metabolic diseases.
Collapse
Affiliation(s)
- Taís Silveira Assmann
- Department of Nutrition, Food Science and Physiology, Center for Nutrition Research, University of Navarra, 31008 Pamplona, Spain
| | - Fermín I Milagro
- Department of Nutrition, Food Science and Physiology, Center for Nutrition Research, University of Navarra, 31008 Pamplona, Spain
| | - José Alfredo Martínez
- Department of Nutrition, Food Science and Physiology, Center for Nutrition Research, University of Navarra, 31008 Pamplona, Spain
| |
Collapse
|
58
|
Greene J, Baird AM, Casey O, Brady L, Blackshields G, Lim M, O'Brien O, Gray SG, McDermott R, Finn SP. Circular RNAs are differentially expressed in prostate cancer and are potentially associated with resistance to enzalutamide. Sci Rep 2019; 9:10739. [PMID: 31341219 PMCID: PMC6656767 DOI: 10.1038/s41598-019-47189-2] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 07/04/2019] [Indexed: 12/19/2022] Open
Abstract
Most forms of castration-resistant prostate cancer (CRPC) are dependent on the androgen receptor (AR) for survival. While, enzalutamide provides a substantial survival benefit, it is not curative and many patients develop resistance to therapy. Although not yet fully understood, resistance can develop through a number of mechanisms, such as AR copy number gain, the generation of splice variants such as AR-V7 and mutations within the ligand binding domain (LBD) of the AR. circular RNAs (circRNAs) are a novel type of non-coding RNA, which can regulate the function of miRNA, and may play a key role in the development of drug resistance. circRNAs are highly resistant to degradation, are detectable in plasma and, therefore may serve a role as clinical biomarkers. In this study, AR-V7 expression was assessed in an isogenic model of enzalutamide resistance. The model consisted of age matched control cells and two sub-line clones displaying varied resistance to enzalutamide. circRNA profiling was performed on the panel using a high throughout microarray assay. Bioinformatic analysis identified a number of differentially expressed circRNAs and predicted five miRNA binding sites for each circRNA. miRNAs were stratified based on known associations with prostate cancer, and targets were validated using qPCR. Overall, circRNAs were more often down regulated in resistant cell lines compared with control (588 vs. 278). Of particular interest was hsa_circ_0004870, which was down-regulated in enzalutamide resistant cells (p ≤ 0.05, vs. sensitive cells), decreased in cells that highly express AR (p ≤ 0.01, vs. AR negative), and decreased in malignant cells (p ≤ 0.01, vs. benign). The associated parental gene was identified as RBM39, a member of the U2AF65 family of proteins. Both genes were down-regulated in resistant cells (p < 0.05, vs. sensitive cells). This is one of the first studies to profile and demonstrate discrete circRNA expression patterns in an enzalutamide resistant cell line model of prostate cancer. Our data suggests that hsa_circ_0004870, through RBM39, may play a critical role in the development of enzalutamide resistance in CRPC.
Collapse
Affiliation(s)
- John Greene
- Department of Histopathology and Morbid Anatomy, School of Medicine, Trinity College Dublin, Dublin 8, Ireland. .,Department of Medical Oncology, Tallaght Hospital, Dublin 24, Ireland.
| | - Anne-Marie Baird
- Department of Histopathology and Morbid Anatomy, School of Medicine, Trinity College Dublin, Dublin 8, Ireland.,Thoracic Oncology Research Group, Trinity Translational Medical Institute, St. James's Hospital, Dublin 8, Ireland.,Department of Clinical Medicine, Trinity College Dublin, Dublin 2, Ireland.,Cancer and Ageing Research Program, Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia
| | - Orla Casey
- Department of Histopathology and Morbid Anatomy, School of Medicine, Trinity College Dublin, Dublin 8, Ireland
| | - Lauren Brady
- Department of Histopathology and Morbid Anatomy, School of Medicine, Trinity College Dublin, Dublin 8, Ireland
| | - Gordon Blackshields
- Department of Histopathology and Morbid Anatomy, School of Medicine, Trinity College Dublin, Dublin 8, Ireland
| | - Marvin Lim
- Department of Histopathology and Morbid Anatomy, School of Medicine, Trinity College Dublin, Dublin 8, Ireland.,Department of Medical Oncology, Tallaght Hospital, Dublin 24, Ireland
| | | | - Steven G Gray
- Thoracic Oncology Research Group, Trinity Translational Medical Institute, St. James's Hospital, Dublin 8, Ireland.,Department of Clinical Medicine, Trinity College Dublin, Dublin 2, Ireland.,Labmed Directorate, St. James's Hospital, Dublin 8, Ireland.,HOPE Directorate, St. James's Hospital, Dublin 8, Ireland
| | - Raymond McDermott
- Department of Medical Oncology, Tallaght Hospital, Dublin 24, Ireland.,Department of Histopathology, St. James's Hospital, Dublin 8, Ireland.,Department of Medical Oncology, St. Vincent's Hospital, Dublin 4, Ireland
| | - Stephen P Finn
- Department of Histopathology and Morbid Anatomy, School of Medicine, Trinity College Dublin, Dublin 8, Ireland.,Thoracic Oncology Research Group, Trinity Translational Medical Institute, St. James's Hospital, Dublin 8, Ireland.,Department of Clinical Medicine, Trinity College Dublin, Dublin 2, Ireland.,Department of Histopathology, St. James's Hospital, Dublin 8, Ireland
| |
Collapse
|
59
|
Biology and Bias in Cell Type-Specific RNAseq of Nucleus Accumbens Medium Spiny Neurons. Sci Rep 2019; 9:8350. [PMID: 31171808 PMCID: PMC6554355 DOI: 10.1038/s41598-019-44798-9] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Accepted: 05/24/2019] [Indexed: 12/25/2022] Open
Abstract
Subcellular RNAseq promises to dissect transcriptional dynamics but is not well characterized. Furthermore, FACS may introduce bias but has not been benchmarked genome-wide. Finally, D1 and D2 dopamine receptor-expressing medium spiny neurons (MSNs) of the nucleus accumbens (NAc) are fundamental to neuropsychiatric traits but have only a short list of canonical surface markers. We address these gaps by systematically comparing nuclear-FACS, whole cell-FACS, and RiboTag affinity purification from D1- and D2-MSNs. Using differential expression, variance partitioning, and co-expression, we identify the following trade-offs for each method. RiboTag-seq best distinguishes D1- and D2-MSNs but has the lowest transcriptome coverage. Nuclear-FACS-seq generates the most differentially expressed genes and overlaps significantly with neuropsychiatric genetic risk loci, but un-annotated genes hamper interpretation. Whole cell-FACS is more similar to nuclear-FACS than RiboTag, but captures aspects of both. Using pan-method approaches, we discover that transcriptional regulation is predominant in D1-MSNs, while D2-MSNs tend towards cytosolic regulation. We are also the first to find evidence for moderate sexual dimorphism in these cell types at baseline. As these results are from 49 mice (nmale = 39, nfemale = 10), they represent generalizable ground-truths. Together, these results guide RNAseq methods selection, define MSN transcriptomes, highlight neuronal sex differences, and provide a baseline for D1- and D2-MSNs.
Collapse
|
60
|
Lessons learned from a lncRNA odyssey for two genes with vascular functions, DLL4 and TIE1. Vascul Pharmacol 2019; 114:103-109. [PMID: 30910126 DOI: 10.1016/j.vph.2018.06.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 04/24/2018] [Accepted: 06/13/2018] [Indexed: 01/30/2023]
Abstract
Pervasive transcription is a feature of the human genome that requires better understanding. Over the last decade or so, RNA species longer than 200 nucleotides-dubbed long non-coding RNA (lncRNAs)-had been found in sense or anti-sense orientation within or outside of genes that encode proteins. Importantly, lncRNA-mediated gene regulation and the elements that control lncRNA expression are a source of fascination among molecular biologists. In vascular biology, a dozen or so lncRNAs had been identified, and progress occurs each day. In this review, we highlighted our laboratories' contribution to the lncRNA field by discussing lessons learned from two lncRNAs in the tyrosine kinase containing immunoglobulin and epidermal growth factor homology1 (Tie1) and delta-like 4 (Dll4) loci. These genes are responsible for basic vascular patterning and pathophysiological remodeling in angiogenesis.
Collapse
|
61
|
Yin G, Zhang B, Li J. miR‑221‑3p promotes the cell growth of non‑small cell lung cancer by targeting p27. Mol Med Rep 2019; 20:604-612. [PMID: 31180541 PMCID: PMC6580017 DOI: 10.3892/mmr.2019.10291] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2018] [Accepted: 02/21/2019] [Indexed: 01/19/2023] Open
Abstract
Emerging evidence suggests the critical function of microRNAs in regulating the growth of cancer cells. In the present study, it was demonstrated that miR-221-3p was overexpressed in non-small cell lung cancer (NSCLC) tissues and cell lines compared with that noted in the normal controls. Downregulation of miR-221-3p suppressed the proliferation, colony formation and invasion of NSCLC cells. To further understand the molecular mechanisms underlying the potential oncogenic function of miR-221-3p in NSCLC, the downstream targets of miR-221-3p were predicted using bioinformatic databases. The prediction suggested the cell cycle regulator p27 as one of the targets of miR-221-3p. Molecular experiments showed that miR-221-3p was able to bind with the 3′-untranslated region (UTR) of p27 and decreased the expression of p27 in NSCLC cells. Consistent with the suppressive role of p27 in controlling cell cycle progression, overexpression of miR-221-3p decreased the expression of p27 and promoted cell cycle progression from G1 to S phase. Collectively, our findings identified miR-221-3p as a novel regulator of NSCLC cell growth via modulating the expression of p27.
Collapse
Affiliation(s)
- Guoqing Yin
- Department of Oncology, Xianyang Hospital, Yan'an University, Xianyang, Shaanxi 712000, P.R. China
| | - Bo Zhang
- Radiation Department, People's Hospital of Ankang City, Ankang, Shaanxi 725000, P.R. China
| | - Jia Li
- Department of Respiratory Medicine, Longnan Hospital, Daqing, Heilongjiang 163453, P.R. China
| |
Collapse
|
62
|
Liu S, Zheng B, Sheng Y, Kong Q, Jiang Y, Yang Y, Han X, Cheng L, Zhang Y, Han J. Identification of Cancer Dysfunctional Subpathways by Integrating DNA Methylation, Copy Number Variation, and Gene-Expression Data. Front Genet 2019; 10:441. [PMID: 31156704 PMCID: PMC6529853 DOI: 10.3389/fgene.2019.00441] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 04/29/2019] [Indexed: 12/29/2022] Open
Abstract
A subpathway is defined as the local region of a biological pathway with specific biological functions. With the generation of large-scale sequencing data, there are more opportunities to study the molecular mechanisms of cancer development. It is necessary to investigate the potential impact of DNA methylation, copy number variation (CNV), and gene-expression changes in the molecular states of oncogenic dysfunctional subpathways. We propose a novel method, Identification of Cancer Dysfunctional Subpathways (ICDS), by integrating multi-omics data and pathway topological information to identify dysfunctional subpathways. We first calculated gene-risk scores by integrating the three following types of data: DNA methylation, CNV, and gene expression. Second, we performed a greedy search algorithm to identify the key dysfunctional subpathways within pathways for which the discriminative scores were locally maximal. Finally, a permutation test was used to calculate the statistical significance level for these key dysfunctional subpathways. We validated the effectiveness of ICDS in identifying dysregulated subpathways using datasets from liver hepatocellular carcinoma (LIHC), head-neck squamous cell carcinoma (HNSC), cervical squamous cell carcinoma, and endocervical adenocarcinoma. We further compared ICDS with methods that performed the same subpathway identification algorithm but only considered DNA methylation, CNV, or gene expression (defined as ICDS_M, ICDS_CNV, or ICDS_G, respectively). With these analyses, we confirmed that ICDS better identified cancer-associated subpathways than the three other methods, which only considered one type of data. Our ICDS method has been implemented as a freely available R-based tool (https://cran.r-project.org/web/packages/ICDS).
Collapse
Affiliation(s)
- Siyao Liu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Baotong Zheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yuqi Sheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Qingfei Kong
- College of Basic Medical Science, Harbin Medical University, Harbin, China
| | - Ying Jiang
- College of Basic Medical Science, Heilongjiang University of Chinese Medicine, Harbin, China
| | - Yang Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Xudong Han
- College of Basic Medical Science, Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| |
Collapse
|
63
|
A Novel Method for Predicting Disease-Associated LncRNA-MiRNA Pairs Based on the Higher-Order Orthogonal Iteration. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2019; 2019:7614850. [PMID: 31191710 PMCID: PMC6525924 DOI: 10.1155/2019/7614850] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 01/25/2019] [Accepted: 02/10/2019] [Indexed: 12/30/2022]
Abstract
A lot of research studies have shown that many complex human diseases are associated not only with microRNAs (miRNAs) but also with long noncoding RNAs (lncRNAs). However, most of the current existing studies focus on the prediction of disease-related miRNAs or lncRNAs, and to our knowledge, until now, there are few literature studies reported to pay attention to the study of impact of miRNA-lncRNA pairs on diseases, although more and more studies have shown that both lncRNAs and miRNAs play important roles in cell proliferation and differentiation during the recent years. The identification of disease-related genes provides great insight into the underlying pathogenesis of diseases at a system level. In this study, a novel model called PADLMHOOI was proposed to predict potential associations between diseases and lncRNA-miRNA pairs based on the higher-order orthogonal iteration, and in order to evaluate its prediction performance, the global and local LOOCV were implemented, respectively, and simulation results demonstrated that PADLMHOOI could achieve reliable AUCs of 0.9545 and 0.8874 in global and local LOOCV separately. Moreover, case studies further demonstrated the effectiveness of PADLMHOOI to infer unknown disease-related lncRNA-miRNA pairs.
Collapse
|
64
|
Tahir M, Tayara H, Chong KT. iPseU-CNN: Identifying RNA Pseudouridine Sites Using Convolutional Neural Networks. MOLECULAR THERAPY-NUCLEIC ACIDS 2019; 16:463-470. [PMID: 31048185 PMCID: PMC6488737 DOI: 10.1016/j.omtn.2019.03.010] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 03/29/2019] [Accepted: 03/29/2019] [Indexed: 12/15/2022]
Abstract
Pseudouridine is the most prevalent RNA modification and has been found in both eukaryotes and prokaryotes. Currently, pseudouridine has been demonstrated in several kinds of RNAs, such as small nuclear RNA, rRNA, tRNA, mRNA, and small nucleolar RNA. Therefore, its significance to academic research and drug development is understandable. Through biochemical experiments, the pseudouridine site identification has produced good outcomes, but these lab exploratory methods and biochemical processes are expensive and time consuming. Therefore, it is important to introduce efficient methods for identification of pseudouridine sites. In this study, an intelligent method for pseudouridine sites using the deep-learning approach was developed. The proposed prediction model is called iPseU-CNN (identifying pseudouridine by convolutional neural networks). The existing methods used handcrafted features and machine-learning approaches to identify pseudouridine sites. However, the proposed predictor extracts the features of the pseudouridine sites automatically using a convolution neural network model. The iPseU-CNN model yields better outcomes than the current state-of-the-art models in all evaluation parameters. It is thus highly projected that the iPseU-CNN predictor will become a helpful tool for academic research on pseudouridine site prediction of RNA, as well as in drug discovery.
Collapse
Affiliation(s)
- Muhammad Tahir
- Department of Electronics and Information Engineering, Chonbuk National University, Jeonju 54896, South Korea; Department of Computer Science, Abdul Wali Khan University, Mardan 23200, Pakistan
| | - Hilal Tayara
- Department of Electronics and Information Engineering, Chonbuk National University, Jeonju 54896, South Korea.
| | - Kil To Chong
- Advanced Electronics and Information Research Center, Chonbuk National University, Jeonju 54896, South Korea.
| |
Collapse
|
65
|
Hughes SC, Simmonds AJ. Drosophila mRNA Localization During Later Development: Past, Present, and Future. Front Genet 2019; 10:135. [PMID: 30899273 PMCID: PMC6416162 DOI: 10.3389/fgene.2019.00135] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Accepted: 02/11/2019] [Indexed: 12/12/2022] Open
Abstract
Multiple mechanisms tightly regulate mRNAs during their transcription, translation, and degradation. Of these, the physical localization of mRNAs to specific cytoplasmic regions is relatively easy to detect; however, linking localization to functional regulatory roles has been more difficult to establish. Historically, Drosophila melanogaster is a highly effective model to identify localized mRNAs and has helped identify roles for this process by regulating various cell activities. The majority of the well-characterized functional roles for localizing mRNAs to sub-regions of the cytoplasm have come from the Drosophila oocyte and early syncytial embryo. At present, relatively few functional roles have been established for mRNA localization within the relatively smaller, differentiated somatic cell lineages characteristic of later development, beginning with the cellular blastoderm, and the multiple cell lineages that make up the gastrulating embryo, larva, and adult. This review is divided into three parts—the first outlines past evidence for cytoplasmic mRNA localization affecting aspects of cellular activity post-blastoderm development in Drosophila. The majority of these known examples come from highly polarized cell lineages such as differentiating neurons. The second part considers the present state of affairs where we now know that many, if not most mRNAs are localized to discrete cytoplasmic regions in one or more somatic cell lineages of cellularized embryos, larvae or adults. Assuming that the phenomenon of cytoplasmic mRNA localization represents an underlying functional activity, and correlation with the encoded proteins suggests that mRNA localization is involved in far more than neuronal differentiation. Thus, it seems highly likely that past-identified examples represent only a small fraction of localization-based mRNA regulation in somatic cells. The last part highlights recent technological advances that now provide an opportunity for probing the role of mRNA localization in Drosophila, moving beyond cataloging the diversity of localized mRNAs to a similar understanding of how localization affects mRNA activity.
Collapse
Affiliation(s)
- Sarah C Hughes
- Department of Medical Genetics, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada.,Department of Cell Biology, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada
| | - Andrew J Simmonds
- Department of Cell Biology, Faculty of Medicine and Dentistry, University of Alberta, Edmonton, AB, Canada
| |
Collapse
|
66
|
Li L, Che D, Wang X, Zhang P, Rahman SU, Zhao J, Yu J, Tao S, Lu H, Liao M. CellSim: a novel software to calculate cell similarity and identify their co-regulation networks. BMC Bioinformatics 2019; 20:111. [PMID: 30832570 PMCID: PMC6399906 DOI: 10.1186/s12859-019-2699-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 02/22/2019] [Indexed: 12/14/2022] Open
Abstract
Background Cell direct reprogramming technology has been rapidly developed with its low risk of tumor risk and avoidance of ethical issues caused by stem cells, but it is still limited to specific cell types. Direct reprogramming from an original cell to target cell type needs the cell similarity and cell specific regulatory network. The position and function of cells in vivo, can provide some hints about the cell similarity. However, it still needs further clarification based on molecular level studies. Result CellSim is therefore developed to offer a solution for cell similarity calculation and a tool of bioinformatics for researchers. CellSim is a novel tool for the similarity calculation of different cells based on cell ontology and molecular networks in over 2000 different human cell types and presents sharing regulation networks of part cells. CellSim can also calculate cell types by entering a list of genes, including more than 250 human normal tissue specific cell types and 130 cancer cell types. The results are shown in both tables and spider charts which can be preserved easily and freely. Conclusion CellSim aims to provide a computational strategy for cell similarity and the identification of distinct cell types. Stable CellSim releases (Windows, Linux, and Mac OS/X) are available at: www.cellsim.nwsuaflmz.com, and source code is available at: https://github.com/lileijie1992/CellSim/.
Collapse
Affiliation(s)
- Leijie Li
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Dongxue Che
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Xiaodan Wang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Peng Zhang
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Siddiq Ur Rahman
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Jianbang Zhao
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Jiantao Yu
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi, China
| | - Shiheng Tao
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China
| | - Hui Lu
- Department of Bioinformatics and Biostatistics, SJTU Yale Joint Center Biostatistics, Shanghai Jiao Tong University, Shanghai, China
| | - Mingzhi Liao
- College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, China.
| |
Collapse
|
67
|
Li T, Song R, Yin Q, Gao M, Chen Y. Identification of S-nitrosylation sites based on multiple features combination. Sci Rep 2019; 9:3098. [PMID: 30816267 PMCID: PMC6395632 DOI: 10.1038/s41598-019-39743-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 02/01/2019] [Indexed: 01/24/2023] Open
Abstract
Protein S-nitrosylation (SNO) is a typical reversible, redox-dependent and post-translational modification that involves covalent modification of cysteine residues with nitric oxide (NO) for the thiol group. Numerous experiments have shown that SNO plays a major role in cell function and pathophysiology. In order to rapidly analysis the big sets of data, the computing methods for identifying the SNO sites are being considered as necessary auxiliary tools. In this study, multiple features including Parallel correlation pseudo amino acid composition (PC-PseAAC), Basic kmer1 (kmer1), Basic kmer2 (kmer2), General parallel correlation pseudo amino acid composition (PC-PseAAC_G), Adapted Normal distribution Bi-Profile Bayes (ANBPB), Double Bi-Profile Bayes (DBPB), Bi-Profile Bayes (BPB), Incorporating Amino Acid Pairwise (IAAPair) and Position-specific Tri-Amino Acid Propensity(PSTAAP) were employed to extract the sequence information. To remove information redundancy, information gain (IG) was applied to evaluate the importance of amino acids, which is the information entropy of class after subtracting the conditional entropy for the given amino acid. The prediction performance of the SNO sites was found to be best by using the cross-validation and independent tests. In addition, we also calculated four commonly used performance measurements, i.e. Sensitivity (Sn), Specificity (Sp), Accuracy (Acc), and the Matthew's Correlation Coefficient (MCC). For the training dataset, the overall Acc was 83.11%, the MCC was 0.6617. For an independent test dataset, Acc was 73.17%, and MCC was 0.3788. The results indicate that our method is likely to complement the existing prediction methods and is a useful tool for effective identification of the SNO sites.
Collapse
Affiliation(s)
- Taoying Li
- Department of Maritime Economics and Management, Dalian Maritime University, No. 1 Linghai Road, Dalian, 116026, China.
| | - Runyu Song
- Department of Maritime Economics and Management, Dalian Maritime University, No. 1 Linghai Road, Dalian, 116026, China
| | - Qian Yin
- Department of Maritime Economics and Management, Dalian Maritime University, No. 1 Linghai Road, Dalian, 116026, China
| | - Mingyue Gao
- Department of Maritime Economics and Management, Dalian Maritime University, No. 1 Linghai Road, Dalian, 116026, China
| | - Yan Chen
- Department of Maritime Economics and Management, Dalian Maritime University, No. 1 Linghai Road, Dalian, 116026, China
| |
Collapse
|
68
|
Pyfrom SC, Luo H, Payton JE. PLAIDOH: a novel method for functional prediction of long non-coding RNAs identifies cancer-specific LncRNA activities. BMC Genomics 2019; 20:137. [PMID: 30767760 PMCID: PMC6377765 DOI: 10.1186/s12864-019-5497-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 01/29/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Long non-coding RNAs (lncRNAs) exhibit remarkable cell-type specificity and disease association. LncRNA's functional versatility includes epigenetic modification, nuclear domain organization, transcriptional control, regulation of RNA splicing and translation, and modulation of protein activity. However, most lncRNAs remain uncharacterized due to a shortage of predictive tools available to guide functional experiments. RESULTS To address this gap for lymphoma-associated lncRNAs identified in our studies, we developed a new computational method, Predicting LncRNA Activity through Integrative Data-driven 'Omics and Heuristics (PLAIDOH), which has several unique features not found in other methods. PLAIDOH integrates transcriptome, subcellular localization, enhancer landscape, genome architecture, chromatin interaction, and RNA-binding (eCLIP) data and generates statistically defined output scores. PLAIDOH's approach identifies and ranks functional connections between individual lncRNA, coding gene, and protein pairs using enhancer, transcript cis-regulatory, and RNA-binding protein interactome scores that predict the relative likelihood of these different lncRNA functions. When applied to 'omics datasets that we collected from lymphoma patients, or to publicly available cancer (TCGA) or ENCODE datasets, PLAIDOH identified and prioritized well-known lncRNA-target gene regulatory pairs (e.g., HOTAIR and HOX genes, PVT1 and MYC), validated hits in multiple lncRNA-targeted CRISPR screens, and lncRNA-protein binding partners (e.g., NEAT1 and NONO). Importantly, PLAIDOH also identified novel putative functional interactions, including one lymphoma-associated lncRNA based on analysis of data from our human lymphoma study. We validated PLAIDOH's predictions for this lncRNA using knock-down and knock-out experiments in lymphoma cell models. CONCLUSIONS Our study demonstrates that we have developed a new method for the prediction and ranking of functional connections between individual lncRNA, coding gene, and protein pairs, which were validated by genetic experiments and comparison to published CRISPR screens. PLAIDOH expedites validation and follow-on mechanistic studies of lncRNAs in any biological system. It is available at https://github.com/sarahpyfrom/PLAIDOH .
Collapse
Affiliation(s)
- Sarah C. Pyfrom
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO 63110 USA
| | - Hong Luo
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO 63110 USA
| | - Jacqueline E. Payton
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, MO 63110 USA
| |
Collapse
|
69
|
Cheng L, Liu P, Wang D, Leung KS. Exploiting locational and topological overlap model to identify modules in protein interaction networks. BMC Bioinformatics 2019; 20:23. [PMID: 30642247 PMCID: PMC6332531 DOI: 10.1186/s12859-019-2598-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 01/03/2019] [Indexed: 12/27/2022] Open
Abstract
Background Clustering molecular network is a typical method in system biology, which is effective in predicting protein complexes or functional modules. However, few studies have realized that biological molecules are spatial-temporally regulated to form a dynamic cellular network and only a subset of interactions take place at the same location in cells. Results In this study, considering the subcellular localization of proteins, we first construct a co-localization human protein interaction network (PIN) and systematically investigate the relationship between subcellular localization and biological functions. After that, we propose a Locational and Topological Overlap Model (LTOM) to preprocess the co-localization PIN to identify functional modules. LTOM requires the topological overlaps, the common partners shared by two proteins, to be annotated in the same localization as the two proteins. We observed the model has better correspondence with the reference protein complexes and shows more relevance to cancers based on both human and yeast datasets and two clustering algorithms, ClusterONE and MCL. Conclusion Taking into consideration of protein localization and topological overlap can improve the performance of module detection from protein interaction networks. Electronic supplementary material The online version of this article (10.1186/s12859-019-2598-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lixin Cheng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong. .,Institute of translation medicine, Shenzhen Second People's Hospital, First Affiliated Hospital of Shenzhen University, Shenzhen, China.
| | - Pengfei Liu
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
| | - Dong Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou, China.
| | - Kwong-Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong.
| |
Collapse
|
70
|
Hybrid sequencing-based personal full-length transcriptomic analysis implicates proteostatic stress in metastatic ovarian cancer. Oncogene 2019; 38:3047-3060. [PMID: 30617306 DOI: 10.1038/s41388-018-0644-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Revised: 10/16/2018] [Accepted: 12/04/2018] [Indexed: 12/27/2022]
Abstract
Comprehensive molecular characterization of myriad somatic alterations and aberrant gene expressions at personal level is key to precision cancer therapy, yet limited by current short-read sequencing technology, individualized catalog of complete genomic and transcriptomic features is thus far elusive. Here, we integrated second- and third-generation sequencing platforms to generate a multidimensional dataset on a patient affected by metastatic epithelial ovarian cancer. Whole-genome and hybrid transcriptome dissection captured global genetic and transcriptional variants at previously unparalleled resolution. Particularly, single-molecule mRNA sequencing identified a vast array of unannotated transcripts, novel long noncoding RNAs and gene chimeras, permitting accurate determination of transcription start, splice, polyadenylation and fusion sites. Phylogenetic and enrichment inference of isoform-level measurements implicated early functional divergence and cytosolic proteostatic stress in shaping ovarian tumorigenesis. A complementary imaging-based high-throughput drug screen was performed and subsequently validated, which consistently pinpointed proteasome inhibitors as an effective therapeutic regime by inducing protein aggregates in ovarian cancer cells. Therefore, our study suggests that clinical application of the emerging long-read full-length analysis for improving molecular diagnostics is feasible and informative. An in-depth understanding of the tumor transcriptome complexity allowed by leveraging the hybrid sequencing approach lays the basis to reveal novel and valid therapeutic vulnerabilities in advanced ovarian malignancies.
Collapse
|
71
|
Gao H, Duan Y, Fu X, Xie H, Liu Y, Yuan H, Zhou M, Xie C. Comparison of efficacy of SHENQI compound and rosiglitazone in the treatment of diabetic vasculopathy analyzing multi-factor mediated disease-causing modules. PLoS One 2018; 13:e0207683. [PMID: 30521536 PMCID: PMC6283585 DOI: 10.1371/journal.pone.0207683] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 11/05/2018] [Indexed: 01/09/2023] Open
Abstract
Atherosclerosis-predominant vasculopathy is a common complication of diabetes with high morbidity and high mortality, which is ruining the patient's daily life. As is known to all, traditional Chinese medicine (TCM) SHENQI compound and western medicine rosiglitazone play an important role in the treatment of diabetes. In particular, SHENQI compound has a significant inhibitory effect on vascular lesions. Here, to explore and compare the therapeutic mechanism of SHENQI compound and rosiglitazone on diabetic vasculopathy, we first built 7 groups of mouse models. The behavioral, physiological and pathological morphological characteristics of these mice showed that SHENQI compound has a more comprehensive curative effect than rosiglitazone and has a stronger inhibitory effect on vascular lesions. While rosiglitazone has a more effective but no significant effect on hypoglycemic. Further, based on the gene expression of mice in each group, we performed differential expression analysis. The functional enrichment analysis of these differentially expressed genes (DEGs) revealed the potential pathogenesis and treatment mechanisms of diabetic angiopathy. In addition, we found that SHENQI compound mainly exerts comprehensive effects by regulating MCM8, IRF7, CDK7, NEDD4L by pivot regulator analysis, while rosiglitazone can rapidly lower blood glucose levels by targeting PSMD3, UBA52. Except that, we also identified some pivot TFs and ncRNAs for these potential disease-causing DEG modules, which may the mediators bridging drugs and modules. Finally, similar to pivot regulator analysis, we also identified the regulation of some drugs (e.g. bumetanide, disopyramide and glyburide etc.) which have been shown to have a certain effect on diabetes or diabetic angiopathy, proofing the scientific and objectivity of this study. Overall, this study not only provides an in-depth comparison of the efficacy of SHENQI compound and rosiglitazone in the treatment of diabetic vasculopathy, but also provides clinicians and drug designers with valuable theoretical guidance.
Collapse
MESH Headings
- Animals
- Aorta, Abdominal/drug effects
- Aorta, Abdominal/pathology
- Cardiovascular Agents/therapeutic use
- Diabetes Mellitus, Experimental/drug therapy
- Diabetes Mellitus, Experimental/genetics
- Diabetes Mellitus, Experimental/pathology
- Diabetes Mellitus, Type 2/drug therapy
- Diabetes Mellitus, Type 2/genetics
- Diabetes Mellitus, Type 2/pathology
- Diabetic Angiopathies/drug therapy
- Diabetic Angiopathies/genetics
- Diabetic Angiopathies/pathology
- Disease Models, Animal
- Drugs, Chinese Herbal/therapeutic use
- Gene Expression/drug effects
- Humans
- Hypoglycemic Agents/therapeutic use
- Male
- Medicine, Chinese Traditional
- Mice
- Mice, Inbred C57BL
- Mice, Mutant Strains
- Phytotherapy
- Rosiglitazone/therapeutic use
- Signal Transduction/genetics
Collapse
Affiliation(s)
- Hong Gao
- Teaching Hospital, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Yuhong Duan
- Department Two of Endocrinology, Teaching Hospital, Shaanxi University of Traditional Chinese Medicine, Xianyang, China
| | - Xiaoxu Fu
- Teaching Hospital, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Hongyan Xie
- Teaching Hospital, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Ya Liu
- Teaching Hospital, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Haipo Yuan
- Teaching Hospital, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Mingyang Zhou
- Teaching Hospital, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Chunguang Xie
- Teaching Hospital, School of Clinical Medicine, Chengdu University of Traditional Chinese Medicine, Chengdu, China
- * E-mail:
| |
Collapse
|
72
|
Identifying Genomic Variations in Monozygotic Twins Discordant for Autism Spectrum Disorder Using Whole-Genome Sequencing. MOLECULAR THERAPY-NUCLEIC ACIDS 2018; 14:204-211. [PMID: 30623854 PMCID: PMC6325071 DOI: 10.1016/j.omtn.2018.11.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 09/08/2018] [Accepted: 11/21/2018] [Indexed: 11/24/2022]
Abstract
Autism spectrum disorder (ASD) presents a set of childhood neurodevelopmental disorders with impairments in social communication and restricted, repetitive, and stereotyped patterns of behavior. Here, based on the whole-genome sequencing (WGS) data of three monozygotic twins discordant for ASD, we explored multiple patient-specific genetic variations and prioritized a list of ASD risk genes. Our results identified DVMT (discordant variation in monozygotic twin) observed in at least two twin pairs, including 14,310 SNPs, 2,425 indels, and 16,735 CNVs, referring to a total of 2,174 genes, and 37 of these were covered by all three types of variations. Gene ontology (GO) enrichment analysis of biological processes for 2,174 genes showed that the majority of these genes were related to neurodevelopmental processes. In addition, functional network analysis showed that there was a strong functional relevance between 37 genes covered by all three types of variations. In conclusion, for the first time, we conducted a comprehensive scan of genomic differences between monozygotic twins discordant for ASD, providing researchers with in-depth directions. It may also provide effective strategies for clinical treatment of individuals affected by ASD.
Collapse
|
73
|
Zaghlool A, Ameur A, Wu C, Westholm JO, Niazi A, Manivannan M, Bramlett K, Nilsson M, Feuk L. Expression profiling and in situ screening of circular RNAs in human tissues. Sci Rep 2018; 8:16953. [PMID: 30446675 PMCID: PMC6240052 DOI: 10.1038/s41598-018-35001-6] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Accepted: 10/28/2018] [Indexed: 12/23/2022] Open
Abstract
Circular RNAs (circRNAs) were recently discovered as a class of widely expressed noncoding RNA and have been implicated in regulation of gene expression. However, the function of the majority of circRNAs remains unknown. Studies of circRNAs have been hampered by a lack of essential approaches for detection, quantification and visualization. We therefore developed a target-enrichment sequencing method suitable for screening of circRNAs and their linear counterparts in large number of samples. We also applied padlock probes and in situ sequencing to visualize and determine circRNA localization in human brain tissue at subcellular levels. We measured circRNA abundance across different human samples and tissues. Our results highlight the potential of this RNA class to act as a specific diagnostic marker in blood and serum, by detection of circRNAs from genes exclusively expressed in the brain. The powerful and scalable tools we present will enable studies of circRNA function and facilitate screening of circRNA as diagnostic biomarkers.
Collapse
Affiliation(s)
- Ammar Zaghlool
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| | - Adam Ameur
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Chenglin Wu
- National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Jakub Orzechowski Westholm
- National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Adnan Niazi
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Manimozhi Manivannan
- Clinical Sequencing Division, Life Science Solutions Group, Thermo Fisher Scientific, San Francisco, CA, USA
| | - Kelli Bramlett
- Clinical Sequencing Division, Life Science Solutions Group, Thermo Fisher Scientific, San Francisco, CA, USA
| | - Mats Nilsson
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.,National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - Lars Feuk
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
74
|
Non-coding RNAome of RPE cells under oxidative stress suggests unknown regulative aspects of Retinitis pigmentosa etiopathogenesis. Sci Rep 2018; 8:16638. [PMID: 30413775 PMCID: PMC6226517 DOI: 10.1038/s41598-018-35086-z] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2018] [Accepted: 10/29/2018] [Indexed: 12/26/2022] Open
Abstract
The discovery of thousands of non-coding RNAs has revolutionized molecular biology, being implicated in several biological processes and diseases. To clarify oxidative stress role on Retinitis pigmentosa, a very heterogeneous and inherited ocular disorder group characterized by progressive retinal degeneration, we realized a comparative transcriptome analysis of human retinal pigment epithelium cells, comparing two groups, one treated with oxLDL and one untreated, in four time points (1 h, 2 h, 4 h, 6 h). Data analysis foresaw a complex pipeline, starting from CLC Genomics Workbench, STAR and TopHat2/TopHat-Fusion alignment comparisons, followed by transcriptomes assembly and expression quantification. We then filtered out non-coding RNAs and continued the computational analysis roadmap with specific tools and databases for long non-coding RNAs (FEELnc), circular RNAs (CIRCexplorer, UROBORUS, CIRI, KNIFE, CircInteractome) and piwi-interacting RNAs (piRNABank, piRNA Cluster, piRBase, PILFER). Finally, all detected non-coding RNAs underwent pathway analysis by Cytoscape software. Eight-hundred and fifty-four non-coding RNAs, between long non-coding RNAs and PIWI-interacting, were differentially expressed throughout all considered time points, in treated and untreated samples. These non-coding RNAs target host genes involved in several biochemical pathways are related to compromised response to oxidative stress, visual functions, synaptic impairment of retinal neurotransmission, impairment of the interphotoreceptor matrix and blood – retina barrier, all leading to retinal cell death. These data suggest that non-coding RNAs could play a relevant role in Retinitis pigmentosa etiopathogenesis.
Collapse
|
75
|
Gudenas BL, Wang L. Prediction of LncRNA Subcellular Localization with Deep Learning from Sequence Features. Sci Rep 2018; 8:16385. [PMID: 30401954 PMCID: PMC6219567 DOI: 10.1038/s41598-018-34708-w] [Citation(s) in RCA: 86] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 10/19/2018] [Indexed: 12/20/2022] Open
Abstract
Long non-coding RNAs are involved in biological processes throughout the cell including the nucleus, chromatin and cytosol. However, most lncRNAs remain unannotated and functional annotation of lncRNAs is difficult due to their low conservation and their tissue and developmentally specific expression. LncRNA subcellular localization is highly informative regarding its biological function, although it is difficult to discover because few prediction methods currently exist. While protein subcellular localization prediction is a well-established research field, lncRNA localization prediction is a novel research problem. We developed DeepLncRNA, a deep learning algorithm which predicts lncRNA subcellular localization directly from lncRNA transcript sequences. We analyzed 93 strand-specific RNA-seq samples of nuclear and cytosolic fractions from multiple cell types to identify differentially localized lncRNAs. We then extracted sequence-based features from the lncRNAs to construct our DeepLncRNA model, which achieved an accuracy of 72.4%, sensitivity of 83%, specificity of 62.4% and area under the receiver operating characteristic curve of 0.787. Our results suggest that primary sequence motifs are a major driving force in the subcellular localization of lncRNAs.
Collapse
Affiliation(s)
- Brian L Gudenas
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Liangjiang Wang
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA.
| |
Collapse
|
76
|
Ariaeenejad S, Mousivand M, Moradi Dezfouli P, Hashemi M, Kavousi K, Hosseini Salekdeh G. A computational method for prediction of xylanase enzymes activity in strains of Bacillus subtilis based on pseudo amino acid composition features. PLoS One 2018; 13:e0205796. [PMID: 30346964 PMCID: PMC6197662 DOI: 10.1371/journal.pone.0205796] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 10/02/2018] [Indexed: 01/09/2023] Open
Abstract
Xylanases are hydrolytic enzymes which based on physicochemical properties, structure, mode of action and substrate specificities are classified into various glycoside hydrolase (GH) families. The purpose of this study is to show that the activity of the members of the xylanase family in the specified pH and temperature conditions can be computationally predicted. The proposed computational regression model was trained and tested with the Pseudo Amino Acid Composition (PseAAC) features extracted solely from the amino acid sequences of enzymes. The xylanases with experimentally determined activities were used as the training dataset to adjust the model parameters. To develop the model, 41 strains of Bacillus subtilis isolated from field soil were screened. From them, 28 strains with the highest halo diameter were selected for further studies. The performance of the model for prediction of xylanase activity was evaluated in three different temperature and pH conditions using stratified cross-validation and jackknife methods. The trained model can be used for determining the activity of newly found xylanases in the specified condition. Such computational models help to scale down the experimental costs and save time by identifying enzymes with appropriate activity for scientific and industrial usage. Our methodology for activity prediction of xylanase enzymes can be potentially applied to the members of the other enzyme families. The availability of sufficient experimental data in specified pH and temperature conditions is a prerequisite for training the learning model and to achieve high accuracy.
Collapse
Affiliation(s)
- Shohreh Ariaeenejad
- Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREO), Karaj, Iran
| | - Maryam Mousivand
- Department of Microbial Biotechnology, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREO), Karaj, Iran
| | - Parinaz Moradi Dezfouli
- Department of Microbial Biotechnology, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREO), Karaj, Iran
| | - Maryam Hashemi
- Department of Microbial Biotechnology, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREO), Karaj, Iran
| | - Kaveh Kavousi
- Institute of Biochemistry and Biophysics (IBB), University of Tehran, Tehran, Iran
| | - Ghasem Hosseini Salekdeh
- Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research Education and Extension Organization (AREO), Karaj, Iran
| |
Collapse
|
77
|
Shao L, Gao H, Liu Z, Feng J, Tang L, Lin H. Identification of Antioxidant Proteins With Deep Learning From Sequence Information. Front Pharmacol 2018; 9:1036. [PMID: 30294271 PMCID: PMC6158654 DOI: 10.3389/fphar.2018.01036] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2018] [Accepted: 08/27/2018] [Indexed: 01/26/2023] Open
Abstract
Antioxidant proteins have been found closely linked to disease control for its ability to eliminate excess free radicals. Because of its medicinal value, the study of identifying antioxidant proteins is on the upsurge. Many machine-learning classifiers have performed poorly owing to the nonlinear and unbalanced nature of biological data. Recently, deep learning techniques showed advantages over many state-of-the-art machine learning methods in various fields. In this study, a deep learning based classifier was proposed to identify antioxidant proteins based on mixed g-gap dipeptide composition feature vector. The classifier employed deep autoencoder to extract nonlinear representation from raw input. The t-Distributed Stochastic Neighbor Embedding (t-SNE) was used for dimensionality reduction. Support vector machine was finally performed for classification. The classifier achieved F 1 score of 0.8842 and MCC of 0.7409 in 10-fold cross validation. Experimental results show that our proposed method outperformed the traditional machine learning methods and could be a promising tool for antioxidant protein identification. For the convenience of others' scientific research, we have developed a user-friendly web server called IDAod for antioxidant protein identification, which can be accessed freely at http://bigroup.uestc.edu.cn/IDAod/.
Collapse
Affiliation(s)
- Lifen Shao
- Center for Informational Biology, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Hui Gao
- Center for Informational Biology, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhen Liu
- Center for Informational Biology, School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Juan Feng
- Key Laboratory for Neuro-Information of Ministry of Education, Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Lixia Tang
- Key Laboratory for Neuro-Information of Ministry of Education, Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
78
|
Tan JX, Dao FY, Lv H, Feng PM, Ding H. Identifying Phage Virion Proteins by Using Two-Step Feature Selection Methods. Molecules 2018; 23:molecules23082000. [PMID: 30103458 PMCID: PMC6222849 DOI: 10.3390/molecules23082000] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Revised: 07/30/2018] [Accepted: 08/08/2018] [Indexed: 12/31/2022] Open
Abstract
Accurate identification of phage virion protein is not only a key step for understanding the function of the phage virion protein but also helpful for further understanding the lysis mechanism of the bacterial cell. Since traditional experimental methods are time-consuming and costly for identifying phage virion proteins, it is extremely urgent to apply machine learning methods to accurately and efficiently identify phage virion proteins. In this work, a support vector machine (SVM) based method was proposed by mixing multiple sets of optimal g-gap dipeptide compositions. The analysis of variance (ANOVA) and the minimal-redundancy-maximal-relevance (mRMR) with an increment feature selection (IFS) were applied to single out the optimal feature set. In the five-fold cross-validation test, the proposed method achieved an overall accuracy of 87.95%. We believe that the proposed method will become an efficient and powerful method for scientists concerning phage virion proteins.
Collapse
Affiliation(s)
- Jiu-Xin Tan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Fu-Ying Dao
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Hao Lv
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Peng-Mian Feng
- Hebei Province Key Laboratory of Occupational Health and Safety for Coal Industry, School of Public Health, North China University of Science and Technology, Tangshan 063000, China.
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
79
|
Manavalan B, Shin TH, Kim MO, Lee G. PIP-EL: A New Ensemble Learning Method for Improved Proinflammatory Peptide Predictions. Front Immunol 2018; 9:1783. [PMID: 30108593 PMCID: PMC6079197 DOI: 10.3389/fimmu.2018.01783] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2018] [Accepted: 07/19/2018] [Indexed: 02/03/2023] Open
Abstract
Proinflammatory cytokines have the capacity to increase inflammatory reaction and play a central role in first line of defence against invading pathogens. Proinflammatory inducing peptides (PIPs) have been used as an antineoplastic agent, an antibacterial agent and a vaccine in immunization therapies. Due to the advancement in sequence technologies that resulted an avalanche of protein sequence data. Therefore, it is necessary to develop an automated computational method to enable fast and accurate identification of novel PIPs within the vast number of candidate proteins and peptides. To address this, we proposed a new predictor, PIP-EL, for predicting PIPs using the strategy of ensemble learning (EL). Our benchmarking dataset is imbalanced. Thus, we applied a random under-sampling technique to generate 10 balanced models for each composition. Technically, PIP-EL is the fusion of 50 independent random forest (RF) models, where each of the five different compositions, including amino acid, dipeptide, composition-transition-distribution, physicochemical properties, and amino acid index contains 10 RF models. PIP-EL achieves the Matthews' correlation coefficient (MCC) of 0.435 in a 5-fold cross-validation test, which is ~2-5% higher than that of the individual classifiers and hybrid feature-based classifier. Furthermore, we evaluate the performance of PIP-EL on the independent dataset, showing that our method outperforms the existing method and two different machine learning methods developed in this study, with an MCC of 0.454. These results indicate that PIP-EL will be a useful tool for predicting PIPs and for researchers working in the field of peptide therapeutics and immunotherapy. The user-friendly web server, PIP-EL, is freely accessible.
Collapse
Affiliation(s)
| | - Tae Hwan Shin
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| | - Myeong Ok Kim
- Division of Life Science and Applied Life Science (BK21 Plus), College of Natural Sciences, Gyeongsang National University, Jinju, South Korea
| | - Gwang Lee
- Department of Physiology, Ajou University School of Medicine, Suwon, South Korea
- Institute of Molecular Science and Technology, Ajou University, Suwon, South Korea
| |
Collapse
|
80
|
Bhatnager R, Dang AS. Comprehensive in-silico prediction of damage associated SNPs in Human Prolidase gene. Sci Rep 2018; 8:9430. [PMID: 29930383 PMCID: PMC6013436 DOI: 10.1038/s41598-018-27789-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Accepted: 06/04/2018] [Indexed: 12/05/2022] Open
Abstract
Prolidase is cytosolic manganese dependent exopeptidase responsible for the catabolism of imido di and tripeptides. Prolidase levels have been associated with a number of diseases such as bipolar disorder, erectile dysfunction and varied cancers. Single nucleotide polymorphism present in coding region of proteins (nsSNPs) has the potential to alter the primary structure as well as function of the protein. Hence, it becomes necessary to differentiate the potential harmful nsSNPs from the neutral ones. 19 nsSNPs were predicted as damaging by in-silico analysis of 298 nsSNPs retrieved from dbSNP database. Consurf analysis showed 18 out of 19 substitutions were present in the conserved regions. 4 substitutions (D276N, D287N, E412K, and G448R) that observed to have damaging effect are present in catalytic pocket. Four SNPs listed in splice site region were found to affect splicing of mRNA by altering acceptor site. On 3′UTR scan of 77 SNPs listed in SNP database, 9 SNPs were lead to alter miRNA target sites. These results provide a filtered data to explore the effect of uncharacterized nsSNP and SNP related to UTRs and splice site of prolidase to find their association with the disease susceptibility and to design the target dependent drugs for therapeutics.
Collapse
Affiliation(s)
- Richa Bhatnager
- Centre For Medical Biotechnology, M. D. University, Rohtak, 124001, India
| | - Amita S Dang
- Centre For Medical Biotechnology, M. D. University, Rohtak, 124001, India.
| |
Collapse
|
81
|
Pan Y, Gao H, Lin H, Liu Z, Tang L, Li S. Identification of Bacteriophage Virion Proteins Using Multinomial Naïve Bayes with g-Gap Feature Tree. Int J Mol Sci 2018; 19:E1779. [PMID: 29914091 PMCID: PMC6032154 DOI: 10.3390/ijms19061779] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Revised: 06/12/2018] [Accepted: 06/12/2018] [Indexed: 01/29/2023] Open
Abstract
Bacteriophages, which are tremendously important to the ecology and evolution of bacteria, play a key role in the development of genetic engineering. Bacteriophage virion proteins are essential materials of the infectious viral particles and in charge of several of biological functions. The correct identification of bacteriophage virion proteins is of great importance for understanding both life at the molecular level and genetic evolution. However, few computational methods are available for identifying bacteriophage virion proteins. In this paper, we proposed a new method to predict bacteriophage virion proteins using a Multinomial Naïve Bayes classification model based on discrete feature generated from the g-gap feature tree. The accuracy of the proposed model reaches 98.37% with MCC of 96.27% in 10-fold cross-validation. This result suggests that the proposed method can be a useful approach in identifying bacteriophage virion proteins from sequence information. For the convenience of experimental scientists, a web server (PhagePred) that implements the proposed predictor is available, which can be freely accessed on the Internet.
Collapse
Affiliation(s)
- Yanyuan Pan
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Hui Gao
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Zhen Liu
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Lixia Tang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| | - Songtao Li
- School of Computer Science and Engineering, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.
| |
Collapse
|
82
|
Jiang J, Xing F, Zeng X, Zou Q. RicyerDB: A Database For Collecting Rice Yield-related Genes with Biological Analysis. Int J Biol Sci 2018; 14:965-970. [PMID: 29989091 PMCID: PMC6036756 DOI: 10.7150/ijbs.23328] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2017] [Accepted: 12/25/2017] [Indexed: 11/16/2022] Open
Abstract
The Rice Yield-related Database (RicyerDB) was created to complement with related research of influence rice (Oryza sativa L.) yield in multiple traits by manually curating the related databases and literature, and genomics and proteomics information that could be useful for comprehensive understanding of the rice biology. RicyerDB provides a more valuable resource in which to efficiently investigate, browse and analyze yield-related genes. The whole data set can be easily queried and downloaded through the webpage. In addition, RicyerDB also constructed a protein-protein interaction network with biological analysis. The combined rice database opens a new path to facilitate researchers achieving information on rice gene in terms of their effects on traits important for rice breeding. The web server is freely available at: http://server.malab.cn/Ricyer/index.html.
Collapse
Affiliation(s)
- Jing Jiang
- School of Aerospace Engineering, Xiamen University, Xiamen, 361001, China
| | - Fei Xing
- School of Aerospace Engineering, Xiamen University, Xiamen, 361001, China
| | - Xiangxiang Zeng
- School of Information Science and Engineering, Xiamen University, Xiamen 361001, China
| | - Quan Zou
- School of Computer Science and Technology, Tianjin University, Tianjin, 300354, China
| |
Collapse
|
83
|
Yang H, Qiu WR, Liu G, Guo FB, Chen W, Chou KC, Lin H. iRSpot-Pse6NC: Identifying recombination spots in Saccharomyces cerevisiae by incorporating hexamer composition into general PseKNC. Int J Biol Sci 2018; 14:883-891. [PMID: 29989083 PMCID: PMC6036749 DOI: 10.7150/ijbs.24616] [Citation(s) in RCA: 135] [Impact Index Per Article: 22.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2017] [Accepted: 02/04/2018] [Indexed: 02/06/2023] Open
Abstract
Meiotic recombination caused by meiotic double-strand DNA breaks. In some regions the frequency of DNA recombination is relatively higher, while in other regions the frequency is lower: the former is usually called "recombination hotspot", while the latter the "recombination coldspot". Information of the hot and cold spots may provide important clues for understanding the mechanism of genome revolution. Therefore, it is important to accurately predict these spots. In this study, we rebuilt the benchmark dataset by unifying its samples with a same length (131 bp). Based on such a foundation and using SVM (Support Vector Machine) classifier, a new predictor called "iRSpot-Pse6NC" was developed by incorporating the key hexamer features into the general PseKNC (Pseudo K-tuple Nucleotide Composition) via the binomial distribution approach. It has been observed via rigorous cross-validations that the proposed predictor is superior to its counterparts in overall accuracy, stability, sensitivity and specificity. For the convenience of most experimental scientists, the web-server for iRSpot-Pse6NC has been established at http://lin-group.cn/server/iRSpot-Pse6NC, by which users can easily obtain their desired result without the need to go through the detailed mathematical equations involved.
Collapse
Affiliation(s)
- Hui Yang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wang-Ren Qiu
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, 333403, China
| | - Guoqing Liu
- School of Life Science and Technology, Inner Mongolia University of Science and Technology, Baotou, 014010, China
| | - Feng-Biao Guo
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Wei Chen
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Department of Physics, School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.,Gordon Life Science Institute, Boston, MA 02478, USA
| | - Kuo-Chen Chou
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Gordon Life Science Institute, Boston, MA 02478, USA
| | - Hao Lin
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China.,Gordon Life Science Institute, Boston, MA 02478, USA
| |
Collapse
|
84
|
Tang H, Zhao YW, Zou P, Zhang CM, Chen R, Huang P, Lin H. HBPred: a tool to identify growth hormone-binding proteins. Int J Biol Sci 2018; 14:957-964. [PMID: 29989085 PMCID: PMC6036759 DOI: 10.7150/ijbs.24174] [Citation(s) in RCA: 136] [Impact Index Per Article: 22.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Accepted: 01/15/2018] [Indexed: 12/19/2022] Open
Abstract
Hormone-binding protein (HBP) is a kind of soluble carrier protein and can selectively and non-covalently interact with hormone. HBP plays an important role in life growth, but its function is still unclear. Correct recognition of HBPs is the first step to further study their function and understand their biological process. However, it is difficult to correctly recognize HBPs from more and more proteins through traditional biochemical experiments because of high experimental cost and long experimental period. To overcome these disadvantages, we designed a computational method for identifying HBPs accurately in the study. At first, we collected HBP data from UniProt to establish a high-quality benchmark dataset. Based on the dataset, the dipeptide composition was extracted from HBP residue sequences. In order to find out the optimal features to provide key clues for HBP identification, the analysis of various (ANOVA) was performed for feature ranking. The optimal features were selected through the incremental feature selection strategy. Subsequently, the features were inputted into support vector machine (SVM) for prediction model construction. Jackknife cross-validation results showed that 88.6% HBPs and 81.3% non-HBPs were correctly recognized, suggesting that our proposed model was powerful. This study provides a new strategy to identify HBPs. Moreover, based on the proposed model, we established a webserver called HBPred, which could be freely accessed at http://lin-group.cn/server/HBPred.
Collapse
Affiliation(s)
- Hua Tang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Ya-Wei Zhao
- Key Laboratory for NeuroInformation of Ministry of Education, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Ping Zou
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Chun-Mei Zhang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Rong Chen
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Po Huang
- Department of Pathophysiology, Southwest Medical University, Luzhou 646000, China
| | - Hao Lin
- Key Laboratory for NeuroInformation of Ministry of Education, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
85
|
Liu H, Lei C, He Q, Pan Z, Xiao D, Tao Y. Nuclear functions of mammalian MicroRNAs in gene regulation, immunity and cancer. Mol Cancer 2018; 17:64. [PMID: 29471827 PMCID: PMC5822656 DOI: 10.1186/s12943-018-0765-5] [Citation(s) in RCA: 235] [Impact Index Per Article: 39.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 01/12/2018] [Indexed: 12/19/2022] Open
Abstract
MicroRNAs (miRNAs) are endogenous non-coding RNAs that contain approximately 22 nucleotides. They serve as key regulators in various biological processes and their dysregulation is implicated in many diseases including cancer and autoimmune disorders. It has been well established that the maturation of miRNAs occurs in the cytoplasm and miRNAs exert post-transcriptional gene silencing (PTGS) via RNA-induced silencing complex (RISC) pathway in the cytoplasm. However, numerous studies reaffirm the existence of mature miRNA in the nucleus, and nucleus-cytoplasm transport mechanism has also been illustrated. Moreover, active regulatory functions of nuclear miRNAs were found including PTGS, transcriptional gene silencing (TGS), and transcriptional gene activation (TGA), in which miRNAs bind nascent RNA transcripts, gene promoter regions or enhancer regions and exert further effects via epigenetic pathways. Based on existing interaction rules, some miRNA binding sites prediction software tools are developed, which are evaluated in this article. In addition, we attempt to explore and review the nuclear functions of miRNA in immunity, tumorigenesis and invasiveness of tumor. As a non-canonical aspect of miRNA action, nuclear miRNAs supplement miRNA regulatory networks and could be applied in miRNA based therapies.
Collapse
Affiliation(s)
- Hongyu Liu
- Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Xiangya Hospital, Central South University, 87 Xiangya Road, Changsha, Hunan, 410008, China
- Key Laboratory of Carcinogenesis, Ministry of Education, Cancer Research Institute, School of Basic Medicine, Central South University, 110 Xiangya Road, Changsha, Hunan, 410078, China
| | - Cheng Lei
- Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Xiangya Hospital, Central South University, 87 Xiangya Road, Changsha, Hunan, 410008, China
- Key Laboratory of Carcinogenesis, Ministry of Education, Cancer Research Institute, School of Basic Medicine, Central South University, 110 Xiangya Road, Changsha, Hunan, 410078, China
| | - Qin He
- Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Xiangya Hospital, Central South University, 87 Xiangya Road, Changsha, Hunan, 410008, China
- Key Laboratory of Carcinogenesis, Ministry of Education, Cancer Research Institute, School of Basic Medicine, Central South University, 110 Xiangya Road, Changsha, Hunan, 410078, China
| | - Zou Pan
- Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Xiangya Hospital, Central South University, 87 Xiangya Road, Changsha, Hunan, 410008, China
- Key Laboratory of Carcinogenesis, Ministry of Education, Cancer Research Institute, School of Basic Medicine, Central South University, 110 Xiangya Road, Changsha, Hunan, 410078, China
| | - Desheng Xiao
- Department of Pathology, Xiangya Hospital, Central South University, Changsha, Hunan, 410078, China
| | - Yongguang Tao
- Key Laboratory of Carcinogenesis and Cancer Invasion, Ministry of Education, Xiangya Hospital, Central South University, 87 Xiangya Road, Changsha, Hunan, 410008, China.
- Key Laboratory of Carcinogenesis, Ministry of Education, Cancer Research Institute, School of Basic Medicine, Central South University, 110 Xiangya Road, Changsha, Hunan, 410078, China.
- Department of Thoracic Surgery, Second Xiangya Hospital, Central South University, Changsha, China.
| |
Collapse
|
86
|
Wen X, Gao L, Guo X, Li X, Huang X, Wang Y, Xu H, He R, Jia C, Liang F. lncSLdb: a resource for long non-coding RNA subcellular localization. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:1-6. [PMID: 30219837 PMCID: PMC6146130 DOI: 10.1093/database/bay085] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2018] [Accepted: 07/23/2018] [Indexed: 01/18/2023]
Abstract
While long non-coding RNAs (lncRNAs) may play important roles in cellular function and biological process, we still know little about them. Growing evidences indicate that subcellular localization of lncRNAs may provide clues to their functionality. To facilitate researchers functionally characterize thousands of lncRNAs, we developed a database-driven application, lncSLdb, which stores and manages user-collected qualitative and quantitative subcellular localization information of lncRNAs from literature mining. The current release contains >11 000 transcripts from three species. Based on the accumulated region of lncRNAs, we classify transcripts into three basic localization types (nucleus, cytoplasm and nucleus/cytoplasm). In some conditions, the nucleus and cytoplasm types can be divided into three more accurate subtypes (chromosome, nucleoplasm and ribosome). Besides browsing and downloading data in lncSLdb, our system provides a set of comprehensive tools to search by gene symbols, genome coordinates or sequence similarity. We hope that lncSLdb will provide a convenient platform for researchers to investigate the functions and the molecular mechanisms of lncRNAs in the view of subcellular localization.
Collapse
Affiliation(s)
- Xiao Wen
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Xingli Guo
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Xing Li
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Xiaotai Huang
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Ying Wang
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Haifu Xu
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Ruijie He
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Chenglong Jia
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| | - Feixiang Liang
- School of Computer Science and Technology, Xidian University, Xi'an Shaanxi, PR China
| |
Collapse
|
87
|
Lynch CM, van Berkel VH, Frieboes HB. Application of unsupervised analysis techniques to lung cancer patient data. PLoS One 2017; 12:e0184370. [PMID: 28910336 PMCID: PMC5598970 DOI: 10.1371/journal.pone.0184370] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 08/22/2017] [Indexed: 11/24/2022] Open
Abstract
This study applies unsupervised machine learning techniques for classification and clustering to a collection of descriptive variables from 10,442 lung cancer patient records in the Surveillance, Epidemiology, and End Results (SEER) program database. The goal is to automatically classify lung cancer patients into groups based on clinically measurable disease-specific variables in order to estimate survival. Variables selected as inputs for machine learning include Number of Primaries, Age, Grade, Tumor Size, Stage, and TNM, which are numeric or can readily be converted to numeric type. Minimal up-front processing of the data enables exploring the out-of-the-box capabilities of established unsupervised learning techniques, with little human intervention through the entire process. The output of the techniques is used to predict survival time, with the efficacy of the prediction representing a proxy for the usefulness of the classification. A basic single variable linear regression against each unsupervised output is applied, and the associated Root Mean Squared Error (RMSE) value is calculated as a metric to compare between the outputs. The results show that self-ordering maps exhibit the best performance, while k-Means performs the best of the simpler classification techniques. Predicting against the full data set, it is found that their respective RMSE values (15.591 for self-ordering maps and 16.193 for k-Means) are comparable to supervised regression techniques, such as Gradient Boosting Machine (RMSE of 15.048). We conclude that unsupervised data analysis techniques may be of use to classify patients by defining the classes as effective proxies for survival prediction.
Collapse
Affiliation(s)
- Chip M. Lynch
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY, United States of America
| | - Victor H. van Berkel
- Department of Cardiovascular and Thoracic Surgery, University of Louisville, Louisville, KY, United States of America
| | - Hermann B. Frieboes
- Department of Bioengineering, University of Louisville, Louisville, KY, United States of America
- James Graham Brown Cancer Center, University of Louisville, Louisville, KY, United States of America
| |
Collapse
|
88
|
Elaziz MA, Hemdan AM, Hassanien A, Oliva D, Xiong S. Analysis of Bioactive Amino Acids from Fish Hydrolysates with a New Bioinformatic Intelligent System Approach. Sci Rep 2017; 7:10860. [PMID: 28883610 PMCID: PMC5589738 DOI: 10.1038/s41598-017-10890-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2017] [Accepted: 07/24/2017] [Indexed: 11/11/2022] Open
Abstract
The current economics of the fish protein industry demand rapid, accurate and expressive prediction algorithms at every step of protein production especially with the challenge of global climate change. This help to predict and analyze functional and nutritional quality then consequently control food allergies in hyper allergic patients. As, it is quite expensive and time-consuming to know these concentrations by the lab experimental tests, especially to conduct large-scale projects. Therefore, this paper introduced a new intelligent algorithm using adaptive neuro-fuzzy inference system based on whale optimization algorithm. This algorithm is used to predict the concentration levels of bioactive amino acids in fish protein hydrolysates at different times during the year. The whale optimization algorithm is used to determine the optimal parameters in adaptive neuro-fuzzy inference system. The results of proposed algorithm are compared with others and it is indicated the higher performance of the proposed algorithm.
Collapse
Affiliation(s)
- Mohamed Abd Elaziz
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China.
- Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, Egypt.
| | - Ahmed Monem Hemdan
- Faculty of Veterinary Medicine, Kafrelsheikh University, Kafrelsheikh, Egypt
| | | | - Diego Oliva
- Departamento de Ciencias Computacionales, Universidad de Guadalajara, CUCEI, Av. Revolucion 1500, Guadalajara, Jal, Mexico
| | - Shengwu Xiong
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, China.
- Hubei Collaborative Innovation Center of Basic Education Information Technology Services, Hubei University of Education, Wuhan, China.
| |
Collapse
|
89
|
Mas-Ponte D, Carlevaro-Fita J, Palumbo E, Hermoso Pulido T, Guigo R, Johnson R. LncATLAS database for subcellular localization of long noncoding RNAs. RNA (NEW YORK, N.Y.) 2017; 23:1080-1087. [PMID: 28386015 PMCID: PMC5473142 DOI: 10.1261/rna.060814.117] [Citation(s) in RCA: 204] [Impact Index Per Article: 29.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 03/31/2017] [Indexed: 05/04/2023]
Abstract
The subcellular localization of long noncoding RNAs (lncRNAs) holds valuable clues to their molecular function. However, measuring localization of newly discovered lncRNAs involves time-consuming and costly experimental methods. We have created "lncATLAS," a comprehensive resource of lncRNA localization in human cells based on RNA-sequencing data sets. Altogether, 6768 GENCODE-annotated lncRNAs are represented across various compartments of 15 cell lines. We introduce relative concentration index (RCI) as a useful measure of localization derived from ensemble RNA-seq measurements. LncATLAS is accessible through an intuitive and informative webserver, from which lncRNAs of interest are accessed using identifiers or names. Localization is presented across cell types and organelles, and may be compared to the distribution of all other genes. Publication-quality figures and raw data tables are automatically generated with each query, and the entire data set is also available to download. LncATLAS makes lncRNA subcellular localization data available to the widest possible number of researchers. It is available at lncatlas.crg.eu.
Collapse
Affiliation(s)
- David Mas-Ponte
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain
| | - Joana Carlevaro-Fita
- Department of Clinical Research, University of Bern, 3008 Bern, Switzerland
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010 Bern, Switzerland
| | - Emilio Palumbo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | - Toni Hermoso Pulido
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
| | - Roderic Guigo
- Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona 08003, Spain
- Universitat Pompeu Fabra (UPF), Barcelona 08003, Spain
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), 08003 Barcelona, Catalonia, Spain
| | - Rory Johnson
- Department of Clinical Research, University of Bern, 3008 Bern, Switzerland
- Department of Medical Oncology, Inselspital, University Hospital and University of Bern, 3010 Bern, Switzerland
| |
Collapse
|
90
|
Yu H, Chen X, Lu L. Large-scale prediction of microRNA-disease associations by combinatorial prioritization algorithm. Sci Rep 2017; 7:43792. [PMID: 28317855 PMCID: PMC5357838 DOI: 10.1038/srep43792] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 01/30/2017] [Indexed: 12/12/2022] Open
Abstract
Identification of the associations between microRNA molecules and human diseases from large-scale heterogeneous biological data is an important step for understanding the pathogenesis of diseases in microRNA level. However, experimental verification of microRNA-disease associations is expensive and time-consuming. To overcome the drawbacks of conventional experimental methods, we presented a combinatorial prioritization algorithm to predict the microRNA-disease associations. Importantly, our method can be used to predict microRNAs (diseases) associated with the diseases (microRNAs) without the known associated microRNAs (diseases). The predictive performance of our proposed approach was evaluated and verified by the internal cross-validations and external independent validations based on standard association datasets. The results demonstrate that our proposed method achieves the impressive performance for predicting the microRNA-disease association with the Area Under receiver operation characteristic Curve (AUC), 86.93%, which is indeed outperform the previous prediction methods. Particularly, we observed that the ensemble-based method by integrating the predictions of multiple algorithms can give more reliable and robust prediction than the single algorithm, with the AUC score improved to 92.26%. We applied our combinatorial prioritization algorithm to lung neoplasms and breast neoplasms, and revealed their top 30 microRNA candidates, which are in consistent with the published literatures and databases.
Collapse
Affiliation(s)
- Hua Yu
- State Key Laboratory of Plant Genomics, Institute of Genetic and Developmental Biology, Chinese Academy of Sciences, No. 1 West Beichen Road, Chaoyang District, Beijing, 100101, China
| | - Xiaojun Chen
- Key Lab of Agricultural Biotechnology of Ningxia, Agricultural Biotechnology Center, Ningxia Academy of Agriculture and Forestry Sciences, 590 Huanghe East Road, Jinfeng District, Yinchuan, Ningxia, 750002, China.
| | - Lu Lu
- Beijing Computing Center, Beijing Academy of Science and Technology, Building 3 BeiKe Industrial park, Fengxian road 7, Haidian District, Beijing, 100094, China
| |
Collapse
|