1
|
Wan C, Gao J, Zhang H, Jiang X, Zang Q, Ban R, Zhang Y, Shi Q. CPSS 2.0: a computational platform update for the analysis of small RNA sequencing data. Bioinformatics 2017; 33:3289-3291. [PMID: 28177064 PMCID: PMC5860027 DOI: 10.1093/bioinformatics/btx066] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 01/14/2017] [Accepted: 02/07/2017] [Indexed: 01/05/2023] Open
Abstract
SUMMARY Next-generation sequencing has been widely applied to understand the complexity of non-coding RNAs (ncRNAs) in the last decades. Here, we present CPSS 2.0, an updated version of CPSS 1.0 for small RNA sequencing data analysis, with the following improvements: (i) a substantial increase of supported species from 10 to 48; (ii) improved strategies applied to detect ncRNAs; (iii) more ncRNAs can be detected and profiled, such as lncRNA and circRNA; (iv) identification of differentially expressed ncRNAs among multiple samples; (v) enhanced visualization interface containing graphs and charts in detailed analysis results. The new version of CPSS is an efficient bioinformatics tool for users in non-coding RNA research. AVAILABILITY AND IMPLEMENTATION CPSS 2.0 is implemented in PHP + Perl + R and can be freely accessed at http://114.214.166.79/cpss2.0/. CONTACT zyuanwei@ustc.edu.cn or qshi@ustc.edu.cn. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Changlin Wan
- Molecular and Cell Genetics Laboratory, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, CAS Center for Excellence in Molecular Cell Science, University of Science and Technology of China, Collaborative Innovation Center of Genetics and Development, Collaborative Innovation Center for Cancer Medicine, Hefei, China
| | - Jianing Gao
- Molecular and Cell Genetics Laboratory, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, CAS Center for Excellence in Molecular Cell Science, University of Science and Technology of China, Collaborative Innovation Center of Genetics and Development, Collaborative Innovation Center for Cancer Medicine, Hefei, China
| | - Huan Zhang
- Molecular and Cell Genetics Laboratory, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, CAS Center for Excellence in Molecular Cell Science, University of Science and Technology of China, Collaborative Innovation Center of Genetics and Development, Collaborative Innovation Center for Cancer Medicine, Hefei, China
| | - Xiaohua Jiang
- Molecular and Cell Genetics Laboratory, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, CAS Center for Excellence in Molecular Cell Science, University of Science and Technology of China, Collaborative Innovation Center of Genetics and Development, Collaborative Innovation Center for Cancer Medicine, Hefei, China
| | - Qiguang Zang
- School of Information Science and Technology, University of Science and Technology of China, Hefei, China
| | - Rongjun Ban
- Molecular and Cell Genetics Laboratory, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, CAS Center for Excellence in Molecular Cell Science, University of Science and Technology of China, Collaborative Innovation Center of Genetics and Development, Collaborative Innovation Center for Cancer Medicine, Hefei, China
| | - Yuanwei Zhang
- Molecular and Cell Genetics Laboratory, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, CAS Center for Excellence in Molecular Cell Science, University of Science and Technology of China, Collaborative Innovation Center of Genetics and Development, Collaborative Innovation Center for Cancer Medicine, Hefei, China
| | - Qinghua Shi
- Molecular and Cell Genetics Laboratory, The CAS Key Laboratory of Innate Immunity and Chronic Diseases, Hefei National Laboratory for Physical Sciences at Microscale, School of Life Sciences, CAS Center for Excellence in Molecular Cell Science, University of Science and Technology of China, Collaborative Innovation Center of Genetics and Development, Collaborative Innovation Center for Cancer Medicine, Hefei, China
| |
Collapse
|
2
|
Bousios A, Gaut BS, Darzentas N. Considerations and complications of mapping small RNA high-throughput data to transposable elements. Mob DNA 2017; 8:3. [PMID: 28228849 PMCID: PMC5311732 DOI: 10.1186/s13100-017-0086-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2016] [Accepted: 01/31/2017] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND High-throughput sequencing (HTS) has revolutionized the way in which epigenetic research is conducted. When coupled with fully-sequenced genomes, millions of small RNA (sRNA) reads are mapped to regions of interest and the results scrutinized for clues about epigenetic mechanisms. However, this approach requires careful consideration in regards to experimental design, especially when one investigates repetitive parts of genomes such as transposable elements (TEs), or when such genomes are large, as is often the case in plants. RESULTS Here, in an attempt to shed light on complications of mapping sRNAs to TEs, we focus on the 2,300 Mb maize genome, 85% of which is derived from TEs, and scrutinize methodological strategies that are commonly employed in TE studies. These include choices for the reference dataset, the normalization of multiply mapping sRNAs, and the selection among sRNA metrics. We further examine how these choices influence the relationship between sRNAs and the critical feature of TE age, and contrast their effect on low copy genomic regions and other popular HTS data. CONCLUSIONS Based on our analyses, we share a series of take-home messages that may help with the design, implementation, and interpretation of high-throughput TE epigenetic studies specifically, but our conclusions may also apply to any work that involves analysis of HTS data.
Collapse
Affiliation(s)
- Alexandros Bousios
- School of Life Sciences, University of Sussex, Brighton, East Sussex BN1 9RH UK
| | - Brandon S. Gaut
- Department of Ecology and Evolutionary Biology, UC Irvine, Irvine, CA 92697 USA
| | - Nikos Darzentas
- Central European Institute of Technology, Masaryk University, Brno, 62500 Czech Republic
| |
Collapse
|
3
|
Capece V, Garcia Vizcaino JC, Vidal R, Rahman RU, Pena Centeno T, Shomroni O, Suberviola I, Fischer A, Bonn S. Oasis: online analysis of small RNA deep sequencing data. ACTA ACUST UNITED AC 2015; 31:2205-7. [PMID: 25701573 PMCID: PMC4481843 DOI: 10.1093/bioinformatics/btv113] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2014] [Accepted: 02/13/2015] [Indexed: 01/18/2023]
Abstract
UNLABELLED Oasis is a web application that allows for the fast and flexible online analysis of small-RNA-seq (sRNA-seq) data. It was designed for the end user in the lab, providing an easy-to-use web frontend including video tutorials, demo data and best practice step-by-step guidelines on how to analyze sRNA-seq data. Oasis' exclusive selling points are a differential expression module that allows for the multivariate analysis of samples, a classification module for robust biomarker detection and an advanced programming interface that supports the batch submission of jobs. Both modules include the analysis of novel miRNAs, miRNA targets and functional analyses including GO and pathway enrichment. Oasis generates downloadable interactive web reports for easy visualization, exploration and analysis of data on a local system. Finally, Oasis' modular workflow enables for the rapid (re-) analysis of data. AVAILABILITY AND IMPLEMENTATION Oasis is implemented in Python, R, Java, PHP, C++ and JavaScript. It is freely available at http://oasis.dzne.de. CONTACT stefan.bonn@dzne.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Vincenzo Capece
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Julio C Garcia Vizcaino
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Ramon Vidal
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Raza-Ur Rahman
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Tonatiuh Pena Centeno
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Orr Shomroni
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Irantzu Suberviola
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Andre Fischer
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| | - Stefan Bonn
- Laboratory of Computational Systems Biology and Laboratory of Epigenetic Mechanisms in Dementia, German Center for Neurodegenerative Diseases, 37077 Goettingen, Germany
| |
Collapse
|
4
|
Oleksiewicz U, Tomczak K, Woropaj J, Markowska M, Stępniak P, Shah PK. Computational characterisation of cancer molecular profiles derived using next generation sequencing. Contemp Oncol (Pozn) 2015; 19:A78-91. [PMID: 25691827 PMCID: PMC4322529 DOI: 10.5114/wo.2014.47137] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Our current understanding of cancer genetics is grounded on the principle that cancer arises from a clone that has accumulated the requisite somatically acquired genetic aberrations, leading to the malignant transformation. It also results in aberrent of gene and protein expression. Next generation sequencing (NGS) or deep sequencing platforms are being used to create large catalogues of changes in copy numbers, mutations, structural variations, gene fusions, gene expression, and other types of information for cancer patients. However, inferring different types of biological changes from raw reads generated using the sequencing experiments is algorithmically and computationally challenging. In this article, we outline common steps for the quality control and processing of NGS data. We highlight the importance of accurate and application-specific alignment of these reads and the methodological steps and challenges in obtaining different types of information. We comment on the importance of integrating these data and building infrastructure to analyse it. We also provide exhaustive lists of available software to obtain information and point the readers to articles comparing software for deeper insight in specialised areas. We hope that the article will guide readers in choosing the right tools for analysing oncogenomic datasets.
Collapse
Affiliation(s)
- Urszula Oleksiewicz
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; These authors contributed equally to this paper
| | - Katarzyna Tomczak
- Laboratory of Gene Therapy, Department of Cancer Immunology, The Greater Poland Cancer Centre, Poznan, Poland ; Department of Cancer Immunology and Diagnostics, Chair of Medical Biotechnology, Poznan University of Medical Sciences, Poznan, Poland ; Postgraduate School of Molecular Medicine, Medical University of Warsaw, Warsaw ; These authors contributed equally to this paper
| | - Jakub Woropaj
- Poznan University of Economics, Poznań, Poland ; These authors contributed equally to this paper
| | | | | | - Parantu K Shah
- Institute for Applied Cancer Science, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| |
Collapse
|
5
|
Cheng WC, Chung IF, Tsai CF, Huang TS, Chen CY, Wang SC, Chang TY, Sun HJ, Chao JYC, Cheng CC, Wu CW, Wang HW. YM500v2: a small RNA sequencing (smRNA-seq) database for human cancer miRNome research. Nucleic Acids Res 2014; 43:D862-7. [PMID: 25398902 PMCID: PMC4383957 DOI: 10.1093/nar/gku1156] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
We previously presented YM500, which is an integrated database for miRNA quantification, isomiR identification, arm switching discovery and novel miRNA prediction from 468 human smRNA-seq datasets. Here in this updated YM500v2 database (http://ngs.ym.edu.tw/ym500/), we focus on the cancer miRNome to make the database more disease-orientated. New miRNA-related algorithms developed after YM500 were included in YM500v2, and, more significantly, more than 8000 cancer-related smRNA-seq datasets (including those of primary tumors, paired normal tissues, PBMC, recurrent tumors, and metastatic tumors) were incorporated into YM500v2. Novel miRNAs (miRNAs not included in the miRBase R21) were not only predicted by three independent algorithms but also cleaned by a new in silico filtration strategy and validated by wetlab data such as Cross-Linked ImmunoPrecipitation sequencing (CLIP-seq) to reduce the false-positive rate. A new function 'Meta-analysis' is additionally provided for allowing users to identify real-time differentially expressed miRNAs and arm-switching events according to customer-defined sample groups and dozens of clinical criteria tidying up by proficient clinicians. Cancer miRNAs identified hold the potential for both basic research and biotech applications.
Collapse
Affiliation(s)
- Wei-Chung Cheng
- Research Center for Tumor Medical Science, China Medical University, Taichung 40402, Taiwan
| | - I-Fang Chung
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 11221, Taiwan
| | - Cheng-Fong Tsai
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 11221, Taiwan VGH-YM Genomic Research Center, National Yang-Ming University, Taipei 11221, Taiwan
| | - Tse-Shun Huang
- Institute of Engineering in Medicine, University of California, San Diego, La Jolla, CA 92093, USA
| | - Chen-Yang Chen
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 11221, Taiwan VGH-YM Genomic Research Center, National Yang-Ming University, Taipei 11221, Taiwan
| | - Shao-Chuan Wang
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 11221, Taiwan
| | - Ting-Yu Chang
- VGH-YM Genomic Research Center, National Yang-Ming University, Taipei 11221, Taiwan
| | - Hsing-Jen Sun
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 11221, Taiwan
| | - Jeffrey Yung-Chuan Chao
- Institute of Clinical Medicine, Medical College, National Yang-Ming University, Taipei 11221, Taiwan Department of Radiation Oncology, Taichung Veterans' General Hospital, Taichung 40705, Taiwan
| | - Cheng-Chung Cheng
- Division of Cardiology, Department of Internal Medicine, Tri-Service General Hospital, National Defence Medical Center, Taipei 11490, Taiwan
| | - Cheng-Wen Wu
- Institute of Clinical Medicine, Medical College, National Yang-Ming University, Taipei 11221, Taiwan Institute of Biomedical Science, Academia Sinica, Taipei 11529, Taiwan Institute of Biochemistry and Molecular Biology, National Yang Ming University, Taipei 11221, Taiwan
| | - Hsei-Wei Wang
- Institute of Biomedical Informatics, National Yang-Ming University, Taipei 11221, Taiwan VGH-YM Genomic Research Center, National Yang-Ming University, Taipei 11221, Taiwan Institute of Clinical Medicine, Medical College, National Yang-Ming University, Taipei 11221, Taiwan Institute of Microbiology and Immunology, National Yang-Ming University, Taipei 11221, Taiwan Department of Education and Research, Taipei City Hospital, Taipei 10341, Taiwan
| |
Collapse
|
6
|
Liu YX, Wang M, Wang XJ. Endogenous small RNA clusters in plants. GENOMICS PROTEOMICS & BIOINFORMATICS 2014; 12:64-71. [PMID: 24769055 PMCID: PMC4411336 DOI: 10.1016/j.gpb.2014.04.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/19/2014] [Revised: 04/09/2014] [Accepted: 04/15/2014] [Indexed: 11/25/2022]
Abstract
In plants, small RNAs (sRNAs) usually refer to non-coding RNAs (ncRNAs) with lengths of 20–24 nucleotides. sRNAs are involved in the regulation of many essential processes related to plant development and environmental responses. sRNAs in plants are mainly grouped into microRNAs (miRNAs) and small interfering RNAs (siRNAs), and the latter can be further classified into trans-acting siRNAs (ta-siRNAs), repeat-associated siRNAs (ra-siRNAs), natural anti-sense siRNAs (nat-siRNAs), etc. Many sRNAs exhibit a clustered distribution pattern in the genome. Here, we summarize the features and functions of cluster-distributed sRNAs, aimed to not only provide a thorough picture of sRNA clusters (SRCs) in plants, but also shed light on the identification of new classes of functional sRNAs.
Collapse
Affiliation(s)
- Yong-Xin Liu
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100101, China
| | - Meng Wang
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Xiu-Jie Wang
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China.
| |
Collapse
|