1
|
Singh J, Khanna NN, Rout RK, Singh N, Laird JR, Singh IM, Kalra MK, Mantella LE, Johri AM, Isenovic ER, Fouda MM, Saba L, Fatemi M, Suri JS. GeneAI 3.0: powerful, novel, generalized hybrid and ensemble deep learning frameworks for miRNA species classification of stationary patterns from nucleotides. Sci Rep 2024; 14:7154. [PMID: 38531923 DOI: 10.1038/s41598-024-56786-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 03/11/2024] [Indexed: 03/28/2024] Open
Abstract
Due to the intricate relationship between the small non-coding ribonucleic acid (miRNA) sequences, the classification of miRNA species, namely Human, Gorilla, Rat, and Mouse is challenging. Previous methods are not robust and accurate. In this study, we present AtheroPoint's GeneAI 3.0, a powerful, novel, and generalized method for extracting features from the fixed patterns of purines and pyrimidines in each miRNA sequence in ensemble paradigms in machine learning (EML) and convolutional neural network (CNN)-based deep learning (EDL) frameworks. GeneAI 3.0 utilized five conventional (Entropy, Dissimilarity, Energy, Homogeneity, and Contrast), and three contemporary (Shannon entropy, Hurst exponent, Fractal dimension) features, to generate a composite feature set from given miRNA sequences which were then passed into our ML and DL classification framework. A set of 11 new classifiers was designed consisting of 5 EML and 6 EDL for binary/multiclass classification. It was benchmarked against 9 solo ML (SML), 6 solo DL (SDL), 12 hybrid DL (HDL) models, resulting in a total of 11 + 27 = 38 models were designed. Four hypotheses were formulated and validated using explainable AI (XAI) as well as reliability/statistical tests. The order of the mean performance using accuracy (ACC)/area-under-the-curve (AUC) of the 24 DL classifiers was: EDL > HDL > SDL. The mean performance of EDL models with CNN layers was superior to that without CNN layers by 0.73%/0.92%. Mean performance of EML models was superior to SML models with improvements of ACC/AUC by 6.24%/6.46%. EDL models performed significantly better than EML models, with a mean increase in ACC/AUC of 7.09%/6.96%. The GeneAI 3.0 tool produced expected XAI feature plots, and the statistical tests showed significant p-values. Ensemble models with composite features are highly effective and generalized models for effectively classifying miRNA sequences.
Collapse
Affiliation(s)
- Jaskaran Singh
- Department of Computer Science, Graphic Era Deemed to be University, Dehradun, Uttarakhand, India
| | - Narendra N Khanna
- Department of Cardiology, Indraprastha APOLLO Hospitals, New Delhi, India
| | - Ranjeet K Rout
- Department of Computer Science and Engineering, NIT Srinagar, Hazratbal, Srinagar, India
| | - Narpinder Singh
- Department of Food Science, Graphic Era Deemed to be University, Dehradun, Uttarakhand, India
| | - John R Laird
- Heart and Vascular Institute, Adventist Health St. Helena, St Helena, CA, USA
| | - Inder M Singh
- Advanced Cardiac and Vascular Institute, Sacramento, CA, USA
| | - Mannudeep K Kalra
- Department of Radiology, Massachusetts General Hospital, Boston, MA, 02115, USA
| | - Laura E Mantella
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON, Canada
| | - Amer M Johri
- Department of Biomedical and Molecular Sciences, Queen's University, Kingston, ON, Canada
| | - Esma R Isenovic
- Laboratory for Molecular Genetics and Radiobiology, University of Belgrade, Belgrade, Serbia
| | - Mostafa M Fouda
- Department of Electrical and Computer Engineering, Idaho State University, Pocatello, ID, 83209, USA
| | - Luca Saba
- Department of Neurology, University of Cagliari, Cagliari, Italy
| | - Mostafa Fatemi
- Department of Physiology and Biomedical Engineering, Mayo Clinic, Rochester, MN, 55905, USA
| | - Jasjit S Suri
- Stroke Monitoring and Diagnostic Division, AtheroPoint LLC, Roseville, CA, 95661, USA.
| |
Collapse
|
2
|
Rao S, Balyan S, Bansal C, Mathur S. An Integrated Bioinformatics and Functional Approach for miRNA Validation. Methods Mol Biol 2022; 2408:253-281. [PMID: 35325428 DOI: 10.1007/978-1-0716-1875-2_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
MicroRNAs (miRNAs) are small (20-24 nucleotides) non-coding ribo-regulatory molecules with significant roles in regulating target mRNA and long non-coding RNAs at transcriptional and post-transcriptional levels. Rapid advancement in the small RNA sequencing methods with integration of degradome sequencing has accelerated the understanding of miRNA-mediated regulatory hubs in plants and yielded extensive annotation of miRNAs and corresponding targets. However, it is becoming clear that large numbers of such annotations are questionable. Therefore, it is imperative to adopt reliable and strict bioinformatics pipelines for miRNA identification. Furthermore, sensitive methods are needed for validation and functional characterization of miRNA and its target(s). In this chapter, we have provided a comprehensive and streamlined methodology for miRNA identification and its functional validation in plants. This includes a combination of various in silico and experimental methodologies. To identify miRNA compendium from large-scale Next-Generation Sequencing (NGS) small RNA datasets, the miR-PREFeR (miRNA PREdiction From small RNA-Seq data) bioinformatics tool has been described. Also, a homology-based search protocol for finding members of a specific miRNA family has been discussed. The chapter also includes techniques to ascertain miRNA:target pair specificity using in silico target prediction from degradome NGS libraries using CleaveLand pipeline, miRNA:target validation by in planta transient assays, 5' RLM-RACE and expression analysis as well as functional techniques like miRNA overexpression, short tandem target mimic and resistant target approaches. The proposed strategy offers a reliable and sensitive way for miRNA:target identification and validation. Additionally, we strongly promulgate the use of multiple methodologies to validate a miRNA as well as its target.
Collapse
Affiliation(s)
- Sombir Rao
- National Institute of Plant Genome Research, New Delhi, India
| | - Sonia Balyan
- National Institute of Plant Genome Research, New Delhi, India
| | - Chandni Bansal
- National Institute of Plant Genome Research, New Delhi, India
| | - Saloni Mathur
- National Institute of Plant Genome Research, New Delhi, India.
| |
Collapse
|
3
|
Kajal M, Kaushal N, Kaur R, Singh K. Identification of novel microRNAs and their targets in Chlorophytum borivilianum by small RNA and degradome sequencing. Noncoding RNA Res 2020; 4:141-154. [PMID: 32072082 PMCID: PMC7012778 DOI: 10.1016/j.ncrna.2019.11.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2019] [Revised: 11/20/2019] [Accepted: 11/21/2019] [Indexed: 11/04/2022] Open
Abstract
Plant specific miRNAs (Novel miRNAs) are well known to perform distinctive functions in biological processes. Identification of new miRNAs is necessary to understand their gene regulation. Degradome provides an opportunity to explore the miRNA functions by comparing the miRNA population and their degraded products. In the present study, Small RNA sequencing data was used to identify novel miRNAs. Further, degradome sequencing was carried out to identify miRNAs targets in the plant, Chlorophytum borivilianum. The present study supplemented 40 more novel miRNAs correlating degradome data with smallRNAome. Novel miRNAs, complementary to mRNA partial sequences obtained from degradome sequencing were actually targeting the later. A big pool of miRNA was established by using Oryza sativa, Arabidopsis thaliana, Populus trichocarpa, Ricinus communis, and Vitis vinifera genomic data. Targets were identified for novel miRNAs and total 109 targets were predicted. BLAST2GO analysis elaborate about localization of novel miRNAs’ targets and their corresponding KEGG (Kyoto Encyclopedia for Genes and Genomes) pathways. Identified targets were annotated and were found to be involved in significant biological processes like Nitrogen metabolism, Pyruvate metabolism, Citrate cycle (TCA cycle), photosynthesis, and Glycolysis/Gluconeogenesis. The present study provides an overall view of the miRNA regulation in multiple metabolic pathways that are involved in plant growth, pathogen resistance and secondary metabolism of C. borivilianum.
Collapse
Key Words
- AGO, Argonaute
- BLAST, Basic local Alignment Search Tool
- BP, Biological Process
- CC, Cellular Component
- Chlorophytum borivilianum
- Degradome
- FAO, Food and Agriculture Organization of the United Nations
- GO, Gene Ontology
- IL, Interleukin
- Illumina sequencing
- KEGG, Kyoto Encyclopedia of Genes and Genomes
- MCF-7, PC3, HCT-116, Types of cell lines
- MEP, 2-C-methyl-Derythritol-4-phosphate pathway
- MF, Molecular Function
- MFEs, Minimum Fold Energies
- MTT, 3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyl tetrazolium bromide
- MVA, Mevalonic Acid Pathway
- RdDM, RNA-directed DNA methylation
- SRA
- SRA, Sequencing Read Archieve
- TNF, Tumor Necrosis Factor
- iNOS, Inducible Nitric Oxide Synthase
- mgmL−1, milligram per millilitre
- microRNAs
- nt, nucleotide
Collapse
Affiliation(s)
- Monika Kajal
- Department of Biotechnology, Panjab University, BMS Block-I, Sector 25, Chandigarh, 160014, India
| | - Nishant Kaushal
- Department of Biotechnology, Panjab University, BMS Block-I, Sector 25, Chandigarh, 160014, India
| | - Ravneet Kaur
- Department of Biotechnology, Panjab University, BMS Block-I, Sector 25, Chandigarh, 160014, India
| | - Kashmir Singh
- Department of Biotechnology, Panjab University, BMS Block-I, Sector 25, Chandigarh, 160014, India
| |
Collapse
|
4
|
Parveen A, Mustafa SH, Yadav P, Kumar A. Applications of Machine Learning in miRNA Discovery and Target Prediction. Curr Genomics 2020; 20:537-544. [PMID: 32581642 PMCID: PMC7290058 DOI: 10.2174/1389202921666200106111813] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2019] [Revised: 12/05/2019] [Accepted: 12/09/2019] [Indexed: 11/28/2022] Open
Abstract
MicroRNA (miRNA) is a small non-coding molecule that is involved in gene regulation and RNA silencing by complementary on their targets. Experimental methods for target prediction can be time-consuming and expensive. Thus, the application of the computational approach is implicated to enlighten these complications with experimental studies. However, there is still a need for an optimized approach in miRNA biology. Therefore, machine learning (ML) would initiate a new era of research in miRNA biology towards potential diseases biomarker. In this article, we described the application of ML approaches in miRNA discovery and target prediction with functions and future prospective. The implementation of a new era of computational methodologies in this direction would initiate further advanced levels of discoveries in miRNA.
Collapse
Affiliation(s)
- Alisha Parveen
- 1Institute of Medical Bioinformatics and Systems Medicine Medical Center, Faculty of Medicine, Albert-Ludwigs University of Freiburg, 79110Freiburg, Germany; 2Department of Computer Engineering, Zakir Husain College of Engineering and Technology, Aligarh Muslim University, Aligarh, Uttar Pradesh, India; 3Department of Bioscience and Bio- engineering, Indian Institute of Technology, Jodhpur, India; 4Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India; 5Manipal Academy of Higher Education (MAHE), Manipal576104, Karnataka, India
| | - Syed H Mustafa
- 1Institute of Medical Bioinformatics and Systems Medicine Medical Center, Faculty of Medicine, Albert-Ludwigs University of Freiburg, 79110Freiburg, Germany; 2Department of Computer Engineering, Zakir Husain College of Engineering and Technology, Aligarh Muslim University, Aligarh, Uttar Pradesh, India; 3Department of Bioscience and Bio- engineering, Indian Institute of Technology, Jodhpur, India; 4Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India; 5Manipal Academy of Higher Education (MAHE), Manipal576104, Karnataka, India
| | - Pankaj Yadav
- 1Institute of Medical Bioinformatics and Systems Medicine Medical Center, Faculty of Medicine, Albert-Ludwigs University of Freiburg, 79110Freiburg, Germany; 2Department of Computer Engineering, Zakir Husain College of Engineering and Technology, Aligarh Muslim University, Aligarh, Uttar Pradesh, India; 3Department of Bioscience and Bio- engineering, Indian Institute of Technology, Jodhpur, India; 4Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India; 5Manipal Academy of Higher Education (MAHE), Manipal576104, Karnataka, India
| | - Abhishek Kumar
- 1Institute of Medical Bioinformatics and Systems Medicine Medical Center, Faculty of Medicine, Albert-Ludwigs University of Freiburg, 79110Freiburg, Germany; 2Department of Computer Engineering, Zakir Husain College of Engineering and Technology, Aligarh Muslim University, Aligarh, Uttar Pradesh, India; 3Department of Bioscience and Bio- engineering, Indian Institute of Technology, Jodhpur, India; 4Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India; 5Manipal Academy of Higher Education (MAHE), Manipal576104, Karnataka, India
| |
Collapse
|
5
|
Mármol-Sánchez E, Cirera S, Quintanilla R, Pla A, Amills M. Discovery and annotation of novel microRNAs in the porcine genome by using a semi-supervised transductive learning approach. Genomics 2019; 112:2107-2118. [PMID: 31816430 DOI: 10.1016/j.ygeno.2019.12.005] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Revised: 11/13/2019] [Accepted: 12/05/2019] [Indexed: 12/15/2022]
Abstract
Despite the broad variety of available microRNA (miRNA) prediction tools, their application to the discovery and annotation of novel miRNA genes in domestic species is still limited. In this study we designed a comprehensive pipeline (eMIRNA) for miRNA identification in the yet poorly annotated porcine genome and demonstrated the usefulness of implementing a motif search positional refinement strategy for the accurate determination of precursor miRNA boundaries. The small RNA fraction from gluteus medius skeletal muscle of 48 Duroc gilts was sequenced and used for the prediction of novel miRNA loci. Additionally, we selected the human miRNA annotation for a homology-based search of porcine miRNAs with orthologous genes in the human genome. A total of 20 novel expressed miRNAs were identified in the porcine muscle transcriptome and 27 additional novel porcine miRNAs were also detected by homology-based search using the human miRNA annotation. The existence of three selected novel miRNAs (ssc-miR-483, ssc-miR484 and ssc-miR-200a) was further confirmed by reverse transcription quantitative real-time PCR analyses in the muscle and liver tissues of Göttingen minipigs. In summary, the eMIRNA pipeline presented in the current work allowed us to expand the catalogue of porcine miRNAs and showed better performance than other commonly used miRNA prediction approaches. More importantly, the flexibility of our pipeline makes possible its application in other yet poorly annotated non-model species.
Collapse
Affiliation(s)
- Emilio Mármol-Sánchez
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain.
| | - Susanna Cirera
- Department of Veterinary and Animal Sciences, Faculty of Health and Medical Sciences, University of Copenhagen, Grønnegårdsvej 3, 2nd Floor, 1870 Frederiksberg C, Denmark
| | - Raquel Quintanilla
- Animal Breeding and Genetics Program, Institute for Research and Technology in Food and Agriculture (IRTA), Torre Marimon, 08140 Caldes de Montbui, Spain
| | - Albert Pla
- Department of Medical Genetics, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Marcel Amills
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB, Universitat Autònoma de Barcelona, 08193 Bellaterra, Spain; Departament de Ciència Animal i dels Aliments, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona, Spain
| |
Collapse
|
6
|
Toraih EA, Abdallah HY, Rashed EA, El-Wazir A, Tantawy MA, Fawzy MS. Comprehensive data analysis for development of custom qRT-PCR miRNA assay for glioblastoma: a prevalidation study. Epigenomics 2019; 11:367-380. [DOI: 10.2217/epi-2018-0134] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Aim: Glioblastoma (GB) is one notable example of miRNA-modulated neoplasms. Given its unique expression signature, proper miRNA profiling can help discriminate between GB and other types of brain tumors. The current work aimed to develop a more GB-specific and applicable custom designed quantitative real-time reverse transcription polymerase chain reaction (qRT-PCR) miRNA assay. Materials & methods: A comprehensive data analysis of bioinformatics databases, previous literature and commercially available pre-designed miRNA PCR arrays within the market. Results: A highly enriched panel of 84 deregulated and GB-specific miRNAs has been developed. Conclusion: After validation of this newly developed array, it can not only save the researcher's time and effort, but can also have a potential diagnostic and/or prognostic role in GB, paving the road toward personalized medicine.
Collapse
Affiliation(s)
- Eman A Toraih
- Department of Histology & Cell Biology, Genetics Unit, Faculty of Medicine, Suez Canal University, Ismailia 41522, Egypt
- Center of Excellence of Molecular & Cellular Medicine, Suez Canal University, Ismailia, Egypt
| | - Hoda Y Abdallah
- Department of Histology & Cell Biology, Genetics Unit, Faculty of Medicine, Suez Canal University, Ismailia 41522, Egypt
- Center of Excellence of Molecular & Cellular Medicine, Suez Canal University, Ismailia, Egypt
| | - Essam A Rashed
- Department of Mathematics, Faculty of Science, Suez Canal University, Ismailia 41522, Egypt
- Department of Computer Science, Faculty of Informatics and Computer Science, The British University in Egypt, Cairo 11837, Egypt
| | - Aya El-Wazir
- Department of Histology & Cell Biology, Genetics Unit, Faculty of Medicine, Suez Canal University, Ismailia 41522, Egypt
- Center of Excellence of Molecular & Cellular Medicine, Suez Canal University, Ismailia, Egypt
| | - Mohamed A Tantawy
- Hormones Department, Medical Research Division, National Research Center, Cairo, Egypt
| | - Manal S Fawzy
- Department of Medical Biochemistry & Molecular Biology, Faculty of Medicine, Suez Canal University, Ismailia 41522, Egypt
- Department of Biochemistry, Faculty of Medicine, Northern Border University, Arar, Saudi Arabia
| |
Collapse
|
7
|
Peace RJ, Sheikh Hassani M, Green JR. miPIE: NGS-based Prediction of miRNA Using Integrated Evidence. Sci Rep 2019; 9:1548. [PMID: 30733467 PMCID: PMC6367335 DOI: 10.1038/s41598-018-38107-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 12/18/2018] [Indexed: 12/12/2022] Open
Abstract
Methods for the de novo identification of microRNA (miRNA) have been developed using a range of sequence-based features. With the increasing availability of next generation sequencing (NGS) transcriptome data, there is a need for miRNA identification that integrates both NGS transcript expression-based patterns as well as advanced genomic sequence-based methods. While miRDeep2 does examine the predicted secondary structure of putative miRNA sequences, it does not leverage many of the sequence-based features used in state-of-the-art de novo methods. Meanwhile, other NGS-based methods, such as miRanalyzer, place an emphasis on sequence-based features without leveraging advanced expression-based features reflecting miRNA biosynthesis. This represents an opportunity to combine the strengths of NGS-based analysis with recent advances in de novo sequence-based miRNA prediction. We here develop a method, microRNA Prediction using Integrated Evidence (miPIE), which integrates both expression-based and sequence-based features to achieve significantly improved miRNA prediction performance. Feature selection identifies the 20 most discriminative features, 3 of which reflect strictly expression-based information. Evaluation using precision-recall curves, for six NGS data sets representing six diverse species, demonstrates substantial improvements in prediction performance compared to three methods: miRDeep2, miRanalyzer, and mirnovo. The individual contributions of expression-based and sequence-based features are also examined and we demonstrate that their combination is more effective than either alone.
Collapse
Affiliation(s)
- R J Peace
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada
| | - M Sheikh Hassani
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada
| | - J R Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada.
| |
Collapse
|
8
|
Numnark S, Suwannik W. An emerging technique for reducing the response time in plant miRNA identification. Comput Biol Chem 2019; 78:382-388. [DOI: 10.1016/j.compbiolchem.2018.12.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2018] [Accepted: 12/25/2018] [Indexed: 12/23/2022]
|
9
|
Abstract
microRNA molecules have been shown to play various significant roles in many physiological and pathophysiological processes in living organisms. The tremendous interest in these molecules has led to the significant development and constant release of a number of computational tools useful for basic as well as advanced miRNA-related analyses. These approaches have various constantly evolving utilities, such as detection, target prediction, functional annotation, and many others. In this chapter, we provide an overview of several computational tools useful for broadly defined plant miRNA analysis.
Collapse
Affiliation(s)
- Anna Lukasik
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland
| | - Piotr Zielenkiewicz
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland.
- Department of Plant Molecular Biology, Institute of Experimental Plant Biology and Biotechnology, University of Warsaw, Warsaw, Poland.
| |
Collapse
|
10
|
Yones C, Stegmayer G, Milone DH. Genome-wide pre-miRNA discovery from few labeled examples. Bioinformatics 2018; 34:541-549. [PMID: 29028911 DOI: 10.1093/bioinformatics/btx612] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2017] [Accepted: 09/22/2017] [Indexed: 12/16/2022] Open
Abstract
Motivation Although many machine learning techniques have been proposed for distinguishing miRNA hairpins from other stem-loop sequences, most of the current methods use supervised learning, which requires a very good set of positive and negative examples. Those methods have important practical limitations when they have to be applied to a real prediction task. First, there is the challenge of dealing with a scarce number of positive (well-known) pre-miRNA examples. Secondly, it is very difficult to build a good set of negative examples for representing the full spectrum of non-miRNA sequences. Thirdly, in any genome, there is a huge class imbalance (1: 10 000) that is well-known for particularly affecting supervised classifiers. Results To enable efficient and speedy genome-wide predictions of novel miRNAs, we present miRNAss, which is a novel method based on semi-supervised learning. It takes advantage of the information provided by the unlabeled stem-loops, thereby improving the prediction rates, even when the number of labeled examples is low and not representative of the classes. An automatic method for searching negative examples to initialize the algorithm is also proposed so as to spare the user this difficult task. MiRNAss obtained better prediction rates and shorter execution times than state-of-the-art supervised methods. It was validated with genome-wide data from three model species, with more than one million of hairpin sequences each, thereby demonstrating its applicability to a real prediction task. Availability and implementation An R package can be downloaded from https://cran.r-project.org/package=miRNAss. In addition, a web-demo for testing the algorithm is available at http://fich.unl.edu.ar/sinc/web-demo/mirnass. All the datasets that were used in this study and the sets of predicted pre-miRNA are available on http://sourceforge.net/projects/sourcesinc/files/mirnass. Contact cyones@sinc.unl.edu.ar. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- C Yones
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), UNL-CONICET. Ciudad Universitaria, 4to piso FICH, Santa Fe 3000, Argentina
| | - G Stegmayer
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), UNL-CONICET. Ciudad Universitaria, 4to piso FICH, Santa Fe 3000, Argentina
| | - D H Milone
- Research Institute for Signals, Systems and Computational Intelligence, sinc(i), UNL-CONICET. Ciudad Universitaria, 4to piso FICH, Santa Fe 3000, Argentina
| |
Collapse
|
11
|
Vivek A. In silico identification and characterization of microRNAs based on EST and GSS in orphan legume crop, Lens culinaris medik. (Lentil). ACTA ACUST UNITED AC 2018. [DOI: 10.1016/j.aggene.2018.05.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
12
|
Khan A, Shah S, Wahid F, Khan FG, Jabeen S. Identification of microRNA precursors using reduced and hybrid features. MOLECULAR BIOSYSTEMS 2018; 13:1640-1645. [PMID: 28686281 DOI: 10.1039/c7mb00115k] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
MicroRNAs (also called miRNAs) are a group of short non-coding RNA molecules. They play a vital role in the gene expression of transcriptional and post-transcriptional processes. However, abnormality of their expression has been observed in cancer, heart diseases and nervous system disorders. Therefore for basic research and microRNA based therapy, it is imperative to separate real pre-miRNAs from false ones (hairpin sequences similar to pre-miRNA stem loops). Different conservation and machine learning methods have been applied for the identification of miRNAs. However, machine learning algorithms have gained more popularity than conservative based algorithms in terms of sensitivity and overall performance. Due to the avalanche of RNA sequences discovered in a post-genomic age, it is necessary to construct a predictor for the identification of pre-microRNAs in humans. We have developed a predictor called MicroR-Pred in which the RNA sequences are formulated by a hybrid feature vector. The novelty of the new predictor is in the use of the partial least squares technique followed by the Random Forest and SVM (Support Vector Machine) algorithms for dimension reduction and classification. The performance of the MicroR-Pred model is quite promising compared to other state-of-the-art miRNA predictors. It has achieved 88.40% and 93.90% accuracies for RF and SVM.
Collapse
Affiliation(s)
- Asad Khan
- Department of Computer Science COMSATS Institute of IT, Abbottabad 22060, Pakistan.
| | - Sajid Shah
- Department of Computer Science COMSATS Institute of IT, Abbottabad 22060, Pakistan.
| | - Fazli Wahid
- Department of Environmental Sciences COMSATS Institute of IT, Abbottabad 22060, Pakistan
| | - Fiaz Gul Khan
- Department of Computer Science COMSATS Institute of IT, Abbottabad 22060, Pakistan.
| | - Saima Jabeen
- Department of Computer Science COMSATS Institute of IT, Abbottabad 22060, Pakistan.
| |
Collapse
|
13
|
Kim KH, Seo YM, Kim EY, Lee SY, Kwon J, Ko JJ, Lee KA. The miR-125 family is an important regulator of the expression and maintenance of maternal effect genes during preimplantational embryo development. Open Biol 2017; 6:rsob.160181. [PMID: 27906131 PMCID: PMC5133438 DOI: 10.1098/rsob.160181] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 11/03/2016] [Indexed: 02/03/2023] Open
Abstract
Previously, we reported that Sebox is a new maternal effect gene (MEG) that is required for early embryo development beyond the two-cell (2C) stage because this gene orchestrates the expression of important genes for zygotic genome activation (ZGA). However, regulators of Sebox expression remain unknown. Therefore, the objectives of the present study were to use bioinformatics tools to identify such regulatory microRNAs (miRNAs) and to determine the effects of the identified miRNAs on Sebox expression. Using computational algorithms, we identified a motif within the 3′UTR of Sebox mRNA that is specific to the seed region of the miR-125 family, which includes miR-125a-5p, miR-125b-5p and miR-351-5p. During our search for miRNAs, we found that the Lin28a 3′UTR also contains the same binding motif for the seed region of the miR-125 family. In addition, we confirmed that Lin28a also plays a role as a MEG and affects ZGA at the 2C stage, without affecting oocyte maturation or fertilization. Thus, we provide the first report indicating that the miR-125 family plays a crucial role in regulating MEGs related to the 2C block and in regulating ZGA through methods such as affecting Sebox and Lin28a in oocytes and embryos.
Collapse
Affiliation(s)
- Kyeoung-Hwa Kim
- Institute of Reproductive Medicine, Department of Biomedical Science, College of Life Science, CHA University, Pangyo, South Korea
| | - You-Mi Seo
- Department of Oral Histology-Developmental Biology, School of Dentistry and Dental Research Institute, Seoul National University, Seoul, South Korea
| | - Eun-Young Kim
- Institute of Reproductive Medicine, Department of Biomedical Science, College of Life Science, CHA University, Pangyo, South Korea
| | - Su-Yeon Lee
- Institute of Reproductive Medicine, Department of Biomedical Science, College of Life Science, CHA University, Pangyo, South Korea
| | - Jini Kwon
- Institute of Reproductive Medicine, Department of Biomedical Science, College of Life Science, CHA University, Pangyo, South Korea
| | - Jung-Jae Ko
- Institute of Reproductive Medicine, Department of Biomedical Science, College of Life Science, CHA University, Pangyo, South Korea
| | - Kyung-Ah Lee
- Institute of Reproductive Medicine, Department of Biomedical Science, College of Life Science, CHA University, Pangyo, South Korea
| |
Collapse
|
14
|
Luo J, Xiao Q. A novel approach for predicting microRNA-disease associations by unbalanced bi-random walk on heterogeneous network. J Biomed Inform 2017; 66:194-203. [PMID: 28104458 DOI: 10.1016/j.jbi.2017.01.008] [Citation(s) in RCA: 78] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Revised: 01/11/2017] [Accepted: 01/13/2017] [Indexed: 12/24/2022]
Abstract
MicroRNAs (miRNAs) play a critical role by regulating their targets in post-transcriptional level. Identification of potential miRNA-disease associations will aid in deciphering the pathogenesis of human polygenic diseases. Several computational models have been developed to uncover novel miRNA-disease associations based on the predicted target genes. However, due to the insufficient number of experimentally validated miRNA-target interactions as well as the relatively high false-positive and false-negative rates of predicted target genes, it is still challenging for these prediction models to obtain remarkable performances. The purpose of this study is to prioritize miRNA candidates for diseases. We first construct a heterogeneous network, which consists of a disease similarity network, a miRNA functional similarity network and a known miRNA-disease association network. Then, an unbalanced bi-random walk-based algorithm on the heterogeneous network (BRWH) is adopted to discover potential associations by exploiting bipartite subgraphs. Based on 5-fold cross validation, the proposed network-based method achieves AUC values ranging from 0.782 to 0.907 for the 22 human diseases and an average AUC of almost 0.846. The experiments indicated that BRWH can achieve better performances compared with several popular methods. In addition, case studies of some common diseases further demonstrated the superior performance of our proposed method on prioritizing disease-related miRNA candidates.
Collapse
Affiliation(s)
- Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Qiu Xiao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
15
|
Samir M, Vaas LAI, Pessler F. MicroRNAs in the Host Response to Viral Infections of Veterinary Importance. Front Vet Sci 2016; 3:86. [PMID: 27800484 PMCID: PMC5065965 DOI: 10.3389/fvets.2016.00086] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 09/12/2016] [Indexed: 12/13/2022] Open
Abstract
The discovery of small regulatory non-coding RNAs has been an exciting advance in the field of genomics. MicroRNAs (miRNAs) are endogenous RNA molecules, approximately 22 nucleotides in length, that regulate gene expression, mostly at the posttranscriptional level. MiRNA profiling technologies have made it possible to identify and quantify novel miRNAs and to study their regulation and potential roles in disease pathogenesis. Although miRNAs have been extensively investigated in viral infections of humans, their implications in viral diseases affecting animals of veterinary importance are much less understood. The number of annotated miRNAs in different animal species is growing continuously, and novel roles in regulating host–pathogen interactions are being discovered, for instance, miRNA-mediated augmentation of viral transcription and replication. In this review, we present an overview of synthesis and function of miRNAs and an update on the current state of research on host-encoded miRNAs in the genesis of viral infectious diseases in their natural animal host as well as in selected in vivo and in vitro laboratory models.
Collapse
Affiliation(s)
- Mohamed Samir
- TWINCORE, Center for Experimental and Clinical Infection Research, Hannover, Germany; Department of Zoonoses, Faculty of Veterinary Medicine, Zagazig University, Zagazig, Egypt
| | - Lea A I Vaas
- TWINCORE, Center for Experimental and Clinical Infection Research , Hannover , Germany
| | - Frank Pessler
- TWINCORE, Center for Experimental and Clinical Infection Research, Hannover, Germany; Helmholtz Center for Infection Research, Braunschweig, Germany
| |
Collapse
|
16
|
Sun X, Zhang J. Dysfunctional miRNA-Mediated Regulation in Chromophobe Renal Cell Carcinoma. PLoS One 2016; 11:e0156324. [PMID: 27258182 PMCID: PMC4892590 DOI: 10.1371/journal.pone.0156324] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 05/12/2016] [Indexed: 01/05/2023] Open
Abstract
Past research on pathogenesis of a complex disease suggests that differentially expressed message RNAs (mRNAs) can be noted as biomarkers of a disease. However, significant miRNA-mediated regulation change might also be more deep underlying cause of a disease. In this study, a miRNA-mediated regulation module is defined based on GO terms (Gene Ontology terms) from which dysfunctional modules are identified as the suspected cause of a disease. A miRNA-mediated regulation module contains mRNAs annotated to a GO term and MicroRNAs (miRNAs) which regulate the mRNAs. Based on the miRNA-mediated regulation coefficients estimated from the expression profiles of the mRNA and the miRNAs, a SW (single regulation-weight) value is then designed to evaluate the miRNA-mediated regulation change of an mRNA, and the modules with significantly differential SW values are thus identified as dysfunctional modules. The approach is applied to Chromophobe renal cell carcinoma and it identifies 70 dysfunctional miRNA-mediated regulation modules from initial 4381 modules. The identified dysfunctional modules are detected to be comprehensive reflection of chromophobe renal cell carcinoma. The proposed approach suggests that accumulated alteration in miRNA-mediated regulation might cause functional alterations, which further cause a disease. Moreover, this approach can also be used to identify diffentially miRNA-mediated regulated mRNAs showing more comprehensive underlying association with a disease than differentially expressed mRNAs.
Collapse
Affiliation(s)
- Xiaohan Sun
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, P. R. China
- College of Mathematics and Information Science, Weinan Normal University, Weinan, Shaanxi, P. R. China
| | - Junying Zhang
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, P. R. China
- * E-mail:
| |
Collapse
|
17
|
Steinkraus BR, Toegel M, Fulga TA. Tiny giants of gene regulation: experimental strategies for microRNA functional studies. WILEY INTERDISCIPLINARY REVIEWS. DEVELOPMENTAL BIOLOGY 2016; 5:311-62. [PMID: 26950183 PMCID: PMC4949569 DOI: 10.1002/wdev.223] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Revised: 11/19/2015] [Accepted: 11/28/2015] [Indexed: 12/11/2022]
Abstract
The discovery over two decades ago of short regulatory microRNAs (miRNAs) has led to the inception of a vast biomedical research field dedicated to understanding these powerful orchestrators of gene expression. Here we aim to provide a comprehensive overview of the methods and techniques underpinning the experimental pipeline employed for exploratory miRNA studies in animals. Some of the greatest challenges in this field have been uncovering the identity of miRNA-target interactions and deciphering their significance with regard to particular physiological or pathological processes. These endeavors relied almost exclusively on the development of powerful research tools encompassing novel bioinformatics pipelines, high-throughput target identification platforms, and functional target validation methodologies. Thus, in an unparalleled manner, the biomedical technology revolution unceasingly enhanced and refined our ability to dissect miRNA regulatory networks and understand their roles in vivo in the context of cells and organisms. Recurring motifs of target recognition have led to the creation of a large number of multifactorial bioinformatics analysis platforms, which have proved instrumental in guiding experimental miRNA studies. Subsequently, the need for discovery of miRNA-target binding events in vivo drove the emergence of a slew of high-throughput multiplex strategies, which now provide a viable prospect for elucidating genome-wide miRNA-target binding maps in a variety of cell types and tissues. Finally, deciphering the functional relevance of miRNA post-transcriptional gene silencing under physiological conditions, prompted the evolution of a host of technologies enabling systemic manipulation of miRNA homeostasis as well as high-precision interference with their direct, endogenous targets. For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Bruno R Steinkraus
- Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Markus Toegel
- Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| | - Tudor A Fulga
- Weatherall Institute of Molecular Medicine, Radcliffe Department of Medicine, University of Oxford, Oxford, UK
| |
Collapse
|
18
|
Lukasik A, Wójcikowski M, Zielenkiewicz P. Tools4miRs - one place to gather all the tools for miRNA analysis. Bioinformatics 2016; 32:2722-4. [PMID: 27153626 PMCID: PMC5013900 DOI: 10.1093/bioinformatics/btw189] [Citation(s) in RCA: 59] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Accepted: 04/04/2016] [Indexed: 01/08/2023] Open
Abstract
Summary: MiRNAs are short, non-coding molecules that negatively regulate gene expression and thereby play several important roles in living organisms. Dozens of computational methods for miRNA-related research have been developed, which greatly differ in various aspects. The substantial availability of difficult-to-compare approaches makes it challenging for the user to select a proper tool and prompts the need for a solution that will collect and categorize all the methods. Here, we present tools4miRs, the first platform that gathers currently more than 160 methods for broadly defined miRNA analysis. The collected tools are classified into several general and more detailed categories in which the users can additionally filter the available methods according to their specific research needs, capabilities and preferences. Tools4miRs is also a web-based target prediction meta-server that incorporates user-designated target prediction methods into the analysis of user-provided data. Availability and Implementation: Tools4miRs is implemented in Python using Django and is freely available at tools4mirs.org. Contact: piotr@ibb.waw.pl Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Anna Lukasik
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland
| | - Maciej Wójcikowski
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland
| | - Piotr Zielenkiewicz
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, 02-106 Warsaw, Poland Department of Systems Biology, Institute of Experimental Plant Biology and Biotechnology, University of Warsaw, 02-096 Warsaw, Poland
| |
Collapse
|
19
|
Singh NK. microRNAs Databases: Developmental Methodologies, Structural and Functional Annotations. Interdiscip Sci 2016; 9:357-377. [PMID: 27021491 DOI: 10.1007/s12539-016-0166-7] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2015] [Revised: 02/08/2016] [Accepted: 03/11/2016] [Indexed: 12/31/2022]
Abstract
microRNA (miRNA) is an endogenous and evolutionary conserved non-coding RNA, involved in post-transcriptional process as gene repressor and mRNA cleavage through RNA-induced silencing complex (RISC) formation. In RISC, miRNA binds in complementary base pair with targeted mRNA along with Argonaut proteins complex, causes gene repression or endonucleolytic cleavage of mRNAs and results in many diseases and syndromes. After the discovery of miRNA lin-4 and let-7, subsequently large numbers of miRNAs were discovered by low-throughput and high-throughput experimental techniques along with computational process in various biological and metabolic processes. The miRNAs are important non-coding RNA for understanding the complex biological phenomena of organism because it controls the gene regulation. This paper reviews miRNA databases with structural and functional annotations developed by various researchers. These databases contain structural and functional information of animal, plant and virus miRNAs including miRNAs-associated diseases, stress resistance in plant, miRNAs take part in various biological processes, effect of miRNAs interaction on drugs and environment, effect of variance on miRNAs, miRNAs gene expression analysis, sequence of miRNAs, structure of miRNAs. This review focuses on the developmental methodology of miRNA databases such as computational tools and methods used for extraction of miRNAs annotation from different resources or through experiment. This study also discusses the efficiency of user interface design of every database along with current entry and annotations of miRNA (pathways, gene ontology, disease ontology, etc.). Here, an integrated schematic diagram of construction process for databases is also drawn along with tabular and graphical comparison of various types of entries in different databases. Aim of this paper is to present the importance of miRNAs-related resources at a single place.
Collapse
Affiliation(s)
- Nagendra Kumar Singh
- Department of Biological Science and Engineering, Maulana Azad National Institute of Technology, Bhopal, M.P., 462003, India.
| |
Collapse
|
20
|
Alptekin B, Akpinar BA, Budak H. A Comprehensive Prescription for Plant miRNA Identification. FRONTIERS IN PLANT SCIENCE 2016; 7:2058. [PMID: 28174574 PMCID: PMC5258749 DOI: 10.3389/fpls.2016.02058] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 12/23/2016] [Indexed: 05/15/2023]
Abstract
microRNAs (miRNAs) are tiny ribo-regulatory molecules involved in various essential pathways for persistence of cellular life, such as development, environmental adaptation, and stress response. In recent years, miRNAs have become a major focus in molecular biology because of their functional and diagnostic importance. This interest in miRNA research has resulted in the development of many specific software and pipelines for the identification of miRNAs and their specific targets, which is the key for the elucidation of miRNA-modulated gene expression. While the well-recognized importance of miRNAs in clinical research pushed the emergence of many useful computational identification approaches in animals, available software and pipelines are fewer for plants. Additionally, existing approaches suffers from mis-identification and annotation of plant miRNAs since the miRNA mining process for plants is highly prone to false-positives, particularly in cereals which have a highly repetitive genome. Our group developed a homology-based in silico miRNA identification approach for plants, which utilizes two Perl scripts "SUmirFind" and "SUmirFold" and since then, this method helped identify many miRNAs particularly from crop species such as Triticum or Aegliops. Herein, we describe a comprehensive updated guideline by the implementation of two new scripts, "SUmirPredictor" and "SUmirLocator," and refinements to our previous method in order to identify genuine miRNAs with increased sensitivity in consideration of miRNA identification problems in plants. Recent updates enable our method to provide more reliable and precise results in an automated fashion in addition to solutions for elimination of most false-positive predictions, miRNA naming and miRNA mis-annotation. It also provides a comprehensive view to genome/transcriptome-wide location of miRNA precursors as well as their association with transposable elements. The "SUmirPredictor" and "SUmirLocator" scripts are freely available together with a reference high-confidence plant miRNA list.
Collapse
Affiliation(s)
- Burcu Alptekin
- Cereal Genomics Lab, Department of Plant Sciences and Plant Pathology, Montana State UniversityBozeman, MT, USA
| | - Bala A. Akpinar
- Sabanci University Nanotechnology Research and Application Centre, Sabanci UniversityIstanbul, Turkey
| | - Hikmet Budak
- Cereal Genomics Lab, Department of Plant Sciences and Plant Pathology, Montana State UniversityBozeman, MT, USA
- *Correspondence: Hikmet Budak
| |
Collapse
|
21
|
miRNAfe: A comprehensive tool for feature extraction in microRNA prediction. Biosystems 2015; 138:1-5. [DOI: 10.1016/j.biosystems.2015.10.003] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2015] [Revised: 09/21/2015] [Accepted: 10/18/2015] [Indexed: 01/25/2023]
|
22
|
Quek C, Jung CH, Bellingham SA, Lonie A, Hill AF. iSRAP - a one-touch research tool for rapid profiling of small RNA-seq data. J Extracell Vesicles 2015; 4:29454. [PMID: 26561006 PMCID: PMC4641893 DOI: 10.3402/jev.v4.29454] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2015] [Revised: 10/12/2015] [Accepted: 10/14/2015] [Indexed: 12/23/2022] Open
Abstract
Small non-coding RNAs have been significantly recognized as the key modulators in many biological processes, and are emerging as promising biomarkers for several diseases. These RNA species are transcribed in cells and can be packaged in extracellular vesicles, which are small vesicles released from many biotypes, and are involved in intercellular communication. Currently, the advent of next-generation sequencing (NGS) technology for high-throughput profiling has further advanced the biological insights of non-coding RNA on a genome-wide scale and has become the preferred approach for the discovery and quantification of non-coding RNA species. Despite the routine practice of NGS, the processing of large data sets poses difficulty for analysis before conducting downstream experiments. Often, the current analysis tools are designed for specific RNA species, such as microRNA, and are limited in flexibility for modifying parameters for optimization. An analysis tool that allows for maximum control of different software is essential for drawing concrete conclusions for differentially expressed transcripts. Here, we developed a one-touch integrated small RNA analysis pipeline (iSRAP) research tool that is composed of widely used tools for rapid profiling of small RNAs. The performance test of iSRAP using publicly and in-house available data sets shows its ability of comprehensive profiling of small RNAs of various classes, and analysis of differentially expressed small RNAs. iSRAP offers comprehensive analysis of small RNA sequencing data that leverage informed decisions on the downstream analyses of small RNA studies, including extracellular vesicles such as exosomes.
Collapse
Affiliation(s)
- Camelia Quek
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, VIC, Australia
| | - Chol-Hee Jung
- Victorian Life Sciences Computation Initiative (VLSCI), The University of Melbourne, Melbourne, VIC, Australia
| | - Shayne A Bellingham
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, VIC, Australia
| | - Andrew Lonie
- Victorian Life Sciences Computation Initiative (VLSCI), The University of Melbourne, Melbourne, VIC, Australia
| | - Andrew F Hill
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, VIC, Australia.,Department of Biochemistry and Genetics, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, VIC, Australia;
| |
Collapse
|
23
|
Kleftogiannis D, Theofilatos K, Likothanassis S, Mavroudi S. YamiPred: A Novel Evolutionary Method for Predicting Pre-miRNAs and Selecting Relevant Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1183-1192. [PMID: 26451829 DOI: 10.1109/tcbb.2014.2388227] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs, which play a significant role in gene regulation. Predicting miRNA genes is a challenging bioinformatics problem and existing experimental and computational methods fail to deal with it effectively. We developed YamiPred, an embedded classification method that combines the efficiency and robustness of support vector machines (SVM) with genetic algorithms (GA) for feature selection and parameters optimization. YamiPred was tested in a new and realistic human dataset and was compared with state-of-the-art computational intelligence approaches and the prevalent SVM-based tools for miRNA prediction. Experimental results indicate that YamiPred outperforms existing approaches in terms of accuracy and of geometric mean of sensitivity and specificity. The embedded feature selection component selects a compact feature subset that contributes to the performance optimization. Further experimentation with this minimal feature subset has achieved very high classification performance and revealed the minimum number of samples required for developing a robust predictor. YamiPred also confirmed the important role of commonly used features such as entropy and enthalpy, and uncovered the significance of newly introduced features, such as %A-U aggregate nucleotide frequency and positional entropy. The best model trained on human data has successfully predicted pre-miRNAs to other organisms including the category of viruses.
Collapse
|
24
|
Karathanou K, Theofilatos K, Kleftogiannis D, Alexakos C, Likothanassis S, Tsakalidis A, Mavroudi S. ncRNAclass: A Web Platform for Non-Coding RNA Feature Calculation and MicroRNAs and Targets Prediction. INT J ARTIF INTELL T 2015. [DOI: 10.1142/s0218213015400023] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
According to the central dogma of Biology it was commonly accepted that most of the genetic information was transacted by proteins. Recent experimental techniques revealed that the majority of mammalian genomes and other complex organisms are in fact transcribed into non-coding RNAs. Typically, non-coding RNAs are small nucleotide sequences that are not transcribed into proteins and have a profound regulatory role. Present advances in computational biosciences linked their abnormal functionality to many diseases and re-stated the principles of basic therapeutic strategies. The effective identification of non-coding RNAs and their biological role emerges as a new and challenging bioinformatics problem. ncRNAclass ( http://biotools.ceid.upatras.gr/ncrnaclass/ ) is a web platform that allows for efficient computation of a set of features that can describe effectively the broad class of non-coding RNAs. Moreover, it enables the calculation of features that include information about the targeting behavior of miRNAs. The tool operates under a user-friendly interface and its pilot implementation incorporates prediction models for the well-known class of microRNAs and for prediction their mRNA targets. The prediction models are based on two novel evolutionary Machine Learning algorithms that achieve very high classification performance in comparison with existing methods. The platform is also equipped with a data warehouse, with manually curated sequences, that enables fast information retrieval and data mining utilities.
Collapse
Affiliation(s)
| | | | | | - Christos Alexakos
- Department of Computer Engineering and Informatics, University of Patras, Greece
| | - Spiros Likothanassis
- Department of Computer Engineering and Informatics, University of Patras, Greece
| | | | - Seferina Mavroudi
- Department of Computer Engineering and Informatics, University of Patras, Greece
- Department of Social Work, School of Sciences of Health and Care Technological Educational Institute of Patras, Greece
| |
Collapse
|
25
|
|
26
|
Guruceaga E, Segura V. Functional interpretation of microRNA-mRNA association in biological systems using R. Comput Biol Med 2013; 44:124-31. [PMID: 24377695 DOI: 10.1016/j.compbiomed.2013.11.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Revised: 10/30/2013] [Accepted: 11/03/2013] [Indexed: 12/24/2022]
Abstract
The prediction of microRNA targets is a challenging task that has given rise to several prediction algorithms. Databases of predicted targets can be used in a microRNA target enrichment analysis, enhancing our capacity to extract functional information from gene lists. However, the available tools in this field analyze gene sets one by one limiting their use in a meta-analysis. Here, we present an R system for miRNA enrichment analysis that is suitable for systems biology. These collection of R scripts and embedded data allow using predicted targets of public databases or a custom integration of them. As a proof-of-principle, we have successfully performed the challenging analysis of 2158 tumoral samples at a time. The obtained results have been summarized in a network where each cancer disease is linked to enriched miRNAs and overrepresented functions. These network connections have proven to be an invaluable resource for the study of biological and pathological causes and effects of the expression of miRNAs.
Collapse
Affiliation(s)
- Elizabeth Guruceaga
- Unit of Proteomics, Genomics and Bioinformatics, Center for Applied Medical Research (CIMA), University of Navarra, Pamplona, Spain.
| | - Victor Segura
- Unit of Proteomics, Genomics and Bioinformatics, Center for Applied Medical Research (CIMA), University of Navarra, Pamplona, Spain.
| |
Collapse
|