Ayachit G, Pandya H, Das J. miRDetect: A combinatorial approach for automated detection of novel miRNA precursors from plant EST data using homology and Random Forest classification.
Genomics 2020;
112:3201-3206. [PMID:
32380232 DOI:
10.1016/j.ygeno.2020.05.002]
[Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 04/25/2020] [Accepted: 05/02/2020] [Indexed: 01/23/2023]
Abstract
Identification of microRNAs from plants is a crucial step for understanding the mechanisms of pathways and regulation of genes. A number of tools have been developed for the detection of microRNAs from small RNA-seq data. However, there is a lack of pipeline for detection of miRNA from EST dataset even when a huge resource is publicly available and the method is known. Here we present miRDetect, a python implementation to detect novel miRNA precursors from plant EST data using homology and machine learning approach. 10-fold cross validation was applied to choose best classifier based on ROC, accuracy, MCC and F1-scores using 112 features. miRDetect achieved a classification accuracy of 93.35% on a Random Forest classifier and outperformed other precursor detection tools in terms of performance. The miRDetect pipeline aids in identifying novel plant precursors using a mixed approach and will be helpful to researchers with less informatics background.
Collapse