Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Elyasigomari V, Lee D, Screen H, Shaheed M. Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification. J Biomed Inform 2017;67:11-20. [DOI: 10.1016/j.jbi.2017.01.016] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Revised: 01/24/2017] [Accepted: 01/31/2017] [Indexed: 12/24/2022]

For:	Elyasigomari V, Lee D, Screen H, Shaheed M. Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification. J Biomed Inform 2017;67:11-20. [DOI: 10.1016/j.jbi.2017.01.016] [Citation(s) in RCA: 56] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Revised: 01/24/2017] [Accepted: 01/31/2017] [Indexed: 12/24/2022]

Number

Cited by Other Article(s)

Yaqoob A, Verma NK, Aziz RM. Optimizing Gene Selection and Cancer Classification with Hybrid Sine Cosine and Cuckoo Search Algorithm. J Med Syst 2024;48:10. [PMID: 38193948 DOI: 10.1007/s10916-023-02031-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 12/28/2023] [Indexed: 01/10/2024]

Alweshah M. Coronavirus herd immunity optimizer to solve classification problems. Soft comput 2023;27:3509-3529. [PMID: 35309595 PMCID: PMC8922087 DOI: 10.1007/s00500-022-06917-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/13/2022] [Indexed: 11/28/2022]

Al-Shaikh A, Mahafzah BA, Alshraideh M. Hybrid harmony search algorithm for social network contact tracing of COVID-19. Soft comput 2023;27:3343-3365. [PMID: 34220301 PMCID: PMC8237257 DOI: 10.1007/s00500-021-05948-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/04/2021] [Indexed: 02/05/2023]

Akinola OA, Agushaka JO, Ezugwu AE. Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems. PLoS One 2022;17:e0274850. [PMID: 36201524 PMCID: PMC9536540 DOI: 10.1371/journal.pone.0274850] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2022] [Accepted: 09/06/2022] [Indexed: 11/13/2022] Open

Obadina OO, Thaha MA, Mohamed Z, Shaheed MH. Grey-box modelling and fuzzy logic control of a Leader-Follower robot manipulator system: A hybrid Grey Wolf-Whale Optimisation approach. ISA TRANSACTIONS 2022;129:572-593. [PMID: 35277266 DOI: 10.1016/j.isatra.2022.02.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 01/04/2022] [Accepted: 02/15/2022] [Indexed: 06/14/2023]

Akinola OO, Ezugwu AE, Agushaka JO, Zitar RA, Abualigah L. Multiclass feature selection with metaheuristic optimization algorithms: a review. Neural Comput Appl 2022;34:19751-19790. [PMID: 36060097 PMCID: PMC9424068 DOI: 10.1007/s00521-022-07705-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 08/02/2022] [Indexed: 11/24/2022]

A Highly Discriminative Hybrid Feature Selection Algorithm for Cancer Diagnosis. ScientificWorldJournal 2022;2022:1056490. [PMID: 35983572 PMCID: PMC9381276 DOI: 10.1155/2022/1056490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/20/2022] [Indexed: 11/17/2022] Open

Al-Obeidat F, Rocha Á, Akram M, Razzaq S, Maqbool F. (CDRGI)-Cancer detection through relevant genes identification. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-05739-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Ai H. GSEA-SDBE: A gene selection method for breast cancer classification based on GSEA and analyzing differences in performance metrics. PLoS One 2022;17:e0263171. [PMID: 35472078 PMCID: PMC9041804 DOI: 10.1371/journal.pone.0263171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2021] [Accepted: 01/13/2022] [Indexed: 12/20/2022] Open

Abstract

MOTIVATION

Selecting the most relevant genes for sample classification is a common process in gene expression studies. Moreover, determining the smallest set of relevant genes that can achieve the required classification performance is particularly important in diagnosing cancer and improving treatment.

RESULTS

In this study, I propose a novel method to eliminate irrelevant and redundant genes, and thus determine the smallest set of relevant genes for breast cancer diagnosis. The method is based on random forest models, gene set enrichment analysis (GSEA), and my developed Sort Difference Backward Elimination (SDBE) algorithm; hence, the method is named GSEA-SDBE. Using this method, genes are filtered according to their importance following random forest training and GSEA is used to select genes by core enrichment of Kyoto Encyclopedia of Genes and Genomes pathways that are strongly related to breast cancer. Subsequently, the SDBE algorithm is applied to eliminate redundant genes and identify the most relevant genes for breast cancer diagnosis. In the SDBE algorithm, the differences in the Matthews correlation coefficients (MCCs) of performing random forest models are computed before and after the deletion of each gene to indicate the degree of redundancy of the corresponding deleted gene on the remaining genes during backward elimination. Next, the obtained MCC difference list is divided into two parts from a set position and each part is respectively sorted. By continuously iterating and changing the set position, the most relevant genes are stably assembled on the left side of the gene list, facilitating their identification, and the redundant genes are gathered on the right side of the gene list for easy elimination. A cross-comparison of the SDBE algorithm was performed by respectively computing differences between MCCs and ROC_AUC_score and then respectively using 10-fold classification models, e.g., random forest (RF), support vector machine (SVM), k-nearest neighbor (KNN), extreme gradient boosting (XGBoost), and extremely randomized trees (ExtraTrees). Finally, the classification performance of the proposed method was compared with that of three advanced algorithms for five cancer datasets. Results showed that analyzing MCC differences and using random forest models was the optimal solution for the SDBE algorithm. Accordingly, three consistently relevant genes (i.e., VEGFD, TSLP, and PKMYT1) were selected for the diagnosis of breast cancer. The performance metrics (MCC and ROC_AUC_score, respectively) of the random forest models based on 10-fold verification reached 95.28% and 98.75%. In addition, survival analysis showed that VEGFD and TSLP could be used to predict the prognosis of patients with breast cancer. Moreover, the proposed method significantly outperformed the other methods tested as it allowed selecting a smaller number of genes while maintaining the required classification accuracy.

Collapse

Sathya M, Jeyaselvi M, Joshi S, Pandey E, Pareek PK, Jamal SS, Kumar V, Atiglah HK. Cancer Categorization Using Genetic Algorithm to Identify Biomarker Genes. JOURNAL OF HEALTHCARE ENGINEERING 2022;2022:5821938. [PMID: 35242297 PMCID: PMC8888099 DOI: 10.1155/2022/5821938] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 12/14/2021] [Indexed: 11/18/2022]

Deng X, Li M, Deng S, Wang L. Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification. Med Biol Eng Comput 2022;60:663-681. [PMID: 35028863 DOI: 10.1007/s11517-021-02476-x] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2021] [Accepted: 11/23/2021] [Indexed: 12/15/2022]

Bose S, Das C, Banerjee A, Ghosh K, Chattopadhyay M, Chattopadhyay S, Barik A. An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples. PeerJ Comput Sci 2021;7:e671. [PMID: 34616883 PMCID: PMC8459790 DOI: 10.7717/peerj-cs.671] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 07/20/2021] [Indexed: 06/13/2023]

Abstract

BACKGROUND

Machine learning is one kind of machine intelligence technique that learns from data and detects inherent patterns from large, complex datasets. Due to this capability, machine learning techniques are widely used in medical applications, especially where large-scale genomic and proteomic data are used. Cancer classification based on bio-molecular profiling data is a very important topic for medical applications since it improves the diagnostic accuracy of cancer and enables a successful culmination of cancer treatments. Hence, machine learning techniques are widely used in cancer detection and prognosis.

METHODS

In this article, a new ensemble machine learning classification model named Multiple Filtering and Supervised Attribute Clustering algorithm based Ensemble Classification model (MFSAC-EC) is proposed which can handle class imbalance problem and high dimensionality of microarray datasets. This model first generates a number of bootstrapped datasets from the original training data where the oversampling procedure is applied to handle the class imbalance problem. The proposed MFSAC method is then applied to each of these bootstrapped datasets to generate sub-datasets, each of which contains a subset of the most relevant/informative attributes of the original dataset. The MFSAC method is a feature selection technique combining multiple filters with a new supervised attribute clustering algorithm. Then for every sub-dataset, a base classifier is constructed separately, and finally, the predictive accuracy of these base classifiers is combined using the majority voting technique forming the MFSAC-based ensemble classifier. Also, a number of most informative attributes are selected as important features based on their frequency of occurrence in these sub-datasets.

RESULTS

To assess the performance of the proposed MFSAC-EC model, it is applied on different high-dimensional microarray gene expression datasets for cancer sample classification. The proposed model is compared with well-known existing models to establish its effectiveness with respect to other models. From the experimental results, it has been found that the generalization performance/testing accuracy of the proposed classifier is significantly better compared to other well-known existing models. Apart from that, it has been also found that the proposed model can identify many important attributes/biomarker genes.

Collapse

A novel bio-inspired hybrid multi-filter wrapper gene selection method with ensemble classifier for microarray data. Neural Comput Appl 2021;35:11531-11561. [PMID: 34539088 PMCID: PMC8435304 DOI: 10.1007/s00521-021-06459-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Accepted: 08/26/2021] [Indexed: 01/04/2023]

Abstract

Microarray technology is known as one of the most important tools for collecting DNA expression data. This technology allows researchers to investigate and examine types of diseases and their origins. However, microarray data are often associated with a small sample size, a significant number of genes, imbalanced data, etc., making classification models inefficient. Thus, a new hybrid solution based on a multi-filter and adaptive chaotic multi-objective forest optimization algorithm (AC-MOFOA) is presented to solve the gene selection problem and construct the Ensemble Classifier. In the proposed solution, a multi-filter model (i.e., ensemble filter) is proposed as preprocessing step to reduce the dataset's dimensions, using a combination of five filter methods to remove redundant and irrelevant genes. Accordingly, the results of the five filter methods are combined using a voting-based function. Additionally, the results of the proposed multi-filter indicate that it has good capability in reducing the gene subset size and selecting relevant genes. Then, an AC-MOFOA based on the concepts of non-dominated sorting, crowding distance, chaos theory, and adaptive operators is presented. AC-MOFOA as a wrapper method aimed at reducing dataset dimensions, optimizing KELM, and increasing the accuracy of the classification, simultaneously. Next, in this method, an ensemble classifier model is presented using AC-MOFOA results to classify microarray data. The performance of the proposed algorithm was evaluated on nine public microarray datasets, and its results were compared in terms of the number of selected genes, classification efficiency, execution time, time complexity, hypervolume indicator, and spacing metric with five hybrid multi-objective methods, and three hybrid single-objective methods. According to the results, the proposed hybrid method could increase the accuracy of the KELM in most datasets by reducing the dataset's dimensions and achieve similar or superior performance compared to other multi-objective methods. Furthermore, the proposed Ensemble Classifier model could provide better classification accuracy and generalizability in the seven of nine microarray datasets compared to conventional ensemble methods. Moreover, the comparison results of the Ensemble Classifier model with three state-of-the-art ensemble generation methods indicate its competitive performance in which the proposed ensemble model achieved better results in the five of nine datasets.

Collapse

Mishra P, Bhoi N. Cancer gene recognition from microarray data with manta ray based enhanced ANFIS technique. Biocybern Biomed Eng 2021. [DOI: 10.1016/j.bbe.2021.06.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Al-Shaikh A, Mahafzah BA, Alshraideh M. Hybrid harmony search algorithm for social network contact tracing of COVID-19. Soft comput 2021. [DOI: https://doi.org/10.1007/s00500-021-05948-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]

Al-Shaikh A, Mahafzah BA, Alshraideh M. Hybrid harmony search algorithm for social network contact tracing of COVID-19. Soft comput 2021. [DOI: https:/doi.org/10.1007/s00500-021-05948-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]

Al-Rajab M, Lu J, Xu Q. A framework model using multifilter feature selection to enhance colon cancer classification. PLoS One 2021;16:e0249094. [PMID: 33861766 PMCID: PMC8691854 DOI: 10.1371/journal.pone.0249094] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 03/11/2021] [Indexed: 11/18/2022] Open

Hameed SS, Hassan WH, Latiff LA, Muhammadsharif FF. A comparative study of nature-inspired metaheuristic algorithms using a three-phase hybrid approach for gene selection and classification in high-dimensional cancer datasets. Soft comput 2021. [DOI: 10.1007/s00500-021-05726-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Feature Selection for Colon Cancer Detection Using K-Means Clustering and Modified Harmony Search Algorithm. MATHEMATICS 2021. [DOI: 10.3390/math9050570] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Abiodun EO, Alabdulatif A, Abiodun OI, Alawida M, Alabdulatif A, Alkhawaldeh RS. A systematic review of emerging feature selection optimization methods for optimal text classification: the present state and prospective opportunities. Neural Comput Appl 2021;33:15091-15118. [PMID: 34404964 PMCID: PMC8361413 DOI: 10.1007/s00521-021-06406-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 07/31/2021] [Indexed: 02/07/2023]

MotieGhader H, Masoudi-Sobhanzadeh Y, Ashtiani SH, Masoudi-Nejad A. mRNA and microRNA selection for breast cancer molecular subtype stratification using meta-heuristic based algorithms. Genomics 2020;112:3207-3217. [DOI: 10.1016/j.ygeno.2020.06.014] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Revised: 05/13/2020] [Accepted: 06/02/2020] [Indexed: 02/06/2023]

Gholami J, Pourpanah F, Wang X. Feature selection based on improved binary global harmony search for data classification. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106402] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Integration of multi-objective PSO based feature selection and node centrality for medical datasets. Genomics 2020;112:4370-4384. [PMID: 32717320 DOI: 10.1016/j.ygeno.2020.07.027] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Revised: 06/22/2020] [Accepted: 07/14/2020] [Indexed: 01/19/2023]

Boveiri HR. An enhanced cuckoo optimization algorithm for task graph scheduling in cluster-computing systems. Soft comput 2020. [DOI: 10.1007/s00500-019-04520-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

A survey on single and multi omics data mining methods in cancer data classification. J Biomed Inform 2020;107:103466. [DOI: 10.1016/j.jbi.2020.103466] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Revised: 05/01/2020] [Accepted: 05/31/2020] [Indexed: 01/09/2023]

Baliarsingh SK, Vipsita S. Chaotic emperor penguin optimised extreme learning machine for microarray cancer classification. IET Syst Biol 2020;14:85-95. [PMID: 32196467 DOI: 10.1049/iet-syb.2019.0028] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Breast and Colon Cancer Classification from Gene Expression Profiles Using Data Mining Techniques. Symmetry (Basel) 2020. [DOI: 10.3390/sym12030408] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

MapReduce-Based Parallel Genetic Algorithm for CpG-Site Selection in Age Prediction. Genes (Basel) 2019;10:genes10120969. [PMID: 31775313 PMCID: PMC6947642 DOI: 10.3390/genes10120969] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2019] [Revised: 11/12/2019] [Accepted: 11/15/2019] [Indexed: 11/23/2022] Open

Abstract

Genomic biomarkers such as DNA methylation (DNAm) are employed for age prediction. In recent years, several studies have suggested the association between changes in DNAm and its effect on human age. The high dimensional nature of this type of data significantly increases the execution time of modeling algorithms. To mitigate this problem, we propose a two-stage parallel algorithm for selection of age related CpG-sites. The algorithm first attempts to cluster the data into similar age ranges. In the next stage, a parallel genetic algorithm (PGA), based on the MapReduce paradigm (MR-based PGA), is used for selecting age-related features of each individual age range. In the proposed method, the execution of the algorithm for each age range (data parallel), the evaluation of chromosomes (task parallel) and the calculation of the fitness function (data parallel) are performed using a novel parallel framework. In this paper, we consider 16 different healthy DNAm datasets that are related to the human blood tissue and that contain the relevant age information. These datasets are combined into a single unioned set, which is in turn randomly divided into two sets of train and test data with a ratio of 7:3, respectively. We build a Gradient Boosting Regressor (GBR) model on the selected CpG-sites from the train set. To evaluate the model accuracy, we compared our results with state-of-the-art approaches that used these datasets, and observed that our method performs better on the unseen test dataset with a Mean Absolute Deviation (MAD) of 3.62 years, and a correlation (R²) of 95.96% between age and DNAm. In the train data, the MAD and R² are 1.27 years and 99.27%, respectively. Finally, we evaluate our method in terms of the effect of parallelization in computation time. The algorithm without parallelization requires 4123 min to complete, whereas the parallelized execution on 3 computing machines having 32 processing cores each, only takes a total of 58 min. This shows that our proposed algorithm is both efficient and scalable.

Collapse

Optimum design of shallow foundation using evolutionary algorithms. Soft comput 2019. [DOI: 10.1007/s00500-019-04316-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

A new optimal gene selection approach for cancer classification using enhanced Jaya-based forest optimization algorithm. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04355-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Kawabata T, Emoto R, Nishino J, Takahashi K, Matsui S. Two-stage analysis for selecting fixed numbers of features in omics association studies. Stat Med 2019;38:2956-2971. [PMID: 30931544 DOI: 10.1002/sim.8150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2018] [Revised: 12/31/2018] [Accepted: 02/28/2019] [Indexed: 11/07/2022]

A new data analysis method based on feature linear combination. J Biomed Inform 2019;94:103173. [PMID: 30965135 DOI: 10.1016/j.jbi.2019.103173] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 04/02/2019] [Accepted: 04/06/2019] [Indexed: 01/15/2023]

Abstract

In biological data, feature relationships are complex and diverse, they could reflect physiological and pathological changes. Defining simple and efficient classification rules based on feature relationships is helpful for discriminating different conditions and studying disease mechanism. The popular data analysis method, k top scoring pairs (k-TSP), explores the feature relationship by focusing on the difference of the relative level of two features in different groups and classifies samples based on the exploration. To define more efficient classification rules, we propose a new data analysis method based on the linear combination of k > 0 top scoring pairs (LC-k-TSP). LC-k-TSP applies support vector machine (SVM) to define the best linear relationship of each feature pair, scores feature pairs by the discriminative abilities of the corresponding linear combinations and selects k disjoint top scoring pairs to construct an ensemble classifier. Experiments on twelve public datasets showed the superiority of LC-k-TSP over k-TSP which evaluates the relationship of every two features in the same way. The experiment also illustrated that LC-k-TSP performed similarly to SVM and random forest (RF) in accuracy rate. LC-k-TSP studies the own unique linear combination for each feature pair and defines simple classification rules, it is easy to explore the biomedical explanation. Finally, we applied LC-k-TSP to analyze the hepatocellular carcinoma (HCC) metabolomics data and define the simple classification rules for discrimination of different liver diseases. It obtained accuracy rates of 89.76% and 89.13% in distinguishing between small HCC and hepatic cirrhosis (CIR) groups as well as between HCC and CIR groups, superior to 87.99% and 80.35% by k-TSP. Hence, defining classification rules based on feature relationships is an effective way to analyze biological data. LC-k-TSP which checks different feature pairs by their corresponding unique best linear relationship has the superiority over k-TSP which checks each pair by the same linear relationship. Availability and implementation: http://www.402.dicp.ac.cn/download_ok_4.htm.

Collapse

Ventura-Molina E, Alarcón-Paredes A, Aldape-Pérez M, Yáñez-Márquez C, Adolfo Alonso G. Gene selection for enhanced classification on microarray data using a weighted k-NN based algorithm. INTELL DATA ANAL 2019. [DOI: 10.3233/ida-173720] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Mrmr+ and Cfs+ feature selection algorithms for high-dimensional data. APPL INTELL 2018. [DOI: 10.1007/s10489-018-1381-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

A-COA: an adaptive cuckoo optimization algorithm for continuous and combinatorial optimization. Neural Comput Appl 2018. [DOI: 10.1007/s00521-018-3928-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

K T, N KV, S S, M P. Distributed ICSA Clustering Approach for Large Scale Protein Sequences and Cancer Diagnosis. Asian Pac J Cancer Prev 2018;19:3105-3109. [PMID: 30486549 PMCID: PMC6318385 DOI: 10.31557/apjcp.2018.19.11.3105] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open

Sapre S, Mini S. Opposition-based moth flame optimization with Cauchy mutation and evolutionary boundary constraint handling for global optimization. Soft comput 2018. [DOI: 10.1007/s00500-018-3586-y] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Nikdelfaz O, Jalili S. Disease genes prediction by HMM based PU-learning using gene expression profiles. J Biomed Inform 2018;81:102-111. [PMID: 29571901 DOI: 10.1016/j.jbi.2018.03.006] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Revised: 11/22/2017] [Accepted: 03/12/2018] [Indexed: 12/24/2022]

Gao L, Ye M, Lu X, Huang D. Hybrid Method Based on Information Gain and Support Vector Machine for Gene Selection in Cancer Classification. GENOMICS PROTEOMICS & BIOINFORMATICS 2017;15:389-395. [PMID: 29246519 PMCID: PMC5828665 DOI: 10.1016/j.gpb.2017.08.002] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/12/2017] [Revised: 07/25/2017] [Accepted: 08/08/2017] [Indexed: 12/30/2022]