Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Alhenawi E, Al-Sayyed R, Hudaib A, Mirjalili S. Feature selection methods on gene expression microarray data for cancer classification: A systematic review. Comput Biol Med 2022;140:105051. [PMID: 34839186 DOI: 10.1016/j.compbiomed.2021.105051] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 11/01/2021] [Accepted: 11/15/2021] [Indexed: 11/29/2022]

For:	Alhenawi E, Al-Sayyed R, Hudaib A, Mirjalili S. Feature selection methods on gene expression microarray data for cancer classification: A systematic review. Comput Biol Med 2022;140:105051. [PMID: 34839186 DOI: 10.1016/j.compbiomed.2021.105051] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 11/01/2021] [Accepted: 11/15/2021] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Jeyananthan P. Performance comparison between multi-level gene expression data in cancer subgroup classification. Pathol Res Pract 2024;260:155419. [PMID: 38955118 DOI: 10.1016/j.prp.2024.155419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 06/06/2024] [Accepted: 06/19/2024] [Indexed: 07/04/2024]

M S K, Rajaguru H, Nair AR. Enhancement of Classifier Performance with Adam and RanAdam Hyper-Parameter Tuning for Lung Cancer Detection from Microarray Data-In Pursuit of Precision. Bioengineering (Basel) 2024;11:314. [PMID: 38671736 PMCID: PMC11047746 DOI: 10.3390/bioengineering11040314] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 03/18/2024] [Accepted: 03/20/2024] [Indexed: 04/28/2024] Open

Zhou M, Wang J, Shi J, Zhai G, Zhou X, Ye L, Li L, Hu M, Zhou Y. Prediction model of radiotherapy outcome for Ocular Adnexal Lymphoma using informative features selected by chemometric algorithms. Comput Biol Med 2024;170:108067. [PMID: 38301513 DOI: 10.1016/j.compbiomed.2024.108067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 12/28/2023] [Accepted: 01/27/2024] [Indexed: 02/03/2024]

Abstract

BACKGROUND

Ocular Adnexal Lymphoma (OAL) is a non-Hodgkin's lymphoma that most often appears in the tissues near the eye, and radiotherapy is the currently preferred treatment. There has been a controversy regarding the prognostic factors for systemic failure of OAL radiotherapy, the thorough evaluation prior to receiving radiotherapy is highly recommended to better the patient's prognosis and minimize the likelihood of any adverse effects.

PURPOSE

To investigate the risk factors that contribute to incomplete remission in OAL radiotherapy and to establish a hybrid model for predicting the radiotherapy outcomes in OAL patients.

METHODS

A retrospective chart review was performed for 87 consecutive patients with OAL who received radiotherapy between Feb 2011 and August 2022 in our center. Seven image features, derived from MRI sequences, were integrated with 122 clinical features to form comprehensive patient feature sets. Chemometric algorithms were then employed to distill highly informative features from these sets. Based on these refined features, SVM and XGBoost classifiers were performed to classify the effect of radiotherapy.

RESULTS

The clinical records of from 87 OAL patients (median age: 60 months, IQR: 52-68 months; 62.1% male) treated with radiotherapy were reviewed. Analysis of Lasso (AUC = 0.75, 95% CI: 0.72-0.77) and Random Forest (AUC = 0.67, 95% CI: 0.62-0.70) algorithms revealed four potential features, resulting in an intersection AUC of 0.80 (95% CI: 0.75-0.82). Logistic Regression (AUC = 0.75, 95% CI: 0.72-0.77) identified two features. Furthermore, the integration of chemometric methods such as CARS (AUC = 0.66, 95% CI: 0.62-0.72), UVE (AUC = 0.71, 95% CI: 0.66-0.75), and GA (AUC = 0.65, 95% CI: 0.60-0.69) highlighted six features in total, with an intersection AUC of 0.82 (95% CI: 0.78-0.83). These features included enophthalmos, diplopia, tenderness, elevated ALT count, HBsAg positivity, and CD43 positivity in immunohistochemical tests.

CONCLUSION

The findings suggest the effectiveness of chemometric algorithms in pinpointing OAL risk factors, and the prediction model we proposed shows promise in helping clinicians identify OAL patients likely to achieve complete remission via radiotherapy. Notably, patients with a history of exophthalmos, diplopia, tenderness, elevated ALT levels, HBsAg positivity, and CD43 positivity are less likely to attain complete remission after radiotherapy. These insights offer more targeted management strategies for OAL patients. The developed model is accessible online at: https://lzz.testop.top/.

Collapse

Affiliation(s)

Min Zhou Ophthalmology Department, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, 639 Zhizaoju Road, Shanghai 200011, China; Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China.
Jiaqi Wang Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, 500 Dongchuan Road, Shanghai 200241, China.
Jiahao Shi Ophthalmology Department, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, 639 Zhizaoju Road, Shanghai 200011, China; Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China.
Guangtao Zhai Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai 200240, China.
Xiaowen Zhou Ophthalmology Department, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, 639 Zhizaoju Road, Shanghai 200011, China; Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China.
Lulu Ye Department of Oral and Maxillofacial- Head Neck Oncology, Shanghai Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, 639 Zhizaoju Road, Shanghai 200011, China.
Lunhao Li Ophthalmology Department, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, 639 Zhizaoju Road, Shanghai 200011, China; Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China.
Menghan Hu Shanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, 500 Dongchuan Road, Shanghai 200241, China.
Yixiong Zhou Ophthalmology Department, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine, 639 Zhizaoju Road, Shanghai 200011, China; Shanghai Key Laboratory of Orbital Diseases and Ocular Oncology, Shanghai 200011, China.

Collapse

Abbasi EY, Deng Z, Ali Q, Khan A, Shaikh A, Reshan MSA, Sulaiman A, Alshahrani H. A machine learning and deep learning-based integrated multi-omics technique for leukemia prediction. Heliyon 2024;10:e25369. [PMID: 38352790 PMCID: PMC10862685 DOI: 10.1016/j.heliyon.2024.e25369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 12/13/2023] [Accepted: 01/25/2024] [Indexed: 02/16/2024] Open

Yang G, Li W, Xie W, Wang L, Yu K. An improved binary particle swarm optimization algorithm for clinical cancer biomarker identification in microarray data. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024;244:107987. [PMID: 38157825 DOI: 10.1016/j.cmpb.2023.107987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 11/04/2023] [Accepted: 12/16/2023] [Indexed: 01/03/2024]

Abstract

BACKGROUND AND OBJECTIVE

The limited number of samples and high-dimensional features in microarray data make selecting a small number of features for disease diagnosis a challenging problem. Traditional feature selection methods based on evolutionary algorithms are difficult to search for the optimal set of features in a limited time when dealing with the high-dimensional feature selection problem. New solutions are proposed to solve the above problems.

METHODS

In this paper, we propose a hybrid feature selection method (C-IFBPFE) for biomarker identification in microarray data, which combines clustering and improved binary particle swarm optimization while incorporating an embedded feature elimination strategy. Firstly, an adaptive redundant feature judgment method based on correlation clustering is proposed for feature screening to reduce the search space in the subsequent stage. Secondly, we propose an improved flipping probability-based binary particle swarm optimization (IFBPSO), better applicable to the binary particle swarm optimization problem. Finally, we also design a new feature elimination (FE) strategy embedded in the binary particle swarm optimization algorithm. This strategy gradually removes poorer features during iterations to reduce the number of features and improve accuracy.

RESULTS

We compared C-IFBPFE with other published hybrid feature selection methods on eight public datasets and analyzed the impact of each improvement. The proposed method outperforms other current state-of-the-art feature selection methods in terms of accuracy, number of features, sensitivity, and specificity. The ablation study of this method validates the efficacy of each component, especially the proposed feature elimination strategy significantly improves the performance of the algorithm.

CONCLUSIONS

The hybrid feature selection method proposed in this paper helps address the issue of high-dimensional microarray data with few samples. It can select a small subset of features and achieve high classification accuracy on microarray datasets. Additionally, independent validation of the selected features shows that those chosen by C-IFBPFE have strong correlations with disease phenotypes and can identify important biomarkers from data related to biomedical problems.

Collapse

Zhou K, Yin Z, Gu J, Zeng Z. A Feature Selection Method Based on Graph Theory for Cancer Classification. Comb Chem High Throughput Screen 2024;27:650-660. [PMID: 37056061 DOI: 10.2174/1386207326666230413085646] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 02/02/2023] [Accepted: 02/24/2023] [Indexed: 04/15/2023]

Mohamed TIA, Ezugwu AE, Fonou-Dombeu JV, Mohammed M, Greeff J, Elbashir MK. A novel feature selection algorithm for identifying hub genes in lung cancer. Sci Rep 2023;13:21671. [PMID: 38066059 PMCID: PMC10709567 DOI: 10.1038/s41598-023-48953-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 12/01/2023] [Indexed: 12/18/2023] Open

Abstract

Lung cancer, a life-threatening disease primarily affecting lung tissue, remains a significant contributor to mortality in both developed and developing nations. Accurate biomarker identification is imperative for effective cancer diagnosis and therapeutic strategies. This study introduces the Voting-Based Enhanced Binary Ebola Optimization Search Algorithm (VBEOSA), an innovative ensemble-based approach combining binary optimization and the Ebola optimization search algorithm. VBEOSA harnesses the collective power of the state-of-the-art classification models through soft voting. Moreover, our research applies VBEOSA to an extensive lung cancer gene expression dataset obtained from TCGA, following essential preprocessing steps including outlier detection and removal, data normalization, and filtration. VBEOSA aids in feature selection, leading to the discovery of key hub genes closely associated with lung cancer, validated through comprehensive protein-protein interaction analysis. Notably, our investigation reveals ten significant hub genes-ADRB2, ACTB, ARRB2, GNGT2, ADRB1, ACTG1, ACACA, ATP5A1, ADCY9, and ADRA1B-each demonstrating substantial involvement in the domain of lung cancer. Furthermore, our pathway analysis sheds light on the prominence of strategic pathways such as salivary secretion and the calcium signaling pathway, providing invaluable insights into the intricate molecular mechanisms underpinning lung cancer. We also utilize the weighted gene co-expression network analysis (WGCNA) method to identify gene modules exhibiting strong correlations with clinical attributes associated with lung cancer. Our findings underscore the efficacy of VBEOSA in feature selection and offer profound insights into the multifaceted molecular landscape of lung cancer. Finally, we are confident that this research has the potential to improve diagnostic capabilities and further enrich our understanding of the disease, thus setting the stage for future advancements in the clinical management of lung cancer. The VBEOSA source codes is publicly available at https://github.com/TEHNAN/VBEOSA-A-Novel-Feature-Selection-Algorithm-for-Identifying-hub-Genes-in-Lung-Cancer .

Collapse

Huang HH, Lu CJ, Jhou MJ, Liu TC, Yang CT, Hsieh SJ, Yang WJ, Chang HC, Chen MS. Using a Decision Tree Algorithm Predictive Model for Sperm Count Assessment and Risk Factors in Health Screening Population. Risk Manag Healthc Policy 2023;16:2469-2478. [PMID: 38024496 PMCID: PMC10658962 DOI: 10.2147/rmhp.s433193] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 11/01/2023] [Indexed: 12/01/2023] Open

Lv G, Xia Y, Qi Z, Zhao Z, Tang L, Chen C, Yang S, Wang Q, Gu L. LncRNA-protein interaction prediction with reweighted feature selection. BMC Bioinformatics 2023;24:410. [PMID: 37904080 PMCID: PMC10617115 DOI: 10.1186/s12859-023-05536-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 10/16/2023] [Indexed: 11/01/2023] Open

Morabito F, Adornetto C, Monti P, Amaro A, Reggiani F, Colombo M, Rodriguez-Aldana Y, Tripepi G, D’Arrigo G, Vener C, Torricelli F, Rossi T, Neri A, Ferrarini M, Cutrona G, Gentile M, Greco G. Genes selection using deep learning and explainable artificial intelligence for chronic lymphocytic leukemia predicting the need and time to therapy. Front Oncol 2023;13:1198992. [PMID: 37719021 PMCID: PMC10501728 DOI: 10.3389/fonc.2023.1198992] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 07/31/2023] [Indexed: 09/19/2023] Open

Abstract

Analyzing gene expression profiles (GEP) through artificial intelligence provides meaningful insight into cancer disease. This study introduces DeepSHAP Autoencoder Filter for Genes Selection (DSAF-GS), a novel deep learning and explainable artificial intelligence-based approach for feature selection in genomics-scale data. DSAF-GS exploits the autoencoder's reconstruction capabilities without changing the original feature space, enhancing the interpretation of the results. Explainable artificial intelligence is then used to select the informative genes for chronic lymphocytic leukemia prognosis of 217 cases from a GEP database comprising roughly 20,000 genes. The model for prognosis prediction achieved an accuracy of 86.4%, a sensitivity of 85.0%, and a specificity of 87.5%. According to the proposed approach, predictions were strongly influenced by CEACAM19 and PIGP, moderately influenced by MKL1 and GNE, and poorly influenced by other genes. The 10 most influential genes were selected for further analysis. Among them, FADD, FIBP, FIBP, GNE, IGF1R, MKL1, PIGP, and SLC39A6 were identified in the Reactome pathway database as involved in signal transduction, transcription, protein metabolism, immune system, cell cycle, and apoptosis. Moreover, according to the network model of the 3D protein-protein interaction (PPI) explored using the NetworkAnalyst tool, FADD, FIBP, IGF1R, QTRT1, GNE, SLC39A6, and MKL1 appear coupled into a complex network. Finally, all 10 selected genes showed a predictive power on time to first treatment (TTFT) in univariate analyses on a basic prognostic model including IGHV mutational status, del(11q) and del(17p), NOTCH1 mutations, β2-microglobulin, Rai stage, and B-lymphocytosis known to predict TTFT in CLL. However, only IGF1R [hazard ratio (HR) 1.41, 95% CI 1.08-1.84, P=0.013), COL28A1 (HR 0.32, 95% CI 0.10-0.97, P=0.045), and QTRT1 (HR 7.73, 95% CI 2.48-24.04, P<0.001) genes were significantly associated with TTFT in multivariable analyses when combined with the prognostic factors of the basic model, ultimately increasing the Harrell's c-index and the explained variation to 78.6% (versus 76.5% of the basic prognostic model) and 52.6% (versus 42.2% of the basic prognostic model), respectively. Also, the goodness of model fit was enhanced (χ2 = 20.1, P=0.002), indicating its improved performance above the basic prognostic model. In conclusion, DSAF-GS identified a group of significant genes for CLL prognosis, suggesting future directions for bio-molecular research.

Collapse

Affiliation(s)

Fortunato Morabito Biotechnology Research Unit, ‘A. Sforza’ Foundation, Cosenza, Italy
Carlo Adornetto Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
Paola Monti Mutagenesis and Cancer Prevention Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Adriana Amaro Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Francesco Reggiani Tumor Epigenetics Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Monica Colombo Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Yissel Rodriguez-Aldana Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy
Giovanni Tripepi Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
Graziella D’Arrigo Consiglio Nazionale delle Ricerche, Istituto di Fisiologia Clinica del Consiglio Nazionale delle Ricerche (CNR), Reggio Calabria, Italy
Claudia Vener Department of Oncology and Hemato-Oncology, University of Milan, Milan, Italy
Federica Torricelli Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Teresa Rossi Laboratory of Translational Research, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Crabtree Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Antonino Neri Scientific Directorate, Azienda Unità Sanitaria Locale - Istituto di Ricovero e Cura a Carattere Scientifico (USL-IRCCS) of Reggio Emilia, Reggio Emilia, Italy
Manlio Ferrarini Unità Operariva (UO) Molecular Pathology, Ospedale Policlinico San Martino Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS), Genoa, Italy
Giovanna Cutrona Molecular Pathology Unit, Istituto di Ricovero e Cura a Carattere Scientifico (IRCCS) Ospedale Policlinico San Martino, Genoa, Italy
Massimo Gentile Hematology Unit, Department of Onco-Hematology, Azienda Ospedaliera (A.O.) of Cosenza, Cosenza, Italy Department of Pharmacy and Health and Nutritional Sciences, University of Calabria, Cosenza, Italy
Gianluigi Greco Department of Mathematics and Computer Science, University of Calabria, Cosenza, Italy

Collapse

Kabzinski J, Kucharska-Lusina A, Majsterek I. RNA-Based Liquid Biopsy in Head and Neck Cancer. Cells 2023;12:1916. [PMID: 37508579 PMCID: PMC10377854 DOI: 10.3390/cells12141916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 07/17/2023] [Accepted: 07/21/2023] [Indexed: 07/30/2023] Open

Fu Q, Li Q, Li X. An improved multi-objective marine predator algorithm for gene selection in classification of cancer microarray data. Comput Biol Med 2023;160:107020. [PMID: 37196457 DOI: 10.1016/j.compbiomed.2023.107020] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2023] [Revised: 04/09/2023] [Accepted: 05/05/2023] [Indexed: 05/19/2023]

Vahabzadeh V, Moattar MH. Robust microarray data feature selection using a correntropy based distance metric learning approach. Comput Biol Med 2023;161:107056. [PMID: 37235945 DOI: 10.1016/j.compbiomed.2023.107056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 04/18/2023] [Accepted: 05/20/2023] [Indexed: 05/28/2023]

Wang Z, Zhou Y, Takagi T, Song J, Tian YS, Shibuya T. Genetic algorithm-based feature selection with manifold learning for cancer classification using microarray data. BMC Bioinformatics 2023;24:139. [PMID: 37031189 PMCID: PMC10082986 DOI: 10.1186/s12859-023-05267-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 04/02/2023] [Indexed: 04/10/2023] Open

Alhenawi E, Al-Sayyed R, Hudaib A, Mirjalili S. Improved intelligent water drop-based hybrid feature selection method for microarray data processing. Comput Biol Chem 2023;103:107809. [PMID: 36696844 DOI: 10.1016/j.compbiolchem.2022.107809] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2021] [Revised: 12/13/2022] [Accepted: 12/30/2022] [Indexed: 01/15/2023]

Abstract

Classifying microarray datasets, which usually contains many noise genes that degrade the performance of classifiers and decrease classification accuracy rate, is a competitive research topic. Feature selection (FS) is one of the most practical ways for finding the most optimal subset of genes that increases classification's accuracy for diagnostic and prognostic prediction of tumor cancer from the microarray datasets. This means that we always need to develop more efficient FS methods, that select only optimal or close-to-optimal subset of features to improve classification performance. In this paper, we propose a hybrid FS method for microarray data processing, that combines an ensemble filter with an Improved Intelligent Water Drop (IIWD) algorithm as a wrapper by adding one of three local search (LS) algorithms: Tabu search (TS), Novel LS algorithm (NLSA), or Hill Climbing (HC) in each iteration from IWD, and using a correlation coefficient filter as a heuristic undesirability (HUD) for next node selection in the original IWD algorithm. The effects of adding three different LS algorithms to the proposed IIWD algorithm have been evaluated through comparing the performance of the proposed ensemble filter-IIWD-based wrapper without adding any LS algorithms named (PHFS-IWD) FS method versus its performance when adding a specific LS algorithm from (TS, NLSA or HC) in FS methods named, (PHFS-IWDTS, PHFS-IWDNLSA, and PHFS-IWDHC), respectively. Naïve Bayes(NB) classifier with five microarray datasets have been deployed for evaluating and comparing the proposed hybrid FS methods. Results show that using LS algorithms in each iteration from the IWD algorithm improves F-score value with an average equal to 5% compared with PHFS-IWD. Also, PHFS-IWDNLSA improves the F-score value with an average of 4.15% over PHFS-IWDTS, and 5.67% over PHFS-IWDHC while PHFS-IWDTS outperformed PHFS-IWDHC with an average of increment equal to 1.6%. On the other hand, the proposed hybrid-based FS methods improve accuracy with an average equal to 8.92% in three out of five datasets and decrease the number of genes with a percentage of 58.5% in all five datasets compared with six of the most recent state-of-the-art FS methods.

Collapse

Awotunde JB, Ayo FE, Panigrahi R, Garg A, Bhoi AK, Barsocchi P. A Multi-level Random Forest Model-Based Intrusion Detection Using Fuzzy Inference System for Internet of Things Networks. INT J COMPUT INT SYS 2023. [DOI: 10.1007/s44196-023-00205-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2023] Open

Abstract AbstractIntrusion detection (ID) methods are security frameworks designed to safeguard network information systems. The strength of an intrusion detection method is dependent on the robustness of the feature selection method. This study developed a multi-level random forest algorithm for intrusion detection using a fuzzy inference system. The strengths of the filter and wrapper approaches are combined in this work to create a more advanced multi-level feature selection technique, which strengthens network security. The first stage of the multi-level feature selection is the filter method using a correlation-based feature selection to select essential features based on the multi-collinearity in the data. The correlation-based feature selection used a genetic search method to choose the best features from the feature set. The genetic search algorithm assesses the merits of each attribute, which then delivers the characteristics with the highest fitness values for selection. A rule assessment has also been used to determine whether two feature subsets have the same fitness value, which ultimately returns the feature subset with the fewest features. The second stage is a wrapper method based on the sequential forward selection method to further select top features based on the accuracy of the baseline classifier. The selected top features serve as input into the random forest algorithm for detecting intrusions. Finally, fuzzy logic was used to classify intrusions as either normal, low, medium, or high to reduce misclassification. When the developed intrusion method was compared to other existing models using the same dataset, the results revealed a higher accuracy, precision, sensitivity, specificity, and F1-score of 99.46%, 99.46%, 99.46%, 93.86%, and 99.46%, respectively. The classification of attacks using the fuzzy inference system also indicates that the developed method can correctly classify attacks with reduced misclassification. The use of a multi-level feature selection method to leverage the advantages of filter and wrapper feature selection methods and fuzzy logic for intrusion classification makes this study unique. Collapse

Gokhale M, Mohanty SK, Ojha A. GeneViT: Gene Vision Transformer with Improved DeepInsight for cancer classification. Comput Biol Med 2023;155:106643. [PMID: 36803792 DOI: 10.1016/j.compbiomed.2023.106643] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 01/03/2023] [Accepted: 02/05/2023] [Indexed: 02/09/2023]

Alromema N, Syed AH, Khan T. A Hybrid Machine Learning Approach to Screen Optimal Predictors for the Classification of Primary Breast Tumors from Gene Expression Microarray Data. Diagnostics (Basel) 2023;13:diagnostics13040708. [PMID: 36832196 PMCID: PMC9955903 DOI: 10.3390/diagnostics13040708] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 01/30/2023] [Accepted: 02/07/2023] [Indexed: 02/16/2023] Open

Hybrid Filter and Genetic Algorithm-Based Feature Selection for Improving Cancer Classification in High-Dimensional Microarray Data. Processes (Basel) 2023. [DOI: 10.3390/pr11020562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023] Open

Arafa A, El-Fishawy N, Badawy M, Radad M. RN-Autoencoder: Reduced Noise Autoencoder for classifying imbalanced cancer genomic data. J Biol Eng 2023;17:7. [PMID: 36717866 PMCID: PMC9887895 DOI: 10.1186/s13036-022-00319-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2022] [Accepted: 12/12/2022] [Indexed: 01/31/2023] Open

Abstract

BACKGROUND

In the current genomic era, gene expression datasets have become one of the main tools utilized in cancer classification. Both curse of dimensionality and class imbalance problems are inherent characteristics of these datasets. These characteristics have a negative impact on the performance of most classifiers when used to classify cancer using genomic datasets.

RESULTS

This paper introduces Reduced Noise-Autoencoder (RN-Autoencoder) for pre-processing imbalanced genomic datasets for precise cancer classification. Firstly, RN-Autoencoder solves the curse of dimensionality problem by utilizing the autoencoder for feature reduction and hence generating new extracted data with lower dimensionality. In the next stage, RN-Autoencoder introduces the extracted data to the well-known Reduced Noise-Synthesis Minority Over Sampling Technique (RN- SMOTE) that efficiently solve the problem of class imbalance in the extracted data. RN-Autoencoder has been evaluated using different classifiers and various imbalanced datasets with different imbalance ratios. The results proved that the performance of the classifiers has been improved with RN-Autoencoder and outperformed the performance with original data and extracted data with percentages based on the classifier, dataset and evaluation metric. Also, the performance of RN-Autoencoder has been compared to the performance of the current state of the art and resulted in an increase up to 18.017, 19.183, 18.58 and 8.87% in terms of test accuracy using colon, leukemia, Diffuse Large B-Cell Lymphoma (DLBCL) and Wisconsin Diagnostic Breast Cancer (WDBC) datasets respectively.

CONCLUSION

RN-Autoencoder is a model for cancer classification using imbalanced gene expression datasets. It utilizes the autoencoder to reduce the high dimensionality of the gene expression datasets and then handles the class imbalance using RN-SMOTE. RN-Autoencoder has been evaluated using many different classifiers and many different imbalanced datasets. The performance of many classifiers has improved and some have succeeded in classifying cancer with 100% performance in terms of all used metrics. In addition, RN-Autoencoder outperformed many recent works using the same datasets.

Collapse

Alharbi F, Vakanski A. Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering (Basel) 2023;10:bioengineering10020173. [PMID: 36829667 PMCID: PMC9952758 DOI: 10.3390/bioengineering10020173] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open

A new ranking-based stability measure for feature selection algorithms. Soft comput 2023. [DOI: 10.1007/s00500-022-07767-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Attallah O. MonDiaL-CAD: Monkeypox diagnosis via selected hybrid CNNs unified with feature selection and ensemble learning. Digit Health 2023;9:20552076231180054. [PMID: 37312961 PMCID: PMC10259124 DOI: 10.1177/20552076231180054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 05/18/2023] [Indexed: 06/15/2023] Open

Braik M. Enhanced Ali Baba and the forty thieves algorithm for feature selection. Neural Comput Appl 2023;35:6153-6184. [PMID: 36408290 PMCID: PMC9666985 DOI: 10.1007/s00521-022-08015-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 10/26/2022] [Indexed: 11/16/2022]

Pan X, Zhang G, Lin A, Guan X, Chen P, Ge Y, Chen X. An evaluation model for children's foot & ankle deformity severity using sparse multi-objective feature selection algorithm. Comput Biol Med 2022;151:106229. [PMID: 36308897 DOI: 10.1016/j.compbiomed.2022.106229] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 10/08/2022] [Accepted: 10/16/2022] [Indexed: 12/27/2022]

Shaban WM. Insight into breast cancer detection: new hybrid feature selection method. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-08062-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]

Hybrid Feature Selection Method for Intrusion Detection Systems Based on an Improved Intelligent Water Drop Algorithm. CYBERNETICS AND INFORMATION TECHNOLOGIES 2022. [DOI: 10.2478/cait-2022-0040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Zanella L, Facco P, Bezzo F, Cimetta E. Feature Selection and Molecular Classification of Cancer Phenotypes: A Comparative Study. Int J Mol Sci 2022;23:ijms23169087. [PMID: 36012350 PMCID: PMC9408964 DOI: 10.3390/ijms23169087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 11/16/2022] Open

Performance Analysis of Ovarian Cancer Detection and Classification for Microarray Gene Data. BIOMED RESEARCH INTERNATIONAL 2022;2022:6750457. [PMID: 35872866 PMCID: PMC9307352 DOI: 10.1155/2022/6750457] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 06/30/2022] [Indexed: 11/18/2022]

Azadifar S, Rostami M, Berahmand K, Moradi P, Oussalah M. Graph-based relevancy-redundancy gene selection method for cancer diagnosis. Comput Biol Med 2022;147:105766. [DOI: 10.1016/j.compbiomed.2022.105766] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 06/12/2022] [Accepted: 06/18/2022] [Indexed: 11/26/2022]

Feature Subset Selection with Optimal Adaptive Neuro-Fuzzy Systems for Bioinformatics Gene Expression Classification. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2022;2022:1698137. [PMID: 35607459 PMCID: PMC9124108 DOI: 10.1155/2022/1698137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/20/2022] [Accepted: 04/27/2022] [Indexed: 01/28/2023]

Red Fox Optimizer with Data-Science-Enabled Microarray Gene Expression Classification Model. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12094172] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Zhong P, Wei X, Li X, Wei X, Wu S, Huang W, Koidis A, Xu Z, Lei H. Untargeted metabolomics by liquid chromatography‐mass spectrometry for food authentication: A review. Compr Rev Food Sci Food Saf 2022;21:2455-2488. [DOI: 10.1111/1541-4337.12938] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 02/20/2022] [Accepted: 02/21/2022] [Indexed: 12/17/2022]

Affiliation(s)

Peng Zhong Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China
Xiaoqun Wei Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China
Xiangmei Li Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China
Xiaoyi Wei Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China
Shaozong Wu Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China
Weijuan Huang Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China
Anastasios Koidis Institute for Global Food Security Queen's University Belfast Belfast UK
Zhenlin Xu Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China
Hongtao Lei Guangdong Provincial Key Laboratory of Food Quality and Safety / National–Local Joint Engineering Research Center for Precision Machining and Safety of Livestock and Poultry Products, College of Food Science South China Agricultural University Guangzhou 510642 China Guangdong Laboratory for Lingnan Modern Agriculture South China Agricultural University Guangzhou 510642 China

Collapse

Tahmouresi A, Rashedi E, Yaghoobi MM, Rezaei M. Gene selection using pyramid gravitational search algorithm. PLoS One 2022;17:e0265351. [PMID: 35290401 PMCID: PMC8923457 DOI: 10.1371/journal.pone.0265351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 02/28/2022] [Indexed: 11/24/2022] Open

Li F, Zhou Y, Zhang Y, Yin J, Qiu Y, Gao J, Zhu F. POSREG: proteomic signature discovered by simultaneously optimizing its reproducibility and generalizability. Brief Bioinform 2022;23:6532538. [PMID: 35183059 DOI: 10.1093/bib/bbac040] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/21/2022] [Accepted: 01/27/2022] [Indexed: 12/17/2022] Open

Xavier D, Floris C, Fabrice P, Angoulvant D, Mewton N, Roubille F, Pascal R, Marc F, Valérie M, Laurane C, Alain F, Gabriel G, Loïc B, Delphine MP. Post-infarct cardiac remodeling predictions with machine learning. Int J Cardiol 2022;355:1-4. [DOI: 10.1016/j.ijcard.2022.02.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/21/2021] [Revised: 02/04/2022] [Accepted: 02/07/2022] [Indexed: 11/05/2022]

Cao Y. Possible relationship between the somatic mutations and the formation of cancers. BIO WEB OF CONFERENCES 2022. [DOI: 10.1051/bioconf/20225501009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open