Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Liu S, Xu C, Zhang Y, Liu J, Yu B, Liu X, Dehmer M. Feature selection of gene expression data for Cancer classification using double RBF-kernels. BMC Bioinformatics 2018;19:396. [PMID: 30373514 DOI: 10.1186/s12859-018-2400-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 09/26/2018] [Indexed: 12/31/2022] Open

For:	Liu S, Xu C, Zhang Y, Liu J, Yu B, Liu X, Dehmer M. Feature selection of gene expression data for Cancer classification using double RBF-kernels. BMC Bioinformatics 2018;19:396. [PMID: 30373514 DOI: 10.1186/s12859-018-2400-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2018] [Accepted: 09/26/2018] [Indexed: 12/31/2022] Open

Number

Cited by Other Article(s)

Pradhan UK, Meher PK, Naha S, Sharma NK, Agarwal A, Gupta A, Parsad R. DBPMod: a supervised learning model for computational recognition of DNA-binding proteins in model organisms. Brief Funct Genomics 2024;23:363-372. [PMID: 37651627 DOI: 10.1093/bfgp/elad039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 08/09/2023] [Accepted: 08/15/2023] [Indexed: 09/02/2023] Open

Abstract

DNA-binding proteins (DBPs) play critical roles in many biological processes, including gene expression, DNA replication, recombination and repair. Understanding the molecular mechanisms underlying these processes depends on the precise identification of DBPs. In recent times, several computational methods have been developed to identify DBPs. However, because of the generic nature of the models, these models are unable to identify species-specific DBPs with higher accuracy. Therefore, a species-specific computational model is needed to predict species-specific DBPs. In this paper, we introduce the computational DBPMod method, which makes use of a machine learning approach to identify species-specific DBPs. For prediction, both shallow learning algorithms and deep learning models were used, with shallow learning models achieving higher accuracy. Additionally, the evolutionary features outperformed sequence-derived features in terms of accuracy. Five model organisms, including Caenorhabditis elegans, Drosophila melanogaster, Escherichia coli, Homo sapiens and Mus musculus, were used to assess the performance of DBPMod. Five-fold cross-validation and independent test set analyses were used to evaluate the prediction accuracy in terms of area under receiver operating characteristic curve (auROC) and area under precision-recall curve (auPRC), which was found to be ~89-92% and ~89-95%, respectively. The comparative results demonstrate that the DBPMod outperforms 12 current state-of-the-art computational approaches in identifying the DBPs for all five model organisms. We further developed the web server of DBPMod to make it easier for researchers to detect DBPs and is publicly available at https://iasri-sg.icar.gov.in/dbpmod/. DBPMod is expected to be an invaluable tool for discovering DBPs, supplementing the current experimental and computational methods.

Collapse

Mendapara K. Development and evaluation of a chronic kidney disease risk prediction model using random forest. Front Genet 2024;15:1409755. [PMID: 38993480 PMCID: PMC11236722 DOI: 10.3389/fgene.2024.1409755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 05/29/2024] [Indexed: 07/13/2024] Open

Pradhan UK, Meher PK, Naha S, Das R, Gupta A, Parsad R. ProkDBP: Toward more precise identification of prokaryotic DNA binding proteins. Protein Sci 2024;33:e5015. [PMID: 38747369 PMCID: PMC11094783 DOI: 10.1002/pro.5015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 04/18/2024] [Accepted: 04/21/2024] [Indexed: 05/19/2024]

Karimi-Fard A, Saidi A, TohidFar M, Emami SN. Novel candidate genes for environmental stresses response in Synechocystis sp. PCC 6803 revealed by machine learning algorithms. Braz J Microbiol 2024;55:1219-1229. [PMID: 38705959 PMCID: PMC11153407 DOI: 10.1007/s42770-024-01338-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 04/03/2024] [Indexed: 05/07/2024] Open

Bai Z, Bartelo N, Aslam M, Murphy EA, Hale CR, Blachere NE, Parveen S, Spolaore E, DiCarlo E, Gravallese EM, Smith MH, Frank MO, Jiang CS, Zhang H, Pyrgaki C, Lewis MJ, Sikandar S, Pitzalis C, Lesnak JB, Mazhar K, Price TJ, Malfait AM, Miller RE, Zhang F, Goodman S, Darnell RB, Wang F, Orange DE. Synovial fibroblast gene expression is associated with sensory nerve growth and pain in rheumatoid arthritis. Sci Transl Med 2024;16:eadk3506. [PMID: 38598614 DOI: 10.1126/scitranslmed.adk3506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 03/21/2024] [Indexed: 04/12/2024]

Affiliation(s)

Zilong Bai Weill Cornell Medicine, New York, NY 10065, USA
Nicholas Bartelo Weill Cornell Medicine, New York, NY 10065, USA
Maryam Aslam Rockefeller University, New York, NY 10065, USA
Elisabeth A Murphy Rockefeller University, New York, NY 10065, USA
Caryn R Hale Rockefeller University, New York, NY 10065, USA Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
Nathalie E Blachere Rockefeller University, New York, NY 10065, USA Howard Hughes Medical Institute, Rockefeller University, New York, NY 10065, USA
Salina Parveen Rockefeller University, New York, NY 10065, USA
Edoardo Spolaore Hospital for Special Surgery, New York, NY 10021, USA
Edward DiCarlo Hospital for Special Surgery, New York, NY 10021, USA
Ellen M Gravallese Brigham and Women's Hospital, Boston, MA 02115, USA
Melanie H Smith Hospital for Special Surgery, New York, NY 10021, USA
Mayu O Frank Rockefeller University, New York, NY 10065, USA
Caroline S Jiang Rockefeller University, New York, NY 10065, USA
Haotan Zhang Weill Cornell Medicine, New York, NY 10065, USA
Christina Pyrgaki Rockefeller University, New York, NY 10065, USA
Myles J Lewis Queen Mary University of London & NIHR BRC Barts Health NHS Trust, London E1 4NS, UK
Shafaq Sikandar Queen Mary University of London & NIHR BRC Barts Health NHS Trust, London E1 4NS, UK
Costantino Pitzalis Queen Mary University of London & NIHR BRC Barts Health NHS Trust, London E1 4NS, UK Department of Biomedical Sciences, Humanitas University & IRCC Humanitas Research Hospital, Milan 20072, Italy
Joseph B Lesnak University of Texas at Dallas, Richardson, TX 75080, USA
Khadijah Mazhar University of Texas at Dallas, Richardson, TX 75080, USA
Theodore J Price University of Texas at Dallas, Richardson, TX 75080, USA
Anne-Marie Malfait Rush University Medical Center, Chicago, IL 60612, USA
Rachel E Miller Rush University Medical Center, Chicago, IL 60612, USA
Fan Zhang University of Colorado School of Medicine, Aurora, CO 80045, USA
Susan Goodman Hospital for Special Surgery, New York, NY 10021, USA
Robert B Darnell Rockefeller University, New York, NY 10065, USA Howard Hughes Medical Institute, Rockefeller University, New York, NY 10065, USA
Fei Wang Weill Cornell Medicine, New York, NY 10065, USA
Dana E Orange Rockefeller University, New York, NY 10065, USA Hospital for Special Surgery, New York, NY 10021, USA

Collapse

Mukherjee A, Abraham S, Singh A, Balaji S, Mukunthan KS. From Data to Cure: A Comprehensive Exploration of Multi-omics Data Analysis for Targeted Therapies. Mol Biotechnol 2024:10.1007/s12033-024-01133-6. [PMID: 38565775 DOI: 10.1007/s12033-024-01133-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2023] [Accepted: 02/27/2024] [Indexed: 04/04/2024]

Okimoto LYS, Mendonca-Neto R, Nakamura FG, Nakamura EF, Fenyö D, Silva CT. Few-shot genes selection: subset of PAM50 genes for breast cancer subtypes classification. BMC Bioinformatics 2024;25:92. [PMID: 38429657 PMCID: PMC10908178 DOI: 10.1186/s12859-024-05715-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/21/2024] [Indexed: 03/03/2024] Open

Turfan D, Altunkaynak B, Yeniay Ö. A New Filter Approach Based on Effective Ranges for Classification of Gene Expression Data. BIG DATA 2023. [PMID: 37668992 DOI: 10.1089/big.2022.0086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/06/2023]

Hybrid Filter and Genetic Algorithm-Based Feature Selection for Improving Cancer Classification in High-Dimensional Microarray Data. Processes (Basel) 2023. [DOI: 10.3390/pr11020562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023] Open

Alharbi F, Vakanski A. Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering (Basel) 2023;10:bioengineering10020173. [PMID: 36829667 PMCID: PMC9952758 DOI: 10.3390/bioengineering10020173] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open

Identification of Biclusters in Huntington’s Disease Dataset Using a New Variant of Grey Wolf Optimizer. JOURNAL OF THE INSTITUTION OF ENGINEERS (INDIA): SERIES B 2022. [PMCID: PMC9640792 DOI: 10.1007/s40031-022-00815-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Guryleva MV, Penzar DD, Chistyakov DV, Mironov AA, Favorov AV, Sergeeva MG. Investigation of the Role of PUFA Metabolism in Breast Cancer Using a Rank-Based Random Forest Algorithm. Cancers (Basel) 2022;14:cancers14194663. [PMID: 36230586 PMCID: PMC9562210 DOI: 10.3390/cancers14194663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 09/15/2022] [Accepted: 09/21/2022] [Indexed: 11/16/2022] Open

Abstract

Simple Summary

Polyunsaturated fatty acids (PUFAs) and their derivatives, oxylipins, are a constant focus of cancer research due to the relationship between cancer and processes of energy metabolism and inflammation, where a PUFA system is an active player. Only recently have methods been developed that allow for studying such complex systems. Using the Rank-based Random Forest (RF) model, we show that PUFA metabolism genes are critical for the pathogenesis of breast cancer (BC); BC subtypes differ in PUFA metabolism gene expression. The enrichment of BC subtypes with various genes associated with oxylipin signaling pathways indicates a different contribution of these compounds to the biology of subtypes.

Abstract

Polyunsaturated fatty acid (PUFA) metabolism is currently a focus in cancer research due to PUFAs functioning as structural components of the membrane matrix, as fuel sources for energy production, and as sources of secondary messengers, so called oxylipins, important players of inflammatory processes. Although breast cancer (BC) is the leading cause of cancer death among women worldwide, no systematic study of PUFA metabolism as a system of interrelated processes in this disease has been carried out. Here, we implemented a Boruta-based feature selection algorithm to determine the list of most important PUFA metabolism genes altered in breast cancer tissues compared with in normal tissues. A rank-based Random Forest (RF) model was built on the selected gene list (33 genes) and applied to predict the cancer phenotype to ascertain the PUFA genes involved in cancerogenesis. It showed high-performance of dichotomic classification (balanced accuracy of 0.94, ROC AUC 0.99) We also retrieved a list of the important PUFA genes (46 genes) that differed between molecular subtypes at the level of breast cancer molecular subtypes. The balanced accuracy of the classification model built on the specified genes was 0.82, while the ROC AUC for the sensitivity analysis was 0.85. Specific patterns of PUFA metabolic changes were obtained for each molecular subtype of breast cancer. These results show evidence that (1) PUFA metabolism genes are critical for the pathogenesis of breast cancer; (2) BC subtypes differ in PUFA metabolism genes expression; and (3) the lists of genes selected in the models are enriched with genes involved in the metabolism of signaling lipids.

Collapse

Zanella L, Facco P, Bezzo F, Cimetta E. Feature Selection and Molecular Classification of Cancer Phenotypes: A Comparative Study. Int J Mol Sci 2022;23:ijms23169087. [PMID: 36012350 PMCID: PMC9408964 DOI: 10.3390/ijms23169087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 11/16/2022] Open

Virtual reality for the observation of oncology models (VROOM): immersive analytics for oncology patient cohorts. Sci Rep 2022;12:11337. [PMID: 35790803 PMCID: PMC9256599 DOI: 10.1038/s41598-022-15548-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 06/24/2022] [Indexed: 11/08/2022] Open

EGFAFS: A Novel Feature Selection Algorithm Based on Explosion Gravitation Field Algorithm. ENTROPY 2022;24:e24070873. [PMID: 35885095 PMCID: PMC9322764 DOI: 10.3390/e24070873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/15/2022] [Accepted: 06/22/2022] [Indexed: 02/04/2023]

Mirzaei G. GraphChrom: A Novel Graph-Based Framework for Cancer Classification Using Chromosomal Rearrangement Endpoints. Cancers (Basel) 2022;14:cancers14133060. [PMID: 35804833 PMCID: PMC9265123 DOI: 10.3390/cancers14133060] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Revised: 06/06/2022] [Accepted: 06/18/2022] [Indexed: 11/16/2022] Open

Jha A, Quesnel-Vallières M, Wang D, Thomas-Tikhonenko A, Lynch KW, Barash Y. Identifying common transcriptome signatures of cancer by interpreting deep learning models. Genome Biol 2022;23:117. [PMID: 35581644 PMCID: PMC9112525 DOI: 10.1186/s13059-022-02681-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 04/27/2022] [Indexed: 01/01/2023] Open

The ability to classify patients based on gene-expression data varies by algorithm and performance metric. PLoS Comput Biol 2022;18:e1009926. [PMID: 35275931 PMCID: PMC8942277 DOI: 10.1371/journal.pcbi.1009926] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 03/23/2022] [Accepted: 02/15/2022] [Indexed: 01/02/2023] Open

Abstract

By classifying patients into subgroups, clinicians can provide more effective care than using a uniform approach for all patients. Such subgroups might include patients with a particular disease subtype, patients with a good (or poor) prognosis, or patients most (or least) likely to respond to a particular therapy. Transcriptomic measurements reflect the downstream effects of genomic and epigenomic variations. However, high-throughput technologies generate thousands of measurements per patient, and complex dependencies exist among genes, so it may be infeasible to classify patients using traditional statistical models. Machine-learning classification algorithms can help with this problem. However, hundreds of classification algorithms exist-and most support diverse hyperparameters-so it is difficult for researchers to know which are optimal for gene-expression biomarkers. We performed a benchmark comparison, applying 52 classification algorithms to 50 gene-expression datasets (143 class variables). We evaluated algorithms that represent diverse machine-learning methodologies and have been implemented in general-purpose, open-source, machine-learning libraries. When available, we combined clinical predictors with gene-expression data. Additionally, we evaluated the effects of performing hyperparameter optimization and feature selection using nested cross validation. Kernel- and ensemble-based algorithms consistently outperformed other types of classification algorithms; however, even the top-performing algorithms performed poorly in some cases. Hyperparameter optimization and feature selection typically improved predictive performance, and univariate feature-selection algorithms typically outperformed more sophisticated methods. Together, our findings illustrate that algorithm performance varies considerably when other factors are held constant and thus that algorithm selection is a critical step in biomarker studies.

Collapse

Combination of Reduction Detection Using TOPSIS for Gene Expression Data Analysis. BIG DATA AND COGNITIVE COMPUTING 2022. [DOI: 10.3390/bdcc6010024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Rout S, Mallick PK, Mishra D. DRBF-DS: Double RBF Kernel-Based Deep Sampling with CNNs to Handle Complex Imbalanced Datasets. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-021-06480-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Wang A, Liu H, Yang J, Chen G. Ensemble feature selection for stable biomarker identification and cancer classification from microarray expression data. Comput Biol Med 2022;142:105208. [PMID: 35016102 DOI: 10.1016/j.compbiomed.2021.105208] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2021] [Revised: 12/19/2021] [Accepted: 12/31/2021] [Indexed: 01/31/2023]

Mori Y, Yokota H, Hoshino I, Iwatate Y, Wakamatsu K, Uno T, Suyari H. Deep learning-based gene selection in comprehensive gene analysis in pancreatic cancer. Sci Rep 2021;11:16521. [PMID: 34389782 PMCID: PMC8363643 DOI: 10.1038/s41598-021-95969-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 07/29/2021] [Indexed: 12/14/2022] Open

Del Giudice M, Peirone S, Perrone S, Priante F, Varese F, Tirtei E, Fagioli F, Cereda M. Artificial Intelligence in Bulk and Single-Cell RNA-Sequencing Data to Foster Precision Oncology. Int J Mol Sci 2021;22:ijms22094563. [PMID: 33925407 PMCID: PMC8123853 DOI: 10.3390/ijms22094563] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2021] [Revised: 04/21/2021] [Accepted: 04/23/2021] [Indexed: 02/01/2023] Open

Affiliation(s)

Marco Del Giudice Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.) Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy
Serena Peirone Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.) Department of Physics and INFN, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
Sarah Perrone Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.) Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
Francesca Priante Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.) Department of Physics, Università degli Studi di Torino, via P.Giuria 1, 10125 Turin, Italy
Fabiola Varese Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.) Department of Life Science and System Biology, Università degli Studi di Torino, via Accademia Albertina 13, 10123 Turin, Italy
Elisa Tirtei Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.)
Franca Fagioli Paediatric Onco-Haematology Division, Regina Margherita Children’s Hospital, City of Health and Science of Turin, 10126 Turin, Italy; (E.T.); (F.F.) Department of Public Health and Paediatric Sciences, University of Torino, 10124 Turin, Italy
Matteo Cereda Cancer Genomics and Bioinformatics Unit, IIGM—Italian Institute for Genomic Medicine, c/o IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy; (M.D.G.); (S.P.); (S.P.); (F.P.); (F.V.) Candiolo Cancer Institute, FPO—IRCCS, Str. Prov.le 142, km 3.95, 10060 Candiolo, TO, Italy Correspondence: ; Tel.: +39-011-993-3969

Collapse

Seo H, Cho DH. Feature selection algorithm based on dual correlation filters for cancer-associated somatic variants. BMC Bioinformatics 2020;21:486. [PMID: 33121438 PMCID: PMC7596964 DOI: 10.1186/s12859-020-03767-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 09/18/2020] [Indexed: 12/30/2022] Open

Afshar M, Usefi H. High-dimensional feature selection for genomic datasets. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Xu D, Zhang J, Xu H, Zhang Y, Chen W, Gao R, Dehmer M. Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data. BMC Genomics 2020;21:650. [PMID: 32962626 PMCID: PMC7510277 DOI: 10.1186/s12864-020-07038-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 08/30/2020] [Indexed: 12/19/2022] Open

Abstract

Background

The small number of samples and the curse of dimensionality hamper the better application of deep learning techniques for disease classification. Additionally, the performance of clustering-based feature selection algorithms is still far from being satisfactory due to their limitation in using unsupervised learning methods. To enhance interpretability and overcome this problem, we developed a novel feature selection algorithm. In the meantime, complex genomic data brought great challenges for the identification of biomarkers and therapeutic targets. The current some feature selection methods have the problem of low sensitivity and specificity in this field.

Results

In this article, we designed a multi-scale clustering-based feature selection algorithm named MCBFS which simultaneously performs feature selection and model learning for genomic data analysis. The experimental results demonstrated that MCBFS is robust and effective by comparing it with seven benchmark and six state-of-the-art supervised methods on eight data sets. The visualization results and the statistical test showed that MCBFS can capture the informative genes and improve the interpretability and visualization of tumor gene expression and single-cell sequencing data. Additionally, we developed a general framework named McbfsNW using gene expression data and protein interaction data to identify robust biomarkers and therapeutic targets for diagnosis and therapy of diseases. The framework incorporates the MCBFS algorithm, network recognition ensemble algorithm and feature selection wrapper. McbfsNW has been applied to the lung adenocarcinoma (LUAD) data sets. The preliminary results demonstrated that higher prediction results can be attained by identified biomarkers on the independent LUAD data set, and we also structured a drug-target network which may be good for LUAD therapy.

Conclusions

The proposed novel feature selection method is robust and effective for gene selection, classification, and visualization. The framework McbfsNW is practical and helpful for the identification of biomarkers and targets on genomic data. It is believed that the same methods and principles are extensible and applicable to other different kinds of data sets.

Collapse

Kalina J, Matonoha C. A sparse pair-preserving centroid-based supervised learning method for high-dimensional biomedical data or images. Biocybern Biomed Eng 2020. [DOI: 10.1016/j.bbe.2020.03.008] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Han X, Li D, Liu P, Wang L. Feature selection by recursive binary gravitational search algorithm optimization for cancer classification. Soft comput 2020. [DOI: 10.1007/s00500-019-04203-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Specific glioblastoma multiforme prognostic-subtype distinctions based on DNA methylation patterns. Cancer Gene Ther 2019;27:702-714. [PMID: 31619751 DOI: 10.1038/s41417-019-0142-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2019] [Revised: 10/01/2019] [Accepted: 10/04/2019] [Indexed: 12/14/2022]

Shi M, Wang J, Zhang C. Integration of Cancer Genomics Data for Tree-based Dimensionality Reduction and Cancer Outcome Prediction. Mol Inform 2019;39:e1900028. [PMID: 31490641 DOI: 10.1002/minf.201900028] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Accepted: 08/22/2019] [Indexed: 11/10/2022]