Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Ghorai S, Mukherjee A, Sengupta S, Dutta PK. Cancer classification from gene expression data by NPPC ensemble. IEEE/ACM Trans Comput Biol Bioinform 2011;8:659-671. [PMID: 20479504 DOI: 10.1109/tcbb.2010.36] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

For:	Ghorai S, Mukherjee A, Sengupta S, Dutta PK. Cancer classification from gene expression data by NPPC ensemble. IEEE/ACM Trans Comput Biol Bioinform 2011;8:659-671. [PMID: 20479504 DOI: 10.1109/tcbb.2010.36] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

Number

Cited by Other Article(s)

Mendonca-Neto R, Li Z, Fenyo D, Silva CT, Nakamura FG, Nakamura EF. A Gene Selection Method Based on Outliers for Breast Cancer Subtype Classification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2547-2559. [PMID: 34860652 DOI: 10.1109/tcbb.2021.3132339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Bose S, Das C, Banerjee A, Ghosh K, Chattopadhyay M, Chattopadhyay S, Barik A. An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples. PeerJ Comput Sci 2021;7:e671. [PMID: 34616883 PMCID: PMC8459790 DOI: 10.7717/peerj-cs.671] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 07/20/2021] [Indexed: 06/13/2023]

Abstract

BACKGROUND

Machine learning is one kind of machine intelligence technique that learns from data and detects inherent patterns from large, complex datasets. Due to this capability, machine learning techniques are widely used in medical applications, especially where large-scale genomic and proteomic data are used. Cancer classification based on bio-molecular profiling data is a very important topic for medical applications since it improves the diagnostic accuracy of cancer and enables a successful culmination of cancer treatments. Hence, machine learning techniques are widely used in cancer detection and prognosis.

METHODS

In this article, a new ensemble machine learning classification model named Multiple Filtering and Supervised Attribute Clustering algorithm based Ensemble Classification model (MFSAC-EC) is proposed which can handle class imbalance problem and high dimensionality of microarray datasets. This model first generates a number of bootstrapped datasets from the original training data where the oversampling procedure is applied to handle the class imbalance problem. The proposed MFSAC method is then applied to each of these bootstrapped datasets to generate sub-datasets, each of which contains a subset of the most relevant/informative attributes of the original dataset. The MFSAC method is a feature selection technique combining multiple filters with a new supervised attribute clustering algorithm. Then for every sub-dataset, a base classifier is constructed separately, and finally, the predictive accuracy of these base classifiers is combined using the majority voting technique forming the MFSAC-based ensemble classifier. Also, a number of most informative attributes are selected as important features based on their frequency of occurrence in these sub-datasets.

RESULTS

To assess the performance of the proposed MFSAC-EC model, it is applied on different high-dimensional microarray gene expression datasets for cancer sample classification. The proposed model is compared with well-known existing models to establish its effectiveness with respect to other models. From the experimental results, it has been found that the generalization performance/testing accuracy of the proposed classifier is significantly better compared to other well-known existing models. Apart from that, it has been also found that the proposed model can identify many important attributes/biomarker genes.

Collapse

Das J, Barman Mandal S. Identification of Homo sapiens cancer classes based on fusion of hidden gene features. J Biomed Inform 2020;110:103555. [PMID: 32916304 DOI: 10.1016/j.jbi.2020.103555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 07/08/2020] [Accepted: 09/02/2020] [Indexed: 10/23/2022]

A mapping study of ensemble classification methods in lung cancer decision support systems. Med Biol Eng Comput 2020;58:2177-2193. [PMID: 32621068 DOI: 10.1007/s11517-020-02223-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2020] [Accepted: 06/25/2020] [Indexed: 10/23/2022]

A concise peephole model based transfer learning method for small sample temporal feature-based data-driven quality analysis. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.105665] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Menaga D, Revathi S. AN EMPIRICAL STUDY OF CANCER CLASSIFICATION TECHNIQUES BASED ON THE NEURAL NETWORKS. BIOMEDICAL ENGINEERING: APPLICATIONS, BASIS AND COMMUNICATIONS 2020. [DOI: 10.4015/s1016237220500131] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]

Hosni M, Carrillo-de-Gea JM, Idri A, Fernandez-Aleman JL, Garcia-Berna JA. Using ensemble classification methods in lung cancer disease. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020;2019:1367-1370. [PMID: 31946147 DOI: 10.1109/embc.2019.8857435] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Wang W, Xie G, Ren Z, Xie T, Li J. Gene Selection for the Discrimination of Colorectal Cancer. Curr Mol Med 2019;20:415-428. [PMID: 31746296 DOI: 10.2174/1566524019666191119105209] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2019] [Revised: 10/29/2019] [Accepted: 10/31/2019] [Indexed: 12/15/2022]

Hosni M, Abnane I, Idri A, Carrillo de Gea JM, Fernández Alemán JL. Reviewing ensemble classification methods in breast cancer. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2019;177:89-112. [PMID: 31319964 DOI: 10.1016/j.cmpb.2019.05.019] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 05/16/2019] [Accepted: 05/18/2019] [Indexed: 05/09/2023]

Abstract

CONTEXT

Ensemble methods consist of combining more than one single technique to solve the same task. This approach was designed to overcome the weaknesses of single techniques and consolidate their strengths. Ensemble methods are now widely used to carry out prediction tasks (e.g. classification and regression) in several fields, including that of bioinformatics. Researchers have particularly begun to employ ensemble techniques to improve research into breast cancer, as this is the most frequent type of cancer and accounts for most of the deaths among women.

OBJECTIVE AND METHOD

The goal of this study is to analyse the state of the art in ensemble classification methods when applied to breast cancer as regards 9 aspects: publication venues, medical tasks tackled, empirical and research types adopted, types of ensembles proposed, single techniques used to construct the ensembles, validation framework adopted to evaluate the proposed ensembles, tools used to build the ensembles, and optimization methods used for the single techniques. This paper was undertaken as a systematic mapping study.

RESULTS

A total of 193 papers that were published from the year 2000 onwards, were selected from four online databases: IEEE Xplore, ACM digital library, Scopus and PubMed. This study found that of the six medical tasks that exist, the diagnosis medical task was that most frequently researched, and that the experiment-based empirical type and evaluation-based research type were the most dominant approaches adopted in the selected studies. The homogeneous type was that most widely used to perform the classification task. With regard to single techniques, this mapping study found that decision trees, support vector machines and artificial neural networks were those most frequently adopted to build ensemble classifiers. In the case of the evaluation framework, the Wisconsin Breast Cancer dataset was the most frequently used by researchers to perform their experiments, while the most noticeable validation method was k-fold cross-validation. Several tools are available to perform experiments related to ensemble classification methods, such as Weka and R Software. Few researchers took into account the optimisation of the single technique of which their proposed ensemble was composed, while the grid search method was that most frequently adopted to tune the parameter settings of a single classifier.

CONCLUSION

This paper reports an in-depth study of the application of ensemble methods as regards breast cancer. Our results show that there are several gaps and issues and we, therefore, provide researchers in the field of breast cancer research with recommendations. Moreover, after analysing the papers found in this systematic mapping study, we discovered that the majority report positive results concerning the accuracy of ensemble classifiers when compared to the single classifiers. In order to aggregate the evidence reported in literature, it will, therefore, be necessary to perform a systematic literature review and meta-analysis in which an in-depth analysis could be conducted so as to confirm the superiority of ensemble classifiers over the classical techniques.

Collapse

Boosted neural network ensemble classification for lung cancer disease diagnosis. Appl Soft Comput 2019. [DOI: 10.1016/j.asoc.2019.04.031] [Citation(s) in RCA: 77] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Li J, Dong W, Meng D. Grouped Gene Selection of Cancer via Adaptive Sparse Group Lasso Based on Conditional Mutual Information. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018;15:2028-2038. [PMID: 29028206 DOI: 10.1109/tcbb.2017.2761871] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Fronto-parietal numerical networks in relation with early numeracy in young children. Brain Struct Funct 2018;224:263-275. [PMID: 30315414 DOI: 10.1007/s00429-018-1774-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 10/05/2018] [Indexed: 10/28/2022]

Ye Q, Zhao H, Li Z, Yang X, Gao S, Yin T, Ye N. L1-Norm Distance Minimization-Based Fast Robust Twin Support Vector $k$ -Plane Clustering. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018;29:4494-4503. [PMID: 28981431 DOI: 10.1109/tnnls.2017.2749428] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

Piątek Ł, Grzymała-Busse JW. LEMRG: Decision Rule Generation Algorithm for Mining MicroRNA Expression Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017;1028:105-137. [DOI: 10.1007/978-981-10-6041-0_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Random Subspace Aggregation for Cancer Prediction with Gene Expression Profiles. BIOMED RESEARCH INTERNATIONAL 2016;2016:4596326. [PMID: 27999797 PMCID: PMC5143691 DOI: 10.1155/2016/4596326] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/03/2016] [Revised: 10/08/2016] [Accepted: 10/20/2016] [Indexed: 12/23/2022]

Ang JC, Mirzal A, Haron H, Hamed HNA. Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016;13:971-989. [PMID: 26390495 DOI: 10.1109/tcbb.2015.2478454] [Citation(s) in RCA: 185] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Wang L, Wang Y, Chang Q. Feature selection methods for big data bioinformatics: A survey from the search perspective. Methods 2016;111:21-31. [PMID: 27592382 DOI: 10.1016/j.ymeth.2016.08.014] [Citation(s) in RCA: 110] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2016] [Revised: 08/25/2016] [Accepted: 08/30/2016] [Indexed: 11/26/2022] Open

Garro BA, Rodríguez K, Vázquez RA. Classification of DNA microarrays using artificial neural networks and ABC algorithm. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2015.10.002] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Yu Z, Li L, Liu J, Han G. Hybrid adaptive classifier ensemble. IEEE TRANSACTIONS ON CYBERNETICS 2015;45:177-190. [PMID: 24860045 DOI: 10.1109/tcyb.2014.2322195] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]

Majid A, Ali S, Iqbal M, Kausar N. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2014;113:792-808. [PMID: 24472367 DOI: 10.1016/j.cmpb.2014.01.001] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2013] [Revised: 12/29/2013] [Accepted: 01/03/2014] [Indexed: 06/03/2023]

Recognition of multiple imbalanced cancer types based on DNA microarray data using ensemble classifiers. BIOMED RESEARCH INTERNATIONAL 2013;2013:239628. [PMID: 24078908 PMCID: PMC3770038 DOI: 10.1155/2013/239628] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2013] [Revised: 07/08/2013] [Accepted: 07/17/2013] [Indexed: 11/24/2022]

Sarkar A, Maulik U. Cancer Gene Expression Data Analysis Using Rough Based Symmetrical Clustering. Bioinformatics 2013. [DOI: 10.4018/978-1-4666-3604-0.ch085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open

Wang N, Su L, Tang J, Ye A. Informative gene selection using the Algebraic Connectivity Strength of Point and Scoring Criteria. CHINESE SCIENCE BULLETIN-CHINESE 2013. [DOI: 10.1007/s11434-012-5421-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Shao YH, Deng NY, Yang ZM, Chen WJ, Wang Z. Probabilistic outputs for twin support vector machines. Knowl Based Syst 2012. [DOI: 10.1016/j.knosys.2012.04.006] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Nanni L, Brahnam S, Lumini A. Combining multiple approaches for gene microarray classification. Bioinformatics 2012;28:1151-7. [DOI: 10.1093/bioinformatics/bts108] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Gene Expression Data Classification by VVRKFA. ACTA ACUST UNITED AC 2012. [DOI: 10.1016/j.protcy.2012.05.050] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Ghorai S, Mukherjee A, Dutta PK. Discriminant Analysis for Fast Multiclass Data Classification Through Regularized Kernel Function Approximation. ACTA ACUST UNITED AC 2010;21:1020-9. [DOI: 10.1109/tnn.2010.2046646] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]