1
|
Li L, Yang J, Por LY, Khan MS, Hamdaoui R, Hussain L, Iqbal Z, Rotaru IM, Dobrotă D, Aldrdery M, Omar A. Enhancing lung cancer detection through hybrid features and machine learning hyperparameters optimization techniques. Heliyon 2024; 10:e26192. [PMID: 38404820 PMCID: PMC10884486 DOI: 10.1016/j.heliyon.2024.e26192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 01/30/2024] [Accepted: 02/08/2024] [Indexed: 02/27/2024] Open
Abstract
Machine learning offers significant potential for lung cancer detection, enabling early diagnosis and potentially improving patient outcomes. Feature extraction remains a crucial challenge in this domain. Combining the most relevant features can further enhance detection accuracy. This study employed a hybrid feature extraction approach, which integrates both Gray-level co-occurrence matrix (GLCM) with Haralick and autoencoder features with an autoencoder. These features were subsequently fed into supervised machine learning methods. Support Vector Machine (SVM) Radial Base Function (RBF) and SVM Gaussian achieved perfect performance measures, while SVM polynomial produced an accuracy of 99.89% when utilizing GLCM with an autoencoder, Haralick, and autoencoder features. SVM Gaussian achieved an accuracy of 99.56%, while SVM RBF achieved an accuracy of 99.35% when utilizing GLCM with Haralick features. These results demonstrate the potential of the proposed approach for developing improved diagnostic and prognostic lung cancer treatment planning and decision-making systems.
Collapse
Affiliation(s)
- Liangyu Li
- Center for Software Technology and Management, Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, 43600, Bangi, Selangor, Malaysia
- Health Informatics Laboratory, Cancer Research Institute, Chifeng Cancer Hospital (Second Affiliated Hospital of Chifeng University), Medical Department, Chifeng University, Chifeng City, Inner Mongolia Autonomous Region, 024000, China
| | - Jing Yang
- Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Lip Yee Por
- Department of Computer System and Technology, Faculty of Computer Science and Information Technology, Universiti Malaya, 50603, Kuala Lumpur, Malaysia
| | - Mohammad Shahbaz Khan
- Children's National Hospital, 111 Michigan Ave NW, Washington, DC, 20010, United States
| | - Rim Hamdaoui
- Department of Computer Science, College of Science and Human Studies Dawadmi, Shaqra University, Shaqra, Riyadh, Saudi Arabia
| | - Lal Hussain
- Department of Computer Science and Information Technology, King Abdullah Campus Chatter Kalas, University of Azad Jammu and Kashmir, Muzaffarabad, 13100, Azad Kashmir, Pakistan
- Department of Computer Science and Information Technology, Neelum Campus, University of Azad Jammu and Kashmir, Athmuqam, 13230, Azad Kashmir, Pakistan
| | - Zahoor Iqbal
- School of Computer Science and Technology, Zhejiang Normal University, Jinhua, 321004, China
| | - Ionela Magdalena Rotaru
- Department of Industrial Engineering and Management, Lucian Blaga University of Sibiu, Bulevardul Victoriei 10, Sibiu, 550024, Romania
| | - Dan Dobrotă
- Faculty of Engineering, Lucian Blaga University of Sibiu, Bulevardul Victoriei 10, Sibiu, 550024, Romania
| | - Moutaz Aldrdery
- Department of Chemical Engineering, College of Engineering, King Khalid University, Abha, 61411, Saudi Arabia
| | - Abdulfattah Omar
- Department of English, College of Science & Humanities, Prince Sattam Bin Abdulaziz University, Saudi Arabia
| |
Collapse
|
2
|
Faruqui N, Yousuf MA, Kateb FA, Abdul Hamid M, Monowar MM. Healthcare As a Service (HAAS): CNN-based cloud computing model for ubiquitous access to lung cancer diagnosis. Heliyon 2023; 9:e21520. [PMID: 37942151 PMCID: PMC10628703 DOI: 10.1016/j.heliyon.2023.e21520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 09/27/2023] [Accepted: 10/23/2023] [Indexed: 11/10/2023] Open
Abstract
The field of automated lung cancer diagnosis using Computed Tomography (CT) scans has been significantly advanced by the precise predictions offered by Convolutional Neural Network (CNN)-based classifiers. Critical areas of study include improving image quality, optimizing learning algorithms, and enhancing diagnostic accuracy. To facilitate a seamless transition from research laboratories to real-world applications, it is crucial to improve the technology's usability-a factor often neglected in current state-of-the-art research. Yet, current state-of-the-art research in this field frequently overlooks the need for expediting this process. This paper introduces Healthcare-As-A-Service (HAAS), an innovative concept inspired by Software-As-A-Service (SAAS) within the cloud computing paradigm. As a comprehensive lung cancer diagnosis service system, HAAS has the potential to reduce lung cancer mortality rates by providing early diagnosis opportunities to everyone. We present HAASNet, a cloud-compatible CNN that boasts an accuracy rate of 96.07%. By integrating HAASNet predictions with physio-symptomatic data from the Internet of Medical Things (IoMT), the proposed HAAS model generates accurate and reliable lung cancer diagnosis reports. Leveraging IoMT and cloud technology, the proposed service is globally accessible via the Internet, transcending geographic boundaries. This groundbreaking lung cancer diagnosis service achieves average precision, recall, and F1-scores of 96.47%, 95.39%, and 94.81%, respectively.
Collapse
Affiliation(s)
- Nuruzzaman Faruqui
- Institute of Information Technology (IIT), Jahangirnagar University, Savar, Dhaka, 1342, Bangladesh
- Department of Software Engineering, Daffodil International University, Daffodil Smart City, Birulia, Dhaka, 1216, Bangladesh
| | - Mohammad Abu Yousuf
- Institute of Information Technology (IIT), Jahangirnagar University, Savar, Dhaka, 1342, Bangladesh
| | - Faris A. Kateb
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Md. Abdul Hamid
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Muhammad Mostafa Monowar
- Department of Information Technology, Faculty of Computing & Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| |
Collapse
|
3
|
Javed MA, Bin Liaqat H, Meraj T, Alotaibi A, Alshammari M. Identification and Classification of Lungs Focal Opacity Using CNN Segmentation and Optimal Feature Selection. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2023; 2023:6357252. [PMID: 37538561 PMCID: PMC10396675 DOI: 10.1155/2023/6357252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/07/2022] [Accepted: 09/26/2022] [Indexed: 08/05/2023]
Abstract
Lung cancer is one of the deadliest cancers around the world, with high mortality rate in comparison to other cancers. A lung cancer patient's survival probability in late stages is very low. However, if it can be detected early, the patient survival rate can be improved. Diagnosing lung cancer early is a complicated task due to having the visual similarity of lungs nodules with trachea, vessels, and other surrounding tissues that leads toward misclassification of lung nodules. Therefore, correct identification and classification of nodules is required. Previous studies have used noisy features, which makes results comprising. A predictive model has been proposed to accurately detect and classify the lung nodules to address this problem. In the proposed framework, at first, the semantic segmentation was performed to identify the nodules in images in the Lungs image database consortium (LIDC) dataset. Optimal features for classification include histogram oriented gradients (HOGs), local binary patterns (LBPs), and geometric features are extracted after segmentation of nodules. The results shown that support vector machines performed better in identifying the nodules than other classifiers, achieving the highest accuracy of 97.8% with sensitivity of 100%, specificity of 93%, and false positive rate of 6.7%.
Collapse
Affiliation(s)
| | - Hannan Bin Liaqat
- Department of Information Technology, Division of Science and Technology University of Education, Township Campus Lahore, Lahore, Pakistan
| | - Talha Meraj
- Department of Computer Science, COMSATS University Islamabad—Wah Campus, Wah Cantt, Rawalpindi 47040, Pakistan
| | - Aziz Alotaibi
- Department of Computer Science, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| | - Majid Alshammari
- Department of Information Technology, College of Computers and Information Technology, Taif University, P.O. Box 11099, Taif 21944, Saudi Arabia
| |
Collapse
|
4
|
Carmo D, Ribeiro J, Dertkigil S, Appenzeller S, Lotufo R, Rittner L. A Systematic Review of Automated Segmentation Methods and Public Datasets for the Lung and its Lobes and Findings on Computed Tomography Images. Yearb Med Inform 2022; 31:277-295. [PMID: 36463886 PMCID: PMC9719778 DOI: 10.1055/s-0042-1742517] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
OBJECTIVES Automated computational segmentation of the lung and its lobes and findings in X-Ray based computed tomography (CT) images is a challenging problem with important applications, including medical research, surgical planning, and diagnostic decision support. With the increase in large imaging cohorts and the need for fast and robust evaluation of normal and abnormal lungs and their lobes, several authors have proposed automated methods for lung assessment on CT images. In this paper we intend to provide a comprehensive summarization of these methods. METHODS We used a systematic approach to perform an extensive review of automated lung segmentation methods. We chose Scopus, PubMed, and Scopus to conduct our review and included methods that perform segmentation of the lung parenchyma, lobes or internal disease related findings. The review was not limited by date, but rather by only including methods providing quantitative evaluation. RESULTS We organized and classified all 234 included articles into various categories according to methodological similarities among them. We provide summarizations of quantitative evaluations, public datasets, evaluation metrics, and overall statistics indicating recent research directions of the field. CONCLUSIONS We noted the rise of data-driven models in the last decade, especially due to the deep learning trend, increasing the demand for high-quality data annotation. This has instigated an increase of semi-supervised and uncertainty guided works that try to be less dependent on human annotation. In addition, the question of how to evaluate the robustness of data-driven methods remains open, given that evaluations derived from specific datasets are not general.
Collapse
Affiliation(s)
- Diedre Carmo
- School of Electrical and Computer Engineering, University of Campinas, Brazil
| | - Jean Ribeiro
- School of Electrical and Computer Engineering, University of Campinas, Brazil
| | | | | | - Roberto Lotufo
- School of Electrical and Computer Engineering, University of Campinas, Brazil
| | - Leticia Rittner
- School of Electrical and Computer Engineering, University of Campinas, Brazil,Correspondence to: Leticia Rittner Av. Albert Einstein, 400, Cidade Universitária Zeferino Vaz, Barão Geraldo - Campinas - SP 13083-852Brazil
| |
Collapse
|
5
|
Lung Cancer Prediction Using Robust Machine Learning and Image Enhancement Methods on Extracted Gray-Level Co-Occurrence Matrix Features. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12136517] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
In the present era, cancer is the leading cause of demise in both men and women worldwide, with low survival rates due to inefficient diagnostic techniques. Recently, researchers have been devising methods to improve prediction performance. In medical image processing, image enhancement can further improve prediction performance. This study aimed to improve lung cancer image quality by utilizing and employing various image enhancement methods, such as image adjustment, gamma correction, contrast stretching, thresholding, and histogram equalization methods. We extracted the gray-level co-occurrence matrix (GLCM) features on enhancement images, and applied and optimized vigorous machine learning classification algorithms, such as the decision tree (DT), naïve Bayes, support vector machine (SVM) with Gaussian, radial base function (RBF), and polynomial. Without the image enhancement method, the highest performance was obtained using SVM, polynomial, and RBF, with accuracy of (99.89%). The image enhancement methods, such as image adjustment, contrast stretching at threshold (0.02, 0.98), and gamma correction at gamma value of 0.9, improved the prediction performance of our analysis on 945 images provided by the Lung Cancer Alliance MRI dataset, which yielded 100% accuracy and 1.00 of AUC using SVM, RBF, and polynomial kernels. The results revealed that the proposed methodology can be very helpful to improve the lung cancer prediction for further diagnosis and prognosis by expert radiologists to decrease the mortality rate.
Collapse
|
6
|
Ensemble Learning Framework with GLCM Texture Extraction for Early Detection of Lung Cancer on CT Images. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2022; 2022:2733965. [PMID: 35693266 PMCID: PMC9184160 DOI: 10.1155/2022/2733965] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Revised: 04/29/2022] [Accepted: 05/10/2022] [Indexed: 11/18/2022]
Abstract
Lung cancer has emerged as a major cause of death among all demographics worldwide, largely caused by a proliferation of smoking habits. However, early detection and diagnosis of lung cancer through technological improvements can save the lives of millions of individuals affected globally. Computerized tomography (CT) scan imaging is a proven and popular technique in the medical field, but diagnosing cancer with only CT scans is a difficult task even for doctors and experts. This is why computer-assisted diagnosis has revolutionized disease diagnosis, especially cancer detection. This study looks at 20 CT scan images of lungs. In a preprocessing step, we chose the best filter to be applied to medical CT images between median, Gaussian, 2D convolution, and mean. From there, it was established that the median filter is the most appropriate. Next, we improved image contrast by applying adaptive histogram equalization. Finally, the preprocessed image with better quality is subjected to two optimization algorithms, fuzzy c-means and k-means clustering. The performance of these algorithms was then compared. Fuzzy c-means showed the highest accuracy of 98%. The feature was extracted using Gray Level Cooccurrence Matrix (GLCM). In classification, a comparison between three algorithms—bagging, gradient boosting, and ensemble (SVM, MLPNN, DT, logistic regression, and KNN)—was performed. Gradient boosting performed the best among these three, having an accuracy of 90.9%.
Collapse
|
7
|
Lung Segmentation in CT Images: A Residual U-Net Approach on a Cross-Cohort Dataset. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12041959] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
Lung cancer is one of the most common causes of cancer-related mortality, and since the majority of cases are diagnosed when the tumor is in an advanced stage, the 5-year survival rate is dismally low. Nevertheless, the chances of survival can increase if the tumor is identified early on, which can be achieved through screening with computed tomography (CT). The clinical evaluation of CT images is a very time-consuming task and computed-aided diagnosis systems can help reduce this burden. The segmentation of the lungs is usually the first step taken in image analysis automatic models of the thorax. However, this task is very challenging since the lungs present high variability in shape and size. Moreover, the co-occurrence of other respiratory comorbidities alongside lung cancer is frequent, and each pathology can present its own scope of CT imaging appearances. This work investigated the development of a deep learning model, whose architecture consists of the combination of two structures, a U-Net and a ResNet34. The proposed model was designed on a cross-cohort dataset and it achieved a mean dice similarity coefficient (DSC) higher than 0.93 for the 4 different cohorts tested. The segmentation masks were qualitatively evaluated by two experienced radiologists to identify the main limitations of the developed model, despite the good overall performance obtained. The performance per pathology was assessed, and the results confirmed a small degradation for consolidation and pneumocystis pneumonia cases, with a DSC of 0.9015 ± 0.2140 and 0.8750 ± 0.1290, respectively. This work represents a relevant assessment of the lung segmentation model, taking into consideration the pathological cases that can be found in the clinical routine, since a global assessment could not detail the fragilities of the model.
Collapse
|
8
|
|