Malik H, Anees T. Multi-modal deep learning methods for classification of chest diseases using different medical imaging and cough sounds.
PLoS One 2024;
19:e0296352. [PMID:
38470893 DOI:
10.1371/journal.pone.0296352]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 12/11/2023] [Indexed: 03/14/2024] Open
Abstract
Chest disease refers to a wide range of conditions affecting the lungs, such as COVID-19, lung cancer (LC), consolidation lung (COL), and many more. When diagnosing chest disorders medical professionals may be thrown off by the overlapping symptoms (such as fever, cough, sore throat, etc.). Additionally, researchers and medical professionals make use of chest X-rays (CXR), cough sounds, and computed tomography (CT) scans to diagnose chest disorders. The present study aims to classify the nine different conditions of chest disorders, including COVID-19, LC, COL, atelectasis (ATE), tuberculosis (TB), pneumothorax (PNEUTH), edema (EDE), pneumonia (PNEU). Thus, we suggested four novel convolutional neural network (CNN) models that train distinct image-level representations for nine different chest disease classifications by extracting features from images. Furthermore, the proposed CNN employed several new approaches such as a max-pooling layer, batch normalization layers (BANL), dropout, rank-based average pooling (RBAP), and multiple-way data generation (MWDG). The scalogram method is utilized to transform the sounds of coughing into a visual representation. Before beginning to train the model that has been developed, the SMOTE approach is used to calibrate the CXR and CT scans as well as the cough sound images (CSI) of nine different chest disorders. The CXR, CT scan, and CSI used for training and evaluating the proposed model come from 24 publicly available benchmark chest illness datasets. The classification performance of the proposed model is compared with that of seven baseline models, namely Vgg-19, ResNet-101, ResNet-50, DenseNet-121, EfficientNetB0, DenseNet-201, and Inception-V3, in addition to state-of-the-art (SOTA) classifiers. The effectiveness of the proposed model is further demonstrated by the results of the ablation experiments. The proposed model was successful in achieving an accuracy of 99.01%, making it superior to both the baseline models and the SOTA classifiers. As a result, the proposed approach is capable of offering significant support to radiologists and other medical professionals.
Collapse