1
|
Parveen Rahamathulla M, Sam Emmanuel WR, Bindhu A, Mustaq Ahmed M. YOLOv8's advancements in tuberculosis identification from chest images. Front Big Data 2024; 7:1401981. [PMID: 38994120 PMCID: PMC11236731 DOI: 10.3389/fdata.2024.1401981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 05/29/2024] [Indexed: 07/13/2024] Open
Abstract
Tuberculosis (TB) is a chronic and pathogenic disease that leads to life-threatening situations like death. Many people have been affected by TB owing to inaccuracy, late diagnosis, and deficiency of treatment. The early detection of TB is important to protect people from the severity of the disease and its threatening consequences. Traditionally, different manual methods have been used for TB prediction, such as chest X-rays and CT scans. Nevertheless, these approaches are identified as time-consuming and ineffective for achieving optimal results. To resolve this problem, several researchers have focused on TB prediction. Conversely, it results in a lack of accuracy, overfitting of data, and speed. For improving TB prediction, the proposed research employs the Selection Focal Fusion (SFF) block in the You Look Only Once v8 (YOLOv8, Ultralytics software company, Los Angeles, United States) object detection model with attention mechanism through the Kaggle TBX-11k dataset. The YOLOv8 is used for its ability to detect multiple objects in a single pass. However, it struggles with small objects and finds it impossible to perform fine-grained classifications. To evade this problem, the proposed research incorporates the SFF technique to improve detection performance and decrease small object missed detection rates. Correspondingly, the efficacy of the projected mechanism is calculated utilizing various performance metrics such as recall, precision, F1Score, and mean Average Precision (mAP) to estimate the performance of the proposed framework. Furthermore, the comparison of existing models reveals the efficiency of the proposed research. The present research is envisioned to contribute to the medical world and assist radiologists in identifying tuberculosis using the YOLOv8 model to obtain an optimal outcome.
Collapse
Affiliation(s)
- Mohamudha Parveen Rahamathulla
- Department of Basic Medical Science, College of Medicine, Prince Sattam bin Abdulaziz University, Al-Kharj, Saudi Arabia
| | - W. R. Sam Emmanuel
- Department of Computer Science and Research Centre, Nesamony Memorial Christian College, Marthandam, Tamil Nadu, India
| | - A. Bindhu
- Department of Computer Science, Infant Jesus College of Arts and Science for Women, Mulagumoodu, Tamil Nadu, India
| | - Mohamed Mustaq Ahmed
- Department of Information Technology, The New College, Chennai, Tamil Nadu, India
| |
Collapse
|
2
|
Xu W, Bao X, Lou X, Liu X, Chen Y, Zhao X, Zhang C, Pan C, Liu W, Liu F. Feature fusion method for pulmonary tuberculosis patient detection based on cough sound. PLoS One 2024; 19:e0302651. [PMID: 38743758 PMCID: PMC11093322 DOI: 10.1371/journal.pone.0302651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 04/08/2024] [Indexed: 05/16/2024] Open
Abstract
Since the COVID-19, cough sounds have been widely used for screening purposes. Intelligent analysis techniques have proven to be effective in detecting respiratory diseases. In 2021, there were up to 10 million TB-infected patients worldwide, with an annual growth rate of 4.5%. Most of the patients were from economically underdeveloped regions and countries. The PPD test, a common screening method in the community, has a sensitivity of as low as 77%. Although IGRA and Xpert MTB/RIF offer high specificity and sensitivity, their cost makes them less accessible. In this study, we proposed a feature fusion model-based cough sound classification method for primary TB screening in communities. Data were collected from hospitals using smart phones, including 230 cough sounds from 70 patients with TB and 226 cough sounds from 74 healthy subjects. We employed Bi-LSTM and Bi-GRU recurrent neural networks to analyze five traditional feature sets including the Mel frequency cepstrum coefficient (MFCC), zero-crossing rate (ZCR), short-time energy, root mean square, and chroma_cens. The incorporation of features extracted from the speech spectrogram by 2D convolution training into the Bi-LSTM model enhanced the classification results. With traditional futures, the best TB patient detection result was achieved with the Bi-LSTM model, with 93.99% accuracy, 93.93% specificity, and 92.39% sensitivity. When combined with a speech spectrogram, the classification results showed 96.33% accuracy, 94.99% specificity, and 98.13% sensitivity. Our findings underscore that traditional features and deep features have good complementarity when fused using Bi LSTM modelling, which outperforms existing PPD detection methods in terms of both efficiency and accuracy.
Collapse
Affiliation(s)
- Wenlong Xu
- College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China
| | - Xiaofan Bao
- College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China
| | - Xiaomin Lou
- Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China
| | - Xiaofang Liu
- College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China
| | - Yuanyuan Chen
- Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China
| | | | - Chenlu Zhang
- Hangzhou Red Cross Hospital, Hangzhou, Zhejiang, China
| | - Chen Pan
- College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China
| | - Wenlong Liu
- College of Information Engineering, China Jiliang University, Hangzhou, Zhejiang, China
| | - Feng Liu
- School of Information Technology and Electrical Engineering, University of Queensland, Brisbane, Queensland, Australia
| |
Collapse
|
3
|
Isangula KG, Haule RJ. Leveraging AI and Machine Learning to Develop and Evaluate a Contextualized User-Friendly Cough Audio Classifier for Detecting Respiratory Diseases: Protocol for a Diagnostic Study in Rural Tanzania. JMIR Res Protoc 2024; 13:e54388. [PMID: 38652526 PMCID: PMC11077412 DOI: 10.2196/54388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 02/14/2024] [Accepted: 02/21/2024] [Indexed: 04/25/2024] Open
Abstract
BACKGROUND Respiratory diseases, including active tuberculosis (TB), asthma, and chronic obstructive pulmonary disease (COPD), constitute substantial global health challenges, necessitating timely and accurate diagnosis for effective treatment and management. OBJECTIVE This research seeks to develop and evaluate a noninvasive user-friendly artificial intelligence (AI)-powered cough audio classifier for detecting these respiratory conditions in rural Tanzania. METHODS This is a nonexperimental cross-sectional research with the primary objective of collection and analysis of cough sounds from patients with active TB, asthma, and COPD in outpatient clinics to generate and evaluate a noninvasive cough audio classifier. Specialized cough sound recording devices, designed to be nonintrusive and user-friendly, will facilitate the collection of diverse cough sound samples from patients attending outpatient clinics in 20 health care facilities in the Shinyanga region. The collected cough sound data will undergo rigorous analysis, using advanced AI signal processing and machine learning techniques. By comparing acoustic features and patterns associated with TB, asthma, and COPD, a robust algorithm capable of automated disease discrimination will be generated facilitating the development of a smartphone-based cough sound classifier. The classifier will be evaluated against the calculated reference standards including clinical assessments, sputum smear, GeneXpert, chest x-ray, culture and sensitivity, spirometry and peak expiratory flow, and sensitivity and predictive values. RESULTS This research represents a vital step toward enhancing the diagnostic capabilities available in outpatient clinics, with the potential to revolutionize the field of respiratory disease diagnosis. Findings from the 4 phases of the study will be presented as descriptions supported by relevant images, tables, and figures. The anticipated outcome of this research is the creation of a reliable, noninvasive diagnostic cough classifier that empowers health care professionals and patients themselves to identify and differentiate these respiratory diseases based on cough sound patterns. CONCLUSIONS Cough sound classifiers use advanced technology for early detection and management of respiratory conditions, offering a less invasive and more efficient alternative to traditional diagnostics. This technology promises to ease public health burdens, improve patient outcomes, and enhance health care access in under-resourced areas, potentially transforming respiratory disease management globally. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID) PRR1-10.2196/54388.
Collapse
Affiliation(s)
- Kahabi Ganka Isangula
- School of Nursing and Midwifery, Aga Khan University, Dar Es Salaam, United Republic of Tanzania
| | - Rogers John Haule
- School of Nursing and Midwifery, Aga Khan University, Dar Es Salaam, United Republic of Tanzania
| |
Collapse
|
4
|
Huddart S, Yadav V, Sieberts SK, Omberg L, Raberahona M, Rakotoarivelo R, Lyimo IN, Lweno O, Christopher DJ, Nhung NV, Theron G, Worodria W, Yu CY, Bachman CM, Burkot S, Dewan P, Kulhare S, Small PM, Cattamanchi A, Jaganath D, Lapierre SG. Solicited Cough Sound Analysis for Tuberculosis Triage Testing: The CODA TB DREAM Challenge Dataset. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.03.27.24304980. [PMID: 38585855 PMCID: PMC10996751 DOI: 10.1101/2024.03.27.24304980] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Cough is a common and commonly ignored symptom of lung disease. Cough is often perceived as difficult to quantify, frequently self-limiting, and non-specific. However, cough has a central role in the clinical detection of many lung diseases including tuberculosis (TB), which remains the leading infectious disease killer worldwide. TB screening currently relies on self-reported cough which fails to meet the World Health Organization (WHO) accuracy targets for a TB triage test. Artificial intelligence (AI) models based on cough sound have been developed for several respiratory conditions, with limited work being done in TB. To support the development of an accurate, point-of-care cough-based triage tool for TB, we have compiled a large multi-country database of cough sounds from individuals being evaluated for TB. The dataset includes more than 700,000 cough sounds from 2,143 individuals with detailed demographic, clinical and microbiologic diagnostic information. We aim to empower researchers in the development of cough sound analysis models to improve TB diagnosis, where innovative approaches are critically needed to end this long-standing pandemic.
Collapse
Affiliation(s)
- Sophie Huddart
- University of California San Francisco, School of Medicine, 533 Parnassus Ave, San Francisco, CA 94143 USA
| | | | | | - Larson Omberg
- Sage Bionetworks, Seattle, WA 98103 USA
- Curently at Koneksa Health, One World Trade Center 285 Fulton St. 77th Floor New York, NY, 10007
| | - Mihaja Raberahona
- CHU Joseph Rasera Befelatanana, Antananarivo, 101, Analamanga, Madagascar
- Centre d’Infectiologie Charles Mérieux, Antananarivo, 101, Analamanga, Madagascar
| | - Rivo Rakotoarivelo
- CHU Tambohobe Fianarantsoa, 301, Haute-Matsiatra, Madagascar
- Université de Fianarantsoa, Fianarantsoa, 301, Haute-Matsiatra, Madagascar
| | - Issa N. Lyimo
- Ifakara Health Institute, Environmental and Ecological Sciences & Interventions and Clinical Trials Departments, Kiko Avenue, Plot 463, Mikocheni, Dar es Salaam, Tanzania
| | - Omar Lweno
- Ifakara Health Institute, Environmental and Ecological Sciences & Interventions and Clinical Trials Departments, Kiko Avenue, Plot 463, Mikocheni, Dar es Salaam, Tanzania
| | | | - Nguyen Viet Nhung
- National Tuberculosis Programme, 463 Hoang Hoa Tham, Ba Dinh District, Hanoi, Vietnam
| | - Grant Theron
- Stellenbosch University, Division of Molecular Biology and Human Genetics, Matieland, 7602 South Africa
| | | | - Charles Y. Yu
- De La Salle Medical and Health Sciences Institute, Governor D. Mangubat Avenue, Dasmarinas Cavite, Philippines 4114
| | | | - Stephen Burkot
- Global Health Labs, 14360 SE Eastgate Way, Bellevue, WA 98007 USA
| | - Puneet Dewan
- Global Health Labs, 14360 SE Eastgate Way, Bellevue, WA 98007 USA
| | - Sourabh Kulhare
- Global Health Labs, 14360 SE Eastgate Way, Bellevue, WA 98007 USA
| | - Peter M Small
- Global Health Labs, 14360 SE Eastgate Way, Bellevue, WA 98007 USA
| | - Adithya Cattamanchi
- University of California Irvine, School of Medicine, 333 City Blvd. W Suite 400, Orange CA 92868 USA
| | - Devan Jaganath
- University of California Irvine, School of Medicine, 333 City Blvd. W Suite 400, Orange CA 92868 USA
| | - Simon Grandjean Lapierre
- Centre de Recherche du Centre Hospitalier de l’Université de Montréal, Immunopathology Axis, 900 St-Denis, Montréal, Québec, H2X 0A9 Canada
- Université de Montréal, Department of Microbiology, Infectious Diseases and Immunology, 2900 Edouard-Montpetit, Montréal, Québec, H3T 1J4 Canada
| |
Collapse
|
5
|
Diab MS, Rodriguez-Villegas E. Feature evaluation of accelerometry signals for cough detection. Front Digit Health 2024; 6:1368574. [PMID: 38585283 PMCID: PMC10995234 DOI: 10.3389/fdgth.2024.1368574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 03/06/2024] [Indexed: 04/09/2024] Open
Abstract
Cough is a common symptom of multiple respiratory diseases, such as asthma and chronic obstructive pulmonary disorder. Various research works targeted cough detection as a means for continuous monitoring of these respiratory health conditions. This has been mainly achieved using sophisticated machine learning or deep learning algorithms fed with audio recordings. In this work, we explore the use of an alternative detection method, since audio can generate privacy and security concerns related to the use of always-on microphones. This study proposes the use of a non-contact tri-axial accelerometer for motion detection to differentiate between cough and non-cough events/movements. A total of 43 time-domain features were extracted from the acquired tri-axial accelerometry signals. These features were evaluated and ranked for their importance using six methods with adjustable conditions, resulting in a total of 11 feature rankings. The ranking methods included model-based feature importance algorithms, first principal component, leave-one-out, permutation, and recursive features elimination (RFE). The ranking results were further used in the feature selection of the top 10, 20, and 30 for use in cough detection. A total of 68 classification models using a simple logistic regression classifier are reported, using two approaches for data splitting: subject-record-split and leave-one-subject-out (LOSO). The best-performing model out of the 34 using subject-record-split obtained an accuracy of 92.20%, sensitivity of 90.87%, specificity of 93.52%, and F1 score of 92.09% using only 20 features selected by the RFE method. The best-performing model out of the 34 using LOSO obtained an accuracy of 89.57%, sensitivity of 85.71%, specificity of 93.43%, and F1 score of 88.72% using only 10 features selected by the RFE method. These results demonstrate the ability for future implementation of a motion-based wearable cough detector.
Collapse
Affiliation(s)
- Maha S. Diab
- Wearable Technologies Lab, Department of Electrical and Electronic Engineering, Imperial College London, London, United Kingdom
| | | |
Collapse
|
6
|
Malik H, Anees T. Multi-modal deep learning methods for classification of chest diseases using different medical imaging and cough sounds. PLoS One 2024; 19:e0296352. [PMID: 38470893 DOI: 10.1371/journal.pone.0296352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 12/11/2023] [Indexed: 03/14/2024] Open
Abstract
Chest disease refers to a wide range of conditions affecting the lungs, such as COVID-19, lung cancer (LC), consolidation lung (COL), and many more. When diagnosing chest disorders medical professionals may be thrown off by the overlapping symptoms (such as fever, cough, sore throat, etc.). Additionally, researchers and medical professionals make use of chest X-rays (CXR), cough sounds, and computed tomography (CT) scans to diagnose chest disorders. The present study aims to classify the nine different conditions of chest disorders, including COVID-19, LC, COL, atelectasis (ATE), tuberculosis (TB), pneumothorax (PNEUTH), edema (EDE), pneumonia (PNEU). Thus, we suggested four novel convolutional neural network (CNN) models that train distinct image-level representations for nine different chest disease classifications by extracting features from images. Furthermore, the proposed CNN employed several new approaches such as a max-pooling layer, batch normalization layers (BANL), dropout, rank-based average pooling (RBAP), and multiple-way data generation (MWDG). The scalogram method is utilized to transform the sounds of coughing into a visual representation. Before beginning to train the model that has been developed, the SMOTE approach is used to calibrate the CXR and CT scans as well as the cough sound images (CSI) of nine different chest disorders. The CXR, CT scan, and CSI used for training and evaluating the proposed model come from 24 publicly available benchmark chest illness datasets. The classification performance of the proposed model is compared with that of seven baseline models, namely Vgg-19, ResNet-101, ResNet-50, DenseNet-121, EfficientNetB0, DenseNet-201, and Inception-V3, in addition to state-of-the-art (SOTA) classifiers. The effectiveness of the proposed model is further demonstrated by the results of the ablation experiments. The proposed model was successful in achieving an accuracy of 99.01%, making it superior to both the baseline models and the SOTA classifiers. As a result, the proposed approach is capable of offering significant support to radiologists and other medical professionals.
Collapse
Affiliation(s)
- Hassaan Malik
- Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan
| | - Tayyaba Anees
- Department of Software Engineering, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan
| |
Collapse
|
7
|
Kapetanidis P, Kalioras F, Tsakonas C, Tzamalis P, Kontogiannis G, Karamanidou T, Stavropoulos TG, Nikoletseas S. Respiratory Diseases Diagnosis Using Audio Analysis and Artificial Intelligence: A Systematic Review. SENSORS (BASEL, SWITZERLAND) 2024; 24:1173. [PMID: 38400330 PMCID: PMC10893010 DOI: 10.3390/s24041173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 02/03/2024] [Accepted: 02/04/2024] [Indexed: 02/25/2024]
Abstract
Respiratory diseases represent a significant global burden, necessitating efficient diagnostic methods for timely intervention. Digital biomarkers based on audio, acoustics, and sound from the upper and lower respiratory system, as well as the voice, have emerged as valuable indicators of respiratory functionality. Recent advancements in machine learning (ML) algorithms offer promising avenues for the identification and diagnosis of respiratory diseases through the analysis and processing of such audio-based biomarkers. An ever-increasing number of studies employ ML techniques to extract meaningful information from audio biomarkers. Beyond disease identification, these studies explore diverse aspects such as the recognition of cough sounds amidst environmental noise, the analysis of respiratory sounds to detect respiratory symptoms like wheezes and crackles, as well as the analysis of the voice/speech for the evaluation of human voice abnormalities. To provide a more in-depth analysis, this review examines 75 relevant audio analysis studies across three distinct areas of concern based on respiratory diseases' symptoms: (a) cough detection, (b) lower respiratory symptoms identification, and (c) diagnostics from the voice and speech. Furthermore, publicly available datasets commonly utilized in this domain are presented. It is observed that research trends are influenced by the pandemic, with a surge in studies on COVID-19 diagnosis, mobile data acquisition, and remote diagnosis systems.
Collapse
Affiliation(s)
- Panagiotis Kapetanidis
- Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece (C.T.); (G.K.); (S.N.)
| | - Fotios Kalioras
- Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece (C.T.); (G.K.); (S.N.)
| | - Constantinos Tsakonas
- Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece (C.T.); (G.K.); (S.N.)
| | - Pantelis Tzamalis
- Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece (C.T.); (G.K.); (S.N.)
| | - George Kontogiannis
- Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece (C.T.); (G.K.); (S.N.)
| | - Theodora Karamanidou
- Pfizer Center for Digital Innovation, 55535 Thessaloniki, Greece; (T.K.); (T.G.S.)
| | | | - Sotiris Nikoletseas
- Computer Engineering and Informatics Department, University of Patras, 26504 Patras, Greece (C.T.); (G.K.); (S.N.)
| |
Collapse
|
8
|
Savage N. Tracking down tuberculosis. Nature 2024:10.1038/d41586-024-00087-8. [PMID: 38273057 DOI: 10.1038/d41586-024-00087-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
|
9
|
Sharma M, Nduba V, Njagi LN, Murithi W, Mwongera Z, Hawn TR, Patel SN, Horne DJ. TBscreen: A passive cough classifier for tuberculosis screening with a controlled dataset. SCIENCE ADVANCES 2024; 10:eadi0282. [PMID: 38170773 PMCID: PMC10776005 DOI: 10.1126/sciadv.adi0282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 12/01/2023] [Indexed: 01/05/2024]
Abstract
Recent respiratory disease screening studies suggest promising performance of cough classifiers, but potential biases in model training and dataset quality preclude robust conclusions. To examine tuberculosis (TB) cough diagnostic features, we enrolled subjects with pulmonary TB (N = 149) and controls with other respiratory illnesses (N = 46) in Nairobi. We collected a dataset with 33,000 passive coughs and 1600 forced coughs in a controlled setting with similar demographics. We trained a ResNet18-based cough classifier using images of passive cough scalogram as input and obtained a fivefold cross-validation sensitivity of 0.70 (±0.11 SD). The smartphone-based model had better performance in subjects with higher bacterial load {receiver operating characteristic-area under the curve (ROC-AUC): 0.87 [95% confidence interval (CI): 0.87 to 0.88], P < 0.001} or lung cavities [ROC-AUC: 0.89 (95% CI: 0.88 to 0.89), P < 0.001]. Overall, our data suggest that passive cough features distinguish TB from non-TB subjects and are associated with bacterial burden and disease severity.
Collapse
Affiliation(s)
- Manuja Sharma
- Department of Electrical and Computer Engineering, University of Washington, 185 E Stevens Way NE, Seattle, WA 98195, USA
| | - Videlis Nduba
- Centre for Respiratory Diseases Research, Kenya Medical Research Institute, Mbagathi Rd, Nairobi 610101, Kenya
| | - Lilian N. Njagi
- Centre for Respiratory Diseases Research, Kenya Medical Research Institute, Mbagathi Rd, Nairobi 610101, Kenya
| | - Wilfred Murithi
- Centre for Respiratory Diseases Research, Kenya Medical Research Institute, Mbagathi Rd, Nairobi 610101, Kenya
| | - Zipporah Mwongera
- Centre for Respiratory Diseases Research, Kenya Medical Research Institute, Mbagathi Rd, Nairobi 610101, Kenya
| | - Thomas R. Hawn
- Department of Medicine, University of Washington, 1959 NE Pacific Street, Seattle, WA 98195, USA
| | - Shwetak N. Patel
- Department of Electrical and Computer Engineering, University of Washington, 185 E Stevens Way NE, Seattle, WA 98195, USA
- Paul G. Allen School of Computer Science and Engineering, University of Washington, 185 E Stevens Way NE, Seattle, WA 98195, USA
| | - David J. Horne
- Department of Medicine, University of Washington, 1959 NE Pacific Street, Seattle, WA 98195, USA
| |
Collapse
|
10
|
Gao Y, Zhang Y, Hu C, He P, Fu J, Lin F, Liu K, Fu X, Liu R, Sun J, Chen F, Yang W, Zhou Y. Distinguishing infectivity in patients with pulmonary tuberculosis using deep learning. Front Public Health 2023; 11:1247141. [PMID: 38089031 PMCID: PMC10711219 DOI: 10.3389/fpubh.2023.1247141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Accepted: 11/07/2023] [Indexed: 12/18/2023] Open
Abstract
Introduction This study aimed to develop and assess a deep-learning model based on CT images for distinguishing infectivity in patients with pulmonary tuberculosis (PTB). Methods We labeled all 925 patients from four centers with weak and strong infectivity based on multiple sputum smears within a month for our deep-learning model named TBINet's training. We compared TBINet's performance in identifying infectious patients to that of the conventional 3D ResNet model. For model explainability, we used gradient-weighted class activation mapping (Grad-CAM) technology to identify the site of lesion activation in the CT images. Results The TBINet model demonstrated superior performance with an area under the curve (AUC) of 0.819 and 0.753 on the validation and external test sets, respectively, compared to existing deep learning methods. Furthermore, using Grad-CAM, we observed that CT images with higher levels of consolidation, voids, upper lobe involvement, and enlarged lymph nodes were more likely to come from patients with highly infectious forms of PTB. Conclusion Our study proves the feasibility of using CT images to identify the infectivity of PTB patients based on the deep learning method.
Collapse
Affiliation(s)
- Yi Gao
- Department of Infectious Disease and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Department of Infectious Disease, Hainan General Hospital, Hainan Medical University, Haikou, China
- Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yiwen Zhang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, China
| | - Chengguang Hu
- Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Pengyuan He
- Department of Infectious Disease, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
| | - Jian Fu
- Department of Infectious Disease, Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Feng Lin
- Department of Infectious Disease, Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Kehui Liu
- Department of Radiology, Haikou Municipal People's Hospital and Central South University Xiangya Medical College Affiliated Hospital, Haikou, China
| | - Xianxian Fu
- Clinical Lab, Haikou Municipal People's Hospital and Central South University Xiangya Medical College Affiliated Hospital, Haikou, China
| | - Rui Liu
- Department of Infectious Disease, The Second Affiliated Hospital, Hainan Medical University, Haikou, China
| | - Jiarun Sun
- Department of Infectious Disease and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Feng Chen
- Department of Radiology, Hainan General Hospital, Hainan Medical University, Haikou, China
| | - Wei Yang
- School of Biomedical Engineering, Southern Medical University, Guangzhou, China
| | - Yuanping Zhou
- Department of Infectious Disease and Hepatology Unit, Nanfang Hospital, Southern Medical University, Guangzhou, China
- Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
11
|
Dohál M, Porvazník I, Solovič I, Mokrý J. Advancing tuberculosis management: the role of predictive, preventive, and personalized medicine. Front Microbiol 2023; 14:1225438. [PMID: 37860132 PMCID: PMC10582268 DOI: 10.3389/fmicb.2023.1225438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 09/22/2023] [Indexed: 10/21/2023] Open
Abstract
Tuberculosis is a major global health issue, with approximately 10 million people falling ill and 1.4 million dying yearly. One of the most significant challenges to public health is the emergence of drug-resistant tuberculosis. For the last half-century, treating tuberculosis has adhered to a uniform management strategy in most patients. However, treatment ineffectiveness in some individuals with pulmonary tuberculosis presents a major challenge to the global tuberculosis control initiative. Unfavorable outcomes of tuberculosis treatment (including mortality, treatment failure, loss of follow-up, and unevaluated cases) may result in increased transmission of tuberculosis and the emergence of drug-resistant strains. Treatment failure may occur due to drug-resistant strains, non-adherence to medication, inadequate absorption of drugs, or low-quality healthcare. Identifying the underlying cause and adjusting the treatment accordingly to address treatment failure is important. This is where approaches such as artificial intelligence, genetic screening, and whole genome sequencing can play a critical role. In this review, we suggest a set of particular clinical applications of these approaches, which might have the potential to influence decisions regarding the clinical management of tuberculosis patients.
Collapse
Affiliation(s)
- Matúš Dohál
- Biomedical Centre Martin, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin, Slovakia
| | - Igor Porvazník
- National Institute of Tuberculosis, Lung Diseases and Thoracic Surgery, Vyšné Hágy, Slovakia
- Faculty of Health, Catholic University in Ružomberok, Ružomberok, Slovakia
| | - Ivan Solovič
- National Institute of Tuberculosis, Lung Diseases and Thoracic Surgery, Vyšné Hágy, Slovakia
- Faculty of Health, Catholic University in Ružomberok, Ružomberok, Slovakia
| | - Juraj Mokrý
- Department of Pharmacology, Jessenius Faculty of Medicine in Martin, Comenius University in Bratislava, Martin, Slovakia
| |
Collapse
|
12
|
Malik H, Anees T, Al-Shamaylehs AS, Alharthi SZ, Khalil W, Akhunzada A. Deep Learning-Based Classification of Chest Diseases Using X-rays, CT Scans, and Cough Sound Images. Diagnostics (Basel) 2023; 13:2772. [PMID: 37685310 PMCID: PMC10486427 DOI: 10.3390/diagnostics13172772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 08/14/2023] [Accepted: 08/21/2023] [Indexed: 09/10/2023] Open
Abstract
Chest disease refers to a variety of lung disorders, including lung cancer (LC), COVID-19, pneumonia (PNEU), tuberculosis (TB), and numerous other respiratory disorders. The symptoms (i.e., fever, cough, sore throat, etc.) of these chest diseases are similar, which might mislead radiologists and health experts when classifying chest diseases. Chest X-rays (CXR), cough sounds, and computed tomography (CT) scans are utilized by researchers and doctors to identify chest diseases such as LC, COVID-19, PNEU, and TB. The objective of the work is to identify nine different types of chest diseases, including COVID-19, edema (EDE), LC, PNEU, pneumothorax (PNEUTH), normal, atelectasis (ATE), and consolidation lung (COL). Therefore, we designed a novel deep learning (DL)-based chest disease detection network (DCDD_Net) that uses a CXR, CT scans, and cough sound images for the identification of nine different types of chest diseases. The scalogram method is used to convert the cough sounds into an image. Before training the proposed DCDD_Net model, the borderline (BL) SMOTE is applied to balance the CXR, CT scans, and cough sound images of nine chest diseases. The proposed DCDD_Net model is trained and evaluated on 20 publicly available benchmark chest disease datasets of CXR, CT scan, and cough sound images. The classification performance of the DCDD_Net is compared with four baseline models, i.e., InceptionResNet-V2, EfficientNet-B0, DenseNet-201, and Xception, as well as state-of-the-art (SOTA) classifiers. The DCDD_Net achieved an accuracy of 96.67%, a precision of 96.82%, a recall of 95.76%, an F1-score of 95.61%, and an area under the curve (AUC) of 99.43%. The results reveal that DCDD_Net outperformed the other four baseline models in terms of many performance evaluation metrics. Thus, the proposed DCDD_Net model can provide significant assistance to radiologists and medical experts. Additionally, the proposed model was also shown to be resilient by statistical evaluations of the datasets using McNemar and ANOVA tests.
Collapse
Affiliation(s)
- Hassaan Malik
- School of Systems and Technology, University of Management and Technology, Lahore 54770, Pakistan; (H.M.); (T.A.)
| | - Tayyaba Anees
- School of Systems and Technology, University of Management and Technology, Lahore 54770, Pakistan; (H.M.); (T.A.)
| | - Ahmad Sami Al-Shamaylehs
- Department of Networks and Cybersecurity, Faculty of Information Technology, Al-Ahliyya Amman University, Amman 19328, Jordan;
| | - Salman Z. Alharthi
- Department of Information System, College of Computers and Information Systems, Al-Lith Campus, Umm AL-Qura University, P.O. Box 7745, AL-Lith 21955, Saudi Arabia
| | - Wajeeha Khalil
- Department of Computer Science and Information Technology, University of Engineering and Technology Peshawar, Peshawar 25000, Pakistan;
| | - Adnan Akhunzada
- College of Computing & IT, University of Doha for Science and Technology, Doha P.O. Box 24449, Qatar;
| |
Collapse
|
13
|
Yellapu GD, Rudraraju G, Sripada NR, Mamidgi B, Jalukuru C, Firmal P, Yechuri V, Varanasi S, Peddireddi VS, Bhimarasetty DM, Kanisetti S, Joshi N, Mohapatra P, Pamarthi K. Development and clinical validation of Swaasa AI platform for screening and prioritization of pulmonary TB. Sci Rep 2023; 13:4740. [PMID: 36959347 PMCID: PMC10034902 DOI: 10.1038/s41598-023-31772-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 03/16/2023] [Indexed: 03/25/2023] Open
Abstract
Acoustic signal analysis has been employed in various medical devices. However, studies involving cough sound analysis to screen the potential pulmonary tuberculosis (PTB) suspects are very few. The main objective of this cross-sectional validation study was to develop and validate the Swaasa AI platform to screen and prioritize at risk patients for PTB based on the signature cough sound as well as symptomatic information provided by the subjects. The voluntary cough sound data was collected at Andhra Medical College-India. An Algorithm based on multimodal convolutional neural network architecture and feedforward artificial neural network (tabular features) was built and validated on a total of 567 subjects, comprising 278 positive and 289 negative PTB cases. The output from these two models was combined to detect the likely presence (positive cases) of PTB. In the clinical validation phase, the AI-model was found to be 86.82% accurate in detecting the likely presence of PTB with 90.36% sensitivity and 84.67% specificity. The pilot testing of model was conducted at a peripheral health care centre, RHC Simhachalam-India on 65 presumptive PTB cases. Out of which, 15 subjects truly turned out to be PTB positive with a positive predictive value of 75%. The validation results obtained from the model are quite encouraging. This platform has the potential to fulfil the unmet need of a cost-effective PTB screening method. It works remotely, presents instantaneous results, and does not require a highly trained operator. Therefore, it could be implemented in various inaccessible, resource-poor parts of the world.
Collapse
Affiliation(s)
| | | | | | | | - Charan Jalukuru
- Salcit Technologies, Jayabheri Silicon Towers, Hyderabad, India
| | - Priyanka Firmal
- Salcit Technologies, Jayabheri Silicon Towers, Hyderabad, India
| | - Venkat Yechuri
- Salcit Technologies, Jayabheri Silicon Towers, Hyderabad, India
| | | | | | | | | | - Niranjan Joshi
- Centre for Cellular and Molecular Platforms, Bengaluru, India
| | - Prasant Mohapatra
- Department of Computer Science, University of California, Davis, USA
| | | |
Collapse
|
14
|
Zhang J, Wu J, Qiu Y, Song A, Li W, Li X, Liu Y. Intelligent speech technologies for transcription, disease diagnosis, and medical equipment interactive control in smart hospitals: A review. Comput Biol Med 2023; 153:106517. [PMID: 36623438 PMCID: PMC9814440 DOI: 10.1016/j.compbiomed.2022.106517] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2022] [Revised: 12/23/2022] [Accepted: 12/31/2022] [Indexed: 01/07/2023]
Abstract
The growing and aging of the world population have driven the shortage of medical resources in recent years, especially during the COVID-19 pandemic. Fortunately, the rapid development of robotics and artificial intelligence technologies help to adapt to the challenges in the healthcare field. Among them, intelligent speech technology (IST) has served doctors and patients to improve the efficiency of medical behavior and alleviate the medical burden. However, problems like noise interference in complex medical scenarios and pronunciation differences between patients and healthy people hamper the broad application of IST in hospitals. In recent years, technologies such as machine learning have developed rapidly in intelligent speech recognition, which is expected to solve these problems. This paper first introduces IST's procedure and system architecture and analyzes its application in medical scenarios. Secondly, we review existing IST applications in smart hospitals in detail, including electronic medical documentation, disease diagnosis and evaluation, and human-medical equipment interaction. In addition, we elaborate on an application case of IST in the early recognition, diagnosis, rehabilitation training, evaluation, and daily care of stroke patients. Finally, we discuss IST's limitations, challenges, and future directions in the medical field. Furthermore, we propose a novel medical voice analysis system architecture that employs active hardware, active software, and human-computer interaction to realize intelligent and evolvable speech recognition. This comprehensive review and the proposed architecture offer directions for future studies on IST and its applications in smart hospitals.
Collapse
Affiliation(s)
- Jun Zhang
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China,Corresponding author
| | - Jingyue Wu
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Yiyi Qiu
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Aiguo Song
- The State Key Laboratory of Bioelectronics, School of Instrument Science and Engineering, Southeast University, Nanjing, 210096, China
| | - Weifeng Li
- Department of Emergency Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, 510080, China
| | - Xin Li
- Department of Emergency Medicine, Guangdong Provincial People's Hospital, Guangdong Academy of Medical Sciences, Guangzhou, 510080, China
| | - Yecheng Liu
- Emergency Department, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Science and Peking Union Medical College, Beijing, 100730, China
| |
Collapse
|
15
|
Liebenberg D, Gordhan BG, Kana BD. Drug resistant tuberculosis: Implications for transmission, diagnosis, and disease management. Front Cell Infect Microbiol 2022; 12:943545. [PMID: 36211964 PMCID: PMC9538507 DOI: 10.3389/fcimb.2022.943545] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 09/06/2022] [Indexed: 01/17/2023] Open
Abstract
Drug resistant tuberculosis contributes significantly to the global burden of antimicrobial resistance, often consuming a large proportion of the healthcare budget and associated resources in many endemic countries. The rapid emergence of resistance to newer tuberculosis therapies signals the need to ensure appropriate antibiotic stewardship, together with a concerted drive to develop new regimens that are active against currently circulating drug resistant strains. Herein, we highlight that the current burden of drug resistant tuberculosis is driven by a combination of ongoing transmission and the intra-patient evolution of resistance through several mechanisms. Global control of tuberculosis will require interventions that effectively address these and related aspects. Interrupting tuberculosis transmission is dependent on the availability of novel rapid diagnostics which provide accurate results, as near-patient as is possible, together with appropriate linkage to care. Contact tracing, longitudinal follow-up for symptoms and active mapping of social contacts are essential elements to curb further community-wide spread of drug resistant strains. Appropriate prophylaxis for contacts of drug resistant index cases is imperative to limit disease progression and subsequent transmission. Preventing the evolution of drug resistant strains will require the development of shorter regimens that rapidly eliminate all populations of mycobacteria, whilst concurrently limiting bacterial metabolic processes that drive drug tolerance, mutagenesis and the ultimate emergence of resistance. Drug discovery programs that specifically target bacterial genetic determinants associated with these processes will be paramount to tuberculosis eradication. In addition, the development of appropriate clinical endpoints that quantify drug tolerant organisms in sputum, such as differentially culturable/detectable tubercle bacteria is necessary to accurately assess the potential of new therapies to effectively shorten treatment duration. When combined, this holistic approach to addressing the critical problems associated with drug resistance will support delivery of quality care to patients suffering from tuberculosis and bolster efforts to eradicate this disease.
Collapse
|
16
|
Pahar M, Miranda I, Diacon A, Niesler T. Automatic Non-Invasive Cough Detection based on Accelerometer and Audio Signals. JOURNAL OF SIGNAL PROCESSING SYSTEMS 2022; 94:821-835. [PMID: 35341095 PMCID: PMC8934184 DOI: 10.1007/s11265-022-01748-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 01/09/2022] [Accepted: 02/23/2022] [Indexed: 12/01/2022]
Abstract
We present an automatic non-invasive way of detecting cough events based on both accelerometer and audio signals. The acceleration signals are captured by a smartphone firmly attached to the patient’s bed, using its integrated accelerometer. The audio signals are captured simultaneously by the same smartphone using an external microphone. We have compiled a manually-annotated dataset containing such simultaneously-captured acceleration and audio signals for approximately 6000 cough and 68000 non-cough events from 14 adult male patients. Logistic regression (LR), support vector machine (SVM) and multilayer perceptron (MLP) classifiers provide a baseline and are compared with three deep architectures, convolutional neural network (CNN), long short-term memory (LSTM) network, and residual-based architecture (Resnet50) using a leave-one-out cross-validation scheme. We find that it is possible to use either acceleration or audio signals to distinguish between coughing and other activities including sneezing, throat-clearing, and movement on the bed with high accuracy. However, in all cases, the deep neural networks outperform the shallow classifiers by a clear margin and the Resnet50 offers the best performance, achieving an area under the ROC curve (AUC) exceeding 0.98 and 0.99 for acceleration and audio signals respectively. While audio-based classification consistently offers better performance than acceleration-based classification, we observe that the difference is very small for the best systems. Since the acceleration signal requires less processing power, and since the need to record audio is sidestepped and thus privacy is inherently secured, and since the recording device is attached to the bed and not worn, an accelerometer-based highly accurate non-invasive cough detector may represent a more convenient and readily accepted method in long-term cough monitoring.
Collapse
Affiliation(s)
- Madhurananda Pahar
- Department of Electrical and Electronic Engineering, Stellenbosch University, Stellenbosch, 7600 Western Cape South Africa
| | - Igor Miranda
- Federal University of Recôncavo da Bahia, Cruz das Almas, 44.380-000 Bahia Brazil
| | - Andreas Diacon
- TASK Applied Science, Cape Town, Western Cape South Africa
| | - Thomas Niesler
- Department of Electrical and Electronic Engineering, Stellenbosch University, Stellenbosch, 7600 Western Cape South Africa
| |
Collapse
|
17
|
Abstract
Cough assessment is central to the clinical management of respiratory diseases, including tuberculosis (TB), but strategies to objectively and unobtrusively measure cough are lacking. Acoustic epidemiology is an emerging field that uses technology to detect cough sounds and analyze cough patterns to improve health outcomes among people with respiratory conditions linked to cough. This field is increasingly exploring the potential of artificial intelligence (AI) for more advanced applications, such as analyzing cough sounds as a biomarker for disease screening. While much of the data are preliminary, objective cough assessment could potentially transform disease control programs, including TB, and support individual patient management. Here, we present an overview of recent advances in this field and describe how cough assessment, if validated, could support public health programs at various stages of the TB care cascade. Zimmer et al. discuss the importance of cough assessment in clinical management of tuberculosis (TB). They describe how acoustic epidemiology, which uses recording devices and artificial intelligence to detect, record and analyze cough, can be used in TB control and individual patient management.
Collapse
|
18
|
Nathavitharana RR, Garcia-Basteiro AL, Ruhwald M, Cobelens F, Theron G. Reimagining the status quo: How close are we to rapid sputum-free tuberculosis diagnostics for all? EBioMedicine 2022; 78:103939. [PMID: 35339423 PMCID: PMC9043971 DOI: 10.1016/j.ebiom.2022.103939] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 02/14/2022] [Accepted: 02/28/2022] [Indexed: 01/26/2023] Open
Abstract
Rapid, accurate, sputum-free tests for tuberculosis (TB) triage and confirmation are urgently needed to close the widening diagnostic gap. We summarise key technologies and review programmatic, systems, and resource issues that could affect the impact of diagnostics. Mid-to-early-stage technologies like artificial intelligence-based automated digital chest X-radiography and capillary blood point-of-care assays are particularly promising. Pitfalls in the diagnostic pipeline, included a lack of community-based tools. We outline how these technologies may complement one another within the context of the TB care cascade, help overturn current paradigms (eg, reducing syndromic triage reliance, permitting subclinical TB to be diagnosed), and expand options for extra-pulmonary TB. We review challenges such as the difficulty of detecting paucibacillary TB and the limitations of current reference standards, and discuss how researchers and developers can better design and evaluate assays to optimise programmatic uptake. Finally, we outline how leveraging the urgency and innovation applied to COVID-19 is critical to improving TB patients' diagnostic quality-of-care.
Collapse
Affiliation(s)
- Ruvandhi R. Nathavitharana
- Division of Infectious Diseases, Beth Israel Deaconess Medical Center & Harvard Medical School, Boston, USA
| | - Alberto L. Garcia-Basteiro
- ISGlobal, Hospital Clínic - Universitat de Barcelona, Barcelona, Spain,Centro de Investigação em Saude de Manhiça, Maputo, Mozambique
| | - Morten Ruhwald
- FIND, the global alliance for diagnostics, Geneva, Switzerland
| | - Frank Cobelens
- Department of Global Health and Amsterdam Institute for Global Health and Development, Amsterdam University Medical Centers, Amsterdam, Netherlands
| | - Grant Theron
- DSI-NRF Centre of Excellence for Biomedical Tuberculosis Research, South African Medical Research Council Centre for Tuberculosis Research, Division of Molecular Biology and Human Genetics, Faculty of Medicine and Health Sciences, Stellenbosch University, South Africa,Corresponding author.
| |
Collapse
|
19
|
COVID-19 detection in cough, breath and speech using deep transfer learning and bottleneck features. Comput Biol Med 2021; 141:105153. [PMID: 34954610 PMCID: PMC8679499 DOI: 10.1016/j.compbiomed.2021.105153] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 11/24/2021] [Accepted: 12/14/2021] [Indexed: 12/15/2022]
Abstract
We present an experimental investigation into the effectiveness of transfer learning and bottleneck feature extraction in detecting COVID-19 from audio recordings of cough, breath and speech. This type of screening is non-contact, does not require specialist medical expertise or laboratory facilities and can be deployed on inexpensive consumer hardware such as a smartphone. We use datasets that contain cough, sneeze, speech and other noises, but do not contain COVID-19 labels, to pre-train three deep neural networks: a CNN, an LSTM and a Resnet50. These pre-trained networks are subsequently either fine-tuned using smaller datasets of coughing with COVID-19 labels in the process of transfer learning, or are used as bottleneck feature extractors. Results show that a Resnet50 classifier trained by this transfer learning process delivers optimal or near-optimal performance across all datasets achieving areas under the receiver operating characteristic (ROC AUC) of 0.98, 0.94 and 0.92 respectively for all three sound classes: coughs, breaths and speech. This indicates that coughs carry the strongest COVID-19 signature, followed by breath and speech. Our results also show that applying transfer learning and extracting bottleneck features using the larger datasets without COVID-19 labels led not only to improved performance, but also to a marked reduction in the standard deviation of the classifier AUCs measured over the outer folds during nested cross-validation, indicating better generalisation. We conclude that deep transfer learning and bottleneck feature extraction can improve COVID-19 cough, breath and speech audio classification, yielding automatic COVID-19 detection with a better and more consistent overall performance.
Collapse
|