1
|
Semmad A, Bahoura M. Comparative study of respiratory sounds classification methods based on cepstral analysis and artificial neural networks. Comput Biol Med 2024; 171:108190. [PMID: 38387384 DOI: 10.1016/j.compbiomed.2024.108190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 01/30/2024] [Accepted: 02/18/2024] [Indexed: 02/24/2024]
Abstract
In this paper, we investigated and evaluated various machine learning-based approaches for automatically detecting wheezing sounds. We conducted a comprehensive comparison of these proposed systems, assessing their classification performance through metrics such as Sensitivity, Specificity, and Accuracy. The main approach to developing a machine learning-based system for classifying respiratory sounds involved the combination of a technique for extracting features from an unknown input sound with a classification method to determine its belonging class. The characterization techniques used in this study are based on the cepstral analysis, which was extensively employed in the automatic speech recognition field. While MFCC (Mel-Frequency Cepstral Coefficients) feature extraction methods are commonly used in respiratory sounds classification, our study introduces a novelty by employing GFCC (Gammatone-Frequency Cepstral Coefficients) and BFCC (Bark-Frequency Cepstral Coefficients) for this purpose. For the classification task, we employed two types of neural networks: the MLP (Multilayer Perceptron), a feedforward neural network, and a variant of the LSTM (Long Short-Term Memory) recurrent neural network called BiLSTM (Bidirectional LSTM). The proposed classification systems are evaluated using a database consisting of 497 wheezing segments and 915 normal respiratory segments, which are recorded from individuals diagnosticated with asthma and individuals without any respiratory issues, respectively. The highest classification performance was achieved by the BFCC-BiLSTM model, which demonstrated an exceptional accuracy rate of 99.8%.
Collapse
Affiliation(s)
- Abdelkrim Semmad
- Department of Engineering, Université du Québec à Rimouski, 300, allée des Ursulines, Rimouski, Qc, Canada, G5L 3A1.
| | - Mohammed Bahoura
- Department of Engineering, Université du Québec à Rimouski, 300, allée des Ursulines, Rimouski, Qc, Canada, G5L 3A1.
| |
Collapse
|
2
|
Zhou W, Yu L, Zhang M, Xiao W. A low power respiratory sound diagnosis processing unit based on LSTM for wearable health monitoring. BIOMED ENG-BIOMED TE 2023; 68:469-480. [PMID: 37080905 DOI: 10.1515/bmt-2022-0421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 04/05/2023] [Indexed: 04/22/2023]
Abstract
Early prevention and detection of respiratory disease have attracted extensive attention due to the significant increase in people with respiratory issues. Restraining the spread and relieving the symptom of this disease is essential. However, the traditional auscultation technique demands a high-level medical skill, and computational respiratory sound analysis approaches have limits in constrained locations. A wearable auscultation device is required to real-time monitor respiratory system health and provides consumers with ease. In this work, we developed a Respiratory Sound Diagnosis Processor Unit (RSDPU) based on Long Short-Term Memory (LSTM). The experiments and analyses were conducted on feature extraction and abnormality diagnosis algorithm of respiratory sound, and Dynamic Normalization Mapping (DNM) was proposed to better utilize quantization bits and lessen overfitting. Furthermore, we developed the hardware implementation of RSDPU including a corrector to filter diagnosis noise. We presented the FPGA prototyping verification and layout of the RSDPU for power and area evaluation. Experimental results demonstrated that RSDPU achieved an abnormality diagnosis accuracy of 81.4 %, an area of 1.57 × 1.76 mm under the SMIC 130 nm process, and power consumption of 381.8 μW, which met the requirements of high accuracy, low power consumption, and small area.
Collapse
Affiliation(s)
- Weixin Zhou
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| | - Lina Yu
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| | - Ming Zhang
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| | - Wan'ang Xiao
- Chinese Academy of Sciences, Institute of Semiconductors, Beijing, China
| |
Collapse
|
3
|
Huang DM, Huang J, Qiao K, Zhong NS, Lu HZ, Wang WJ. Deep learning-based lung sound analysis for intelligent stethoscope. Mil Med Res 2023; 10:44. [PMID: 37749643 PMCID: PMC10521503 DOI: 10.1186/s40779-023-00479-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/05/2023] [Indexed: 09/27/2023] Open
Abstract
Auscultation is crucial for the diagnosis of respiratory system diseases. However, traditional stethoscopes have inherent limitations, such as inter-listener variability and subjectivity, and they cannot record respiratory sounds for offline/retrospective diagnosis or remote prescriptions in telemedicine. The emergence of digital stethoscopes has overcome these limitations by allowing physicians to store and share respiratory sounds for consultation and education. On this basis, machine learning, particularly deep learning, enables the fully-automatic analysis of lung sounds that may pave the way for intelligent stethoscopes. This review thus aims to provide a comprehensive overview of deep learning algorithms used for lung sound analysis to emphasize the significance of artificial intelligence (AI) in this field. We focus on each component of deep learning-based lung sound analysis systems, including the task categories, public datasets, denoising methods, and, most importantly, existing deep learning methods, i.e., the state-of-the-art approaches to convert lung sounds into two-dimensional (2D) spectrograms and use convolutional neural networks for the end-to-end recognition of respiratory diseases or abnormal lung sounds. Additionally, this review highlights current challenges in this field, including the variety of devices, noise sensitivity, and poor interpretability of deep models. To address the poor reproducibility and variety of deep learning in this field, this review also provides a scalable and flexible open-source framework that aims to standardize the algorithmic workflow and provide a solid basis for replication and future extension: https://github.com/contactless-healthcare/Deep-Learning-for-Lung-Sound-Analysis .
Collapse
Affiliation(s)
- Dong-Min Huang
- Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China
| | - Jia Huang
- The Third People's Hospital of Shenzhen, Shenzhen, 518112, Guangdong, China
| | - Kun Qiao
- The Third People's Hospital of Shenzhen, Shenzhen, 518112, Guangdong, China
| | - Nan-Shan Zhong
- Guangzhou Institute of Respiratory Health, China State Key Laboratory of Respiratory Disease, National Clinical Research Center for Respiratory Disease, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou, 510120, China.
| | - Hong-Zhou Lu
- The Third People's Hospital of Shenzhen, Shenzhen, 518112, Guangdong, China.
| | - Wen-Jin Wang
- Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, 518055, Guangdong, China.
| |
Collapse
|
4
|
Choi Y, Lee H. Interpretation of lung disease classification with light attention connected module. Biomed Signal Process Control 2023; 84:104695. [PMID: 36879856 PMCID: PMC9978539 DOI: 10.1016/j.bspc.2023.104695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 12/21/2022] [Accepted: 02/11/2023] [Indexed: 03/06/2023]
Abstract
Lung diseases lead to complications from obstructive diseases, and the COVID-19 pandemic has increased lung disease-related deaths. Medical practitioners use stethoscopes to diagnose lung disease. However, an artificial intelligence model capable of objective judgment is required since the experience and diagnosis of respiratory sounds differ. Therefore, in this study, we propose a lung disease classification model that uses an attention module and deep learning. Respiratory sounds were extracted using log-Mel spectrogram MFCC. Normal and five types of adventitious sounds were effectively classified by improving VGGish and adding a light attention connected module to which the efficient channel attention module (ECA-Net) was applied. The performance of the model was evaluated for accuracy, precision, sensitivity, specificity, f1-score, and balanced accuracy, which were 92.56%, 92.81%, 92.22%, 98.50%, 92.29%, and 95.4%, respectively. We confirmed high performance according to the attention effect. The classification causes of lung diseases were analyzed using gradient-weighted class activation mapping (Grad-CAM), and the performances of their models were compared using open lung sounds measured using a Littmann 3200 stethoscope. The experts' opinions were also included. Our results will contribute to the early diagnosis and interpretation of diseases in patients with lung disease by utilizing algorithms in smart medical stethoscopes.
Collapse
Affiliation(s)
- Youngjin Choi
- School of Industrial Management Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea
| | - Hongchul Lee
- School of Industrial Management Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Republic of Korea
| |
Collapse
|
5
|
Dianat B, La Torraca P, Manfredi A, Cassone G, Vacchi C, Sebastiani M, Pancaldi F. Classification of pulmonary sounds through deep learning for the diagnosis of interstitial lung diseases secondary to connective tissue diseases. Comput Biol Med 2023; 160:106928. [PMID: 37156223 DOI: 10.1016/j.compbiomed.2023.106928] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 01/17/2023] [Accepted: 04/13/2023] [Indexed: 05/10/2023]
Abstract
Early diagnosis of interstitial lung diseases secondary to connective tissue diseases is critical for the treatment and survival of patients. The symptoms, like dry cough and dyspnea, appear late in the clinical history and are not specific, moreover, the current approach to confirm the diagnosis of interstitial lung disease is based on high resolution computer tomography. However, computer tomography involves x-ray exposure for patients and high costs for the Health System, therefore preventing its use for a massive screening campaign in elder people. In this work we investigate the use of deep learning techniques for the classification of pulmonary sounds acquired from patients affected by connective tissue diseases. The novelty of the work consists of a suitably developed pre-processing pipeline for de-noising and data augmentation. The proposed approach is combined with a clinical study where the ground truth is represented by high resolution computer tomography. Various convolutional neural networks have provided an overall accuracy as high as 91% in the classification of lung sounds and have led to an overwhelming diagnostic accuracy in the range 91%-93%. Modern high performance hardware for edge computing can easily support our algorithms. This solution paves the way for a vast screening campaign of interstitial lung diseases in elder people on the basis of a non-invasive and cheap thoracic auscultation.
Collapse
Affiliation(s)
- Behnood Dianat
- University of Modena and Reggio Emilia, Department of Sciences and Methods for Engineering, Via G. Amendola 2, 42122 Reggio Emilia, Italy; University of Modena and Reggio Emilia, Artificial Intelligence Research and Innovation Center (AIRI), Via Pietro Vivarelli 10, 41125 Modena, Italy
| | - Paolo La Torraca
- University of Modena and Reggio Emilia, Department of Sciences and Methods for Engineering, Via G. Amendola 2, 42122 Reggio Emilia, Italy
| | - Andreina Manfredi
- University of Modena and Reggio Emilia, Department of Surgery, Medicine, Dentistry and Morphological Sciences with Transplant Surgery, Oncology and Regenerative Medicine Relevance, via del Pozzo 71, 41124, Modena, Italy; Azienda Policlinico di Modena, Rheumatology Unit, via del Pozzo 71, 41124, Modena, Italy
| | - Giulia Cassone
- University of Modena and Reggio Emilia, Department of Surgery, Medicine, Dentistry and Morphological Sciences with Transplant Surgery, Oncology and Regenerative Medicine Relevance, via del Pozzo 71, 41124, Modena, Italy; Azienda Policlinico di Modena, Rheumatology Unit, via del Pozzo 71, 41124, Modena, Italy
| | - Caterina Vacchi
- University of Modena and Reggio Emilia, Department of Surgery, Medicine, Dentistry and Morphological Sciences with Transplant Surgery, Oncology and Regenerative Medicine Relevance, via del Pozzo 71, 41124, Modena, Italy; Azienda Policlinico di Modena, Rheumatology Unit, via del Pozzo 71, 41124, Modena, Italy
| | - Marco Sebastiani
- University of Modena and Reggio Emilia, Department of Surgery, Medicine, Dentistry and Morphological Sciences with Transplant Surgery, Oncology and Regenerative Medicine Relevance, via del Pozzo 71, 41124, Modena, Italy; Azienda Policlinico di Modena, Rheumatology Unit, via del Pozzo 71, 41124, Modena, Italy
| | - Fabrizio Pancaldi
- University of Modena and Reggio Emilia, Department of Sciences and Methods for Engineering, Via G. Amendola 2, 42122 Reggio Emilia, Italy; University of Modena and Reggio Emilia, Artificial Intelligence Research and Innovation Center (AIRI), Via Pietro Vivarelli 10, 41125 Modena, Italy.
| |
Collapse
|
6
|
Song W, Han J. Patch-level contrastive embedding learning for respiratory sound classification. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2022.104338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
7
|
Tzudir M, Baghel S, Sarmah P, Prasanna SRM. Under-resourced dialect identification in Ao using source information. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1755. [PMID: 36182313 DOI: 10.1121/10.0014176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 09/01/2022] [Indexed: 06/16/2023]
Abstract
This paper reports the findings of an automatic dialect identification (DID) task conducted on Ao speech data using source features. Considering that Ao is a tone language, in this study for DID, the gammatonegram of the linear prediction residual is proposed as a feature. As Ao is an under-resourced language, data augmentation was carried out to increase the size of the speech corpus. The results showed that data augmentation improved DID by 14%. A perception test conducted on Ao speakers showed better DID by the subjects when utterance duration was 3 s. Accordingly, automatic DID was conducted on utterances of various duration. A baseline DID system with the Slms feature attained an average F1-score of 53.84% in a 3 s long utterance. Inclusion of source features, Silpr and S, improved the F1-score to 60.69%. In a final system, with a combination of Silpr, S, Slms, and Mel frequency cepstral coefficient features, the F1-score increased to 61.46%.
Collapse
Affiliation(s)
- Moakala Tzudir
- Indian Institute of Technology Guwahati, Guwahati-781039, India
| | - Shikha Baghel
- Indian Institute of Technology Guwahati, Guwahati-781039, India
| | | | | |
Collapse
|
8
|
Neili Z, Sundaraj K. A comparative study of the spectrogram, scalogram, melspectrogram and gammatonegram time-frequency representations for the classification of lung sounds using the ICBHI database based on CNNs. BIOMED ENG-BIOMED TE 2022; 67:367-390. [PMID: 35926850 DOI: 10.1515/bmt-2022-0180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 06/21/2022] [Indexed: 11/15/2022]
Abstract
In lung sound classification using deep learning, many studies have considered the use of short-time Fourier transform (STFT) as the most commonly used 2D representation of the input data. Consequently, STFT has been widely used as an analytical tool, but other versions of the representation have also been developed. This study aims to evaluate and compare the performance of the spectrogram, scalogram, melspectrogram and gammatonegram representations, and provide comparative information to users regarding the suitability of these time-frequency (TF) techniques in lung sound classification. Lung sound signals used in this study were obtained from the ICBHI 2017 respiratory sound database. These lung sound recordings were converted into images of spectrogram, scalogram, melspectrogram and gammatonegram TF representations respectively. The four types of images were fed separately into the VGG16, ResNet-50 and AlexNet deep-learning architectures. Network performances were analyzed and compared based on accuracy, precision, recall and F1-score. The results of the analysis on the performance of the four representations using these three commonly used CNN deep-learning networks indicate that the generated gammatonegram and scalogram TF images coupled with ResNet-50 achieved maximum classification accuracies.
Collapse
Affiliation(s)
- Zakaria Neili
- Electronics Department, University of Badji Mokhtar Annaba, Annaba, Algeria
| | - Kenneth Sundaraj
- Faculty of Electronics and Computer Engineering, Universiti Teknikal Malaysia Melaka, Melaka, Malaysia
| |
Collapse
|