1
|
Satar M, Cengizler C, Hamitoglu S, Ozdemir M. Investigation of Relation Between Hypoxic-Ischemic Encephalopathy and Spectral Features of Infant Cry Audio. J Voice 2024; 38:1288-1295. [PMID: 35760634 DOI: 10.1016/j.jvoice.2022.05.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 05/23/2022] [Accepted: 05/23/2022] [Indexed: 11/19/2022]
Abstract
Despite advances in medical technologies, Hypoxic-Ischemic Encephalopathy (HIE) continues to be a problem for neonatal intensive care units. Analysis of crying sounds may be a valuable tool for predicting neonatal disease. However, the characteristics of crying in newborns with HIE are still unclear. One of the factors limiting the ability to focus on that subject is the lack of commercially available infant cry database for research. Also, another reason that complicates the classification is the varying characteristics of infant cry. Accordingly, crying sounds were recorded from 35 infants and demographic characteristics of the study groups are presented as well as the numerical representation of spectral features. Experiments reveal that the existence of HIE causes distinctive variation in energy, energy entropy and spectral centroid features of the utterances; which leads us to conclude that the presented combination of spectral features would function well with any supervised or unsupervised machine learning algorithm.
Collapse
Affiliation(s)
- Mehmet Satar
- Division of Neonatology, Department of Pediatrics, Medical School, Cukurova University, Adana, Turkey.
| | - Caglar Cengizler
- Electric and Energy Program, AOSB Technical Sciences Vocational School, Cukurova University, Adana, Turkey
| | - Serif Hamitoglu
- Division of Neonatology, Department of Pediatrics, Medical School, Cukurova University, Adana, Turkey
| | - Mustafa Ozdemir
- Division of Neonatology, Department of Pediatrics, Medical School, Cukurova University, Adana, Turkey
| |
Collapse
|
2
|
Matikolaie FS, Tadj C. Machine Learning-Based Cry Diagnostic System for Identifying Septic Newborns. J Voice 2024; 38:963.e1-963.e14. [PMID: 35193790 DOI: 10.1016/j.jvoice.2021.12.021] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/28/2021] [Accepted: 12/29/2021] [Indexed: 10/19/2022]
Abstract
BACKGROUND AND OBJECTIVE Processing the newborns' cry audio signal (CAS) provides valuable information about the newborns' condition. This information can be used to diagnose the disease. This article analyzes the CASs of newborns under two months old using machine learning approaches to develop an automatic diagnostic system for identifying septic infants from healthy ones. Septic infants have not been studied in this context. METHODOLOGY The proposed features include Mel frequency cepstral coefficients and the prosodic features of tilt, rhythm, and intensity. The performance of each feature set was evaluated using a collection of classifiers, including Support Vector Machine (SVM), decision tree, and discriminant analysis. We also examined the majority voting method for improving the classification results and feature manipulation and multiple classifier framework, which has not previously been reported in the literature on developing an automatic diagnostic system based on the infant's CAS. We tested our methodology on two datasets of expiration and inspiration episodes of newborns' CASs. RESULTS AND CONCLUSION The framework of the concatenation of all feature sets using quadratic SVM resulted in the best F-score with 86% for the expiration dataset. Furthermore, the framework of tilt feature set with quadratic discriminant with 83.90% resulted in the best F-score for inspiration. We found out that septic infants cry differently than healthy infants through these experiments. Thus, our proposed method can be used as a noninvasive tool for identifying septic infants from healthy ones only based on their CAS.
Collapse
Affiliation(s)
| | - Chakib Tadj
- Department of Electrical Engineering, École De Technologie Supérieure, Montreal, QC, H3C 1K3, Canada
| |
Collapse
|
3
|
Kumari P, Mahto K. A Narrative Review on Different Novel Machine Learning Techniques for Detecting Pathologies in Infants From Born Baby Cries. J Voice 2024:S0892-1997(24)00077-8. [PMID: 38714440 DOI: 10.1016/j.jvoice.2024.03.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Revised: 03/09/2024] [Accepted: 03/11/2024] [Indexed: 05/09/2024]
Abstract
This paper reviews the research work on the analysis and classification of pathological infant cries in the last 50 years. The literature review mainly covers the need and role of early clinical diagnosis, pathologies detected from cry samples, challenges in pathological cry signal data acquisition, signal processing techniques, and signal classifiers. The signal processing techniques include preprocessing, feature extraction from domains, such as time, spectral, time-frequency, prosodic, wavelet, etc, and feature selection for selecting dominant features. Literature covers traditional machine learning classifiers, such as Bayesian networks, decision trees, K-nearest neighbor, support vector machine, Gaussian mixture model, etc, and recently added neural network models, such as convolutional neural networks, regression neural networks, probabilistic neural networks, graph neural networks, etc. Significant experimental results of pathological cry identification and classification are listed for comparison. Finally, it suggests future research in the direction of database preparation, feature analysis and extraction, neural network classifiers to provide a non-invasive and robust automatic infant cry analysis model.
Collapse
Affiliation(s)
- Preeti Kumari
- Department of Electronics and Communication Engineering, Birla Institute of Technology Mesra, Ranchi, Jharkhand, India.
| | - Kartik Mahto
- Department of Electronics and Communication Engineering, Birla Institute of Technology Mesra, Ranchi, Jharkhand, India.
| |
Collapse
|
4
|
Zhang K, Ting HN, Choo YM. Baby cry recognition based on WOA-VMD and an improved Dempster-Shafer evidence theory. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 245:108043. [PMID: 38306944 DOI: 10.1016/j.cmpb.2024.108043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2023] [Revised: 01/04/2024] [Accepted: 01/20/2024] [Indexed: 02/04/2024]
Abstract
BACKGROUND AND OBJECTIVE Conflict may happen when more than one classifier is used to perform prediction or classification. The recognition model error leads to conflicting evidence. These conflicts can cause decision errors in a baby cry recognition and further decrease its recognition accuracy. Thus, the objective of this study is to propose a method that can effectively minimize the conflict among deep learning models and improve the accuracy of baby cry recognition. METHODS An improved Dempster-Shafer evidence theory (DST) based on Wasserstein distance and Deng entropy was proposed to reduce the conflicts among the results by combining the credibility degree between evidence and the uncertainty degree of evidence. To validate the effectiveness of the proposed method, examples were analyzed, and applied in a baby cry recognition. The Whale optimization algorithm-Variational mode decomposition (WOA-VMD) was used to optimally decompose the baby cry signals. The deep features of decomposed components were extracted using the VGG16 model. Long Short-Term Memory (LSTM) models were used to classify baby cry signals. An improved DST decision method was used to obtain the decision fusion. RESULTS The proposed fusion method achieves an accuracy of 90.15% in classifying three types of baby cry. Improvement between 2.90% and 4.98% was obtained over the existing DST fusion methods. Recognition accuracy was improved by between 5.79% and 11.53% when compared to the latest methods used in baby cry recognition. CONCLUSION The proposed method optimally decomposes baby cry signal, effectively reduces the conflict among the results of deep learning models and improves the accuracy of baby cry recognition.
Collapse
Affiliation(s)
- Ke Zhang
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Jalan Pantai Baharu, 50603 Kuala Lumpur, Malaysia
| | - Hua-Nong Ting
- Department of Biomedical Engineering, Faculty of Engineering, Universiti Malaya, Jalan Pantai Baharu, 50603 Kuala Lumpur, Malaysia; Faculty of Medical Engineering, Jining Medical University, University Park, National High-tech Zone, 272067 Jining City, Shandong Province, China.
| | - Yao-Mun Choo
- Department of Paediatrics, Faculty of Medicine, Universiti Malaya, Jalan Pantai Baharu, 50603 Kuala Lumpur, Malaysia
| |
Collapse
|
5
|
Zayed Y, Hasasneh A, Tadj C. Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features. Diagnostics (Basel) 2023; 13:2107. [PMID: 37371002 DOI: 10.3390/diagnostics13122107] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 06/09/2023] [Accepted: 06/13/2023] [Indexed: 06/29/2023] Open
Abstract
Early diagnosis of medical conditions in infants is crucial for ensuring timely and effective treatment. However, infants are unable to verbalize their symptoms, making it difficult for healthcare professionals to accurately diagnose their conditions. Crying is often the only way for infants to communicate their needs and discomfort. In this paper, we propose a medical diagnostic system for interpreting infants' cry audio signals (CAS) using a combination of different audio domain features and deep learning (DL) algorithms. The proposed system utilizes a dataset of labeled audio signals from infants with specific pathologies. The dataset includes two infant pathologies with high mortality rates, neonatal respiratory distress syndrome (RDS), sepsis, and crying. The system employed the harmonic ratio (HR) as a prosodic feature, the Gammatone frequency cepstral coefficients (GFCCs) as a cepstral feature, and image-based features through the spectrogram which are extracted using a convolution neural network (CNN) pretrained model and fused with the other features to benefit multiple domains in improving the classification rate and the accuracy of the model. The different combination of the fused features is then fed into multiple machine learning algorithms including random forest (RF), support vector machine (SVM), and deep neural network (DNN) models. The evaluation of the system using the accuracy, precision, recall, F1-score, confusion matrix, and receiver operating characteristic (ROC) curve, showed promising results for the early diagnosis of medical conditions in infants based on the crying signals only, where the system achieved the highest accuracy of 97.50% using the combination of the spectrogram, HR, and GFCC through the deep learning process. The finding demonstrated the importance of fusing different audio features, especially the spectrogram, through the learning process rather than a simple concatenation and the use of deep learning algorithms in extracting sparsely represented features that can be used later on in the classification problem, which improves the separation between different infants' pathologies. The results outperformed the published benchmark paper by improving the classification problem to be multiclassification (RDS, sepsis, and healthy), investigating a new type of feature, which is the spectrogram, using a new feature fusion technique, which is fusion, through the learning process using the deep learning model.
Collapse
Affiliation(s)
- Yara Zayed
- Department of Natural, Engineering and Technology Sciences, Faculty of Graduate Studies, Arab American University, Ramallah P.O. Box 240, Palestine
| | - Ahmad Hasasneh
- Department of Natural, Engineering and Technology Sciences, Faculty of Graduate Studies, Arab American University, Ramallah P.O. Box 240, Palestine
| | - Chakib Tadj
- Department of Electrical Engineering, École de Technologie Supérieur, Université du Québec, Montréal, QC H3C 1K3, Canada
| |
Collapse
|
6
|
Ozseven T. Infant cry classification by using different deep neural network models and hand-crafted features. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
7
|
Khalilzad Z, Tadj C. Using CCA-Fused Cepstral Features in a Deep Learning-Based Cry Diagnostic System for Detecting an Ensemble of Pathologies in Newborns. Diagnostics (Basel) 2023; 13:diagnostics13050879. [PMID: 36900023 PMCID: PMC10000938 DOI: 10.3390/diagnostics13050879] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/14/2023] [Accepted: 02/21/2023] [Indexed: 03/02/2023] Open
Abstract
Crying is one of the means of communication for a newborn. Newborn cry signals convey precious information about the newborn's health condition and their emotions. In this study, cry signals of healthy and pathologic newborns were analyzed for the purpose of developing an automatic, non-invasive, and comprehensive Newborn Cry Diagnostic System (NCDS) that identifies pathologic newborns from healthy infants. For this purpose, Mel-frequency Cepstral Coefficients (MFCC) and Gammatone Frequency Cepstral Coefficients (GFCC) were extracted as features. These feature sets were also combined and fused through Canonical Correlation Analysis (CCA), which provides a novel manipulation of the features that have not yet been explored in the literature on NCDS designs, to the best of our knowledge. All the mentioned feature sets were fed to the Support Vector Machine (SVM) and Long Short-term Memory (LSTM). Furthermore, two Hyperparameter optimization methods, Bayesian and grid search, were examined to enhance the system's performance. The performance of our proposed NCDS was evaluated with two different datasets of inspiratory and expiratory cries. The CCA fusion feature set using the LSTM classifier accomplished the best F-score in the study, with 99.86% for the inspiratory cry dataset. The best F-score regarding the expiratory cry dataset, 99.44%, belonged to the GFCC feature set employing the LSTM classifier. These experiments suggest the high potential and value of using the newborn cry signals in the detection of pathologies. The framework proposed in this study can be implemented as an early diagnostic tool for clinical studies and help in the identification of pathologic newborns.
Collapse
|
8
|
Newborn Cry-Based Diagnostic System to Distinguish between Sepsis and Respiratory Distress Syndrome Using Combined Acoustic Features. Diagnostics (Basel) 2022; 12:diagnostics12112802. [PMID: 36428865 PMCID: PMC9689015 DOI: 10.3390/diagnostics12112802] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 11/05/2022] [Accepted: 11/11/2022] [Indexed: 11/18/2022] Open
Abstract
Crying is the only means of communication for a newborn baby with its surrounding environment, but it also provides significant information about the newborn's health, emotions, and needs. The cries of newborn babies have long been known as a biomarker for the diagnosis of pathologies. However, to the best of our knowledge, exploring the discrimination of two pathology groups by means of cry signals is unprecedented. Therefore, this study aimed to identify septic newborns with Neonatal Respiratory Distress Syndrome (RDS) by employing the Machine Learning (ML) methods of Multilayer Perceptron (MLP) and Support Vector Machine (SVM). Furthermore, the cry signal was analyzed from the following two different perspectives: 1) the musical perspective by studying the spectral feature set of Harmonic Ratio (HR), and 2) the speech processing perspective using the short-term feature set of Gammatone Frequency Cepstral Coefficients (GFCCs). In order to assess the role of employing features from both short-term and spectral modalities in distinguishing the two pathology groups, they were fused in one feature set named the combined features. The hyperparameters (HPs) of the implemented ML approaches were fine-tuned to fit each experiment. Finally, by normalizing and fusing the features originating from the two modalities, the overall performance of the proposed design was improved across all evaluation measures, achieving accuracies of 92.49% and 95.3% by the MLP and SVM classifiers, respectively. The MLP classifier was outperformed in terms of all evaluation measures presented in this study, except for the Area Under Curve of Receiver Operator Characteristics (AUC-ROC), which signifies the ability of the proposed design in class separation. The achieved results highlighted the role of combining features from different levels and modalities for a more powerful analysis of the cry signals, as well as including a neural network (NN)-based classifier. Consequently, attaining a 95.3% accuracy for the separation of two entangled pathology groups of RDS and sepsis elucidated the promising potential for further studies with larger datasets and more pathology groups.
Collapse
|
9
|
Khalilzad Z, Kheddache Y, Tadj C. An Entropy-Based Architecture for Detection of Sepsis in Newborn Cry Diagnostic Systems. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1194. [PMID: 36141080 PMCID: PMC9498202 DOI: 10.3390/e24091194] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 08/18/2022] [Accepted: 08/22/2022] [Indexed: 06/16/2023]
Abstract
The acoustic characteristics of cries are an exhibition of an infant's health condition and these characteristics have been acknowledged as indicators for various pathologies. This study focused on the detection of infants suffering from sepsis by developing a simplified design using acoustic features and conventional classifiers. The features for the proposed framework were Mel-frequency Cepstral Coefficients (MFCC), Spectral Entropy Cepstral Coefficients (SENCC) and Spectral Centroid Cepstral Coefficients (SCCC), which were classified through K-nearest Neighborhood (KNN) and Support Vector Machine (SVM) classification methods. The performance of the different combinations of the feature sets was also evaluated based on several measures such as accuracy, F1-score and Matthews Correlation Coefficient (MCC). Bayesian Hyperparameter Optimization (BHPO) was employed to tailor the classifiers uniquely to fit each experiment. The proposed methodology was tested on two datasets of expiratory cries (EXP) and voiced inspiratory cries (INSV). The highest accuracy and F-score were 89.99% and 89.70%, respectively. This framework also implemented a novel feature selection method based on Fuzzy Entropy (FE) as a final experiment. By employing FE, the number of features was reduced by more than 40%, whereas the evaluation measures were not hindered for the EXP dataset and were even enhanced for the INSV dataset. Therefore, it was deduced through these experiments that an entropy-based framework is successful for identifying sepsis in neonates and has the advantage of achieving high performance with conventional machine learning (ML) approaches, which makes it a reliable means for the early diagnosis of sepsis in deprived areas of the world.
Collapse
|
10
|
Lahmiri S, Tadj C, Gargour C. Nonlinear Statistical Analysis of Normal and Pathological Infant Cry Signals in Cepstrum Domain by Multifractal Wavelet Leaders. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1166. [PMID: 36010830 PMCID: PMC9407617 DOI: 10.3390/e24081166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2021] [Revised: 04/06/2022] [Accepted: 08/19/2022] [Indexed: 06/15/2023]
Abstract
Multifractal behavior in the cepstrum representation of healthy and unhealthy infant cry signals is examined by means of wavelet leaders and compared using the Student t-test. The empirical results show that both expiration and inspiration signals exhibit clear evidence of multifractal properties under healthy and unhealthy conditions. In addition, expiration and inspiration signals exhibit more complexity under healthy conditions than under unhealthy conditions. Furthermore, distributions of multifractal characteristics are different across healthy and unhealthy conditions. Hence, this study improves the understanding of infant crying by providing a complete description of its intrinsic dynamics to better evaluate its health status.
Collapse
Affiliation(s)
- Salim Lahmiri
- Department of Supply Chain and Business Technology Management, John Molson School of Business, Concordia University, Montreal, QC H3G 1M8, Canada
- Department of Electrical Engineering, École de Technologie Supérieure, Montreal, QC H3C 1K3, Canada
| | - Chakib Tadj
- Department of Electrical Engineering, École de Technologie Supérieure, Montreal, QC H3C 1K3, Canada
| | - Christian Gargour
- Department of Electrical Engineering, École de Technologie Supérieure, Montreal, QC H3C 1K3, Canada
| |
Collapse
|
11
|
|
12
|
Salehian Matikolaie F, Tadj C. On the use of long-term features in a newborn cry diagnostic system. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2020.101889] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|