1
|
Matikolaie FS, Tadj C. Machine Learning-Based Cry Diagnostic System for Identifying Septic Newborns. J Voice 2024; 38:963.e1-963.e14. [PMID: 35193790 DOI: 10.1016/j.jvoice.2021.12.021] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/28/2021] [Accepted: 12/29/2021] [Indexed: 10/19/2022]
Abstract
BACKGROUND AND OBJECTIVE Processing the newborns' cry audio signal (CAS) provides valuable information about the newborns' condition. This information can be used to diagnose the disease. This article analyzes the CASs of newborns under two months old using machine learning approaches to develop an automatic diagnostic system for identifying septic infants from healthy ones. Septic infants have not been studied in this context. METHODOLOGY The proposed features include Mel frequency cepstral coefficients and the prosodic features of tilt, rhythm, and intensity. The performance of each feature set was evaluated using a collection of classifiers, including Support Vector Machine (SVM), decision tree, and discriminant analysis. We also examined the majority voting method for improving the classification results and feature manipulation and multiple classifier framework, which has not previously been reported in the literature on developing an automatic diagnostic system based on the infant's CAS. We tested our methodology on two datasets of expiration and inspiration episodes of newborns' CASs. RESULTS AND CONCLUSION The framework of the concatenation of all feature sets using quadratic SVM resulted in the best F-score with 86% for the expiration dataset. Furthermore, the framework of tilt feature set with quadratic discriminant with 83.90% resulted in the best F-score for inspiration. We found out that septic infants cry differently than healthy infants through these experiments. Thus, our proposed method can be used as a noninvasive tool for identifying septic infants from healthy ones only based on their CAS.
Collapse
Affiliation(s)
| | - Chakib Tadj
- Department of Electrical Engineering, École De Technologie Supérieure, Montreal, QC, H3C 1K3, Canada
| |
Collapse
|
2
|
Keles E, Bagci U. The past, current, and future of neonatal intensive care units with artificial intelligence: a systematic review. NPJ Digit Med 2023; 6:220. [PMID: 38012349 PMCID: PMC10682088 DOI: 10.1038/s41746-023-00941-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 10/05/2023] [Indexed: 11/29/2023] Open
Abstract
Machine learning and deep learning are two subsets of artificial intelligence that involve teaching computers to learn and make decisions from any sort of data. Most recent developments in artificial intelligence are coming from deep learning, which has proven revolutionary in almost all fields, from computer vision to health sciences. The effects of deep learning in medicine have changed the conventional ways of clinical application significantly. Although some sub-fields of medicine, such as pediatrics, have been relatively slow in receiving the critical benefits of deep learning, related research in pediatrics has started to accumulate to a significant level, too. Hence, in this paper, we review recently developed machine learning and deep learning-based solutions for neonatology applications. We systematically evaluate the roles of both classical machine learning and deep learning in neonatology applications, define the methodologies, including algorithmic developments, and describe the remaining challenges in the assessment of neonatal diseases by using PRISMA 2020 guidelines. To date, the primary areas of focus in neonatology regarding AI applications have included survival analysis, neuroimaging, analysis of vital parameters and biosignals, and retinopathy of prematurity diagnosis. We have categorically summarized 106 research articles from 1996 to 2022 and discussed their pros and cons, respectively. In this systematic review, we aimed to further enhance the comprehensiveness of the study. We also discuss possible directions for new AI models and the future of neonatology with the rising power of AI, suggesting roadmaps for the integration of AI into neonatal intensive care units.
Collapse
Affiliation(s)
- Elif Keles
- Northwestern University, Feinberg School of Medicine, Department of Radiology, Chicago, IL, USA.
| | - Ulas Bagci
- Northwestern University, Feinberg School of Medicine, Department of Radiology, Chicago, IL, USA
- Northwestern University, Department of Biomedical Engineering, Chicago, IL, USA
- Department of Electrical and Computer Engineering, Chicago, IL, USA
| |
Collapse
|
3
|
Liu Y, Zhang E, Jia X, Wu Y, Liu J, Brewer LM, Yu L. Tracheal sound-based apnea detection using hidden Markov model in sedated volunteers and post anesthesia care unit patients. J Clin Monit Comput 2023; 37:1061-1070. [PMID: 37140851 DOI: 10.1007/s10877-023-01015-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 04/13/2023] [Indexed: 05/05/2023]
Abstract
The current method of apnea detection based on tracheal sounds is limited in certain situations. In this work, the Hidden Markov Model (HMM) algorithm based on segmentation is used to classify the respiratory and non-respiratory states of tracheal sounds, to achieve the purpose of apnea detection. Three groups of tracheal sounds were used, including two groups of data collected in the laboratory and a group of patient data in the post anesthesia care unit (PACU). One was used for model training, and the others (laboratory test group and clinical test group) were used for testing and apnea detection. The trained HMMs were used to segment the tracheal sounds in laboratory test data and clinical test data. Apnea was detected according to the segmentation results and respiratory flow rate/pressure which was the reference signal in two test groups. The sensitivity, specificity, and accuracy were calculated. For the laboratory test data, apnea detection sensitivity, specificity, and accuracy were 96.9%, 95.5%, and 95.7%, respectively. For the clinical test data, apnea detection sensitivity, specificity, and accuracy were 83.1%, 99.0% and 98.6%. Apnea detection based on tracheal sound using HMM is accurate and reliable for sedated volunteers and patients in PACU.
Collapse
Affiliation(s)
- Yang Liu
- Department of Stomatology, The Fourth Affiliated Hospital of China Medical University, Shenyang, Liaoning, People's Republic of China
| | - Erpeng Zhang
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, No. 77, Puhe Road, Shenyang North New Area, Shenyang, 110122, Liaoning, People's Republic of China
| | - Xiuzhu Jia
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, No. 77, Puhe Road, Shenyang North New Area, Shenyang, 110122, Liaoning, People's Republic of China
| | - Yanan Wu
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning, People's Republic of China
| | - Jing Liu
- Department of Nuclear Medicine, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, People's Republic of China
| | - Lara M Brewer
- Department of Anesthesiology, University of Utah, Salt Lake City, Utah, USA
| | - Lu Yu
- Department of Biomedical Engineering, School of Intelligent Medicine, China Medical University, No. 77, Puhe Road, Shenyang North New Area, Shenyang, 110122, Liaoning, People's Republic of China.
| |
Collapse
|
4
|
Khalilzad Z, Tadj C. Using CCA-Fused Cepstral Features in a Deep Learning-Based Cry Diagnostic System for Detecting an Ensemble of Pathologies in Newborns. Diagnostics (Basel) 2023; 13:diagnostics13050879. [PMID: 36900023 PMCID: PMC10000938 DOI: 10.3390/diagnostics13050879] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/14/2023] [Accepted: 02/21/2023] [Indexed: 03/02/2023] Open
Abstract
Crying is one of the means of communication for a newborn. Newborn cry signals convey precious information about the newborn's health condition and their emotions. In this study, cry signals of healthy and pathologic newborns were analyzed for the purpose of developing an automatic, non-invasive, and comprehensive Newborn Cry Diagnostic System (NCDS) that identifies pathologic newborns from healthy infants. For this purpose, Mel-frequency Cepstral Coefficients (MFCC) and Gammatone Frequency Cepstral Coefficients (GFCC) were extracted as features. These feature sets were also combined and fused through Canonical Correlation Analysis (CCA), which provides a novel manipulation of the features that have not yet been explored in the literature on NCDS designs, to the best of our knowledge. All the mentioned feature sets were fed to the Support Vector Machine (SVM) and Long Short-term Memory (LSTM). Furthermore, two Hyperparameter optimization methods, Bayesian and grid search, were examined to enhance the system's performance. The performance of our proposed NCDS was evaluated with two different datasets of inspiratory and expiratory cries. The CCA fusion feature set using the LSTM classifier accomplished the best F-score in the study, with 99.86% for the inspiratory cry dataset. The best F-score regarding the expiratory cry dataset, 99.44%, belonged to the GFCC feature set employing the LSTM classifier. These experiments suggest the high potential and value of using the newborn cry signals in the detection of pathologies. The framework proposed in this study can be implemented as an early diagnostic tool for clinical studies and help in the identification of pathologic newborns.
Collapse
|
5
|
A self-training automatic infant-cry detector. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-08129-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
AbstractInfant cry is one of the first distinctive and informative life signals observed after birth. Neonatologists and automatic assistive systems can analyse infant cry to early-detect pathologies. These analyses extensively use reference expert-curated databases containing annotated infant-cry audio samples. However, these databases are not publicly accessible because of their sensitive data. Moreover, the recorded data can under-represent specific phenomena or the operational conditions required by other medical teams. Additionally, building these databases requires significant investments that few hospitals can afford. This paper describes an open-source workflow for infant-cry detection, which identifies audio segments containing high-quality infant-cry samples with no other overlapping audio events (e.g. machine noise or adult speech). It requires minimal training because it trains an LSTM-with-self-attention model on infant-cry samples automatically detected from the recorded audio through cluster analysis and HMM classification. The audio signal processing uses energy and intonation acoustic features from 100-ms segments to improve spectral robustness to noise. The workflow annotates the input audio with intervals containing infant-cry samples suited for populating a database for neonatological and early diagnosis studies. On 16 min of hospital phone-audio recordings, it reached sufficient infant-cry detection accuracy in 3 neonatal care environments (nursery—69%, sub-intensive—82%, intensive—77%) involving 20 infants subject to heterogeneous cry stimuli, and had substantial agreement with an expert’s annotation. Our workflow is a cost-effective solution, particularly suited for a sub-intensive care environment, scalable to monitor from one to many infants. It allows a hospital to build and populate an extensive high-quality infant-cry database with a minimal investment.
Collapse
|
6
|
Khalilzad Z, Kheddache Y, Tadj C. An Entropy-Based Architecture for Detection of Sepsis in Newborn Cry Diagnostic Systems. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1194. [PMID: 36141080 PMCID: PMC9498202 DOI: 10.3390/e24091194] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 08/18/2022] [Accepted: 08/22/2022] [Indexed: 06/16/2023]
Abstract
The acoustic characteristics of cries are an exhibition of an infant's health condition and these characteristics have been acknowledged as indicators for various pathologies. This study focused on the detection of infants suffering from sepsis by developing a simplified design using acoustic features and conventional classifiers. The features for the proposed framework were Mel-frequency Cepstral Coefficients (MFCC), Spectral Entropy Cepstral Coefficients (SENCC) and Spectral Centroid Cepstral Coefficients (SCCC), which were classified through K-nearest Neighborhood (KNN) and Support Vector Machine (SVM) classification methods. The performance of the different combinations of the feature sets was also evaluated based on several measures such as accuracy, F1-score and Matthews Correlation Coefficient (MCC). Bayesian Hyperparameter Optimization (BHPO) was employed to tailor the classifiers uniquely to fit each experiment. The proposed methodology was tested on two datasets of expiratory cries (EXP) and voiced inspiratory cries (INSV). The highest accuracy and F-score were 89.99% and 89.70%, respectively. This framework also implemented a novel feature selection method based on Fuzzy Entropy (FE) as a final experiment. By employing FE, the number of features was reduced by more than 40%, whereas the evaluation measures were not hindered for the EXP dataset and were even enhanced for the INSV dataset. Therefore, it was deduced through these experiments that an entropy-based framework is successful for identifying sepsis in neonates and has the advantage of achieving high performance with conventional machine learning (ML) approaches, which makes it a reliable means for the early diagnosis of sepsis in deprived areas of the world.
Collapse
|
7
|
|
8
|
Anikin A, Reby D. Ingressive phonation conveys arousal in human nonverbal vocalizations. BIOACOUSTICS 2022. [DOI: 10.1080/09524622.2022.2039295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Affiliation(s)
- Andrey Anikin
- Division of Cognitive Science, Lund University, Lund, Sweden
- Enes Sensory Neuro-Ethology Lab, Crnl, Jean Monnet University of Saint Étienne, St-Étienne, France
| | - David Reby
- Enes Sensory Neuro-Ethology Lab, Crnl, Jean Monnet University of Saint Étienne, St-Étienne, France
| |
Collapse
|
9
|
Manfredi C, Bandini A, Melino D, Viellevoye R, Kalenga M, Orlandi S. Automated detection and classification of basic shapes of newborn cry melody. Biomed Signal Process Control 2018. [DOI: 10.1016/j.bspc.2018.05.033] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
10
|
Abou-Abbas L, Tadj C, Fersaie HA. A fully automated approach for baby cry signal segmentation and boundary detection of expiratory and inspiratory episodes. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2017; 142:1318. [PMID: 28964073 PMCID: PMC5593797 DOI: 10.1121/1.5001491] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Revised: 07/11/2017] [Accepted: 08/19/2017] [Indexed: 06/07/2023]
Abstract
The detection of cry sounds is generally an important pre-processing step for various applications involving cry analysis such as diagnostic systems, electronic monitoring systems, emotion detection, and robotics for baby caregivers. Given its complexity, an automatic cry segmentation system is a rather challenging topic. In this paper, a framework for automatic cry sound segmentation for application in a cry-based diagnostic system has been proposed. The contribution of various additional time- and frequency-domain features to increase the robustness of a Gaussian mixture model/hidden Markov model (GMM/HMM)-based cry segmentation system in noisy environments is studied. A fully automated segmentation algorithm to extract cry sound components, namely, audible expiration and inspiration, is introduced and is grounded on two approaches: statistical analysis based on GMMs or HMMs classifiers and a post-processing method based on intensity, zero crossing rate, and fundamental frequency feature extraction. The main focus of this paper is to extend the systems developed in previous works to include a post-processing stage with a set of corrective and enhancing tools to improve the classification performance. This full approach allows to precisely determine the start and end points of the expiratory and inspiratory components of a cry signal, EXP and INSV, respectively, in any given sound signal. Experimental results have indicated the effectiveness of the proposed solution. EXP and INSV detection rates of approximately 94.29% and 92.16%, respectively, were achieved by applying a tenfold cross-validation technique to avoid over-fitting.
Collapse
Affiliation(s)
- Lina Abou-Abbas
- Department of Electrical Engineering, École de Technologie Supérieure, Quebec University, 1100 Rue Notre Dame Ouest, Montréal, Quebec H3C 1K3, Canada
| | - Chakib Tadj
- Department of Electrical Engineering, École de Technologie Supérieure, Quebec University, 1100 Rue Notre Dame Ouest, Montréal, Quebec H3C 1K3, Canada
| | - Hesam Alaie Fersaie
- Department of Electrical Engineering, École de Technologie Supérieure, Quebec University, 1100 Rue Notre Dame Ouest, Montréal, Quebec H3C 1K3, Canada
| |
Collapse
|
11
|
Orlandi S, Reyes Garcia CA, Bandini A, Donzelli G, Manfredi C. Application of Pattern Recognition Techniques to the Classification of Full-Term and Preterm Infant Cry. J Voice 2016; 30:656-663. [DOI: 10.1016/j.jvoice.2015.08.007] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2015] [Accepted: 08/07/2015] [Indexed: 10/22/2022]
|
12
|
Koumura T, Okanoya K. Automatic Recognition of Element Classes and Boundaries in the Birdsong with Variable Sequences. PLoS One 2016; 11:e0159188. [PMID: 27442240 PMCID: PMC4956110 DOI: 10.1371/journal.pone.0159188] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 06/28/2016] [Indexed: 11/18/2022] Open
Abstract
Researches on sequential vocalization often require analysis of vocalizations in long continuous sounds. In such studies as developmental ones or studies across generations in which days or months of vocalizations must be analyzed, methods for automatic recognition would be strongly desired. Although methods for automatic speech recognition for application purposes have been intensively studied, blindly applying them for biological purposes may not be an optimal solution. This is because, unlike human speech recognition, analysis of sequential vocalizations often requires accurate extraction of timing information. In the present study we propose automated systems suitable for recognizing birdsong, one of the most intensively investigated sequential vocalizations, focusing on the three properties of the birdsong. First, a song is a sequence of vocal elements, called notes, which can be grouped into categories. Second, temporal structure of birdsong is precisely controlled, meaning that temporal information is important in song analysis. Finally, notes are produced according to certain probabilistic rules, which may facilitate the accurate song recognition. We divided the procedure of song recognition into three sub-steps: local classification, boundary detection, and global sequencing, each of which corresponds to each of the three properties of birdsong. We compared the performances of several different ways to arrange these three steps. As results, we demonstrated a hybrid model of a deep convolutional neural network and a hidden Markov model was effective. We propose suitable arrangements of methods according to whether accurate boundary detection is needed. Also we designed the new measure to jointly evaluate the accuracy of note classification and boundary detection. Our methods should be applicable, with small modification and tuning, to the songs in other species that hold the three properties of the sequential vocalization.
Collapse
Affiliation(s)
- Takuya Koumura
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
- Research Fellow of Japan Society for the Promotion of Science, Tokyo, Japan
| | - Kazuo Okanoya
- Department of Life Sciences, Graduate School of Arts and Sciences, The University of Tokyo, Tokyo, Japan
- Cognition and Behavior Joint Laboratory, RIKEN Brain Science Institute, Saitama, Japan
| |
Collapse
|
13
|
Automatic detection of the expiratory and inspiratory phases in newborn cry signals. Biomed Signal Process Control 2015. [DOI: 10.1016/j.bspc.2015.03.007] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|