1
|
Kuo ZM, Chen KF, Tseng YJ. MoCab: A framework for the deployment of machine learning models across health information systems. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024; 255:108336. [PMID: 39079482 DOI: 10.1016/j.cmpb.2024.108336] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 07/13/2024] [Accepted: 07/17/2024] [Indexed: 09/01/2024]
Abstract
BACKGROUND AND OBJECTIVE Machine learning models are vital for enhancing healthcare services. However, integrating them into health information systems (HISs) introduces challenges beyond clinical decision making, such as interoperability and diverse electronic health records (EHR) formats. We proposed Model Cabinet Architecture (MoCab), a framework designed to leverage fast healthcare interoperability resources (FHIR) as the standard for data storage and retrieval when deploying machine learning models across various HISs, addressing the challenges highlighted by platforms such as EPOCH®, ePRISM®, KETOS, and others. METHODS The MoCab architecture is designed to streamline predictive modeling in healthcare through a structured framework incorporating several specialized parts. The Data Service Center manages patient data retrieval from FHIR servers. These data are then processed by the Knowledge Model Center, where they are formatted and fed into predictive models. The Model Retraining Center is crucial in continuously updating these models to maintain accuracy in dynamic clinical environments. The framework further incorporates Clinical Decision Support (CDS) Hooks for issuing clinical alerts. It uses Substitutable Medical Apps Reusable Technologies (SMART) on FHIR to develop applications for displaying alerts, prediction results, and patient records. RESULTS The MoCab framework was demonstrated using three types of predictive models: a scoring model (qCSI), a machine learning model (NSTI), and a deep learning model (SPC), applied to synthetic data that mimic a major EHR system. The implementations showed how MoCab integrates predictive models with health data for clinical decision support, utilizing CDS Hooks and SMART on FHIR for seamless HIS integration. The demonstration confirmed the practical utility of MoCab in supporting clinical decision making, validated by its application in various healthcare settings. CONCLUSIONS We demonstrate MoCab's potential in promoting the interoperability of machine learning models and enhancing its utility across various EHRs. Despite facing challenges like FHIR adoption, MoCab addresses key challenges in adapting machine learning models within healthcare settings, paving the way for further enhancements and broader adoption.
Collapse
Affiliation(s)
- Zhe-Ming Kuo
- Department of Information Management, National Central University, Taoyuan, Taiwan
| | - Kuan-Fu Chen
- College of Intelligent Computing, Chang Gung University, Taoyuan, Taiwan; Medical Statistics Research Center, Chang Gung University, Taoyuan, Taiwan; Department of Emergency Medicine, Chang Gung Memorial Hospital, Keelung, Taiwan
| | - Yi-Ju Tseng
- Department of Computer Science, National Yang Ming Chiao Tung University, Hsinchu, Taiwan; Computational Health Informatics Program, Boston Children's Hospital, Boston, MA, USA.
| |
Collapse
|
2
|
Rogers HP, Hseu A, Kim J, Silberholz E, Jo S, Dorste A, Jenkins K. Voice as a Biomarker of Pediatric Health: A Scoping Review. CHILDREN (BASEL, SWITZERLAND) 2024; 11:684. [PMID: 38929263 PMCID: PMC11201680 DOI: 10.3390/children11060684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2024] [Revised: 05/24/2024] [Accepted: 05/29/2024] [Indexed: 06/28/2024]
Abstract
The human voice has the potential to serve as a valuable biomarker for the early detection, diagnosis, and monitoring of pediatric conditions. This scoping review synthesizes the current knowledge on the application of artificial intelligence (AI) in analyzing pediatric voice as a biomarker for health. The included studies featured voice recordings from pediatric populations aged 0-17 years, utilized feature extraction methods, and analyzed pathological biomarkers using AI models. Data from 62 studies were extracted, encompassing study and participant characteristics, recording sources, feature extraction methods, and AI models. Data from 39 models across 35 studies were evaluated for accuracy, sensitivity, and specificity. The review showed a global representation of pediatric voice studies, with a focus on developmental, respiratory, speech, and language conditions. The most frequently studied conditions were autism spectrum disorder, intellectual disabilities, asphyxia, and asthma. Mel-Frequency Cepstral Coefficients were the most utilized feature extraction method, while Support Vector Machines were the predominant AI model. The analysis of pediatric voice using AI demonstrates promise as a non-invasive, cost-effective biomarker for a broad spectrum of pediatric conditions. Further research is necessary to standardize the feature extraction methods and AI models utilized for the evaluation of pediatric voice as a biomarker for health. Standardization has significant potential to enhance the accuracy and applicability of these tools in clinical settings across a variety of conditions and voice recording types. Further development of this field has enormous potential for the creation of innovative diagnostic tools and interventions for pediatric populations globally.
Collapse
Affiliation(s)
- Hannah Paige Rogers
- Department of Cardiology, Boston Children’s Hospital, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA
| | - Anne Hseu
- Department of Otolaryngology, Boston Children’s Hospital, 333 Longwood Ave, Boston, MA 02115, USA
| | - Jung Kim
- Department of Pediatrics, Boston Children’s Hospital, Boston, MA 02115, USA
| | | | - Stacy Jo
- Department of Otolaryngology, Boston Children’s Hospital, 333 Longwood Ave, Boston, MA 02115, USA
| | - Anna Dorste
- Boston Children’s Hospital, 300 Longwood Avenue, Boston, MA 02115, USA
| | - Kathy Jenkins
- Department of Cardiology, Boston Children’s Hospital, Harvard Medical School, 300 Longwood Avenue, Boston, MA 02115, USA
| |
Collapse
|
3
|
RethikumariAmma KN, Ranjana P. Pivotal region and optimized deep neuro fuzzy network for autism spectrum disorder detection. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/13/2023]
|
4
|
Lee JH, Lee GW, Bong G, Yoo HJ, Kim HK. End-to-End Model-Based Detection of Infants with Autism Spectrum Disorder Using a Pretrained Model. SENSORS (BASEL, SWITZERLAND) 2022; 23:202. [PMID: 36616801 PMCID: PMC9823402 DOI: 10.3390/s23010202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/13/2022] [Accepted: 12/21/2022] [Indexed: 06/17/2023]
Abstract
In this paper, we propose an end-to-end (E2E) neural network model to detect autism spectrum disorder (ASD) from children's voices without explicitly extracting the deterministic features. In order to obtain the decisions for discriminating between the voices of children with ASD and those with typical development (TD), we combined two different feature-extraction models and a bidirectional long short-term memory (BLSTM)-based classifier to obtain the ASD/TD classification in the form of probability. We realized one of the feature extractors as the bottleneck feature from an autoencoder using the extended version of the Geneva minimalistic acoustic parameter set (eGeMAPS) input. The other feature extractor is the context vector from a pretrained wav2vec2.0-based model directly applied to the waveform input. In addition, we optimized the E2E models in two different ways: (1) fine-tuning and (2) joint optimization. To evaluate the performance of the proposed E2E models, we prepared two datasets from video recordings of ASD diagnoses collected between 2016 and 2018 at Seoul National University Bundang Hospital (SNUBH), and between 2019 and 2021 at a Living Lab. According to the experimental results, the proposed wav2vec2.0-based E2E model with joint optimization achieved significant improvements in the accuracy and unweighted average recall, from 64.74% to 71.66% and from 65.04% to 70.81%, respectively, compared with a conventional model using autoencoder-based BLSTM and the deterministic features of the eGeMAPS.
Collapse
Affiliation(s)
- Jung Hyuk Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Geon Woo Lee
- AI Graduate School, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| | - Guiyoung Bong
- Department of Psychiatry, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
| | - Hee Jeong Yoo
- Department of Psychiatry, Seoul National University Bundang Hospital, Seongnam 13620, Republic of Korea
- College of Medicine, Seoul National University, Seoul 03980, Republic of Korea
| | - Hong Kook Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
- AI Graduate School, Gwangju Institute of Science and Technology, Gwangju 61005, Republic of Korea
| |
Collapse
|
5
|
Chi NA, Washington P, Kline A, Husic A, Hou C, He C, Dunlap K, Wall DP. Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study. JMIR Pediatr Parent 2022; 5:e35406. [PMID: 35436234 PMCID: PMC9052034 DOI: 10.2196/35406] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 01/18/2022] [Accepted: 01/25/2022] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Autism spectrum disorder (ASD) is a neurodevelopmental disorder that results in altered behavior, social development, and communication patterns. In recent years, autism prevalence has tripled, with 1 in 44 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process that requires the work of trained physicians, significant attention has been given to developing systems that automatically detect autism. We work toward this goal by analyzing audio data, as prosody abnormalities are a signal of autism, with affected children displaying speech idiosyncrasies such as echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. OBJECTIVE We aimed to test the ability for machine learning approaches to aid in detection of autism in self-recorded speech audio captured from children with ASD and neurotypical (NT) children in their home environments. METHODS We considered three methods to detect autism in child speech: (1) random forests trained on extracted audio features (including Mel-frequency cepstral coefficients); (2) convolutional neural networks trained on spectrograms; and (3) fine-tuned wav2vec 2.0-a state-of-the-art transformer-based speech recognition model. We trained our classifiers on our novel data set of cellphone-recorded child speech audio curated from the Guess What? mobile game, an app designed to crowdsource videos of children with ASD and NT children in a natural home environment. RESULTS The random forest classifier achieved 70% accuracy, the fine-tuned wav2vec 2.0 model achieved 77% accuracy, and the convolutional neural network achieved 79% accuracy when classifying children's audio as either ASD or NT. We used 5-fold cross-validation to evaluate model performance. CONCLUSIONS Our models were able to predict autism status when trained on a varied selection of home audio clips with inconsistent recording qualities, which may be more representative of real-world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.
Collapse
Affiliation(s)
- Nathan A Chi
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Peter Washington
- Department of Bioengineering, Stanford University, Stanford, CA, United States
| | - Aaron Kline
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Arman Husic
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Cathy Hou
- Department of Computer Science, Stanford University, Stanford, CA, United States
| | - Chloe He
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Kaitlyn Dunlap
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Dennis P Wall
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, United States
| |
Collapse
|