1
|
Alves CL, Martinelli T, Sallum LF, Rodrigues FA, Toutain TGLDO, Porto JAM, Thielemann C, Aguiar PMDC, Moeckel M. Multiclass classification of Autism Spectrum Disorder, attention deficit hyperactivity disorder, and typically developed individuals using fMRI functional connectivity analysis. PLoS One 2024; 19:e0305630. [PMID: 39418298 DOI: 10.1371/journal.pone.0305630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 06/03/2024] [Indexed: 10/19/2024] Open
Abstract
Neurodevelopmental conditions, such as Autism Spectrum Disorder (ASD) and Attention Deficit Hyperactivity Disorder (ADHD), present unique challenges due to overlapping symptoms, making an accurate diagnosis and targeted intervention difficult. Our study employs advanced machine learning techniques to analyze functional magnetic resonance imaging (fMRI) data from individuals with ASD, ADHD, and typically developed (TD) controls, totaling 120 subjects in the study. Leveraging multiclass classification (ML) algorithms, we achieve superior accuracy in distinguishing between ASD, ADHD, and TD groups, surpassing existing benchmarks with an area under the ROC curve near 98%. Our analysis reveals distinct neural signatures associated with ASD and ADHD: individuals with ADHD exhibit altered connectivity patterns of regions involved in attention and impulse control, whereas those with ASD show disruptions in brain regions critical for social and cognitive functions. The observed connectivity patterns, on which the ML classification rests, agree with established diagnostic approaches based on clinical symptoms. Furthermore, complex network analyses highlight differences in brain network integration and segregation among the three groups. Our findings pave the way for refined, ML-enhanced diagnostics in accordance with established practices, offering a promising avenue for developing trustworthy clinical decision-support systems.
Collapse
Affiliation(s)
- Caroline L Alves
- Laboratory for Hybrid Modeling, Aschaffenburg University of Applied Sciences, Aschaffenburg, Bayern, Germany
| | - Tiago Martinelli
- Institute of Mathematical and Computer Sciences, University of São Paulo, São Paulo, São Paulo, Brazil
| | - Loriz Francisco Sallum
- Institute of Mathematical and Computer Sciences, University of São Paulo, São Paulo, São Paulo, Brazil
| | | | | | - Joel Augusto Moura Porto
- Institute of Physics of São Carlos (IFSC), University of São Paulo (USP), São Carlos, São Paulo, Brazil
- Institute of Biological Information Processing, Heinrich Heine University Düsseldorf, Düsseldorf, North Rhine-Westphalia Land, Germany
| | - Christiane Thielemann
- BioMEMS Lab, Aschaffenburg University of Applied Sciences, Aschaffenburg, Bayern, Germany
| | - Patrícia Maria de Carvalho Aguiar
- Hospital Israelita Albert Einstein, São Paulo, São Paulo, Brazil
- Department of Neurology and Neurosurgery, Federal University of São Paulo, São Paulo, São Paulo, Brazil
| | - Michael Moeckel
- Laboratory for Hybrid Modeling, Aschaffenburg University of Applied Sciences, Aschaffenburg, Bayern, Germany
| |
Collapse
|
2
|
Wawer A, Chojnicka I, Sarzyńska-Wawer J, Krawczyk M. A cross-dataset study on automatic detection of autism spectrum disorder from text data. Acta Psychiatr Scand 2024. [PMID: 39032040 DOI: 10.1111/acps.13737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 06/20/2024] [Accepted: 07/06/2024] [Indexed: 07/22/2024]
Abstract
OBJECTIVE The goals of this article are as follows. First, to investigate the possibility of detecting autism spectrum disorder (ASD) from text data using the latest generation of machine learning tools. Second, to compare model performance on two datasets of transcribed statements, collected using two different diagnostic tools. Third, to investigate the feasibility of knowledge transfer between models trained on both datasets and check if data augmentation can help alleviate the problem of a small number of observations. METHOD We explore two techniques to detect ASD. The first one is based on fine-tuning HerBERT, a BERT-based, monolingual deep transformer neural network. The second one uses the newest, multipurpose text embeddings from OpenAI and a classifier. We apply the methods to two separate datasets of transcribed statements, collected using two different diagnostic tools: thought, language, and communication (TLC) and autism diagnosis observation schedule-2 (ADOS-2). We conducted several cross-dataset experiments in both a zero-shot setting and a setting where models are pretrained on one dataset and then training continues on another to test the possibility of knowledge transfer. RESULTS Unlike previous studies, the models we tested obtained average results on ADOS-2 data but reached very good performance of the models in TLC. We did not observe any benefits from knowledge transfer between datasets. We observed relatively poor performance of models trained on augmented data and hypothesize that data augmentation by back translation obfuscates autism-specific signals. CONCLUSION The quality of machine learning models that detect ASD from text data is improving, but model results are dependent on the type of input data or diagnostic tool.
Collapse
Affiliation(s)
- Aleksander Wawer
- Polish Academy of Sciences, Institute of Computer Science, Warsaw, Poland
| | - Izabela Chojnicka
- Department of Health and Rehabilitation Psychology, Faculty of Psychology, University of Warsaw, Warsaw, Poland
| | | | | |
Collapse
|
3
|
Shi JM, Chiu VY, Avila CC, Lewis S, Park D, Peltier MR, Getahun D. Coding of Childhood Psychiatric and Neurodevelopmental Disorders in Electronic Health Records of a Large Integrated Health Care System: Validation Study. JMIR Ment Health 2024; 11:e56812. [PMID: 38771217 PMCID: PMC11107768 DOI: 10.2196/56812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/08/2024] [Accepted: 04/09/2024] [Indexed: 05/22/2024] Open
Abstract
Background Mental, emotional, and behavioral disorders are chronic pediatric conditions, and their prevalence has been on the rise over recent decades. Affected children have long-term health sequelae and a decline in health-related quality of life. Due to the lack of a validated database for pharmacoepidemiological research on selected mental, emotional, and behavioral disorders, there is uncertainty in their reported prevalence in the literature. objectives We aimed to evaluate the accuracy of coding related to pediatric mental, emotional, and behavioral disorders in a large integrated health care system's electronic health records (EHRs) and compare the coding quality before and after the implementation of the International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) coding as well as before and after the COVID-19 pandemic. Methods Medical records of 1200 member children aged 2-17 years with at least 1 clinical visit before the COVID-19 pandemic (January 1, 2012, to December 31, 2014, the ICD-9-CM coding period; and January 1, 2017, to December 31, 2019, the ICD-10-CM coding period) and after the COVID-19 pandemic (January 1, 2021, to December 31, 2022) were selected with stratified random sampling from EHRs for chart review. Two trained research associates reviewed the EHRs for all potential cases of autism spectrum disorder (ASD), attention-deficit hyperactivity disorder (ADHD), major depression disorder (MDD), anxiety disorder (AD), and disruptive behavior disorders (DBD) in children during the study period. Children were considered cases only if there was a mention of any one of the conditions (yes for diagnosis) in the electronic chart during the corresponding time period. The validity of diagnosis codes was evaluated by directly comparing them with the gold standard of chart abstraction using sensitivity, specificity, positive predictive value, negative predictive value, the summary statistics of the F-score, and Youden J statistic. κ statistic for interrater reliability among the 2 abstractors was calculated. Results The overall agreement between the identification of mental, behavioral, and emotional conditions using diagnosis codes compared to medical record abstraction was strong and similar across the ICD-9-CM and ICD-10-CM coding periods as well as during the prepandemic and pandemic time periods. The performance of AD coding, while strong, was relatively lower compared to the other conditions. The weighted sensitivity, specificity, positive predictive value, and negative predictive value for each of the 5 conditions were as follows: 100%, 100%, 99.2%, and 100%, respectively, for ASD; 100%, 99.9%, 99.2%, and 100%, respectively, for ADHD; 100%, 100%, 100%, and 100%, respectively for DBD; 87.7%, 100%, 100%, and 99.2%, respectively, for AD; and 100%, 100%, 99.2%, and 100%, respectively, for MDD. The F-score and Youden J statistic ranged between 87.7% and 100%. The overall agreement between abstractors was almost perfect (κ=95%). Conclusions Diagnostic codes are quite reliable for identifying selected childhood mental, behavioral, and emotional conditions. The findings remained similar during the pandemic and after the implementation of the ICD-10-CM coding in the EHR system.
Collapse
Affiliation(s)
- Jiaxiao M Shi
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Vicki Y Chiu
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Chantal C Avila
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Sierra Lewis
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Daniella Park
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| | - Morgan R Peltier
- Department of Psychiatry, Jersey Shore University Medical Center, Neptune, NJ, United States
| | - Darios Getahun
- Department of Research and Evaluation, Kaiser Permanente Southern California, Pasadena, CA, United States
| |
Collapse
|
4
|
Li J, Washington P. A Comparison of Personalized and Generalized Approaches to Emotion Recognition Using Consumer Wearable Devices: Machine Learning Study. JMIR AI 2024; 3:e52171. [PMID: 38875573 PMCID: PMC11127131 DOI: 10.2196/52171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Revised: 02/19/2024] [Accepted: 03/23/2024] [Indexed: 06/16/2024]
Abstract
BACKGROUND There are a wide range of potential adverse health effects, ranging from headaches to cardiovascular disease, associated with long-term negative emotions and chronic stress. Because many indicators of stress are imperceptible to observers, the early detection of stress remains a pressing medical need, as it can enable early intervention. Physiological signals offer a noninvasive method for monitoring affective states and are recorded by a growing number of commercially available wearables. OBJECTIVE We aim to study the differences between personalized and generalized machine learning models for 3-class emotion classification (neutral, stress, and amusement) using wearable biosignal data. METHODS We developed a neural network for the 3-class emotion classification problem using data from the Wearable Stress and Affect Detection (WESAD) data set, a multimodal data set with physiological signals from 15 participants. We compared the results between a participant-exclusive generalized, a participant-inclusive generalized, and a personalized deep learning model. RESULTS For the 3-class classification problem, our personalized model achieved an average accuracy of 95.06% and an F1-score of 91.71%; our participant-inclusive generalized model achieved an average accuracy of 66.95% and an F1-score of 42.50%; and our participant-exclusive generalized model achieved an average accuracy of 67.65% and an F1-score of 43.05%. CONCLUSIONS Our results emphasize the need for increased research in personalized emotion recognition models given that they outperform generalized models in certain contexts. We also demonstrate that personalized machine learning models for emotion classification are viable and can achieve high performance.
Collapse
Affiliation(s)
- Joe Li
- Information and Computer Sciences, University of Hawai`i at Mānoa, Honolulu, HI, United States
| | - Peter Washington
- Information and Computer Sciences, University of Hawai`i at Mānoa, Honolulu, HI, United States
| |
Collapse
|
5
|
Jaiswal A, Washington P. Using #ActuallyAutistic on Twitter for Precision Diagnosis of Autism Spectrum Disorder: Machine Learning Study. JMIR Form Res 2024; 8:e52660. [PMID: 38354045 PMCID: PMC10902768 DOI: 10.2196/52660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 11/19/2023] [Accepted: 12/10/2023] [Indexed: 03/02/2024] Open
Abstract
BACKGROUND The increasing use of social media platforms has given rise to an unprecedented surge in user-generated content, with millions of individuals publicly sharing their thoughts, experiences, and health-related information. Social media can serve as a useful means to study and understand public health. Twitter (subsequently rebranded as "X") is one such social media platform that has proven to be a valuable source of rich information for both the general public and health officials. We conducted the first study applying Twitter data mining to autism screening. OBJECTIVE This study used Twitter as the primary source of data to study the behavioral characteristics and real-time emotional projections of individuals identifying with autism spectrum disorder (ASD). We aimed to improve the rigor of ASD analytics research by using the digital footprint of an individual to study the linguistic patterns of individuals with ASD. METHODS We developed a machine learning model to distinguish individuals with autism from their neurotypical peers based on the textual patterns from their public communications on Twitter. We collected 6,515,470 tweets from users' self-identification with autism using "#ActuallyAutistic" and a separate control group to identify linguistic markers associated with ASD traits. To construct the data set, we targeted English-language tweets using the search query "#ActuallyAutistic" posted from January 1, 2014, to December 31, 2022. From these tweets, we identified unique users who used keywords such as "autism" OR "autistic" OR "neurodiverse" in their profile description and collected all the tweets from their timeline. To build the control group data set, we formulated a search query excluding the hashtag, "-#ActuallyAutistic," and collected 1000 tweets per day during the same time period. We trained a word2vec model and an attention-based, bidirectional long short-term memory model to validate the performance of per-tweet and per-profile classification models. We also illustrate the utility of the data set through common natural language processing tasks such as sentiment analysis and topic modeling. RESULTS Our tweet classifier reached a 73% accuracy, a 0.728 area under the receiver operating characteristic curve score, and an 0.71 F1-score using word2vec representations fed into a logistic regression model, while the user profile classifier achieved an 0.78 area under the receiver operating characteristic curve score and an F1-score of 0.805 using an attention-based, bidirectional long short-term memory model. This is a promising start, demonstrating the potential for effective digital phenotyping studies and large-scale intervention using text data mined from social media. CONCLUSIONS Textual differences in social media communications can help researchers and clinicians conduct symptomatology studies in natural settings.
Collapse
Affiliation(s)
- Aditi Jaiswal
- Department of Information and Computer Sciences, University of Hawaii at Manoa, Honolulu, HI, United States
| | - Peter Washington
- Department of Information and Computer Sciences, University of Hawaii at Manoa, Honolulu, HI, United States
| |
Collapse
|
6
|
Parab S, Boster J, Washington P. Parkinson Disease Recognition Using a Gamified Website: Machine Learning Development and Usability Study. JMIR Form Res 2023; 7:e49898. [PMID: 37773607 PMCID: PMC10576230 DOI: 10.2196/49898] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Revised: 08/16/2023] [Accepted: 09/04/2023] [Indexed: 10/01/2023] Open
Abstract
BACKGROUND Parkinson disease (PD) affects millions globally, causing motor function impairments. Early detection is vital, and diverse data sources aid diagnosis. We focus on lower arm movements during keyboard and trackpad or touchscreen interactions, which serve as reliable indicators of PD. Previous works explore keyboard tapping and unstructured device monitoring; we attempt to further these works with structured tests taking into account 2D hand movement in addition to finger tapping. Our feasibility study uses keystroke and mouse movement data from a remotely conducted, structured, web-based test combined with self-reported PD status to create a predictive model for detecting the presence of PD. OBJECTIVE Analysis of finger tapping speed and accuracy through keyboard input and analysis of 2D hand movement through mouse input allowed differentiation between participants with and without PD. This comparative analysis enables us to establish clear distinctions between the two groups and explore the feasibility of using motor behavior to predict the presence of the disease. METHODS Participants were recruited via email by the Hawaii Parkinson Association (HPA) and directed to a web application for the tests. The 2023 HPA symposium was also used as a forum to recruit participants and spread information about our study. The application recorded participant demographics, including age, gender, and race, as well as PD status. We conducted a series of tests to assess finger tapping, using on-screen prompts to request key presses of constant and random keys. Response times, accuracy, and unintended movements resulting in accidental presses were recorded. Participants performed a hand movement test consisting of tracing straight and curved on-screen ribbons using a trackpad or mouse, allowing us to evaluate stability and precision of 2D hand movement. From this tracing, the test collected and stored insights concerning lower arm motor movement. RESULTS Our formative study included 31 participants, 18 without PD and 13 with PD, and analyzed their lower limb movement data collected from keyboards and computer mice. From the data set, we extracted 28 features and evaluated their significances using an extra tree classifier predictor. A random forest model was trained using the 6 most important features identified by the predictor. These selected features provided insights into precision and movement speed derived from keyboard tapping and mouse tracing tests. This final model achieved an average F1-score of 0.7311 (SD 0.1663) and an average accuracy of 0.7429 (SD 0.1400) over 20 runs for predicting the presence of PD. CONCLUSIONS This preliminary feasibility study suggests the possibility of using technology-based limb movement data to predict the presence of PD, demonstrating the practicality of implementing this approach in a cost-effective and accessible manner. In addition, this study demonstrates that structured mouse movement tests can be used in combination with finger tapping to detect PD.
Collapse
Affiliation(s)
- Shubham Parab
- University of Hawaii at Manoa, Honolulu, HI, United States
| | - Jerry Boster
- Hawaii Parkinson Association, Honolulu, HI, United States
| | - Peter Washington
- Department of Information & Computer Sciences, University of Hawaii at Manoa, Honolulu, HI, United States
| |
Collapse
|
7
|
Awaji B, Senan EM, Olayah F, Alshari EA, Alsulami M, Abosaq HA, Alqahtani J, Janrao P. Hybrid Techniques of Facial Feature Image Analysis for Early Detection of Autism Spectrum Disorder Based on Combined CNN Features. Diagnostics (Basel) 2023; 13:2948. [PMID: 37761315 PMCID: PMC10527645 DOI: 10.3390/diagnostics13182948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 09/07/2023] [Accepted: 09/11/2023] [Indexed: 09/29/2023] Open
Abstract
Autism spectrum disorder (ASD) is a complex neurodevelopmental disorder characterized by difficulties in social communication and repetitive behaviors. The exact causes of ASD remain elusive and likely involve a combination of genetic, environmental, and neurobiological factors. Doctors often face challenges in accurately identifying ASD early due to its complex and diverse presentation. Early detection and intervention are crucial for improving outcomes for individuals with ASD. Early diagnosis allows for timely access to appropriate interventions, leading to better social and communication skills development. Artificial intelligence techniques, particularly facial feature extraction using machine learning algorithms, display promise in aiding the early detection of ASD. By analyzing facial expressions and subtle cues, AI models identify patterns associated with ASD features. This study developed various hybrid systems to diagnose facial feature images for an ASD dataset by combining convolutional neural network (CNN) features. The first approach utilized pre-trained VGG16, ResNet101, and MobileNet models. The second approach employed a hybrid technique that combined CNN models (VGG16, ResNet101, and MobileNet) with XGBoost and RF algorithms. The third strategy involved diagnosing ASD using XGBoost and an RF based on features of VGG-16-ResNet101, ResNet101-MobileNet, and VGG16-MobileNet models. Notably, the hybrid RF algorithm that utilized features from the VGG16-MobileNet models demonstrated superior performance, reached an AUC of 99.25%, an accuracy of 98.8%, a precision of 98.9%, a sensitivity of 99%, and a specificity of 99.1%.
Collapse
Affiliation(s)
- Bakri Awaji
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Ebrahim Mohammed Senan
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Alrazi University, Sana’a, Yemen
| | - Fekry Olayah
- Department of Information System, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia;
| | - Eman A. Alshari
- Department of Computer Science and Information Technology, Thamar University, Dhamar 87246, Yemen;
- Department of Artificial Intelligence, Faculty of Engineering and Smart Computing, Modern Specialized University, Sana’a, Yemen
| | - Mohammad Alsulami
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Hamad Ali Abosaq
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Jarallah Alqahtani
- Department of Computer Science, College of Computer Science and Information Systems, Najran University, Najran 6646, Saudi Arabia; (M.A.); (H.A.A.); (J.A.)
| | - Prachi Janrao
- Thakur College of Engineering and Technology, Kandivali(E), Mumbai 400101, India;
| |
Collapse
|