1
|
Ochi K, Kojima M, Ono N, Kuroda M, Owada K, Sagayama S, Yamasue H. Objective assessment of autism spectrum disorder based on performance in structured interpersonal acting-out tasks with prosodic stability and variability. Autism Res 2024; 17:395-409. [PMID: 38151701 DOI: 10.1002/aur.3080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 12/01/2023] [Indexed: 12/29/2023]
Abstract
In this study, we sought to objectively and quantitatively characterize the prosodic features of autism spectrum disorder (ASD) via the characteristics of prosody in a newly developed structured speech experiment. Male adults with high-functioning ASD and age/intelligence-matched men with typical development (TD) were asked to read 29 brief scripts aloud in response to preceding auditory stimuli. To investigate whether (1) highly structured acting-out tasks can uncover the prosodic of difference between those with ASD and TD, and (2) the prosodic stableness and flexibleness can be used for objective automatic assessment of ASD, we compared prosodic features such as fundamental frequency, intensity, and mora duration. The results indicate that individuals with ASD exhibit stable pitch registers or volume levels in some affective vocal-expression scenarios, such as those involving anger or sadness, compared with TD and those with TD. However, unstable prosody was observed in some timing control or emphasis tasks in the participants with ASD. Automatic classification of the ASD and TD groups using a support vector machine (SVM) with speech features exhibited an accuracy of 90.4%. A machine learning-based assessment of the degree of ASD core symptoms using support vector regression (SVR) also had good performance. These results may inform the development of a new easy-to-use assessment tool for ASD core symptoms using recorded audio signals.
Collapse
Affiliation(s)
- Keiko Ochi
- Graduate School of Informatics, Kyoto University, Kyoto, Japan
| | - Masaki Kojima
- Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Nobutaka Ono
- Graduate School of Systems Design, Tokyo Metropolitan University, Tokyo, Japan
| | - Miho Kuroda
- Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | - Keiho Owada
- Graduate School of Medicine, University of Tokyo, Tokyo, Japan
| | | | - Hidenori Yamasue
- Graduate School of Medicine, University of Tokyo, Tokyo, Japan
- Department of Psychiatry, Hamamatsu University School of Medicine, Hamamatsu City, Japan
| |
Collapse
|
2
|
Mukherjee D, Bhavnani S, Lockwood Estrin G, Rao V, Dasgupta J, Irfan H, Chakrabarti B, Patel V, Belmonte MK. Digital tools for direct assessment of autism risk during early childhood: A systematic review. AUTISM : THE INTERNATIONAL JOURNAL OF RESEARCH AND PRACTICE 2024; 28:6-31. [PMID: 36336996 PMCID: PMC10771029 DOI: 10.1177/13623613221133176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
LAY ABSTRACT The challenge of finding autistic children, and finding them early enough to make a difference for them and their families, becomes all the greater in parts of the world where human and material resources are in short supply. Poverty of resources delays interventions, translating into a poverty of outcomes. Digital tools carry potential to lessen this delay because they can be administered by non-specialists in children's homes, schools or other everyday environments, they can measure a wide range of autistic behaviours objectively and they can automate analysis without requiring an expert in computers or statistics. This literature review aimed to identify and describe digital tools for screening children who may be at risk for autism. These tools are predominantly at the 'proof-of-concept' stage. Both portable (laptops, mobile phones, smart toys) and fixed (desktop computers, virtual-reality platforms) technologies are used to present computerised games, or to record children's behaviours or speech. Computerised analysis of children's interactions with these technologies differentiates children with and without autism, with promising results. Tasks assessing social responses and hand and body movements are the most reliable in distinguishing autistic from typically developing children. Such digital tools hold immense potential for early identification of autism spectrum disorder risk at a large scale. Next steps should be to further validate these tools and to evaluate their applicability in a variety of settings. Crucially, stakeholders from underserved communities globally must be involved in this research, lest it fail to capture the issues that these stakeholders are facing.
Collapse
Affiliation(s)
- Debarati Mukherjee
- Indian Institute of Public Health - Bengaluru, Public Health Foundation of India, India
| | | | | | - Vaisnavi Rao
- Institute for Democracy and Economic Affairs (IDEAS), Malaysia
| | | | | | | | - Vikram Patel
- Child Development Group, Sangath, India
- Harvard Medical School, USA
- Harvard T.H. Chan School of Public Health, USA
| | | |
Collapse
|
3
|
Chen A, Zhao R, Huang G, Li A, Cheung H. Successful lexical tone production of Mandarin Chinese autistic children with intellectual impairment. INTERNATIONAL JOURNAL OF LANGUAGE & COMMUNICATION DISORDERS 2023; 58:1912-1926. [PMID: 37140200 DOI: 10.1111/1460-6984.12881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 03/29/2023] [Indexed: 05/05/2023]
Abstract
BACKGROUND Atypical speech prosody has been commonly found among autistic children. Yet it remains unknown whether prosody impairment originates from poor pitch ability in general or whether it is the result of the difficulty in understanding and using prosody for communicative purposes. AIMS To investigate whether native Mandarin Chinese-speaking autistic children with intellectual impairment were able to accurately produce native lexical tones, which are pitch patterns that distinguish word meaning lexically and serve little social purpose. METHODS & PROCEDURES Using a picture-naming task, thirteen 8-13-year-old Mandarin Chinese-speaking autistic children with intellectual impairment were tested on their production of Chinese lexical tones. Chronical age-matched typically developing (TD) children were included as the control group. Perceptual assessment and phonetic analyses were conducted with the produced lexical tones. OUTCOMES & RESULTS The majority of the lexical tones produced by the autistic children were perceived as accurate by adult judges. Phonetic analysis of the pitch contours found no significant difference between the two groups, and the autistic children and TD children used the phonetic features in comparable ways when differentiating the lexical tones. However, the lexical tone accuracy rate was lower among the autistic children than among the TDs, and the larger individual difference was observed among the autistic children than the TD children. CONCLUSIONS & IMPLICATIONS These results indicate that autistic children are able to produce the global contours of the lexical tones, and pitch deficits do not seem to qualify as a core feature of autism. WHAT THIS PAPER ADDS What is already known on the subject Atypical prosody has been considered a maker of the speech of autistic children, and meta-analysis found a significant difference in mean pitch and pitch range between TD children and autistic children. Yet it remains unknown whether the pitch deficits are the result of impaired perceptual-motoric ability or if they reflect failure in learning sentential prosody, which requires an understanding of the interlocutors' mind. In addition, research on pitch ability of autistic children with intellectual disabilities has been scarce, and whether these children are able to produce pitch variation is largely unknown. What this paper adds to existing knowledge We tested native Mandarin Chinese autistic children with intellectual impairment on their production of native lexical tones. The lexical tones in Chinese are pitch variations realized on individual syllables that distinguish lexical meaning, but they do not serve social pragmatic purposes. We found that although these autistic children had only developed limited spoken language, the majority of their lexical tones were perceived as accurate. They were able to use the phonetic features in comparable ways with the TD children when distinguishing the lexical tones. What are the potential or actual clinical implications of this work? It seems unlikely that pitch processing at the lexical level is fundamentally impaired in autistic children, and pitch deficits do not seem to qualify for a core feature of their speech. Practitioners should be cautious when using pitch production as a clinical marker for autistic children.
Collapse
Affiliation(s)
- Ao Chen
- School of Psychology, Beijing Language and Culture University, Beijing, China
- School of Communication Science, Beijing Language and Culture University, Beijing, China
| | - Ru Zhao
- School of Psychology, Beijing Language and Culture University, Beijing, China
- The Children's Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang, China
| | - Gan Huang
- Institute of Linguistics, Chinese Academy of Social Sciences, Beijing, China
| | - Aijun Li
- Institute of Linguistics, Chinese Academy of Social Sciences, Beijing, China
| | - Hintat Cheung
- Department of Audiology and Speech-Language Pathology, Asia University, Taichung, Taiwan, China
| |
Collapse
|
4
|
Godel M, Robain F, Journal F, Kojovic N, Latrèche K, Dehaene-Lambertz G, Schaer M. Prosodic signatures of ASD severity and developmental delay in preschoolers. NPJ Digit Med 2023; 6:99. [PMID: 37248317 DOI: 10.1038/s41746-023-00845-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 05/15/2023] [Indexed: 05/31/2023] Open
Abstract
Atypical prosody in speech production is a core feature of Autism Spectrum Disorder (ASD) that can impact everyday life communication. Because the ability to modulate prosody develops around the age of speech acquisition, it might be affected by ASD symptoms and developmental delays that emerge at the same period. Here, we investigated the existence of a prosodic signature of developmental level and ASD symptom severity in a sample of 74 autistic preschoolers. We first developed an original diarization pipeline to extract preschoolers' vocalizations from recordings of naturalistic social interactions. Using this novel approach, we then found a robust voice quality signature of ASD developmental difficulties in preschoolers. Furthermore, some prosodic measures were associated with one year later outcome in participants who had not acquired speech yet. Altogether, our results highlight the potential benefits of automatized diarization algorithms and prosodic metrics for digital phenotyping in psychiatry, helping clinicians establish early diagnosis and prognosis.
Collapse
Affiliation(s)
- Michel Godel
- Department of Psychiatry, University of Geneva School of Medicine, Geneva, Switzerland.
| | - François Robain
- Department of Psychiatry, University of Geneva School of Medicine, Geneva, Switzerland
| | - Fiona Journal
- Department of Psychiatry, University of Geneva School of Medicine, Geneva, Switzerland
| | - Nada Kojovic
- Department of Psychiatry, University of Geneva School of Medicine, Geneva, Switzerland
| | - Kenza Latrèche
- Department of Psychiatry, University of Geneva School of Medicine, Geneva, Switzerland
| | - Ghislaine Dehaene-Lambertz
- Cognitive Neuroimaging Unit, CNRS ERL 9003, INSERM U992, CEA, Université Paris-Saclay, NeuroSpin Center, Gif/Yvette, France
| | - Marie Schaer
- Department of Psychiatry, University of Geneva School of Medicine, Geneva, Switzerland
| |
Collapse
|
5
|
Stagg SD, Thompson-Robertson L, Morgan C. Primary school children rate children with autism negatively on looks, speech and speech content. BRITISH JOURNAL OF DEVELOPMENTAL PSYCHOLOGY 2023; 41:37-49. [PMID: 36003025 DOI: 10.1111/bjdp.12430] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 06/28/2022] [Indexed: 02/07/2023]
Abstract
Adults and adolescents form negative first impressions of ASD adults and children. We examined the first impression ratings of primary school children (6-9 years) of their ASD peers. 146 school children rated either silent videos, speech or transcribe speech from 14 actors (7 ASD, 7 TD). The ASD actors were rated more negatively than the typically developing actors on all three stimulus types. Children with ASD are likely to be judged more negatively than their peers at the very start of their formal education. Contrary to previous research, for primary school children, the content of the speech was judged as negatively as the delivery of the speech.
Collapse
|
6
|
Gargan CE, Andrianopoulos MV. Receptive and expressive lexical stress in adolescents with autism. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2022; 24:636-646. [PMID: 34871124 DOI: 10.1080/17549507.2021.2008006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Purpose: Lexical stress abilities were investigated in individuals with autism spectrum disorder (ASD) compared to typically developing (TD) controls. We hypothesised that individuals with ASD would demonstrate atypical prosody on lexical and phrase stress tasks and are perceived by listeners as sounding unnatural.Method: A between-group study was conducted to investigate lexical stress abilities among adolescents (12-20 years) with ASD (n = 11) compared to TD controls (n = 11) matched for age and gender. Two tasks were administered to assess the ability to receptively and expressively distinguish nouns from verbs and a noun phrase from a compound noun. Receptive tasks required participants to select visual stimuli corresponding with the utterance they heard. Expressive tasks were rated using perceptual judgments of accuracy, perceptual and acoustic measurements of duration and perceptual ratings of "naturalness."Result: Individuals with ASD performed with significantly less accuracy on all prosody tasks, significantly longer duration of utterances, and were rated as sounding "unnatural" at a significantly higher rate than controls.Conclusion: This study provides converging evidence that supports atypical prosody is influenced by longer duration of utterances and less accurate lexical and phrase stress. The clinical implications of this study support early assessment and intervention of prosodic disorders in ASD.
Collapse
Affiliation(s)
- Colleen E Gargan
- Department of Communication Sciences and Disorders, Syracuse University, Syracuse, NY, USA
| | - Mary V Andrianopoulos
- Department of Communication Disorders, University of Massachusetts Amherst, Amherst, MA, USA
| |
Collapse
|
7
|
Xu K, Yan J, Ma C, Chang X, Chien YF. Atypical patterns of tone production in tone-language-speaking children with autism. Front Psychol 2022; 13:1023205. [DOI: 10.3389/fpsyg.2022.1023205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Accepted: 10/10/2022] [Indexed: 11/06/2022] Open
Abstract
Speakers with autism spectrum disorder (ASD) are found to exhibit atypical pitch patterns in speech production. However, little is known about the production of lexical tones (T1, T2, T3, T4) as well as neutral tones (T1N, T2N, T3N, T4N) by tone-language speakers with ASD. Thus, this study investigated the height and shape of tones produced by Mandarin-speaking children with ASD and their age-matched typically developing (TD) peers. A pronunciation experiment was conducted in which the participants were asked to produce reduplicated nouns. The findings from the acoustic analyses showed that although ASD children generally produced both lexical tones and neutral tones with distinct tonal contours, there were significant differences between the ASD and TD groups for tone height and shape for T1/T1N, T3/T3N, and T4/T4N. However, we did not find any difference in T2/T2N. These data implied that the atypical acoustic pattern in the ASD group could be partially due to the suppression of the F0 range. Moreover, we found that ASD children tended to produce more errors for T2/T2N, T3/T3N than for T1/T1N, T4/T4N. The pattern of tone errors could be explained by the acquisition principle of pitch, similarities among different tones, and tone sandhi. We thus concluded that deficits in pitch processing could be responsible for the atypical tone pattern of ASD children, and speculated that the atypical tonal contours might also be due to imitation deficits. The present findings may eventually help enhance the comprehensive understanding of the representation of atypical pitch patterns in ASD across languages.
Collapse
|
8
|
Lau JCY, Patel S, Kang X, Nayar K, Martin GE, Choy J, Wong PCM, Losh M. Cross-linguistic patterns of speech prosodic differences in autism: A machine learning study. PLoS One 2022; 17:e0269637. [PMID: 35675372 PMCID: PMC9176813 DOI: 10.1371/journal.pone.0269637] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Accepted: 05/24/2022] [Indexed: 11/19/2022] Open
Abstract
Differences in speech prosody are a widely observed feature of Autism Spectrum Disorder (ASD). However, it is unclear how prosodic differences in ASD manifest across different languages that demonstrate cross-linguistic variability in prosody. Using a supervised machine-learning analytic approach, we examined acoustic features relevant to rhythmic and intonational aspects of prosody derived from narrative samples elicited in English and Cantonese, two typologically and prosodically distinct languages. Our models revealed successful classification of ASD diagnosis using rhythm-relative features within and across both languages. Classification with intonation-relevant features was significant for English but not Cantonese. Results highlight differences in rhythm as a key prosodic feature impacted in ASD, and also demonstrate important variability in other prosodic properties that appear to be modulated by language-specific differences, such as intonation.
Collapse
Affiliation(s)
- Joseph C. Y. Lau
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, Illinois, United States of America
| | - Shivani Patel
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, Illinois, United States of America
| | - Xin Kang
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong S.A.R., China
- Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong S.A.R., China
- Research Centre for Language, Cognition and Language Application, Chongqing University, Chongqing, China
- School of Foreign Languages and Cultures, Chongqing University, Chongqing, China
| | - Kritika Nayar
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, Illinois, United States of America
| | - Gary E. Martin
- Department of Communication Sciences and Disorders, St. John’s University, Staten Island, New York, United States of America
| | - Jason Choy
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong S.A.R., China
| | - Patrick C. M. Wong
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Hong Kong S.A.R., China
- Brain and Mind Institute, The Chinese University of Hong Kong, Hong Kong S.A.R., China
| | - Molly Losh
- Roxelyn and Richard Pepper Department of Communication Sciences and Disorders, Northwestern University, Evanston, Illinois, United States of America
| |
Collapse
|
9
|
Moffitt JM, Ahn YA, Custode S, Tao Y, Mathew E, Parlade M, Hale M, Durocher J, Alessandri M, Perry LK, Messinger DS. Objective measurement of vocalizations in the assessment of autism spectrum disorder symptoms in preschool age children. Autism Res 2022; 15:1665-1674. [PMID: 35466527 DOI: 10.1002/aur.2731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 03/14/2022] [Accepted: 04/04/2022] [Indexed: 11/11/2022]
Abstract
Assessment of autism spectrum disorder (ASD) relies on expert clinician observation and judgment, but objective measurement tools have the potential to provide additional information on ASD symptom severity. Diagnostic evaluations for ASD typically include the autism diagnostic observation schedule (ADOS-2), a semi-structured assessment composed of a series of social presses. The current study examined associations between concurrent objective features of child vocalizations during the ADOS-2 and examiner-rated autism symptom severity. The sample included 66 children (49 male; M = 40 months, SD = 10.58) evaluated in a university-based clinic, 61 of whom received an ASD diagnosis. Research reliable administration of the ADOS-2 provided social affect (SA) and restricted and repetitive behavior (RRB) calibrated severity scores (CSS). Audio was recorded from examiner-worn eyeglasses during the ADOS-2 and child and adult speech were differentiated with LENA SP Hub. PRAAT was used to ascertain acoustic features of the audio signal, specifically the mean fundamental vocal frequency (F0) of LENA-identified child speech-like vocalizations (those with phonemic content), child cry vocalizations, and adult speech. Sphinx-4 was employed to estimate child and adult phonological features indexed by the average consonant and vowel count per vocalization. More than a quarter of the variance in ADOS-2 RRB CSS was predicted by the combination of child phoneme count per vocalization and child vocalization F0. Findings indicate that both acoustic and phonological features of child vocalizations are associated with expert clinician ratings of autism symptom severity. LAY SUMMARY: Determination of the severity of autism spectrum disorder is based in part on expert (but subjective) clinician observations during the ADOS-2. Two characteristics of child vocalizations-a smaller number of speech-like sounds per vocalization and higher pitched vocalizations (including cries)-were associated with greater autism symptom severity. The results suggest that objectively ascertained characteristics of children's vocalizations capture variance in children's restricted and repetitive behaviors that are reflected in clinician severity indices.
Collapse
Affiliation(s)
| | - Yeojin Amy Ahn
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Stephanie Custode
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Yudong Tao
- Department of Electrical and Computer Engineering, University of Miami, Coral Gables, Florida, USA
| | - Emilin Mathew
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Meaghan Parlade
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Melissa Hale
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Jennifer Durocher
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Michael Alessandri
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Lynn K Perry
- Department of Psychology, University of Miami, Coral Gables, Florida, USA
| | - Daniel S Messinger
- Department of Psychology, University of Miami, Coral Gables, Florida, USA.,Department of Electrical and Computer Engineering, University of Miami, Coral Gables, Florida, USA.,Departments of Pediatrics and Music Engineering, University of Miami, Coral Gables, Florida, USA
| |
Collapse
|
10
|
Chi NA, Washington P, Kline A, Husic A, Hou C, He C, Dunlap K, Wall DP. Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study. JMIR Pediatr Parent 2022; 5:e35406. [PMID: 35436234 PMCID: PMC9052034 DOI: 10.2196/35406] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 01/18/2022] [Accepted: 01/25/2022] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Autism spectrum disorder (ASD) is a neurodevelopmental disorder that results in altered behavior, social development, and communication patterns. In recent years, autism prevalence has tripled, with 1 in 44 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process that requires the work of trained physicians, significant attention has been given to developing systems that automatically detect autism. We work toward this goal by analyzing audio data, as prosody abnormalities are a signal of autism, with affected children displaying speech idiosyncrasies such as echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. OBJECTIVE We aimed to test the ability for machine learning approaches to aid in detection of autism in self-recorded speech audio captured from children with ASD and neurotypical (NT) children in their home environments. METHODS We considered three methods to detect autism in child speech: (1) random forests trained on extracted audio features (including Mel-frequency cepstral coefficients); (2) convolutional neural networks trained on spectrograms; and (3) fine-tuned wav2vec 2.0-a state-of-the-art transformer-based speech recognition model. We trained our classifiers on our novel data set of cellphone-recorded child speech audio curated from the Guess What? mobile game, an app designed to crowdsource videos of children with ASD and NT children in a natural home environment. RESULTS The random forest classifier achieved 70% accuracy, the fine-tuned wav2vec 2.0 model achieved 77% accuracy, and the convolutional neural network achieved 79% accuracy when classifying children's audio as either ASD or NT. We used 5-fold cross-validation to evaluate model performance. CONCLUSIONS Our models were able to predict autism status when trained on a varied selection of home audio clips with inconsistent recording qualities, which may be more representative of real-world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.
Collapse
Affiliation(s)
- Nathan A Chi
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Peter Washington
- Department of Bioengineering, Stanford University, Stanford, CA, United States
| | - Aaron Kline
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Arman Husic
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Cathy Hou
- Department of Computer Science, Stanford University, Stanford, CA, United States
| | - Chloe He
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Kaitlyn Dunlap
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Dennis P Wall
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, United States
| |
Collapse
|
11
|
Analysis and classification of speech sounds of children with autism spectrum disorder using acoustic features. COMPUT SPEECH LANG 2022. [DOI: 10.1016/j.csl.2021.101287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
12
|
Mann CC, Karsten AM. Assessment and Treatment of Prosody Behavior in Individuals with Level 1 Autism: A Review and Call for Research. Anal Verbal Behav 2021; 37:171-193. [PMID: 35141105 PMCID: PMC8789987 DOI: 10.1007/s40616-021-00154-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/24/2021] [Indexed: 01/19/2023] Open
Abstract
Differences in prosody behavior between individuals with autism spectrum disorder (ASD) and their typically developing peers have been considered a central feature of ASD since the earliest clinical descriptions of the disorder (e.g., Kanner, 1943/1973). Prosody includes pitch and volume among other dimensions of vocal-verbal behavior that discriminate responses of the listener; thus, people with ASD whose prosody has confusing or off-putting effects may have fewer social opportunities at work, at school, or in the community. The purpose of this review is to examine the state of the literature intervening on prosody with individuals with ASD and to provide recommendations for researchers who are interested in contributing to the scientific understanding of prosody.
Collapse
Affiliation(s)
- Charlotte C. Mann
- Department of Psychology, Western New England University, Springfield, MA, USA
- Department of Counseling and Applied Behavioral Studies, University of Saint Joseph, 1678 Asylum Ave, West Hartford, CT 06117 USA
| | - Amanda M. Karsten
- Department of Psychology, Western New England University, Springfield, MA, USA
| |
Collapse
|
13
|
Asghari SZ, Farashi S, Bashirian S, Jenabi E. Distinctive prosodic features of people with autism spectrum disorder: a systematic review and meta-analysis study. Sci Rep 2021; 11:23093. [PMID: 34845298 PMCID: PMC8630064 DOI: 10.1038/s41598-021-02487-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 11/16/2021] [Indexed: 12/26/2022] Open
Abstract
In this systematic review, we analyzed and evaluated the findings of studies on prosodic features of vocal productions of people with autism spectrum disorder (ASD) in order to recognize the statistically significant, most confirmed and reliable prosodic differences distinguishing people with ASD from typically developing individuals. Using suitable keywords, three major databases including Web of Science, PubMed and Scopus, were searched. The results for prosodic features such as mean pitch, pitch range and variability, speech rate, intensity and voice duration were extracted from eligible studies. The pooled standard mean difference between ASD and control groups was extracted or calculated. Using I2 statistic and Cochrane Q-test, between-study heterogeneity was evaluated. Furthermore, publication bias was assessed using funnel plot and its significance was evaluated using Egger's and Begg's tests. Thirty-nine eligible studies were retrieved (including 910 and 850 participants for ASD and control groups, respectively). This systematic review and meta-analysis showed that ASD group members had a significantly larger mean pitch (SMD = - 0.4, 95% CI [- 0.70, - 0.10]), larger pitch range (SMD = - 0.78, 95% CI [- 1.34, - 0.21]), longer voice duration (SMD = - 0.43, 95% CI [- 0.72, - 0.15]), and larger pitch variability (SMD = - 0.46, 95% CI [- 0.84, - 0.08]), compared with typically developing control group. However, no significant differences in pitch standard deviation, voice intensity and speech rate were found between groups. Chronological age of participants and voice elicitation tasks were two sources of between-study heterogeneity. Furthermore, no publication bias was observed during analyses (p > 0.05). Mean pitch, pitch range, pitch variability and voice duration were recognized as the prosodic features reliably distinguishing people with ASD from TD individuals.
Collapse
Affiliation(s)
| | - Sajjad Farashi
- Autism Spectrum Disorders Research Center, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Saeid Bashirian
- Department of Public Health, School of Health, Hamadan University of Medical Sciences, Hamadan, Iran.
| | - Ensiyeh Jenabi
- Autism Spectrum Disorders Research Center, Hamadan University of Medical Sciences, Hamadan, Iran
| |
Collapse
|
14
|
Wang L, Beaman CP, Jiang C, Liu F. Perception and Production of Statement-Question Intonation in Autism Spectrum Disorder: A Developmental Investigation. J Autism Dev Disord 2021; 52:3456-3472. [PMID: 34355295 PMCID: PMC9296411 DOI: 10.1007/s10803-021-05220-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/25/2021] [Indexed: 11/25/2022]
Abstract
Prosody or “melody in speech” in autism spectrum disorder (ASD) is often perceived as atypical. This study examined perception and production of statements and questions in 84 children, adolescents and adults with and without ASD, as well as participants’ pitch direction discrimination thresholds. The results suggested that the abilities to discriminate (in both speech and music conditions), identify, and imitate statement-question intonation were intact in individuals with ASD across age cohorts. Sensitivity to pitch direction predicted performance on intonation processing in both groups, who also exhibited similar developmental changes. These findings provide evidence for shared mechanisms in pitch processing between speech and music, as well as associations between low- and high-level pitch processing and between perception and production of pitch.
Collapse
Affiliation(s)
- Li Wang
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - C Philip Beaman
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK
| | - Cunmei Jiang
- Music College, Shanghai Normal University, Shanghai, China
| | - Fang Liu
- School of Psychology and Clinical Language Sciences, University of Reading, Reading, UK.
| |
Collapse
|
15
|
Chen F, Cheung CCH, Peng G. Linguistic Tone and Non-Linguistic Pitch Imitation in Children with Autism Spectrum Disorders: A Cross-Linguistic Investigation. J Autism Dev Disord 2021; 52:2325-2343. [PMID: 34109462 DOI: 10.1007/s10803-021-05123-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/30/2021] [Indexed: 11/29/2022]
Abstract
The conclusions on prosodic pitch features in autism spectrum disorders (ASD) have primarily been derived from studies in non-tonal language speakers. This cross-linguistic study evaluated the performance of imitating Cantonese lexical tones and their non-linguistic (nonspeech) counterparts by Cantonese- and Mandarin-speaking children with and without ASD. Acoustic analyses showed that, compared with typically developing peers, children with ASD exhibited increased pitch variations when imitating lexical tones, while performed similarly when imitating the nonspeech counterparts. Furthermore, Mandarin-speaking children with ASD failed to exploit the phonological knowledge of segments to improve the imitation accuracy of non-native lexical tones. These findings help clarify the speech-specific pitch processing atypicality and phonological processing deficit in tone-language-speaking children with ASD.
Collapse
Affiliation(s)
- Fei Chen
- School of Foreign Languages, Hunan University, Changsha, China.
| | - Candice Chi-Hang Cheung
- Research Centre for Language, Cognition, and Neuroscience, & Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Gang Peng
- Research Centre for Language, Cognition, and Neuroscience, & Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hong Kong SAR, China.
| |
Collapse
|
16
|
Kissine M, Geelhand P, Philippart De Foy M, Harmegnies B, Deliens G. Phonetic Inflexibility in Autistic Adults. Autism Res 2021; 14:1186-1196. [PMID: 33484063 DOI: 10.1002/aur.2477] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 12/28/2020] [Accepted: 01/07/2021] [Indexed: 11/06/2022]
Abstract
This study examined whether the atypical speech style that is frequently reported in autistic adults is underpinned by an inflexible production of phonetic targets. In a first task, 20 male autistic adults and 20 neuro-typicals had to read and produce native vowels. To assess the extent to which phonetic inflexibility is due to an overall fine-grained control of phonetic behavior or to a lack of flexibility in the realization of one's phonological repertoire, the second task asked participants to reproduce artificial vowel-like sounds. Results confirmed the presence of a greater articulatory stability in the production of native vowels in autistic adults. When instructed to imitate artificial vowel-like sounds, the autistic group did not better approximate the targets' acoustic properties relative to neuro-typicals but their performance at reproducing artificial vowels was less variable and influenced to a greater extent by the articulatory properties of their own vocalic space. These findings suggest that the greater articulatory stability observed in autistic adults arises from a lack of flexibility in the production of their own native vowels. The two phonetic tasks are devoid of any pragmatic constraint, which indicates that phonetic inflexibility in autism is partly independent of register selection. LAY SUMMARY: Autistic and neuro-typical adults took part in two tasks: one in which they produced vowels from French, their native tongue, and the other where they imitated unfamiliar vowels. Autistic adults displayed significantly less variation in their production of different French vowels. In imitating unfamiliar vowels, they were more influenced by the way they pronounce French vowels. These results suggest that the atypical speech style, frequently attested in autistic individuals, could stem from an unusually stable pronunciation of speech sounds.
Collapse
Affiliation(s)
- Mikhail Kissine
- ACTE, Université libre de Bruxelles, Bruxelles, Belgium.,ULB Neuroscience Institute, Université libre de Bruxelles, Bruxelles, Belgium.,Center for Research in Linguistics (LaDisco), Université libre de Bruxelles, Bruxelles, Belgium
| | - Philippine Geelhand
- ACTE, Université libre de Bruxelles, Bruxelles, Belgium.,ULB Neuroscience Institute, Université libre de Bruxelles, Bruxelles, Belgium.,Center for Research in Linguistics (LaDisco), Université libre de Bruxelles, Bruxelles, Belgium
| | - Marie Philippart De Foy
- Laboratoire de Phonétique, Institut de Recherche en Sciences et Technologies du Langage, Université de Mons, Mons, Belgium
| | - Bernard Harmegnies
- Laboratoire de Phonétique, Institut de Recherche en Sciences et Technologies du Langage, Université de Mons, Mons, Belgium
| | - Gaétane Deliens
- ACTE, Université libre de Bruxelles, Bruxelles, Belgium.,ULB Neuroscience Institute, Université libre de Bruxelles, Bruxelles, Belgium.,Center for Research in Cognition and Neurosciences (CRCN), Université libre de Bruxelles, Bruxelles, Belgium
| |
Collapse
|
17
|
Stroganova TA, Komarov KS, Sysoeva OV, Goiaeva DE, Obukhova TS, Ovsiannikova TM, Prokofyev AO, Orekhova EV. Left hemispheric deficit in the sustained neuromagnetic response to periodic click trains in children with ASD. Mol Autism 2020; 11:100. [PMID: 33384021 PMCID: PMC7775632 DOI: 10.1186/s13229-020-00408-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 12/17/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Deficits in perception and production of vocal pitch are often observed in people with autism spectrum disorder (ASD), but the neural basis of these deficits is unknown. In magnetoencephalogram (MEG), spectrally complex periodic sounds trigger two continuous neural responses-the auditory steady state response (ASSR) and the sustained field (SF). It has been shown that the SF in neurotypical individuals is associated with low-level analysis of pitch in the 'pitch processing center' of the Heschl's gyrus. Therefore, alternations in this auditory response may reflect atypical processing of vocal pitch. The SF, however, has never been studied in people with ASD. METHODS We used MEG and individual brain models to investigate the ASSR and SF evoked by monaural 40 Hz click trains in boys with ASD (N = 35) and neurotypical (NT) boys (N = 35) aged 7-12-years. RESULTS In agreement with the previous research in adults, the cortical sources of the SF in children were located in the left and right Heschl's gyri, anterolateral to those of the ASSR. In both groups, the SF and ASSR dominated in the right hemisphere and were higher in the hemisphere contralateral to the stimulated ear. The ASSR increased with age in both NT and ASD children and did not differ between the groups. The SF amplitude did not significantly change between the ages of 7 and 12 years. It was moderately attenuated in both hemispheres and was markedly delayed and displaced in the left hemisphere in boys with ASD. The SF delay in participants with ASD was present irrespective of their intelligence level and severity of autism symptoms. LIMITATIONS We did not test the language abilities of our participants. Therefore, the link between SF and processing of vocal pitch in children with ASD remains speculative. CONCLUSION Children with ASD demonstrate atypical processing of spectrally complex periodic sound at the level of the core auditory cortex of the left-hemisphere. The observed neural deficit may contribute to speech perception difficulties experienced by children with ASD, including their poor perception and production of linguistic prosody.
Collapse
Affiliation(s)
- T A Stroganova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russian Federation
| | - K S Komarov
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russian Federation
| | - O V Sysoeva
- Institute of Higher Nervous Activity, Russian Academy of Science, Moscow, Russian Federation
| | - D E Goiaeva
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russian Federation
| | - T S Obukhova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russian Federation
| | - T M Ovsiannikova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russian Federation
| | - A O Prokofyev
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russian Federation
| | - E V Orekhova
- Center for Neurocognitive Research (MEG Center), Moscow State University of Psychology and Education, Moscow, Russian Federation. .,MedTech West and the Institute of Neuroscience and Physiology, Sahlgrenska Academy, The University of Gothenburg, Gothenburg, Sweden.
| |
Collapse
|
18
|
Ochi K, Ono N, Owada K, Kojima M, Kuroda M, Sagayama S, Yamasue H. Quantification of speech and synchrony in the conversation of adults with autism spectrum disorder. PLoS One 2019; 14:e0225377. [PMID: 31805131 PMCID: PMC6894781 DOI: 10.1371/journal.pone.0225377] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 11/04/2019] [Indexed: 11/25/2022] Open
Abstract
Autism spectrum disorder (ASD) is a highly prevalent neurodevelopmental disorder characterized by impairments in social reciprocity and communication together with restricted interest and stereotyped behaviors. The Autism Diagnostic Observation Schedule (ADOS) is considered a ‘gold standard’ instrument for diagnosis of ASD and mainly depends on subjective assessments made by trained clinicians. To develop a quantitative and objective surrogate marker for ASD symptoms, we investigated speech features including F0, speech rate, speaking time, and turn-taking gaps, extracted from footage recorded during a semi-structured socially interactive situation from ADOS. We calculated not only the statistic values in a whole session of the ADOS activity but also conducted a block analysis, computing the statistical values of the prosodic features in each 8s sliding window. The block analysis identified whether participants changed volume or pitch according to the flow of the conversation. We also measured the synchrony between the participant and the ADOS administrator. Participants with high-functioning ASD showed significantly longer turn-taking gaps and a greater proportion of pause time, less variability and less synchronous changes in blockwise mean of intensity compared with those with typical development (TD) (p<0.05 corrected). In addition, the ASD group had significantly wider distribution than the TD group in the within-participant variability of blockwise mean of log F0 (p<0.05 corrected). The clinical diagnosis could be discriminated using the speech features with 89% accuracy. The features of turn-taking and pausing were significantly correlated with deficits of ASD in reciprocity (p<0.05 corrected). Additionally, regression analysis provided 1.35 of mean absolute error in the prediction of deficits in reciprocity, to which the synchrony of intensity especially contributed. The findings suggest that considering variance of speech features, interaction and synchrony with conversation partner are critical to characterize atypical features in the conversation of people with ASD.
Collapse
Affiliation(s)
- Keiko Ochi
- School of Media Science, Tokyo University of Technology, Hachioji, Japan
- * E-mail: (KO); (HY)
| | - Nobutaka Ono
- Department of Computer Science, Graduate School of Systems Design, Tokyo Metropolitan University, Hino, Japan
| | - Keiho Owada
- Department of Child Psychiatry, School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Masaki Kojima
- Department of Child Psychiatry, School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Miho Kuroda
- Department of Child Psychiatry, School of Medicine, The University of Tokyo, Tokyo, Japan
| | | | - Hidenori Yamasue
- Department of Psychiatry, Hamamatsu University School of Medicine, Hamamatsu, Japan
- * E-mail: (KO); (HY)
| |
Collapse
|
19
|
Yankowitz LD, Schultz RT, Parish-Morris J. Pre- and Paralinguistic Vocal Production in ASD: Birth Through School Age. Curr Psychiatry Rep 2019; 21:126. [PMID: 31749074 DOI: 10.1007/s11920-019-1113-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
PURPOSE OF REVIEW We review what is known about how pre-linguistic vocal differences in autism spectrum disorder (ASD) unfold across development and consider whether vocalization features can serve as useful diagnostic indicators. RECENT FINDINGS Differences in the frequency and acoustic quality of several vocalization types (e.g., babbles and cries) during the first year of life are associated with later ASD diagnosis. Paralinguistic features (e.g., prosody) measured during early and middle childhood can accurately classify current ASD diagnosis using cross-validated machine learning approaches. Pre-linguistic vocalization differences in infants are promising behavioral markers of later ASD diagnosis. In older children, paralinguistic features hold promise as diagnostic indicators as well as clinical targets. Future research efforts should focus on (1) bridging the gap between basic research and practical implementations of early vocalization-based risk assessment tools, and (2) demonstrating the clinical impact of targeting atypical vocalization features during social skill interventions for older children.
Collapse
Affiliation(s)
- Lisa D Yankowitz
- Center for Autism Research, Children's Hospital of Philadelphia, 2716 South St, Philadelphia, PA, 19104, USA. .,Department of Psychology, University of Pennsylvania, 3720 Walnut Street, Philadelphia, PA, 19104, USA.
| | - Robert T Schultz
- Center for Autism Research, Children's Hospital of Philadelphia, 2716 South St, Philadelphia, PA, 19104, USA.,Department of Psychiatry, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA, 19105, USA.,Department of Pediatrics, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA, 19105, USA
| | - Julia Parish-Morris
- Center for Autism Research, Children's Hospital of Philadelphia, 2716 South St, Philadelphia, PA, 19104, USA.,Department of Psychiatry, University of Pennsylvania, 3400 Civic Center Blvd, Philadelphia, PA, 19105, USA
| |
Collapse
|
20
|
Kissine M, Geelhand P. Brief Report: Acoustic Evidence for Increased Articulatory Stability in the Speech of Adults with Autism Spectrum Disorder. J Autism Dev Disord 2019; 49:2572-2580. [PMID: 30707332 DOI: 10.1007/s10803-019-03905-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Subjective impressions of speech delivery in Autism Spectrum Disorder (ASD) as monotonic or over-precise are widespread but still lack robust acoustic evidence. This study provides a detailed acoustic characterization of the specificities of speech in individuals with ASD using an extensive sample of speech data, from the production of narratives and from spontaneous conversation. Syllable-level analyses (30,843 tokens in total) were performed on audio recordings from two sub-tasks of the Autism Diagnostic Observation Schedule from 20 adults with ASD and 20 pairwise matched neuro-typical adults, providing acoustic measures of fundamental frequency, jitter, shimmer and the first three formants. The results suggest that participants with ASD display a greater articulatory stability in vowel production than neuro-typical participants, both in phonation and articulatory gestures.
Collapse
Affiliation(s)
- Mikhail Kissine
- ACTE at LaDisco & ULB Neuroscience Institute, Université libre de Bruxelles, CP 175, avenue F.D. Roosevelt, 1050, Brussels, Belgium.
| | - Philippine Geelhand
- ACTE at LaDisco & ULB Neuroscience Institute, Université libre de Bruxelles, CP 175, avenue F.D. Roosevelt, 1050, Brussels, Belgium
| |
Collapse
|
21
|
Mencattini A, Mosciano F, Comes MC, Di Gregorio T, Raguso G, Daprati E, Ringeval F, Schuller B, Di Natale C, Martinelli E. An emotional modulation model as signature for the identification of children developmental disorders. Sci Rep 2018; 8:14487. [PMID: 30262838 PMCID: PMC6160482 DOI: 10.1038/s41598-018-32454-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 08/06/2018] [Indexed: 12/15/2022] Open
Abstract
In recent years, applications like Apple's Siri or Microsoft's Cortana have created the illusion that one can actually "chat" with a machine. However, a perfectly natural human-machine interaction is far from real as none of these tools can empathize. This issue has raised an increasing interest in speech emotion recognition systems, as the possibility to detect the emotional state of the speaker. This possibility seems relevant to a broad number of domains, ranging from man-machine interfaces to those of diagnostics. With this in mind, in the present work, we explored the possibility of applying a precision approach to the development of a statistical learning algorithm aimed at classifying samples of speech produced by children with developmental disorders(DD) and typically developing(TD) children. Under the assumption that acoustic features of vocal production could not be efficiently used as a direct marker of DD, we propose to apply the Emotional Modulation function(EMF) concept, rather than running analyses on acoustic features per se to identify the different classes. The novel paradigm was applied to the French Child Pathological & Emotional Speech Database obtaining a final accuracy of 0.79, with maximum performance reached in recognizing language impairment (0.92) and autism disorder (0.82).
Collapse
Affiliation(s)
- Arianna Mencattini
- Department of Electronic Engineering, University of Rome Tor Vergata, via del Politecnico 1, 00133, Roma, Italy
| | - Francesco Mosciano
- Department of Electronic Engineering, University of Rome Tor Vergata, via del Politecnico 1, 00133, Roma, Italy
| | - Maria Colomba Comes
- Department of Electronic Engineering, University of Rome Tor Vergata, via del Politecnico 1, 00133, Roma, Italy
| | - Tania Di Gregorio
- Faculty of Science MM.FF.NN., University of Bari, Aldo Moro, University Campus Ernesto Quagliariello, Via Edoardo Orabona 4, 70126, Bari, Italy
| | - Grazia Raguso
- Faculty of Science MM.FF.NN., University of Bari, Aldo Moro, University Campus Ernesto Quagliariello, Via Edoardo Orabona 4, 70126, Bari, Italy
| | - Elena Daprati
- Department of Systems Medicine, CBMS, University of Rome Tor Vergata, via Montpellier 1, 00133, Roma, Italy
| | - Fabien Ringeval
- Laboratoire d'Informatique de Grenoble, Université Grenoble Alpes, 38401, St Martin d'Hères, France
| | - Bjorn Schuller
- GLAM - Group on Language, Audio & Music, Imperial College London, SW7 2AZ, London, UK
- Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, 86159, Augsburg, Germany
| | | | - Eugenio Martinelli
- Department of Electronic Engineering, University of Rome Tor Vergata, via del Politecnico 1, 00133, Roma, Italy.
| |
Collapse
|
22
|
Tardif M, Berti L, Marino V, Pardo J, Bressmann T. Hypernasal Speech Is Perceived as More Monotonous than Typical Speech. Folia Phoniatr Logop 2018; 70:183-190. [DOI: 10.1159/000492385] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2018] [Accepted: 07/23/2018] [Indexed: 11/19/2022] Open
|
23
|
Nakai Y, Takiguchi T, Matsui G, Yamaoka N, Takada S. Detecting Abnormal Word Utterances in Children With Autism Spectrum Disorders: Machine-Learning-Based Voice Analysis Versus Speech Therapists. Percept Mot Skills 2017. [PMID: 28649923 DOI: 10.1177/0031512517716855] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Abnormal prosody is often evident in the voice intonations of individuals with autism spectrum disorders. We compared a machine-learning-based voice analysis with human hearing judgments made by 10 speech therapists for classifying children with autism spectrum disorders ( n = 30) and typical development ( n = 51). Using stimuli limited to single-word utterances, machine-learning-based voice analysis was superior to speech therapist judgments. There was a significantly higher true-positive than false-negative rate for machine-learning-based voice analysis but not for speech therapists. Results are discussed in terms of some artificiality of clinician judgments based on single-word utterances, and the objectivity machine-learning-based voice analysis adds to judging abnormal prosody.
Collapse
|
24
|
Olivati AG, Assumpção FB, Misquiatti ARN. Acoustic analysis of speech intonation pattern of individuals with Autism Spectrum Disorders. Codas 2017; 29:e20160081. [PMID: 28403279 DOI: 10.1590/2317-1782/20172016081] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 09/12/2016] [Indexed: 11/22/2022] Open
Abstract
PURPOSE This study aimed to analyze prosodic elements of speech segments of students with Autism Spectrum Disorders (ASD) and compare with the control group, using an acoustic analysis. METHODS Speech recordings were performed with a sample of individuals with ASD (n = 19) and with typical development (n = 19) of the male gender, age range: 8-33 years. The prosody questionnaire ALIB (Brazilian Linguistic Atlas) was used as script, which contains interrogative, affirmative and imperative sentences. Data were analyzed using the PRAAT software and forwarded to statistical analysis in order to verify possible significant differences between the two groups studied in each prosodic parameter evaluated (fundamental frequency, intensity and duration) and its respective variables. RESULTS There were significant differences for the variables tessitura, melodic amplitude of tonic vowel, melodic amplitude of pretonic vowel, maximum intensity, minimum intensity, tonic vowel duration, pretonic vowel duration and phrase duration. CONCLUSION Individuals with ASD present significant differences in prosody compared to those with typical development. It is noteworthy, however, the necessity of additional studies on the characterization of prosodic aspects of speech of individuals with ASD with a larger sample and a more restricted age group.
Collapse
Affiliation(s)
- Ana Gabriela Olivati
- Universidade Estadual Paulista "Júlio de Mesquita Filho" - UNESP - Marília (SP), Brasil
| | | | | |
Collapse
|
25
|
Fusaroli R, Lambrechts A, Bang D, Bowler DM, Gaigg SB. “Is voice a marker for Autism spectrum disorder? A systematic review and meta-analysis”. Autism Res 2016; 10:384-407. [DOI: 10.1002/aur.1678] [Citation(s) in RCA: 72] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2016] [Revised: 06/24/2016] [Accepted: 07/01/2016] [Indexed: 12/31/2022]
Affiliation(s)
| | | | - Dan Bang
- The Interacting Minds Centre; Aarhus University; Aarhus Denmark
- Wellcome Trust Centre for Neuroimaging; Institute of Neurology, University College London; London UK
- Calleva Research Centre for Evolution and Human Sciences; Magdalen College, University of Oxford; Oxford UK
| | | | | |
Collapse
|
26
|
Do Individuals with High-Functioning Autism Who Speak a Tone Language Show Intonation Deficits? J Autism Dev Disord 2016; 46:1784-92. [DOI: 10.1007/s10803-016-2709-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|