1
|
Using a small dataset to classify strength-interactions with an elastic display: a case study for the screening of autism spectrum disorder. INT J MACH LEARN CYB 2022. [DOI: 10.1007/s13042-022-01554-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
2
|
Chi NA, Washington P, Kline A, Husic A, Hou C, He C, Dunlap K, Wall DP. Classifying Autism From Crowdsourced Semistructured Speech Recordings: Machine Learning Model Comparison Study. JMIR Pediatr Parent 2022; 5:e35406. [PMID: 35436234 PMCID: PMC9052034 DOI: 10.2196/35406] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 01/18/2022] [Accepted: 01/25/2022] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND Autism spectrum disorder (ASD) is a neurodevelopmental disorder that results in altered behavior, social development, and communication patterns. In recent years, autism prevalence has tripled, with 1 in 44 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process that requires the work of trained physicians, significant attention has been given to developing systems that automatically detect autism. We work toward this goal by analyzing audio data, as prosody abnormalities are a signal of autism, with affected children displaying speech idiosyncrasies such as echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns. OBJECTIVE We aimed to test the ability for machine learning approaches to aid in detection of autism in self-recorded speech audio captured from children with ASD and neurotypical (NT) children in their home environments. METHODS We considered three methods to detect autism in child speech: (1) random forests trained on extracted audio features (including Mel-frequency cepstral coefficients); (2) convolutional neural networks trained on spectrograms; and (3) fine-tuned wav2vec 2.0-a state-of-the-art transformer-based speech recognition model. We trained our classifiers on our novel data set of cellphone-recorded child speech audio curated from the Guess What? mobile game, an app designed to crowdsource videos of children with ASD and NT children in a natural home environment. RESULTS The random forest classifier achieved 70% accuracy, the fine-tuned wav2vec 2.0 model achieved 77% accuracy, and the convolutional neural network achieved 79% accuracy when classifying children's audio as either ASD or NT. We used 5-fold cross-validation to evaluate model performance. CONCLUSIONS Our models were able to predict autism status when trained on a varied selection of home audio clips with inconsistent recording qualities, which may be more representative of real-world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.
Collapse
Affiliation(s)
- Nathan A Chi
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Peter Washington
- Department of Bioengineering, Stanford University, Stanford, CA, United States
| | - Aaron Kline
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Arman Husic
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Cathy Hou
- Department of Computer Science, Stanford University, Stanford, CA, United States
| | - Chloe He
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
| | - Kaitlyn Dunlap
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
| | - Dennis P Wall
- Division of Systems Medicine, Department of Pediatrics, Stanford University, Palo Alto, CA, United States
- Department of Biomedical Data Science, Stanford University, Stanford, CA, United States
- Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, United States
| |
Collapse
|
3
|
Pokorny FB, Bartl-Pokorny KD, Zhang D, Marschik PB, Schuller D, Schuller BW. Efficient Collection and Representation of Preverbal Data in Typical and Atypical Development. JOURNAL OF NONVERBAL BEHAVIOR 2020; 44:419-436. [PMID: 33088008 PMCID: PMC7561537 DOI: 10.1007/s10919-020-00332-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Human preverbal development refers to the period of steadily increasing vocal capacities until the emergence of a child’s first meaningful words. Over the last decades, research has intensively focused on preverbal behavior in typical development. Preverbal vocal patterns have been phonetically classified and acoustically characterized. More recently, specific preverbal phenomena were discussed to play a role as early indicators of atypical development. Recent advancements in audio signal processing and machine learning have allowed for novel approaches in preverbal behavior analysis including automatic vocalization-based differentiation of typically and atypically developing individuals. In this paper, we give a methodological overview of current strategies for collecting and acoustically representing preverbal data for intelligent audio analysis paradigms. Efficiency in the context of data collection and data representation is discussed. Following current research trends, we set a special focus on challenges that arise when dealing with preverbal data of individuals with late detected developmental disorders, such as autism spectrum disorder or Rett syndrome.
Collapse
Affiliation(s)
- Florian B Pokorny
- iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Graz, Austria.,Machine Intelligence & Signal Processing group (MISP), Chair of Human-Machine Communication, Technical University of Munich, Munich, Germany
| | - Katrin D Bartl-Pokorny
- iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Graz, Austria
| | - Dajie Zhang
- iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Graz, Austria.,Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, Göttingen, Germany.,Leibniz ScienceCampus Primate Cognition, Göttingen, Germany
| | - Peter B Marschik
- iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Graz, Austria.,Department of Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, Göttingen, Germany.,Leibniz ScienceCampus Primate Cognition, Göttingen, Germany.,Center of Neurodevelopmental Disorders (KIND), Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| | | | - Björn W Schuller
- audEERING GmbH, Gilching, Germany.,ZD.B Chair of Embedded Intelligence for Health Care and Wellbeing, University of Augsburg, Augsburg, Germany.,GLAM - Group on Language, Audio & Music, Department of Computing, Imperial College London, London, UK
| |
Collapse
|
4
|
VanDam M, Yoshinaga-Itano C. Use of the LENA Autism Screen with Children who are Deaf or Hard of Hearing. MEDICINA (KAUNAS, LITHUANIA) 2019; 55:E495. [PMID: 31426435 PMCID: PMC6723169 DOI: 10.3390/medicina55080495] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/22/2019] [Revised: 07/27/2019] [Accepted: 08/12/2019] [Indexed: 11/25/2022]
Abstract
Background and Objectives: This systematic review reports the evidence from the literature concerning the potential for using an automated vocal analysis, the Language ENvironment Analysis (LENA, LENA Research Foundation, Boulder, CO, USA) in the screening process for children at risk for autism spectrum disorder (ASD) and deaf or hard of hearing (D/HH). ASD and D/HH have increased comorbidity, but current behavioral diagnostic and screening tools have limitations. The LENA Language Autism Screen (LLAS) may offer an additional tool to disambiguate ASD from D/HH in young children. Materials and Methods: We examine empirical reports that use automatic vocal analysis methods to differentiate disordered from typically developing children. Results: Consensus across the sampled scientific literature shows support for use of automatic methods for screening and disambiguation of children with ASD and D/HH. There is some evidence of vocal differentiation between ASD, D/HH, and typically-developing children warranting use of the LLAS, but additional empirical evidence is needed to better understand the strengths and weaknesses of the tool. Conclusions: The findings reported here warrant further, more substantive, methodologically-sound research that is fully powered to show a reliable difference. Findings may be useful for both clinicians and researchers in better identification and understanding of communication disorders.
Collapse
Affiliation(s)
- Mark VanDam
- Department of Speech & Hearing Sciences, Elson S. Floyd College of Medicine, Washington State University, Spokane, WA 99202, USA.
- Hearing Oral Program of Excellence (HOPE), Spokane, WA 99202, USA.
| | | |
Collapse
|
5
|
Lang S, Bartl-Pokorny KD, Pokorny FB, Garrido D, Mani N, Fox-Boyer AV, Zhang D, Marschik PB. Canonical Babbling: A Marker for Earlier Identification of Late Detected Developmental Disorders? CURRENT DEVELOPMENTAL DISORDERS REPORTS 2019; 6:111-118. [PMID: 31984204 PMCID: PMC6951805 DOI: 10.1007/s40474-019-00166-w] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Purpose of Review To summarize findings about the emergence and characteristics of canonical babbling in children with late detected developmental disorders (LDDDs), such as autism spectrum disorder, Rett syndrome, and fragile X syndrome. In particular, we ask whether infants’ vocal development in the first year of life contains any markers that may contribute to earlier detection of these disorders. Recent Findings Only a handful studies have investigated canonical babbling in infants with LDDDs. With divergent research paradigms and definitions applied, findings on the onset and characteristics of canonical babbling are inconsistent and difficult to compare. Infants with LDDDs showed reduced likelihood to produce canonical babbling vocalizations. If achieved, this milestone was more likely to be reached beyond the critical time window of 5–10 months. Summary Canonical babbling appears promising as a potential marker for early detection of infants at risk for developmental disorders. In-depth studies on babbling characteristics in LDDDs are warranted.
Collapse
Affiliation(s)
- Sigrun Lang
- 1iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Auenbruggerplatz 26, 8036 Graz, Austria
| | - Katrin D Bartl-Pokorny
- 1iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Auenbruggerplatz 26, 8036 Graz, Austria
| | - Florian B Pokorny
- 1iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Auenbruggerplatz 26, 8036 Graz, Austria.,2Machine Intelligence & Signal Processing group, Chair of Human-Machine Communication, Technical University of Munich, Munich, Germany
| | - Dunia Garrido
- 3Mind, Brain, and Behavior Research Center, University of Granada, Granada, Spain
| | - Nivedita Mani
- 4Psychology of Language Department, Georg-August University Göttingen, Göttingen, Germany.,Leibniz-ScienceCampus Primate Cognition, Göttingen, Germany
| | - Annette V Fox-Boyer
- 6Department of Human Communication Sciences, Sheffield University, Sheffield, Great Britain
| | - Dajie Zhang
- 1iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Auenbruggerplatz 26, 8036 Graz, Austria.,Leibniz-ScienceCampus Primate Cognition, Göttingen, Germany.,7Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, Göttingen, Germany
| | - Peter B Marschik
- 1iDN - interdisciplinary Developmental Neuroscience, Division of Phoniatrics, Medical University of Graz, Auenbruggerplatz 26, 8036 Graz, Austria.,Leibniz-ScienceCampus Primate Cognition, Göttingen, Germany.,7Child and Adolescent Psychiatry and Psychotherapy, University Medical Center Göttingen, Göttingen, Germany.,8Center of Neurodevelopmental Disorders (KIND), Center for Psychiatry Research, Department of Women's and Children's Health, Karolinska Institutet, Stockholm, Sweden
| |
Collapse
|
6
|
Abstract
In this study, we examined the accuracy of the Language ENvironment Analysis (LENA) system in European French. LENA is a digital recording device with software that facilitates the collection and analysis of audio recordings from young children, providing automated measures of the speech overheard and produced by the child. Eighteen native French-speaking children, who were divided into six age groups ranging from 3 to 48 months old, were recorded about 10-16 h per day, three days a week. A total of 324 samples (six 10-min chunks of recordings) were selected and then transcribed according to the CHAT format. Simple and mixed linear models between the LENA and human adult word count (AWC) and child vocalization count (CVC) estimates were performed, to determine to what extent the automatic and the human methods agreed. Both the AWC and CVC estimates were very reliable (r = .64 and .71, respectively) for the 324 samples. When controlling the random factors of participants and recordings, 1 h was sufficient to obtain a reliable sample. It was, however, found that two age groups (7-12 months and 13-18 months) had a significant effect on the AWC data and that the second day of recording had a significant effect on the CVC data. When noise-related factors were added to the model, only a significant effect of signal-to-noise ratio was found on the AWC data. All of these findings and their clinical implications are discussed, providing strong support for the reliability of LENA in French.
Collapse
|
7
|
Zhang X, Xue L, Zhang Z, Zhang Y. A Novel Application System of Assessing the Pronunciation Differences Between Chinese Children and Adults. Open Biomed Eng J 2016; 10:91-100. [PMID: 27583037 PMCID: PMC4988092 DOI: 10.2174/1874120701610010091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2015] [Revised: 04/25/2016] [Accepted: 06/01/2016] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND Health problems about children have been attracting much attention of parents and even the whole society all the time, among which, child-language development is a hot research topic. The experts and scholars have studied and found that the guardians taking appropriate intervention in children at the early stage can promote children's language and cognitive ability development effectively, and carry out analysis of quantity. The intervention of Artificial Intelligence Technology has effect on the autistic spectrum disorders of children obviously. OBJECTIVE AND METHODS This paper presents a speech signal analysis system for children, with preprocessing of the speaker speech signal, subsequent calculation of the number in the speech of guardians and children, and some other characteristic parameters or indicators (e.g cognizable syllable number, the continuity of the language). RESULTS With these quantitative analysis tool and parameters, we can evaluate and analyze the quality of children's language and cognitive ability objectively and quantitatively to provide the basis for decision-making criteria for parents. Thereby, they can adopt appropriate measures for children to promote the development of children's language and cognitive status. CONCLUSION In this paper, according to the existing study of children's language development, we put forward several indicators in the process of automatic measurement for language development which influence the formation of children's language. From the experimental results we can see that after the pretreatment (including signal enhancement, speech activity detection), both divergence algorithm calculation results and the later words count are quite satisfactory compared with the actual situation.
Collapse
Affiliation(s)
- Xiaoyang Zhang
- School of Communication and Information Engineering, Shanghai University, Shanghai, 200000, P.R. China
| | - Lei Xue
- School of Communication and Information Engineering, Shanghai University, Shanghai, 200000, P.R. China
| | - Zhi Zhang
- School of Communication and Information Engineering, Shanghai University, Shanghai, 200000, P.R. China
| | - Yiwen Zhang
- Shanghai Children’s Medical Center, Shanghai, 200000, P.R. China
| |
Collapse
|
8
|
Suskind DL, Leffel KR, Graf E, Hernandez MW, Gunderson EA, Sapolich SG, Suskind E, Leininger L, Goldin-Meadow S, Levine SC. A parent-directed language intervention for children of low socioeconomic status: a randomized controlled pilot study. JOURNAL OF CHILD LANGUAGE 2016; 43:366-406. [PMID: 26041013 PMCID: PMC10835758 DOI: 10.1017/s0305000915000033] [Citation(s) in RCA: 105] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
We designed a parent-directed home-visiting intervention targeting socioeconomic status (SES) disparities in children's early language environments. A randomized controlled trial was used to evaluate whether the intervention improved parents' knowledge of child language development and increased the amount and diversity of parent talk. Twenty-three mother-child dyads (12 experimental, 11 control, aged 1;5-3;0) participated in eight weekly hour-long home-visits. In the experimental group, but not the control group, parent knowledge of language development increased significantly one week and four months after the intervention. In lab-based observations, parent word types and tokens and child word types increased significantly one week, but not four months, post-intervention. In home-based observations, adult word tokens, conversational turn counts, and child vocalization counts increased significantly during the intervention, but not post-intervention. The results demonstrate the malleability of child-directed language behaviors and knowledge of child language development among low-SES parents.
Collapse
Affiliation(s)
- Dana L Suskind
- University of Chicago Medicine,Department of Surgery,Division of Otolaryngology
| | - Kristin R Leffel
- University of Chicago Medicine,Department of Surgery,Division of Otolaryngology
| | - Eileen Graf
- University of Chicago Medicine,Department of Surgery,Division of Otolaryngology
| | | | | | | | - Elizabeth Suskind
- University of Chicago Medicine,Department of Surgery,Division of Otolaryngology
| | | | | | | |
Collapse
|
9
|
McGinnis WR, Audhya T, Edelson SM. Proposed toxic and hypoxic impairment of a brainstem locus in autism. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2013; 10:6955-7000. [PMID: 24336025 PMCID: PMC3881151 DOI: 10.3390/ijerph10126955] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/20/2013] [Revised: 11/07/2013] [Accepted: 11/11/2013] [Indexed: 01/15/2023]
Abstract
Electrophysiological findings implicate site-specific impairment of the nucleus tractus solitarius (NTS) in autism. This invites hypothetical consideration of a large role for this small brainstem structure as the basis for seemingly disjointed behavioral and somatic features of autism. The NTS is the brain's point of entry for visceral afference, its relay for vagal reflexes, and its integration center for autonomic control of circulatory, immunological, gastrointestinal, and laryngeal function. The NTS facilitates normal cerebrovascular perfusion, and is the seminal point for an ascending noradrenergic system that modulates many complex behaviors. Microvascular configuration predisposes the NTS to focal hypoxia. A subregion--the "pNTS"--permits exposure to all blood-borne neurotoxins, including those that do not readily transit the blood-brain barrier. Impairment of acetylcholinesterase (mercury and cadmium cations, nitrates/nitrites, organophosphates, monosodium glutamate), competition for hemoglobin (carbon monoxide, nitrates/nitrites), and higher blood viscosity (net systemic oxidative stress) are suggested to potentiate microcirculatory insufficiency of the NTS, and thus autism.
Collapse
Affiliation(s)
- Woody R. McGinnis
- Autism Research Institute, 4182 Adams Avenue, San Diego, CA 92116, USA; E-Mail:
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-541-326-8822; Fax: +1-619-563-6840
| | - Tapan Audhya
- Division of Endocrinology, Department of Medicine, New York University Medical School, New York, NY 10016, USA; E-Mail:
| | - Stephen M. Edelson
- Autism Research Institute, 4182 Adams Avenue, San Diego, CA 92116, USA; E-Mail:
| |
Collapse
|