1
|
Heller Murray ES, Chao A. The Relationships Among Vocal Variability, Vocal-Articulatory Coordination, and Dysphonia in Children. J Voice 2023; 37:969.e43-969.e49. [PMID: 34272144 DOI: 10.1016/j.jvoice.2021.06.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 06/01/2021] [Accepted: 06/10/2021] [Indexed: 10/20/2022]
Abstract
OBJECTIVE The purpose of this study was to evaluate the relationship between vocal variability and variability of vocal-articulatory coordination in children. Furthermore, this study examined if this relationship was impacted by pediatric dysphonia. STUDY DESIGN Retrospective analysis of speech samples in the Arizona Child Acoustic Database. METHODS Speech samples from children 2-7 years of age were selected for analysis. Vocal variability was defined as the coefficient of variation (CoV) of fundamental frequency, taken from the center of sustained vowels. Variability of vocal-articulatory coordination was defined as the CoV of voice onset time (VOT) of voiceless stop consonants. Both objective and subjective measures of dysphonia were completed for each participant. RESULTS Children had a negative correlation between VOT variability and vocal variability. Further analysis indicated that this relationship was present in children with typical developmental levels of dysphonia but absent for children with moderate to severe dysphonia. Increased dysphonia severity was associated with increased vocal variability. CONCLUSION Increased VOT variability was associated with decreased vocal variability in children with dysphonia severities consistent with typical vocal development. However, this relationship was not present in children with moderate to severe dysphonia. This study suggests that future work is needed to examine the relationships between the vocal system and vocal-articulatory coordination in children with and without diagnosed voice disorders.
Collapse
Affiliation(s)
| | - Andie Chao
- Department of Communication Sciences and Disorders, Temple University, Philadelphia, Pennsylvania
| |
Collapse
|
2
|
Cmejla R, Novotny M, Rusz J, Tykalova T, Vimr J, Hlavnicka J. The automated screening of speech motor development in children based on the sequential motion rate. Comput Biol Med 2023; 162:107086. [PMID: 37290387 DOI: 10.1016/j.compbiomed.2023.107086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Revised: 04/28/2023] [Accepted: 05/27/2023] [Indexed: 06/10/2023]
Abstract
BACKGROUND Motor skills in children have traditionally been examined via challenging speech tasks such as syllable repetition, and calculating the syllabic rate using a stopwatch or by inspecting the oscillogram followed by a laborious comparison of the scores on a look-up table representing the typical performances of children of the given age and sex. As the commonly used performance tables are over-simplified to allow for manual scoring, we raise the question of whether a computational model of motor skills development could be more informative, and could allow for the automated screening of children to detect underdeveloped motor skills. METHODS We recruited a total of 275 children aged four to 15 years. All the participants were native Czech speakers with no history of hearing or neurological impairments. We recorded each child's performance of/pa/-/ta/-/ka/syllable repetition. Various parameters of diadochokinesis (DDK; DDK rate, DDK regularity, voice onset time [VOT] ratio, syllable, vowel and VOT duration) were investigated in the acoustic signals using supervised reference labels. Female and male participants were analyzed separately by comparing younger, middle, and older age groups of children via ANOVA. Finally, we implemented a fully automated model that estimated the developmental age of a child based on the acoustic signal, and evaluated its accuracy using Pearson's correlation coefficient and normalized root-mean-squared errors (RMSEs). RESULTS The DDK rate reflected the ages of the children proportionally (p < 0.001). Other DDK parameters also showed strong sensitivity to age (p < 0.001), with the exception of VOT duration, which had a smaller effect (p = 0.091). The effect of age was found to be sex specific for the syllable length (p < 0.001) and DDK rate (p = 0.003). We observed that females spoke more slowly and had a longer VOT at preschool age (p < 0.001). The DDK rate obtained via the automated algorithm was strongly correlated with the reference (p < 0.001, Pearson's correlation coefficient of 0.97), with a low normalized RMSE of 3.77%. CONCLUSIONS As children develop their motor skills, they are capable of shortening the vowels to increase the rate of syllabic repetitions. The nonlinear development in childhood and adolescence, with a steady state in adulthood, follows a logistic function for the DDK rate. This study demonstrates that the development of motor skills can be examined sensitively and more appropriately by a fully automated noninvasive procedure that also accounts for the dispersion of values within age brackets.
Collapse
Affiliation(s)
- Roman Cmejla
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic.
| | - Michal Novotny
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Jan Rusz
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Tereza Tykalova
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Jan Vimr
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| | - Jan Hlavnicka
- Department of Circuit Theory, Faculty of Electrical Engineering, Czech Technical University in Prague, Prague, Czech Republic
| |
Collapse
|
3
|
León M, Washington KN, McKenna VS, Crowe K, Fritz K, Boyce S. Characterizing Speech Sound Productions in Bilingual Speakers of Jamaican Creole and English: Application of Durational Acoustic Methods. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2023; 66:61-83. [PMID: 36580548 PMCID: PMC10023179 DOI: 10.1044/2022_jslhr-22-00304] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 09/11/2022] [Accepted: 09/16/2022] [Indexed: 06/17/2023]
Abstract
PURPOSE This study examined the speech acoustic characteristics of Jamaican Creole (JC) and English in bilingual preschoolers and adults using acoustic duration measures. The aims were to determine if, for JC and English, (a) child and adult acoustic duration characteristics differ, (b) differences occur in preschoolers' duration patterns based on the language spoken, and (c) relationships exist between the preschoolers' personal contextual factors (i.e., age, sex, and percentage of language [%language] exposure and use) and acoustic duration. METHOD Data for this cross-sectional study were collected in Kingston, Jamaica, and New York City, New York, United States, during 2013-2019. Participants included typically developing simultaneous bilingual preschoolers (n = 120, ages 3;4-5;11 [years;months]) and adults (n = 15, ages 19;0-54;4) from the same linguistic community. Audio recordings of single-word productions of JC and English were collected through elicited picture-based tasks and used for acoustic analysis. Durational features (voice onset time [VOT], vowel duration, whole-word duration, and the proportion of vowel to whole-word duration) were measured using Praat, a speech analysis software program. RESULTS JC-English-speaking children demonstrated developing speech motor control through differences in durational patterns compared with adults, including VOT for voiced plosives. Children's VOT, vowel duration, and whole-word duration were produced similarly across JC and English. The contextual factor %language use was predictive of vowel and whole-word duration in English. CONCLUSIONS The findings from this study contribute to a foundation of understanding typical bilingual speech characteristics and motor development as well as schema in JC-English speakers. In particular, minimal acoustic duration differences were observed across the post-Creole continuum, a feature that may be attributed to the JC-English bilingual environment. SUPPLEMENTAL MATERIAL https://doi.org/10.23641/asha.21760469.
Collapse
Affiliation(s)
- Michelle León
- Department of Communication Sciences & Disorders, University of Cincinnati, OH
| | - Karla N. Washington
- Department of Communication Sciences & Disorders, University of Cincinnati, OH
- Department of Speech-Language Pathology, University of Toronto, Ontario, Canada
- Department of Communicative Sciences and Disorders, New York University, NY
| | - Victoria S. McKenna
- Department of Communication Sciences & Disorders, University of Cincinnati, OH
- Department of Biomedical Engineering, University of Cincinnati, OH
- Department of Otolaryngology–Head and Neck Surgery, University of Cincinnati, OH
| | - Kathryn Crowe
- School of Education, Charles Sturt University, Bathurst, New South Wales, Australia
- School of Health Sciences, University of Iceland, Reykjavík
| | - Kristina Fritz
- Department of Psychology, California State University Northridge, Los Angeles
| | - Suzanne Boyce
- Department of Communication Sciences & Disorders, University of Cincinnati, OH
| |
Collapse
|
4
|
Paquette-Smith M, Schertz J, Johnson EK. Comparing Phonetic Convergence in Children and Adults. LANGUAGE AND SPEECH 2022; 65:240-260. [PMID: 33998342 DOI: 10.1177/00238309211013864] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Observations by sociolinguists suggest that when children relocate to a new community, they rapidly learn to imitate their peers, adopting the new local accent faster and more effectively than adults. However, few well-controlled laboratory experiments have been conducted comparing speech or accent imitation across ages. Here, we investigated Canadian English-speaking children's and adults' imitation of three model speakers: a Canadian English talker, an Australian English talker, and a non-native Mandarin English talker who learned English later in life. The speech of all three talkers was manipulated to have elongated voice onset time (VOT) on word initial stop consonants. The dependent measure was how much participants would lengthen their VOTs after exposure to one of the talkers in two paradigms: delayed shadowing (Experiment 1) and immediate shadowing (Experiment 2). We predicted that overall children would show more imitation than adults, particularly when imitating the Canadian English talker, given previous work on children's social preferences. Although we did not observe age differences in either study, when shadowing was immediate, we found that imitation was influenced by the accent of the speaker, but not in the manner we predicted: both age groups imitated the Mandarin-accented model more strongly than the Canadian model. When shadowing was delayed, we observed no evidence of imitation. We discuss our findings in light of other recent work, and conclude that the development of speech imitation is an area ripe for further investigation.
Collapse
|
5
|
Lou Q, Wang X, Jiang L, Wang G, Chen Y, Liu Q. Subjective and Objective Evaluation of Speech in Adult Patients with Unrepaired Cleft Palate. J Craniofac Surg 2022; 33:e528-e532. [PMID: 35175986 DOI: 10.1097/scs.0000000000008567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 01/25/2022] [Indexed: 11/25/2022] Open
Abstract
OBJECTIVE To explore the speech outcomes of adult patients through subjective perception evaluation and objective acoustic analysis, and to compare the differences in pronunciation characteristics between speakers with adult patients with unrepaired cleft palate and their non-cleft peers. PARTICIPANTS AND INTERVENTION Subjective evaluation indicators included speech intelligibility, nasality, and consonant missing rate, whereas objective acoustic parameters included normalized vowel formants, voice onset time, and the analysis of three-dimensional spectrogram and spectrum, were carried out on speech samples produced by 2 groups of speakers: (a) speakers with unrepaired cleft palate (n = 65, mean age = 25.1 years) and (b) typical speakers (n = 30, mean age = 23.7 years). RESULTS Compared with typical speakers, individuals with unrepaired cleft palate exhibited a lower speech intelligibility with higher nasality and consonant missing rate, the missing rate is highest for the 6 consonants syllables. The acoustic parameters are mainly manifested as differences in vowel formants and voice onset time. CONCLUSIONS The results revealed important acoustical differences between adult patients with unrepaired cleft palate and typical speakers. The trend of spectral deviation may have contributed to the difficulty in producing pressure vowels and aspirated consonants in individuals with speech disorders related to cleft palate.
Collapse
Affiliation(s)
- Qun Lou
- Department of Oral and Maxillofacial Surgery, Shanghai Ninth People's Hospital, College of Stomatology, Shanghai Jiao Tong University School of Medicine, National Clinical Research Center for Oral Diseases, Shanghai Key Laboratory of Stomatology and Shanghai Research Institute of Stomatology, Shanghai, China
| | | | | | | | | | | |
Collapse
|
6
|
Crosslinguistic Influence in the Discrimination of Korean Stop Contrast by Heritage Speakers and Second Language Learners. LANGUAGES 2022. [DOI: 10.3390/languages7010006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The present study examines the extent of crosslinguistic influence from English as a dominant language in the perception of the Korean lenis–aspirated contrast among Korean heritage speakers in the United States (N = 20) and English-speaking learners of Korean as a second language (N = 20), as compared to native speakers of Korean immersed in the first language environment (N = 20), by using an AX discrimination task. In addition, we sought to determine whether significant dependencies could be observed between participants’ linguistic background and experiences and their perceptual accuracy in the discrimination task. Results of a mixed-effects logistic regression model demonstrated that heritage speakers outperformed second language learners with 85% vs. 63% accurate discrimination, while no significant difference was detected between heritage speakers and first language-immersed native speakers (85% vs. 88% correct). Furthermore, higher verbal fluency was significantly predictive of greater perceptual accuracy for the heritage speakers. The results are compatible with the interpretation that the influence of English on the discrimination of the Korean laryngeal contrast was stronger for second language learners of Korean than for heritage speakers, while heritage speakers were not apparently affected by dominance in English in their discrimination of Korean lenis and aspirated stops.
Collapse
|
7
|
Jacewicz E, Arzbecker LJ, Fox RA, Liu S. Variability in within-category implementation of stop consonant voicing in American English-speaking children. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:3711. [PMID: 34852578 PMCID: PMC8730371 DOI: 10.1121/10.0007229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 10/26/2021] [Indexed: 06/13/2023]
Abstract
The development of stop consonant voicing in English-speaking children has been documented as a progressive mastery of phonological contrast, but implementation of voicing within one voicing category has not been systematically examined. This study provides a comprehensive account of structured variability in phonetic realization of /b/ in running speech by 8-12-year-old American children (n = 48) when compared to adults (n = 36). The stop always occurred word-initially, was followed by either a voiced or voiceless coda, and its position varied in a sentence, which created systematic conditions to examine acoustic variability in closure duration (CD) and voicing during the closure (VDC) stemming from phonetic context and prosodic prominence. Children demonstrated command of long-distance anticipatory coarticulation, providing evidence that information about coda voicing is distributed over an entire monosyllabic word and is available in the onset stop. They also manifested covariation of cues to stop voicing and command of prosodic variation, despite greater random variability, greater CD, reduced VDC, and exaggerated execution of sentential focus when compared to adults. Controlling for regional variation, dialect was a significant predictor for adults but not for children, who no longer adhered to the marked local variants in their implementation of stop voicing.
Collapse
Affiliation(s)
- Ewa Jacewicz
- Department of Speech and Hearing Science, The Ohio State University, 1070 Carmack Road, Columbus, Ohio 43210, USA
| | - Lian J Arzbecker
- Department of Speech and Hearing Science, The Ohio State University, 1070 Carmack Road, Columbus, Ohio 43210, USA
| | - Robert A Fox
- Department of Speech and Hearing Science, The Ohio State University, 1070 Carmack Road, Columbus, Ohio 43210, USA
| | - Shuang Liu
- Department of Speech and Hearing Science, The Ohio State University, 1070 Carmack Road, Columbus, Ohio 43210, USA
| |
Collapse
|
8
|
Millasseau J, Yuen I, Bruggeman L, Demuth K. Acoustic cues to coda stop voicing contrasts in Australian English-speaking children. JOURNAL OF CHILD LANGUAGE 2021; 48:1262-1280. [PMID: 33563341 DOI: 10.1017/s0305000920000781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
While voicing contrasts in word-onset position are acquired relatively early, much less is known about how and when they are acquired in word-coda position, where accurate production of these contrasts is also critical for distinguishing words (e.g., dog vs. dock). This study examined how the acoustic cues to coda voicing contrasts are realized in the speech of 4-year-old Australian English-speaking children. The results showed that children used similar acoustic cues to those of adults, including longer vowel duration and more frequent voice bar for voiced stops, and longer closure and burst durations for voiceless stops along with more frequent irregular pitch periods. This suggests that 4-year-olds have acquired productive use of the acoustic cues to coda voicing contrasts, though implementations are not yet fully adult-like. The findings have implications for understanding the development of phonological contrasts in populations for whom these may be challenging, such as children with hearing loss.
Collapse
Affiliation(s)
- Julien Millasseau
- Department of Linguistics, Macquarie University, Australia 16 University Avenue, Australian Hearing Hub, North Ryde, NSW2109Australia
| | - Ivan Yuen
- Department of Linguistics, Macquarie University, Australia 16 University Avenue, Australian Hearing Hub, North Ryde, NSW2109Australia
| | - Laurence Bruggeman
- Department of Linguistics, Macquarie University, Australia 16 University Avenue, Australian Hearing Hub, North Ryde, NSW2109Australia
- The MARCS Institute & ARC Centre of Excellence for the Dynamics of Language, Western Sydney University, Australia
| | - Katherine Demuth
- Department of Linguistics, Macquarie University, Australia 16 University Avenue, Australian Hearing Hub, North Ryde, NSW2109Australia
| |
Collapse
|
9
|
Non-nutritive suck and voice onset time: Examining infant oromotor coordination. PLoS One 2021; 16:e0250529. [PMID: 33905427 PMCID: PMC8078818 DOI: 10.1371/journal.pone.0250529] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 04/08/2021] [Indexed: 11/19/2022] Open
Abstract
The variability of a child’s voice onset time (VOT) decreases during development as they learn to coordinate upper vocal tract and laryngeal articulatory gestures. Yet, little is known about the relationship between VOT and other early motor tasks. The aims of this study were to evaluate the relationship between infant vocalization and another early oromotor task, non-nutritive suck (NNS). Twenty-five full-term infants (11 male, 14 female) completed this study. NNS was measured with a customized pacifier at 3 months to evaluate this early reflex. Measures of mean VOT and variability of VOT (measured via coefficient of variation) were collected from 12-month-old infants using a Language Environmental Analysis device. Variability of VOTs at 12 months was significantly related to NNS measures at 3-months. Increased VOT variability was primarily driven by increased NNS intraburst frequency and increased NNS burst duration. There were no relationships between average VOT or range of VOT and NNS measures. Findings from this pilot study indicate a relationship between NNS measures of intraburst frequency and burst duration and VOT variability. Infants with increased NNS intraburst frequency and NNS burst duration had increased VOT variability, suggesting a relationship between the development of VOT and NNS in the first year of life. Future work is needed to continue to examine the relationship between these early oromotor actions and to evaluate how this may impact later speech development.
Collapse
|
10
|
Schertz J, Johnson EK, Paquette-Smith M. The independent contribution of voice onset time to perceptual metrics of convergence. JASA EXPRESS LETTERS 2021; 1:045205. [PMID: 36154201 DOI: 10.1121/10.0004373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
This work explores the relationship between phonetic and perceptual metrics for convergence in shadowed productions by adults and 6-year-old children by isolating the role of voice onset time (VOT) in listeners' similarity judgments. Results show a small but independent role for VOT: listeners were less likely to identify shadowed tokens as more similar to the model when natural VOT convergence present in the stimulus set had been artificially removed (experiments 1 and 2). However, VOT equivalence alone, when accompanied by naturally occurring variation along other dimensions, was not sufficient to drive listeners' judgments of similarity (experiment 3).
Collapse
Affiliation(s)
- Jessamyn Schertz
- Department of Language Studies, University of Toronto Mississauga, Mississauga, Ontario, Canada
| | - Elizabeth K Johnson
- Department of Psychology, University of Toronto Mississauga, Mississauga, Ontario, Canada
| | - Melissa Paquette-Smith
- Department of Psychology, University of California, Los Angeles, California 90095, USA , ,
| |
Collapse
|
11
|
Millasseau J, Bruggeman L, Yuen I, Demuth K. Temporal cues to onset voicing contrasts in Australian English-speaking children. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 149:348. [PMID: 33514122 DOI: 10.1121/10.0003060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 12/14/2020] [Indexed: 06/12/2023]
Abstract
Voicing contrasts are lexically important for differentiating words in many languages (e.g., "bear" vs "pear"). Temporal differences in the voice onset time (VOT) and closure duration (CD) contribute to the voicing contrast in word-onset position. However, little is known about the acoustic realization of these voicing contrasts in Australian English-speaking children. This is essential for understanding the challenges faced by those with language delay. Therefore, the present study examined the VOT and CD values for word-initial stops as produced by 20 Australian English-speaking 4-5-year-olds. As anticipated, these children produced a systematic distinction between voiced and voiceless stops at all places of articulation (PoAs). However, although the children's VOT values for voiced stops were similar to those of adults, their VOTs for voiceless stops were longer. Like adults, the children also had different CD values for voiced and voiceless categories; however, these were systematically longer than those of adults. Even after adjusting for temporal differences by computing proportional ratios for the VOT and CD, children's voicing contrasts were not yet adultlike. These results suggest that children of this age are still developing appropriate timing and articulatory adjustments for voicing contrasts in the word-initial position.
Collapse
Affiliation(s)
- Julien Millasseau
- Department of Linguistics, Macquarie University, Sydney, 16 University Avenue, New South Wales 2109, Australia
| | - Laurence Bruggeman
- Australian Research Council Centre of Excellence for the Dynamics of Language, The MARCS Institute, Western Sydney University, Australia
| | - Ivan Yuen
- Department of Linguistics, Macquarie University, Sydney, 16 University Avenue, New South Wales 2109, Australia
| | - Katherine Demuth
- Department of Linguistics, Macquarie University, Sydney, 16 University Avenue, New South Wales 2109, Australia
| |
Collapse
|
12
|
Beguš G. Generative Adversarial Phonology: Modeling Unsupervised Phonetic and Phonological Learning With Neural Networks. Front Artif Intell 2020; 3:44. [PMID: 33733161 PMCID: PMC7861218 DOI: 10.3389/frai.2020.00044] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/19/2020] [Indexed: 11/30/2022] Open
Abstract
Training deep neural networks on well-understood dependencies in speech data can provide new insights into how they learn internal representations. This paper argues that acquisition of speech can be modeled as a dependency between random space and generated speech data in the Generative Adversarial Network architecture and proposes a methodology to uncover the network's internal representations that correspond to phonetic and phonological properties. The Generative Adversarial architecture is uniquely appropriate for modeling phonetic and phonological learning because the network is trained on unannotated raw acoustic data and learning is unsupervised without any language-specific assumptions or pre-assumed levels of abstraction. A Generative Adversarial Network was trained on an allophonic distribution in English, in which voiceless stops surface as aspirated word-initially before stressed vowels, except if preceded by a sibilant [s]. The network successfully learns the allophonic alternation: the network's generated speech signal contains the conditional distribution of aspiration duration. The paper proposes a technique for establishing the network's internal representations that identifies latent variables that correspond to, for example, presence of [s] and its spectral properties. By manipulating these variables, we actively control the presence of [s] and its frication amplitude in the generated outputs. This suggests that the network learns to use latent variables as an approximation of phonetic and phonological representations. Crucially, we observe that the dependencies learned in training extend beyond the training interval, which allows for additional exploration of learning representations. The paper also discusses how the network's architecture and innovative outputs resemble and differ from linguistic behavior in language acquisition, speech disorders, and speech errors, and how well-understood dependencies in speech data can help us interpret how neural networks learn their representations.
Collapse
Affiliation(s)
- Gašper Beguš
- Department of Linguistics, University of California, Berkeley, Berkeley, CA, United States
- Department of Linguistics, University of Washington, Seattle, WA, United States
| |
Collapse
|
13
|
Lowenstein JH, Nittrouer S. Perception-Production Links in Children's Speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:853-867. [PMID: 30986136 PMCID: PMC6802887 DOI: 10.1044/2018_jslhr-s-18-0178] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2018] [Revised: 09/25/2018] [Accepted: 12/13/2018] [Indexed: 06/02/2023]
Abstract
Purpose Child phonologists have long been interested in how tightly speech input constrains the speech production capacities of young children, and the question acquires clinical significance when children with hearing loss are considered. Children with sensorineural hearing loss often show differences in the spectral and temporal structures of their speech production, compared to children with normal hearing. The current study was designed to investigate the extent to which this problem can be explained by signal degradation. Method Ten 5-year-olds with normal hearing were recorded imitating 120 three-syllable nonwords presented in unprocessed form and as noise-vocoded signals. Target segments consisted of fricatives, stops, and vowels. Several measures were made: 2 duration measures (voice onset time and fricative length) and 4 spectral measures involving 2 segments (1st and 3rd moments of fricatives and 1st and 2nd formant frequencies for the point vowels). Results All spectral measures were affected by signal degradation, with vowel production showing the largest effects. Although a change in voice onset time was observed with vocoded signals for /d/, voicing category was not affected. Fricative duration remained constant. Conclusions Results support the hypothesis that quality of the input signal constrains the speech production capacities of young children. Consequently, it can be concluded that the production problems of children with hearing loss-including those with cochlear implants-can be explained to some extent by the degradation in the signal they hear. However, experience with both speech perception and production likely plays a role as well.
Collapse
Affiliation(s)
- Joanna H. Lowenstein
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville
| | - Susan Nittrouer
- Department of Speech, Language, and Hearing Sciences, University of Florida, Gainesville
| |
Collapse
|
14
|
Yang J. Development of stop consonants in three- to six-year-old Mandarin-speaking children. JOURNAL OF CHILD LANGUAGE 2018; 45:1091-1115. [PMID: 29667563 DOI: 10.1017/s0305000918000090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This study compared the temporal measurements of stop consonants in 29 three- to six-year-old Mandarin-speaking children and 12 Mandarin-speaking adults. Each participant produced 18 Mandarin disyllabic words which contained six stop consonants /p, pʰ, t, tʰ, k, kʰ/ each followed by three vowels /a, i, u/ at the word-initial position in the first syllable. The temporal measurements of VOT, overall burst duration, average duration per burst, number of bursts, and VOT-lag duration were obtained. Although adult-like short-lag VOTs were achieved in all children, the long-lag VOTs were widespread in the younger group and gradually developed to a concentrated distribution in the older children. Further analysis of the burst and VOT-lag revealed that these children tended to produce shorter average duration per burst and longer VOT-lag than the adults. These results indicate that children in this age range may not have developed adult-like laryngeal-oral timing pattern and airflow control for stop production.
Collapse
Affiliation(s)
- Jing Yang
- Department of Communication Sciences and Disorders,University of Central Arkansas
| |
Collapse
|
15
|
Lee SAS, Iverson GK. The emergence of phonetic categories in Korean-English bilingual children. JOURNAL OF CHILD LANGUAGE 2017; 44:1485-1515. [PMID: 28166843 DOI: 10.1017/s0305000916000659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The present study examined the speech production of three-year-old Korean-English bilingual (KEB) children. English and Korean stops, as well as front vowels in both languages, were compared acoustically among the KEB children, then also measured against those of their age-equivalent monolingual counterparts. Evidence of distinctive phonetic categorization in bilingual children was more salient in vowels than in stops. Vowels and stops produced by the bilingual children were not significantly different from those of their monolingual counterparts. The findings suggest that, similar to other language domains, two linguistic systems are apparent in the phonetic production component of three-year-old KEB children, but that phonetic distinctiveness in production may not emerge holistically in an across-the-board fashion, appearing earlier in vowels than stops. Thus, the phonetic production systems of the two languages may develop with only limited interaction in simultaneous KEB children exposed to two languages at an early age.
Collapse
|
16
|
Beckman ME, Plummer AR, Munson B, Reidy PF. Methods for eliciting, annotating, and analyzing databases for child speech development. COMPUT SPEECH LANG 2017; 45:278-299. [PMID: 28943715 DOI: 10.1016/j.csl.2017.02.010] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Methods from automatic speech recognition (ASR), such as segmentation and forced alignment, have facilitated the rapid annotation and analysis of very large adult speech databases and databases of caregiver-infant interaction, enabling advances in speech science that were unimaginable just a few decades ago. This paper centers on two main problems that must be addressed in order to have analogous resources for developing and exploiting databases of young children's speech. The first problem is to understand and appreciate the differences between adult and child speech that cause ASR models developed for adult speech to fail when applied to child speech. These differences include the fact that children's vocal tracts are smaller than those of adult males and also changing rapidly in size and shape over the course of development, leading to between-talker variability across age groups that dwarfs the between-talker differences between adult men and women. Moreover, children do not achieve fully adult-like speech motor control until they are young adults, and their vocabularies and phonological proficiency are developing as well, leading to considerably more within-talker variability as well as more between-talker variability. The second problem then is to determine what annotation schemas and analysis techniques can most usefully capture relevant aspects of this variability. Indeed, standard acoustic characterizations applied to child speech reveal that adult-centered annotation schemas fail to capture phenomena such as the emergence of covert contrasts in children's developing phonological systems, while also revealing children's nonuniform progression toward community speech norms as they acquire the phonological systems of their native languages. Both problems point to the need for more basic research into the growth and development of the articulatory system (as well as of the lexicon and phonological system) that is oriented explicitly toward the construction of age-appropriate computational models.
Collapse
Affiliation(s)
| | | | | | - Patrick F Reidy
- Callier Center for Communication Disorders, University of Texas at Dallas
| |
Collapse
|
17
|
Ma J, Chen X, Wu Y, Zhang L. Effects of age and sex on voice onset time: Evidence from Mandarin voiceless stops. LOGOP PHONIATR VOCO 2017; 43:56-62. [PMID: 28511574 DOI: 10.1080/14015439.2017.1324915] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Considerable studies have addressed effects of age and sex on voice onset time (VOT) in English. However, few studies have examined these effects on Mandarin stops. This study attempts to examine effects of age and sex on VOT in Mandarin. A total of 85 Mandarin-speaking children, aged 4-18 years old, and 13 adults as reference participated in a production experiment. Productions were elicited by reading target words in carrier phrases. Results showed that children aged 6-7 years old had longer VOTs than older ones for highly aspirated stops, and the same tendency was not observed for unaspirated stops. However, no linear developmental trend was observed for both highly aspirated and unaspirated stops. In addition, females displayed longer VOTs for highly aspirated stops and shorter for unaspirated stops, whereas significant sex differences in VOTs existed from 14 years old to adulthood for highly aspirated stops, and no significant sex differences in VOTs were found for unaspirated stops in each group, indicating that sex differences in VOTs varied with age and aspiration. The findings suggest that physiological changes in and differences between males and females provide account for some, but not all differences in VOTs across age and sex.
Collapse
Affiliation(s)
- Junzhou Ma
- a College of Foreign Languages, Hunan University , Changsha , PR China
| | - Xiaoxiang Chen
- a College of Foreign Languages, Hunan University , Changsha , PR China
| | - Yezhou Wu
- b School of French , Xi'an International Studies University , Xi'an , PR China
| | - Linjie Zhang
- a College of Foreign Languages, Hunan University , Changsha , PR China
| |
Collapse
|
18
|
Chenausky K, Tager-Flusberg H. Acquisition of voice onset time in toddlers at high and low risk for autism spectrum disorder. Autism Res 2017; 10:1269-1279. [PMID: 28339140 DOI: 10.1002/aur.1775] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2016] [Revised: 01/30/2017] [Accepted: 02/06/2017] [Indexed: 12/19/2022]
Abstract
Although language delay is common in autism spectrum disorder (ASD), research is equivocal on whether speech development is affected. We used acoustic methods to investigate the existence of sub-perceptual differences in the speech of toddlers who developed ASD. Development of the distinction between b and p was prospectively tracked in 22 toddlers at low risk for ASD (LRC), 22 at high risk for ASD without ASD (HRA-), and 11 at high risk for ASD who were diagnosed with ASD at 36 months (HRA+). Voice onset time (VOT), the main acoustic difference between b and p, was measured from spontaneously produced words at 18, 24, and 36 months. Number of words, number of tokens (instances) of syllable-initial b and p produced, error rates, language scores, and motor ability were also assessed. All groups' mean language scores were within the average range or slightly higher. No between-group differences were found in number of words, b's, p's, or errors produced; or in mean or standard deviation of VOT. Binary logistic regression showed that only diagnostic status, not language score, motor ability, number of words, number of b's and p's, or number of errors significantly predicted whether a toddler produced acoustically distinct b and p populations at 36 months. HRA+ toddlers were significantly less likely to produce acoustically distinct b's and p's at 36 months, which may indicate that the HRA+ group may be using different strategies to produce this distinction. Autism Res 2017. © 2017 International Society for Autism Research, Wiley Periodicals, Inc. Autism Res 2017, 10: 1269-1279. © 2017 International Society for Autism Research, Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Karen Chenausky
- Music and Neuroimaging Lab, Neurology Department, Beth Israel Deaconess Medical Center, 330 Brookline Avenue, Boston, Massachusetts
| | - Helen Tager-Flusberg
- Department of Psychological and Brain Sciences, Center for Autism Research Excellence at Boston University, 100 Cummington Mall, Boston, Massachusetts
| |
Collapse
|
19
|
Lehet M, Holt LL. Dimension-Based Statistical Learning Affects Both Speech Perception and Production. Cogn Sci 2016; 41 Suppl 4:885-912. [PMID: 27666146 DOI: 10.1111/cogs.12413] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Revised: 04/04/2016] [Accepted: 04/29/2016] [Indexed: 11/29/2022]
Abstract
Multiple acoustic dimensions signal speech categories. However, dimensions vary in their informativeness; some are more diagnostic of category membership than others. Speech categorization reflects these dimensional regularities such that diagnostic dimensions carry more "perceptual weight" and more effectively signal category membership to native listeners. Yet perceptual weights are malleable. When short-term experience deviates from long-term language norms, such as in a foreign accent, the perceptual weight of acoustic dimensions in signaling speech category membership rapidly adjusts. The present study investigated whether rapid adjustments in listeners' perceptual weights in response to speech that deviates from the norms also affects listeners' own speech productions. In a word recognition task, the correlation between two acoustic dimensions signaling consonant categories, fundamental frequency (F0) and voice onset time (VOT), matched the correlation typical of English, and then shifted to an "artificial accent" that reversed the relationship, and then shifted back. Brief, incidental exposure to the artificial accent caused participants to down-weight perceptual reliance on F0, consistent with previous research. Throughout the task, participants were intermittently prompted with pictures to produce these same words. In the block in which listeners heard the artificial accent with a reversed F0 × VOT correlation, F0 was a less robust cue to voicing in listeners' own speech productions. The statistical regularities of short-term speech input affect both speech perception and production, as evidenced via shifts in how acoustic dimensions are weighted.
Collapse
Affiliation(s)
- Matthew Lehet
- Department of Psychology and the Center for Neural Basis of Cognition, Carnegie Mellon University
| | - Lori L Holt
- Department of Psychology and the Center for Neural Basis of Cognition, Carnegie Mellon University
| |
Collapse
|
20
|
Hattori M, Sumita YI, Elbashti ME, Kurtz KS, Taniguchi H. Effect of Experimental Palatal Prosthesis on Voice Onset Time. J Prosthodont 2016; 27:223-226. [PMID: 27482952 DOI: 10.1111/jopr.12494] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/21/2016] [Indexed: 11/29/2022] Open
Abstract
PURPOSE Objective evaluation of a patient's speech is needed in prosthetic dentistry because the prostheses can affect the intelligibility of speech. Measurement of voice onset time is one evaluation method of consonant production used in phonetic science. The purpose of this study was to confirm the influence of a palatal prosthesis on consonant production by measuring voice onset time. MATERIALS AND METHODS In this study, voice onset time was measured in 10 healthy women (mean age 26.5 years) under two conditions: with and without an experimental palatal prosthesis. In this study, voice onset time of /ta/ and /ka/ were used to determine the effect of wearing a palatal prosthesis; /pa/ was tested as a control, with the null hypothesis that voice onset time of /ta/ and /ka/ would not change when wearing a palatal prosthesis. RESULTS Medial voice onset time of /pa/, /ta/, and /ka/ syllables without the palatal prosthesis was 22.5 ms, 19.5 ms, and 42.5 ms, whereas that with the palatal prosthesis was 22.5 ms, 23.5 ms, and 55.0 ms. Voice onset times for /ta/ and /ka/ were prolonged when wearing the experimental palatal prosthesis, whereas /pa/ showed no significant difference. CONCLUSION Consonant production was affected by wearing a palatal prosthesis, and this change in sound was detected by measuring voice onset time.
Collapse
Affiliation(s)
- Mariko Hattori
- Clinic for Maxillofacial Prosthetics, University Hospital, Faculty of Dentistry Tokyo Medical and Dental University (TMDU), Bunkyo, Japan
| | - Yuka I Sumita
- Department of Maxillofacial Prosthetics, Graduate School, Tokyo Medical and Dental University (TMDU), Bunkyo, Japan
| | - Mahmoud E Elbashti
- Department of Maxillofacial Prosthetics, Graduate School, Tokyo Medical and Dental University (TMDU), Bunkyo, Japan
| | - Kenneth S Kurtz
- Division of Maxillofacial Prosthetics, Stony Brook University School of Dental Medicine, Stony Brook, NY
| | - Hisashi Taniguchi
- Department of Maxillofacial Prosthetics, Graduate School, Tokyo Medical and Dental University (TMDU), Bunkyo, Japan
| |
Collapse
|
21
|
Falk S, Maslow E, Thum G, Hoole P. Temporal variability in sung productions of adolescents who stutter. JOURNAL OF COMMUNICATION DISORDERS 2016; 62:101-114. [PMID: 27323225 DOI: 10.1016/j.jcomdis.2016.05.012] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 05/09/2016] [Accepted: 05/24/2016] [Indexed: 06/06/2023]
Abstract
UNLABELLED Singing has long been used as a technique to enhance and reeducate temporal aspects of articulation in speech disorders. In the present study, differences in temporal structure of sung versus spoken speech were investigated in stuttering. In particular, the question was examined if singing helps to reduce VOT variability of voiceless plosives, which would indicate enhanced temporal coordination of oral and laryngeal processes. Eight German adolescents who stutter and eight typically fluent peers repeatedly spoke and sang a simple German congratulation formula in which a disyllabic target word (e.g., /'ki:ta/) was repeated five times. Every trial, the first syllable of the word was varied starting equally often with one of the three voiceless German stops /p/, /t/, /k/. Acoustic analyses showed that mean VOT and stop gap duration reduced during singing compared to speaking while mean vowel and utterance duration was prolonged in singing in both groups. Importantly, adolescents who stutter significantly reduced VOT variability (measured as the Coefficient of Variation) during sung productions compared to speaking in word-initial stressed positions while the control group showed a slight increase in VOT variability. However, in unstressed syllables, VOT variability increased in both adolescents who do and do not stutter from speech to song. In addition, vowel and utterance durational variability decreased in both groups, yet, adolescents who stutter were still more variable in utterance duration independent of the form of vocalization. These findings shed new light on how singing alters temporal structure and in particular, the coordination of laryngeal-oral timing in stuttering. Future perspectives for investigating how rhythmic aspects could aid the management of fluent speech in stuttering are discussed. LEARNING OUTCOMES Readers will be able to describe (1) current perspectives on singing and its effects on articulation and fluency in stuttering and (2) acoustic parameters such as VOT variability which indicate the efficiency of control and coordination of laryngeal-oral movements. They will understand and be able to discuss (3) how singing reduces temporal variability in the productions of adolescents who do and do not stutter and 4) how this is linked to altered articulatory patterns in singing as well as to its rhythmic structure.
Collapse
Affiliation(s)
- Simone Falk
- Institute of German Philology, Ludwig-Maximilians-University, Schellingstr. 3, 80799 Munich, Germany; Laboratoire Parole et Langage, UMR 7309, Aix-Marseille University, CNRS, Aix-en-Provence, France.
| | - Elena Maslow
- Institute of Phonetics and Speech Processing, Ludwig-Maximilians-University, Munich, Germany
| | - Georg Thum
- Counselling Service for Stuttering, Institute of Clinical Speech Therapy and Education (Spra-chheilpädagogik), Ludwig-Maximilians-University, Munich, Germany
| | - Philip Hoole
- Institute of Phonetics and Speech Processing, Ludwig-Maximilians-University, Munich, Germany
| |
Collapse
|
22
|
Pycha A, Dahan D. Differences in coda voicing trigger changes in gestural timing: A test case from the American English diphthong /aɪ/. JOURNAL OF PHONETICS 2016; 56:15-37. [PMID: 26966337 PMCID: PMC4780424 DOI: 10.1016/j.wocn.2016.01.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We investigate the hypothesis that duration and spectral differences in vowels before voiceless versus voiced codas originate from a single source, namely the reorganization of articulatory gestures relative to one another in time. As a test case, we examine the American English diphthong /aɪ/, in which the acoustic manifestations of the nucleus /a/ and offglide /ɪ/ gestures are relatively easy to identify, and we use the ratio of nucleus-to-offglide duration as an index of the temporal distance between these gestures. Experiment 1 demonstrates that, in production, the ratio is smaller before voiceless codas than before voiced codas; this effect is consistent across speakers as well as changes in speech rate and phrasal position. Experiment 2 demonstrates that, in perception, diphthongs with contextually incongruent ratios delay listeners' identification of target words containing voiceless codas, even when the other durational and spectral correlates of voicing remain intact. This, we argue, is evidence that listeners are sensitive to the gestural origins of voicing differences. Both sets of results support the idea that the voicing contrast triggers changes in timing: gestures are close to one another in time before voiceless codas, but separated from one another before voiced codas.
Collapse
Affiliation(s)
- Anne Pycha
- Department of Linguistics, University of Wisconsin, Milwaukee P.O. Box 413, Milwaukee, Wisconsin 53211-0413, U.S.A
| | - Delphine Dahan
- Department of Psychology, University of Pennsylvania, 3401 Walnut Street, Room 412, Philadelphia, Pennsylvania 19104-6228
| |
Collapse
|
23
|
MacLeod AAN. Phonetic and phonological perspectives on the acquisition of voice onset time by French-speaking children. CLINICAL LINGUISTICS & PHONETICS 2016; 30:584-598. [PMID: 27014796 DOI: 10.3109/02699206.2016.1152509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The goal of the present article is to describe the acquisition of the phonetic details and phonological categories of stop consonants in French. To this end, the stop consonants produced by children aged 2-4 years were transcribed and acoustically analysed. Stop consonants provide an interesting window in phonetic and phonological development since they are among the first phonemes to be acquired in French (MacLeod, Sutton, Thordardottir & Trudeau, 2011), yet the mastery of the phonetic detail of these phonemes can be more drawn out (Allen, 1985). The results of the study indicate that these children are producing significant voicing contrasts between homorganic stops using voice onset time, but at the phonetic level their productions are not yet within adult ranges.
Collapse
Affiliation(s)
- Andrea A N MacLeod
- a Université de Montréal , Montréal , Quebec , Canada
- b CHU Sainte-Justine Research Center , Montréal , Quebec , Canada
| |
Collapse
|
24
|
Hitchcock ER, Koenig LL. Longitudinal observations of typical English voicing acquisition in a 2-year-old child: Stability of the contrast and considerations for clinical assessment. CLINICAL LINGUISTICS & PHONETICS 2015; 29:955-976. [PMID: 26513374 DOI: 10.3109/02699206.2015.1083617] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Early assessment of phonetic and phonological development requires knowledge of typical versus atypical speech patterns, as well as the range of individual developmental trajectories. The nature of data reporting in previous literature on typical voicing acquisition left aspects of the developmental process unclear and limited clinical applicability. This work extends a previous four-month group study to present data for one child over 12 months. Words containing initial /b p d t/ were elicited from a monolingual English-speaking 2-year-old child biweekly for 25 sessions. Voice onset time (VOT) was measured for each stop. For each consonant and recording session, we measured range as well as accuracy, overshoot and discreteness calculated for means and individual tokens. The results underscore the value of token-by-token analyses. They further reveal that typical development may involve an extended period of fluctuating voicing patterns, suggesting that the voiced/voiceless contrast may take months or years to stabilise.
Collapse
Affiliation(s)
- Elaine R Hitchcock
- a Department of Communication Sciences and Disorders , Montclair State University , Bloomfield , NJ , USA
| | - Laura L Koenig
- b Haskins Laboratories , New Haven , CT , USA , and
- c Long Island University , Brooklyn , NY , USA
| |
Collapse
|
25
|
Yu VY, De Nil LF, Pang EW. Effects of Age, Sex and Syllable Number on Voice Onset Time: Evidence from Children's Voiceless Aspirated Stops. LANGUAGE AND SPEECH 2015; 58:152-67. [PMID: 26677640 PMCID: PMC4885737 DOI: 10.1177/0023830914522994] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Voice onset time (VOT) is a temporal acoustic parameter that reflects motor speech coordination skills. This study investigated the patterns of age and sex differences across development of voice onset time in a group of 70 English-speaking children, ranging in age from 4.1 to 18.4 years, and 12 young adults. The effect of the number of syllables on VOT patterns was also examined. Speech samples were elicited by producing syllables /pa/ and /pataka/. Results supported previous findings showing that younger children produce longer VOT values with higher levels of variability. Markedly higher VOT values and increased variability were found for boys at ages between 8 and 11 years, confirming sex differences in VOT patterns and patterns of variability. In addition, all participants consistently produced shorter VOT with higher variability for multisyllables than monosyllables, indicating an effect of syllable number. Possible explanations for these findings and clinical implications are discussed.
Collapse
|
26
|
Wiethan FM, Mota HB. A influência da escolha dos sons-alvo e do modelo de terapia em crianças que apresentam dessonorização. REVISTA CEFAC 2015. [DOI: 10.1590/1982-0216201517s123912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
O objetivo deste estudo foi verificar a eficácia do Modelo ABAB-Retirada e Provas Múltiplas na reestruturação do sistema fonológico de crianças com dessonorização e a influência da escolha dos sons-alvo neste processo. Sete crianças foram selecionadas de um banco de dados de uma instituição federal. Todas estavam autorizadas por seus responsáveis a participar e deveriam apresentar dessonorização previamente à intervenção, sendo tratadas após pelo modelo ABAB-Retirada e Provas Múltiplas. Realizaram-se avaliações fonoaudiológicas e complementares para a obtenção do diagnóstico de desvio fonológico. A gravidade do desvio foi obtida por meio do Percentual de Consoantes Corretas-Revisado. Os sujeitos apresentavam desvios levemente-moderado ou moderadamente-grave, as idades variaram entre 5 anos e 7 anos e 1 mês. Três foram tratados com líquidas e quatro com a fricativa /ʒ/. Analisaram-se as amostras de fala das duas primeiras sessões de retirada do primeiro ciclo de terapia, utilizando-se o teste U de Mann-Whitney. Compararam-se as médias entre avaliação inicial e final ou as médias de evolução entre os grupos. Houve aumento significante do número de sons adquiridos e das produções corretas das fricativas. Porém, o mesmo não ocorreu com o Percentual de Consoantes Corretas-Revisado. As consoantes plosivas e líquidas também não demonstraram aumento significante de produções corretas. Na comparação entre os grupos tratado com líquidas versus tratado com a fricativa /ʒ/, não houve diferença para nenhuma das variáveis. Conclui-se que o modelo ABAB-Retirada e Provas Múltiplas melhora em alguns aspectos os sistemas fonológicos de crianças com dessonorização. Já os sons-alvo para terapia não influenciam neste processo.
Collapse
|
27
|
Wiethan F, Ceron MI, Marchetti P, Giacchini V, Mota HB. O uso da eletroglotografia, eletromiografia, espectografia e ultrassom nos estudos de fala - revisão teórica. REVISTA CEFAC 2015. [DOI: 10.1590/s1516-18462013005000049] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
O emprego de novas tecnologias na avaliação e terapia de fala a partir de uma revisão de artigos publicados nos últimos 5 anos é o tema deste estudo, que tem por objetivo realizar uma revisão bibliográfica dos estudos nacionais e internacionais que utilizaram os recursos: eletroglotografia, espectrografia, ultrassonografia e eletromiografia na avaliação e terapia das alterações de fala. Existe um crescente interesse da inserção desses recursos nos estudos de fala, contudo, os trabalhos existentes que os correlacionam ainda são escassos.
Collapse
|
28
|
Tar É. The acquisition of the voicing contrast in word-initial bilabial and alveolar stops--atypical data from Hungarian. CLINICAL LINGUISTICS & PHONETICS 2014; 28:269-282. [PMID: 24206421 DOI: 10.3109/02699206.2013.835445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The present study aims to investigate, through acoustic analysis, the acquisition of voicing contrast in Hungarian word-initial bilabial and alveolar stops (p/b and t/d) produced by 15 children with primary language disorders between 5;6 and 7;7 years of age. Data collection was based on a picture-naming task involving at least four different pictures for each segment tested. The elicited data were audio-recorded and then evaluated in terms of the proportion of pre-voiced stops per target category, the duration of voice onset time (VOT), and the presence of other phonetic features. The results revealed that, while all target voiceless stops are produced without pre-voicing, there is a bimodal distribution of VOTs for target voiced categories. Regarding the duration of VOTs, accurate realisations show average VOTs of immature values, and no sub-phonemic level differences were revealed in the distribution of VOTs for inaccurate realisations. Furthermore, voiceless realisations present frequently double/multiple release bursts. Findings are discussed in relation to a study on VOT distribution in the speech of typically developing children and to suggestions for further investigations.
Collapse
Affiliation(s)
- Éva Tar
- Faculty of Special Education, Department of Phonetics, Speech and Language Development, Eötvös Loránd University , Budapest , Hungary
| |
Collapse
|
29
|
Hitchcock ER, Koenig LL. The effects of data reduction in determining the schedule of voicing acquisition in young children. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2013; 56:441-457. [PMID: 23275393 DOI: 10.1044/1092-4388(2012/11-0175)] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
PURPOSE In this study, multiple measures of voicing acquisition were used to evaluate the extent to which developmental patterns based on voice onset time (VOT) mean data differed from those based on token-by-token analyses in typically developing 2-year-olds. METHOD Multiple repetitions of words containing initial /b p d t/ were elicited from 10 English-speaking children biweekly for 4 months. VOT was measured for each stop. For each child, consonant, and recording session, means and ranges were obtained, as were measures of accuracy, discreteness, and overshoot calculated for session means and for individual tokens. RESULTS The token-by-token analyses suggested lower accuracy and more category overlap than the session means and revealed an overshoot phase for all children. They also showed examples of both abrupt and gradual changes that were not always evident in the means. Measures of range, accuracy, discreteness, and overshoot all continued to change after statistically significant VOT differences were observed. CONCLUSIONS The findings suggest that some aspects of voicing development may not be evident in analyses that rely on VOT mean data and patterns of statistical significance. Token-by-token measures provide a more complete picture of stages of voicing development than those based solely on mean VOT values.
Collapse
|
30
|
Berticelli A, Mota HB. Ocorrência das estratégias de reparo para os fonemas plosivos, considerando o grau do desvio fonológico. REVISTA CEFAC 2012. [DOI: 10.1590/s1516-18462012005000027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJETIVO: verificar a ocorrência ou não de estratégias de reparo para os fonemas /b/, /d/, /k/ e /g/ e a relação destas estratégias com a gravidade do desvio fonológico. MÉTODO: selecionados 54 sujeitos com diagnóstico de desvio fonológico que apresentavam estratégias de reparo para as consoantes plosivas /b/, /d/, /k/ e /g/ nas posições de onset inicial e/ou medial, com emprego de 40% em seu sistema fonológico. Os dados foram submetidos à análise estatística por meio do programa Statistical Analysis System, versão 8.02, utilizando-se o Teste Exato de Fisher. O nível de significância adotado para os testes estatísticos foi de 5% (p< 0.05). RESULTADOS: verifica-se diferença estatisticamente significativa para o /b/ com maior frequência de dessonorização nas crianças com desvio moderadamente-grave e desvio grave, e de posteriorização, sendo utilizadas duas ou mais estratégias pelas crianças com desvio grave. Diferença estatisticamente significativa para o /d/ com maior frequência de posteriorização nos sujeitos com desvio leve, de dessonorização e duas ou mais estratégias naqueles com desvio moderadamente-grave e a dessonorização por aqueles com desvio grave. CONCLUSÃO: quanto mais complexos em termos de aquisição e produção são os fonemas plosivos, mais estratégias de reparo são utilizadas. E ainda, quanto maior o grau do desvio fonológico, maior é a quantidade de vezes que este recurso é usado, demonstrando que a criança possui um menor conhecimento fonológico.
Collapse
|
31
|
Melo RM, Mota HB, Mezzomo CL, Brasil BDC, Lovatto L, Arzeno L. Caracterização acústica da sonoridade dos fones plosivos do português brasileiro. REVISTA CEFAC 2011. [DOI: 10.1590/s1516-18462011005000143] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Objetivo investigar e comparar as características acústicas das plosivas surdas e sonoras na fala de crianças com desenvolvimento fonológico típico e, de adultos com padrões de fala típicos da língua. Método a amostra do estudo é composta por dois grupos - 17 adultos e 11 crianças com desenvolvimento fonológico típico. Por meio de palavras/pseudopalavras ([’papa], [’baba], [’tata], [’dada], [’kaka] e [’gaga]) inseridas em frases-veículo (“Fala ___ papa de novo”;), mediu-se o voice onset time, a duração da vogal, a amplitude do burst e a duração da oclusão. Foram comparados os registros acústicos de plosivas surdas e sonoras intra e intergrupo por meio de testes estatísticos (p<0,05). Resultados em geral, observou-se que: (1) o voice onset time foi maior para as plosivas sonoras em comparação às surdas; (2) a duração da vogal quando seguida ou precedida por uma plosiva sonora foi mais longa do que diante de uma plosiva surda; (3) a amplitude do burst foi levemente superior durante a produção dos segmentos sonoros e; (4) a duração da oclusão se mostrou superior no contexto de plosivas surdas. Também se observou que adultos e crianças apresentam muitas similaridades em relação à produção desses parâmetros. Conclusão pode-se concluir que as pistas acústicas investigadas apresentam-se como fortes parâmetros envolvidos na caracterização do contraste de sonoridade das plosivas. Além disso, os resultados também indicam muitas semelhanças entre adultos e crianças com padrões fonológicos típicos. No entanto, quando algumas diferenças são evidentes, essas ocorrem na posição de sílaba átona e medial.
Collapse
|
32
|
Wiethan FM, Mota HB. Ambientes linguísticos para a produção das fricativas /z/, /∫/e /ℑ/: variabilidades na aquisição fonológica de seis sujeitos. REVISTA CEFAC 2011. [DOI: 10.1590/s1516-18462011005000111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
TEMA: percursos de aquisição fonológica de crianças tratadas com ambientes favoráveis em oposição aos ambientes pouco favoráveis e neutros para a produção de /z/, /∫/e /ℑ/na terapia fonológica. PROCEDIMENTOS: foram selecionadas seis crianças com desvio fonológico e idades entre 4:7 e 7:8 para participarem do estudo. As mesmas estavam devidamente autorizadas por seus responsáveis. Foram realizadas avaliações fonoaudiológicas e complementares para diagnóstico do desvio fonológico. Os sujeitos foram pareados de acordo com a gravidade do desvio fonológico, sexo, faixa etária e aspectos do sistema fonológico em relação aos fonemas alterados. Metade das crianças foi tratada com palavras em que os fonemas /z/, /∫/e /ℑ/encontravam-se em ambientes favoráveis e a outra metade com ambientes pouco favoráveis e neutros. Foram realizadas oito sessões e, após estas, novas avaliações foram realizadas para descrever e comparar qualitativamente os percursos de aquisição fonológica dos sujeitos por meio do Modelo Implicacional de Complexidade de Traços. RESULTADOS: os resultados indicaram discreta vantagem na evolução terapêutica de dois sujeitos tratados com ambientes favoráveis, em relação a seus pares. Porém, uma criança tratada com ambientes pouco-favoráveis e neutros, obteve resultados mais positivos do que seu par. CONCUSÃO: os ambientes favoráveis à produção das fricativas /z/, /∫/e /ℑ/ não determinaram o sucesso terapêutico, porém influenciaram positivamente a evolução fonológica dos sujeitos do estudo.
Collapse
|
33
|
Melo RM, Mota HB, Mezzomo CL, Brasil BDC, Lovatto L, Arzeno L. Desvio fonológico e a dificuldade com a distinção do traço [voz] dos fonemas plosivos: dados de produção e percepção do contraste de sonoridade. REVISTA CEFAC 2011. [DOI: 10.1590/s1516-18462011005000083] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJETIVO: comparar os valores de voice onset time (VOT) de fonemas plosivos, produzidos por crianças com desvio fonológico e dificuldade na produção do contraste de sonoridade, classificados como surdos e sonoros a partir de uma análise perceptivo auditiva. MÉTODO: participaram do estudo cinco meninos com desvio fonológico e dificuldade no estabelecimento do traço [+voz] das plosivas. Por meio de pares de palavras (['papa], ['baba], ['tata], ['dada], ['kaka] e ['gaga]) inseridas em frases-veículo, extraiu-se o VOT de cada plosiva. Com base na mesma amostra de fala, foi realizada por três fonoaudiólogas, uma análise perceptivo auditiva, na qual deveriam julgar os referidos fonemas como surdos ou sonoros. Os valores do VOT dos fonemas classificados como surdos foram comparados com os valores de VOT dos fonemas classificados como sonoros utilizando-se testes estatísticos. RESULTADOS: verificou-se diferença estatisticamente significante somente entre os valores de VOT julgados como surdos ou sonoros de [p, d, g] em onset inicial, assim como, entre os três pontos articulatórios. As demais variáveis analisadas não mostraram significância estatística. CONCLUSÃO: de maneira geral, constatou-se que o VOT não é uma pista determinante para a percepção da distinção da sonoridade dos casos desviantes. No entanto, esta pista mostrou exercer influência na discriminação dos fonemas de acordo com o ponto articulatório, também no desvio fonológico. A partir da análise acústica observou-se que: (a) a presença de pré-sonoridade influencia no julgamento da consoante como sonora; (b) a duração do VOT positivo não é decisiva para a distinção de sonoridade e (c) um VOT nulo, na sua maioria, é responsável pela identificação de uma plosiva sonora.
Collapse
|
34
|
Berti LC, Marino VCDC. Contraste fônico encoberto entre /t/ e /k/: um estudo de caso de normalidade e de transtorno fonológico. REVISTA CEFAC 2011. [DOI: 10.1590/s1516-18462011005000010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJETIVO: investigar, com auxílio de análise acústica, o estabelecimento do contraste fonológico de produções de um sujeito com transtorno fonológico que apresenta neutralização (pela análise de oitiva) do contraste entre as oclusivas alveolar e velar. MÉTODOS: foram analisadas as produções das oclusivas /t/ e /k/ combinadas com as vogais /a/ e /u/ na posição acentuada em dois sujeitos - com e sem transtorno fonológico - ambos do sexo masculino, com faixa etária entre 5 e 6 anos. Os parâmetros fonético-acústicos analisados incluíram a inspeção acústica da forma de onda, os parâmetros relativos às características espectrais do burst; os parâmetros acústicos relativos às características acústicas das vogais adjacentes às oclusivas e os parâmetros acústicos relativos ao padrão temporal. RESULTADOS: nas produções da criança com transtorno fonológico observou-se a presença de contrastes encobertos entre as oclusivas investigadas, marcadas tanto pelo uso inadequado quanto pelo correto das pistas fonéticas, mas com valores insuficientes. Nas produções da criança sem transtorno fonológico observou-se o uso de pelo menos uma pista fonética relativa às principais características de um segmento oclusivo que, com magnitude suficiente, propicia o resgate do contraste entre /t/ e /k/ pelos ouvintes. Nas produções de ambos os sujeitos as pistas fonéticas foram dependentes do contexto vocálico. CONCLUSÃO: um determinado contraste fônico pode ser entendido como uma constelação de pistas fonéticas que variam em sua interdependência e significância perceptual.
Collapse
|
35
|
Souza APRD, Scott LC, Mezzomo CL, Dias RF, Giacchini V. Avaliações acústica e perceptiva de fala nos processos de dessonorização de obstruintes. REVISTA CEFAC 2010. [DOI: 10.1590/s1516-18462010005000039] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
OBJETIVO: comparar a percepção e a produção do traço de sonoridade de dois sujeitos, um em aquisição normal e outro com transtorno fonológico, e analisar as metodologias acústica e perceptual de investigação do traço [±sonoro]. PROCEDIMENTOS: um instrumento contendo pares mínimos com oposição do valor do traço sonoro foi criado para eliciar a fala dos sujeitos e proporcionar a análise acústica e perceptual do vozeamento em suas falas. Os dados de fala foram gravados em MiniDisc Sony MZ-R70 em sala tratada acusticamente e submetidos ao programa Sona-Graph 5500 da Kay Elemetrics, verificando a presença ou não de vozeamento. Dois clínicos e pesquisadores experientes em aquisição fonológica fizeram os julgamentos acerca do contraste de sonoridade na fala das crianças. Também se testou a discriminação auditiva do traço de sonoridade pelas crianças através do instrumento de Levi (1994). RESULTADOS: ambas as crianças demonstraram perceber o traço de sonoridade e produzi-lo em alguns contextos. A criança em aquisição normal apresentou mais contextos com produção adequada do traço de sonoridade. Houve cerca de 10% de desacordo no julgamento do contraste de sonoridade entre os juízes. CONCLUSÃO: as análises acústicas e perceptuais são complementares na avaliação da fala. Existem momentos de dessonorização na fala do sujeito em aquisição típica.
Collapse
|