1
|
van Hugte TBR, Heeren WFL. Exploring Interspeaker Variation in Creaky Voice in Dutch. J Voice 2024:S0892-1997(24)00162-0. [PMID: 38902141 DOI: 10.1016/j.jvoice.2024.05.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 05/17/2024] [Accepted: 05/20/2024] [Indexed: 06/22/2024]
Abstract
OBJECTIVES This study explored the extent and discriminatory potential of interspeaker variation in creaky voice in Dutch men. METHODS Intervals of creaky voice for 30 speakers were manually segmented and annotated from a corpus of spontaneous speech data. For each speaker, at least 1500 syllables were analyzed. Total creakiness was calculated based on the proportion of creaky syllables. Creaky intervals were categorized into subtypes based on the degree of periodicity. Furthermore, acoustic measurements were taken from the intervals and tested for speaker-discriminating capacity by means of a linear discriminant analysis (LDA). RESULTS Speakers differed in what percentage of syllables they realized with creaky voice, with a range of roughly 0-5% of all syllables. They likewise differed in the proportion with which they used different subtypes of creaky voice, such that some speakers have very distinctive profiles. The LDA resulted in correct classifications of creaky intervals to speakers at a rate above chance level. CONCLUSIONS Interspeaker variation in creaky voice in Dutch male speech was confirmed and allowed for moderate speaker classification on the basis of speech acoustics.
Collapse
Affiliation(s)
- Thom B R van Hugte
- Institut für Linguistik, Leipzig University, Leipzig, Germany; Leiden University Centre for Linguistics, Leiden University, Leiden, The Netherlands.
| | - Willemijn F L Heeren
- Leiden University Centre for Linguistics, Leiden University, Leiden, The Netherlands
| |
Collapse
|
2
|
Zhang Y, Chen F, Xu F, Guo C, Li K. Acoustic characteristics of infant- and foreigner-directed speech with Mandarin as the target language. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024; 155:3877-3888. [PMID: 38888391 DOI: 10.1121/10.0026359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 05/28/2024] [Indexed: 06/20/2024]
Abstract
The quality of speech input influences the efficiency of L1 and L2 acquisition. This study examined modifications in infant-directed speech (IDS) and foreigner-directed speech (FDS) in Standard Mandarin-a tonal language-and explored how IDS and FDS features were manifested in disyllabic words and a longer discourse. The study aimed to determine which characteristics of IDS and FDS were enhanced in comparison with adult-directed speech (ADS), and how IDS and FDS differed when measured in a common set of acoustic parameters. For words, it was found that tone-bearing vowel duration, mean and range of fundamental frequency (F0), and the lexical tone contours were enhanced in IDS and FDS relative to ADS, except for the dipping Tone 3 that exhibited an unexpected lowering in FDS, but no modification in IDS when compared with ADS. For the discourse, different aspects of temporal and F0 enhancements were emphasized in IDS and FDS: the mean F0 was higher in IDS whereas the total discourse duration was greater in FDS. These findings add to the growing literature on L1 and L2 speech input characteristics and their role in language acquisition.
Collapse
Affiliation(s)
- Yu Zhang
- School of Foreign Languages, Hunan University, Changsha, China
| | - Fei Chen
- School of Foreign Languages, Hunan University, Changsha, China
| | - Feng Xu
- Department of Linguistics, Macquarie University, Sydney, New South Wales, Australia
| | - Chengyu Guo
- Faculty of Arts and Sciences, Beijing Normal University, Zhuhai, China
| | - Kexuan Li
- School of Foreign Languages, Hunan University, Changsha, China
| |
Collapse
|
3
|
Gao S, Ma EPM. The Relationship Between Voice Parameters and Speech Intelligibility: A Scoping Review. J Voice 2024:S0892-1997(24)00130-9. [PMID: 38755076 DOI: 10.1016/j.jvoice.2024.04.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 04/07/2024] [Accepted: 04/08/2024] [Indexed: 05/18/2024]
Abstract
OBJECTIVE To synthesize existing evidence of the relationship between voice parameters and speech intelligibility. METHODS Following Preferred Reporting Items for Systematic Reviews and Meta-Analysis extension for Scoping Review (PRISMA-ScR) guidelines, 13 databases were searched and a manual search was conducted. A narrative synthesis of methodological quality, study characteristics, participant demographics, voice parameter categorization, and their relationship to speech intelligibility was conducted. A Grading of Recommendations Assessment, Development, and Evaluation (GRADE) assessment was also performed. RESULTS A total of 5593 studies were retrieved, and 30 eligible studies were included in the final scoping review. The studies were given scores of 10-25 (average 16.93) out of 34 in the methodological quality assessment. Research that analyzed voice parameters related to speech intelligibility, encompassing perceptual, acoustic, and aerodynamic parameters, was included. Validated and nonvalidated perceptual voice assessments showed divergent results regarding the relationship between perceptual parameters and speech intelligibility. The relationship between acoustic parameters and speech intelligibility was found to be complex and the results were inconsistent. The limited research on aerodynamic parameters did not reach a consensus on their relationship with speech intelligibility. Studies in which listeners were not speech-language pathologists (SLPs) far outnumbered those with SLP listeners, and research conducted in English contexts significantly exceeded that in non-English contexts. The GRADE evaluation indicated that the quality of evidence varied from low to moderate. DISCUSSION The results for the relationship between voice parameters and intelligibility showed significant heterogeneity. Future research should consider age-related voice changes and include diverse age groups. To enhance validity and comparability, it will be necessary to report effect sizes, tool validity, inter-rater reliability, and calibration procedures. Voice assessments should account for the validation status of tools because of their potential impact on the outcomes. The linguistic context may also influence the results.
Collapse
Affiliation(s)
- Shaohua Gao
- Voice Research Laboratory, Faculty of Education, The University of Hong Kong, Pok Fu Lam, Hong Kong
| | - Estella P-M Ma
- Voice Research Laboratory, Faculty of Education, The University of Hong Kong, Pok Fu Lam, Hong Kong.
| |
Collapse
|
4
|
Højen A, Madsen TO, Bleses D. Danish 20-month-olds' recognition of familiar words with and without consonant and vowel mispronunciations. PHONETICA 2023; 80:309-328. [PMID: 37533184 DOI: 10.1515/phon-2023-2001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/04/2023]
Abstract
Although several studies initially supported the proposal by Nespor et al. (Nespor, Marina, Marcela Peña & Jacques Mehler. 2003. On the different roles of vowels and consonants in speech processing and language acquisition. Lingue e Linguaggio 2. 221-247) that consonants are more informative than vowels in lexical processing, a more complex picture has emerged from recent research. Current evidence suggests that infants initially show a vowel bias in lexical processing and later transition to a consonant bias, possibly depending on the characteristics of the ambient language. Danish infants have shown a vowel bias in word learning at 20 months-an age at which infants learning French or Italian no longer show a vowel bias but rather a consonant bias, and infants learning English show no bias. The present study tested whether Danish 20-month-olds also have a vowel bias when recognizing familiar words. Specifically, using the Intermodal Preferential Looking paradigm, we tested whether Danish infants were more likely to ignore or accept consonant than vowel mispronunciations when matching familiar words with pictures. The infants successfully matched correctly pronounced familiar words with pictures but showed no vowel or consonant bias when matching mispronounced words with pictures. The lack of a bias for Danish vowels or consonants in familiar word recognition adds to evidence that lexical processing biases are language-specific and may additionally depend on developmental age and perhaps task difficulty.
Collapse
Affiliation(s)
- Anders Højen
- School of Communication and Culture and TrygFonden's Centre for Child Research, Aarhus University, Aarhus V, Denmark
| | - Thomas O Madsen
- Department of Language, Culture, History and Communication, University of Southern Denmark, Odense, Denmark
| | - Dorthe Bleses
- School of Communication and Culture and TrygFonden's Centre for Child Research, Aarhus University, Aarhus V, Denmark
| |
Collapse
|
5
|
Fung RSY, Wong EYC. Separated and reunified: An apparent time investigation of the voice quality differences between Hong Kong Cantonese and Guangzhou Cantonese. PLoS One 2023; 18:e0293058. [PMID: 37851598 PMCID: PMC10584129 DOI: 10.1371/journal.pone.0293058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 10/04/2023] [Indexed: 10/20/2023] Open
Abstract
Hong Kong Cantonese (HKC) and Guangzhou Cantonese (GZC) are two major accents of Cantonese spoken in two geographically non-contiguous cities in Southern China. Previous studies were unable to identify the phonetic features that discern the two accents since they share the same phonological system. This study attempted to solve the puzzle by investigating the voice quality differences between the two accents through acoustic analysis on the speech output of 191 talkers in three age groups ranging from 18 to 65 years old. Among the various spectral and noise measurements of voice quality, we found that Cepstral Peak Prominence (CPP) was the best acoustic measure to discern the two accents. Based on the CPP measure, GZC had overall increased noise than HKC. Covariation of voice quality and tones was studied. The greatest CPP differences between the two accents were found in the two extreme tones: the high-level and the extra-low-level tones. Furthermore, creaky voice was found mainly tied to the extra-low-level tone in both accents. However, HKC exhibited higher frequency of creaky voice than GZC. The creaky voice in GZC was characterized by increased noise and increased tension, compared to those of HKC. Finally, age was found to be a mediating factor in the voice quality of the two accents. Adopting the Apparent Time Framework, voice quality in the two cities has undergone changes over time. The voice quality of the young generations of the two accents have become merged among the three low tones. Furthermore, the prevalence of creaky voice was increasing across age groups in both accents, and it increased at a faster rate in HKC than GZC.
Collapse
Affiliation(s)
- Roxana S. Y. Fung
- Department of Chinese and Bilingual Studies, The Hong Kong Polytechnic University, Hung Hom, Hong Kong
| | - Eugene Y. C. Wong
- Department of Speech-Language-Hearing Sciences, University of Minnesota, Minneapolis, Minnesota, United States of America
| |
Collapse
|
6
|
Creel SC, Obiri-Yeboah M, Rose S. Language-to-music transfer effects depend on the tone language: Akan vs. East Asian tone languages. Mem Cognit 2023; 51:1624-1639. [PMID: 37052771 PMCID: PMC10100610 DOI: 10.3758/s13421-023-01416-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/15/2023] [Indexed: 04/14/2023]
Abstract
Recent research suggests that speaking a tone language confers benefits in processing pitch in nonlinguistic contexts such as music. This research largely compares speakers of nontone European languages (English, French) with speakers of tone languages in East Asia (Mandarin, Cantonese, Vietnamese, Thai). However, tone languages exist on multiple continents-notably, languages indigenous to Africa and the Americas. With one exception (Bradley, Psychomusicology, 26(4), 337-345, 2016), no research has assessed whether these tone languages also confer pitch processing advantages. Two studies presented a melody change detection task, using quasirandom note sequences drawn from Western major scale tone probabilities. Listeners were speakers of Akan, a tone language of Ghana, plus speakers from previously tested populations (nontone language speakers and East Asian tone language speakers). In both cases, East Asian tone language speakers showed the strongest musical pitch processing, but Akan speakers did not exceed nontone speakers, despite comparable or better instrument change detection. Results suggest more nuanced effects of tone languages on pitch processing. Greater numbers of tones, presence of contour tones in a language's tone inventory, or possibly greater functional load of tone may be more likely to confer pitch processing benefits than mere presence of tone contrasts.
Collapse
Affiliation(s)
- Sarah C. Creel
- UC San Diego Cognitive Science, 9500 Gilman Drive Mail Code 0515, La Jolla, CA 92093-0515 USA
| | - Michael Obiri-Yeboah
- Georgetown University Linguistics, Washington, DC USA
- UC San Diego Linguistics, San Diego, CA USA
| | | |
Collapse
|
7
|
Chan MPY, Kuang J. The effect of tone language background on cue integration in pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2023; 154:819-830. [PMID: 37563829 DOI: 10.1121/10.0020565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2023] [Accepted: 07/18/2023] [Indexed: 08/12/2023]
Abstract
This study explores the effect of native language and musicality on voice quality cue integration in pitch perception. Previous work by Cui and Kang [(2019). J. Acoust. Soc. Am. 146(6), 4086-4096] found no differences in pitch perception strategies between English and Mandarin speakers. The present study asks whether Cantonese listeners may perform differently, as Cantonese consists of multiple level tones. Participants completed two experiments: (i) a forced choice pitch classification experiment involving four spectral slope permutations that vary in fo across an 11 step continuum, and (ii) the MBEMA test that quantifies listeners' musicality. Results show that Cantonese speakers do not differ from English and Mandarin speakers in terms of overall categoricity and perceptual shift, that Cantonese speakers do not have advantages in musicality, and that musicality is a significant predictor for participants' pitch perception strategies. Listeners with higher musicality scores tend to rely more on fo cues than voice quality cues compared to listeners with lower musicality. These findings support the notion that voice quality integration in pitch perception is not language specific, and may be a universal psychoacoustic phenomenon at a non-lexical level.
Collapse
Affiliation(s)
- May Pik Yu Chan
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6228, USA
| |
Collapse
|
8
|
Han JY, Hsiao CJ, Zheng WZ, Weng KC, Ho GM, Chang CY, Wang CT, Fang SH, Lai YH. Enhancing the Performance of Pathological Voice Quality Assessment System Through the Attention-Mechanism Based Neural Network. J Voice 2023:S0892-1997(22)00426-X. [PMID: 36732109 DOI: 10.1016/j.jvoice.2022.12.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 12/28/2022] [Accepted: 12/28/2022] [Indexed: 02/04/2023]
Abstract
OBJECTIVE Doctors, nowadays, primarily use auditory-perceptual evaluation, such as the grade, roughness, breathiness, asthenia, and strain scale, to evaluate voice quality and determine the treatment. However, the results predicted by individual physicians often differ, because of subjective perceptions, and diagnosis time interval, if the patient's symptoms are hard to judge. Therefore, an accurate computerized pathological voice quality assessment system will improve the quality of assessment. METHOD This study proposes a self_attention-based system, with a deep learning technology, named self_attention-based bidirectional long-short term memory (SA BiLSTM). Different pitches [low, normal, high], and vowels [/a/, /i/, /u/], were added into the proposed model, to make it learn how professional doctors evaluate the grade, roughness, breathiness, asthenia, and strain scale, in a high dimension view. RESULTS The experimental results showed that the proposed system provided higher performance than the baseline system. More specifically, the macro average of the F1 score, presented as decimal, was used to compare the accuracy of classification. The (G, R, and B) of the proposed system were (0.768±0.011, 0.820±0.009, and 0.815±0.009), which is higher than the baseline systems: deep neural network (0.395±0.010, 0.312±0.019, 0.321±0.014) and convolution neural network (0.421±0.052, 0.306±0.043, 0.3250±0.032) respectively. CONCLUSIONS The proposed system, with SA BiLSTM, pitches, and vowels, provides a more accurate way to evaluate the voice. This will be helpful for clinical voice evaluations and will improve patients' benefits from voice therapy.
Collapse
Affiliation(s)
- Ji-Yan Han
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan
| | - Ching-Ju Hsiao
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan
| | - Wei-Zhong Zheng
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan
| | | | | | | | - Chi-Te Wang
- Far Eastern Memorial Hospital, Department of Otolaryngology Head and Neck Surgery, Taipei, Taiwan
| | - Shih-Hau Fang
- Yuan Ze University, Department of Electric Engineering, Taoyuan, Taiwan
| | - Ying-Hui Lai
- National Yang Ming Chiao Tung University, Department of Biomedical Engineering, Taipei, Taiwan; Medical Device Innovation & Translation Center, National Yang Ming Chiao Tung University, Taipei, Taiwan.
| |
Collapse
|
9
|
Wang B, Kügler F, Genzel S. The interaction of focus and phrasing with downstep and post-low-bouncing in Mandarin Chinese. Front Psychol 2022; 13:884102. [PMID: 36248550 PMCID: PMC9561885 DOI: 10.3389/fpsyg.2022.884102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 08/01/2022] [Indexed: 11/13/2022] Open
Abstract
L(ow) tone in Mandarin Chinese causes both downstep and post-low-bouncing. Downstep refers to the lowering of a H(igh) tone after a L tone, which is usually measured by comparing the H tones in a “H…HLH…H” sentence with a “H…HHH…H” sentence (cross-comparison), investigating whether downstep sets a new pitch register for the scaling of subsequent tones. Post-low-bouncing refers to the raising of a H tone after a focused L tone. The current study investigates how downstep and post-low-bouncing interact with focus and phrasing in Mandarin Chinese. In the experiment, we systematically manipulated (a) the tonal environment by embedding two syllables with either LH or HH tone (syllable X and Y) sentence-medially in the same carrier sentences containing only H tones; (b) boundary strength between X and Y by introducing either a syllable boundary or a phonological phrase boundary; and (c) information structure by either placing a contrastive focus in the HL/HH word (XF), syllable Y (YF), or the sentence-final word (ZF). A wide-focus condition served as the baseline. With systematic control of focus and boundary strength around the L tone, the current study shows that the downstep effect in Mandarin is quite robust, lasting for 3–5 H tones after the L tone, but eventually levelling back again to the register reference line of a H tone. The way how focus and phrasing interact with the downstep effect is unexpected. Firstly, sentence-final focus has no anticipatory effect on shortening the downstep effect; instead, it makes the downstep effect lasts longer as compared to the wide focus condition. Secondly, the downstep effect still shows when the H tone after the L tone is on-focus (YF), in a weaker manner than the wide focus condition, and is overridden by the post-focus-compression. Thirdly, the downstep effect gets greater when the boundary after the L tone is stronger, because the L tone is longer and more likely to be creaky. We further analyzed downstep by measuring the F0 drop between the two H tones surrounding the L tone (sequential-comparison). Comparing it with F0 drop in all-H sentences (i.e., declination), it showed that the downstep effect was much greater and more robust than declination. However, creaky voice in the L tone was not the direct cause of downstep. At last, when the L tone was under focus (XF), it caused a post-low-bouncing effect, which is weakened by a phonological phrase boundary. Altogether, the results showed that although intonation is largely controlled by informative functions, the physical-articulatory controls are relatively persistent, varying within the pitch range of 2.5 semitones. Downstep and post-low-bouncing in Mandarin Chinese thus seem to be mainly due to physical-articulatory movement on varying pitch, with the gradual tonal F0 change meeting the requirement of smooth transition across syllables, and avoiding confusion in informative F0 control.
Collapse
Affiliation(s)
- Bei Wang
- Key Laboratory of Language, Cognition and Computation, School of Foreign Languages, Beijing Institute of Technology, Beijing, China
| | - Frank Kügler
- Department of Linguistics, Goethe University Frankfurt, Frankfurt, Germany
- *Correspondence: Frank Kügler,
| | | |
Collapse
|
10
|
Chai Y, Garellek M. On H1-H2 as an acoustic measure of linguistic phonation type. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022; 152:1856. [PMID: 36182308 DOI: 10.1121/10.0014175] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 09/01/2022] [Indexed: 06/16/2023]
Abstract
The measure H1-H2, the difference in amplitude between the first and second harmonics, is frequently used to distinguish phonation types and to characterize differences across voices and genders. While H1-H2 can differentiate voices and is used by listeners to perceive changes in voice quality, its relation to voice articulation is less straightforward. Its calculation also involves practical issues with error propagation. This paper highlights some developments in the use of H1-H2 and proposes a new measure that we call "residual H1." In residual H1, the amplitude of the first harmonic is normalized against the overall sound energy (as measured by root mean square energy) instead of against H2. Residual H1 may mitigate some of the issues with using H1-H2. The current study tests the correlation between residual H1 and electroglottographic contact quotient (CQ) and compares the ability of residual H1 vs H1-H2 to differentiate statistically across phonation types in !Xóõ and utterance-level changes in phonatory quality in Mandarin. The results show that residual H1 has a stronger correlation with CQ and differentiates contrastive and allophonic phonatory quality better than H1-H2, particularly for more constricted phonation types.
Collapse
Affiliation(s)
- Yuan Chai
- Department of Linguistics, University of California San Diego, La Jolla, California 92093, USA
| | - Marc Garellek
- Department of Linguistics, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
11
|
Liu M, Chen Y, Schiller NO. Context Matters for Tone and Intonation Processing in Mandarin. LANGUAGE AND SPEECH 2022; 65:52-72. [PMID: 33482696 DOI: 10.1177/0023830920986174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In tonal languages such as Mandarin, both lexical tone and sentence intonation are primarily signaled by F0. Their F0 encodings are sometimes in conflict and sometimes in congruency. The present study investigated how tone and intonation, with F0 encodings in conflict or in congruency, are processed and how semantic context may affect their processing. To this end, tone and intonation identification experiments were conducted in both semantically neutral and constraining contexts. Results showed that the overall performance of tone identification was better than that of intonation. Specifically, tone identification was seldom affected by intonation information irrespective of semantic contexts. However, intonation identification, particularly question intonation, was susceptible to the final lexical tone identity and affected by the semantic context. In the semantically neutral context, questions ending with a rising tone and a falling tone were equally difficult to identify. In the semantically constraining context, questions ending with a falling tone were much better identified than those ending with a rising tone. This perceptual asymmetry suggests that top-down information provided by the semantically constraining context can play a facilitating role for listeners to disentangle intonational information from tonal information, but mainly in sentences with the lexical falling tone in the final position.
Collapse
Affiliation(s)
- Min Liu
- College of Chinese Language and Culture & Institute of Applied Linguistics, Jinan University, China
| | | | - Niels O Schiller
- Leiden University Centre for Linguistics & Leiden Institute for Brain and Cognition, Leiden University, the Netherlands
| |
Collapse
|
12
|
Abstract
This study investigates the timing of stød, a type of phonological nonmodal phonation related to creaky voice in Danish, relative to the syllable. Stød-bearing syllables are characterized by high fundamental frequency (F0) and modal phonation at the beginning of the syllable followed by nonmodal, often creaky phonation and low F0 towards the end of the syllable (the stød phase proper). However, the timing of these two phases relative to the syllable and to each other has been debated. To investigate this, F0 throughout the word and the timing of the stød phase proper relative to the syllable were analyzed in five types of monosyllabic words. The results show that across word types the first stød phase (high F0) coordinates with the syllable rhyme onset, whilst the second phase is timed to the center of the sonorant rhyme, in contrast to previous hypotheses of stød timing. This relationship is formalized using the framework of Articulatory Phonology. In doing so, two additions to the theory are proposed to account for the biphasic nature of stød and the timing of the stød phase proper relative to the syllable.
Collapse
|
13
|
Rhee N, Chen A, Kuang J. Musicality and Age Interaction in Tone Development. Front Neurosci 2022; 16:804042. [PMID: 35264924 PMCID: PMC8901167 DOI: 10.3389/fnins.2022.804042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 01/26/2022] [Indexed: 11/13/2022] Open
Abstract
Vocal pitch, which involves not only F0 but also multiple covarying acoustic cues is central to linguistic perception and production at various levels of prosodic structure. Recent studies on language development have shown that differences in learners' musicality affect the F0 cue development in perception of sentence-level intonation or in prosodic realization of focus. This study aims to contribute toward a fuller understanding of the effect of musicality on linguistic pitch development via a close investigation of the relationship between musicality, age, and lexical tone production covering both F0 and spectral cues in children. Forty-three native Mandarin-speaking children between the ages of 4 and 6 years are recruited to participate in both a semi-spontaneous tone production task and a musicality test. For each age (4, 5, and 6 years) and musicality (below or above the median score of each age group) group, the contrastivity of the four tones is evaluated by performing automatic tone classification using three sets of acoustic cues (F0, spectral cues, and both). It has been found that higher musicality is associated with higher contrastivity of the tones produced at the age of 4 and 5 years, but not at the age of 6 years. These results suggest that musicality promotes earlier development of tone production only in earlier stages of prosodic development; by the age of 6 years, the musicality advantage in tone production subsides.
Collapse
Affiliation(s)
- Nari Rhee
- Department of Linguistics, University of Pennsylvania, Philadelphia, PA, United States
- *Correspondence: Nari Rhee
| | - Aoju Chen
- Utrecht Institute of Linguistics OTS, Utrecht University, Utrecht, Netherlands
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, PA, United States
- Jianjing Kuang
| |
Collapse
|
14
|
Tupper P, Leung KW, Wang Y, Jongman A, Sereno JA. The contrast between clear and plain speaking style for Mandarin tones. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2021; 150:4464. [PMID: 34972264 DOI: 10.1121/10.0009142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 12/03/2021] [Indexed: 06/14/2023]
Abstract
We examine the acoustic characteristics of clear and plain conversational productions of Mandarin tones. Twenty-one native Mandarin speakers were asked to produce a selection of Mandarin words in both plain and clear speaking styles. Several tokens were gathered for each of the four tones giving a total of 2045 productions. Six critical tonal cues were computed for each production: fundamental frequency (F0) mean, slope, and second derivative, duration, mean intensity, and a binary variable coding whether the production involved creaky voice. A linear mixed-effects regression model was used to explore how these cues changed with respect to the clear versus plain distinction for each tone, with speaking style as the fixed effect and speaker being a random effect. The strongest effects detected were that duration and mean intensity increased in clear speech across speakers and tones. Tones 2 and 3 increased in mean F0 and Tone 4 increased its slope. An additional finding was that, for contour tones, speakers accomplished the increase in duration by stretching out the tone contours in time while largely not changing the F0 range. These results are discussed in terms of signal-based (affecting all tones) and code-based (enhancing contrast between tones) change.
Collapse
Affiliation(s)
- Paul Tupper
- Department of Mathematics, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
| | - Keith W Leung
- Department of Linguistics, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
| | - Yue Wang
- Department of Linguistics, Simon Fraser University, Burnaby, British Columbia, V5A 1S6, Canada
| | - Allard Jongman
- Department of Linguistics, University of Kansas, Lawrence, Kansas 66045, USA
| | - Joan A Sereno
- Department of Linguistics, University of Kansas, Lawrence, Kansas 66045, USA
| |
Collapse
|
15
|
Zeng Y, Fiorentino R, Zhang J. Electrophysiological Signatures of Perceiving Alternated Tone in Mandarin Chinese: Mismatch Negativity to Underlying Tone Conflict. Front Psychol 2021; 12:735593. [PMID: 34646215 PMCID: PMC8504678 DOI: 10.3389/fpsyg.2021.735593] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2021] [Accepted: 08/23/2021] [Indexed: 11/16/2022] Open
Abstract
Although phonological alternation is prevalent in languages, the process of perceiving phonologically alternated sounds is poorly understood, especially at the neurolinguistic level. We examined the process of perceiving Mandarin 3rd tone sandhi (T3 + T3 → T2 + T3) with a mismatch negativity (MMN) experiment. Our design has two independent variables (whether the deviant undergoes tone sandhi; whether the standard and the deviant have matched underlying tone). These two independent variables modulated ERP responses in both the first and the second syllables. Notably, despite the apparent segmental conflict between the standard and the deviant in all conditions, MMN is only observed when neither the standard nor the deviant undergoes tone sandhi, suggesting that discovering the underlying representation of an alternated sound could interfere with the generation of MMN. A tentative model with three hypothesized underlying processing mechanisms is proposed to explain the observed latency and amplitude differences across conditions. The results are also discussed in light of the potential electrophysiological signatures involved in the process of perceiving alternated sounds.
Collapse
Affiliation(s)
- Yuyu Zeng
- Phonetics and Psycholinguistics Laboratory, Department of Linguistics, University of Kansas, Lawrence, KS, United States.,Neurolinguistics and Language Processing Laboratory, Department of Linguistics, University of Kansas, Lawrence, KS, United States
| | - Robert Fiorentino
- Neurolinguistics and Language Processing Laboratory, Department of Linguistics, University of Kansas, Lawrence, KS, United States
| | - Jie Zhang
- Phonetics and Psycholinguistics Laboratory, Department of Linguistics, University of Kansas, Lawrence, KS, United States
| |
Collapse
|
16
|
Rhee N, Chen A, Kuang J. Going beyond F0: The acquisition of Mandarin tones. JOURNAL OF CHILD LANGUAGE 2021; 48:387-398. [PMID: 32393402 DOI: 10.1017/s0305000920000239] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Using a semi-spontaneous speech corpus, we present evidence from computational modelling of tonal productions from Mandarin-speaking children (4- to 11-years old) and adults, showing that children exceed the adult-level tonal distinction at the age of 7 to 8 years using F0 cues, but do not reach the high adult-level distinction using spectral cues even at the age of 10 to 11 years. The difference in the developmental curves of F0 and spectral cues suggests that, in Mandarin tone production, secondary cues continue to develop even after the mastery of primary cues.
Collapse
|
17
|
Lee-Kim SI. Stop Laryngeal Distinctions Driven by Contrastive Effects of Neighboring Tones. LANGUAGE AND SPEECH 2021; 64:98-122. [PMID: 32476568 DOI: 10.1177/0023830920922897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This study examined contrastive effects of neighboring tones that give rise to a systematic asymmetry in stop perception. Korean-speaking learners of Mandarin Chinese and naïve listeners labeled voiceless unaspirated stops preceded or followed by low or high extrinsic tonal context (e.g., maLO.pa vs. maHI.pa) either as lenis (associated with a low F0 at the vowel onset) or as fortis stops (with a high F0). Further, the target tone itself varied between level and rising (e.g., maLO.paLEV vs. maLO.paRIS). Both groups of listeners showed significant contrastive effects of extrinsic context. Specifically, more lenis responses were elicited in a high tone context than in a low one, and vice versa. This indicates that the onset F0 of a stop is perceived lower in a high tone context, which, in turn, provides positive evidence for lenis stops. This effect was more clearly pronounced for the level than for the contour tone target and also for the preceding than for the following context irrespective of linguistic experience. Despite qualitative similarities, learners showed larger effects for all F0 variables, indicating that the degree of context effects may be enhanced by one's phonetic knowledge, namely sensitivity to F0 cues along with the processing of consecutive tones acquired through learning a tone language.
Collapse
|
18
|
Davidson L. The versatility of creaky phonation: Segmental, prosodic, and sociolinguistic uses in the world's languages. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2020; 12:e1547. [PMID: 33015958 DOI: 10.1002/wcs.1547] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Revised: 09/11/2020] [Accepted: 09/15/2020] [Indexed: 11/11/2022]
Abstract
Creaky phonation (also known as creaky voice, vocal fry, laryngealization, or glottalization) is a voice quality that refers to shortened and thickened vocal folds that vibrate at a low and quasi-regular fundamental frequency with a long period of damping. Cross-linguistically, creaky phonation can span either short or long domains. When implemented on individual vowels or consonants (as in Zapotec or Montana Salish), it can signal phonemic contrast with other voice qualities, or it can be an additional acoustic cue to enhance other contrasts, such as tone (as in Mandarin or Cantonese). Another segmental use of creaky phonation in many languages is as a variant of glottal stop. Creaky phonation can also be implemented as a prosodic element that signals the end of a phrase (as in English or Mandarin), or indicates relinquishing a conversational turn (as in Finnish). It can also express meaning in a social interaction, such as irritation (in Vietnamese). Lastly, creaky phonation can be deployed as a sociolinguistic marker to establish identities, convey affect, or distinguish one speech group from another within the same language. In some social circumstances, such as the perception that young women use creaky phonation at greater rates than men do, it can be evaluated negatively by listeners. As creaky phonation can be combined with linguistic elements at various levels and is easily perceptible, it has taken on a remarkable number of roles in our linguistic repertoires. This article is categorized under: Linguistics > Language in Mind and Brain.
Collapse
Affiliation(s)
- Lisa Davidson
- Department of Linguistics, New York University, New York, New York, USA
| |
Collapse
|
19
|
Gao J, Hallé P, Draxler C. Breathy voice and low-register: A case of trading relation in Shanghai Chinese tone perception? LANGUAGE AND SPEECH 2020; 63:582-607. [PMID: 31496353 DOI: 10.1177/0023830919873080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In Shanghai Chinese as well as many other Wu dialects, breathy voice is a well-documented accompaniment of the low-register tone syllables with obstruent as well as sonorant onsets. But Shanghai Chinese is rapidly changing and the breathy voice associated with low-register tones tends to disappear in young speakers' productions. In this study, we asked whether breathy voice is nevertheless still perceived and whether it pushes tone identification toward low-register tones. We conducted forced-choice tone identification tests on young native listeners of Shanghai Chinese, using low-high register tone continua-from tone T3 (23) to tone T2 (34)-imposed on base syllables with either modal or breathy voice quality, and beginning with various onset consonants. We used continua constructed from either naturally produced or synthesized syllables. Our results show that breathy voice does bias tone identification responses toward the low-register tone T3. This result held for both synthesized and natural stimuli, except for the /m/-onset stimuli derived from naturally produced syllables. We propose that the phonetic change at issue-loss of breathiness in production-is not due to misperception but reflects the ever-stronger influence of Standard Mandarin Chinese. In other words, this particular case of sound change seems to be led by production rather than perception. It remains an open question whether this kind of sound change is only determined by sociolinguistic factors (here, the dominance of Mandarin Chinese) or is independently motivated by phonetic and/or phonological factors.
Collapse
Affiliation(s)
- Jiayin Gao
- Sophia University, Tokyo, Japan
- Japan Society for the Promotion of Science, Tokyo, Japan
- Laboratoire de Phonétique et Phonologie, CNRS-Université Paris 3, France
- Laboratoire Langues et Civilisations à Tradition Orale, CNRS-Université Paris 3, France
| | - Pierre Hallé
- Laboratoire de Phonétique et Phonologie, CNRS-Université Paris 3, France
- Laboratoire Mémoire et Cognition, INSERM-Université Paris 5, France
| | - Christoph Draxler
- Institute of Phonetics and Speech Processing, Ludwig-Maximilian University, Munich, Germany
| |
Collapse
|
20
|
Zhang Y, Kirby J. The role of F 0 and phonation cues in Cantonese low tone perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:EL40. [PMID: 32752729 DOI: 10.1121/10.0001523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 06/17/2020] [Indexed: 06/11/2023]
Abstract
For languages that primarily exploit F0 to signal tonal contrast, the role of phonation cues in tonal perception remains controversial. This study revisits the use of F0 and phonation cues in Cantonese low tone perception (tone 4, 21/tone 6, 22) using synthesized stimuli. In line with previous studies, F0 contour and height were found to be the most salient cues, with F0 height being more important. The effects of non-modal phonation (creaky and breathy voice) were relatively small. Non-modal phonation enhanced low tone perception only in the low F0 range. The results are consistent with the differential integration hypothesis that the perceptual role of phonation is dependent on F0 and that phonation cues integrate with F0 differently depending on F0 height.
Collapse
Affiliation(s)
- Yubin Zhang
- Department of Linguistics, University of Southern California, Los Angeles, California 90089, USA
| | - James Kirby
- Department of Linguistics and English Language, University of Edinburgh, Edinburgh, EH8 9AD, United ,
| |
Collapse
|
21
|
Lai W, Kuang J. The effect of speaker gender on Cantonese tone perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:4119. [PMID: 32611181 DOI: 10.1121/10.0001411] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 05/27/2020] [Indexed: 06/11/2023]
Abstract
This paper presents three experiments on the integration of speaker gender cues in Cantonese tone perception. Experiment 1 compared tone identification of F0-matched stimuli between different gender voices and showed that listeners tended to hear lower tones for stimuli with female-sounding voices and higher tones for stimuli with male-sounding voices. Experiment 2 investigated whether a similar voice gender normalization effect would occur in pitch perception. The results showed that unlike tone categorization shifting with voice gender systematically, voice gender interfered with pitch perception in listener-specific ways. In particular, musicians who were not affected by voice gender in pitch perception still showed a tone boundary shift induced by voice gender. Experiment 3 evaluated the influence of non-voice gender cues on tone identification with the guises of gendered names. The result shows that gendered names barely induced any shift on their own as guises of an identical set of gender-ambiguous stimuli; however, gendered names enhanced the shift when patterned with gender-prototypical voices of their gender. These findings support an additional phonological normalization process on top of psychoacoustic sensation. They also suggest that speaker normalization involves fine-grained processing of rich social cues conveyed by acoustic signals rather than merely abstract social labels.
Collapse
Affiliation(s)
- Wei Lai
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
22
|
Shaw JA, Tyler MD. Effects of vowel coproduction on the timecourse of tone recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:2511. [PMID: 32359304 DOI: 10.1121/10.0001103] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 03/30/2020] [Indexed: 06/11/2023]
Abstract
Vowel contrasts tend to be perceived independently of pitch modulation, but it is not known whether pitch can be perceived independently of vowel quality. This issue was investigated in the context of a lexical tone language, Mandarin Chinese, using a printed word version of the visual world paradigm. Eye movements to four printed words were tracked while listeners heard target words that differed from competitors only in tone (test condition) or also in onset consonant and vowel (control condition). Results showed that the timecourse of tone recognition is influenced by vowel quality for high, low, and rising tones. For these tones, the time for the eyes to converge on the target word in the test condition (relative to control) depended on the vowel with which the tone was coarticulated with /a/ and /i/ supporting faster recognition of high, low, and rising tones than /u/. These patterns are consistent with the hypothesis that tone-conditioned variation in the articulation of /a/ and /i/ facilitates rapid recognition of tones. The one exception to this general pattern-no effect of vowel quality on falling tone perception-may be due to fortuitous amplification of the harmonics relevant for pitch perception in this context.
Collapse
Affiliation(s)
- Jason A Shaw
- Department of Linguistics, Yale University, Dow Hall, New Haven, Connecticut 06511, USA
| | - Michael D Tyler
- School of Psychology, Western Sydney University, Penrith, New South Wales 2751, Australia
| |
Collapse
|
23
|
Kelterer A, Schuppler B. Phonation type contrasts and tone in Chichimec. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:3043. [PMID: 32359325 DOI: 10.1121/10.0001015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Accepted: 03/16/2020] [Indexed: 06/11/2023]
Abstract
Chichimec (Otomanguean) has two tones, high and low, and a phonological three-way phonation contrast: modal /V/, breathy /V¨/, and creaky /Ṽ/. Tone and phonation type contrasts are used independently. This paper investigates the acoustic realization of modal, breathy, and creaky vowels; the timing of phonation in non-modal vowels; and the production of tone in combination with different phonation types. The results of cepstral peak prominence and three spectral tilt measures showed that phonation type contrasts are not distinguished by the same acoustic measures for women and men. In line with expectations for laryngeally complex languages, phonetic modal and non-modal phonation are sequenced in phonological breathy and creaky vowels. With respect to the timing pattern, however, the results show that non-modal phonation is not, as previously reported, mainly located in the middle of the vowel. Non-modal phonation is, instead, predominantly realized in the second half of phonological breathy and creaky vowels. Tone is distinguished in all three phonation types, and non-modal vowels do not exhibit distinct F0 ranges except for creaky vowels produced by women in which F0 declines in the creaky portion. The results of the acoustic analysis provide additional insights to phonological accounts of laryngeal complexity in Chichimec.
Collapse
Affiliation(s)
- Anneliese Kelterer
- Department of Linguistics, University of Graz, Merangasse 70/III, 8010 Graz, Austria
| | - Barbara Schuppler
- Signal Processing and Speech Communication Laboratory, Graz University of Technology, Inffeldgasse 16c, 8010 Graz, Austria
| |
Collapse
|
24
|
Huang Y. Different attributes of creaky voice distinctly affect Mandarin tonal perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 147:1441. [PMID: 32237818 DOI: 10.1121/10.0000721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Accepted: 01/27/2020] [Indexed: 06/11/2023]
Abstract
Previous work has shown mixed findings concerning the role of voice quality cues in Mandarin tones, with some studies showing that creak improves identification. This study tests the linguistic importance of acoustic properties of creak for Mandarin tone perception. Mandarin speakers identified tones with four resynthesized creak manipulations: low spectral tilt, irregular F0, period doubling, and extra-low F0. Two experiments with three conditions were conducted. In Experiment 1, the manipulations were confined to a portion of the stimuli's duration; in Experiment 2 the creak manipulations were modified and lengthened throughout the stimuli, and in a second condition, noise was incorporated to weaken F0 cues. Listeners remained most sensitive to extra-low F0, which affected identification of the four tones differently: it improved the identification accuracy of Tone 3 and hindered that of Tones 1 and 4. Irregular F0 consistently hindered T1 identification. The effects of irregular F0, period doubling, and low spectral tilt emerged in Experiment 2, where F0 cues were less robust and creak cues were stronger. Thus, low F0 is the most prominent cue used in Mandarin tone identification, but other voice quality cues become more salient to listeners when the F0 cues are less retrievable.
Collapse
Affiliation(s)
- Yaqian Huang
- Department of Linguistics, University of California San Diego, 9500 Gilman Drive #0108, La Jolla, California 92093-0108, USA
| |
Collapse
|
25
|
Mok PPK, Li VG, Fung HSH. Development of Phonetic Contrasts in Cantonese Tone Acquisition. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:95-108. [PMID: 31944874 DOI: 10.1044/2019_jslhr-19-00152] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Purpose Previous studies showed both early and late acquisition of Cantonese tones based on transcription data using different criteria, but very little acoustic data were reported. Our study examined Cantonese tone acquisition using both transcription and acoustic data, illustrating the early and protracted aspects of Cantonese tone acquisition. Method One hundred fifty-nine Cantonese-speaking children aged between 2;1 and 6;0 (years;months) and 10 reference speakers participated in a tone production experiment based on picture naming. Natural production materials with 30 monosyllabic words were transcribed by two native judges. Acoustic measurements included overall tonal dispersion and specific contrasts between similar tone pairs: ratios of average fundamental frequency height for the level tones (T1, T3, T6), magnitude of rise and inflection point for the rising tones (T2, T5), magnitude of fall, H1*-H2*, and harmonic-to-noise ratio for the low tones (T4, T6). Auditory assessment of creakiness for T4 was also included. Results Children in the eldest group (aged 5;7-6;0) were still not completely adultlike in production accuracy, although two thirds of them had production accuracy over 90%. Children in all age groups had production accuracy significantly higher than chance level, and they could produce the major acoustic contrasts between specific tone pairs similarly as reference speakers. Fine phonetic detail of the inflection point and creakiness was more challenging for children. Conclusion Our findings illustrated the multifaceted aspects (both early and late) of Cantonese tone acquisition and called for a wider perspective on how to define successful phonological acquisition. Supplemental Material https://doi.org/10.23641/asha.11594853.
Collapse
Affiliation(s)
- Peggy Pik Ki Mok
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin
| | - Vivian Guo Li
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin
| | - Holly Sze Ho Fung
- Department of Linguistics and Modern Languages, The Chinese University of Hong Kong, Shatin
| |
Collapse
|
26
|
Cui A, Kuang J. The effects of musicality and language background on cue integration in pitch perception. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:4086. [PMID: 31893734 DOI: 10.1121/1.5134442] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 10/28/2019] [Indexed: 06/10/2023]
Abstract
Pitch perception involves the processing of multidimensional acoustic cues, and listeners can exhibit different cue integration strategies in interpreting pitch. This study aims to examine whether musicality and language experience have effects on listeners' pitch perception strategies. Both Mandarin and English listeners were recruited to participate in two experiments: (1) a pitch classification experiment that tested their relative reliance on f0 and spectral cues, and (2) the Montreal Battery of Evaluation of Musical Abilities that objectively quantified their musical aptitude as continuous musicality scores. Overall, the results show a strong musicality effect: Listeners with higher musicality scores relied more on f0 in pitch perception, while listeners with lower musicality scores were more likely to attend to spectral cues. However, there were no effects of language experience on musicality scores or cue integration strategies in pitch perception. These results suggest that less musical or even amusic subjects may not suffer impairment in linguistic pitch processing due to the multidimensional nature of pitch cues.
Collapse
Affiliation(s)
- Aletheia Cui
- Department of Linguistics, University of Pennsylvania, 3401-C Walnut Street, Suite 300, Philadelphia, Pennsylvania 19104, USA
| | - Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, 3401-C Walnut Street, Suite 300, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|
27
|
Kuang J, Tian J, Jiang B. The effect of vocal effort on contrastive voice quality in Shaoxing Wu. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2019; 146:EL272. [PMID: 31590508 DOI: 10.1121/1.5126120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Accepted: 08/29/2019] [Indexed: 06/10/2023]
Abstract
Voice quality varies at different levels of communication functions. In order to better understand the range of voice quality variation in normal speech, it is important to examine the interaction between global functions and local functions. This study investigates the effect of vocal effort on the contrastive voice quality in Shaoxing Wu. Results show that register contrasts are maintained in all vocal effort conditions, suggesting that the controls for global vs local functions are rather independent. However, the contrastivity of the registers is modulated by the vocal effort conditions, and the register contrasts are less well-defined in the loud and soft conditions.
Collapse
Affiliation(s)
- Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Jia Tian
- Department of Linguistics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA
| | - Bing'er Jiang
- Department of Linguistics, McGill University, Montreal, Quebec H3A 1A7, , ,
| |
Collapse
|
28
|
Garellek M. Acoustic Discriminability of the Complex Phonation System in !Xóõ. PHONETICA 2019; 77:131-160. [PMID: 30739113 DOI: 10.1159/000494301] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2017] [Accepted: 10/02/2018] [Indexed: 06/09/2023]
Abstract
Phonation types, or contrastive voice qualities, are minimally produced using complex movements of the vocal folds, but may additionally involve constriction in the supraglottal and pharyngeal cavities. These complex articulations in turn produce a multidimensional acoustic output that can be modeled in various ways. In this study, I investigate whether the psychoacoustic model of voice by Kreiman et al. (2014) succeeds at distinguishing six phonation types of !Xóõ. Linear discriminant analysis is performed using parameters from the model averaged over the entire vowel as well as for the first and final halves of the vowel. The results indicate very high classification accuracy for all phonation types. Measures averaged over the vowel's entire duration are closely correlated with the discriminant functions, suggesting that they are sufficient for distinguishing even dynamic phonation types. Measures from all classes of parameters are correlated with the linear discriminant functions; in particular, the "strident" vowels, which are harsh in quality, are characterized by their noise, changes in spectral tilt, decrease in voicing amplitude and frequency, and raising of the first formant. Despite the large number of contrasts and the time-varying characteristics of many of the phonation types, the phonation contrasts in !Xóõ remain well differentiated acoustically.
Collapse
Affiliation(s)
- Marc Garellek
- Department of Linguistics, University of California, San Diego, San Diego, California, USA,
| |
Collapse
|
29
|
Kuang J, Liberman M. Integrating Voice Quality Cues in the Pitch Perception of Speech and Non-speech Utterances. Front Psychol 2018; 9:2147. [PMID: 30555365 PMCID: PMC6281971 DOI: 10.3389/fpsyg.2018.02147] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Accepted: 10/18/2018] [Indexed: 11/13/2022] Open
Abstract
Pitch perception plays a crucial role in speech processing. Since F0 is highly ambiguous and variable in the speech signal, effective pitch-range perception is important in perceiving the intended linguistic pitch targets. This study argues that the effectiveness of pitch-range perception can be achieved by taking advantage of other signal-internal information that co-varies with F0, such as voice quality cues. This study provides direct perceptual evidence that voice quality cues as an indicator of pitch ranges can effectively affect the pitch-height perception. A series of forced-choice pitch classification experiments with four spectral conditions were conducted to investigate the degree to which manipulating spectral slope affects pitch-height perception. Both non-speech and speech stimuli were investigated. The results suggest that the pitch classification function is significantly shifted under different spectral conditions. Listeners are likely to perceive a higher pitch when the spectrum has higher high-frequency energy (i.e., tenser phonation). The direction of the shift is consistent with the correlation between voice quality and pitch range. Moreover, cue integration is affected by the speech mode, where listeners are more sensitive to relative difference within an utterance when hearing speech stimuli. This study generally supports the hypothesis that voice quality is an important enhancement cue for pitch range.
Collapse
Affiliation(s)
- Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, Philadelphia, PA, United States
| | | |
Collapse
|
30
|
Zheng A, Hirata Y, Kelly SD. Exploring the Effects of Imitating Hand Gestures and Head Nods on L1 and L2 Mandarin Tone Production. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2018; 61:2179-2195. [PMID: 30193334 DOI: 10.1044/2018_jslhr-s-17-0481] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2017] [Accepted: 05/07/2018] [Indexed: 06/08/2023]
Abstract
PURPOSE This study investigated the impact of metaphoric actions-head nods and hand gestures-in producing Mandarin tones for first language (L1) and second language (L2) speakers. METHOD In 2 experiments, participants imitated videos of Mandarin tones produced under 3 conditions: (a) speech alone, (b) speech + head nods, and (c) speech + hand gestures. Fundamental frequency was recorded for both L1 (Experiment 1) and L2 (Experiment 2a) speakers, and the output of the L2 speakers was rated for tonal accuracy by 7 native Mandarin judges (Experiment 2b). RESULTS Experiment 1 showed that 12 L1 speakers' fundamental frequency spectral data did not differ among the 3 conditions. In Experiment 2a, the conditions did not affect the production of 24 English speakers for the most part, but there was some evidence that hand gestures helped Tone 4. In Experiment 2b, native Mandarin judges found limited conditional differences in L2 productions, with Tone 3 showing a slight head nods benefit in a subset of "correct" L2 tokens. CONCLUSION Results suggest that metaphoric bodily actions do not influence the lowest levels of L1 speech production in a tonal language and may play a very modest role during preliminary L2 learning.
Collapse
Affiliation(s)
- Annie Zheng
- Department of Neuroscience, Washington University, St. Louis, MO
- Center for Language and Brain, Colgate University, Hamilton, NY
| | - Yukari Hirata
- Department of East Asian Languages and Literatures, Colgate University, Hamilton, NY
- Center for Language and Brain, Colgate University, Hamilton, NY
| | - Spencer D Kelly
- Department of Psychological and Brain Sciences, Neuroscience Program, Colgate University, Hamilton, NY
- Center for Language and Brain, Colgate University, Hamilton, NY
| |
Collapse
|
31
|
Zhang Z. Vocal instabilities in a three-dimensional body-cover phonation model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 144:1216. [PMID: 30424612 PMCID: PMC6128715 DOI: 10.1121/1.5053116] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Revised: 08/17/2018] [Accepted: 08/20/2018] [Indexed: 05/08/2023]
Abstract
The goal of this study is to identify vocal fold conditions that produce irregular vocal fold vibration and the underlying physical mechanisms. Using a three-dimensional computational model of phonation, parametric simulations are performed with co-variations in vocal fold geometry, stiffness, and vocal tract shape. For each simulation, the cycle-to-cycle variations in the amplitude and period of the glottal area function are calculated, based on which the voice is classified into three types corresponding to regular, quasi-steady or subharmonic, and chaotic phonation. The results show that vocal folds with a large medial surface vertical thickness and low transverse stiffness are more likely to exhibit irregular vocal fold vibration when tightly approximated and subject to high subglottal pressure. Transition from regular vocal fold vibration to vocal instabilities is often accompanied by energy redistribution among the first few vocal fold eigenmodes, presumably due to nonlinear interaction between eigenmodes during vocal fold contact. The presence of a vocal tract may suppress such contact-related vocal instabilities, but also induce new instabilities, particularly for less constricted vocal fold conditions, almost doubling the number of vocal fold conditions producing irregular vibration.
Collapse
Affiliation(s)
- Zhaoyan Zhang
- Department of Head and Neck Surgery, University of California, Los Angeles, 31-24 Rehabilitation Center, 1000 Veteran Avenue, Los Angeles, California 90095-1794, USA
| |
Collapse
|
32
|
Kuang J. The influence of tonal categories and prosodic boundaries on the creakiness in Mandarin. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2018; 143:EL509. [PMID: 29960425 DOI: 10.1121/1.5043094] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This study examines the distribution of creaky voice as a function of various prosodic structures in a large-scale corpus of continuous speech of Mandarin. Both tonal categories and prosodic boundaries have strong effects on the likelihood of creak and relative creakiness. It was found that (1) creaky voice in Mandarin is indeed largely driven by the occurrence of low pitch and weakening; (2) Tone 3 sandhi and Tone 2 are different in both pitch and voice quality; (3) the creakiness of Tone 3 (low tone) and the neutral tone (weakening) is realized differently. Moreover, females and males do not differ in how they use creaky voice linguistically.
Collapse
Affiliation(s)
- Jianjing Kuang
- Department of Linguistics, University of Pennsylvania, 3401 Walnut Street, Philadelphia, Pennsylvania 19104, USA
| |
Collapse
|