1
|
Steele CM, Mancopes R, Barrett E, Panes V, Peladeau-Pigeon M, Simmons MM, Smaoui S. Preliminary Exploration of Variations in Measures of Pharyngeal Area During Nonswallowing Tasks. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2024; 67:4304-4313. [PMID: 39467167 DOI: 10.1044/2024_jslhr-24-00418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/30/2024]
Abstract
PURPOSE Age- and disease-related changes in oropharyngeal anatomy and physiology may be identified through quantitative videofluoroscopic measures of pharyngeal area and dynamics. Pixel-based measures of nonconstricted pharyngeal area (PhAR) are typically taken during oral bolus hold tasks or on postswallow rest frames. A recent study in 87 healthy adults reported mean postswallow PhAR of 62%(C2-4)2, (range: 25%-135%), and significantly larger PhAR in males. The fact that measures were taken after initial bolus swallows without controlling for the presence of subsequent clearing swallows was identified as a potential source of variation. A subset of study participants had completed a protocol including additional static nonswallowing tasks, enabling us to explore variability across those tasks, taking sex differences into account. METHOD Videofluoroscopy still shots were analyzed for 20 healthy adults (10 males, 10 females, Mage = 26 years) in head-neutral position, chin-down and chin-up positions, a sustained /a/ vowel vocalization, and oral bolus hold tasks (1-cc, 5-cc). Trained raters used ImageJ software to measure PhAR in %(C2-4)2 units. Measures were compared to previously reported mean postswallow PhAR for the same participants: (a) explorations of sex differences; (b) pairwise linear mixed-model analyses of variance (ANOVAs) of PhAR for each nonswallowing task versus postswallow measures, controlling for sex; and (c) a combined mixed-model ANOVA to confirm comparability of the subset of tasks showing no significant differences from postswallow measures in Step 2. RESULTS Overall, PhAR measures were significantly larger in male participants; however, most pairwise task comparisons did not differ by sex. No significant differences from postswallow measures were seen for 5-cc bolus hold, chin-down and chin-up postures, and the second (but not the first) of two repeated head neutral still shots. PhAR during a 5-cc bolus hold was most similar to postswallow measures: mean ± standard deviation of 51 ± 13%(C2-4)2 in females and 64 ± 16%(C2-4)2 in males. CONCLUSIONS PhAR is larger in men than in women. Oral bolus hold tasks with a 5-cc liquid bolus yield similar measures to those obtained from postswallow rest frames.
Collapse
Affiliation(s)
- Catriona M Steele
- Kite Research Institute, University Health Network, Toronto, Ontario, Canada
- Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, Ontario, Canada
- Canada Research Chair in Swallowing and Food Oral Processing, Canada Research Chairs Secretariat, Ottawa, Ontario
| | - Renata Mancopes
- Kite Research Institute, University Health Network, Toronto, Ontario, Canada
| | - Emily Barrett
- Kite Research Institute, University Health Network, Toronto, Ontario, Canada
| | - Vanessa Panes
- Kite Research Institute, University Health Network, Toronto, Ontario, Canada
| | | | - Michelle M Simmons
- Kite Research Institute, University Health Network, Toronto, Ontario, Canada
| | - Sana Smaoui
- Kite Research Institute, University Health Network, Toronto, Ontario, Canada
- Rehabilitation Sciences Institute, Temerty Faculty of Medicine, University of Toronto, Ontario, Canada
- Department of Hearing and Speech Sciences, Faculty of Allied Health Sciences, Health Sciences Center, Kuwait University
| |
Collapse
|
2
|
Parrell B, Niziolek CA, Chen T. Sensorimotor adaptation to a nonuniform formant perturbation generalizes to untrained vowels. J Neurophysiol 2024; 132:1437-1444. [PMID: 39356074 DOI: 10.1152/jn.00240.2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2024] [Revised: 08/29/2024] [Accepted: 09/27/2024] [Indexed: 10/03/2024] Open
Abstract
When speakers learn to change the way they produce a speech sound, how much does that learning generalize to other speech sounds? Past studies of speech sensorimotor learning have typically tested the generalization of a single transformation learned in a single context. Here, we investigate the ability of the speech motor system to generalize learning when multiple opposing sensorimotor transformations are learned in separate regions of the vowel space. We find that speakers adapt to a nonuniform "centralization" perturbation, learning to produce vowels with greater acoustic contrast, and that this adaptation generalizes to untrained vowels, which pattern like neighboring trained vowels and show increased contrast of a similar magnitude.NEW & NOTEWORTHY We show that sensorimotor adaptation of vowels at the edges of the articulatory working space generalizes to intermediate vowels through local transfer of learning from adjacent vowels. These results extend findings on the locality of sensorimotor learning from upper limb control to speech, a complex task with an opaque and nonlinear transformation between motor actions and sensory consequences. Our results also suggest that our paradigm has potential to drive behaviorally relevant changes that improve communication effectiveness.
Collapse
Affiliation(s)
- Benjamin Parrell
- University of Wisconsin-Madison, Madison, Wisconsin, United States
| | | | - Taijing Chen
- University of Wisconsin-Madison, Madison, Wisconsin, United States
| |
Collapse
|
3
|
Kim KS, Gaines JL, Parrell B, Ramanarayanan V, Nagarajan SS, Houde JF. Mechanisms of sensorimotor adaptation in a hierarchical state feedback control model of speech. PLoS Comput Biol 2023; 19:e1011244. [PMID: 37506120 PMCID: PMC10434967 DOI: 10.1371/journal.pcbi.1011244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2022] [Revised: 08/17/2023] [Accepted: 06/06/2023] [Indexed: 07/30/2023] Open
Abstract
Upon perceiving sensory errors during movements, the human sensorimotor system updates future movements to compensate for the errors, a phenomenon called sensorimotor adaptation. One component of this adaptation is thought to be driven by sensory prediction errors-discrepancies between predicted and actual sensory feedback. However, the mechanisms by which prediction errors drive adaptation remain unclear. Here, auditory prediction error-based mechanisms involved in speech auditory-motor adaptation were examined via the feedback aware control of tasks in speech (FACTS) model. Consistent with theoretical perspectives in both non-speech and speech motor control, the hierarchical architecture of FACTS relies on both the higher-level task (vocal tract constrictions) as well as lower-level articulatory state representations. Importantly, FACTS also computes sensory prediction errors as a part of its state feedback control mechanism, a well-established framework in the field of motor control. We explored potential adaptation mechanisms and found that adaptive behavior was present only when prediction errors updated the articulatory-to-task state transformation. In contrast, designs in which prediction errors updated forward sensory prediction models alone did not generate adaptation. Thus, FACTS demonstrated that 1) prediction errors can drive adaptation through task-level updates, and 2) adaptation is likely driven by updates to task-level control rather than (only) to forward predictive models. Additionally, simulating adaptation with FACTS generated a number of important hypotheses regarding previously reported phenomena such as identifying the source(s) of incomplete adaptation and driving factor(s) for changes in the second formant frequency during adaptation to the first formant perturbation. The proposed model design paves the way for a hierarchical state feedback control framework to be examined in the context of sensorimotor adaptation in both speech and non-speech effector systems.
Collapse
Affiliation(s)
- Kwang S. Kim
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, Indiana, United States of America
| | - Jessica L. Gaines
- Graduate Program in Bioengineering, University of California Berkeley-University of California San Francisco, San Francisco, California, United States of America
| | - Benjamin Parrell
- Department of Communication Sciences and Disorders, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Vikram Ramanarayanan
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, California, United States of America
- Modality.AI, San Francisco, California, United States of America
| | - Srikantan S. Nagarajan
- Department of Radiology and Biomedical Imaging, University of California San Francisco, San Francisco, California, United States of America
| | - John F. Houde
- Department of Otolaryngology-Head and Neck Surgery, University of California San Francisco, San Francisco, California, United States of America
| |
Collapse
|
4
|
Parrell B, Ramanarayanan V, Nagarajan S, Houde J. The FACTS model of speech motor control: Fusing state estimation and task-based control. PLoS Comput Biol 2019; 15:e1007321. [PMID: 31479444 PMCID: PMC6743785 DOI: 10.1371/journal.pcbi.1007321] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 09/13/2019] [Accepted: 08/02/2019] [Indexed: 11/18/2022] Open
Abstract
We present a new computational model of speech motor control: the Feedback-Aware Control of Tasks in Speech or FACTS model. FACTS employs a hierarchical state feedback control architecture to control simulated vocal tract and produce intelligible speech. The model includes higher-level control of speech tasks and lower-level control of speech articulators. The task controller is modeled as a dynamical system governing the creation of desired constrictions in the vocal tract, after Task Dynamics. Both the task and articulatory controllers rely on an internal estimate of the current state of the vocal tract to generate motor commands. This estimate is derived, based on efference copy of applied controls, from a forward model that predicts both the next vocal tract state as well as expected auditory and somatosensory feedback. A comparison between predicted feedback and actual feedback is then used to update the internal state prediction. FACTS is able to qualitatively replicate many characteristics of the human speech system: the model is robust to noise in both the sensory and motor pathways, is relatively unaffected by a loss of auditory feedback but is more significantly impacted by the loss of somatosensory feedback, and responds appropriately to externally-imposed alterations of auditory and somatosensory feedback. The model also replicates previously hypothesized trade-offs between reliance on auditory and somatosensory feedback and shows for the first time how this relationship may be mediated by acuity in each sensory domain. These results have important implications for our understanding of the speech motor control system in humans.
Collapse
Affiliation(s)
- Benjamin Parrell
- Department of Communication Sciences and Disorders, University of Wisconsin–Madison, Madison, Wisconsin, United States of America
| | - Vikram Ramanarayanan
- Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, San Francisco, California, United States of America
- Educational Testing Service R&D, San Francisco, California, United States of America
| | - Srikantan Nagarajan
- Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, San Francisco, California, United States of America
- Department of Radiology and Biomedical Imaging, University of California, San Francisco, San Francisco, California, United States of America
| | - John Houde
- Department of Otolaryngology - Head and Neck Surgery, University of California, San Francisco, San Francisco, California, United States of America
| |
Collapse
|
5
|
Masapollo M, Polka L, Ménard L, Franklin L, Tiede M, Morgan J. Asymmetries in unimodal visual vowel perception: The roles of oral-facial kinematics, orientation, and configuration. J Exp Psychol Hum Percept Perform 2018; 44:1103-1118. [PMID: 29517257 PMCID: PMC6037555 DOI: 10.1037/xhp0000518] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Masapollo, Polka, and Ménard (2017) recently reported a robust directional asymmetry in unimodal visual vowel perception: Adult perceivers discriminate a change from an English /u/ viseme to a French /u/ viseme significantly better than a change in the reverse direction. This asymmetry replicates a frequent pattern found in unimodal auditory vowel perception that points to a universal bias favoring more extreme vocalic articulations, which lead to acoustic signals with increased formant convergence. In the present article, the authors report 5 experiments designed to investigate whether this asymmetry in the visual realm reflects a speech-specific or general processing bias. They successfully replicated the directional effect using Masapollo et al.'s dynamically articulating faces but failed to replicate the effect when the faces were shown under static conditions. Asymmetries also emerged during discrimination of canonically oriented point-light stimuli that retained the kinematics and configuration of the articulating mouth. In contrast, no asymmetries emerged during discrimination of rotated point-light stimuli or Lissajou patterns that retained the kinematics, but not the canonical orientation or spatial configuration, of the labial gestures. These findings suggest that the perceptual processes underlying asymmetries in unimodal visual vowel discrimination are sensitive to speech-specific motion and configural properties and raise foundational questions concerning the role of specialized and general processes in vowel perception. (PsycINFO Database Record
Collapse
Affiliation(s)
- Matthew Masapollo
- Brown University
- McGill University
- Centre for Research on Brain, Language, and Music
| | - Linda Polka
- McGill University
- Centre for Research on Brain, Language, and Music
| | - Lucie Ménard
- Centre for Research on Brain, Language, and Music
- University of Quebec at Montreal
| | | | | | | |
Collapse
|
6
|
Masapollo M, Polka L, Ménard L. A universal bias in adult vowel perception - By ear or by eye. Cognition 2017; 166:358-370. [PMID: 28601721 DOI: 10.1016/j.cognition.2017.06.001] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2016] [Revised: 05/03/2017] [Accepted: 06/01/2017] [Indexed: 12/29/2022]
Abstract
Speech perceivers are universally biased toward "focal" vowels (i.e., vowels whose adjacent formants are close in frequency, which concentrates acoustic energy into a narrower spectral region). This bias is demonstrated in phonetic discrimination tasks as a directional asymmetry: a change from a relatively less to a relatively more focal vowel results in significantly better performance than a change in the reverse direction. We investigated whether the critical information for this directional effect is limited to the auditory modality, or whether visible articulatory information provided by the speaker's face also plays a role. Unimodal auditory and visual as well as bimodal (auditory-visual) vowel stimuli were created from video recordings of a speaker producing variants of /u/, differing in both their degree of focalization and visible lip rounding (i.e., lip compression and protrusion). In Experiment 1, we confirmed that subjects showed an asymmetry while discriminating the auditory vowel stimuli. We then found, in Experiment 2, a similar asymmetry when subjects lip-read those same vowels. In Experiment 3, we found asymmetries, comparable to those found for unimodal vowels, for bimodal vowels when the audio and visual channels were phonetically-congruent. In contrast, when the audio and visual channels were phonetically-incongruent (as in the "McGurk effect"), this asymmetry was disrupted. These findings collectively suggest that the perceptual processes underlying the "focal" vowel bias are sensitive to articulatory information available across sensory modalities, and raise foundational issues concerning the extent to which vowel perception derives from general-auditory or speech-gesture-specific processes.
Collapse
Affiliation(s)
- Matthew Masapollo
- School of Communication Sciences and Disorders, McGill University, 2001 McGill College, 8th Floor, Montreal, QC H3A 1G1, Canada; Centre for Research on Brain, Language, and Music, McGill University, 3640 de la Montagne, Montreal, Quebec H3G 2A8, Canada.
| | - Linda Polka
- School of Communication Sciences and Disorders, McGill University, 2001 McGill College, 8th Floor, Montreal, QC H3A 1G1, Canada; Centre for Research on Brain, Language, and Music, McGill University, 3640 de la Montagne, Montreal, Quebec H3G 2A8, Canada
| | - Lucie Ménard
- Département de Linguistique, Université du Québec à Montréal, Pavillon J.-A. De sève, DS-4425, 320, Sainte-Catherine Est, Montréal, QC H2X 1L7, Canada; Centre for Research on Brain, Language, and Music, McGill University, 3640 de la Montagne, Montreal, Quebec H3G 2A8, Canada
| |
Collapse
|
7
|
Abstract
Carol Fowler has had a tremendous impact on the field of speech perception, in part by having people disagree with her. The disagreements arise, as they often do, from two incompatible sources: Her positions are often misunderstood and thus "disagreed" with only on the surface, and her positions are rejected because they challenge deeply held, intuitively appealing positions, without being shown to be wrong. The misunderstandings center largely on the assertion that perception is "direct." This is often taken to mean that we have access to the speaker's vocal tract by some means other than the (largely acoustic) speech signal, when, in fact, it asserts that the signal is sufficient to directly specify that production. It is unclear why this misunderstanding persists; while there are still issues to be resolved in this regard, the stance is clear. The challenge to "acoustic" theories of speech perception remains, and thus direct perception is still controversial, as it seems that acoustic theories are held by a majority of researchers. Decades' worth of evidence showing the lack of usefulness of purely acoustic properties and the coherence gained by a production perspective have not changed this situation. Some attempts at combining the two perspectives have emerged, but they largely miss the Gibsonian challenge that Fowler has espoused: Perception of speech is direct. It looks as though it will take some further decades of research and discussion to fully explore her position.
Collapse
Affiliation(s)
- D H Whalen
- City University of New York, Haskins Laboratories, Yale University
| |
Collapse
|
8
|
Iskarous K. Compatible Dynamical Models of Environmental, Sensory, and Perceptual Systems. ECOLOGICAL PSYCHOLOGY 2016. [DOI: 10.1080/10407413.2016.1230377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
9
|
Lammert AC, Narayanan SS. On Short-Time Estimation of Vocal Tract Length from Formant Frequencies. PLoS One 2015; 10:e0132193. [PMID: 26177102 PMCID: PMC4503663 DOI: 10.1371/journal.pone.0132193] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 06/10/2015] [Indexed: 11/19/2022] Open
Abstract
Vocal tract length is highly variable across speakers and determines many aspects of the acoustic speech signal, making it an essential parameter to consider for explaining behavioral variability. A method for accurate estimation of vocal tract length from formant frequencies would afford normalization of interspeaker variability and facilitate acoustic comparisons across speakers. A framework for considering estimation methods is developed from the basic principles of vocal tract acoustics, and an estimation method is proposed that follows naturally from this framework. The proposed method is evaluated using acoustic characteristics of simulated vocal tracts ranging from 14 to 19 cm in length, as well as real-time magnetic resonance imaging data with synchronous audio from five speakers whose vocal tracts range from 14.5 to 18.0 cm in length. Evaluations show improvements in accuracy over previously proposed methods, with 0.631 and 1.277 cm root mean square error on simulated and human speech data, respectively. Empirical results show that the effectiveness of the proposed method is based on emphasizing higher formant frequencies, which seem less affected by speech articulation. Theoretical predictions of formant sensitivity reinforce this empirical finding. Moreover, theoretical insights are explained regarding the reason for differences in formant sensitivity.
Collapse
Affiliation(s)
- Adam C. Lammert
- Computer Science Department, Swarthmore College, Swarthmore, PA, United States of America
- * E-mail:
| | - Shrikanth S. Narayanan
- Signal Analysis and Interpretation Laboratory, University of Southern California, Los Angeles, CA, United States of America
| |
Collapse
|
10
|
Zourmand A, Mirhassani SM, Ting HN, Bux SI, Ng KH, Bilgen M, Jalaludin MA. A magnetic resonance imaging study on the articulatory and acoustic speech parameters of Malay vowels. Biomed Eng Online 2014; 13:103. [PMID: 25060583 PMCID: PMC4115483 DOI: 10.1186/1475-925x-13-103] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2013] [Accepted: 07/14/2014] [Indexed: 11/29/2022] Open
Abstract
The phonetic properties of six Malay vowels are investigated using magnetic resonance imaging (MRI) to visualize the vocal tract in order to obtain dynamic articulatory parameters during speech production. To resolve image blurring due to the tongue movement during the scanning process, a method based on active contour extraction is used to track tongue contours. The proposed method efficiently tracks tongue contours despite the partial blurring of MRI images. Consequently, the articulatory parameters that are effectively measured as tongue movement is observed, and the specific shape of the tongue and its position for all six uttered Malay vowels are determined. Speech rehabilitation procedure demands some kind of visual perceivable prototype of speech articulation. To investigate the validity of the measured articulatory parameters based on acoustic theory of speech production, an acoustic analysis based on the uttered vowels by subjects has been performed. As the acoustic speech and articulatory parameters of uttered speech were examined, a correlation between formant frequencies and articulatory parameters was observed. The experiments reported a positive correlation between the constriction location of the tongue body and the first formant frequency, as well as a negative correlation between the constriction location of the tongue tip and the second formant frequency. The results demonstrate that the proposed method is an effective tool for the dynamic study of speech production.
Collapse
Affiliation(s)
| | | | - Hua-Nong Ting
- Biomedical Engineering Department, Faculty of Engineering, University of Malaya, Kuala Lumpur, Malaysia.
| | | | | | | | | |
Collapse
|
11
|
Viswanathan N, Magnuson JS, Fowler CA. Information for coarticulation: Static signal properties or formant dynamics? J Exp Psychol Hum Percept Perform 2014; 40:1228-36. [PMID: 24730744 DOI: 10.1037/a0036214] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Perception of a speech segment changes depending on properties of surrounding segments in a phenomenon called compensation for coarticulation (Mann, 1980). The nature of information that drives these perceptual changes is a matter of debate. One account attributes perceptual shifts to low-level auditory system contrast effects based on static portions of the signal (e.g., third formant [F3] center or average frequency; Lotto & Kluender, 1998). An alternative account is that listeners' perceptual shifts result from listeners attuning to the acoustic effects of gestural overlap and that this information for coarticulation is necessarily dynamic (Fowler, 2006). In a pair of experiments, we used sinewave speech precursors to investigate the nature of information for compensation for coarticulation. In Experiment 1, as expected by both accounts, we found that sinewave speech precursors produce shifts in following segments. In Experiment 2, we investigated whether effects in Experiment 1 were driven by static F3 offsets of sinewave speech precursors, or by dynamic relationships among their formants. We temporally reversed F1 and F2 in sinewave precursors, preserving static F3 offset and average F1, F2 and F3 frequencies, but disrupting dynamic formant relationships. Despite having identical F3s, selectively reversed precursors produced effects that were significantly smaller and restricted to only a small portion of the continuum. We conclude that dynamic formant relations rather than static properties of the precursor provide information for compensation for coarticulation.
Collapse
|
12
|
Abstract
I discuss language forms as the primary means that language communities provide to enable public language use. As such, they are adapted to public use most notably in being linguistically significant vocal tract actions, not the categories in the mind as proposed in phonological theories. Their primary function is to serve as vehicles for production of syntactically structured sequences of words. However, more than that, phonological actions themselves do work in public language use. In particular, they foster interpersonal coordination in social activities. An intriguing property of language forms that likely reflects their emergence in social communicative activities is that phonological forms that should be meaningless (in order to serve their role in the openness of language at the level of the lexicon) are not wholly meaningless. In fact, the form-meaning "rift" is bridged bidirectionally: The smallest language forms are meaningful, and the meanings of lexical language forms generally inhere, in part, in their embodiment by understanders.
Collapse
Affiliation(s)
- Carol A Fowler
- University of Connecticut, Storrs, CT 06269, and Haskins Laboratories, New Haven, CT 06511
| |
Collapse
|
13
|
Nam H, Mooshammer C, Iskarous K, Whalen DH. Hearing tongue loops: perceptual sensitivity to acoustic signatures of articulatory dynamics. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2013; 134:3808-3817. [PMID: 24180790 PMCID: PMC3829900 DOI: 10.1121/1.4824161] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2013] [Revised: 09/10/2013] [Accepted: 09/19/2013] [Indexed: 06/02/2023]
Abstract
Previous work has shown that velar stops are produced with a forward movement during closure, forming a forward (anterior) loop for a VCV sequence, when the preceding vowels are back or mid. Are listeners aware of this aspect of articulatory dynamics? The current study used articulatory synthesis to examine how such kinematic patterns are reflected in the acoustics, and whether those acoustic patterns elicit different goodness ratings. In Experiment I, the size and direction of loops was modulated in articulatory synthesis. The resulting stimuli were presented to listeners for a naturalness judgment. Results show that listeners rate forward loops as more natural than backward loops, in agreement with typical productions. Acoustic analysis of the synthetic stimuli shows that forward loops exhibit shorter and shallower VC transitions than CV transitions. In Experiment II, three acoustic parameters were employed incorporating F3-F2 distance, transition slope, and transition length to systematically modulate the magnitude of VC and CV transitions. Listeners rated the naturalness in accord with those of Experiment I. This study reveals that there is sufficient information in the acoustic signature of "velar loops" to affect perceptual preference. Similarity to typical productions seemed to determine preferences, not acoustic distinctiveness.
Collapse
Affiliation(s)
- Hosung Nam
- Haskins Laboratories, 300 George Street, New Haven, Connecticut 06511
| | | | | | | |
Collapse
|