301
|
Kathleen Pichora-Fuller M. Use of supportive context by younger and older adult listeners: Balancing bottom-up and top-down information processing. Int J Audiol 2009; 47 Suppl 2:S72-82. [DOI: 10.1080/14992020802307404] [Citation(s) in RCA: 107] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
302
|
|
303
|
|
304
|
Sidaras SK, Alexander JED, Nygaard LC. Perceptual learning of systematic variation in Spanish-accented speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:3306-16. [PMID: 19425672 PMCID: PMC2736743 DOI: 10.1121/1.3101452] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2008] [Revised: 01/29/2009] [Accepted: 02/20/2009] [Indexed: 05/13/2023]
Abstract
Spoken language is characterized by an enormous amount of variability in how linguistic segments are realized. In order to investigate how speech perceptual processes accommodate to multiple sources of variation, adult native speakers of American English were trained with English words or sentences produced by six Spanish-accented talkers. At test, listeners transcribed utterances produced by six familiar or unfamiliar Spanish-accented talkers. With only brief exposure, listeners perceptually adapted to accent-general regularities in spoken language, generalizing to novel accented words and sentences produced by unfamiliar accented speakers. Acoustic properties of vowel production and their relation to identification performance were assessed to determine if the English listeners were sensitive to systematic variation in the realization of accented vowels. Vowels that showed the most improvement after Spanish-accented training were distinct from nearby vowels in terms of their acoustic characteristics. These findings suggest that the speech perceptual system dynamically adjusts to the acoustic consequences of changes in talker's voice and accent.
Collapse
Affiliation(s)
- Sabrina K Sidaras
- Department of Psychology, Emory University, 532 Kilgo Circle, Atlanta, Georgia 30322, USA
| | | | | |
Collapse
|
305
|
Munhall KG, MacDonald EN, Byrne SK, Johnsrude I. Talkers alter vowel production in response to real-time formant perturbation even when instructed not to compensate. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 125:384-90. [PMID: 19173425 PMCID: PMC2658635 DOI: 10.1121/1.3035829] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Talkers show sensitivity to a range of perturbations of auditory feedback (e.g., manipulation of vocal amplitude, fundamental frequency and formant frequency). Here, 50 subjects spoke a monosyllable ("head"), and the formants in their speech were shifted in real time using a custom signal processing system that provided feedback over headphones. First and second formants were altered so that the auditory feedback matched subjects' production of "had." Three different instructions were tested: (1) control, in which subjects were naive about the feedback manipulation, (2) ignore headphones, in which subjects were told that their voice might sound different and to ignore what they heard in the headphones, and (3) avoid compensation, in which subjects were informed in detail about the manipulation and were told not to compensate. Despite explicit instruction to ignore the feedback changes, subjects produced a robust compensation in all conditions. There were no differences in the magnitudes of the first or second formant changes between groups. In general, subjects altered their vowel formant values in a direction opposite to the perturbation, as if to cancel its effects. These results suggest that compensation in the face of formant perturbation is relatively automatic, and the response is not easily modified by conscious strategy.
Collapse
Affiliation(s)
- K G Munhall
- Department of Psychology, Queen's University, Humphrey Hall, Kingston, Ontario, Canada.
| | | | | | | |
Collapse
|
306
|
Gow DW, Segawa JA, Ahlfors SP, Lin FH. Lexical influences on speech perception: a Granger causality analysis of MEG and EEG source estimates. Neuroimage 2008; 43:614-23. [PMID: 18703146 DOI: 10.1016/j.neuroimage.2008.07.027] [Citation(s) in RCA: 130] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2008] [Revised: 05/29/2008] [Accepted: 07/12/2008] [Indexed: 11/25/2022] Open
Abstract
Behavioral and functional imaging studies have demonstrated that lexical knowledge influences the categorization of perceptually ambiguous speech sounds. However, methodological and inferential constraints have so far been unable to resolve the question of whether this interaction takes the form of direct top-down influences on perceptual processing, or feedforward convergence during a decision process. We examined top-down lexical influences on the categorization of segments in a /s/-/integral/ continuum presented in different lexical contexts to produce a robust Ganong effect. Using integrated MEG/EEG and MRI data we found that, within a network identified by 40 Hz gamma phase locking, activation in the supramarginal gyrus associated with wordform representation influences phonetic processing in the posterior superior temporal gyrus during a period of time associated with lexical processing. This result provides direct evidence that lexical processes influence lower level phonetic perception, and demonstrates the potential value of combining Granger causality analyses and high spatiotemporal resolution multimodal imaging data to explore the functional architecture of cognition.
Collapse
Affiliation(s)
- David W Gow
- Neuropsychology Laboratory, Massachusetts General Hospital, 175 Cambridge St., CPZ S340, Boston, MA 02114, USA.
| | | | | | | |
Collapse
|
307
|
Loebach JL, Bent T, Pisoni DB. Multiple routes to the perceptual learning of speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:552-61. [PMID: 18646998 PMCID: PMC2677329 DOI: 10.1121/1.2931948] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/21/2007] [Revised: 04/06/2008] [Accepted: 04/18/2008] [Indexed: 05/21/2023]
Abstract
A listener's ability to utilize indexical information in the speech signal can enhance their performance on a variety of speech perception tasks. It is unclear, however, whether such information plays a similar role for spectrally reduced speech signals, such as those experienced by individuals with cochlear implants. The present study compared the effects of training on linguistic and indexical tasks when adapting to cochlear implant simulations. Listening to sentences processed with an eight-channel sinewave vocoder, three separate groups of subjects were trained on a transcription task (transcription), a talker identification task (talker ID), or a gender identification task (gender ID). Pre- to posttest comparisons demonstrated that training produced significant improvement for all groups. Moreover, subjects from the talker ID and transcription training groups performed similarly at posttest and generalization, and significantly better than the subjects from the gender ID training group. These results suggest that training on an indexical task that requires high levels of controlled attention can provide equivalent benefits to training on a linguistic task. When listeners selectively focus their attention on the extralinguistic information in the speech signal, they still extract linguistic information, the degree to which they do so, however, appears to be task dependent.
Collapse
Affiliation(s)
- Jeremy L Loebach
- Speech Research Laboratory, Department of Psychological and Brain Sciences, Indiana University, Bloomington, Indiana 47405, USA.
| | | | | |
Collapse
|
308
|
Fridriksson J, Moss J, Davis B, Baylis GC, Bonilha L, Rorden C. Motor speech perception modulates the cortical language areas. Neuroimage 2008; 41:605-13. [PMID: 18396063 PMCID: PMC11239083 DOI: 10.1016/j.neuroimage.2008.02.046] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2007] [Revised: 01/25/2008] [Accepted: 02/20/2008] [Indexed: 11/22/2022] Open
Abstract
Traditionally, the left frontal and parietal lobes have been associated with language production while regions in the temporal lobe are seen as crucial for language comprehension. However, recent evidence suggests that the classical language areas constitute an integrated network where each area plays a crucial role both in speech production and perception. We used functional MRI to examine whether observing speech motor movements (without auditory speech) relative to non-speech motor movements preferentially activates the cortical speech areas. Furthermore, we tested whether the activation in these regions was modulated by task difficulty. This dissociates between areas that are actively involved with speech perception from regions that show an obligatory activation in response to speech movements (e.g. areas that automatically activate in preparation for a motoric response). Specifically, we hypothesized that regions involved with decoding oral speech would show increasing activation with increasing difficulty. We found that speech movements preferentially activate the frontal and temporal language areas. In contrast, non-speech movements preferentially activate the parietal region. Degraded speech stimuli increased both frontal and parietal lobe activity but did not differentially excite the temporal region. These findings suggest that the frontal language area plays a role in visual speech perception and highlight the differential roles of the classical speech and language areas in processing others' motor speech movements.
Collapse
Affiliation(s)
- Julius Fridriksson
- Department of Communication Sciences and Disorders, University of South Carolina, USA.
| | | | | | | | | | | |
Collapse
|
309
|
Stacey PC, Summerfield AQ. Comparison of word-, sentence-, and phoneme-based training strategies in improving the perception of spectrally distorted speech. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2008; 51:526-538. [PMID: 18367694 DOI: 10.1044/1092-4388(2008/038)] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
PURPOSE To compare the effectiveness of 3 self-administered strategies for auditory training that might improve speech perception by adult users of cochlear implants. The strategies are based, respectively, on discriminating isolated words, words in sentences, and phonemes in nonsense syllables. METHOD Participants were 18 normal-hearing adults who listened to speech processed by a noise-excited vocoder to simulate the information provided by a cochlear implant. They were assigned randomly to word-, sentence-, or phoneme-based training and underwent 9 training sessions (20 min each) on separate days over a 2- to 3-week period. The effectiveness of training was assessed as the improvement in accuracy of discriminating vowels and consonants, as well as identifying words in sentences, relative to participants' best performance in repeated tests prior to training. RESULTS Word- and sentence-based training led to significant improvements in the ability to identify words in sentences that were significantly larger than the improvements produced by phoneme-based training. There were no significant differences between the effectiveness of word- and sentence-based training. No significant improvements in consonant or vowel discrimination were found for the sentence- or phoneme-based training groups, but some improvements were found for the word-based training group. CONCLUSION The word- and sentence-based training strategies were more effective than the phoneme-based strategy at improving the perception of spectrally distorted speech.
Collapse
|
310
|
Loebach JL, Pisoni DB. Perceptual learning of spectrally degraded speech and environmental sounds. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:1126-39. [PMID: 18247913 PMCID: PMC3304448 DOI: 10.1121/1.2823453] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Adaptation to the acoustic world following cochlear implantation does not typically include formal training or extensive audiological rehabilitation. Can cochlear implant (CI) users benefit from formal training, and if so, what type of training is best? This study used a pre-/posttest design to evaluate the efficacy of training and generalization of perceptual learning in normal hearing subjects listening to CI simulations (eight-channel sinewave vocoder). Five groups of subjects were trained on words (simple/complex), sentences (meaningful/anomalous), or environmental sounds, and then were tested using an open-set identification task. Subjects were trained on only one set of materials but were tested on all stimuli. All groups showed significant improvement due to training, which successfully generalized to some, but not all stimulus materials. For easier tasks, all types of training generalized equally well. For more difficult tasks, training specificity was observed. Training on speech did not generalize to the recognition of environmental sounds; however, explicit training on environmental sounds successfully generalized to speech. These data demonstrate that the perceptual learning of degraded speech is highly context dependent and the type of training and the specific stimulus materials that a subject experiences during perceptual learning has a substantial impact on generalization to new materials.
Collapse
Affiliation(s)
- Jeremy L Loebach
- Speech Research Laboratory, Department of Psychological and Brain Sciences, Indiana University, Bloomington, Indiana 47405, USA.
| | | |
Collapse
|
311
|
Hopkins K, Moore BCJ, Stone MA. Effects of moderate cochlear hearing loss on the ability to benefit from temporal fine structure information in speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:1140-53. [PMID: 18247914 PMCID: PMC2688774 DOI: 10.1121/1.2824018] [Citation(s) in RCA: 134] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Speech reception thresholds (SRTs) were measured with a competing talker background for signals processed to contain variable amounts of temporal fine structure (TFS) information, using nine normal-hearing and nine hearing-impaired subjects. Signals (speech and background talker) were bandpass filtered into channels. Channel signals for channel numbers above a "cut-off channel" (CO) were vocoded to remove TFS information, while channel signals for channel numbers of CO and below were left unprocessed. Signals from all channels were combined. As a group, hearing-impaired subjects benefited less than normal-hearing subjects from the additional TFS information that was available as CO increased. The amount of benefit varied between hearing-impaired individuals, with some showing no improvement in SRT and one showing an improvement similar to that for normal-hearing subjects. The reduced ability to take advantage of TFS information in speech may partially explain why subjects with cochlear hearing loss get less benefit from listening in a fluctuating background than normal-hearing subjects. TFS information may be important in identifying the temporal "dips" in such a background.
Collapse
Affiliation(s)
- Kathryn Hopkins
- Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, United Kingdom.
| | | | | |
Collapse
|
312
|
Bradlow AR, Bent T. Perceptual adaptation to non-native speech. Cognition 2008; 106:707-29. [PMID: 17532315 PMCID: PMC2213510 DOI: 10.1016/j.cognition.2007.04.005] [Citation(s) in RCA: 289] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2006] [Revised: 04/04/2007] [Accepted: 04/07/2007] [Indexed: 11/18/2022]
Abstract
This study investigated talker-dependent and talker-independent perceptual adaptation to foreign-accent English. Experiment 1 investigated talker-dependent adaptation by comparing native English listeners' recognition accuracy for Chinese-accented English across single and multiple talker presentation conditions. Results showed that the native listeners adapted to the foreign-accented speech over the course of the single talker presentation condition with some variation in the rate and extent of this adaptation depending on the baseline sentence intelligibility of the foreign-accented talker. Experiment 2 investigated talker-independent perceptual adaptation to Chinese-accented English by exposing native English listeners to Chinese-accented English and then testing their perception of English produced by a novel Chinese-accented talker. Results showed that, if exposed to multiple talkers of Chinese-accented English during training, native English listeners could achieve talker-independent adaptation to Chinese-accented English. Taken together, these findings provide evidence for highly flexible speech perception processes that can adapt to speech that deviates substantially from the pronunciation norms in the native talker community along multiple acoustic-phonetic dimensions.
Collapse
Affiliation(s)
- Ann R Bradlow
- Department of Linguistics, Northwestern University, Evanston, IL 60208, USA.
| | | |
Collapse
|
313
|
Sheldon S, Pichora-Fuller MK, Schneider BA. Effect of age, presentation method, and learning on identification of noise-vocoded words. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:476-488. [PMID: 18177175 DOI: 10.1121/1.2805676] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Noise vocoding was used to investigate the ability of younger and older adults with normal audiometric thresholds in the speech range to use amplitude envelope cues to identify words. In Experiment 1, four 50-word lists were tested, with each word presented initially with one frequency band and the number of bands being incremented until it was correctly identified by the listener. Both age groups required an average of 5.25 bands for 50% correct word identification and performance improved across the four lists. In Experiment 2, the same participants who completed Experiment 1 identified words in four blocked noise-vocoded conditions (16, 8, 4, 2 bands). Compared to Experiment 1, both age groups required more bands to reach the 50% correct word identification threshold in Experiment 2, 6.13, and 8.55 bands, respectively, with younger adults outperforming older adults. Experiment 3 was identical to Experiment 2 except the participants had no prior experience with noise-vocoded speech. Again, younger adults outperformed older adults, with thresholds of 6.67 and 8.97 bands, respectively. The finding of age effects in Experiments 2 and 3, but not in Experiment 1, seems more likely to be related to differences in the presentation methods than to experience with noise vocoding.
Collapse
Affiliation(s)
- Signy Sheldon
- Department of Psychology, University of Toronto, 3359 Mississauga Road North, Mississauga, Ontario L5L 1C6, Canada
| | | | | |
Collapse
|
314
|
Sheldon S, Pichora-Fuller MK, Schneider BA. Priming and sentence context support listening to noise-vocoded speech by younger and older adults. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 123:489-499. [PMID: 18177176 DOI: 10.1121/1.2783762] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Older adults are known to benefit from supportive context in order to compensate for age-related reductions in perceptual and cognitive processing, including when comprehending spoken language in adverse listening conditions. In the present study, we examine how younger and older adults benefit from two types of contextual support, predictability from sentence context and priming, when identifying target words in noise-vocoded sentences. In the first part of the experiment, benefit from context based on primarily semantic knowledge was evaluated by comparing the accuracy of identification of sentence-final target words that were either highly predictable or not predictable from the sentence context. In the second part of the experiment, benefit from priming was evaluated by comparing the accuracy of identification of target words when noise-vocoded sentences were either primed or not by the presentation of the sentence context without noise vocoding and with the target word replaced with white noise. Younger and older adults benefited from each type of supportive context, with the most benefit realized when both types were combined. Supportive context reduced the number of noise-vocoded bands needed for 50% word identification more for older adults than their younger counterparts.
Collapse
Affiliation(s)
- Signy Sheldon
- Department of Psychology, University of Toronto, 3359 Mississauga Road N., Mississauga, Ontario L5L 1C6, Canada
| | | | | |
Collapse
|
315
|
Clopper CG, Bradlow AR. Perception of dialect variation in noise: intelligibility and classification. LANGUAGE AND SPEECH 2008; 51:175-98. [PMID: 19626923 PMCID: PMC2744323 DOI: 10.1177/0023830908098539] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Listeners can explicitly categorize unfamiliar talkers by regional dialect with above-chance performance under ideal listening conditions. However, the extent to which this important source of variation affects speech processing is largely unknown. In a series of four experiments, we examined the effects of dialect variation on speech intelligibility in noise and the effects of noise on perceptual dialect classification. Results revealed that, on the one hand, dialect-specific differences in speech intelligibility were more pronounced at harder signal-to-noise ratios, but were attenuated under more favorable listening conditions. Listener dialect did not interact with talker dialect; for all listeners, at a range of noise levels, the General American talkers were the most intelligible and the Mid-Atlantic talkers were the least intelligible. Dialect classification performance, on the other hand, was poor even with only moderate amounts of noise. These findings suggest that at moderate noise levels, listeners are able to adapt to dialect variation in the acoustic signal such that some cross-dialect intelligibility differences are neutralized, despite relatively poor explicit dialect classification performance. However, at more difficult noise levels, participants cannot effectively adapt to dialect variation in the acoustic signal and cross-dialect differences in intelligibility emerge for all listeners, regardless of their dialect.
Collapse
|
316
|
An introduction to hearing loss and screening procedures for behavioral research. Behav Res Methods 2007; 39:667-72. [PMID: 17958180 DOI: 10.3758/bf03193038] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Hearing loss is a confounding variable that is rarely addressed in behavioral research despite its prevalence across the life span. Currently, the most common method of experimental control over hearing acuity is through self report of perceived impairment. We argue that this technique may lack sensitivity and that researchers should more commonly utilize standardized hearing screening procedures. Distinctive patterns of hearing loss are reviewed with attention to populations that commonly participate in behavioral research. We explain standard techniques for conducting pure tone hearing screening using a conventional portable audiometer and outline a procedure for how researchers can modify a conventional laptop computer for audiometric screening when a standard audiometer is unavailable. We offer a sample hearing screening program that researchers may use toward the development of their own protocol. This program is freely available for download at www .psychonomic.org/archive.
Collapse
|
317
|
Mattys SL, Melhorn JF. Sentential, lexical, and acoustic effects on the perception of word boundaries. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 122:554-67. [PMID: 17614511 DOI: 10.1121/1.2735105] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
This study investigates the effects of sentential context, lexical knowledge, and acoustic cues on the segmentation of connected speech. Listeners heard near-homophonous phrases (e.g., plmpaI for "plum pie" versus "plump eye") in isolation, in a sentential context, or in a lexically biasing context. The sentential context and the acoustic cues were piloted to provide strong versus mild support for one segmentation alternative (plum pie) or the other (plump eye). The lexically biasing context favored one segmentation or the other (e.g., skmpaI for "scum pie" versus *"scump eye," and lmpaI, for "lump eye" versus *"lum pie," with the asterisk denoting a lexically unacceptable parse). A forced-choice task, in which listeners indicated which of two words they thought they heard (e.g., "pie" or "eye"), revealed compensatory mechanisms between the sources of information. The effect of both sentential and lexical contexts on segmentation responses was larger when the acoustic cues were mild than when they were strong. Moreover, lexical effects were accompanied with a reduction in sensitivity to the acoustic cues. Sentential context only affected the listeners' response criterion. The results highlight the graded, interactive, and flexible nature of multicue segmentation, as well as functional differences between sentential and lexical contributions to this process.
Collapse
Affiliation(s)
- Sven L Mattys
- Department of Experimental Psychology, University of Bristol, Bristol, Avon, United Kingdom.
| | | |
Collapse
|
318
|
Davis MH, Johnsrude IS. Hearing speech sounds: Top-down influences on the interface between audition and speech perception. Hear Res 2007; 229:132-47. [PMID: 17317056 DOI: 10.1016/j.heares.2007.01.014] [Citation(s) in RCA: 249] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/06/2006] [Revised: 11/23/2006] [Accepted: 01/03/2007] [Indexed: 10/23/2022]
Abstract
This paper focuses on the cognitive and neural mechanisms of speech perception: the rapid, and highly automatic processes by which complex time-varying speech signals are perceived as sequences of meaningful linguistic units. We will review four processes that contribute to the perception of speech: perceptual grouping, lexical segmentation, perceptual learning and categorical perception, in each case presenting perceptual evidence to support highly interactive processes with top-down information flow driving and constraining interpretations of spoken input. The cognitive and neural underpinnings of these interactive processes appear to depend on two distinct representations of heard speech: an auditory, echoic representation of incoming speech, and a motoric/somatotopic representation of speech as it would be produced. We review the neuroanatomical system supporting these two key properties of speech perception and discuss how this system incorporates interactive processes and two parallel echoic and somato-motoric representations, drawing on evidence from functional neuroimaging studies in humans and from comparative anatomical studies. We propose that top-down interactive mechanisms within auditory networks play an important role in explaining the perception of spoken language.
Collapse
|
319
|
Mirman D, McClelland JL, Holt LL. An interactive Hebbian account of lexically guided tuning of speech perception. Psychon Bull Rev 2007; 13:958-65. [PMID: 17484419 PMCID: PMC2291357 DOI: 10.3758/bf03213909] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
We describe an account of lexically guided tuning of speech perception based on interactive processing and Hebbian learning. Interactive feedback provides lexical information to prelexical levels, and Hebbian learning uses that information to retune the mapping from auditory input to prelexical representations of speech. Simulations of an extension of the TRACE model of speech perception are presented that demonstrate the efficacy of this mechanism. Further simulations show that acoustic similarity can account for the patterns of speaker generalization. This account addresses the role of lexical information in guiding both perception and learning with a single set of principles of information propagation.
Collapse
Affiliation(s)
- Daniel Mirman
- Department of Psychology, University of Connecticut, 406 Babbidge Rd., Unit 1020, Storrs, CT 06269-1020, USA.
| | | | | |
Collapse
|
320
|
Stacey PC, Summerfield AQ. Effectiveness of computer-based auditory training in improving the perception of noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:2923-35. [PMID: 17550190 DOI: 10.1121/1.2713668] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Five experiments were designed to evaluate the effectiveness of "high-variability" lexical training in improving the ability of normal-hearing subjects to perceive noise-vocoded speech that had been spectrally shifted to simulate tonotopic misalignment. Two approaches to training were implemented. One training approach required subjects to recognize isolated words, while the other training approach required subjects to recognize words in sentences. Both approaches to training improved the ability to identify words in sentences. Improvements following a single session (lasting 1-2 h) of auditory training ranged between 7 and 12 %pts and were significantly larger than improvements following a visual control task that was matched with the auditory training task in terms of the response demands. An additional three sessions of word- and sentence-based training led to further improvements, with the average overall improvement ranging from 13 to 18% pts. When a tonotopic misalignment of 3 mm rather than 6 mm was simulated, training with several talkers led to greater generalization to new talkers than training with a single talker. The results confirm that computer-based lexical training can help overcome the effects of spectral distortions in speech, and they suggest that training materials are most effective when several talkers are included.
Collapse
Affiliation(s)
- Paula C Stacey
- Department of Psychology, University of York, Heslington, York YO10 5DD, United Kingdom.
| | | |
Collapse
|
321
|
Levi SV, Winters SJ, Pisoni DB. Speaker-independent factors affecting the perception of foreign accent in a second language. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:2327-38. [PMID: 17471745 PMCID: PMC3319010 DOI: 10.1121/1.2537345] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Previous research on foreign accent perception has largely focused on speaker-dependent factors such as age of learning and length of residence. Factors that are independent of a speaker's language learning history have also been shown to affect perception of second language speech. The present study examined the effects of two such factors--listening context and lexical frequency--on the perception of foreign-accented speech. Listeners rated foreign accent in two listening contexts: auditory-only, where listeners only heard the target stimuli, and auditory + orthography, where listeners were presented with both an auditory signal and an orthographic display of the target word. Results revealed that higher frequency words were consistently rated as less accented than lower frequency words. The effect of the listening context emerged in two interactions: the auditory + orthography context reduced the effects of lexical frequency, but increased the perceived differences between native and non-native speakers. Acoustic measurements revealed some production differences for words of different levels of lexical frequency, though these differences could not account for all of the observed interactions from the perceptual experiment. These results suggest that factors independent of the speakers' actual speech articulations can influence the perception of degree of foreign accent.
Collapse
Affiliation(s)
- Susannah V Levi
- Speech Research Laboratory, Department of Psychological and Brain Sciences, Indiana University, Bloomington 47405, USA.
| | | | | |
Collapse
|
322
|
Obleser J, Wise RJS, Dresner MA, Scott SK. Functional integration across brain regions improves speech perception under adverse listening conditions. J Neurosci 2007; 27:2283-9. [PMID: 17329425 PMCID: PMC6673469 DOI: 10.1523/jneurosci.4663-06.2007] [Citation(s) in RCA: 274] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Speech perception is supported by both acoustic signal decomposition and semantic context. This study, using event-related functional magnetic resonance imaging, investigated the neural basis of this interaction with two speech manipulations, one acoustic (spectral degradation) and the other cognitive (semantic predictability). High compared with low predictability resulted in the greatest improvement in comprehension at an intermediate level of degradation, and this was associated with increased activity in the left angular gyrus, the medial and left lateral prefrontal cortices, and the posterior cingulate gyrus. Functional connectivity between these regions was also increased, particularly with respect to the left angular gyrus. In contrast, activity in both superior temporal sulci and the left inferior frontal gyrus correlated with the amount of spectral detail in the speech signal, regardless of predictability. These results demonstrate that increasing functional connectivity between high-order cortical areas, remote from the auditory cortex, facilitates speech comprehension when the clarity of speech is reduced.
Collapse
Affiliation(s)
- Jonas Obleser
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AR, United Kingdom.
| | | | | | | |
Collapse
|
323
|
Golomb JD, Peelle JE, Wingfield A. Effects of stimulus variability and adult aging on adaptation to time-compressed speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2007; 121:1701-8. [PMID: 17407906 DOI: 10.1121/1.2436635] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
With as few as 10-20 sentences of exposure, listeners are able to adapt to speech that is highly distorted compared to that which is encountered in everyday conversation. The current study examines the extent to which adaptation to time-compressed speech can be impeded by disrupting the continuity of the exposure sentences, and whether this differs between young and older adult listeners when they are equated for starting accuracy. In separate sessions conducted one week apart, the degree of adaptation was assessed in four exposure conditions, all of which involved exposure to the same number of time-compressed sentences. A continuous exposure condition involved presentation of the time-compressed sentences without interruption. Two alternation conditions alternated time-compressed speech and uncompressed speech by single sentences or groups of four sentences. A fourth condition presented sentences that were separated by a period of silence but no uncompressed speech. For all conditions, neither young nor older adults' overall level of learning was influenced by disruptions to the exposure sentences. In addition, participants' performance showed reliable improvement across the first and subsequent sessions. These results support robust learning mechanisms in speech perception that remain functional throughout the lifespan.
Collapse
Affiliation(s)
- Julie D Golomb
- Volen National Center for Complex Systems, Brandeis University, Waltham, Massachusetts 02454, USA
| | | | | |
Collapse
|
324
|
Zangl R, Fernald A. Increasing Flexibility in Children's Online Processing of Grammatical and Nonce Determiners in Fluent Speech. LANGUAGE LEARNING AND DEVELOPMENT : THE OFFICIAL JOURNAL OF THE SOCIETY FOR LANGUAGE DEVELOPMENT 2007; 3:199-231. [PMID: 22081762 PMCID: PMC3212392 DOI: 10.1080/15475440701360564] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Two experiments using online speech processing measures with 18- to 36-month-olds extended research by Gerken & McIntosh (1993) showing that young children's comprehension is disrupted when the grammatical determiner in a noun phrase is replaced with a nonce determiner (the car vs. po car). In Expt. 1, 18-month-olds were slower and less accurate to identify familiar nouns on nonce-article than grammatical-article trials, although older children who produced determiners in their own speech showed no disruption. However, when tested on novel words in Expt. 2, even linguistically advanced 34-month-olds had greater difficulty identifying familiar as well as newly learned object names preceded by a nonce article. Children's success in "listening through" an uninformative functor-like nonce syllable before a familiar noun was related to their level of grammatical competence, but their attention to the nonce article also varied with lexical familiarity and the overall redundancy of the processing context.
Collapse
|
325
|
Smith MW, Faulkner A. Perceptual adaptation by normally hearing listeners to a simulated "hole" in hearing. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:4019-30. [PMID: 17225428 DOI: 10.1121/1.2359235] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Simulations of cochlear implants have demonstrated that the deleterious effects of a frequency misalignment between analysis bands and characteristic frequencies at basally shifted simulated electrode locations are significantly reduced with training. However, a distortion of frequency-to-place mapping may also arise due to a region of dysfunctional neurons that creates a "hole" in the tonotopic representation. This study simulated a 10 mm hole in the mid-frequency region. Noise-band processors were created with six output bands (three apical and three basal to the hole). The spectral information that would have been represented in the hole was either dropped or reassigned to bands on either side. Such reassignment preserves information but warps the place code, which may in itself impair performance. Normally hearing subjects received three hours of training in two reassignment conditions. Speech recognition improved considerably with training. Scores were much lower in a baseline (untrained) condition where information from the hole region was dropped. A second group of subjects trained in this dropped condition did show some improvement; however, scores after training were significantly lower than in the reassignment conditions. These results are consistent with the view that speech processors should present the most informative frequency range irrespective of frequency misalignment.
Collapse
Affiliation(s)
- Matthew W Smith
- Department of Phonetics and Linguistics, UCL, Wolfson House, 4 Stephenson Way, London NW1 2HE, United Kingdom.
| | | |
Collapse
|
326
|
Faulkner A, Rosen S, Norman C. The right information may matter more than frequency-place alignment: simulations of frequency-aligned and upward shifting cochlear implant processors for a shallow electrode array insertion. Ear Hear 2006; 27:139-52. [PMID: 16518142 DOI: 10.1097/01.aud.0000202357.40662.85] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE It has been claimed that speech recognition with a cochlear implant is dependent on the correct frequency alignment of analysis bands in the speech processor with characteristic frequencies (CFs) at electrode locations. However, the use of filters aligned in frequency to a relatively basal electrode array position leads to significant loss of lower frequency speech information. This study uses an acoustic simulation to compare two approaches to the matching of speech processor filters to an electrode array having a relatively shallow depth within the typical range, such that the most apical element is at a CF of 1851 Hz. Two noise-excited vocoder speech processors are compared, one with CF-matched filters, and one with filters matched to CFs at basilar membrane locations 6 mm more apical than electrode locations. DESIGN An extended crossover training design examined pre- and post-training performance in the identification of vowels and words in sentences for both processors. Subjects received about 3 hours of training with each processor in turn. RESULTS Training improved performance with both processors, but training effects were greater for the shifted processor. For a male talker, the shifted processor led to higher post-training scores than the frequency-aligned processor with both vowels and sentences. For a female talker, post-training vowel scores did not differ significantly between processors, whereas sentence scores were higher with the frequency-aligned processor. CONCLUSIONS Even for a shallow electrode insertion, we conclude that a speech processor should represent information from important frequency regions below 1 kHz and that the possible cost of frequency misalignment can be significantly reduced with listening experience.
Collapse
Affiliation(s)
- Andrew Faulkner
- Department of Phonetics and Linguistics, University College London, Wolfson House, London, United Kingdom.
| | | | | |
Collapse
|
327
|
McClelland JL, Mirman D, Holt LL. Are there interactive processes in speech perception? Trends Cogn Sci 2006; 10:363-9. [PMID: 16843037 PMCID: PMC3523348 DOI: 10.1016/j.tics.2006.06.007] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2006] [Revised: 05/18/2006] [Accepted: 06/26/2006] [Indexed: 11/17/2022]
Abstract
Lexical information facilitates speech perception, especially when sounds are ambiguous or degraded. The interactive approach to understanding this effect posits that this facilitation is accomplished through bi-directional flow of information, allowing lexical knowledge to influence pre-lexical processes. Alternative autonomous theories posit feed-forward processing with lexical influence restricted to post-perceptual decision processes. We review evidence supporting the prediction of interactive models that lexical influences can affect pre-lexical mechanisms, triggering compensation, adaptation and retuning of phonological processes generally taken to be pre-lexical. We argue that these and other findings point to interactive processing as a fundamental principle for perception of speech and other modalities.
Collapse
Affiliation(s)
- James L McClelland
- Center for the Neural Basis of Cognition and Department of Psychology, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA 15213, USA.
| | | | | |
Collapse
|