1
|
Beach SD, Ozernov-Palchik O, May SC, Centanni TM, Gabrieli JDE, Pantazis D. Neural Decoding Reveals Concurrent Phonemic and Subphonemic Representations of Speech Across Tasks. NEUROBIOLOGY OF LANGUAGE (CAMBRIDGE, MASS.) 2021; 2:254-279. [PMID: 34396148 PMCID: PMC8360503 DOI: 10.1162/nol_a_00034] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Accepted: 02/21/2021] [Indexed: 06/13/2023]
Abstract
Robust and efficient speech perception relies on the interpretation of acoustically variable phoneme realizations, yet prior neuroimaging studies are inconclusive regarding the degree to which subphonemic detail is maintained over time as categorical representations arise. It is also unknown whether this depends on the demands of the listening task. We addressed these questions by using neural decoding to quantify the (dis)similarity of brain response patterns evoked during two different tasks. We recorded magnetoencephalography (MEG) as adult participants heard isolated, randomized tokens from a /ba/-/da/ speech continuum. In the passive task, their attention was diverted. In the active task, they categorized each token as ba or da. We found that linear classifiers successfully decoded ba vs. da perception from the MEG data. Data from the left hemisphere were sufficient to decode the percept early in the trial, while the right hemisphere was necessary but not sufficient for decoding at later time points. We also decoded stimulus representations and found that they were maintained longer in the active task than in the passive task; however, these representations did not pattern more like discrete phonemes when an active categorical response was required. Instead, in both tasks, early phonemic patterns gave way to a representation of stimulus ambiguity that coincided in time with reliable percept decoding. Our results suggest that the categorization process does not require the loss of subphonemic detail, and that the neural representation of isolated speech sounds includes concurrent phonemic and subphonemic information.
Collapse
Affiliation(s)
- Sara D. Beach
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Program in Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA
| | - Ola Ozernov-Palchik
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Sidney C. May
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Lynch School of Education and Human Development, Boston College, Chestnut Hill, MA, USA
| | - Tracy M. Centanni
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
- Department of Psychology, Texas Christian University, Fort Worth, TX, USA
| | - John D. E. Gabrieli
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Dimitrios Pantazis
- McGovern Institute for Brain Research, Massachusetts Institute of Technology, Cambridge, MA, USA
| |
Collapse
|
2
|
Silva DMR, Rothe-Neves R, Melges DB. Long-latency event-related responses to vowels: N1-P2 decomposition by two-step principal component analysis. Int J Psychophysiol 2019; 148:93-102. [PMID: 31863852 DOI: 10.1016/j.ijpsycho.2019.11.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2019] [Revised: 11/16/2019] [Accepted: 11/18/2019] [Indexed: 11/26/2022]
Abstract
The N1-P2 complex of the auditory event-related potential (ERP) has been used to examine neural activity associated with speech sound perception. Since it is thought to reflect multiple generator processes, its functional significance is difficult to infer. In the present study, a temporospatial principal component analysis (PCA) was used to decompose the N1-P2 response into latent factors underlying covariance patterns in ERP data recorded during passive listening to pairs of successive vowels. In each trial, one of six sounds drawn from an /i/-/e/ vowel continuum was followed either by an identical sound, a different token of the same vowel category, or a token from the other category. Responses were examined as to how they were modulated by within- and across-category vowel differences and by adaptation (repetition suppression) effects. Five PCA factors were identified as corresponding to three well-known N1 subcomponents and two P2 subcomponents. Results added evidence that the N1 peak reflects both generators that are sensitive to spectral information and generators that are not. For later latency ranges, different patterns of sensitivity to vowel quality were found, including category-related effects. Particularly, a subcomponent identified as the Tb wave showed release from adaptation in response to an /i/ followed by an /e/ sound. A P2 subcomponent varied linearly with spectral shape along the vowel continuum, while the other was stronger the closer the vowel was to the category boundary, suggesting separate processing of continuous and category-related information. Thus, the PCA-based decomposition of the N1-P2 complex was functionally meaningful, revealing distinct underlying processes at work during speech sound perception.
Collapse
Affiliation(s)
- Daniel M R Silva
- Phonetics Lab, Faculty of Letters, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Rui Rothe-Neves
- Phonetics Lab, Faculty of Letters, Federal University of Minas Gerais, Belo Horizonte, Brazil.
| | - Danilo B Melges
- Graduate Program in Electrical Engineering, Department of Electrical Engineering, Federal University of Minas Gerais
| |
Collapse
|
3
|
Ross B, Tremblay KL, Alain C. Simultaneous EEG and MEG recordings reveal vocal pitch elicited cortical gamma oscillations in young and older adults. Neuroimage 2019; 204:116253. [PMID: 31600592 DOI: 10.1016/j.neuroimage.2019.116253] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Revised: 09/13/2019] [Accepted: 10/06/2019] [Indexed: 10/25/2022] Open
Abstract
The frequency-following response with origin in the auditory brainstem represents the pitch contour of voice and can be recorded with electrodes from the scalp. MEG studies also revealed a cortical contribution to the high gamma oscillations at the fundamental frequency (f0) of a vowel stimulus. Therefore, studying the cortical component of the frequency-following response could provide insights into how pitch information is encoded at the cortical level. Comparing how aging affects the different responses may help to uncover the neural mechanisms underlying speech understanding deficits in older age. We simultaneously recorded EEG and MEG responses to the syllable /ba/. MEG beamformer analysis localized sources in bilateral auditory cortices and the midbrain. Time-frequency analysis showed a faithful representation of the pitch contour between 106 Hz and 138 Hz in the cortical activity. A cross-correlation revealed a latency of 20 ms. Furthermore, stimulus onsets elicited cortical 40-Hz responses. Both the 40-Hz and the f0 response amplitudes increased in older age and were larger in the right hemisphere. The effects of aging and laterality of the f0 response were evident in the MEG only, suggesting that both effects were characteristics of the cortical response. After comparing f0 and N1 responses in EEG and MEG, we estimated that approximately one-third of the scalp-recorded f0 response could be cortical in origin. We attributed the significance of the cortical f0 response to the precise timing of cortical neurons that serve as a time-sensitive code for pitch.
Collapse
Affiliation(s)
- Bernhard Ross
- Rotman Research Institute, Baycrest Centre, Toronto, Ontario, Canada; Department for Medical Biophysics, University of Toronto, Ontario, Canada.
| | - Kelly L Tremblay
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA, USA
| | - Claude Alain
- Rotman Research Institute, Baycrest Centre, Toronto, Ontario, Canada; Department of Psychology, University of Toronto, Ontario, Canada
| |
Collapse
|
4
|
Fan CSD, Zhu X, Dosch HG, von Stutterheim C, Rupp A. Language related differences of the sustained response evoked by natural speech sounds. PLoS One 2017; 12:e0180441. [PMID: 28727776 PMCID: PMC5519032 DOI: 10.1371/journal.pone.0180441] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 06/15/2017] [Indexed: 12/31/2022] Open
Abstract
In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC). Using magnetoencephalography (MEG), we recorded transient and sustained auditory evoked fields (AEF) in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF) evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl’s gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction between superior temporal gyrus and sulcus.
Collapse
Affiliation(s)
- Christina Siu-Dschu Fan
- Institut für Theoretische Physik, Heidelberg, Germany
- Storz Medical AG, Tägerwilen, Switzerland
| | - Xingyu Zhu
- Department for General and Applied Linguistics, University of Heidelberg, Heidelberg, Germany
| | | | | | - André Rupp
- Section of Biomagnetism, Department of Neurology, University of Heidelberg, Heidelberg, Germany
- * E-mail:
| |
Collapse
|
5
|
Neuromagnetic correlates of voice pitch, vowel type, and speaker size in auditory cortex. Neuroimage 2017; 158:79-89. [PMID: 28669914 DOI: 10.1016/j.neuroimage.2017.06.065] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 06/13/2017] [Accepted: 06/22/2017] [Indexed: 11/24/2022] Open
Abstract
Vowel recognition is largely immune to differences in speaker size despite the waveform differences associated with variation in speaker size. This has led to the suggestion that voice pitch and mean formant frequency (MFF) are extracted early in the hierarchy of hearing/speech processing and used to normalize the internal representation of vowel sounds. This paper presents a magnetoencephalographic (MEG) experiment designed to locate and compare neuromagnetic activity associated with voice pitch, MFF and vowel type in human auditory cortex. Sequences of six sustained vowels were used to contrast changes in the three components of vowel perception, and MEG responses to the changes were recorded from 25 participants. A staged procedure was employed to fit the MEG data with a source model having one bilateral pair of dipoles for each component of vowel perception. This dipole model showed that the activity associated with the three perceptual changes was functionally separable; the pitch source was located in Heschl's gyrus (bilaterally), while the vowel-type and formant-frequency sources were located (bilaterally) just behind Heschl's gyrus in planum temporale. The results confirm that vowel normalization begins in auditory cortex at an early point in the hierarchy of speech processing.
Collapse
|
6
|
Silva DMR, Melges DB, Rothe-Neves R. N1 response attenuation and the mismatch negativity (MMN) to within- and across-category phonetic contrasts. Psychophysiology 2017; 54:591-600. [PMID: 28169421 DOI: 10.1111/psyp.12824] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2016] [Accepted: 12/07/2016] [Indexed: 11/29/2022]
Abstract
According to the neural adaptation model of the mismatch negativity (MMN), the sensitivity of this event-related response to both acoustic and categorical information in speech sounds can be accounted for by assuming that (a) the degree of overlapping between neural representations of two sounds depends on both the acoustic difference between them and whether or not they belong to distinct phonetic categories, and (b) a release from stimulus-specific adaptation causes an enhanced N1 obligatory response to infrequent deviant stimuli. On the basis of this view, we tested in Experiment 1 whether the N1 response to the second sound of a pair (S2 ) would be more attenuated in pairs of identical vowels compared with pairs of different vowels, and in pairs of exemplars of the same vowel category compared with pairs of exemplars of different categories. The psychoacoustic distance between S1 and S2 was the same for all within-category and across-category pairs. While N1 amplitudes decreased markedly from S1 to S2 , responses to S2 were quite similar across pair types, indicating that the attenuation effect in such conditions is not stimulus specific. In Experiment 2, a pronounced MMN was elicited by a deviant vowel sound in an across-category oddball sequence, but not when the exact same deviant vowel was presented in a within-category oddball sequence. This adds evidence that MMN reflects categorical phonetic processing. Taken together, the results suggest that different neural processes underlie the attenuation of the N1 response to S2 and the MMN to vowels.
Collapse
Affiliation(s)
- Daniel M R Silva
- Graduate Program in Neuroscience, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Danilo B Melges
- Graduate Program in Electrical Engineering, Department of Electrical Engineering, Federal University of Minas Gerais, Belo Horizonte, Brazil
| | - Rui Rothe-Neves
- Phonetics Lab, Faculty of Letters, Federal University of Minas Gerais, Belo Horizonte, Brazil
| |
Collapse
|
7
|
Manca AD, Grimaldi M. Vowels and Consonants in the Brain: Evidence from Magnetoencephalographic Studies on the N1m in Normal-Hearing Listeners. Front Psychol 2016; 7:1413. [PMID: 27713712 PMCID: PMC5031792 DOI: 10.3389/fpsyg.2016.01413] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Accepted: 09/05/2016] [Indexed: 01/07/2023] Open
Abstract
Speech sound perception is one of the most fascinating tasks performed by the human brain. It involves a mapping from continuous acoustic waveforms onto the discrete phonological units computed to store words in the mental lexicon. In this article, we review the magnetoencephalographic studies that have explored the timing and morphology of the N1m component to investigate how vowels and consonants are computed and represented within the auditory cortex. The neurons that are involved in the N1m act to construct a sensory memory of the stimulus due to spatially and temporally distributed activation patterns within the auditory cortex. Indeed, localization of auditory fields maps in animals and humans suggested two levels of sound coding, a tonotopy dimension for spectral properties and a tonochrony dimension for temporal properties of sounds. When the stimulus is a complex speech sound, tonotopy and tonochrony data may give important information to assess whether the speech sound parsing and decoding are generated by pure bottom-up reflection of acoustic differences or whether they are additionally affected by top-down processes related to phonological categories. Hints supporting pure bottom-up processing coexist with hints supporting top-down abstract phoneme representation. Actually, N1m data (amplitude, latency, source generators, and hemispheric distribution) are limited and do not help to disentangle the issue. The nature of these limitations is discussed. Moreover, neurophysiological studies on animals and neuroimaging studies on humans have been taken into consideration. We compare also the N1m findings with the investigation of the magnetic mismatch negativity (MMNm) component and with the analogous electrical components, the N1 and the MMN. We conclude that N1 seems more sensitive to capture lateralization and hierarchical processes than N1m, although the data are very preliminary. Finally, we suggest that MEG data should be integrated with EEG data in the light of the neural oscillations framework and we propose some concerns that should be addressed by future investigations if we want to closely line up language research with issues at the core of the functional brain mechanisms.
Collapse
Affiliation(s)
- Anna Dora Manca
- Dipartimento di Studi Umanistici, Centro di Ricerca Interdisciplinare sul Linguaggio, University of SalentoLecce, Italy; Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla MedicinaLecce, Italy
| | - Mirko Grimaldi
- Dipartimento di Studi Umanistici, Centro di Ricerca Interdisciplinare sul Linguaggio, University of SalentoLecce, Italy; Laboratorio Diffuso di Ricerca Interdisciplinare Applicata alla MedicinaLecce, Italy
| |
Collapse
|
8
|
Schomers MR, Pulvermüller F. Is the Sensorimotor Cortex Relevant for Speech Perception and Understanding? An Integrative Review. Front Hum Neurosci 2016; 10:435. [PMID: 27708566 PMCID: PMC5030253 DOI: 10.3389/fnhum.2016.00435] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Accepted: 08/15/2016] [Indexed: 11/21/2022] Open
Abstract
In the neuroscience of language, phonemes are frequently described as multimodal units whose neuronal representations are distributed across perisylvian cortical regions, including auditory and sensorimotor areas. A different position views phonemes primarily as acoustic entities with posterior temporal localization, which are functionally independent from frontoparietal articulatory programs. To address this current controversy, we here discuss experimental results from functional magnetic resonance imaging (fMRI) as well as transcranial magnetic stimulation (TMS) studies. On first glance, a mixed picture emerges, with earlier research documenting neurofunctional distinctions between phonemes in both temporal and frontoparietal sensorimotor systems, but some recent work seemingly failing to replicate the latter. Detailed analysis of methodological differences between studies reveals that the way experiments are set up explains whether sensorimotor cortex maps phonological information during speech perception or not. In particular, acoustic noise during the experiment and ‘motor noise’ caused by button press tasks work against the frontoparietal manifestation of phonemes. We highlight recent studies using sparse imaging and passive speech perception tasks along with multivariate pattern analysis (MVPA) and especially representational similarity analysis (RSA), which succeeded in separating acoustic-phonological from general-acoustic processes and in mapping specific phonological information on temporal and frontoparietal regions. The question about a causal role of sensorimotor cortex on speech perception and understanding is addressed by reviewing recent TMS studies. We conclude that frontoparietal cortices, including ventral motor and somatosensory areas, reflect phonological information during speech perception and exert a causal influence on language understanding.
Collapse
Affiliation(s)
- Malte R Schomers
- Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität BerlinBerlin, Germany; Berlin School of Mind and Brain, Humboldt-Universität zu BerlinBerlin, Germany
| | - Friedemann Pulvermüller
- Brain Language Laboratory, Department of Philosophy and Humanities, Freie Universität BerlinBerlin, Germany; Berlin School of Mind and Brain, Humboldt-Universität zu BerlinBerlin, Germany
| |
Collapse
|
9
|
Scharinger M, Monahan PJ, Idsardi WJ. Linguistic category structure influences early auditory processing: Converging evidence from mismatch responses and cortical oscillations. Neuroimage 2016; 128:293-301. [PMID: 26780574 DOI: 10.1016/j.neuroimage.2016.01.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2015] [Revised: 12/30/2015] [Accepted: 01/02/2016] [Indexed: 10/22/2022] Open
Abstract
While previous research has established that language-specific knowledge influences early auditory processing, it is still controversial as to what aspects of speech sound representations determine early speech perception. Here, we propose that early processing primarily depends on information propagated top-down from abstractly represented speech sound categories. In particular, we assume that mid-vowels (as in 'bet') exert less top-down effects than the high-vowels (as in 'bit') because of their less specific (default) tongue height position as compared to either high- or low-vowels (as in 'bat'). We tested this assumption in a magnetoencephalography (MEG) study where we contrasted mid- and high-vowels, as well as the low- and high-vowels in a passive oddball paradigm. Overall, significant differences between deviants and standards indexed reliable mismatch negativity (MMN) responses between 200 and 300ms post-stimulus onset. MMN amplitudes differed in the mid/high-vowel contrasts and were significantly reduced when a mid-vowel standard was followed by a high-vowel deviant, extending previous findings. Furthermore, mid-vowel standards showed reduced oscillatory power in the pre-stimulus beta-frequency band (18-26Hz), compared to high-vowel standards. We take this as converging evidence for linguistic category structure to exert top-down influences on auditory processing. The findings are interpreted within the linguistic model of underspecification and the neuropsychological predictive coding framework.
Collapse
Affiliation(s)
- Mathias Scharinger
- Department of Language and Literature, Max Planck Institute for Empirical Aesthetics, Frankfurt, Germany; Department of Linguistics, University of Maryland, College Park, MD, USA; Biological incl. Cognitive Psychology, Institute for Psychology, University of Leipzig, Germany.
| | - Philip J Monahan
- Centre for French and Linguistics, University of Toronto Scarborough, Canada; Department of Linguistics, University of Toronto, Canada
| | - William J Idsardi
- Department of Linguistics, University of Maryland, College Park, MD, USA
| |
Collapse
|
10
|
Tuomainen J, Savela J, Obleser J, Aaltonen O. Attention modulates the use of spectral attributes in vowel discrimination: behavioral and event-related potential evidence. Brain Res 2012; 1490:170-83. [PMID: 23174416 DOI: 10.1016/j.brainres.2012.10.067] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Revised: 10/26/2012] [Accepted: 10/31/2012] [Indexed: 10/27/2022]
Abstract
Speech contains a variety of acoustic cues to auditory and phonetic contrasts that are exploited by the listener in decoding the acoustic signal. In three experiments, we tried to elucidate whether listeners rely on formant peak frequencies or whole spectrum attributes in vowel discrimination. We created two vowel continua in which the acoustic distance in formant frequencies was constant but the continua differed in spectral moments (i.e., the whole spectrum modeled as a probability density function). In Experiment 1, we measured reaction times and response accuracy while listeners performed a go/no-go discrimination task. The results indicated that the performance of the listeners was based on the spectral moments (especially the first and second moments), and not on formant peaks. Behavioral results in Experiment 2 showed that, when the stimuli were presented in noise eliminating differences in spectral moments between the two continua, listeners employed formant peak frequencies. In Experiment 3, using the same listeners and stimuli as in Experiment 1, we measured an automatic brain potential, the mismatch negativity (MMN), when listeners did not attend to the auditory stimuli. Results showed that the MMN reflects sensitivity only to the formant structure of the vowels. We suggest that the auditory cortex automatically and pre-attentively encodes formant peak frequencies, whereas attention can be deployed for processing additional spectral information, such as spectral moments, to enhance vowel discrimination.
Collapse
Affiliation(s)
- J Tuomainen
- Department of Speech, Hearing and Phonetic Sciences, University College London, UK.
| | | | | | | |
Collapse
|
11
|
Smolka E, Eviatar Z. Phonological and orthographic visual word recognition in the two cerebral hemispheres: Evidence from Hebrew. Cogn Neuropsychol 2012; 23:972-89. [PMID: 21049362 DOI: 10.1080/02643290600654855] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Studies on the cerebral mechanisms of reading have mostly used Latin-based writing systems and assume that the left, but not the right, cerebral hemisphere is capable of phonological processing. The present study used Hebrew as the test language to examine the effects of phonological and orthographic information in the two hemispheres. In unvoweled Hebrew script, words are read via consonant information alone. We used two naming tasks with an interference paradigm, where phonemically, orthographically, and figurally incorrect vowel information conflicted with the consonant information of words presented in the left, right, or central visual fields. Interference patterns indicated that the left hemisphere automatically transforms graphemes into phonemes (Experiments 1 and 2), whereas the right hemisphere processes vowel diacritics as visual objects (Experiment 1), although it possesses some phonological categories (Experiment 2). The significance of these findings for models of visual word recognition in the cerebral hemispheres is discussed.
Collapse
Affiliation(s)
- Eva Smolka
- Department of Psychology, Philipps-University Marburg, Germany
| | | |
Collapse
|
12
|
Scharinger M, Monahan PJ, Idsardi WJ. Asymmetries in the processing of vowel height. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2012; 55:903-918. [PMID: 22232394 DOI: 10.1044/1092-4388(2011/11-0065)] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PURPOSE Speech perception can be described as the transformation of continuous acoustic information into discrete memory representations. Therefore, research on neural representations of speech sounds is particularly important for a better understanding of this transformation. Speech perception models make specific assumptions regarding the representation of mid vowels (e.g., [ε]) that are articulated with a neutral position in regard to height. One hypothesis is that their representation is less specific than the representation of vowels with a more specific position (e.g., [æ]). METHOD In a magnetoencephalography study, we tested the underspecification of mid vowel in American English. Using a mismatch negativity (MMN) paradigm, mid and low lax vowels ([ε]/[æ]), and high and low lax vowels ([i]/[æ]), were opposed, and M100/N1 dipole source parameters as well as MMN latency and amplitude were examined. RESULTS Larger MMNs occurred when the mid vowel [ε] was a deviant to the standard [æ], a result consistent with less specific representations for mid vowels. MMNs of equal magnitude were elicited in the high-low comparison, consistent with more specific representations for both high and low vowels. M100 dipole locations support early vowel categorization on the basis of linguistically relevant acoustic-phonetic features. CONCLUSION We take our results to reflect an abstract long-term representation of vowels that do not include redundant specifications at very early stages of processing the speech signal. Moreover, the dipole locations indicate extraction of distinctive features and their mapping onto representationally faithful cortical locations (i.e., a feature map).
Collapse
|
13
|
Swink S, Stuart A. Auditory long latency responses to tonal and speech stimuli. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2012; 55:447-459. [PMID: 22199192 DOI: 10.1044/1092-4388(2011/10-0364)] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
PURPOSE The effects of type of stimuli (i.e., nonspeech vs. speech), speech (i.e., natural vs. synthetic), gender of speaker and listener, speaker (i.e., self vs. other), and frequency alteration in self-produced speech on the late auditory cortical evoked potential were examined. METHOD Young adult men (n = 15) and women (n = 15), all with normal hearing, participated. P1-N1-P2 components were evoked with the following stimuli: 723-Hz tone bursts; naturally produced male and female /a/ tokens; synthetic male and female /a/ tokens; an /a/ token self-produced by each participant; and the same /a/ token produced by the participant but with a shift in frequency. RESULTS In general, P1-N1-P2 component latencies were significantly shorter when evoked with the tonal stimulus versus speech stimuli and natural versus synthetic speech (p < .05). Women had significantly shorter latencies for only the P2 component (p < .05). For the tonal versus speech stimuli, P1 amplitudes were significantly smaller, and N1 and P2 amplitudes were significantly larger (p < .05). There was no significant effect of gender on the P1, N1, or P2 amplitude (p > .05). CONCLUSION These findings are consistent with the notion that spectrotemporal characteristics of nonspeech and speech stimuli affect P1-N1-P2 latency and amplitude components.
Collapse
|
14
|
Abstract
Language processing is a trait of human species. The knowledge about its neurobiological basis has been increased considerably over the past decades. Different brain regions in the left and right hemisphere have been identified to support particular language functions. Networks involving the temporal cortex and the inferior frontal cortex with a clear left lateralization were shown to support syntactic processes, whereas less lateralized temporo-frontal networks subserve semantic processes. These networks have been substantiated both by functional as well as by structural connectivity data. Electrophysiological measures indicate that within these networks syntactic processes of local structure building precede the assignment of grammatical and semantic relations in a sentence. Suprasegmental prosodic information overtly available in the acoustic language input is processed predominantly in a temporo-frontal network in the right hemisphere associated with a clear electrophysiological marker. Studies with patients suffering from lesions in the corpus callosum reveal that the posterior portion of this structure plays a crucial role in the interaction of syntactic and prosodic information during language processing.
Collapse
Affiliation(s)
- Angela D Friederici
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.
| |
Collapse
|
15
|
Sjerps MJ, Mitterer H, McQueen JM. Listening to different speakers: On the time-course of perceptual compensation for vocal-tract characteristics. Neuropsychologia 2011; 49:3831-46. [DOI: 10.1016/j.neuropsychologia.2011.09.044] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2011] [Revised: 09/22/2011] [Accepted: 09/27/2011] [Indexed: 11/26/2022]
|
16
|
Scharinger M, Idsardi WJ, Poe S. A comprehensive three-dimensional cortical map of vowel space. J Cogn Neurosci 2011; 23:3972-82. [PMID: 21568638 DOI: 10.1162/jocn_a_00056] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Mammalian cortex is known to contain various kinds of spatial encoding schemes for sensory information including retinotopic, somatosensory, and tonotopic maps. Tonotopic maps are especially interesting for human speech sound processing because they encode linguistically salient acoustic properties. In this study, we mapped the entire vowel space of a language (Turkish) onto cortical locations by using the magnetic N1 (M100), an auditory-evoked component that peaks approximately 100 msec after auditory stimulus onset. We found that dipole locations could be structured into two distinct maps, one for vowels produced with the tongue positioned toward the front of the mouth (front vowels) and one for vowels produced in the back of the mouth (back vowels). Furthermore, we found spatial gradients in lateral-medial, anterior-posterior, and inferior-superior dimensions that encoded the phonetic, categorical distinctions between all the vowels of Turkish. Statistical model comparisons of the dipole locations suggest that the spatial encoding scheme is not entirely based on acoustic bottom-up information but crucially involves featural-phonetic top-down modulation. Thus, multiple areas of excitation along the unidimensional basilar membrane are mapped into higher dimensional representations in auditory cortex.
Collapse
|
17
|
Scharinger M, Monahan PJ, Idsardi WJ. You had me at "Hello": Rapid extraction of dialect information from spoken words. Neuroimage 2011; 56:2329-38. [PMID: 21511041 DOI: 10.1016/j.neuroimage.2011.04.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2010] [Revised: 03/22/2011] [Accepted: 04/04/2011] [Indexed: 10/18/2022] Open
Abstract
Research on the neuronal underpinnings of speaker identity recognition has identified voice-selective areas in the human brain with evolutionary homologues in non-human primates who have comparable areas for processing species-specific calls. Most studies have focused on estimating the extent and location of these areas. In contrast, relatively few experiments have investigated the time-course of speaker identity, and in particular, dialect processing and identification by electro- or neuromagnetic means. We show here that dialect extraction occurs speaker-independently, pre-attentively and categorically. We used Standard American English and African-American English exemplars of 'Hello' in a magnetoencephalographic (MEG) Mismatch Negativity (MMN) experiment. The MMN as an automatic change detection response of the brain reflected dialect differences that were not entirely reducible to acoustic differences between the pronunciations of 'Hello'. Source analyses of the M100, an auditory evoked response to the vowels suggested additional processing in voice-selective areas whenever a dialect change was detected. These findings are not only relevant for the cognitive neuroscience of language, but also for the social sciences concerned with dialect and race perception.
Collapse
Affiliation(s)
- Mathias Scharinger
- Department of Linguistics, University of Maryland, College Park, MD, USA.
| | | | | |
Collapse
|
18
|
Scharinger M, Merickel J, Riley J, Idsardi WJ. Neuromagnetic evidence for a featural distinction of English consonants: sensor- and source-space data. BRAIN AND LANGUAGE 2011; 116:71-82. [PMID: 21185073 PMCID: PMC3031676 DOI: 10.1016/j.bandl.2010.11.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2009] [Revised: 10/31/2010] [Accepted: 11/15/2010] [Indexed: 05/30/2023]
Abstract
Speech sounds can be classified on the basis of their underlying articulators or on the basis of the acoustic characteristics resulting from particular articulatory positions. Research in speech perception suggests that distinctive features are based on both articulatory and acoustic information. In recent years, neuroelectric and neuromagnetic investigations provided evidence for the brain's early sensitivity to distinctive features and their acoustic consequences, particularly for place of articulation distinctions. Here, we compare English consonants in a Mismatch Field design across two broad and distinct places of articulation - labial and coronal - and provide further evidence that early evoked auditory responses are sensitive to these features. We further add to the findings of asymmetric consonant processing, although we do not find support for coronal underspecification. Labial glides (Experiment 1) and fricatives (Experiment 2) elicited larger Mismatch responses than their coronal counterparts. Interestingly, their M100 dipoles differed along the anterior/posterior dimension in the auditory cortex that has previously been found to spatially reflect place of articulation differences. Our results are discussed with respect to acoustic and articulatory bases of featural speech sound classifications and with respect to a model that maps distinctive phonetic features onto long-term representations of speech sounds.
Collapse
Affiliation(s)
- Mathias Scharinger
- Department of Linguistics, University of Maryland, College Park, MD 20742-7505, USA.
| | | | | | | |
Collapse
|
19
|
Miettinen I, Alku P, Salminen N, May PJ, Tiitinen H. Responsiveness of the human auditory cortex to degraded speech sounds: Reduction of amplitude resolution vs. additive noise. Brain Res 2011; 1367:298-309. [DOI: 10.1016/j.brainres.2010.10.037] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2010] [Revised: 10/07/2010] [Accepted: 10/12/2010] [Indexed: 11/15/2022]
|
20
|
The analysis of simple and complex auditory signals in human auditory cortex: magnetoencephalographic evidence from M100 modulation. Ear Hear 2010; 31:515-26. [PMID: 20445455 DOI: 10.1097/aud.0b013e3181d99a75] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVE Ecologically valid signals (e.g., vowels) have multiple components of substantially different frequencies and amplitudes that may not be equally cortically represented. In this study, we investigate a relatively simple signal at an intermediate level of complexity, two-frequency composite tones, a stimulus lying between simple sinusoids and ecologically valid signals such as speech. We aim to characterize the cortical response properties to better understand how complex signals may be represented in auditory cortex. DESIGN Using magnetoencephalography, we assessed the sensitivity of the M100/N100m auditory-evoked component to manipulations of the power ratio of the individual frequency components of the two-frequency complexes. Fourteen right-handed subjects with normal hearing were scanned while passively listening to 10 complex and 12 simple signals. The complex signals were composed of one higher frequency and one lower frequency sinusoid; the lower frequency sinusoidal component was at one of the five loudness levels relative to the higher frequency one: -20, -10, 0, +10, +20 dB. The simple signals comprised all the complex signal components presented in isolation. RESULTS The data replicate and extend several previous findings: (1) the systematic dependence of the M100 latency on signal intensity and (2) the dependence of the M100 latency on signal frequency, with lower frequency signals ( approximately 100 Hz) exhibiting longer latencies than higher frequency signals ( approximately 1000 Hz) even at matched loudness levels. (3) Importantly, we observe that, relative to simple signals, complex signals show increased response amplitude-as one might predict-but decreased M100 latencies. CONCLUSION : The data suggest that by the time the M100 is generated in auditory cortex ( approximately 70 to 80 msecs after stimulus onset), integrative processing across frequency channels has taken place which is observable in the M100 modulation. In light of these data models that attribute more time and processing resources to a complex stimulus merit reevaluation, in that our data show that acoustically more complex signals are associated with robust temporal facilitation, across frequencies and signal amplitude level.
Collapse
|
21
|
Monahan PJ, Idsardi WJ. Auditory Sensitivity to Formant Ratios:Toward an Account of Vowel Normalization. LANGUAGE AND COGNITIVE PROCESSES 2010; 25:808-839. [PMID: 20606713 PMCID: PMC2893733 DOI: 10.1080/01690965.2010.490047] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
A long-standing question in speech perception research is how do listeners extract linguistic content from a highly variable acoustic input. In the domain of vowel perception, formant ratios, or the calculation of relative bark differences between vowel formants, have been a sporadically proposed solution. We propose a novel formant ratio algorithm in which the first (F1) and second (F2) formants are compared against the third formant (F3). Results from two magnetoencephelographic (MEG) experiments are presented that suggest auditory cortex is sensitive to formant ratios. Our findings also demonstrate that the perceptual system shows heightened sensitivity to formant ratios for tokens located in more crowded regions of the vowel space. Additionally, we present statistical evidence that this algorithm eliminates speaker-dependent variation based on age and gender from vowel productions. We conclude that these results present an impetus to reconsider formant ratios as a legitimate mechanistic component in the solution to the problem of speaker normalization.
Collapse
Affiliation(s)
- Philip J. Monahan
- Basque Center on Cognition, Brain and Language, Donostia-San Sebastián, Spain
| | - William J. Idsardi
- Department of Linguistics, University of Maryland, USA
- Neuroscience and Cognitive Science Program University of Maryland, USA
| |
Collapse
|
22
|
Pulvermüller F, Fadiga L. Active perception: sensorimotor circuits as a cortical basis for language. Nat Rev Neurosci 2010; 11:351-60. [PMID: 20383203 DOI: 10.1038/nrn2811] [Citation(s) in RCA: 535] [Impact Index Per Article: 38.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Action and perception are functionally linked in the brain, but a hotly debated question is whether perception and comprehension of stimuli depend on motor circuits. Brain language mechanisms are ideal for addressing this question. Neuroimaging investigations have found specific motor activations when subjects understand speech sounds, word meanings and sentence structures. Moreover, studies involving transcranial magnetic stimulation and patients with lesions affecting inferior frontal regions of the brain have shown contributions of motor circuits to the comprehension of phonemes, semantic categories and grammar. These data show that language comprehension benefits from frontocentral action systems, indicating that action and perception circuits are interdependent.
Collapse
Affiliation(s)
- Friedemann Pulvermüller
- Medical Research Council, Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge, CB2 2EF, UK.
| | | |
Collapse
|
23
|
Matilainen LE, Talvitie SS, Pekkonen E, Alku P, May PJC, Tiitinen H. The effects of healthy aging on auditory processing in humans as indexed by transient brain responses. Clin Neurophysiol 2010; 121:902-11. [PMID: 20359943 DOI: 10.1016/j.clinph.2010.01.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2009] [Revised: 12/18/2009] [Accepted: 01/08/2010] [Indexed: 11/28/2022]
Abstract
OBJECTIVE The aim of the study was to investigate the effects of aging on human cortical auditory processing of rising-intensity sinusoids and speech sounds. We also aimed to evaluate the suitability of a recently discovered transient brain response for applied research. METHODS In young and aged adults, magnetic fields produced by cortical activity elicited by a 570-Hz pure-tone and a speech sound (Finnish vowel /a/) were measured using MEG. The stimuli rose smoothly in intensity from an inaudible to an audible level over 750 ms. We used both the active (attended) and the passive recording condition. In the attended condition, behavioral reaction times were measured. RESULTS The latency of the transient brain response was prolonged in the aged compared to the young and the accuracy of behavioral responses to sinusoids was diminished among the aged. In response amplitudes, no differences were found between the young and the aged. In both groups, spectral complexity of the stimuli enhanced response amplitudes. CONCLUSIONS Aging seems to affect the temporal dynamics of cortical auditory processing. The transient brain response is sensitive both to spectral complexity and aging-related changes in the timing of cortical activation. SIGNIFICANCE The transient brain responses elicited by rising-intensity sounds could be useful in revealing differences in auditory cortical processing in applied research.
Collapse
Affiliation(s)
- Laura E Matilainen
- Department of Biomedical Engineering and Computational Science, Aalto University, School of Science and Technology, Finland
| | | | | | | | | | | |
Collapse
|
24
|
Pulvermüller F. Brain embodiment of syntax and grammar: discrete combinatorial mechanisms spelt out in neuronal circuits. BRAIN AND LANGUAGE 2010; 112:167-179. [PMID: 20132977 DOI: 10.1016/j.bandl.2009.08.002] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/10/2008] [Revised: 06/22/2009] [Accepted: 08/02/2009] [Indexed: 05/28/2023]
Abstract
Neuroscience has greatly improved our understanding of the brain basis of abstract lexical and semantic processes. The neuronal devices underlying words and concepts are distributed neuronal assemblies reaching into sensory and motor systems of the cortex and, at the cognitive level, information binding in such widely dispersed circuits is mirrored by the sensorimotor grounding of form and meaning of symbols. Recent years have seen the emergence of evidence for similar brain embodiment of syntax. Neurophysiological studies have accumulated support for the linguistic notion of abstract combinatorial rules manifest as functionally discrete neuronal assemblies. Concepts immanent to the theory of abstract automata could be grounded in observations from modern neuroscience, so that it became possible to model abstract pushdown storage - which is critical for building linguistic tree structure representations - as ordered dynamics of memory circuits in the brain. At the same time, neurocomputational research showed how sequence detectors already known from animal brains can be neuronally linked so that they merge into larger functionally discrete units, thereby underpinning abstract rule representations that syntactically bind lexicosemantic classes of morphemes and words into larger meaningful constituents. Specific predictions of brain-based grammar models could be confirmed by neurophysiological and brain imaging experiments using MEG, EEG and fMRI. Neuroscience and neurocomputational research offering perspectives on understanding abstract linguistic mechanisms in terms of neuronal circuits and their interactions therefore point programmatic new ways to future theory-guided experimental investigation of the brain basis of grammar.
Collapse
Affiliation(s)
- Friedemann Pulvermüller
- Medical Research Council, Cognition and Brain Sciences Unit, 15 Chaucer Road, Cambridge CB2 7EF, UK.
| |
Collapse
|
25
|
Miettinen I, Tiitinen H, Alku P, May PJC. Sensitivity of the human auditory cortex to acoustic degradation of speech and non-speech sounds. BMC Neurosci 2010; 11:24. [PMID: 20175890 PMCID: PMC2837048 DOI: 10.1186/1471-2202-11-24] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2009] [Accepted: 02/22/2010] [Indexed: 12/04/2022] Open
Abstract
Background Recent studies have shown that the human right-hemispheric auditory cortex is particularly sensitive to reduction in sound quality, with an increase in distortion resulting in an amplification of the auditory N1m response measured in the magnetoencephalography (MEG). Here, we examined whether this sensitivity is specific to the processing of acoustic properties of speech or whether it can be observed also in the processing of sounds with a simple spectral structure. We degraded speech stimuli (vowel /a/), complex non-speech stimuli (a composite of five sinusoidals), and sinusoidal tones by decreasing the amplitude resolution of the signal waveform. The amplitude resolution was impoverished by reducing the number of bits to represent the signal samples. Auditory evoked magnetic fields (AEFs) were measured in the left and right hemisphere of sixteen healthy subjects. Results We found that the AEF amplitudes increased significantly with stimulus distortion for all stimulus types, which indicates that the right-hemispheric N1m sensitivity is not related exclusively to degradation of acoustic properties of speech. In addition, the P1m and P2m responses were amplified with increasing distortion similarly in both hemispheres. The AEF latencies were not systematically affected by the distortion. Conclusions We propose that the increased activity of AEFs reflects cortical processing of acoustic properties common to both speech and non-speech stimuli. More specifically, the enhancement is most likely caused by spectral changes brought about by the decrease of amplitude resolution, in particular the introduction of periodic, signal-dependent distortion to the original sound. Converging evidence suggests that the observed AEF amplification could reflect cortical sensitivity to periodic sounds.
Collapse
Affiliation(s)
- Ismo Miettinen
- Department of Biomedical Engineering and Computational Science, Aalto University School of Science and Technology, Espoo, Finland.
| | | | | | | |
Collapse
|
26
|
|
27
|
May PJC, Tiitinen H. Mismatch negativity (MMN), the deviance-elicited auditory deflection, explained. Psychophysiology 2010; 47:66-122. [DOI: 10.1111/j.1469-8986.2009.00856.x] [Citation(s) in RCA: 374] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
28
|
Blakely T, Miller KJ, Rao RPN, Holmes MD, Ojemann JG. Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2009; 2008:4964-7. [PMID: 19163831 DOI: 10.1109/iembs.2008.4650328] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We present results of cortical activity during phoneme pronunciation, recorded using miniaturized electrocorticography grids with high spatial resolution. A patient implanted with the miniature grid was instructed to audibly pronounce one of four phonemes. For each phoneme, we observed distinct spatial correlation patterns at the 3mm electrode spacing. We applied a support vector machine classification scheme and, for the first time, were able to distinguish discrete phonemes with high accuracy. In addition, we found that sub-regions of our miniature array were specific for distinct pairs of phonemes, showing that cortical phoneme processing occurs at a higher resolution than previously though.
Collapse
|
29
|
Ceponiene R, Torki M, Alku P, Koyama A, Townsend J. Event-related potentials reflect spectral differences in speech and non-speech stimuli in children and adults. Clin Neurophysiol 2008; 119:1560-77. [PMID: 18456550 DOI: 10.1016/j.clinph.2008.03.005] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2007] [Revised: 02/19/2008] [Accepted: 03/05/2008] [Indexed: 11/24/2022]
Abstract
OBJECTIVE Event-related brain potentials (ERP) may provide tools for examining normal and abnormal language development. To clarify functional significance of auditory ERPs, we examined ERP indices of spectral differences in speech and non-speech sounds. METHODS Three Spectral Items (BA, DA, GA) were presented as three Stimulus Types: syllables, non-phonetics, and consonant-vowel transitions (CVT). Fourteen 7- to 10-year-old children and 14 adults were presented with equiprobable Spectral Item sequences blocked by Stimulus Type. RESULTS Spectral Item effect appeared as P1, P2, N2, and N4 amplitude variations. The P2 was sensitive to all Stimulus Types in both groups. In adults, the P1 was also sensitive to transitions while the N4 was sensitive to syllables. In children, only the 50-ms CVT stimuli elicited N2 and N4 spectral effects. In both groups, non-phonetic stimuli elicited larger N1-P2 amplitudes while speech stimuli elicited larger N2-N4 amplitudes. CONCLUSIONS Auditory feature processing is reflected by P1-P2 and N2-N4 peaks and matures earlier than supra-sensory integrative mechanisms, reflected by N1-P2 peaks. Auditory P2 appears to pertain to both processing types. SIGNIFICANCE These results delineate an orderly processing organization whereby direct feature mapping occurs earlier in processing and, in part, serves sound detection whereas relational mapping occurs later in processing and serves sound identification.
Collapse
Affiliation(s)
- R Ceponiene
- Center for Research in Language, Project in Neural and Cognitive Development, University of California, San Diego, La Jolla, CA 92093-0113, USA.
| | | | | | | | | |
Collapse
|
30
|
Abstract
Voice onset time (VOT) provides an important auditory cue for recognizing spoken consonant-vowel syllables. Although changes in the neuromagnetic response to consonant-vowel syllables with different VOT have been examined, such experiments have only manipulated VOT with respect to voicing. We utilized the characteristics of a previously developed asymmetric VOT continuum [Liederman, J., Frye, R. E., McGraw Fisher, J., Greenwood, K., & Alexander, R. A temporally dynamic contextual effect that disrupts voice onset time discrimination of rapidly successive stimuli. Psychonomic Bulletin and Review, 12, 380-386, 2005] to determine if changes in the prominent M100 neuromagnetic response were linearly modulated by VOT. Eight right-handed, English-speaking, normally developing participants performed a VOT discrimination task during a whole-head neuromagnetic recording. The M100 was identified in the gradiometers overlying the right and left temporal cortices and single dipoles were fit to each M100 waveform. A repeated measures analysis of variance with post hoc contrast test for linear trend was used to determine whether characteristics of the M100 were linearly modulated by VOT. The morphology of the M100 gradiometer waveform and the peak latency of the dipole waveform were linearly modulated by VOT. This modulation was much greater in the left, as compared to the right, hemisphere. The M100 dipole moved in a linear fashion as VOT increased in both hemispheres, but along different axes in each hemisphere. This study suggests that VOT may linearly modulate characteristics of the M100, predominately in the left hemisphere, and suggests that the VOT of consonant-vowel syllables, instead of, or in addition to, voicing, should be examined in future experiments.
Collapse
Affiliation(s)
- Richard E Frye
- University of Texas Health Science Center at Houston, TX 77030, USA.
| | | | | | | | | | | |
Collapse
|
31
|
Tavabi K, Obleser J, Dobel C, Pantev C. Auditory evoked fields differentially encode speech features: an MEG investigation of the P50m and N100m time courses during syllable processing. Eur J Neurosci 2007; 25:3155-62. [PMID: 17561829 DOI: 10.1111/j.1460-9568.2007.05572.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The functional organization of speech sound processing in the human brain and its unfolding over time are still not well understood. While the N100/N100m is a comparatively well-studied, and quite late, component of the auditory evoked field elicited by speech, earlier processes such as those reflected in the P50m remain to be resolved. Using magnetoencephalography, the present study follows up on previous reports of N100m-centred spatiotemporal encoding of phonological features and coarticulatory processes in the auditory cortex during consonant-vowel syllable perception. Our results indicate that the time course and response strength of the P50m and N100m components of evoked magnetic fields are differentially influenced by mutually exclusive place-of-articulation features of a syllable's stop consonant and vowel segments. Topographical differences in P50m generators were driven by place contrasts between consonants in syllables, with spatial gradients orthogonal to the ones previously reported for N100m. Peak latency results replicated previous findings for the N100m and revealed a reverse pattern for the earlier P50m (shorter latencies depending on the presence of a back vowel [o]). Our findings allow attribution of a role in basic feature extraction to the comparatively early P50m time window. Moreover, the observations substantiate the assumption that the N100m response reflects a more abstract phonological representational stage during speech perception.
Collapse
Affiliation(s)
- Kambiz Tavabi
- Institute for Biomagnetism and Biosignalanalysis, University of Münster, Germany.
| | | | | | | |
Collapse
|
32
|
Obleser J, Boecker H, Drzezga A, Haslinger B, Hennenlotter A, Roettinger M, Eulitz C, Rauschecker JP. Vowel sound extraction in anterior superior temporal cortex. Hum Brain Mapp 2006; 27:562-71. [PMID: 16281283 PMCID: PMC6871493 DOI: 10.1002/hbm.20201] [Citation(s) in RCA: 130] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
Abstract
We investigated the functional neuroanatomy of vowel processing. We compared attentive auditory perception of natural German vowels to perception of nonspeech band-passed noise stimuli using functional magnetic resonance imaging (fMRI). More specifically, the mapping in auditory cortex of first and second formants was considered, which spectrally characterize vowels and are linked closely to phonological features. Multiple exemplars of natural German vowels were presented in sequences alternating either mainly along the first formant (e.g., [u]-[o], [i]-[e]) or along the second formant (e.g., [u]-[i], [o]-[e]). In fixed-effects and random-effects analyses, vowel sequences elicited more activation than did nonspeech noise in the anterior superior temporal cortex (aST) bilaterally. Partial segregation of different vowel categories was observed within the activated regions, suggestive of a speech sound mapping across the cortical surface. Our results add to the growing evidence that speech sounds, as one of the behaviorally most relevant classes of auditory objects, are analyzed and categorized in aST. These findings also support the notion of an auditory "what" stream, with highly object-specialized areas anterior to primary auditory cortex.
Collapse
Affiliation(s)
- Jonas Obleser
- Fachgruppen Psychologie und Linguistik, Universität Konstanz, Konstanz, Germany
| | | | | | | | | | | | | | | |
Collapse
|
33
|
Ogata E, Yumoto M, Itoh K, Sekimoto S, Karino S, Kaga K. A magnetoencephalographic study of Japanese vowel processing. Neuroreport 2006; 17:1127-31. [PMID: 16837840 DOI: 10.1097/01.wnr.0000230503.47973.b7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Magnetic brain responses were recorded to clarify the cortical representation of vowel processing in Japanese. We investigated the peak latencies and equivalent current dipoles of the auditory N1m responses to the Japanese vowels [a], [i], [o], and [u]. In intraindividual analyses for a single participant, well-replicated results for the dipole parameters supported the existence of phoneme-specific cortical maps for vowels. In the interindividual analyses for the eight participants, [a] and [i] elicited significantly earlier N1m responses than [u], and the dipole for [i] was more posteriorly oriented than [a] in the left hemisphere. The results of the current study suggest left hemispheric predominance in vowel processing and that factors associated with a different language system may modify the cortical map.
Collapse
Affiliation(s)
- Erika Ogata
- Department of Sensory and Motor Neuroscience, Graduate School of Medicine, University of Tokyo, Tokyo, Japan.
| | | | | | | | | | | |
Collapse
|
34
|
Gage N, Roberts TPL, Hickok G. Temporal resolution properties of human auditory cortex: reflections in the neuromagnetic auditory evoked M100 component. Brain Res 2006; 1069:166-71. [PMID: 16403467 DOI: 10.1016/j.brainres.2005.11.023] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2005] [Revised: 06/24/2005] [Accepted: 11/08/2005] [Indexed: 11/21/2022]
Abstract
UNLABELLED Previous work has provided evidence for a brief, finite ( approximately 35 ms) temporal window of integration (TWI) in M100 formation, during which stimulus attributes are accumulated in processes leading to the M100 peak. Here, we investigate resolution within the TWI by recording responses to tones containing silent gaps (0-20 ms). Gaps were inserted in 1 kHz tones in 2 conditions: +10 ms post-onset (10 ms masker) wherein the masker and gap of longest duration (20 ms) were contained within the initial 35 ms of the stimulus and +40 ms (40 ms masker) wherein all gaps were inserted +40 ms post-onset. Tones were presented binaurally and responses sampled from both hemispheres in 12 adults using a twin 37-channel biomagnetometer (MAGNES-II, BTi, San Diego, CA). Results--10 ms masker: M100 latency was prolonged and amplitude decreased as a function of gap duration, even with the shortest duration (2 ms) gap, indicating that integrative processes underlying M100 formation are sensitive to fine-grained discontinuities within a brief, finite TWI. Results--40 ms masker: M100 latency and amplitude were unaffected by gaps inserted at +40 ms, providing further evidence for an M100 TWI of <40 ms. CONCLUSION within a brief integrative window in M100 formation, population-level responses are sensitive to discontinuities in sounds on a scale corresponding to psychophysical detection thresholds and minimum detectable gap thresholds in single unit recordings. Cumulatively, results provide evidence that M100 resolution for brief fluctuations in sounds reflects temporal acuity properties that are both intrinsic to the auditory system and critical to the accurate perception of speech.
Collapse
Affiliation(s)
- Nicole Gage
- Department of Cognitive Sciences, 3151 Social Science Plaza A, University of California, Irvine, CA 92697-5100, USA.
| | | | | |
Collapse
|
35
|
Tiitinen H, Mäkelä AM, Mäkinen V, May PJC, Alku P. Disentangling the effects of phonation and articulation: hemispheric asymmetries in the auditory N1m response of the human brain. BMC Neurosci 2005; 6:62. [PMID: 16225699 PMCID: PMC1280927 DOI: 10.1186/1471-2202-6-62] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2005] [Accepted: 10/15/2005] [Indexed: 11/16/2022] Open
Abstract
Background The cortical activity underlying the perception of vowel identity has typically been addressed by manipulating the first and second formant frequency (F1 & F2) of the speech stimuli. These two values, originating from articulation, are already sufficient for the phonetic characterization of vowel category. In the present study, we investigated how the spectral cues caused by articulation are reflected in cortical speech processing when combined with phonation, the other major part of speech production manifested as the fundamental frequency (F0) and its harmonic integer multiples. To study the combined effects of articulation and phonation we presented vowels with either high (/a/) or low (/u/) formant frequencies which were driven by three different types of excitation: a natural periodic pulseform reflecting the vibration of the vocal folds, an aperiodic noise excitation, or a tonal waveform. The auditory N1m response was recorded with whole-head magnetoencephalography (MEG) from ten human subjects in order to resolve whether brain events reflecting articulation and phonation are specific to the left or right hemisphere of the human brain. Results The N1m responses for the six stimulus types displayed a considerable dynamic range of 115–135 ms, and were elicited faster (~10 ms) by the high-formant /a/ than by the low-formant /u/, indicating an effect of articulation. While excitation type had no effect on the latency of the right-hemispheric N1m, the left-hemispheric N1m elicited by the tonally excited /a/ was some 10 ms earlier than that elicited by the periodic and the aperiodic excitation. The amplitude of the N1m in both hemispheres was systematically stronger to stimulation with natural periodic excitation. Also, stimulus type had a marked (up to 7 mm) effect on the source location of the N1m, with periodic excitation resulting in more anterior sources than aperiodic and tonal excitation. Conclusion The auditory brain areas of the two hemispheres exhibit differential tuning to natural speech signals, observable already in the passive recording condition. The variations in the latency and strength of the auditory N1m response can be traced back to the spectral structure of the stimuli. More specifically, the combined effects of the harmonic comb structure originating from the natural voice excitation caused by the fluctuating vocal folds and the location of the formant frequencies originating from the vocal tract leads to asymmetric behaviour of the left and right hemisphere.
Collapse
Affiliation(s)
- Hannu Tiitinen
- Apperception & Cortical Dynamics (ACD), Department of Psychology, P.O.B. 9, FIN-00014 University of Helsinki, Finland
- BioMag Laboratory, Engineering Centre, Helsinki University Central Hospital, Finland
| | - Anna Mari Mäkelä
- Apperception & Cortical Dynamics (ACD), Department of Psychology, P.O.B. 9, FIN-00014 University of Helsinki, Finland
- BioMag Laboratory, Engineering Centre, Helsinki University Central Hospital, Finland
| | - Ville Mäkinen
- Apperception & Cortical Dynamics (ACD), Department of Psychology, P.O.B. 9, FIN-00014 University of Helsinki, Finland
- BioMag Laboratory, Engineering Centre, Helsinki University Central Hospital, Finland
| | - Patrick JC May
- Apperception & Cortical Dynamics (ACD), Department of Psychology, P.O.B. 9, FIN-00014 University of Helsinki, Finland
- BioMag Laboratory, Engineering Centre, Helsinki University Central Hospital, Finland
| | - Paavo Alku
- Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, Espoo, Finland
| |
Collapse
|
36
|
Ceponiene R, Alku P, Westerfield M, Torki M, Townsend J. ERPs differentiate syllable and nonphonetic sound processing in children and adults. Psychophysiology 2005; 42:391-406. [PMID: 16008768 DOI: 10.1111/j.1469-8986.2005.00305.x] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We examined maturation of speech-sound-related indices of auditory event-related brain potentials (ERPs). ERPs were elicited by syllables and nonphonetic correlates in children and adults. Compared with syllables, nonphonetic stimuli elicited larger N1 and P2 in adults and P1 in children. Because the nonphonetics were more perceptually salient, this N1 effect was consistent with known N1 sensitivity to sound onset features. Based on stimulus dependence and independent component structure, children's P1 appeared to contain overlapping P2-like activity. In both subject groups, syllables elicited larger N2/N4 peaks. This might reflect sound content feature processing, more extensive for speech than nonspeech sounds. Therefore, sound detection mechanisms (N1, P2) still develop whereas sound content processing (N2, N4) is largely mature during mid-childhood; in children and adults, speech sounds are processed more extensively than nonspeech sounds 200-400 ms poststimulus.
Collapse
Affiliation(s)
- R Ceponiene
- Center for Research in Language, University of California, San Diego, California 92093-0113, USA.
| | | | | | | | | |
Collapse
|
37
|
Sittiprapaporn W, Tervaniemi M, Chindaduangratn C, Kotchabhakdi N. Preattentive discrimination of across-category and within-category change in consonant–vowel syllable. Neuroreport 2005; 16:1513-8. [PMID: 16110281 DOI: 10.1097/01.wnr.0000175618.46677.07] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
Event-related potentials to infrequently presented spoken deviant syllables /pi/ and /po/ among repetitive standard [see text] syllables were recorded in Thai study participants who ignored these stimuli while reading books of their choices. The vowel across-category and within-category changes elicited a change-specific mismatch negativity response. The across-category and within-category change discrimination of vowels in consonant-vowel syllable was also assessed using the low-resolution electromagnetic tomography. The results of low-resolution electromagnetic tomography mismatch negativity generator analysis suggest that the within-category change perception of vowels is analyzed as the change in physical features of the stimuli, thus predominantly activating the right temporal cortex. In contrast, the left temporal cortex is predominantly activated in the across-category change perception of vowels, emphasizing the role of the left hemisphere in speech processing already at a preattentive processing level also in consonant-vowel syllables. The results support the hypothesis that a part of the superior temporal gyrus contains neurons specialized for speech perception.
Collapse
Affiliation(s)
- Wichian Sittiprapaporn
- Neuro-Behavioural Biology Center, Institute of Science and Technology for Research and Development, Mahidol University, Salaya, Nakhonpathom, Thailand.
| | | | | | | |
Collapse
|
38
|
Eichele T, Nordby H, Rimol LM, Hugdahl K. Asymmetry of evoked potential latency to speech sounds predicts the ear advantage in dichotic listening. ACTA ACUST UNITED AC 2005; 24:405-12. [PMID: 16099353 DOI: 10.1016/j.cogbrainres.2005.02.017] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2004] [Revised: 02/11/2005] [Accepted: 02/14/2005] [Indexed: 11/18/2022]
Abstract
The functional organization of the human auditory cortex is still not well understood with respect to speech perception and language lateralization. Especially, there is comparatively little data available in the brain imaging literature focusing on the timing of phonetic processing. We recorded auditory-evoked potentials (AEP) from 27 scalp and additional EOG channels in 12 healthy volunteers performing a free report dichotic listening task with simple speech sounds (CV syllables: [ba], [da], [ga], [pa], [ta], [ka]). ERP analysis employed independent components analysis (ICA) wavelet denoising for artifact reduction and improvement of the SNR. The main finding was a 15-ms shorter average latency of the N1-AEP recorded from the scalp approximately overlying the left supratemporal cortical plane compared to the N1-AEP over the homologous right side. Corresponding N1 amplitudes did not differ between these sites. The individual AEP latency differences significantly correlated with the ear advantage as an index of speech/language lateralization. The behaviorally relevant difference in N1 latency between the hemispheres indicates that an important key to understanding speech perception is to consider the functional implications of neuronal event timing.
Collapse
Affiliation(s)
- Tom Eichele
- Department of Biological and Medical Psychology, University of Bergen, Jonas Lies Vei 91, N-5011 Bergen, Norway.
| | | | | | | |
Collapse
|
39
|
Mäkelä AM, Alku P, May PJC, Mäkinen V, Tiitinen H. Left-hemispheric brain activity reflects formant transitions in speech sounds. Neuroreport 2005; 16:549-53. [PMID: 15812305 DOI: 10.1097/00001756-200504250-00006] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Connected speech is characterized by formant transitions whereby formant frequencies change over time. Here, using magneto-encephalography, we investigated the cortical activity in 10 participants in response to constant-formant vowels and diphthongs with formant transitions. All the stimuli elicited prominent auditory N100m responses, but the formant transitions resulted in latency modulations specific to the left hemisphere. Following the elicitation of the N100m, cortical activity shifted some 10 mm towards anterior brain areas. This late activity resembled the N400m, typically obtained with more complex utterances such as words and/or sentences. Thus, the present study demonstrates how magnetoencephalography can be used to investigate the spatiotemporal evolution in cortical activity related to the various stages of the processing of speech.
Collapse
Affiliation(s)
- Anna Mari Mäkelä
- Apperception & Cortical Dynamics, Department of Psychology, University of Helsinki, Helsinki, Finland.
| | | | | | | | | |
Collapse
|
40
|
Shestakova A, Brattico E, Soloviev A, Klucharev V, Huotilainen M. Orderly cortical representation of vowel categories presented by multiple exemplars. ACTA ACUST UNITED AC 2004; 21:342-50. [PMID: 15511650 DOI: 10.1016/j.cogbrainres.2004.06.011] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/19/2004] [Indexed: 11/17/2022]
Abstract
This study aimed at determining how the human brain automatically processes phoneme categories irrespective of the large acoustic inter-speaker variability. Subjects were presented with 450 different speech stimuli, equally distributed across the [a], [i], and [u] vowel categories, and each uttered by a different male speaker. A 306-channel magnetoencephalogram (MEG) was used to record N1m, the magnetic counterpart of the N1 component of the auditory event-related potential (ERP). The N1m amplitude and source locations differed between vowel categories. We also found that the spectrum dissimilarities were reproduced in the cortical representations of the large set of the phonemes used in this study: vowels with similar spectral envelopes had closer cortical representations than those whose spectral differences were the largest. Our data further extend the notion of differential cortical representations in response to vowel categories, previously demonstrated by using only one or a few tokens representing each category.
Collapse
Affiliation(s)
- Anna Shestakova
- Cognitive Brain Research Unit, Department of Psychology, PO Box 9, FIN-00014 University of Helsinki, Helsinki, Finland.
| | | | | | | | | |
Collapse
|
41
|
Obleser J, Elbert T, Eulitz C. Attentional influences on functional mapping of speech sounds in human auditory cortex. BMC Neurosci 2004; 5:24. [PMID: 15268765 PMCID: PMC503386 DOI: 10.1186/1471-2202-5-24] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2004] [Accepted: 07/21/2004] [Indexed: 11/10/2022] Open
Abstract
Background The speech signal contains both information about phonological features such as place of articulation and non-phonological features such as speaker identity. These are different aspects of the 'what'-processing stream (speaker vs. speech content), and here we show that they can be further segregated as they may occur in parallel but within different neural substrates. Subjects listened to two different vowels, each spoken by two different speakers. During one block, they were asked to identify a given vowel irrespectively of the speaker (phonological categorization), while during the other block the speaker had to be identified irrespectively of the vowel (speaker categorization). Auditory evoked fields were recorded using 148-channel magnetoencephalography (MEG), and magnetic source imaging was obtained for 17 subjects. Results During phonological categorization, a vowel-dependent difference of N100m source location perpendicular to the main tonotopic gradient replicated previous findings. In speaker categorization, the relative mapping of vowels remained unchanged but sources were shifted towards more posterior and more superior locations. Conclusions These results imply that the N100m reflects the extraction of abstract invariants from the speech signal. This part of the processing is accomplished in auditory areas anterior to AI, which are part of the auditory 'what' system. This network seems to include spatially separable modules for identifying the phonological information and for associating it with a particular speaker that are activated in synchrony but within different regions, suggesting that the 'what' processing can be more adequately modeled by a stream of parallel stages. The relative activation of the parallel processing stages can be modulated by attentional or task demands.
Collapse
Affiliation(s)
- Jonas Obleser
- Department of Psychology, University of Konstanz, Germany
- Department of Linguistics, University of Konstanz, Germany
| | - Thomas Elbert
- Department of Psychology, University of Konstanz, Germany
| | - Carsten Eulitz
- Department of Linguistics, University of Konstanz, Germany
- Department of Psychiatry and Psychotherapy, School of Medicine, University of Aachen, German
| |
Collapse
|
42
|
Jacobsen T, Schröger E, Alter K. Pre-attentive perception of vowel phonemes from variable speech stimuli. Psychophysiology 2004; 41:654-9. [PMID: 15189488 DOI: 10.1111/1469-8986.2004.00175.x] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Understanding speech requires the construction of phonetic representations while abstracting from specific sound features. To understand different speakers of varying pitches of voice, loudness, or timbre, categorical phoneme information needs to be rapidly extracted from dynamic, changing speech input. The present study demonstrated a genuine MMN to tokens of /a/ and /i/ vowels varying in pitch of voice and amplitude envelope when they occurred infrequently among the respective other vowels. These data indicate that the speech perception system pre-attentively extracted the F1/F2 formant information despite the language-irrelevant variation in the sound input.
Collapse
Affiliation(s)
- Thomas Jacobsen
- Institut für Allgemeine Psychologie, Universität Leipzig, Seeburgstrasse 14-20, 04103 Leipzig, Germany.
| | | | | |
Collapse
|
43
|
Eulitz C, Obleser J, Lahiri A. Intra-subject replication of brain magnetic activity during the processing of speech sounds. ACTA ACUST UNITED AC 2004; 19:82-91. [PMID: 14972361 DOI: 10.1016/j.cogbrainres.2003.11.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/18/2003] [Indexed: 11/20/2022]
Abstract
The present study examined the cortical activity during processing of vocalic segments by means of whole-head magnetoencephalography (MEG) to see whether respective cortical maps are stable across repeated measurements. We investigated the spatial configuration and temporal characteristics of the N100m generators of the auditory-evoked field during the processing of the synthetic German vowels [a], [e] and [i] across 10 repeated measurements in a single subject. Between vowels, N100m latency as well as source location differences were found with the latency differences being in accordance with tonochronic principles. The spatial configuration of the different vowel sources was related to differences in acoustic/phonological features. Vowels differing maximally in those features, i.e., [a] and [i], showed larger Euclidean distances between N100m vowel sources than [e] and [i]. This pattern was repeatable across sessions and independent of the source modeling strategy for left-hemispheric data. Compared to a pure tone control condition, the N100m generators of vowels were localized in more anterior, superior and lateral parts of the temporal lobe and showed longer latencies. Being aware of the limited significance of conclusions drawn from a single case study, the study yielded a repeatable spatial and temporal pattern of vowel source activity in the auditory cortex which was determined by the distinctiveness of the formant frequencies corresponding to abstract phonological features.
Collapse
Affiliation(s)
- Carsten Eulitz
- Department of Clinical Psychology, University of Konstanz, Giessberg 10, P.O. Box D25, 78457 Konstanz, Germany.
| | | | | |
Collapse
|
44
|
Obleser J, Lahiri A, Eulitz C. Magnetic Brain Response Mirrors Extraction of Phonological Features from Spoken Vowels. J Cogn Neurosci 2004; 16:31-9. [PMID: 15006034 DOI: 10.1162/089892904322755539] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
This study further elucidates determinants of vowel perception in the human auditory cortex. The vowel inventory of a given language can be classified on the basis of phonological features which are closely linked to acoustic properties. A cortical representation of speech sounds based on these phonological features might explain the surprisingly inverse correlation between immense variance in the acoustic signal and high accuracy of speech recognition. We investigated timing and mapping of the N100m elicited by 42 tokens of seven natural German vowels varying along the phonological features tongue height (corresponding to the frequency of the first formant) and place of articulation (corresponding to the frequency of the second and third formants). Auditoryevoked fields were recorded using a 148-channel whole-head magnetometer while subjects performed target vowel detection tasks. Source location differences appeared to be driven by place of articulation: Vowels with mutually exclusive place of articulation features, namely, coronal and dorsal elicited separate centers of activation along the posterior-anterior axis. Additionally, the time course of activation as reflected in the N100m peak latency distinguished between vowel categories especially when the spatial distinctiveness of cortical activation was low. In sum, results suggest that both N100m latency and source location as well as their interaction reflect properties of speech stimuli that correspond to abstract phonological features.
Collapse
Affiliation(s)
- Jonas Obleser
- University of Konstanz, Universitätstrasse 10, PO Box D25, 78457 Konstanz, Germany.
| | | | | |
Collapse
|
45
|
Mäkelä AM, Alku P, Tiitinen H. The auditory N1m reveals the left-hemispheric representation of vowel identity in humans. Neurosci Lett 2003; 353:111-4. [PMID: 14664913 DOI: 10.1016/j.neulet.2003.09.021] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The cortical correlates of the perception of the sustained vowels /a/, /o/ and /u/ were studied by using whole-head magnetoencephalography (MEG). The three vowels which were located on a line in the space spanned by the first (F1) and second (F2) formants and having equal F2-F1 differences evoked equally strong auditory N1m responses at 120 ms after stimulus onset. The left-hemispheric distribution of the source locations, estimated by equivalent current dipoles, reflected the acoustic similarity of the vowels: the growing distance of the vowels in the F2,F1-space was accompanied by a growing distance between the centres of gravity of activation elicited by each vowel. Thus, direct evidence for the orderly left-hemispheric representation of phonemes in human auditory cortex was found.
Collapse
Affiliation(s)
- Anna Mari Mäkelä
- Apperception & Cortical Dynamics, Department of Psychology, Box 9, University of Helsinki, Helsinki, Fin-00014, Finland.
| | | | | |
Collapse
|
46
|
Vihla M, Eulitz C. Topography of the auditory evoked potential in humans reflects differences between vowels embedded in pseudo-words. Neurosci Lett 2003; 338:189-92. [PMID: 12581828 DOI: 10.1016/s0304-3940(02)01403-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
To study the processing of vowels embedded in more complex linguistic structures, we compared cortical responses for pseudo-words. Auditory evoked potentials were recorded in 11 right-handed females using a passive oddball paradigm, with /pemu/ and /pomu/ as standard stimuli, differing only with respect to the first syllable. Topographic differences in the N100 were observed between the standards: /pemu/ had larger amplitudes than /pomu/ at more posterior electrode sites whereas a reverse pattern was found at more anterior positions along the midline. This topographic difference can be explained by different generators for the two stimuli. Different vowels and/or the initial formant transition possibly activate different neural populations in the auditory cortex, also when the vowels are embedded in pseudo-words.
Collapse
Affiliation(s)
- Minna Vihla
- Department of Psychology, University of Konstanz, PO Box D25, 78457 Germany
| | | |
Collapse
|
47
|
Obleser J, Elbert T, Lahiri A, Eulitz C. Cortical representation of vowels reflects acoustic dissimilarity determined by formant frequencies. BRAIN RESEARCH. COGNITIVE BRAIN RESEARCH 2003; 15:207-13. [PMID: 12527095 DOI: 10.1016/s0926-6410(02)00193-3] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We studied neuromagnetic correlates of the processing of German vowels [a], [e] and [i]. The aim was (i) to show an influence of acoustic/phonetic features on timing and mapping of the N100 m component and (ii) to demonstrate the retest reliability of these parameters. To assess the spatial configuration of the N100 m generators, Euclidean distances between vowel sources were computed. Latency, amplitude, and source locations of the N100 m component differed between vowels. The acoustically most dissimilar vowels [a] and [i] showed more distant source locations than the more similar vowels [e] and [i]. This pattern of results was reliably found in a second experimental session after at least 5 days. The results suggest the preservation of spectral dissimilarities as mapped in a F(1)-F(2) vowel space in a cortical representation.
Collapse
Affiliation(s)
- Jonas Obleser
- University of Konstanz, Department of Clinical Psychology, P.O. Box D25, 78457 Konstanz, Germany.
| | | | | | | |
Collapse
|
48
|
Menning H, Imaizumi S, Zwitserlood P, Pantev C. Plasticity of the human auditory cortex induced by discrimination learning of non-native, mora-timed contrasts of the Japanese language. Learn Mem 2002; 9:253-67. [PMID: 12359835 PMCID: PMC187135 DOI: 10.1101/lm.49402] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In this magnetoencephalographic (MEG) study, we examined with high temporal resolution the traces of learning in the speech-dominant left-hemispheric auditory cortex as a function of newly trained mora-timing. In Japanese, the "mora" is a temporal unit that divides words into almost isochronous segments (e.g., na-ka-mu-ra and to-o-kyo-o each comprises four mora). Changes in the brain responses of a group of German and Japanese subjects to differences in the mora structure of Japanese words were compared. German subjects performed a discrimination training in 10 sessions of 1.5 h each day. They learned to discriminate Japanese pairs of words (in a consonant, anni-ani; and a vowel, kiyo-kyo, condition), where the second word was shortened by one mora in eight steps of 15 msec each. A significant increase in learning performance, as reflected by behavioral measures, was observed, accompanied by a significant increase of the amplitude of the Mismatch Negativity Field (MMF). The German subjects' hit rate for detecting durational deviants increased by up to 35%. Reaction times and MMF latencies decreased significantly across training sessions. Japanese subjects showed a more sensitive MMF to smaller differences. Thus, even in young adults, perceptual learning of non-native mora-timing occurs rapidly and deeply. The enhanced behavioral and neurophysiological sensitivity found after training indicates a strong relationship between learning and (plastic) changes in the cortical substrate.
Collapse
Affiliation(s)
- Hans Menning
- Center for Biomagnetism, Institute of Experimental Audiology, Münster, Germany.
| | | | | | | |
Collapse
|
49
|
Palomäki KJ, Tiitinen H, Mäkinen V, May P, Alku P. Cortical processing of speech sounds and their analogues in a spatial auditory environment. BRAIN RESEARCH. COGNITIVE BRAIN RESEARCH 2002; 14:294-9. [PMID: 12067702 DOI: 10.1016/s0926-6410(02)00132-5] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
We used magnetoencephalographic (MEG) measurements to study how speech sounds presented in a realistic spatial sound environment are processed in human cortex. A spatial sound environment was created by utilizing head-related transfer functions (HRTFs), and using a vowel, a pseudo-vowel, and a wide-band noise burst as stimuli. The behaviour of the most prominent auditory response, the cortically generated N1m, was investigated above the left and right hemisphere. We found that the N1m responses elicited by the vowel and by the pseudo-vowel were much larger in amplitude than those evoked by the noise burst. Corroborating previous observations, we also found that cortical activity reflecting the processing of spatial sound was more pronounced in the right than in the left hemisphere for all of the stimulus types and that both hemispheres exhibited contralateral tuning to sound direction.
Collapse
Affiliation(s)
- Kalle J Palomäki
- Speech and Hearing Research Group, Department of Computer Science, University of Sheffield, Sheffield, UK.
| | | | | | | | | |
Collapse
|
50
|
Ceponiene R, Shestakova A, Balan P, Alku P, Yiaguchi K, Näätänen R. Children's auditory event-related potentials index sound complexity and "speechness". Int J Neurosci 2001; 109:245-60. [PMID: 11699331 DOI: 10.3109/00207450108986536] [Citation(s) in RCA: 61] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Children's long-latency auditory event-related potential (LLAEP) structure differs from that of adults. Functional significance of childhood ERP components is largely unknown. In order to look for the functional correlates in adult and children's LLAEPs, stimulus-complexity effects were investigated in 8-10-year old children. To this end, auditory ERPs to vowels, acoustically matched complex tones, and sinusoidal tones were recorded. All types of stimuli elicited P100-N250-N450 ERP complex. Differences between the sinusoidal and complex tones were confined to the P100 and N250 peaks, complex tones eliciting larger responses. Vowels elicited smaller-amplitude N250 but larger-amplitude N450 than the complex tones. Some stimulus-complexity effects observed for N250 in children corresponded to those observed for the N1 in adults, whereas the N450 peak exhibited behaviour resembling that of the adult ERP components subsequent to the N1 wave.
Collapse
Affiliation(s)
- R Ceponiene
- Cognitive Brain Research Unit, Dept. of Psychology, Meritullinkatu 1B, P. O. Box 13, University of Helsinki, 000 14 Helsinki, Finland
| | | | | | | | | | | |
Collapse
|