451
|
Wasmann JWA, van Eijl RHM, Versnel H, van Zanten GA. Assessing auditory nerve condition by tone decay in deaf subjects with a cochlear implant. Int J Audiol 2018; 57:864-871. [PMID: 30261773 DOI: 10.1080/14992027.2018.1498598] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
The condition of the auditory nerve is a factor determining hearing performance of cochlear implant (CI) recipients. Abnormal loudness adaptation is associated with poor auditory nerve survival. We examined which stimulus conditions are suitable for tone decay measurements to differentiate between CI recipients with respect to their speech perception. Tone decay was defined here as occurring when the percept disappears before the stimulus stops. We measured the duration of the percept of a 60-s pulse train. Current levels ranged from below threshold up to maximum acceptable loudness, pulse rates from 250 to 5000 pulses/s, and duty cycles (percentages of time the burst of pulses is on) from 10% to 100%. Ten adult CI recipients were included: seven with good and three with poor speech perception. Largest differences among the subjects were found at 5000 pulses/s and 100% duty cycle. The well performing subjects had a continuous percept of the 60-s stimulus within 3 dB above threshold. Two poorly performing subjects showed abnormal loudness adaptation, that is, no continuous percept even at levels greater than 6 dB above threshold. We conclude that abnormal loudness adaptation can be detected via an electric tone decay test using a high pulse rate and 100% duty cycle.
Collapse
|
452
|
Specht K, Wigglesworth P. The functional and structural asymmetries of the superior temporal sulcus. Scand J Psychol 2018; 59:74-82. [PMID: 29356006 DOI: 10.1111/sjop.12410] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2017] [Accepted: 10/01/2017] [Indexed: 01/09/2023]
Abstract
The superior temporal sulcus (STS) is an anatomical structure that increasingly interests researchers. This structure appears to receive multisensory input and is involved in several perceptual and cognitive core functions, such as speech perception, audiovisual integration, (biological) motion processing and theory of mind capacities. In addition, the superior temporal sulcus is not only one of the longest sulci of the brain, but it also shows marked functional and structural asymmetries, some of which have only been found in humans. To explore the functional-structural relationships of these asymmetries in more detail, this study combines functional and structural magnetic resonance imaging. Using a speech perception task, an audiovisual integration task, and a theory of mind task, this study again demonstrated an involvement of the STS in these processes, with an expected strong leftward asymmetry for the speech perception task. Furthermore, this study confirmed the earlier described, human-specific asymmetries, namely that the left STS is longer than the right STS and that the right STS is deeper than the left STS. However, this study did not find any relationship between these structural asymmetries and the detected brain activations or their functional asymmetries. This can, on the other hand, give further support to the notion that the structural asymmetry of the STS is not directly related to the functional asymmetry of the speech perception and the language system as a whole, but that it may have other causes and functions.
Collapse
|
453
|
Kasisopa B, El-Khoury Antonios L, Jongman A, Sereno JA, Burnham D. Training Children to Perceive Non-native Lexical Tones: Tone Language Background, Bilingualism, and Auditory-Visual Information. Front Psychol 2018; 9:1508. [PMID: 30233446 PMCID: PMC6131621 DOI: 10.3389/fpsyg.2018.01508] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Accepted: 07/31/2018] [Indexed: 11/13/2022] Open
Abstract
This study investigates the role of language background and bilingual status in the perception of foreign lexical tones. Eight groups of participants, consisting of children of 6 and 8 years from one of four language background (tone or non-tone) × bilingual status (monolingual or bilingual)-Thai monolingual, English monolingual, English-Thai bilingual, and English-Arabic bilingual were trained to perceive the four Mandarin lexical tones. Half the children in each of these eight groups were given auditory-only (AO) training and half auditory-visual (AV) training. In each group Mandarin tone identification was tested before and after (pre- and post-) training with both auditory-only test (ao-test) and auditory-visual test (av test). The effect of training on Mandarin tone identification was minimal for 6-year-olds. On the other hand, 8-year-olds, particularly those with tone language experience showed greater pre- to post-training improvement, and this was best indexed by ao-test trials. Bilingual vs. monolingual background did not facilitate overall improvement due to training, but it did modulate the efficacy of the Training mode: for bilinguals both AO and AV training, and especially AO, resulted in performance gain; but for monolinguals training was most effective with AV stimuli. Again this effect was best indexed by ao-test trials. These results suggest that tone language experience, be it monolingual or bilingual, is a strong predictor of learning unfamiliar tones; that monolinguals learn best from AV training trials and bilinguals from AO training trials; and that there is no metalinguistic advantage due to bilingualism in learning to perceive lexical tones.
Collapse
|
454
|
Norris D, McQueen JM, Cutler A. Commentary on "Interaction in Spoken Word Recognition Models". Front Psychol 2018; 9:1568. [PMID: 30233453 PMCID: PMC6129619 DOI: 10.3389/fpsyg.2018.01568] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 08/07/2018] [Indexed: 11/26/2022] Open
|
455
|
Dorsi J, Viswanathan N, Rosenblum LD, Dias JW. The role of speech fidelity in the irrelevant sound effect: Insights from noise-vocoded speech backgrounds. Q J Exp Psychol (Hove) 2018; 71:2152-2161. [PMID: 30226434 DOI: 10.1177/1747021817739257] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The Irrelevant Sound Effect (ISE) is the finding that background sound impairs accuracy for visually presented serial recall tasks. Among various auditory backgrounds, speech typically acts as the strongest distractor. Based on the changing-state hypothesis, speech is a disruptive background because it is more complex than other nonspeech backgrounds. In the current study, we evaluate an alternative explanation by examining whether the speech-likeness of the background (speech fidelity) contributes, beyond signal complexity, to the ISE. We did this by using noise-vocoded speech as a background. In Experiment 1, we varied the complexity of the background by manipulating the number of vocoding channels. Results indicate that the ISE increases with the number of channels, suggesting that more complex signals produce greater ISEs. In Experiment 2, we varied complexity and speech fidelity independently. At each channel level, we selectively reversed a subset of channels to design a low-fidelity signal that was equated in overall complexity. Experiment 2 results indicated that speech-like noise-vocoded speech produces a larger ISE than selectively reversed noise-vocoded speech. Finally, in Experiment 3, we evaluated the locus of the speech-fidelity effect by assessing the distraction produced by these stimuli in a missing-item task. In this task, even though noise-vocoded speech disrupted task performance relative to silence, neither its complexity nor speech fidelity contributed to this effect. Together, these findings indicate a clear role for speech fidelity of the background beyond its changing-state quality and its attention capture potential.
Collapse
|
456
|
Ishida M, Arai T, Kashino M. Perceptual Restoration of Temporally Distorted Speech in L1 vs. L2: Local Time Reversal and Modulation Filtering. Front Psychol 2018; 9:1749. [PMID: 30283390 PMCID: PMC6156149 DOI: 10.3389/fpsyg.2018.01749] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2018] [Accepted: 08/29/2018] [Indexed: 11/13/2022] Open
Abstract
Speech is intelligible even when the temporal envelope of speech is distorted. The current study investigates how native and non-native speakers perceptually restore temporally distorted speech. Participants were native English speakers (NS), and native Japanese speakers who spoke English as a second language (NNS). In Experiment 1, participants listened to “locally time-reversed speech” where every x-ms of speech signal was reversed on the temporal axis. Here, the local time reversal shifted the constituents of the speech signal forward or backward from the original position, and the amplitude envelope of speech was altered as a function of reversed segment length. In Experiment 2, participants listened to “modulation-filtered speech” where the modulation frequency components of speech were low-pass filtered at a particular cut-off frequency. Here, the temporal envelope of speech was altered as a function of cut-off frequency. The results suggest that speech becomes gradually unintelligible as the length of reversed segments increases (Experiment 1), and as a lower cut-off frequency is imposed (Experiment 2). Both experiments exhibit the equivalent level of speech intelligibility across six levels of degradation for native and non-native speakers respectively, which poses a question whether the regular occurrence of local time reversal can be discussed in the modulation frequency domain, by simply converting the length of reversed segments (ms) into frequency (Hz).
Collapse
|
457
|
Krasotkina A, Götz A, Höhle B, Schwarzer G. Perceptual Narrowing in Speech and Face Recognition: Evidence for Intra-individual Cross-Domain Relations. Front Psychol 2018; 9:1711. [PMID: 30258388 PMCID: PMC6144632 DOI: 10.3389/fpsyg.2018.01711] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 08/24/2018] [Indexed: 11/15/2022] Open
Abstract
During the first year of life, infants undergo perceptual narrowing in the domains of speech and face perception. This is typically characterized by improvements in infants’ abilities in discriminating among stimuli of familiar types, such as native speech tones and same-race faces. Simultaneously, infants begin to decline in their ability to discriminate among stimuli of types with which they have little experience, such as non-native tones and other-race faces. The similarity in time-frames during which perceptual narrowing seems to occur in the domains of speech and face perception has led some researchers to hypothesize that the perceptual narrowing in these domains could be driven by shared domain-general processes. To explore this hypothesis, we tested 53 Caucasian 9-month-old infants from monolingual German households on their ability to discriminate among non-native Cantonese speech tones, as well among same-race German faces and other-race Chinese faces. We tested the infants using an infant-controlled habituation-dishabituation paradigm, with infants’ preferences for looking at novel stimuli versus the habituated stimuli (dishabituation scores) acting as indicators of discrimination ability. As expected for their age, infants were able to discriminate between same-race faces, but not between other-race faces or non-native speech tones. Most interestingly, we found that infants’ dishabituation scores for the non-native speech tones and other-race faces showed significant positive correlations, while the dishabituation scores for non-native speech tones and same-race faces did not. These results therefore support the hypothesis that shared domain-general mechanisms may drive perceptual narrowing in the domains of speech and face perception.
Collapse
|
458
|
Abstract
We describe the performance of an aphasic individual, K.A., who showed a selective impairment affecting his ability to perceive spoken language, while largely sparing his ability to perceive written language and to produce spoken language. His spoken perception impairment left him unable to distinguish words or nonwords that differed on a single phoneme and he was no better than chance at auditory lexical decision or single spoken word and single picture matching with phonological foils. Strikingly, despite this profound impairment, K.A. showed a selective sparing in his ability to perceive number words, which he was able to repeat and comprehend largely without error. This case adds to a growing literature demonstrating modality-specific dissociations between number word and non-number word processing. Because of the locus of K.A.'s speech perception deficit for non-number words, we argue that this distinction between number word and non-number word processing arises at a sublexical level of representations in speech perception, in a parallel fashion to what has previously been argued for in the organization of the sublexical level of representation for speech production.
Collapse
|
459
|
Ward RM, Kelty-Stephen DG. Bringing the Nonlinearity of the Movement System to Gestural Theories of Language Use: Multifractal Structure of Spoken English Supports the Compensation for Coarticulation in Human Speech Perception. Front Physiol 2018; 9:1152. [PMID: 30233386 PMCID: PMC6129613 DOI: 10.3389/fphys.2018.01152] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Accepted: 07/31/2018] [Indexed: 01/13/2023] Open
Abstract
Coarticulation is the tendency for speech vocalization and articulation even at the phonemic level to change with context, and compensation for coarticulation (CfC) reflects the striking human ability to perceive phonemic stability despite this variability. A current controversy centers on whether CfC depends on contrast between formants of a speech-signal spectrogram-specifically, contrast between offset formants concluding context stimuli and onset formants opening the target sound-or on speech-sound variability specific to the coordinative movement of speech articulators (e.g., vocal folds, postural muscles, lips, tongues). This manuscript aims to encode that coordinative-movement context in terms of speech-signal multifractal structure and to determine whether speech's multifractal structure might explain the crucial gestural support for any proposed spectral contrast. We asked human participants to categorize individual target stimuli drawn from an 11-step [ga]-to-[da] continuum as either phonemes "GA" or "DA." Three groups each heard a specific-type context stimulus preceding target stimuli: either real-speech [al] or [a], sine-wave tones at the third-formant offset frequency of either [al] or [aɹ], and either simulated-speech contexts [al] or [aɹ]. Here, simulating speech contexts involved randomizing the sequence of relatively homogeneous pitch periods within vowel-sound [a] of each [al] and [aɹ]. Crucially, simulated-speech contexts had the same offset and extremely similar vowel formants as and, to additional naïve participants, sounded identical to real-speech contexts. However, randomization distorted original speech-context multifractality, and effects of spectral contrast following speech only appeared after regression modeling of trial-by-trial "GA" judgments controlled for context-stimulus multifractality. Furthermore, simulated-speech contexts elicited faster responses (like tone contexts do) and weakened known biases in CfC, suggesting that spectral contrast depends on the nonlinear interactions across multiple scales that articulatory gestures express through the speech signal. Traditional mouse-tracking behaviors measured as participants moved their computer-mouse cursor to register their "GA"-or-"DA" decisions with mouse-clicks suggest that listening to speech leads the movement system to resonate with the multifractality of context stimuli. We interpret these results as shedding light on a new multifractal terrain upon which to found a better understanding in which movement systems play an important role in shaping how speech perception makes use of acoustic information.
Collapse
|
460
|
Campbell JA, McSherry HL, Theodore RM. Contextual Influences on Phonetic Categorization in School-Aged Children. FRONTIERS IN COMMUNICATION 2018; 3:35. [PMID: 31763339 PMCID: PMC6874108 DOI: 10.3389/fcomm.2018.00035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Perceptual stability in adult listeners is supported by the ability to process acoustic-phonetic variation categorically and dynamically adjust category boundaries given systematic contextual influences. The current study examined the developmental trajectory of such flexibility. Adults and school-aged children (5-10 years of age) made voicing identification decisions to voice-onset-time (VOT) continua that differed in speaking rate and place of articulation. The results showed that both populations were sensitive to contextual influences; the voicing boundary was located at a longer VOT for the slow compared to the fast speaking rate continuum and for the velar compared to the labial continuum, and the magnitude of the displacement was slighter greater for the adults compared to the children. Moreover, the two populations differed in terms of the absolute location of the voicing boundaries and the categorization slopes, with slopes becoming more categorical as age increased. These results demonstrate that sensitivity to contextual influences on speech perception emerges early in development, but mature perceptual tuning requires extended experience.
Collapse
|
461
|
Feng L, Oxenham AJ. Spectral contrast effects produced by competing speech contexts. J Exp Psychol Hum Percept Perform 2018; 44:1447-1457. [PMID: 29847973 PMCID: PMC6110988 DOI: 10.1037/xhp0000546] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The long-term spectrum of a preceding sentence can alter the perception of a following speech sound in a contrastive manner. This speech context effect contributes to our ability to extract reliable spectral characteristics of the surrounding acoustic environment and to compensate for the voice characteristics of different speakers or spectral colorations in different listening environments to maintain perceptual constancy. The extent to which such effects are mediated by low-level "automatic" processes, or require directed attention, remains unknown. This study investigated spectral context effects by measuring the effects of two competing sentences on the phoneme category boundary between /i/ and /ε/ in a following target word, while directing listeners' attention to one or the other context sentence. Spatial separation of the context sentences was achieved either by presenting them to different ears, or by presenting them to both ears but imposing an interaural time difference (ITD) between the ears. The results confirmed large context effects based on ear of presentation. Smaller effects were observed based on either ITD or attention. The results, combined with predictions from a two-stage model, suggest that ear-specific factors dominate speech context effects but that the effects can be modulated by higher-level features, such as perceived location, and by attention. (PsycINFO Database Record
Collapse
|
462
|
van de Ven M, Ernestus M. The role of segmental and durational cues in the processing of reduced words. LANGUAGE AND SPEECH 2018; 61:358-383. [PMID: 28870139 PMCID: PMC6099978 DOI: 10.1177/0023830917727774] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
In natural conversations, words are generally shorter and they often lack segments. It is unclear to what extent such durational and segmental reductions affect word recognition. The present study investigates to what extent reduction in the initial syllable hinders word comprehension, which types of segments listeners mostly rely on, and whether listeners use word duration as a cue in word recognition. We conducted three experiments in Dutch, in which we adapted the gating paradigm to study the comprehension of spontaneously uttered conversational speech by aligning the gates with the edges of consonant clusters or vowels. Participants heard the context and some segmental and/or durational information from reduced target words with unstressed initial syllables. The initial syllable varied in its degree of reduction, and in half of the stimuli the vowel was not clearly present. Participants gave too short answers if they were only provided with durational information from the target words, which shows that listeners are unaware of the reductions that can occur in spontaneous speech. More importantly, listeners required fewer segments to recognize target words if the vowel in the initial syllable was absent. This result strongly suggests that this vowel hardly plays a role in word comprehension, and that its presence may even delay this process. More important are the consonants and the stressed vowel.
Collapse
|
463
|
Abstract
Linguistic experience affects speech perception from early infancy, as previously evidenced by behavioral and brain measures. Current research focuses on whether linguistic effects on speech perception can be observed at an earlier stage in the neural processing of speech (i.e., auditory brainstem). Brainstem responses reflect rapid, automatic, and preattentive encoding of sounds. Positive experiential effects have been reported by examining the frequency-following response (FFR) component of the complex auditory brainstem response (cABR) in response to sustained high-energy periodic portions of speech sounds (vowels and lexical tones). The current study expands the existing literature by examining the cABR onset component in response to transient and low-energy portions of speech (consonants), employing simultaneous magnetoencephalography (MEG) in addition to electroencephalography (EEG), which provide complementary source information on cABR. Utilizing a cross-cultural design, we behaviorally measured perceptual responses to consonants in native Spanish- and English-speaking adults, in addition to cABR. Brain and behavioral relations were examined. Results replicated previous behavioral differences between language groups and further showed that individual consonant perception is strongly associated with EEG-cABR onset peak latency. MEG-cABR source analysis of the onset peaks complimented the EEG-cABR results by demonstrating subcortical sources for both peaks, with no group differences in peak locations. Current results demonstrate a brainstem-perception relation and show that the effects of linguistic experience on speech perception can be observed at the brainstem level.
Collapse
|
464
|
Holmes E, Domingo Y, Johnsrude IS. Familiar Voices Are More Intelligible, Even if They Are Not Recognized as Familiar. Psychol Sci 2018; 29:1575-1583. [PMID: 30096018 DOI: 10.1177/0956797618779083] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
We can recognize familiar people by their voices, and familiar talkers are more intelligible than unfamiliar talkers when competing talkers are present. However, whether the acoustic voice characteristics that permit recognition and those that benefit intelligibility are the same or different is unknown. Here, we recruited pairs of participants who had known each other for 6 months or longer and manipulated the acoustic correlates of two voice characteristics (vocal tract length and glottal pulse rate). These had different effects on explicit recognition of and the speech-intelligibility benefit realized from familiar voices. Furthermore, even when explicit recognition of familiar voices was eliminated, they were still more intelligible than unfamiliar voices-demonstrating that familiar voices do not need to be explicitly recognized to benefit intelligibility. Processing familiar-voice information appears therefore to depend on multiple, at least partially independent, systems that are recruited depending on the perceptual goal of the listener.
Collapse
|
465
|
McMurray B, Danelz A, Rigler H, Seedorff M. Speech categorization develops slowly through adolescence. Dev Psychol 2018; 54:1472-1491. [PMID: 29952600 PMCID: PMC6062449 DOI: 10.1037/dev0000542] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
The development of the ability to categorize speech sounds is often viewed as occurring primarily during infancy via perceptual learning mechanisms. However, a number of studies suggest that even after infancy, children's categories become more categorical and well defined through about age 12. We investigated the cognitive changes that may be responsible for such development using a visual world paradigm experiment based on (McMurray, Tanenhaus, & Aslin, 2002). Children from 3 age groups (7-8, 12-13, and 17-18 years) heard a token from either a b/p or s/∫ continua spanning 2 words (beach/peach, ship/sip) and selected its referent from a screen containing 4 pictures of potential lexical candidates. Eye movements to each object were monitored as a measure of how strongly children were committing to each candidate as perception unfolds in real-time. Results showed an ongoing sharpening of speech categories through 18, which was particularly apparent during the early stages of real-time perception. When analysis targeted to specifically within-category sensitivity to continuous detail, children exhibited increasingly gradient categories over development, suggesting that increasing sensitivity to fine-grained detail in the signal enables these more discrete categorizations. Together these suggest that speech development is a protracted process in which children's increasing sensitivity to within-category detail in the signal enables increasingly sharp phonetic categories. (PsycINFO Database Record
Collapse
|
466
|
Liang B, Du Y. The Functional Neuroanatomy of Lexical Tone Perception: An Activation Likelihood Estimation Meta-Analysis. Front Neurosci 2018; 12:495. [PMID: 30087589 PMCID: PMC6066585 DOI: 10.3389/fnins.2018.00495] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 07/02/2018] [Indexed: 11/13/2022] Open
Abstract
In tonal language such as Chinese, lexical tone serves as a phonemic feature in determining word meaning. Meanwhile, it is close to prosody in terms of suprasegmental pitch variations and larynx-based articulation. The important yet mixed nature of lexical tone has evoked considerable studies, but no consensus has been reached on its functional neuroanatomy. This meta-analysis aimed at uncovering the neural network of lexical tone perception in comparison with that of phoneme and prosody in a unified framework. Independent Activation Likelihood Estimation meta-analyses were conducted for different linguistic elements: lexical tone by native tonal language speakers, lexical tone by non-tonal language speakers, phoneme, word-level prosody, and sentence-level prosody. Results showed that lexical tone and prosody studies demonstrated more extensive activations in the right than the left auditory cortex, whereas the opposite pattern was found for phoneme studies. Only tonal language speakers consistently recruited the left anterior superior temporal gyrus (STG) for processing lexical tone, an area implicated in phoneme processing and word-form recognition. Moreover, an anterior-lateral to posterior-medial gradient of activation as a function of element timescale was revealed in the right STG, in which the activation for lexical tone lied between that for phoneme and that for prosody. Another topological pattern was shown on the left precentral gyrus (preCG), with the activation for lexical tone overlapped with that for prosody but ventral to that for phoneme. These findings provide evidence that the neural network for lexical tone perception is hybrid with those for phoneme and prosody. That is, resembling prosody, lexical tone perception, regardless of language experience, involved right auditory cortex, with activation localized between sites engaged by phonemic and prosodic processing, suggesting a hierarchical organization of representations in the right auditory cortex. For tonal language speakers, lexical tone additionally engaged the left STG lexical mapping network, consistent with the phonemic representation. Similarly, when processing lexical tone, only tonal language speakers engaged the left preCG site implicated in prosody perception, consistent with tonal language speakers having stronger articulatory representations for lexical tone in the laryngeal sensorimotor network. A dynamic dual-stream model for lexical tone perception was proposed and discussed.
Collapse
|
467
|
Wenrich KA, Davidson LS, Uchanski RM. Segmental and Suprasegmental Perception in Children Using Hearing Aids. J Am Acad Audiol 2018; 28:901-912. [PMID: 29130438 PMCID: PMC5726292 DOI: 10.3766/jaaa.16105] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
BACKGROUND Suprasegmental perception (perception of stress, intonation, "how something is said" and "who says it") and segmental speech perception (perception of individual phonemes or perception of "what is said") are perceptual abilities that provide the foundation for the development of spoken language and effective communication. While there are numerous studies examining segmental perception in children with hearing aids (HAs), there are far fewer studies examining suprasegmental perception, especially for children with greater degrees of residual hearing. Examining the relation between acoustic hearing thresholds, and both segmental and suprasegmental perception for children with HAs, may ultimately enable better device recommendations (bilateral HAs, bimodal devices [one CI and one HA in opposite ears], bilateral CIs) for a particular degree of residual hearing. Examining both types of speech perception is important because segmental and suprasegmental cues are affected differentially by the type of hearing device(s) used (i.e., cochlear implant [CI] and/or HA). Additionally, suprathreshold measures, such as frequency resolution ability, may partially predict benefit from amplification and may assist audiologists in making hearing device recommendations. PURPOSE The purpose of this study is to explore the relationship between audibility (via hearing thresholds and speech intelligibility indices), and segmental and suprasegmental speech perception for children with HAs. A secondary goal is to explore the relationships among frequency resolution ability (via spectral modulation detection [SMD] measures), segmental and suprasegmental speech perception, and receptive language in these same children. RESEARCH DESIGN A prospective cross-sectional design. STUDY SAMPLE Twenty-three children, ages 4 yr 11 mo to 11 yr 11 mo, participated in the study. Participants were recruited from pediatric clinic populations, oral schools for the deaf, and mainstream schools. DATA COLLECTION AND ANALYSIS Audiological history and hearing device information were collected from participants and their families. Segmental and suprasegmental speech perception, SMD, and receptive vocabulary skills were assessed. Correlations were calculated to examine the significance (p < 0.05) of relations between audibility and outcome measures. RESULTS Measures of audibility and segmental speech perception are not significantly correlated, while low-frequency pure-tone average (unaided) is significantly correlated with suprasegmental speech perception. SMD is significantly correlated with all measures (measures of audibility, segmental and suprasegmental perception and vocabulary). Lastly, although age is not significantly correlated with measures of audibility, it is significantly correlated with all other outcome measures. CONCLUSIONS The absence of a significant correlation between audibility and segmental speech perception might be attributed to overall audibility being maximized through well-fit HAs. The significant correlation between low-frequency unaided audibility and suprasegmental measures is likely due to the strong, predominantly low-frequency nature of suprasegmental acoustic properties. Frequency resolution ability, via SMD performance, is significantly correlated with all outcomes and requires further investigation; its significant correlation with vocabulary suggests that linguistic ability may be partially related to frequency resolution ability. Last, all of the outcome measures are significantly correlated with age, suggestive of developmental effects.
Collapse
|
468
|
|
469
|
Kronenberger WG, Henning SC, Ditmars AM, Roman AS, Pisoni DB. Verbal learning and memory in prelingually deaf children with cochlear implants. Int J Audiol 2018; 57:746-754. [PMID: 29933710 DOI: 10.1080/14992027.2018.1481538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
OBJECTIVE Deaf children with cochlear implants (CIs) show poorer verbal working memory compared to normal-hearing (NH) peers, but little is known about their verbal learning and memory (VLM) processes involving multi-trial free recall. DESIGN Children with CIs were compared to NH peers using the California Verbal Learning Test for Children (CVLT-C). STUDY SAMPLE Participants were 21 deaf (before age 6 months) children (6-16 years old) implanted prior to age 3 years, and 21 age-IQ matched NH peers. RESULTS Results revealed no differences between groups in number of words recalled. However, CI users showed a pattern of increasing use of serial clustering strategies across learning trials, whereas NH peers decreased their use of serial clustering strategies. In the CI sample (but not in the NH sample), verbal working memory test scores were related to resistance to the build-up of proactive interference, and sentence recognition was associated with performance on the first exposure to the word list and to the use of recency recall strategies. CONCLUSIONS Children with CIs showed robust evidence of VLM comparable to NH peers. However, their VLM processing (especially recency and proactive interference) was related to speech perception outcomes and verbal WM in different ways from NH peers.
Collapse
|
470
|
Neural Prediction Errors Distinguish Perception and Misperception of Speech. J Neurosci 2018; 38:6076-6089. [PMID: 29891730 DOI: 10.1523/jneurosci.3258-17.2018] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2017] [Revised: 03/08/2018] [Accepted: 03/28/2018] [Indexed: 11/21/2022] Open
Abstract
Humans use prior expectations to improve perception, especially of sensory signals that are degraded or ambiguous. However, if sensory input deviates from prior expectations, then correct perception depends on adjusting or rejecting prior expectations. Failure to adjust or reject the prior leads to perceptual illusions, especially if there is partial overlap (and thus partial mismatch) between expectations and input. With speech, "slips of the ear" occur when expectations lead to misperception. For instance, an entomologist might be more susceptible to hear "The ants are my friends" for "The answer, my friend" (in the Bob Dylan song Blowing in the Wind). Here, we contrast two mechanisms by which prior expectations may lead to misperception of degraded speech. First, clear representations of the common sounds in the prior and input (i.e., expected sounds) may lead to incorrect confirmation of the prior. Second, insufficient representations of sounds that deviate between prior and input (i.e., prediction errors) could lead to deception. We used crossmodal predictions from written words that partially match degraded speech to compare neural responses when male and female human listeners were deceived into accepting the prior or correctly reject it. Combined behavioral and multivariate representational similarity analysis of fMRI data show that veridical perception of degraded speech is signaled by representations of prediction error in the left superior temporal sulcus. Instead of using top-down processes to support perception of expected sensory input, our findings suggest that the strength of neural prediction error representations distinguishes correct perception and misperception.SIGNIFICANCE STATEMENT Misperceiving spoken words is an everyday experience, with outcomes that range from shared amusement to serious miscommunication. For hearing-impaired individuals, frequent misperception can lead to social withdrawal and isolation, with severe consequences for wellbeing. In this work, we specify the neural mechanisms by which prior expectations, which are so often helpful for perception, can lead to misperception of degraded sensory signals. Most descriptive theories of illusory perception explain misperception as arising from a clear sensory representation of features or sounds that are in common between prior expectations and sensory input. Our work instead provides support for a complementary proposal: that misperception occurs when there is an insufficient sensory representations of the deviation between expectations and sensory signals.
Collapse
|
471
|
Rizza A, Terekhov AV, Montone G, Olivetti-Belardinelli M, O'Regan JK. Why Early Tactile Speech Aids May Have Failed: No Perceptual Integration of Tactile and Auditory Signals. Front Psychol 2018; 9:767. [PMID: 29875719 PMCID: PMC5974558 DOI: 10.3389/fpsyg.2018.00767] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2017] [Accepted: 04/30/2018] [Indexed: 11/23/2022] Open
Abstract
Tactile speech aids, though extensively studied in the 1980’s and 1990’s, never became a commercial success. A hypothesis to explain this failure might be that it is difficult to obtain true perceptual integration of a tactile signal with information from auditory speech: exploitation of tactile cues from a tactile aid might require cognitive effort and so prevent speech understanding at the high rates typical of everyday speech. To test this hypothesis, we attempted to create true perceptual integration of tactile with auditory information in what might be considered the simplest situation encountered by a hearing-impaired listener. We created an auditory continuum between the syllables /BA/ and /VA/, and trained participants to associate /BA/ to one tactile stimulus and /VA/ to another tactile stimulus. After training, we tested if auditory discrimination along the continuum between the two syllables could be biased by incongruent tactile stimulation. We found that such a bias occurred only when the tactile stimulus was above, but not when it was below its previously measured tactile discrimination threshold. Such a pattern is compatible with the idea that the effect is due to a cognitive or decisional strategy, rather than to truly perceptual integration. We therefore ran a further study (Experiment 2), where we created a tactile version of the McGurk effect. We extensively trained two Subjects over 6 days to associate four recorded auditory syllables with four corresponding apparent motion tactile patterns. In a subsequent test, we presented stimulation that was either congruent or incongruent with the learnt association, and asked Subjects to report the syllable they perceived. We found no analog to the McGurk effect, suggesting that the tactile stimulation was not being perceptually integrated with the auditory syllable. These findings strengthen our hypothesis according to which tactile aids failed because integration of tactile cues with auditory speech occurred at a cognitive or decisional level, rather than truly at a perceptual level.
Collapse
|
472
|
Marian V, Lam TQ, Hayakawa S, Dhar S. Top-Down Cognitive and Linguistic Influences on the Suppression of Spontaneous Otoacoustic Emissions. Front Neurosci 2018; 12:378. [PMID: 29937708 PMCID: PMC6002685 DOI: 10.3389/fnins.2018.00378] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Accepted: 05/17/2018] [Indexed: 11/13/2022] Open
Abstract
Auditory sensation is often thought of as a bottom-up process, yet the brain exerts top-down control to affect how and what we hear. We report the discovery that the magnitude of top-down influence varies across individuals as a result of differences in linguistic background and executive function. Participants were 32 normal-hearing individuals (23 female) varying in language background (11 English monolinguals, 10 Korean-English late bilinguals, and 11 Korean-English early bilinguals), as well as cognitive abilities (working memory, cognitive control). To assess efferent control over inner ear function, participants were presented with speech-sounds (e.g., /ba/, /pa/) in one ear while spontaneous otoacoustic emissions (SOAEs) were measured in the contralateral ear. SOAEs are associated with the amplification of sound in the cochlea, and can be used as an index of top-down efferent activity. Individuals with bilingual experience and those with better cognitive control experienced larger reductions in the amplitude of SOAEs in response to speech stimuli, likely as a result of greater efferent suppression of amplification in the cochlea. This suppression may aid in the critical task of speech perception by minimizing the disruptive effects of noise. In contrast, individuals with better working memory exert less control over the cochlea, possibly due to a greater capacity to process complex stimuli at later stages. These findings demonstrate that even peripheral mechanics of auditory perception are shaped by top-down cognitive and linguistic influences.
Collapse
|
473
|
Moses DA, Leonard MK, Chang EF. Real-time classification of auditory sentences using evoked cortical activity in humans. J Neural Eng 2018; 15:036005. [PMID: 29378977 PMCID: PMC10560396 DOI: 10.1088/1741-2552/aaab6f] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
OBJECTIVE Recent research has characterized the anatomical and functional basis of speech perception in the human auditory cortex. These advances have made it possible to decode speech information from activity in brain regions like the superior temporal gyrus, but no published work has demonstrated this ability in real-time, which is necessary for neuroprosthetic brain-computer interfaces. APPROACH Here, we introduce a real-time neural speech recognition (rtNSR) software package, which was used to classify spoken input from high-resolution electrocorticography signals in real-time. We tested the system with two human subjects implanted with electrode arrays over the lateral brain surface. Subjects listened to multiple repetitions of ten sentences, and rtNSR classified what was heard in real-time from neural activity patterns using direct sentence-level and HMM-based phoneme-level classification schemes. MAIN RESULTS We observed single-trial sentence classification accuracies of [Formula: see text] or higher for each subject with less than 7 minutes of training data, demonstrating the ability of rtNSR to use cortical recordings to perform accurate real-time speech decoding in a limited vocabulary setting. SIGNIFICANCE Further development and testing of the package with different speech paradigms could influence the design of future speech neuroprosthetic applications.
Collapse
|
474
|
Dietrich S, Hertrich I, Müller-Dahlhaus F, Ackermann H, Belardinelli P, Desideri D, Seibold VC, Ziemann U. Reduced Performance During a Sentence Repetition Task by Continuous Theta-Burst Magnetic Stimulation of the Pre-supplementary Motor Area. Front Neurosci 2018; 12:361. [PMID: 29896086 PMCID: PMC5987029 DOI: 10.3389/fnins.2018.00361] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Accepted: 05/09/2018] [Indexed: 11/23/2022] Open
Abstract
The pre-supplementary motor area (pre-SMA) is engaged in speech comprehension under difficult circumstances such as poor acoustic signal quality or time-critical conditions. Previous studies found that left pre-SMA is activated when subjects listen to accelerated speech. Here, the functional role of pre-SMA was tested for accelerated speech comprehension by inducing a transient “virtual lesion” using continuous theta-burst stimulation (cTBS). Participants were tested (1) prior to (pre-baseline), (2) 10 min after (test condition for the cTBS effect), and (3) 60 min after stimulation (post-baseline) using a sentence repetition task (formant-synthesized at rates of 8, 10, 12, 14, and 16 syllables/s). Speech comprehension was quantified by the percentage of correctly reproduced speech material. For high speech rates, subjects showed decreased performance after cTBS of pre-SMA. Regarding the error pattern, the number of incorrect words without any semantic or phonological similarity to the target context increased, while related words decreased. Thus, the transient impairment of pre-SMA seems to affect its inhibitory function that normally eliminates erroneous speech material prior to speaking or, in case of perception, prior to encoding into a semantically/pragmatically meaningful message.
Collapse
|
475
|
Keetels M, Bonte M, Vroomen J. A Selective Deficit in Phonetic Recalibration by Text in Developmental Dyslexia. Front Psychol 2018; 9:710. [PMID: 29867675 PMCID: PMC5962785 DOI: 10.3389/fpsyg.2018.00710] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2018] [Accepted: 04/23/2018] [Indexed: 11/30/2022] Open
Abstract
Upon hearing an ambiguous speech sound, listeners may adjust their perceptual interpretation of the speech input in accordance with contextual information, like accompanying text or lipread speech (i.e., phonetic recalibration; Bertelson et al., 2003). As developmental dyslexia (DD) has been associated with reduced integration of text and speech sounds, we investigated whether this deficit becomes manifest when text is used to induce this type of audiovisual learning. Adults with DD and normal readers were exposed to ambiguous consonants halfway between /aba/ and /ada/ together with text or lipread speech. After this audiovisual exposure phase, they categorized auditory-only ambiguous test sounds. Results showed that individuals with DD, unlike normal readers, did not use text to recalibrate their phoneme categories, whereas their recalibration by lipread speech was spared. Individuals with DD demonstrated similar deficits when ambiguous vowels (halfway between /wIt/ and /wet/) were recalibrated by text. These findings indicate that DD is related to a specific letter-speech sound association deficit that extends over phoneme classes (vowels and consonants), but – as lipreading was spared – does not extend to a more general audio–visual integration deficit. In particular, these results highlight diminished reading-related audiovisual learning in addition to the commonly reported phonological problems in developmental dyslexia.
Collapse
|