1
|
Calce RP, Rekow D, Barbero FM, Kiseleva A, Talwar S, Leleu A, Collignon O. Voice categorization in the four-month-old human brain. Curr Biol 2024; 34:46-55.e4. [PMID: 38096819 DOI: 10.1016/j.cub.2023.11.042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 10/20/2023] [Accepted: 11/20/2023] [Indexed: 01/11/2024]
Abstract
Voices are the most relevant social sounds for humans and therefore have crucial adaptive value in development. Neuroimaging studies in adults have demonstrated the existence of regions in the superior temporal sulcus that respond preferentially to voices. Yet, whether voices represent a functionally specific category in the young infant's mind is largely unknown. We developed a highly sensitive paradigm relying on fast periodic auditory stimulation (FPAS) combined with scalp electroencephalography (EEG) to demonstrate that the infant brain implements a reliable preferential response to voices early in life. Twenty-three 4-month-old infants listened to sequences containing non-vocal sounds from different categories presented at 3.33 Hz, with highly heterogeneous vocal sounds appearing every third stimulus (1.11 Hz). We were able to isolate a voice-selective response over temporal regions, and individual voice-selective responses were found in most infants within only a few minutes of stimulation. This selective response was significantly reduced for the same frequency-scrambled sounds, indicating that voice selectivity is not simply driven by the envelope and the spectral content of the sounds. Such a robust selective response to voices as early as 4 months of age suggests that the infant brain is endowed with the ability to rapidly develop a functional selectivity to this socially relevant category of sounds.
Collapse
Affiliation(s)
- Roberta P Calce
- Crossmodal Perception and Plasticity Laboratory, Institute of Research in Psychology (IPSY) and Institute of Neuroscience (IoNS), Université Catholique de Louvain, 1348 Louvain-la-Neuve, Belgium.
| | - Diane Rekow
- Development of Olfactory Communication and Cognition Lab, Centre des Sciences du Goût et de l'Alimentation, Université Bourgogne Franche-Comté, Université de Bourgogne, CNRS, Inrae, Institut Agro Dijon, 21000 Dijon, France; Biological Psychology and Neuropsychology, University of Hamburg, 20146 Hamburg, Germany
| | - Francesca M Barbero
- Crossmodal Perception and Plasticity Laboratory, Institute of Research in Psychology (IPSY) and Institute of Neuroscience (IoNS), Université Catholique de Louvain, 1348 Louvain-la-Neuve, Belgium
| | - Anna Kiseleva
- Development of Olfactory Communication and Cognition Lab, Centre des Sciences du Goût et de l'Alimentation, Université Bourgogne Franche-Comté, Université de Bourgogne, CNRS, Inrae, Institut Agro Dijon, 21000 Dijon, France
| | - Siddharth Talwar
- Crossmodal Perception and Plasticity Laboratory, Institute of Research in Psychology (IPSY) and Institute of Neuroscience (IoNS), Université Catholique de Louvain, 1348 Louvain-la-Neuve, Belgium
| | - Arnaud Leleu
- Development of Olfactory Communication and Cognition Lab, Centre des Sciences du Goût et de l'Alimentation, Université Bourgogne Franche-Comté, Université de Bourgogne, CNRS, Inrae, Institut Agro Dijon, 21000 Dijon, France
| | - Olivier Collignon
- Crossmodal Perception and Plasticity Laboratory, Institute of Research in Psychology (IPSY) and Institute of Neuroscience (IoNS), Université Catholique de Louvain, 1348 Louvain-la-Neuve, Belgium; School of Health Sciences, HES-SO Valais-Wallis, The Sense Innovation and Research Center, 1007 Lausanne & Sion, Switzerland.
| |
Collapse
|
2
|
Talwar S, Barbero FM, Calce RP, Collignon O. Automatic Brain Categorization of Discrete Auditory Emotion Expressions. Brain Topogr 2023; 36:854-869. [PMID: 37639111 PMCID: PMC10522533 DOI: 10.1007/s10548-023-00983-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Accepted: 06/21/2023] [Indexed: 08/29/2023]
Abstract
Seamlessly extracting emotional information from voices is crucial for efficient interpersonal communication. However, it remains unclear how the brain categorizes vocal expressions of emotion beyond the processing of their acoustic features. In our study, we developed a new approach combining electroencephalographic recordings (EEG) in humans with a frequency-tagging paradigm to 'tag' automatic neural responses to specific categories of emotion expressions. Participants were presented with a periodic stream of heterogeneous non-verbal emotional vocalizations belonging to five emotion categories: anger, disgust, fear, happiness and sadness at 2.5 Hz (stimuli length of 350 ms with a 50 ms silent gap between stimuli). Importantly, unknown to the participant, a specific emotion category appeared at a target presentation rate of 0.83 Hz that would elicit an additional response in the EEG spectrum only if the brain discriminates the target emotion category from other emotion categories and generalizes across heterogeneous exemplars of the target emotion category. Stimuli were matched across emotion categories for harmonicity-to-noise ratio, spectral center of gravity and pitch. Additionally, participants were presented with a scrambled version of the stimuli with identical spectral content and periodicity but disrupted intelligibility. Both types of sequences had comparable envelopes and early auditory peripheral processing computed via the simulation of the cochlear response. We observed that in addition to the responses at the general presentation frequency (2.5 Hz) in both intact and scrambled sequences, a greater peak in the EEG spectrum at the target emotion presentation rate (0.83 Hz) and its harmonics emerged in the intact sequence in comparison to the scrambled sequence. The greater response at the target frequency in the intact sequence, together with our stimuli matching procedure, suggest that the categorical brain response elicited by a specific emotion is at least partially independent from the low-level acoustic features of the sounds. Moreover, responses at the fearful and happy vocalizations presentation rates elicited different topographies and different temporal dynamics, suggesting that different discrete emotions are represented differently in the brain. Our paradigm revealed the brain's ability to automatically categorize non-verbal vocal emotion expressions objectively (at a predefined frequency of interest), behavior-free, rapidly (in few minutes of recording time) and robustly (with a high signal-to-noise ratio), making it a useful tool to study vocal emotion processing and auditory categorization in general and in populations where behavioral assessments are more challenging.
Collapse
Affiliation(s)
- Siddharth Talwar
- Institute for Research in Psychology (IPSY) & Neuroscience (IoNS), Louvain Bionics, University of Louvain (UCLouvain), Louvain, Belgium.
| | - Francesca M Barbero
- Institute for Research in Psychology (IPSY) & Neuroscience (IoNS), Louvain Bionics, University of Louvain (UCLouvain), Louvain, Belgium
| | - Roberta P Calce
- Institute for Research in Psychology (IPSY) & Neuroscience (IoNS), Louvain Bionics, University of Louvain (UCLouvain), Louvain, Belgium
| | - Olivier Collignon
- Institute for Research in Psychology (IPSY) & Neuroscience (IoNS), Louvain Bionics, University of Louvain (UCLouvain), Louvain, Belgium.
- School of Health Sciences, HES-SO Valais-Wallis, The Sense Innovation and Research Center, Lausanne and Sion, Switzerland.
| |
Collapse
|
3
|
Human voices escape the auditory attentional blink: Evidence from detections and pupil responses. Brain Cogn 2023; 165:105928. [PMID: 36459865 DOI: 10.1016/j.bandc.2022.105928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/30/2022] [Accepted: 11/03/2022] [Indexed: 11/30/2022]
Abstract
Attentional selection of a second target in a rapid stream of stimuli embedding two targets tends to be briefly impaired when two targets are presented in close temporal proximity, an effect known as an attentional blink (AB). Two target sounds (T1 and T2) were embedded in a rapid serial auditory presentation of environmental sounds with a short (Lag 3) or long lag (Lag 9). Participants were to first identify T1 (bell or sine tone) and then to detect T2 (present or absent). Individual stimuli had durations of either 30 or 90 ms, and were presented in streams of 20 sounds. The T2 varied in category: human voice, cello, or dog sound. Previous research has introduced pupillometry as a useful marker of the intensity of cognitive processing and attentional allocation in the visual AB paradigm. Results suggest that the interplay of stimulus factors is critical for target detection accuracy and provides support for the hypothesis that the human voice is the least likely to show an auditory AB (in the 90 ms condition). For the other stimuli, accuracy for T2 was significantly worse at Lag 3 than at Lag 9 in the 90 ms condition, suggesting the presence of an auditory AB. When AB occurred (at Lag 3), we observed smaller pupil dilations, time-locked to the onset of T2, compared to Lag 9, reflecting lower attentional processing when 'blinking' during target detection. Taken together, these findings support the conclusion that human voices escape the AB and that the pupillary changes are consistent with the so-called T2 attentional deficit. In addition, we found some indication that salient stimuli like human voices could require a less intense allocation of attention, or noradrenergic potentiation, compared to other auditory stimuli.
Collapse
|
4
|
Akça M, Vuoskoski JK, Laeng B, Bishop L. Recognition of brief sounds in rapid serial auditory presentation. PLoS One 2023; 18:e0284396. [PMID: 37053212 PMCID: PMC10101377 DOI: 10.1371/journal.pone.0284396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2021] [Accepted: 03/30/2023] [Indexed: 04/14/2023] Open
Abstract
Two experiments were conducted to test the role of participant factors (i.e., musical sophistication, working memory capacity) and stimulus factors (i.e., sound duration, timbre) on auditory recognition using a rapid serial auditory presentation paradigm. Participants listened to a rapid stream of very brief sounds ranging from 30 to 150 milliseconds and were tested on their ability to distinguish the presence from the absence of a target sound selected from various sound sources placed amongst the distracters. Experiment 1a established that brief exposure to stimuli (60 to 150 milliseconds) does not necessarily correspond to impaired recognition. In Experiment 1b we found evidence that 30 milliseconds of exposure to the stimuli significantly impairs recognition of single auditory targets, but the recognition for voice and sine tone targets impaired the least, suggesting that the lower limit required for successful recognition could be lower than 30 milliseconds for voice and sine tone targets. Critically, the effect of sound duration on recognition completely disappeared when differences in musical sophistication were controlled for. Participants' working memory capacities did not seem to predict their recognition performances. Our behavioral results extend the studies oriented to understand the processing of brief timbres under temporal constraint by suggesting that the musical sophistication may play a larger role than previously thought. These results can also provide a working hypothesis for future research, namely, that underlying neural mechanisms for the processing of various sound sources may have different temporal constraints.
Collapse
Affiliation(s)
- Merve Akça
- RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Musicology, University of Oslo, Oslo, Norway
| | - Jonna Katariina Vuoskoski
- RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Musicology, University of Oslo, Oslo, Norway
- Department of Psychology, University of Oslo, Oslo, Norway
| | - Bruno Laeng
- RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Psychology, University of Oslo, Oslo, Norway
| | - Laura Bishop
- RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Musicology, University of Oslo, Oslo, Norway
| |
Collapse
|
5
|
Li M, Guo F, Wang X, Chen J, Ham J. Effects of robot gaze and voice human-likeness on users’ subjective perception, visual attention, and cerebral activity in voice conversations. COMPUTERS IN HUMAN BEHAVIOR 2022. [DOI: 10.1016/j.chb.2022.107645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
6
|
Fan Y, Fang K, Sun R, Shen D, Yang J, Tang Y, Fang G. Hierarchical auditory perception for species discrimination and individual recognition in the music frog. Curr Zool 2021; 68:581-591. [DOI: 10.1093/cz/zoab085] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2021] [Accepted: 10/01/2021] [Indexed: 11/12/2022] Open
Abstract
Abstract
The ability to discriminate species and recognize individuals is crucial for reproductive success and/or survival in most animals. However, the temporal order and neural localization of these decision-making processes has remained unclear. In this study, event-related potentials (ERPs) were measured in the telencephalon, diencephalon, and mesencephalon of the music frog Nidirana daunchina. These ERPs were elicited by calls from 1 group of heterospecifics (recorded from a sympatric anuran species) and 2 groups of conspecifics that differed in their fundamental frequencies. In terms of the polarity and position within the ERP waveform, auditory ERPs generally consist of 4 main components that link to selective attention (N1), stimulus evaluation (P2), identification (N2), and classification (P3). These occur around 100, 200, 250, and 300 ms after stimulus onset, respectively. Our results show that the N1 amplitudes differed significantly between the heterospecific and conspecific calls, but not between the 2 groups of conspecific calls that differed in fundamental frequency. On the other hand, the N2 amplitudes were significantly different between the 2 groups of conspecific calls, suggesting that the music frogs discriminated the species first, followed by individual identification, since N1 and N2 relate to selective attention and stimuli identification, respectively. Moreover, the P2 amplitudes evoked in females were significantly greater than those in males, indicating the existence of sexual dimorphism in auditory discrimination. In addition, both the N1 amplitudes in the left diencephalon and the P2 amplitudes in the left telencephalon were greater than in other brain areas, suggesting left hemispheric dominance in auditory perception. Taken together, our results support the hypothesis that species discrimination and identification of individual characteristics are accomplished sequentially, and that auditory perception exhibits differences between sexes and in spatial dominance.
Collapse
Affiliation(s)
- Yanzhu Fan
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Ke Fang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
- School of Life Science, Anhui University, Hefei 230601, China
| | - Ruolei Sun
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
- School of Life Science, Anhui University, Hefei 230601, China
| | - Di Shen
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jing Yang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yezhong Tang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guangzhan Fang
- Chengdu Institute of Biology, Chinese Academy of Sciences, Chengdu 610041, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
7
|
Fast Periodic Auditory Stimulation Reveals a Robust Categorical Response to Voices in the Human Brain. eNeuro 2021; 8:ENEURO.0471-20.2021. [PMID: 34016602 PMCID: PMC8225406 DOI: 10.1523/eneuro.0471-20.2021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 03/03/2021] [Accepted: 04/04/2021] [Indexed: 11/21/2022] Open
Abstract
Voices are arguably among the most relevant sounds in humans' everyday life, and several studies have suggested the existence of voice-selective regions in the human brain. Despite two decades of research, defining the human brain regions supporting voice recognition remains challenging. Moreover, whether neural selectivity to voices is merely driven by acoustic properties specific to human voices (e.g., spectrogram, harmonicity), or whether it also reflects a higher-level categorization response is still under debate. Here, we objectively measured rapid automatic categorization responses to human voices with fast periodic auditory stimulation (FPAS) combined with electroencephalography (EEG). Participants were tested with stimulation sequences containing heterogeneous non-vocal sounds from different categories presented at 4 Hz (i.e., four stimuli/s), with vocal sounds appearing every three stimuli (1.333 Hz). A few minutes of stimulation are sufficient to elicit robust 1.333 Hz voice-selective focal brain responses over superior temporal regions of individual participants. This response is virtually absent for sequences using frequency-scrambled sounds, but is clearly observed when voices are presented among sounds from musical instruments matched for pitch and harmonicity-to-noise ratio (HNR). Overall, our FPAS paradigm demonstrates that the human brain seamlessly categorizes human voices when compared with other sounds including musical instruments' sounds matched for low level acoustic features and that voice-selective responses are at least partially independent from low-level acoustic features, making it a powerful and versatile tool to understand human auditory categorization in general.
Collapse
|
8
|
The processing of intimately familiar and unfamiliar voices: Specific neural responses of speaker recognition and identification. PLoS One 2021; 16:e0250214. [PMID: 33861789 PMCID: PMC8051806 DOI: 10.1371/journal.pone.0250214] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 04/03/2021] [Indexed: 11/19/2022] Open
Abstract
Research has repeatedly shown that familiar and unfamiliar voices elicit different neural responses. But it has also been suggested that different neural correlates associate with the feeling of having heard a voice and knowing who the voice represents. The terminology used to designate these varying responses remains vague, creating a degree of confusion in the literature. Additionally, terms serving to designate tasks of voice discrimination, voice recognition, and speaker identification are often inconsistent creating further ambiguities. The present study used event-related potentials (ERPs) to clarify the difference between responses to 1) unknown voices, 2) trained-to-familiar voices as speech stimuli are repeatedly presented, and 3) intimately familiar voices. In an experiment, 13 participants listened to repeated utterances recorded from 12 speakers. Only one of the 12 voices was intimately familiar to a participant, whereas the remaining 11 voices were unfamiliar. The frequency of presentation of these 11 unfamiliar voices varied with only one being frequently presented (the trained-to-familiar voice). ERP analyses revealed different responses for intimately familiar and unfamiliar voices in two distinct time windows (P2 between 200-250 ms and a late positive component, LPC, between 450-850 ms post-onset) with late responses occurring only for intimately familiar voices. The LPC present sustained shifts, and short-time ERP components appear to reflect an early recognition stage. The trained voice equally elicited distinct responses, compared to rarely heard voices, but these occurred in a third time window (N250 between 300-350 ms post-onset). Overall, the timing of responses suggests that the processing of intimately familiar voices operates in two distinct steps of voice recognition, marked by a P2 on right centro-frontal sites, and speaker identification marked by an LPC component. The recognition of frequently heard voices entails an independent recognition process marked by a differential N250. Based on the present results and previous observations, it is proposed that there is a need to distinguish between processes of voice "recognition" and "identification". The present study also specifies test conditions serving to reveal this distinction in neural responses, one of which bears on the length of speech stimuli given the late responses associated with voice identification.
Collapse
|
9
|
Ohgami Y, Kotani Y, Yoshida N, Kunimatsu A, Kiryu S, Inoue Y. Voice, rhythm, and beep stimuli differently affect the right hemisphere preponderance and components of stimulus-preceding negativity. Biol Psychol 2021; 160:108048. [PMID: 33596460 DOI: 10.1016/j.biopsycho.2021.108048] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Revised: 02/08/2021] [Accepted: 02/08/2021] [Indexed: 12/30/2022]
Abstract
The present study investigated whether auditory stimuli with different contents affect right laterality and the components of stimulus-preceding negativity (SPN). A time-estimation task was performed under voice, rhythm, beep, and control conditions. The SPN interval during which participants anticipated the stimulus was divided into quarters to define early and late SPNs. Early and late components of SPN were also extracted using a principal component analysis. The anticipation of voice sounds enhanced the early SPN and the early component, which reflected the anticipation of language processing. Beep sounds elicited the right hemisphere preponderance of the early component, the early SPN, and the late SPN. The rhythmic sound tended to attenuate the amplitude compared with the two other stimuli. These findings further substantiate the existence of separate early and late components of the SPN. In addition, they suggest that the early component reflects selective anticipatory attention toward differing types of auditory feedback.
Collapse
Affiliation(s)
- Yoshimi Ohgami
- Institute for Liberal Arts, Tokyo Institute of Technology, 2-12-1 Ohokayama, Meguro, Tokyo, Japan.
| | - Yasunori Kotani
- Institute for Liberal Arts, Tokyo Institute of Technology, 2-12-1 Ohokayama, Meguro, Tokyo, Japan
| | - Nobukiyo Yoshida
- Department of Radiology, Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato, Tokyo, Japan
| | - Akira Kunimatsu
- Department of Radiology, Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Minato, Tokyo, Japan
| | - Shigeru Kiryu
- Department of Medicine, International University of Health and Welfare, 4-3 Kozunomori, Narita, Chiba, Japan
| | - Yusuke Inoue
- Department of Diagnostic Radiology, Kitasato University, 1-15-1 Kitasato, Minami, Sagamihara, Kanagawa, Japan
| |
Collapse
|
10
|
Talkington WJ, Donai J, Kadner AS, Layne ML, Forino A, Wen S, Gao S, Gray MM, Ashraf AJ, Valencia GN, Smith BD, Khoo SK, Gray SJ, Lass N, Brefczynski-Lewis JA, Engdahl S, Graham D, Frum CA, Lewis JW. Electrophysiological Evidence of Early Cortical Sensitivity to Human Conspecific Mimic Voice as a Distinct Category of Natural Sound. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2020; 63:3539-3559. [PMID: 32936717 PMCID: PMC8060013 DOI: 10.1044/2020_jslhr-20-00063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 04/29/2020] [Accepted: 07/01/2020] [Indexed: 06/11/2023]
Abstract
Purpose From an anthropological perspective of hominin communication, the human auditory system likely evolved to enable special sensitivity to sounds produced by the vocal tracts of human conspecifics whether attended or passively heard. While numerous electrophysiological studies have used stereotypical human-produced verbal (speech voice and singing voice) and nonverbal vocalizations to identify human voice-sensitive responses, controversy remains as to when (and where) processing of acoustic signal attributes characteristic of "human voiceness" per se initiate in the brain. Method To explore this, we used animal vocalizations and human-mimicked versions of those calls ("mimic voice") to examine late auditory evoked potential responses in humans. Results Here, we revealed an N1b component (96-120 ms poststimulus) during a nonattending listening condition showing significantly greater magnitude in response to mimics, beginning as early as primary auditory cortices, preceding the time window reported in previous studies that revealed species-specific vocalization processing initiating in the range of 147-219 ms. During a sound discrimination task, a P600 (500-700 ms poststimulus) component showed specificity for accurate discrimination of human mimic voice. Distinct acoustic signal attributes and features of the stimuli were used in a classifier model, which could distinguish most human from animal voice comparably to behavioral data-though none of these single features could adequately distinguish human voiceness. Conclusions These results provide novel ideas for algorithms used in neuromimetic hearing aids, as well as direct electrophysiological support for a neurocognitive model of natural sound processing that informs both neurodevelopmental and anthropological models regarding the establishment of auditory communication systems in humans. Supplemental Material https://doi.org/10.23641/asha.12903839.
Collapse
Affiliation(s)
- William J. Talkington
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Jeremy Donai
- Department of Communication Sciences and Disorders, College of Education and Human Services, West Virginia University, Morgantown
| | - Alexandra S. Kadner
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Molly L. Layne
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Andrew Forino
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Sijin Wen
- Department of Biostatistics, West Virginia University, Morgantown
| | - Si Gao
- Department of Biostatistics, West Virginia University, Morgantown
| | - Margeaux M. Gray
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Alexandria J. Ashraf
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Gabriela N. Valencia
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Brandon D. Smith
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Stephanie K. Khoo
- Department of Biology, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Stephen J. Gray
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - Norman Lass
- Department of Communication Sciences and Disorders, College of Education and Human Services, West Virginia University, Morgantown
| | | | - Susannah Engdahl
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - David Graham
- Department of Computer Science and Electrical Engineering, West Virginia University, Morgantown
| | - Chris A. Frum
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| | - James W. Lewis
- Department of Neuroscience, Rockefeller Neuroscience Institute, West Virginia University, Morgantown
| |
Collapse
|
11
|
Neurophysiological Differences in Emotional Processing by Cochlear Implant Users, Extending Beyond the Realm of Speech. Ear Hear 2020; 40:1197-1209. [PMID: 30762600 DOI: 10.1097/aud.0000000000000701] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
OBJECTIVE Cochlear implants (CIs) restore a sense of hearing in deaf individuals. However, they do not transmit the acoustic signal with sufficient fidelity, leading to difficulties in recognizing emotions in voice and in music. The study aimed to explore the neurophysiological bases of these limitations. DESIGN Twenty-two adults (18 to 70 years old) with CIs and 22 age-matched controls with normal hearing participated. Event-related potentials (ERPs) were recorded in response to emotional bursts (happy, sad, or neutral) produced in each modality (voice or music) that were for the most part correctly identified behaviorally. RESULTS Compared to controls, the N1 and P2 components were attenuated and prolonged in CI users. To a smaller degree, N1 and P2 were also attenuated and prolonged in music compared to voice, in both populations. The N1-P2 complex was emotion-dependent (e.g., reduced and prolonged response to sadness), but this was also true in both populations. In contrast, the later portion of the response, between 600 and 850 ms, differentiated happy and sad from neutral stimuli in normal hearing but not in CI listeners. CONCLUSIONS The early portion of the ERP waveform reflected primarily the general reduction in sensory encoding by CI users (largely due to CI processing itself), whereas altered emotional processing (by CI users) could be found in the later portion of the ERP and extended beyond the realm of speech.
Collapse
|
12
|
Akça M, Laeng B, Godøy RI. No Evidence for an Auditory Attentional Blink for Voices Regardless of Musical Expertise. Front Psychol 2020; 10:2935. [PMID: 31998190 PMCID: PMC6966238 DOI: 10.3389/fpsyg.2019.02935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2019] [Accepted: 12/11/2019] [Indexed: 12/03/2022] Open
Abstract
Background: Attending to goal-relevant information can leave us metaphorically "blind" or "deaf" to the next relevant information while searching among distracters. This temporal cost lasting for about a half a second on the human selective attention has been long explored using the attentional blink paradigm. Although there is evidence that certain visual stimuli relating to one's area of expertise can be less susceptible to attentional blink effects, it remains unexplored whether the dynamics of temporal selective attention vary with expertise and objects types in the auditory modality. Methods: Using the auditory version of the attentional blink paradigm, the present study investigates whether certain auditory objects relating to musical and perceptual expertise could have an impact on the transient costs of selective attention. In this study, expert cellists and novice participants were asked to first identify a target sound, and then to detect instrumental timbres of cello or organ, or human voice as a second target in a rapid auditory stream. Results: The results showed moderate evidence against the attentional blink effect for voices independent of participants' musical expertise. Experts outperformed novices in their overall accuracy levels of target identification and detection, reflecting a clear benefit of musical expertise. Importantly, the musicianship advantage disappeared when the human voices served as the second target in the stream. Discussion: The results are discussed in terms of stimulus salience, the advantage of voice processing, as well as perceptual and musical expertise in relation to attention and working memory performances.
Collapse
Affiliation(s)
- Merve Akça
- RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Musicology, University of Oslo, Oslo, Norway
| | - Bruno Laeng
- RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Psychology, University of Oslo, Oslo, Norway
| | - Rolf Inge Godøy
- RITMO Center for Interdisciplinary Studies in Rhythm, Time and Motion, University of Oslo, Oslo, Norway
- Department of Musicology, University of Oslo, Oslo, Norway
| |
Collapse
|
13
|
Zhang H, Liu M, Li W, Sommer W. Human voice attractiveness processing: Electrophysiological evidence. Biol Psychol 2019; 150:107827. [PMID: 31756365 DOI: 10.1016/j.biopsycho.2019.107827] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 11/08/2019] [Accepted: 11/17/2019] [Indexed: 11/20/2022]
Abstract
Voice attractiveness plays a significant role in social interaction and mate choice. However, how listeners perceive attractive voices and whether this process is mandatory, is poorly understood. The current study explores this question using event-related brain potentials. Participants listened to syllables spoken by male and female voices with high or low attractiveness while completing an implicit (voice un-related) tone detection task or explicitly judging voice attractiveness. In both tasks, attractive male voices elicited a larger N1 than unattractive voices. However, an effect of voice attractiveness on the late positive complex (LPC) was only seen in the explicit task but it was present to both same- and opposite-sex voices. Taken together, voice attractiveness processing during early stages appears to be rapid and mandatory and related to mate selection, whereas during later elaborated processing, voice attractiveness is strategic and aesthetics-based, requiring attentional resources.
Collapse
Affiliation(s)
- Hang Zhang
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
| | - Meng Liu
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China
| | - Weijun Li
- Research Center of Brain and Cognitive Neuroscience, Liaoning Normal University, Dalian, China.
| | - Werner Sommer
- Institut für Psychologie, Humboldt-Universität zu Berlin, Berlin, Germany
| |
Collapse
|
14
|
Salvari V, Paraskevopoulos E, Chalas N, Müller K, Wollbrink A, Dobel C, Korth D, Pantev C. Auditory Categorization of Man-Made Sounds Versus Natural Sounds by Means of MEG Functional Brain Connectivity. Front Neurosci 2019; 13:1052. [PMID: 31636532 PMCID: PMC6787283 DOI: 10.3389/fnins.2019.01052] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 09/19/2019] [Indexed: 01/27/2023] Open
Abstract
Previous neuroimaging studies have shown that sounds can be discriminated due to living-related or man-made-related characteristics and involve different brain regions. However, these studies have mainly provided source space analyses, which offer simple maps of activated brain regions but do not explain how regions of a distributed system are functionally organized under a specific task. In the present study, we aimed to further examine the functional connectivity of the auditory processing pathway across different categories of non-speech sounds in healthy adults, by means of MEG. Our analyses demonstrated significant activation and interconnection differences between living and man-made object sounds, in the prefrontal areas, anterior-superior temporal gyrus (aSTG), posterior cingulate cortex (PCC), and supramarginal gyrus (SMG), occurring within 80–120 ms post-stimulus interval. Current findings replicated previous ones, in that other regions beyond the auditory cortex are involved during auditory processing. According to the functional connectivity analysis, differential brain networks across the categories exist, which proposes that sound category discrimination processing relies on distinct cortical networks, a notion that has been strongly argued in the literature also in relation to the visual system.
Collapse
Affiliation(s)
- Vasiliki Salvari
- Institute for Biomagnetism and Biosignalanalysis, University of Münster, Münster, Germany
| | - Evangelos Paraskevopoulos
- Institute for Biomagnetism and Biosignalanalysis, University of Münster, Münster, Germany.,School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Nikolas Chalas
- School of Medicine, Faculty of Health Sciences, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Kilian Müller
- Institute for Biomagnetism and Biosignalanalysis, University of Münster, Münster, Germany
| | - Andreas Wollbrink
- Institute for Biomagnetism and Biosignalanalysis, University of Münster, Münster, Germany
| | - Christian Dobel
- Department of Otorhinolaryngology, Friedrich-Schiller University of Jena, Jena, Germany
| | - Daniela Korth
- Department of Otorhinolaryngology, Friedrich-Schiller University of Jena, Jena, Germany
| | - Christo Pantev
- Institute for Biomagnetism and Biosignalanalysis, University of Münster, Münster, Germany
| |
Collapse
|
15
|
Ogg M, Carlson TA, Slevc LR. The Rapid Emergence of Auditory Object Representations in Cortex Reflect Central Acoustic Attributes. J Cogn Neurosci 2019; 32:111-123. [PMID: 31560265 DOI: 10.1162/jocn_a_01472] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Human listeners are bombarded by acoustic information that the brain rapidly organizes into coherent percepts of objects and events in the environment, which aids speech and music perception. The efficiency of auditory object recognition belies the critical constraint that acoustic stimuli necessarily require time to unfold. Using magnetoencephalography, we studied the time course of the neural processes that transform dynamic acoustic information into auditory object representations. Participants listened to a diverse set of 36 tokens comprising everyday sounds from a typical human environment. Multivariate pattern analysis was used to decode the sound tokens from the magnetoencephalographic recordings. We show that sound tokens can be decoded from brain activity beginning 90 msec after stimulus onset with peak decoding performance occurring at 155 msec poststimulus onset. Decoding performance was primarily driven by differences between category representations (e.g., environmental vs. instrument sounds), although within-category decoding was better than chance. Representational similarity analysis revealed that these emerging neural representations were related to harmonic and spectrotemporal differences among the stimuli, which correspond to canonical acoustic features processed by the auditory pathway. Our findings begin to link the processing of physical sound properties with the perception of auditory objects and events in cortex.
Collapse
|
16
|
Liu X, Xu Y, Alter K, Tuomainen J. Emotional Connotations of Musical Instrument Timbre in Comparison With Emotional Speech Prosody: Evidence From Acoustics and Event-Related Potentials. Front Psychol 2018; 9:737. [PMID: 29867690 PMCID: PMC5962697 DOI: 10.3389/fpsyg.2018.00737] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 04/26/2018] [Indexed: 11/25/2022] Open
Abstract
Music and speech both communicate emotional meanings in addition to their domain-specific contents. But it is not clear whether and how the two kinds of emotional meanings are linked. The present study is focused on exploring the emotional connotations of musical timbre of isolated instrument sounds through the perspective of emotional speech prosody. The stimuli were isolated instrument sounds and emotional speech prosody categorized by listeners into anger, happiness and sadness, respectively. We first analyzed the timbral features of the stimuli, which showed that relations between the three emotions were relatively consistent in those features for speech and music. The results further echo the size-code hypothesis in which different sound timbre indicates different body size projections. Then we conducted an ERP experiment using a priming paradigm with isolated instrument sounds as primes and emotional speech prosody as targets. The results showed that emotionally incongruent instrument-speech pairs triggered a larger N400 response than emotionally congruent pairs. Taken together, this is the first study to provide evidence that the timbre of simple and isolated musical instrument sounds can convey emotion in a way similar to emotional speech prosody.
Collapse
Affiliation(s)
- Xiaoluan Liu
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Yi Xu
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| | - Kai Alter
- Faculty of Linguistics, Philology and Phonetics, University of Oxford, Oxford, United Kingdom.,Institute of Neuroscience, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Jyrki Tuomainen
- Department of Speech, Hearing and Phonetic Sciences, University College London, London, United Kingdom
| |
Collapse
|
17
|
Ahmed DG, Paquette S, Zeitouni A, Lehmann A. Neural Processing of Musical and Vocal Emotions Through Cochlear Implants Simulation. Clin EEG Neurosci 2018; 49:143-151. [PMID: 28958161 DOI: 10.1177/1550059417733386] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Cochlear implants (CIs) partially restore the sense of hearing in the deaf. However, the ability to recognize emotions in speech and music is reduced due to the implant's electrical signal limitations and the patient's altered neural pathways. Electrophysiological correlations of these limitations are not yet well established. Here we aimed to characterize the effect of CIs on auditory emotion processing and, for the first time, directly compare vocal and musical emotion processing through a CI-simulator. We recorded 16 normal hearing participants' electroencephalographic activity while listening to vocal and musical emotional bursts in their original form and in a degraded (CI-simulated) condition. We found prolonged P50 latency and reduced N100-P200 complex amplitude in the CI-simulated condition. This points to a limitation in encoding sound signals processed through CI simulation. When comparing the processing of vocal and musical bursts, we found a delay in latency with the musical bursts compared to the vocal bursts in both conditions (original and CI-simulated). This suggests that despite the cochlear implants' limitations, the auditory cortex can distinguish between vocal and musical stimuli. In addition, it adds to the literature supporting the complexity of musical emotion. Replicating this study with actual CI users might lead to characterizing emotional processing in CI users and could ultimately help develop optimal rehabilitation programs or device processing strategies to improve CI users' quality of life.
Collapse
Affiliation(s)
- Duha G Ahmed
- 1 International Laboratory for Brain Music and Sound Research, Center for Research on Brain, Language and Music, Department of Psychology, University of Montreal, Montreal, Quebec, Canada.,2 Department of Otolaryngology, Head and Neck Surgery, McGill University, Montreal, Quebec, Canada.,3 Department of Otolaryngology, Head and Neck Surgery, King Abdulaziz University, Rabigh Medical College, Jeddah, Saudi Arabia
| | - Sebastian Paquette
- 1 International Laboratory for Brain Music and Sound Research, Center for Research on Brain, Language and Music, Department of Psychology, University of Montreal, Montreal, Quebec, Canada.,4 Neurology Department, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, MA, USA
| | - Anthony Zeitouni
- 2 Department of Otolaryngology, Head and Neck Surgery, McGill University, Montreal, Quebec, Canada
| | - Alexandre Lehmann
- 1 International Laboratory for Brain Music and Sound Research, Center for Research on Brain, Language and Music, Department of Psychology, University of Montreal, Montreal, Quebec, Canada.,2 Department of Otolaryngology, Head and Neck Surgery, McGill University, Montreal, Quebec, Canada
| |
Collapse
|
18
|
Bowman C, Yamauchi T. Processing emotions in sounds: cross-domain aftereffects of vocal utterances and musical sounds. Cogn Emot 2016; 31:1610-1626. [PMID: 27848281 DOI: 10.1080/02699931.2016.1255588] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Nonlinguistic signals in the voice and musical instruments play a critical role in communicating emotion. Although previous research suggests a common mechanism for emotion processing in music and speech, the precise relationship between the two domains is unclear due to the paucity of direct evidence. By applying the adaptation paradigm developed by Bestelmeyer, Rouger, DeBruine, and Belin [2010. Auditory adaptation in vocal affect perception. Cognition, 117(2), 217-223. doi: 10.1016/j.cognition.2010.08.008 ], this study shows cross-domain aftereffects from vocal to musical sounds. Participants heard an angry or fearful sound four times, followed by a test sound and judged whether the test sound was angry or fearful. Results show cross-domain aftereffects in one direction - vocal utterances to musical sounds, not vice-versa. This effect occurred primarily for angry vocal sounds. It is argued that there is a unidirectional relationship between vocal and musical sounds where emotion processing of vocal sounds encompasses musical sounds but not vice-versa.
Collapse
Affiliation(s)
- Casady Bowman
- a Department of Psychology , Texas A&M University , College Station , TX , USA
| | - Takashi Yamauchi
- a Department of Psychology , Texas A&M University , College Station , TX , USA
| |
Collapse
|
19
|
Stavropoulos KKM, Carver LJ. Neural Correlates of Attention to Human-Made Sounds: An ERP Study. PLoS One 2016; 11:e0165745. [PMID: 27798701 PMCID: PMC5087949 DOI: 10.1371/journal.pone.0165745] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2016] [Accepted: 10/16/2016] [Indexed: 11/23/2022] Open
Abstract
Previous neuroimaging and electrophysiological studies have suggested that human made sounds are processed differently from non-human made sounds. Multiple groups have suggested that voices might be processed as “special,” much like faces. Although previous literature has explored neural correlates of voice perception under varying task demands, few studies have examined electrophysiological correlates of attention while directly comparing human made and non-human made sounds. In the present study, we used event-related potentials (ERPs) to compare attention to human versus non-human made sounds in an oddball paradigm. ERP components of interest were the P300, and fronto-temporal positivity to voices (FTVP), which has been reported in previous investigations of voice versus non-voice stimuli. We found that participants who heard human made sounds as “target” or infrequent stimuli had significantly larger FTPV amplitude, shorter FTPV latency, and larger P300 amplitude than those who heard non-human sounds as “target” stimuli. Our results are in concordance with previous findings that human-made and non-human made sounds are processed differently, and expand upon previous literature by demonstrating increased attention to human versus non-human made sounds, even when the non-human made sounds are ones that require immediate attention in daily life (e.g. a car horn). Heightened attention to human-made sounds is important theoretically and has potential for application in tests of social interest in populations with autism.
Collapse
Affiliation(s)
| | - Leslie J. Carver
- University of California San Diego, San Diego, California, United States of America
| |
Collapse
|
20
|
Herz N, Reuveni I, Goldstein A, Peri T, Schreiber S, Harpaz Y, Bonne O. Neural correlates of attention bias in posttraumatic stress disorder. Clin Neurophysiol 2016; 127:3268-76. [DOI: 10.1016/j.clinph.2016.07.016] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2015] [Revised: 05/29/2016] [Accepted: 07/29/2016] [Indexed: 10/21/2022]
|
21
|
Fang G, Yang P, Xue F, Cui J, Brauth SE, Tang Y. Sound Classification and Call Discrimination Are Decoded in Order as Revealed by Event-Related Potential Components in Frogs. BRAIN, BEHAVIOR AND EVOLUTION 2015; 86:232-45. [DOI: 10.1159/000441215] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 09/20/2015] [Indexed: 11/19/2022]
Abstract
Species that use communication sounds to coordinate social and reproductive behavior must be able to distinguish vocalizations from nonvocal sounds as well as to identify individual vocalization types. In this study we sought to identify the neural localization of the processes involved and the temporal order in which they occur in an anuran species, the music frog Babina daunchina. To do this we measured telencephalic and mesencephalic event-related potentials (ERPs) elicited by synthesized white noise (WN), highly sexually attractive (HSA) calls produced by males from inside nests and male calls of low sexual attractiveness (LSA) produced outside of nests. Each stimulus possessed similar temporal structures. The results showed the following: (1) the amplitudes of the first negative ERP component (N1) at ∼100 ms differed significantly between WN and conspecific calls but not between HSA and LSA calls, indicating that discrimination between conspecific calls and nonvocal sounds occurs in ∼100 ms, (2) the amplitudes of the second positive ERP component (P2) at ∼200 ms in the difference waves between HSA calls and WN were significantly higher than between LSA calls and WN in the right telencephalon, implying that call characteristic identification occurs in ∼200 ms and (3) WN evoked a larger third positive ERP component (P3) at ∼300 ms than conspecific calls, suggesting the frogs had classified the conspecific calls into one category and perceived WN as novel. Thus, both the detection of sounds and the identification of call characteristics are accomplished quickly in a specific temporal order, as reflected by ERP components. In addition, the most dynamic ERP patterns appeared in the left mesencephalon and the right telencephalon, indicating the two brain regions might play key roles in anuran vocal communication.
Collapse
|
22
|
Klyn NAM, Will U, Cheong YJ, Allen ET. Differential short-term memorisation for vocal and instrumental rhythms. Memory 2015; 24:766-91. [PMID: 26274938 DOI: 10.1080/09658211.2015.1050400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
This study explores differential processing of vocal and instrumental rhythms in short-term memory with three decision (same/different judgments) and one reproduction experiment. In the first experiment, memory performance declined for delayed versus immediate recall, with accuracy for the two rhythms being affected differently: Musicians performed better than non-musicians on clapstick but not on vocal rhythms, and musicians were better on vocal rhythms in the same than in the different condition. Results for the second experiment showed that concurrent sub-vocal articulation and finger-tapping differentially affected the two rhythms and same/different decisions, but produced no evidence for articulatory loop involvement in delayed decision tasks. In a third experiment, which tested rhythm reproduction, concurrent sub-vocal articulation decreased memory performance, with a stronger deleterious effect on the reproduction of vocal than of clapstick rhythms. This suggests that the articulatory loop may only be involved in delayed reproduction not in decision tasks. The fourth experiment tested whether differences between filled and empty rhythms (continuous vs. discontinuous sounds) can explain the different memorisation of vocal and clapstick rhythms. Though significant differences were found for empty and filled instrumental rhythms, the differences between vocal and clapstick can only be explained by considering additional voice specific features.
Collapse
Affiliation(s)
- Niall A M Klyn
- a School of Music , The Ohio State University , Columbus , OH , USA.,b Department of Speech and Hearing Science , The Ohio State University , Columbus , OH , USA
| | - Udo Will
- a School of Music , The Ohio State University , Columbus , OH , USA
| | - Yong-Jeon Cheong
- a School of Music , The Ohio State University , Columbus , OH , USA
| | - Erin T Allen
- a School of Music , The Ohio State University , Columbus , OH , USA
| |
Collapse
|
23
|
Rigoulot S, Pell MD, Armony JL. Time course of the influence of musical expertise on the processing of vocal and musical sounds. Neuroscience 2015; 290:175-84. [PMID: 25637804 DOI: 10.1016/j.neuroscience.2015.01.033] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2014] [Revised: 01/09/2015] [Accepted: 01/12/2015] [Indexed: 11/18/2022]
Abstract
Previous functional magnetic resonance imaging (fMRI) studies have suggested that different cerebral regions preferentially process human voice and music. Yet, little is known on the temporal course of the brain processes that decode the category of sounds and how the expertise in one sound category can impact these processes. To address this question, we recorded the electroencephalogram (EEG) of 15 musicians and 18 non-musicians while they were listening to short musical excerpts (piano and violin) and vocal stimuli (speech and non-linguistic vocalizations). The task of the participants was to detect noise targets embedded within the stream of sounds. Event-related potentials revealed an early differentiation of sound category, within the first 100 ms after the onset of the sound, with mostly increased responses to musical sounds. Importantly, this effect was modulated by the musical background of participants, as musicians were more responsive to music sounds than non-musicians, consistent with the notion that musical training increases sensitivity to music. In late temporal windows, brain responses were enhanced in response to vocal stimuli, but musicians were still more responsive to music. These results shed new light on the temporal course of neural dynamics of auditory processing and reveal how it is impacted by the stimulus category and the expertise of participants.
Collapse
Affiliation(s)
- S Rigoulot
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, Canada; Department of Psychiatry, McGill University and Douglas Mental Health University Institute, Montreal, Canada.
| | - M D Pell
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, Canada; School of Communication Sciences and Disorders, McGill University, Canada
| | - J L Armony
- Centre for Research on Brain, Language and Music (CRBLM), Montreal, Canada; Department of Psychiatry, McGill University and Douglas Mental Health University Institute, Montreal, Canada
| |
Collapse
|
24
|
Zhang D, Liu Y, Hou X, Sun G, Cheng Y, Luo Y. Discrimination of fearful and angry emotional voices in sleeping human neonates: a study of the mismatch brain responses. Front Behav Neurosci 2014; 8:422. [PMID: 25538587 PMCID: PMC4255595 DOI: 10.3389/fnbeh.2014.00422] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Accepted: 11/18/2014] [Indexed: 02/04/2023] Open
Abstract
Appropriate processing of human voices with different threat-related emotions is of evolutionarily adaptive value for the survival of individuals. Nevertheless, it is still not clear whether the sensitivity to threat-related information is present at birth. Using an odd-ball paradigm, the current study investigated the neural correlates underlying automatic processing of emotional voices of fear and anger in sleeping neonates. Event-related potential data showed that the fronto-central scalp distribution of the neonatal brain could discriminate fearful voices from angry voices; the mismatch response (MMR) was larger in response to the deviant stimuli of anger, compared with the standard stimuli of fear. Furthermore, this fear–anger MMR discrimination was observed only when neonates were in active sleep state. Although the neonates' sensitivity to threat-related voices is not likely associated with a conceptual understanding of fearful and angry emotions, this special discrimination in early life may provide a foundation for later emotion and social cognition development.
Collapse
Affiliation(s)
- Dandan Zhang
- Institute of Affective and Social Neuroscience, Shenzhen University Shenzhen, China ; State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University Beijing, China
| | - Yunzhe Liu
- Institute of Affective and Social Neuroscience, Shenzhen University Shenzhen, China ; State Key Laboratory of Cognitive Neuroscience and Learning, Beijing Normal University Beijing, China
| | - Xinlin Hou
- Department of Pediatrics, Peking University First Hospital Beijing, China
| | - Guoyu Sun
- Department of Pediatrics, Peking University First Hospital Beijing, China
| | - Yawei Cheng
- Institute of Neuroscience, Yang-Ming University Taipei, Taiwan ; Department of Rehabilitation, Yang-Ming University Hospital Ilan, Taiwan
| | - Yuejia Luo
- Institute of Affective and Social Neuroscience, Shenzhen University Shenzhen, China
| |
Collapse
|
25
|
Li Y, Gu F, Zhang X, Yang L, Chen L, Wei Z, Zha R, Wang Y, Li X, Zhou Y, Zhang X. Cerebral activity to opposite-sex voices reflected by event-related potentials. PLoS One 2014; 9:e94976. [PMID: 24727971 PMCID: PMC3984274 DOI: 10.1371/journal.pone.0094976] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2014] [Accepted: 03/20/2014] [Indexed: 11/18/2022] Open
Abstract
Human voice is a gender discriminating cue and is important to mate selection. This study employed electrophysiological recordings to examine whether there is specific cerebral activity when presented with opposite-sex voices as compared to same-sex voices. Male voices and female voices were pseudo-randomly presented to male and female participants. In Experiment 1, participants were instructed to determine the gender of each voice. A late positivity (LP) response around 750 ms after voice onset was elicited by opposite-sex voices, as reflected by a positive deflection of the ERP to opposite-sex voices than that to same-sex voices. This LP response was prominent around parieto-occipital recording sites, and it suggests an opposite-sex specific process, which may reflect emotion- and/or reward-related cerebral activity. In Experiment 2, participants were instructed to press a key when hearing a non-voice pure tone and not give any response when they heard voice stimuli. In this task, no difference were found between the ERP to same-sex voices and that to opposite-sex voices, suggesting that the cerebral activity to opposite-sex voices may disappear without gender-related attention. These results provide significant implications on cognitive mechanisms with regard to opposite-sex specific voice processing.
Collapse
Affiliation(s)
- Ya Li
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Feng Gu
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
- * E-mail: (FG) (FG); (XCZ) (XZ)
| | - Xiliang Zhang
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Lizhuang Yang
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Lijun Chen
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Zhengde Wei
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Rujing Zha
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Ying Wang
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Xiaoming Li
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
- Department of Medical Psychology, Anhui Medical University, Hefei, Anhui Province, China
| | - Yifeng Zhou
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
| | - Xiaochu Zhang
- CAS Key Laboratory of Brain Function and Disease, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui, China
- School of Humanities & Social Science, University of Science & Technology of China, Hefei, Anhui, China
- * E-mail: (FG) (FG); (XCZ) (XZ)
| |
Collapse
|
26
|
Cossy N, Tzovara A, Simonin A, Rossetti AO, De Lucia M. Robust discrimination between EEG responses to categories of environmental sounds in early coma. Front Psychol 2014; 5:155. [PMID: 24611061 PMCID: PMC3933775 DOI: 10.3389/fpsyg.2014.00155] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2013] [Accepted: 02/07/2014] [Indexed: 01/18/2023] Open
Abstract
Humans can recognize categories of environmental sounds, including vocalizations produced by humans and animals and the sounds of man-made objects. Most neuroimaging investigations of environmental sound discrimination have studied subjects while consciously perceiving and often explicitly recognizing the stimuli. Consequently, it remains unclear to what extent auditory object processing occurs independently of task demands and consciousness. Studies in animal models have shown that environmental sound discrimination at a neural level persists even in anesthetized preparations, whereas data from anesthetized humans has thus far provided null results. Here, we studied comatose patients as a model of environmental sound discrimination capacities during unconsciousness. We included 19 comatose patients treated with therapeutic hypothermia (TH) during the first 2 days of coma, while recording nineteen-channel electroencephalography (EEG). At the level of each individual patient, we applied a decoding algorithm to quantify the differential EEG responses to human vs. animal vocalizations as well as to sounds of living vocalizations vs. man-made objects. Discrimination between vocalization types was accurate in 11 patients and discrimination between sounds from living and man-made sources in 10 patients. At the group level, the results were significant only for the comparison between vocalization types. These results lay the groundwork for disentangling truly preferential activations in response to auditory categories, and the contribution of awareness to auditory category discrimination.
Collapse
Affiliation(s)
- Natacha Cossy
- Electroencephalography Brain Mapping Core, Center for Biomedical Imaging (CIBM), University Hospital Center, University of Lausanne Lausanne, Switzerland ; Department of Radiology, University Hospital Center, University of Lausanne Lausanne, Switzerland
| | - Athina Tzovara
- Electroencephalography Brain Mapping Core, Center for Biomedical Imaging (CIBM), University Hospital Center, University of Lausanne Lausanne, Switzerland ; Department of Radiology, University Hospital Center, University of Lausanne Lausanne, Switzerland
| | - Alexandre Simonin
- Department of Clinical Neurosciences, University Hospital Center, University of Lausanne Lausanne, Switzerland
| | - Andrea O Rossetti
- Department of Clinical Neurosciences, University Hospital Center, University of Lausanne Lausanne, Switzerland
| | - Marzia De Lucia
- Electroencephalography Brain Mapping Core, Center for Biomedical Imaging (CIBM), University Hospital Center, University of Lausanne Lausanne, Switzerland ; Department of Radiology, University Hospital Center, University of Lausanne Lausanne, Switzerland
| |
Collapse
|
27
|
Granot RY, Israel-Kolatt R, Gilboa A, Kolatt T. Accuracy of pitch matching significantly improved by live voice model. J Voice 2013; 27:390.e13-20. [PMID: 23528675 DOI: 10.1016/j.jvoice.2013.01.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2012] [Accepted: 01/07/2013] [Indexed: 11/30/2022]
Abstract
Singing is, undoubtedly, the most fundamental expression of our musical capacity, yet an estimated 10-15% of Western population sings "out-of-tune (OOT)." Previous research in children and adults suggests, albeit inconsistently, that imitating a human voice can improve pitch matching. In the present study, we focus on the potentially beneficial effects of the human voice and especially the live human voice. Eighteen participants varying in their singing abilities were required to imitate in singing a set of nine ascending and descending intervals presented to them in five different randomized blocked conditions: live piano, recorded piano, live voice using optimal voice production, recorded voice using optimal voice production, and recorded voice using artificial forced voice production. Pitch and interval matching in singing were much more accurate when participants repeated sung intervals as compared with intervals played to them on the piano. The advantage of the vocal over the piano stimuli was robust and emerged clearly regardless of whether piano tones were played live and in full view or were presented via recording. Live vocal stimuli elicited higher accuracy than recorded vocal stimuli, especially when the recorded vocal stimuli were produced in a forced vocal production. Remarkably, even those who would be considered OOT singers on the basis of their performance when repeating piano tones were able to pitch match live vocal sounds, with deviations well within the range of what is considered accurate singing (M=46.0, standard deviation=39.2 cents). In fact, those participants who were most OOT gained the most from the live voice model. Results are discussed in light of the dual auditory-motor encoding of pitch analogous to that found in speech.
Collapse
Affiliation(s)
- Roni Y Granot
- Musicology Department, the Hebrew University of Jerusalem, Jerusalem, Israel.
| | | | | | | |
Collapse
|
28
|
Kaganovich N, Kim J, Herring C, Schumaker J, Macpherson M, Weber-Fox C. Musicians show general enhancement of complex sound encoding and better inhibition of irrelevant auditory change in music: an ERP study. Eur J Neurosci 2013; 37:1295-307. [PMID: 23301775 DOI: 10.1111/ejn.12110] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2011] [Revised: 11/19/2012] [Accepted: 11/25/2012] [Indexed: 11/30/2022]
Abstract
Using electrophysiology, we have examined two questions in relation to musical training - namely, whether it enhances sensory encoding of the human voice and whether it improves the ability to ignore irrelevant auditory change. Participants performed an auditory distraction task, in which they identified each sound as either short (350 ms) or long (550 ms) and ignored a change in timbre of the sounds. Sounds consisted of a male and a female voice saying a neutral sound [a], and of a cello and a French Horn playing an F3 note. In some blocks, musical sounds occurred on 80% of trials, while voice sounds on 20% of trials. In other blocks, the reverse was true. Participants heard naturally recorded sounds in half of experimental blocks and their spectrally-rotated versions in the other half. Regarding voice perception, we found that musicians had a larger N1 event-related potential component not only to vocal sounds but also to their never before heard spectrally-rotated versions. We therefore conclude that musical training is associated with a general improvement in the early neural encoding of complex sounds. Regarding the ability to ignore irrelevant auditory change, musicians' accuracy tended to suffer less from the change in timbre of the sounds, especially when deviants were musical notes. This behavioral finding was accompanied by a marginally larger re-orienting negativity in musicians, suggesting that their advantage may lie in a more efficient disengagement of attention from the distracting auditory dimension.
Collapse
Affiliation(s)
- Natalya Kaganovich
- Department of Speech, Language, and Hearing Sciences, Purdue University, 500 Oval Drive, West Lafayette, IN 47907-2038, USA.
| | | | | | | | | | | |
Collapse
|
29
|
Mai X, Xu L, Li M, Shao J, Zhao Z, deRegnier RA, Nelson CA, Lozoff B. Auditory recognition memory in 2-month-old infants as assessed by event-related potentials. Dev Neuropsychol 2012; 37:400-14. [PMID: 22799760 DOI: 10.1080/87565641.2011.650807] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Previous studies of auditory recognition memory in sleeping newborns reported 2 event-related potential (ERP) components, P2 and negative slow wave (NSW), reflecting voice discrimination and detection of novelty, respectively. In the present study, using high-density recording arrays, ERPs were acquired from 26 2-month-old awake infants as they were presented with a familiar and unfamiliar voice (i.e., mother and stranger) with equal probability. In addition to P2 and NSW, we observed a positive slow wave (PSW) over the right temporo-parietal scalp, indicating memory updating. Our study suggests that infants appear to have the capacity to encode novel stimuli as early as 2 months of age.
Collapse
Affiliation(s)
- Xiaoqin Mai
- Center for Human Growth and Development, University of Michigan, Ann Arbor, Michigan, USA
| | | | | | | | | | | | | | | |
Collapse
|
30
|
|
31
|
The superiority in voice processing of the blind arises from neural plasticity at sensory processing stages. Neuropsychologia 2012; 50:2056-67. [DOI: 10.1016/j.neuropsychologia.2012.05.006] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Revised: 03/07/2012] [Accepted: 05/06/2012] [Indexed: 11/17/2022]
|
32
|
Capilla A, Belin P, Gross J. The early spatio-temporal correlates and task independence of cerebral voice processing studied with MEG. Cereb Cortex 2012; 23:1388-95. [PMID: 22610392 DOI: 10.1093/cercor/bhs119] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Functional magnetic resonance imaging studies have repeatedly provided evidence for temporal voice areas (TVAs) with particular sensitivity to human voices along bilateral mid/anterior superior temporal sulci and superior temporal gyri (STS/STG). In contrast, electrophysiological studies of the spatio-temporal correlates of cerebral voice processing have yielded contradictory results, finding the earliest correlates either at ∼300-400 ms, or earlier at ∼200 ms ("fronto-temporal positivity to voice", FTPV). These contradictory results are likely the consequence of different stimulus sets and attentional demands. Here, we recorded magnetoencephalography activity while participants listened to diverse types of vocal and non-vocal sounds and performed different tasks varying in attentional demands. Our results confirm the existence of an early voice-preferential magnetic response (FTPVm, the magnetic counterpart of the FTPV) peaking at about 220 ms and distinguishing between vocal and non-vocal sounds as early as 150 ms after stimulus onset. The sources underlying the FTPVm were localized along bilateral mid-STS/STG, largely overlapping with the TVAs. The FTPVm was consistently observed across different stimulus subcategories, including speech and non-speech vocal sounds, and across different tasks. These results demonstrate the early, largely automatic recruitment of focal, voice-selective cerebral mechanisms with a time-course comparable to that of face processing.
Collapse
Affiliation(s)
- Almudena Capilla
- Department of Biological and Health Psychology, Autonoma University of Madrid, Madrid, Spain
| | | | | |
Collapse
|
33
|
Lévêque Y, Giovanni A, Schön D. Effects of humanness and gender in voice processing. LOGOP PHONIATR VOCO 2012; 37:137-43. [PMID: 22587690 DOI: 10.3109/14015439.2012.687763] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
When we observe a producible human movement, the brain performs a specific perception-action matching process, which possibly facilitates perceptual processing. In this work, we wanted to study whether the producibility of a sound affects the speed at which it is categorized. Participants were presented with isolated sounds, either sung by a natural male or female voice ('producible') or distorted by saturation ('non-producible'), and had to categorize them as produced by a voice or by a machine. We analyzed reaction time variations as a function of the gender and humanness of the voice. Results corroborate the existence of a 'human bias' in auditory perception, and suggest a processing speed asymmetry between natural female and male voices.
Collapse
Affiliation(s)
- Yohana Lévêque
- Laboratoire Parole et Langage, CNRS & Aix-Marseille University, 5 avenue Pasteur, Aix-en-Provence 13604, France.
| | | | | |
Collapse
|
34
|
Agus TR, Suied C, Thorpe SJ, Pressnitzer D. Fast recognition of musical sounds based on timbre. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2012; 131:4124-4133. [PMID: 22559384 DOI: 10.1121/1.3701865] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Human listeners seem to have an impressive ability to recognize a wide variety of natural sounds. However, there is surprisingly little quantitative evidence to characterize this fundamental ability. Here the speed and accuracy of musical-sound recognition were measured psychophysically with a rich but acoustically balanced stimulus set. The set comprised recordings of notes from musical instruments and sung vowels. In a first experiment, reaction times were collected for three target categories: voice, percussion, and strings. In a go/no-go task, listeners reacted as quickly as possible to members of a target category while withholding responses to distractors (a diverse set of musical instruments). Results showed near-perfect accuracy and fast reaction times, particularly for voices. In a second experiment, voices were recognized among strings and vice-versa. Again, reaction times to voices were faster. In a third experiment, auditory chimeras were created to retain only spectral or temporal features of the voice. Chimeras were recognized accurately, but not as quickly as natural voices. Altogether, the data suggest rapid and accurate neural mechanisms for musical-sound recognition based on selectivity to complex spectro-temporal signatures of sound sources.
Collapse
Affiliation(s)
- Trevor R Agus
- Laboratoire de Psychologie de la Perception, UMR CNRS 8158, Université Paris-Descartes & Département d'Études Cognitives, Ecole Normale Supérieure, 29 rue d'Ulm, 75005 Paris, France.
| | | | | | | |
Collapse
|
35
|
Discriminating Male and Female Voices: Differentiating Pitch and Gender. Brain Topogr 2011; 25:194-204. [DOI: 10.1007/s10548-011-0207-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2011] [Accepted: 10/24/2011] [Indexed: 11/26/2022]
|
36
|
Person identification through faces and voices: An ERP study. Brain Res 2011; 1407:13-26. [DOI: 10.1016/j.brainres.2011.03.029] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2011] [Accepted: 03/11/2011] [Indexed: 11/17/2022]
|
37
|
|
38
|
Vanzella P, Schellenberg EG. Absolute pitch: effects of timbre on note-naming ability. PLoS One 2010; 5:e15449. [PMID: 21085598 PMCID: PMC2978713 DOI: 10.1371/journal.pone.0015449] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 09/22/2010] [Indexed: 11/24/2022] Open
Abstract
Background Absolute pitch (AP) is the ability to identify or produce isolated musical tones. It is evident primarily among individuals who started music lessons in early childhood. Because AP requires memory for specific pitches as well as learned associations with verbal labels (i.e., note names), it represents a unique opportunity to study interactions in memory between linguistic and nonlinguistic information. One untested hypothesis is that the pitch of voices may be difficult for AP possessors to identify. A musician's first instrument may also affect performance and extend the sensitive period for acquiring accurate AP. Methods/Principal Findings A large sample of AP possessors was recruited on-line. Participants were required to identity test tones presented in four different timbres: piano, pure tone, natural (sung) voice, and synthesized voice. Note-naming accuracy was better for non-vocal (piano and pure tones) than for vocal (natural and synthesized voices) test tones. This difference could not be attributed solely to vibrato (pitch variation), which was more pronounced in the natural voice than in the synthesized voice. Although starting music lessons by age 7 was associated with enhanced note-naming accuracy, equivalent abilities were evident among listeners who started music lessons on piano at a later age. Conclusions/Significance Because the human voice is inextricably linked to language and meaning, it may be processed automatically by voice-specific mechanisms that interfere with note naming among AP possessors. Lessons on piano or other fixed-pitch instruments appear to enhance AP abilities and to extend the sensitive period for exposure to music in order to develop accurate AP.
Collapse
Affiliation(s)
| | - E. Glenn Schellenberg
- Department of Psychology, University of Toronto, Mississauga, Ontario, Canada
- * E-mail:
| |
Collapse
|
39
|
Abstract
The ability to discriminate conspecific vocalizations is observed across species and early during development. However, its neurophysiologic mechanism remains controversial, particularly regarding whether it involves specialized processes with dedicated neural machinery. We identified spatiotemporal brain mechanisms for conspecific vocalization discrimination in humans by applying electrical neuroimaging analyses to auditory evoked potentials (AEPs) in response to acoustically and psychophysically controlled nonverbal human and animal vocalizations as well as sounds of man-made objects. AEP strength modulations in the absence of topographic modulations are suggestive of statistically indistinguishable brain networks. First, responses were significantly stronger, but topographically indistinguishable to human versus animal vocalizations starting at 169-219 ms after stimulus onset and within regions of the right superior temporal sulcus and superior temporal gyrus. This effect correlated with another AEP strength modulation occurring at 291-357 ms that was localized within the left inferior prefrontal and precentral gyri. Temporally segregated and spatially distributed stages of vocalization discrimination are thus functionally coupled and demonstrate how conventional views of functional specialization must incorporate network dynamics. Second, vocalization discrimination is not subject to facilitated processing in time, but instead lags more general categorization by approximately 100 ms, indicative of hierarchical processing during object discrimination. Third, although differences between human and animal vocalizations persisted when analyses were performed at a single-object level or extended to include additional (man-made) sound categories, at no latency were responses to human vocalizations stronger than those to all other categories. Vocalization discrimination transpires at times synchronous with that of face discrimination but is not functionally specialized.
Collapse
|
40
|
Spierer L, De Lucia M, Bernasconi F, Grivel J, Bourquin NMP, Clarke S, Murray MM. Learning-induced plasticity in human audition: objects, time, and space. Hear Res 2010; 271:88-102. [PMID: 20430070 DOI: 10.1016/j.heares.2010.03.086] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/23/2009] [Revised: 02/16/2010] [Accepted: 03/03/2010] [Indexed: 10/19/2022]
Abstract
The human auditory system is comprised of specialized but interacting anatomic and functional pathways encoding object, spatial, and temporal information. We review how learning-induced plasticity manifests along these pathways and to what extent there are common mechanisms subserving such plasticity. A first series of experiments establishes a temporal hierarchy along which sounds of objects are discriminated along basic to fine-grained categorical boundaries and learned representations. A widespread network of temporal and (pre)frontal brain regions contributes to object discrimination via recursive processing. Learning-induced plasticity typically manifested as repetition suppression within a common set of brain regions. A second series considered how the temporal sequence of sound sources is represented. We show that lateralized responsiveness during the initial encoding phase of pairs of auditory spatial stimuli is critical for their accurate ordered perception. Finally, we consider how spatial representations are formed and modified through training-induced learning. A population-based model of spatial processing is supported wherein temporal and parietal structures interact in the encoding of relative and absolute spatial information over the initial ~300 ms post-stimulus onset. Collectively, these data provide insights into the functional organization of human audition and open directions for new developments in targeted diagnostic and neurorehabilitation strategies.
Collapse
Affiliation(s)
- Lucas Spierer
- Neuropsychology and Neurorehabilitation Service, Department of Clinical Neuroscience, Vaudois University Hospital Center and University of Lausanne, Switzerland
| | | | | | | | | | | | | |
Collapse
|
41
|
Sorokin A, Alku P, Kujala T. Change and novelty detection in speech and non-speech sound streams. Brain Res 2010; 1327:77-90. [DOI: 10.1016/j.brainres.2010.02.052] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2009] [Revised: 01/30/2010] [Accepted: 02/18/2010] [Indexed: 10/19/2022]
|
42
|
Gordon RL, Schön D, Magne C, Astésano C, Besson M. Words and melody are intertwined in perception of sung words: EEG and behavioral evidence. PLoS One 2010; 5:e9889. [PMID: 20360991 PMCID: PMC2847603 DOI: 10.1371/journal.pone.0009889] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 02/26/2010] [Indexed: 11/19/2022] Open
Abstract
Language and music, two of the most unique human cognitive abilities, are combined in song, rendering it an ecological model for comparing speech and music cognition. The present study was designed to determine whether words and melodies in song are processed interactively or independently, and to examine the influence of attention on the processing of words and melodies in song. Event-Related brain Potentials (ERPs) and behavioral data were recorded while non-musicians listened to pairs of sung words (prime and target) presented in four experimental conditions: same word, same melody; same word, different melody; different word, same melody; different word, different melody. Participants were asked to attend to either the words or the melody, and to perform a same/different task. In both attentional tasks, different word targets elicited an N400 component, as predicted based on previous results. Most interestingly, different melodies (sung with the same word) elicited an N400 component followed by a late positive component. Finally, ERP and behavioral data converged in showing interactions between the linguistic and melodic dimensions of sung words. The finding that the N400 effect, a well-established marker of semantic processing, was modulated by musical melody in song suggests that variations in musical features affect word processing in sung language. Implications of the interactions between words and melody are discussed in light of evidence for shared neural processing resources between the phonological/semantic aspects of language and the melodic/harmonic aspects of music.
Collapse
Affiliation(s)
- Reyna L Gordon
- Center for Complex Systems and Brain Sciences, Florida Atlantic University, Boca Raton, Florida, United States of America.
| | | | | | | | | |
Collapse
|
43
|
Latinus M, VanRullen R, Taylor MJ. Top-down and bottom-up modulation in processing bimodal face/voice stimuli. BMC Neurosci 2010; 11:36. [PMID: 20222946 PMCID: PMC2850913 DOI: 10.1186/1471-2202-11-36] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2009] [Accepted: 03/11/2010] [Indexed: 11/16/2022] Open
Abstract
Background Processing of multimodal information is a critical capacity of the human brain, with classic studies showing bimodal stimulation either facilitating or interfering in perceptual processing. Comparing activity to congruent and incongruent bimodal stimuli can reveal sensory dominance in particular cognitive tasks. Results We investigated audiovisual interactions driven by stimulus properties (bottom-up influences) or by task (top-down influences) on congruent and incongruent simultaneously presented faces and voices while ERPs were recorded. Subjects performed gender categorisation, directing attention either to faces or to voices and also judged whether the face/voice stimuli were congruent in terms of gender. Behaviourally, the unattended modality affected processing in the attended modality: the disruption was greater for attended voices. ERPs revealed top-down modulations of early brain processing (30-100 ms) over unisensory cortices. No effects were found on N170 or VPP, but from 180-230 ms larger right frontal activity was seen for incongruent than congruent stimuli. Conclusions Our data demonstrates that in a gender categorisation task the processing of faces dominate over the processing of voices. Brain activity showed different modulation by top-down and bottom-up information. Top-down influences modulated early brain activity whereas bottom-up interactions occurred relatively late.
Collapse
Affiliation(s)
- Marianne Latinus
- Université de Toulouse, UPS, CNRS, Centre de recherche Cerveau et Cognition, Toulouse, France.
| | | | | |
Collapse
|
44
|
Murray MM, Spierer L. Auditory spatio-temporal brain dynamics and their consequences for multisensory interactions in humans. Hear Res 2009; 258:121-33. [DOI: 10.1016/j.heares.2009.04.022] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2009] [Revised: 04/28/2009] [Accepted: 04/28/2009] [Indexed: 11/24/2022]
|
45
|
Rogier O, Roux S, Belin P, Bonnet-Brilhault F, Bruneau N. An electrophysiological correlate of voice processing in 4- to 5-year-old children. Int J Psychophysiol 2009; 75:44-7. [PMID: 19896509 DOI: 10.1016/j.ijpsycho.2009.10.013] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2009] [Revised: 10/29/2009] [Accepted: 10/30/2009] [Indexed: 11/16/2022]
Abstract
Cortical auditory evoked potentials were studied in responses to voice and environmental sounds in 4- to 5-year-old children. A specific response to voice was dissociated from the response to environmental sounds. It appeared as a positive deflection recorded at right fronto-temporal sites and beginning within 60ms of stimulus onset. We termed this response Fronto-Temporal Positivity to Voice (FTPV).
Collapse
Affiliation(s)
- Ophelie Rogier
- UMR INSERM U930, CNRS ERL 3106, Université François-Rabelais de Tours, CHRU de Tours, France.
| | | | | | | | | |
Collapse
|
46
|
Charest I, Pernet CR, Rousselet GA, Quiñones I, Latinus M, Fillion-Bilodeau S, Chartrand JP, Belin P. Electrophysiological evidence for an early processing of human voices. BMC Neurosci 2009; 10:127. [PMID: 19843323 PMCID: PMC2770575 DOI: 10.1186/1471-2202-10-127] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2009] [Accepted: 10/20/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Previous electrophysiological studies have identified a "voice specific response" (VSR) peaking around 320 ms after stimulus onset, a latency markedly longer than the 70 ms needed to discriminate living from non-living sound sources and the 150 ms to 200 ms needed for the processing of voice paralinguistic qualities. In the present study, we investigated whether an early electrophysiological difference between voice and non-voice stimuli could be observed. RESULTS ERPs were recorded from 32 healthy volunteers who listened to 200 ms long stimuli from three sound categories - voices, bird songs and environmental sounds - whilst performing a pure-tone detection task. ERP analyses revealed voice/non-voice amplitude differences emerging as early as 164 ms post stimulus onset and peaking around 200 ms on fronto-temporal (positivity) and occipital (negativity) electrodes. CONCLUSION Our electrophysiological results suggest a rapid brain discrimination of sounds of voice, termed the "fronto-temporal positivity to voices" (FTPV), at latencies comparable to the well-known face-preferential N170.
Collapse
Affiliation(s)
- Ian Charest
- Centre for Cognitive NeuroImaging (CCNi) & Department of Psychology, University of Glasgow, Glasgow, UK.
| | | | | | | | | | | | | | | |
Collapse
|
47
|
|
48
|
Brancucci A, Lucci G, Mazzatenta A, Tommasi L. Asymmetries of the human social brain in the visual, auditory and chemical modalities. Philos Trans R Soc Lond B Biol Sci 2009; 364:895-914. [PMID: 19064350 PMCID: PMC2666086 DOI: 10.1098/rstb.2008.0279] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Structural and functional asymmetries are present in many regions of the human brain responsible for motor control, sensory and cognitive functions and communication. Here, we focus on hemispheric asymmetries underlying the domain of social perception, broadly conceived as the analysis of information about other individuals based on acoustic, visual and chemical signals. By means of these cues the brain establishes the border between 'self' and 'other', and interprets the surrounding social world in terms of the physical and behavioural characteristics of conspecifics essential for impression formation and for creating bonds and relationships. We show that, considered from the standpoint of single- and multi-modal sensory analysis, the neural substrates of the perception of voices, faces, gestures, smells and pheromones, as evidenced by modern neuroimaging techniques, are characterized by a general pattern of right-hemispheric functional asymmetry that might benefit from other aspects of hemispheric lateralization rather than constituting a true specialization for social information.
Collapse
Affiliation(s)
| | | | | | - Luca Tommasi
- Department of Biomedical Sciences, Institute for Advanced Biomedical Technologies, University of ChietiBlocco A, Via dei Vestini 29, 66013 Chieti, Italy
| |
Collapse
|
49
|
Bonte M, Valente G, Formisano E. Dynamic and task-dependent encoding of speech and voice by phase reorganization of cortical oscillations. J Neurosci 2009; 29:1699-706. [PMID: 19211877 PMCID: PMC6666288 DOI: 10.1523/jneurosci.3694-08.2009] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2008] [Revised: 01/05/2009] [Accepted: 01/07/2009] [Indexed: 11/21/2022] Open
Abstract
Speech and vocal sounds are at the core of human communication. Cortical processing of these sounds critically depends on behavioral demands. However, the neurocomputational mechanisms enabling this adaptive processing remain elusive. Here we examine the task-dependent reorganization of electroencephalographic responses to natural speech sounds (vowels /a/, /i/, /u/) spoken by three speakers (two female, one male) while listeners perform a one-back task on either vowel or speaker identity. We show that dynamic changes of sound-evoked responses and phase patterns of cortical oscillations in the alpha band (8-12 Hz) closely reflect the abstraction and analysis of the sounds along the task-relevant dimension. Vowel categorization leads to a significant temporal realignment of responses to the same vowel, e.g., /a/, independent of who pronounced this vowel, whereas speaker categorization leads to a significant temporal realignment of responses to the same speaker, e.g., speaker 1, independent of which vowel she/he pronounced. This transient and goal-dependent realignment of neuronal responses to physically different external events provides a robust cortical coding mechanism for forming and processing abstract representations of auditory (speech) input.
Collapse
Affiliation(s)
- Milene Bonte
- Department of Cognitive Neuroscience, Faculty of Psychology and Neuroscience, Maastricht University, 6200 MD Maastricht, The Netherlands.
| | | | | |
Collapse
|
50
|
Dolce G, Riganello F, Quintieri M, Candelieri A, Conforti D. Personal Interaction in the Vegetative State. J PSYCHOPHYSIOL 2008. [DOI: 10.1027/0269-8803.22.3.150] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Background and purpose: Brain processing at varying levels of functional complexity and emotional reactions to relatives are anecdotally reported by the caregivers of patients in a vegetative state. In this study, computer-assisted machine-learning procedures were applied to identify heart rate variability changes or galvanic skin responses to a relative’s presence. Methods: The skin conductance (galvanic skin response) and heart beats were continuously recorded in 12 patients in a vegetative state, at rest (baseline) and while approached by a relative (usually the mother; test condition) or by a nonfamiliar person (control condition). The cardiotachogram (the series of consecutive intervals between heart beats) was analyzed in the time and frequency domains by computing the parametric and nonparametric frequency spectra. A machine-learning algorithm was applied to sort out the significant spectral parameter(s). For all patients, each condition (baseline, test, control) was characterized by the values of its spectral parameters, and the association between spectral parameters values and experimental condition was tested (WEKA machine-learning software). Results and comments: A galvanic skin response was obtained in two patients. The machine-learning procedure independently selected the nu_LF spectral parameter and attributed each nu_LF measure to any of the three experimental conditions. 69.4% of attributions were correct (baseline: 58%; test condition: 75%; control. 75%). In seven patients, attribution changed when the subject was approached by the test person; specifically, sequential shifts from baseline to test condition (“the Mom effect”) to control condition were identified in four patients (30.0%); the change from test to control was attributed correctly in seven patients (58%). The observation of heart rate changes tentatively attributable to emotional reaction in a vegetative state suggest residual rudimentary personal interaction, consistent with functioning limbic and paralimbic systems after massive brain damage. Machine-learning proved applicable to sort significant measure(s) out of large samples and to control for statistical alpha inflation.
Collapse
Affiliation(s)
- G. Dolce
- Intensive Care Unit, S. Anna Institute, Crotone, Italy
| | - F. Riganello
- Intensive Care Unit, S. Anna Institute, Crotone, Italy
| | - M. Quintieri
- Intensive Care Unit, S. Anna Institute, Crotone, Italy
| | - A. Candelieri
- Department of Electronic Informatics and Systems, Laboratory of Decision Engineering for Health Care Delivery, University of Cosenza, Italy
| | - D. Conforti
- Department of Electronic Informatics and Systems, Laboratory of Decision Engineering for Health Care Delivery, University of Cosenza, Italy
| |
Collapse
|