1
|
Hakonen M, Dahmani L, Lankinen K, Ren J, Barbaro J, Blazejewska A, Cui W, Kotlarz P, Li M, Polimeni JR, Turpin T, Uluç I, Wang D, Liu H, Ahveninen J. Individual connectivity-based parcellations reflect functional properties of human auditory cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.20.576475. [PMID: 38293021 PMCID: PMC10827228 DOI: 10.1101/2024.01.20.576475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
Abstract
Neuroimaging studies of the functional organization of human auditory cortex have focused on group-level analyses to identify tendencies that represent the typical brain. Here, we mapped auditory areas of the human superior temporal cortex (STC) in 30 participants by combining functional network analysis and 1-mm isotropic resolution 7T functional magnetic resonance imaging (fMRI). Two resting-state fMRI sessions, and one or two auditory and audiovisual speech localizer sessions, were collected on 3-4 separate days. We generated a set of functional network-based parcellations from these data. Solutions with 4, 6, and 11 networks were selected for closer examination based on local maxima of Dice and Silhouette values. The resulting parcellation of auditory cortices showed high intraindividual reproducibility both between resting state sessions (Dice coefficient: 69-78%) and between resting state and task sessions (Dice coefficient: 62-73%). This demonstrates that auditory areas in STC can be reliably segmented into functional subareas. The interindividual variability was significantly larger than intraindividual variability (Dice coefficient: 57%-68%, p<0.001), indicating that the parcellations also captured meaningful interindividual variability. The individual-specific parcellations yielded the highest alignment with task response topographies, suggesting that individual variability in parcellations reflects individual variability in auditory function. Connectional homogeneity within networks was also highest for the individual-specific parcellations. Furthermore, the similarity in the functional parcellations was not explainable by the similarity of macroanatomical properties of auditory cortex. Our findings suggest that individual-level parcellations capture meaningful idiosyncrasies in auditory cortex organization.
Collapse
Affiliation(s)
- M Hakonen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - L Dahmani
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - K Lankinen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - J Ren
- Division of Brain Sciences, Changping Laboratory, Beijing, China
| | - J Barbaro
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
| | - A Blazejewska
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - W Cui
- Division of Brain Sciences, Changping Laboratory, Beijing, China
| | - P Kotlarz
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
| | - M Li
- Division of Brain Sciences, Changping Laboratory, Beijing, China
| | - J R Polimeni
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
- Harvard-MIT Program in Health Sciences and Technology, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - T Turpin
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
| | - I Uluç
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - D Wang
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| | - H Liu
- Division of Brain Sciences, Changping Laboratory, Beijing, China
- Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | - J Ahveninen
- Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital Charlestown, MA, USA
- Department of Radiology, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
2
|
Silva Pereira S, Özer EE, Sebastian-Galles N. Complexity of STG signals and linguistic rhythm: a methodological study for EEG data. Cereb Cortex 2024; 34:bhad549. [PMID: 38236741 DOI: 10.1093/cercor/bhad549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/29/2023] [Accepted: 12/30/2023] [Indexed: 02/06/2024] Open
Abstract
The superior temporal and the Heschl's gyri of the human brain play a fundamental role in speech processing. Neurons synchronize their activity to the amplitude envelope of the speech signal to extract acoustic and linguistic features, a process known as neural tracking/entrainment. Electroencephalography has been extensively used in language-related research due to its high temporal resolution and reduced cost, but it does not allow for a precise source localization. Motivated by the lack of a unified methodology for the interpretation of source reconstructed signals, we propose a method based on modularity and signal complexity. The procedure was tested on data from an experiment in which we investigated the impact of native language on tracking to linguistic rhythms in two groups: English natives and Spanish natives. In the experiment, we found no effect of native language but an effect of language rhythm. Here, we compare source projected signals in the auditory areas of both hemispheres for the different conditions using nonparametric permutation tests, modularity, and a dynamical complexity measure. We found increasing values of complexity for decreased regularity in the stimuli, giving us the possibility to conclude that languages with less complex rhythms are easier to track by the auditory cortex.
Collapse
Affiliation(s)
- Silvana Silva Pereira
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Ege Ekin Özer
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| | - Nuria Sebastian-Galles
- Center for Brain and Cognition, Department of Information and Communications Technologies, Universitat Pompeu Fabra, 08005 Barcelona, Spain
| |
Collapse
|
3
|
Krason A, Vigliocco G, Mailend ML, Stoll H, Varley R, Buxbaum LJ. Benefit of visual speech information for word comprehension in post-stroke aphasia. Cortex 2023; 165:86-100. [PMID: 37271014 PMCID: PMC10850036 DOI: 10.1016/j.cortex.2023.04.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Revised: 03/13/2023] [Accepted: 04/22/2023] [Indexed: 06/06/2023]
Abstract
Aphasia is a language disorder that often involves speech comprehension impairments affecting communication. In face-to-face settings, speech is accompanied by mouth and facial movements, but little is known about the extent to which they benefit aphasic comprehension. This study investigated the benefit of visual information accompanying speech for word comprehension in people with aphasia (PWA) and the neuroanatomic substrates of any benefit. Thirty-six PWA and 13 neurotypical matched control participants performed a picture-word verification task in which they indicated whether a picture of an animate/inanimate object matched a subsequent word produced by an actress in a video. Stimuli were either audiovisual (with visible mouth and facial movements) or auditory-only (still picture of a silhouette) with audio being clear (unedited) or degraded (6-band noise-vocoding). We found that visual speech information was more beneficial for neurotypical participants than PWA, and more beneficial for both groups when speech was degraded. A multivariate lesion-symptom mapping analysis for the degraded speech condition showed that lesions to superior temporal gyrus, underlying insula, primary and secondary somatosensory cortices, and inferior frontal gyrus were associated with reduced benefit of audiovisual compared to auditory-only speech, suggesting that the integrity of these fronto-temporo-parietal regions may facilitate cross-modal mapping. These findings provide initial insights into our understanding of the impact of audiovisual information on comprehension in aphasia and the brain regions mediating any benefit.
Collapse
Affiliation(s)
- Anna Krason
- Experimental Psychology, University College London, UK; Moss Rehabilitation Research Institute, Elkins Park, PA, USA.
| | - Gabriella Vigliocco
- Experimental Psychology, University College London, UK; Moss Rehabilitation Research Institute, Elkins Park, PA, USA
| | - Marja-Liisa Mailend
- Moss Rehabilitation Research Institute, Elkins Park, PA, USA; Department of Special Education, University of Tartu, Tartu Linn, Estonia
| | - Harrison Stoll
- Moss Rehabilitation Research Institute, Elkins Park, PA, USA; Applied Cognitive and Brain Science, Drexel University, Philadelphia, PA, USA
| | | | - Laurel J Buxbaum
- Moss Rehabilitation Research Institute, Elkins Park, PA, USA; Department of Rehabilitation Medicine, Thomas Jefferson University, Philadelphia, PA, USA
| |
Collapse
|
4
|
Keshishian M, Akkol S, Herrero J, Bickel S, Mehta AD, Mesgarani N. Joint, distributed and hierarchically organized encoding of linguistic features in the human auditory cortex. Nat Hum Behav 2023; 7:740-753. [PMID: 36864134 PMCID: PMC10417567 DOI: 10.1038/s41562-023-01520-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 01/05/2023] [Indexed: 03/04/2023]
Abstract
The precise role of the human auditory cortex in representing speech sounds and transforming them to meaning is not yet fully understood. Here we used intracranial recordings from the auditory cortex of neurosurgical patients as they listened to natural speech. We found an explicit, temporally ordered and anatomically distributed neural encoding of multiple linguistic features, including phonetic, prelexical phonotactics, word frequency, and lexical-phonological and lexical-semantic information. Grouping neural sites on the basis of their encoded linguistic features revealed a hierarchical pattern, with distinct representations of prelexical and postlexical features distributed across various auditory areas. While sites with longer response latencies and greater distance from the primary auditory cortex encoded higher-level linguistic features, the encoding of lower-level features was preserved and not discarded. Our study reveals a cumulative mapping of sound to meaning and provides empirical evidence for validating neurolinguistic and psycholinguistic models of spoken word recognition that preserve the acoustic variations in speech.
Collapse
Affiliation(s)
- Menoua Keshishian
- Department of Electrical Engineering, Columbia University, New York, NY, USA
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Serdar Akkol
- Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| | - Jose Herrero
- Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Department of Neurosurgery, Hofstra-Northwell School of Medicine, Manhasset, NY, USA
| | - Stephan Bickel
- Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Department of Neurosurgery, Hofstra-Northwell School of Medicine, Manhasset, NY, USA
| | - Ashesh D Mehta
- Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
- Department of Neurosurgery, Hofstra-Northwell School of Medicine, Manhasset, NY, USA
| | - Nima Mesgarani
- Department of Electrical Engineering, Columbia University, New York, NY, USA.
- Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
| |
Collapse
|
5
|
Caravaglios G, Muscoso EG, Blandino V, Di Maria G, Gangitano M, Graziano F, Guajana F, Piccoli T. EEG Resting-State Functional Networks in Amnestic Mild Cognitive Impairment. Clin EEG Neurosci 2023; 54:36-50. [PMID: 35758261 DOI: 10.1177/15500594221110036] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Background. Alzheimer's cognitive-behavioral syndrome is the result of impaired connectivity between nerve cells, due to misfolded proteins, which accumulate and disrupt specific brain networks. Electroencephalography, because of its excellent temporal resolution, is an optimal approach for assessing the communication between functionally related brain regions. Objective. To detect and compare EEG resting-state networks (RSNs) in patients with amnesic mild cognitive impairment (aMCI), and healthy elderly (HE). Methods. We recruited 125 aMCI patients and 70 healthy elderly subjects. One hundred and twenty seconds of artifact-free EEG data were selected and compared between patients with aMCI and HE. We applied standard low-resolution brain electromagnetic tomography (sLORETA)-independent component analysis (ICA) to assess resting-state networks. Each network consisted of a set of images, one for each frequency (delta, theta, alpha1/2, beta1/2). Results. The functional ICA analysis revealed 17 networks common to groups. The statistical procedure demonstrated that aMCI used some networks differently than HE. The most relevant findings were as follows. Amnesic-MCI had: i) increased delta/beta activity in the superior frontal gyrus and decreased alpha1 activity in the paracentral lobule (ie, default mode network); ii) greater delta/theta/alpha/beta in the superior frontal gyrus (i.e, attention network); iii) lower alpha in the left superior parietal lobe, as well as a lower delta/theta and beta, respectively in post-central, and in superior frontal gyrus(ie, attention network). Conclusions. Our study confirms sLORETA-ICA method is effective in detecting functional resting-state networks, as well as between-groups connectivity differences. The findings provide support to the Alzheimer's network disconnection hypothesis.
Collapse
Affiliation(s)
- G Caravaglios
- U.O.C. Neurologia, A.O. Cannizzaro per l'emergenza, Catania, Italy
| | - E G Muscoso
- U.O.C. Neurologia, A.O. Cannizzaro per l'emergenza, Catania, Italy
| | - V Blandino
- Department of Biomedicine, Neuroscience and Advanced Diagnostics (Bi.N.D.), 18998University of Palermo, Palermo, Italy
| | - G Di Maria
- U.O.C. Neurologia, A.O. Cannizzaro per l'emergenza, Catania, Italy
| | - M Gangitano
- Department of Biomedicine, Neuroscience and Advanced Diagnostics (Bi.N.D.), 18998University of Palermo, Palermo, Italy
| | - F Graziano
- U.O.C. Neurologia, A.O. Cannizzaro per l'emergenza, Catania, Italy
| | - F Guajana
- U.O.C. Neurologia, A.O. Cannizzaro per l'emergenza, Catania, Italy
| | - T Piccoli
- Department of Biomedicine, Neuroscience and Advanced Diagnostics (Bi.N.D.), 18998University of Palermo, Palermo, Italy
| |
Collapse
|
6
|
Sakakura K, Sonoda M, Mitsuhashi T, Kuroda N, Firestone E, O'Hara N, Iwaki H, Lee MH, Jeong JW, Rothermel R, Luat AF, Asano E. Developmental organization of neural dynamics supporting auditory perception. Neuroimage 2022; 258:119342. [PMID: 35654375 PMCID: PMC9354710 DOI: 10.1016/j.neuroimage.2022.119342] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 05/09/2022] [Accepted: 05/29/2022] [Indexed: 11/28/2022] Open
Abstract
Purpose: A prominent view of language acquisition involves learning to ignore irrelevant auditory signals through functional reorganization, enabling more efficient processing of relevant information. Yet, few studies have characterized the neural spatiotemporal dynamics supporting rapid detection and subsequent disregard of irrelevant auditory information, in the developing brain. To address this unknown, the present study modeled the developmental acquisition of cost-efficient neural dynamics for auditory processing, using intracranial electrocorticographic responses measured in individuals receiving standard-of-care treatment for drug-resistant, focal epilepsy. We also provided evidence demonstrating the maturation of an anterior-to-posterior functional division within the superior-temporal gyrus (STG), which is known to exist in the adult STG. Methods: We studied 32 patients undergoing extraoperative electrocorticography (age range: eight months to 28 years) and analyzed 2,039 intracranial electrode sites outside the seizure onset zone, interictal spike-generating areas, and MRI lesions. Patients were given forward (normal) speech sounds, backward-played speech sounds, and signal-correlated noises during a task-free condition. We then quantified sound processing-related neural costs at given time windows using high-gamma amplitude at 70–110 Hz and animated the group-level high-gamma dynamics on a spatially normalized three-dimensional brain surface. Finally, we determined if age independently contributed to high-gamma dynamics across brain regions and time windows. Results: Group-level analysis of noise-related neural costs in the STG revealed developmental enhancement of early high-gamma augmentation and diminution of delayed augmentation. Analysis of speech-related high-gamma activity demonstrated an anterior-to-posterior functional parcellation in the STG. The left anterior STG showed sustained augmentation throughout stimulus presentation, whereas the left posterior STG showed transient augmentation after stimulus onset. We found a double dissociation between the locations and developmental changes in speech sound-related high-gamma dynamics. Early left anterior STG high-gamma augmentation (i.e., within 200 ms post-stimulus onset) showed developmental enhancement, whereas delayed left posterior STG high-gamma augmentation declined with development. Conclusions: Our observations support the model that, with age, the human STG refines neural dynamics to rapidly detect and subsequently disregard uninformative acoustic noises. Our study also supports the notion that the anterior-to-posterior functional division within the left STG is gradually strengthened for efficient speech sound perception after birth.
Collapse
Affiliation(s)
- Kazuki Sakakura
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Neurosurgery, University of Tsukuba, Tsukuba, 3058575, Japan
| | - Masaki Sonoda
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Neurosurgery, Yokohama City University, Yokohama, Kanagawa, 2360004, Japan
| | - Takumi Mitsuhashi
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Neurosurgery, Juntendo University, School of Medicine, Tokyo, 1138421, Japan
| | - Naoto Kuroda
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Epileptology, Tohoku University Graduate School of Medicine, Sendai, 9808575, Japan
| | - Ethan Firestone
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Physiology, Wayne State University, Detroit, MI 48201, USA
| | - Nolan O'Hara
- Translational Neuroscience Program, Wayne State University, Detroit, Michigan, 48201, USA
| | - Hirotaka Iwaki
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Epileptology, Tohoku University Graduate School of Medicine, Sendai, 9808575, Japan
| | - Min-Hee Lee
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA
| | - Jeong-Won Jeong
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Neurology, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Translational Neuroscience Program, Wayne State University, Detroit, Michigan, 48201, USA
| | - Robert Rothermel
- Department of Psychiatry, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA
| | - Aimee F Luat
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Neurology, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Pediatrics, Central Michigan University, Mt. Pleasant, MI 48858, USA
| | - Eishi Asano
- Department of Pediatrics, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Department of Neurology, Children's Hospital of Michigan, Detroit Medical Center, Wayne State University, Detroit, Michigan, 48201, USA.; Translational Neuroscience Program, Wayne State University, Detroit, Michigan, 48201, USA..
| |
Collapse
|
7
|
Michail G, Senkowski D, Holtkamp M, Wächter B, Keil J. Early beta oscillations in multisensory association areas underlie crossmodal performance enhancement. Neuroimage 2022; 257:119307. [PMID: 35577024 DOI: 10.1016/j.neuroimage.2022.119307] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 04/29/2022] [Accepted: 05/10/2022] [Indexed: 11/28/2022] Open
Abstract
The combination of signals from different sensory modalities can enhance perception and facilitate behavioral responses. While previous research described crossmodal influences in a wide range of tasks, it remains unclear how such influences drive performance enhancements. In particular, the neural mechanisms underlying performance-relevant crossmodal influences, as well as the latency and spatial profile of such influences are not well understood. Here, we examined data from high-density electroencephalography (N = 30) recordings to characterize the oscillatory signatures of crossmodal facilitation of response speed, as manifested in the speeding of visual responses by concurrent task-irrelevant auditory information. Using a data-driven analysis approach, we found that individual gains in response speed correlated with larger beta power difference (13-25 Hz) between the audiovisual and the visual condition, starting within 80 ms after stimulus onset in the secondary visual cortex and in multisensory association areas in the parietal cortex. In addition, we examined data from electrocorticography (ECoG) recordings in four epileptic patients in a comparable paradigm. These ECoG data revealed reduced beta power in audiovisual compared with visual trials in the superior temporal gyrus (STG). Collectively, our data suggest that the crossmodal facilitation of response speed is associated with reduced early beta power in multisensory association and secondary visual areas. The reduced early beta power may reflect an auditory-driven feedback signal to improve visual processing through attentional gating. These findings improve our understanding of the neural mechanisms underlying crossmodal response speed facilitation and highlight the critical role of beta oscillations in mediating behaviorally relevant multisensory processing.
Collapse
Affiliation(s)
- Georgios Michail
- Department of Psychiatry and Psychotherapy, Charité Campus Mitte (CCM), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin 10117, Germany.
| | - Daniel Senkowski
- Department of Psychiatry and Psychotherapy, Charité Campus Mitte (CCM), Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charitéplatz 1, Berlin 10117, Germany
| | - Martin Holtkamp
- Epilepsy-Center Berlin-Brandenburg, Institute for Diagnostics of Epilepsy, Berlin 10365, Germany; Department of Neurology, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Charité Campus Mitte (CCM), Charitéplatz 1, Berlin 10117, Germany
| | - Bettina Wächter
- Epilepsy-Center Berlin-Brandenburg, Institute for Diagnostics of Epilepsy, Berlin 10365, Germany
| | - Julian Keil
- Biological Psychology, Christian-Albrechts-University Kiel, Kiel 24118, Germany
| |
Collapse
|
8
|
The Role of the Interaction between the Inferior Parietal Lobule and Superior Temporal Gyrus in the Multisensory Go/No-go Task. Neuroimage 2022; 254:119140. [PMID: 35342002 DOI: 10.1016/j.neuroimage.2022.119140] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 03/19/2022] [Accepted: 03/22/2022] [Indexed: 11/23/2022] Open
Abstract
Information from multiple sensory modalities interacts. Using functional magnetic resonance imaging (fMRI), we aimed to identify the neural structures correlated with how cooccurring sound modulates the visual motor response execution. The reaction time (RT) to audiovisual stimuli was significantly faster than the RT to visual stimuli. Signal detection analyses showed no significant difference in the perceptual sensitivity (d') between audiovisual and visual stimuli, while the response criteria (β or c) of the audiovisual stimuli was decreased compared to the visual stimuli. The functional connectivity between the left inferior parietal lobule (IPL) and bilateral superior temporal gyrus (STG) was enhanced in Go processing compared with No-go processing of audiovisual stimuli. Furthermore, the left precentral gyrus (PreCG) showed enhanced functional connectivity with the bilateral STG and other areas of the ventral stream in Go processing compared with No-go processing of audiovisual stimuli. These results revealed that the neuronal network correlated with modulations of the motor response execution after the presentation of both visual stimuli along with cooccurring sound in a multisensory Go/Nogo task, including the left IPL, left PreCG, bilateral STG and some areas of the ventral stream. The role of the interaction between the IPL and STG in transforming audiovisual information into motor behavior is discussed. The current study provides a new perspective for exploring potential brain mechanisms underlying how humans execute appropriate behaviors on the basis of multisensory information.
Collapse
|
9
|
Rennig J, Beauchamp MS. Intelligibility of audiovisual sentences drives multivoxel response patterns in human superior temporal cortex. Neuroimage 2022; 247:118796. [PMID: 34906712 PMCID: PMC8819942 DOI: 10.1016/j.neuroimage.2021.118796] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 11/18/2021] [Accepted: 12/08/2021] [Indexed: 11/18/2022] Open
Abstract
Regions of the human posterior superior temporal gyrus and sulcus (pSTG/S) respond to the visual mouth movements that constitute visual speech and the auditory vocalizations that constitute auditory speech, and neural responses in pSTG/S may underlie the perceptual benefit of visual speech for the comprehension of noisy auditory speech. We examined this possibility through the lens of multivoxel pattern responses in pSTG/S. BOLD fMRI data was collected from 22 participants presented with speech consisting of English sentences presented in five different formats: visual-only; auditory with and without added auditory noise; and audiovisual with and without auditory noise. Participants reported the intelligibility of each sentence with a button press and trials were sorted post-hoc into those that were more or less intelligible. Response patterns were measured in regions of the pSTG/S identified with an independent localizer. Noisy audiovisual sentences with very similar physical properties evoked very different response patterns depending on their intelligibility. When a noisy audiovisual sentence was reported as intelligible, the pattern was nearly identical to that elicited by clear audiovisual sentences. In contrast, an unintelligible noisy audiovisual sentence evoked a pattern like that of visual-only sentences. This effect was less pronounced for noisy auditory-only sentences, which evoked similar response patterns regardless of intelligibility. The successful integration of visual and auditory speech produces a characteristic neural signature in pSTG/S, highlighting the importance of this region in generating the perceptual benefit of visual speech.
Collapse
Affiliation(s)
- Johannes Rennig
- Division of Neuropsychology, Center of Neurology, Hertie-Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany
| | - Michael S Beauchamp
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Richards Medical Research Building, A607, 3700 Hamilton Walk, Philadelphia, PA 19104-6016, United States.
| |
Collapse
|
10
|
Romanovska L, Bonte M. How Learning to Read Changes the Listening Brain. Front Psychol 2021; 12:726882. [PMID: 34987442 PMCID: PMC8721231 DOI: 10.3389/fpsyg.2021.726882] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 11/23/2021] [Indexed: 01/18/2023] Open
Abstract
Reading acquisition reorganizes existing brain networks for speech and visual processing to form novel audio-visual language representations. This requires substantial cortical plasticity that is reflected in changes in brain activation and functional as well as structural connectivity between brain areas. The extent to which a child's brain can accommodate these changes may underlie the high variability in reading outcome in both typical and dyslexic readers. In this review, we focus on reading-induced functional changes of the dorsal speech network in particular and discuss how its reciprocal interactions with the ventral reading network contributes to reading outcome. We discuss how the dynamic and intertwined development of both reading networks may be best captured by approaching reading from a skill learning perspective, using audio-visual learning paradigms and longitudinal designs to follow neuro-behavioral changes while children's reading skills unfold.
Collapse
Affiliation(s)
| | - Milene Bonte
- *Correspondence: Linda Romanovska, ; Milene Bonte,
| |
Collapse
|
11
|
Abstract
Human speech perception results from neural computations that transform external acoustic speech signals into internal representations of words. The superior temporal gyrus (STG) contains the nonprimary auditory cortex and is a critical locus for phonological processing. Here, we describe how speech sound representation in the STG relies on fundamentally nonlinear and dynamical processes, such as categorization, normalization, contextual restoration, and the extraction of temporal structure. A spatial mosaic of local cortical sites on the STG exhibits complex auditory encoding for distinct acoustic-phonetic and prosodic features. We propose that as a population ensemble, these distributed patterns of neural activity give rise to abstract, higher-order phonemic and syllabic representations that support speech perception. This review presents a multi-scale, recurrent model of phonological processing in the STG, highlighting the critical interface between auditory and language systems. Expected final online publication date for the Annual Review of Psychology, Volume 73 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Ilina Bhaya-Grossman
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA; .,Joint Graduate Program in Bioengineering, University of California, Berkeley and San Francisco, California 94720, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, California 94143, USA;
| |
Collapse
|
12
|
Karthik G, Plass J, Beltz AM, Liu Z, Grabowecky M, Suzuki S, Stacey WC, Wasade VS, Towle VL, Tao JX, Wu S, Issa NP, Brang D. Visual speech differentially modulates beta, theta, and high gamma bands in auditory cortex. Eur J Neurosci 2021; 54:7301-7317. [PMID: 34587350 DOI: 10.1111/ejn.15482] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Revised: 08/20/2021] [Accepted: 08/28/2021] [Indexed: 12/13/2022]
Abstract
Speech perception is a central component of social communication. Although principally an auditory process, accurate speech perception in everyday settings is supported by meaningful information extracted from visual cues. Visual speech modulates activity in cortical areas subserving auditory speech perception including the superior temporal gyrus (STG). However, it is unknown whether visual modulation of auditory processing is a unitary phenomenon or, rather, consists of multiple functionally distinct processes. To explore this question, we examined neural responses to audiovisual speech measured from intracranially implanted electrodes in 21 patients with epilepsy. We found that visual speech modulated auditory processes in the STG in multiple ways, eliciting temporally and spatially distinct patterns of activity that differed across frequency bands. In the theta band, visual speech suppressed the auditory response from before auditory speech onset to after auditory speech onset (-93 to 500 ms) most strongly in the posterior STG. In the beta band, suppression was seen in the anterior STG from -311 to -195 ms before auditory speech onset and in the middle STG from -195 to 235 ms after speech onset. In high gamma, visual speech enhanced the auditory response from -45 to 24 ms only in the posterior STG. We interpret the visual-induced changes prior to speech onset as reflecting crossmodal prediction of speech signals. In contrast, modulations after sound onset may reflect a decrease in sustained feedforward auditory activity. These results are consistent with models that posit multiple distinct mechanisms supporting audiovisual speech perception.
Collapse
Affiliation(s)
- G Karthik
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| | - John Plass
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| | - Adriene M Beltz
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| | - Zhongming Liu
- Department of Biomedical Engineering and Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, USA
| | - Marcia Grabowecky
- Department of Psychology, Northwestern University, Evanston, Illinois, USA
| | - Satoru Suzuki
- Department of Psychology, Northwestern University, Evanston, Illinois, USA
| | - William C Stacey
- Department of Neurology and Department of Biomedical Engineering, University of Michigan, Ann Arbor, Michigan, USA
| | - Vibhangini S Wasade
- Department of Neurology, Henry Ford Hospital, Detroit, Michigan, USA.,Department of Neurology, Wayne State University School of Medicine, Detroit, Michigan, USA
| | - Vernon L Towle
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - James X Tao
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - Shasha Wu
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - Naoum P Issa
- Department of Neurology, The University of Chicago, Chicago, Illinois, USA
| | - David Brang
- Department of Psychology, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
13
|
Lasfargues-Delannoy A, Strelnikov K, Deguine O, Marx M, Barone P. Supra-normal skills in processing of visuo-auditory prosodic information by cochlear-implanted deaf patients. Hear Res 2021; 410:108330. [PMID: 34492444 DOI: 10.1016/j.heares.2021.108330] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 07/08/2021] [Accepted: 08/02/2021] [Indexed: 10/20/2022]
Abstract
Cochlear implanted (CI) adults with acquired deafness are known to depend on multisensory integration skills (MSI) for speech comprehension through the fusion of speech reading skills and their deficient auditory perception. But, little is known on how CI patients perceive prosodic information relating to speech content. Our study aimed to identify how CI patients use MSI between visual and auditory information to process paralinguistic prosodic information of multimodal speech and the visual strategies employed. A psychophysics assessment was developed, in which CI patients and hearing controls (NH) had to distinguish between a question and a statement. The controls were separated into two age groups (young and aged-matched) to dissociate any effect of aging. In addition, the oculomotor strategies used when facing a speaker in this prosodic decision task were recorded using an eye-tracking device and compared to controls. This study confirmed that prosodic processing is multisensory but it revealed that CI patients showed significant supra-normal audiovisual integration for prosodic information compared to hearing controls irrespective of age. This study clearly showed that CI patients had a visuo-auditory gain more than 3 times larger than that observed in hearing controls. Furthermore, CI participants performed better in the visuo-auditory situation through a specific oculomotor exploration of the face as they significantly fixate the mouth region more than young NH participants who fixate the eyes, whereas the aged-matched controls presented an intermediate exploration pattern equally reported between the eyes and mouth. To conclude, our study demonstrated that CI patients have supra-normal skills MSI when integrating visual and auditory linguistic prosodic information, and a specific adaptive strategy developed as it participates directly in speech content comprehension.
Collapse
Affiliation(s)
- Anne Lasfargues-Delannoy
- Université Fédérale de Toulouse - Université Paul Sabatier (UPS), France; UMR 5549 CerCo, UPS CNRS, France; CHU Toulouse - France, Service d'Oto Rhino Laryngologie (ORL), Otoneurologie et ORL Pédiatrique, Hôpital Pierre Paul Riquet, site Purpan France.
| | - Kuzma Strelnikov
- Université Fédérale de Toulouse - Université Paul Sabatier (UPS), France; UMR 5549 CerCo, UPS CNRS, France; CHU Toulouse, France
| | - Olivier Deguine
- Université Fédérale de Toulouse - Université Paul Sabatier (UPS), France; UMR 5549 CerCo, UPS CNRS, France; CHU Toulouse - France, Service d'Oto Rhino Laryngologie (ORL), Otoneurologie et ORL Pédiatrique, Hôpital Pierre Paul Riquet, site Purpan France
| | - Mathieu Marx
- Université Fédérale de Toulouse - Université Paul Sabatier (UPS), France; UMR 5549 CerCo, UPS CNRS, France; CHU Toulouse - France, Service d'Oto Rhino Laryngologie (ORL), Otoneurologie et ORL Pédiatrique, Hôpital Pierre Paul Riquet, site Purpan France
| | - Pascal Barone
- Université Fédérale de Toulouse - Université Paul Sabatier (UPS), France; UMR 5549 CerCo, UPS CNRS, France
| |
Collapse
|
14
|
Hamilton LS, Oganian Y, Hall J, Chang EF. Parallel and distributed encoding of speech across human auditory cortex. Cell 2021; 184:4626-4639.e13. [PMID: 34411517 DOI: 10.1016/j.cell.2021.07.019] [Citation(s) in RCA: 80] [Impact Index Per Article: 26.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Revised: 02/11/2021] [Accepted: 07/19/2021] [Indexed: 12/27/2022]
Abstract
Speech perception is thought to rely on a cortical feedforward serial transformation of acoustic into linguistic representations. Using intracranial recordings across the entire human auditory cortex, electrocortical stimulation, and surgical ablation, we show that cortical processing across areas is not consistent with a serial hierarchical organization. Instead, response latency and receptive field analyses demonstrate parallel and distinct information processing in the primary and nonprimary auditory cortices. This functional dissociation was also observed where stimulation of the primary auditory cortex evokes auditory hallucination but does not distort or interfere with speech perception. Opposite effects were observed during stimulation of nonprimary cortex in superior temporal gyrus. Ablation of the primary auditory cortex does not affect speech perception. These results establish a distributed functional organization of parallel information processing throughout the human auditory cortex and demonstrate an essential independent role for nonprimary auditory cortex in speech processing.
Collapse
Affiliation(s)
- Liberty S Hamilton
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Yulia Oganian
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Jeffery Hall
- Department of Neurology and Neurosurgery, McGill University Montreal Neurological Institute, Montreal, QC, H3A 2B4, Canada
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
15
|
Paredes O, López JB, Covantes-Osuna C, Ocegueda-Hernández V, Romo-Vázquez R, Morales JA. A Transcriptome Community-and-Module Approach of the Human Mesoconnectome. ENTROPY (BASEL, SWITZERLAND) 2021; 23:1031. [PMID: 34441171 PMCID: PMC8393183 DOI: 10.3390/e23081031] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 08/03/2021] [Accepted: 08/06/2021] [Indexed: 12/15/2022]
Abstract
Graph analysis allows exploring transcriptome compartments such as communities and modules for brain mesostructures. In this work, we proposed a bottom-up model of a gene regulatory network to brain-wise connectome workflow. We estimated the gene communities across all brain regions from the Allen Brain Atlas transcriptome database. We selected the communities method to yield the highest number of functional mesostructures in the network hierarchy organization, which allowed us to identify specific brain cell functions (e.g., neuroplasticity, axonogenesis and dendritogenesis communities). With these communities, we built brain-wise region modules that represent the connectome. Our findings match with previously described anatomical and functional brain circuits, such the default mode network and the default visual network, supporting the notion that the brain dynamics that carry out low- and higher-order functions originate from the modular composition of a GRN complex network.
Collapse
Affiliation(s)
| | | | | | | | - Rebeca Romo-Vázquez
- Computer Sciences Department, Exact Sciences and Engineering University Centre, Universidad de Guadalajara, Guadalajara 44430, Mexico; (O.P.); (J.B.L.); (C.C.-O.); (V.O.-H.)
| | - J. Alejandro Morales
- Computer Sciences Department, Exact Sciences and Engineering University Centre, Universidad de Guadalajara, Guadalajara 44430, Mexico; (O.P.); (J.B.L.); (C.C.-O.); (V.O.-H.)
| |
Collapse
|
16
|
Liu Y, Shi G, Li M, Xing H, Song Y, Xiao L, Guan Y, Han Z. Early Top-Down Modulation in Visual Word Form Processing: Evidence From an Intracranial SEEG Study. J Neurosci 2021; 41:6102-6115. [PMID: 34011525 PMCID: PMC8276739 DOI: 10.1523/jneurosci.2288-20.2021] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 05/04/2021] [Accepted: 05/09/2021] [Indexed: 11/21/2022] Open
Abstract
Visual word recognition, at a minimum, involves the processing of word form and lexical information. Opinions diverge on the spatiotemporal distribution of and interaction between the two types of information. Feedforward theory argues that they are processed sequentially, whereas interactive theory advocates that lexical information is processed fast and modulates early word form processing. To distinguish between the two theories, we applied stereoelectroencephalography (SEEG) to 33 human adults with epilepsy (25 males and eight females) during visual lexical decisions. The stimuli included real words (RWs), pseudowords (PWs) with legal radical positions, nonwords (NWs) with illegal radical positions, and stroked-changed words (SWs) in Chinese. Word form and lexical processing were measured by the word form effect (PW versus NW) and lexical effect (RW versus PW), respectively. Gamma-band (60 ∼ 140 Hz) SEEG activity was treated as an electrophysiological measure. A word form effect was found in eight left brain regions (i.e., the inferior parietal lobe, insula, fusiform, inferior temporal, middle temporal, middle occipital, precentral and postcentral gyri) from 50 ms poststimulus onset, whereas a lexical effect was observed in five left brain regions (i.e., the calcarine, middle temporal, superior temporal, precentral, and postcentral gyri) from 100 ms poststimulus onset. The two effects overlapped in the precentral (300 ∼ 500 ms) and postcentral (100 ∼ 200 ms and 250 ∼ 600 ms) gyri. Moreover, high-level regions provide early feedback to word form regions. These results demonstrate that lexical processing occurs early and modulates word form recognition, providing vital supportive evidence for interactive theory.SIGNIFICANCE STATEMENT A pivotal unresolved dispute in the field of word processing is whether word form recognition is obligatorily modulated by high-level lexical top-down information. To address this issue, we applied intracranial SEEG to 33 adults with epilepsy to precisely delineate the spatiotemporal dynamics between processing word form and lexical information during visual word recognition. We observed that lexical processing occurred from 100 ms poststimulus presentation and even spatiotemporally overlapped with word form processing. Moreover, the high-order regions provided feedback to the word form regions in the early stage of word recognition. These results revealed the crucial role of high-level lexical information in word form recognition, deepening our understanding of the functional coupling among brain regions in word processing networks.
Collapse
Affiliation(s)
- Yi Liu
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Gaofeng Shi
- Faculty of International Education of Chinese Language, Beijing Language and Culture University, Beijing 100083, China
| | - Mingyang Li
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Hongbing Xing
- Faculty of International Education of Chinese Language, Beijing Language and Culture University, Beijing 100083, China
| | - Yan Song
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Luchuan Xiao
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| | - Yuguang Guan
- Department of Neurosurgery, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China
| | - Zaizhu Han
- State Key Laboratory of Cognitive Neuroscience and Learning and IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China
| |
Collapse
|
17
|
Visual Influences on Auditory Behavioral, Neural, and Perceptual Processes: A Review. J Assoc Res Otolaryngol 2021; 22:365-386. [PMID: 34014416 PMCID: PMC8329114 DOI: 10.1007/s10162-021-00789-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 02/07/2021] [Indexed: 01/03/2023] Open
Abstract
In a naturalistic environment, auditory cues are often accompanied by information from other senses, which can be redundant with or complementary to the auditory information. Although the multisensory interactions derived from this combination of information and that shape auditory function are seen across all sensory modalities, our greatest body of knowledge to date centers on how vision influences audition. In this review, we attempt to capture the state of our understanding at this point in time regarding this topic. Following a general introduction, the review is divided into 5 sections. In the first section, we review the psychophysical evidence in humans regarding vision's influence in audition, making the distinction between vision's ability to enhance versus alter auditory performance and perception. Three examples are then described that serve to highlight vision's ability to modulate auditory processes: spatial ventriloquism, cross-modal dynamic capture, and the McGurk effect. The final part of this section discusses models that have been built based on available psychophysical data and that seek to provide greater mechanistic insights into how vision can impact audition. The second section reviews the extant neuroimaging and far-field imaging work on this topic, with a strong emphasis on the roles of feedforward and feedback processes, on imaging insights into the causal nature of audiovisual interactions, and on the limitations of current imaging-based approaches. These limitations point to a greater need for machine-learning-based decoding approaches toward understanding how auditory representations are shaped by vision. The third section reviews the wealth of neuroanatomical and neurophysiological data from animal models that highlights audiovisual interactions at the neuronal and circuit level in both subcortical and cortical structures. It also speaks to the functional significance of audiovisual interactions for two critically important facets of auditory perception-scene analysis and communication. The fourth section presents current evidence for alterations in audiovisual processes in three clinical conditions: autism, schizophrenia, and sensorineural hearing loss. These changes in audiovisual interactions are postulated to have cascading effects on higher-order domains of dysfunction in these conditions. The final section highlights ongoing work seeking to leverage our knowledge of audiovisual interactions to develop better remediation approaches to these sensory-based disorders, founded in concepts of perceptual plasticity in which vision has been shown to have the capacity to facilitate auditory learning.
Collapse
|
18
|
Ramos Nuñez AI, Yue Q, Pasalar S, Martin RC. The role of left vs. right superior temporal gyrus in speech perception: An fMRI-guided TMS study. BRAIN AND LANGUAGE 2020; 209:104838. [PMID: 32801090 DOI: 10.1016/j.bandl.2020.104838] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Revised: 05/27/2020] [Accepted: 07/13/2020] [Indexed: 05/15/2023]
Abstract
Debate continues regarding the necessary role of right superior temporal gyrus (STG) regions in sublexical speech perception given the bilateral STG activation often observed in fMRI studies. To evaluate the causal roles, TMS pulses were delivered to inhibit and disrupt neuronal activity at the left and right STG regions during a nonword discrimination task based on peak activations from a blocked fMRI paradigm assessing speech vs. nonspeech perception (N = 20). Relative to a control region located in the posterior occipital lobe, TMS to the left anterior STG (laSTG) led to significantly worse accuracy, whereas TMS to the left posterior STG (lpSTG) and right anterior STG (raSTG) did not. Although the disruption from TMS was significantly greater for the laSTG than for raSTG, the difference in accuracy between the laSTG and lpSTG did not reach significance. The results argue for a causal role of the laSTG but not raSTG in speech perception. Further research is needed to establish the source of the differences between the laSTG and lpSTG.
Collapse
Affiliation(s)
- Aurora I Ramos Nuñez
- Department of Social Sciences, College of Coastal Georgia, Brunswick, GA 31520, USA.
| | - Qiuhai Yue
- Department of Psychological Sciences, Rice University, Houston, TX 77005, USA; Department of Psychology, Vanderbilt University, Nashville, TN 37212, USA
| | - Siavash Pasalar
- Department of Psychological Sciences, Rice University, Houston, TX 77005, USA
| | - Randi C Martin
- Department of Psychological Sciences, Rice University, Houston, TX 77005, USA
| |
Collapse
|
19
|
Responses to Visual Speech in Human Posterior Superior Temporal Gyrus Examined with iEEG Deconvolution. J Neurosci 2020; 40:6938-6948. [PMID: 32727820 PMCID: PMC7470920 DOI: 10.1523/jneurosci.0279-20.2020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 06/01/2020] [Accepted: 06/02/2020] [Indexed: 12/22/2022] Open
Abstract
Experimentalists studying multisensory integration compare neural responses to multisensory stimuli with responses to the component modalities presented in isolation. This procedure is problematic for multisensory speech perception since audiovisual speech and auditory-only speech are easily intelligible but visual-only speech is not. To overcome this confound, we developed intracranial encephalography (iEEG) deconvolution. Individual stimuli always contained both auditory and visual speech, but jittering the onset asynchrony between modalities allowed for the time course of the unisensory responses and the interaction between them to be independently estimated. We applied this procedure to electrodes implanted in human epilepsy patients (both male and female) over the posterior superior temporal gyrus (pSTG), a brain area known to be important for speech perception. iEEG deconvolution revealed sustained positive responses to visual-only speech and larger, phasic responses to auditory-only speech. Confirming results from scalp EEG, responses to audiovisual speech were weaker than responses to auditory-only speech, demonstrating a subadditive multisensory neural computation. Leveraging the spatial resolution of iEEG, we extended these results to show that subadditivity is most pronounced in more posterior aspects of the pSTG. Across electrodes, subadditivity correlated with visual responsiveness, supporting a model in which visual speech enhances the efficiency of auditory speech processing in pSTG. The ability to separate neural processes may make iEEG deconvolution useful for studying a variety of complex cognitive and perceptual tasks.SIGNIFICANCE STATEMENT Understanding speech is one of the most important human abilities. Speech perception uses information from both the auditory and visual modalities. It has been difficult to study neural responses to visual speech because visual-only speech is difficult or impossible to comprehend, unlike auditory-only and audiovisual speech. We used intracranial encephalography deconvolution to overcome this obstacle. We found that visual speech evokes a positive response in the human posterior superior temporal gyrus, enhancing the efficiency of auditory speech processing.
Collapse
|
20
|
Micheli C, Schepers IM, Ozker M, Yoshor D, Beauchamp MS, Rieger JW. Electrocorticography reveals continuous auditory and visual speech tracking in temporal and occipital cortex. Eur J Neurosci 2020; 51:1364-1376. [PMID: 29888819 PMCID: PMC6289876 DOI: 10.1111/ejn.13992] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 05/19/2018] [Accepted: 05/29/2018] [Indexed: 12/11/2022]
Abstract
During natural speech perception, humans must parse temporally continuous auditory and visual speech signals into sequences of words. However, most studies of speech perception present only single words or syllables. We used electrocorticography (subdural electrodes implanted on the brains of epileptic patients) to investigate the neural mechanisms for processing continuous audiovisual speech signals consisting of individual sentences. Using partial correlation analysis, we found that posterior superior temporal gyrus (pSTG) and medial occipital cortex tracked both the auditory and the visual speech envelopes. These same regions, as well as inferior temporal cortex, responded more strongly to a dynamic video of a talking face compared to auditory speech paired with a static face. Occipital cortex and pSTG carry temporal information about both auditory and visual speech dynamics. Visual speech tracking in pSTG may be a mechanism for enhancing perception of degraded auditory speech.
Collapse
Affiliation(s)
- Cristiano Micheli
- Department of Psychology, Carl von Ossietzky University, Oldenburg, Germany
- Donders Centre for Cognitive Neuroimaging, Radboud University, Nijmegen, The Netherlands
| | - Inga M Schepers
- Department of Psychology, Carl von Ossietzky University, Oldenburg, Germany
- Research Center Neurosensory Science, Carl von Ossietzky University, Oldenburg, Germany
| | - Müge Ozker
- Department of Neurosurgery, Baylor College of Medicine, Houston, Texas
| | - Daniel Yoshor
- Department of Neurosurgery, Baylor College of Medicine, Houston, Texas
- Michael E. DeBakey Veterans Affairs Medical Center, Houston, Texas
| | | | - Jochem W Rieger
- Department of Psychology, Carl von Ossietzky University, Oldenburg, Germany
- Research Center Neurosensory Science, Carl von Ossietzky University, Oldenburg, Germany
| |
Collapse
|
21
|
Karas PJ, Magnotti JF, Metzger BA, Zhu LL, Smith KB, Yoshor D, Beauchamp MS. The visual speech head start improves perception and reduces superior temporal cortex responses to auditory speech. eLife 2019; 8:e48116. [PMID: 31393261 PMCID: PMC6687434 DOI: 10.7554/elife.48116] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Accepted: 07/17/2019] [Indexed: 12/30/2022] Open
Abstract
Visual information about speech content from the talker's mouth is often available before auditory information from the talker's voice. Here we examined perceptual and neural responses to words with and without this visual head start. For both types of words, perception was enhanced by viewing the talker's face, but the enhancement was significantly greater for words with a head start. Neural responses were measured from electrodes implanted over auditory association cortex in the posterior superior temporal gyrus (pSTG) of epileptic patients. The presence of visual speech suppressed responses to auditory speech, more so for words with a visual head start. We suggest that the head start inhibits representations of incompatible auditory phonemes, increasing perceptual accuracy and decreasing total neural responses. Together with previous work showing visual cortex modulation (Ozker et al., 2018b) these results from pSTG demonstrate that multisensory interactions are a powerful modulator of activity throughout the speech perception network.
Collapse
Affiliation(s)
- Patrick J Karas
- Department of NeurosurgeryBaylor College of MedicineHoustonUnited States
| | - John F Magnotti
- Department of NeurosurgeryBaylor College of MedicineHoustonUnited States
| | - Brian A Metzger
- Department of NeurosurgeryBaylor College of MedicineHoustonUnited States
| | - Lin L Zhu
- Department of NeurosurgeryBaylor College of MedicineHoustonUnited States
| | - Kristen B Smith
- Department of NeurosurgeryBaylor College of MedicineHoustonUnited States
| | - Daniel Yoshor
- Department of NeurosurgeryBaylor College of MedicineHoustonUnited States
| | | |
Collapse
|
22
|
Yi HG, Leonard MK, Chang EF. The Encoding of Speech Sounds in the Superior Temporal Gyrus. Neuron 2019; 102:1096-1110. [PMID: 31220442 PMCID: PMC6602075 DOI: 10.1016/j.neuron.2019.04.023] [Citation(s) in RCA: 171] [Impact Index Per Article: 34.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 04/08/2019] [Accepted: 04/16/2019] [Indexed: 01/02/2023]
Abstract
The human superior temporal gyrus (STG) is critical for extracting meaningful linguistic features from speech input. Local neural populations are tuned to acoustic-phonetic features of all consonants and vowels and to dynamic cues for intonational pitch. These populations are embedded throughout broader functional zones that are sensitive to amplitude-based temporal cues. Beyond speech features, STG representations are strongly modulated by learned knowledge and perceptual goals. Currently, a major challenge is to understand how these features are integrated across space and time in the brain during natural speech comprehension. We present a theory that temporally recurrent connections within STG generate context-dependent phonological representations, spanning longer temporal sequences relevant for coherent percepts of syllables, words, and phrases.
Collapse
Affiliation(s)
- Han Gyol Yi
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Matthew K Leonard
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San Francisco, 675 Nelson Rising Lane, San Francisco, CA 94158, USA.
| |
Collapse
|
23
|
A task-invariant cognitive reserve network. Neuroimage 2018; 178:36-45. [PMID: 29772378 DOI: 10.1016/j.neuroimage.2018.05.033] [Citation(s) in RCA: 74] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Revised: 05/11/2018] [Accepted: 05/12/2018] [Indexed: 01/12/2023] Open
Abstract
The concept of cognitive reserve (CR) can explain individual differences in susceptibility to cognitive or functional impairment in the presence of age or disease-related brain changes. Epidemiologic evidence indicates that CR helps maintain performance in the face of pathology across multiple cognitive domains. We therefore tried to identify a single, "task-invariant" CR network that is active during the performance of many disparate tasks. In imaging data acquired from 255 individuals age 20-80 while performing 12 different cognitive tasks, we used an iterative approach to derive a multivariate network that was expressed during the performance of all tasks, and whose degree of expression correlated with IQ, a proxy for CR. When applied to held out data or forward applied to fMRI data from an entirely different activation task, network expression correlated with IQ. Expression of the CR pattern accounted for additional variance in fluid reasoning performance over and above the influence of cortical thickness, and also moderated between cortical thickness and reasoning performance, consistent with the behavior of a CR network. The identification of a task-invariant CR network supports the idea that life experiences may result in brain processing differences that might provide reserve against age- or disease-related changes across multiple tasks.
Collapse
|
24
|
Ozker M, Yoshor D, Beauchamp MS. Converging Evidence From Electrocorticography and BOLD fMRI for a Sharp Functional Boundary in Superior Temporal Gyrus Related to Multisensory Speech Processing. Front Hum Neurosci 2018; 12:141. [PMID: 29740294 PMCID: PMC5928751 DOI: 10.3389/fnhum.2018.00141] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2018] [Accepted: 03/28/2018] [Indexed: 01/15/2023] Open
Abstract
Although humans can understand speech using the auditory modality alone, in noisy environments visual speech information from the talker’s mouth can rescue otherwise unintelligible auditory speech. To investigate the neural substrates of multisensory speech perception, we compared neural activity from the human superior temporal gyrus (STG) in two datasets. One dataset consisted of direct neural recordings (electrocorticography, ECoG) from surface electrodes implanted in epilepsy patients (this dataset has been previously published). The second dataset consisted of indirect measures of neural activity using blood oxygen level dependent functional magnetic resonance imaging (BOLD fMRI). Both ECoG and fMRI participants viewed the same clear and noisy audiovisual speech stimuli and performed the same speech recognition task. Both techniques demonstrated a sharp functional boundary in the STG, spatially coincident with an anatomical boundary defined by the posterior edge of Heschl’s gyrus. Cortex on the anterior side of the boundary responded more strongly to clear audiovisual speech than to noisy audiovisual speech while cortex on the posterior side of the boundary did not. For both ECoG and fMRI measurements, the transition between the functionally distinct regions happened within 10 mm of anterior-to-posterior distance along the STG. We relate this boundary to the multisensory neural code underlying speech perception and propose that it represents an important functional division within the human speech perception network.
Collapse
Affiliation(s)
- Muge Ozker
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| | - Daniel Yoshor
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States.,Michael E. DeBakey Veterans Affairs Medical Center, Houston, TX, United States
| | - Michael S Beauchamp
- Department of Neurosurgery, Baylor College of Medicine, Houston, TX, United States
| |
Collapse
|
25
|
Hauswald A, Lithari C, Collignon O, Leonardelli E, Weisz N. A Visual Cortical Network for Deriving Phonological Information from Intelligible Lip Movements. Curr Biol 2018; 28:1453-1459.e3. [PMID: 29681475 PMCID: PMC5956463 DOI: 10.1016/j.cub.2018.03.044] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2018] [Revised: 02/25/2018] [Accepted: 03/20/2018] [Indexed: 11/26/2022]
Abstract
Successful lip-reading requires a mapping from visual to phonological information [1]. Recently, visual and motor cortices have been implicated in tracking lip movements (e.g., [2]). It remains unclear, however, whether visuo-phonological mapping occurs already at the level of the visual cortex-that is, whether this structure tracks the acoustic signal in a functionally relevant manner. To elucidate this, we investigated how the cortex tracks (i.e., entrains to) absent acoustic speech signals carried by silent lip movements. Crucially, we contrasted the entrainment to unheard forward (intelligible) and backward (unintelligible) acoustic speech. We observed that the visual cortex exhibited stronger entrainment to the unheard forward acoustic speech envelope compared to the unheard backward acoustic speech envelope. Supporting the notion of a visuo-phonological mapping process, this forward-backward difference of occipital entrainment was not present for actually observed lip movements. Importantly, the respective occipital region received more top-down input, especially from left premotor, primary motor, and somatosensory regions and, to a lesser extent, also from posterior temporal cortex. Strikingly, across participants, the extent of top-down modulation of the visual cortex stemming from these regions partially correlated with the strength of entrainment to absent acoustic forward speech envelope, but not to present forward lip movements. Our findings demonstrate that a distributed cortical network, including key dorsal stream auditory regions [3-5], influences how the visual cortex shows sensitivity to the intelligibility of speech while tracking silent lip movements.
Collapse
Affiliation(s)
- Anne Hauswald
- Centre for Cognitive Neurosciences, University of Salzburg, Salzburg 5020, Austria; CIMeC, Center for Mind/Brain Sciences, Università degli studi di Trento, Trento 38123, Italy.
| | - Chrysa Lithari
- Centre for Cognitive Neurosciences, University of Salzburg, Salzburg 5020, Austria; CIMeC, Center for Mind/Brain Sciences, Università degli studi di Trento, Trento 38123, Italy
| | - Olivier Collignon
- CIMeC, Center for Mind/Brain Sciences, Università degli studi di Trento, Trento 38123, Italy; Institute of Research in Psychology & Institute of NeuroScience, Université catholique de Louvain, Louvain 1348, Belgium
| | - Elisa Leonardelli
- CIMeC, Center for Mind/Brain Sciences, Università degli studi di Trento, Trento 38123, Italy
| | - Nathan Weisz
- Centre for Cognitive Neurosciences, University of Salzburg, Salzburg 5020, Austria; CIMeC, Center for Mind/Brain Sciences, Università degli studi di Trento, Trento 38123, Italy.
| |
Collapse
|
26
|
Regenbogen C, Seubert J, Johansson E, Finkelmeyer A, Andersson P, Lundström JN. The intraparietal sulcus governs multisensory integration of audiovisual information based on task difficulty. Hum Brain Mapp 2017; 39:1313-1326. [PMID: 29235185 DOI: 10.1002/hbm.23918] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2017] [Revised: 11/30/2017] [Accepted: 12/04/2017] [Indexed: 01/20/2023] Open
Abstract
Object recognition benefits maximally from multimodal sensory input when stimulus presentation is noisy, or degraded. Whether this advantage can be attributed specifically to the extent of overlap in object-related information, or rather, to object-unspecific enhancement due to the mere presence of additional sensory stimulation, remains unclear. Further, the cortical processing differences driving increased multisensory integration (MSI) for degraded compared with clear information remain poorly understood. Here, two consecutive studies first compared behavioral benefits of audio-visual overlap of object-related information, relative to conditions where one channel carried information and the other carried noise. A hierarchical drift diffusion model indicated performance enhancement when auditory and visual object-related information was simultaneously present for degraded stimuli. A subsequent fMRI study revealed visual dominance on a behavioral and neural level for clear stimuli, while degraded stimulus processing was mainly characterized by activation of a frontoparietal multisensory network, including IPS. Connectivity analyses indicated that integration of degraded object-related information relied on IPS input, whereas clear stimuli were integrated through direct information exchange between visual and auditory sensory cortices. These results indicate that the inverse effectiveness observed for identification of degraded relative to clear objects in behavior and brain activation might be facilitated by selective recruitment of an executive cortical network which uses IPS as a relay mediating crossmodal sensory information exchange.
Collapse
Affiliation(s)
- Christina Regenbogen
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.,Department of Psychiatry, Psychotherapy and Psychosomatics, Medical School, RWTH Aachen University, Germany.,JARA - BRAIN Institute 1: Structure-Function Relationship: Decoding the Human Brain at systemic levels, Forschungszentrum Jülich, Germany
| | - Janina Seubert
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.,Department of Neurobiology, Aging Research Center, Care Sciences and Society, Karolinska Institute and Stockholm University, Stockholm, Sweden
| | - Emilia Johansson
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden
| | - Andreas Finkelmeyer
- Institute of Neuroscience, Newcastle University, Newcastle-upon-Tyne, United Kingdom
| | - Patrik Andersson
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.,Stockholm University Brain Imaging Centre, Stockholm University, Sweden
| | - Johan N Lundström
- Department of Clinical Neuroscience, Karolinska Institutet, Stockholm, Sweden.,Monell Chemical Senses Center, Philadelphia, Pennsylvania.,Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania
| |
Collapse
|