1
|
Borjigin A, Bharadwaj HM. Individual Differences Elucidate the Perceptual Benefits Associated with Robust Temporal Fine-Structure Processing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.09.20.558670. [PMID: 37790457 PMCID: PMC10542537 DOI: 10.1101/2023.09.20.558670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The auditory system is unique among sensory systems in its ability to phase lock to and precisely follow very fast cycle-by-cycle fluctuations in the phase of sound-driven cochlear vibrations. Yet, the perceptual role of this temporal fine structure (TFS) code is debated. This fundamental gap is attributable to our inability to experimentally manipulate TFS cues without altering other perceptually relevant cues. Here, we circumnavigated this limitation by leveraging individual differences across 200 participants to systematically compare variations in TFS sensitivity to performance in a range of speech perception tasks. TFS sensitivity was assessed through detection of interaural time/phase differences, while speech perception was evaluated by word identification under noise interference. Results suggest that greater TFS sensitivity is not associated with greater masking release from fundamental-frequency or spatial cues, but appears to contribute to resilience against the effects of reverberation. We also found that greater TFS sensitivity is associated with faster response times, indicating reduced listening effort. These findings highlight the perceptual significance of TFS coding for everyday hearing.
Collapse
Affiliation(s)
- Agudemu Borjigin
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
- Waisman Center, University of Wisconsin - Madison, Madison, WI 53705, USA
| | - Hari M. Bharadwaj
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN 47907, USA
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA 15213, USA
| |
Collapse
|
2
|
Schirmer J, Wolpert S, Dapper K, Rühle M, Wertz J, Wouters M, Eldh T, Bader K, Singer W, Gaudrain E, Başkent D, Verhulst S, Braun C, Rüttiger L, Munk MHJ, Dalhoff E, Knipper M. Neural Adaptation at Stimulus Onset and Speed of Neural Processing as Critical Contributors to Speech Comprehension Independent of Hearing Threshold or Age. J Clin Med 2024; 13:2725. [PMID: 38731254 PMCID: PMC11084258 DOI: 10.3390/jcm13092725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 04/24/2024] [Accepted: 04/26/2024] [Indexed: 05/13/2024] Open
Abstract
Background: It is assumed that speech comprehension deficits in background noise are caused by age-related or acquired hearing loss. Methods: We examined young, middle-aged, and older individuals with and without hearing threshold loss using pure-tone (PT) audiometry, short-pulsed distortion-product otoacoustic emissions (pDPOAEs), auditory brainstem responses (ABRs), auditory steady-state responses (ASSRs), speech comprehension (OLSA), and syllable discrimination in quiet and noise. Results: A noticeable decline of hearing sensitivity in extended high-frequency regions and its influence on low-frequency-induced ABRs was striking. When testing for differences in OLSA thresholds normalized for PT thresholds (PTTs), marked differences in speech comprehension ability exist not only in noise, but also in quiet, and they exist throughout the whole age range investigated. Listeners with poor speech comprehension in quiet exhibited a relatively lower pDPOAE and, thus, cochlear amplifier performance independent of PTT, smaller and delayed ABRs, and lower performance in vowel-phoneme discrimination below phase-locking limits (/o/-/u/). When OLSA was tested in noise, listeners with poor speech comprehension independent of PTT had larger pDPOAEs and, thus, cochlear amplifier performance, larger ASSR amplitudes, and higher uncomfortable loudness levels, all linked with lower performance of vowel-phoneme discrimination above the phase-locking limit (/i/-/y/). Conslusions: This study indicates that listening in noise in humans has a sizable disadvantage in envelope coding when basilar-membrane compression is compromised. Clearly, and in contrast to previous assumptions, both good and poor speech comprehension can exist independently of differences in PTTs and age, a phenomenon that urgently requires improved techniques to diagnose sound processing at stimulus onset in the clinical routine.
Collapse
Affiliation(s)
- Jakob Schirmer
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Stephan Wolpert
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Konrad Dapper
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
- Department of Biology, Technical University Darmstadt, 64287 Darmstadt, Germany
| | - Moritz Rühle
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Jakob Wertz
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Marjoleen Wouters
- Department of Information Technology, Ghent University, Technologiepark 126, 9052 Zwijnaarde, Belgium; (M.W.); (S.V.)
| | - Therese Eldh
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Katharina Bader
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Wibke Singer
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Etienne Gaudrain
- Lyon Neuroscience Research Center, Centre National de la Recherche Scientifique UMR5292, Inserm U1028, Université Lyon 1, Centre Hospitalier Le Vinatier-Bâtiment 462–Neurocampus, 95 Boulevard Pinel, 69675 Bron CEDEX, France;
- Department of Otorhinolaryngology, University Medical Center Groningen (UMCG), Hanzeplein 1, BB21, 9700 RB Groningen, The Netherlands;
| | - Deniz Başkent
- Department of Otorhinolaryngology, University Medical Center Groningen (UMCG), Hanzeplein 1, BB21, 9700 RB Groningen, The Netherlands;
| | - Sarah Verhulst
- Department of Information Technology, Ghent University, Technologiepark 126, 9052 Zwijnaarde, Belgium; (M.W.); (S.V.)
| | - Christoph Braun
- Magnetoencephalography-Centre and Hertie Institute for Clinical Brain Research, University of Tübingen, Otfried-Müller-Straße 27, 72076 Tübingen, Germany;
- Center for Mind and Brain Research, University of Trento, Palazzo Fedrigotti-corso Bettini 31, 38068 Rovereto, Italy
| | - Lukas Rüttiger
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Matthias H. J. Munk
- Department of Biology, Technical University Darmstadt, 64287 Darmstadt, Germany
- Department of Psychiatry & Psychotherapy, University of Tübingen, Calwerstraße 14, 72076 Tübingen, Germany
| | - Ernst Dalhoff
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| | - Marlies Knipper
- Department of Otolaryngology, Head and Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Str. 5, 72076 Tübingen, Germany; (J.S.); (S.W.); (K.D.); (M.R.); (J.W.); (T.E.); (K.B.); (W.S.); (L.R.)
| |
Collapse
|
3
|
Saldías O'Hrens M, Castro C, Espinoza VM, Stoney J, Quezada C, Laukkanen AM. Spectral features related to the auditory perception of twang-like voices. LOGOP PHONIATR VOCO 2024:1-18. [PMID: 38656176 DOI: 10.1080/14015439.2024.2345373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Accepted: 04/15/2024] [Indexed: 04/26/2024]
Abstract
BACKGROUND To the best of our knowledge, studies on the relationship between spectral energy distribution and the degree of perceived twang-like voices are still sparse. Through an auditory-perceptual test we aimed to explore the spectral features that may relate with the auditory-perception of twang-like voices. METHODS Ten judges who were blind to the test's tasks and stimuli rated the amount of twang perceived on seventy-six audio samples. The stimuli consisted of twenty voices recorded from eight CCM singers who sustained the vowel [a:] in different pitches, with and without a twang-like voice. Also, forty filtered and sixteen synthesized-manipulated stimuli were included. RESULTS AND CONCLUSIONS Based on the intra-rater reliability scores, four judges were identified as suitable to be included in the analyses. Results showed that the frequency of F1 and F2 correlated strongly with the auditory-perception of twang-like voices (0.90 and 0.74, respectively), whereas F3 showed a moderate negative correlation (-0.52). The frequency difference between F1 and F3 showed a strong negative correlation (-0.82). The mean energy between 1-2 kHz and 2-3 kHz correlated moderately (0.51 and 0.42, respectively). The frequency of F4 and F5, and the energy above 3 kHz showed weak correlations. Since the spectral changes under 2 kHz have been associated with the jaw, lips, and tongue adjustments (i.e. vowel articulation) and a higher vertical laryngeal position might affect the frequency of all formants (including F1 and F2), our results suggest that vowel articulation and the laryngeal height may be relevant when performing twang-like voices.
Collapse
Affiliation(s)
| | - Christian Castro
- Departamento de Fonoaudiología, Universidad de Chile, Santiago, Chile
- Department Speech and Language Pathology, Universidad de Valparaíso, Valparaíso, Chile
- PhD Program in Health Sciences and Engineering, Universidad de Valparaíso, Valparaíso, Chile
| | | | - Justin Stoney
- New York Vocal Coaching Studio Inc, New York, NY, USA
| | - Camilo Quezada
- Departamento de Fonoaudiología, Universidad de Chile, Santiago, Chile
| | - Anne-Maria Laukkanen
- Speech and Voice Research Laboratory, Faculty of Social Sciences, Tampere University, Tampere, Finland
| |
Collapse
|
4
|
Ying R, Stolzberg DJ, Caras ML. Neural correlates of flexible sound perception in the auditory midbrain and thalamus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.12.589266. [PMID: 38645241 PMCID: PMC11030403 DOI: 10.1101/2024.04.12.589266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Hearing is an active process in which listeners must detect and identify sounds, segregate and discriminate stimulus features, and extract their behavioral relevance. Adaptive changes in sound detection can emerge rapidly, during sudden shifts in acoustic or environmental context, or more slowly as a result of practice. Although we know that context- and learning-dependent changes in the spectral and temporal sensitivity of auditory cortical neurons support many aspects of flexible listening, the contribution of subcortical auditory regions to this process is less understood. Here, we recorded single- and multi-unit activity from the central nucleus of the inferior colliculus (ICC) and the ventral subdivision of the medial geniculate nucleus (MGV) of Mongolian gerbils under two different behavioral contexts: as animals performed an amplitude modulation (AM) detection task and as they were passively exposed to AM sounds. Using a signal detection framework to estimate neurometric sensitivity, we found that neural thresholds in both regions improved during task performance, and this improvement was driven by changes in firing rate rather than phase locking. We also found that ICC and MGV neurometric thresholds improved and correlated with behavioral performance as animals learn to detect small AM depths during a multi-day perceptual training paradigm. Finally, we reveal that in the MGV, but not the ICC, context-dependent enhancements in AM sensitivity grow stronger during perceptual training, mirroring prior observations in the auditory cortex. Together, our results suggest that the auditory midbrain and thalamus contribute to flexible sound processing and perception over rapid and slow timescales.
Collapse
Affiliation(s)
- Rose Ying
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland, 20742
- Department of Biology, University of Maryland, College Park, Maryland, 20742
- Center for Comparative and Evolutionary Biology of Hearing, University of Maryland, College Park, Maryland, 20742
| | - Daniel J. Stolzberg
- Department of Biology, University of Maryland, College Park, Maryland, 20742
| | - Melissa L. Caras
- Neuroscience and Cognitive Science Program, University of Maryland, College Park, Maryland, 20742
- Department of Biology, University of Maryland, College Park, Maryland, 20742
- Center for Comparative and Evolutionary Biology of Hearing, University of Maryland, College Park, Maryland, 20742
- Department of Hearing and Speech Sciences, University of Maryland, College Park, Maryland, 20742
| |
Collapse
|
5
|
Zussino J, Zupan B, Preston R. "The barriers are plentiful." Speech-language pathologists' perspectives of enablers and barriers to hearing assessment for children in metropolitan, regional, and rural Australia: A mixed methods study. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2024; 26:289-300. [PMID: 37318161 DOI: 10.1080/17549507.2023.2215486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
PURPOSE Access to hearing assessment is important for children, as poor auditory information can lead to poor speech and oral language development. This study aims to identify enablers and barriers to accessing hearing assessments for Australian children from the perspective of speech-language pathologists (SLPs), comparing access in metropolitan, regional, and rural areas. METHOD This is a sequential, explanatory mixed-methods study. Forty-nine participants completed the quantitative survey and 14 participated in semi-structured interviews. The study was undertaken online and included participants from metropolitan, regional, and rural parts of Australian states and territories. RESULT Similar accessibility issues were experienced across geographic locations and access to hearing assessment was related to the complexity of individual contexts. Speech-language pathologists felt that awareness and knowledge of hearing loss was low in parents and health professionals. Participants discussed barriers such as long wait times, complex criteria, and inefficient services that lead to compromised outcomes for clients. CONCLUSION Barriers to hearing assessment are extensive and multifaceted. Future research might examine the accessibility of the health system in light of the barriers discussed in this research, and whether policies and procedures could be adapted to allow more easily accessible services.
Collapse
Affiliation(s)
- Jenna Zussino
- Central Queensland University, Rockhampton, Australia
| | - Barbra Zupan
- School of Speech Pathology, Central Queensland University, Rockhampton, Australia
| | - Robyn Preston
- Public Health, Central Queensland University, Townsville, Australia
| |
Collapse
|
6
|
Mok BA, Viswanathan V, Borjigin A, Singh R, Kafi H, Bharadwaj HM. Web-based psychoacoustics: Hearing screening, infrastructure, and validation. Behav Res Methods 2024; 56:1433-1448. [PMID: 37326771 PMCID: PMC10704001 DOI: 10.3758/s13428-023-02101-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/01/2023] [Indexed: 06/17/2023]
Abstract
Anonymous web-based experiments are increasingly used in many domains of behavioral research. However, online studies of auditory perception, especially of psychoacoustic phenomena pertaining to low-level sensory processing, are challenging because of limited available control of the acoustics, and the inability to perform audiometry to confirm normal-hearing status of participants. Here, we outline our approach to mitigate these challenges and validate our procedures by comparing web-based measurements to lab-based data on a range of classic psychoacoustic tasks. Individual tasks were created using jsPsych, an open-source JavaScript front-end library. Dynamic sequences of psychoacoustic tasks were implemented using Django, an open-source library for web applications, and combined with consent pages, questionnaires, and debriefing pages. Subjects were recruited via Prolific, a subject recruitment platform for web-based studies. Guided by a meta-analysis of lab-based data, we developed and validated a screening procedure to select participants for (putative) normal-hearing status based on their responses in a suprathreshold task and a survey. Headphone use was standardized by supplementing procedures from prior literature with a binaural hearing task. Individuals meeting all criteria were re-invited to complete a range of classic psychoacoustic tasks. For the re-invited participants, absolute thresholds were in excellent agreement with lab-based data for fundamental frequency discrimination, gap detection, and sensitivity to interaural time delay and level difference. Furthermore, word identification scores, consonant confusion patterns, and co-modulation masking release effect also matched lab-based studies. Our results suggest that web-based psychoacoustics is a viable complement to lab-based research. Source code for our infrastructure is provided.
Collapse
Affiliation(s)
- Brittany A Mok
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA
| | - Vibha Viswanathan
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Agudemu Borjigin
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Ravinderjit Singh
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Homeira Kafi
- Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN, USA
| | - Hari M Bharadwaj
- Department of Speech, Language, and Hearing Sciences, Purdue University, West Lafayette, IN, USA.
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
7
|
Kalchev E. Beyond the Sound Waves: A Comprehensive Exploration of the Burn-In Phenomenon in Audio Equipment Across Physiological, Psychological, and Societal Domains. Cureus 2024; 16:e53097. [PMID: 38414701 PMCID: PMC10898501 DOI: 10.7759/cureus.53097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/28/2024] [Indexed: 02/29/2024] Open
Abstract
Audio burn-in, often referred to as the process by which audio equipment undergoes a series of played sounds to achieve optimal performance, remains a topic of significant debate within both audiophile communities and relevant scientific fields. While some attribute perceived changes in sound quality to actual physical changes in the equipment, an emerging perspective points to the interplay of physiological, psychological, and social factors that might influence these perceptions. This narrative review delves into the intricate layers of auditory physiology, cognitive sound interpretation, and the wider societal beliefs around burn-in. We underscore the importance of discerning between actual physical changes in audio gear and the multifaceted human factors that potentially modulate our perception of sound. Through a comprehensive exploration, this article illuminates the complexities of this phenomenon, offering insights for both medical professionals and passionate audio enthusiasts and proposing directions for future research.
Collapse
Affiliation(s)
- Emilian Kalchev
- Diagnostic Imaging, St. Marina University Hospital, Varna, BGR
| |
Collapse
|
8
|
Gautam D, Raza MU, Miyakoshi M, Molina JL, Joshi YB, Clayson PE, Light GA, Swerdlow NR, Sivarao DV. Click-train evoked steady state harmonic response as a novel pharmacodynamic biomarker of cortical oscillatory synchrony. Neuropharmacology 2023; 240:109707. [PMID: 37673332 DOI: 10.1016/j.neuropharm.2023.109707] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 07/25/2023] [Accepted: 08/31/2023] [Indexed: 09/08/2023]
Abstract
Sensory networks naturally entrain to rhythmic stimuli like a click train delivered at a particular frequency. Such synchronization is integral to information processing, can be measured by electroencephalography (EEG) and is an accessible index of neural network function. Click trains evoke neural entrainment not only at the driving frequency (F), referred to as the auditory steady state response (ASSR), but also at its higher multiples called the steady state harmonic response (SSHR). Since harmonics play an important and non-redundant role in acoustic information processing, we hypothesized that SSHR may differ from ASSR in presentation and pharmacological sensitivity. In female SD rats, a 2 s-long train stimulus was used to evoke ASSR at 20 Hz and its SSHR at 40, 60 and 80 Hz, recorded from a prefrontal epidural electrode. Narrow band evoked responses were evident at all frequencies; signal power was strongest at 20 Hz while phase synchrony was strongest at 80 Hz. SSHR at 40 Hz took the longest time (∼180 ms from stimulus onset) to establish synchrony. The NMDA antagonist MK801 (0.025-0.1 mg/kg) did not consistently affect 20 Hz ASSR phase synchrony but robustly and dose-dependently attenuated synchrony of all SSHR. Evoked power was attenuated by MK801 at 20 Hz ASSR and 40 Hz SSHR only. Thus, presentation as well as pharmacological sensitivity distinguished SSHR from ASSR, making them non-redundant markers of cortical network function. SSHR is a novel and promising translational biomarker of cortical oscillatory dynamics that may have important applications in CNS drug development and personalized medicine.
Collapse
Affiliation(s)
- Deepshila Gautam
- Department of Pharmaceutical Sciences, Bill Gatton College of Pharmacy, East Tennessee State University, Johnson City, TN, 37604, USA
| | - Muhammad Ummear Raza
- Department of Pharmaceutical Sciences, Bill Gatton College of Pharmacy, East Tennessee State University, Johnson City, TN, 37604, USA
| | - M Miyakoshi
- Division of Child and Adolescent Psychiatry, Cincinnati Children's Hospital Medical Center, Cincinnati, OH, USA
| | - J L Molina
- Department of Psychiatry, UCSD School of Medicine, La Jolla, CA, USA; VISN 22 MIRECC, SD Veterans Administration Health System, La Jolla, CA, USA
| | - Y B Joshi
- Department of Psychiatry, UCSD School of Medicine, La Jolla, CA, USA; VISN 22 MIRECC, SD Veterans Administration Health System, La Jolla, CA, USA
| | - P E Clayson
- Department of Psychology, University of South Florida, Tampa, FL, USA
| | - G A Light
- Department of Psychiatry, UCSD School of Medicine, La Jolla, CA, USA; VISN 22 MIRECC, SD Veterans Administration Health System, La Jolla, CA, USA
| | - N R Swerdlow
- Department of Psychiatry, UCSD School of Medicine, La Jolla, CA, USA; VISN 22 MIRECC, SD Veterans Administration Health System, La Jolla, CA, USA
| | - Digavalli V Sivarao
- Department of Pharmaceutical Sciences, Bill Gatton College of Pharmacy, East Tennessee State University, Johnson City, TN, 37604, USA.
| |
Collapse
|
9
|
Kuo CY, Liu JW, Wang CH, Juan CH, Hsieh IH. The role of carrier spectral composition in the perception of musical pitch. Atten Percept Psychophys 2023; 85:2083-2099. [PMID: 37479873 DOI: 10.3758/s13414-023-02761-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/07/2023] [Indexed: 07/23/2023]
Abstract
Temporal envelope fluctuations of natural sounds convey critical information to speech and music processing. In particular, musical pitch perception is assumed to be primarily underlined by temporal envelope encoding. While increasing evidence demonstrates the importance of carrier fine structure to complex pitch perception, how carrier spectral information affects musical pitch perception is less clear. Here, transposed tones designed to convey identical envelope information across different carriers were used to assess the effects of carrier spectral composition to pitch discrimination and musical-interval and melody identifications. Results showed that pitch discrimination thresholds became lower (better) with increasing carrier frequencies from 1k to 10k Hz, with performance comparable to that of pure sinusoids. Musical interval and melody defined by the periodicity of sine- or harmonic complex envelopes across carriers were identified with greater than 85% accuracy even on a 10k-Hz carrier. Moreover, enhanced interval and melody identification performance was observed with increasing carrier frequency up to 6k Hz. Findings suggest a perceptual enhancement of temporal envelope information with increasing carrier spectral region in musical pitch processing, at least for frequencies up to 6k Hz. For carriers in the extended high-frequency region (8-20k Hz), the use of temporal envelope information to music pitch processing may vary depending on task requirement. Collectively, these results implicate the fidelity of temporal envelope information to musical pitch perception is more pronounced than previously considered, with ecological implications.
Collapse
Affiliation(s)
- Chao-Yin Kuo
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei City, Taiwan
| | - Jia-Wei Liu
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
| | - Chih-Hung Wang
- Department of Otolaryngology-Head and Neck Surgery, Tri-Service General Hospital, National Defense Medical Center, Taipei City, Taiwan
| | - Chi-Hung Juan
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
- Cognitive Intelligence and Precision Healthcare Center, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan
| | - I-Hui Hsieh
- Institute of Cognitive Neuroscience, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan.
- Cognitive Intelligence and Precision Healthcare Center, National Central University, No. 300, Zhongda Rd., Zhongli District, Taoyuan City, 320317, Taiwan.
| |
Collapse
|
10
|
Colligan T, Irish K, Emlen DJ, Wheeler TJ. DISCO: A deep learning ensemble for uncertainty-aware segmentation of acoustic signals. PLoS One 2023; 18:e0288172. [PMID: 37494341 PMCID: PMC10370718 DOI: 10.1371/journal.pone.0288172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 06/21/2023] [Indexed: 07/28/2023] Open
Abstract
Recordings of animal sounds enable a wide range of observational inquiries into animal communication, behavior, and diversity. Automated labeling of sound events in such recordings can improve both throughput and reproducibility of analysis. Here, we describe our software package for labeling elements in recordings of animal sounds, and demonstrate its utility on recordings of beetle courtships and whale songs. The software, DISCO, computes sensible confidence estimates and produces labels with high precision and accuracy. In addition to the core labeling software, it provides a simple tool for labeling training data, and a visual system for analysis of resulting labels. DISCO is open-source and easy to install, it works with standard file formats, and it presents a low barrier of entry to use.
Collapse
Affiliation(s)
- Thomas Colligan
- College of Pharmacy, University of Arizona, Tucson, AZ, United States of America
- Department of Computer Science, University of Montana, Missoula, MT, United States of America
| | - Kayla Irish
- Department of Computer Science, University of Montana, Missoula, MT, United States of America
- Department of Statistics, University of Washington, Seattle, WA, United States of America
| | - Douglas J Emlen
- Division of Biological Sciences, University of Montana, Missoula, MT, United States of America
| | - Travis J Wheeler
- College of Pharmacy, University of Arizona, Tucson, AZ, United States of America
- Department of Computer Science, University of Montana, Missoula, MT, United States of America
| |
Collapse
|
11
|
McCarty MJ, Murphy E, Scherschligt X, Woolnough O, Morse CW, Snyder K, Mahon BZ, Tandon N. Intraoperative cortical localization of music and language reveals signatures of structural complexity in posterior temporal cortex. iScience 2023; 26:107223. [PMID: 37485361 PMCID: PMC10362292 DOI: 10.1016/j.isci.2023.107223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 06/01/2023] [Accepted: 06/22/2023] [Indexed: 07/25/2023] Open
Abstract
Language and music involve the productive combination of basic units into structures. It remains unclear whether brain regions sensitive to linguistic and musical structure are co-localized. We report an intraoperative awake craniotomy in which a left-hemispheric language-dominant professional musician underwent cortical stimulation mapping (CSM) and electrocorticography of music and language perception and production during repetition tasks. Musical sequences were melodic or amelodic, and differed in algorithmic compressibility (Lempel-Ziv complexity). Auditory recordings of sentences differed in syntactic complexity (single vs. multiple phrasal embeddings). CSM of posterior superior temporal gyrus (pSTG) disrupted music perception and production, along with speech production. pSTG and posterior middle temporal gyrus (pMTG) activated for language and music (broadband gamma; 70-150 Hz). pMTG activity was modulated by musical complexity, while pSTG activity was modulated by syntactic complexity. This points to shared resources for music and language comprehension, but distinct neural signatures for the processing of domain-specific structural features.
Collapse
Affiliation(s)
- Meredith J. McCarty
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Elliot Murphy
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Xavier Scherschligt
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Oscar Woolnough
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Cale W. Morse
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Kathryn Snyder
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
| | - Bradford Z. Mahon
- Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Neuroscience Institute, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Nitin Tandon
- Vivian L. Smith Department of Neurosurgery, McGovern Medical School, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, Houston, TX 77030, USA
- Memorial Hermann Hospital, Texas Medical Center, Houston, TX 77030, USA
| |
Collapse
|
12
|
Prinz R. Nothing in evolution makes sense except in the light of code biology. Biosystems 2023; 229:104907. [PMID: 37207840 DOI: 10.1016/j.biosystems.2023.104907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2023] [Revised: 04/29/2023] [Accepted: 05/02/2023] [Indexed: 05/21/2023]
Abstract
This article highlights the potential contribution of biological codes to the course and dynamics of evolution. The concept of organic codes, developed by Marcello Barbieri, has fundamentally changed our view of how living systems function. The notion that molecular interactions built on adaptors that arbitrarily link molecules from different "worlds" in a conventional, i.e., rule-based way, departs significantly from the law-based constraints imposed on livening things by physical and chemical mechanisms. In other words, living and non-living things behave like rules and laws, respectively, but this important distinction is rarely considered in current evolutionary theory. The many known codes allow quantification of codes that relate to a cell, or comparisons between different biological systems and may pave the way to a quantitative and empirical research agenda in code biology. A starting point for such an endeavour is the introduction of a simple dichotomous classification of structural and regulatory codes. This classification can be used as a tool to analyse and quantify key organising principles of the living world, such as modularity, hierarchy, and robustness, based on organic codes. The implications for evolutionary research are related to the unique dynamics of codes, or ´Eigendynamics´ (self-momentum) and how they determine the behaviour of biological systems from within, whereas physical constraints are imposed mainly from without. A speculation on the drivers of macroevolution in light of codes is followed by the conclusion that a meaningful and comprehensive understanding of evolution depends including codes into the equation of life.
Collapse
|
13
|
Ekberg M, Stavrinos G, Andin J, Stenfelt S, Dahlström Ö. Acoustic Features Distinguishing Emotions in Swedish Speech. J Voice 2023:S0892-1997(23)00103-0. [PMID: 37045739 DOI: 10.1016/j.jvoice.2023.03.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 03/09/2023] [Accepted: 03/10/2023] [Indexed: 04/14/2023]
Abstract
Few studies have examined which acoustic features of speech can be used to distinguish between different emotions, and how combinations of acoustic parameters contribute to identification of emotions. The aim of the present study was to investigate which acoustic parameters in Swedish speech are most important for differentiation between, and identification of, the emotions anger, fear, happiness, sadness, and surprise in Swedish sentences. One-way ANOVAs were used to compare acoustic parameters between the emotions and both simple and multiple logistic regression models were used to examine the contribution of different acoustic parameters to differentiation between emotions. Results showed differences between emotions for several acoustic parameters in Swedish speech: surprise was the most distinct emotion, with significant differences compared to the other emotions across a range of acoustic parameters, while anger and happiness did not differ from each other on any parameter. The logistic regression models showed that fear was the best-predicted emotion while happiness was most difficult to predict. Frequency- and spectral-balance-related parameters were best at predicting fear. Amplitude- and temporal-related parameters were most important for surprise, while a combination of frequency-, amplitude- and spectral balance-related parameters are important for sadness. Assuming that there are similarities between acoustic models and how listeners infer emotions in speech, results suggest that individuals with hearing loss, who lack abilities of frequency detection, may compared to normal hearing individuals have difficulties in identifying fear in Swedish speech. Since happiness and fear relied primarily on amplitude- and spectral-balance-related parameters, detection of them are probably facilitated more by hearing aid use.
Collapse
Affiliation(s)
- M Ekberg
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Östergötland, Sweden.
| | - G Stavrinos
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Östergötland, Sweden
| | - J Andin
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Östergötland, Sweden
| | - S Stenfelt
- Department of Biomedical and Clinical Sciences, Linköping University, Linköping, Östergötland, Sweden
| | - Ö Dahlström
- Department of Behavioural Sciences and Learning, Linköping University, Linköping, Östergötland, Sweden
| |
Collapse
|
14
|
Basiński K, Quiroga-Martinez DR, Vuust P. Temporal hierarchies in the predictive processing of melody - From pure tones to songs. Neurosci Biobehav Rev 2023; 145:105007. [PMID: 36535375 DOI: 10.1016/j.neubiorev.2022.105007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 11/30/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022]
Abstract
Listening to musical melodies is a complex task that engages perceptual and memoryrelated processes. The processes underlying melody cognition happen simultaneously on different timescales, ranging from milliseconds to minutes. Although attempts have been made, research on melody perception is yet to produce a unified framework of how melody processing is achieved in the brain. This may in part be due to the difficulty of integrating concepts such as perception, attention and memory, which pertain to different temporal scales. Recent theories on brain processing, which hold prediction as a fundamental principle, offer potential solutions to this problem and may provide a unifying framework for explaining the neural processes that enable melody perception on multiple temporal levels. In this article, we review empirical evidence for predictive coding on the levels of pitch formation, basic pitch-related auditory patterns,more complex regularity processing extracted from basic patterns and long-term expectations related to musical syntax. We also identify areas that would benefit from further inquiry and suggest future directions in research on musical melody perception.
Collapse
Affiliation(s)
- Krzysztof Basiński
- Division of Quality of Life Research, Medical University of Gdańsk, Poland
| | - David Ricardo Quiroga-Martinez
- Helen Wills Neuroscience Institute & Department of Psychology, University of California Berkeley, USA; Center for Music in the Brain, Aarhus University & The Royal Academy of Music, Denmark
| | - Peter Vuust
- Center for Music in the Brain, Aarhus University & The Royal Academy of Music, Denmark
| |
Collapse
|
15
|
Oxenham AJ. Questions and controversies surrounding the perception and neural coding of pitch. Front Neurosci 2023; 16:1074752. [PMID: 36699531 PMCID: PMC9868815 DOI: 10.3389/fnins.2022.1074752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 12/16/2022] [Indexed: 01/12/2023] Open
Abstract
Pitch is a fundamental aspect of auditory perception that plays an important role in our ability to understand speech, appreciate music, and attend to one sound while ignoring others. The questions surrounding how pitch is represented in the auditory system, and how our percept relates to the underlying acoustic waveform, have been a topic of inquiry and debate for well over a century. New findings and technological innovations have led to challenges of some long-standing assumptions and have raised new questions. This article reviews some recent developments in the study of pitch coding and perception and focuses on the topic of how pitch information is extracted from peripheral representations based on frequency-to-place mapping (tonotopy), stimulus-driven auditory-nerve spike timing (phase locking), or a combination of both. Although a definitive resolution has proved elusive, the answers to these questions have potentially important implications for mitigating the effects of hearing loss via devices such as cochlear implants.
Collapse
Affiliation(s)
- Andrew J. Oxenham
- Center for Applied and Translational Sensory Science, University of Minnesota Twin Cities, Minneapolis, MN, United States
- Department of Psychology, University of Minnesota Twin Cities, Minneapolis, MN, United States
| |
Collapse
|
16
|
Forno E, Fra V, Pignari R, Macii E, Urgese G. Spike encoding techniques for IoT time-varying signals benchmarked on a neuromorphic classification task. Front Neurosci 2022; 16:999029. [PMID: 36620463 PMCID: PMC9811205 DOI: 10.3389/fnins.2022.999029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/30/2022] [Indexed: 12/24/2022] Open
Abstract
Spiking Neural Networks (SNNs), known for their potential to enable low energy consumption and computational cost, can bring significant advantages to the realm of embedded machine learning for edge applications. However, input coming from standard digital sensors must be encoded into spike trains before it can be elaborated with neuromorphic computing technologies. We present here a detailed comparison of available spike encoding techniques for the translation of time-varying signals into the event-based signal domain, tested on two different datasets both acquired through commercially available digital devices: the Free Spoken Digit dataset (FSD), consisting of 8-kHz audio files, and the WISDM dataset, composed of 20-Hz recordings of human activity through mobile and wearable inertial sensors. We propose a complete pipeline to benchmark these encoding techniques by performing time-dependent signal classification through a Spiking Convolutional Neural Network (sCNN), including a signal preprocessing step consisting of a bank of filters inspired by the human cochlea, feature extraction by production of a sonogram, transfer learning via an equivalent ANN, and model compression schemes aimed at resource optimization. The resulting performance comparison and analysis provides a powerful practical tool, empowering developers to select the most suitable coding method based on the type of data and the desired processing algorithms, and further expands the applicability of neuromorphic computational paradigms to embedded sensor systems widely employed in the IoT and industrial domains.
Collapse
Affiliation(s)
| | | | | | | | - Gianvito Urgese
- Politecnico di Torino, Electronic Design Automation (EDA) Group, Turin, Italy
| |
Collapse
|
17
|
Guinan JJ. Cochlear amplification in the short-wave region by outer hair cells changing organ-of-Corti area to amplify the fluid traveling wave. Hear Res 2022. [DOI: 10.1016/j.heares.2022.108641] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
|
18
|
Hausrat TJ, Vogl C, Neef J, Schweizer M, Yee BK, Strenzke N, Kneussel M. Monoallelic loss of the F-actin-binding protein radixin facilitates startle reactivity and pre-pulse inhibition in mice. Front Cell Dev Biol 2022; 10:987691. [DOI: 10.3389/fcell.2022.987691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 11/11/2022] [Indexed: 11/29/2022] Open
Abstract
Hearing impairment is one of the most common disorders with a global burden and increasing prevalence in an ever-aging population. Previous research has largely focused on peripheral sensory perception, while the brain circuits of auditory processing and integration remain poorly understood. Mutations in the rdx gene, encoding the F-actin binding protein radixin (Rdx), can induce hearing loss in human patients and homozygous depletion of Rdx causes deafness in mice. However, the precise physiological function of Rdx in hearing and auditory information processing is still ill-defined. Here, we investigated consequences of rdx monoallelic loss in the mouse. Unlike the homozygous (−/−) rdx knockout, which is characterized by the degeneration of actin-based stereocilia and subsequent hearing loss, our analysis of heterozygous (+/−) mutants has revealed a different phenotype. Specifically, monoallelic loss of rdx potentiated the startle reflex in response to acoustic stimulation of increasing intensities, suggesting a gain of function relative to wildtype littermates. The monoallelic loss of the rdx gene also facilitated pre-pulse inhibition of the acoustic startle reflex induced by weak auditory pre-pulse stimuli, indicating a modification to the circuit underlying sensorimotor gating of auditory input. However, the auditory brainstem response (ABR)-based hearing thresholds revealed a mild impairment in peripheral sound perception in rdx (+/-) mice, suggesting minor aberration of stereocilia structural integrity. Taken together, our data suggest a critical role of Rdx in the top-down processing and/or integration of auditory signals, and therefore a novel perspective to uncover further Rdx-mediated mechanisms in central auditory information processing.
Collapse
|
19
|
Torppa R, Kuuluvainen S, Lipsanen J. The development of cortical processing of speech differs between children with cochlear implants and normal hearing and changes with parental singing. Front Neurosci 2022; 16:976767. [PMID: 36507354 PMCID: PMC9731313 DOI: 10.3389/fnins.2022.976767] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Accepted: 11/04/2022] [Indexed: 11/21/2022] Open
Abstract
Objective The aim of the present study was to investigate speech processing development in children with normal hearing (NH) and cochlear implants (CI) groups using a multifeature event-related potential (ERP) paradigm. Singing is associated to enhanced attention and speech perception. Therefore, its connection to ERPs was investigated in the CI group. Methods The paradigm included five change types in a pseudoword: two easy- (duration, gap) and three difficult-to-detect (vowel, pitch, intensity) with CIs. The positive mismatch responses (pMMR), mismatch negativity (MMN), P3a and late differentiating negativity (LDN) responses of preschoolers (below 6 years 9 months) and schoolchildren (above 6 years 9 months) with NH or CIs at two time points (T1, T2) were investigated with Linear Mixed Modeling (LMM). For the CI group, the association of singing at home and ERP development was modeled with LMM. Results Overall, responses elicited by the easy- and difficult to detect changes differed between the CI and NH groups. Compared to the NH group, the CI group had smaller MMNs to vowel duration changes and gaps, larger P3a responses to gaps, and larger pMMRs and smaller LDNs to vowel identity changes. Preschoolers had smaller P3a responses and larger LDNs to gaps, and larger pMMRs to vowel identity changes than schoolchildren. In addition, the pMMRs to gaps increased from T1 to T2 in preschoolers. More parental singing in the CI group was associated with increasing pMMR and less parental singing with decreasing P3a amplitudes from T1 to T2. Conclusion The multifeature paradigm is suitable for assessing cortical speech processing development in children. In children with CIs, cortical discrimination is often reflected in pMMR and P3a responses, and in MMN and LDN responses in children with NH. Moreover, the cortical speech discrimination of children with CIs develops late, and over time and age, their speech sound change processing changes as does the processing of children with NH. Importantly, multisensory activities such as parental singing can lead to improvement in the discrimination and attention shifting toward speech changes in children with CIs. These novel results should be taken into account in future research and rehabilitation.
Collapse
Affiliation(s)
- Ritva Torppa
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland,Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland,Centre of Excellence in Music, Mind, Body and Brain, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| | - Soila Kuuluvainen
- Cognitive Brain Research Unit, Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland,Department of Digital Humanities, Faculty of Arts, University of Helsinki, Helsinki, Finland
| | - Jari Lipsanen
- Department of Psychology and Logopedics, Faculty of Medicine, University of Helsinki, Helsinki, Finland
| |
Collapse
|
20
|
Lengert L, Lohmann H, Johannsmeier S, Ripken T, Maier H, Heisterkamp A, Kalies S. Optoacoustic tones generated by nanosecond laser pulses can cover the entire human hearing range. JOURNAL OF BIOPHOTONICS 2022; 15:e202200161. [PMID: 36328060 DOI: 10.1002/jbio.202200161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Revised: 08/01/2022] [Accepted: 08/02/2022] [Indexed: 06/16/2023]
Abstract
The aim of this work is to generate defined tones that cover the human hearing range in aqueous media for a later application in middle or inner ear implants. In our experiments, we investigated the characteristics of single laser pulses and pulse trains with different laser repetition rates of nanosecond laser pulses that were focused into aqueous media in a small volume. The frequency of the generated tones was limited by the spectral properties of the single acoustic pulses, which depended on the medium. Tones with fundamental frequencies above 8 kHz were generated using laser pulses focused into water. By replacing water with gel, tones between 500 Hz and 20 kHz could be produced. The generation of tones in the low-frequency range was only possible when laser pulse trains with pulse density modulated pulse patterns were applied in gel. This enabled the generation of tones between 20 Hz and 2 kHz. Consequently, the combination of different pulse patterns for the different frequency ranges allows generating optoacoustic tones between 20 Hz and 20 kHz in gel. Thus, we can cover the complete range of human hearing through optoacoustically generated tones.
Collapse
Affiliation(s)
- Liza Lengert
- Industrial and Biomedical Optics, Laser Zentrum Hannover e.V., Hannover, Germany
- Lower Saxony Centre for Biomedical Engineering, Implant Research and Development (NIFE), Hannover, Germany
- Cluster of Excellence Hearing4all, Hannover and Oldenburg, Hannover, Germany
| | - Hinnerk Lohmann
- Institute of Quantum Optics, Leibniz University Hannover, Hannover, Germany
| | - Sonja Johannsmeier
- Industrial and Biomedical Optics, Laser Zentrum Hannover e.V., Hannover, Germany
- Lower Saxony Centre for Biomedical Engineering, Implant Research and Development (NIFE), Hannover, Germany
| | - Tammo Ripken
- Industrial and Biomedical Optics, Laser Zentrum Hannover e.V., Hannover, Germany
- Lower Saxony Centre for Biomedical Engineering, Implant Research and Development (NIFE), Hannover, Germany
- Cluster of Excellence Hearing4all, Hannover and Oldenburg, Hannover, Germany
| | - Hannes Maier
- Lower Saxony Centre for Biomedical Engineering, Implant Research and Development (NIFE), Hannover, Germany
- Cluster of Excellence Hearing4all, Hannover and Oldenburg, Hannover, Germany
- Department of Otorhinolaryngology, Hannover Medical School MHH, Hannover, Germany
| | - Alexander Heisterkamp
- Industrial and Biomedical Optics, Laser Zentrum Hannover e.V., Hannover, Germany
- Lower Saxony Centre for Biomedical Engineering, Implant Research and Development (NIFE), Hannover, Germany
- Cluster of Excellence Hearing4all, Hannover and Oldenburg, Hannover, Germany
- Institute of Quantum Optics, Leibniz University Hannover, Hannover, Germany
| | - Stefan Kalies
- Lower Saxony Centre for Biomedical Engineering, Implant Research and Development (NIFE), Hannover, Germany
- Cluster of Excellence Hearing4all, Hannover and Oldenburg, Hannover, Germany
- Institute of Quantum Optics, Leibniz University Hannover, Hannover, Germany
| |
Collapse
|
21
|
Wang X, Mo Y, Kong F, Guo W, Zhou H, Zheng N, Schnupp JWH, Zheng Y, Meng Q. Cochlear-implant Mandarin tone recognition with a disyllabic word corpus. Front Psychol 2022; 13:1026116. [DOI: 10.3389/fpsyg.2022.1026116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Accepted: 09/28/2022] [Indexed: 11/13/2022] Open
Abstract
Despite pitch being considered the primary cue for discriminating lexical tones, there are secondary cues such as loudness contour and duration, which may allow some cochlear implant (CI) tone discrimination even with severely degraded pitch cues. To isolate pitch cues from other cues, we developed a new disyllabic word stimulus set (Di) whose primary (pitch) and secondary (loudness) cue varied independently. This Di set consists of 270 disyllabic words, each having a distinct meaning depending on the perceived tone. Thus, listeners who hear the primary pitch cue clearly may hear a different meaning from listeners who struggle with the pitch cue and must rely on the secondary loudness contour. A lexical tone recognition experiment was conducted, which compared Di with a monosyllabic set of natural recordings. Seventeen CI users and eight normal-hearing (NH) listeners took part in the experiment. Results showed that CI users had poorer pitch cues encoding and their tone recognition performance was significantly influenced by the “missing” or “confusing” secondary cues with the Di corpus. The pitch-contour-based tone recognition is still far from satisfactory for CI users compared to NH listeners, even if some appear to integrate multiple cues to achieve high scores. This disyllabic corpus could be used to examine the performance of pitch recognition of CI users and the effectiveness of pitch cue enhancement based Mandarin tone enhancement strategies. The Di corpus is freely available online: https://github.com/BetterCI/DiTone.
Collapse
|
22
|
Mehta AH, Oxenham AJ. Role of perceptual integration in pitch discrimination at high frequenciesa). JASA EXPRESS LETTERS 2022; 2:084402. [PMID: 37311192 PMCID: PMC10264831 DOI: 10.1121/10.0013429] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 07/26/2022] [Indexed: 06/15/2023]
Abstract
At very high frequencies, fundamental-frequency difference limens (F0DLs) for five-component harmonic complex tones can be better than predicted by optimal integration of information, assuming performance is limited by noise at the peripheral level, but are in line with predictions based on more central sources of noise. This study investigates whether there is a minimum number of harmonic components needed for such super-optimal integration effects and if harmonic range or inharmonicity affects this super-optimal integration. Results show super-optimal integration, even with two harmonic components and for most combinations of consecutive harmonic, but not inharmonic, components.
Collapse
Affiliation(s)
- Anahita H Mehta
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA ,
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455, USA ,
| |
Collapse
|
23
|
Steenken F, Oetjen H, Beutelmann R, Carney LH, Koeppl C, Klump GM. Neural processing and perception of Schroeder-phase harmonic tone complexes in the gerbil: Relating single-unit neurophysiology to behavior. Eur J Neurosci 2022; 56:4060-4085. [PMID: 35724973 PMCID: PMC9632632 DOI: 10.1111/ejn.15744] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Revised: 05/22/2022] [Accepted: 05/25/2022] [Indexed: 11/30/2022]
Abstract
Schroeder-phase harmonic tone complexes have been used in physiological and psychophysical studies in several species to gain insight into cochlear function. Each pitch period of the Schroeder stimulus contains a linear frequency sweep; the duty cycle, sweep velocity, and direction are controlled by parameters of the phase spectrum. Here, responses to a range of Schroeder-phase harmonic tone complexes were studied both behaviorally and in neural recordings from the auditory nerve and inferior colliculus of Mongolian gerbils. Gerbils were able to discriminate Schroeder-phase harmonic tone complexes based on sweep direction, duty cycle, and/or velocity for fundamental frequencies up to 200 Hz. Temporal representation in neural responses based on the van Rossum spike-distance metric, with time constants of either 1 ms or related to the stimulus' period, was compared to average discharge rates. Neural responses and behavioral performance were both expressed in terms of sensitivity, d', to allow direct comparisons. Our results suggest that in the auditory nerve, stimulus fine structure is represented by spike timing while envelope is represented by rate. In the inferior colliculus, both temporal fine structure and envelope appear to be represented best by rate. However, correlations between neural d' values and behavioral sensitivity for sweep direction were strongest for both temporal metrics, for both auditory nerve and inferior colliculus. Furthermore, the high sensitivity observed in the inferior colliculus neural rate-based discrimination suggests that these neurons integrate across multiple inputs arising from the auditory periphery.
Collapse
Affiliation(s)
- Friederike Steenken
- Cluster of Excellence "Hearing4all" and Research Centre Neurosensory Science, Department of Neuroscience, School of Medicine and Health Science, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Henning Oetjen
- Cluster of Excellence "Hearing4all" and Research Centre Neurosensory Science, Department of Neuroscience, School of Medicine and Health Science, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Rainer Beutelmann
- Cluster of Excellence "Hearing4all" and Research Centre Neurosensory Science, Department of Neuroscience, School of Medicine and Health Science, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Laurel H Carney
- Department of Biomedical Engineering, University of Rochester, Rochester, NY, USA.,Hanse-Wissenschaftskolleg, Delmenhorst, Germany
| | - Christine Koeppl
- Cluster of Excellence "Hearing4all" and Research Centre Neurosensory Science, Department of Neuroscience, School of Medicine and Health Science, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| | - Georg M Klump
- Cluster of Excellence "Hearing4all" and Research Centre Neurosensory Science, Department of Neuroscience, School of Medicine and Health Science, Carl von Ossietzky University Oldenburg, Oldenburg, Germany
| |
Collapse
|
24
|
Dellaferrera G, Asabuki T, Fukai T. Modeling the Repetition-Based Recovering of Acoustic and Visual Sources With Dendritic Neurons. Front Neurosci 2022; 16:855753. [PMID: 35573290 PMCID: PMC9097820 DOI: 10.3389/fnins.2022.855753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 03/31/2022] [Indexed: 11/13/2022] Open
Abstract
In natural auditory environments, acoustic signals originate from the temporal superimposition of different sound sources. The problem of inferring individual sources from ambiguous mixtures of sounds is known as blind source decomposition. Experiments on humans have demonstrated that the auditory system can identify sound sources as repeating patterns embedded in the acoustic input. Source repetition produces temporal regularities that can be detected and used for segregation. Specifically, listeners can identify sounds occurring more than once across different mixtures, but not sounds heard only in a single mixture. However, whether such a behavior can be computationally modeled has not yet been explored. Here, we propose a biologically inspired computational model to perform blind source separation on sequences of mixtures of acoustic stimuli. Our method relies on a somatodendritic neuron model trained with a Hebbian-like learning rule which was originally conceived to detect spatio-temporal patterns recurring in synaptic inputs. We show that the segregation capabilities of our model are reminiscent of the features of human performance in a variety of experimental settings involving synthesized sounds with naturalistic properties. Furthermore, we extend the study to investigate the properties of segregation on task settings not yet explored with human subjects, namely natural sounds and images. Overall, our work suggests that somatodendritic neuron models offer a promising neuro-inspired learning strategy to account for the characteristics of the brain segregation capabilities as well as to make predictions on yet untested experimental settings.
Collapse
Affiliation(s)
- Giorgia Dellaferrera
- Neural Coding and Brain Computing Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
- Institute of Neuroinformatics, University of Zurich and Swiss Federal Institute of Technology Zurich (ETH), Zurich, Switzerland
| | - Toshitake Asabuki
- Neural Coding and Brain Computing Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
| | - Tomoki Fukai
- Neural Coding and Brain Computing Unit, Okinawa Institute of Science and Technology, Okinawa, Japan
| |
Collapse
|
25
|
Cooper T, Lai H, Gorlewicz J. Do You Hear What I Hear: The Balancing Act of Designing an Electronic Hockey Puck for Playing Hockey Non-Visually. ACM TRANSACTIONS ON ACCESSIBLE COMPUTING 2022. [DOI: 10.1145/3507660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Blind hockey is a sport that is gaining popularity in the United States after having an international presence for years. In blind hockey, a modified puck is used that emits sounds via ball bearings that rattle inside the puck when it is moving. The modified puck’s lifetime is minimal due to its lack of durability, and it does not provide feedback when the puck stops moving. This article presents an evaluation of multiple prototypes that investigate the appropriate acoustic profiles for an electronic version of a puck that has the ability to overcome some of these challenges. Our approach leverages the use of alternative 3D printable materials and the implementation of four distinct sound profiles: the league-standard puck in blind hockey, a 3.5kHz piezo buzzer, an 800Hz sine tone, and simulated white noise. We present the design and prototype of the pucks, along with benchtop and user validation tests of the prototypes, comparing them to the league standard puck with a focus on acoustic performance. Participants rated the white noise sound profile highest in pleasantness and loudness and the LSP highest in localization. The white noise sound profile was associated with lower angle and distance errors. Of the prototypes produced, the white noise prototype puck appeared to demonstrate the most promise for playing hockey non-visually. We close with a discussion of recommendations for future electronic hockey puck designs to support blind hockey moving forward.
Collapse
|
26
|
Domingos LCF, Santos PE, Skelton PSM, Brinkworth RSA, Sammut K. A Survey of Underwater Acoustic Data Classification Methods Using Deep Learning for Shoreline Surveillance. SENSORS 2022; 22:s22062181. [PMID: 35336352 PMCID: PMC8954367 DOI: 10.3390/s22062181] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 02/06/2022] [Accepted: 02/09/2022] [Indexed: 02/04/2023]
Abstract
This paper presents a comprehensive overview of current deep-learning methods for automatic object classification of underwater sonar data for shoreline surveillance, concentrating mostly on the classification of vessels from passive sonar data and the identification of objects of interest from active sonar (such as minelike objects, human figures or debris of wrecked ships). Not only is the contribution of this work to provide a systematic description of the state of the art of this field, but also to identify five main ingredients in its current development: the application of deep-learning methods using convolutional layers alone; deep-learning methods that apply biologically inspired feature-extraction filters as a preprocessing step; classification of data from frequency and time–frequency analysis; methods using machine learning to extract features from original signals; and transfer learning methods. This paper also describes some of the most important datasets cited in the literature and discusses data-augmentation techniques. The latter are used for coping with the scarcity of annotated sonar datasets from real maritime missions.
Collapse
Affiliation(s)
- Lucas C. F. Domingos
- Department of Electrical and Electronics Engineering, Centro Universitário FEI, Sao Bernardo do Campo 09850-901, SP, Brazil;
- Department of Computer Vision, Instituto de Pesquisas Eldorado, Campinas 13083-898, SP, Brazil
- Correspondence:
| | - Paulo E. Santos
- Department of Electrical and Electronics Engineering, Centro Universitário FEI, Sao Bernardo do Campo 09850-901, SP, Brazil;
- Centre for Defence Engineering Research and Training, College of Science and Engineering, Flinders University, Tonsley, SA 5042, Australia; (P.S.M.S.); (R.S.A.B.); (K.S.)
| | - Phillip S. M. Skelton
- Centre for Defence Engineering Research and Training, College of Science and Engineering, Flinders University, Tonsley, SA 5042, Australia; (P.S.M.S.); (R.S.A.B.); (K.S.)
| | - Russell S. A. Brinkworth
- Centre for Defence Engineering Research and Training, College of Science and Engineering, Flinders University, Tonsley, SA 5042, Australia; (P.S.M.S.); (R.S.A.B.); (K.S.)
| | - Karl Sammut
- Centre for Defence Engineering Research and Training, College of Science and Engineering, Flinders University, Tonsley, SA 5042, Australia; (P.S.M.S.); (R.S.A.B.); (K.S.)
| |
Collapse
|
27
|
Effects of mild-to-moderate sensorineural hearing loss and signal amplification on vocal emotion recognition in middle-aged–older individuals. PLoS One 2022; 17:e0261354. [PMID: 34995305 PMCID: PMC8740977 DOI: 10.1371/journal.pone.0261354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 11/29/2021] [Indexed: 11/19/2022] Open
Abstract
Previous research has shown deficits in vocal emotion recognition in sub-populations of individuals with hearing loss, making this a high priority research topic. However, previous research has only examined vocal emotion recognition using verbal material, in which emotions are expressed through emotional prosody. There is evidence that older individuals with hearing loss suffer from deficits in general prosody recognition, not specific to emotional prosody. No study has examined the recognition of non-verbal vocalization, which constitutes another important source for the vocal communication of emotions. It might be the case that individuals with hearing loss have specific difficulties in recognizing emotions expressed through prosody in speech, but not non-verbal vocalizations. We aim to examine whether vocal emotion recognition difficulties in middle- aged-to older individuals with sensorineural mild-moderate hearing loss are better explained by deficits in vocal emotion recognition specifically, or deficits in prosody recognition generally by including both sentences and non-verbal expressions. Furthermore a, some of the studies which have concluded that individuals with mild-moderate hearing loss have deficits in vocal emotion recognition ability have also found that the use of hearing aids does not improve recognition accuracy in this group. We aim to examine the effects of linear amplification and audibility on the recognition of different emotions expressed both verbally and non-verbally. Besides examining accuracy for different emotions we will also look at patterns of confusion (which specific emotions are mistaken for other specific emotion and at which rates) during both amplified and non-amplified listening, and we will analyze all material acoustically and relate the acoustic content to performance. Together these analyses will provide clues to effects of amplification on the perception of different emotions. For these purposes, a total of 70 middle-aged-older individuals, half with mild-moderate hearing loss and half with normal hearing will perform a computerized forced-choice vocal emotion recognition task with and without amplification.
Collapse
|
28
|
Knipper M, Singer W, Schwabe K, Hagberg GE, Li Hegner Y, Rüttiger L, Braun C, Land R. Disturbed Balance of Inhibitory Signaling Links Hearing Loss and Cognition. Front Neural Circuits 2022; 15:785603. [PMID: 35069123 PMCID: PMC8770933 DOI: 10.3389/fncir.2021.785603] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Accepted: 12/08/2021] [Indexed: 12/19/2022] Open
Abstract
Neuronal hyperexcitability in the central auditory pathway linked to reduced inhibitory activity is associated with numerous forms of hearing loss, including noise damage, age-dependent hearing loss, and deafness, as well as tinnitus or auditory processing deficits in autism spectrum disorder (ASD). In most cases, the reduced central inhibitory activity and the accompanying hyperexcitability are interpreted as an active compensatory response to the absence of synaptic activity, linked to increased central neural gain control (increased output activity relative to reduced input). We here suggest that hyperexcitability also could be related to an immaturity or impairment of tonic inhibitory strength that typically develops in an activity-dependent process in the ascending auditory pathway with auditory experience. In these cases, high-SR auditory nerve fibers, which are critical for the shortest latencies and lowest sound thresholds, may have either not matured (possibly in congenital deafness or autism) or are dysfunctional (possibly after sudden, stressful auditory trauma or age-dependent hearing loss linked with cognitive decline). Fast auditory processing deficits can occur despite maintained basal hearing. In that case, tonic inhibitory strength is reduced in ascending auditory nuclei, and fast inhibitory parvalbumin positive interneuron (PV-IN) dendrites are diminished in auditory and frontal brain regions. This leads to deficits in central neural gain control linked to hippocampal LTP/LTD deficiencies, cognitive deficits, and unbalanced extra-hypothalamic stress control. Under these conditions, a diminished inhibitory strength may weaken local neuronal coupling to homeostatic vascular responses required for the metabolic support of auditory adjustment processes. We emphasize the need to distinguish these two states of excitatory/inhibitory imbalance in hearing disorders: (i) Under conditions of preserved fast auditory processing and sustained tonic inhibitory strength, an excitatory/inhibitory imbalance following auditory deprivation can maintain precise hearing through a memory linked, transient disinhibition that leads to enhanced spiking fidelity (central neural gain⇑) (ii) Under conditions of critically diminished fast auditory processing and reduced tonic inhibitory strength, hyperexcitability can be part of an increased synchronization over a broader frequency range, linked to reduced spiking reliability (central neural gain⇓). This latter stage mutually reinforces diminished metabolic support for auditory adjustment processes, increasing the risks for canonical dementia syndromes.
Collapse
Affiliation(s)
- Marlies Knipper
- Department of Otolaryngology, Head and Neck Surgery, Tübingen Hearing Research Center (THRC), Molecular Physiology of Hearing, University of Tübingen, Tübingen, Germany
- *Correspondence: Marlies Knipper,
| | - Wibke Singer
- Department of Otolaryngology, Head and Neck Surgery, Tübingen Hearing Research Center (THRC), Molecular Physiology of Hearing, University of Tübingen, Tübingen, Germany
| | - Kerstin Schwabe
- Experimental Neurosurgery, Department of Neurosurgery, Hannover Medical School, Hanover, Germany
| | - Gisela E. Hagberg
- Department of Biomedical Magnetic Resonance, University Hospital Tübingen (UKT), Tübingen, Germany
- High-Field Magnetic Resonance, Max Planck Institute for Biological Cybernetics, Tübingen, Germany
| | - Yiwen Li Hegner
- MEG Center, University of Tübingen, Tübingen, Germany
- Center of Neurology, Hertie-Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany
| | - Lukas Rüttiger
- Department of Otolaryngology, Head and Neck Surgery, Tübingen Hearing Research Center (THRC), Molecular Physiology of Hearing, University of Tübingen, Tübingen, Germany
| | - Christoph Braun
- MEG Center, University of Tübingen, Tübingen, Germany
- Center of Neurology, Hertie-Institute for Clinical Brain Research, University of Tübingen, Tübingen, Germany
| | - Rüdiger Land
- Department of Experimental Otology, Institute for Audioneurotechnology, Hannover Medical School, Hanover, Germany
| |
Collapse
|
29
|
An Introduction to Musical Interactions. MULTIMODAL TECHNOLOGIES AND INTERACTION 2022. [DOI: 10.3390/mti6010004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The article presents a contextual survey of eight contributions in the special issue Musical Interactions (Volume I) in Multimodal Technologies and Interaction. The presentation includes (1) a critical examination of what it means to be musical, to devise the concept of music proper to MTI as well as multicultural proximity, and (2) a conceptual framework for instrumentation, design, and assessment of musical interaction research through five enabling dimensions: Affordance; Design Alignment; Adaptive Learning; Second-Order Feedback; Temporal Integration. Each dimension is discussed and applied in the survey. The results demonstrate how the framework provides an interdisciplinary scope required for musical interaction, and how this approach may offer a coherent way to describe and assess approaches to research and design as well as implementations of interactive musical systems. Musical interaction stipulates musical liveness for experiencing both music and technologies. While music may be considered ontologically incomplete without a listener, musical interaction is defined as ontological completion of a state of music and listening through a listener’s active engagement with musical resources in multimodal information flow.
Collapse
|
30
|
Wagner JD, Gelman A, Hancock KE, Chung Y, Delgutte B. Rabbits use both spectral and temporal cues to discriminate the fundamental frequency of harmonic complexes with missing fundamentals. J Neurophysiol 2022; 127:290-312. [PMID: 34879207 PMCID: PMC8759963 DOI: 10.1152/jn.00366.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The pitch of harmonic complex tones (HCTs) common in speech, music, and animal vocalizations plays a key role in the perceptual organization of sound. Unraveling the neural mechanisms of pitch perception requires animal models, but little is known about complex pitch perception by animals, and some species appear to use different pitch mechanisms than humans. Here, we tested rabbits' ability to discriminate the fundamental frequency (F0) of HCTs with missing fundamentals, using a behavioral paradigm inspired by foraging behavior in which rabbits learned to harness a spatial gradient in F0 to find the location of a virtual target within a room for a food reward. Rabbits were initially trained to discriminate HCTs with F0s in the range 400-800 Hz and with harmonics covering a wide frequency range (800-16,000 Hz) and then tested with stimuli differing in spectral composition to test the role of harmonic resolvability (experiment 1) or in F0 range (experiment 2) or in both F0 and spectral content (experiment 3). Together, these experiments show that rabbits can discriminate HCTs over a wide F0 range (200-1,600 Hz) encompassing the range of conspecific vocalizations and can use either the spectral pattern of harmonics resolved by the cochlea for higher F0s or temporal envelope cues resulting from interaction between unresolved harmonics for lower F0s. The qualitative similarity of these results to human performance supports the use of rabbits as an animal model for studies of pitch mechanisms, providing species differences in cochlear frequency selectivity and F0 range of vocalizations are taken into account.NEW & NOTEWORTHY Understanding the neural mechanisms of pitch perception requires experiments in animal models, but little is known about pitch perception by animals. Here we show that rabbits, a popular animal in auditory neuroscience, can discriminate complex sounds differing in pitch using either spectral cues or temporal cues. The results suggest that the role of spectral cues in pitch perception by animals may have been underestimated by predominantly testing low frequencies in the range of human voice.
Collapse
Affiliation(s)
- Joseph D. Wagner
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,3Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Alice Gelman
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts
| | - Kenneth E. Hancock
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,2Department of Otolaryngology, Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts
| | - Yoojin Chung
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,2Department of Otolaryngology, Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts
| | - Bertrand Delgutte
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,2Department of Otolaryngology, Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
31
|
Variability in Quantity and Quality of Early Linguistic Experience in Children With Cochlear Implants: Evidence from Analysis of Natural Auditory Environments. Ear Hear 2022; 43:685-698. [PMID: 34611118 PMCID: PMC8881322 DOI: 10.1097/aud.0000000000001136] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
OBJECTIVES Understanding how quantity and quality of language input vary across children with cochlear implants (CIs) is important for explaining sources of large individual differences in language outcomes of this at-risk pediatric population. Studies have mostly focused either on intervention-related, device-related, and/or patient-related factors, or relied on data from parental reports and laboratory-based speech corpus to unravel factors explaining individual differences in language outcomes among children with CIs. However, little is known about the extent to which children with CIs differ in quantity and quality of language input they experience in their natural linguistic environments. To address this knowledge gap, the present study analyzed the quantity and quality of language input to early-implanted children (age of implantation <23 mo) during the first year after implantation. DESIGN Day-long Language ENvironment Analysis (LENA) recordings, derived from home environments of 14 early-implanted children, were analyzed to estimate numbers of words per day, type-token ratio (TTR), and mean length of utterance in morphemes (MLUm) in adults' speech. Properties of language input were analyzed across these three dimensions to examine how input in home environments varied across children with CIs in quantity, defined as number of words, and quality, defined as whether speech was child-directed or overheard. RESULTS Our per-day estimates demonstrated that children with CIs were highly variable in the number of total words (mean ± SD = 25,134 ± 9,267 words) and high-quality child-directed words (mean ± SD = 10,817 ± 7,187 words) they experienced in a day in their home environments during the first year after implantation. The results also showed that the patterns of variability across children in quantity and quality of language input changes depending on whether the speech was child-directed or overheard. Children also experienced highly different environments in terms of lexical diversity (as measured by TTR) and morphosyntactic complexity (as measured by MLUm) of language input. The results demonstrated that children with CIs varied substantially in the quantity and quality of language input experienced in their home environments. More importantly, individual children experienced highly variable amounts of high-quality, child-directed speech, which may drive variability in language outcomes across children with CIs. CONCLUSIONS Analyzing early language input in natural, linguistic environments of children with CIs showed that the quantity and quality of early linguistic input vary substantially across individual children with CIs. This substantial individual variability suggests that the quantity and quality of early linguistic input are potential sources of individual differences in outcomes of children with CIs and warrant further investigation to determine the effects of this variability on outcomes.
Collapse
|
32
|
Knipper M, Mazurek B, van Dijk P, Schulze H. Too Blind to See the Elephant? Why Neuroscientists Ought to Be Interested in Tinnitus. J Assoc Res Otolaryngol 2021; 22:609-621. [PMID: 34686939 PMCID: PMC8599745 DOI: 10.1007/s10162-021-00815-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 08/30/2021] [Indexed: 01/13/2023] Open
Abstract
A curative therapy for tinnitus currently does not exist. One may actually exist but cannot currently be causally linked to tinnitus due to the lack of consistency of concepts about the neural correlate of tinnitus. Depending on predictions, these concepts would require either a suppression or enhancement of brain activity or an increase in inhibition or disinhibition. Although procedures with a potential to silence tinnitus may exist, the lack of rationale for their curative success hampers an optimization of therapeutic protocols. We discuss here six candidate contributors to tinnitus that have been suggested by a variety of scientific experts in the field and that were addressed in a virtual panel discussion at the ARO round table in February 2021. In this discussion, several potential tinnitus contributors were considered: (i) inhibitory circuits, (ii) attention, (iii) stress, (iv) unidentified sub-entities, (v) maladaptive information transmission, and (vi) minor cochlear deafferentation. Finally, (vii) some potential therapeutic approaches were discussed. The results of this discussion is reflected here in view of potential blind spots that may still remain and that have been ignored in most tinnitus literature. We strongly suggest to consider the high impact of connecting the controversial findings to unravel the whole complexity of the tinnitus phenomenon; an essential prerequisite for establishing suitable therapeutic approaches.
Collapse
Affiliation(s)
- Marlies Knipper
- Molecular Physiology of Hearing, Tübingen Hearing Research Centre (THRC), Department of Otolaryngology, Head & Neck Surgery, University of Tübingen, Elfriede-Aulhorn-Straße 5, 72076, Tübingen, Germany.
| | - Birgit Mazurek
- Tinnitus Center Charité, Universitätsmedizin Berlin, Berlin, Germany
| | - Pim van Dijk
- Department of Otorhinolaryngology/Head and Neck Surgery, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
- Graduate School of Medical Sciences (Research School of Behavioural and Cognitive Neurosciences), University of Groningen, Groningen, The Netherlands
| | - Holger Schulze
- Experimental Otolaryngology, Friedrich-Alexander Universität Erlangen-Nürnberg, Waldstrasse 1, 91054, Erlangen, Germany
| |
Collapse
|
33
|
Jeng FC, Hart BN, Lin CD. Separating the Novel Speech Sound Perception of Lexical Tone Chimeras From Their Auditory Signal Manipulations: Behavioral and Electroencephalographic Evidence. Percept Mot Skills 2021; 128:2527-2543. [PMID: 34586922 DOI: 10.1177/00315125211049723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Previous research has shown the novelty of lexical-tone chimeras (artificially constructed speech sounds created by combining normal speech sounds of a given language) to native speakers of the language from which the chimera components were drawn. However, the source of such novelty remains unclear. Our goal in this study was to separate the effects of chimeric tonal novelty in Mandarin speech from the effects of auditory signal manipulations. We recruited 20 native speakers of Mandarin and constructed two sets of lexical-tone chimeras by interchanging the envelopes and fine structures of both a falling/yi4/and a rising/yi2/Mandarin tone through 1, 2, 3, 4, 6, 8, 16, 32, and 64 auditory filter banks. We conducted pitch-perception ability tasks via a two-alternative, forced-choice paradigm to produce behavioral (versus physiological) pitch perception data. We also obtained electroencephalographic measurements through the scalp-recorded frequency-following response (FFR). Analyses of variances and post hoc Greenhouse-Geisser procedures revealed that the differences observed in the participants' reaction times and FFR measurements were attributable primarily to chimeric novelty rather than signal manipulation effects. These findings can be useful in assessing neuroplasticity and developing speech-processing strategies.
Collapse
Affiliation(s)
- Fuh-Cherng Jeng
- Communication Sciences and Disorders, 1354Ohio University, Ohio University, Athens, Ohio, United States.,Department of Otolaryngology-HNS, Medical University Hospital, Taichung City
| | - Breanna N Hart
- Communication Sciences and Disorders, 1354Ohio University, Ohio University, Athens, Ohio, United States
| | - Chia-Der Lin
- Department of Otolaryngology-HNS, Medical University Hospital, Taichung City
| |
Collapse
|
34
|
Mathematical framework for place coding in the auditory system. PLoS Comput Biol 2021; 17:e1009251. [PMID: 34339409 PMCID: PMC8360601 DOI: 10.1371/journal.pcbi.1009251] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 08/12/2021] [Accepted: 07/06/2021] [Indexed: 11/18/2022] Open
Abstract
In the auditory system, tonotopy is postulated to be the substrate for a place code, where sound frequency is encoded by the location of the neurons that fire during the stimulus. Though conceptually simple, the computations that allow for the representation of intensity and complex sounds are poorly understood. Here, a mathematical framework is developed in order to define clearly the conditions that support a place code. To accommodate both frequency and intensity information, the neural network is described as a space with elements that represent individual neurons and clusters of neurons. A mapping is then constructed from acoustic space to neural space so that frequency and intensity are encoded, respectively, by the location and size of the clusters. Algebraic operations -addition and multiplication- are derived to elucidate the rules for representing, assembling, and modulating multi-frequency sound in networks. The resulting outcomes of these operations are consistent with network simulations as well as with electrophysiological and psychophysical data. The analyses show how both frequency and intensity can be encoded with a purely place code, without the need for rate or temporal coding schemes. The algebraic operations are used to describe loudness summation and suggest a mechanism for the critical band. The mathematical approach complements experimental and computational approaches and provides a foundation for interpreting data and constructing models.
Collapse
|
35
|
Simões PN, Lüders D, José MR, Romanelli G, Lüders V, Santos RS, de Araújo CM. Musical Perception Assessment of People With Hearing Impairment: A Systematic Review and Meta-Analysis. Am J Audiol 2021; 30:458-473. [PMID: 33784174 DOI: 10.1044/2021_aja-20-00146] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Purpose People with hearing impairment (HI) face numerous challenges that can be minimized with the use of hearing aids and cochlear implants. Despite technological advances in these assistive hearing devices, musical perception remains difficult for these people. Tests and protocols developed to assess the musical perception of this audience were the target of this systematic review, whose objective was to investigate how assessments of musical perception in people with HI are carried out. Method Searches for primary articles were carried out in the PubMed/MEDLINE, Scopus, Web of Science, Latin American and Caribbean Health Sciences Literature, and ASHAWire databases. Search results were managed using EndNote X9 software, and analysis was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) Statement. Results The 16 cross-sectional included studies analyzed music perception data from people with HI compared to a control group of participants with normal hearing. Among these, four studies were selected to be included in a meta-analysis, performed with timbre and melody. Variability was observed in the tests and between the levels of auditory perception skills analyzed in relation to the components of music. With respect to the tests, sound stimuli generated by synthesizers were the most used stimuli; with the exception of timbre evaluation, the most frequent test environment was a booth with sound attenuation, and the average intensity for presenting sound stimuli was 70 dB SPL. The most evaluated sound component was pitch, followed by rhythm and timbre, with a pattern of responses based on adaptive and psychoacoustic methods. Conclusions The heterogeneity of the musical parameters and the auditory abilities evaluated by the tests is a fact that can compromise evidence found in this area of study. It is worth considering the quality of samples that were recorded with real musical instruments and digitized afterward, in comparison with synthesized samples that do not seem to accurately represent real instruments. The need to minimize semantic parallelism that involves the auditory skills and elements of music involved in the assessment of musical perception is highlighted.
Collapse
Affiliation(s)
- Pierangela Nota Simões
- Postgraduate Program in Communication Disorders, Universidade Tuiuti do Paraná, Curitiba, Brazil
- Faculty of Arts, Universidade Estadual do Paraná, Curitiba, Brazil
| | - Debora Lüders
- Postgraduate Program in Communication Disorders, Universidade Tuiuti do Paraná, Curitiba, Brazil
| | - Maria Renata José
- Postgraduate Program in Communication Disorders, Universidade Tuiuti do Paraná, Curitiba, Brazil
| | - Guilherme Romanelli
- Postgraduate Program in Music, Universidade Federal do Paraná, Curitiba, Brazil
| | - Valéria Lüders
- Postgraduate Program in Music, Universidade Federal do Paraná, Curitiba, Brazil
| | - Rosane Sampaio Santos
- Postgraduate Program in Communication Disorders, Universidade Tuiuti do Paraná, Curitiba, Brazil
| | | |
Collapse
|
36
|
Guinan JJ, Lefler SM, Buchman CA, Goodman SS, Lichtenhan JT. Altered mapping of sound frequency to cochlear place in ears with endolymphatic hydrops provide insight into the pitch anomaly of diplacusis. Sci Rep 2021; 11:10380. [PMID: 34001971 PMCID: PMC8128888 DOI: 10.1038/s41598-021-89902-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Accepted: 04/26/2021] [Indexed: 11/22/2022] Open
Abstract
A fundamental property of mammalian hearing is the conversion of sound pressure into a frequency-specific place of maximum vibration along the cochlear length, thereby creating a tonotopic map. The tonotopic map makes possible systematic frequency tuning across auditory-nerve fibers, which enables the brain to use pitch to separate sounds from different environmental sources and process the speech and music that connects us to people and the world. Sometimes a tone has a different pitch in the left and right ears, a perceptual anomaly known as diplacusis. Diplacusis has been attributed to a change in the cochlear frequency-place map, but the hypothesized abnormal cochlear map has never been demonstrated. Here we assess cochlear frequency-place maps in guinea-pig ears with experimentally-induced endolymphatic hydrops, a hallmark of Ménière’s disease. Our findings are consistent with the hypothesis that diplacusis is due to an altered cochlear map. Map changes can lead to altered pitch, but the size of the pitch change is also affected by neural synchrony. Our data show that the cochlear frequency-place map is not fixed but can be altered by endolymphatic hydrops. Map changes should be considered in assessing hearing pathologies and treatments.
Collapse
Affiliation(s)
- J J Guinan
- Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, MA, USA.,Department of Otolaryngology, Harvard Medical School, Boston, MA, USA
| | - S M Lefler
- Department of Otolaryngology, School of Medicine, Washington University St. Louis, Campus Box 8115, 660 South Euclid Avenue, Saint Louis, MO, 63110, USA
| | - C A Buchman
- Department of Otolaryngology, School of Medicine, Washington University St. Louis, Campus Box 8115, 660 South Euclid Avenue, Saint Louis, MO, 63110, USA
| | - S S Goodman
- Department of Communication Sciences and Disorders, University of Iowa, Iowa City, IA, USA
| | - J T Lichtenhan
- Department of Otolaryngology, School of Medicine, Washington University St. Louis, Campus Box 8115, 660 South Euclid Avenue, Saint Louis, MO, 63110, USA.
| |
Collapse
|
37
|
Wilson US, Browning-Kamins J, Durante AS, Boothalingam S, Moleti A, Sisto R, Dhar S. Cochlear tuning estimates from level ratio functions of distortion product otoacoustic emissions. Int J Audiol 2021; 60:890-899. [PMID: 33612052 DOI: 10.1080/14992027.2021.1886352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Objective: Distortion product otoacoustic emission (DPOAE) levels plotted as a function of stimulus frequency ratio demonstrate a bandpass shape. This bandpass shape is narrower at higher frequencies compared to lower frequencies and thus has been thought to be related to cochlear mechanical tuning.Design: However, the frequency- and level-dependence of these functions above 8 kHz is largely unknown. Furthermore, how tuning estimates from these functions are related to behavioural tuning is not fully understood.Study Sample: From experiment 1, we report DPOAE level ratio functions (LRF) from seven normal-hearing, young-adults for f2 = 0.75-16 kHz and two stimulus levels of 62/52 and 52/37 dB FPL. We found that LRFs became narrower as a function of increasing frequency and decreasing level.Results: Tuning estimates from these functions increased as expected from 1-8 kHz. In experiment 2, we compared tuning estimates from DPOAE LRF to behavioural tuning in 24 normal-hearing, young adults for 1 and 4 kHz and found that behavioural tuning generally predicted DPOAE LRF estimated tuning.Conclusions: Our findings suggest that DPOAE LRFs generally reflect the tuning profile consistent with basilar membrane, neural, and behavioural tuning. However, further investigations are warranted to fully determine the use of DPOAE LRF as a clinical measure of cochlear tuning.
Collapse
Affiliation(s)
- Uzma Shaheen Wilson
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL, USA
| | - Jenna Browning-Kamins
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL, USA
| | | | | | - Arturo Moleti
- Physics Department, University of Roma Tor Vergata, Rome, Italy
| | | | - Sumitrajit Dhar
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL, USA.,Knowles Hearing Center, Northwestern University, Evanston, IL, USA
| |
Collapse
|
38
|
Adadey SM, Quaye O, Amedofu GK, Awandare GA, Wonkam A. Screening for GJB2-R143W-Associated Hearing Impairment: Implications for Health Policy and Practice in Ghana. Public Health Genomics 2020; 23:184-189. [PMID: 33302283 DOI: 10.1159/000512121] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 09/23/2020] [Indexed: 11/19/2022] Open
Abstract
Genetic factors significantly contribute to the burden of hearing impairment (HI) in Ghana as there is a high carrier frequency (1.5%) of the connexin 26 gene founder variant GJB2-R143W in the healthy Ghanaian population. GJB2-R143W mutation accounts for nearly 26% of causes in families segregating congenital non-syndromic HI. With HI associated with high genetic fitness, this indicates that Ghana will likely sustain an increase in the number of individuals living with inheritable HI. There is a universal newborn hearing screening (UNHS) program in Ghana. However, this program does not include genetic testing. Adding genetic testing of GJB2-R143W mutation for the population, prenatal and neonatal stages may lead to guiding genetic counseling for individual and couples, early detection of HI for at-risk infants, and improvement of medical management, including speech therapy and audiologic intervention, as well as provision of the needed social service to enhance parenting and education for children with HI. Based on published research on the genetics of HI in Ghana, we recommend that the UNHS program should include genetic screening for the GJB2-R143W gene variant for newborns who did not pass the initial UNHS tests. This will require an upgrade and resourcing of public health infrastructures to implement the rapid and cost-effective GJB2-R143W testing, followed by appropriate genetic and anticipatory guidance for medical care.
Collapse
Affiliation(s)
- Samuel M Adadey
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana.,Division of Human Genetics, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa
| | - Osbourne Quaye
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Geoffrey K Amedofu
- Department of Eye Ear Nose & Throat, School of Medical Sciences, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
| | - Gordon A Awandare
- West African Centre for Cell Biology of Infectious Pathogens (WACCBIP), University of Ghana, Accra, Ghana
| | - Ambroise Wonkam
- Division of Human Genetics, Faculty of Health Sciences, University of Cape Town, Cape Town, South Africa,
| |
Collapse
|
39
|
Wasiuk PA, Lavandier M, Buss E, Oleson J, Calandruccio L. The effect of fundamental frequency contour similarity on multi-talker listening in older and younger adults. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:3527. [PMID: 33379934 PMCID: PMC7863686 DOI: 10.1121/10.0002661] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Older adults with hearing loss have greater difficulty recognizing target speech in multi-talker environments than young adults with normal hearing, especially when target and masker speech streams are perceptually similar. A difference in fundamental frequency (f0) contour depth is an effective stream segregation cue for young adults with normal hearing. This study examined whether older adults with varying degrees of sensorineural hearing loss are able to utilize differences in target/masker f0 contour depth to improve speech recognition in multi-talker listening. Speech recognition thresholds (SRTs) were measured for speech mixtures composed of target/masker streams with flat, normal, and exaggerated speaking styles, in which f0 contour depth systematically varied. Computational modeling estimated differences in energetic masking across listening conditions. Young adults had lower SRTs than older adults; a result that was partially explained by differences in audibility predicted by the model. However, audibility differences did not explain why young adults experienced a benefit from mismatched target/masker f0 contour depth, while in most conditions, older adults did not. Reduced ability to use segregation cues (differences in target/masker f0 contour depth), and deficits grouping speech with variable f0 contours likely contribute to difficulties experienced by older adults in challenging acoustic environments.
Collapse
Affiliation(s)
- Peter A Wasiuk
- Department of Psychological Sciences, 11635 Euclid Avenue, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | - Mathieu Lavandier
- Univ. Lyon, ENTPE, Laboratoire Génie Civil et Bâtiment, Rue M. Audin, Vaulx-en-Velin Cedex, 69518, France
| | - Emily Buss
- Department of Otolaryngology/Head and Neck Surgery, University of North Carolina, CB#7070, Chapel Hill, North Carolina 27599, USA
| | - Jacob Oleson
- Department of Biostatistics, N300 CPHB, University of Iowa, 145 North Riverside Drive, Iowa City, Iowa 52242-2007, USA
| | - Lauren Calandruccio
- Department of Psychological Sciences, 11635 Euclid Avenue, Case Western Reserve University, Cleveland, Ohio 44106, USA
| |
Collapse
|
40
|
Gupta S, Bee MA. Treefrogs exploit temporal coherence to form perceptual objects of communication signals. Biol Lett 2020; 16:20200573. [PMID: 32961090 PMCID: PMC7532704 DOI: 10.1098/rsbl.2020.0573] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 09/07/2020] [Indexed: 11/12/2022] Open
Abstract
For many animals, navigating their environment requires an ability to organize continuous streams of sensory input into discrete 'perceptual objects' that correspond to physical entities in visual and auditory scenes. The human visual and auditory systems follow several Gestalt laws of perceptual organization to bind constituent features into coherent perceptual objects. A largely unexplored question is whether nonhuman animals follow similar Gestalt laws in perceiving behaviourally relevant stimuli, such as communication signals. We used females of Cope's grey treefrog (Hyla chrysoscelis) to test the hypothesis that temporal coherence-a powerful Gestalt principle in human auditory scene analysis-promotes perceptual binding in forming auditory objects of species-typical vocalizations. According to the principle of temporal coherence, sound elements that start and stop at the same time or that modulate coherently over time are likely to become bound together into the same auditory object. We found that the natural temporal coherence between two spectral components of advertisement calls promotes their perceptual binding into auditory objects of advertisement calls. Our findings confirm the broad ecological validity of temporal coherence as a Gestalt law of auditory perceptual organization guiding the formation of biologically relevant perceptual objects in animal behaviour.
Collapse
Affiliation(s)
- Saumya Gupta
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN 55108, USA
| | - Mark A. Bee
- Department of Ecology, Evolution, and Behavior, University of Minnesota, Saint Paul, MN 55108, USA
- Graduate Program in Neuroscience, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
41
|
Fox NP, Leonard M, Sjerps MJ, Chang EF. Transformation of a temporal speech cue to a spatial neural code in human auditory cortex. eLife 2020; 9:e53051. [PMID: 32840483 PMCID: PMC7556862 DOI: 10.7554/elife.53051] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 08/21/2020] [Indexed: 11/28/2022] Open
Abstract
In speech, listeners extract continuously-varying spectrotemporal cues from the acoustic signal to perceive discrete phonetic categories. Spectral cues are spatially encoded in the amplitude of responses in phonetically-tuned neural populations in auditory cortex. It remains unknown whether similar neurophysiological mechanisms encode temporal cues like voice-onset time (VOT), which distinguishes sounds like /b/ and/p/. We used direct brain recordings in humans to investigate the neural encoding of temporal speech cues with a VOT continuum from /ba/ to /pa/. We found that distinct neural populations respond preferentially to VOTs from one phonetic category, and are also sensitive to sub-phonetic VOT differences within a population's preferred category. In a simple neural network model, simulated populations tuned to detect either temporal gaps or coincidences between spectral cues captured encoding patterns observed in real neural data. These results demonstrate that a spatial/amplitude neural code underlies the cortical representation of both spectral and temporal speech cues.
Collapse
Affiliation(s)
- Neal P Fox
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthew Leonard
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
| | - Matthias J Sjerps
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive Neuroimaging, Radboud UniversityNijmegenNetherlands
- Max Planck Institute for PsycholinguisticsNijmegenNetherlands
| | - Edward F Chang
- Department of Neurological Surgery, University of California, San FranciscoSan FranciscoUnited States
- Weill Institute for Neurosciences, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
42
|
Face-voice space: Integrating visual and auditory cues in judgments of person distinctiveness. Atten Percept Psychophys 2020; 82:3710-3727. [PMID: 32696231 DOI: 10.3758/s13414-020-02084-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Faces and voices each convey multiple cues enabling us to tell people apart. Research on face and voice distinctiveness commonly utilizes multidimensional space to represent these complex, perceptual abilities. We extend this framework to examine how a combined face-voice space would relate to its constituent face and voice spaces. Participants rated videos of speakers for their dissimilarity in face only, voice only, and face-voice together conditions. Multiple dimensional scaling (MDS) and regression analyses showed that whereas face-voice space more closely resembled face space, indicating visual dominance, face-voice distinctiveness was best characterized by a multiplicative integration of face-only and voice-only distinctiveness, indicating that auditory and visual cues are used interactively in person-distinctiveness judgments. Further, the multiplicative integration could not be explained by the small correlation found between face-only and voice-only distinctiveness. As an exploratory analysis, we next identified auditory and visual features that correlated with the dimensions in the MDS solutions. Features pertaining to facial width, lip movement, spectral centroid, fundamental frequency, and loudness variation were identified as important features in face-voice space. We discuss the implications of our findings in terms of person perception, recognition, and face-voice matching abilities.
Collapse
|
43
|
Saiz-Alía M, Reichenbach T. Computational modeling of the auditory brainstem response to continuous speech. J Neural Eng 2020; 17:036035. [DOI: 10.1088/1741-2552/ab970d] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
44
|
Abstract
This study presents a computational model to reproduce the biological dynamics of "listening to music." A biologically plausible model of periodicity pitch detection is proposed and simulated. Periodicity pitch is computed across a range of the auditory spectrum. Periodicity pitch is detected from subsets of activated auditory nerve fibers (ANFs). These activate connected model octopus cells, which trigger model neurons detecting onsets and offsets; thence model interval-tuned neurons are innervated at the right interval times; and finally, a set of common interval-detecting neurons indicate pitch. Octopus cells rhythmically spike with the pitch periodicity of the sound. Batteries of interval-tuned neurons stopwatch-like measure the inter-spike intervals of the octopus cells by coding interval durations as first spike latencies (FSLs). The FSL-triggered spikes synchronously coincide through a monolayer spiking neural network at the corresponding receiver pitch neurons.
Collapse
Affiliation(s)
- Frank Klefenz
- Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany
| | - Tamas Harczos
- Fraunhofer Institute for Digital Media Technology IDMT, Ilmenau, Germany
- Auditory Neuroscience and Optogenetics Laboratory, German Primate Center, Göttingen, Germany
- audifon GmbH & Co. KG, Kölleda, Germany
| |
Collapse
|
45
|
Mishra SK. The role of efferents in human auditory development: efferent inhibition predicts frequency discrimination in noise for children. J Neurophysiol 2020; 123:2437-2448. [PMID: 32432503 DOI: 10.1152/jn.00136.2020] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
The descending corticofugal fibers originate from the auditory cortex and exert control on the periphery via the olivocochlear efferents. Medial efferents are thought to enhance the discriminability of transient sounds in background noise. In addition, the observation of deleterious long-term effects of efferent sectioning on the response properties of auditory nerve fibers in neonatal cats supports an efferent-mediated control of normal development. However, the role of the efferent system in human hearing remains unclear. The objective of the present study was to test the hypothesis that the medial efferents are involved in the development of frequency discrimination in noise. The hypothesis was examined with a combined behavioral and physiological approach. Frequency discrimination in noise and efferent inhibition were measured in 5- to 12-yr-old children (n = 127) and young adults (n = 37). Medial efferent strength was noninvasively assayed with a rigorous otoacoustic emission protocol. Results revealed an age-mediated relationship between efferent inhibition and frequency discrimination in noise. Efferent inhibition strongly predicted frequency discrimination in noise for younger children (5-9 yr). However, for older children (>9 yr) and adults, efferent inhibition was not related to frequency discrimination in noise. These findings support the role of efferents in the development of hearing-in-noise in humans; specifically, younger children compared with older children and adults are relatively more dependent on efferent inhibition for extracting relevant cues in noise. Additionally, the present findings caution against postulating an oversimplified relationship between efferent inhibition and measures of auditory perception in humans.NEW & NOTEWORTHY Despite several decades of research, the functional role of medial olivocochlear efferents in humans remains controversial and is thought to be insignificant. Here it is shown that medial efferent inhibition strongly predicts frequency discrimination in noise for younger children but not for older children and adults. Young children are relatively more dependent on the efferent system for listening-in-noise. This study highlights the role of the efferent system in hearing-in-noise during childhood development.
Collapse
Affiliation(s)
- Srikanta K Mishra
- Department of Communication Sciences and Disorders, The University of Texas Rio Grande Valley, Edinburg, Texas.,Department of Communication Disorders, New Mexico State University, Las Cruces, New Mexico
| |
Collapse
|
46
|
Harrison PMC, Pearce MT. Simultaneous consonance in music perception and composition. Psychol Rev 2020; 127:216-244. [PMID: 31868392 PMCID: PMC7032667 DOI: 10.1037/rev0000169] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2019] [Revised: 06/13/2019] [Accepted: 09/02/2019] [Indexed: 11/08/2022]
Abstract
Simultaneous consonance is a salient perceptual phenomenon corresponding to the perceived pleasantness of simultaneously sounding musical tones. Various competing theories of consonance have been proposed over the centuries, but recently a consensus has developed that simultaneous consonance is primarily driven by harmonicity perception. Here we question this view, substantiating our argument by critically reviewing historic consonance research from a broad variety of disciplines, reanalyzing consonance perception data from 4 previous behavioral studies representing more than 500 participants, and modeling three Western musical corpora representing more than 100,000 compositions. We conclude that simultaneous consonance is a composite phenomenon that derives in large part from three phenomena: interference, periodicity/harmonicity, and cultural familiarity. We formalize this conclusion with a computational model that predicts a musical chord's simultaneous consonance from these three features, and release this model in an open-source R package, incon, alongside 15 other computational models also evaluated in this paper. We hope that this package will facilitate further psychological and musicological research into simultaneous consonance. (PsycINFO Database Record (c) 2020 APA, all rights reserved).
Collapse
Affiliation(s)
- Peter M C Harrison
- School of Electronic Engineering and Computer Science, Queen Mary University of London
| | - Marcus T Pearce
- School of Electronic Engineering and Computer Science, Queen Mary University of London
| |
Collapse
|
47
|
Mehta AH, Lu H, Oxenham AJ. The Perception of Multiple Simultaneous Pitches as a Function of Number of Spectral Channels and Spectral Spread in a Noise-Excited Envelope Vocoder. J Assoc Res Otolaryngol 2020; 21:61-72. [PMID: 32048077 DOI: 10.1007/s10162-019-00738-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Accepted: 10/30/2019] [Indexed: 01/06/2023] Open
Abstract
Cochlear implant (CI) listeners typically perform poorly on tasks involving the pitch of complex tones. This limitation in performance is thought to be mainly due to the restricted number of active channels and the broad current spread that leads to channel interactions and subsequent loss of precise spectral information, with temporal information limited primarily to temporal-envelope cues. Little is known about the degree of spectral resolution required to perceive combinations of multiple pitches, or a single pitch in the presence of other interfering tones in the same spectral region. This study used noise-excited envelope vocoders that simulate the limited resolution of CIs to explore the perception of multiple pitches presented simultaneously. The results show that the resolution required for perceiving multiple complex pitches is comparable to that found in a previous study using single complex tones. Although relatively high performance can be achieved with 48 channels, performance remained near chance when even limited spectral spread (with filter slopes as steep as 144 dB/octave) was introduced to the simulations. Overall, these tight constraints suggest that current CI technology will not be able to convey the pitches of combinations of spectrally overlapping complex tones.
Collapse
Affiliation(s)
- Anahita H Mehta
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA.
| | - Hao Lu
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Parkway, Minneapolis, MN, 55455, USA
| |
Collapse
|
48
|
Robust Rate-Place Coding of Resolved Components in Harmonic and Inharmonic Complex Tones in Auditory Midbrain. J Neurosci 2020; 40:2080-2093. [PMID: 31996454 DOI: 10.1523/jneurosci.2337-19.2020] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Revised: 01/12/2020] [Accepted: 01/16/2020] [Indexed: 11/21/2022] Open
Abstract
Harmonic complex tones (HCTs) commonly occurring in speech and music evoke a strong pitch at their fundamental frequency (F0), especially when they contain harmonics individually resolved by the cochlea. When all frequency components of an HCT are shifted by the same amount, the pitch of the resulting inharmonic tone (IHCT) can also shift, although the envelope repetition rate is unchanged. A rate-place code, whereby resolved harmonics are represented by local maxima in firing rates along the tonotopic axis, has been characterized in the auditory nerve and primary auditory cortex, but little is known about intermediate processing stages. We recorded single-neuron responses to HCT and IHCT with varying F0 and sound level in the inferior colliculus (IC) of unanesthetized rabbits of both sexes. Many neurons showed peaks in firing rate when a low-numbered harmonic aligned with the neuron's characteristic frequency, demonstrating "rate-place" coding. The IC rate-place code was most prevalent for F0 > 800 Hz, was only moderately dependent on sound level over a 40 dB range, and was not sensitive to stimulus harmonicity. A spectral receptive-field model incorporating broadband inhibition better predicted the neural responses than a purely excitatory model, suggesting an enhancement of the rate-place representation by inhibition. Some IC neurons showed facilitation in response to HCT relative to pure tones, similar to cortical "harmonic template neurons" (Feng and Wang, 2017), but to a lesser degree. Our findings shed light on the transformation of rate-place coding of resolved harmonics along the auditory pathway.SIGNIFICANCE STATEMENT Harmonic complex tones are ubiquitous in speech and music and produce strong pitch percepts when they contain frequency components that are individually resolved by the cochlea. Here, we characterize a "rate-place" code for resolved harmonics in the auditory midbrain that is more robust across sound levels than the peripheral rate-place code and insensitive to the harmonic relationships among frequency components. We use a computational model to show that inhibition may play an important role in shaping the rate-place code. Our study fills a major gap in understanding the transformations in neural representations of resolved harmonics along the auditory pathway.
Collapse
|
49
|
Auditory Selectivity for Spectral Contrast in Cortical Neurons and Behavior. J Neurosci 2019; 40:1015-1027. [PMID: 31826944 DOI: 10.1523/jneurosci.1200-19.2019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2019] [Revised: 12/04/2019] [Accepted: 12/06/2019] [Indexed: 12/17/2022] Open
Abstract
Vocal communication relies on the ability of listeners to identify, process, and respond to vocal sounds produced by others in complex environments. To accurately recognize these signals, animals' auditory systems must robustly represent acoustic features that distinguish vocal sounds from other environmental sounds. Vocalizations typically have spectral structure; power regularly fluctuates along the frequency axis, creating spectral contrast. Spectral contrast is closely related to harmonicity, which refers to spectral power peaks occurring at integer multiples of a fundamental frequency. Although both spectral contrast and harmonicity typify natural sounds, they may differ in salience for communication behavior and engage distinct neural mechanisms. Therefore, it is important to understand which of these properties of vocal sounds underlie the neural processing and perception of vocalizations.Here, we test the importance of vocalization-typical spectral features in behavioral recognition and neural processing of vocal sounds, using male zebra finches. We show that behavioral responses to natural and synthesized vocalizations rely on the presence of discrete frequency components, but not on harmonic ratios between frequencies. We identify a specific population of neurons in primary auditory cortex that are sensitive to the spectral resolution of vocal sounds. We find that behavioral and neural response selectivity is explained by sensitivity to spectral contrast rather than harmonicity. This selectivity emerges within the cortex; it is absent in the thalamorecipient region and present in the deep output region. Further, deep-region neurons that are contrast-sensitive show distinct temporal responses and selectivity for modulation density compared with unselective neurons.SIGNIFICANCE STATEMENT Auditory coding and perception are critical for vocal communication. Auditory neurons must encode acoustic features that distinguish vocalizations from other sounds in the environment and generate percepts that direct behavior. The acoustic features that drive neural and behavioral selectivity for vocal sounds are unknown, however. Here, we show that vocal response behavior scales with stimulus spectral contrast but not with harmonicity, in songbirds. We identify a distinct population of auditory cortex neurons in which response selectivity parallels behavioral selectivity. This neural response selectivity is explained by sensitivity to spectral contrast rather than to harmonicity. Our findings inform the understanding of how the auditory system encodes socially-relevant signals via detection of an acoustic feature that is ubiquitous in vocalizations.
Collapse
|
50
|
Graves JE, Pralus A, Fornoni L, Oxenham AJ, Caclin A, Tillmann B. Short- and long-term memory for pitch and non-pitch contours: Insights from congenital amusia. Brain Cogn 2019; 136:103614. [PMID: 31546175 PMCID: PMC6953621 DOI: 10.1016/j.bandc.2019.103614] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Revised: 09/11/2019] [Accepted: 09/13/2019] [Indexed: 11/25/2022]
Abstract
Congenital amusia is a neurodevelopmental disorder characterized by deficits in music perception, including discriminating and remembering melodies and melodic contours. As non-amusic listeners can perceive contours in dimensions other than pitch, such as loudness and brightness, our present study investigated whether amusics' pitch contour deficits also extend to these other auditory dimensions. Amusic and control participants performed an identification task for ten familiar melodies and a short-term memory task requiring the discrimination of changes in the contour of novel four-tone melodies. For both tasks, melodic contour was defined by pitch, brightness, or loudness. Amusic participants showed some ability to extract contours in all three dimensions. For familiar melodies, amusic participants showed impairment in all conditions, perhaps reflecting the fact that the long-term memory representations of the familiar melodies were defined in pitch. In the contour discrimination task with novel melodies, amusic participants exhibited less impairment for loudness-based melodies than for pitch- or brightness-based melodies, suggesting some specificity of the deficit for spectral changes, if not for pitch alone. The results suggest pitch and brightness may not be processed by the same mechanisms as loudness, and that short-term memory for loudness contours may be spared to some degree in congenital amusia.
Collapse
Affiliation(s)
- Jackson E Graves
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France; Department of Psychology, University of Minnesota, Minneapolis, MN, USA; Laboratoire des systèmes perceptifs, Département d'études cognitives, École normale supérieure, PSL University, CNRS, 75005 Paris, France.
| | - Agathe Pralus
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Lesly Fornoni
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Andrew J Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, MN, USA
| | - Anne Caclin
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| | - Barbara Tillmann
- Lyon Neuroscience Research Center (CRNL), CNRS, UMR 5292, Inserm U1028, Université Lyon 1, Lyon, France
| |
Collapse
|