1
|
Tollefsrud MA, Joyner CN, Zakrzewski AC, Wisniewski MG. Not fully remembered, but not forgotten: interfering sounds worsen but do not eliminate the representation of pitch in working memory. Atten Percept Psychophys 2024; 86:855-865. [PMID: 38231462 PMCID: PMC11217971 DOI: 10.3758/s13414-024-02845-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/02/2024] [Indexed: 01/18/2024]
Abstract
Recent research has begun measuring auditory working memory with a continuous adjustment task in which listeners adjust attributes of a sound to match a stimulus presented earlier. This approach captures auditory memory's continuous nature better than standard change detection paradigms that collect binary ("same or different") memory measurements. In two experiments, we assessed the impact of different interference stimuli (multitone complexes vs. white noise vs. silence) on the precision and accuracy of participants' reproductions of pitch from memory. Participants were presented with a target multitone complex stimulus followed by eight successive interference signals. Across trials, these signals alternated between additional multitone complexes, randomly generated white noise samples, or (in Experiment 2) silence. This was followed by a response period where participants adjusted the pitch of a response stimulus using a MIDI touchpad to match the target. Experiment 1 found a significant effect of interference type on performance, with tone interference signals producing the greatest impairments to participants' accuracy and precision compared to white noise. Interestingly, it also found a compression in the participants' responses, with overestimations of low-frequency targets and underestimations for high-frequency targets. Experiment 2 replicated results from Experiment 1, with an additional silence condition showing the best performance, suggesting that non-tonal signals also generate interference. In general, results support a shared resource model of working memory with a limited capacity that can be flexibly allocated to hold items in memory with varying levels of fidelity. Interference does not appear to knock items out of a fixed precision slot, but rather robs a portion of capacity from stored items.
Collapse
|
2
|
McBride JM, Passmore S, Tlusty T. Convergent evolution in a large cross-cultural database of musical scales. PLoS One 2023; 18:e0284851. [PMID: 38091315 PMCID: PMC10718441 DOI: 10.1371/journal.pone.0284851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 07/21/2023] [Indexed: 12/18/2023] Open
Abstract
Scales, sets of discrete pitches that form the basis of melodies, are thought to be one of the most universal hallmarks of music. But we know relatively little about cross-cultural diversity of scales or how they evolved. To remedy this, we assemble a cross-cultural database (Database of Musical Scales: DaMuSc) of scale data, collected over the past century by various ethnomusicologists. Statistical analyses of the data highlight that certain intervals (e.g., the octave, fifth, second) are used frequently across cultures. Despite some diversity among scales, it is the similarities across societies which are most striking: step intervals are restricted to 100-400 cents; most scales are found close to equidistant 5- and 7-note scales. We discuss potential mechanisms of variation and selection in the evolution of scales, and how the assembled data may be used to examine the root causes of convergent evolution.
Collapse
Affiliation(s)
- John M. McBride
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan, South Korea
| | - Sam Passmore
- Faculty of Environment and Information Studies, Keio University, Fujisawa, Japan
- Evolution of Cultural Diversity Initiative, College of Asia and the Pacific, Australian National University, Canberra, Australia
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan, South Korea
| |
Collapse
|
3
|
Rajappa N, Guest DR, Oxenham AJ. Benefits of Harmonicity for Hearing in Noise Are Limited to Detection and Pitch-Related Discrimination Tasks. BIOLOGY 2023; 12:1522. [PMID: 38132348 PMCID: PMC10740545 DOI: 10.3390/biology12121522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2023] [Revised: 12/07/2023] [Accepted: 12/08/2023] [Indexed: 12/23/2023]
Abstract
Harmonic complex tones are easier to detect in noise than inharmonic complex tones, providing a potential perceptual advantage in complex auditory environments. Here, we explored whether the harmonic advantage extends to other auditory tasks that are important for navigating a noisy auditory environment, such as amplitude- and frequency-modulation detection. Sixty young normal-hearing listeners were tested, divided into two equal groups with and without musical training. Consistent with earlier studies, harmonic tones were easier to detect in noise than inharmonic tones, with a signal-to-noise ratio (SNR) advantage of about 2.5 dB, and the pitch discrimination of the harmonic tones was more accurate than that of inharmonic tones, even after differences in audibility were accounted for. In contrast, neither amplitude- nor frequency-modulation detection was superior with harmonic tones once differences in audibility were accounted for. Musical training was associated with better performance only in pitch-discrimination and frequency-modulation-detection tasks. The results confirm a detection and pitch-perception advantage for harmonic tones but reveal that the harmonic benefits do not extend to suprathreshold tasks that do not rely on extracting the fundamental frequency. A general theory is proposed that may account for the effects of both noise and memory on pitch-discrimination differences between harmonic and inharmonic tones.
Collapse
Affiliation(s)
- Neha Rajappa
- Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA;
| | - Daniel R. Guest
- Department of Biomedical Engineering, University of Rochester, Rochester, NY 14627, USA;
| | - Andrew J. Oxenham
- Department of Psychology, University of Minnesota, Minneapolis, MN 55455, USA;
| |
Collapse
|
4
|
Feather J, Leclerc G, Mądry A, McDermott JH. Model metamers reveal divergent invariances between biological and artificial neural networks. Nat Neurosci 2023; 26:2017-2034. [PMID: 37845543 PMCID: PMC10620097 DOI: 10.1038/s41593-023-01442-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 08/29/2023] [Indexed: 10/18/2023]
Abstract
Deep neural network models of sensory systems are often proposed to learn representational transformations with invariances like those in the brain. To reveal these invariances, we generated 'model metamers', stimuli whose activations within a model stage are matched to those of a natural stimulus. Metamers for state-of-the-art supervised and unsupervised neural network models of vision and audition were often completely unrecognizable to humans when generated from late model stages, suggesting differences between model and human invariances. Targeted model changes improved human recognizability of model metamers but did not eliminate the overall human-model discrepancy. The human recognizability of a model's metamers was well predicted by their recognizability by other models, suggesting that models contain idiosyncratic invariances in addition to those required by the task. Metamer recognizability dissociated from both traditional brain-based benchmarks and adversarial vulnerability, revealing a distinct failure mode of existing sensory models and providing a complementary benchmark for model assessment.
Collapse
Affiliation(s)
- Jenelle Feather
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Computational Neuroscience, Flatiron Institute, Cambridge, MA, USA.
| | - Guillaume Leclerc
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Aleksander Mądry
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA.
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Speech and Hearing Bioscience and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
5
|
Chen C, Cruces-Solís H, Ertman A, de Hoz L. Subcortical coding of predictable and unsupervised sound-context associations. CURRENT RESEARCH IN NEUROBIOLOGY 2023; 5:100110. [PMID: 38020811 PMCID: PMC10663128 DOI: 10.1016/j.crneur.2023.100110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 09/12/2023] [Accepted: 09/17/2023] [Indexed: 12/01/2023] Open
Abstract
Our environment is made of a myriad of stimuli present in combinations often patterned in predictable ways. For example, there is a strong association between where we are and the sounds we hear. Like many environmental patterns, sound-context associations are learned implicitly, in an unsupervised manner, and are highly informative and predictive of normality. Yet, we know little about where and how unsupervised sound-context associations are coded in the brain. Here we measured plasticity in the auditory midbrain of mice living over days in an enriched task-less environment in which entering a context triggered sound with different degrees of predictability. Plasticity in the auditory midbrain, a hub of auditory input and multimodal feedback, developed over days and reflected learning of contextual information in a manner that depended on the predictability of the sound-context association and not on reinforcement. Plasticity manifested as an increase in response gain and tuning shift that correlated with a general increase in neuronal frequency discrimination. Thus, the auditory midbrain is sensitive to unsupervised predictable sound-context associations, revealing a subcortical engagement in the detection of contextual sounds. By increasing frequency resolution, this detection might facilitate the processing of behaviorally relevant foreground information described to occur in cortical auditory structures.
Collapse
Affiliation(s)
- Chi Chen
- Department of Neurogenetics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- International Max Planck Research School for Neurosciences, Göttingen, Germany
- Göttingen Graduate School of Neurosciences and Molecular Biosciences, Germany
- Charité Medical University, Neuroscience Research Center, Berlin, Germany
| | - Hugo Cruces-Solís
- Department of Neurogenetics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- International Max Planck Research School for Neurosciences, Göttingen, Germany
- Göttingen Graduate School of Neurosciences and Molecular Biosciences, Germany
| | - Alexandra Ertman
- Charité Medical University, Neuroscience Research Center, Berlin, Germany
- International Graduate Program Medical Neurosciences, Charité Medical University, Berlin, Germany
| | - Livia de Hoz
- Department of Neurogenetics, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Charité Medical University, Neuroscience Research Center, Berlin, Germany
- Bernstein Center for Computational Neuroscience, Berlin, Germany
| |
Collapse
|
6
|
Wisniewski MG, Tollefsrud MA. Auditory short-term memory for pitch loses precision over time. JASA EXPRESS LETTERS 2023; 3:034402. [PMID: 37003712 DOI: 10.1121/10.0017518] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The impact of retention interval duration on the fidelity of pitch memory was investigated. Listeners heard "target" pure tones, followed by a retention interval (2-8 s), then a response period in which the frequency of a novel sound was adjusted to match their memory of the target. The variability of pitch matches increased with retention interval duration. Supplemental analyses of the most accurate trials and temporal dynamics of matching suggest that decreasing precision was not due to differences in complete forgetting among intervals. Results suggest that the precision of short-term memory for pitch may continuously degrade over time.
Collapse
Affiliation(s)
- Matthew G Wisniewski
- Department of Psychological Sciences, Kansas State University, Manhattan, Kansas 66506, USA ,
| | - Michael A Tollefsrud
- Department of Psychological Sciences, Kansas State University, Manhattan, Kansas 66506, USA ,
| |
Collapse
|
7
|
McPherson MJ, McDermott JH. Relative pitch representations and invariance to timbre. Cognition 2023; 232:105327. [PMID: 36495710 PMCID: PMC10016107 DOI: 10.1016/j.cognition.2022.105327] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 09/13/2022] [Accepted: 11/10/2022] [Indexed: 12/12/2022]
Abstract
Information in speech and music is often conveyed through changes in fundamental frequency (f0), perceived by humans as "relative pitch". Relative pitch judgments are complicated by two facts. First, sounds can simultaneously vary in timbre due to filtering imposed by a vocal tract or instrument body. Second, relative pitch can be extracted in two ways: by measuring changes in constituent frequency components from one sound to another, or by estimating the f0 of each sound and comparing the estimates. We examined the effects of timbral differences on relative pitch judgments, and whether any invariance to timbre depends on whether judgments are based on constituent frequencies or their f0. Listeners performed up/down and interval discrimination tasks with pairs of spoken vowels, instrument notes, or synthetic tones, synthesized to be either harmonic or inharmonic. Inharmonic sounds lack a well-defined f0, such that relative pitch must be extracted from changes in individual frequencies. Pitch judgments were less accurate when vowels/instruments were different compared to when they were the same, and were biased by the associated timbre differences. However, this bias was similar for harmonic and inharmonic sounds, and was observed even in conditions where judgments of harmonic sounds were based on f0 representations. Relative pitch judgments are thus not invariant to timbre, even when timbral variation is naturalistic, and when such judgments are based on representations of f0.
Collapse
Affiliation(s)
- Malinda J McPherson
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America.
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA 02139, United States of America; Program in Speech and Hearing Biosciences and Technology, Harvard University, Boston, MA 02115, United States of America; McGovern Institute for Brain Research, MIT, Cambridge, MA 02139, United States of America; Center for Brains Minds and Machines, MIT, Cambridge, MA 02139, United States of America
| |
Collapse
|
8
|
Basiński K, Quiroga-Martinez DR, Vuust P. Temporal hierarchies in the predictive processing of melody - From pure tones to songs. Neurosci Biobehav Rev 2023; 145:105007. [PMID: 36535375 DOI: 10.1016/j.neubiorev.2022.105007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 11/30/2022] [Accepted: 12/14/2022] [Indexed: 12/23/2022]
Abstract
Listening to musical melodies is a complex task that engages perceptual and memoryrelated processes. The processes underlying melody cognition happen simultaneously on different timescales, ranging from milliseconds to minutes. Although attempts have been made, research on melody perception is yet to produce a unified framework of how melody processing is achieved in the brain. This may in part be due to the difficulty of integrating concepts such as perception, attention and memory, which pertain to different temporal scales. Recent theories on brain processing, which hold prediction as a fundamental principle, offer potential solutions to this problem and may provide a unifying framework for explaining the neural processes that enable melody perception on multiple temporal levels. In this article, we review empirical evidence for predictive coding on the levels of pitch formation, basic pitch-related auditory patterns,more complex regularity processing extracted from basic patterns and long-term expectations related to musical syntax. We also identify areas that would benefit from further inquiry and suggest future directions in research on musical melody perception.
Collapse
Affiliation(s)
- Krzysztof Basiński
- Division of Quality of Life Research, Medical University of Gdańsk, Poland
| | - David Ricardo Quiroga-Martinez
- Helen Wills Neuroscience Institute & Department of Psychology, University of California Berkeley, USA; Center for Music in the Brain, Aarhus University & The Royal Academy of Music, Denmark
| | - Peter Vuust
- Center for Music in the Brain, Aarhus University & The Royal Academy of Music, Denmark
| |
Collapse
|
9
|
Rallapalli V, Souza P. Feasibility of Tablet-Based Remote Data Collection Method for Measuring Hearing Aid Preference. Am J Audiol 2022; 31:746-756. [PMID: 35914021 DOI: 10.1044/2022_aja-21-00273] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
PURPOSE The purpose of this study was to determine the feasibility of a tablet-based remote data collection method for measuring preference for hearing aid signal processing features. METHOD Participants were nine individuals with bilateral mild to moderately severe sensorineural hearing loss. Stimuli were spatialized low-context sentences mixed with six-talker babble at two realistic signal-to-noise ratios (3 and 8 dB) and processed through a hearing aid simulator. Preference for full factorial combinations of three common hearing aid processing features (two levels each) was elicited using a paired-comparison task. Participants completed two versions of the experiment: The lab version was completed in a sound-treated booth using a custom MATLAB application on a desktop computer; the remote version was completed in a quiet room in the participant's home, using a custom MATLAB executable application on a tablet. Both versions used the same calibrated headphones. Strict infection control protocols were followed. RESULTS McNemar's test showed no association between preference and data collection method for the majority of the conditions. Percentage agreement and kappa scores were moderate/fair across most conditions. The results indicated that the remote versus lab versions did not have a systematic effect on preference. However, the relatively low agreement and kappa scores suggested within-subject variability in the outcome (preference). CONCLUSION The tablet-based version of remote experimentation was comparable to the lab-based version for eliciting preference for hearing aid signal processing features.
Collapse
Affiliation(s)
- Varsha Rallapalli
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL
| | - Pamela Souza
- Roxelyn and Richard Pepper Department of Communication Sciences & Disorders, Northwestern University, Evanston, IL.,Knowles Hearing Center, Northwestern University, Evanston, IL
| |
Collapse
|
10
|
Sihn D, Kim SP. Spatio-Temporally Efficient Coding Assigns Functions to Hierarchical Structures of the Visual System. Front Comput Neurosci 2022; 16:890447. [PMID: 35694611 PMCID: PMC9184804 DOI: 10.3389/fncom.2022.890447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2022] [Accepted: 05/09/2022] [Indexed: 11/17/2022] Open
Abstract
Hierarchical structures constitute a wide array of brain areas, including the visual system. One of the important questions regarding visual hierarchical structures is to identify computational principles for assigning functions that represent the external world to hierarchical structures of the visual system. Given that visual hierarchical structures contain both bottom-up and top-down pathways, the derived principles should encompass these bidirectional pathways. However, existing principles such as predictive coding do not provide an effective principle for bidirectional pathways. Therefore, we propose a novel computational principle for visual hierarchical structures as spatio-temporally efficient coding underscored by the efficient use of given resources in both neural activity space and processing time. This coding principle optimises bidirectional information transmissions over hierarchical structures by simultaneously minimising temporal differences in neural responses and maximising entropy in neural representations. Simulations demonstrated that the proposed spatio-temporally efficient coding was able to assign the function of appropriate neural representations of natural visual scenes to visual hierarchical structures. Furthermore, spatio-temporally efficient coding was able to predict well-known phenomena, including deviations in neural responses to unlearned inputs and bias in preferred orientations. Our proposed spatio-temporally efficient coding may facilitate deeper mechanistic understanding of the computational processes of hierarchical brain structures.
Collapse
Affiliation(s)
| | - Sung-Phil Kim
- Department of Biomedical Engineering, Ulsan National Institute of Science and Technology, Ulsan, South Korea
| |
Collapse
|
11
|
Abstract
Hearing in noise is a core problem in audition, and a challenge for hearing-impaired listeners, yet the underlying mechanisms are poorly understood. We explored whether harmonic frequency relations, a signature property of many communication sounds, aid hearing in noise for normal hearing listeners. We measured detection thresholds in noise for tones and speech synthesized to have harmonic or inharmonic spectra. Harmonic signals were consistently easier to detect than otherwise identical inharmonic signals. Harmonicity also improved discrimination of sounds in noise. The largest benefits were observed for two-note up-down "pitch" discrimination and melodic contour discrimination, both of which could be performed equally well with harmonic and inharmonic tones in quiet, but which showed large harmonic advantages in noise. The results show that harmonicity facilitates hearing in noise, plausibly by providing a noise-robust pitch cue that aids detection and discrimination.
Collapse
|
12
|
Wagner JD, Gelman A, Hancock KE, Chung Y, Delgutte B. Rabbits use both spectral and temporal cues to discriminate the fundamental frequency of harmonic complexes with missing fundamentals. J Neurophysiol 2022; 127:290-312. [PMID: 34879207 PMCID: PMC8759963 DOI: 10.1152/jn.00366.2021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The pitch of harmonic complex tones (HCTs) common in speech, music, and animal vocalizations plays a key role in the perceptual organization of sound. Unraveling the neural mechanisms of pitch perception requires animal models, but little is known about complex pitch perception by animals, and some species appear to use different pitch mechanisms than humans. Here, we tested rabbits' ability to discriminate the fundamental frequency (F0) of HCTs with missing fundamentals, using a behavioral paradigm inspired by foraging behavior in which rabbits learned to harness a spatial gradient in F0 to find the location of a virtual target within a room for a food reward. Rabbits were initially trained to discriminate HCTs with F0s in the range 400-800 Hz and with harmonics covering a wide frequency range (800-16,000 Hz) and then tested with stimuli differing in spectral composition to test the role of harmonic resolvability (experiment 1) or in F0 range (experiment 2) or in both F0 and spectral content (experiment 3). Together, these experiments show that rabbits can discriminate HCTs over a wide F0 range (200-1,600 Hz) encompassing the range of conspecific vocalizations and can use either the spectral pattern of harmonics resolved by the cochlea for higher F0s or temporal envelope cues resulting from interaction between unresolved harmonics for lower F0s. The qualitative similarity of these results to human performance supports the use of rabbits as an animal model for studies of pitch mechanisms, providing species differences in cochlear frequency selectivity and F0 range of vocalizations are taken into account.NEW & NOTEWORTHY Understanding the neural mechanisms of pitch perception requires experiments in animal models, but little is known about pitch perception by animals. Here we show that rabbits, a popular animal in auditory neuroscience, can discriminate complex sounds differing in pitch using either spectral cues or temporal cues. The results suggest that the role of spectral cues in pitch perception by animals may have been underestimated by predominantly testing low frequencies in the range of human voice.
Collapse
Affiliation(s)
- Joseph D. Wagner
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,3Department of Biomedical Engineering, Boston University, Boston, Massachusetts
| | - Alice Gelman
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts
| | - Kenneth E. Hancock
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,2Department of Otolaryngology, Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts
| | - Yoojin Chung
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,2Department of Otolaryngology, Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts
| | - Bertrand Delgutte
- 1Eaton-Peabody Laboratories, Massachusetts Eye and Ear, Boston, Massachusetts,2Department of Otolaryngology, Head and Neck Surgery, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
13
|
Saddler MR, Gonzalez R, McDermott JH. Deep neural network models reveal interplay of peripheral coding and stimulus statistics in pitch perception. Nat Commun 2021; 12:7278. [PMID: 34907158 PMCID: PMC8671597 DOI: 10.1038/s41467-021-27366-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 11/12/2021] [Indexed: 11/15/2022] Open
Abstract
Perception is thought to be shaped by the environments for which organisms are optimized. These influences are difficult to test in biological organisms but may be revealed by machine perceptual systems optimized under different conditions. We investigated environmental and physiological influences on pitch perception, whose properties are commonly linked to peripheral neural coding limits. We first trained artificial neural networks to estimate fundamental frequency from biologically faithful cochlear representations of natural sounds. The best-performing networks replicated many characteristics of human pitch judgments. To probe the origins of these characteristics, we then optimized networks given altered cochleae or sound statistics. Human-like behavior emerged only when cochleae had high temporal fidelity and when models were optimized for naturalistic sounds. The results suggest pitch perception is critically shaped by the constraints of natural environments in addition to those of the cochlea, illustrating the use of artificial neural networks to reveal underpinnings of behavior.
Collapse
Affiliation(s)
- Mark R Saddler
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
| | - Ray Gonzalez
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA
| | - Josh H McDermott
- Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
- McGovern Institute for Brain Research, MIT, Cambridge, MA, USA.
- Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
- Program in Speech and Hearing Biosciences and Technology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
14
|
Homma NY, Bajo VM. Lemniscal Corticothalamic Feedback in Auditory Scene Analysis. Front Neurosci 2021; 15:723893. [PMID: 34489635 PMCID: PMC8417129 DOI: 10.3389/fnins.2021.723893] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 07/30/2021] [Indexed: 12/15/2022] Open
Abstract
Sound information is transmitted from the ear to central auditory stations of the brain via several nuclei. In addition to these ascending pathways there exist descending projections that can influence the information processing at each of these nuclei. A major descending pathway in the auditory system is the feedback projection from layer VI of the primary auditory cortex (A1) to the ventral division of medial geniculate body (MGBv) in the thalamus. The corticothalamic axons have small glutamatergic terminals that can modulate thalamic processing and thalamocortical information transmission. Corticothalamic neurons also provide input to GABAergic neurons of the thalamic reticular nucleus (TRN) that receives collaterals from the ascending thalamic axons. The balance of corticothalamic and TRN inputs has been shown to refine frequency tuning, firing patterns, and gating of MGBv neurons. Therefore, the thalamus is not merely a relay stage in the chain of auditory nuclei but does participate in complex aspects of sound processing that include top-down modulations. In this review, we aim (i) to examine how lemniscal corticothalamic feedback modulates responses in MGBv neurons, and (ii) to explore how the feedback contributes to auditory scene analysis, particularly on frequency and harmonic perception. Finally, we will discuss potential implications of the role of corticothalamic feedback in music and speech perception, where precise spectral and temporal processing is essential.
Collapse
Affiliation(s)
- Natsumi Y. Homma
- Center for Integrative Neuroscience, University of California, San Francisco, San Francisco, CA, United States
- Coleman Memorial Laboratory, Department of Otolaryngology – Head and Neck Surgery, University of California, San Francisco, San Francisco, CA, United States
| | - Victoria M. Bajo
- Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|