1
|
Calcus A. Development of auditory scene analysis: a mini-review. Front Hum Neurosci 2024; 18:1352247. [PMID: 38532788 PMCID: PMC10963424 DOI: 10.3389/fnhum.2024.1352247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Accepted: 02/22/2024] [Indexed: 03/28/2024] Open
Abstract
Most auditory environments contain multiple sound waves that are mixed before reaching the ears. In such situations, listeners must disentangle individual sounds from the mixture, performing the auditory scene analysis. Analyzing complex auditory scenes relies on listeners ability to segregate acoustic events into different streams, and to selectively attend to the stream of interest. Both segregation and selective attention are known to be challenging for adults with normal hearing, and seem to be even more difficult for children. Here, we review the recent literature on the development of auditory scene analysis, presenting behavioral and neurophysiological results. In short, cognitive and neural mechanisms supporting stream segregation are functional from birth but keep developing until adolescence. Similarly, from 6 months of age, infants can orient their attention toward a target in the presence of distractors. However, selective auditory attention in the presence of interfering streams only reaches maturity in late childhood at the earliest. Methodological limitations are discussed, and a new paradigm is proposed to clarify the relationship between auditory scene analysis and speech perception in noise throughout development.
Collapse
Affiliation(s)
- Axelle Calcus
- Center for Research in Cognitive Neuroscience (CRCN), ULB Neuroscience Institute (UNI), Université Libre de Bruxelles, Brussels, Belgium
| |
Collapse
|
2
|
Saberi K, Hickok G. Confirming an antiphasic bicyclic pattern of forward entrainment in signal detection: A reanalysis of Sun et al. (2021). Eur J Neurosci 2022; 56:5274-5286. [PMID: 36057434 PMCID: PMC9826078 DOI: 10.1111/ejn.15816] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 08/27/2022] [Accepted: 08/31/2022] [Indexed: 01/11/2023]
Abstract
Forward entrainment refers to that part of the entrainment process that persists after termination of an entraining stimulus. Hickok et al. (2015) reported forward entrainment in signal detection that lasted for two post-stimulus cycles. In a recent paper, Sun et al. (2021) reported new data which suggested an absence of entrainment effects (Eur. J. Neurosci, 1-18, doi.org/10.1111/ejn.15367). Here we show that when Sun et al.'s data are analysed using unbiased detection-theoretic measures, a clear antiphasic bicyclic pattern of entrainment is observed. We further show that the measure of entrainment strength used by Sun et al., the normalized Fourier transform of performance curves, is not only erroneously calculated but is also unreliable in estimating entrainment strength due to signal-processing artifacts.
Collapse
Affiliation(s)
- Kourosh Saberi
- Department of Cognitive SciencesUniversity of California, IrvineIrvineCaliforniaUSA
| | - Gregory Hickok
- Department of Cognitive SciencesUniversity of California, IrvineIrvineCaliforniaUSA,Department of Language ScienceUniversity of California, IrvineIrvineCaliforniaUSA
| |
Collapse
|
3
|
Development of Masked Speech Detection Thresholds in 2- to 15-year-old Children: Speech-Shaped Noise and Two-Talker Speech Maskers. Ear Hear 2021; 42:1712-1726. [PMID: 33928913 DOI: 10.1097/aud.0000000000001062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
OBJECTIVES On the basis of the data from school-aged children, there is consistent evidence that there is a prolonged course of auditory development for perceiving speech embedded in competing background sounds. Furthermore, age-related differences are prolonged and pronounced for a two-talker speech masker compared to a speech-shaped noise masker. However, little is known about the course of development during the toddler and preschool years because it is difficult to collect reliable behavioral data from this age range. The goal of this study was to extend our lower age limit to include toddlers and preschoolers to characterize the developmental trajectory for masked speech detection thresholds across childhood. DESIGN Participants were 2- to 15-year-old children (n = 67) and adults (n = 17), all with normal hearing. Thresholds (71%) were measured for detecting a two-syllable word embedded in one of two maskers: speech-shaped noise or two-talker speech. The masker was presented at 55 dB SPL throughout testing. Stimuli were presented to the left ear via a lightweight headphone. Data were collected using an observer-based testing method in which the participant's behavior was judged by an experimenter using a two-interval, two-alternative testing paradigm. The participant's response to the stimulus was shaped by training him/her to perform a conditioned play-based response to the sound. For children, receptive vocabulary and working memory were measured. Data were fitted with a linear regression model to establish the course of development for each masker condition. Appropriateness of the test method was also evaluated by determining if there were age-related differences in training data, inter-rater reliability, or slope or upper asymptote estimates from pooled psychometric functions across different age groups. RESULTS Child and adult speech detection thresholds were poorer in the two-talker masker than in the speech-shaped noise masker, but different developmental trajectories were seen for the two masker conditions. For the speech-shaped noise masker, threshold improved by about 5 dB across the age span tested, with adult-like performance being reached around 10 years of age. For the two-talker masker condition, thresholds improved by about 7 dB between 2.5 and 15 years. However, the linear fit for this condition failed to achieve adult-like performance because of limited data from teenagers. No significant age-related differences were seen in training data, probe hit rate, or inter-rater reliability. Furthermore, slope and upper asymptote estimates from pooled psychometric functions were similar across different child age groups. CONCLUSIONS Different developmental patterns were seen across the two maskers, with more pronounced child-adult differences and prolonged immaturity during childhood for the two-talker masker relative to the speech-shaped noise masker. Our data do not support the idea that there is rapid improvement of masked speech detection thresholds between 2.5 and 5 years of age. This study also highlights that our observer-based method can be used to collect reliable behavioral data from toddlers and preschoolers-a time period where we know little about auditory development.
Collapse
|
4
|
Lalonde K, McCreery RW. Audiovisual Enhancement of Speech Perception in Noise by School-Age Children Who Are Hard of Hearing. Ear Hear 2021; 41:705-719. [PMID: 32032226 PMCID: PMC7822589 DOI: 10.1097/aud.0000000000000830] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVES The purpose of this study was to examine age- and hearing-related differences in school-age children's benefit from visual speech cues. The study addressed three questions: (1) Do age and hearing loss affect degree of audiovisual (AV) speech enhancement in school-age children? (2) Are there age- and hearing-related differences in the mechanisms underlying AV speech enhancement in school-age children? (3) What cognitive and linguistic variables predict individual differences in AV benefit among school-age children? DESIGN Forty-eight children between 6 and 13 years of age (19 with mild to severe sensorineural hearing loss; 29 with normal hearing) and 14 adults with normal hearing completed measures of auditory and AV syllable detection and/or sentence recognition in a two-talker masker type and a spectrally matched noise. Children also completed standardized behavioral measures of receptive vocabulary, visuospatial working memory, and executive attention. Mixed linear modeling was used to examine effects of modality, listener group, and masker on sentence recognition accuracy and syllable detection thresholds. Pearson correlations were used to examine the relationship between individual differences in children's AV enhancement (AV-auditory-only) and age, vocabulary, working memory, executive attention, and degree of hearing loss. RESULTS Significant AV enhancement was observed across all tasks, masker types, and listener groups. AV enhancement of sentence recognition was similar across maskers, but children with normal hearing exhibited less AV enhancement of sentence recognition than adults with normal hearing and children with hearing loss. AV enhancement of syllable detection was greater in the two-talker masker than the noise masker, but did not vary significantly across listener groups. Degree of hearing loss positively correlated with individual differences in AV benefit on the sentence recognition task in noise, but not on the detection task. None of the cognitive and linguistic variables correlated with individual differences in AV enhancement of syllable detection or sentence recognition. CONCLUSIONS Although AV benefit to syllable detection results from the use of visual speech to increase temporal expectancy, AV benefit to sentence recognition requires that an observer extracts phonetic information from the visual speech signal. The findings from this study suggest that all listener groups were equally good at using temporal cues in visual speech to detect auditory speech, but that adults with normal hearing and children with hearing loss were better than children with normal hearing at extracting phonetic information from the visual signal and/or using visual speech information to access phonetic/lexical representations in long-term memory. These results suggest that standard, auditory-only clinical speech recognition measures likely underestimate real-world speech recognition skills of children with mild to severe hearing loss.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| | - Ryan W. McCreery
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE, USA
| |
Collapse
|
5
|
Lalonde K, Werner LA. Development of the Mechanisms Underlying Audiovisual Speech Perception Benefit. Brain Sci 2021; 11:49. [PMID: 33466253 PMCID: PMC7824772 DOI: 10.3390/brainsci11010049] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2020] [Revised: 12/30/2020] [Accepted: 12/30/2020] [Indexed: 02/07/2023] Open
Abstract
The natural environments in which infants and children learn speech and language are noisy and multimodal. Adults rely on the multimodal nature of speech to compensate for noisy environments during speech communication. Multiple mechanisms underlie mature audiovisual benefit to speech perception, including reduced uncertainty as to when auditory speech will occur, use of correlations between the amplitude envelope of auditory and visual signals in fluent speech, and use of visual phonetic knowledge for lexical access. This paper reviews evidence regarding infants' and children's use of temporal and phonetic mechanisms in audiovisual speech perception benefit. The ability to use temporal cues for audiovisual speech perception benefit emerges in infancy. Although infants are sensitive to the correspondence between auditory and visual phonetic cues, the ability to use this correspondence for audiovisual benefit may not emerge until age four. A more cohesive account of the development of audiovisual speech perception may follow from a more thorough understanding of the development of sensitivity to and use of various temporal and phonetic cues.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, NE 68131, USA
| | - Lynne A. Werner
- Department of Speech and Hearing Sciences, University of Washington, Seattle, WA 98105, USA;
| |
Collapse
|
6
|
Bonino AY, Wiens A, Nightengale EC, Vance EA. Interrater Reliability for a Two-Interval, Observer-Based Procedure for Measuring Hearing in Young Children. Am J Audiol 2020; 29:762-773. [PMID: 32966098 DOI: 10.1044/2020_aja-20-00022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Purpose To overcome methodology limitations for studying auditory development in young children, we have recently developed an observer-based procedure that uses a conditioned, play-based, motor response (see Bonino & Leibold, 2017). The purpose of this article was to examine interrater reliability for the method. Method Video recordings of test sessions of 2- to 4-year-old children (n = 17) were examined. Detection of a 1000-Hz warble tone was measured with the Play Observer-Based, Two-Interval (PlayO2I) method in each of two conditions: for a fixed intensity level (30 dB SPL) or for a variable intensity level signal (0-30 dB SPL). All test sessions were scored independently by three observers (one real-time, two offline). Observer consensus was evaluated with Fleiss' kappa statistic. To determine if summary data were similar across the observers of each test session, the proportion of correct trials (fixed-level condition) or threshold (variable-level condition) were computed. Results The strength of observer consensus was classified as "almost perfect" and "substantial" for the fixed-level and variable-level conditions, respectively. Follow-up analysis of the variable-level data indicated that differences in observer consensus were seen based on the signal level, the type of response behavior provided by the child, and the confidence level of the real-time observer. Resulting summary data were similar across the three observers of each test session: no significant differences for estimates of the proportion of correct trials or threshold. Conclusions Results from this study confirm strong interrater reliability for the method. The PlayO2I method is a powerful tool for measuring detection and discrimination abilities in young children. Supplemental Material https://doi.org/10.23641/asha.12978197.
Collapse
Affiliation(s)
- Angela Yarnell Bonino
- Department of Speech, Language, and Hearing Sciences, University of Colorado Boulder
| | - Ashton Wiens
- Laboratory for Interdisciplinary Statistical Analysis, Department of Applied Mathematics, University of Colorado Boulder
| | - Emily C. Nightengale
- Department of Speech, Language, and Hearing Sciences, University of Colorado Boulder
| | - Eric A. Vance
- Laboratory for Interdisciplinary Statistical Analysis, Department of Applied Mathematics, University of Colorado Boulder
| |
Collapse
|
7
|
Halverson DM, Lalonde K. Does visual speech provide release from perceptual masking in children? THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2020; 148:EL221. [PMID: 33003896 PMCID: PMC7731949 DOI: 10.1121/10.0001867] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Revised: 08/12/2020] [Accepted: 08/13/2020] [Indexed: 06/11/2023]
Abstract
Adults benefit more from visual speech in speech maskers than in noise maskers because visual speech helps perceptually isolate target talkers from competing talkers. To investigate whether children use visual speech to perceptually isolate target talkers, this study compared children's speech recognition thresholds in auditory and audiovisual condition across two maskers: two-talker speech and noise. Children demonstrated similar audiovisual benefit in both maskers. Individual differences in speechreading accuracy predicted audiovisual benefit in each masker to a similar degree. Results suggest that although visual speech improves children's masked speech recognition thresholds, children may use visual speech in different ways than adults.
Collapse
Affiliation(s)
- Destinee M Halverson
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68104, ,
| | - Kaylah Lalonde
- Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska 68104, ,
| |
Collapse
|
8
|
The rhythm of attention: Perceptual modulation via rhythmic entrainment is lowpass and attention mediated. Atten Percept Psychophys 2020; 82:3558-3570. [PMID: 32686065 DOI: 10.3758/s13414-020-02095-y] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Modulation patterns are known to carry critical predictive cues to signal detection in complex acoustic environments. The current study investigated the persistence of masker modulation effects on postmodulation detection of probe signals. Hickok, Farahbod, and Saberi (Psychological Science, 26, 1006-1013, 2015) demonstrated that thresholds for a tone pulse in stationary noise follow a predictable periodic pattern when preceded by a 3-Hz amplitude modulated masker. They found entrainment of detection patterns to the modulation envelope lasting for approximately two cycles after termination of modulation. The current study extends these results to a wide range of modulation rates by mapping the temporal modulation transfer function for persistent modulatory effects. We found significant entrainment to modulation rates of 2 and 3 Hz, a weaker effect at 5 Hz, and no entrainment at higher rates (8 to 32 Hz). The effect seems critically dependent on attentional mechanisms, requiring temporal and level uncertainty of the probe signal. Our findings suggest that the persistence of modulatory effects on signal detection is lowpass in nature and attention based.
Collapse
|
9
|
Lalonde K, Werner LA. Infants and Adults Use Visual Cues to Improve Detection and Discrimination of Speech in Noise. JOURNAL OF SPEECH, LANGUAGE, AND HEARING RESEARCH : JSLHR 2019; 62:3860-3875. [PMID: 31618097 PMCID: PMC7201336 DOI: 10.1044/2019_jslhr-h-19-0106] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 05/30/2019] [Accepted: 07/08/2019] [Indexed: 06/10/2023]
Abstract
Purpose This study assessed the extent to which 6- to 8.5-month-old infants and 18- to 30-year-old adults detect and discriminate auditory syllables in noise better in the presence of visual speech than in auditory-only conditions. In addition, we examined whether visual cues to the onset and offset of the auditory signal account for this benefit. Method Sixty infants and 24 adults were randomly assigned to speech detection or discrimination tasks and were tested using a modified observer-based psychoacoustic procedure. Each participant completed 1-3 conditions: auditory-only, with visual speech, and with a visual signal that only cued the onset and offset of the auditory syllable. Results Mixed linear modeling indicated that infants and adults benefited from visual speech on both tasks. Adults relied on the onset-offset cue for detection, but the same cue did not improve their discrimination. The onset-offset cue benefited infants for both detection and discrimination. Whereas the onset-offset cue improved detection similarly for infants and adults, the full visual speech signal benefited infants to a lesser extent than adults on the discrimination task. Conclusions These results suggest that infants' use of visual onset-offset cues is mature, but their ability to use more complex visual speech cues is still developing. Additional research is needed to explore differences in audiovisual enhancement (a) of speech discrimination across speech targets and (b) with increasingly complex tasks and stimuli.
Collapse
Affiliation(s)
- Kaylah Lalonde
- Department of Speech & Hearing Sciences, University of Washington, Seattle
| | - Lynne A. Werner
- Department of Speech & Hearing Sciences, University of Washington, Seattle
| |
Collapse
|
10
|
Bonino AY, Ramsey ME, McTee HM, Vance EA. Behavioral Assessment of Hearing in 2- to 7-Year-Old Children: Evaluation of a Two-Interval, Observer-Based Procedure Using Conditioned Play-Based Responses. Am J Audiol 2019; 28:560-571. [PMID: 31238003 PMCID: PMC7219350 DOI: 10.1044/2019_aja-19-0004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 02/21/2019] [Accepted: 03/14/2019] [Indexed: 11/09/2022] Open
Abstract
Purpose It is challenging to collect reliable behavioral data from toddlers and preschoolers. Consequently, we have significant gaps in our understanding of how auditory development unfolds during this time period. One method that appears to be promising is an observer-based procedure that uses conditioned, play-based responses (Bonino & Leibold, 2017). In order to evaluate the quality of data obtained with this method, this study presented a suprathreshold signal to determine the number of trials 2- to 7-year-old children could complete, as well as the associated hit rate and observer confidence. Method Participants were 23 children (2-7 years old). Children were taught to perform a play-based motor response when they detected the 1000-Hz warble tone signal (at 30 dB SPL). An observer evaluated children's behavior using a 2-interval, 2-alternative testing paradigm. Testing was terminated after 100 trials or earlier, if signs of habituation were observed. Results Data were successfully collected from 22 of the 23 children. Of the 22 children, all but 1 child completed 100 trials. Overall hit rate was high (0.88-1.0; M = 0.94) and improved with listener age. Hit rate was stable across the test session. Strong agreement was seen between the correctness of the response and the observer's confidence in the judgment. Conclusion Results of this study confirm that the 2-interval, observer-based procedure described in this article is a powerful tool for measuring detection and discrimination abilities in young children. Future research will (a) evaluate coder reliability and (b) examine stability of performance across a test session when the signal intensity is manipulated. Supplemental Material https://doi.org/10.23641/asha.8309273.
Collapse
Affiliation(s)
| | - Michael E. Ramsey
- Laboratory for Interdisciplinary Statistical Analysis, Department of Applied Mathematics, University of Colorado Boulder
| | - Haley M. McTee
- Department of Speech, Language and Hearing Sciences, University of Colorado Boulder
| | - Eric A. Vance
- Laboratory for Interdisciplinary Statistical Analysis, Department of Applied Mathematics, University of Colorado Boulder
| |
Collapse
|
11
|
A Novel Communication Value Task Demonstrates Evidence of Response Bias in Cases with Presbyacusis. Sci Rep 2017; 7:16512. [PMID: 29184188 PMCID: PMC5705661 DOI: 10.1038/s41598-017-16673-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2017] [Accepted: 11/06/2017] [Indexed: 01/21/2023] Open
Abstract
Decision-making about the expected value of an experience or behavior can explain hearing health behaviors in older adults with hearing loss. Forty-four middle-aged to older adults (68.45 ± 7.73 years) performed a task in which they were asked to decide whether information from a surgeon or an administrative assistant would be important to their health in hypothetical communication scenarios across visual signal-to-noise ratios (SNR). Participants also could choose to view the briefly presented sentences multiple times. The number of these effortful attempts to read the stimuli served as a measure of demand for information to make a health importance decision. Participants with poorer high frequency hearing more frequently decided that information was important to their health compared to participants with better high frequency hearing. This appeared to reflect a response bias because participants with high frequency hearing loss demonstrated shorter response latencies when they rated the sentences as important to their health. However, elevated high frequency hearing thresholds did not predict demand for information to make a health importance decision. The results highlight the utility of a performance-based measure to characterize effort and expected value from performing tasks in older adults with hearing loss.
Collapse
|
12
|
Abstract
OBJECTIVES Pure-tone audiometry has been a staple of hearing assessments for decades. Many different procedures have been proposed for measuring thresholds with pure tones by systematically manipulating intensity one frequency at a time until a discrete threshold function is determined. The authors have developed a novel nonparametric approach for estimating a continuous threshold audiogram using Bayesian estimation and machine learning classification. The objective of this study was to assess the accuracy and reliability of this new method relative to a commonly used threshold measurement technique. DESIGN The authors performed air conduction pure-tone audiometry on 21 participants between the ages of 18 and 90 years with varying degrees of hearing ability. Two repetitions of automated machine learning audiogram estimation and one repetition of conventional modified Hughson-Westlake ascending-descending audiogram estimation were acquired by an audiologist. The estimated hearing thresholds of these two techniques were compared at standard audiogram frequencies (i.e., 0.25, 0.5, 1, 2, 4, 8 kHz). RESULTS The two threshold estimate methods delivered very similar estimates at standard audiogram frequencies. Specifically, the mean absolute difference between estimates was 4.16 ± 3.76 dB HL. The mean absolute difference between repeated measurements of the new machine learning procedure was 4.51 ± 4.45 dB HL. These values compare favorably with those of other threshold audiogram estimation procedures. Furthermore, the machine learning method generated threshold estimates from significantly fewer samples than the modified Hughson-Westlake procedure while returning a continuous threshold estimate as a function of frequency. CONCLUSIONS The new machine learning audiogram estimation technique produces continuous threshold audiogram estimates accurately, reliably, and efficiently, making it a strong candidate for widespread application in clinical and research audiometry.
Collapse
|