251
|
Rennies J, Brand T, Kollmeier B. Prediction of the influence of reverberation on binaural speech intelligibility in noise and in quiet. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:2999-3012. [PMID: 22087928 DOI: 10.1121/1.3641368] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/22/2023]
Abstract
Reverberation usually degrades speech intelligibility for spatially separated speech and noise sources since spatial unmasking is reduced and late reflections decrease the fidelity of the received speech signal. The latter effect could not satisfactorily be predicted by a recently presented binaural speech intelligibility model [Beutelmann et al. (2010). J. Acoust. Soc. Am. 127, 2479-2497]. This study therefore evaluated three extensions of the model to improve its predictions: (1) an extension of the speech intelligibility index based on modulation transfer functions, (2) a correction factor based on the room acoustical quantity "definition," and (3) a separation of the speech signal into useful and detrimental parts. The predictions were compared to results of two experiments in which speech reception thresholds were measured in a reverberant room in quiet and in the presence of a noise source for listeners with normal hearing. All extensions yielded better predictions than the original model when the influence of reverberation was strong, while predictions were similar for conditions with less reverberation. Although model (3) differed substantially in the assumed interaction of binaural processing and early reflections, its predictions were very similar to model (2) that achieved the best fit to the data.
Collapse
Affiliation(s)
- Jan Rennies
- Project Group Hearing, Speech and Audio Technology, Fraunhofer Institute for Digital Media Technology IDMT, Marie-Curie-Str. 2, D-26129 Oldenburg, Germany.
| | | | | |
Collapse
|
252
|
Clinical evaluation of signal-to-noise ratio-based noise reduction in Nucleus® cochlear implant recipients. Ear Hear 2011; 32:382-90. [PMID: 21206365 DOI: 10.1097/aud.0b013e318201c200] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
OBJECTIVE The aim of this study was to investigate whether a real-time noise reduction algorithm provided speech perception benefit for Cochlear™ Nucleus® cochlear implant recipients in the laboratory. DESIGN The noise reduction algorithm attenuated masker-dominated channels. It estimated the signal-to-noise ratio of each channel on a short-term basis from a single microphone input, using a recursive minimum statistics method. In this clinical evaluation, the algorithm was implemented in two programs (noise reduction programs 1 [NR1] and 2 [NR2]), which differed in their level of noise reduction. These programs used advanced combination encoder (ACE™) channel selection and were compared with ACE without noise reduction in 13 experienced cochlear implant subjects. An adaptive speech reception threshold (SRT) test provided the signal-to-noise ratio for 50% sentence intelligibility in three different types of noises: speech-weighted, cocktail party, and street-side city noise. RESULTS In all three noise types, mean SRTs for both NR programs were significantly better than those for ACE. The greatest improvement occurred for speech-weighted noise; the SRT benefit over ACE was 1.77 dB for NR1 and 2.14 dB for NR2. There were no significant differences in speech perception scores between the two NR programs. Subjects reported no degradation in sound quality with the experimental programs. CONCLUSIONS The noise reduction algorithm was successful in improving sentence perception in speech-weighted noise, as well as in more dynamic types of background noise. The algorithm is currently being trialed in a behind-the-ear processor for take-home use.
Collapse
|
253
|
Arweiler I, Buchholz JM. The influence of spectral characteristics of early reflections on speech intelligibility. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 130:996-1005. [PMID: 21877812 DOI: 10.1121/1.3609258] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The auditory system takes advantage of early reflections (ERs) in a room by integrating them with the direct sound (DS) and thereby increasing the effective speech level. In the present paper the benefit from realistic ERs on speech intelligibility in diffuse speech-shaped noise was investigated for normal-hearing and hearing-impaired listeners. Monaural and binaural speech intelligibility tests were performed in a virtual auditory environment where the spectral characteristics of ERs from a simulated room could be preserved. The useful ER energy was derived from the speech intelligibility results and the efficiency of the ERs was determined as the ratio of the useful ER energy to the total ER energy. Even though ER energy contributed to speech intelligibility, DS energy was always more efficient, leading to better speech intelligibility for both groups of listeners. The efficiency loss for the ERs was mainly ascribed to their altered spectrum compared to the DS and to the filtering by the torso, head, and pinna. No binaural processing other than a binaural summation effect could be observed.
Collapse
Affiliation(s)
- Iris Arweiler
- Department of Electrical Engineering, Technical University of Denmark, Lyngby, Denmark.
| | | |
Collapse
|
254
|
Uslar V, Ruigendijk E, Hamann C, Brand T, Kollmeier B. How does linguistic complexity influence intelligibility in a German audiometric sentence intelligibility test? Int J Audiol 2011; 50:621-31. [PMID: 21714708 DOI: 10.3109/14992027.2011.582166] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
OBJECTIVE We investigated if linguistic complexity contributes to the variation of the speech reception threshold in noise (SRTN) and thus should be employed as an additional design criterion in sentence tests used for audiometry. DESIGN Three test lists were established with sentences from the Göttingen sentence test ( Kollmeier & Wesselkamp, 1997 ). One list contained linguistically simple sentences, the other two lists contained sentences with two types of linguistic complexity. For each listener the SRTN was determined for each list. STUDY SAMPLE Younger and older listeners with normal hearing and older listeners with hearing impairment were tested. RESULTS Younger listeners with normal hearing showed significantly worse SRTNs on the complex lists than on the simple list. This difference could not be found for either of the older groups. CONCLUSIONS The effect of linguistic complexity on speech recognition seems to depend on age and/or hearing status. Hence, pending further research, linguistic complexity seems less relevant as a sentence test design criterion for clinical-audiological purposes, but we argue that a test with larger variation in linguistic complexity across sentences might show a relation between linguistic complexity and speech recognition even in a clinical population.
Collapse
Affiliation(s)
- Verena Uslar
- Carl von Ossietzky University, Oldenburg, Germany.
| | | | | | | | | |
Collapse
|
255
|
Meister H, Landwehr M, Pyschny V, Grugel L, Walger M. Use of intonation contours for speech recognition in noise by cochlear implant recipients. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2011; 129:EL204-EL209. [PMID: 21568376 DOI: 10.1121/1.3574501] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
The corruption of intonation contours has detrimental effects on sentence-based speech recognition in normal-hearing listeners Binns and Culling [(2007). J. Acoust. Soc. Am. 122, 1765-1776]. This paper examines whether this finding also applies to cochlear implant (CI) recipients. The subjects' F0-discrimination and speech perception in the presence of noise were measured, using sentences with regular and inverted F0-contours. The results revealed that speech recognition for regular contours was significantly better than for inverted contours. This difference was related to the subjects' F0-discrimination providing further evidence that the perception of intonation patterns is important for the CI-mediated speech recognition in noise.
Collapse
Affiliation(s)
- Hartmut Meister
- Jean-Uhrmacher-Institute for Clinical ENT-Research, University of Cologne, Geibelstrasse 29-31, D-50931 Cologne, Germany.
| | | | | | | | | |
Collapse
|
256
|
Sukowski H, Brand T, Wagener KC, Kollmeier B. [Comparison of the Göttingen sentence test and the monosyllabic rhyme test by von Wallenberg and Kollmeier with the Freiburg speech test : Investigation in a clinically representative group of listeners]. HNO 2010; 58:597-604. [PMID: 20533016 DOI: 10.1007/s00106-009-2066-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
BACKGROUND In a previous study [12] we compared the Freiburg speech test (number test and monosyllabic test) with the Göttingen sentence test and the monosyllabic rhyme test developed by von Wallenberg and Kollmeier. For a small group of participants we were able to demonstrate that the often criticized Freiburg speech test could be replaced by more modern test procedures. In the current study we verified this for a larger and more heterogeneous group of participants. METHOD A total of 145 participants with hearing impairments were tested with the Freiburg speech test and the modern procedures. Both monosyllabic tests were carried out at three different presentation levels. Based on the findings of the previous study the monosyllabic rhyme test was performed in each case with a presentation level reduced by 15 dB relative to the Freiburg monosyllabic test levels. RESULTS The feasibility to replace both parts of the Freiburg speech test by more modern test procedures could be confirmed. The comparison of both monosyllabic tests showed that a reduction in the presentation level by 20 dB for the monosyllabic rhyme test would be most appropriate to achieve on average the same results with both procedures.
Collapse
Affiliation(s)
- H Sukowski
- Medizinische Physik, Universität Oldenburg, Deutschland.
| | | | | | | |
Collapse
|
257
|
Ozimek E, Warzybok A, Kutzner D. Polish sentence matrix test for speech intelligibility measurement in noise. Int J Audiol 2010; 49:444-54. [PMID: 20482292 DOI: 10.3109/14992021003681030] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The purpose of this study was to develop the Polish sentence matrix test (PSMT) to measure intelligibility of speech presented against a background noise. The PSMT consists of five columns containing: 10 names, 10 verbs, 10 numerals, 10 adjectives, and 10 nouns. Since each word was available as a separate sound file, it was possible to generate different sentences by juxtaposing randomly selected words taken from respective columns. This approach allows 100,000 unique sentences of a fixed grammatical structure to be generated. The speech reception threshold (SRT), i.e. the signal-to-noise ratio (SNR) providing 50% speech intelligibility and S(50), the slope of an intelligibility function at the SRT point, were shown to be -9.6 dB and 17.1 %/dB, respectively. Note that in this study dB is regarded as dB SNR, otherwise reference is given. PSMT was also evaluated using an adaptive 1-up/1-down staircase procedure in investigations with and without participation of an experimenter. No significant differences were shown for SRTs obtained in these investigations.
Collapse
Affiliation(s)
- Edward Ozimek
- Institute of Acoustics, A. Mickiewicz University, Poznań, Poland.
| | | | | |
Collapse
|
258
|
Battmer RD, Dillier N, Lai WK, Begall K, Leypon EE, González JCF, Manrique M, Morera C, Müller-Deile J, Wesarg T, Zarowski A, Killian MJ, von Wallenberg E, Smoorenburg GF. Speech perception performance as a function of stimulus pulse rate and processing strategy preference for the Cochlear™ Nucleus®CI24RE device: Relation to perceptual threshold and loudness comfort profiles. Int J Audiol 2010; 49:657-66. [DOI: 10.3109/14992021003801471] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
259
|
Beutelmann R, Brand T, Kollmeier B. Revision, extension, and evaluation of a binaural speech intelligibility model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:2479-97. [PMID: 20370031 DOI: 10.1121/1.3295575] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
This study presents revision, extension, and evaluation of a binaural speech intelligibility model (Beutelmann, R., and Brand, T. (2006). J. Acoust. Soc. Am. 120, 331-342) that yields accurate predictions of speech reception thresholds (SRTs) in the presence of a stationary noise source at arbitrary azimuths and in different rooms. The modified model is based on an analytical expression of binaural unmasking for arbitrary input signals and is computationally more efficient, while maintaining the prediction quality of the original model. An extension for nonstationary interferers was realized by applying the model to short time frames of the input signals and averaging over the predicted SRT results. Binaural SRTs from 8 normal-hearing and 12 hearing-impaired subjects, incorporating all combinations of four rooms, three source setups, and three noise types were measured and compared to the model's predictions. Depending on the noise type, the parametric correlation coefficients between observed and predicted SRTs were 0.80-0.93 for normal-hearing subjects and 0.59-0.80 for hearing-impaired subjects. The mean absolute prediction error was 3 dB for the mean normal-hearing data and 4 dB for the individual hearing-impaired data. 70% of the variance of the SRTs of hearing-impaired subjects could be explained by the model, which is based only on the audiogram.
Collapse
Affiliation(s)
- Rainer Beutelmann
- Medizinische Physik, Carl-von-Ossietzky-Universitat Oldenburg, 26111 Oldenburg, Germany.
| | | | | |
Collapse
|
260
|
Luts H, Eneman K, Wouters J, Schulte M, Vormann M, Buechler M, Dillier N, Houben R, Dreschler WA, Froehlich M, Puder H, Grimm G, Hohmann V, Leijon A, Lombard A, Mauler D, Spriet A. Multicenter evaluation of signal enhancement algorithms for hearing aids. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2010; 127:1491-1505. [PMID: 20329849 DOI: 10.1121/1.3299168] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
In the framework of the European HearCom project, promising signal enhancement algorithms were developed and evaluated for future use in hearing instruments. To assess the algorithms' performance, five of the algorithms were selected and implemented on a common real-time hardware/software platform. Four test centers in Belgium, The Netherlands, Germany, and Switzerland perceptually evaluated the algorithms. Listening tests were performed with large numbers of normal-hearing and hearing-impaired subjects. Three perceptual measures were used: speech reception threshold (SRT), listening effort scaling, and preference rating. Tests were carried out in two types of rooms. Speech was presented in multitalker babble arriving from one or three loudspeakers. In a pseudo-diffuse noise scenario, only one algorithm, the spatially preprocessed speech-distortion-weighted multi-channel Wiener filtering, provided a SRT improvement relative to the unprocessed condition. Despite the general lack of improvement in SRT, some algorithms were preferred over the unprocessed condition at all tested signal-to-noise ratios (SNRs). These effects were found across different subject groups and test sites. The listening effort scores were less consistent over test sites. For the algorithms that did not affect speech intelligibility, a reduction in listening effort was observed at 0 dB SNR.
Collapse
Affiliation(s)
- Heleen Luts
- ExpORL, Department of Neurosciences, Katholieke Universiteit Leuven, Herestraat 49 bus 721, B-3000 Leuven, Belgium.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
261
|
García-Pérez MA. Denoising forced-choice detection data. THE BRITISH JOURNAL OF MATHEMATICAL AND STATISTICAL PSYCHOLOGY 2010; 63:75-100. [PMID: 19422731 DOI: 10.1348/000711009x424057] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Observers in a two-alternative forced-choice (2AFC) detection task face the need to produce a response at random (a guess) on trials in which neither presentation appeared to display a stimulus. Observers could alternatively be instructed to use a 'guess' key on those trials, a key that would produce a random guess and would also record the resultant correct or wrong response as emanating from a computer-generated guess. A simulation study shows that 'denoising' 2AFC data with information regarding which responses are a result of guesses yields estimates of detection threshold and spread of the psychometric function that are far more precise than those obtained in the absence of this information, and parallel the precision of estimates obtained with yes-no tasks running for the same number of trials. Simulations also show that partial compliance with the instructions to use the 'guess' key reduces the quality of the estimates, which nevertheless continue to be more precise than those obtained from conventional 2AFC data if the observers are still moderately compliant. An empirical study testing the validity of simulation results showed that denoised 2AFC estimates of spread were clearly superior to conventional 2AFC estimates and similar to yes-no estimates, but variations in threshold across observers and across sessions hid the benefits of denoising for threshold estimation. The empirical study also proved the feasibility of using a 'guess' key in addition to the conventional response keys defined in 2AFC tasks.
Collapse
Affiliation(s)
- Miguel A García-Pérez
- Departamento de Metodología, Facultad de Psicología, Universidad Complutense, Madrid, Spain.
| |
Collapse
|
262
|
Theunissen M, Swanepoel DW, Hanekom J. Sentence recognition in noise: Variables in compilation and interpretation of tests. Int J Audiol 2009; 48:743-57. [DOI: 10.3109/14992020903082088] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
263
|
Jürgens T, Brand T. Microscopic prediction of speech recognition for listeners with normal hearing in noise using an auditory model. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:2635-48. [PMID: 19894841 DOI: 10.1121/1.3224721] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
This study compares the phoneme recognition performance in speech-shaped noise of a microscopic model for speech recognition with the performance of normal-hearing listeners. "Microscopic" is defined in terms of this model twofold. First, the speech recognition rate is predicted on a phoneme-by-phoneme basis. Second, microscopic modeling means that the signal waveforms to be recognized are processed by mimicking elementary parts of human's auditory processing. The model is based on an approach by Holube and Kollmeier [J. Acoust. Soc. Am. 100, 1703-1716 (1996)] and consists of a psychoacoustically and physiologically motivated preprocessing and a simple dynamic-time-warp speech recognizer. The model is evaluated while presenting nonsense speech in a closed-set paradigm. Averaged phoneme recognition rates, specific phoneme recognition rates, and phoneme confusions are analyzed. The influence of different perceptual distance measures and of the model's a-priori knowledge is investigated. The results show that human performance can be predicted by this model using an optimal detector, i.e., identical speech waveforms for both training of the recognizer and testing. The best model performance is yielded by distance measures which focus mainly on small perceptual distances and neglect outliers.
Collapse
Affiliation(s)
- Tim Jürgens
- Medizinische Physik, Universitat Oldenburg, D-26111 Oldenburg, Germany
| | | |
Collapse
|
264
|
Beutelmann R, Brand T, Kollmeier B. Prediction of binaural speech intelligibility with frequency-dependent interaural phase differences. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:1359-1368. [PMID: 19739750 DOI: 10.1121/1.3177266] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The aim of this study was to test the hypothesis of independent processing strategies in adjacent binaural frequency bands underlying current models for binaural speech intelligibility in complex configurations and to investigate the effective binaural auditory bandwidth in broad-band signals. Speech reception thresholds (SRTs) were measured for binaural conditions with frequency-dependent interaural phase differences (IPDs) of speech and noise. SRT predictions with the binaural speech intelligibility model by Beutelmann and Brand (2006, J. Acoust. Soc. Am. 120, 331-342) were compared with the observed data. The IPDs of speech and noise had a sinusoidal shape on a logarithmic frequency scale. The bandwidth between zeros of the IPD function was varied from 18 to 4 octaves. Speech and noise had either the same IPD function (reference condition) or opposite signs of the IPD function (binaural condition). Each condition had two subconditions with alternating and non-alternating signs, respectively, of the IPD function. The binaural unmasking with respect to the reference condition decreased from 6 dB to zero with decreasing IPD bandwidth for the alternating condition while it stayed significantly larger than zero for the non-alternating condition. The observed results were well predicted by the model with an analysis filter bandwidth of 2.3 equivalent rectangular bandwidths (ERBs).
Collapse
Affiliation(s)
- Rainer Beutelmann
- Medizinische Physik, Carl-von-Ossietzky-Universitat Oldenburg, Oldenburg, Germany.
| | | | | |
Collapse
|
265
|
Kjems U, Boldt JB, Pedersen MS, Lunner T, Wang D. Role of mask pattern in intelligibility of ideal binary-masked noisy speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2009; 126:1415-26. [PMID: 19739755 DOI: 10.1121/1.3179673] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
Intelligibility of ideal binary masked noisy speech was measured on a group of normal hearing individuals across mixture signal to noise ratio (SNR) levels, masker types, and local criteria for forming the binary mask. The binary mask is computed from time-frequency decompositions of target and masker signals using two different schemes: an ideal binary mask computed by thresholding the local SNR within time-frequency units and a target binary mask computed by comparing the local target energy against the long-term average speech spectrum. By depicting intelligibility scores as a function of the difference between mixture SNR and local SNR threshold, alignment of the performance curves is obtained for a large range of mixture SNR levels. Large intelligibility benefits are obtained for both sparse and dense binary masks. When an ideal mask is dense with many ones, the effect of changing mixture SNR level while fixing the mask is significant, whereas for more sparse masks the effect is small or insignificant.
Collapse
|
266
|
Abstract
The investigation of speech intelligibility under quiet and noisy conditions is essential to evaluate the auditory and verbal rehabilitation of cochlear implant patients. A set of audiological tests is introduced that has been validated and optimized by empirical investigations in adult cochlear implant users. The Kiel logatom test, the Freiburg speech intelligibility test and the adaptive measurement of speech perception threshold in noise with the Oldenburg sentence test are methods suggested for clinical use, which are also applicable for scientific investigations. The test battery provides results that can be interpreted by every professional involved in the rehabilitation process of cochlear implant patients.
Collapse
|
267
|
Ozimek E, Kutzner D, Sęk A, Wicher A. Polish sentence tests for measuring the intelligibility of speech in interfering noise. Int J Audiol 2009; 48:433-43. [DOI: 10.1080/14992020902725521] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
268
|
Wagener KC, Brand T, Kollmeier B. The role of silent intervals for sentence intelligibility in fluctuating noise in hearing-impaired listeners. Int J Audiol 2009; 45:26-33. [PMID: 16562561 DOI: 10.1080/14992020500243851] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Fluctuating interfering noises are highly suitable for speech audiometry because of the large inter-individual variability in intelligibility results. This study explores the maximum duration of silent intervals in the masker as an important factor underlying sentence intelligibility in fluctuating noise. Three versions of speech-simulating fluctuating interfering noises based on the icra noises (Dreschler et al, 2001) were explored: The original noise which simulates one interfering speaker and contains pause durations up to two seconds, as well as two modified versions with pause durations limited to 250 ms and 62.5 ms, respectively. In addition, a stationary speech-shaped noise was used. Test-retest reliability as well as speech reception threshold (SRT) and speech intelligibility function slope were determined with hearing-impaired subjects. All fluctuating noises differentiated very well between subjects. Partial rank correlation analysis showed that SRTs in fluctuating noise with longest maximum pause durations mostly depended on SRTs in quiet. SRTs in fluctuating noises with smaller maximum pause durations correlated both with SRTs in quiet and in stationary noise.
Collapse
Affiliation(s)
- Kirsten Carola Wagener
- Carl von Ossietzky Universität Oldenburg, Medizinische Physik, Fakultät V/Institut für Physik, Oldenburg, Germany.
| | | | | |
Collapse
|
269
|
Wagener KC, Brand T. Sentence intelligibility in noise for listeners with normal hearing and hearing impairment: Influence of measurement procedure and masking parameters La inteligibilidad de frases en silencio para sujetos con audición normal y con hipoacusia: la influencia del procedimiento de medición y de los parámetros de enmascaramiento. Int J Audiol 2009; 44:144-56. [PMID: 15916115 DOI: 10.1080/14992020500057517] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Speech intelligibility measurements strongly depend on several procedural parameters. In order to obtain comparable results from different test procedures, these parameters must be investigated as to which should be standardized and which could be set freely. This study investigates the influence of noise level, noise type, and presentation mode on speech reception thresholds (SRTs), and intelligibility function slopes in noise for normal-hearing and hearing-impaired subjects. The noise presentation level had no significant influence on either SRTs or slope values, provided that the presentation level exceeded hearing threshold. Two stationary, speech-shaped noises produced identical results. Speech-simulating fluctuating noise yielded about 14 dB lower SRTs for normal-hearing subjects and about 10 dB lower SRTs for 20% of the heating-impaired subjects. Of the hearing-impaired subjects, 30% did not benefit from the modulations and showed similar SRTs as for stationary noise. Using continuous noise yielded lower SRTs compared to gated noise. However, the difference between the results in continuous and gated noise was not significant for the hearing-impaired subjects. A presentation level of 65 dB SPL (normal-hearing subjects) or 80 dB SPL (hearing-impaired subjects) and an interfering noise with a spectrum similar to the mean long-term average speech spectrum (LTASS) is suggested for comparable adaptive measurement procedures. A fluctuating, speech-shaped noise is recommended to differentiate between subjects.
Collapse
Affiliation(s)
- Kirsten Carola Wagener
- Carl von Ossietzky Universität, Oldenburg, Medizinische Physik, Fakultät V/Institut für Physik, Oldenburg, Germany.
| | | |
Collapse
|
270
|
[Comparison of different speech intelligibility tests in German language (Freiburg speech test vs. Göttingen sentence test and monosyllabic rhyme test)]. HNO 2009; 57:239-50. [PMID: 18696020 DOI: 10.1007/s00106-008-1727-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
BACKGROUND For assessing a noise-induced hearing loss, the Freiburg speech test (Freiburger Sprachtest) is traditionally used to examine speech recognition in silence. However, for many years this test has been shown to have serious shortcomings. Various modern procedures in German language are available as alternatives. METHODS The aim of the current study was to compare the Freiburg number test (FBZ) with the Göttingen sentence test (GöSa) and the Freiburg monosyllabic test (FBE) with the monosyllabic rhyme test developed by von Wallenberg and Kollmeier (WaKo), all applied in silence. Overall, 31 participants with various degrees of hearing loss were tested in this study. Speech intelligibility was determined with both monosyllabic tests at presentation levels of 60 and 80 dB SPL and for some listeners also at 100 dB SPL. The maximum intelligibility was also determined. In addition, for the combination FBZ and FBE and for the combination FBZ and WaKo, the percentage of hearing loss based on speech audiometry was calculated. RESULTS The results show that both of the modern speech tests can be used as an alternative to the Freiburg speech test. Altogether the monosyllabic rhyme test leads to higher speech intelligibility than the Freiburg monosyllabic test. Therefore, a reduction of the presentation level by 15 dB is recommended if it is intended to retain the existing tables for calculating the percentage of hearing loss. Reducing the presentation level also has the advantage that measurements at 100 dB SPL are not required anymore. A level of 100 dB SPL is assessed as unpleasant by many listeners.
Collapse
|
271
|
Terband H, Drullman R. Study of an automated procedure for a Dutch sentence test for the measurement of the speech reception threshold in noise. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2008; 124:3225-3234. [PMID: 19045806 DOI: 10.1121/1.2990706] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
A procedure was developed for the automated measurement of the speech reception threshold in stationary noise (SRTn), which can be administered by the subjects themselves using a computer. The procedure was based on the SRTn test for Dutch developed by Plomp and Mimpen [(1979). "Improving the reliability of testing the speech reception threshold for sentences," Audiology, 18, 43-52]. Because in the automated procedure the responses were entered on a keyboard, the question of how to deal with typing and spelling errors played a key role. At first the possibility of scoring on keywords only was examined. An experiment was conducted in which the adaptive procedure was varied. Results showed that the combination of scoring each keyword separately and a fixed scheme of the adaptation of the signal-to-noise ratio throughout the procedure yields the highest test-retest reliability. Subsequently, the collection and verification of responses using a keyboard were examined. Two different algorithms were developed and evaluated against the traditional task of verbal repetition and response verification by an experimenter. The results indicated a preference for verification by dynamic alignment over a spelling checker approach. In conclusion, the results show that it is possible to automate the test procedure while maintaining sufficient reliability.
Collapse
Affiliation(s)
- Hayo Terband
- TNO Human Factors, P.O. Box 23, 3769 ZG Soesterberg, The Netherlands
| | | |
Collapse
|
272
|
|
273
|
Wagener KC, Brand T, Kollmeier B. [Evaluation of the Oldenburg children's rhyme test in silence and in noise]. HNO 2007; 54:171-8. [PMID: 16132880 DOI: 10.1007/s00106-005-1304-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
BACKGROUND The Oldenburg children's rhyme test (OlKi) was designed and optimized for speech intelligibility measurements for primary school pupils in silence [8, 3]. In the optimization, the intelligibility of the particular test words was equalized. METHODS The evaluation of the test with 147 primary school pupils with normal hearing in silence and 107 pupils in noise is presented in this article. The comparability between test lists for speech intelligibility was investigated and age dependent reference functions were determined. RESULTS The evaluation showed that intelligibility differences are larger across children within one grade than across different test lists. The reference functions of first grade pupils are shifted to slightly higher presentation levels and signal-to-noise ratios both in silence and noise. CONCLUSIONS The 12 optimized test lists of the OlKi test are equally intelligible both in silence and noise. No list effects are expected. The degree of difficulty of the Oldenburg children's rhyme test can be compared with the Göttingen children's test II and the Mainz children's test III.
Collapse
|
274
|
Abstract
OBJECTIVE The goals of this research were to develop and evaluate a new version of the Listening in Spatialized Noise Test (LISN; Cameron Dillon & Newall, 2006a) by incorporating a simplified and more objective response protocol to make the test suitable for assessing the ability of children as young as 5 yr to understand speech in background noise. The LISN-Sentences test (LISN-S; Cameron & Dillon, Reference Note 1) produces a three-dimensional auditory environment under headphones and is presented by using a personal computer. A simple repetition response protocol is used to determine speech reception thresholds (SRTs) for sentences presented in competing speech under various conditions. In four LISN-S conditions, the maskers are manipulated with respect to location (0 degrees versus +/-90 degrees azimuth) and vocal quality of the speaker(s) of the stories (same as, or different than, the speaker of the target sentences). Performance is measured as two SRT measures and three "advantage" measures. These advantage measures represent the benefit in decibels gained when either talker, spatial, or both talker and spatial cues combined, are incorporated in the maskers. This use of difference scores minimizes the effects of between-listener variation in factors such as linguistic skills and general cognitive ability on LISN-S performance. DESIGN An initial experiment was conducted to determine the relative intelligibility of the sentences used in the test. Up to 30 sentences were presented adaptively to 24 children ages 8 to 9 yr to estimate the SRT (eSRT). Fifty sentences each were then presented at each participant's eSRT, eSRT +2 dB, and eSRT -2 dB. Psychometric functions were fitted and the sentences were adjusted in amplitude for equal intelligibility. After adjustment, intelligibility increased across sentences by approximately 17% for each 1 dB increase in signal-to-noise ratio (SNR). A second experiment was conducted to gather normative data on the LISN-S from 82 children with normal hearing, ages 5 to 11 yr. RESULTS For the 82 children in the normative data study, regression analysis showed that there was a strong trend of decreasing SRT and increasing advantage as age increased across all LISN-S performance measures. Analysis of variance revealed that significant differences in performance were most pronounced between the 5-yr-olds and the other age groups on the LISN-S measures that assess the ability to use spatial cues to understand speech in background noise, suggesting that binaural processing skills are still developing at age 5 yr. Inter-participant variation in performance on the various SRT and advantage measures was minimal for all groups, including the 5- and 6-yr-olds who exhibited standard deviations ranging from only 1.0 dB to 1.8 dB across measures. The intra-participant standard error ranged from 0.6 dB to 2.0 dB across age groups and conditions. Total time taken to administer all four LISN-S conditions was on average 12 minutes. CONCLUSIONS The LISN-S provides a quick, objective method of measuring a child's ability to understand speech in background noise. The small degree of inter- and intra-participant variation in the 5- and 6-yr-old children suggests that the test is capable of assessing auditory processing in this age group. However, because there appears to be a strong developmental curve in binaural processing skills in the 5-yr-olds, it is suggested that the LISN-S be used clinically with children from 6 yr of age. Cut-off scores, calculated as 2 standard deviations below the mean adjusted for age, were calculated for each performance measure for children ages 6 to 11 yr. These scores, which represent the level below which performance on the LISN-S is considered to be outside normal limits, will be used to in future studies with children with suspected central auditory processing disorder.
Collapse
|
275
|
Smits C, Houtgast T. Measurements and calculations on the simple up-down adaptive procedure for speech-in-noise tests. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:1608-21. [PMID: 17004483 DOI: 10.1121/1.2221405] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
The simple up-down adaptive procedure is a common method for measuring speech reception thresholds. It is used by the Dutch speech-in-noise telephone screening test [National Hearing test; Smits and Houtgast Ear Hear. 26, 89-95 (2005)]. The test uses digit triplets to measure the speech reception threshold in noise by telephone (SRTT(n)). About 66 000 people took this test within four months of its introduction and details were stored of all individual measurements. Analyses of this large volume of data have revealed that the standard deviation of SRTT(n) estimates increases with hearing loss. This paper presents a calculation model which--using an intelligibility function as input--can determine the standard deviation of SRTT(n) estimates and the bias for the simple up-down procedure. The effects of variations in the slope of the intelligibility function, the guess rate, the starting level, the heterogeneity of the speech material, and the possibilities of optimizing SRTT(n) measurements were all explored with this model. The predicted decrease in the standard deviation of SRTT(n) estimates as a result of optimizing the speech material was confirmed by measurements in 244 listeners. The paper concludes by discussing possibilities for optimizing the development of comparable tests.
Collapse
Affiliation(s)
- Cas Smits
- Department of Otolaryngology/Audiology, VU University Medical Center, Amsterdam, The Netherlands.
| | | |
Collapse
|
276
|
Beutelmann R, Brand T. Prediction of speech intelligibility in spatial noise and reverberation for normal-hearing and hearing-impaired listeners. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2006; 120:331-42. [PMID: 16875230 DOI: 10.1121/1.2202888] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Binaural speech intelligibility of individual listeners under realistic conditions was predicted using a model consisting of a gammatone filter bank, an independent equalization-cancellation (EC) process in each frequency band, a gammatone resynthesis, and the speech intelligibility index (SII). Hearing loss was simulated by adding uncorrelated masking noises (according to the pure-tone audiogram) to the ear channels. Speech intelligibility measurements were carried out with 8 normal-hearing and 15 hearing-impaired listeners, collecting speech reception threshold (SRT) data for three different room acoustic conditions (anechoic, office room, cafeteria hall) and eight directions of a single noise source (speech in front). Artificial EC processing errors derived from binaural masking level difference data using pure tones were incorporated into the model. Except for an adjustment of the SII-to-intelligibility mapping function, no model parameter was fitted to the SRT data of this study. The overall correlation coefficient between predicted and observed SRTs was 0.95. The dependence of the SRT of an individual listener on the noise direction and on room acoustics was predicted with a median correlation coefficient of 0.91. The effect of individual hearing impairment was predicted with a median correlation coefficient of 0.95. However, for mild hearing losses the release from masking was overestimated.
Collapse
Affiliation(s)
- Rainer Beutelmann
- Medizinische Physik, Fakultät V, Carl-von-Ossietzky-Universität Oldenburg, D-26111 Oldenburg, Germany.
| | | |
Collapse
|
277
|
Moelker A, Maas RAJJ, Pattynama PMT. Verbal communication in MR environments: effect of MR system acoustic noise on speech understanding. Radiology 2004; 232:107-13. [PMID: 15220495 DOI: 10.1148/radiol.2321030955] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
PURPOSE To assess the masking effect of magnetic resonance (MR)-related acoustic noise and the effect of passive hearing protection on speech understanding. MATERIALS AND METHODS Acoustic recordings were made at 1.5 T at patient and operator (interventionalist in the MR suite) locations for relevant pulse sequences. In an audiologic laboratory, speech-to-noise ratios (STNRs) were determined, defined as the difference between the absolute sound pressure levels of MR noise and speech. The recorded noise of the MR sequences was played simultaneously with the recorded sentences at various intensities, and 15 healthy volunteers (seven women, eight men; median age, 27 years) repeated these sentences as accurately as possible. The STNR that corresponded with a 50% correct repetition was used as the measure for speech intelligibility. In addition, the effect of passive hearing protection on speech intelligibility was tested by using an earplug model. RESULTS Overall, speech understanding was reduced more at operator than at patient location. Most problematic were fast gradient-recalled-echo train and spiral k-space sequences. As the absolute sound pressure level of these sequences was approximately 100 dB at patient location, the vocal effort needed to attain 50% intelligibility was shouting (>77 dB). At operator location, less effort was required because of the lower sound pressure levels of the MR noise. Fast spoiled gradient-recalled-echo and echo-planar imaging sequences showed relatively favorable results with raised voice at operator location and loud speaking at patient location. The use of hearing protection slightly improved STNR. CONCLUSION At 1.5 T, the level of MR noise requires that large vocal effort is used, at the operator and especially at the patient location. Depending on the specific MR sequence used, loud speaking or shouting is needed to achieve adequate bidirectional communication with the patient. The wearing of earplugs improves speech intelligibility.
Collapse
Affiliation(s)
- Adriaan Moelker
- Department of Radiology, Erasmus Medical Center Rotterdam, 50 Dr Molewaterplein, PO Box 1738, 3000 DR Rotterdam, the Netherlands.
| | | | | |
Collapse
|
278
|
Richerson SJ, Faulkner LW, Robinson CJ, Redfern MS, Purucker MC. Acceleration threshold detection during short anterior and posterior perturbations on a translating platform. Gait Posture 2003; 18:11-9. [PMID: 14654203 DOI: 10.1016/s0966-6362(02)00189-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Balance control systems have usually been studied under two conditions, during quiet standing or under large postural perturbations of a magnitude that requires a postural adjustment to prevent falling. Between these two extremes lie perturbations that can be repeated and measured while not forcing adaptive strategies from the postural control system. Unlike other studies of postural control, we employed very short translations with varying accelerations at the edge of psychophysical detectability. These perturbations were vibration-free anterior or posterior translations of the platform on which a subject stood. Using a full Latin-square design set of perturbations in the forward or backward direction, with a smooth or jerk acceleration profile, and of length 4 or 20 mm, were presented to five subjects. Perceptual peak acceleration thresholds were determined by an iterative psychophysical method that forced the subjects to choose in which of two sequential intervals that they perceived a stimulus to have been presented. The only factor found that significantly correlated with detection was perturbation length. The 4 mm peak thresholds averaged 14.51 mm/s2 while 20 mm thresholds averaged 8.55 mm/s2. For the short perturbations employed in this study, detection of motion thus was dependent upon the magnitude of the acceleration, but it was independent of the acceleration profile (jerk versus smooth) or movement direction. By understanding the influences on the ability to perceptually detect motion underfoot, we can begin to understand what elements of the postural control system might be involved in the second-to-second control of balance.
Collapse
Affiliation(s)
- S J Richerson
- Research Service, Overton Brooks VA Medical Center, Shreveport, LA, USA
| | | | | | | | | |
Collapse
|
279
|
Wagener K, Josvassen JL, Ardenkjaer R. Design, optimization and evaluation of a Danish sentence test in noise. Int J Audiol 2003; 42:10-7. [PMID: 12564511 DOI: 10.3109/14992020309056080] [Citation(s) in RCA: 109] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
The Danish sentence test DANTALE II was developed in analogy to the Swedish sentence test by Hagerman and the German Oldenburg sentence test as a new Danish sentence test in noise to determine the speech reception threshold in noise (SRT, i.e. the signal-to-noise ratio (SNR) that yields 500% intelligibility). Each sentence is generated by a random combination of the alternatives of a base list. This base list consists of 10 sentences with the same syntactical structure (name, verb, numeral, adjective, object). The test sentences were recorded and segmented in such a way that the coarticulation effects were taken into account in order to achieve a high perceived sound quality of the resynthesized sentences: 100 sentences were recorded, each coarticulation between each word and the 10 possible following word alternatives were recorded, and the correct coarticulation was used to generate the test sentences. Word-specific speech recognition curves were measured for each recorded word to optimize the homogeneity of the speech material and the measurement accuracy. Level corrections of particular words and a careful selection of the test lists produced a noticeable reduction in the variation in the distribution of word-specific SRT (standard deviation 1.75 dB instead of 3.78 dB). Therefore, the slope of the total intelligibility function was expected to increase from 8.30%/dB (raw test material) to 13.2%/dB (after modification). These theoretical expectations were evaluated by independent measurements with normal-hearing subjects, and, for the most part, confirmed. The reference data for the DANTALE II are: SRT=-8.43 dB SNR; slope at SRT, s50 = 13.2%/dB. The training effect was 2.2 dB and could be reduced to less than 1 dB if two training lists of 20 sentences were performed prior to data collection.
Collapse
Affiliation(s)
- Kirsten Wagener
- Medizinische Physik, University of Oldenburg, 26111 Oldenburg, Germany,
| | | | | |
Collapse
|
280
|
Bronkhorst AW, Brand T, Wagener K. Evaluation of context effects in sentence recognition. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2002; 111:2874-2886. [PMID: 12083221 DOI: 10.1121/1.1458025] [Citation(s) in RCA: 21] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
It was investigated whether the model for context effects, developed earlier by Bronkhorst et al. [J. Acoust. Soc. Am. 93, 499-509 (1993)], can be applied to results of sentence tests, used for the evaluation of speech recognition. Data for two German sentence tests, that differed with respect to their semantic content, were analyzed. They had been obtained from normal-hearing listeners using adaptive paradigms in which the signal-to-noise ratio was varied. It appeared that the model can accurately reproduce the complete pattern of scores as a function of signal-to-noise ratio: both sentence recognition scores and proportions of incomplete responses. In addition, it is shown that the model can provide a better account of the relationship between average word recognition probability (p(e)) and sentence recognition probability (p(w)) than the relationship p(w) =p(e)j, which has been used in previous studies. Analysis of the relationship between j and the model parameters shows that j is, nevertheless, a very useful parameter, especially when it is combined with the parameter j', which can be derived using the equivalent relationship p(w,0) = (1 - p(e))(j'), where p(w,0) is the probability of recognizing none of the words in the sentence. These parameters not only provide complementary information on context effects present in the speech material, but they also can be used to estimate the model parameters. Because the model can be applied to both speech and printed text, an experiment was conducted in which part of the sentences was presented orthographically with 1-3 missing words. The results revealed a large difference between the values of the model parameters for the two presentation modes. This is probably due to the fact that, with speech, subjects can reduce the number of alternatives for a certain word using partial information that they have perceived (i.e., not only using the sentence context). A method for mapping model parameters from one mode to the other is suggested, but the validity of this approach has to be confirmed with additional data.
Collapse
|