1
|
Baker CP, Sundberg J, Purdy SC, Rakena TO, Leão SHDS. CPPS and Voice-Source Parameters: Objective Analysis of the Singing Voice. J Voice 2024; 38:549-560. [PMID: 35000836 DOI: 10.1016/j.jvoice.2021.12.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 12/08/2021] [Accepted: 12/13/2021] [Indexed: 11/19/2022]
Abstract
INTRODUCTION In recent years cepstral analysis and specific cepstrum-based measures such as smoothed cepstral peak prominence (CPPS) has become increasingly researched and utilized in attempts to determine the extent of overall dysphonia in voice signals. Yet, few studies have extensively examined how specific voice-source parameters affect CPPS values. OBJECTIVE Using a range of synthesized tones, this exploratory study sought to systematically analyze the effect of fundamental frequency (fo), vibrato extent, source-spectrum tilt, and the amplitude of the voice-source fundamental on CPPS values. MATERIALS AND METHODS A series of scales were synthesised using the freeware Madde. Fundamental frequency, vibrato extent, source-spectrum tilt, and the amplitude of the voice-source fundamental were systematically and independently varied. The tones were analysed in PRAAT, and statistical analyses were conducted in SPSS. RESULTS CPPS was significantly affected by both fo and source-spectrum tilt, independently. A nonlinear association was seen between vibrato extent and CPPS, where CPPS values increased from 0 to 0.6 semitones (ST), then rapidly decreased approaching 1.0 ST. No relationship was seen between the amplitude of the voice-source fundamental and CPPS. CONCLUSION The large effect of fo should be taken into account when analyzing the voice, particularly in singing-voice research, when comparing pre and posttreatment data, and when comparing inter-subject CPPS data.
Collapse
Affiliation(s)
- Calvin P Baker
- Department of Voice, School of Music, University of Auckland, Auckland Central, Auckland, New Zealand.
| | - Johan Sundberg
- Division of Speech, Music and Hearing, School of Electrical Engineering and Computer Science, KTH (Royal Institute of Technology), Stockholm, Sweden; Department of Linguistics, Stockholm University, Stockholm, Sweden; University College of Music Education Stockholm, Sweden
| | - Suzanne C Purdy
- School of Psychology, University of Auckland, Auckland Central, Auckland, New Zealand
| | - Te Oti Rakena
- Department of Voice, School of Music, University of Auckland, Auckland Central, Auckland, New Zealand
| | - Sylvia H de S Leão
- Speech Science, School of Psychology, University of Auckland, Grafton, Auckland, New Zealand
| |
Collapse
|
2
|
Aaen M, McGlashan J, Christoph N, Sadolin C. Extreme Vocal Effects Distortion, Growl, Grunt, Rattle, and Creaking as Measured by Electroglottography and Acoustics in 32 Healthy Professional Singers. J Voice 2024; 38:795.e21-795.e35. [PMID: 34972633 DOI: 10.1016/j.jvoice.2021.11.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 11/09/2021] [Accepted: 11/10/2021] [Indexed: 11/28/2022]
Abstract
Vocal effects - also called extreme or extended vocal techniques - with the intention to sound hoarse or rough are widely used as part of many genres and styles of singing, yet scarcely documented in research. Physiological studies detail the involvement of supraglottic structures for the production of vocal effects, yet the acoustic impact of such involvement has not been documented systematically across phonation types. PURPOSE To report acoustic measurements and electroglottography-specific measurements for the five rough-sounding vocal effects Distortion, Growl, Grunt, Rattle, and Creaking across phonation types to demonstrate differences between notes with and without vocal effects added. METHODS Thirty-two professional singers and singing teachers produced sustained vowels in each of the four vocal modes with alternations of adding and removing the vocal effects. The singers were recorded with a microphone at a constant distance as well as with EGG. RESULTS The vocal effects Distortion, Growl, Grunt, Rattle, and Creaking impact the acoustic spectra in separate and systematic ways across genders and phonation types. Each vocal effect impacted the spectrum in specific and particular frequency regions between 0 and 3.5 KHz as well as in higher partials after 12 kHz with statistical significance. EGG-waveforms were un-impacted by most of the vocal effects produced using supraglottic sound sources, whereas Grunt and Creaking conditions did impact EGG-waveform signals, though not consistently between participants. EGG measures confirmed sustained and unchanged Qx and Fx for most conditions, with statically significant changes in noise measurements Harmonic-to-Noise Ratio, Normalised Noise Energy, Relative Average Perturbation, and Cepstral Peak Prominence, despite Sound Pressure Level differing significantly only for a few specific conditions. Singers scored an average of 5,95 on Voice Handicap Index questionnaires and were all reportedly healthy. CONCLUSIONS Vocal effects added to phonation produce specific increases and specific decreases in particular frequency regions in a systematic way and can be produced in a healthy and sustainable manner, as measured by Voice Handicap Index. Vocal effects can be added to different phonation types with differing acoustic output and singers were able to sustain and control involvement of the supraglottic sound source(s) independently of phonation type.
Collapse
Affiliation(s)
- Mathias Aaen
- Department of Otorhinolaryngology, Queen's Medical Centre Campus, Nottingham University Hospitals, Nottingham, UK.
| | - Julian McGlashan
- Department of Otorhinolaryngology, Queen's Medical Centre Campus, Nottingham University Hospitals, Nottingham, UK
| | - Noor Christoph
- Hogeschool InHolland, Domein Gezondheid, Sport en Welzijn, Amsterdam, Holland
| | | |
Collapse
|
3
|
Rosenberg S, Sundberg J, Lã FMB. Kulning: Acoustic and Perceptual Characteristics of a Calling Style Used Within the Scandinavian Herding Tradition. J Voice 2024; 38:585-594. [PMID: 34991935 DOI: 10.1016/j.jvoice.2021.11.016] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 11/24/2021] [Accepted: 11/30/2021] [Indexed: 10/19/2022]
Abstract
Kulning, a loud, high-pitched vocal calling technique pertaining to the Scandinavian herding system, has attracted several researchers' attention, mainly focusing on cultural, phonatory and musical aspects. Less attention has been paid to the spectral and physiological properties that characterize Kulning tones, and also if there is a physiologically optimum pitch range. We analyzed tones produced by ten participants with varying experience in Kulning. They performed a phrase, pitch range G5 to C6 (784 to 1046 Hz), in three different conditions: starting (1) on pitch A5, (2) on the participant's preferred pitch, and (3) after the deepest possible inhalation, also on the participant's preferred pitch subglottal pressure (Psub) was measured as the oral pressure during /p/-occlusion. The quality of the Kulning was rated by a group of experts. The highest-rated tones all had a sound pressure level (SPL) at 0.3 m exceeding 115 dB and a pitch higher than 1010 Hz, while the SPL of the lowest rated tones was less than 108 dB at a pitch below 900 Hz. A multiple regression analysis was performed to evaluate the relationship between the ratings and Psub), SPL, level of the fundamental and the frequency at which a spectrum envelope dip occurred. Highly rated tones were started at maximum lung volumes, and on participants' preferred pitches. They all shared a high frequency of the spectrum envelope dip and a high level of the fundamental. In decreasing order of ratings, Condition 3 showed the highest values followed by Condition 2 and Condition 1. Each singer seemed to perform best within an individual Psub and pitch range. The relevance of the results to voice pedagogy, artistic, and compositional work is discussed.
Collapse
Affiliation(s)
- Susanne Rosenberg
- Department of Folk Music, Academy 1, Royal College of Music in Stockholm (KMH), Stockholm, Sweden.
| | - Johan Sundberg
- Department of Speech Music Hearing, School of Electrical Engineering and Computer Science, Royal Institute of Technology (KTH), Stockholm, Sweden; Department of Linguistics, Stockholm University, Stockholm, Sweden; Voice Department, The Stockholm University College of Music Education (SMI), Stockholm, Sweden
| | - Filipa M B Lã
- Faculty of Education, Department of Didactics, School Organization and Special Didactics, The National Distance Education University (UNED), Madrid, Spain
| |
Collapse
|
4
|
Yeshoda K, Raveendran R. Exploring the Spectral and Temporal Characteristics of Human Beatbox Sounds: A Preliminary Study. J Voice 2024; 38:795.e1-795.e9. [PMID: 34952722 DOI: 10.1016/j.jvoice.2021.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 10/08/2021] [Accepted: 10/11/2021] [Indexed: 11/27/2022]
Abstract
PURPOSE Human beatbox is a developing hip-hop branch of music wherein impersonations of the percussion drum are done using manipulations of the oro-laryngopharyngeal structures. This study presents a preliminary attempt of exploration and documentation of the spectral and temporal measures of beatbox sounds produced by a single beatbox performer. METHOD An analytical observational study design was adopted wherein; audio recordings were taken from a professional beatboxer. The participant produced five different types of beatbox sequences consisting of classic kick, inward Ph snare with /ʃ/, throat bass, and uvular oscillation sounds. The recorded beatbox sounds were segmented into preburst and postburst events and were analyzed acoustically. RESULTS The scrutiny of the results revealed that the beatbox productions shared characteristic features of linguistic sounds , namely, the following manners of productions, stops, fricatives, and affricates, and further, oro- and laryngopharyngeal regions of the vocal tract as the places of articulation. CONCLUSION It is interesting to note that, the art of betboxing involves wide variety of articulatory configurations and also, the use of resonatory sub-system in the vocal tract.. This knowledge could expand the professional realm of Speech-Language Pathology necessitating the professionals to equip themselves with vocal demands of this relatively budding vocal art form called the beatboxing.
Collapse
Affiliation(s)
- Krishna Yeshoda
- Department of Speech-Language Sciences, All India Institute of Speech and Hearing (AIISH), Manasagangothri, Mysuru, India
| | - Revathi Raveendran
- Department of Speech-Language Sciences, All India Institute of Speech and Hearing (AIISH), Manasagangothri, Mysuru, India.
| |
Collapse
|
5
|
Fandel AD, Silva K, Bailey H. Vocal signatures affected by population identity and environmental sound levels. PLoS One 2024; 19:e0299250. [PMID: 38635752 PMCID: PMC11025965 DOI: 10.1371/journal.pone.0299250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 02/06/2024] [Indexed: 04/20/2024] Open
Abstract
Passive acoustic monitoring has improved our understanding of vocalizing organisms in remote habitats and during all weather conditions. Many vocally active species are highly mobile, and their populations overlap. However, distinct vocalizations allow the tracking and discrimination of individuals or populations. Using signature whistles, the individually distinct calls of bottlenose dolphins, we calculated a minimum abundance of individuals, characterized and compared signature whistles from five locations, and determined reoccurrences of individuals throughout the Mid-Atlantic Bight and Chesapeake Bay, USA. We identified 1,888 signature whistles in which the duration, number of extrema, start, end, and minimum frequencies of signature whistles varied significantly by site. All characteristics of signature whistles were deemed important for determining from which site the whistle originated and due to the distinct signature whistle characteristics and lack of spatial mixing of the dolphins detected at the Offshore site, we suspect that these dolphins are of a different population than those at the Coastal and Bay sites. Signature whistles were also found to be shorter when sound levels were higher. Using only the passively recorded vocalizations of this marine top predator, we obtained information about its population and how it is affected by ambient sound levels, which will increase as offshore wind energy is developed. In this rapidly developing area, these calls offer critical management insights for this protected species.
Collapse
Affiliation(s)
- Amber D. Fandel
- Chesapeake Biological Laboratory, University of Maryland Center for Environmental Science, Solomons, MD, United States of America
| | - Kirsten Silva
- Chesapeake Biological Laboratory, University of Maryland Center for Environmental Science, Solomons, MD, United States of America
| | - Helen Bailey
- Chesapeake Biological Laboratory, University of Maryland Center for Environmental Science, Solomons, MD, United States of America
| |
Collapse
|
6
|
Magnaterra AK, Rose EM, Ball GF, Dooling RJ. Hearing and vocalizations in a small songbird, the red-cheeked cordon bleu (Uraeginthus bengalus) (L). J Acoust Soc Am 2024; 155:2724-2727. [PMID: 38656337 DOI: 10.1121/10.0025764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 03/28/2024] [Indexed: 04/26/2024]
Abstract
The auditory sensitivity of a small songbird, the red-cheeked cordon bleu, was measured using the standard methods of animal psychophysics. Hearing in cordon bleus is similar to other small passerines with best hearing in the frequency region from 2 to 4 kHz and sensitivity declining at the rate of about 10 dB/octave below 2 kHz and about 35 dB/octave as frequency increases from 4 to 9 kHz. While critical ratios are similar to other songbirds, the long-term average power spectrum of cordon bleu song falls above the frequency of best hearing in this species.
Collapse
Affiliation(s)
- Anna K Magnaterra
- Department of Psychology, Neuroscience and Cognitive Science Program, University of Maryland College Park, College Park, Maryland 20742, USA
| | - Evangeline M Rose
- Department of Psychology, Neuroscience and Cognitive Science Program, University of Maryland College Park, College Park, Maryland 20742, USA
| | - Gregory F Ball
- Department of Psychology, Neuroscience and Cognitive Science Program, University of Maryland College Park, College Park, Maryland 20742, USA
| | - Robert J Dooling
- Department of Psychology, Neuroscience and Cognitive Science Program, University of Maryland College Park, College Park, Maryland 20742, USA
| |
Collapse
|
7
|
Barkley YM, Merkens KPB, Wood M, Oleson EM, Marques TA. Click detection rate variability of central North Pacific sperm whales from passive acoustic towed arrays. J Acoust Soc Am 2024; 155:2627-2635. [PMID: 38629884 DOI: 10.1121/10.0025540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 03/19/2024] [Indexed: 04/19/2024]
Abstract
Passive acoustic monitoring (PAM) is an optimal method for detecting and monitoring cetaceans as they frequently produce sound while underwater. Cue counting, counting acoustic cues of deep-diving cetaceans instead of animals, is an alternative method for density estimation, but requires an average cue production rate to convert cue density to animal density. Limited information about click rates exists for sperm whales in the central North Pacific Ocean. In the absence of acoustic tag data, we used towed hydrophone array data to calculate the first sperm whale click rates from this region and examined their variability based on click type, location, distance of whales from the array, and group size estimated by visual observers. Our findings show click type to be the most important variable, with groups that include codas yielding the highest click rates. We also found a positive relationship between group size and click detection rates that may be useful for acoustic predictions of group size in future studies. Echolocation clicks detected using PAM methods are often the only indicator of deep-diving cetacean presence. Understanding the factors affecting their click rates provides important information for acoustic density estimation.
Collapse
Affiliation(s)
- Yvonne M Barkley
- Cooperative Institute for Marine and Atmospheric Research, School of Ocean and Earth Science and Technology, University of Hawai'i at Mānoa, Honolulu, Hawaii 96822, USA
| | | | - Megan Wood
- Saltwater Inc., Anchorage, Alaska 99501, USA
| | - Erin M Oleson
- Pacific Islands Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Honolulu, Hawaii 96818, USA
| | - Tiago A Marques
- Centre for Research into Ecological and Environmental Modelling, The Observatory, University of St Andrews, St Andrews, KY16 9LZ, Scotland
- Departamento de Biologia Animal, Centro de Estatística e Aplicações, Faculdade de Ciências da Universidade de Lisboa, Portugal
| |
Collapse
|
8
|
Diamant R, Testolin A, Shachar I, Galili O, Scheinin A. Observational study on the non-linear response of dolphins to the presence of vessels. Sci Rep 2024; 14:6062. [PMID: 38480760 PMCID: PMC10937978 DOI: 10.1038/s41598-024-56654-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2023] [Accepted: 03/08/2024] [Indexed: 03/17/2024] Open
Abstract
With the large increase in human marine activity, our seas have become populated with vessels that can be overheard from distances of even 20 km. Prior investigations showed that such a dense presence of vessels impacts the behaviour of marine animals, and in particular dolphins. While previous explorations were based on a linear observation for changes in the features of dolphin whistles, in this work we examine non-linear responses of bottlenose dolphins (Tursiops Truncatus) to the presence of vessels. We explored the response of dolphins to vessels by continuously recording acoustic data using two long-term acoustic recorders deployed near a shipping lane and a dolphin habitat in Eilat, Israel. Using deep learning methods we detected a large number of 50,000 whistles, which were clustered to associate whistle traces and to characterize their features to discriminate vocalizations of dolphins: both structure and quantities. Using a non-linear classifier, the whistles were categorized into two classes representing the presence or absence of a nearby vessel. Although our database does not show linear observable change in the features of the whistles, we obtained true positive and true negative rates exceeding 90% accuracy on separate, left-out test sets. We argue that this success in classification serves as a statistical proof for a non-linear response of dolphins to the presence of vessels.
Collapse
Affiliation(s)
- Roee Diamant
- Department of Marine Technologies, University of Haifa, Haifa, 3498838, Israel.
- Faculty of Electrical and Computing Engineering, University of Zagreb, Zagreb, Croatia.
| | - Alberto Testolin
- Department of Mathematics and the Department of General Psychology, University of Padova, 35131, Padova, Italy
| | - Ilan Shachar
- Department of Marine Technologies, University of Haifa, Haifa, 3498838, Israel
| | - Ori Galili
- Morris Kahn Marine Research Station, Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Haifa, Israel
| | - Aviad Scheinin
- Morris Kahn Marine Research Station, Department of Marine Biology, Leon H. Charney School of Marine Sciences, University of Haifa, Haifa, Israel
| |
Collapse
|
9
|
Busquet F, Efthymiou F, Hildebrand C. Voice analytics in the wild: Validity and predictive accuracy of common audio-recording devices. Behav Res Methods 2024; 56:2114-2134. [PMID: 37253958 PMCID: PMC10228884 DOI: 10.3758/s13428-023-02139-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/27/2023] [Indexed: 06/01/2023]
Abstract
The use of voice recordings in both research and industry practice has increased dramatically in recent years-from diagnosing a COVID-19 infection based on patients' self-recorded voice samples to predicting customer emotions during a service center call. Crowdsourced audio data collection in participants' natural environment using their own recording device has opened up new avenues for researchers and practitioners to conduct research at scale across a broad range of disciplines. The current research examines whether fundamental properties of the human voice are reliably and validly captured through common consumer-grade audio-recording devices in current medical, behavioral science, business, and computer science research. Specifically, this work provides evidence from a tightly controlled laboratory experiment analyzing 1800 voice samples and subsequent simulations that recording devices with high proximity to a speaker (such as a headset or a lavalier microphone) lead to inflated measures of amplitude compared to a benchmark studio-quality microphone while recording devices with lower proximity to a speaker (such as a laptop or a smartphone in front of the speaker) systematically reduce measures of amplitude and can lead to biased measures of the speaker's true fundamental frequency. We further demonstrate through simulation studies that these differences can lead to biased and ultimately invalid conclusions in, for example, an emotion detection task. Finally, we outline a set of recording guidelines to ensure reliable and valid voice recordings and offer initial evidence for a machine-learning approach to bias correction in the case of distorted speech signals.
Collapse
Affiliation(s)
- Francesc Busquet
- Institute of Behavioral Science and Technology, University of St. Gallen, Torstrasse 25, St. Gallen, 9000, Switzerland.
| | - Fotis Efthymiou
- Institute of Behavioral Science and Technology, University of St. Gallen, Torstrasse 25, St. Gallen, 9000, Switzerland
| | - Christian Hildebrand
- Institute of Behavioral Science and Technology, University of St. Gallen, Torstrasse 25, St. Gallen, 9000, Switzerland.
| |
Collapse
|
10
|
Shadle CH, Fulop SA, Chen WR, Whalen DH. Assessing accuracy of resonances obtained with reassigned spectrograms from the "ground truth" of physical vocal tract models. J Acoust Soc Am 2024; 155:1253-1263. [PMID: 38341748 PMCID: PMC10858790 DOI: 10.1121/10.0024548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 01/04/2024] [Accepted: 01/06/2024] [Indexed: 02/13/2024]
Abstract
The reassigned spectrogram (RS) has emerged as the most accurate way to infer vocal tract resonances from the acoustic signal [Shadle, Nam, and Whalen (2016). "Comparing measurement errors for formants in synthetic and natural vowels," J. Acoust. Soc. Am. 139(2), 713-727]. To date, validating its accuracy has depended on formant synthesis for ground truth values of these resonances. Synthesis is easily controlled, but it has many intrinsic assumptions that do not necessarily accurately realize the acoustics in the way that physical resonances would. Here, we show that physical models of the vocal tract with derivable resonance values allow a separate approach to the ground truth, with a different range of limitations. Our three-dimensional printed vocal tract models were excited by white noise, allowing an accurate determination of the resonance frequencies. Then, sources with a range of fundamental frequencies were implemented, allowing a direct assessment of whether RS avoided the systematic bias towards the nearest strong harmonic to which other analysis techniques are prone. RS was indeed accurate at fundamental frequencies up to 300 Hz; above that, accuracy was somewhat reduced. Future directions include testing mechanical models with the dimensions of children's vocal tracts and making RS more broadly useful by automating the detection of resonances.
Collapse
Affiliation(s)
- Christine H Shadle
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06511, USA
| | - Sean A Fulop
- Department of Linguistics, Fresno State University, Fresno, California 93740, USA
| | - Wei-Rong Chen
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06511, USA
| | - D H Whalen
- Yale Child Study Center, School of Medicine, Yale University, New Haven, Connecticut 06511, USA
| |
Collapse
|
11
|
Abildtrup Nielsen N, Dawson SM, Torres Ortiz S, Wahlberg M, Martin MJ. Hector's dolphins (Cephalorhynchus hectori) produce both narrowband high-frequency and broadband acoustic signals. J Acoust Soc Am 2024; 155:1437-1450. [PMID: 38364047 DOI: 10.1121/10.0024820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Accepted: 01/25/2024] [Indexed: 02/18/2024]
Abstract
Odontocetes produce clicks for echolocation and communication. Most odontocetes are thought to produce either broadband (BB) or narrowband high-frequency (NBHF) clicks. Here, we show that the click repertoire of Hector's dolphin (Cephalorhynchus hectori) comprises highly stereotypical NBHF clicks and far more variable broadband clicks, with some that are intermediate between these two categories. Both NBHF and broadband clicks were made in trains, buzzes, and burst-pulses. Most clicks within click trains were typical NBHF clicks, which had a median centroid frequency of 130.3 kHz (median -10 dB bandwidth = 29.8 kHz). Some, however, while having only marginally lower centroid frequency (median = 123.8 kHz), had significant energy below 100 kHz and approximately double the bandwidth (median -10 dB bandwidth = 69.8 kHz); we refer to these as broadband. Broadband clicks in buzzes and burst-pulses had lower median centroid frequencies (120.7 and 121.8 kHz, respectively) compared to NBHF buzzes and burst-pulses (129.5 and 130.3 kHz, respectively). Source levels of NBHF clicks, estimated by using a drone to measure ranges from a single hydrophone and by computing time-of-arrival differences at a vertical hydrophone array, ranged from 116 to 171 dB re 1 μPa at 1 m, whereas source levels of broadband clicks, obtained from array data only, ranged from 138 to 184 dB re 1 μPa at 1 m. Our findings challenge the grouping of toothed whales as either NBHF or broadband species.
Collapse
Affiliation(s)
- Nicoline Abildtrup Nielsen
- Marine Biological Research Center, Department of Biology, University of Southern Denmark, 5300 Kerteminde, Denmark
| | - Stephen M Dawson
- Department of Marine Science, University of Otago, Dunedin 9054, New Zealand
| | - Sara Torres Ortiz
- Marine Biological Research Center, Department of Biology, University of Southern Denmark, 5300 Kerteminde, Denmark
| | - Magnus Wahlberg
- Marine Biological Research Center, Department of Biology, University of Southern Denmark, 5300 Kerteminde, Denmark
| | - Morgan J Martin
- Center for Marine Acoustics, Bureau of Ocean Energy Management, Sterling, Virginia 20166, USA
| |
Collapse
|
12
|
Christman KA, Finneran JJ, Mulsow J, Houser DS, Gentner TQ. The effects of range and echo-phase on range resolution in bottlenose dolphins (Tursiops truncatus) performing a successive comparison taska). J Acoust Soc Am 2024; 155:274-283. [PMID: 38215217 DOI: 10.1121/10.0024342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 12/14/2023] [Indexed: 01/14/2024]
Abstract
Echolocating bats and dolphins use biosonar to determine target range, but differences in range discrimination thresholds have been reported for the two species. Whether these differences represent a true difference in their sensory system capability is unknown. Here, the dolphin's range discrimination threshold as a function of absolute range and echo-phase was investigated. Using phantom echoes, the dolphins were trained to echo-inspect two simulated targets and indicate the closer target by pressing a paddle. One target was presented at a time, requiring the dolphin to hold the initial range in memory as they compared it to the second target. Range was simulated by manipulating echo-delay while the received echo levels, relative to the dolphins' clicks, were held constant. Range discrimination thresholds were determined at seven different ranges from 1.75 to 20 m. In contrast to bats, range discrimination thresholds increased from 4 to 75 cm, across the entire ranges tested. To investigate the acoustic features used more directly, discrimination thresholds were determined when the echo was given a random phase shift (±180°). Results for the constant-phase versus the random-phase echo were quantitatively similar, suggesting that dolphins used the envelope of the echo waveform to determine the difference in range.
Collapse
Affiliation(s)
- Katie A Christman
- Department of Psychology, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
- Department of Biologic and Bioacoustic Research, National Marine Mammal Foundation, 3131, 2240 Shelter Island Drive, San Diego, California 92106, USA
| | - James J Finneran
- United States Navy Marine Mammal Program, Naval Information Warfare Center Pacific Code 56710, 53560 Hull Street, San Diego, California 92152, USA
| | - Jason Mulsow
- Department of Biologic and Bioacoustic Research, National Marine Mammal Foundation, 3131, 2240 Shelter Island Drive, San Diego, California 92106, USA
| | - Dorian S Houser
- Department of Biologic and Bioacoustic Research, National Marine Mammal Foundation, 3131, 2240 Shelter Island Drive, San Diego, California 92106, USA
| | - Timothy Q Gentner
- Department of Psychology, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
- Department of Neurobiology, University of California, San Diego, 9500 Gilman Drive, La Jolla, California 92093, USA
| |
Collapse
|
13
|
Gransier R, Kastelein RA. Similar susceptibility to temporary hearing threshold shifts despite different audiograms in harbor porpoises and harbor seals. J Acoust Soc Am 2024; 155:396-404. [PMID: 38240666 DOI: 10.1121/10.0024343] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2023] [Accepted: 12/14/2023] [Indexed: 01/23/2024]
Abstract
When they are exposed to loud fatiguing sounds in the oceans, marine mammals are susceptible to hearing damage in the form of temporary hearing threshold shifts (TTSs) or permanent hearing threshold shifts. We compared the level-dependent and frequency-dependent susceptibility to TTSs in harbor seals and harbor porpoises, species with different hearing sensitivities in the low- and high-frequency regions. Both species were exposed to 100% duty cycle one-sixth-octave noise bands at frequencies that covered their entire hearing range. In the case of the 6.5 kHz exposure for the harbor seals, a pure tone (continuous wave) was used. TTS was quantified as a function of sound pressure level (SPL) half an octave above the center frequency of the fatiguing sound. The species have different audiograms, but their frequency-specific susceptibility to TTS was more similar. The hearing frequency range in which both species were most susceptible to TTS was 22.5-50 kHz. Furthermore, the frequency ranges were characterized by having similar critical levels (defined as the SPL of the fatiguing sound above which the magnitude of TTS induced as a function of SPL increases more strongly). This standardized between-species comparison indicates that the audiogram is not a good predictor of frequency-dependent susceptibility to TTS.
Collapse
Affiliation(s)
- Robin Gransier
- Research Group Experimental Oto-rhino-laryngology (ExpORL), Department of Neurosciences, KU Leuven, Herestraat 49, Box 721, 3000 Leuven, Belgium
| | - Ronald A Kastelein
- Sea Mammal Research Company (SEAMARCO), Julianalaan 46, 3842 CC Harderwijk, The Netherlands
| |
Collapse
|
14
|
Chu MN, Gussenhoven C, van Hout R. A perception-induced /t/-to-/k/ sound change: evidence from a cross-linguistic study. Phonetica 2023; 80:465-493. [PMID: 37852617 DOI: 10.1515/phon-2023-2003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/20/2023]
Abstract
John Ohala claimed that the source of sound change may lie in misperceptions which can be replicated in the laboratory. We tested this claim for a historical change of /t/ to /k/ in the coda in the Southern Min dialect of Chaoshan. We conducted a forced-choice segment identification task with CVC syllables in which the final C varied across the segments [p t k ʔ] in addition to a number of further variables, including the V, which ranged across [i u a]. The results from three groups of participants whose native languages have the coda systems /p t k ʔ/ (Zhangquan), /p k ʔ/ (Chaoshan) and /p t k/ (Dutch) indicate that [t] is the least stably perceived segment overall. It is particularly disfavoured when it follows [a], where there is a bias towards [k]. We argue that this finding supports a perceptual account of the historically documented scenario whereby a change from /at/ to /ak/ preceded and triggered a more general merger of /t/ with /k/ in the coda of Chaoshan. While we grant that perceptual sound changes are not the only or even the most common type of sound change, the fact that the perception results are essentially the same across the three language groups lends credibility to Ohala's perceptually motivated sound changes.
Collapse
Affiliation(s)
- Man-Ni Chu
- Graduate Institute of Cross-Cultural Studies, Fu Jen Catholic University, New Taipei City, Taiwan
| | - Carlos Gussenhoven
- Department of Linguistics, Radboud University, Nijmegen, The Netherlands
| | - Roeland van Hout
- Department of Linguistics, Radboud University, Nijmegen, The Netherlands
| |
Collapse
|
15
|
Selbmann A, Miller PJO, Wensveen PJ, Svavarsson J, Samarra FIP. Call combination patterns in Icelandic killer whales (Orcinus orca). Sci Rep 2023; 13:21771. [PMID: 38065973 PMCID: PMC10709340 DOI: 10.1038/s41598-023-48349-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Accepted: 11/25/2023] [Indexed: 12/18/2023] Open
Abstract
Acoustic sequences have been described in a range of species and in varying complexity. Cetaceans are known to produce complex song displays but these are generally limited to mysticetes; little is known about call combinations in odontocetes. Here we investigate call combinations produced by killer whales (Orcinus orca), a highly social and vocal species. Using acoustic recordings from 22 multisensor tags, we use a first order Markov model to show that transitions between call types or subtypes were significantly different from random, with repetitions and specific call combinations occurring more often than expected by chance. The mixed call combinations were composed of two or three calls and were part of three call combination clusters. Call combinations were recorded over several years, from different individuals, and several social clusters. The most common call combination cluster consisted of six call (sub-)types. Although different combinations were generated, there were clear rules regarding which were the first and last call types produced, and combinations were highly stereotyped. Two of the three call combination clusters were produced outside of feeding contexts, but their function remains unclear and further research is required to determine possible functions and whether these combinations could be behaviour- or group-specific.
Collapse
Affiliation(s)
- Anna Selbmann
- Faculty of Life and Environmental Sciences, University of Iceland, Reykjavík, Iceland.
| | - Patrick J O Miller
- Sea Mammal Research Unit, School of Biology, University of St Andrews, St Andrews, UK
| | - Paul J Wensveen
- Faculty of Life and Environmental Sciences, University of Iceland, Reykjavík, Iceland
| | - Jörundur Svavarsson
- Faculty of Life and Environmental Sciences, University of Iceland, Reykjavík, Iceland
| | - Filipa I P Samarra
- Institute of Research Centres, University of Iceland, Vestmannaeyjar, Iceland
| |
Collapse
|
16
|
Webb McAdams AL, Smith ME. The relationship between body size and stridulatory sound production in loricariid catfishesa). J Acoust Soc Am 2023; 154:3672-3683. [PMID: 38059727 DOI: 10.1121/10.0022575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 11/13/2023] [Indexed: 12/08/2023]
Abstract
Sound production capabilities and characteristics in Loricariidae, the largest catfish family, have not been well examined. Sounds produced by three loricariid catfish species, Otocinclus affinis, Pterygoplichthys gibbiceps, and Pterygoplichthys pardalis, were recorded. Each of these species produces pulses via pectoral-fin spine stridulation by rubbing the ridged condyle of the dorsal process of the pectoral-fin spine base against a matching groove-like socket in the pectoral girdle. Light and scanning electron microscopy were used to examine the dorsal process of the pectoral-fin spines of these species. Mean distances between dorsal process ridges of O. affinis, P. gibbiceps, and P. pardalis were 53, 161, and 329 μm, respectively. Stridulation sounds occurred during either abduction (type A) or adduction (type B). O. affinis produced sounds through adduction only and P. pardalis through abduction only, whereas P. gibbiceps often produced pulse trains alternating between abduction and adduction. In these species, dominant frequency was an inverse function of sound duration, fish total length, and inter-ridge distance on the dorsal process of the pectoral-fin spine and sound duration increased with fish total length. While stridulation sounds are used in many behavioral contexts in catfishes, the functional significance of sound production in Loricariidae is currently unknown.
Collapse
Affiliation(s)
- Amanda L Webb McAdams
- Department of Biology, Western Kentucky University, Bowling Green, Kentucky 42101, USA
| | - Michael E Smith
- Department of Biology, Western Kentucky University, Bowling Green, Kentucky 42101, USA
| |
Collapse
|
17
|
Luzum NR, Hamel BL, Shafiro V, Harris MS. Identification Accuracy of Safety-Relevant Environmental Sounds in Adult Cochlear Implant Users. Laryngoscope 2023; 133:2388-2393. [PMID: 36317721 PMCID: PMC10149563 DOI: 10.1002/lary.30475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 10/18/2022] [Accepted: 10/19/2022] [Indexed: 11/05/2022]
Abstract
OBJECTIVE Examine cochlear implant (CI) users' ability to identify safety-relevant environmental sounds, imperative for safety, independence, and personal well-being. METHODS Twenty-one experienced adult CI users completed an Environmental Sound Identification (ESI) test consisting of 42 common environmental sounds, 28 of which were relevant to personal safety, along with 14 control sounds. Prior to sound identification, participants were shown sound names and asked to rate the familiarity and, separately, relevance to safety of each corresponding sound on a 1-5 scale. RESULTS Overall ESI accuracy was 57% correct for the safety-relevant sounds and 55% correct for control sounds. Participants rated safety-relevant sounds as more important to safety and more familiar than the non-safety sounds. ESI accuracy significantly correlated with familiarity ratings. CONCLUSION The present findings suggest mediocre ESI accuracy in postlingual adult CI users for safety-relevant and other environmental sounds. Deficits in the identification of these sounds may put CI listeners at increased risk of accidents or injuries and may require a specific rehabilitation program to improve CI outcomes. LEVEL OF EVIDENCE 4 Laryngoscope, 133:2388-2393, 2023.
Collapse
Affiliation(s)
| | - Benjamin L. Hamel
- Department of Pediatric and Adolescent Medicine, Mayo Clinic, Rochester, MN, USA
| | - Valeriy Shafiro
- Department of Communication Disorders & Sciences, College of Health Sciences & Graduate College, Rush University, Chicago, IL, USA
| | - Michael S. Harris
- Department of Otolaryngology & Communication Sciences, Medical College of Wisconsin Milwaukee, WI, USA
- Department of Neurosurgery, Medical College of Wisconsin, Milwaukee, WI, USA
| |
Collapse
|
18
|
Hamza Y, Farhadi A, Schwarz DM, McDonough JM, Carney LH. Representations of fricatives in subcortical model responses: Comparisons with human consonant perception. J Acoust Soc Am 2023; 154:602-618. [PMID: 37535429 PMCID: PMC10550336 DOI: 10.1121/10.0020536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 07/11/2023] [Accepted: 07/13/2023] [Indexed: 08/05/2023]
Abstract
Fricatives are obstruent sound contrasts made by airflow constrictions in the vocal tract that produce turbulence across the constriction or at a site downstream from the constriction. Fricatives exhibit significant intra/intersubject and contextual variability. Yet, fricatives are perceived with high accuracy. The current study investigated modeled neural responses to fricatives in the auditory nerve (AN) and inferior colliculus (IC) with the hypothesis that response profiles across populations of neurons provide robust correlates to consonant perception. Stimuli were 270 intervocalic fricatives (10 speakers × 9 fricatives × 3 utterances). Computational model response profiles had characteristic frequencies that were log-spaced from 125 Hz to 8 or 20 kHz to explore the impact of high-frequency responses. Confusion matrices generated by k-nearest-neighbor subspace classifiers were based on the profiles of average rates across characteristic frequencies as feature vectors. Model confusion matrices were compared with published behavioral data. The modeled AN and IC neural responses provided better predictions of behavioral accuracy than the stimulus spectra, and IC showed better accuracy than AN. Behavioral fricative accuracy was explained by modeled neural response profiles, whereas confusions were only partially explained. Extended frequencies improved accuracy based on the model IC, corroborating the importance of extended high frequencies in speech perception.
Collapse
Affiliation(s)
- Yasmeen Hamza
- Department of Biomedical Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Afagh Farhadi
- Department of Electrical and Computer Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Douglas M Schwarz
- Depts. of Neuroscience and Biomedical Engineering, University of Rochester, Rochester, New York 14627, USA
| | - Joyce M McDonough
- Department of Linguistics, University of Rochester, Rochester, New York 14627, USA
| | - Laurel H Carney
- Depts. of Biomedical Engineering, Neuroscience, and Electrical and Computer Engineering, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
19
|
Sayigh LS, El Haddad N, Tyack PL, Janik VM, Wells RS, Jensen FH. Bottlenose dolphin mothers modify signature whistles in the presence of their own calves. Proc Natl Acad Sci U S A 2023; 120:e2300262120. [PMID: 37364108 PMCID: PMC10318978 DOI: 10.1073/pnas.2300262120] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 05/09/2023] [Indexed: 06/28/2023] Open
Abstract
Human caregivers interacting with children typically modify their speech in ways that promote attention, bonding, and language acquisition. Although this "motherese," or child-directed communication (CDC), occurs in a variety of human cultures, evidence among nonhuman species is very rare. We looked for its occurrence in a nonhuman mammalian species with long-term mother-offspring bonds that is capable of vocal production learning, the bottlenose dolphin (Tursiops truncatus). Dolphin signature whistles provide a unique opportunity to test for CDC in nonhuman animals, because we are able to quantify changes in the same vocalizations produced in the presence or absence of calves. We analyzed recordings made during brief catch-and-release events of wild bottlenose dolphins in waters near Sarasota Bay, Florida, United States, and found that females produced signature whistles with significantly higher maximum frequencies and wider frequency ranges when they were recorded with their own dependent calves vs. not with them. These differences align with the higher fundamental frequencies and wider pitch ranges seen in human CDC. Our results provide evidence in a nonhuman mammal for changes in the same vocalizations when produced in the presence vs. absence of offspring, and thus strongly support convergent evolution of motherese, or CDC, in bottlenose dolphins. CDC may function to enhance attention, bonding, and vocal learning in dolphin calves, as it does in human children. Our data add to the growing body of evidence that dolphins provide a powerful animal model for studying the evolution of vocal learning and language.
Collapse
Affiliation(s)
- Laela S. Sayigh
- Biology Department, Woods Hole Oceanographic Institution, Falmouth, MA02543
- Hampshire College, Amherst, MA01002
| | - Nicole El Haddad
- Biology Department, Woods Hole Oceanographic Institution, Falmouth, MA02543
- Earth and Environmental Sciences Department, University of Milano Bicocca, Milano20126, Italy
| | - Peter L. Tyack
- Biology Department, Woods Hole Oceanographic Institution, Falmouth, MA02543
- Sea Mammal Research Unit, Scottish Oceans Institute, University of St. Andrews, St. Andrews, KY16 8LB, United Kingdom
| | - Vincent M. Janik
- Sea Mammal Research Unit, Scottish Oceans Institute, University of St. Andrews, St. Andrews, KY16 8LB, United Kingdom
| | - Randall S. Wells
- Chicago Zoological Society’s Sarasota Dolphin Research Program, c/o Mote Marine Laboratory, Sarasota, FL34236
| | - Frants H. Jensen
- Biology Department, Woods Hole Oceanographic Institution, Falmouth, MA02543
- Marine Mammal Research, Department of Ecoscience, Aarhus University, Roskilde4000, Denmark
- Biology Department, Syracuse University, Syracuse, NY13244
| |
Collapse
|
20
|
Huang D, Ren L, Lu H, Wang W. A Scene Adaption Framework for Infant Cry Detection in Obstetrics. Annu Int Conf IEEE Eng Med Biol Soc 2023; 2023:1-5. [PMID: 38083776 DOI: 10.1109/embc40787.2023.10340693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2023]
Abstract
Infant cry provides useful clinical insights for caregivers to make appropriate medical decisions, such as in obstetrics. However, robust infant cry detection in real clinical settings (e.g. obstetrics) is still challenging due to the limited training data in this scenario. In this paper, we propose a scene adaption framework (SAF) including two different learning stages that can quickly adapt the cry detection model to a new environment. The first stage uses the acoustic principle that mixture sources in audio signals are approximately additive to imitate the sounds in clinical settings using public datasets. The second stage utilizes mutual learning to mine the shared characteristics of infant cry between the clinical setting and public dataset to adapt the scene in an unsupervised manner. The clinical trial was conducted in Obstetrics, where the crying audios from 200 infants were collected. The experimented four classifiers used for infant cry detection have nearly 30% improvement on the F1-score by using SAF, which achieves similar performance as the supervised learning based on the target setting. SAF is demonstrated to be an effective plug- and-play tool for improving infant cry detection in new clinical settings. Our code is available at https://github.com/contactless-healthcare/Scene-Adaption-for-Infant-Cry-Detection.
Collapse
|
21
|
Somervuo P, Lauha P, Lokki T. Effects of landscape and distance in automatic audio based bird species identification. J Acoust Soc Am 2023; 154:245-254. [PMID: 37439638 DOI: 10.1121/10.0020153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/02/2023] [Accepted: 06/28/2023] [Indexed: 07/14/2023]
Abstract
The present work focuses on how the landscape and distance between a bird and an audio recording unit affect automatic species identification. Moreover, it is shown that automatic species identification can be improved by taking into account the effects of landscape and distance. The proposed method uses measurements of impulse responses between the sound source and the recorder. These impulse responses, characterizing the effect of a landscape, can be measured in the real environment, after which they can be convolved with any number of recorded bird sounds to modify an existing set of bird sound recordings. The method is demonstrated using autonomous recording units on an open field and in two different types of forests, varying the distance between the sound source and the recorder. Species identification accuracy improves significantly when the landscape and distance effect is taken into account when building the classification model. The method is demonstrated using bird sounds, but the approach is applicable to other animal and non-animal vocalizations as well.
Collapse
Affiliation(s)
- Panu Somervuo
- Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
| | - Patrik Lauha
- Faculty of Biological and Environmental Sciences, University of Helsinki, Helsinki, Finland
| | - Tapio Lokki
- Acoustics Lab, Department of Information and Communications Engineering, Aalto University, Espoo, Finland
| |
Collapse
|
22
|
Wu Y, Li P, Guo W, Zhang B, Hu Z. Passive source depth estimation using beam intensity striations of a horizontal linear array in deep water. J Acoust Soc Am 2023; 154:255-269. [PMID: 37449786 DOI: 10.1121/10.0020148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 06/26/2023] [Indexed: 07/18/2023]
Abstract
Source depth estimation is an important yet very difficult task for passive sonars, especially for horizontal linear arrays (HLAs). This paper proposes an efficient two-step depth estimation scheme using narrowband and broadband constructive and deconstructive striation patterns due to interference between the direct (D) and sea surface reflected (SR) arrivals at an HLA on the bottom of deep water. First, the horizontal source-array ranges are derived from triangulation results of solid angle estimates by subarray beamforming. The applicable areas of the method in deep water are investigated through Mento Carlo simulations, assuming different subarray partitioning ways of a given HLA aperture. Second, cost functions are built to match the measured beam intensity striations with modeled ones. To mitigate the spatial smoothing effect of the beam intensity striations during beamforming, a criterion of the largest subarray aperture is established, and a computationally efficient way is presented to model the replicas by the D-SR time delay templates at a single element of the array calculated by ray theory. The performance degradation due to limited source range spans, the distortion of the beam intensity striations, and range estimation errors has been analyzed. Two experimental datasets verify the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Yanqun Wu
- College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China
| | - Pingzheng Li
- College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China
| | - Wei Guo
- College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China
| | - Bingbing Zhang
- College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China
| | - Zhengliang Hu
- College of Meteorology and Oceanography, National University of Defense Technology, Changsha 410073, China
| |
Collapse
|
23
|
Dong Z, Ding Q, Zhai W, Zhou M. A Speech Recognition Method Based on Domain-Specific Datasets and Confidence Decision Networks. Sensors (Basel) 2023; 23:6036. [PMID: 37447886 PMCID: PMC10346893 DOI: 10.3390/s23136036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 06/25/2023] [Accepted: 06/28/2023] [Indexed: 07/15/2023]
Abstract
This paper proposes a speech recognition method based on a domain-specific language speech network (DSL-Net) and a confidence decision network (CD-Net). The method involves automatically training a domain-specific dataset, using pre-trained model parameters for migration learning, and obtaining a domain-specific speech model. Importance sampling weights were set for the trained domain-specific speech model, which was then integrated with the trained speech model from the benchmark dataset. This integration automatically expands the lexical content of the model to accommodate the input speech based on the lexicon and language model. The adaptation attempts to address the issue of out-of-vocabulary words that are likely to arise in most realistic scenarios and utilizes external knowledge sources to extend the existing language model. By doing so, the approach enhances the adaptability of the language model in new domains or scenarios and improves the prediction accuracy of the model. For domain-specific vocabulary recognition, a deep fully convolutional neural network (DFCNN) and a candidate temporal classification (CTC)-based approach were employed to achieve effective recognition of domain-specific vocabulary. Furthermore, a confidence-based classifier was added to enhance the accuracy and robustness of the overall approach. In the experiments, the method was tested on a proprietary domain audio dataset and compared with an automatic speech recognition (ASR) system trained on a large-scale dataset. Based on experimental verification, the model achieved an accuracy improvement from 82% to 91% in the medical domain. The inclusion of domain-specific datasets resulted in a 5% to 7% enhancement over the baseline, while the introduction of model confidence further improved the baseline by 3% to 5%. These findings demonstrate the significance of incorporating domain-specific datasets and model confidence in advancing speech recognition technology.
Collapse
Affiliation(s)
- Zhe Dong
- School of Electrical and Control Engineering, North China University of Technology, Beijing 100041, China; (Q.D.); (W.Z.); (M.Z.)
| | | | | | | |
Collapse
|
24
|
Wei C, Houser D, Erbe C, Mátrai E, Ketten DR, Finneran JJ. Does rotation increase the acoustic field of view? Comparative models based on CT data of a live dolphin versus a dead dolphin. Bioinspir Biomim 2023; 18:035006. [PMID: 36917857 DOI: 10.1088/1748-3190/acc43d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 03/14/2023] [Indexed: 06/18/2023]
Abstract
Rotational behaviour has been observed when dolphins track or detect targets, however, its role in echolocation is unknown. We used computed tomography data of one live and one recently deceased bottlenose dolphin, together with measurements of the acoustic properties of head tissues, to perform acoustic property reconstruction. The anatomical configuration and acoustic properties of the main forehead structures between the live and deceased dolphins were compared. Finite element analysis (FEA) was applied to simulate the generation and propagation of echolocation clicks, to compute their waveforms and spectra in both near- and far-fields, and to derive echolocation beam patterns. Modelling results from both the live and deceased dolphins were in good agreement with click recordings from other, live, echolocating individuals. FEA was also used to estimate the acoustic scene experienced by a dolphin rotating 180° about its longitudinal axis to detect fish in the far-field at elevation angles of -20° to 20°. The results suggest that the rotational behaviour provides a wider insonification area and a wider receiving area. Thus, it may provide compensation for the dolphin's relatively narrow biosonar beam, asymmetries in sound reception, and constraints on the pointing direction that are limited by head movement. The results also have implications for examining the accuracy of FEA in acoustic simulations using recently deceased specimens.
Collapse
Affiliation(s)
- Chong Wei
- Centre for Marine Science and Technology, Curtin University, Perth, WA 6102, Australia
| | - Dorian Houser
- National Marine Mammal Foundation, 2240 Shelter Island Drive, #200, San Diego, CA 92106, United States of America
| | - Christine Erbe
- Centre for Marine Science and Technology, Curtin University, Perth, WA 6102, Australia
| | - Eszter Mátrai
- Research Department, Ocean Park, Hong Kong, People's Republic of China
| | - Darlene R Ketten
- Department of Biomedical Engineering, Boston University, Boston, MA 02215, United States of America
| | - James J Finneran
- United States Navy Marine Mammal Program, Naval Information Warfare Center Pacific Code 56710, 53560 Hull Street, San Diego, CA 92152, United States of America
| |
Collapse
|
25
|
Arranz P, Miranda D, Gkikopoulou KC, Cardona A, Alcazar J, Aguilar de Soto N, Thomas L, Marques TA. Comparison of visual and passive acoustic estimates of beaked whale density off El Hierro, Canary Islands. J Acoust Soc Am 2023; 153:2469. [PMID: 37092951 DOI: 10.1121/10.0017921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 04/06/2023] [Indexed: 05/03/2023]
Abstract
Passive acoustic monitoring (PAM) offers considerable potential for density estimation of cryptic cetaceans, such as beaked whales. However, comparative studies on the accuracy of PAM density estimates from these species are lacking. Concurrent, low-cost drifting PAM, with SoundTraps suspended at 200 m depth, and land-based sightings, were conducted off the Canary Islands. Beaked whale density was estimated using a cue-count method, with click production rate and the probability of click detection derived from digital acoustic recording tags (DTags), and distance sampling techniques, adapted to fixed-point visual surveys. Of 32 870 detections obtained throughout 206 h of PAM recordings, 68% were classified as "certain" beaked whale clicks. Acoustic detection probability was 0.15 [coefficient variation (CV) 0.24] and click production rate was 0.46 clicks s - 1 (CV 0.05). PAM density estimates were in the range of 21.5 or 48.6 whales per 1000 km2 [CV 0.50 or 0.44, 95% confidence interval (CI) 20.7-22.4 or 47-50.9), depending on whether "uncertain" clicks were considered. Density estimates from concurrent sightings resulted in 33.7 whales per 1000 km2 (CV 0.77, 95% CI 8.9-50.5). Cue-count PAM methods under application provide reliable estimates of beaked whale density, over relatively long time periods and in realistic scenarios, as these match the concurrent density estimates obtained from visual observations.
Collapse
Affiliation(s)
- P Arranz
- BIOECOMAC, Departamento de Biología Animal, Edafología y Geología. Universidad de La Laguna. Avenida Astrofísico F. Sánchez, s/n. 38206 San Cristóbal de La Laguna, Tenerife, Spain
| | - D Miranda
- BIOECOMAC, Departamento de Biología Animal, Edafología y Geología. Universidad de La Laguna. Avenida Astrofísico F. Sánchez, s/n. 38206 San Cristóbal de La Laguna, Tenerife, Spain
| | - K C Gkikopoulou
- Sea Mammal Research Unit, Scottish Oceans Institute, University of St Andrews, KY16 8LB St Andrews, Scotland
| | - A Cardona
- Sea Mammal Research Unit, Scottish Oceans Institute, University of St Andrews, KY16 8LB St Andrews, Scotland
| | - J Alcazar
- BIOECOMAC, Departamento de Biología Animal, Edafología y Geología. Universidad de La Laguna. Avenida Astrofísico F. Sánchez, s/n. 38206 San Cristóbal de La Laguna, Tenerife, Spain
| | - N Aguilar de Soto
- BIOECOMAC, Departamento de Biología Animal, Edafología y Geología. Universidad de La Laguna. Avenida Astrofísico F. Sánchez, s/n. 38206 San Cristóbal de La Laguna, Tenerife, Spain
| | - L Thomas
- Centre for Research into Ecological and Environmental Modelling, University of St Andrews, KY16 8LB St Andrews, Scotland
| | - T A Marques
- Departamento de Biología Animal, Centro de Estatística e Aplicações, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Campo Grande, Lisboa, Portugal
| |
Collapse
|
26
|
Figueiredo LDD, Maciel I, Viola FM, Savi MA, Simão SM. Nonlinear features in whistles produced by the short-beaked common dolphin (Delphinus delphis) off southeastern Brazil. J Acoust Soc Am 2023; 153:2436. [PMID: 37092947 DOI: 10.1121/10.0017883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2022] [Accepted: 03/30/2023] [Indexed: 05/03/2023]
Abstract
Animal vocalizations have nonlinear characteristics responsible for features such as subharmonics, frequency jumps, biphonation, and deterministic chaos. This study describes the whistle repertoire of a short-beaked common dolphin (Delphinus delphis) group at Brazilian coast and quantifies the nonlinear features of these whistles. Dolphins were recorded for a total of 67 min around Cabo Frio, Brazil. We identify 10 basic categories of whistle, with 75 different types, classified according to their contour shape. Most (45) of these 75 types had not been reported previously for the species. The duration of the whistles ranged from 0.04 to 3.67 s, with frequencies of 3.05-29.75 kHz. Overall, the whistle repertoire presented here has one of the widest frequency ranges and greatest level of frequency modulation recorded in any study of D. delphis. All the nonlinear features sought during the study were confirmed, with at least one feature occurring in 38.4% of the whistles. The frequency jump was the most common feature (29.75% of the whistles) and the nonlinear time series analyses confirmed the deterministic chaos in the chaotic-like segments. These results indicate that nonlinearities are a relevant characteristic of these whistles, and that are important in acoustic communication.
Collapse
Affiliation(s)
| | - Israel Maciel
- Department of Ecology, State University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Flavio M Viola
- Center for Nonlinear Mechanics, COPPE-Mechanical Engineering, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Marcelo A Savi
- Center for Nonlinear Mechanics, COPPE-Mechanical Engineering, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Sheila M Simão
- Department of Environmental Science, Federal Rural University of Rio de Janeiro, Rio de Janeiro, Brazil
| |
Collapse
|
27
|
Stennette KA, Fishbein A, Prior N, Ball GF, Dooling RJ. Sound order discrimination in two species of birds-Taeniopygia guttata and Melopsittacus undulatus. J Comp Psychol 2023; 137:29-37. [PMID: 36931835 DOI: 10.1037/com0000340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Recent psychophysical experiments have shown that zebra finches (Taeniopygia guttata-a songbird) are surprisingly insensitive to syllable sequence changes in their species-specific motifs while budgerigars (Melopsittacus undulatus-a psittacine) do much better when tested on exactly the same sounds. This is unexpected since zebra finch males learn the order of syllables in their songs when young and sing the same song throughout adulthood. Here we probe the limits of this species difference by testing birds on an order change involving just two syllables, hereafter called bi-syllable phrases. Results show budgerigars still perform better than zebra finches on an order change involving just two syllables. An analysis of response latencies shows that both species respond to an order change in a bi-syllable motif at the onset of the first syllable rather than listening to the entire sequence before responding. Additional tests with one syllable omitted or doubled, or with white noise bursts substituted for syllables, indicate that the first syllable in the sequence has a dominant effect on subsequent discrimination of changes in a bi-syllable pattern. These results are surprising in that zebra finch males sing their full motif syllable sequence with a high degree of stereotypy throughout life, suggesting that this consistency in production may not rely on perceptual mechanisms for processing syllable order in adulthood. Budgerigars, on the other hand, are quite sensitive to bi-syllable order changes, an ability that may be related to useful information being encoded in the sequence of syllables in their natural song. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
Collapse
Affiliation(s)
| | - Adam Fishbein
- Department of Psychology, University of Maryland College Park
| | - Nora Prior
- Department of Psychology, University of Maryland College Park
| | - Gregory F Ball
- Department of Psychology, University of Maryland College Park
| | | |
Collapse
|
28
|
ZoBell VM, Gassmann M, Kindberg LB, Wiggins SM, Hildebrand JA, Frasier KE. Retrofit-induced changes in the radiated noise and monopole source levels of container ships. PLoS One 2023; 18:e0282677. [PMID: 36928448 PMCID: PMC10019734 DOI: 10.1371/journal.pone.0282677] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 02/20/2023] [Indexed: 03/18/2023] Open
Abstract
The container shipping line Maersk undertook a Radical Retrofit to improve the energy efficiency of twelve sister container ships. Noise reduction, identified as a potential added benefit of the retrofitting effort, was investigated in this study. A passive acoustic recording dataset from the Santa Barbara Channel off Southern California was used to compile over 100 opportunistic vessel transits of the twelve G-Class container ships, pre- and post-retrofit. Post-retrofit, the G-Class vessels' capacity was increased from ~9,000 twenty-foot equivalent units (TEUs) to ~11,000 TEUs, which required a draft increase of the vessel by 1.5 m on average. The increased vessel draft resulted in higher radiated noise levels (<2 dB) in the mid- and high-frequency bands. Accounting for the Lloyd's mirror (dipole source) effect, the monopole source levels of the post-retrofit ships were found to be significantly lower (>5 dB) than the pre-retrofit ships in the low-frequency band and the reduction was greatest at low speed. Although multiple design changes occurred during retrofitting, the reduction in the low-frequency band most likely results from a reduction in cavitation due to changes in propeller and bow design.
Collapse
Affiliation(s)
- Vanessa M. ZoBell
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, United States of America
- * E-mail:
| | - Martin Gassmann
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, United States of America
| | - Lee B. Kindberg
- Maersk North America, Charlotte, North Carolina, United States of America
| | - Sean M. Wiggins
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, United States of America
| | - John A. Hildebrand
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, United States of America
| | - Kaitlin E. Frasier
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, California, United States of America
| |
Collapse
|
29
|
Veyrié A, Noreña A, Sarrazin JC, Pezard L. Investigating the influence of masker and target properties on the dynamics of perceptual awareness under informational masking. PLoS One 2023; 18:e0282885. [PMID: 36928693 PMCID: PMC10019711 DOI: 10.1371/journal.pone.0282885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 02/27/2023] [Indexed: 03/18/2023] Open
Abstract
Informational masking has been investigated using the detection of an auditory target embedded in a random multi-tone masker. The build-up of the target percept is influenced by the masker and target properties. Most studies dealing with discrimination performance neglect the dynamics of perceptual awareness. This study aims at investigating the dynamics of perceptual awareness using multi-level survival models in an informational masking paradigm by manipulating masker uncertainty, masker-target similarity and target repetition rate. Consistent with previous studies, it shows that high target repetition rates, low masker-target similarity and low masker uncertainty facilitate target detection. In the context of evidence accumulation models, these results can be interpreted by changes in the accumulation parameters. The probabilistic description of perceptual awareness provides a benchmark for the choice of target and masker parameters in order to examine the underlying cognitive and neural dynamics of perceptual awareness.
Collapse
Affiliation(s)
- Alexandre Veyrié
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
- ONERA, The French Aerospace Lab, Salon de Provence, France
| | - Arnaud Noreña
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
| | | | - Laurent Pezard
- Aix-Marseille Université, LNC, CNRS UMR 7291, Marseille, France
- * E-mail:
| |
Collapse
|
30
|
Conant PC, Li P, Liu X, Klinck H, Fleishman E, Gillespie D, Nosal EM, Roch MA. Silbido profundo: An open source package for the use of deep learning to detect odontocete whistles. J Acoust Soc Am 2022; 152:3800. [PMID: 36586843 DOI: 10.1121/10.0016631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 12/08/2022] [Indexed: 06/17/2023]
Abstract
This work presents an open-source matlab software package for exploiting recent advances in extracting tonal signals from large acoustic data sets. A whistle extraction algorithm published by Li, Liu, Palmer, Fleishman, Gillespie, Nosal, Shiu, Klinck, Cholewiak, Helble, and Roch [(2020). Proceedings of the International Joint Conference on Neural Networks, July 19-24, Glasgow, Scotland, p. 10] is incorporated into silbido, an established software package for extraction of cetacean tonal calls. The precision and recall of the new system were over 96% and nearly 80%, respectively, when applied to a whistle extraction task on a challenging two-species subset of a conference-benchmark data set. A second data set was examined to assess whether the algorithm generalized to data that were collected across different recording devices and locations. These data included 487 h of weakly labeled, towed array data collected in the Pacific Ocean on two National Oceanographic and Atmospheric Administration (NOAA) cruises. Labels for these data consisted of regions of toothed whale presence for at least 15 species that were based on visual and acoustic observations and not limited to whistles. Although the lack of per whistle-level annotations prevented measurement of precision and recall, there was strong concurrence of automatic detections and the NOAA annotations, suggesting that the algorithm generalizes well to new data.
Collapse
Affiliation(s)
- Peter C Conant
- Department of Computer Science, San Diego State University, San Diego, California 92182, USA
| | - Pu Li
- Department of Computer Science, San Diego State University, San Diego, California 92182, USA
| | - Xiaobai Liu
- Department of Computer Science, San Diego State University, San Diego, California 92182, USA
| | - Holger Klinck
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, New York, New York 14850, USA
| | - Erica Fleishman
- College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, Oregon 97331, USA
| | - Douglas Gillespie
- Sea Mammal Research Unit, Scottish Oceans Institute, University of St. Andrews, St. Andrews, KY16 9AJ, United Kingdom
| | - Eva-Marie Nosal
- Department of Ocean and Resources Engineering, University of Hawai'i at Mānoa, Honolulu, Hawaii 96822, USA
| | - Marie A Roch
- Department of Computer Science, San Diego State University, San Diego, California 92182, USA
| |
Collapse
|
31
|
Shin M, Hong W, Lee K, Choo Y. Passive Sonar Target Identification Using Multiple-Measurement Sparse Bayesian Learning. Sensors (Basel) 2022; 22:8511. [PMID: 36366208 PMCID: PMC9654619 DOI: 10.3390/s22218511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/12/2022] [Revised: 10/27/2022] [Accepted: 11/03/2022] [Indexed: 06/16/2023]
Abstract
Accurate estimation of the frequency component is an important issue to identify and track marine objects (e.g., surface ship, submarine, etc.). In general, a passive sonar system consists of a sensor array, and each sensor receives data that have common information of the target signal. In this paper, we consider multiple-measurement sparse Bayesian learning (MM-SBL), which reconstructs sparse solutions in a linear system using Bayesian frameworks, to detect the common frequency components received by each sensor. In addition, the direction of arrival estimation was performed on each detected common frequency component using the MM-SBL based on beamforming. The azimuth for each common frequency component was confirmed in the frequency-azimuth plot, through which we identified the target. In addition, we perform target tracking using the target detection results along time, which are derived from the sum of the signal spectrum at the azimuth angle. The performance of the MM-SBL and the conventional target detection method based on energy detection were compared using in-situ data measured near the Korean peninsula, where MM-SBL displays superior detection performance and high-resolution results.
Collapse
|
32
|
Stilp CE, Shorey AE, King CJ. Nonspeech sounds are not all equally good at being nonspeech. J Acoust Soc Am 2022; 152:1842. [PMID: 36182316 DOI: 10.1121/10.0014174] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 08/30/2022] [Indexed: 06/16/2023]
Abstract
Perception of speech sounds has a long history of being compared to perception of nonspeech sounds, with rich and enduring debates regarding how closely they share similar underlying processes. In many instances, perception of nonspeech sounds is directly compared to that of speech sounds without a clear explanation of how related these sounds are to the speech they are selected to mirror (or not mirror). While the extreme acoustic variability of speech sounds is well documented, this variability is bounded by the common source of a human vocal tract. Nonspeech sounds do not share a common source, and as such, exhibit even greater acoustic variability than that observed for speech. This increased variability raises important questions about how well perception of a given nonspeech sound might resemble or model perception of speech sounds. Here, we offer a brief review of extremely diverse nonspeech stimuli that have been used in the efforts to better understand perception of speech sounds. The review is organized according to increasing spectrotemporal complexity: random noise, pure tones, multitone complexes, environmental sounds, music, speech excerpts that are not recognized as speech, and sinewave speech. Considerations are offered for stimulus selection in nonspeech perception experiments moving forward.
Collapse
Affiliation(s)
- Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Anya E Shorey
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Caleb J King
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| |
Collapse
|
33
|
Pinson S, Quilfen V, Le Courtois F, Real G, Fattaccioli D. Shallow-water waveguide acoustic analysis in a fluctuating environment. J Acoust Soc Am 2022; 152:1252. [PMID: 36182283 DOI: 10.1121/10.0013831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 08/11/2022] [Indexed: 06/16/2023]
Abstract
The Acoustic Laboratory for Marine Applications (ALMA) is a deployable and autonomous acoustic system, designed by DGA Naval Systems, to address problems in underwater acoustics, such as sound propagation in fluctuating environments. In this article, data from the ALMA-2016 at-sea campaign are used to analyze the ocean fluctuation's influence on sound propagation in a shallow-water waveguide. The experiment took place on the continental shelf of the island of Corsica in November 2016. A source and a receiver array were 9.3 km apart in a nearly constant water depth of 100 m. The source emitted a variety of signals from which the chirp (1-13 kHz) is used to extract the waveguide eigenrays. To do so, a time-domain beamforming is performed on the match-filtered received signals with an automatic detection of local maxima in the time of arrival/direction of arrival (TOA/DOA) domain. A 2 min acquisition period of more than 13 h duration shows significant fluctuations in eigenray TOAs/DOAs. Qualitative comparisons with synthetic signals obtained from simulations in two and three dimensions permit reproduction of the observed eigenray fluctuations without including range dependence of the sound-speed profile.
Collapse
Affiliation(s)
- Samuel Pinson
- ENSTA Bretagne, 2 Rue François Verny, 29200 Brest, France
| | | | | | - Gaultier Real
- Direction Générale de l'Armement, Avenue de la Tour Royale, 83000 Toulon, France
| | | |
Collapse
|
34
|
Lv Z, Du L, Li H, Wang L, Qin J, Yang M, Ren C. Influence of Temporal and Spatial Fluctuations of the Shallow Sea Acoustic Field on Underwater Acoustic Communication. Sensors 2022; 22:s22155795. [PMID: 35957351 PMCID: PMC9371005 DOI: 10.3390/s22155795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 07/21/2022] [Accepted: 07/28/2022] [Indexed: 12/05/2022]
Abstract
In underwater acoustic communication (UAC) systems, the channel characteristics are mainly affected by spatiotemporal changes, which are specifically manifested by two factors: the effects of refraction and scattering caused by seawater layered media on the sound field and the random fluctuations from the sea floor and surface. Due to the time-varying and space-varying characteristics of a channel, the communication signals have significant variations in time and space. Furthermore, the signal shows frequency-selective fading in the frequency domain and signal waveform distortion in the time domain, which seriously affect the performance of a UAC system. Techniques such as error correction coding or space diversity are usually adopted by UAC systems to neutralize or eliminate the effects of deep fading and signal distortion, which results in a significant waste of limited communication resources. From the perspective of the sound field, this study used experimental data to analyze the spatiotemporal fluctuation characteristics of the signal and noise fields and then summarized the temporal and spatial variation rules. The influence of the system then guided the parameter configuration and network protocol optimization of the underwater acoustic communication system by reasonably selecting the communication signal parameters, such as frequency, bandwidth, equipment deployment depth, and horizontal distance.
Collapse
Affiliation(s)
- Zhichao Lv
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China; (L.D.); (H.L.); (L.W.)
- Correspondence: (Z.L.); (M.Y.)
| | - Libin Du
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China; (L.D.); (H.L.); (L.W.)
| | - Huming Li
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China; (L.D.); (H.L.); (L.W.)
| | - Lei Wang
- College of Ocean Science and Engineering, Shandong University of Science and Technology, Qingdao 266590, China; (L.D.); (H.L.); (L.W.)
| | - Jixing Qin
- State Key Laboratory of Acoustics, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China;
| | - Min Yang
- North China Sea Marine Technical Support Center, State Oceanic Administration, Qingdao 266061, China
- Correspondence: (Z.L.); (M.Y.)
| | - Chao Ren
- Acoustic Science and Technology Laboratory, Harbin Engineering University, Harbin 150001, China;
| |
Collapse
|
35
|
Chabot S, Braasch J. Walkable auralizations for experiential learning in an immersive classroom. J Acoust Soc Am 2022; 152:899. [PMID: 36050150 DOI: 10.1121/10.0012985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 07/08/2022] [Indexed: 06/15/2023]
Abstract
This paper proposes an experiential method for learning acoustics and consequences of room design through the rapid creation of audio-visual congruent walkable auralizations. An efficient method produces auralizations of acoustical landmarks using a two-dimensional ray-tracing algorithm and publicly available floor plans for a 128-channel wave-field synthesis system. Late reverberation parameters are calculated using additional volumetric data. Congruent visuals are produced using a web-based interface accessible via personal devices, which automatically formats for and transmits to the immersive display. Massive user-contributed online databases are harnessed through application programming interfaces, such as those offered by the Google Maps Platform, to provide near-instant access to innumerable locations. The approach allows the rapid sonic recreation of historical concert venues with adequate sound sources. Listeners can walk through these recreations over an extended user area (12 m × 10 m).
Collapse
Affiliation(s)
- Samuel Chabot
- School of Architecture, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
| | - Jonas Braasch
- School of Architecture, Rensselaer Polytechnic Institute, Troy, New York 12180, USA
| |
Collapse
|
36
|
Gee KL, Mathews LT, Anderson MC, Hart GW. Saturn-V sound levels: A letter to the Redditor. J Acoust Soc Am 2022; 152:1068. [PMID: 36050168 DOI: 10.1121/10.0013216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/20/2021] [Accepted: 07/12/2022] [Indexed: 06/15/2023]
Abstract
The Saturn V is a monument to one of mankind's greatest achievements: the human Moon landings. However, online claims about this vehicle's impressive acoustics by well-meaning individuals are often based on misunderstood or incorrect data. This article, intended for both educators and enthusiasts, discusses topics related to rocket acoustics and documents what is known about the Saturn V's levels: overall power, maximum overall sound pressure, and peak pressure. The overall power level was approximately 204 dB re 1 pW, whereas its lesser sound pressure levels were impacted by source size, directivity, and propagation effects. As this article is part of a special issue on Education in Acoustics in The Journal of the Acoustical Society of America, supplementary Saturn V-related homework problems are included.
Collapse
Affiliation(s)
- Kent L Gee
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA
| | - Logan T Mathews
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA
| | - Mark C Anderson
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA
| | - Grant W Hart
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah 84602, USA
| |
Collapse
|
37
|
Tolkova I, Klinck H. Source separation with an acoustic vector sensor for terrestrial bioacoustics. J Acoust Soc Am 2022; 152:1123. [PMID: 36050162 DOI: 10.1121/10.0013505] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Passive acoustic monitoring is emerging as a low-cost, non-invasive methodology for automated species-level population surveys. However, systems for automating the detection and classification of vocalizations in complex soundscapes are significantly hindered by the overlap of calls and environmental noise. We propose addressing this challenge by utilizing an acoustic vector sensor to separate contributions from different sound sources. More specifically, we describe and implement an analytical pipeline consisting of (1) calculating direction-of-arrival, (2) decomposing the azimuth estimates into angular distributions for individual sources, and (3) numerically reconstructing source signals. Using both simulation and experimental recordings, we evaluate the accuracy of direction-of-arrival estimation through the active intensity method (AIM) against the baselines of white noise gain constraint beamforming (WNC) and multiple signal classification (MUSIC). Additionally, we demonstrate and compare source signal reconstruction with simple angular thresholding and a wrapped Gaussian mixture model. Overall, we show that AIM achieves higher performance than WNC and MUSIC, with a mean angular error of about 5°, robustness to environmental noise, flexible representation of multiple sources, and high fidelity in source signal reconstructions.
Collapse
Affiliation(s)
- Irina Tolkova
- School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts 02138, USA
| | - Holger Klinck
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, New York 14850, USA
| |
Collapse
|
38
|
Abstract
State-of-the-art mode estimation methods either utilize active source transmissions or rely on a full-spanning array to extract normal modes from noise radiated by a ship-of-opportunity. Modal-MUSIC, an adaptation of the MUSIC algorithm (best known for direction-of-arrival estimation), extracts normal modes from a moving source of unknown range recorded on a partially spanning vertical line array, given knowledge of the water column sound speed profile. The method is demonstrated on simulations, as well as on data from the SWellEx-96 experiment. Extracted normal modes from ship noise during the experiment are used to successfully localize a multitone source without any geoacoustic information.
Collapse
Affiliation(s)
- F Hunter Akins
- Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093-0701, USA ,
| | - W A Kuperman
- Scripps Institution of Oceanography, University of California, San Diego, La Jolla, California 92093-0701, USA ,
| |
Collapse
|
39
|
Mills HE, Shorey AE, Theodore RM, Stilp CE. Context effects in perception of vowels differentiated by F 1 are not influenced by variability in talkers' mean F 1 or F 3. J Acoust Soc Am 2022; 152:55. [PMID: 35931547 DOI: 10.1121/10.0011920] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 06/08/2022] [Indexed: 06/15/2023]
Abstract
Spectral properties of earlier sounds (context) influence recognition of later sounds (target). Acoustic variability in context stimuli can disrupt this process. When mean fundamental frequencies (f0's) of preceding context sentences were highly variable across trials, shifts in target vowel categorization [due to spectral contrast effects (SCEs)] were smaller than when sentence mean f0's were less variable; when sentences were rearranged to exhibit high or low variability in mean first formant frequencies (F1) in a given block, SCE magnitudes were equivalent [Assgari, Theodore, and Stilp (2019) J. Acoust. Soc. Am. 145(3), 1443-1454]. However, since sentences were originally chosen based on variability in mean f0, stimuli underrepresented the extent to which mean F1 could vary. Here, target vowels (/ɪ/-/ɛ/) were categorized following context sentences that varied substantially in mean F1 (experiment 1) or mean F3 (experiment 2) with variability in mean f0 held constant. In experiment 1, SCE magnitudes were equivalent whether context sentences had high or low variability in mean F1; the same pattern was observed in experiment 2 for new sentences with high or low variability in mean F3. Variability in some acoustic properties (mean f0) can be more perceptually consequential than others (mean F1, mean F3), but these results may be task-dependent.
Collapse
Affiliation(s)
- Hannah E Mills
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Anya E Shorey
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| | - Rachel M Theodore
- Department of Speech, Language, and Hearing Sciences, University of Connecticut, Storrs, Connecticut 06269, USA
| | - Christian E Stilp
- Department of Psychological and Brain Sciences, University of Louisville, Louisville, Kentucky 40292, USA
| |
Collapse
|
40
|
Guan S, Brookens T, Miner R. Kurtosis analysis of sounds from down-the-hole pile installation and the implications for marine mammal auditory impairment. JASA Express Lett 2022; 2:071201. [PMID: 36154049 DOI: 10.1121/10.0012348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Sounds from down-the-hole pile installation contain both impulsive and non-impulsive components. Kurtosis values (β) were determined for two datasets to investigate the impulsiveness of piling sounds. When the hammer struck the pile(s), β was 21-30 at 10 m and approximately 10 at 200 m. When the hammer was used for drilling without contacting the pile, β was 4-6 at all distances. These findings suggest that a simple dichotomy of classifying sounds as impulsive or non-impulsive may be overly simplistic for assessing marine mammal auditory impacts and studies investigating the impacts from complex sound fields are needed.
Collapse
Affiliation(s)
- Shane Guan
- Bureau of Ocean Energy Management, Division of Environmental Sciences, Sterling, Virginia 20166, USA
| | | | - Robert Miner
- Robert Miner Dynamic Testing of Alaska Inc., Manchester, Washington 98353, USA , ,
| |
Collapse
|
41
|
Jones B, Tufano S, Ridgway S. Signature whistles exhibit a 'fade-in' and then 'fade-out' pattern of relative amplitude declination. Behav Processes 2022; 200:104690. [PMID: 35709885 DOI: 10.1016/j.beproc.2022.104690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 06/08/2022] [Accepted: 06/10/2022] [Indexed: 11/19/2022]
Abstract
Bottlenose dolphins have individually distinct signature whistles that are characterized by a stereotyped frequency-time contour. Signature whistles are commonly exchanged with short time delays between calls. Dolphin whistles are produced by pressurized nasal sacs that increase and then decrease in pressure over emission. This study found that the relative amplitude modulation pattern over time exhibited the same fade-in and then fade-out pattern in the signature whistles of eight bottlenose dolphins at the Navy in San Diego, CA. Both the initial and final five percent of the whistle's duration also had significantly lower mean relative amplitude than the center five percent. The current analyses of the amplitude-time relationship was then integrated to a previously reported model of the negative relationship between relative log amplitude and log peak frequency. This produced a more robust model for accounting for the predictable aspects of the more broadly non-stereotyped amplitude modulations of signature whistles. Whether dolphins can intentionally manipulate these amplitude features or they are simple by-products of the sound production system, and further whether they are perceived and utilized by receivers, is an exciting area for continued research.
Collapse
Affiliation(s)
- Brittany Jones
- National Marine Mammal Foundation: 3131, 2240 Shelter Island Dr, San Diego, CA 92106, USA.
| | - Samantha Tufano
- National Marine Mammal Foundation: 3131, 2240 Shelter Island Dr, San Diego, CA 92106, USA
| | - Sam Ridgway
- National Marine Mammal Foundation: 3131, 2240 Shelter Island Dr, San Diego, CA 92106, USA
| |
Collapse
|
42
|
Sun Z, Jiang J, Li Y, Li C, Li Z, Fu X, Duan F. An automated piecewise synthesis method for cetacean tonal sounds based on time-frequency spectrogram. J Acoust Soc Am 2022; 151:3758. [PMID: 35778203 DOI: 10.1121/10.0011551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Accepted: 05/15/2022] [Indexed: 06/15/2023]
Abstract
Bionic signal waveform design plays an important role in biological research, as well as bionic underwater acoustic detection and communication. Most conventional methods cannot construct high-similarity bionic waveforms to match complex cetacean sounds or easily modify the time-frequency structure of the synthesized bionic signals. In our previous work, we proposed a synthesis and modification method for cetacean tonal sounds, but it requires a lot of manpower to construct each bionic signal segment to match the tonal sound contour. To solve these problems, an automated piecewise synthesis method is proposed. First, based on the time-frequency spectrogram of each tonal sound, the fundamental contour and each harmonic contour of the tonal sound is automatically recognized and extracted. Then, based on the extracted contours, four sub power frequency modulation bionic signal models are combined to match cetacean sound contours. Finally, combining the envelopes of the fundamental frequency and each harmonic, the synthesized bionic signal is obtained. Experimental results show that the Pearson correlation coefficient (PCC) between all true cetacean sounds and their corresponding bionic signals are higher than 0.95, demonstrating that the proposed method can automatically imitate all kinds of simple and complex cetacean tonal sounds with high similarity.
Collapse
Affiliation(s)
- Zhongbo Sun
- State Key Lab of Precision Measuring Technology and Instruments, Tianjin University, Tianjin, 300072, China
| | - Jiajia Jiang
- State Key Lab of Precision Measuring Technology and Instruments, Tianjin University, Tianjin, 300072, China
| | - Yao Li
- Systems Engineering Research Institute, China State Shipbuilding Corporation (CSSC), Beijing, 100036, China
| | - Chunyue Li
- State Key Lab of Precision Measuring Technology and Instruments, Tianjin University, Tianjin, 300072, China
| | - Zhuochen Li
- State Key Lab of Precision Measuring Technology and Instruments, Tianjin University, Tianjin, 300072, China
| | - Xiao Fu
- State Key Lab of Precision Measuring Technology and Instruments, Tianjin University, Tianjin, 300072, China
| | - Fajie Duan
- State Key Lab of Precision Measuring Technology and Instruments, Tianjin University, Tianjin, 300072, China
| |
Collapse
|
43
|
Song Z, Zhang C, Fu W, Gao Z, Ou W, Zhang J, Zhang Y. Investigation on whistle directivity in the Indo-Pacific humpback dolphin (Sousa chinensis) through numerical modeling. J Acoust Soc Am 2022; 151:3573. [PMID: 35778211 DOI: 10.1121/10.0011513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 05/11/2022] [Indexed: 06/15/2023]
Abstract
Odontocetes have evolved special acoustic structures in the forehead to modulate echolocation and communication signals into directional beams to facilitate feeding and social behaviors. Whistle directivity was addressed for the Indo-Pacific humpback dolphin (Sousa chinensis) by developing numerical models in the current paper. Directivity was first examined at the fundamental frequency 5 kHz, and simulations were then extended to the harmonics of 10, 15, 20, 25, and 30 kHz. At 5 kHz, the -3 dB beam widths in the vertical and horizontal planes were 149.3° and 119.4°, corresponding to the directivity indexes (DIs) of 4.4 and 5.4 dB, respectively. More importantly, we incorporated directivity of the fundamental frequency and harmonics to produce an overall beam, resulting in -3 dB beam widths of 77.2° and 62.9° and DIs of 8.2 and 9.7 dB in the vertical and horizontal planes, respectively. Harmonics can enhance the directivity of fundamental frequency by 3.8 and 4.3 dB, respectively. These results suggested the transmission system can modulate whistles into directional projection, and harmonics can improve DI.
Collapse
Affiliation(s)
- Zhongchang Song
- State Key Laboratory of Marine Environmental Science, College of the Environment and Ecology, Xiamen University, Xiamen 361005, China
| | - Chuang Zhang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361005, China
| | - Weijie Fu
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361005, China
| | - Zhanyuan Gao
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361005, China
| | - Wenzhan Ou
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361005, China
| | - Jinhu Zhang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361005, China
| | - Yu Zhang
- State Key Laboratory of Marine Environmental Science, College of Ocean and Earth Sciences, Xiamen University, Xiamen 361005, China
| |
Collapse
|
44
|
Tougaard J, Beedholm K, Madsen PT. Thresholds for noise induced hearing loss in harbor porpoises and phocid seals. J Acoust Soc Am 2022; 151:4252. [PMID: 35778178 DOI: 10.1121/10.0011560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 05/16/2022] [Indexed: 06/15/2023]
Abstract
Intense sound sources, such as pile driving, airguns, and military sonars, have the potential to inflict hearing loss in marine mammals and are, therefore, regulated in many countries. The most recent criteria for noise induced hearing loss are based on empirical data collected until 2015 and recommend frequency-weighted and species group-specific thresholds to predict the onset of temporary threshold shift (TTS). Here, evidence made available after 2015 in light of the current criteria for two functional hearing groups is reviewed. For impulsive sounds (from pile driving and air guns), there is strong support for the current threshold for very high frequency cetaceans, including harbor porpoises (Phocoena phocoena). Less strong support also exists for the threshold for phocid seals in water, including harbor seals (Phoca vitulina). For non-impulsive sounds, there is good correspondence between exposure functions and empirical thresholds below 10 kHz for porpoises (applicable to assessment and regulation of military sonars) and between 3 and 16 kHz for seals. Above 10 kHz for porpoises and outside of the range 3-16 kHz for seals, there are substantial differences (up to 35 dB) between the predicted thresholds for TTS and empirical results. These discrepancies call for further studies.
Collapse
Affiliation(s)
- Jakob Tougaard
- Department of Ecoscience, Marine Mammal Research, Aarhus University, C. F. Møllers Allé 3, Aarhus 8000, Denmark
| | - Kristian Beedholm
- Department of Biology, Zoophysiology, Aarhus University, C. F. Møllers Allé 3, Aarhus 8000, Denmark
| | - Peter T Madsen
- Department of Biology, Zoophysiology, Aarhus University, C. F. Møllers Allé 3, Aarhus 8000, Denmark
| |
Collapse
|
45
|
Stenton CA, Bolger EL, Michenot M, Dodd JA, Wale MA, Briers RA, Hartl MGJ, Diele K. Effects of pile driving sound playbacks and cadmium co-exposure on the early life stage development of the Norway lobster, Nephrops norvegicus. Mar Pollut Bull 2022; 179:113667. [PMID: 35533617 DOI: 10.1016/j.marpolbul.2022.113667] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2021] [Revised: 04/10/2022] [Accepted: 04/12/2022] [Indexed: 06/14/2023]
Abstract
There is an urgent need to understand how organisms respond to multiple, potentially interacting drivers in today's world. The effects of the pollutants anthropogenic sound (pile driving sound playbacks) and waterborne cadmium were investigated across multiple levels of biology in larval and juvenile Norway lobster, Nephrops norvegicus under controlled laboratory conditions. The combination of pile driving playbacks (170 dBpk-pk re 1 μPa) and cadmium combined synergistically at concentrations >9.62 μg[Cd] L-1 resulting in increased larval mortality, with sound playbacks otherwise being antagonistic to cadmium toxicity. Exposure to 63.52 μg[Cd] L-1 caused significant delays in larval development, dropping to 6.48 μg[Cd] L-1 in the presence of piling playbacks. Pre-exposure to the combination of piling playbacks and 6.48 μg[Cd] L-1 led to significant differences in the swimming behaviour of the first juvenile stage. Biomarker analysis suggested oxidative stress as the mechanism resultant deleterious effects, with cellular metallothionein (MT) being the predominant protective mechanism.
Collapse
Affiliation(s)
- C A Stenton
- Aquatic Noise Research Group, School of Applied Sciences, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; Centre for Conservation and Restoration Science, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; St Abbs Marine Station, The Harbour, St Abbs, Eyemouth TD14 5PW, UK; Ocean Science Consulting Ltd., Spott Road, Dunbar EH42 1RR, UK.
| | - E L Bolger
- Aquatic Noise Research Group, School of Applied Sciences, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; Centre for Conservation and Restoration Science, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; St Abbs Marine Station, The Harbour, St Abbs, Eyemouth TD14 5PW, UK
| | - M Michenot
- École Nationale des Travaux Publics de L'état, 3 Rue Maurice Audin, 69 120 Vaulx en Velin, France
| | - J A Dodd
- Aquatic Noise Research Group, School of Applied Sciences, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; Centre for Conservation and Restoration Science, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK
| | - M A Wale
- Aquatic Noise Research Group, School of Applied Sciences, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; Centre for Conservation and Restoration Science, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; St Abbs Marine Station, The Harbour, St Abbs, Eyemouth TD14 5PW, UK
| | - R A Briers
- Aquatic Noise Research Group, School of Applied Sciences, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; Centre for Conservation and Restoration Science, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK
| | - M G J Hartl
- Centre for Marine Biodiversity & Biotechnology, Institute of Life and Earth Sciences, School of Energy, Geoscience, Infrastructure & Society, Heriot-Watt University, Edinburgh EH14 4AS, UK
| | - K Diele
- Aquatic Noise Research Group, School of Applied Sciences, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; Centre for Conservation and Restoration Science, Edinburgh Napier University, 9 Sighthill Court, Edinburgh EH11 4BN, UK; St Abbs Marine Station, The Harbour, St Abbs, Eyemouth TD14 5PW, UK.
| |
Collapse
|
46
|
Abstract
While studies have demonstrated concept formation in animals, only humans are known to label concepts to use them in mental simulations or predictions. To investigate whether other animals use labels comparably, we studied cross-modal, individual recognition in bottlenose dolphins (Tursiops truncatus) that use signature whistles as labels for conspecifics in their own communication. First, we tested whether dolphins could use gustatory stimuli and found that they could distinguish between water and urine samples, as well as between urine from familiar and unfamiliar individuals. Then, we paired playbacks of signature whistles of known animals with urine samples from either the same dolphin or a different, familiar animal. Dolphins investigated the presentation area longer when the acoustic and gustatory sample matched than when they mismatched. This demonstrates that dolphins recognize other individuals by gustation alone and can integrate information from acoustic and taste inputs indicating a modality independent, labeled concept for known conspecifics.
Collapse
|
47
|
Li L, Qiao G, Qing X, Zhang H, Liu X, Liu S. Robust unsupervised Tursiops aduncus whistle-event detection using gammatone multi-channel Savitzky-Golay based whistle enhancement. J Acoust Soc Am 2022; 151:3509. [PMID: 35649921 DOI: 10.1121/10.0011402] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 05/02/2022] [Indexed: 06/15/2023]
Abstract
Detecting whistle events is essential when studying the population density and behavior of cetaceans. After eight months of passive acoustic monitoring in Xiamen, we obtained long calls from two Tursiops aduncus individuals. In this paper, we propose an algorithm with an unbiased gammatone multi-channel Savitzky-Golay for smoothing dynamic continuous background noise and interference from long click trains. The algorithm uses the method of least squares to perform a local polynomial regression on the time-frequency representation of multi-frequency resolution call measurements, which can effectively retain the whistle profiles while filtering out noise and interference. We prove that it is better at separating out whistles and has lower computational complexity than other smoothing methods. In order to further extract whistle features in enhanced spectrograms, we also propose a set of multi-scale and multi-directional moving filter banks for various whistle durations and contour shapes. The final binary adaptive decisions at frame level for whistle events are obtained from the histograms of multi-scale and multi-directional spectrograms. Finally, we explore the entire data set and find that the proposed scheme achieves the highest frame-level F1-scores when detecting T. aduncus whistles than the baseline schemes, with an improvement of more than 6%.
Collapse
Affiliation(s)
- Lei Li
- Acoustic Science and Technology Laboratory, Harbin Engineering University, Harbin 150001, China
| | - Gang Qiao
- Acoustic Science and Technology Laboratory, Harbin Engineering University, Harbin 150001, China
| | - Xin Qing
- Acoustic Science and Technology Laboratory, Harbin Engineering University, Harbin 150001, China
| | - Huaying Zhang
- Acoustic Science and Technology Laboratory, Harbin Engineering University, Harbin 150001, China
| | - Xinyu Liu
- Acoustic Science and Technology Laboratory, Harbin Engineering University, Harbin 150001, China
| | - Songzuo Liu
- Acoustic Science and Technology Laboratory, Harbin Engineering University, Harbin 150001, China
| |
Collapse
|
48
|
Song W, Gao D, Li X, Kang D, Li Y. Waveguide invariant in a gradual range- and azimuth-varying waveguide. JASA Express Lett 2022; 2:056002. [PMID: 36154066 DOI: 10.1121/10.0010489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Experimental data indicate that in a sloped area, the value of β abruptly changes before and after a given source arrives at the closest point of approach to the hydrophone, which has not been previously reported. The adiabatic approximation is employed to explain the above abrupt change in β, and it is found that the azimuthal variance in the sound path is the reason for this phenomenon. Simulations are performed to confirm the model and experimental data, and perfect agreement is achieved. This work suggests that β should be carefully set in related applications in a sloped area.
Collapse
Affiliation(s)
- Wenhua Song
- College of Physics and Optoelectronic Engineering, Ocean University of China, Qingdao, 266100, China
| | - Dazhi Gao
- College of Marine Technology, Ocean University of China, Qingdao, 266100, China , , , ,
| | - Xiaolei Li
- College of Marine Technology, Ocean University of China, Qingdao, 266100, China , , , ,
| | - Dexiang Kang
- College of Marine Technology, Ocean University of China, Qingdao, 266100, China , , , ,
| | - Yuzheng Li
- College of Marine Technology, Ocean University of China, Qingdao, 266100, China , , , ,
| |
Collapse
|
49
|
Kim D, Kim JS, Song J. Cancellation of dolphin sonar clicks in a communication signal based on adaptive time reversal processing. JASA Express Lett 2022; 2:056001. [PMID: 36154075 DOI: 10.1121/10.0010375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In long-range underwater communication, conventional time reversal processing (CTRP) is used to mitigate the distortion caused by multipath and temporal spreading. However, signals produced by marine animals can contaminate communication signals. Impulsive signals, such as dolphin sonar clicks, have wide bandwidths and short pulse durations, making it difficult to isolate the communication signal. This letter proposes a method to cancel these sounds by estimating the Green's function of marine animal clicks and applying adaptive time reversal processing (ATRP). The effectiveness of the click nulling was verified by comparing the performances of CTRP and ATRP with seagoing experimental data.
Collapse
Affiliation(s)
- Donghyeon Kim
- Department of Convergence Study on the Ocean Science and Technoloty, Korea Maritime and Ocean University, Busan 49112, Korea
| | - J S Kim
- Department of Ocean Engineering, Korea Maritime and Ocean University, Busan 49112, Korea , ,
| | - Jiyoung Song
- Department of Ocean Engineering, Korea Maritime and Ocean University, Busan 49112, Korea , ,
| |
Collapse
|
50
|
Bowman DC, Rouse JW, Krishnamoorthy S, Silber EA. Infrasound direction of arrival determination using a balloon-borne aeroseismometer. JASA Express Lett 2022; 2:054001. [PMID: 36154067 DOI: 10.1121/10.0010378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Free-floating balloons are an emerging platform for infrasound recording, but they cannot host arrays sufficiently wide for multi-sensor acoustic direction finding techniques. Because infrasound waves are longitudinal, the balloon motion in response to acoustic loading can be used to determine the signal azimuth. This technique, called "aeroseismometry," permits sparse balloon-borne networks to geolocate acoustic sources. This is demonstrated by using an aeroseismometer on a stratospheric balloon to measure the direction of arrival of acoustic waves from successive ground chemical explosions. A geolocation algorithm adapted from hydroacoustics is then used to calculate the location of the explosions.
Collapse
Affiliation(s)
- Daniel C Bowman
- Geophysical Detection Systems, Sandia National Laboratories, Albuquerque, New Mexico, 87123, USA
| | - Jerry W Rouse
- Analytical Structural Dynamics, Sandia National Laboratories, Albuquerque, New Mexico, 87123, USA
| | | | - Elizabeth A Silber
- Geophysics, Sandia National Laboratories, Albuquerque, New Mexico, 87123, USA , , ,
| |
Collapse
|