1
|
Kershenbaum A, Akçay Ç, Babu-Saheer L, Barnhill A, Best P, Cauzinille J, Clink D, Dassow A, Dufourq E, Growcott J, Markham A, Marti-Domken B, Marxer R, Muir J, Reynolds S, Root-Gutteridge H, Sadhukhan S, Schindler L, Smith BR, Stowell D, Wascher CAF, Dunn JC. Automatic detection for bioacoustic research: a practical guide from and for biologists and computer scientists. Biol Rev Camb Philos Soc 2024. [PMID: 39417330 DOI: 10.1111/brv.13155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 09/30/2024] [Accepted: 10/04/2024] [Indexed: 10/19/2024]
Abstract
Recent years have seen a dramatic rise in the use of passive acoustic monitoring (PAM) for biological and ecological applications, and a corresponding increase in the volume of data generated. However, data sets are often becoming so sizable that analysing them manually is increasingly burdensome and unrealistic. Fortunately, we have also seen a corresponding rise in computing power and the capability of machine learning algorithms, which offer the possibility of performing some of the analysis required for PAM automatically. Nonetheless, the field of automatic detection of acoustic events is still in its infancy in biology and ecology. In this review, we examine the trends in bioacoustic PAM applications, and their implications for the burgeoning amount of data that needs to be analysed. We explore the different methods of machine learning and other tools for scanning, analysing, and extracting acoustic events automatically from large volumes of recordings. We then provide a step-by-step practical guide for using automatic detection in bioacoustics. One of the biggest challenges for the greater use of automatic detection in bioacoustics is that there is often a gulf in expertise between the biological sciences and the field of machine learning and computer science. Therefore, this review first presents an overview of the requirements for automatic detection in bioacoustics, intended to familiarise those from a computer science background with the needs of the bioacoustics community, followed by an introduction to the key elements of machine learning and artificial intelligence that a biologist needs to understand to incorporate automatic detection into their research. We then provide a practical guide to building an automatic detection pipeline for bioacoustic data, and conclude with a discussion of possible future directions in this field.
Collapse
Affiliation(s)
- Arik Kershenbaum
- Girton College and Department of Zoology, University of Cambridge, Huntingdon Road, Cambridge, CB3 0JG, UK
| | - Çağlar Akçay
- Behavioural Ecology Research Group, School of Life Sciences, Anglia Ruskin University, East Road, Cambridge, CB1 1PT, UK
| | - Lakshmi Babu-Saheer
- Computing Informatics and Applications Research Group, School of Computing and Information Sciences, Anglia Ruskin University, East Road, Cambridge, CB1 1PT, UK
| | - Alex Barnhill
- Pattern Recognition Lab, Department of Computer Science, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, 91058, Germany
| | - Paul Best
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, ILCB, CS 60584, Toulon, 83041 CEDEX 9, France
| | - Jules Cauzinille
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, ILCB, CS 60584, Toulon, 83041 CEDEX 9, France
| | - Dena Clink
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, 159 Sapsucker Woods Road, Ithaca, New York, 14850, USA
| | - Angela Dassow
- Biology Department, Carthage College, 2001 Alford Park Dr, 68 David A Straz Jr, Kenosha, Wisconsin, 53140, USA
| | - Emmanuel Dufourq
- African Institute for Mathematical Sciences, 7 Melrose Road, Muizenberg, Cape Town, 7441, South Africa
- Stellenbosch University, Jan Celliers Road, Stellenbosch, 7600, South Africa
- African Institute for Mathematical Sciences - Research and Innovation Centre, District Gasabo, Secteur Kacyiru, Cellule Kamatamu, Rue KG590 ST No 1, Kigali, Rwanda
| | - Jonathan Growcott
- Centre of Ecology and Conservation, College of Life and Environmental Sciences, University of Exeter, Cornwall Campus, Exeter, TR10 9FE, UK
- Wildlife Conservation Research Unit, Recanati-Kaplan Centre, Tubney House, Abingdon Road Tubney, Abingdon, OX13 5QL, UK
| | - Andrew Markham
- Department of Computer Science, University of Oxford, Parks Road, Oxford, OX1 3QD, UK
| | | | - Ricard Marxer
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, ILCB, CS 60584, Toulon, 83041 CEDEX 9, France
| | - Jen Muir
- Behavioural Ecology Research Group, School of Life Sciences, Anglia Ruskin University, East Road, Cambridge, CB1 1PT, UK
| | - Sam Reynolds
- Behavioural Ecology Research Group, School of Life Sciences, Anglia Ruskin University, East Road, Cambridge, CB1 1PT, UK
| | - Holly Root-Gutteridge
- School of Natural Sciences, University of Lincoln, Joseph Banks Laboratories, Beevor Street, Lincoln, Lincolnshire, LN5 7TS, UK
| | - Sougata Sadhukhan
- Institute of Environment Education and Research, Pune Bharati Vidyapeeth Educational Campus, Satara Road, Pune, Maharashtra, 411 043, India
| | - Loretta Schindler
- Department of Zoology, Faculty of Science, Charles University, Prague, 128 44, Czech Republic
| | - Bethany R Smith
- Institute of Zoology, Zoological Society of London, Outer Circle, London, NW1 4RY, UK
| | - Dan Stowell
- Tilburg University, Tilburg, The Netherlands
- Naturalis Biodiversity Center, Darwinweg 2, Leiden, 2333 CR, The Netherlands
| | - Claudia A F Wascher
- Behavioural Ecology Research Group, School of Life Sciences, Anglia Ruskin University, East Road, Cambridge, CB1 1PT, UK
| | - Jacob C Dunn
- Behavioural Ecology Research Group, School of Life Sciences, Anglia Ruskin University, East Road, Cambridge, CB1 1PT, UK
- Department of Archaeology, University of Cambridge, Downing Street, Cambridge, CB2 3DZ, UK
- Department of Behavioral and Cognitive Biology, University of Vienna, University Biology Building (UBB), Djerassiplatiz 1, Vienna, 1030, Austria
| |
Collapse
|
2
|
Knight E, Rhinehart T, de Zwaan DR, Weldy MJ, Cartwright M, Hawley SH, Larkin JL, Lesmeister D, Bayne E, Kitzes J. Individual identification in acoustic recordings. Trends Ecol Evol 2024; 39:947-960. [PMID: 38862357 DOI: 10.1016/j.tree.2024.05.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 05/08/2024] [Accepted: 05/15/2024] [Indexed: 06/13/2024]
Abstract
Recent advances in bioacoustics combined with acoustic individual identification (AIID) could open frontiers for ecological and evolutionary research because traditional methods of identifying individuals are invasive, expensive, labor-intensive, and potentially biased. Despite overwhelming evidence that most taxa have individual acoustic signatures, the application of AIID remains challenging and uncommon. Furthermore, the methods most commonly used for AIID are not compatible with many potential AIID applications. Deep learning in adjacent disciplines suggests opportunities to advance AIID, but such progress is limited by training data. We suggest that broadscale implementation of AIID is achievable, but researchers should prioritize methods that maximize the potential applications of AIID, and develop case studies with easy taxa at smaller spatiotemporal scales before progressing to more difficult scenarios.
Collapse
Affiliation(s)
- Elly Knight
- Department of Biological Sciences, Alberta Biodiversity Monitoring Institute, University of Alberta, Edmonton, Alberta, T6G 2E6, Canada.
| | - Tessa Rhinehart
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
| | - Devin R de Zwaan
- Department of Biology, Mount Allison University, Sackville, NB, E4L 1E4, Canada; Acadia University, Wolfville, NS, B4P 2R6, Canada
| | - Matthew J Weldy
- Department of Forest Ecosystems and Society, Oregon State University, Corvallis, OR, 97331-5704, USA
| | - Mark Cartwright
- Department of Informatics, New Jersey Institute of Technology, Newark, NJ, 07102, USA
| | - Scott H Hawley
- Chemistry and Physics Department, Belmont University, Nashville, TN, 37212, USA
| | - Jeffery L Larkin
- Department of Biology, Indiana University of Pennsylvania, Indiana, PA, 15705-1081, USA; American Bird Conservancy, The Plains, VA, 20198, USA
| | - Damon Lesmeister
- USDA Forest Service, Pacific Northwest Research Station, Corvallis Forestry Science Laboratory, Oregon State University, Corvallis, OR, 97330, USA
| | - Erin Bayne
- Department of Biological Sciences, Alberta Biodiversity Monitoring Institute, University of Alberta, Edmonton, Alberta, T6G 2E6, Canada
| | - Justin Kitzes
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| |
Collapse
|
3
|
Sasek J, Allison B, Contina A, Knobles D, Wilson P, Keitt T. Semiautomated generation of species-specific training data from large, unlabeled acoustic datasets for deep supervised birdsong isolation. PeerJ 2024; 12:e17854. [PMID: 39329137 PMCID: PMC11426315 DOI: 10.7717/peerj.17854] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2024] [Accepted: 07/12/2024] [Indexed: 09/28/2024] Open
Abstract
Background Bioacoustic monitoring is an effective and minimally invasive method to study wildlife ecology. However, even the state-of-the-art techniques for analyzing birdsongs decrease in accuracy in the presence of extraneous signals such as anthropogenic noise and vocalizations of non-target species. Deep supervised source separation (DSSS) algorithms have been shown to effectively separate mixtures of animal vocalizations. However, in practice, recording sites also have site-specific variations and unique background audio that need to be removed, warranting the need for site-specific data. Methods Here, we test the potential of training DSSS models on site-specific bird vocalizations and background audio. We used a semiautomated workflow using deep supervised classification and statistical cleaning to label and generate a site-specific source separation dataset by mixing birdsongs and background audio segments. Then, we trained a deep supervised source separation (DSSS) model with this generated dataset. Because most data is passively-recorded and consequently noisy, the true isolated birdsongs are unavailable which makes evaluation challenging. Therefore, in addition to using traditional source separation (SS) metrics, we also show the effectiveness of our site-specific approach using metrics commonly used in ornithological analyses such as automated feature labeling and species-specific trilateration accuracy. Results Our approach of training on site-specific data boosts the source-to-distortion, source-to-interference, and source-to-artifact ratios (SDR, SIR, and SAR) by 9.33 dB, 24.07 dB, and 3.60 dB respectively. We also find our approach allows for automated feature labeling with single-digit mean absolute percent error and birdsong trilateration accuracy with a mean simulated trilateration error of 2.58 m. Conclusion Overall, we show that site-specific DSSS is a promising upstream solution for wildlife audio analysis tools that break down in the presence of background noise. By training on site-specific data, our method is robust to unique, site-specific interference that caused previous methods to fail.
Collapse
Affiliation(s)
- Justin Sasek
- Department of Computer Science, The University of Texas at Austin, Austin, TX, United States of America
| | - Brendan Allison
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, United States of America
| | - Andrea Contina
- School of Integrative Biological and Chemical Sciences, The University of Texas Rio Grande Valley, Brownsville, TX, United States of America
| | - David Knobles
- Knobles Scientific and Analysis, LLC, Austin, TX, United States of America
| | - Preston Wilson
- Walker Department of Mechanical Engineering, The University of Texas at Austin, Austin, TX, United States of America
| | - Timothy Keitt
- Department of Integrative Biology, The University of Texas at Austin, Austin, TX, United States of America
| |
Collapse
|
4
|
Cañas JS, Toro-Gómez MP, Sugai LSM, Benítez Restrepo HD, Rudas J, Posso Bautista B, Toledo LF, Dena S, Domingos AHR, de Souza FL, Neckel-Oliveira S, da Rosa A, Carvalho-Rocha V, Bernardy JV, Sugai JLMM, Dos Santos CE, Bastos RP, Llusia D, Ulloa JS. A dataset for benchmarking Neotropical anuran calls identification in passive acoustic monitoring. Sci Data 2023; 10:771. [PMID: 37932332 PMCID: PMC10628131 DOI: 10.1038/s41597-023-02666-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 10/19/2023] [Indexed: 11/08/2023] Open
Abstract
Global change is predicted to induce shifts in anuran acoustic behavior, which can be studied through passive acoustic monitoring (PAM). Understanding changes in calling behavior requires automatic identification of anuran species, which is challenging due to the particular characteristics of neotropical soundscapes. In this paper, we introduce a large-scale multi-species dataset of anuran amphibians calls recorded by PAM, that comprises 27 hours of expert annotations for 42 different species from two Brazilian biomes. We provide open access to the dataset, including the raw recordings, experimental setup code, and a benchmark with a baseline model of the fine-grained categorization problem. Additionally, we highlight the challenges of the dataset to encourage machine learning researchers to solve the problem of anuran call identification towards conservation policy. All our experiments and resources have been made available at https://soundclim.github.io/anuraweb/ .
Collapse
Grants
- Group on Earth Observations (GEO) and Microsoft, under the GEO-Microsoft Planetary Computer Programme (October 2021)
- São Paulo Research Foundation (FAPESP #2016/25358-3; #2019/18335-5)
- National Council for Scientific and Technological Development (CNPq #302834/2020-6; #312338/2021-0, #307599/2021-3)
- CNPQ/MCTI/CONFAP-FAPS/PELD No 21/2020 (FAPESC 2021TR386)
- Comunidad de Madrid (2020-T1/AMB-20636, Atracción de Talento Investigador, Spain) and research projects funded by the European Commission (EAVESTROP–661408, Global Marie S. Curie fellowship, program H2020, EU); and the Ministerio de Economía, Industria y Competitividad (CGL2017-88764-R, MINECO/AEI/FEDER, Spain).
Collapse
Affiliation(s)
- Juan Sebastián Cañas
- Instituto de Investigación de Recursos Biológicos Alexander von Humboldt, Avenida Paseo Bolívar 16-20, Bogotá, Colombia.
| | - María Paula Toro-Gómez
- Instituto de Investigación de Recursos Biológicos Alexander von Humboldt, Avenida Paseo Bolívar 16-20, Bogotá, Colombia
| | - Larissa Sayuri Moreira Sugai
- K Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, 159 Sapsucker woods road, 14850, Ithaca, New York, USA
| | | | - Jorge Rudas
- Instituto de Investigación de Recursos Biológicos Alexander von Humboldt, Avenida Paseo Bolívar 16-20, Bogotá, Colombia
| | - Breyner Posso Bautista
- Instituto de Investigación de Recursos Biológicos Alexander von Humboldt, Avenida Paseo Bolívar 16-20, Bogotá, Colombia
| | - Luís Felipe Toledo
- Laboratório de História Natural de Anfíbios Brasileiros (LaHNAB), Universidade Estadual de Campinas, Campinas, SP, Brazil
| | - Simone Dena
- Museu de Diversidade Biológica (MDBio), Universidade Estadual de Campinas, Campinas, SP, Brazil
| | | | - Franco Leandro de Souza
- Universidade Federal de Mato Grosso do Sul, Instituto de Biociências, Campo Grande, MS, Brazil
| | - Selvino Neckel-Oliveira
- Departamento de Ecologia e Zoologia, Universidade Federal de Santa Catarina, Florianopolis, SC, Brazil
| | - Anderson da Rosa
- Departamento de Ecologia e Zoologia, Universidade Federal de Santa Catarina, Florianopolis, SC, Brazil
| | - Vítor Carvalho-Rocha
- Departamento de Ecologia e Zoologia, Universidade Federal de Santa Catarina, Florianopolis, SC, Brazil
| | | | | | | | | | - Diego Llusia
- Terrestrial Ecology Group, Departamento de Ecología, Universidad Autónoma de Madrid, C/ Darwin, 2, Ciudad Universitaria de Cantoblanco, Facultad de Ciencias, Edificio de Biología, 28049, Madrid, Spain
- Centro de Investigación en Biodiversidad y Cambio Global (CIBC), Universidad Autónoma de Madrid. C/ Darwin 2, 28049, Madrid, Spain
- Laboratório de Herpetologia e Comportamento Animal, Departamento de Ecologia, Instituto de Ciências Biológicas, Universidade Federal de Goiás, Goiás, Brazil
| | - Juan Sebastián Ulloa
- Instituto de Investigación de Recursos Biológicos Alexander von Humboldt, Avenida Paseo Bolívar 16-20, Bogotá, Colombia
| |
Collapse
|
5
|
Brickson L, Zhang L, Vollrath F, Douglas-Hamilton I, Titus AJ. Elephants and algorithms: a review of the current and future role of AI in elephant monitoring. J R Soc Interface 2023; 20:20230367. [PMID: 37963556 PMCID: PMC10645515 DOI: 10.1098/rsif.2023.0367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 10/23/2023] [Indexed: 11/16/2023] Open
Abstract
Artificial intelligence (AI) and machine learning (ML) present revolutionary opportunities to enhance our understanding of animal behaviour and conservation strategies. Using elephants, a crucial species in Africa and Asia's protected areas, as our focal point, we delve into the role of AI and ML in their conservation. Given the increasing amounts of data gathered from a variety of sensors like cameras, microphones, geophones, drones and satellites, the challenge lies in managing and interpreting this vast data. New AI and ML techniques offer solutions to streamline this process, helping us extract vital information that might otherwise be overlooked. This paper focuses on the different AI-driven monitoring methods and their potential for improving elephant conservation. Collaborative efforts between AI experts and ecological researchers are essential in leveraging these innovative technologies for enhanced wildlife conservation, setting a precedent for numerous other species.
Collapse
Affiliation(s)
| | | | - Fritz Vollrath
- Save the Elephants, Nairobi, Kenya
- Department of Biology, University of Oxford, Oxford, UK
| | | | - Alexander J. Titus
- Colossal Biosciences, Dallas, TX, USA
- Information Sciences Institute, University of Southern California, Los Angeles, USA
| |
Collapse
|
6
|
Giordano AT, Jeng FC, Black TR, Bauer SW, Carriero AE, McDonald K, Lin TH, Wang CY. Effects of Silent Intervals on the Extraction of Human Frequency-Following Responses Using Non-Negative Matrix Factorization. Percept Mot Skills 2023; 130:1834-1851. [PMID: 37534595 DOI: 10.1177/00315125231191303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2023]
Abstract
Source-Separation Non-Negative Matrix Factorization (SSNMF) is a mathematical algorithm recently developed to extract scalp-recorded frequency-following responses (FFRs) from noise. Despite its initial success, the effects of silent intervals on algorithm performance remain undetermined. Our purpose in this study was to determine the effects of silent intervals on the extraction of FFRs, which are electrophysiological responses that are commonly used to evaluate auditory processing and neuroplasticity in the human brain. We used an English vowel /i/ with a rising frequency contour to evoke FFRs in 23 normal-hearing adults. The stimulus had a duration of 150 ms, while the silent interval between the onset of one stimulus and the offset of the next one was also 150 ms. We computed FFR Enhancement and Noise Residue to estimate algorithm performance, while silent intervals were either included (i.e., the WithSI condition) or excluded (i.e., the WithoutSI condition) in our analysis. The FFR Enhancements and Noise Residues obtained in the WithoutSI condition were significantly better (p < .05) than those obtained in the WithSI condition. On average, the exclusion of silent intervals produced a 11.78% increment in FFR Enhancement and a 20.69% decrement in Noise Residue. These results not only quantify the effects of silent intervals on the extraction of human FFRs, but also provide recommendations for designing and improving the SSNMF algorithm in future research.
Collapse
Affiliation(s)
- Allison T Giordano
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Fuh-Cherng Jeng
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Taylor R Black
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Sydney W Bauer
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Amanda E Carriero
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Kalyn McDonald
- Communication Sciences and Disorders, Ohio University, Athens, Ohio, USA
| | - Tzu-Hao Lin
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Ching-Yuan Wang
- Department of Otolaryngology-HNS, China Medical University Hospital, Taichung, Taiwan
| |
Collapse
|
7
|
Hassan N, Ramli DA. Sparse Component Analysis (SCA) Based on Adaptive Time-Frequency Thresholding for Underdetermined Blind Source Separation (UBSS). SENSORS (BASEL, SWITZERLAND) 2023; 23:2060. [PMID: 36850657 PMCID: PMC9961070 DOI: 10.3390/s23042060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 01/30/2023] [Accepted: 02/08/2023] [Indexed: 06/18/2023]
Abstract
Blind source separation (BSS) recovers source signals from observations without knowing the mixing process or source signals. Underdetermined blind source separation (UBSS) occurs when there are fewer mixes than source signals. Sparse component analysis (SCA) is a general UBSS solution that benefits from sparse source signals which consists of (1) mixing matrix estimation and (2) source recovery estimation. The first stage of SCA is crucial, as it will have an impact on the recovery of the source. Single-source points (SSPs) were detected and clustered during the process of mixing matrix estimation. Adaptive time-frequency thresholding (ATFT) was introduced to increase the accuracy of the mixing matrix estimations. ATFT only used significant TF coefficients to detect the SSPs. After identifying the SSPs, hierarchical clustering approximates the mixing matrix. The second stage of SCA estimated the source recovery using least squares methods. The mixing matrix and source recovery estimations were evaluated using the error rate and mean squared error (MSE) metrics. The experimental results on four bioacoustics signals using ATFT demonstrated that the proposed technique outperformed the baseline method, Zhen's method, and three state-of-the-art methods over a wide range of signal-to-noise ratio (SNR) ranges while consuming less time.
Collapse
Affiliation(s)
- Norsalina Hassan
- Department of Electrical Engineering, Politeknik Seberang Perai, Jalan Permatang Pauh, Bukit Mertajam 13700, Pulau Pinang, Malaysia
| | - Dzati Athiar Ramli
- School of Electrical & Electronic Engineering, Universiti Sains Malaysia, Nibong Tebal 14300, Pulau Pinang, Malaysia
| |
Collapse
|