1
|
Bousquet CAH, Sueur C, King AJ, O'Bryan LR. Individual and ecological heterogeneity promote complex communication in social vertebrate group decisions. Philos Trans R Soc Lond B Biol Sci 2024; 379:20230204. [PMID: 38768211 DOI: 10.1098/rstb.2023.0204] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 03/04/2024] [Indexed: 05/22/2024] Open
Abstract
To receive the benefits of social living, individuals must make effective group decisions that enable them to achieve behavioural coordination and maintain cohesion. However, heterogeneity in the physical and social environments surrounding group decision-making contexts can increase the level of difficulty social organisms face in making decisions. Groups that live in variable physical environments (high ecological heterogeneity) can experience barriers to information transfer and increased levels of ecological uncertainty. In addition, in groups with large phenotypic variation (high individual heterogeneity), individuals can have substantial conflicts of interest regarding the timing and nature of activities, making it difficult for them to coordinate their behaviours or reach a consensus. In such cases, active communication can increase individuals' abilities to achieve coordination, such as by facilitating the transfer and aggregation of information about the environment or individual behavioural preferences. Here, we review the role of communication in vertebrate group decision-making and its relationship to heterogeneity in the ecological and social environment surrounding group decision-making contexts. We propose that complex communication has evolved to facilitate decision-making in specific socio-ecological contexts, and we provide a framework for studying this topic and testing related hypotheses as part of future research in this area. This article is part of the theme issue 'The power of sound: unravelling how acoustic communication shapes group dynamics'.
Collapse
Affiliation(s)
- Christophe A H Bousquet
- Department of Psychology, University of Konstanz , Konstanz 78457, Germany
- Centre for the Advanced Study of Collective Behaviour, University of Konstanz , Konstanz 78457, Germany
| | - Cédric Sueur
- Institut pluridisciplinaire Hubert Curien , Strasbourg 67000, France
- Institut Universitaire de France , Paris 75005, France
| | - Andrew J King
- Biosciences, Faculty of Science and Engineering , Swansea SA2 8PP, UK
| | - Lisa R O'Bryan
- Department of Psychological Sciences, Rice University , Houston, TX 77005, USA
| |
Collapse
|
2
|
Wang B, Torok Z, Duffy A, Bell DG, Wongso S, Velho TAF, Fairhall AL, Lois C. Unsupervised restoration of a complex learned behavior after large-scale neuronal perturbation. Nat Neurosci 2024; 27:1176-1186. [PMID: 38684893 DOI: 10.1038/s41593-024-01630-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 03/26/2024] [Indexed: 05/02/2024]
Abstract
Reliable execution of precise behaviors requires that brain circuits are resilient to variations in neuronal dynamics. Genetic perturbation of the majority of excitatory neurons in HVC, a brain region involved in song production, in adult songbirds with stereotypical songs triggered severe degradation of the song. The song fully recovered within 2 weeks, and substantial improvement occurred even when animals were prevented from singing during the recovery period, indicating that offline mechanisms enable recovery in an unsupervised manner. Song restoration was accompanied by increased excitatory synaptic input to neighboring, unmanipulated neurons in the same brain region. A model inspired by the behavioral and electrophysiological findings suggests that unsupervised single-cell and population-level homeostatic plasticity rules can support the functional restoration after large-scale disruption of networks that implement sequential dynamics. These observations suggest the existence of cellular and systems-level restorative mechanisms that ensure behavioral resilience.
Collapse
Affiliation(s)
- Bo Wang
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| | - Zsofia Torok
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Alison Duffy
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA
- Computational Neuroscience Center, University of Washington, Seattle, WA, USA
| | - David G Bell
- Computational Neuroscience Center, University of Washington, Seattle, WA, USA
- Department of Physics, University of Washington, Seattle, WA, USA
| | - Shelyn Wongso
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Tarciso A F Velho
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Adrienne L Fairhall
- Department of Physiology and Biophysics, University of Washington, Seattle, WA, USA
- Computational Neuroscience Center, University of Washington, Seattle, WA, USA
- Department of Physics, University of Washington, Seattle, WA, USA
| | - Carlos Lois
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
3
|
Liao DA, Brecht KF, Veit L, Nieder A. Crows "count" the number of self-generated vocalizations. Science 2024; 384:874-877. [PMID: 38781375 DOI: 10.1126/science.adl0984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 04/22/2024] [Indexed: 05/25/2024]
Abstract
Producing a specific number of vocalizations with purpose requires a sophisticated combination of numerical abilities and vocal control. Whether this capacity exists in animals other than humans is yet unknown. We show that crows can flexibly produce variable numbers of one to four vocalizations in response to arbitrary cues associated with numerical values. The acoustic features of the first vocalization of a sequence were predictive of the total number of vocalizations, indicating a planning process. Moreover, the acoustic features of vocal units predicted their order in the sequence and could be used to read out counting errors during vocal production.
Collapse
Affiliation(s)
- Diana A Liao
- Animal Physiology, Institute of Neurobiology, University of Tübingen, Auf der Morgenstelle 28, 72076 Tübingen, Germany
| | - Katharina F Brecht
- Animal Physiology, Institute of Neurobiology, University of Tübingen, Auf der Morgenstelle 28, 72076 Tübingen, Germany
| | - Lena Veit
- Neurobiology of Vocal Communication, Institute of Neurobiology, University of Tübingen Auf der Morgenstelle 28, 72076 Tübingen, Germany
| | - Andreas Nieder
- Animal Physiology, Institute of Neurobiology, University of Tübingen, Auf der Morgenstelle 28, 72076 Tübingen, Germany
| |
Collapse
|
4
|
Erb WM, Ross W, Kazanecki H, Mitra Setia T, Madhusudhana S, Clink DJ. Vocal complexity in the long calls of Bornean orangutans. PeerJ 2024; 12:e17320. [PMID: 38766489 PMCID: PMC11100477 DOI: 10.7717/peerj.17320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Accepted: 04/09/2024] [Indexed: 05/22/2024] Open
Abstract
Vocal complexity is central to many evolutionary hypotheses about animal communication. Yet, quantifying and comparing complexity remains a challenge, particularly when vocal types are highly graded. Male Bornean orangutans (Pongo pygmaeus wurmbii) produce complex and variable "long call" vocalizations comprising multiple sound types that vary within and among individuals. Previous studies described six distinct call (or pulse) types within these complex vocalizations, but none quantified their discreteness or the ability of human observers to reliably classify them. We studied the long calls of 13 individuals to: (1) evaluate and quantify the reliability of audio-visual classification by three well-trained observers, (2) distinguish among call types using supervised classification and unsupervised clustering, and (3) compare the performance of different feature sets. Using 46 acoustic features, we used machine learning (i.e., support vector machines, affinity propagation, and fuzzy c-means) to identify call types and assess their discreteness. We additionally used Uniform Manifold Approximation and Projection (UMAP) to visualize the separation of pulses using both extracted features and spectrogram representations. Supervised approaches showed low inter-observer reliability and poor classification accuracy, indicating that pulse types were not discrete. We propose an updated pulse classification approach that is highly reproducible across observers and exhibits strong classification accuracy using support vector machines. Although the low number of call types suggests long calls are fairly simple, the continuous gradation of sounds seems to greatly boost the complexity of this system. This work responds to calls for more quantitative research to define call types and quantify gradedness in animal vocal systems and highlights the need for a more comprehensive framework for studying vocal complexity vis-à-vis graded repertoires.
Collapse
Affiliation(s)
- Wendy M. Erb
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, United States of America
- Department of Anthropology, Rutgers, The State University of New Jersey, New Brunswick, United States of America
| | - Whitney Ross
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, United States of America
| | - Haley Kazanecki
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, United States of America
| | - Tatang Mitra Setia
- Primate Research Center, Universitas Nasional Jakarta, Jakarta, Indonesia
- Department of Biology, Faculty of Biology and Agriculture, Universitas Nasional Jakarta, Jakarta, Indonesia
| | - Shyam Madhusudhana
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, United States of America
- Centre for Marine Science and Technology, Curtin University, Perth, Australia
| | - Dena J. Clink
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, United States of America
| |
Collapse
|
5
|
Hasani Azhdari SM, Mahmoodzadeh A, Khishe M, Agahi H. Enhanced PRIM recognition using PRI sound and deep learning techniques. PLoS One 2024; 19:e0298373. [PMID: 38691542 PMCID: PMC11062556 DOI: 10.1371/journal.pone.0298373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 01/24/2024] [Indexed: 05/03/2024] Open
Abstract
Pulse repetition interval modulation (PRIM) is integral to radar identification in modern electronic support measure (ESM) and electronic intelligence (ELINT) systems. Various distortions, including missing pulses, spurious pulses, unintended jitters, and noise from radar antenna scans, often hinder the accurate recognition of PRIM. This research introduces a novel three-stage approach for PRIM recognition, emphasizing the innovative use of PRI sound. A transfer learning-aided deep convolutional neural network (DCNN) is initially used for feature extraction. This is followed by an extreme learning machine (ELM) for real-time PRIM classification. Finally, a gray wolf optimizer (GWO) refines the network's robustness. To evaluate the proposed method, we develop a real experimental dataset consisting of sound of six common PRI patterns. We utilized eight pre-trained DCNN architectures for evaluation, with VGG16 and ResNet50V2 notably achieving recognition accuracies of 97.53% and 96.92%. Integrating ELM and GWO further optimized the accuracy rates to 98.80% and 97.58. This research advances radar identification by offering an enhanced method for PRIM recognition, emphasizing the potential of PRI sound to address real-world distortions in ESM and ELINT systems.
Collapse
Affiliation(s)
| | - Azar Mahmoodzadeh
- Department of Electrical Engineering, Shiraz Branch, Islamic Azad University, Shiraz, Iran
| | - Mohammad Khishe
- Department of Electrical Engineering, Imam Khomeini Marine Science University, Nowshahr, Iran
| | - Hamed Agahi
- Department of Electrical Engineering, Shiraz Branch, Islamic Azad University, Shiraz, Iran
| |
Collapse
|
6
|
Santana GM, Dietrich MO. SqueakOut: Autoencoder-based segmentation of mouse ultrasonic vocalizations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.19.590368. [PMID: 38712291 PMCID: PMC11071348 DOI: 10.1101/2024.04.19.590368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Mice emit ultrasonic vocalizations (USVs) that are important for social communication. Despite great advancements in tools to detect USVs from audio files in the recent years, highly accurate segmentation of USVs from spectrograms (i.e., removing noise) remains a significant challenge. Here, we present a new dataset of 12,954 annotated spectrograms explicitly labeled for mouse USV segmentation. Leveraging this dataset, we developed SqueakOut, a lightweight (4.6M parameters) fully convolutional autoencoder that achieves high accuracy in supervised segmentation of USVs from spectrograms, with a Dice score of 90.22. SqueakOut combines a MobileNetV2 backbone with skip connections and transposed convolutions to precisely segment USVs. Using stochastic data augmentation techniques and a hybrid loss function, SqueakOut learns robust segmentation across varying recording conditions. We evaluate SqueakOut's performance, demonstrating substantial improvements over existing methods like VocalMat (63.82 Dice score). The accurate USV segmentations enabled by SqueakOut will facilitate novel methods for vocalization classification and more accurate analysis of mouse communication. To promote further research, we release the annotated 12,954 spectrogram USV segmentation dataset and the SqueakOut implementation publicly.
Collapse
Affiliation(s)
- Gustavo M Santana
- Laboratory of Physiology of Behavior, Interdepartmental Neuroscience Program, Program in Physics, Engineering and Biology, Yale University, USA
- Graduate Program in Biochemistry, Federal University of Rio Grande do Sul, BRA
| | - Marcelo O Dietrich
- Laboratory of Physiology of Behavior, Department of Comparative Medicine, Department of Neuroscience, Yale University, USA
| |
Collapse
|
7
|
Koparkar A, Warren TL, Charlesworth JD, Shin S, Brainard MS, Veit L. Lesions in a songbird vocal circuit increase variability in song syntax. eLife 2024; 13:RP93272. [PMID: 38635312 PMCID: PMC11026095 DOI: 10.7554/elife.93272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/19/2024] Open
Abstract
Complex skills like speech and dance are composed of ordered sequences of simpler elements, but the neuronal basis for the syntactic ordering of actions is poorly understood. Birdsong is a learned vocal behavior composed of syntactically ordered syllables, controlled in part by the songbird premotor nucleus HVC (proper name). Here, we test whether one of HVC's recurrent inputs, mMAN (medial magnocellular nucleus of the anterior nidopallium), contributes to sequencing in adult male Bengalese finches (Lonchura striata domestica). Bengalese finch song includes several patterns: (1) chunks, comprising stereotyped syllable sequences; (2) branch points, where a given syllable can be followed probabilistically by multiple syllables; and (3) repeat phrases, where individual syllables are repeated variable numbers of times. We found that following bilateral lesions of mMAN, acoustic structure of syllables remained largely intact, but sequencing became more variable, as evidenced by 'breaks' in previously stereotyped chunks, increased uncertainty at branch points, and increased variability in repeat numbers. Our results show that mMAN contributes to the variable sequencing of vocal elements in Bengalese finch song and demonstrate the influence of recurrent projections to HVC. Furthermore, they highlight the utility of species with complex syntax in investigating neuronal control of ordered sequences.
Collapse
Affiliation(s)
- Avani Koparkar
- Neurobiology of Vocal Communication, Institute for Neurobiology, University of TübingenTübingenGermany
| | - Timothy L Warren
- Howard Hughes Medical Institute and Center for Integrative Neuroscience, University of California San FranciscoSan FranciscoUnited States
- Departments of Horticulture and Integrative Biology, Oregon State UniversityCorvallisUnited States
| | - Jonathan D Charlesworth
- Howard Hughes Medical Institute and Center for Integrative Neuroscience, University of California San FranciscoSan FranciscoUnited States
| | - Sooyoon Shin
- Howard Hughes Medical Institute and Center for Integrative Neuroscience, University of California San FranciscoSan FranciscoUnited States
| | - Michael S Brainard
- Howard Hughes Medical Institute and Center for Integrative Neuroscience, University of California San FranciscoSan FranciscoUnited States
| | - Lena Veit
- Neurobiology of Vocal Communication, Institute for Neurobiology, University of TübingenTübingenGermany
| |
Collapse
|
8
|
Youngblood M. Language-like efficiency and structure in house finch song. Proc Biol Sci 2024; 291:20240250. [PMID: 38565151 PMCID: PMC10987240 DOI: 10.1098/rspb.2024.0250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2023] [Accepted: 03/06/2024] [Indexed: 04/04/2024] Open
Abstract
Communication needs to be complex enough to be functional while minimizing learning and production costs. Recent work suggests that the vocalizations and gestures of some songbirds, cetaceans and great apes may conform to linguistic laws that reflect this trade-off between efficiency and complexity. In studies of non-human communication, though, clustering signals into types cannot be done a priori, and decisions about the appropriate grain of analysis may affect statistical signals in the data. The aim of this study was to assess the evidence for language-like efficiency and structure in house finch (Haemorhous mexicanus) song across three levels of granularity in syllable clustering. The results show strong evidence for Zipf's rank-frequency law, Zipf's law of abbreviation and Menzerath's law. Additional analyses show that house finch songs have small-world structure, thought to reflect systematic structure in syntax, and the mutual information decay of sequences is consistent with a combination of Markovian and hierarchical processes. These statistical patterns are robust across three levels of granularity in syllable clustering, pointing to a limited form of scale invariance. In sum, it appears that house finch song has been shaped by pressure for efficiency, possibly to offset the costs of female preferences for complexity.
Collapse
Affiliation(s)
- Mason Youngblood
- Minds and Traditions Research Group, Max Planck Institute for Geoanthropology, Jena, Thüringen, Germany
- Institute for Advanced Computational Science, Stony Brook University, Stony Brook, NY, USA
| |
Collapse
|
9
|
Alam D, Zia F, Roberts TF. The hidden fitness of the male zebra finch courtship song. Nature 2024; 628:117-121. [PMID: 38509376 DOI: 10.1038/s41586-024-07207-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Accepted: 02/19/2024] [Indexed: 03/22/2024]
Abstract
Vocal learning in songbirds is thought to have evolved through sexual selection, with female preference driving males to develop large and varied song repertoires1-3. However, many songbird species learn only a single song in their lifetime4. How sexual selection drives the evolution of single-song repertoires is not known. Here, by applying dimensionality-reduction techniques to the singing behaviour of zebra finches (Taeniopygia guttata), we show that syllable spread in low-dimensional feature space explains how single songs function as honest indicators of fitness. We find that this Gestalt measure of behaviour captures the spectrotemporal distinctiveness of song syllables in zebra finches; that females strongly prefer songs that occupy more latent space; and that matching path lengths in low-dimensional space is difficult for young males. Our findings clarify how simple vocal repertoires may have evolved in songbirds and indicate divergent strategies for how sexual selection can shape vocal learning.
Collapse
Affiliation(s)
- Danyal Alam
- Department of Neuroscience, UT Southwestern Medical Center, Dallas, TX, USA
| | - Fayha Zia
- Department of Neuroscience, UT Southwestern Medical Center, Dallas, TX, USA
| | - Todd F Roberts
- Department of Neuroscience, UT Southwestern Medical Center, Dallas, TX, USA.
- O'Donnell Brain Institute, UT Southwestern Medical Center, Dallas, TX, USA.
| |
Collapse
|
10
|
Mielke A, Badihi G, Graham KE, Grund C, Hashimoto C, Piel AK, Safryghin A, Slocombe KE, Stewart F, Wilke C, Zuberbühler K, Hobaiter C. Many morphs: Parsing gesture signals from the noise. Behav Res Methods 2024:10.3758/s13428-024-02368-6. [PMID: 38438657 DOI: 10.3758/s13428-024-02368-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/12/2024] [Indexed: 03/06/2024]
Abstract
Parsing signals from noise is a general problem for signallers and recipients, and for researchers studying communicative systems. Substantial efforts have been invested in comparing how other species encode information and meaning, and how signalling is structured. However, research depends on identifying and discriminating signals that represent meaningful units of analysis. Early approaches to defining signal repertoires applied top-down approaches, classifying cases into predefined signal types. Recently, more labour-intensive methods have taken a bottom-up approach describing detailed features of each signal and clustering cases based on patterns of similarity in multi-dimensional feature-space that were previously undetectable. Nevertheless, it remains essential to assess whether the resulting repertoires are composed of relevant units from the perspective of the species using them, and redefining repertoires when additional data become available. In this paper we provide a framework that takes data from the largest set of wild chimpanzee (Pan troglodytes) gestures currently available, splitting gesture types at a fine scale based on modifying features of gesture expression using latent class analysis (a model-based cluster detection algorithm for categorical variables), and then determining whether this splitting process reduces uncertainty about the goal or community of the gesture. Our method allows different features of interest to be incorporated into the splitting process, providing substantial future flexibility across, for example, species, populations, and levels of signal granularity. Doing so, we provide a powerful tool allowing researchers interested in gestural communication to establish repertoires of relevant units for subsequent analyses within and between systems of communication.
Collapse
Affiliation(s)
- Alexander Mielke
- Wild Minds Lab, School of Psychology and Neuroscience, University of St Andrews, St Andrews, UK.
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, UK.
| | - Gal Badihi
- Wild Minds Lab, School of Psychology and Neuroscience, University of St Andrews, St Andrews, UK
| | - Kirsty E Graham
- Wild Minds Lab, School of Psychology and Neuroscience, University of St Andrews, St Andrews, UK
| | - Charlotte Grund
- Wild Minds Lab, School of Psychology and Neuroscience, University of St Andrews, St Andrews, UK
| | - Chie Hashimoto
- Primate Research Institute, Kyoto University, Kyoto, Japan
| | - Alex K Piel
- Department of Anthropology, University College London, London, UK
- Department of Human Origins, Max Planck Institute of Evolutionary Anthropology, Leipzig, Germany
| | - Alexandra Safryghin
- Wild Minds Lab, School of Psychology and Neuroscience, University of St Andrews, St Andrews, UK
| | | | - Fiona Stewart
- Department of Anthropology, University College London, London, UK
- Department of Human Origins, Max Planck Institute of Evolutionary Anthropology, Leipzig, Germany
| | - Claudia Wilke
- Department of Psychology, University of York, York, UK
| | - Klaus Zuberbühler
- Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland
| | - Catherine Hobaiter
- Wild Minds Lab, School of Psychology and Neuroscience, University of St Andrews, St Andrews, UK
| |
Collapse
|
11
|
Cominelli S, Bellin N, Brown CD, Rossi V, Lawson J. Acoustic features as a tool to visualize and explore marine soundscapes: Applications illustrated using marine mammal passive acoustic monitoring datasets. Ecol Evol 2024; 14:e10951. [PMID: 38384822 PMCID: PMC10880131 DOI: 10.1002/ece3.10951] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 12/16/2023] [Accepted: 01/02/2024] [Indexed: 02/23/2024] Open
Abstract
Passive Acoustic Monitoring (PAM) is emerging as a solution for monitoring species and environmental change over large spatial and temporal scales. However, drawing rigorous conclusions based on acoustic recordings is challenging, as there is no consensus over which approaches are best suited for characterizing marine acoustic environments. Here, we describe the application of multiple machine-learning techniques to the analysis of two PAM datasets. We combine pre-trained acoustic classification models (VGGish, NOAA and Google Humpback Whale Detector), dimensionality reduction (UMAP), and balanced random forest algorithms to demonstrate how machine-learned acoustic features capture different aspects of the marine acoustic environment. The UMAP dimensions derived from VGGish acoustic features exhibited good performance in separating marine mammal vocalizations according to species and locations. RF models trained on the acoustic features performed well for labeled sounds in the 8 kHz range; however, low- and high-frequency sounds could not be classified using this approach. The workflow presented here shows how acoustic feature extraction, visualization, and analysis allow establishing a link between ecologically relevant information and PAM recordings at multiple scales, ranging from large-scale changes in the environment (i.e., changes in wind speed) to the identification of marine mammal species.
Collapse
Affiliation(s)
- Simone Cominelli
- Northern EDGE Lab, Department of GeographyMemorial University of Newfoundland and LabradorSt. John'sNewfoundland and LabradorCanada
| | - Nicolo' Bellin
- Department of Chemistry, Life Sciences and Environmental SustainabilityUniversity of ParmaParmaItaly
| | - Carissa D. Brown
- Northern EDGE Lab, Department of GeographyMemorial University of Newfoundland and LabradorSt. John'sNewfoundland and LabradorCanada
| | - Valeria Rossi
- Department of Chemistry, Life Sciences and Environmental SustainabilityUniversity of ParmaParmaItaly
| | - Jack Lawson
- Marine Mammal SectionDepartment of Fisheries and OceansSt. John'sNewfoundland and LabradorCanada
| |
Collapse
|
12
|
Choi N, Miller P, Hebets EA. Vibroscape analysis reveals acoustic niche overlap and plastic alteration of vibratory courtship signals in ground-dwelling wolf spiders. Commun Biol 2024; 7:23. [PMID: 38182735 PMCID: PMC10770364 DOI: 10.1038/s42003-023-05700-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 12/12/2023] [Indexed: 01/07/2024] Open
Abstract
To expand the scope of soundscape ecology to encompass substrate-borne vibrations (i.e. vibroscapes), we analyzed the vibroscape of a deciduous forest floor using contact microphone arrays followed by automated processing of large audio datasets. We then focused on vibratory signaling of ground-dwelling Schizocosa wolf spiders to test for (i) acoustic niche partitioning and (ii) plastic behavioral responses that might reduce the risk of signal interference from substrate-borne noise and conspecific/heterospecific signaling. Two closely related species - S. stridulans and S. uetzi - showed high acoustic niche overlap across space, time, and dominant frequency. Both species show plastic behavioral responses - S. uetzi males shorten their courtship in higher abundance of substrate-borne noise, S. stridulans males increased the duration of their vibratory courtship signals in a higher abundance of conspecific signals, and S. stridulans males decreased vibratory signal complexity in a higher abundance of S. uetzi signals.
Collapse
Affiliation(s)
- Noori Choi
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE, USA.
- Max Planck Institute of Animal Behavior, Konstanz, Germany.
| | - Pat Miller
- University of Mississippi field station associate, Abbeville, MS, USA
| | - Eileen A Hebets
- School of Biological Sciences, University of Nebraska-Lincoln, Lincoln, NE, USA
| |
Collapse
|
13
|
Martin K, Cornero FM, Clayton NS, Adam O, Obin N, Dufour V. Vocal complexity in a socially complex corvid: gradation, diversity and lack of common call repertoire in male rooks. ROYAL SOCIETY OPEN SCIENCE 2024; 11:231713. [PMID: 38204786 PMCID: PMC10776222 DOI: 10.1098/rsos.231713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 12/08/2023] [Indexed: 01/12/2024]
Abstract
Vocal communication is widespread in animals, with vocal repertoires of varying complexity. The social complexity hypothesis predicts that species may need high vocal complexity to deal with complex social organization (e.g. have a variety of different interindividual relations). We quantified the vocal complexity of two geographically distant captive colonies of rooks, a corvid species with complex social organization and cognitive performances, but understudied vocal abilities. We quantified the diversity and gradation of their repertoire, as well as the inter-individual similarity at the vocal unit level. We found that males produced call units with lower diversity and gradation than females, while song units did not differ between sexes. Surprisingly, while females produced highly similar call repertoires, even between colonies, each individual male produced almost completely different call repertoires from any other individual. These findings question the way male rooks communicate with their social partners. We suggest that each male may actively seek to remain vocally distinct, which could be an asset in their frequently changing social environment. We conclude that inter-individual similarity, an understudied aspect of vocal repertoires, should also be considered as a measure of vocal complexity.
Collapse
Affiliation(s)
- Killian Martin
- PRC, UMR 7247, Ethologie Cognitive et Sociale, CNRS-IFCE-INRAE-Université de Tours, Strasbourg, France
| | | | | | - Olivier Adam
- Institut Jean Le Rond d'Alembert, UMR 7190, CNRS-Sorbonne Université, 75005 Paris, France
- Institut des Neurosciences Paris-Saclay, UMR 9197, CNRS-Université Paris Sud, Orsay, France
| | - Nicolas Obin
- STMS Lab, IRCAM, CNRS-Sorbonne Université, Paris, France
| | - Valérie Dufour
- PRC, UMR 7247, Ethologie Cognitive et Sociale, CNRS-IFCE-INRAE-Université de Tours, Strasbourg, France
| |
Collapse
|
14
|
Torok Z, Luebbert L, Feldman J, Duffy A, Nevue AA, Wongso S, Mello CV, Fairhall A, Pachter L, Gonzalez WG, Lois C. Recovery of a learned behavior despite partial restoration of neuronal dynamics after chronic inactivation of inhibitory neurons. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.17.541057. [PMID: 37292888 PMCID: PMC10245685 DOI: 10.1101/2023.05.17.541057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Maintaining motor skills is crucial for an animal's survival, enabling it to endure diverse perturbations throughout its lifespan, such as trauma, disease, and aging. What mechanisms orchestrate brain circuit reorganization and recovery to preserve the stability of behavior despite the continued presence of a disturbance? To investigate this question, we chronically silenced a fraction of inhibitory neurons in a brain circuit necessary for singing in zebra finches. Song in zebra finches is a complex, learned motor behavior and central to reproduction. This manipulation altered brain activity and severely perturbed song for around two months, after which time it was precisely restored. Electrophysiology recordings revealed abnormal offline dynamics, resulting from chronic inhibition loss, some aspects of which returned to normal as the song recovered. However, even after the song had fully recovered, the levels of neuronal firing in the premotor and motor areas did not return to a control-like state. Single-cell RNA sequencing revealed that chronic silencing of interneurons led to elevated levels of microglia and MHC I, which were also observed in normal juveniles during song learning. These experiments demonstrate that the adult brain can overcome extended periods of abnormal activity, and precisely restore a complex behavior, without recovering normal neuronal dynamics. These findings suggest that the successful functional recovery of a brain circuit after a perturbation can involve more than mere restoration to its initial configuration. Instead, the circuit seems to adapt and reorganize into a new state capable of producing the original behavior despite the persistence of some abnormal neuronal dynamics.
Collapse
Affiliation(s)
- Zsofia Torok
- Division of Biology and Biological Engineering, California Institute of Technology; Pasadena, CA, USA
| | - Laura Luebbert
- Division of Biology and Biological Engineering, California Institute of Technology; Pasadena, CA, USA
| | - Jordan Feldman
- Division of Biology and Biological Engineering, California Institute of Technology; Pasadena, CA, USA
| | | | | | - Shelyn Wongso
- Division of Biology and Biological Engineering, California Institute of Technology; Pasadena, CA, USA
| | | | | | - Lior Pachter
- Division of Biology and Biological Engineering, California Institute of Technology; Pasadena, CA, USA
- Department of Computing and Mathematical Sciences, California Institute of Technology; Pasadena, CA, USA
| | - Walter G. Gonzalez
- Department of Physiology, University of San Francisco; San Francisco, CA, USA
| | - Carlos Lois
- Division of Biology and Biological Engineering, California Institute of Technology; Pasadena, CA, USA
| |
Collapse
|
15
|
Uehara K, Yasuhara M, Koguchi J, Oku T, Shiotani S, Morise M, Furuya S. Brain network flexibility as a predictor of skilled musical performance. Cereb Cortex 2023; 33:10492-10503. [PMID: 37566918 DOI: 10.1093/cercor/bhad298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 08/13/2023] Open
Abstract
Interactions between the body and the environment are dynamically modulated by upcoming sensory information and motor execution. To adapt to this behavioral state-shift, brain activity must also be flexible and possess a large repertoire of brain networks so as to switch them flexibly. Recently, flexible internal brain communications, i.e. brain network flexibility, have come to be recognized as playing a vital role in integrating various sensorimotor information. Therefore, brain network flexibility is one of the key factors that define sensorimotor skill. However, little is known about how flexible communications within the brain characterize the interindividual variation of sensorimotor skill and trial-by-trial variability within individuals. To address this, we recruited skilled musical performers and used a novel approach that combined multichannel-scalp electroencephalography, behavioral measurements of musical performance, and mathematical approaches to extract brain network flexibility. We found that brain network flexibility immediately before initiating the musical performance predicted interindividual differences in the precision of tone timbre when required for feedback control, but not for feedforward control. Furthermore, brain network flexibility in broad cortical regions predicted skilled musical performance. Our results provide novel evidence that brain network flexibility plays an important role in building skilled sensorimotor performance.
Collapse
Affiliation(s)
- Kazumasa Uehara
- Neural Information Dynamics Laboratory, Department of Computer Science and Engineering, Toyohashi University of Technology, Toyohashi, Japan
- Sony Computer Science Laboratories Inc, Tokyo 1410022, Japan
| | - Masaki Yasuhara
- Sony Computer Science Laboratories Inc, Tokyo 1410022, Japan
- Neural Engineering Laboratory, Department of Science of Technology Innovation, Nagaoka University of Technology, Nagaoka, Japan
| | - Junya Koguchi
- Sony Computer Science Laboratories Inc, Tokyo 1410022, Japan
- Graduate School of Advanced Mathematical Sciences, Meiji University, Tokyo, Japan
| | | | | | - Masanori Morise
- Sony Computer Science Laboratories Inc, Tokyo 1410022, Japan
- School of Interdisciplinary Mathematical Sciences, Meiji University, Tokyo, Japan
| | - Shinichi Furuya
- Sony Computer Science Laboratories Inc, Tokyo 1410022, Japan
- NeuroPiano Institute, Kyoto 6008086, Japan
| |
Collapse
|
16
|
Fleishman E, Cholewiak D, Gillespie D, Helble T, Klinck H, Nosal EM, Roch MA. Ecological inferences about marine mammals from passive acoustic data. Biol Rev Camb Philos Soc 2023; 98:1633-1647. [PMID: 37142263 DOI: 10.1111/brv.12969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Revised: 04/20/2023] [Accepted: 04/24/2023] [Indexed: 05/06/2023]
Abstract
Monitoring on the basis of sound recordings, or passive acoustic monitoring, can complement or serve as an alternative to real-time visual or aural monitoring of marine mammals and other animals by human observers. Passive acoustic data can support the estimation of common, individual-level ecological metrics, such as presence, detection-weighted occupancy, abundance and density, population viability and structure, and behaviour. Passive acoustic data also can support estimation of some community-level metrics, such as species richness and composition. The feasibility of estimation and certainty of estimates is highly context dependent, and understanding the factors that affect the reliability of measurements is useful for those considering whether to use passive acoustic data. Here, we review basic concepts and methods of passive acoustic sampling in marine systems that often are applicable to marine mammal research and conservation. Our ultimate aim is to facilitate collaboration among ecologists, bioacousticians, and data analysts. Ecological applications of passive acoustics require one to make decisions about sampling design, which in turn requires consideration of sound propagation, sampling of signals, and data storage. One also must make decisions about signal detection and classification and evaluation of the performance of algorithms for these tasks. Investment in the research and development of systems that automate detection and classification, including machine learning, are increasing. Passive acoustic monitoring is more reliable for detection of species presence than for estimation of other species-level metrics. Use of passive acoustic monitoring to distinguish among individual animals remains difficult. However, information about detection probability, vocalisation or cue rate, and relations between vocalisations and the number and behaviour of animals increases the feasibility of estimating abundance or density. Most sensor deployments are fixed in space or are sporadic, making temporal turnover in species composition more tractable to estimate than spatial turnover. Collaborations between acousticians and ecologists are most likely to be successful and rewarding when all partners critically examine and share a fundamental understanding of the target variables, sampling process, and analytical methods.
Collapse
Affiliation(s)
- Erica Fleishman
- College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, OR, 97331, USA
| | - Danielle Cholewiak
- Northeast Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Woods Hole, MA, 02543, USA
| | - Douglas Gillespie
- Sea Mammal Research Unit, Scottish Oceans Institute, University of St Andrews, St Andrews, KY16 9XL, UK
| | - Tyler Helble
- Naval Information Warfare Center Pacific, San Diego, CA, 92152, USA
| | - Holger Klinck
- K. Lisa Yang Center for Conservation Bioacoustics, Cornell Lab of Ornithology, Cornell University, Ithaca, NY, 14850, USA
| | - Eva-Marie Nosal
- Department of Ocean and Resources Engineering, University of Hawai'i at Manoa, Honolulu, HI, 96822, USA
| | - Marie A Roch
- Department of Computer Science, San Diego State University, San Diego, CA, 92182, USA
| |
Collapse
|
17
|
Zhang S, Gao Y, Cai J, Yang H, Zhao Q, Pan F. A Novel Bird Sound Recognition Method Based on Multifeature Fusion and a Transformer Encoder. SENSORS (BASEL, SWITZERLAND) 2023; 23:8099. [PMID: 37836929 PMCID: PMC10575132 DOI: 10.3390/s23198099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 09/16/2023] [Accepted: 09/18/2023] [Indexed: 10/15/2023]
Abstract
Birds play a vital role in the study of ecosystems and biodiversity. Accurate bird identification helps monitor biodiversity, understand the functions of ecosystems, and develop effective conservation strategies. However, previous bird sound recognition methods often relied on single features and overlooked the spatial information associated with these features, leading to low accuracy. Recognizing this gap, the present study proposed a bird sound recognition method that employs multiple convolutional neural-based networks and a transformer encoder to provide a reliable solution for identifying and classifying birds based on their unique sounds. We manually extracted various acoustic features as model inputs, and feature fusion was applied to obtain the final set of feature vectors. Feature fusion combines the deep features extracted by various networks, resulting in a more comprehensive feature set, thereby improving recognition accuracy. The multiple integrated acoustic features, such as mel frequency cepstral coefficients (MFCC), chroma features (Chroma) and Tonnetz features, were encoded by a transformer encoder. The transformer encoder effectively extracted the positional relationships between bird sound features, resulting in enhanced recognition accuracy. The experimental results demonstrated the exceptional performance of our method with an accuracy of 97.99%, a recall of 96.14%, an F1 score of 96.88% and a precision of 97.97% on the Birdsdata dataset. Furthermore, our method achieved an accuracy of 93.18%, a recall of 92.43%, an F1 score of 93.14% and a precision of 93.25% on the Cornell Bird Challenge 2020 (CBC) dataset.
Collapse
Affiliation(s)
- Shaokai Zhang
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610041, China; (S.Z.); (Y.G.); (J.C.)
| | - Yuan Gao
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610041, China; (S.Z.); (Y.G.); (J.C.)
| | - Jianmin Cai
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610041, China; (S.Z.); (Y.G.); (J.C.)
| | - Hangxiao Yang
- College of Computer Science, Sichuan University, Chengdu 610041, China; (H.Y.); (Q.Z.)
| | - Qijun Zhao
- College of Computer Science, Sichuan University, Chengdu 610041, China; (H.Y.); (Q.Z.)
| | - Fan Pan
- College of Electronics and Information Engineering, Sichuan University, Chengdu 610041, China; (S.Z.); (Y.G.); (J.C.)
| |
Collapse
|
18
|
Chen X, Wang R, Khalilian-Gourtani A, Yu L, Dugan P, Friedman D, Doyle W, Devinsky O, Wang Y, Flinker A. A Neural Speech Decoding Framework Leveraging Deep Learning and Speech Synthesis. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.16.558028. [PMID: 37745380 PMCID: PMC10516019 DOI: 10.1101/2023.09.16.558028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
Decoding human speech from neural signals is essential for brain-computer interface (BCI) technologies restoring speech function in populations with neurological deficits. However, it remains a highly challenging task, compounded by the scarce availability of neural signals with corresponding speech, data complexity, and high dimensionality, and the limited publicly available source code. Here, we present a novel deep learning-based neural speech decoding framework that includes an ECoG Decoder that translates electrocorticographic (ECoG) signals from the cortex into interpretable speech parameters and a novel differentiable Speech Synthesizer that maps speech parameters to spectrograms. We develop a companion audio-to-audio auto-encoder consisting of a Speech Encoder and the same Speech Synthesizer to generate reference speech parameters to facilitate the ECoG Decoder training. This framework generates natural-sounding speech and is highly reproducible across a cohort of 48 participants. Among three neural network architectures for the ECoG Decoder, the 3D ResNet model has the best decoding performance (PCC=0.804) in predicting the original speech spectrogram, closely followed by the SWIN model (PCC=0.796). Our experimental results show that our models can decode speech with high correlation even when limited to only causal operations, which is necessary for adoption by real-time neural prostheses. We successfully decode speech in participants with either left or right hemisphere coverage, which could lead to speech prostheses in patients with speech deficits resulting from left hemisphere damage. Further, we use an occlusion analysis to identify cortical regions contributing to speech decoding across our models. Finally, we provide open-source code for our two-stage training pipeline along with associated preprocessing and visualization tools to enable reproducible research and drive research across the speech science and prostheses communities.
Collapse
|
19
|
Lu S, Ang GW, Steadman M, Kozlov AS. Composite receptive fields in the mouse auditory cortex. J Physiol 2023; 601:4091-4104. [PMID: 37578817 PMCID: PMC10952747 DOI: 10.1113/jp285003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 07/12/2023] [Indexed: 08/15/2023] Open
Abstract
A central question in sensory neuroscience is how neurons represent complex natural stimuli. This process involves multiple steps of feature extraction to obtain a condensed, categorical representation useful for classification and behaviour. It has previously been shown that central auditory neurons in the starling have composite receptive fields composed of multiple features. Whether this property is an idiosyncratic characteristic of songbirds, a group of highly specialized vocal learners or a generic property of sensory processing is unknown. To address this question, we have recorded responses from auditory cortical neurons in mice, and characterized their receptive fields using mouse ultrasonic vocalizations (USVs) as a natural and ethologically relevant stimulus and pitch-shifted starling songs as a natural but ethologically irrelevant control stimulus. We have found that these neurons display composite receptive fields with multiple excitatory and inhibitory subunits. Moreover, this was the case with either the conspecific or the heterospecific vocalizations. We then trained the sparse filtering algorithm on both classes of natural stimuli to obtain statistically optimal features, and compared the natural and artificial features using UMAP, a dimensionality-reduction algorithm previously used to analyse mouse USVs and birdsongs. We have found that the receptive-field features obtained with both types of the natural stimuli clustered together, as did the sparse-filtering features. However, the natural and artificial receptive-field features clustered mostly separately. Based on these results, our general conclusion is that composite receptive fields are not a unique characteristic of specialized vocal learners but are likely a generic property of central auditory systems. KEY POINTS: Auditory cortical neurons in the mouse have composite receptive fields with several excitatory and inhibitory features. Receptive-field features capture temporal and spectral modulations of natural stimuli. Ethological relevance of the stimulus affects the estimation of receptive-field dimensionality.
Collapse
Affiliation(s)
- Sihao Lu
- Department of BioengineeringImperial College LondonLondonUK
| | - Grace W.Y. Ang
- Department of BioengineeringImperial College LondonLondonUK
| | - Mark Steadman
- Department of BioengineeringImperial College LondonLondonUK
| | | |
Collapse
|
20
|
Best P, Paris S, Glotin H, Marxer R. Deep audio embeddings for vocalisation clustering. PLoS One 2023; 18:e0283396. [PMID: 37428759 DOI: 10.1371/journal.pone.0283396] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 06/25/2023] [Indexed: 07/12/2023] Open
Abstract
The study of non-human animals' communication systems generally relies on the transcription of vocal sequences using a finite set of discrete units. This set is referred to as a vocal repertoire, which is specific to a species or a sub-group of a species. When conducted by human experts, the formal description of vocal repertoires can be laborious and/or biased. This motivates computerised assistance for this procedure, for which machine learning algorithms represent a good opportunity. Unsupervised clustering algorithms are suited for grouping close points together, provided a relevant representation. This paper therefore studies a new method for encoding vocalisations, allowing for automatic clustering to alleviate vocal repertoire characterisation. Borrowing from deep representation learning, we use a convolutional auto-encoder network to learn an abstract representation of vocalisations. We report on the quality of the learnt representation, as well as of state of the art methods, by quantifying their agreement with expert labelled vocalisation types from 8 datasets of other studies across 6 species (birds and marine mammals). With this benchmark, we demonstrate that using auto-encoders improves the relevance of vocalisation representation which serves repertoire characterisation using a very limited number of settings. We also publish a Python package for the bioacoustic community to train their own vocalisation auto-encoders or use a pretrained encoder to browse vocal repertoires and ease unit wise annotation.
Collapse
Affiliation(s)
- Paul Best
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, Toulon, France
| | - Sébastien Paris
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, Toulon, France
| | - Hervé Glotin
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, Toulon, France
| | - Ricard Marxer
- Université de Toulon, Aix Marseille Univ, CNRS, LIS, Toulon, France
| |
Collapse
|
21
|
Colquitt BM, Li K, Green F, Veline R, Brainard MS. Neural circuit-wide analysis of changes to gene expression during deafening-induced birdsong destabilization. eLife 2023; 12:e85970. [PMID: 37284822 PMCID: PMC10259477 DOI: 10.7554/elife.85970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Accepted: 04/17/2023] [Indexed: 06/08/2023] Open
Abstract
Sensory feedback is required for the stable execution of learned motor skills, and its loss can severely disrupt motor performance. The neural mechanisms that mediate sensorimotor stability have been extensively studied at systems and physiological levels, yet relatively little is known about how disruptions to sensory input alter the molecular properties of associated motor systems. Songbird courtship song, a model for skilled behavior, is a learned and highly structured vocalization that is destabilized following deafening. Here, we sought to determine how the loss of auditory feedback modifies gene expression and its coordination across the birdsong sensorimotor circuit. To facilitate this system-wide analysis of transcriptional responses, we developed a gene expression profiling approach that enables the construction of hundreds of spatially-defined RNA-sequencing libraries. Using this method, we found that deafening preferentially alters gene expression across birdsong neural circuitry relative to surrounding areas, particularly in premotor and striatal regions. Genes with altered expression are associated with synaptic transmission, neuronal spines, and neuromodulation and show a bias toward expression in glutamatergic neurons and Pvalb/Sst-class GABAergic interneurons. We also found that connected song regions exhibit correlations in gene expression that were reduced in deafened birds relative to hearing birds, suggesting that song destabilization alters the inter-region coordination of transcriptional states. Finally, lesioning LMAN, a forebrain afferent of RA required for deafening-induced song plasticity, had the largest effect on groups of genes that were also most affected by deafening. Combined, this integrated transcriptomics analysis demonstrates that the loss of peripheral sensory input drives a distributed gene expression response throughout associated sensorimotor neural circuitry and identifies specific candidate molecular and cellular mechanisms that support the stability and plasticity of learned motor skills.
Collapse
Affiliation(s)
- Bradley M Colquitt
- Howard Hughes Medical InstituteChevy ChaseUnited States
- Department of Physiology, University of California, San FranciscoSan FranciscoUnited States
| | - Kelly Li
- Howard Hughes Medical InstituteChevy ChaseUnited States
- Department of Physiology, University of California, San FranciscoSan FranciscoUnited States
| | - Foad Green
- Howard Hughes Medical InstituteChevy ChaseUnited States
- Department of Physiology, University of California, San FranciscoSan FranciscoUnited States
| | - Robert Veline
- Howard Hughes Medical InstituteChevy ChaseUnited States
- Department of Physiology, University of California, San FranciscoSan FranciscoUnited States
| | - Michael S Brainard
- Howard Hughes Medical InstituteChevy ChaseUnited States
- Department of Physiology, University of California, San FranciscoSan FranciscoUnited States
| |
Collapse
|
22
|
Brudner S, Pearson J, Mooney R. Generative models of birdsong learning link circadian fluctuations in song variability to changes in performance. PLoS Comput Biol 2023; 19:e1011051. [PMID: 37126511 PMCID: PMC10150982 DOI: 10.1371/journal.pcbi.1011051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Accepted: 03/27/2023] [Indexed: 05/02/2023] Open
Abstract
Learning skilled behaviors requires intensive practice over days, months, or years. Behavioral hallmarks of practice include exploratory variation and long-term improvements, both of which can be impacted by circadian processes. During weeks of vocal practice, the juvenile male zebra finch transforms highly variable and simple song into a stable and precise copy of an adult tutor's complex song. Song variability and performance in juvenile finches also exhibit circadian structure that could influence this long-term learning process. In fact, one influential study reported juvenile song regresses towards immature performance overnight, while another suggested a more complex pattern of overnight change. However, neither of these studies thoroughly examined how circadian patterns of variability may structure the production of more or less mature songs. Here we relate the circadian dynamics of song maturation to circadian patterns of song variation, leveraging a combination of data-driven approaches. In particular we analyze juvenile singing in learned feature space that supports both data-driven measures of song maturity and generative developmental models of song production. These models reveal that circadian fluctuations in variability lead to especially regressive morning variants even without overall overnight regression, and highlight the utility of data-driven generative models for untangling these contributions.
Collapse
Affiliation(s)
- Samuel Brudner
- Department of Neurobiology, Duke University School of Medicine, Durham, North Carolina, United States of America
| | - John Pearson
- Department of Neurobiology, Duke University School of Medicine, Durham, North Carolina, United States of America
- Department of Biostatistics & Bioinformatics, Duke University, Durham, North Carolina, United States of America
| | - Richard Mooney
- Department of Neurobiology, Duke University School of Medicine, Durham, North Carolina, United States of America
| |
Collapse
|
23
|
Arnaud V, Pellegrino F, Keenan S, St-Gelais X, Mathevon N, Levréro F, Coupé C. Improving the workflow to crack Small, Unbalanced, Noisy, but Genuine (SUNG) datasets in bioacoustics: The case of bonobo calls. PLoS Comput Biol 2023; 19:e1010325. [PMID: 37053268 PMCID: PMC10129004 DOI: 10.1371/journal.pcbi.1010325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Revised: 04/25/2023] [Accepted: 03/01/2023] [Indexed: 04/15/2023] Open
Abstract
Despite the accumulation of data and studies, deciphering animal vocal communication remains challenging. In most cases, researchers must deal with the sparse recordings composing Small, Unbalanced, Noisy, but Genuine (SUNG) datasets. SUNG datasets are characterized by a limited number of recordings, most often noisy, and unbalanced in number between the individuals or categories of vocalizations. SUNG datasets therefore offer a valuable but inevitably distorted vision of communication systems. Adopting the best practices in their analysis is essential to effectively extract the available information and draw reliable conclusions. Here we show that the most recent advances in machine learning applied to a SUNG dataset succeed in unraveling the complex vocal repertoire of the bonobo, and we propose a workflow that can be effective with other animal species. We implement acoustic parameterization in three feature spaces and run a Supervised Uniform Manifold Approximation and Projection (S-UMAP) to evaluate how call types and individual signatures cluster in the bonobo acoustic space. We then implement three classification algorithms (Support Vector Machine, xgboost, neural networks) and their combination to explore the structure and variability of bonobo calls, as well as the robustness of the individual signature they encode. We underscore how classification performance is affected by the feature set and identify the most informative features. In addition, we highlight the need to address data leakage in the evaluation of classification performance to avoid misleading interpretations. Our results lead to identifying several practical approaches that are generalizable to any other animal communication system. To improve the reliability and replicability of vocal communication studies with SUNG datasets, we thus recommend: i) comparing several acoustic parameterizations; ii) visualizing the dataset with supervised UMAP to examine the species acoustic space; iii) adopting Support Vector Machines as the baseline classification approach; iv) explicitly evaluating data leakage and possibly implementing a mitigation strategy.
Collapse
Affiliation(s)
- Vincent Arnaud
- Département des arts, des lettres et du langage, Université du Québec à Chicoutimi, Chicoutimi, Canada
- Laboratoire Dynamique Du Langage, UMR 5596, Université de Lyon, CNRS, Lyon, France
| | - François Pellegrino
- Laboratoire Dynamique Du Langage, UMR 5596, Université de Lyon, CNRS, Lyon, France
| | - Sumir Keenan
- ENES Bioacoustics Research Laboratory, University of Saint Étienne, CRNL, CNRS UMR 5292, Inserm UMR_S 1028, Saint-Étienne, France
| | - Xavier St-Gelais
- Département des arts, des lettres et du langage, Université du Québec à Chicoutimi, Chicoutimi, Canada
| | - Nicolas Mathevon
- ENES Bioacoustics Research Laboratory, University of Saint Étienne, CRNL, CNRS UMR 5292, Inserm UMR_S 1028, Saint-Étienne, France
| | - Florence Levréro
- ENES Bioacoustics Research Laboratory, University of Saint Étienne, CRNL, CNRS UMR 5292, Inserm UMR_S 1028, Saint-Étienne, France
| | - Christophe Coupé
- Laboratoire Dynamique Du Langage, UMR 5596, Université de Lyon, CNRS, Lyon, France
- Department of Linguistics, The University of Hong Kong, Hong Kong, China
| |
Collapse
|
24
|
Vattis K, Luddy AC, Ouillon JS, Eklund NM, Stephen CD, Schmahmann JD, Nunes AS, Gupta AS. Sensitive quantification of cerebellar speech abnormalities using deep learning models. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.04.03.23288094. [PMID: 37066308 PMCID: PMC10104181 DOI: 10.1101/2023.04.03.23288094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Objective Objective, sensitive, and meaningful disease assessments are critical to support clinical trials and clinical care. Speech changes are one of the earliest and most evident manifestations of cerebellar ataxias. The purpose of this work is to develop models that can accurately identify and quantify these abnormalities. Methods We use deep learning models such as ResNet 18 , that take the time and frequency partial derivatives of the log-mel spectrogram representations of speech as input, to learn representations that capture the motor speech phenotype of cerebellar ataxia. We train classification models to separate patients with ataxia from healthy controls as well as regression models to estimate disease severity. Results Our model was able to accurately distinguish healthy controls from individuals with ataxia, including ataxia participants with no detectable clinical deficits in speech. Furthermore the regression models produced accurate estimates of disease severity, were able to measure subclinical signs of ataxia, and captured disease progression over time in individuals with ataxia. Conclusion Deep learning models, trained on time and frequency partial derivatives of the speech signal, can detect sub-clinical speech changes in ataxias and sensitively measure disease change over time. Significance Such models have the potential to assist with early detection of ataxia and to provide sensitive and low-burden assessment tools in support of clinical trials and neurological care.
Collapse
|
25
|
Jourjine N, Woolfolk ML, Sanguinetti-Scheck JI, Sabatini JE, McFadden S, Lindholm AK, Hoekstra HE. Two pup vocalization types are genetically and functionally separable in deer mice. Curr Biol 2023; 33:1237-1248.e4. [PMID: 36893759 DOI: 10.1016/j.cub.2023.02.045] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 02/11/2023] [Accepted: 02/14/2023] [Indexed: 03/10/2023]
Abstract
Vocalization is a widespread social behavior in vertebrates that can affect fitness in the wild. Although many vocal behaviors are highly conserved, heritable features of specific vocalization types can vary both within and between species, raising the questions of why and how some vocal behaviors evolve. Here, using new computational tools to automatically detect and cluster vocalizations into distinct acoustic categories, we compare pup isolation calls across neonatal development in eight taxa of deer mice (genus Peromyscus) and compare them with laboratory mice (C57BL6/J strain) and free-living, wild house mice (Mus musculus domesticus). Whereas both Peromyscus and Mus pups produce ultrasonic vocalizations (USVs), Peromyscus pups also produce a second call type with acoustic features, temporal rhythms, and developmental trajectories that are distinct from those of USVs. In deer mice, these lower frequency "cries" are predominantly emitted in postnatal days one through nine, whereas USVs are primarily made after day 9. Using playback assays, we show that cries result in a more rapid approach by Peromyscus mothers than USVs, suggesting a role for cries in eliciting parental care early in neonatal development. Using a genetic cross between two sister species of deer mice exhibiting large, innate differences in the acoustic structure of cries and USVs, we find that variation in vocalization rate, duration, and pitch displays different degrees of genetic dominance and that cry and USV features can be uncoupled in second-generation hybrids. Taken together, this work shows that vocal behavior can evolve quickly between closely related rodent species in which vocalization types, likely serving distinct functions in communication, are controlled by distinct genetic loci.
Collapse
Affiliation(s)
- Nicholas Jourjine
- Department of Molecular & Cellular Biology, Department of Organismic & Evolutionary Biology, Center for Brain Science, Museum of Comparative Zoology, Harvard University and the Howard Hughes Medical Institute, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - Maya L Woolfolk
- Department of Molecular & Cellular Biology, Department of Organismic & Evolutionary Biology, Center for Brain Science, Museum of Comparative Zoology, Harvard University and the Howard Hughes Medical Institute, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - Juan I Sanguinetti-Scheck
- Department of Molecular & Cellular Biology, Department of Organismic & Evolutionary Biology, Center for Brain Science, Museum of Comparative Zoology, Harvard University and the Howard Hughes Medical Institute, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - John E Sabatini
- Department of Molecular & Cellular Biology, Department of Organismic & Evolutionary Biology, Center for Brain Science, Museum of Comparative Zoology, Harvard University and the Howard Hughes Medical Institute, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - Sade McFadden
- Department of Molecular & Cellular Biology, Department of Organismic & Evolutionary Biology, Center for Brain Science, Museum of Comparative Zoology, Harvard University and the Howard Hughes Medical Institute, 16 Divinity Avenue, Cambridge, MA 02138, USA
| | - Anna K Lindholm
- Department of Evolutionary Biology & Environmental Studies, University of Zürich, Winterthurerstrasse, 190 8057 Zürich, Switzerland
| | - Hopi E Hoekstra
- Department of Molecular & Cellular Biology, Department of Organismic & Evolutionary Biology, Center for Brain Science, Museum of Comparative Zoology, Harvard University and the Howard Hughes Medical Institute, 16 Divinity Avenue, Cambridge, MA 02138, USA.
| |
Collapse
|
26
|
Zimmermann J, Beguet F, Guthruf D, Langbehn B, Rupp D. Finding the semantic similarity in single-particle diffraction images using self-supervised contrastive projection learning. NPJ COMPUTATIONAL MATERIALS 2023; 9:24. [PMID: 38666059 PMCID: PMC11041688 DOI: 10.1038/s41524-023-00966-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 01/10/2023] [Indexed: 04/28/2024]
Abstract
Single-shot coherent diffraction imaging of isolated nanosized particles has seen remarkable success in recent years, yielding in-situ measurements with ultra-high spatial and temporal resolution. The progress of high-repetition-rate sources for intense X-ray pulses has further enabled recording datasets containing millions of diffraction images, which are needed for the structure determination of specimens with greater structural variety and dynamic experiments. The size of the datasets, however, represents a monumental problem for their analysis. Here, we present an automatized approach for finding semantic similarities in coherent diffraction images without relying on human expert labeling. By introducing the concept of projection learning, we extend self-supervised contrastive learning to the context of coherent diffraction imaging and achieve a dimensionality reduction producing semantically meaningful embeddings that align with physical intuition. The method yields substantial improvements compared to previous approaches, paving the way toward real-time and large-scale analysis of coherent diffraction experiments at X-ray free-electron lasers.
Collapse
Affiliation(s)
| | | | | | | | - Daniela Rupp
- ETH Zürich, Zürich, Switzerland
- Max-Born-Institut, Berlin, Germany
| |
Collapse
|
27
|
Clink DJ, Kier I, Ahmad AH, Klinck H. A workflow for the automated detection and classification of female gibbon calls from long-term acoustic recordings. Front Ecol Evol 2023. [DOI: 10.3389/fevo.2023.1071640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023] Open
Abstract
Passive acoustic monitoring (PAM) allows for the study of vocal animals on temporal and spatial scales difficult to achieve using only human observers. Recent improvements in recording technology, data storage, and battery capacity have led to increased use of PAM. One of the main obstacles in implementing wide-scale PAM programs is the lack of open-source programs that efficiently process terabytes of sound recordings and do not require large amounts of training data. Here we describe a workflow for detecting, classifying, and visualizing female Northern grey gibbon calls in Sabah, Malaysia. Our approach detects sound events using band-limited energy summation and does binary classification of these events (gibbon female or not) using machine learning algorithms (support vector machine and random forest). We then applied an unsupervised approach (affinity propagation clustering) to see if we could further differentiate between true and false positives or the number of gibbon females in our dataset. We used this workflow to address three questions: (1) does this automated approach provide reliable estimates of temporal patterns of gibbon calling activity; (2) can unsupervised approaches be applied as a post-processing step to improve the performance of the system; and (3) can unsupervised approaches be used to estimate how many female individuals (or clusters) there are in our study area? We found that performance plateaued with >160 clips of training data for each of our two classes. Using optimized settings, our automated approach achieved a satisfactory performance (F1 score ~ 80%). The unsupervised approach did not effectively differentiate between true and false positives or return clusters that appear to correspond to the number of females in our study area. Our results indicate that more work needs to be done before unsupervised approaches can be reliably used to estimate the number of individual animals occupying an area from PAM data. Future work applying these methods across sites and different gibbon species and comparisons to deep learning approaches will be crucial for future gibbon conservation initiatives across Southeast Asia.
Collapse
|
28
|
Berthet M, Coye C, Dezecache G, Kuhn J. Animal linguistics: a primer. Biol Rev Camb Philos Soc 2023; 98:81-98. [PMID: 36189714 PMCID: PMC10091714 DOI: 10.1111/brv.12897] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 08/10/2022] [Accepted: 08/12/2022] [Indexed: 01/12/2023]
Abstract
The evolution of language has been investigated by several research communities, including biologists and linguists, striving to highlight similar linguistic capacities across species. To date, however, no consensus exists on the linguistic capacities of non-human species. Major controversies remain on the use of linguistic terminology, analysis methods and behavioural data collection. The field of 'animal linguistics' has emerged to overcome these difficulties and attempt to reach uniform methods and terminology. This primer is a tutorial review of 'animal linguistics'. It describes the linguistic concepts of semantics, pragmatics and syntax, and proposes minimal criteria to be fulfilled to claim that a given species displays a particular linguistic capacity. Second, it reviews relevant methods successfully applied to the study of communication in animals and proposes a list of useful references to detect and overcome major pitfalls commonly observed in the collection of animal behaviour data. This primer represents a step towards mutual understanding and fruitful collaborations between linguists and biologists.
Collapse
Affiliation(s)
- Mélissa Berthet
- Institut Jean Nicod, Département d'études cognitives, ENS, EHESS, CNRS, PSL University, 75005, Paris, France.,Center for the Interdisciplinary Study of Language Evolution, University of Zürich, Affolternstrasse 56, 8050, Zurich, Switzerland.,Department of Comparative Language Science, University of Zürich, Affolternstrasse 56, 8050, Zurich, Switzerland
| | - Camille Coye
- Institut Jean Nicod, Département d'études cognitives, ENS, EHESS, CNRS, PSL University, 75005, Paris, France.,Center for Ecology and Conservation, Bioscience Department, University of Exeter, Penryn Campus, Penryn, TR10 9FE, UK
| | | | - Jeremy Kuhn
- Institut Jean Nicod, Département d'études cognitives, ENS, EHESS, CNRS, PSL University, 75005, Paris, France
| |
Collapse
|
29
|
Walsh SL, Engesser S, Townsend SW, Ridley AR. Multi-level combinatoriality in magpie non-song vocalizations. J R Soc Interface 2023; 20:20220679. [PMID: 36722171 PMCID: PMC9890321 DOI: 10.1098/rsif.2022.0679] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Comparative studies conducted over the past few decades have provided important insights into the capacity for animals to combine vocal segments at either one of two levels: within- or between-calls. There remains, however, a distinct gap in knowledge as to whether animal combinatoriality can extend beyond one level. Investigating this requires a comprehensive analysis of the combinatorial features characterizing a species' vocal system. Here, we used a nonlinear dimensionality reduction analysis and sequential transition analysis to quantitatively describe the non-song combinatorial repertoire of the Western Australian magpie (Gymnorhina tibicen dorsalis). We found that (i) magpies recombine four distinct acoustic segments to create a larger number of calls, and (ii) the resultant calls are further combined into larger call combinations. Our work demonstrates two levels in the combining of magpie vocal units. These results are incongruous with the notion that a capacity for multi-level combinatoriality is unique to human language, wherein the combining of meaningless sounds and meaningful words interactively occurs across different combinatorial levels. Our study thus provides novel insights into the combinatorial capacities of a non-human species, adding to the growing evidence of analogues of language-specific traits present in the animal kingdom.
Collapse
Affiliation(s)
- Sarah L. Walsh
- Centre for Evolutionary Biology, School of Biological Sciences, University of Western Australia, Crawley, WA 6009, Australia
| | - Sabrina Engesser
- Department of Biology, University of Copenhagen, 1165 København, Denmark
| | - Simon W. Townsend
- Department of Comparative Language Science, University of Zurich, Zurich 8006, Switzerland,Center for the Interdisciplinary Study of Language Evolution (ISLE), University of Zurich, Zurich 8006, Switzerland,Department of Psychology, University of Warwick, Coventry CV4 7AL, UK
| | - Amanda R. Ridley
- Centre for Evolutionary Biology, School of Biological Sciences, University of Western Australia, Crawley, WA 6009, Australia
| |
Collapse
|
30
|
Lorenz C, Hao X, Tomka T, Rüttimann L, Hahnloser RH. Interactive extraction of diverse vocal units from a planar embedding without the need for prior sound segmentation. FRONTIERS IN BIOINFORMATICS 2023; 2:966066. [PMID: 36710910 PMCID: PMC9880044 DOI: 10.3389/fbinf.2022.966066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/14/2022] [Indexed: 01/15/2023] Open
Abstract
Annotating and proofreading data sets of complex natural behaviors such as vocalizations are tedious tasks because instances of a given behavior need to be correctly segmented from background noise and must be classified with minimal false positive error rate. Low-dimensional embeddings have proven very useful for this task because they can provide a visual overview of a data set in which distinct behaviors appear in different clusters. However, low-dimensional embeddings introduce errors because they fail to preserve distances; and embeddings represent only objects of fixed dimensionality, which conflicts with vocalizations that have variable dimensions stemming from their variable durations. To mitigate these issues, we introduce a semi-supervised, analytical method for simultaneous segmentation and clustering of vocalizations. We define a given vocalization type by specifying pairs of high-density regions in the embedding plane of sound spectrograms, one region associated with vocalization onsets and the other with offsets. We demonstrate our two-neighborhood (2N) extraction method on the task of clustering adult zebra finch vocalizations embedded with UMAP. We show that 2N extraction allows the identification of short and long vocal renditions from continuous data streams without initially committing to a particular segmentation of the data. Also, 2N extraction achieves much lower false positive error rate than comparable approaches based on a single defining region. Along with our method, we present a graphical user interface (GUI) for visualizing and annotating data.
Collapse
Affiliation(s)
- Corinna Lorenz
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland,Université Paris-Saclay, CNRS, Institut des Neurosciences Paris-Saclay, Saclay, France
| | - Xinyu Hao
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland,Tianjin University, School of Electrical and Information Engineering, Tianjin, China
| | - Tomas Tomka
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Linus Rüttimann
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Richard H.R. Hahnloser
- Institute of Neuroinformatics and Neuroscience Center Zurich, University of Zurich and ETH Zurich, Zurich, Switzerland,*Correspondence: Richard H.R. Hahnloser,
| |
Collapse
|
31
|
McGinn K, Kahl S, Peery MZ, Klinck H, Wood CM. Feature embeddings from the BirdNET algorithm provide insights into avian ecology. ECOL INFORM 2023. [DOI: 10.1016/j.ecoinf.2023.101995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
32
|
Pranic NM, Kornbrek C, Yang C, Cleland TA, Tschida KA. Rates of ultrasonic vocalizations are more strongly related than acoustic features to non-vocal behaviors in mouse pups. Front Behav Neurosci 2022; 16:1015484. [PMID: 36600992 PMCID: PMC9805956 DOI: 10.3389/fnbeh.2022.1015484] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Accepted: 11/29/2022] [Indexed: 12/23/2022] Open
Abstract
Mouse pups produce. ultrasonic vocalizations (USVs) in response to isolation from the nest (i.e., isolation USVs). Rates and acoustic features of isolation USVs change dramatically over the first two weeks of life, and there is also substantial variability in the rates and acoustic features of isolation USVs at a given postnatal age. The factors that contribute to within age variability in isolation USVs remain largely unknown. Here, we explore the extent to which non-vocal behaviors of mouse pups relate to the within age variability in rates and acoustic features of their USVs. We recorded non-vocal behaviors of isolated C57BL/6J mouse pups at four postnatal ages (postnatal days 5, 10, 15, and 20), measured rates of isolation USV production, and applied a combination of pre-defined acoustic feature measurements and an unsupervised machine learning-based vocal analysis method to examine USV acoustic features. When we considered different categories of non-vocal behavior, our analyses revealed that mice in all postnatal age groups produce higher rates of isolation USVs during active non-vocal behaviors than when lying still. Moreover, rates of isolation USVs are correlated with the intensity (i.e., magnitude) of non-vocal body and limb movements within a given trial. In contrast, USVs produced during different categories of non-vocal behaviors and during different intensities of non-vocal movement do not differ substantially in their acoustic features. Our findings suggest that levels of behavioral arousal contribute to within age variability in rates, but not acoustic features, of mouse isolation USVs.
Collapse
|
33
|
Provost KL, Yang J, Carstens BC. The impacts of fine-tuning, phylogenetic distance, and sample size on big-data bioacoustics. PLoS One 2022; 17:e0278522. [PMID: 36477744 PMCID: PMC9728902 DOI: 10.1371/journal.pone.0278522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2022] [Accepted: 11/17/2022] [Indexed: 12/12/2022] Open
Abstract
Vocalizations in animals, particularly birds, are critically important behaviors that influence their reproductive fitness. While recordings of bioacoustic data have been captured and stored in collections for decades, the automated extraction of data from these recordings has only recently been facilitated by artificial intelligence methods. These have yet to be evaluated with respect to accuracy of different automation strategies and features. Here, we use a recently published machine learning framework to extract syllables from ten bird species ranging in their phylogenetic relatedness from 1 to 85 million years, to compare how phylogenetic relatedness influences accuracy. We also evaluate the utility of applying trained models to novel species. Our results indicate that model performance is best on conspecifics, with accuracy progressively decreasing as phylogenetic distance increases between taxa. However, we also find that the application of models trained on multiple distantly related species can improve the overall accuracy to levels near that of training and analyzing a model on the same species. When planning big-data bioacoustics studies, care must be taken in sample design to maximize sample size and minimize human labor without sacrificing accuracy.
Collapse
Affiliation(s)
- Kaiya L. Provost
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, Ohio, United States of America
| | - Jiaying Yang
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, Ohio, United States of America
| | - Bryan C. Carstens
- Department of Evolution, Ecology and Organismal Biology, The Ohio State University, Columbus, Ohio, United States of America
| |
Collapse
|
34
|
Michaud F, Sueur J, Le Cesne M, Haupert S. Unsupervised classification to improve the quality of a bird song recording dataset. ECOL INFORM 2022. [DOI: 10.1016/j.ecoinf.2022.101952] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
35
|
Morales G, Vargas V, Espejo D, Poblete V, Tomasevic JA, Otondo F, Navedo JG. Method for passive acoustic monitoring of bird communities using UMAP and a deep neural network. ECOL INFORM 2022. [DOI: 10.1016/j.ecoinf.2022.101909] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
36
|
Xing J, Sainburg T, Taylor H, Gentner TQ. Syntactic modulation of rhythm in Australian pied butcherbird song. ROYAL SOCIETY OPEN SCIENCE 2022; 9:220704. [PMID: 36177196 PMCID: PMC9515642 DOI: 10.1098/rsos.220704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 09/05/2022] [Indexed: 05/04/2023]
Abstract
The acoustic structure of birdsong is spectrally and temporally complex. Temporal complexity is often investigated in a syntactic framework focusing on the statistical features of symbolic song sequences. Alternatively, temporal patterns can be investigated in a rhythmic framework that focuses on the relative timing between song elements. Here, we investigate the merits of combining both frameworks by integrating syntactic and rhythmic analyses of Australian pied butcherbird (Cracticus nigrogularis) songs, which exhibit organized syntax and diverse rhythms. We show that rhythms of the pied butcherbird song bouts in our sample are categorically organized and predictable by the song's first-order sequential syntax. These song rhythms remain categorically distributed and strongly associated with the first-order sequential syntax even after controlling for variance in note length, suggesting that the silent intervals between notes induce a rhythmic structure on note sequences. We discuss the implication of syntactic-rhythmic relations as a relevant feature of song complexity with respect to signals such as human speech and music, and advocate for a broader conception of song complexity that takes into account syntax, rhythm, and their interaction with other acoustic and perceptual features.
Collapse
Affiliation(s)
- Jeffrey Xing
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
| | - Tim Sainburg
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
| | - Hollis Taylor
- Sydney Conservatorium of Music, University of Sydney, Sydney, New South Wales, Australia
| | - Timothy Q. Gentner
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
- Neurobiology Section, Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Kavli Institute for Brain and Mind, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
37
|
Xing J, Sainburg T, Taylor H, Gentner TQ. Syntactic modulation of rhythm in Australian pied butcherbird song. ROYAL SOCIETY OPEN SCIENCE 2022; 9:220704. [PMID: 36177196 DOI: 10.6084/m9.figshare.c.6197494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 09/05/2022] [Indexed: 05/21/2023]
Abstract
The acoustic structure of birdsong is spectrally and temporally complex. Temporal complexity is often investigated in a syntactic framework focusing on the statistical features of symbolic song sequences. Alternatively, temporal patterns can be investigated in a rhythmic framework that focuses on the relative timing between song elements. Here, we investigate the merits of combining both frameworks by integrating syntactic and rhythmic analyses of Australian pied butcherbird (Cracticus nigrogularis) songs, which exhibit organized syntax and diverse rhythms. We show that rhythms of the pied butcherbird song bouts in our sample are categorically organized and predictable by the song's first-order sequential syntax. These song rhythms remain categorically distributed and strongly associated with the first-order sequential syntax even after controlling for variance in note length, suggesting that the silent intervals between notes induce a rhythmic structure on note sequences. We discuss the implication of syntactic-rhythmic relations as a relevant feature of song complexity with respect to signals such as human speech and music, and advocate for a broader conception of song complexity that takes into account syntax, rhythm, and their interaction with other acoustic and perceptual features.
Collapse
Affiliation(s)
- Jeffrey Xing
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
| | - Tim Sainburg
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
| | - Hollis Taylor
- Sydney Conservatorium of Music, University of Sydney, Sydney, New South Wales, Australia
| | - Timothy Q Gentner
- Department of Psychology, University of California San Diego, La Jolla, CA, USA
- Neurobiology Section, Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
- Kavli Institute for Brain and Mind, University of California San Diego, La Jolla, CA, USA
| |
Collapse
|
38
|
Sun Y, Yen S, Lin T. soundscape_IR: A source separation toolbox for exploring acoustic diversity in soundscapes. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Yi‐Jen Sun
- Biodiversity Research Center Academia Sinica Taipei Taiwan (R.O.C)
| | - Shih‐Ching Yen
- Center for General Education National Tsing Hua University Hsinchu Taiwan (R.O.C)
| | - Tzu‐Hao Lin
- Biodiversity Research Center Academia Sinica Taipei Taiwan (R.O.C)
| |
Collapse
|
39
|
Introducing the Software CASE (Cluster and Analyze Sound Events) by Comparing Different Clustering Methods and Audio Transformation Techniques Using Animal Vocalizations. Animals (Basel) 2022; 12:ani12162020. [PMID: 36009611 PMCID: PMC9404437 DOI: 10.3390/ani12162020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/28/2022] [Accepted: 08/04/2022] [Indexed: 11/17/2022] Open
Abstract
Simple Summary Unsupervised clustering algorithms are widely used in ecology and conservation to classify animal vocalizations, but also offer various advantages in basic research, contributing to the understanding of acoustic communication. Nevertheless, there are still some challenges to overcome. For instance, the quality of the clustering result depends on the audio transformation technique previously used to adjust the audio data. Moreover, it is difficult to verify the reliability of the clustering result. To analyze bioacoustic data using a clustering algorithm, it is, therefore, essential to select a reasonable algorithm from the many existing algorithms and prepare the recorded vocalizations so that the resulting values characterize a vocalization as accurately as possible. Frequency-modulated vocalizations, whose frequencies change over time, pose a particular problem. In this paper, we present the software CASE, which includes various clustering methods and provides an overview of their strengths and weaknesses concerning the classification of bioacoustic data. This software uses a multidimensional feature-extraction method to achieve better clustering results, especially for frequency-modulated vocalizations. Abstract Unsupervised clustering algorithms are widely used in ecology and conservation to classify animal sounds, but also offer several advantages in basic bioacoustics research. Consequently, it is important to overcome the existing challenges. A common practice is extracting the acoustic features of vocalizations one-dimensionally, only extracting an average value for a given feature for the entire vocalization. With frequency-modulated vocalizations, whose acoustic features can change over time, this can lead to insufficient characterization. Whether the necessary parameters have been set correctly and the obtained clustering result reliably classifies the vocalizations subsequently often remains unclear. The presented software, CASE, is intended to overcome these challenges. Established and new unsupervised clustering methods (community detection, affinity propagation, HDBSCAN, and fuzzy clustering) are tested in combination with various classifiers (k-nearest neighbor, dynamic time-warping, and cross-correlation) using differently transformed animal vocalizations. These methods are compared with predefined clusters to determine their strengths and weaknesses. In addition, a multidimensional data transformation procedure is presented that better represents the course of multiple acoustic features. The results suggest that, especially with frequency-modulated vocalizations, clustering is more applicable with multidimensional feature extraction compared with one-dimensional feature extraction. The characterization and clustering of vocalizations in multidimensional space offer great potential for future bioacoustic studies. The software CASE includes the developed method of multidimensional feature extraction, as well as all used clustering methods. It allows quickly applying several clustering algorithms to one data set to compare their results and to verify their reliability based on their consistency. Moreover, the software CASE determines the optimal values of most of the necessary parameters automatically. To take advantage of these benefits, the software CASE is provided for free download.
Collapse
|
40
|
Comella I, Tasirin JS, Klinck H, Johnson LM, Clink DJ. Investigating note repertoires and acoustic tradeoffs in the duet contributions of a basal haplorrhine primate. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.910121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Acoustic communication serves a crucial role in the social interactions of vocal animals. Duetting—the coordinated singing among pairs of animals—has evolved independently multiple times across diverse taxonomic groups including insects, frogs, birds, and mammals. A crucial first step for understanding how information is encoded and transferred in duets is through quantifying the acoustic repertoire, which can reveal differences and similarities on multiple levels of analysis and provides the groundwork necessary for further studies of the vocal communication patterns of the focal species. Investigating acoustic tradeoffs, such as the tradeoff between the rate of syllable repetition and note bandwidth, can also provide important insights into the evolution of duets, as these tradeoffs may represent the physical and mechanical limits on signal design. In addition, identifying which sex initiates the duet can provide insights into the function of the duets. We have three main goals in the current study: (1) provide a descriptive, fine-scale analysis of Gursky’s spectral tarsier (Tarsius spectrumgurskyae) duets; (2) use unsupervised approaches to investigate sex-specific note repertoires; and (3) test for evidence of acoustic tradeoffs in the rate of note repetition and bandwidth of tarsier duet contributions. We found that both sexes were equally likely to initiate the duets and that pairs differed substantially in the duration of their duets. Our unsupervised clustering analyses indicate that both sexes have highly graded note repertoires. We also found evidence for acoustic tradeoffs in both male and female duet contributions, but the relationship in females was much more pronounced. The prevalence of this tradeoff across diverse taxonomic groups including birds, bats, and primates indicates the constraints that limit the production of rapidly repeating broadband notes may be one of the few ‘universals’ in vocal communication. Future carefully designed playback studies that investigate the behavioral response, and therefore potential information transmitted in duets to conspecifics, will be highly informative.
Collapse
|
41
|
Karigo T. Gaining insights into the internal states of the rodent brain through vocal communications. Neurosci Res 2022; 184:1-8. [PMID: 35908736 DOI: 10.1016/j.neures.2022.07.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 07/25/2022] [Accepted: 07/26/2022] [Indexed: 10/31/2022]
Abstract
Animals display various behaviors during social interactions. Social behaviors have been proposed to be driven by the internal states of the animals, reflecting their emotional or motivational states. However, the internal states that drive social behaviors are complex and difficult to interpret. Many animals, including mice, use vocalizations for communication in various social contexts. This review provides an overview of current understandings of mouse vocal communications, its underlying neural circuitry, and the potential to use vocal communications as a readout for the animal's internal states during social interactions.
Collapse
Affiliation(s)
- Tomomi Karigo
- Division of Biology and Biological Engineering 140-18,TianQiao and Chrissy Chen Institute for Neuroscience, California Institute of Technology, Pasadena CA 91125, USA; Present address: Kennedy Krieger Institute, Baltimore, MD 21205, USA; The Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.
| |
Collapse
|
42
|
Harvill J, Wani Y, Alam M, Ahuja N, Hasegawa-Johnsor M, Chestek D, Beiser DG. Estimation of Respiratory Rate from Breathing Audio. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2022; 2022:4599-4603. [PMID: 36085895 DOI: 10.1109/embc48229.2022.9871897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The COVID-19 pandemic has fueled exponential growth in the adoption of remote delivery of primary, specialty, and urgent health care services. One major challenge is the lack of access to physical exam including accurate and inexpensive measurement of remote vital signs. Here we present a novel method for machine learning-based estimation of patient respiratory rate from audio. There exist non-learning methods but their accuracy is limited and work using machine learning known to us is either not directly useful or uses non-public datasets. We are aware of only one publicly available dataset which is small and which we use to evaluate our algorithm. However, to avoid the overfitting problem, we expand its effective size by proposing a new data augmentation method. Our algorithm uses the spectrogram representation and requires labels for breathing cycles, which are used to train a recurrent neural network for recognizing the cycles. Our augmentation method exploits the independence property of the most periodic frequency components of the spectrogram and permutes their order to create multiple signal representations. Our experiments show that our method almost halves the errors obtained by the existing (non-learning) methods. Clinical Relevance- We achieve a Mean Absolute Error (MAE) of 1.0 for the respiratory rate while relying only on an audio signal of a patient breathing. This signal can be collected from a smartphone such that physicians can automatically and reliably determine respiratory rate in a remote setting.
Collapse
|
43
|
Thomas M, Jensen FH, Averly B, Demartsev V, Manser MB, Sainburg T, Roch MA, Strandburg-Peshkin A. A practical guide for generating unsupervised, spectrogram-based latent space representations of animal vocalizations. J Anim Ecol 2022; 91:1567-1581. [PMID: 35657634 DOI: 10.1111/1365-2656.13754] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 05/26/2022] [Indexed: 11/30/2022]
Abstract
1. Background: The manual detection, analysis, and classification of animal vocalizations in acoustic recordings is laborious and requires expert knowledge. Hence, there is a need for objective, generalizable methods that detect underlying patterns in these data, categorize sounds into distinct groups, and quantify similarities between them. Among all computational methods that have been proposed to accomplish this, neighborhood-based dimensionality reduction of spectrograms to produce a latent-space representation of calls stands out for its conceptual simplicity and effectiveness. 2. Goal of the study/what was done: Using a dataset of manually annotated meerkat (Suricata suricatta) vocalizations, we demonstrate how this method can be used to obtain meaningful latent space representations that reflect the established taxonomy of call types. We analyze strengths and weaknesses of the proposed approach, give recommendations for its usage and show application examples, such as the classification of ambiguous calls and the detection of mislabeled calls. 3. What this means: All analyses are accompanied by example code to help researchers realize the potential of this method for the study of animal vocalizations.
Collapse
Affiliation(s)
- Mara Thomas
- Department for the Ecology of Animal Societies, Max Planck Institute of Animal Behavior, Bücklestrasse 5a, Konstanz, Germany.,Biology Department, University of Konstanz, Universitätsstrasse 10, Konstanz, Germany
| | - Frants H Jensen
- Biology Department, Woods Hole Oceanographic Institution, 266 Woods Hole Rd, Woods Hole, MA 02543, USA.,Department of Biology, Syracuse University, Syracuse, NY 13244, USA
| | - Baptiste Averly
- Department for the Ecology of Animal Societies, Max Planck Institute of Animal Behavior, Bücklestrasse 5a, Konstanz, Germany.,Biology Department, University of Konstanz, Universitätsstrasse 10, Konstanz, Germany
| | - Vlad Demartsev
- Department for the Ecology of Animal Societies, Max Planck Institute of Animal Behavior, Bücklestrasse 5a, Konstanz, Germany.,Biology Department, University of Konstanz, Universitätsstrasse 10, Konstanz, Germany
| | - Marta B Manser
- Kalahari Meerkat Project, Kuruman River Reserve, Van Zylsrus, Northern Cape, South Africa.,Department of Evolutionary Biology and Environmental Studies, University of Zürich, 8057, Zürich, Switzerland
| | - Tim Sainburg
- Department of Psychology, University of California San Diego, 9500 Gilman, La Jolla, CA 92093, USA
| | - Marie A Roch
- Department of Computer Science, San Diego State University, 5500 Campanile Drive,San Diego, CA 92182-7720, USA
| | - Ariana Strandburg-Peshkin
- Department for the Ecology of Animal Societies, Max Planck Institute of Animal Behavior, Bücklestrasse 5a, Konstanz, Germany.,Biology Department, University of Konstanz, Universitätsstrasse 10, Konstanz, Germany.,Kalahari Meerkat Project, Kuruman River Reserve, Van Zylsrus, Northern Cape, South Africa.,Centre for the Advanced Study of Collective Behavior, University of Konstanz, Universitätsstrasse 10, Konstanz, Germany
| |
Collapse
|
44
|
Miller CT, Gire D, Hoke K, Huk AC, Kelley D, Leopold DA, Smear MC, Theunissen F, Yartsev M, Niell CM. Natural behavior is the language of the brain. Curr Biol 2022; 32:R482-R493. [PMID: 35609550 PMCID: PMC10082559 DOI: 10.1016/j.cub.2022.03.031] [Citation(s) in RCA: 47] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The breadth and complexity of natural behaviors inspires awe. Understanding how our perceptions, actions, and internal thoughts arise from evolved circuits in the brain has motivated neuroscientists for generations. Researchers have traditionally approached this question by focusing on stereotyped behaviors, either natural or trained, in a limited number of model species. This approach has allowed for the isolation and systematic study of specific brain operations, which has greatly advanced our understanding of the circuits involved. At the same time, the emphasis on experimental reductionism has left most aspects of the natural behaviors that have shaped the evolution of the brain largely unexplored. However, emerging technologies and analytical tools make it possible to comprehensively link natural behaviors to neural activity across a broad range of ethological contexts and timescales, heralding new modes of neuroscience focused on natural behaviors. Here we describe a three-part roadmap that aims to leverage the wealth of behaviors in their naturally occurring distributions, linking their variance with that of underlying neural processes to understand how the brain is able to successfully navigate the everyday challenges of animals' social and ecological landscapes. To achieve this aim, experimenters must harness one challenge faced by all neurobiological systems, namely variability, in order to gain new insights into the language of the brain.
Collapse
Affiliation(s)
- Cory T Miller
- Cortical Systems and Behavior Laboratory, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92039, USA.
| | - David Gire
- Department of Psychology, University of Washington, Guthrie Hall, Seattle, WA 98105, USA
| | - Kim Hoke
- Department of Biology, Colorado State University, 1878 Campus Delivery, Fort Collins, CO 80523, USA
| | - Alexander C Huk
- Center for Perceptual Systems, Departments of Neuroscience and Psychology, University of Texas at Austin, 116 Inner Campus Drive, Austin, TX 78712, USA
| | - Darcy Kelley
- Department of Biological Sciences, Columbia University, 1212 Amsterdam Avenue, New York, NY 10027, USA
| | - David A Leopold
- Section of Cognitive Neurophysiology and Imaging, National Institute of Mental Health, 49 Convent Drive, Bethesda, MD 20892, USA
| | - Matthew C Smear
- Department of Psychology and Institute of Neuroscience, University of Oregon, 1227 University Street, Eugene, OR 97403, USA
| | - Frederic Theunissen
- Department of Psychology, University of California Berkeley, 2121 Berkeley Way, Berkeley, CA 94720, USA
| | - Michael Yartsev
- Department of Bioengineering, University of California Berkeley, 306 Stanley Hall, Berkeley, CA 94720, USA
| | - Cristopher M Niell
- Department of Biology and Institute of Neuroscience, University of Oregon, 222 Huestis Hall, Eugene, OR 97403, USA.
| |
Collapse
|
45
|
MASCDB, a database of images, descriptors and microphysical properties of individual snowflakes in free fall. Sci Data 2022; 9:186. [PMID: 35504919 PMCID: PMC9065139 DOI: 10.1038/s41597-022-01269-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 03/14/2022] [Indexed: 11/21/2022] Open
Abstract
Snowfall information at the scale of individual particles is rare, difficult to gather, but fundamental for a better understanding of solid precipitation microphysics. In this article we present a dataset (with dedicated software) of in-situ measurements of snow particles in free fall. The dataset includes gray-scale (255 shades) images of snowflakes, co-located surface environmental measurements, a large number of geometrical and textural snowflake descriptors as well as the output of previously published retrieval algorithms. These include: hydrometeor classification, riming degree estimation, identification of melting particles, discrimination of wind-blown snow, as well as estimates of snow particle mass and volume. The measurements were collected in various locations of the Alps, Antarctica and Korea for a total of 2’555’091 snowflake images (or 851’697 image triplets). As the instrument used for data collection was a Multi-Angle Snowflake Camera (MASC), the dataset is named MASCDB. Given the large amount of snowflake images and associated descriptors, MASCDB can be exploited also by the computer vision community for the training and benchmarking of image processing systems. Measurement(s) | fall speed • snowfall images • snow crystals • geometrical characteristics • textural characteristics • snowfall microphysics | Technology Type(s) | multi angle snowflake camera |
Collapse
|
46
|
Valente D, Miaretsoa L, Anania A, Costa F, Mascaro A, Raimondi T, De Gregorio C, Torti V, Friard O, Ratsimbazafy J, Giacoma C, Gamba M. Comparative Analysis of the Vocal Repertoires of the Indri (Indri indri) and the Diademed Sifaka (Propithecus diadema). INT J PRIMATOL 2022. [DOI: 10.1007/s10764-022-00287-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
AbstractStrepsirrhine vocalisations are extraordinarily diverse and cross-species comparisons are needed to explore how this variability evolved. We contributed to the investigation of primate acoustic diversity by comparing the vocal repertoire of two sympatric lemur species, Propithecus diadema and Indri indri. These diurnal species belong to the same taxonomic family and have similar activity patterns but different social structures. These features make them excellent candidates for an investigation of the phylogenetic, environmental, and social influence on primate vocal behavior. We recorded 3 P. diadema groups in 2014 and 2016. From 1,872 recordings we selected and assigned 3814 calls to 9 a priori call types, on the basis of their acoustic structure. We implemented a reproducible technique performing an acoustic feature extraction relying on frequency bins, t-SNE data reduction, and a hard-clustering analysis. We first quantified the vocal repertoire of P. diadema, finding consistent results for the 9 putatively identified call types. When comparing this repertoire with a previously published repertoire of I. indri, we found highly species-specific repertoires, with only 2% of the calls misclassified by species identity. The loud calls of the two species were very distinct, while the low-frequency calls were more similar. Our results pinpoint the role of phylogenetic history, social and environmental features on the evolution of communicative systems and contribute to a deeper understanding of the evolutionary roots of primate vocal differentiation. We conclude by arguing that standardized and reproducible techniques, like the one we employed, allow robust comparisons and should be prioritized in the future.
Collapse
|
47
|
Zhu Y, Smith A, Hauser K. Automated Heart and Lung Auscultation in Robotic Physical Examinations. IEEE Robot Autom Lett 2022. [DOI: 10.1109/lra.2022.3149576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
48
|
Linhart P, Mahamoud-Issa M, Stowell D, Blumstein DT. The potential for acoustic individual identification in mammals. Mamm Biol 2022. [DOI: 10.1007/s42991-021-00222-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
49
|
Adret P. Developmental Plasticity in Primate Coordinated Song: Parallels and Divergences With Duetting Songbirds. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.862196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Homeothermic animals (birds and mammals) are prime model systems for investigating the developmental plasticity and neural mechanisms of vocal duetting, a cooperative acoustic signal that prevails in family-living and pair-bonded species including humans. This review focuses on the nature of this trait and its nurturing during ontogeny and extending into adulthood. I begin by outlining the underpinning concepts of duet codes and pair-specific answering rules as used by birds to develop their learned coordinated song, driven by a complex interaction between self-generated and socially mediated auditory feedback. The more tractable avian model of duetting helps identify research gaps in singing primates that also use duetting as a type of intraspecific vocal interaction. Nevertheless, it has become clear that primate coordinated song—whether overlapping or antiphonal—is subject to some degree of vocal flexibility. This is reflected in the ability of lesser apes, titi monkeys, tarsiers, and lemurs to adjust the structure and timing of their calls through (1) social influence, (2) coordinated duetting both before and after mating, (3) the repair of vocal mistakes, (4) the production of heterosexual song early in life, (5) vocal accommodation in call rhythm, (6) conditioning, and (7) innovation. Furthermore, experimental work on the neural underpinnings of avian and mammalian antiphonal duets point to a hierarchical (cortico-subcortical) control mechanism that regulates, via inhibition, the temporal segregation of rapid vocal exchanges. I discuss some weaknesses in this growing field of research and highlight prospective avenues for future investigation.
Collapse
|
50
|
Parsons MJG, Lin TH, Mooney TA, Erbe C, Juanes F, Lammers M, Li S, Linke S, Looby A, Nedelec SL, Van Opzeeland I, Radford C, Rice AN, Sayigh L, Stanley J, Urban E, Di Iorio L. Sounding the Call for a Global Library of Underwater Biological Sounds. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.810156] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Aquatic environments encompass the world’s most extensive habitats, rich with sounds produced by a diversity of animals. Passive acoustic monitoring (PAM) is an increasingly accessible remote sensing technology that uses hydrophones to listen to the underwater world and represents an unprecedented, non-invasive method to monitor underwater environments. This information can assist in the delineation of biologically important areas via detection of sound-producing species or characterization of ecosystem type and condition, inferred from the acoustic properties of the local soundscape. At a time when worldwide biodiversity is in significant decline and underwater soundscapes are being altered as a result of anthropogenic impacts, there is a need to document, quantify, and understand biotic sound sources–potentially before they disappear. A significant step toward these goals is the development of a web-based, open-access platform that provides: (1) a reference library of known and unknown biological sound sources (by integrating and expanding existing libraries around the world); (2) a data repository portal for annotated and unannotated audio recordings of single sources and of soundscapes; (3) a training platform for artificial intelligence algorithms for signal detection and classification; and (4) a citizen science-based application for public users. Although individually, these resources are often met on regional and taxa-specific scales, many are not sustained and, collectively, an enduring global database with an integrated platform has not been realized. We discuss the benefits such a program can provide, previous calls for global data-sharing and reference libraries, and the challenges that need to be overcome to bring together bio- and ecoacousticians, bioinformaticians, propagation experts, web engineers, and signal processing specialists (e.g., artificial intelligence) with the necessary support and funding to build a sustainable and scalable platform that could address the needs of all contributors and stakeholders into the future.
Collapse
|