1
|
Diep QB, Phan HY, Truong TC. Crossmixed convolutional neural network for digital speech recognition. PLoS One 2024; 19:e0302394. [PMID: 38669233 PMCID: PMC11051591 DOI: 10.1371/journal.pone.0302394] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 04/02/2024] [Indexed: 04/28/2024] Open
Abstract
Digital speech recognition is a challenging problem that requires the ability to learn complex signal characteristics such as frequency, pitch, intensity, timbre, and melody, which traditional methods often face issues in recognizing. This article introduces three solutions based on convolutional neural networks (CNN) to solve the problem: 1D-CNN is designed to learn directly from digital data; 2DS-CNN and 2DM-CNN have a more complex architecture, transferring raw waveform into transformed images using Fourier transform to learn essential features. Experimental results on four large data sets, containing 30,000 samples for each, show that the three proposed models achieve superior performance compared to well-known models such as GoogLeNet and AlexNet, with the best accuracy of 95.87%, 99.65%, and 99.76%, respectively. With 5-10% higher performance than other models, the proposed solution has demonstrated the ability to effectively learn features, improve recognition accuracy and speed, and open up the potential for broad applications in virtual assistants, medical recording, and voice commands.
Collapse
Affiliation(s)
- Quoc Bao Diep
- Faculty of Mechanical - Electrical and Computer Engineering, Van Lang University, Ho Chi Minh City, Vietnam
| | - Hong Yen Phan
- Faculty of Mechanical - Electrical and Computer Engineering, Van Lang University, Ho Chi Minh City, Vietnam
| | - Thanh-Cong Truong
- Faculty of Information Technology, University of Finance-Marketing, Ho Chi Minh City, Vietnam
| |
Collapse
|
2
|
Zhou Z, Yip HM, Tsimring K, Sur M, Ip JPK, Tin C. Effective and efficient neural networks for spike inference from in vivo calcium imaging. CELL REPORTS METHODS 2023; 3:100462. [PMID: 37323579 PMCID: PMC10261900 DOI: 10.1016/j.crmeth.2023.100462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 02/21/2023] [Accepted: 03/31/2023] [Indexed: 06/17/2023]
Abstract
Calcium imaging provides advantages in monitoring large populations of neuronal activities simultaneously. However, it lacks the signal quality provided by neural spike recording in traditional electrophysiology. To address this issue, we developed a supervised data-driven approach to extract spike information from calcium signals. We propose the ENS2 (effective and efficient neural networks for spike inference from calcium signals) system for spike-rate and spike-event predictions using ΔF/F0 calcium inputs based on a U-Net deep neural network. When testing on a large, ground-truth public database, it consistently outperformed state-of-the-art algorithms in both spike-rate and spike-event predictions with reduced computational load. We further demonstrated that ENS2 can be applied to analyses of orientation selectivity in primary visual cortex neurons. We conclude that it would be a versatile inference system that may benefit diverse neuroscience studies.
Collapse
Affiliation(s)
- Zhanhong Zhou
- Department of Biomedical Engineering, City University of Hong Kong, Hong Kong SAR, China
| | - Hei Matthew Yip
- School of Biomedical Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Katya Tsimring
- Department of Brain and Cognitive Sciences, Picower Institute for Learning and Memory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Mriganka Sur
- Department of Brain and Cognitive Sciences, Picower Institute for Learning and Memory, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Jacque Pak Kan Ip
- School of Biomedical Sciences, The Chinese University of Hong Kong, Hong Kong SAR, China
| | - Chung Tin
- Department of Biomedical Engineering, City University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
3
|
Zhu F, Grier HA, Tandon R, Cai C, Agarwal A, Giovannucci A, Kaufman MT, Pandarinath C. A deep learning framework for inference of single-trial neural population dynamics from calcium imaging with subframe temporal resolution. Nat Neurosci 2022; 25:1724-1734. [PMID: 36424431 PMCID: PMC9825112 DOI: 10.1038/s41593-022-01189-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Accepted: 09/23/2022] [Indexed: 11/26/2022]
Abstract
In many areas of the brain, neural populations act as a coordinated network whose state is tied to behavior on a millisecond timescale. Two-photon (2p) calcium imaging is a powerful tool to probe such network-scale phenomena. However, estimating the network state and dynamics from 2p measurements has proven challenging because of noise, inherent nonlinearities and limitations on temporal resolution. Here we describe Recurrent Autoencoder for Discovering Imaged Calcium Latents (RADICaL), a deep learning method to overcome these limitations at the population level. RADICaL extends methods that exploit dynamics in spiking activity for application to deconvolved calcium signals, whose statistics and temporal dynamics are quite distinct from electrophysiologically recorded spikes. It incorporates a new network training strategy that capitalizes on the timing of 2p sampling to recover network dynamics with high temporal precision. In synthetic tests, RADICaL infers the network state more accurately than previous methods, particularly for high-frequency components. In 2p recordings from sensorimotor areas in mice performing a forelimb reach task, RADICaL infers network state with close correspondence to single-trial variations in behavior and maintains high-quality inference even when neuronal populations are substantially reduced.
Collapse
Affiliation(s)
- Feng Zhu
- Wallace H. Coulter Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, GA, USA
- Neuroscience Graduate Program, Graduate Division of Biological and Biomedical Sciences, Emory University, Atlanta, GA, USA
| | - Harrison A Grier
- Committee on Computational Neuroscience, The University of Chicago, Chicago, IL, USA
| | - Raghav Tandon
- Wallace H. Coulter Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, GA, USA
| | - Changjia Cai
- Joint Biomedical Engineering Department, University of North Carolina at Chapel Hill and North Carolina State University, Chapel Hill, NC, USA
| | | | - Andrea Giovannucci
- Joint Biomedical Engineering Department, University of North Carolina at Chapel Hill and North Carolina State University, Chapel Hill, NC, USA.
- Neuroscience Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA.
- Closed-Loop Engineering for Advanced Rehabilitation (CLEAR), North Carolina State University, Raleigh, NC, USA.
| | - Matthew T Kaufman
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, IL, USA.
- Neuroscience Institute, The University of Chicago, Chicago, IL, USA.
| | - Chethan Pandarinath
- Wallace H. Coulter Department of Biomedical Engineering, Emory University and Georgia Institute of Technology, Atlanta, GA, USA.
- Department of Neurosurgery, Emory University, Atlanta, GA, USA.
- Center for Machine Learning, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
4
|
Rupprecht P, Carta S, Hoffmann A, Echizen M, Blot A, Kwan AC, Dan Y, Hofer SB, Kitamura K, Helmchen F, Friedrich RW. A database and deep learning toolbox for noise-optimized, generalized spike inference from calcium imaging. Nat Neurosci 2021; 24:1324-1337. [PMID: 34341584 PMCID: PMC7611618 DOI: 10.1038/s41593-021-00895-5] [Citation(s) in RCA: 40] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 06/23/2021] [Indexed: 02/06/2023]
Abstract
Inference of action potentials ('spikes') from neuronal calcium signals is complicated by the scarcity of simultaneous measurements of action potentials and calcium signals ('ground truth'). In this study, we compiled a large, diverse ground truth database from publicly available and newly performed recordings in zebrafish and mice covering a broad range of calcium indicators, cell types and signal-to-noise ratios, comprising a total of more than 35 recording hours from 298 neurons. We developed an algorithm for spike inference (termed CASCADE) that is based on supervised deep networks, takes advantage of the ground truth database, infers absolute spike rates and outperforms existing model-based algorithms. To optimize performance for unseen imaging data, CASCADE retrains itself by resampling ground truth data to match the respective sampling rate and noise level; therefore, no parameters need to be adjusted by the user. In addition, we developed systematic performance assessments for unseen data, openly released a resource toolbox and provide a user-friendly cloud-based implementation.
Collapse
Affiliation(s)
- Peter Rupprecht
- Brain Research Institute, University of Zürich, Zurich, Switzerland.
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
| | - Stefano Carta
- Brain Research Institute, University of Zürich, Zurich, Switzerland
| | - Adrian Hoffmann
- Brain Research Institute, University of Zürich, Zurich, Switzerland
| | - Mayumi Echizen
- Department of Neurophysiology, University of Tokyo, Tokyo, Japan
- Department of Anesthesiology, Graduate School of Medical and Dental Sciences, Tokyo Medical and Dental University, Tokyo, Japan
| | - Antonin Blot
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, United Kingdom
- Biozentrum, University of Basel, Basel, Switzerland
| | - Alex C Kwan
- Department of Psychiatry, Yale University School of Medicine, New Haven, CT, USA
| | - Yang Dan
- Division of Neurobiology, Department of Molecular and Cell Biology, Helen Wills Neuroscience Institute, Howard Hughes Medical Institute, University of California, Berkeley, Berkeley CA, USA
| | - Sonja B Hofer
- Sainsbury Wellcome Centre for Neural Circuits and Behaviour, University College London, London, United Kingdom
- Biozentrum, University of Basel, Basel, Switzerland
| | - Kazuo Kitamura
- Department of Neurophysiology, University of Tokyo, Tokyo, Japan
- Department of Neurophysiology, University of Yamanashi, Yamanashi, Japan
| | - Fritjof Helmchen
- Brain Research Institute, University of Zürich, Zurich, Switzerland.
| | - Rainer W Friedrich
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.
- University of Basel, Basel, Switzerland.
| |
Collapse
|
5
|
Rodríguez-Collado A, Rueda C. A simple parametric representation of the Hodgkin-Huxley model. PLoS One 2021; 16:e0254152. [PMID: 34292948 PMCID: PMC8297874 DOI: 10.1371/journal.pone.0254152] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 06/21/2021] [Indexed: 02/04/2023] Open
Abstract
The Hodgkin-Huxley model, decades after its first presentation, is still a reference model in neuroscience as it has successfully reproduced the electrophysiological activity of many organisms. The primary signal in the model represents the membrane potential of a neuron. A simple representation of this signal is presented in this paper. The new proposal is an adapted Frequency Modulated Möbius multicomponent model defined as a signal plus error model in which the signal is decomposed as a sum of waves. The main strengths of the method are the simple parametric formulation, the interpretability and flexibility of the parameters that describe and discriminate the waveforms, the estimators' identifiability and accuracy, and the robustness against noise. The approach is validated with a broad simulation experiment of Hodgkin-Huxley signals and real data from squid giant axons. Interesting differences between simulated and real data emerge from the comparison of the parameter configurations. Furthermore, the potential of the FMM parameters to predict Hodgkin-Huxley model parameters is shown using different Machine Learning methods. Finally, promising contributions of the approach in Spike Sorting and cell-type classification are detailed.
Collapse
Affiliation(s)
| | - Cristina Rueda
- Department of Statistics and Operations Research, Universidad de Valladolid, Valladolid, Spain
| |
Collapse
|