Jia S, Li X, Huang T, Liu JK, Yu Z. Representing the dynamics of high-dimensional data with non-redundant wavelets.
PATTERNS 2022;
3:100424. [PMID:
35510192 PMCID:
PMC9058841 DOI:
10.1016/j.patter.2021.100424]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 09/22/2021] [Accepted: 12/09/2021] [Indexed: 11/19/2022]
Abstract
A crucial question in data science is to extract meaningful information embedded in high-dimensional data into a low-dimensional set of features that can represent the original data at different levels. Wavelet analysis is a pervasive method for decomposing time-series signals into a few levels with detailed temporal resolution. However, obtained wavelets are intertwined and over-represented across levels for each sample and across different samples within one population. Here, using neuroscience data of simulated spikes, experimental spikes, calcium imaging signals, and human electrocorticography signals, we leveraged conditional mutual information between wavelets for feature selection. The meaningfulness of selected features was verified to decode stimulus or condition with high accuracy yet using only a small set of features. These results provide a new way of wavelet analysis for extracting essential features of the dynamics of spatiotemporal neural data, which then enables to support novel model design of machine learning with representative features.
WCMI can extract meaningful information from high-dimensional data
Extracted features from neural signals are non-redundant
Simple decoders can read out these features with superb accuracy
One of the essential questions in data science is to extract meaningful information from high-dimensional data. A useful approach is to represent data using a few features that maintain the crucial information. The leading property of spatiotemporal data is foremost ever-changing dynamics in time. Wavelet analysis, as a classical method for disentangling time series, can capture temporal dynamics with detail. Here, we leveraged conditional mutual information between wavelets to select a small subset of non-redundant features. We demonstrated the efficiency and effectiveness of features using various types of neuroscience data with different sampling frequencies at the level of the single cell, cell population, and coarse-scale brain activity. Our results shed new insights into representing the dynamics of spatiotemporal data using a few fundamental features extracted by wavelet analysis, which may have wide implications to other types of data with rich temporal dynamics.
Collapse