1
|
Prototype Learning for Medical Time Series Classification via Human-Machine Collaboration. SENSORS (BASEL, SWITZERLAND) 2024; 24:2655. [PMID: 38676273 PMCID: PMC11054195 DOI: 10.3390/s24082655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/15/2024] [Accepted: 04/15/2024] [Indexed: 04/28/2024]
Abstract
Deep neural networks must address the dual challenge of delivering high-accuracy predictions and providing user-friendly explanations. While deep models are widely used in the field of time series modeling, deciphering the core principles that govern the models' outputs remains a significant challenge. This is crucial for fostering the development of trusted models and facilitating domain expert validation, thereby empowering users and domain experts to utilize them confidently in high-risk decision-making contexts (e.g., decision-support systems in healthcare). In this work, we put forward a deep prototype learning model that supports interpretable and manipulable modeling and classification of medical time series (i.e., ECG signal). Specifically, we first optimize the representation of single heartbeat data by employing a bidirectional long short-term memory and attention mechanism, and then construct prototypes during the training phase. The final classification outcomes (i.e., normal sinus rhythm, atrial fibrillation, and other rhythm) are determined by comparing the input with the obtained prototypes. Moreover, the proposed model presents a human-machine collaboration mechanism, allowing domain experts to refine the prototypes by integrating their expertise to further enhance the model's performance (contrary to the human-in-the-loop paradigm, where humans primarily act as supervisors or correctors, intervening when required, our approach focuses on a human-machine collaboration, wherein both parties engage as partners, enabling more fluid and integrated interactions). The experimental outcomes presented herein delineate that, within the realm of binary classification tasks-specifically distinguishing between normal sinus rhythm and atrial fibrillation-our proposed model, albeit registering marginally lower performance in comparison to certain established baseline models such as Convolutional Neural Networks (CNNs) and bidirectional long short-term memory with attention mechanisms (Bi-LSTMAttns), evidently surpasses other contemporary state-of-the-art prototype baseline models. Moreover, it demonstrates significantly enhanced performance relative to these prototype baseline models in the context of triple classification tasks, which encompass normal sinus rhythm, atrial fibrillation, and other rhythm classifications. The proposed model manifests a commendable prediction accuracy of 0.8414, coupled with macro precision, recall, and F1-score metrics of 0.8449, 0.8224, and 0.8235, respectively, achieving both high classification accuracy as well as good interpretability.
Collapse
|
2
|
Monitoring Flow-Forming Processes Using Design of Experiments and a Machine Learning Approach Based on Randomized-Supervised Time Series Forest and Recursive Feature Elimination. SENSORS (BASEL, SWITZERLAND) 2024; 24:1527. [PMID: 38475063 DOI: 10.3390/s24051527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/09/2024] [Accepted: 02/13/2024] [Indexed: 03/14/2024]
Abstract
The machines of WF Maschinenbau process metal blanks into various workpieces using so-called flow-forming processes. The quality of these workpieces depends largely on the quality of the blanks and the condition of the machine. This creates an urgent need for automated monitoring of the forming processes and the condition of the machine. Since the complexity of the flow-forming processes makes physical modeling impossible, the present work deals with data-driven modeling using machine learning algorithms. The main contributions of this work lie in showcasing the feasibility of utilizing machine learning and sensor data to monitor flow-forming processes, along with developing a practical approach for this purpose. The approach includes an experimental design capable of providing the necessary data, as well as a procedure for preprocessing the data and extracting features that capture the information needed by the machine learning models to detect defects in the blank and the machine. To make efficient use of the small number of experiments available, the experimental design is generated using Design of Experiments methods. They consist of two parts. In the first part, a pre-selection of influencing variables relevant to the forming process is performed. In the second part of the design, the selected variables are investigated in more detail. The preprocessing procedure consists of feature engineering, feature extraction and feature selection. In the feature engineering step, the data set is augmented with time series variables that are meaningful in the domain. For feature extraction, an algorithm was developed based on the mechanisms of the r-STSF, a state-of-the-art algorithm for time series classification, extending them for multivariate time series and metric target variables. This feature extraction algorithm itself can be seen as an additional contribution of this work, because it is not tied to the application domain of monitoring flow-forming processes, but can be used as a feature extraction algorithm for multivariate time series classification in general. For feature selection, a Recursive Feature Elimination is employed. With the resulting features, random forests are trained to detect several quality features of the blank and defects of the machine. The trained models achieve good prediction accuracy for most of the target variables. This shows that the application of machine learning is a promising approach for the monitoring of flow-forming processes, which requires further investigation for confirmation.
Collapse
|
3
|
Random Convolutional Kernel Transform with Empirical Mode Decomposition for Classification of Insulators from Power Grid. SENSORS (BASEL, SWITZERLAND) 2024; 24:1113. [PMID: 38400271 PMCID: PMC10893376 DOI: 10.3390/s24041113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 02/05/2024] [Accepted: 02/06/2024] [Indexed: 02/25/2024]
Abstract
The electrical energy supply relies on the satisfactory operation of insulators. The ultrasound recorded from insulators in different conditions has a time series output, which can be used to classify faulty insulators. The random convolutional kernel transform (Rocket) algorithms use convolutional filters to extract various features from the time series data. This paper proposes a combination of Rocket algorithms, machine learning classifiers, and empirical mode decomposition (EMD) methods, such as complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN), empirical wavelet transform (EWT), and variational mode decomposition (VMD). The results show that the EMD methods, combined with MiniRocket, significantly improve the accuracy of logistic regression in insulator fault diagnosis. The proposed strategy achieves an accuracy of 0.992 using CEEMDAN, 0.995 with EWT, and 0.980 with VMD. These results highlight the potential of incorporating EMD methods in insulator failure detection models to enhance the safety and dependability of power systems.
Collapse
|
4
|
Slope Entropy Characterisation: An Asymmetric Approach to Threshold Parameters Role Analysis. ENTROPY (BASEL, SWITZERLAND) 2024; 26:82. [PMID: 38248207 PMCID: PMC10814979 DOI: 10.3390/e26010082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/15/2024] [Accepted: 01/15/2024] [Indexed: 01/23/2024]
Abstract
Slope Entropy (SlpEn) is a novel method recently proposed in the field of time series entropy estimation. In addition to the well-known embedded dimension parameter, m, used in other methods, it applies two additional thresholds, denoted as δ and γ, to derive a symbolic representation of a data subsequence. The original paper introducing SlpEn provided some guidelines for recommended specific values of these two parameters, which have been successfully followed in subsequent studies. However, a deeper understanding of the role of these thresholds is necessary to explore the potential for further SlpEn optimisations. Some works have already addressed the role of δ, but in this paper, we extend this investigation to include the role of γ and explore the impact of using an asymmetric scheme to select threshold values. We conduct a comparative analysis between the standard SlpEn method as initially proposed and an optimised version obtained through a grid search to maximise signal classification performance based on SlpEn. The results confirm that the optimised version achieves higher time series classification accuracy, albeit at the cost of significantly increased computational complexity.
Collapse
|
5
|
Neural fingerprinting on MEG time series using MiniRocket. Front Neurosci 2023; 17:1229371. [PMID: 37799343 PMCID: PMC10547883 DOI: 10.3389/fnins.2023.1229371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Accepted: 09/04/2023] [Indexed: 10/07/2023] Open
Abstract
Neural fingerprinting is the identification of individuals in a cohort based on neuroimaging recordings of brain activity. In magneto- and electroencephalography (M/EEG), it is common practice to use second-order statistical measures, such as correlation or connectivity matrices, when neural fingerprinting is performed. These measures or features typically require coupling between signal channels and often ignore the individual temporal dynamics. In this study, we show that, following recent advances in multivariate time series classification, such as the development of the RandOm Convolutional KErnel Transformation (ROCKET) classifier, it is possible to perform classification directly on short time segments from MEG resting-state recordings with remarkably high classification accuracies. In a cohort of 124 subjects, it was possible to assign windows of time series of 1 s in duration to the correct subject with above 99% accuracy. The achieved accuracies are vastly superior to those of previous methods while simultaneously requiring considerably shorter time segments.
Collapse
|
6
|
Cropland Mapping Using Sentinel-1 Data in the Southern Part of the Russian Far East. SENSORS (BASEL, SWITZERLAND) 2023; 23:7902. [PMID: 37765958 PMCID: PMC10536219 DOI: 10.3390/s23187902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2023] [Revised: 09/13/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
Crop identification is one of the most important tasks in digital farming. The use of remote sensing data makes it possible to clarify the boundaries of fields and identify fallow land. This study considered the possibility of using the seasonal variation in the Dual-polarization Radar Vegetation Index (DpRVI), which was calculated based on data acquired by the Sentinel-1B satellite between May and October 2021, as the main characteristic. Radar images of the Khabarovskiy District of the Khabarovsk Territory, as well as those of the Arkharinskiy, Ivanovskiy, and Oktyabrskiy districts in the Amur Region (Russian Far East), were obtained and processed. The identifiable classes were soybean and oat crops, as well as fallow land. Classification was carried out using the Support Vector Machines, Quadratic Discriminant Analysis (QDA), and Random Forest (RF) algorithms. The training (848 ha) and test (364 ha) samples were located in Khabarovskiy District. The best overall accuracy on the test set (82.0%) was achieved using RF. Classification accuracy at the field level was 79%. When using the QDA classifier on cropland in the Amur Region (2324 ha), the overall classification accuracy was 83.1% (F1 was 0.86 for soybean, 0.84 for fallow, and 0.79 for oat). Application of the Radar Vegetation Index (RVI) and VV/VH ratio enabled an overall classification accuracy in the Amur region of 74.9% and 74.6%, respectively. Thus, using DpRVI allowed us to achieve greater performance compared to other SAR data, and it can be used to identify crops in the south of the Far East and serve as the basis for the automatic classification of cropland.
Collapse
|
7
|
Multivariate CNN Model for Human Locomotion Activity Recognition with a Wearable Exoskeleton Robot. Bioengineering (Basel) 2023; 10:1082. [PMID: 37760184 PMCID: PMC10525937 DOI: 10.3390/bioengineering10091082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 09/05/2023] [Accepted: 09/09/2023] [Indexed: 09/29/2023] Open
Abstract
This study introduces a novel convolutional neural network (CNN) architecture, encompassing both single and multi-head designs, developed to identify a user's locomotion activity while using a wearable lower limb robot. Our research involved 500 healthy adult participants in an activities of daily living (ADL) space, conducted from 1 September to 30 November 2022. We collected prospective data to identify five locomotion activities (level ground walking, stair ascent/descent, and ramp ascent/descent) across three terrains: flat ground, staircase, and ramp. To evaluate the predictive capabilities of the proposed CNN architectures, we compared its performance with three other models: one CNN and two hybrid models (CNN-LSTM and LSTM-CNN). Experiments were conducted using multivariate signals of various types obtained from electromyograms (EMGs) and the wearable robot. Our results reveal that the deeper CNN architecture significantly surpasses the performance of the three competing models. The proposed model, leveraging encoder data such as hip angles and velocities, along with postural signals such as roll, pitch, and yaw from the wearable lower limb robot, achieved superior performance with an inference speed of 1.14 s. Specifically, the F-measure performance of the proposed model reached 96.17%, compared to 90.68% for DDLMI, 94.41% for DeepConvLSTM, and 95.57% for LSTM-CNN, respectively.
Collapse
|
8
|
XGSleeve: detecting sleeve incidents in well completion by using XGBoost classifier. Front Artif Intell 2023; 6:1243584. [PMID: 37780836 PMCID: PMC10533988 DOI: 10.3389/frai.2023.1243584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Accepted: 08/21/2023] [Indexed: 10/03/2023] Open
Abstract
The sliding sleeve holds a pivotal role in regulating fluid flow during hydraulic fracturing within shale oil extraction processes. However, concerns persist surrounding its reliability due to repeated attempts at opening the sleeve, resulting in process inefficiencies. While downhole cameras can verify sleeve states, their high cost poses a limitation. This study proposes an alternative approach, leveraging downhole data analysis for sleeve incident detection in lieu of cameras. This study introduces "XGSleeve," a novel machine-learning methodology. XGSleeve amalgamates hidden Markov model-based clustering with the XGBoost model, offering robust identification of sleeve incidents. This method serves as an operator-centric tool, addressing the domains of oil and gas, well completion, sliding sleeves, time series classification, signal processing, XGBoost, and hidden Markov models. The XGSleeve model exhibits a commendable 86% precision in detecting sleeve incidents. This outcome significantly curtails the need for multiple sleeve open-close attempts, thereby enhancing operational efficiency and safety. The successful implementation of the XGSleeve model rectifies existing limitations in sleeve incident detection, consequently fostering optimization, safety, and resilience within the oil and gas sector. This innovation further underscores the potential for data-driven decision-making in the industry. The XGSleeve model represents a groundbreaking advancement in sleeve incident detection, demonstrating the potential for broader integration of AI and machine learning in oil and gas operations. As technology advances, such methodologies are poised to optimize processes, minimize environmental impact, and promote sustainable practices. Ultimately, the adoption of XGSleeve contributes to the enduring growth and responsible management of global oil and gas resources.
Collapse
|
9
|
Improved Recurrence Plots Compression Distance by Learning Parameter for Video Compression Quality. ENTROPY (BASEL, SWITZERLAND) 2023; 25:953. [PMID: 37372297 DOI: 10.3390/e25060953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/07/2023] [Accepted: 06/16/2023] [Indexed: 06/29/2023]
Abstract
As the Internet-of-Things is deployed widely, many time-series data are generated everyday. Thus, classifying time-series automatically has become important. Compression-based pattern recognition has attracted attention, because it can analyze various data universally with few model parameters. RPCD (Recurrent Plots Compression Distance) is known as a compression-based time-series classification method. First, RPCD transforms time-series data into an image called "Recurrent Plots (RP)". Then, the distance between two time-series data is determined as the dissimilarity between their RPs. Here, the dissimilarity between two images is computed from the file size, when an MPEG-1 encoder compresses the video, which serializes the two images in order. In this paper, by analyzing the RPCD, we give an important insight that the quality parameter for the MPEG-1 encoding that controls the resolution of compressed videos influences the classification performance very much. We also show that the optimal parameter value depends extremely on the dataset to be classified: Interestingly, the optimal value for one dataset can make the RPCD fall behind a naive random classifier for another dataset. Supported by these insights, we propose an improved version of RPCD named qRPCD, which searches the optimal parameter value by means of cross-validation. Experimentally, qRPCD works superiorly to the original RPCD by about 4% in terms of classification accuracy.
Collapse
|
10
|
Multivariable time series classification for clinical mastitis detection and prediction in automated milking systems. J Dairy Sci 2023; 106:3448-3464. [PMID: 36935240 DOI: 10.3168/jds.2022-22355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Accepted: 11/16/2022] [Indexed: 03/19/2023]
Abstract
In this study, we developed a machine learning framework to detect clinical mastitis (CM) at the current milking (i.e., the same milking) and predict CM at the next milking (i.e., one milking before CM occurrence) at the quarter level. Time series quarter-level milking data were extracted from an automated milking system (AMS). For both CM detection and prediction, the best classification performance was obtained from the decision tree-based ensemble models. Moreover, applying models on a data set containing data from the current milking and past 9 milkings before the current milking showed the best accuracy for detecting CM; modeling with a data set containing data from the current milking and past 7 milkings before the current milking yielded the best results for predicting CM. The models combined with oversampling methods resulted in specificity of 95 and 93% for CM detection and prediction, respectively, with the same sensitivity (82%) for both scenarios; when lowering specificity to 80 to 83%, undersampling techniques facilitated models to increase sensitivity to 95%. We propose a feasible machine learning framework to identify CM in a timely manner using imbalanced data from an AMS, which could provide useful information for farmers to manage the negative effects of CM.
Collapse
|
11
|
Stimulus classification with electrical potential and impedance of living plants: comparing discriminant analysis and deep-learning methods. BIOINSPIRATION & BIOMIMETICS 2023; 18:025003. [PMID: 36758242 DOI: 10.1088/1748-3190/acbad2] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 02/09/2023] [Indexed: 06/18/2023]
Abstract
The physiology of living organisms, such as living plants, is complex and particularly difficult to understand on a macroscopic, organism-holistic level. Among the many options for studying plant physiology, electrical potential and tissue impedance are arguably simple measurement techniques that can be used to gather plant-level information. Despite the many possible uses, our research is exclusively driven by the idea of phytosensing, that is, interpreting living plants' signals to gather information about surrounding environmental conditions. As ready-to-use plant-level physiological models are not available, we consider the plant as a blackbox and apply statistics and machine learning to automatically interpret measured signals. In simple plant experiments, we exposeZamioculcas zamiifoliaandSolanum lycopersicum(tomato) to four different stimuli: wind, heat, red light and blue light. We measure electrical potential and tissue impedance signals. Given these signals, we evaluate a large variety of methods from statistical discriminant analysis and from deep learning, for the classification problem of determining the stimulus to which the plant was exposed. We identify a set of methods that successfully classify stimuli with good accuracy, without a clear winner. The statistical approach is competitive, partially depending on data availability for the machine learning approach. Our extensive results show the feasibility of the blackbox approach and can be used in future research to select appropriate classifier techniques for a given use case. In our own future research, we will exploit these methods to derive a phytosensing approach to monitoring air pollution in urban areas.
Collapse
|
12
|
Video Stream Recognition Using Bitstream Shape for Mobile Network QoE. SENSORS (BASEL, SWITZERLAND) 2023; 23:2548. [PMID: 36904751 PMCID: PMC10007105 DOI: 10.3390/s23052548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 02/22/2023] [Accepted: 02/23/2023] [Indexed: 06/18/2023]
Abstract
Video streaming service delivery is a challenging task for mobile network operators. Knowing which services clients are using could help ensure a specific quality of service and manage the users' experience. Additionally, mobile network operators could apply throttle, traffic prioritization, or differentiated pricing. However, due to the growth of encrypted Internet traffic, it has become difficult for network operators to recognize the type of service used by their clients. In this article, we propose and evaluate a method for recognizing video streams solely based on the shape of the bitstream on a cellular network communication channel. To classify bitstreams, we used a convolutional neural network that was trained on a dataset of download and upload bitstreams collected by the authors. We demonstrate that our proposed method achieves an accuracy of over 90% in recognizing video streams from real-world mobile network traffic data.
Collapse
|
13
|
Slope Entropy Normalisation by Means of Analytical and Heuristic Reference Values. ENTROPY (BASEL, SWITZERLAND) 2022; 25:66. [PMID: 36673207 PMCID: PMC9858583 DOI: 10.3390/e25010066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/04/2022] [Revised: 12/22/2022] [Accepted: 12/23/2022] [Indexed: 06/17/2023]
Abstract
Slope Entropy (SlpEn) is a very recently proposed entropy calculation method. It is based on the differences between consecutive values in a time series and two new input thresholds to assign a symbol to each resulting difference interval. As the histogram normalisation value, SlpEn uses the actual number of unique patterns found instead of the theoretically expected value. This maximises the information captured by the method but, as a consequence, SlpEn results do not usually fall within the classical [0,1] interval. Although this interval is not necessary at all for time series classification purposes, it is a convenient and common reference framework when entropy analyses take place. This paper describes a method to keep SlpEn results within this interval, and improves the interpretability and comparability of this measure in a similar way as for other methods. It is based on a max-min normalisation scheme, described in two steps. First, an analytic normalisation is proposed using known but very conservative bounds. Afterwards, these bounds are refined using heuristics about the behaviour of the number of patterns found in deterministic and random time series. The results confirm the suitability of the approach proposed, using a mixture of the two methods.
Collapse
|
14
|
A Systematic Review of Time Series Classification Techniques Used in Biomedical Applications. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22208016. [PMID: 36298367 PMCID: PMC9611376 DOI: 10.3390/s22208016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 09/23/2022] [Accepted: 10/17/2022] [Indexed: 05/06/2023]
Abstract
Background: Digital clinical measures collected via various digital sensing technologies such as smartphones, smartwatches, wearables, and ingestible and implantable sensors are increasingly used by individuals and clinicians to capture the health outcomes or behavioral and physiological characteristics of individuals. Time series classification (TSC) is very commonly used for modeling digital clinical measures. While deep learning models for TSC are very common and powerful, there exist some fundamental challenges. This review presents the non-deep learning models that are commonly used for time series classification in biomedical applications that can achieve high performance. Objective: We performed a systematic review to characterize the techniques that are used in time series classification of digital clinical measures throughout all the stages of data processing and model building. Methods: We conducted a literature search on PubMed, as well as the Institute of Electrical and Electronics Engineers (IEEE), Web of Science, and SCOPUS databases using a range of search terms to retrieve peer-reviewed articles that report on the academic research about digital clinical measures from a five-year period between June 2016 and June 2021. We identified and categorized the research studies based on the types of classification algorithms and sensor input types. Results: We found 452 papers in total from four different databases: PubMed, IEEE, Web of Science Database, and SCOPUS. After removing duplicates and irrelevant papers, 135 articles remained for detailed review and data extraction. Among these, engineered features using time series methods that were subsequently fed into widely used machine learning classifiers were the most commonly used technique, and also most frequently achieved the best performance metrics (77 out of 135 articles). Statistical modeling (24 out of 135 articles) algorithms were the second most common and also the second-best classification technique. Conclusions: In this review paper, summaries of the time series classification models and interpretation methods for biomedical applications are summarized and categorized. While high time series classification performance has been achieved in digital clinical, physiological, or biomedical measures, no standard benchmark datasets, modeling methods, or reporting methodology exist. There is no single widely used method for time series model development or feature interpretation, however many different methods have proven successful.
Collapse
|
15
|
Slope Entropy Characterisation: The Role of the δ Parameter. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1456. [PMID: 37420476 DOI: 10.3390/e24101456] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/16/2022] [Accepted: 09/21/2022] [Indexed: 07/09/2023]
Abstract
Many time series entropy calculation methods have been proposed in the last few years. They are mainly used as numerical features for signal classification in any scientific field where data series are involved. We recently proposed a new method, Slope Entropy (SlpEn), based on the relative frequency of differences between consecutive samples of a time series, thresholded using two input parameters, γ and δ. In principle, δ was proposed to account for differences in the vicinity of the 0 region (namely, ties) and, therefore, was usually set at small values such as 0.001. However, there is no study that really quantifies the role of this parameter using this default or other configurations, despite the good SlpEn results so far. The present paper addresses this issue, removing δ from the SlpEn calculation to assess its real influence on classification performance, or optimising its value by means of a grid search in order to find out if other values beyond the 0.001 value provide significant time series classification accuracy gains. Although the inclusion of this parameter does improve classification accuracy according to experimental results, gains of 5% at most probably do not support the additional effort required. Therefore, SlpEn simplification could be seen as a real alternative.
Collapse
|
16
|
Predicting Abnormalities in Laboratory Values of Patients in the Intensive Care Unit Using Different Deep Learning Models: Comparative Study. JMIR Med Inform 2022; 10:e37658. [PMID: 36001363 PMCID: PMC9453586 DOI: 10.2196/37658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Revised: 06/05/2022] [Accepted: 06/12/2022] [Indexed: 11/13/2022] Open
Abstract
Background In recent years, the volume of medical knowledge and health data has increased rapidly. For example, the increased availability of electronic health records (EHRs) provides accurate, up-to-date, and complete information about patients at the point of care and enables medical staff to have quick access to patient records for more coordinated and efficient care. With this increase in knowledge, the complexity of accurate, evidence-based medicine tends to grow all the time. Health care workers must deal with an increasing amount of data and documentation. Meanwhile, relevant patient data are frequently overshadowed by a layer of less relevant data, causing medical staff to often miss important values or abnormal trends and their importance to the progression of the patient’s case. Objective The goal of this work is to analyze the current laboratory results for patients in the intensive care unit (ICU) and classify which of these lab values could be abnormal the next time the test is done. Detecting near-future abnormalities can be useful to support clinicians in their decision-making process in the ICU by drawing their attention to the important values and focus on future lab testing, saving them both time and money. Additionally, it will give doctors more time to spend with patients, rather than skimming through a long list of lab values. Methods We used Structured Query Language to extract 25 lab values for mechanically ventilated patients in the ICU from the MIMIC-III and eICU data sets. Additionally, we applied time-windowed sampling and holding, and a support vector machine to fill in the missing values in the sparse time series, as well as the Tukey range to detect and delete anomalies. Then, we used the data to train 4 deep learning models for time series classification, as well as a gradient boosting–based algorithm and compared their performance on both data sets. Results The models tested in this work (deep neural networks and gradient boosting), combined with the preprocessing pipeline, achieved an accuracy of at least 80% on the multilabel classification task. Moreover, the model based on the multiple convolutional neural network outperformed the other algorithms on both data sets, with the accuracy exceeding 89%. Conclusions In this work, we show that using machine learning and deep neural networks to predict near-future abnormalities in lab values can achieve satisfactory results. Our system was trained, validated, and tested on 2 well-known data sets to ensure that our system bridged the reality gap as much as possible. Finally, the model can be used in combination with our preprocessing pipeline on real-life EHRs to improve patients’ diagnosis and treatment.
Collapse
|
17
|
Augmentation of Human Action Datasets with Suboptimal Warping and Representative Data Samples. SENSORS 2022; 22:s22082947. [PMID: 35458931 PMCID: PMC9027434 DOI: 10.3390/s22082947] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 03/29/2022] [Accepted: 04/09/2022] [Indexed: 01/27/2023]
Abstract
The popularity of action recognition (AR) approaches and the need for improvement of their effectiveness require the generation of artificial samples addressing the nonlinearity of the time-space, scarcity of data points, or their variability. Therefore, in this paper, a novel approach to time series augmentation is proposed. The method improves the suboptimal warped time series generator algorithm (SPAWNER), introducing constraints based on identified AR-related problems with generated data points. Specifically, the proposed ARSPAWNER removes potential new time series that do not offer additional knowledge to the examples of a class or are created far from the occupied area. The constraints are based on statistics of time series of AR classes and their representative examples inferred with dynamic time warping barycentric averaging technique (DBA). The extensive experiments performed on eight AR datasets using three popular time series classifiers reveal the superiority of the introduced method over related approaches.
Collapse
|
18
|
Classifying Muscle States with One-Dimensional Radio-Frequency Signals from Single Element Ultrasound Transducers. SENSORS 2022; 22:s22072789. [PMID: 35408403 PMCID: PMC9002976 DOI: 10.3390/s22072789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 03/30/2022] [Accepted: 04/02/2022] [Indexed: 11/24/2022]
Abstract
The reliable assessment of muscle states, such as contracted muscles vs. non-contracted muscles or relaxed muscles vs. fatigue muscles, is crucial in many sports and rehabilitation scenarios, such as the assessment of therapeutic measures. The goal of this work was to deploy machine learning (ML) models based on one-dimensional (1-D) sonomyography (SMG) signals to facilitate low-cost and wearable ultrasound devices. One-dimensional SMG is a non-invasive technique using 1-D ultrasound radio-frequency signals to measure muscle states and has the advantage of being able to acquire information from deep soft tissue layers. To mimic real-life scenarios, we did not emphasize the acquisition of particularly distinct signals. The ML models exploited muscle contraction signals of eight volunteers and muscle fatigue signals of 21 volunteers. We evaluated them with different schemes on a variety of data types, such as unprocessed or processed raw signals and found that comparatively simple ML models, such as Support Vector Machines or Logistic Regression, yielded the best performance w.r.t. accuracy and evaluation time. We conclude that our framework for muscle contraction and muscle fatigue classifications is very well-suited to facilitate low-cost and wearable devices based on ML models using 1-D SMG.
Collapse
|
19
|
A New Bearing Fault Diagnosis Method Based on Capsule Network and Markov Transition Field/Gramian Angular Field. SENSORS 2021; 21:s21227762. [PMID: 34833837 PMCID: PMC8622607 DOI: 10.3390/s21227762] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 11/11/2021] [Accepted: 11/19/2021] [Indexed: 12/02/2022]
Abstract
Compared to time-consuming and unreliable manual analysis, intelligent fault diagnosis techniques using deep learning models can improve the accuracy of intelligent fault diagnosis with their multi-layer nonlinear mapping capabilities. This paper proposes a model to perform fault diagnosis and classification by using a time series of vibration sensor data as the input. The model encodes the raw vibration signal into a two-dimensional image and performs feature extraction and classification by a deep convolutional neural network or improved capsule network. A fault diagnosis technique based on the Gramian Angular Field (GAF), the Markov Transition Field (MTF), and the Capsule Network is proposed. Experiments conducted on a bearing failure dataset from Case Western Reserve University investigated the impact of two coding methods and different network structures on the diagnosis accuracy. The results show that the GAF technique retains more complete fault characteristics, while the MTF technique contains a small number of fault characteristics but more dynamic characteristics. Therefore, the proposed method incorporates GAF images and MTF images as a dual-channel image input to the capsule network, enabling the network to obtain a more complete fault signature. Multiple sets of experiments were conducted on the bearing fault dataset at Case Western Reserve University, and the Capsule Network in the proposed model has an advantage over other convolutional neural networks and performs well in the comparison of fault diagnosis methods proposed by other researchers.
Collapse
|
20
|
Matrix Profile-Based Interpretable Time Series Classifier. Front Artif Intell 2021; 4:699448. [PMID: 34746768 PMCID: PMC8564499 DOI: 10.3389/frai.2021.699448] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2021] [Accepted: 09/20/2021] [Indexed: 11/19/2022] Open
Abstract
Time series classification (TSC) is a pervasive and transversal problem in various fields ranging from disease diagnosis to anomaly detection in finance. Unfortunately, the most effective models used by Artificial Intelligence (AI) systems for TSC are not interpretable and hide the logic of the decision process, making them unusable in sensitive domains. Recent research is focusing on explanation methods to pair with the obscure classifier to recover this weakness. However, a TSC approach that is transparent by design and is simultaneously efficient and effective is even more preferable. To this aim, we propose an interpretable TSC method based on the patterns, which is possible to extract from the Matrix Profile (MP) of the time series in the training set. A smart design of the classification procedure allows obtaining an efficient and effective transparent classifier modeled as a decision tree that expresses the reasons for the classification as the presence of discriminative subsequences. Quantitative and qualitative experimentation shows that the proposed method overcomes the state-of-the-art interpretable approaches.
Collapse
|
21
|
Artificial Intelligence Meets Marine Ecotoxicology: Applying Deep Learning to Bio-Optical Data from Marine Diatoms Exposed to Legacy and Emerging Contaminants. BIOLOGY 2021; 10:biology10090932. [PMID: 34571809 PMCID: PMC8470171 DOI: 10.3390/biology10090932] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 09/03/2021] [Accepted: 09/15/2021] [Indexed: 11/24/2022]
Abstract
Simple Summary Our work is motivated by the increasing production of chemicals with environmentally harmful effects to our aquatic ecosystems. We show that it is possible to detect and distinguish the presence of several different emerging contaminants, using the photochemical responses of a microalgae species, which is among the most abundant phytoplankton group in the oceans. We use several machine learning and deep learning models that operate on chlorophyll fluorescence induction curves, which are composed of fluorescence values taken at different time steps from the microalgae exposure trials, achieving up to 97.65% accuracy when predicting the type of contaminant, and up to 100% in several cases when predicting the exposure concentration. Our results show the combination of these models with the fluorescence induction curves creates a powerful tool for ecotoxicity assessment, capable of classifying model organisms for their contaminant exposure, both in terms of type and concentration, opening new doors for toxicophenomics developments. Abstract Over recent decades, the world has experienced the adverse consequences of uncontrolled development of multiple human activities. In recent years, the total production of chemicals has been composed of environmentally harmful compounds, the majority of which have significant environmental impacts. These emerging contaminants (ECs) include a wide range of man-made chemicals (such as pesticides, cosmetics, personal and household care products, pharmaceuticals), which are of worldwide use. Among these, several ECs raised concerns regarding their ecotoxicological effects and how to assess them efficiently. This is of particular interest if marine diatoms are considered as potential target species, due to their widespread distribution, being the most abundant phytoplankton group in the oceans, and also being responsible for key ecological roles. Bio-optical ecotoxicity methods appear as reliable, fast, and high-throughput screening (HTS) techniques, providing large datasets with biological relevance on the mode of action of these ECs in phototrophic organisms, such as diatoms. However, from the large datasets produced, only a small amount of data are normally extracted for physiological evaluation, leaving out a large amount of information on the ECs exposure. In the present paper, we use all the available information and evaluate the application of several machine learning and deep learning algorithms to predict the exposure of model organisms to different ECs under different doses, using a model marine diatom (Phaeodactylum tricornutum) as a test organism. The results show that 2D convolutional neural networks are the best method to predict the type of EC to which the cultures were exposed, achieving a median accuracy of 97.65%, while Rocket is the best at predicting which concentration the cultures were subjected to, achieving a median accuracy of 100%.
Collapse
|
22
|
A Systematic Approach for Evaluating Artificial Intelligence Models in Industrial Settings. SENSORS 2021; 21:s21186195. [PMID: 34577398 PMCID: PMC8469892 DOI: 10.3390/s21186195] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2021] [Revised: 09/10/2021] [Accepted: 09/11/2021] [Indexed: 11/16/2022]
Abstract
Artificial Intelligence (AI) is one of the hottest topics in our society, especially when it comes to solving data-analysis problems. Industry are conducting their digital shifts, and AI is becoming a cornerstone technology for making decisions out of the huge amount of (sensors-based) data available in the production floor. However, such technology may be disappointing when deployed in real conditions. Despite good theoretical performances and high accuracy when trained and tested in isolation, a Machine-Learning (M-L) model may provide degraded performances in real conditions. One reason may be fragility in treating properly unexpected or perturbed data. The objective of the paper is therefore to study the robustness of seven M-L and Deep-Learning (D-L) algorithms, when classifying univariate time-series under perturbations. A systematic approach is proposed for artificially injecting perturbations in the data and for evaluating the robustness of the models. This approach focuses on two perturbations that are likely to happen during data collection. Our experimental study, conducted on twenty sensors’ datasets from the public University of California Riverside (UCR) repository, shows a great disparity of the models’ robustness under data quality degradation. Those results are used to analyse whether the impact of such robustness can be predictable—thanks to decision trees—which would prevent us from testing all perturbations scenarios. Our study shows that building such a predictor is not straightforward and suggests that such a systematic approach needs to be used for evaluating AI models’ robustness.
Collapse
|
23
|
W-TSS: A Wavelet-Based Algorithm for Discovering Time Series Shapelets. SENSORS 2021; 21:s21175801. [PMID: 34502692 PMCID: PMC8434226 DOI: 10.3390/s21175801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 08/24/2021] [Accepted: 08/24/2021] [Indexed: 11/16/2022]
Abstract
Many approaches to time series classification rely on machine learning methods. However, there is growing interest in going beyond black box prediction models to understand discriminatory features of the time series and their associations with outcomes. One promising method is time-series shapelets (TSS), which identifies maximally discriminative subsequences of time series. For example, in environmental health applications TSS could be used to identify short-term patterns in exposure time series (shapelets) associated with adverse health outcomes. Identification of candidate shapelets in TSS is computationally intensive. The original TSS algorithm used exhaustive search. Subsequent algorithms introduced efficiencies by trimming/aggregating the set of candidates or training candidates from initialized values, but these approaches have limitations. In this paper, we introduce Wavelet-TSS (W-TSS) a novel intelligent method for identifying candidate shapelets in TSS using wavelet transformation discovery. We tested W-TSS on two datasets: (1) a synthetic example used in previous TSS studies and (2) a panel study relating exposures from residential air pollution sensors to symptoms in participants with asthma. Compared to previous TSS algorithms, W-TSS was more computationally efficient, more accurate, and was able to discover more discriminative shapelets. W-TSS does not require pre-specification of shapelet length.
Collapse
|
24
|
PFC: A Novel Perceptual Features-Based Framework for Time Series Classification. ENTROPY 2021; 23:e23081059. [PMID: 34441199 PMCID: PMC8391677 DOI: 10.3390/e23081059] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 08/07/2021] [Accepted: 08/12/2021] [Indexed: 11/24/2022]
Abstract
Time series classification (TSC) is a significant problem in data mining with several applications in different domains. Mining different distinguishing features is the primary method. One promising method is algorithms based on the morphological structure of time series, which are interpretable and accurate. However, existing structural feature-based algorithms, such as time series forest (TSF) and shapelet traverse, all features through many random combinations, which means that a lot of training time and computing resources are required to filter meaningless features, important distinguishing information will be ignored. To overcome this problem, in this paper, we propose a perceptual features-based framework for TSC. We are inspired by how humans observe time series and realize that there are usually only a few essential points that need to be remembered for a time series. Although the complex time series has a lot of details, a small number of data points is enough to describe the shape of the entire sample. First, we use the improved perceptually important points (PIPs) to extract key points and use them as the basis for time series segmentation to obtain a combination of interval-level and point-level features. Secondly, we propose a framework to explore the effects of perceptual structural features combined with decision trees (DT), random forests (RF), and gradient boosting decision trees (GBDT) on TSC. The experimental results on the UCR datasets show that our work has achieved leading accuracy, which is instructive for follow-up research.
Collapse
|
25
|
CNN-based classification of fNIRS signals in motor imagery BCI system. J Neural Eng 2021; 18. [PMID: 33761480 DOI: 10.1088/1741-2552/abf187] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Accepted: 03/24/2021] [Indexed: 11/11/2022]
Abstract
Objective. Development of a brain-computer interface (BCI) requires classification of brain neural activities to different states. Functional near-infrared spectroscopy (fNIRS) can measure the brain activities and has great potential for BCI. In recent years, a large number of classification algorithms have been proposed, in which deep learning methods, especially convolutional neural network (CNN) methods are successful. fNIRS signal has typical time series properties, we combined fNIRS data and kinds of CNN-based time series classification (TSC) methods to classify BCI task.Approach. In this study, participants were recruited for a left and right hand motor imagery experiment and the cerebral neural activities were recorded by fNIRS equipment (FOIRE-3000). TSC methods are used to distinguish the brain activities when imagining the left or right hand. We have tested the overall person, single person and overall person with single-channel classification results, and these methods achieved excellent classification results. We also compared the CNN-based TSC methods with traditional classification methods such as support vector machine.Main results. Experiments showed that the CNN-based methods have significant advantages in classification accuracy: the CNN-based methods have achieved remarkable results in the classification of left-handed and right-handed imagination tasks, reaching 98.6% accuracy on overall person, 100% accuracy on single person, and in the single-channel classification an accuracy of 80.1% has been achieved with the best-performing channel.Significance. These results suggest that using the CNN-based TSC methods can significantly improve the BCI performance and also lay the foundation for the miniaturization and portability of training rehabilitation equipment.
Collapse
|
26
|
Prevention of Cooktop Ignition Using Detection and Multi-Step Machine Learning Algorithms. FIRE SAFETY JOURNAL 2021; 120:10.1016/j.firesaf.2020.103043. [PMID: 34511712 PMCID: PMC8431960 DOI: 10.1016/j.firesaf.2020.103043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This paper presents a study to examine the potential use of machine learning models to build a real-time detection algorithm for prevention of kitchen cooktop fires. Sixteen sets of time-dependent sensor signals were obtained from 60 normal/ignition cooking experiments. A total of 200 000 data instances are documented and analyzed. The raw data are preprocessed. Selected features are generated for time series data focusing on real-time detection applications. Utilizing the leave-one-out cross validation method, three machine learning models are built and tested. Parametric studies are carried out to understand the diversity, volume, and tendency of the data. Given the current dataset, the detection algorithm based on Support Vector Machine (SVM) provides the most reliable prediction (with an overall accuracy of 96.9 %) on pre-ignition conditions. Analyses indicate that using a multi-step approach can further improve overall prediction accuracy. The development of an accurate detection algorithm can provide reliable feedback to intercept ignition of unattended cooking and help reduce fire losses.
Collapse
|
27
|
GENDIS: Genetic Discovery of Shapelets. SENSORS 2021; 21:s21041059. [PMID: 33557169 PMCID: PMC7913966 DOI: 10.3390/s21041059] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 01/29/2021] [Accepted: 02/01/2021] [Indexed: 11/21/2022]
Abstract
In the time series classification domain, shapelets are subsequences that are discriminative of a certain class. It has been shown that classifiers are able to achieve state-of-the-art results by taking the distances from the input time series to different discriminative shapelets as the input. Additionally, these shapelets can be visualized and thus possess an interpretable characteristic, making them appealing in critical domains, where longitudinal data are ubiquitous. In this study, a new paradigm for shapelet discovery is proposed, which is based on evolutionary computation. The advantages of the proposed approach are that: (i) it is gradient-free, which could allow escaping from local optima more easily and supports non-differentiable objectives; (ii) no brute-force search is required, making the algorithm scalable; (iii) the total amount of shapelets and the length of each of these shapelets are evolved jointly with the shapelets themselves, alleviating the need to specify this beforehand; (iv) entire sets are evaluated at once as opposed to single shapelets, which results in smaller final sets with fewer similar shapelets that result in similar predictive performances; and (v) the discovered shapelets do not need to be a subsequence of the input time series. We present the results of the experiments, which validate the enumerated advantages.
Collapse
|
28
|
Deep Temporal Convolution Network for Time Series Classification. SENSORS 2021; 21:s21020603. [PMID: 33467136 PMCID: PMC7830229 DOI: 10.3390/s21020603] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Revised: 01/04/2021] [Accepted: 01/14/2021] [Indexed: 11/25/2022]
Abstract
A neural network that matches with a complex data function is likely to boost the classification performance as it is able to learn the useful aspect of the highly varying data. In this work, the temporal context of the time series data is chosen as the useful aspect of the data that is passed through the network for learning. By exploiting the compositional locality of the time series data at each level of the network, shift-invariant features can be extracted layer by layer at different time scales. The temporal context is made available to the deeper layers of the network by a set of data processing operations based on the concatenation operation. A matching learning algorithm for the revised network is described in this paper. It uses gradient routing in the backpropagation path. The framework as proposed in this work attains better generalization without overfitting the network to the data, as the weights can be pretrained appropriately. It can be used end-to-end with multivariate time series data in their raw form, without the need for manual feature crafting or data transformation. Data experiments with electroencephalogram signals and human activity signals show that with the right amount of concatenation in the deeper layers of the proposed network, it can improve the performance in signal classification.
Collapse
|
29
|
Deep Neural Network Sleep Scoring Using Combined Motion and Heart Rate Variability Data. SENSORS 2020; 21:s21010025. [PMID: 33374527 PMCID: PMC7793092 DOI: 10.3390/s21010025] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2020] [Revised: 12/16/2020] [Accepted: 12/18/2020] [Indexed: 11/24/2022]
Abstract
Background: Performance of wrist actigraphy in assessing sleep not only depends on the sensor technology of the actigraph hardware but also on the attributes of the interpretative algorithm (IA). The objective of our research was to improve assessment of sleep quality, relative to existing IAs, through development of a novel IA using deep learning methods, utilizing as input activity count and heart rate variability (HRV) metrics of different window length (number of epochs of data). Methods: Simultaneously recorded polysomnography (PSG) and wrist actigraphy data of 222 participants were utilized. Classic deep learning models were applied to: (a) activity count alone (without HRV), (b) activity count + HRV (30-s window), (c) activity count + HRV (3-min window), and (d) activity count + HRV (5-min window) to ascertain the best set of inputs. A novel deep learning model (Haghayegh Algorithm, HA), founded on best set of inputs, was developed, and its sleep scoring performance was then compared with the most popular University of California San Diego (UCSD) and Actiwatch proprietary IAs. Results: Activity count combined with HRV metrics calculated per 5-min window produced highest agreement with PSG. HA showed 84.5% accuracy (5.3–6.2% higher than comparator IAs), 89.5% sensitivity (6.2% higher than UCSD IA and 6% lower than Actiwatch proprietary IA), 70.0% specificity (8.2–34.3% higher than comparator IAs), and 58.7% Kappa agreement (16–23% higher than comparator IAs) in detecting sleep epochs. HA did not differ significantly from PSG in deriving sleep parameters—sleep efficiency, total sleep time, sleep onset latency, and wake after sleep onset; moreover, bias and mean absolute error of the HA model in estimating them was less than the comparator IAs. HA showed, respectively, 40.9% and 54.0% Kappa agreement with PSG in detecting rapid and non-rapid eye movement (REM and NREM) epochs. Conclusions: The HA model simultaneously incorporating activity count and HRV metrics calculated per 5-min window demonstrates significantly better sleep scoring performance than existing popular IAs.
Collapse
|
30
|
A Case Driven Study of the Use of Time Series Classification for Flexibility in Industry 4.0. SENSORS 2020; 20:s20247273. [PMID: 33353201 PMCID: PMC7767197 DOI: 10.3390/s20247273] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 12/10/2020] [Accepted: 12/14/2020] [Indexed: 11/29/2022]
Abstract
With the Industry 4.0 paradigm comes the convergence of the Internet Technologies and Operational Technologies, and concepts, such as Industrial Internet of Things (IIoT), cloud manufacturing, Cyber-Physical Systems (CPS), and so on. These concepts bring industries into the big data era and allow for them to have access to potentially useful information in order to optimise the Overall Equipment Effectiveness (OEE); however, most European industries still rely on the Computer-Integrated Manufacturing (CIM) model, where the production systems run as independent systems (i.e., without any communication with the upper levels). Those production systems are controlled by a Programmable Logic Controller, in which a static and rigid program is implemented. This program is static and rigid in a sense that the programmed routines cannot evolve over the time unless a human modifies it. However, to go further in terms of flexibility, we are convinced that it requires moving away from the aforementioned old-fashioned and rigid automation to a ML-based automation, i.e., where the control itself is based on the decisions that were taken by ML algorithms. In order to verify this, we applied a time series classification method on a scale model of a factory using real industrial controllers, and widened the variety of parts the production line has to treat. This study shows that satisfactory results can be obtained only at the expense of the human expertise (i.e., in the industrial process and in the ML process).
Collapse
|
31
|
Time Series Forecasting and Classification Models Based on Recurrent with Attention Mechanism and Generative Adversarial Networks. SENSORS 2020; 20:s20247211. [PMID: 33339314 PMCID: PMC7766176 DOI: 10.3390/s20247211] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Revised: 12/11/2020] [Accepted: 12/13/2020] [Indexed: 12/02/2022]
Abstract
Time series classification and forecasting have long been studied with the traditional statistical methods. Recently, deep learning achieved remarkable successes in areas such as image, text, video, audio processing, etc. However, research studies conducted with deep neural networks in these fields are not abundant. Therefore, in this paper, we aim to propose and evaluate several state-of-the-art neural network models in these fields. We first review the basics of representative models, namely long short-term memory and its variants, the temporal convolutional network and the generative adversarial network. Then, long short-term memory with autoencoder and attention-based models, the temporal convolutional network and the generative adversarial model are proposed and applied to time series classification and forecasting. Gaussian sliding window weights are proposed to speed the training process up. Finally, the performances of the proposed methods are assessed using five optimizers and loss functions with the public benchmark datasets, and comparisons between the proposed temporal convolutional network and several classical models are conducted. Experiments show the proposed models’ effectiveness and confirm that the temporal convolutional network is superior to long short-term memory models in sequence modeling. We conclude that the proposed temporal convolutional network reduces time consumption to around 80% compared to others while retaining the same accuracy. The unstable training process for generative adversarial network is circumvented by tuning hyperparameters and carefully choosing the appropriate optimizer of “Adam”. The proposed generative adversarial network also achieves comparable forecasting accuracy with traditional methods.
Collapse
|
32
|
Classification of Actigraphy Records from Bipolar Disorder Patients Using Slope Entropy: A Feasibility Study. ENTROPY 2020; 22:e22111243. [PMID: 33287011 PMCID: PMC7711446 DOI: 10.3390/e22111243] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2020] [Revised: 10/27/2020] [Accepted: 10/28/2020] [Indexed: 12/12/2022]
Abstract
Bipolar Disorder (BD) is an illness with high prevalence and a huge social and economic impact. It is recurrent, with a long-term evolution in most cases. Early treatment and continuous monitoring have proven to be very effective in mitigating the causes and consequences of BD. However, no tools are currently available for a massive and semi-automatic BD patient monitoring and control. Taking advantage of recent technological developments in the field of wearables, this paper studies the feasibility of a BD episodes classification analysis while using entropy measures, an approach successfully applied in a myriad of other physiological frameworks. This is a very difficult task, since actigraphy records are highly non-stationary and corrupted with artifacts (no activity). The method devised uses a preprocessing stage to extract epochs of activity, and then applies a quantification measure, Slope Entropy, recently proposed, which outperforms the most common entropy measures used in biomedical time series. The results confirm the feasibility of the approach proposed, since the three states that are involved in BD, depression, mania, and remission, can be significantly distinguished.
Collapse
|
33
|
Fever Time Series Analysis Using Slope Entropy. Application to Early Unobtrusive Differential Diagnosis. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E1034. [PMID: 33286803 PMCID: PMC7597093 DOI: 10.3390/e22091034] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 09/10/2020] [Accepted: 09/11/2020] [Indexed: 11/16/2022]
Abstract
Fever is a readily measurable physiological response that has been used in medicine for centuries. However, the information provided has been greatly limited by a plain thresholding approach, overlooking the additional information provided by temporal variations and temperature values below such threshold that are also representative of the subject status. In this paper, we propose to utilize continuous body temperature time series of patients that developed a fever, in order to apply a method capable of diagnosing the specific underlying fever cause only by means of a pattern relative frequency analysis. This analysis was based on a recently proposed measure, Slope Entropy, applied to a variety of records coming from dengue and malaria patients, among other fever diseases. After an input parameter customization, a classification analysis of malaria and dengue records took place, quantified by the Matthews Correlation Coefficient. This classification yielded a high accuracy, with more than 90% of the records correctly labelled in some cases, demonstrating the feasibility of the approach proposed. This approach, after further studies, or combined with more measures such as Sample Entropy, is certainly very promising in becoming an early diagnosis tool based solely on body temperature temporal patterns, which is of great interest in the current Covid-19 pandemic scenario.
Collapse
|
34
|
Encoding Time Series as Multi-Scale Signed Recurrence Plots for Classification Using Fully Convolutional Networks. SENSORS 2020; 20:s20143818. [PMID: 32650584 PMCID: PMC7412236 DOI: 10.3390/s20143818] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 06/24/2020] [Accepted: 06/30/2020] [Indexed: 11/17/2022]
Abstract
Recent advances in time series classification (TSC) have exploited deep neural networks (DNN) to improve the performance. One promising approach encodes time series as recurrence plot (RP) images for the sake of leveraging the state-of-the-art DNN to achieve accuracy. Such an approach has been shown to achieve impressive results, raising the interest of the community in it. However, it remains unsolved how to handle not only the variability in the distinctive region scale and the length of sequences but also the tendency confusion problem. In this paper, we tackle the problem using Multi-scale Signed Recurrence Plots (MS-RP), an improvement of RP, and propose a novel method based on MS-RP images and Fully Convolutional Networks (FCN) for TSC. This method first introduces phase space dimension and time delay embedding of RP to produce multi-scale RP images; then, with the use of asymmetrical structure, constructed RP images can represent very long sequences (>700 points). Next, MS-RP images are obtained by multiplying designed sign masks in order to remove the tendency confusion. Finally, FCN is trained with MS-RP images to perform classification. Experimental results on 45 benchmark datasets demonstrate that our method improves the state-of-the-art in terms of classification accuracy and visualization evaluation.
Collapse
|
35
|
FilterNet: A Many-to-Many Deep Learning Architecture for Time Series Classification. SENSORS 2020; 20:s20092498. [PMID: 32354082 PMCID: PMC7249062 DOI: 10.3390/s20092498] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 04/14/2020] [Accepted: 04/26/2020] [Indexed: 11/17/2022]
Abstract
In this paper, we present and benchmark FilterNet, a flexible deep learning architecture for time series classification tasks, such as activity recognition via multichannel sensor data. It adapts popular convolutional neural network (CNN) and long short-term memory (LSTM) motifs which have excelled in activity recognition benchmarks, implementing them in a many-to-many architecture to markedly improve frame-by-frame accuracy, event segmentation accuracy, model size, and computational efficiency. We propose several model variants, evaluate them alongside other published models using the Opportunity benchmark dataset, demonstrate the effect of model ensembling and of altering key parameters, and quantify the quality of the models’ segmentation of discrete events. We also offer recommendations for use and suggest potential model extensions. FilterNet advances the state of the art in all measured accuracy and speed metrics when applied to the benchmarked dataset, and it can be extensively customized for other applications.
Collapse
|
36
|
Edge4TSC: Binary Distribution Tree-Enabled Time Series Classification in Edge Environment. SENSORS 2020; 20:s20071908. [PMID: 32235457 PMCID: PMC7180717 DOI: 10.3390/s20071908] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/28/2020] [Revised: 03/28/2020] [Accepted: 03/28/2020] [Indexed: 11/17/2022]
Abstract
In the past decade, time series data have been generated from various fields at a rapid speed, which offers a huge opportunity for mining valuable knowledge. As a typical task of time series mining, Time Series Classification (TSC) has attracted lots of attention from both researchers and domain experts due to its broad applications ranging from human activity recognition to smart city governance. Specifically, there is an increasing requirement for performing classification tasks on diverse types of time series data in a timely manner without costly hand-crafting feature engineering. Therefore, in this paper, we propose a framework named Edge4TSC that allows time series to be processed in the edge environment, so that the classification results can be instantly returned to the end-users. Meanwhile, to get rid of the costly hand-crafting feature engineering process, deep learning techniques are applied for automatic feature extraction, which shows competitive or even superior performance compared to state-of-the-art TSC solutions. However, because time series presents complex patterns, even deep learning models are not capable of achieving satisfactory classification accuracy, which motivated us to explore new time series representation methods to help classifiers further improve the classification accuracy. In the proposed framework Edge4TSC, by building the binary distribution tree, a new time series representation method was designed for addressing the classification accuracy concern in TSC tasks. By conducting comprehensive experiments on six challenging time series datasets in the edge environment, the potential of the proposed framework for its generalization ability and classification accuracy improvement is firmly validated with a number of helpful insights.
Collapse
|
37
|
A Smartphone Lightweight Method for Human Activity Recognition Based on Information Theory. SENSORS (BASEL, SWITZERLAND) 2020; 20:E1856. [PMID: 32230830 PMCID: PMC7181294 DOI: 10.3390/s20071856] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2020] [Revised: 02/28/2020] [Accepted: 03/07/2020] [Indexed: 12/04/2022]
Abstract
Smartphones have emerged as a revolutionary technology for monitoring everyday life, and they have played an important role in Human Activity Recognition (HAR) due to its ubiquity. The sensors embedded in these devices allows recognizing human behaviors using machine learning techniques. However, not all solutions are feasible for implementation in smartphones, mainly because of its high computational cost. In this context, the proposed method, called HAR-SR, introduces information theory quantifiers as new features extracted from sensors data to create simple activity classification models, increasing in this way the efficiency in terms of computational cost. Three public databases (SHOAIB, UCI, WISDM) are used in the evaluation process. The results have shown that HAR-SR can classify activities with 93% accuracy when using a leave-one-subject-out cross-validation procedure (LOSO).
Collapse
|
38
|
Combination of Sensor Data and Health Monitoring for Early Detection of Subclinical Ketosis in Dairy Cows. SENSORS 2020; 20:s20051484. [PMID: 32182701 PMCID: PMC7085771 DOI: 10.3390/s20051484] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 02/27/2020] [Accepted: 03/04/2020] [Indexed: 11/16/2022]
Abstract
Subclinical ketosis is a metabolic disease in early lactation. It contributes to economic losses because of reduced milk yield and may promote the development of secondary diseases. Thus, an early detection seems desirable as it enables the farmer to initiate countermeasures. To support early detection, we examine different types of data recordings and use them to build a flexible algorithm that predicts the occurence of subclinical ketosis. This approach shows promising results and can be seen as a step toward automatic health monitoring in farm animals.
Collapse
|
39
|
Walking Recognition in Mobile Devices. SENSORS 2020; 20:s20041189. [PMID: 32098082 PMCID: PMC7071017 DOI: 10.3390/s20041189] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Revised: 02/14/2020] [Accepted: 02/18/2020] [Indexed: 11/29/2022]
Abstract
Presently, smartphones are used more and more for purposes that have nothing to do with phone calls or simple data transfers. One example is the recognition of human activity, which is relevant information for many applications in the domains of medical diagnosis, elderly assistance, indoor localization, and navigation. The information captured by the inertial sensors of the phone (accelerometer, gyroscope, and magnetometer) can be analyzed to determine the activity performed by the person who is carrying the device, in particular in the activity of walking. Nevertheless, the development of a standalone application able to detect the walking activity starting only from the data provided by these inertial sensors is a complex task. This complexity lies in the hardware disparity, noise on data, and mostly the many movements that the smartphone can experience and which have nothing to do with the physical displacement of the owner. In this work, we explore and compare several approaches for identifying the walking activity. We categorize them into two main groups: the first one uses features extracted from the inertial data, whereas the second one analyzes the characteristic shape of the time series made up of the sensors readings. Due to the lack of public datasets of inertial data from smartphones for the recognition of human activity under no constraints, we collected data from 77 different people who were not connected to this research. Using this dataset, which we published online, we performed an extensive experimental validation and comparison of our proposals.
Collapse
|
40
|
Knocking and Listening: Learning Mechanical Impulse Response for Understanding Surface Characteristics. SENSORS 2020; 20:s20020369. [PMID: 31936449 PMCID: PMC7013596 DOI: 10.3390/s20020369] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 12/28/2019] [Accepted: 01/04/2020] [Indexed: 11/16/2022]
Abstract
Inspired by spiders that can generate and sense vibrations to obtain information regarding a substrate, we propose an intelligent system that can recognize the type of surface being touched by knocking the surface and listening to the vibrations. Hence, we developed a system that is equipped with an electromagnetic hammer for hitting the ground and an accelerometer for measuring the mechanical responses induced by the impact. We investigate the feasibility of sensing 10 different daily surfaces through various machine-learning techniques including recent deep-learning approaches. Although some test surfaces are similar, experimental results show that our system can recognize 10 different surfaces remarkably well (test accuracy of 98.66%). In addition, our results without directly hitting the surface (internal impact) exhibited considerably high test accuracy (97.51%). Finally, we conclude this paper with the limitations and future directions of the study.
Collapse
|
41
|
Sensor Classification Using Convolutional Neural Network by Encoding Multivariate Time Series as Two-Dimensional Colored Images. SENSORS 2019; 20:s20010168. [PMID: 31892141 PMCID: PMC6982717 DOI: 10.3390/s20010168] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 12/23/2019] [Accepted: 12/24/2019] [Indexed: 11/23/2022]
Abstract
This paper proposes a framework to perform the sensor classification by using multivariate time series sensors data as inputs. The framework encodes multivariate time series data into two-dimensional colored images, and concatenate the images into one bigger image for classification through a Convolutional Neural Network (ConvNet). This study applied three transformation methods to encode time series into images: Gramian Angular Summation Field (GASF), Gramian Angular Difference Field (GADF), and Markov Transition Field (MTF). Two open multivariate datasets were used to evaluate the impact of using different transformation methods, the sequences of concatenating images, and the complexity of ConvNet architectures on classification accuracy. The results show that the selection of transformation methods and the sequence of concatenation do not affect the prediction outcome significantly. Surprisingly, the simple structure of ConvNet is sufficient enough for classification as it performed equally well with the complex structure of VGGNet. The results were also compared with other classification methods and found that the proposed framework outperformed other methods in terms of classification accuracy.
Collapse
|
42
|
Abstract
Successful identification of complex odors by sensor arrays remains a challenging problem. Herein, we report robust, category-specific multiclass-time series classification using an array of 20 carbon nanotube-based chemical sensors. We differentiate between samples of cheese, liquor, and edible oil based on their odor. In a two-stage machine-learning approach, we first obtain an optimal subset of sensors specific to each category and then validate this subset using an independent and expanded data set. We determined the optimal selectors via independent selector classification accuracy, as well as a combinatorial scan of all 4845 possible four selector combinations. We performed sample classification using two models-a k-nearest neighbors model and a random forest model trained on extracted features. This protocol led to high classification accuracy in the independent test sets for five cheese and five liquor samples (accuracies of 91% and 78%, respectively) and only a slightly lower (73%) accuracy on a five edible oil data set.
Collapse
|
43
|
A New Method of Mixed Gas Identification Based on a Convolutional Neural Network for Time Series Classification. SENSORS 2019; 19:s19091960. [PMID: 31027348 PMCID: PMC6539079 DOI: 10.3390/s19091960] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/27/2019] [Revised: 04/22/2019] [Accepted: 04/24/2019] [Indexed: 12/21/2022]
Abstract
This paper proposes a new method of mixed gas identification based on a convolutional neural network for time series classification. In view of the superiority of convolutional neural networks in the field of computer vision, we applied the concept to the classification of five mixed gas time series data collected by an array of eight MOX gas sensors. Existing convolutional neural networks are mostly used for processing visual data, and are rarely used in gas data classification and have great limitations. Therefore, the idea of mapping time series data into an analogous-image matrix data is proposed. Then, five kinds of convolutional neural networks-VGG-16, VGG-19, ResNet18, ResNet34 and ResNet50-were used to classify and compare five kinds of mixed gases. By adjusting the parameters of the convolutional neural networks, the final gas recognition rate is 96.67%. The experimental results show that the method can classify the gas data quickly and effectively, and effectively combine the gas time series data with classical convolutional neural networks, which provides a new idea for the identification of mixed gases.
Collapse
|