1
|
Ananthi M, Gopal A, Ramalakshmi K, Mohan Kumar P. Gaussian Adapted Markov Model with Overhauled Fluctuation Analysis-Based Big Data Streaming Model in Cloud. BIG DATA 2024; 12:1-18. [PMID: 37902996 DOI: 10.1089/big.2023.0035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
An accurate resource usage prediction in the big data streaming applications still remains as one of the complex processes. In the existing works, various resource scaling techniques are developed for forecasting the resource usage in the big data streaming systems. However, the baseline streaming mechanisms limit with the issues of inefficient resource scaling, inaccurate forecasting, high latency, and running time. Therefore, the proposed work motivates to develop a new framework, named as Gaussian adapted Markov model (GAMM)-overhauled fluctuation analysis (OFA), for an efficient big data streaming in the cloud systems. The purpose of this work is to efficiently manage the time-bounded big data streaming applications with reduced error rate. In this study, the gating strategy is also used to extract the set of features for obtaining nonlinear distribution of data and fat convergence solution, used to perform the fluctuation analysis. Moreover, the layered architecture is developed for simplifying the process of resource forecasting in the streaming applications. During experimentation, the results of the proposed stream model GAMM-OFA are validated and compared by using different measures.
Collapse
Affiliation(s)
- M Ananthi
- Department of Computer Science and Business Systems, Sri Sairam Engineering College, Chennai, India
| | - Annapoorani Gopal
- Department of CSE & IT, University College of Engineering, Anna University, Tiruchirappalli, India
| | - K Ramalakshmi
- Department of Computer Science and Engineering, Alliance University, Bengaluru, India
| | - P Mohan Kumar
- Department of Computer Science and Engineering, Hindustan Institute of Technology & Science, Chennai, India
| |
Collapse
|
2
|
Ekemeyong Awong LE, Zielinska T. Comparative Analysis of the Clustering Quality in Self-Organizing Maps for Human Posture Classification. SENSORS (BASEL, SWITZERLAND) 2023; 23:7925. [PMID: 37765983 PMCID: PMC10538130 DOI: 10.3390/s23187925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 09/05/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
The objective of this article is to develop a methodology for selecting the appropriate number of clusters to group and identify human postures using neural networks with unsupervised self-organizing maps. Although unsupervised clustering algorithms have proven effective in recognizing human postures, many works are limited to testing which data are correctly or incorrectly recognized. They often neglect the task of selecting the appropriate number of groups (where the number of clusters corresponds to the number of output neurons, i.e., the number of postures) using clustering quality assessments. The use of quality scores to determine the number of clusters frees the expert to make subjective decisions about the number of postures, enabling the use of unsupervised learning. Due to high dimensionality and data variability, expert decisions (referred to as data labeling) can be difficult and time-consuming. In our case, there is no manual labeling step. We introduce a new clustering quality score: the discriminant score (DS). We describe the process of selecting the most suitable number of postures using human activity records captured by RGB-D cameras. Comparative studies on the usefulness of popular clustering quality scores-such as the silhouette coefficient, Dunn index, Calinski-Harabasz index, Davies-Bouldin index, and DS-for posture classification tasks are presented, along with graphical illustrations of the results produced by DS. The findings show that DS offers good quality in posture recognition, effectively following postural transitions and similarities.
Collapse
Affiliation(s)
- Lisiane Esther Ekemeyong Awong
- Faculty of Power and Aeronautical Engineering, Division of Theory of Machines and Robots, Warsaw University of Technology, 00-665 Warszawa, Poland
| | - Teresa Zielinska
- Faculty of Power and Aeronautical Engineering, Division of Theory of Machines and Robots, Warsaw University of Technology, 00-665 Warszawa, Poland
| |
Collapse
|
3
|
Wang Y, Li J, Yang B, Li HG. Stream-data-clustering based adaptive alarm threshold setting approaches for industrial processes with multiple operating conditions. ISA TRANSACTIONS 2022; 129:594-608. [PMID: 35164962 DOI: 10.1016/j.isatra.2022.01.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 01/26/2022] [Accepted: 01/26/2022] [Indexed: 06/14/2023]
Abstract
The setting of alarm thresholds is a critical concern of alarm management systems in industrial processes. Conventional alarm thresholds less consider changes of operating conditions in production processes, which degrades the effectiveness of alarm management systems. In response to this problem, this paper proposes an adaptive alarm threshold setting approach based on stream data clustering (SDC). Firstly, we develop a stream data clustering algorithm termed as a-DenStream algorithm which realizes industrial flow data clustering through online micro-clustering and offline integration. Subsequently, we develop the C-BOUND algorithm to extract the edges of the clustering results. In response to alarms associated with multiple operating conditions, segmentations are conducted to set alarm threshold groups and build a multi-condition alarm threshold model. Consequently, an adaptive alarm threshold setting method based on model matching is created. The effectiveness of the proposed method is demonstrated by experiments on a coal gasification chemical process. The proposed method provides a potential application for industrial processes with multiple operating conditions alarm managements.
Collapse
Affiliation(s)
- Yuehan Wang
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Jince Li
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China
| | - Bo Yang
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China.
| | - Hong-Guang Li
- College of Information Science & Technology, Beijing University of Chemical Technology, Beijing 100029, China.
| |
Collapse
|
4
|
Macro SOStream: An Evolving Algorithm to Self Organizing Density-Based Clustering with Micro and Macroclusters. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12147161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
This paper proposes a new evolving algorithm named Macro SOStream with entirely online learning and based on self-organizing density for data stream clustering. The Macro SOStream is based on the SOStream algorithm, but we incorporate macroclusters composed of microclusters. While microclusters have spherical shapes, macroclusters can have arbitrary shapes. Moreover, the Macro SOStream has the macrocluster merge functionality specially designed to improve its performance under data drift contexts. The Macro SOStream’s performance is compared to SOStream and DenStream algorithms’ performance using four synthetic datasets and the ARI performance metric to validate our proposal. Furthermore, we carry out an exhaustive analysis on the influence of adequate hyperparameter setup on these algorithms’ performance. As a result, the Macro SOStream presents good performance mainly in the context of data drift and for demands of non-spherical clusters.
Collapse
|
5
|
Liu Q, Yuan B, Wang Y. Online Learning for Foot Contact Detection of Legged Robot Based on Data Stream Clustering. Front Bioeng Biotechnol 2022; 9:771415. [PMID: 35178383 PMCID: PMC8844452 DOI: 10.3389/fbioe.2021.771415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 12/14/2021] [Indexed: 11/13/2022] Open
Abstract
Foot contact detection is critical for legged robot running control using state machine, in which the controller uses different control modules in the leg flight phase and landing phase. This paper presents an online learning framework to improve the rapidity of foot contact detection in legged robot running. In this framework, the Gaussian mixture model with three sub-components is adopted to learn the contact data vectors corresponding to running on flat ground, running upstairs, and running downstairs. An online data stream learning algorithm is used to update the model. To deal with the difficulty in obtaining contact data at landing moment online, a “trace back” module is designed to trace back the contact data in the memory stack until the data meet with the probability contact criterion. To test if the foot is in contact with the ground, a projection method is proposed. The acquiring data vector during the leg flight phase is projected onto an independent random vector space, and the contact event is triggered if all projected random variables fall within 1.5σ of the corresponding Gaussian distribution. Experiments on a legged robot show that the presented algorithm can predict the foot contact 16 ms in advance compared with the prediction using only leg force, which will ease the controller design and enhance the stability of legged robot control.
Collapse
Affiliation(s)
- Qingyu Liu
- Key Laboratory of Metallurgical Equipment and Control Technology, Ministry of Education, Wuhan University of Science and Technology, Wuhan, China
- *Correspondence: Qingyu Liu,
| | - Bing Yuan
- Hubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering, Wuhan University of Science and Technology, Wuhan, China
| | - Yang Wang
- Golden Leaf Production and Manufacturing Center of China Tobacco Henan Industrial Co., Ltd., Zhengzhou, China
| |
Collapse
|
6
|
Discovering three-dimensional patterns in real-time from data streams: An online triclustering approach. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.12.089] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
7
|
Mousavi M, Khotanlou H, Bakar AA, Vakilian M. Varying density method for data stream clustering. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
8
|
Alam MK, Aziz AA, Latif SA, Awang A. Error-Aware Data Clustering for In-Network Data Reduction in Wireless Sensor Networks. SENSORS 2020; 20:s20041011. [PMID: 32069936 PMCID: PMC7071511 DOI: 10.3390/s20041011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/09/2019] [Revised: 11/26/2019] [Accepted: 12/04/2019] [Indexed: 11/16/2022]
Abstract
A wireless sensor network (WSN) deploys hundreds or thousands of nodes that may introduce large-scale data over time. Dealing with such an amount of collected data is a real challenge for energy-constraint sensor nodes. Therefore, numerous research works have been carried out to design efficient data clustering techniques in WSNs to eliminate the amount of redundant data before transmitting them to the sink while preserving their fundamental properties. This paper develops a new error-aware data clustering (EDC) technique at the cluster-heads (CHs) for in-network data reduction. The proposed EDC consists of three adaptive modules that allow users to choose the module that suits their requirements and the quality of the data. The histogram-based data clustering (HDC) module groups temporal correlated data into clusters and eliminates correlated data from each cluster. Recursive outlier detection and smoothing (RODS) with HDC module provides error-aware data clustering, which detects random outliers using temporal correlation of data to maintain data reduction errors within a predefined threshold. Verification of RODS (V-RODS) with HDC module detects not only random outliers but also frequent outliers simultaneously based on both the temporal and spatial correlations of the data. The simulation results show that the proposed EDC is computationally cheap, able to reduce a significant amount of redundant data with minimum error, and provides efficient error-aware data clustering solutions for remote monitoring environmental applications.
Collapse
Affiliation(s)
- M. K. Alam
- Department of Electrical and Electronic Engineering, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia; (A.A.A.); (A.A.)
- Correspondence:
| | - Azrina Abd Aziz
- Department of Electrical and Electronic Engineering, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia; (A.A.A.); (A.A.)
| | - S. A. Latif
- Department of Information Technology, Otago Polytechnic, Dunedin 9016, New Zealand;
| | - Azlan Awang
- Department of Electrical and Electronic Engineering, Universiti Teknologi PETRONAS, Seri Iskandar 32610, Perak, Malaysia; (A.A.A.); (A.A.)
| |
Collapse
|
9
|
A Clustering System for Dynamic Data Streams Based on Metaheuristic Optimisation. MATHEMATICS 2019. [DOI: 10.3390/math7121229] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This article presents the Optimised Stream clustering algorithm (OpStream), a novel approach to cluster dynamic data streams. The proposed system displays desirable features, such as a low number of parameters and good scalability capabilities to both high-dimensional data and numbers of clusters in the dataset, and it is based on a hybrid structure using deterministic clustering methods and stochastic optimisation approaches to optimally centre the clusters. Similar to other state-of-the-art methods available in the literature, it uses “microclusters” and other established techniques, such as density based clustering. Unlike other methods, it makes use of metaheuristic optimisation to maximise performances during the initialisation phase, which precedes the classic online phase. Experimental results show that OpStream outperforms the state-of-the-art methods in several cases, and it is always competitive against other comparison algorithms regardless of the chosen optimisation method. Three variants of OpStream, each coming with a different optimisation algorithm, are presented in this study. A thorough sensitive analysis is performed by using the best variant to point out OpStream’s robustness to noise and resiliency to parameter changes.
Collapse
|