1
|
Nguyen TNA, Vu HT, Dang MT, Kim D, Le AN. Anomaly Detection in Automatic Meter Intelligence System Using Positive Unlabeled Learning and Multiple Symbolic Aggregate Approximation. BIG DATA 2023; 11:225-238. [PMID: 37036805 DOI: 10.1089/big.2021.0471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
With the development of automatic electrical devices in smart grids, the data generated by time and transmitted are vast and thus impossible to control consumption by humans. The problem of abnormal detection in power consumption is crucial in monitoring and controlling smart grids. This article proposes the detection of electrical meter anomalies by detecting abnormal patterns and learning unlabeled data. Furthermore, a framework for big data and machine learning-based anomaly detection framework are introduced. The experimental results show that the time series anomaly detection for electric meters has better results in accuracy and time than the expert alternatives.
Collapse
Affiliation(s)
- Thi Ngoc Anh Nguyen
- Applied Mathematics Department, School of Applied Mathematics and Informatics, Hanoi University of Science and Technology, Hanoi, Vietnam
- Big Data Lab, CMC Institute of Science and Technology, Hanoi, Vietnam
| | - Hoai Thu Vu
- Big Data Lab, CMC Institute of Science and Technology, Hanoi, Vietnam
- Faculty of Information Technology, Posts and Telecommunications Institute of Technology, Hanoi, Vietnam
| | - Minh Tuan Dang
- Big Data Lab, CMC Institute of Science and Technology, Hanoi, Vietnam
- Faculty of Information Technology, Posts and Telecommunications Institute of Technology, Hanoi, Vietnam
| | - Dohyeun Kim
- Department of Computer Engineering, Advanced Technology Research Institute, Jeju National University, Jeju, Korea
| | - Anh Ngoc Le
- Swinburne Vietnam, FPT University, Hanoi, Vietnam
| |
Collapse
|
2
|
Dong Y, Xiao H, Dong Y. SA-CGAN: An oversampling method based on single attribute guided conditional GAN for multi-class imbalanced learning. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.04.135] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
3
|
Mao T, Zhou L, Zhang Y, Sun Y. Classification algorithm for class imbalanced data based on optimized Mahalanobis-Taguchi system. APPL INTELL 2022. [DOI: 10.1007/s10489-021-02929-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
4
|
Electricity Theft Detection Using Supervised Learning Techniques on Smart Meter Data. SUSTAINABILITY 2020. [DOI: 10.3390/su12198023] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Due to the increase in the number of electricity thieves, the electric utilities are facing problems in providing electricity to their consumers in an efficient way. An accurate Electricity Theft Detection (ETD) is quite challenging due to the inaccurate classification on the imbalance electricity consumption data, the overfitting issues and the High False Positive Rate (FPR) of the existing techniques. Therefore, intensified research is needed to accurately detect the electricity thieves and to recover a huge revenue loss for utility companies. To address the above limitations, this paper presents a new model, which is based on the supervised machine learning techniques and real electricity consumption data. Initially, the electricity data are pre-processed using interpolation, three sigma rule and normalization methods. Since the distribution of labels in the electricity consumption data is imbalanced, an Adasyn algorithm is utilized to address this class imbalance problem. It is used to achieve two objectives. Firstly, it intelligently increases the minority class samples in the data. Secondly, it prevents the model from being biased towards the majority class samples. Afterwards, the balanced data are fed into a Visual Geometry Group (VGG-16) module to detect abnormal patterns in electricity consumption. Finally, a Firefly Algorithm based Extreme Gradient Boosting (FA-XGBoost) technique is exploited for classification. The simulations are conducted to show the performance of our proposed model. Moreover, the state-of-the-art methods are also implemented for comparative analysis, i.e., Support Vector Machine (SVM), Convolution Neural Network (CNN), and Logistic Regression (LR). For validation, precision, recall, F1-score, Matthews Correlation Coefficient (MCC), Receiving Operating Characteristics Area Under Curve (ROC-AUC), and Precision Recall Area Under Curve (PR-AUC) metrics are used. Firstly, the simulation results show that the proposed Adasyn method has improved the performance of FA-XGboost classifier, which has achieved F1-score, precision, and recall of 93.7%, 92.6%, and 97%, respectively. Secondly, the VGG-16 module achieved a higher generalized performance by securing accuracy of 87.2% and 83.5% on training and testing data, respectively. Thirdly, the proposed FA-XGBoost has correctly identified actual electricity thieves, i.e., recall of 97%. Moreover, our model is superior to the other state-of-the-art models in terms of handling the large time series data and accurate classification. These models can be efficiently applied by the utility companies using the real electricity consumption data to identify the electricity thieves and overcome the major revenue losses in power sector.
Collapse
|