1
|
Saqib SM, Mazhar T, Iqbal M, Shahazad T, Almogren A, Ouahada K, Hamam H. Deep learning-based electricity theft prediction in non-smart grid environments. Heliyon 2024; 10:e35167. [PMID: 39166039 PMCID: PMC11334629 DOI: 10.1016/j.heliyon.2024.e35167] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2024] [Revised: 07/23/2024] [Accepted: 07/24/2024] [Indexed: 08/22/2024] Open
Abstract
In developing countries, smart grids are nonexistent, and electricity theft significantly hampers power supply. This research introduces a lightweight deep-learning model using monthly customer readings as input data. By employing careful direct and indirect feature engineering techniques, including Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), UMAP (Uniform Manifold Approximation and Projection), and resampling methods such as Random-Under-Sampler (RUS), Synthetic Minority Over-sampling Technique (SMOTE), and Random-Over-Sampler (ROS), an effective solution is proposed. Previous studies indicate that models achieve high precision, recall, and F1 score for the non-theft (0) class, but perform poorly, even achieving 0 %, for the theft (1) class. Through parameter tuning and employing Random-Over-Sampler (ROS), significant improvements in accuracy, precision (89 %), recall (94 %), and F1 score (91 %) for the theft (1) class are achieved. The results demonstrate that the proposed model outperforms existing methods, showcasing its efficacy in detecting electricity theft in non-smart grid environments.
Collapse
Affiliation(s)
- Sheikh Muhammad Saqib
- Department of Computing and Information Technology, Gomal University, Dera Ismail Khan, Pakistan
| | - Tehseen Mazhar
- Department of Computer Science, Virtual University of Pakistan, Lahore, 51000, Pakistan
| | - Muhammad Iqbal
- Department of Computing and Information Technology, Gomal University, Dera Ismail Khan, Pakistan
| | - Tariq Shahazad
- School of Electrical Engineering, Dept. of Electrical and Electronic Eng. Science, University of Johannesburg, Johannesburg, 2006, South Africa
| | - Ahmad Almogren
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, 11633, Saudi Arabia
| | - Khmaies Ouahada
- School of Electrical Engineering, Dept. of Electrical and Electronic Eng. Science, University of Johannesburg, Johannesburg, 2006, South Africa
| | - Habib Hamam
- School of Electrical Engineering, Dept. of Electrical and Electronic Eng. Science, University of Johannesburg, Johannesburg, 2006, South Africa
- Faculty of Engineering, Université de Moncton , Moncton, NB, E1A3E9, Canada
- Hodmas University College, Taleh Area, Mogadishu, Banadir, 521376, Somalia
- Bridges for Academic Excellence, Tunis, Centre-Ville, 1002, Tunisia
| |
Collapse
|
2
|
Lim J, Hwang J. Exploring diverse interests of collaborators in smart cities: A topic analysis using LDA and BERT. Heliyon 2024; 10:e30367. [PMID: 38711650 PMCID: PMC11070861 DOI: 10.1016/j.heliyon.2024.e30367] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 04/21/2024] [Accepted: 04/24/2024] [Indexed: 05/08/2024] Open
Abstract
Smart cities have emerged as a promising solution to the problems associated with urbanization. However, research that holistically considers diverse stakeholders in smart cities is scarce. This study utilizes data from four types of collaborators (academia, public sector, industry, and civil society actors) to identify key topics and suggest research areas for developing smart cities. We used latent Dirichlet allocation and Bidirectional Encoder Representations from Transformers for topic extraction and analysis. The analysis reveals that sustainability and digital platform have received similar levels of interest from academia, industry, and government, whereas governance, resource, and green space are less frequently mentioned than technology-related topics. Hype cycle analysis, which considers public and media expectations, reveals that smart cities experienced rapid growth from 2015 to 2021, but the growth rate has slowed since 2022. This means that a breakthrough improvement in the current situation is required. Accordingly, we propose resolving the unbalanced distribution of topic interests among collaborators, especially in the areas of governance, environment, economy, and healthcare. We expect that our findings will help researchers, policymakers, and industry stakeholders in understanding which topics are underdeveloped in their fields and taking active measures for the future development of smart cities.
Collapse
Affiliation(s)
- Jihye Lim
- Integrated Major in Smart City Global Convergence, Technology Management Economics and Policy Program (TEMEP), Seoul National University, Seoul, South Korea
| | - Junseok Hwang
- Integrated Major in Smart City Global Convergence, Technology Management Economics and Policy Program (TEMEP), Seoul National University, Seoul, South Korea
| |
Collapse
|
3
|
Lin Z, Lin X, Yang X. An Automated Analysis Framework for Epidemiological Survey on COVID-19. IEEE J Biomed Health Inform 2024; 28:3186-3199. [PMID: 38412074 DOI: 10.1109/jbhi.2024.3370253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/29/2024]
Abstract
For a long time, the prevention and control of COVID-19 has received significant attention. A crucial aspect of controlling the disease's spread is the epidemiological survey of patients and the subsequent analysis of epidemiological survey reports (case reports). However, current mainstream analysis approaches are all made manually. This manual method is time-consuming and manpower-intensive. This paper designs an automated visual epidemiological survey analysis (AVESA) framework for the epidemiological survey on COVID-19. AVESA designs a deep neural network for information extraction from case reports and automatically constructs an epidemiological knowledge graph based on predefined pattern. Moreover, a multi-dimensional knowledge reasoning model is developed for conducting knowledge reasoning in the complete COVID-19 epidemiological knowledge graph. In the entity extraction sub-task and multi-task extraction sub-task, AVESA achieved F1 scores of 85.12% and 92.29% respectively on the constructed dataset, significantly outperforming the standalone information extraction models. In full-graph computing, all three experiments align closely with manual analysis standards. In the risk analysis experiment, the weighted PageRank algorithm showed an average improvement of 11.21% in Top_Recall_n% over the standard PageRank algorithm. In the community detection experiment, the weighted Louvain algorithm showed a mere 4.34% community difference rate compared to manual analysis.
Collapse
|
4
|
Wahde M, Della Vedova ML, Virgolin M, Suvanto M. An interpretable method for automated classification of spoken transcripts and written text. EVOLUTIONARY INTELLIGENCE 2023:1-13. [PMID: 37360587 PMCID: PMC10157555 DOI: 10.1007/s12065-023-00851-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 04/18/2023] [Accepted: 04/19/2023] [Indexed: 06/28/2023]
Abstract
We investigate the differences between spoken language (in the form of radio show transcripts) and written language (Wikipedia articles) in the context of text classification. We present a novel, interpretable method for text classification, involving a linear classifier using a large set of n - gram features, and apply it to a newly generated data set with sentences originating either from spoken transcripts or written text. Our classifier reaches an accuracy less than 0.02 below that of a commonly used classifier (DistilBERT) based on deep neural networks (DNNs). Moreover, our classifier has an integrated measure of confidence, for assessing the reliability of a given classification. An online tool is provided for demonstrating our classifier, particularly its interpretable nature, which is a crucial feature in classification tasks involving high-stakes decision-making. We also study the capability of DistilBERT to carry out fill-in-the-blank tasks in either spoken or written text, and find it to perform similarly in both cases. Our main conclusion is that, with careful improvements, the performance gap between classical methods and DNN-based methods may be reduced significantly, such that the choice of classification method comes down to the need (if any) for interpretability.
Collapse
Affiliation(s)
- Mattias Wahde
- Chalmers University of Technology, 412 96 Gothenburg, Sweden
| | | | - Marco Virgolin
- Evolutionary Intelligence Group, Centrum Wiskunde and Informatica, Science Park 123, Amsterdam, 1098 XG The Netherlands
| | - Minerva Suvanto
- Chalmers University of Technology, 412 96 Gothenburg, Sweden
| |
Collapse
|
5
|
Garg M. Mental Health Analysis in Social Media Posts: A Survey. ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING : STATE OF THE ART REVIEWS 2023; 30:1819-1842. [PMID: 36619138 PMCID: PMC9810253 DOI: 10.1007/s11831-022-09863-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 11/05/2022] [Indexed: 05/21/2023]
Abstract
The surge in internet use to express personal thoughts and beliefs makes it increasingly feasible for the social NLP research community to find and validate associations between social media posts and mental health status. Cross-sectional and longitudinal studies of social media data bring to fore the importance of real-time responsible AI models for mental health analysis. Aiming to classify the research directions for social computing and tracking advances in the development of machine learning (ML) and deep learning (DL) based models, we propose a comprehensive survey on quantifying mental health on social media. We compose a taxonomy for mental healthcare and highlight recent attempts in examining social well-being with personal writings on social media. We define all the possible research directions for mental healthcare and investigate a thread of handling online social media data for stress, depression and suicide detection for this work. The key features of this manuscript are (i) feature extraction and classification, (ii) recent advancements in AI models, (iii) publicly available dataset, (iv) new frontiers and future research directions. We compile this information to introduce young research and academic practitioners with the field of computational intelligence for mental health analysis on social media. In this manuscript, we carry out a quantitative synthesis and a qualitative review with the corpus of over 92 potential research articles. In this context, we release the collection of existing work on suicide detection in an easily accessible and updatable repository:https://github.com/drmuskangarg/mentalhealthcare.
Collapse
Affiliation(s)
- Muskan Garg
- University of Florida, Gainesville, FL 32601 USA
| |
Collapse
|
6
|
Special Issue on Big Data for eHealth Applications. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12157578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In the last few years, the rapid growth in available digitised medical data has opened new challenges for the scientific research community in the healthcare informatics field [...]
Collapse
|