1
|
Hong J, Chun H. A prediction model for healthcare time-series data with a mixture of deep mixed effect models using Gaussian processes. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|
2
|
Wang X, Liu H, Du J, Dong X, Yang Z. A long-term multivariate time series forecasting network combining series decomposition and convolutional neural networks. Appl Soft Comput 2023. [DOI: 10.1016/j.asoc.2023.110214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
3
|
Fernández-Gómez AM, Gutiérrez-Avilés D, Troncoso A, Martínez-Álvarez F. A new Apache Spark-based framework for big data streaming forecasting in IoT networks. THE JOURNAL OF SUPERCOMPUTING 2023; 79:11078-11100. [PMID: 36845222 PMCID: PMC9942040 DOI: 10.1007/s11227-023-05100-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 02/01/2023] [Indexed: 05/24/2023]
Abstract
Analyzing time-dependent data acquired in a continuous flow is a major challenge for various fields, such as big data and machine learning. Being able to analyze a large volume of data from various sources, such as sensors, networks, and the internet, is essential for improving the efficiency of our society's production processes. Additionally, this vast amount of data is collected dynamically in a continuous stream. The goal of this research is to provide a comprehensive framework for forecasting big data streams from Internet of Things networks and serve as a guide for designing and deploying other third-party solutions. Hence, a new framework for time series forecasting in a big data streaming scenario, using data collected from Internet of Things networks, is presented. This framework comprises of five main modules: Internet of Things network design and deployment, big data streaming architecture, stream data modeling method, big data forecasting method, and a comprehensive real-world application scenario, consisting of a physical Internet of Things network feeding the big data streaming architecture, being the linear regression the algorithm used for illustrative purposes. Comparison with other frameworks reveals that this is the first framework that incorporates and integrates all the aforementioned modules.
Collapse
Affiliation(s)
- Antonio M. Fernández-Gómez
- Data Science and Big Data Lab, Pablo de Olavide University of Seville, Ctra. de Utrera, km. 1, ES-41013 Seville, Seville Spain
| | - David Gutiérrez-Avilés
- Department of Computer Science, University of Seville, Avda. Reina Mercedes s/n, ES-41012 Seville, Spain
| | - Alicia Troncoso
- Data Science and Big Data Lab, Pablo de Olavide University of Seville, Ctra. de Utrera, km. 1, ES-41013 Seville, Seville Spain
| | - Francisco Martínez-Álvarez
- Data Science and Big Data Lab, Pablo de Olavide University of Seville, Ctra. de Utrera, km. 1, ES-41013 Seville, Seville Spain
| |
Collapse
|
4
|
Xin R, Liu H, Chen P, Zhao Z. Robust and accurate performance anomaly detection and prediction for cloud applications: a novel ensemble learning-based framework. JOURNAL OF CLOUD COMPUTING 2023. [DOI: 10.1186/s13677-022-00383-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
AbstractEffectively detecting run-time performance anomalies is crucial for clouds to identify abnormal performance behavior and forestall future incidents. To be used for real-world applications, an effective anomaly detection framework should meet three main challenging requirements: high accuracy for identifying anomalies, good robustness when application patterns change, and prediction ability for upcoming anomalies. Unfortunately, existing research about performance anomaly detection usually focuses on improving detection accuracy, while little research tackles the three challenges simultaneously. We conduct experiments for existing detection methods on multiple application monitoring data, and results show that existing detection methods usually focus on different features in data, which will lead to their diverse performance on different data patterns. Therefore, existing anomaly detection methods have difficulty improving detection accuracy and robustness and predicting anomalies. To address the three requirements, we propose an Ensemble Learning-Based Detection (ELBD) framework which integrates existing well-selected detection methods. The framework includes three classic linear ensemble methods (maximum, average, and weighted average) and a novel deep ensemble method. Our experiments show that the ELBD framework realizes better detection accuracy and robustness, where the deep ensemble method can achieve the most accurate and robust detection for cloud applications. In addition, it can predict anomalies in the next four minutes with an F1 score higher than 0.8. The paper also proposes a new indicator $$ARP\_score$$
A
R
P
_
s
c
o
r
e
to measure detection accuracy, robustness, and multi-step prediction ability. The $$ARP\_score$$
A
R
P
_
s
c
o
r
e
of the deep ensemble method is 5.1821, which is much higher than other detection methods.
Collapse
|
5
|
Analysis of Stock Market Public Opinion Based on Web Crawler and Deep Learning Technologies Including 1DCNN and LSTM. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-07444-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
6
|
Chen C, Wang N, Chen M, Yan XM. A framework based on heterogeneous ensemble models for liquid steel temperature prediction in LF refining process. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.109724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
7
|
Using dual evolutionary search to construct decision tree based ensemble classifier. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-022-00855-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
AbstractA typical ensemble learning process typically uses a forward integration mechanism to construct the ensemble classifier with a large number of base classifiers. Based on this mechanism, it is difficult to adjust the diversity among base classifiers and optimize the structure inside ensemble since the generation process has a certain amount of randomness, which makes the performance of ensemble classifiers heavily dependent on the human design decisions. To address this issue, we proposed an automatic ensemble classifier construction method based on a dual-layer evolutionary search mechanism, which includes a tree coding-based base classifier population and a binary coding-based ensemble classifier population. Through a collaborative searching process between the two populations, the proposed method can be driven by training data to update the base classifier population and optimize the ensemble classifiers globally. To verify the effectiveness of the dual evolutionary ensemble learning method (DEEL), we tested it on 22 classification tasks from 4 data repositories. The results show that the proposed method can generate a diverse decision tree population on the training data while searching and constructing ensemble classifiers from them. Compared with 9 competitor algorithms, the proposed method achieved the best performance on 17 of 22 test tasks and improved the average accuracies by 0.97–7.65% over the second place. In particular, the generated ensemble classifiers show excellent structure, which involve small number and diverse decision trees. That increases the transparency of ensembles and helps to perform interpretability analysis on them.
Collapse
|
8
|
Wang R, Li H, Jing J, Jiang L, Dong W. WYSIWYG: IoT Device Identification Based on WebUI Login Pages. SENSORS (BASEL, SWITZERLAND) 2022; 22:4892. [PMID: 35808388 PMCID: PMC9269544 DOI: 10.3390/s22134892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 06/19/2022] [Accepted: 06/23/2022] [Indexed: 06/15/2023]
Abstract
With the improvement of intelligence and interconnection, Internet of Things (IoT) devices tend to become more vulnerable and exposed to many threats. Device identification is the foundation of many cybersecurity operations, such as asset management, vulnerability reaction, and situational awareness, which are important for enhancing the security of IoT devices. The more information sources and the more angles of view we have, the more precise identification results we obtain. This study proposes a novel and alternative method for IoT device identification, which introduces commonly available WebUI login pages with distinctive characteristics specific to vendors as the data source and uses an ensemble learning model based on a combination of Convolutional Neural Networks (CNN) and Deep Neural Networks (DNN) for device vendor identification and develops an Optical Character Recognition (OCR) based method for device type and model identification. The experimental results show that the ensemble learning model can achieve 99.1% accuracy and 99.5% F1-Score in the determination of whether a device is from a vendor that appeared in the training dataset, and if the answer is positive, 98% accuracy and 98.3% F1-Score in identifying which vendor it is from. The OCR-based method can identify fine-grained attributes of the device and achieve an accuracy of 99.46% in device model identification, which is higher than the results of the Shodan cyber search engine by a considerable margin of 11.39%.
Collapse
Affiliation(s)
- Ruimin Wang
- State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450000, China; (R.W.); (H.L.); (J.J.); (L.J.)
- Key Laboratory of Cyberspace Situation Awareness of Henan Province, Zhengzhou 450000, China
| | - Haitao Li
- State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450000, China; (R.W.); (H.L.); (J.J.); (L.J.)
| | - Jing Jing
- State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450000, China; (R.W.); (H.L.); (J.J.); (L.J.)
| | - Liehui Jiang
- State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450000, China; (R.W.); (H.L.); (J.J.); (L.J.)
- Key Laboratory of Cyberspace Situation Awareness of Henan Province, Zhengzhou 450000, China
| | - Weiyu Dong
- State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450000, China; (R.W.); (H.L.); (J.J.); (L.J.)
| |
Collapse
|
9
|
Criado-Ramón D, Ruiz L, Pegalajar M. Electric demand forecasting with neural networks and symbolic time series representations. Appl Soft Comput 2022. [DOI: 10.1016/j.asoc.2022.108871] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
10
|
Castán-Lascorz M, Jiménez-Herrera P, Troncoso A, Asencio-Cortés G. A new hybrid method for predicting univariate and multivariate time series based on pattern forecasting. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.12.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
11
|
Towards better time series prediction with model-independent, low-dispersion clusters of contextual subsequence embeddings. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2021.107641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
12
|
Feng JW, Ye J, Qi GF, Hong LZ, Wang F, Liu SY, Jiang Y. LASSO-based machine learning models for the prediction of central lymph node metastasis in clinically negative patients with papillary thyroid carcinoma. Front Endocrinol (Lausanne) 2022; 13:1030045. [PMID: 36506061 PMCID: PMC9727241 DOI: 10.3389/fendo.2022.1030045] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Accepted: 11/07/2022] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND The presence of central lymph node metastasis (CLNM) is crucial for surgical decision-making in clinical N0 (cN0) papillary thyroid carcinoma (PTC) patients. We aimed to develop and validate machine learning (ML) algorithms-based models for predicting the risk of CLNM in cN0 patients. METHODS A total of 1099 PTC patients with cN0 central neck from July 2019 to March 2022 at our institution were retrospectively analyzed. All patients were randomly split into the training dataset (70%) and the validation dataset (30%). Eight ML algorithms, including the Logistic Regression, Gradient Boosting Machine, Extreme Gradient Boosting (XGB), Random Forest (RF), Decision Tree, Neural Network, Support Vector Machine and Bayesian Network were used to evaluate the risk of CLNM. The performance of ML models was evaluated by the area under curve (AUC), sensitivity, specificity, and decision curve analysis (DCA). RESULTS We firstly used the LASSO Logistic regression method to select the most relevant factors for predicting CLNM. The AUC of XGB was slightly higher than RF (0.907 and 0.902, respectively). According to DCA, RF model significantly outperformed XGB model at most threshold points and was therefore used to develop the predictive model. The diagnostic performance of RF algorithm was dependent on the following nine top-rank variables: size, margin, extrathyroidal extension, sex, echogenic foci, shape, number, lateral lymph node metastasis and chronic lymphocytic thyroiditis. CONCLUSION By incorporating clinicopathological and sonographic characteristics, we developed ML-based models, suggesting that this non-invasive method can be applied to facilitate individualized prediction of occult CLNM in cN0 central neck PTC patients.
Collapse
|
13
|
Dat NQ, Ngoc Anh NT, Nhat Anh N, Solanki VK. Hybrid online model based multi seasonal decompose for short-term electricity load forecasting using ARIMA and online RNN. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-189884] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Short-term electricity load forecasting (STLF) plays a key role in operating the power system of a nation. A challenging problem in STLF is to deal with real-time data. This paper aims to address the problem using a hybrid online model. Online learning methods are becoming essential in STLF because load data often show complex seasonality (daily, weekly, annual) and changing patterns. Online models such as Online AutoRegressive Integrated Moving Average (Online ARIMA) and Online Recurrent neural network (Online RNN) can modify their parameters on the fly to adapt to the changes of real-time data. However, Online RNN alone cannot handle seasonality directly and ARIMA can only handle a single seasonal pattern (Seasonal ARIMA). In this study, we propose a hybrid online model that combines Online ARIMA, Online RNN, and Multi-seasonal decomposition to forecast real-time time series with multiple seasonal patterns. First, we decompose the original time series into three components: trend, seasonality, and residual. The seasonal patterns are modeled using Fourier series. This approach is flexible, allowing us to incorporate multiple periods. For trend and residual components, we employ Online ARIMA and Online RNN respectively to obtain the predictions. We use hourly load data of Vietnam and daily load data of Australia as case studies to verify our proposed model. The experimental results show that our model has better performance than single online models. The proposed model is robust and can be applied in many other fields with real-time time series.
Collapse
Affiliation(s)
- Nguyen Quang Dat
- School of Applied Mathematics and Informatics, HUST, Hanoi, Vietnam
| | | | | | - Vijender Kumar Solanki
- Department of Computer Science & Engineering CMR Institute of Technology, Hyderabad, India
| |
Collapse
|
14
|
Chang S, Lee U, Hong MJ, Jo YD, Kim JB. Time-Series Growth Prediction Model Based on U-Net and Machine Learning in Arabidopsis. FRONTIERS IN PLANT SCIENCE 2021; 12:721512. [PMID: 34858446 PMCID: PMC8631871 DOI: 10.3389/fpls.2021.721512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 10/08/2021] [Indexed: 06/13/2023]
Abstract
Yield prediction for crops is essential information for food security. A high-throughput phenotyping platform (HTPP) generates the data of the complete life cycle of a plant. However, the data are rarely used for yield prediction because of the lack of quality image analysis methods, yield data associated with HTPP, and the time-series analysis method for yield prediction. To overcome limitations, this study employed multiple deep learning (DL) networks to extract high-quality HTTP data, establish an association between HTTP data and the yield performance of crops, and select essential time intervals using machine learning (ML). The images of Arabidopsis were taken 12 times under environmentally controlled HTPP over 23 days after sowing (DAS). First, the features from images were extracted using DL network U-Net with SE-ResXt101 encoder and divided into early (15-21 DAS) and late (∼21-23 DAS) pre-flowering developmental stages using the physiological characteristics of the Arabidopsis plant. Second, the late pre-flowering stage at 23 DAS can be predicted using the ML algorithm XGBoost, based only on a portion of the early pre-flowering stage (17-21 DAS). This was confirmed using an additional biological experiment (P < 0.01). Finally, the projected area (PA) was estimated into fresh weight (FW), and the correlation coefficient between FW and predicted FW was calculated as 0.85. This was the first study that analyzed time-series data to predict the FW of related but different developmental stages and predict the PA. The results of this study were informative and enabled the understanding of the FW of Arabidopsis or yield of leafy plants and total biomass consumed in vertical farming. Moreover, this study highlighted the reduction of time-series data for examining interesting traits and future application of time-series analysis in various HTPPs.
Collapse
Affiliation(s)
- Sungyul Chang
- Radiation Breeding Research Team, Advanced Radiation Technology Institute (ARTI), Korea Atomic Energy Research Institute (KAERI), Jeongeup-si, South Korea
| | - Unseok Lee
- Smart Farm Research Center, Korea Institute of Science and Technology (KIST), Gangneung-si, South Korea
| | - Min Jeong Hong
- Radiation Breeding Research Team, Advanced Radiation Technology Institute (ARTI), Korea Atomic Energy Research Institute (KAERI), Jeongeup-si, South Korea
| | - Yeong Deuk Jo
- Radiation Breeding Research Team, Advanced Radiation Technology Institute (ARTI), Korea Atomic Energy Research Institute (KAERI), Jeongeup-si, South Korea
| | - Jin-Baek Kim
- Radiation Breeding Research Team, Advanced Radiation Technology Institute (ARTI), Korea Atomic Energy Research Institute (KAERI), Jeongeup-si, South Korea
| |
Collapse
|
15
|
Li H, Yu Y. Detecting a multigranularity event in an unequal interval time series based on self-adaptive segmenting. INTELL DATA ANAL 2021. [DOI: 10.3233/ida-205480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Analyzing the temporal behaviors and revealing the hidden rules of objects that produce time series data to detect the events that users are interested in have recently received a large amount of attention. Generally, in various application scenarios and most research works, the equal interval sampling of a time series is a requirement. However, this requirement is difficult to guarantee because of the presence of sampling errors in most situations. In this paper, a multigranularity event detection method for an unequal interval time series, called SSED (self-adaptive segmenting based event detection), is proposed. First, in view of the trend features of a time series, a self-adaptive segmenting algorithm is proposed to divide a time series into unfixed-length segmentations based on the trends. Then, by clustering the segmentations and mapping the clusters to different identical symbols, a symbol sequence is built. Finally, based on unfixed-length segmentations, the multigranularity events in the discrete symbol sequence are detected using a tree structure. The SSED is compared to two previous methods with ten public datasets. In addition, the SSED is applied to the public transport systems in Xiamen, China, using bus-speed time-series data. The experimental results show that the SSED can achieve higher efficiency and accuracy than existing algorithms.
Collapse
|
16
|
Accurate Demand Forecasting: A Flexible and Balanced Electric Power Production Big Data Virtualization Based on Photovoltaic Power Plant. ENERGIES 2021. [DOI: 10.3390/en14216915] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This paper has tried to execute accurate demand forecasting by utilizing big data visualization and proposes a flexible and balanced electric power production big data virtualization based on a photovoltaic power plant. First of all, this paper has tried to align electricity demand and supply as much as possible using big data. Second, by using big data to predict the supply of new renewable energy, an attempt was made to incorporate new and renewable energy into the current power supply system and to recommend an efficient energy distribution method. The first presented problem that had to be solved was the improvement in the accuracy of the existing electricity demand for forecasting models. This was explained through the relationship between the power demand and the number of specific words in the paper that use crawling by utilizing big data. The next problem arose because the current electricity production and supply system stores the amount of new renewable energy by changing the form of energy that is produced through ESS or that is pumped through water power generation without taking the amount of new renewable energy that is generated from sources such as thermal power, nuclear power, and hydropower into consideration. This occurs due to the difficulty of predicting power production using new renewable energy and the absence of a prediction system, which is a problem due to the inefficiency of changing energy types. Therefore, using game theory, the theoretical foundation of a power demand forecasting model based on big data-based renewable energy production forecasting was prepared.
Collapse
|
17
|
Zekić-Sušac M, Mitrović S, Has A. Machine learning based system for managing energy efficiency of public sector as an approach towards smart cities. INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT 2021. [DOI: 10.1016/j.ijinfomgt.2020.102074] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
18
|
Pegalajar M, Ruiz L, Cuéllar M, Rueda R. Analysis and enhanced prediction of the Spanish Electricity Network through Big Data and Machine Learning techniques. Int J Approx Reason 2021. [DOI: 10.1016/j.ijar.2021.03.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
19
|
Cui S, Wang Y, Yin Y, Cheng T, Wang D, Zhai M. A cluster-based intelligence ensemble learning method for classification problems. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.01.061] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
20
|
Predicting energy cost of public buildings by artificial neural networks, CART, and random forest. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2020.01.124] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
21
|
Weight Feedback-Based Harmonic MDG-Ensemble Model for Prediction of Traffic Accident Severity. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11115072] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Traffic accidents are emerging as a serious social problem in modern society but if the severity of an accident is quickly grasped, countermeasures can be organized efficiently. To solve this problem, the method proposed in this paper derives the MDG (Mean Decrease Gini) coefficient between variables to assess the severity of traffic accidents. Single models are designed to use coefficient, independent variables to determine and predict accident severity. The generated single models are fused using a weighted-voting-based bagging method ensemble to consider various characteristics and avoid overfitting. The variables used for predicting accidents are classified as dependent or independent and the variables that affect the severity of traffic accidents are predicted using the characteristics of causal relationships. Independent variables are classified as categorical and numerical variables. For this reason, a problem arises when the variation among dependent variables is imbalanced. Therefore, a harmonic average is applied to the weights to maintain the variables’ balance and determine the average rate of change. Through this, it is possible to establish objective criteria for determining the severity of traffic accidents, thereby improving reliability.
Collapse
|
22
|
Savargiv M, Masoumi B, Keyvanpour MR. A New Random Forest Algorithm Based on Learning Automata. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE 2021; 2021:5572781. [PMID: 33854542 PMCID: PMC8019375 DOI: 10.1155/2021/5572781] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2021] [Revised: 03/09/2021] [Accepted: 03/16/2021] [Indexed: 11/29/2022]
Abstract
The goal of aggregating the base classifiers is to achieve an aggregated classifier that has a higher resolution than individual classifiers. Random forest is one of the types of ensemble learning methods that have been considered more than other ensemble learning methods due to its simple structure, ease of understanding, as well as higher efficiency than similar methods. The ability and efficiency of classical methods are always influenced by the data. The capabilities of independence from the data domain, and the ability to adapt to problem space conditions, are the most challenging issues about the different types of classifiers. In this paper, a method based on learning automata is presented, through which the adaptive capabilities of the problem space, as well as the independence of the data domain, are added to the random forest to increase its efficiency. Using the idea of reinforcement learning in the random forest has made it possible to address issues with data that have a dynamic behaviour. Dynamic behaviour refers to the variability in the behaviour of a data sample in different domains. Therefore, to evaluate the proposed method, and to create an environment with dynamic behaviour, different domains of data have been considered. In the proposed method, the idea is added to the random forest using learning automata. The reason for this choice is the simple structure of the learning automata and the compatibility of the learning automata with the problem space. The evaluation results confirm the improvement of random forest efficiency.
Collapse
Affiliation(s)
- Mohammad Savargiv
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | - Behrooz Masoumi
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | | |
Collapse
|
23
|
|
24
|
Research on a Gas Concentration Prediction Algorithm Based on Stacking. SENSORS 2021; 21:s21051597. [PMID: 33668797 PMCID: PMC7956455 DOI: 10.3390/s21051597] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 02/14/2021] [Accepted: 02/20/2021] [Indexed: 11/17/2022]
Abstract
Machine learning algorithms play an important role in the detection of toxic, flammable and explosive gases, and they are extremely important for the study of mixed gas classification and concentration prediction methods. To solve the problem of low prediction accuracy of gas concentration regression prediction algorithms, a gas concentration prediction algorithm based on a stacking model is proposed in the current research. In this paper, the stochastic forest, extreme random regression tree and gradient boosting decision tree (GBDT) regression algorithms are selected as the base learning devices and use the stacking algorithm to take the output of each base learning device as input to train a new model to produce a final output. Through the stacking model, the grid search algorithm is studied to automatically optimize the parameters so that the performance of the entire system can reach the optimal parameters. Through experimental simulation, the gas concentration prediction algorithm based on stacking model has better prediction effect than other integrated frame algorithms and the accuracy of mixed gas concentration prediction is improved.
Collapse
|
25
|
A spatial multi-resolution multi-objective data-driven ensemble model for multi-step air quality index forecasting based on real-time decomposition. COMPUT IND 2021. [DOI: 10.1016/j.compind.2020.103387] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
26
|
Torres JF, Hadjout D, Sebaa A, Martínez-Álvarez F, Troncoso A. Deep Learning for Time Series Forecasting: A Survey. BIG DATA 2021; 9:3-21. [PMID: 33275484 DOI: 10.1089/big.2020.0159] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Time series forecasting has become a very intensive field of research, which is even increasing in recent years. Deep neural networks have proved to be powerful and are achieving high accuracy in many application fields. For these reasons, they are one of the most widely used methods of machine learning to solve problems dealing with big data nowadays. In this work, the time series forecasting problem is initially formulated along with its mathematical fundamentals. Then, the most common deep learning architectures that are currently being successfully applied to predict time series are described, highlighting their advantages and limitations. Particular attention is given to feed forward networks, recurrent neural networks (including Elman, long-short term memory, gated recurrent units, and bidirectional networks), and convolutional neural networks. Practical aspects, such as the setting of values for hyper-parameters and the choice of the most suitable frameworks, for the successful application of deep learning to time series are also provided and discussed. Several fruitful research fields in which the architectures analyzed have obtained a good performance are reviewed. As a result, research gaps have been identified in the literature for several domains of application, thus expecting to inspire new and better forms of knowledge.
Collapse
Affiliation(s)
- José F Torres
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - Dalil Hadjout
- Department of Commerce, SADEG Company (Sonelgaz Group), Bejaia, Algeria
| | - Abderrazak Sebaa
- LIMED Laboratory, Faculty of Exact Sciences, University of Bejaia, Bejaia, Algeria
- Higher School of Sciences and Technologies of Computing and Digital, Bejaia, Algeria
| | | | - Alicia Troncoso
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| |
Collapse
|
27
|
Maté C. Combining Interval Time Series Forecasts. A First Step in a Long Way (Research Agenda). REVISTA COLOMBIANA DE ESTADÍSTICA 2021. [DOI: 10.15446/rce.v44n1.85116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022] Open
Abstract
We observe every day a world more complex, uncertain, and riskier than the world of yesterday. Consequently, having accurate forecasts in economics, finance, energy, health, tourism, and so on; is more critical than ever. Moreover, there is an increasing requirement to provide other types of forecasts beyond point ones such as interval forecasts. After more than 50 years of research, there are two consensuses, “combining forecasts reduces the final forecasting error” and “a simple average of several forecasts often outperforms complicated weighting schemes”, which was named “forecast combination puzzle (FCP)”. The introduction of intervalvalued time series (ITS) concepts and several forecasting methods has been proposed in different papers and gives answers to some big data challenges. Hence, one main issue is how to combine several forecasts obtained for one ITS. This paper proposes some combination schemes with a couple or various ITS forecasts. Some of them extend previous crisp combination schemes incorporating as a novelty the use of Theil’s U. The FCP under the ITS forecasts framework will be analyzed in the context of different accuracy measures and some guidelines will be provided. An agenda for future research in the field of combining forecasts obtained for ITS will be outlined.
Collapse
|
28
|
Cost-sensitive probability for weighted voting in an ensemble model for multi-class classification problems. APPL INTELL 2021. [DOI: 10.1007/s10489-020-02106-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
AbstractEnsemble learning is an algorithm that utilizes various types of classification models. This algorithm can enhance the prediction efficiency of component models. However, the efficiency of combining models typically depends on the diversity and accuracy of the predicted results of ensemble models. However, the problem of multi-class data is still encountered. In the proposed approach, cost-sensitive learning was implemented to evaluate the prediction accuracy for each class, which was used to construct a cost-sensitivity matrix of the true positive (TP) rate. This TP rate can be used as a weight value and combined with a probability value to drive ensemble learning for a specified class. We proposed an ensemble model, which was a type of heterogenous model, namely, a combination of various individual classification models (support vector machine, Bayes, K-nearest neighbour, naïve Bayes, decision tree, and multi-layer perceptron) in experiments on 3-, 4-, 5- and 6-classifier models. The efficiencies of the propose models were compared to those of the individual classifier model and homogenous models (Adaboost, bagging, stacking, voting, random forest, and random subspaces) with various multi-class data sets. The experimental results demonstrate that the cost-sensitive probability for the weighted voting ensemble model that was derived from 3 models provided the most accurate results for the dataset in multi-class prediction. The objective of this study was to increase the efficiency of predicting classification results in multi-class classification tasks and to improve the classification results.
Collapse
|
29
|
Oner M, Ustundag A. Combining predictive base models using deep ensemble learning. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2020. [DOI: 10.3233/jifs-189126] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Since information science and communication technologies had improved significantly, data volumes had expanded. As a result of that situation, advanced pre-processing and analysis of collected data became a crucial topic for extracting meaningful patterns hidden in the data. Therefore, traditional machine learning algorithms generally fail to gather satisfactory results when analyzing complex data. The main reason of this situation is the difficulty of capturing multiple characteristics of the high dimensional data. Within this scope, ensemble learning enables the integration of diversified single models to produce weak predictive results. The final combination is generally achieved by various voting schemes. On the other hand, if a large amount of single models are utilized, voting mechanism cannot be able to combine these results. At this point, Deep Learning (DL) provides the combination of the ensemble results in a considerable time. Apart from previous studies, we determine various predictive models in order to forecast the outcome of two different case studies. Consequently, data cleaning and feature selection are conducted in advance and three predictive models are defined to be combined. DL based integration is applied substituted for voting mechanism. The weak predictive results are fused based on Recurrent Neural Network (RNN) and Long Short Term Memory (LSTM) using different parameters and datasets and best predictors are extracted. After that, different experimental combinations are evaluated for gathering better prediction results. For comparison, grouped individual results (clusters) with proper parameters are compared with DL based ensemble results.
Collapse
Affiliation(s)
- Mahir Oner
- Istanbul Technical University, Industrial Engineering Department, Maçka, İstanbul- Turkey
| | - Alp Ustundag
- Istanbul Technical University, Industrial Engineering Department, Maçka, İstanbul- Turkey
| |
Collapse
|
30
|
Big data time series forecasting based on pattern sequence similarity and its application to the electricity demand. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.06.014] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
31
|
Wu D, Wang X, Su J, Tang B, Wu S. A Labeling Method for Financial Time Series Prediction Based on Trends. ENTROPY (BASEL, SWITZERLAND) 2020; 22:E1162. [PMID: 33286931 PMCID: PMC7597331 DOI: 10.3390/e22101162] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 10/11/2020] [Accepted: 10/13/2020] [Indexed: 11/16/2022]
Abstract
Time series prediction has been widely applied to the finance industry in applications such as stock market price and commodity price forecasting. Machine learning methods have been widely used in financial time series prediction in recent years. How to label financial time series data to determine the prediction accuracy of machine learning models and subsequently determine final investment returns is a hot topic. Existing labeling methods of financial time series mainly label data by comparing the current data with those of a short time period in the future. However, financial time series data are typically non-linear with obvious short-term randomness. Therefore, these labeling methods have not captured the continuous trend features of financial time series data, leading to a difference between their labeling results and real market trends. In this paper, a new labeling method called "continuous trend labeling" is proposed to address the above problem. In the feature preprocessing stage, this paper proposed a new method that can avoid the problem of look-ahead bias in traditional data standardization or normalization processes. Then, a detailed logical explanation was given, the definition of continuous trend labeling was proposed and also an automatic labeling algorithm was given to extract the continuous trend features of financial time series data. Experiments on the Shanghai Composite Index and Shenzhen Component Index and some stocks of China showed that our labeling method is a much better state-of-the-art labeling method in terms of classification accuracy and some other classification evaluation metrics. The results of the paper also proved that deep learning models such as LSTM and GRU are more suitable for dealing with the prediction of financial time series data.
Collapse
Affiliation(s)
| | - Xiaolong Wang
- The College of Computer Science and Technology, Harbin Institute of Technology, Shenzhen 518055, China; (D.W.); (J.S.); (B.T.); (S.W.)
| | | | | | | |
Collapse
|
32
|
D’Antoni F, Merone M, Piemonte V, Iannello G, Soda P. Auto-Regressive Time Delayed jump neural network for blood glucose levels forecasting. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2020.106134] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
|
33
|
Oprea SV, Bâra A. Ultra-short-term forecasting for photovoltaic power plants and real-time key performance indicators analysis with big data solutions. Two case studies - PV Agigea and PV Giurgiu located in Romania. COMPUT IND 2020. [DOI: 10.1016/j.compind.2020.103230] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
34
|
Martínez-Álvarez F, Asencio-Cortés G, Torres JF, Gutiérrez-Avilés D, Melgar-García L, Pérez-Chacón R, Rubio-Escudero C, Riquelme JC, Troncoso A. Coronavirus Optimization Algorithm: A Bioinspired Metaheuristic Based on the COVID-19 Propagation Model. BIG DATA 2020; 8:308-322. [PMID: 32716641 DOI: 10.1089/big.2020.0051] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
This study proposes a novel bioinspired metaheuristic simulating how the coronavirus spreads and infects healthy people. From a primary infected individual (patient zero), the coronavirus rapidly infects new victims, creating large populations of infected people who will either die or spread infection. Relevant terms such as reinfection probability, super-spreading rate, social distancing measures, or traveling rate are introduced into the model to simulate the coronavirus activity as accurately as possible. The infected population initially grows exponentially over time, but taking into consideration social isolation measures, the mortality rate, and number of recoveries, the infected population gradually decreases. The coronavirus optimization algorithm has two major advantages when compared with other similar strategies. First, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Second, the approach has the ability to end after several iterations, without setting this value either. Furthermore, a parallel multivirus version is proposed, where several coronavirus strains evolve over time and explore wider search space areas in less iterations. Finally, the metaheuristic has been combined with deep learning models, to find optimal hyperparameters during the training phase. As application case, the problem of electricity load time series forecasting has been addressed, showing quite remarkable performance.
Collapse
Affiliation(s)
- F Martínez-Álvarez
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - G Asencio-Cortés
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - J F Torres
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - D Gutiérrez-Avilés
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - L Melgar-García
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - R Pérez-Chacón
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| | - C Rubio-Escudero
- Department of Computer Science, University of Seville, Seville, Spain
| | - J C Riquelme
- Department of Computer Science, University of Seville, Seville, Spain
| | - A Troncoso
- Data Science and Big Data Lab, Pablo de Olavide University, Seville, Spain
| |
Collapse
|
35
|
Abstract
Large scale deployments of Internet of Things (IoT) networks are becoming reality. From a technology perspective, a lot of information related to device parameters, channel states, network and application data are stored in databases and can be used for an extensive analysis to improve the functionality of IoT systems in terms of network performance and user services. LoRaWAN (Long Range Wide Area Network) is one of the emerging IoT technologies, with a simple protocol based on LoRa modulation. In this work, we discuss how machine learning approaches can be used to improve network performance (and if and how they can help). To this aim, we describe a methodology to process LoRaWAN packets and apply a machine learning pipeline to: (i) perform device profiling, and (ii) predict the inter-arrival of IoT packets. This latter analysis is very related to the channel and network usage and can be leveraged in the future for system performance enhancements. Our analysis mainly focuses on the use of k-means, Long Short-Term Memory Neural Networks and Decision Trees. We test these approaches on a real large-scale LoRaWAN network where the overall captured traffic is stored in a proprietary database. Our study shows how profiling techniques enable a machine learning prediction algorithm even when training is not possible because of high error rates perceived by some devices. In this challenging case, the prediction of the inter-arrival time of packets has an error of about 3.5% for 77% of real sequence cases.
Collapse
|
36
|
|
37
|
Stability of Multiple Seasonal Holt-Winters Models Applied to Hourly Electricity Demand in Spain. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10072630] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Electricity management and production depend heavily on demand forecasts made. Any mismatch between the energy demanded with respect to that produced supposes enormous losses for the consumer. Transmission System Operators use time series-based tools to forecast accurately the future demand and set the production program. One of the most effective and highly used methods are Holt-Winters. Recently, the incorporation of the multiple seasonal Holt-Winters methods has improved the accuracy of the predictions. These forecasts, depend greatly on the parameters with which the model is constructed. The forecasters need to deal with these parameters values when operating the model. In this article, the parameters space of the multiple seasonal Holt-Winters models applied to electricity demand in Spain is analysed and discussed. The parameters stability analysis leads to forecasters better understanding the behaviour of the predictions and managing their exploitation efficiently. The analysis addresses different time windows, depending on the period of the year as well as different training set sizes. The results show the influence of the calendar effect on these parameters and if it is necessary or not to update them in order to obtain a good accuracy over time.
Collapse
|
38
|
A general framework and guidelines for benchmarking computational intelligence algorithms applied to forecasting problems derived from an application domain-oriented survey. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106103] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
39
|
DERN: Deep Ensemble Learning Model for Short- and Long-Term Prediction of Baltic Dry Index. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10041504] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The Baltic Dry Index (BDI) is a commonly utilized indicator of global shipping and trade activity. It influences stakeholders’ and ship-owners’ decisions respecting investments, chartering, operational plans, and export and import activities. Accurate prediction of the BDI is very challenging due to its volatility, non-stationarity, and complexity. To help stakeholders and ship-owners make sound short- and long-term maritime business decisions and avoid market risk, we performed short- and long-term predictions of BDI using an ensemble deep-learning approach. In this study, we propose to apply recurrent neural network models for BDI prediction. The state-of-the-art of sequential deep-learning models such as RNN, LSTM, and GRU are employed to predict one- and multi-step-ahead BDI values. In order to increase the accuracy, we assemble the models. In experiments, we compared our results with those of traditional methods such as ARIMA and MLP. The results showed that our proposed method outperforms ARIMA, MLP, RNN, LSTM, and GRU in both short- and long-term prediction of BDI.
Collapse
|
40
|
Analysis of net asset value prediction using low complexity neural network with various expansion techniques. EVOLUTIONARY INTELLIGENCE 2020. [DOI: 10.1007/s12065-020-00365-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
41
|
A Novel Ensemble Approach for the Forecasting of Energy Demand Based on the Artificial Bee Colony Algorithm. ENERGIES 2020. [DOI: 10.3390/en13030550] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Accurate forecasting of the energy demand is crucial for the rational formulation of energy policies for energy management. In this paper, a novel ensemble forecasting model based on the artificial bee colony (ABC) algorithm for the energy demand was proposed and adopted. The ensemble model forecasts were based on multiple time variables, such as the gross domestic product (GDP), industrial structure, energy structure, technological innovation, urbanization rate, population, consumer price index, and past energy demand. The model was trained and tested using the primary energy demand data collected in China. Seven base models, including the regression-based model and machine learning models, were utilized and compared to verify the superior performance of the ensemble forecasting model proposed herein. The results revealed that (1) the proposed ensemble model is significantly superior to the benchmark prediction models and the simple average ensemble prediction model just in terms of the forecasting accuracy and hypothesis test, (2) the proposed ensemble approach with the ABC algorithm can be employed as a promising framework for energy demand forecasting in terms of the forecasting accuracy and hypothesis test, and (3) the forecasting results obtained for the future energy demand by the ensemble model revealed that the future energy demand of China will maintain a steady growth trend.
Collapse
|
42
|
Integration of Demand Response and Short-Term Forecasting for the Management of Prosumers’ Demand and Generation. ENERGIES 2019. [DOI: 10.3390/en13010011] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The development of Short-Term Forecasting Techniques has a great importance for power system scheduling and managing. Therefore, many recent research papers have dealt with the proposal of new forecasting models searching for higher efficiency and accuracy. Several kinds of artificial intelligence (AI) techniques have provided good performance at predicting and their efficiency mainly depends on the characteristics of the time series data under study. Load forecasting has been widely studied in recent decades and models providing mean absolute percentage errors (MAPEs) below 5% have been proposed. On the other hand, short-term generation forecasting models for photovoltaic plants have been more recently developed and the MAPEs are in general still far from those achieved from load forecasting models. The aim of this paper is to propose a methodology that could help power systems or aggregators to make up for the lack of accuracy of the current forecasting methods when predicting renewable energy generation. The proposed methodology is carried out in three consecutive steps: (1) short-term forecasting of energy consumption and renewable generation; (2) classification of daily pattern for the renewable generation data using Dynamic Time Warping; (3) application of Demand Response strategies using Physically Based Load Models. Real data from a small town in Spain were used to illustrate the performance and efficiency of the proposed procedure.
Collapse
|
43
|
OCAPIS: R package for Ordinal Classification and Preprocessing in Scala. PROGRESS IN ARTIFICIAL INTELLIGENCE 2019. [DOI: 10.1007/s13748-019-00175-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
44
|
A Comparative Study of Time Series Forecasting Methods for Short Term Electric Energy Consumption Prediction in Smart Buildings. ENERGIES 2019. [DOI: 10.3390/en12101934] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Smart buildings are equipped with sensors that allow monitoring a range of building systems including heating and air conditioning, lighting and the general electric energy consumption. Thees data can then be stored and analyzed. The ability to use historical data regarding electric energy consumption could allow improving the energy efficiency of such buildings, as well as help to spot problems related to wasting of energy. This problem is even more important when considering that buildings are some of the largest consumers of energy. In this paper, we are interested in forecasting the energy consumption of smart buildings, and, to this aim, we propose a comparative study of different forecasting strategies that can be used to this aim. To do this, we used the data regarding the electric consumption registered by thirteen buildings located in a university campus in the south of Spain. The empirical comparison of the selected methods on the different data showed that some methods are more suitable than others for this kind of problem. In particular, we show that strategies based on Machine Learning approaches seem to be more suitable for this task.
Collapse
|