1
|
Li Y, He S, Li Y, Shi Y, Zeng Z. Federated Multiagent Deep Reinforcement Learning Approach via Physics-Informed Reward for Multimicrogrid Energy Management. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5902-5914. [PMID: 37018258 DOI: 10.1109/tnnls.2022.3232630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The utilization of large-scale distributed renewable energy (RE) promotes the development of the multimicrogrid (MMG), which raises the need of developing an effective energy management method to minimize economic costs and keep self energy sufficiency. The multiagent deep reinforcement learning (MADRL) has been widely used for the energy management problem because of its real-time scheduling ability. However, its training requires massive energy operation data of microgrids (MGs), while gathering these data from different MGs would threaten their privacy and data security. Therefore, this article tackles this practical yet challenging issue by proposing a federated MADRL (F-MADRL) algorithm via the physics-informed reward. In this algorithm, the federated learning (FL) mechanism is introduced to train the F-MADRL algorithm, thus ensures the privacy and the security of data. In addition, a decentralized MMG model is built, and the energy of each participated MG is managed by an agent, which aims to minimize economic costs and keep self energy sufficiency according to the physics-informed reward. At first, MGs individually execute the self-training based on local energy operation data to train their local agent models. Then, these local models are periodically uploaded to a server and their parameters are aggregated to build a global agent, which will be broadcasted to MGs and replace their local agents. In this way, the experience of each MG agent can be shared and the energy operation data are not explicitly transmitted, thus protecting the privacy and ensuring data security. Finally, experiments are conducted on Oak Ridge National Laboratory distributed energy control communication laboratory MG (ORNL-MG) test system, and the comparisons are carried out to verify the effectiveness of introducing the FL mechanism and the outperformance of our proposed F-MADRL.
Collapse
|
2
|
Canese L, Cardarilli GC, Di Nunzio L, Fazzolari R, Re M, Spanò S. Resilient multi-agent RL: introducing DQ-RTS for distributed environments with data loss. Sci Rep 2024; 14:1994. [PMID: 38263140 PMCID: PMC10805896 DOI: 10.1038/s41598-023-48767-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 11/30/2023] [Indexed: 01/25/2024] Open
Abstract
This paper proposes DQ-RTS, a novel decentralized Multi-Agent Reinforcement Learning algorithm designed to address challenges posed by non-ideal communication and a varying number of agents in distributed environments. DQ-RTS incorporates an optimized communication protocol to mitigate data loss between agents. A comparative analysis between DQ-RTS and its decentralized counterpart Q-RTS, or Q-learning for Real-Time Swarms, demonstrates the superior convergence speed of DQ-RTS, achieving a remarkable speed-up factor ranging from 1.6 to 2.7 in scenarios with non-ideal communication. Moreover, DQ-RTS exhibits robustness by maintaining performance even when the agent population fluctuates, making it well-suited for applications requiring adaptable agent numbers over time. Additionally, extensive experiments conducted on various benchmark tasks validate the scalability and effectiveness of DQ-RTS, further establishing its potential as a practical solution for resilient Multi-Agent Reinforcement Learning in dynamic distributed environments.
Collapse
Affiliation(s)
- Lorenzo Canese
- Department of Electronics, University of Rome Tor Vergata, 00133, Rome, Italy.
| | | | - Luca Di Nunzio
- Department of Electronics, University of Rome Tor Vergata, 00133, Rome, Italy
| | - Rocco Fazzolari
- Department of Electronics, University of Rome Tor Vergata, 00133, Rome, Italy
| | - Marco Re
- Department of Electronics, University of Rome Tor Vergata, 00133, Rome, Italy
| | - Sergio Spanò
- Department of Electronics, University of Rome Tor Vergata, 00133, Rome, Italy
| |
Collapse
|
3
|
Abstract
Power systems are going through a transition period. Consumers want more active participation in electric system management, namely assuming the role of producers–consumers, prosumers in short. The prosumers’ energy production is heavily based on renewable energy sources, which, besides recognized environmental benefits, entails energy management challenges. For instance, energy consumption of appliances in a home can lead to misleading patterns. Another challenge is related to energy costs since inefficient systems or unbalanced energy control may represent economic loss to the prosumer. The so-called home energy management systems (HEMS) emerge as a solution. When well-designed HEMS allow prosumers to reach higher levels of energy management, this ensures optimal management of assets and appliances. This paper aims to present a comprehensive systematic review of the literature on optimization techniques recently used in the development of HEMS, also taking into account the key factors that can influence the development of HEMS at a technical and computational level. The systematic review covers the period 2018–2021. As a result of the review, the major developments in the field of HEMS in recent years are presented in an integrated manner. In addition, the techniques are divided into four broad categories: traditional techniques, model predictive control, heuristics and metaheuristics, and other techniques.
Collapse
|
4
|
Distributed Reinforcement Learning for the Management of a Smart Grid Interconnecting Independent Prosumers. ENERGIES 2022. [DOI: 10.3390/en15041440] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In the context of an eco-responsible production and distribution of electrical energy at the local scale of an urban territory, we consider a smart grid as a system interconnecting different prosumers, which all retain their decision-making autonomy and defend their own interests in a comprehensive system where the rules, accepted by all, encourage virtuous behavior. In this paper, we present and analyze a model and a management method for smart grids that is shared between different kinds of independent actors, who respect their own interests, and that encourages each actor to behavior that allows, as much as possible, an energy independence of the smart grid from external energy suppliers. We consider here a game theory model, in which each actor of the smart grid is a player, and we investigate distributed machine-learning algorithms to allow decision-making, thus, leading the game to converge to stable situations, in particular to a Nash equilibrium. We propose a Linear Reward Inaction algorithm that achieves Nash equilibria most of the time, both for a single time slot and across time, allowing the smart grid to maximize its energy independence from external energy suppliers.
Collapse
|
5
|
Multi-Agent Reinforcement Learning: A Review of Challenges and Applications. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11114948] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.
Collapse
|
6
|
A Multi-Agent Reinforcement Learning Framework for Lithium-ion Battery Scheduling Problems. ENERGIES 2020. [DOI: 10.3390/en13081982] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
This paper presents a reinforcement learning framework for solving battery scheduling problems in order to extend the lifetime of batteries used in electrical vehicles (EVs), cellular phones, and embedded systems. Battery pack lifetime has often been the limiting factor in many of today’s smart systems, from mobile devices and wireless sensor networks to EVs. Smart charge-discharge scheduling of battery packs is essential to obtain super linear gain of overall system lifetime, due to the recovery effect and nonlinearity in the battery characteristics. Additionally, smart scheduling has also been shown to be beneficial for optimizing the system’s thermal profile and minimizing chances of irreversible battery damage. The recent rapidly-growing community and development infrastructure have added deep reinforcement learning (DRL) to the available tools for designing battery management systems. Through leveraging the representation powers of deep neural networks and the flexibility and versatility of reinforcement learning, DRL offers a powerful solution to both roofline analysis and real-world deployment on complicated use cases. This work presents a DRL-based battery scheduling framework to solve battery scheduling problems, with high flexibility to fit various battery models and application scenarios. Through the discussion of this framework, comparisons have also been made between conventional heuristics-based methods and DRL. The experiments demonstrate that DRL-based scheduling framework achieves battery lifetime comparable to the best weighted-k round-robin (kRR) heuristic scheduling algorithm. In the meantime, the framework offers much greater flexibility in accommodating a wide range of battery models and use cases, including thermal control and imbalanced battery.
Collapse
|
7
|
Optimal Asset Planning for Prosumers Considering Energy Storage and Photovoltaic (PV) Units: A Stochastic Approach. ENERGIES 2020. [DOI: 10.3390/en13071813] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In the distribution system, customers have increasingly use renewable energy sources and battery energy storage systems (BESS), transforming traditional loads into active prosumers. Therefore, methodologies are needed to provide prosumers with tools to optimize their investments and increase business opportunities. In this paper, a stochastic mixed integer linear programming (MILP) formulation is proposed to solve for optimal sizes of prosumer assets, considering the use of a BESS and photovoltaic (PV) units. The objective is to minimize the total cost of the system, which is defined as the combination of a solar PV system investment, BESS investment, maintenance costs of assets, and the cost of electricity supplied by the grid. The developed method defines the optimal size of PV units, the power/energy capacities of the BESS, and the optimal value for initial energy stored in the BESS. Both deterministic and stochastic approaches were explored. For each approach, the proposed model was tested for three cases, providing a varying combination of the use of grid power, PV units, and BESS. The optimal values from each case were compared, showing that there is potential to achieve more economic plans for prosumers when PV and BESS technologies are taken into account.
Collapse
|
8
|
Adaptive Human–Machine Evaluation Framework Using Stochastic Gradient Descent-Based Reinforcement Learning for Dynamic Competing Network. APPLIED SCIENCES-BASEL 2020. [DOI: 10.3390/app10072558] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Complex problems require considerable work, extensive computation, and the development of effective solution methods. Recently, physical hardware- and software-based technologies have been utilized to support problem solving with computers. However, problem solving often involves human expertise and guidance. In these cases, accurate human evaluations and diagnoses must be communicated to the system, which should be done using a series of real numbers. In previous studies, only binary numbers have been used for this purpose. Hence, to achieve this objective, this paper proposes a new method of learning complex network topologies that coexist and compete in the same environment and interfere with the learning objectives of the others. Considering the special problem of reinforcement learning in an environment in which multiple network topologies coexist, we propose a policy that properly computes and updates the rewards derived from quantitative human evaluation and computes together with the rewards of the system. The rewards derived from the quantitative human evaluation are designed to be updated quickly and easily in an adaptive manner. Our new framework was applied to a basketball game for validation and demonstrated greater effectiveness than the existing methods.
Collapse
|