1
|
Li M, Wang D, Ren J, Qiao J. Advanced optimal tracking integrating a neural critic technique for asymmetric constrained zero-sum games. Neural Netw 2024; 177:106388. [PMID: 38776760 DOI: 10.1016/j.neunet.2024.106388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 04/14/2024] [Accepted: 05/12/2024] [Indexed: 05/25/2024]
Abstract
This paper investigates the optimal tracking issue for continuous-time (CT) nonlinear asymmetric constrained zero-sum games (ZSGs) by exploiting the neural critic technique. Initially, an improved algorithm is constructed to tackle the tracking control problem of nonlinear CT multiplayer ZSGs. Also, we give a novel nonquadratic function to settle the asymmetric constraints. One thing worth noting is that the method used in this paper to solve asymmetric constraints eliminates the strict restriction on the control matrix compared to the previous ones. Further, the optimal controls, the worst disturbances, and the tracking Hamilton-Jacobi-Isaacs equation are derived. Next, a single critic neural network is built to estimate the optimal cost function, thus obtaining the approximations of the optimal controls and the worst disturbances. The critic network weight is updated by the normalized steepest descent algorithm. Additionally, based on the Lyapunov method, the stability of the tracking error and the weight estimation error of the critic network is analyzed. In the end, two examples are offered to validate the theoretical results.
Collapse
Affiliation(s)
- Menghua Li
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| | - Ding Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| | - Jin Ren
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| | - Junfei Qiao
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China; Beijing Laboratory of Smart Environmental Protection, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
2
|
Chen H, Liu Z, Alippi C, Huang B, Liu D. Explainable Intelligent Fault Diagnosis for Nonlinear Dynamic Systems: From Unsupervised to Supervised Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:6166-6179. [PMID: 36074885 DOI: 10.1109/tnnls.2022.3201511] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The increased complexity and intelligence of automation systems require the development of intelligent fault diagnosis (IFD) methodologies. By relying on the concept of a suspected space, this study develops explainable data-driven IFD approaches for nonlinear dynamic systems. More specifically, we parameterize nonlinear systems through a generalized kernel representation for system modeling and the associated fault diagnosis. An important result obtained is a unified form of kernel representations, applicable to both unsupervised and supervised learning. More importantly, through a rigorous theoretical analysis, we discover the existence of a bridge (i.e., a bijective mapping) between some supervised and unsupervised learning-based entities. Notably, the designed IFD approaches achieve the same performance with the use of this bridge. In order to have a better understanding of the results obtained, both unsupervised and supervised neural networks are chosen as the learning tools to identify the generalized kernel representations and design the IFD schemes; an invertible neural network is then employed to build the bridge between them. This article is a perspective article, whose contribution lies in proposing and formalizing the fundamental concepts for explainable intelligent learning methods, contributing to system modeling and data-driven IFD designs for nonlinear dynamic systems.
Collapse
|
3
|
Liang M, Wang Y, Liu D. An Efficient Impulsive Adaptive Dynamic Programming Algorithm for Stochastic Systems. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:5545-5559. [PMID: 35380980 DOI: 10.1109/tcyb.2022.3158898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this study, a novel general impulsive transition matrix is defined, which can reveal the transition dynamics and probability distribution evolution patterns for all system states between two impulsive "events," instead of two regular time indexes. Based on this general matrix, the policy iteration-based impulsive adaptive dynamic programming (IADP) algorithm along with its variant, which is a more efficient IADP (EIADP) algorithm, are developed in order to solve the optimal impulsive control problems of discrete stochastic systems. Through analyzing the monotonicity, stability, and convergency properties of the obtained iterative value functions and control laws, it is proved that the IADP and EIADP algorithms both converge to the optimal impulsive performance index function. By dividing the whole impulsive policy into smaller pieces, the proposed EIADP algorithm updates the iterative policies in a "piece-by-piece" manner according to the actual hardware constraints. This feature of the EIADP method enables these ADP-based algorithms to be fully optimized to run on all "sizes" of computing devices including the ones with low memory spaces. A simulation experiment is conducted to validate the effectiveness of the present methods.
Collapse
|
4
|
Qian YY, Liu M, Wan Y, Lewis FL, Davoudi A. Distributed Adaptive Nash Equilibrium Solution for Differential Graphical Games. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:2275-2287. [PMID: 34623292 DOI: 10.1109/tcyb.2021.3114749] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
This article investigates differential graphical games for linear multiagent systems with a leader on fixed communication graphs. The objective is to make each agent synchronize to the leader and, meanwhile, optimize a performance index, which depends on the control policies of its own and its neighbors. To this end, a distributed adaptive Nash equilibrium solution is proposed for the differential graphical games. This solution, in contrast to the existing ones, is not only Nash but also fully distributed in the sense that each agent only uses local information of its own and its immediate neighbors without using any global information of the communication graph. Moreover, the asymptotic stability and global Nash equilibrium properties are analyzed for the proposed distributed adaptive Nash equilibrium solution. As an illustrative example, the differential graphical game solution is applied to the microgrid secondary control problem to achieve fully distributed voltage synchronization with optimized performance.
Collapse
|
5
|
Safe reinforcement learning for discrete-time fully cooperative games with partial state and control constraints using control barrier functions. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2022.10.058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
6
|
Constrained Optimal Control for Nonlinear Multi-Input Safety-Critical Systems with Time-Varying Safety Constraints. MATHEMATICS 2022. [DOI: 10.3390/math10152744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In this paper, we investigate the constrained optimal control problem of nonlinear multi-input safety-critical systems with uncertain disturbances and time-varying safety constraints. By utilizing a barrier function transformation, together with a new disturbance-related term and a smooth safety boundary function, a nominal system-dependent multi-input barrier transformation architecture is developed to deal with the time-varying safety constraints and uncertain disturbances. Based on the obtained transformation system, the coupled Hamilton–Jacobi–Bellman (HJB) function is established to obtain the constrained Nash equilibrium solution. In addition, due to the fact that it is difficult to solve the HJB function directly, the single critic neural network (NN) is constructed to approximate the optimal performance index function of different control inputs, respectively. It is proved theoretically that, under the influence of uncertain disturbances and time-varying safety constraints, the system states and neural network parameters can be uniformly ultimately bounded (UUB) by the proposed neural network approximation method. Finally, the effectiveness of the proposed method is verified by two nonlinear simulation examples.
Collapse
|
7
|
Guan C, Jiang Y. A tractor-trailer parking control scheme using adaptive dynamic programming. COMPLEX INTELL SYST 2022. [DOI: 10.1007/s40747-021-00330-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
AbstractThis paper studies the online learning control of a truck-trailer parking problem via adaptive dynamic programming (ADP). The contribution is twofold. First, a novel ADP method is developed for systems with parametric nonlinearities. It learns the optimal control policy of the linearized system at the origin, while the learning process utilizes online measurements of the full system and is robust with respect to nonlinear disturbances. Second, a control strategy is formulated for a commonly seen truck-trailer parallel parking problem, and the proposed ADP method is integrated into the strategy to provide online learning capabilities and to handle uncertainties. A numerical simulation is conducted to demonstrate the effectiveness of the proposed methodology.
Collapse
|
9
|
Zhang J, Zhang H, Gao Z, Sun S. Time-varying formation control with general linear multi-agent systems by distributed event-triggered mechanisms under fixed and switching topologies. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06539-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
10
|
Event-triggered optimal control for discrete-time multi-player non-zero-sum games using parallel control. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2021.10.073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|