1
|
Guo Z, Zhou Q, Ren H, Ma H, Li H. ADP-based fault-tolerant consensus control for multiagent systems with irregular state constraints. Neural Netw 2024; 180:106737. [PMID: 39316952 DOI: 10.1016/j.neunet.2024.106737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 08/03/2024] [Accepted: 09/11/2024] [Indexed: 09/26/2024]
Abstract
This paper investigates the consensus control issue for nonlinear multiagent systems (MASs) subject to irregular state constraints and actuator faults using an adaptive dynamic programming (ADP) algorithm. Unlike the regular state constraints considered in previous studies, this paper addresses irregular state constraints that may exhibit asymmetry, time variation, and can emerge or disappear during operation. By developing a system transformation method based on one-to-one state mapping, equivalent unconstrained MASs can be obtained. Subsequently, a finite-time distributed observer is designed to estimate the state information of the leader, and the consensus control problem is transformed into the tracking control problem for each agent to ensure that actuator faults of any agent cannot affect its neighboring agents. Then, a critic-only ADP-based fault tolerant control strategy, which consists of the optimal control policy for nominal system and online fault compensation for time-varying addictive faults, is proposed to achieve optimal tracking control. To enhance the learning efficiency of critic neural networks (NNs), an improved weight learning law utilizing stored historical data is employed, ensuring the convergence of critic NN weights towards ideal values under a finite excitation condition. Finally, a practical example of multiple manipulator systems is presented to demonstrate the effectiveness of the developed control method.
Collapse
Affiliation(s)
- Zijie Guo
- School of Electronics and Information, Guangdong Polytechnic Normal University, Guangzhou, 510665, Guangdong, China
| | - Qi Zhou
- School of Automation, Guangdong-Hong Kong Joint Laboratory for Intelligent Decision and Cooperative Control, and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China.
| | - Hongru Ren
- School of Automation, Guangdong-Hong Kong Joint Laboratory for Intelligent Decision and Cooperative Control, and Guangdong Province Key Laboratory of Intelligent Decision and Cooperative Control, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China
| | - Hui Ma
- School of Mathematics and Statistics, Guangdong University of Technology, Guangzhou, 510006, Guangdong, China
| | - Hongyi Li
- College of Electronic and Information Engineering and Chongqing Key Laboratory of Generic Technology and System of Service Robots, Southwest University, Chongqing, 400715, Chongqing, China
| |
Collapse
|
2
|
Finite-Horizon Robust Event-Triggered Control for Nonlinear Multi-agent Systems with State Delay. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-11085-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
3
|
Narayanan V, Modares H, Jagannathan S, Lewis FL. Event-Driven Off-Policy Reinforcement Learning for Control of Interconnected Systems. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:1936-1946. [PMID: 32639933 DOI: 10.1109/tcyb.2020.2991166] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, we introduce a novel approximate optimal decentralized control scheme for uncertain input-affine nonlinear-interconnected systems. In the proposed scheme, we design a controller and an event-triggering mechanism (ETM) at each subsystem to optimize a local performance index and reduce redundant control updates, respectively. To this end, we formulate a noncooperative dynamic game at every subsystem in which we collectively model the interconnection inputs and the event-triggering error as adversarial players that deteriorate the subsystem performance and model the control policy as the performance optimizer, competing against these adversarial players. To obtain a solution to this game, one has to solve the associated Hamilton-Jacobi-Isaac (HJI) equation, which does not have a closed-form solution even when the subsystem dynamics are accurately known. In this context, we introduce an event-driven off-policy integral reinforcement learning (OIRL) approach to learn an approximate solution to this HJI equation using artificial neural networks (NNs). We then use this NN approximated solution to design the control policy and event-triggering threshold at each subsystem. In the learning framework, we guarantee the Zeno-free behavior of the ETMs at each subsystem using the exploration policies. Finally, we derive sufficient conditions to guarantee uniform ultimate bounded regulation of the controlled system states and demonstrate the efficacy of the proposed framework with numerical examples.
Collapse
|
4
|
Wang D, Qiao J, Cheng L. An Approximate Neuro-Optimal Solution of Discounted Guaranteed Cost Control Design. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:77-86. [PMID: 32175887 DOI: 10.1109/tcyb.2020.2977318] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
The adaptive optimal feedback stabilization is investigated in this article for discounted guaranteed cost control of uncertain nonlinear dynamical systems. Via theoretical analysis, the guaranteed cost control problem involving a discounted utility is transformed to the design of a discounted optimal control policy for the nominal plant. The size of the neighborhood with respect to uniformly ultimately bounded stability is discussed. Then, for deriving the approximate optimal solution of the modified Hamilton-Jacobi-Bellman equation, an improved self-learning algorithm under the framework of adaptive critic designs is established. It facilitates the neuro-optimal control implementation without an additional requirement of the initial admissible condition. The simulation verification toward several dynamics is provided, involving the F16 aircraft plant, in order to illustrate the effectiveness of the discounted guaranteed cost control method.
Collapse
|
5
|
Zhao F, Gao W, Jiang ZP, Liu T. Event-Triggered Adaptive Optimal Control With Output Feedback: An Adaptive Dynamic Programming Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5208-5221. [PMID: 33035169 DOI: 10.1109/tnnls.2020.3027301] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article presents an event-triggered output-feedback adaptive optimal control method for continuous-time linear systems. First, it is shown that the unmeasurable states can be reconstructed by using the measured input and output data. An event-based feedback strategy is then proposed to reduce the number of controller updates and save communication resources. The discrete-time algebraic Riccati equation is iteratively solved through event-triggered adaptive dynamic programming based on both policy iteration (PI) and value iteration (VI) methods. The convergence of the proposed algorithm and the closed-loop stability is carried out by using the Lyapunov techniques. Two numerical examples are employed to verify the effectiveness of the design methodology.
Collapse
|
6
|
Wang D, Zhao M, Ha M, Ren J. Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application. Neural Netw 2021; 143:121-132. [PMID: 34118779 DOI: 10.1016/j.neunet.2021.05.027] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 02/15/2021] [Accepted: 05/25/2021] [Indexed: 10/21/2022]
Abstract
In this paper, we aim to solve the optimal tracking control problem for a class of nonaffine discrete-time systems with actuator saturation. First, a data-based neural identifier is constructed to learn the unknown system dynamics. Then, according to the expression of the trained neural identifier, we can obtain the steady control corresponding to the reference trajectory. Next, by involving the iterative dual heuristic dynamic programming algorithm, the new costate function and the tracking control law are developed. Two other neural networks are used to estimate the costate function and approximate the tracking control law. Considering approximation errors of neural networks, the stability analysis of the proposed algorithm for the specific systems is provided by introducing the Lyapunov approach. Finally, via conducting simulation and comparison, the superiority of the developed optimal tracking method is confirmed. Moreover, the trajectory tracking performance of the wastewater treatment application is also involved for further verifying the proposed approach.
Collapse
Affiliation(s)
- Ding Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China.
| | - Mingming Zhao
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China.
| | - Mingming Ha
- School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China.
| | - Jin Ren
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
7
|
Ma B, Li Y. Compensator-critic structure-based event-triggered decentralized tracking control of modular robot manipulators: theory and experimental verification. COMPLEX INTELL SYST 2021. [DOI: 10.1007/s40747-021-00359-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
AbstractThis paper presents a novel compensator-critic structure-based event-triggered decentralized tracking control of modular robot manipulators (MRMs). On the basis of subsystem dynamics under joint torque feedback (JTF) technique, the proposed tracking error fusion function, which includes position error and velocity error, is utilized to construct performance index function. By analyzing the dynamic uncertainties, a local dynamic information-based robust controller is designed to engage the model uncertainty compensation. Based on adaptive dynamic programming (ADP) algorithm and the event-triggered mechanism, the decentralized tracking control is obtained by solving the event-triggered Hamilton–Jacobi–Bellman equation (HJBE) with the critic neural network (NN). The tracking error of the closed-loop manipulators system is proved to be ultimately uniformly bounded (UUB) using the Lyapunov stability theorem. Finally, experimental results illustrate the effectiveness of the developed control method.
Collapse
|
8
|
Yang Y, Vamvoudakis KG, Modares H, Yin Y, Wunsch DC. Safe Intermittent Reinforcement Learning With Static and Dynamic Event Generators. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5441-5455. [PMID: 32054590 DOI: 10.1109/tnnls.2020.2967871] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, we present an intermittent framework for safe reinforcement learning (RL) algorithms. First, we develop a barrier function-based system transformation to impose state constraints while converting the original problem to an unconstrained optimization problem. Second, based on optimal derived policies, two types of intermittent feedback RL algorithms are presented, namely, a static and a dynamic one. We finally leverage an actor/critic structure to solve the problem online while guaranteeing optimality, stability, and safety. Simulation results show the efficacy of the proposed approach.
Collapse
|
9
|
Event-driven H ∞ control with critic learning for nonlinear systems. Neural Netw 2020; 132:30-42. [PMID: 32861146 DOI: 10.1016/j.neunet.2020.08.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Revised: 08/03/2020] [Accepted: 08/10/2020] [Indexed: 11/22/2022]
Abstract
In this paper, we study an event-driven H∞ control problem of continuous-time nonlinear systems. Initially, with the introduction of a discounted cost function, we convert the nonlinear H∞ control problem into an event-driven nonlinear two-player zero-sum game. Then, we develop an event-driven Hamilton-Jacobi-Isaacs equation (HJIE) related to the two-player zero-sum game. After that, we propose a novel event-triggering condition guaranteeing Zeno behavior not to happen. The triggering threshold in the newly proposed event-triggering condition can be kept positive without requiring to properly choose the prescribed level of disturbance attenuation. To solve the event-driven HJIE, we employ an adaptive critic architecture which contains a unique critic neural network (NN). The weight parameters used in the critic NN are tuned via the gradient descent method. After that, we carry out stability analysis of the hybrid closed-loop system based on Lyapunov's direct approach. Finally, we provide two nonlinear plants, including the pendulum system, to validate the proposed event-driven H∞ control scheme.
Collapse
|
10
|
Integral reinforcement learning based event-triggered control with input saturation. Neural Netw 2020; 131:144-153. [PMID: 32771844 DOI: 10.1016/j.neunet.2020.07.016] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2020] [Revised: 06/13/2020] [Accepted: 07/10/2020] [Indexed: 11/20/2022]
Abstract
In this paper, a novel integral reinforcement learning (IRL)-based event-triggered adaptive dynamic programming scheme is developed for input-saturated continuous-time nonlinear systems. By using the IRL technique, the learning system does not require the knowledge of the drift dynamics. Then, a single critic neural network is designed to approximate the unknown value function and its learning is not subjected to the requirement of an initial admissible control. In order to reduce computational and communication costs, the event-triggered control law is designed. The triggering threshold is given to guarantee the asymptotic stability of the control system. Two examples are employed in the simulation studies, and the results verify the effectiveness of the developed IRL-based event-triggered control method.
Collapse
|
11
|
Xiao X, Fu D, Wang G, Liao S, Qi Y, Huang H, Jin L. Two neural dynamics approaches for computing system of time-varying nonlinear equations. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2020.02.011] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
12
|
Sun W, Su SF, Xia J, Wu Y. Adaptive Tracking Control of Wheeled Inverted Pendulums With Periodic Disturbances. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:1867-1876. [PMID: 30582561 DOI: 10.1109/tcyb.2018.2884707] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
This paper reports our study on adaptive tracking control for a mobile-wheeled inverted pendulum with periodic disturbances and parametric uncertainties. With an appropriate reduced dynamic model, incorporating repetitive learning strategies with dynamic decoupling and related adaptive control techniques, a novel controller is successfully constructed to ensure that the output tracking errors of the system will stay within a small neighborhood around zero and all of the other signals are semiglobal uniform bounded. Meanwhile, only one parameter estimation is used for adaptive controller design, which overcomes the problem of over-parametrization. Furthermore, a required condition of period identifier mechanisms is proposed. Finally, detailed simulation results are presented to demonstrate the effectiveness of the proposed control schemes.
Collapse
|
13
|
Liang M, Wang D, Liu D. Improved value iteration for neural-network-based stochastic optimal control design. Neural Netw 2020; 124:280-295. [PMID: 32036226 DOI: 10.1016/j.neunet.2020.01.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2019] [Revised: 01/05/2020] [Accepted: 01/07/2020] [Indexed: 11/30/2022]
Abstract
In this paper, a novel value iteration adaptive dynamic programming (ADP) algorithm is presented, which is called an improved value iteration ADP algorithm, to obtain the optimal policy for discrete stochastic processes. In the improved value iteration ADP algorithm, for the first time we propose a new criteria to verify whether the obtained policy is stable or not for stochastic processes. By analyzing the convergence properties of the proposed algorithm, it is shown that the iterative value functions can converge to the optimum. In addition, our algorithm allows the initial value function to be an arbitrary positive semi-definite function. Finally, two simulation examples are presented to validate the effectiveness of the developed method.
Collapse
Affiliation(s)
- Mingming Liang
- State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
| | - Ding Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Derong Liu
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| |
Collapse
|
14
|
Wen G, Ge SS, Chen CLP, Tu F, Wang S. Adaptive Tracking Control of Surface Vessel Using Optimized Backstepping Technique. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:3420-3431. [PMID: 29994688 DOI: 10.1109/tcyb.2018.2844177] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In this paper, a tracking control approach for surface vessel is developed based on the new control technique named optimized backstepping (OB), which considers optimization as a backstepping design principle. Since surface vessel systems are modeled by second-order dynamic in strict feedback form, backstepping is an ideal technique for finishing the tracking task. In the backstepping control of surface vessel, the virtual and actual controls are designed to be the optimized solutions of corresponding subsystems, therefore the overall control is optimized. In general, optimization control is designed based on the solution of Hamilton-Jacobi-Bellman equation. However, solving the equation is very difficult or even impossible due to the inherent nonlinearity and complexity. In order to overcome the difficulty, the reinforcement learning (RL) strategy of actor-critic architecture is usually considered, of which the critic and actor are utilized for evaluating the control performance and executing the control behavior, respectively. By employing the actor-critic RL algorithm for both virtual and actual controls of the vessel, it is proven that the desired optimizing and tracking performances can be arrived. Simulation results further demonstrate effectiveness of the proposed surface vessel control.
Collapse
|