1
|
Sun J, Dai J, Zhang H, Yu S, Xu S, Wang J. Neural-Network-Based Immune Optimization Regulation Using Adaptive Dynamic Programming. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1944-1953. [PMID: 35767503 DOI: 10.1109/tcyb.2022.3179302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
This article investigates optimal regulation scheme between tumor and immune cells based on the adaptive dynamic programming (ADP) approach. The therapeutic goal is to inhibit the growth of tumor cells to allowable injury degree and maximize the number of immune cells in the meantime. The reliable controller is derived through the ADP approach to make the number of cells achieve the specific ideal states. First, the main objective is to weaken the negative effect caused by chemotherapy and immunotherapy, which means that the minimal dose of chemotherapeutic and immunotherapeutic drugs can be operational in the treatment process. Second, according to the nonlinear dynamical mathematical model of tumor cells, chemotherapy and immunotherapeutic drugs can act as powerful regulatory measures, which is a closed-loop control behavior. Finally, states of the system and critic weight errors are proved to be ultimately uniformly bounded with the appropriate optimization control strategy and the simulation results are shown to demonstrate the effectiveness of the cybernetics methodology.
Collapse
|
2
|
Yuwen C, Wang X, Liu S, Zhang X, Sun B. Distributed Nash equilibrium seeking strategy with incomplete information. ISA TRANSACTIONS 2022; 129:372-379. [PMID: 35125213 DOI: 10.1016/j.isatra.2022.01.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 01/17/2022] [Accepted: 01/17/2022] [Indexed: 06/14/2023]
Abstract
In this paper, two kinds of distributed Nash equilibrium seeking strategies based on Kalman filter are proposed in non-cooperative games with incomplete information. In the discrete-time system with process and measurement noises, each player, selfish and only considering its own profit, utilizes the gradient method to maximize the benefit. Since the payoff function is related to all players' states, Kalman filter and leader-following consensus are used to estimate the states in the network. Furthermore, considering the trade-off between strategy precision of Nash equilibrium and communication rate, another Nash equilibrium seeking method is proposed by introducing an event-based scheduler. The convergence of both Nash equilibrium seeking strategies is analyzed based on Lyapunov method. It is proved that both strategies are bounded in the mean square sense. Simulation examples are given to verify the efficiency.
Collapse
Affiliation(s)
- Cheng Yuwen
- School of Control Science and Engineering, Shandong University, Jinan 250012, China.
| | - Xiaowen Wang
- School of Control Science and Engineering, Shandong University, Jinan 250012, China.
| | - Shuai Liu
- School of Control Science and Engineering, Shandong University, Jinan 250012, China.
| | - Xianfu Zhang
- School of Control Science and Engineering, Shandong University, Jinan 250012, China.
| | - Bo Sun
- School of Control Science and Engineering, Shandong University, Jinan 250012, China.
| |
Collapse
|
3
|
Yang X, Zhu Y, Dong N, Wei Q. Decentralized Event-Driven Constrained Control Using Adaptive Critic Designs. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:5830-5844. [PMID: 33861716 DOI: 10.1109/tnnls.2021.3071548] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
We study the decentralized event-driven control problem of nonlinear dynamical systems with mismatched interconnections and asymmetric input constraints. To begin with, by introducing a discounted cost function for each auxiliary subsystem, we transform the decentralized event-driven constrained control problem into a group of nonlinear H2 -constrained optimal control problems. Then, we develop the event-driven Hamilton-Jacobi-Bellman equations (ED-HJBEs), which arise in the nonlinear H2 -constrained optimal control problems. Meanwhile, we demonstrate that all the solutions of the ED-HJBEs together keep the overall system stable in the sense of uniform ultimate boundedness (UUB). To solve the ED-HJBEs, we build a critic-only architecture under the framework of adaptive critic designs. The architecture only employs critic neural networks and updates their weight vectors via the gradient descent method. After that, based on the Lyapunov approach, we prove that the UUB stability of all signals in the closed-loop auxiliary subsystems is assured. Finally, simulations of an illustrated nonlinear interconnected plant are provided to validate the present designs.
Collapse
|
4
|
Huo X, Karimi HR, Zhao X, Wang B, Zong G. Adaptive-Critic Design for Decentralized Event-Triggered Control of Constrained Nonlinear Interconnected Systems Within an Identifier-Critic Framework. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:7478-7491. [PMID: 33400659 DOI: 10.1109/tcyb.2020.3037321] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This article studies the decentralized event-triggered control problem for a class of constrained nonlinear interconnected systems. By assigning a specific cost function for each constrained auxiliary subsystem, the original control problem is equivalently transformed into finding a series of optimal control policies updating in an aperiodic manner, and these optimal event-triggered control laws together constitute the desired decentralized controller. It is strictly proven that the system under consideration is stable in the sense of uniformly ultimate boundedness provided by the solutions of event-triggered Hamilton-Jacobi-Bellman equations. Different from the traditional adaptive critic design methods, we present an identifier-critic network architecture to relax the restrictions posed on the system dynamics, and the actor network commonly used to approximate the optimal control law is circumvented. The weights in the critic network are tuned on the basis of the gradient descent approach as well as the historical data, such that the persistence of excitation condition is no longer needed. The validity of our control scheme is demonstrated through a simulation example.
Collapse
|
5
|
Yang X, Xu M, Wei Q. Dynamic Event-Sampled Control of Interconnected Nonlinear Systems Using Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:923-937. [PMID: 35666792 DOI: 10.1109/tnnls.2022.3178017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We develop a decentralized dynamic event-based control strategy for nonlinear systems subject to matched interconnections. To begin with, we introduce a dynamic event-based sampling mechanism, which relies on the system's states and the variables generated by time-based differential equations. Then, we prove that the decentralized event-based controller for the whole system is composed of all the optimal event-based control policies of nominal subsystems. To derive these optimal event-based control policies, we design a critic-only architecture to solve the related event-based Hamilton-Jacobi-Bellman equations in the reinforcement learning framework. The implementation of such an architecture uses only critic neural networks (NNs) with their weight vectors being updated through the gradient descent method together with concurrent learning. After that, we demonstrate that the asymptotic stability of closed-loop nominal subsystems and the uniformly ultimate boundedness stability of critic NNs' weight estimation errors are guaranteed by using Lyapunov's approach. Finally, we provide simulations of a matched nonlinear-interconnected plant to validate the present theoretical claims.
Collapse
|
6
|
Robust Tracking Control for Non-Zero-Sum Games of Continuous-Time Uncertain Nonlinear Systems. MATHEMATICS 2022. [DOI: 10.3390/math10111904] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
In this paper, a new adaptive critic design is proposed to approximate the online Nash equilibrium solution for the robust trajectory tracking control of non-zero-sum (NZS) games for continuous-time uncertain nonlinear systems. First, the augmented system was constructed by combining the tracking error and the reference trajectory. By modifying the cost function, the robust tracking control problem was transformed into an optimal tracking control problem. Based on adaptive dynamic programming (ADP), a single critic neural network (NN) was applied for each player to solve the coupled Hamilton–Jacobi–Bellman (HJB) equations approximately, and the obtained control laws were regarded as the feedback Nash equilibrium. Two additional terms were introduced in the weight update law of each critic NN, which strengthened the weight update process and eliminated the strict requirements for the initial stability control policy. More importantly, in theory, through the Lyapunov theory, the stability of the closed-loop system was guaranteed, and the robust tracking performance was analyzed. Finally, the effectiveness of the proposed scheme was verified by two examples.
Collapse
|
7
|
Zhao Q, Sun J, Wang G, Chen J. Event-Triggered ADP for Nonzero-Sum Games of Unknown Nonlinear Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1905-1913. [PMID: 33882002 DOI: 10.1109/tnnls.2021.3071545] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
For nonzero-sum (NZS) games of nonlinear systems, reinforcement learning (RL) or adaptive dynamic programming (ADP) has shown its capability of approximating the desired index performance and the optimal input policy iteratively. In this article, an event-triggered ADP is proposed for NZS games of continuous-time nonlinear systems with completely unknown system dynamics. To achieve the Nash equilibrium solution approximately, the critic neural networks and actor neural networks are utilized to estimate the value functions and the control policies, respectively. Compared with the traditional time-triggered mechanism, the proposed algorithm updates the neural network weights as well as the inputs of players only when a state-based event-triggered condition is violated. It is shown that the system stability and the weights' convergence are still guaranteed under mild assumptions, while occupation of communication and computation resources is considerably reduced. Meanwhile, the infamous Zeno behavior is excluded by proving the existence of a minimum inter-event time (MIET) to ensure the feasibility of the closed-loop event-triggered continuous-time system. Finally, a numerical example is simulated to illustrate the effectiveness of the proposed approach.
Collapse
|
8
|
Wei Q, Zhu L, Song R, Zhang P, Liu D, Xiao J. Model-Free Adaptive Optimal Control for Unknown Nonlinear Multiplayer Nonzero-Sum Game. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:879-892. [PMID: 33108297 DOI: 10.1109/tnnls.2020.3030127] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
In this article, an online adaptive optimal control algorithm based on adaptive dynamic programming is developed to solve the multiplayer nonzero-sum game (MP-NZSG) for discrete-time unknown nonlinear systems. First, a model-free coupled globalized dual-heuristic dynamic programming (GDHP) structure is designed to solve the MP-NZSG problem, in which there is no model network or identifier. Second, in order to relax the requirement of systems dynamics, an online adaptive learning algorithm is developed to solve the Hamilton-Jacobi equation using the system states of two adjacent time steps. Third, a series of critic networks and action networks are used to approximate value functions and optimal policies for all players. All the neural network (NN) weights are updated online based on real-time system states. Fourth, the uniformly ultimate boundedness analysis of the NN approximation errors is proved based on the Lyapunov approach. Finally, simulation results are given to demonstrate the effectiveness of the developed scheme.
Collapse
|
9
|
Ma B, Li Y. Compensator-critic structure-based event-triggered decentralized tracking control of modular robot manipulators: theory and experimental verification. COMPLEX INTELL SYST 2021. [DOI: 10.1007/s40747-021-00359-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
AbstractThis paper presents a novel compensator-critic structure-based event-triggered decentralized tracking control of modular robot manipulators (MRMs). On the basis of subsystem dynamics under joint torque feedback (JTF) technique, the proposed tracking error fusion function, which includes position error and velocity error, is utilized to construct performance index function. By analyzing the dynamic uncertainties, a local dynamic information-based robust controller is designed to engage the model uncertainty compensation. Based on adaptive dynamic programming (ADP) algorithm and the event-triggered mechanism, the decentralized tracking control is obtained by solving the event-triggered Hamilton–Jacobi–Bellman equation (HJBE) with the critic neural network (NN). The tracking error of the closed-loop manipulators system is proved to be ultimately uniformly bounded (UUB) using the Lyapunov stability theorem. Finally, experimental results illustrate the effectiveness of the developed control method.
Collapse
|
10
|
Wei Q, Wang L, Liu Y, Polycarpou MM. Optimal Elevator Group Control via Deep Asynchronous Actor-Critic Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:5245-5256. [PMID: 32071000 DOI: 10.1109/tnnls.2020.2965208] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In this article, a new deep reinforcement learning (RL) method, called asynchronous advantage actor-critic (A3C) method, is developed to solve the optimal control problem of elevator group control systems (EGCSs). The main contribution of this article is that the optimal control law of EGCSs is designed via a new deep RL method, such that the elevator system sends passengers to the desired destination floors as soon as possible. Deep convolutional and recurrent neural networks, which can update themselves during applications, are designed to dispatch elevators. Then, the structure of the A3C method is developed, and the training phase for the learning optimal law is discussed. Finally, simulation results illustrate that the developed method effectively reduces the average waiting time in a complex building environment. Comparisons with traditional algorithms further verify the effectiveness of the developed method.
Collapse
|
11
|
Li Y, Li K, Tong S. Adaptive Neural Network Finite-Time Control for Multi-Input and Multi-Output Nonlinear Systems With Positive Powers of Odd Rational Numbers. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:2532-2543. [PMID: 31484136 DOI: 10.1109/tnnls.2019.2933409] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article investigates the adaptive neural network (NN) finite-time output tracking control problem for a class of multi-input and multi-output (MIMO) uncertain nonlinear systems whose powers are positive odd rational numbers. Such designs adopt NNs to approximate unknown continuous system functions, and a controller is constructed by combining backstepping design and adding a power integrator technique. By constructing new iterative Lyapunov functions and using finite-time stability theory, the closed-loop stability has been achieved, which further verifies that the entire system possesses semiglobal practical finite-time stability (SGPFS), and the tracking errors converge to a small neighborhood of the origin within finite time. Finally, a simulation example is given to elaborate the effectiveness and superiority of the developed.
Collapse
|
12
|
Lv Y, Ren X, Na J. Online Nash-optimization tracking control of multi-motor driven load system with simplified RL scheme. ISA TRANSACTIONS 2020; 98:251-262. [PMID: 31439393 DOI: 10.1016/j.isatra.2019.08.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/21/2018] [Revised: 08/13/2019] [Accepted: 08/13/2019] [Indexed: 06/10/2023]
Abstract
Although the optimal tracking control problem (OTCP) has been addressed recently, only the single-input system is considered in the recent literature. In this paper, the OTCP of unknown multi-motor driven load systems (MMDLS) is addressed based on a simplified reinforcement learning (RL) structure, where all the motor inputs with different dynamics will be obtained as a Nash equilibrium. Thus, the performance indexes associated with each input can be optimized as an outcome of a Nash equilibrium. Firstly, we use an identifier to reconstruct MMDLS dynamics, such that the accurate model required in the general control design is avoided. We use the identified dynamics to drive Nash-optimization inputs, which include the steady-state controls and the RL-based controls. The steady-state controls are designed with the identified system model. The RL-based controls are designed using the optimization method with the simplified RL-based critic NN schemes. We use the simplified RL structures to approximate the cost function of each motor input in the optimal control design. The NN weights of both the identified algorithm and simplified RL-based structure are approximated by using a novel adaptation algorithm, where the learning gains can be optimized adaptively. The weight convergences and the Nash-optimization MMDLS stability are all proved. Finally, numerical MMDLS simulations are implemented to show the correctness and the improved performance of the proposed methods.
Collapse
Affiliation(s)
- Yongfeng Lv
- School of Automation, Beijing Institute of Technology, Beijing 100081, China
| | - Xuemei Ren
- School of Automation, Beijing Institute of Technology, Beijing 100081, China.
| | - Jing Na
- Faculty of Mechanical & Electrical Engineering, Kunming University of Science & Technology, Kunming 650500, China
| |
Collapse
|