1
|
Yi X, Luo B, Zhao Y. Neural Network-Based Robust Guaranteed Cost Control for Image-Based Visual Servoing of Quadrotor. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:12693-12705. [PMID: 37067964 DOI: 10.1109/tnnls.2023.3264511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In this article, a neural network (NN)-based robust guaranteed cost control design is proposed for image-based visual servoing (IBVS) control of quadrotors. According to the dynamics of three subsystems (yaw, height, and lateral subsystems) derived from the quadrotor IBVS dynamic model, the main control design is to solve the robust control problem for the time-varying lateral subsystem with angle constraints and uncertain disturbances. Considering the system dynamics, a two-loop structure is conducted. The outer loop uses the linear quadratic regulator to solve the Riccati equation for the lateral image feature system, and the inner loop adopts the optimal robust guaranteed cost control to solve the lateral velocity system. For the lateral velocity system, the optimal robust control problem is transformed to solve the modified Hamilton-Jacobi-Bellman equation of the corresponding optimal control problem utilizing adaptive dynamic programming. The implementation is accomplished with the time-varying NN and the designed estimated weight update law. In addition, the stability and effectiveness are proved by the theoretic proof and simulations.
Collapse
|
2
|
Liang Y, Zhang H, Zhang J, Ming Z. Event-Triggered Guarantee Cost Control for Partially Unknown Stochastic Systems via Explorized Integral Reinforcement Learning Strategy. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:7830-7844. [PMID: 36395138 DOI: 10.1109/tnnls.2022.3221105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In this article, an integral reinforcement learning (IRL)-based event-triggered guarantee cost control (GCC) approach is proposed for stochastic systems which are modulated by randomly time-varying parameters. First, with the aid of the RL algorithm, the optimal GCC (OGCC) problem is converted into an optimal zero-sum game by solving a modified Hamilton-Jacobin-Isaac (HJI) equation of the auxiliary system. Moreover, in order to address the stochastic zero-sum game, we propose an on-policy IRL-based control approach involved by the multivariate probabilistic collocation method (MPCM), which can accurately predict the mean value of uncertain functions with randomly time-varying parameters. Furthermore, a novel GCC method, which combines the explorized IRL algorithm and MPCM, is designed to relax the restriction of knowing the system dynamics for the class of stochastic systems. On this foundation, for the purpose of reducing computation cost and avoiding the waste of resources, we propose an event-triggered GCC approach involved with explorized IRL and MPCM by utilizing critic-actor-disturbance neural networks (NNs). Meanwhile, the weight vectors of three NNs are updated simultaneously and aperiodically according to the designed triggering condition. The ultimate boundedness (UB) properties of the controlled systems have been proved by means of the Lyapunov theorem. Finally, the effectiveness of the developed GCC algorithms is illustrated via two simulation examples.
Collapse
|
3
|
Hu X, Zhang H, Ma D, Wang R, Wang T, Xie X. Real-Time Leak Location of Long-Distance Pipeline Using Adaptive Dynamic Programming. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7004-7013. [PMID: 34971544 DOI: 10.1109/tnnls.2021.3136939] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In traditional leak location methods, the position of the leak point is located through the time difference of pressure change points of both ends of the pipeline. The inaccurate estimation of pressure change points leads to the wrong leak location result. To address it, adaptive dynamic programming is proposed to solve the pipeline leak location problem in this article. First, a pipeline model is proposed to describe the pressure change along pipeline, which is utilized to reflect the iterative situation of the logarithmic form of pressure change. Then, under the Bellman optimality principle, a value iteration (VI) scheme is proposed to provide the optimal sequence of the nominal parameter and obtain the pipeline leak point. Furthermore, neural networks are built as the VI scheme structure to ensure the iterative performance of the proposed method. By transforming into the dynamic optimization problem, the proposed method adopts the estimation of the logarithmic form of pressure changes of both ends of the pipeline to locate the leak point, which avoids the wrong results caused by unclear pressure change points. Thus, it could be applied for real-time leak location of long-distance pipeline. Finally, the experiment cases are given to illustrate the effectiveness of the proposed method.
Collapse
|
4
|
Wang D, Hu L, Zhao M, Qiao J. Adaptive Critic for Event-Triggered Unknown Nonlinear Optimal Tracking Design With Wastewater Treatment Applications. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:6276-6288. [PMID: 34941533 DOI: 10.1109/tnnls.2021.3135405] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
In this article, an event-based near-optimal tracking control algorithm is developed for a class of nonaffine systems. First, in order to gain the tracking control strategy, the costate function is established through the iterative dual heuristic dynamic programming (DHP) algorithm. Then, the event-based control method is employed to improve the utilization efficiency of resources and ensure that the closed-loop system has an excellent control performance. Meanwhile, the input-to-state stability (ISS) is proven for the event-based tracking plant. In addition, three kinds of neural networks are used in the event-based DHP algorithm, which aims to identify the nonaffine nonlinear system, estimate the costate function, and approximate the tracking control law. Finally, a numerical experimental simulation is conducted to verify the effectiveness of the proposed scheme. Moreover, in order to further validate the feasibility, the algorithm is applied to the wastewater treatment plant to effectively control the concentrations of dissolved oxygen and nitrate nitrogen.
Collapse
|
5
|
Zhang Y, Huang H, Shen G. Adaptive CL-BFGS Algorithms for Complex-Valued Neural Networks. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:6313-6327. [PMID: 34995196 DOI: 10.1109/tnnls.2021.3135553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Complex-valued limited-memory BFGS (CL-BFGS) algorithm is efficient for the training of complex-valued neural networks (CVNNs). As an important parameter, the memory size represents the number of saved vector pairs and would essentially affect the performance of the algorithm. However, the determination of a suitable memory size for the CL-BFGS algorithm remains challenging. To deal with this issue, an adaptive method is proposed in which the memory size is allowed to vary during the iteration process. Basically, at each iteration, with the help of multistep quasi-Newton method, an appropriate memory size is chosen from a variable set {1,2, ... , M} by approximating complex Hessian matrix as close as possible. To reduce the computational complexity and ensure desired performance, the upper bound M is adjustable according to the moving average of memory sizes found in previous iterations. The proposed adaptive CL-BFGS (ACL-BFGS) algorithm can be efficiently applied for the training of CVNNs. Moreover, it is suggested to take multiple memory sizes to construct the search direction, which further improves the performance of the ACL-BFGS algorithm. Experimental results on some benchmark problems including the pattern classification, complex function approximation, and nonlinear channel equalization problems are given to illustrate the advantages of the developed algorithms over some previous ones.
Collapse
|
6
|
Wang Z, Lee J, Wei Q, Zhang A. Event-Triggered Near-Optimal Tracking Control based on Adaptive Dynamic Programming for Discrete-Time Systems. Neurocomputing 2023. [DOI: 10.1016/j.neucom.2023.03.045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
7
|
Constrained Optimal Control for Nonlinear Multi-Input Safety-Critical Systems with Time-Varying Safety Constraints. MATHEMATICS 2022. [DOI: 10.3390/math10152744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In this paper, we investigate the constrained optimal control problem of nonlinear multi-input safety-critical systems with uncertain disturbances and time-varying safety constraints. By utilizing a barrier function transformation, together with a new disturbance-related term and a smooth safety boundary function, a nominal system-dependent multi-input barrier transformation architecture is developed to deal with the time-varying safety constraints and uncertain disturbances. Based on the obtained transformation system, the coupled Hamilton–Jacobi–Bellman (HJB) function is established to obtain the constrained Nash equilibrium solution. In addition, due to the fact that it is difficult to solve the HJB function directly, the single critic neural network (NN) is constructed to approximate the optimal performance index function of different control inputs, respectively. It is proved theoretically that, under the influence of uncertain disturbances and time-varying safety constraints, the system states and neural network parameters can be uniformly ultimately bounded (UUB) by the proposed neural network approximation method. Finally, the effectiveness of the proposed method is verified by two nonlinear simulation examples.
Collapse
|
8
|
Yang X, Xu M, Wei Q. Dynamic Event-Sampled Control of Interconnected Nonlinear Systems Using Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:923-937. [PMID: 35666792 DOI: 10.1109/tnnls.2022.3178017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
We develop a decentralized dynamic event-based control strategy for nonlinear systems subject to matched interconnections. To begin with, we introduce a dynamic event-based sampling mechanism, which relies on the system's states and the variables generated by time-based differential equations. Then, we prove that the decentralized event-based controller for the whole system is composed of all the optimal event-based control policies of nominal subsystems. To derive these optimal event-based control policies, we design a critic-only architecture to solve the related event-based Hamilton-Jacobi-Bellman equations in the reinforcement learning framework. The implementation of such an architecture uses only critic neural networks (NNs) with their weight vectors being updated through the gradient descent method together with concurrent learning. After that, we demonstrate that the asymptotic stability of closed-loop nominal subsystems and the uniformly ultimate boundedness stability of critic NNs' weight estimation errors are guaranteed by using Lyapunov's approach. Finally, we provide simulations of a matched nonlinear-interconnected plant to validate the present theoretical claims.
Collapse
|
9
|
|
10
|
Liu P, Zhang H, Ren H, Liu C. Online event-triggered adaptive critic design for multi-player zero-sum games of partially unknown nonlinear systems with input constraints. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.07.058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
11
|
Ha M, Wang D, Liu D. Neural-network-based discounted optimal control via an integrated value iteration with accuracy guarantee. Neural Netw 2021; 144:176-186. [PMID: 34500256 DOI: 10.1016/j.neunet.2021.08.025] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Revised: 08/19/2021] [Accepted: 08/19/2021] [Indexed: 10/20/2022]
Abstract
A data-based value iteration algorithm with the bidirectional approximation feature is developed for discounted optimal control. The unknown nonlinear system dynamics is first identified by establishing a model neural network. To improve the identification precision, biases are introduced to the model network. The model network with biases is trained by the gradient descent algorithm, where the weights and biases across all layers are updated. The uniform ultimate boundedness stability with a proper learning rate is analyzed, by using the Lyapunov approach. Moreover, an integrated value iteration with the discounted cost is developed to fully guarantee the approximation accuracy of the optimal value function. Then, the effectiveness of the proposed algorithm is demonstrated by carrying out two simulation examples with physical backgrounds.
Collapse
Affiliation(s)
- Mingming Ha
- School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China.
| | - Ding Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China.
| | - Derong Liu
- Department of Electrical and Computer Engineering, University of Illinois at Chicago, Chicago, IL 60607, USA.
| |
Collapse
|
12
|
Wang D, Zhao M, Ha M, Ren J. Neural optimal tracking control of constrained nonaffine systems with a wastewater treatment application. Neural Netw 2021; 143:121-132. [PMID: 34118779 DOI: 10.1016/j.neunet.2021.05.027] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 02/15/2021] [Accepted: 05/25/2021] [Indexed: 10/21/2022]
Abstract
In this paper, we aim to solve the optimal tracking control problem for a class of nonaffine discrete-time systems with actuator saturation. First, a data-based neural identifier is constructed to learn the unknown system dynamics. Then, according to the expression of the trained neural identifier, we can obtain the steady control corresponding to the reference trajectory. Next, by involving the iterative dual heuristic dynamic programming algorithm, the new costate function and the tracking control law are developed. Two other neural networks are used to estimate the costate function and approximate the tracking control law. Considering approximation errors of neural networks, the stability analysis of the proposed algorithm for the specific systems is provided by introducing the Lyapunov approach. Finally, via conducting simulation and comparison, the superiority of the developed optimal tracking method is confirmed. Moreover, the trajectory tracking performance of the wastewater treatment application is also involved for further verifying the proposed approach.
Collapse
Affiliation(s)
- Ding Wang
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China.
| | - Mingming Zhao
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China.
| | - Mingming Ha
- School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083, China.
| | - Jin Ren
- Faculty of Information Technology, Beijing University of Technology, Beijing 100124, China; Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, Beijing 100124, China; Beijing Institute of Artificial Intelligence, Beijing University of Technology, Beijing 100124, China.
| |
Collapse
|
13
|
Liang Y, Zhang H, Duan J, Sun S. Event-triggered reinforcement learning H∞control design for constrained-input nonlinear systems subject to actuator failures. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2020.07.055] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
14
|
Yang X, Wei Q. Adaptive Critic Learning for Constrained Optimal Event-Triggered Control With Discounted Cost. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:91-104. [PMID: 32167914 DOI: 10.1109/tnnls.2020.2976787] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
This article studies an optimal event-triggered control (ETC) problem of nonlinear continuous-time systems subject to asymmetric control constraints. The present nonlinear plant differs from many studied systems in that its equilibrium point is nonzero. First, we introduce a discounted cost for such a system in order to obtain the optimal ETC without making coordinate transformations. Then, we present an event-triggered Hamilton-Jacobi-Bellman equation (ET-HJBE) arising in the discounted-cost constrained optimal ETC problem. After that, we propose an event-triggering condition guaranteeing a positive lower bound for the minimal intersample time. To solve the ET-HJBE, we construct a critic network under the framework of adaptive critic learning. The critic network weight vector is tuned through a modified gradient descent method, which simultaneously uses historical and instantaneous state data. By employing the Lyapunov method, we prove that the uniform ultimate boundedness of all signals in the closed-loop system is guaranteed. Finally, we provide simulations of a pendulum system and an oscillator system to validate the obtained optimal ETC strategy.
Collapse
|
15
|
Liu X, Zhao B, Liu D. Fault tolerant tracking control for nonlinear systems with actuator failures through particle swarm optimization-based adaptive dynamic programming. Appl Soft Comput 2020. [DOI: 10.1016/j.asoc.2020.106766] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
16
|
Wen Y, Si J, Brandt A, Gao X, Huang HH. Online Reinforcement Learning Control for the Personalization of a Robotic Knee Prosthesis. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2346-2356. [PMID: 30668514 DOI: 10.1109/tcyb.2019.2890974] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Robotic prostheses deliver greater function than passive prostheses, but we face the challenge of tuning a large number of control parameters in order to personalize the device for individual amputee users. This problem is not easily solved by traditional control designs or the latest robotic technology. Reinforcement learning (RL) is naturally appealing. The recent, unprecedented success of AlphaZero demonstrated RL as a feasible, large-scale problem solver. However, the prosthesis-tuning problem is associated with several unaddressed issues such as that it does not have a known and stable model, the continuous states and controls of the problem may result in a curse of dimensionality, and the human-prosthesis system is constantly subject to measurement noise, environmental change and human-body-caused variations. In this paper, we demonstrated the feasibility of direct heuristic dynamic programming, an approximate dynamic programming (ADP) approach, to automatically tune the 12 robotic knee prosthesis parameters to meet individual human users' needs. We tested the ADP-tuner on two subjects (one able-bodied subject and one amputee subject) walking at a fixed speed on a treadmill. The ADP-tuner learned to reach target gait kinematics in an average of 300 gait cycles or 10 min of walking. We observed improved ADP tuning performance when we transferred a previously learned ADP controller to a new learning session with the same subject. To the best of our knowledge, our approach to personalize robotic prostheses is the first implementation of online ADP learning control to a clinical problem involving human subjects.
Collapse
|
17
|
Event-triggered constrained control with DHP implementation for nonaffine discrete-time systems. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.01.020] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
18
|
Xu B, Zhang R, Li S, He W, Shi Z. Composite Neural Learning-Based Nonsingular Terminal Sliding Mode Control of MEMS Gyroscopes. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2020; 31:1375-1386. [PMID: 31251201 DOI: 10.1109/tnnls.2019.2919931] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The efficient driving control of MEMS gyroscopes is an attractive way to improve the precision without hardware redesign. This paper investigates the sliding mode control (SMC) for the dynamics of MEMS gyroscopes using neural networks (NNs). Considering the existence of the dynamics uncertainty, the composite neural learning is constructed to obtain higher tracking precision using the serial-parallel estimation model (SPEM). Furthermore, the nonsingular terminal SMC (NTSMC) is proposed to achieve finite-time convergence. To obtain the prescribed performance, a time-varying barrier Lyapunov function (BLF) is introduced to the control scheme. Through simulation tests, it is observed that under the BLF-based NTSMC with composite learning design, the tracking precision of MEMS gyroscopes is highly improved.
Collapse
|
19
|
Adaptive complex-valued stepsize based fast learning of complex-valued neural networks. Neural Netw 2020; 124:233-242. [DOI: 10.1016/j.neunet.2020.01.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Revised: 12/05/2019] [Accepted: 01/14/2020] [Indexed: 11/23/2022]
|
20
|
Feng T, Zhang J, Zhang H. Consensusability of discrete-time linear multi-agent systems with multiple inputs. Neurocomputing 2020. [DOI: 10.1016/j.neucom.2019.11.040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
21
|
An Analysis of IRL-Based Optimal Tracking Control of Unknown Nonlinear Systems with Constrained Input. Neural Process Lett 2019. [DOI: 10.1007/s11063-019-10029-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
22
|
Event-triggered H∞ optimal control for continuous-time nonlinear systems using neurodynamic programming. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.06.090] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
23
|
Song R, Xie Y, Zhang Z. Data-driven finite-horizon optimal tracking control scheme for completely unknown discrete-time nonlinear systems. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2019.05.026] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|