1
|
Li S, Ren T, Ding L, Liu L. Adaptive Finite-Time-Based Neural Optimal Control of Time-Delayed Wheeled Mobile Robotics Systems. SENSORS (BASEL, SWITZERLAND) 2024; 24:5462. [PMID: 39275373 PMCID: PMC11398041 DOI: 10.3390/s24175462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 07/30/2024] [Accepted: 08/13/2024] [Indexed: 09/16/2024]
Abstract
For nonlinear systems with uncertain state time delays, an adaptive neural optimal tracking control method based on finite time is designed. With the help of the appropriate LKFs, the time-delay problem is handled. A novel nonquadratic Hamilton-Jacobi-Bellman (HJB) function is defined, where finite time is selected as the upper limit of integration. This function contains information on the state time delay, while also maintaining the basic information. To meet specific requirements, the integral reinforcement learning method is employed to solve the ideal HJB function. Then, a tracking controller is designed to ensure finite-time convergence and optimization of the controlled system. This involves the evaluation and execution of gradient descent updates of neural network weights based on a reinforcement learning architecture. The semi-global practical finite-time stability of the controlled system and the finite-time convergence of the tracking error are guaranteed.
Collapse
Affiliation(s)
- Shu Li
- The Key Laboratory of Intelligent Control Theory and Application of Liaoning Provincial, Liaoning University of Technology, Jinzhou 121001, China
| | - Tao Ren
- The Key Laboratory of Intelligent Control Theory and Application of Liaoning Provincial, Liaoning University of Technology, Jinzhou 121001, China
| | - Liang Ding
- The State Key Laboratory of Robotics and System, Harbin Institute of Technology, Harbin 150001, China
| | - Lei Liu
- The Key Laboratory of Intelligent Control Theory and Application of Liaoning Provincial, Liaoning University of Technology, Jinzhou 121001, China
| |
Collapse
|
2
|
Fan QY, Jiang H, Song X, Xu B. Composite robust control of uncertain nonlinear systems with unmatched disturbances using policy iteration. ISA TRANSACTIONS 2023; 138:432-441. [PMID: 37019705 DOI: 10.1016/j.isatra.2023.03.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 12/31/2022] [Accepted: 03/18/2023] [Indexed: 06/16/2023]
Abstract
In this paper, the composite robust control problem of uncertain nonlinear systems with unmatched disturbances is investigated. In order to improve the robust control performance, the integral sliding mode control method is considered together with H∞ control for nonlinear systems. By designing a disturbance observer with a new structure, the estimations of disturbances can be obtained with small errors, which are used to construct sliding mode control policy and avoid high gains. On the basis of ensuring the accessibility of specified sliding surface, the guaranteed cost control problem of nonlinear sliding mode dynamics is considered. To overcome the difficulty of robust control design caused by nonlinear characteristics, a modified policy iteration method based on sum of squares is proposed to solve the H∞ control policy of the nonlinear sliding mode dynamics. Finally, the effectiveness of the proposed robust control method is verified by simulation tests.
Collapse
Affiliation(s)
- Quan-Yong Fan
- Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an, 710072, Shanxi, China.
| | - Hongru Jiang
- Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an, 710072, Shanxi, China.
| | - Xuekui Song
- Ansteel Engineering Technology Corporation Limited, 1 Huangang Road, Tiexi District, Anshan, 114021, Liaoning, China.
| | - Bin Xu
- Northwestern Polytechnical University, 127 West Youyi Road, Beilin District, Xi'an, 710072, Shanxi, China.
| |
Collapse
|
3
|
Xu Z, Shen T, Cheng D. Model-Free Reinforcement Learning by Embedding an Auxiliary System for Optimal Control of Nonlinear Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1520-1534. [PMID: 33347416 DOI: 10.1109/tnnls.2020.3042589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this article, a novel integral reinforcement learning (IRL) algorithm is proposed to solve the optimal control problem for continuous-time nonlinear systems with unknown dynamics. The main challenging issue in learning is how to reject the oscillation caused by the externally added probing noise. This article challenges the issue by embedding an auxiliary trajectory that is designed as an exciting signal to learn the optimal solution. First, the auxiliary trajectory is used to decompose the state trajectory of the controlled system. Then, by using the decoupled trajectories, a model-free policy iteration (PI) algorithm is developed, where the policy evaluation step and the policy improvement step are alternated until convergence to the optimal solution. It is noted that an appropriate external input is introduced at the policy improvement step to eliminate the requirement of the input-to-state dynamics. Finally, the algorithm is implemented on the actor-critic structure. The output weights of the critic neural network (NN) and the actor NN are updated sequentially by the least-squares methods. The convergence of the algorithm and the stability of the closed-loop system are guaranteed. Two examples are given to show the effectiveness of the proposed algorithm.
Collapse
|
4
|
Dynamic Optimization of a Steerable Screw In-pipe Inspection Robot Using HJB and Turbine Installation. ROBOTICA 2020. [DOI: 10.1017/s0263574719001784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
SUMMARYIn this paper, two strategies are proposed to optimize the energy consumption of a new screw in-pipe inspection robot which is steerable. In the first method, optimization is performed using the optimal path planning and implementing the Hamilton–Jacobi–Bellman (HJB) method. Since the number of actuators is more than the number of degrees of freedom of the system for the proposed steerable case, it is possible to minimize the energy consumption by the aid of the dynamics of the system. In the second method, the mechanics of the robot is modified by installing some turbine blades through which the drag force of the pipeline fluid can be employed to decrease the required propulsion force of the robot. It is shown that using both of the mentioned improvements, that is, using HJB formulation for the steerable robot and installing the turbine blades can significantly save power and energy. However, it will be shown that for the latter case this improvement is extremely dependent on the alignment of the fluid stream direction with respect to the direction of the robot velocity, while this optimization is independent of this case for the former strategy. On the other hand, the path planning dictates a special pattern of speed functionality while for the robot equipped by blades, saving the energy is possible for any desired input path. The correctness of the modeling is verified by comparing the results of MATLAB and ADAMS, while the efficiency of the proposed optimization algorithms is checked by the aid of some analytic and comparative simulations.
Collapse
|
5
|
Gandhi RV, Adhyaru DM. Hybrid extended state observer based control for systems with matched and mismatched disturbances. ISA TRANSACTIONS 2020; 106:61-73. [PMID: 32605793 DOI: 10.1016/j.isatra.2020.06.019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Revised: 02/09/2020] [Accepted: 06/21/2020] [Indexed: 06/11/2023]
Abstract
The conventional Extended State Observer-Based Control (ESOBC) technique is mostly applicable to a class of Integral Chain Form (ICF) structure with the matched kind of single-channel lumped disturbances. However, the systems having the non-ICF structure with the multi-channel matched and mismatched lumped disturbances are widely prevailing in the control applications, where the conventional ESOBC scheme may no longer applicable. To this end, it is recommended to upgrade the conventional ESOBC scheme for enhancing the feasibility and the applicability. This research work addresses the above-discussed recommendations using hybridizing the ESOBC with the Generalized ESOBC (GESOBC) which termed as Hybrid Extended State Observer Based Control (HyESOBC) in this paper. The proposed scheme is capable of estimating and controlling the multi-channel lumped disturbances (i.e., matched and mismatched) for a wide range of practical applications including the systems with the non-ICF structure. Simulation results and comparative analysis of numerical and practical application examples with the dual-channel lumped disturbances are presented to confirm the feasibility and the applicability of the proposed scheme.
Collapse
Affiliation(s)
- Ravi V Gandhi
- Institute of Technology, Nirma University, Ahmedabad, 382481, India.
| | - Dipak M Adhyaru
- Institute of Technology, Nirma University, Ahmedabad, 382481, India
| |
Collapse
|
6
|
Treesatayapun C. Knowledge-based reinforcement learning controller with fuzzy-rule network: experimental validation. Neural Comput Appl 2019. [DOI: 10.1007/s00521-019-04509-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
7
|
Yang X, He H. Adaptive Critic Designs for Event-Triggered Robust Control of Nonlinear Systems With Unknown Dynamics. IEEE TRANSACTIONS ON CYBERNETICS 2019; 49:2255-2267. [PMID: 29993650 DOI: 10.1109/tcyb.2018.2823199] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper develops a novel event-triggered robust control strategy for continuous-time nonlinear systems with unknown dynamics. To begin with, the event-triggered robust nonlinear control problem is transformed into an event-triggered nonlinear optimal control problem by introducing an infinite-horizon integral cost for the nominal system. Then, a recurrent neural network (RNN) and adaptive critic designs (ACDs) are employed to solve the derived event-triggered nonlinear optimal control problem. The RNN is applied to reconstruct the system dynamics based on collected system data. After acquiring the knowledge of system dynamics, a unique critic network is proposed to obtain the approximate solution of the event-triggered Hamilton-Jacobi-Bellman equation within the framework of ACDs. The critic network is updated by using simultaneously historical and instantaneous state data. An advantage of the present critic network update law is that it can relax the persistence of excitation condition. Meanwhile, under a newly developed event-triggering condition, the proposed critic network tuning rule not only guarantees the critic network weights to converge to optimums but also ensures nominal system states to be uniformly ultimately bounded. Moreover, by using Lyapunov method, it is proved that the derived optimal event-triggered control (ETC) guarantees uniform ultimate boundedness of all the signals in the original system. Finally, a nonlinear oscillator and an unstable power system are provided to validate the developed robust ETC scheme.
Collapse
|
8
|
Qu Q, Zhang H, Luo C, Yu R. Robust control design for multi-player nonlinear systems with input disturbances via adaptive dynamic programming. Neurocomputing 2019. [DOI: 10.1016/j.neucom.2018.11.054] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
9
|
Output feedback tracking control of a class of continuous-time nonlinear systems via adaptive dynamic programming approach. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.07.047] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
10
|
Policy iteration based robust co-design for nonlinear control systems with state constraints. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.08.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
11
|
Zhang H, Qu Q, Xiao G, Cui Y. Optimal Guaranteed Cost Sliding Mode Control for Constrained-Input Nonlinear Systems With Matched and Unmatched Disturbances. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2018; 29:2112-2126. [PMID: 29771665 DOI: 10.1109/tnnls.2018.2791419] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Based on integral sliding mode and approximate dynamic programming (ADP) theory, a novel optimal guaranteed cost sliding mode control is designed for constrained-input nonlinear systems with matched and unmatched disturbances. When the system moves on the sliding surface, the optimal guaranteed cost control problem of sliding mode dynamics is transformed into the optimal control problem of a reformulated auxiliary system with a modified cost function. The ADP algorithm based on single critic neural network (NN) is applied to obtain the approximate optimal control law for the auxiliary system. Lyapunov techniques are used to demonstrate the convergence of the NN weight errors. In addition, the derived approximate optimal control is verified to guarantee the sliding mode dynamics system to be stable in the sense of uniform ultimate boundedness. Some simulation results are presented to verify the feasibility of the proposed control scheme.
Collapse
|
12
|
Abstract
The robust control synthesis of continuous-time nonlinear systems with uncertain term is investigated via event-triggering mechanism and adaptive critic learning technique. We mainly focus on combining the event-triggering mechanism with adaptive critic designs, so as to solve the nonlinear robust control problem. This can not only make better use of computation and communication resources, but also conduct controller design from the view of intelligent optimization. Through theoretical analysis, the nonlinear robust stabilization can be achieved by obtaining an event-triggered optimal control law of the nominal system with a newly defined cost function and a certain triggering condition. The adaptive critic technique is employed to facilitate the event-triggered control design, where a neural network is introduced as an approximator of the learning phase. The performance of the event-triggered robust control scheme is validated via simulation studies and comparisons. The present method extends the application domain of both event-triggered control and adaptive critic control to nonlinear systems possessing dynamical uncertainties.
Collapse
Affiliation(s)
- Ding Wang
- The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Derong Liu
- School of Automation, Guangdong University of Technology, Guangzhou 510006, China.
| |
Collapse
|
13
|
Bounded robust control design for uncertain nonlinear systems using single-network adaptive dynamic programming. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2017.05.030] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
14
|
Wang D, He H, Liu D. Adaptive Critic Nonlinear Robust Control: A Survey. IEEE TRANSACTIONS ON CYBERNETICS 2017; 47:3429-3451. [PMID: 28682269 DOI: 10.1109/tcyb.2017.2712188] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Adaptive dynamic programming (ADP) and reinforcement learning are quite relevant to each other when performing intelligent optimization. They are both regarded as promising methods involving important components of evaluation and improvement, at the background of information technology, such as artificial intelligence, big data, and deep learning. Although great progresses have been achieved and surveyed when addressing nonlinear optimal control problems, the research on robustness of ADP-based control strategies under uncertain environment has not been fully summarized. Hence, this survey reviews the recent main results of adaptive-critic-based robust control design of continuous-time nonlinear systems. The ADP-based nonlinear optimal regulation is reviewed, followed by robust stabilization of nonlinear systems with matched uncertainties, guaranteed cost control design of unmatched plants, and decentralized stabilization of interconnected systems. Additionally, further comprehensive discussions are presented, including event-based robust control design, improvement of the critic learning rule, nonlinear H∞ control design, and several notes on future perspectives. By applying the ADP-based optimal and robust control methods to a practical power system and an overhead crane plant, two typical examples are provided to verify the effectiveness of theoretical results. Overall, this survey is beneficial to promote the development of adaptive critic control methods with robustness guarantee and the construction of higher level intelligent systems.
Collapse
|
15
|
Esfandiari K, Abdollahi F, Talebi HA. Adaptive near-optimal neuro controller for continuous-time nonaffine nonlinear systems with constrained input. Neural Netw 2017. [DOI: 10.1016/j.neunet.2017.05.013] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
16
|
Wang D, Liu D, Mu C, Ma H. Decentralized guaranteed cost control of interconnected systems with uncertainties: A learning-based optimal control strategy. Neurocomputing 2016. [DOI: 10.1016/j.neucom.2016.06.020] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
17
|
Yang X, Liu D, Luo B, Li C. Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning. Inf Sci (N Y) 2016. [DOI: 10.1016/j.ins.2016.07.051] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
18
|
Liu D, Yang X, Wang D, Wei Q. Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints. IEEE TRANSACTIONS ON CYBERNETICS 2015; 45:1372-1385. [PMID: 25872221 DOI: 10.1109/tcyb.2015.2417170] [Citation(s) in RCA: 101] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The design of stabilizing controller for uncertain nonlinear systems with control constraints is a challenging problem. The constrained-input coupled with the inability to identify accurately the uncertainties motivates the design of stabilizing controller based on reinforcement-learning (RL) methods. In this paper, a novel RL-based robust adaptive control algorithm is developed for a class of continuous-time uncertain nonlinear systems subject to input constraints. The robust control problem is converted to the constrained optimal control problem with appropriately selecting value functions for the nominal system. Distinct from typical action-critic dual networks employed in RL, only one critic neural network (NN) is constructed to derive the approximate optimal control. Meanwhile, unlike initial stabilizing control often indispensable in RL, there is no special requirement imposed on the initial control. By utilizing Lyapunov's direct method, the closed-loop optimal control system and the estimated weights of the critic NN are proved to be uniformly ultimately bounded. In addition, the derived approximate optimal control is verified to guarantee the uncertain nonlinear system to be stable in the sense of uniform ultimate boundedness. Two simulation examples are provided to illustrate the effectiveness and applicability of the present approach.
Collapse
|
19
|
Shah AK, Adhyaru DM. Clustering based multiple model control of hybrid dynamical systems using HJB solution. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2015.03.001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
20
|
Liu D, Wang D, Wang FY, Li H, Yang X. Neural-network-based online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems. IEEE TRANSACTIONS ON CYBERNETICS 2014; 44:2834-2847. [PMID: 25415951 DOI: 10.1109/tcyb.2014.2357896] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, the infinite horizon optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems is investigated using neural-network-based online solution of Hamilton-Jacobi-Bellman (HJB) equation. By establishing an appropriate bounded function and defining a modified cost function, the optimal robust guaranteed cost control problem is transformed into an optimal control problem. It can be observed that the optimal cost function of the nominal system is nothing but the optimal guaranteed cost of the original uncertain system. A critic neural network is constructed to facilitate the solution of the modified HJB equation corresponding to the nominal system. More importantly, an additional stabilizing term is introduced for helping to verify the stability, which reinforces the updating process of the weight vector and reduces the requirement of an initial stabilizing control. The uniform ultimate boundedness of the closed-loop system is analyzed by using the Lyapunov approach as well. Two simulation examples are provided to verify the effectiveness of the present control approach.
Collapse
|
21
|
Zhong X, He H, Zhang H, Wang Z. Optimal control for unknown discrete-time nonlinear Markov jump systems using adaptive dynamic programming. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2014; 25:2141-2155. [PMID: 25420238 DOI: 10.1109/tnnls.2014.2305841] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
In this paper, we develop and analyze an optimal control method for a class of discrete-time nonlinear Markov jump systems (MJSs) with unknown system dynamics. Specifically, an identifier is established for the unknown systems to approximate system states, and an optimal control approach for nonlinear MJSs is developed to solve the Hamilton-Jacobi-Bellman equation based on the adaptive dynamic programming technique. We also develop detailed stability analysis of the control approach, including the convergence of the performance index function for nonlinear MJSs and the existence of the corresponding admissible control. Neural network techniques are used to approximate the proposed performance index function and the control law. To demonstrate the effectiveness of our approach, three simulation studies, one linear case, one nonlinear case, and one single link robot arm case, are used to validate the performance of the proposed optimal control method.
Collapse
|
22
|
Wang D, Liu D, Li H, Ma H, Li C. A neural-network-based online optimal control approach for nonlinear robust decentralized stabilization. Soft comput 2014. [DOI: 10.1007/s00500-014-1534-z] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
23
|
Modares H, Naghibi Sistani MB, Lewis FL. A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA TRANSACTIONS 2013; 52:611-621. [PMID: 23706414 DOI: 10.1016/j.isatra.2013.04.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2012] [Revised: 01/23/2013] [Accepted: 04/06/2013] [Indexed: 06/02/2023]
Abstract
This paper is an effort towards developing an online learning algorithm to find the optimal control solution for continuous-time (CT) systems subject to input constraints. The proposed method is based on the policy iteration (PI) technique which has recently evolved as a major technique for solving optimal control problems. Although a number of online PI algorithms have been developed for CT systems, none of them take into account the input constraints caused by actuator saturation. In practice, however, ignoring these constraints leads to performance degradation or even system instability. In this paper, to deal with the input constraints, a suitable nonquadratic functional is employed to encode the constraints into the optimization formulation. Then, the proposed PI algorithm is implemented on an actor-critic structure to solve the Hamilton-Jacobi-Bellman (HJB) equation associated with this nonquadratic cost functional in an online fashion. That is, two coupled neural network (NN) approximators, namely an actor and a critic are tuned online and simultaneously for approximating the associated HJB solution and computing the optimal control policy. The critic is used to evaluate the cost associated with the current policy, while the actor is used to find an improved policy based on information provided by the critic. Convergence to a close approximation of the HJB solution as well as stability of the proposed feedback control law are shown. Simulation results of the proposed method on a nonlinear CT system illustrate the effectiveness of the proposed approach.
Collapse
Affiliation(s)
- Hamidreza Modares
- Department of Electrical Engineering, Ferdowsi University of Mashhad, Mashhad 91775-1111, Iran.
| | | | | |
Collapse
|
24
|
Stable iterative adaptive dynamic programming algorithm with approximation errors for discrete-time nonlinear systems. Neural Comput Appl 2013. [DOI: 10.1007/s00521-013-1361-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
25
|
Disturbance attenuation for nonlinear switched descriptor systems based on neural network. Neural Comput Appl 2012. [DOI: 10.1007/s00521-012-1171-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|