1
|
Wang D, Wu J, Ha M, Zhao M, Li M, Qiao J. Advanced Optimal Tracking Control With Stability Guarantee via Novel Value Learning Formulation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:8254-8265. [PMID: 37015365 DOI: 10.1109/tnnls.2022.3226518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
In this article, to solve the optimal tracking control problem (OTCP) for discrete-time (DT) nonlinear systems, general value iteration (GVI) scheme and online value iteration (VI) algorithms with novel value function are discussed. First, the disadvantage of the traditional value function for the OTCP is presented and the novel value function is introduced. Second, we analyze the monotonicity and convergence of GVI and establish the admissibility condition of GVI to evaluate the admissibility of the current iterative control. Note that a novel approach is introduced to analyze the admissibility. Third, based on the attraction domain, improved control policies with online VI can be obtained by judging the location of the current tracking error and reference point. Finally, the stability of the online VI-based control system is guaranteed. Besides, we provide two simulation examples to show the performance of the proposed methods.
Collapse
|
2
|
Pham DB, Dao QT, Bui NT, Nguyen TVA. Robust-optimal control of rotary inverted pendulum control through fuzzy descriptor-based techniques. Sci Rep 2024; 14:5593. [PMID: 38454029 PMCID: PMC10920893 DOI: 10.1038/s41598-024-56202-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 03/04/2024] [Indexed: 03/09/2024] Open
Abstract
Expanding upon the well-established Takagi-Sugeno (T-S) fuzzy model, the T-S fuzzy descriptor model emerges as a robust and flexible framework. This article introduces the development of optimal and robust-optimal controllers grounded in the principles of stability control and fuzzy descriptor systems. By transforming complicated inequalities into linear matrix inequalities (LMI), we establish the essential conditions for controller construction, as delineated in theorems. To substantiate the utility of these controllers, we employ the rotary inverted pendulum as a testbed. Through diverse simulation scenarios, these controllers, rooted in fuzzy descriptor systems, demonstrate their practicality and effectiveness in ensuring the stable control of inverted pendulum systems, even in the presence of uncertainties within the model. This study highlights the adaptability and robustness of fuzzy descriptor-based controllers, paving the way for advanced control strategies in complex and uncertain environments.
Collapse
Affiliation(s)
- Duc-Binh Pham
- Hanoi University of Science and Technology, Hanoi, 11615, Vietnam
| | - Quy-Thinh Dao
- Hanoi University of Science and Technology, Hanoi, 11615, Vietnam
| | - Ngoc-Tam Bui
- Innovative Global Program, Shibaura Institute of Technology, Saitama, 337-8570, Japan
| | | |
Collapse
|
3
|
Wu H, Huang J, Wu K, Lopes AM, Chen L. Precise tracking control via iterative learning for one-sided Lipschitz Caputo fractional-order systems. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:3095-3109. [PMID: 38454720 DOI: 10.3934/mbe.2024137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
This paper investigates iterative learning control for Caputo fractional-order systems with one-sided Lipschitz nonlinearity. Both open- and closed-loop P-type learning algorithms are proposed to achieve perfect tracking for the desired trajectory, and their convergence conditions are established. It is shown that the algorithms can make the output tracking error converge to zero along the iteration axis. A simulation example illustrates the application of the theoretical findings, and shows the effectiveness of the proposed approach.
Collapse
Affiliation(s)
- Hanjiang Wu
- Anhui Electrical Engineering Professional Technique College, Hefei 230051, China
| | - Jie Huang
- Anhui Electrical Engineering Professional Technique College, Hefei 230051, China
| | - Kehan Wu
- Anhui Electrical Engineering Professional Technique College, Hefei 230051, China
| | - António M Lopes
- LAETA/INEGI, Faculty of Engineering, University of Porto, Rua Dr. Roberto Frias, 4200-465 Porto, Portugal
| | - Liping Chen
- School of Electrical Engineering and Automation, Hefei University of Technology, Hefei 230009, China
| |
Collapse
|
4
|
Fang H, Zhang M, He S, Luan X, Liu F, Ding Z. Solving the Zero-Sum Control Problem for Tidal Turbine System: An Online Reinforcement Learning Approach. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:7635-7647. [PMID: 35839191 DOI: 10.1109/tcyb.2022.3186886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A novel completely mode-free integral reinforcement learning (CMFIRL)-based iteration algorithm is proposed in this article to compute the two-player zero-sum games and the Nash equilibrium problems, that is, the optimal control policy pairs, for tidal turbine system based on continuous-time Markov jump linear model with exact transition probability and completely unknown dynamics. First, the tidal turbine system is modeled into Markov jump linear systems, followed by a designed subsystem transformation technique to decouple the jumping modes. Then, a completely mode-free reinforcement learning algorithm is employed to address the game-coupled algebraic Riccati equations without using the information of the system dynamics, in order to reach the Nash equilibrium. The learning algorithm includes one iteration loop by updating the control policy and the disturbance policy simultaneously. Also, the exploration signal is added for motivating the system, and the convergence of the CMFIRL iteration algorithm is rigorously proved. Finally, a simulation example is given to illustrate the effectiveness and applicability of the control design approach.
Collapse
|
5
|
Deng Z, Liu Y. Nash Equilibrium Seeking Algorithm Design for Distributed Nonsmooth Multicluster Games Over Weight-Balanced Digraphs. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:10802-10811. [PMID: 35544491 DOI: 10.1109/tnnls.2022.3171535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this article, we study the multicluster games over weight-balanced digraphs, where the cost functions of all players are nonsmooth. Besides, in the problem, not only are the decisions of all players constrained by heterogeneous local constraints but also the decisions of players in the same cluster are constrained by coupling constraints. Due to the nonsmooth cost functions, the coupling constraints, the general local convex constraints, and the weight-balanced digraphs, existing Nash equilibrium seeking algorithms cannot solve our problem. In order to seek the Nash equilibrium of the game, we design a distributed algorithm based on subgradient descent, differential inclusions, and projection operations. In the algorithm, a distributed learning strategy is embedded for the players to estimate the decisions of other players. Moreover, we analyze the asymptotical convergence of the algorithm via set-valued LaSalle invariance principle. Finally, a numerical simulation about electricity market games is presented to illustrate the effectiveness of our result.
Collapse
|
6
|
Shi Y, Hu J, Ghosh BK. Adaptive Output Containment Tracking Control for Heterogeneous Wide-Area Networks with Aperiodic Intermittent Communication and Uncertain Leaders. SENSORS (BASEL, SWITZERLAND) 2023; 23:8631. [PMID: 37896724 PMCID: PMC10611305 DOI: 10.3390/s23208631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 10/19/2023] [Accepted: 10/20/2023] [Indexed: 10/29/2023]
Abstract
This paper proposes an adaptive distributed hybrid control approach to investigate the output containment tracking problem of heterogeneous wide-area networks with intermittent communication. First, a clustered network is modeled for a wide-area scenario. An aperiodic intermittent communication mechanism is exerted on the clusters such that clusters only communicate through leaders. Second, in order to remove the assumption that each follower must know the system matrix of the leaders and achieve output containment, a distributed adaptive hybrid control strategy is proposed for each agent under the internal model and adaptive estimation mechanism. Third, sufficient conditions based on average dwell-time are provided for the output containment achievement using a Lyapunov function method, from which the exponential stability of the closed-loop system is analyzed. Finally, simulation results are presented to demonstrate the effectiveness of the proposed adaptive distributed intermittent control strategy.
Collapse
Affiliation(s)
- Yanpeng Shi
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; (Y.S.); (B.K.G.)
| | - Jiangping Hu
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; (Y.S.); (B.K.G.)
- Yangtze Delta Region Institute (Huzhou), University of Electronic Science and Technology of China, Huzhou 313001, China
| | - Bijoy Kumar Ghosh
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China; (Y.S.); (B.K.G.)
- Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX 79409-1042, USA
| |
Collapse
|
7
|
Possieri C, Sassano M. Data-Driven Policy Iteration for Nonlinear Optimal Control Problems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7365-7376. [PMID: 35100122 DOI: 10.1109/tnnls.2022.3142501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The design of optimal control laws for nonlinear systems is tackled without knowledge of the underlying plant and of a functional description of the cost function. The proposed data-driven method is based only on real-time measurements of the state of the plant and of the (instantaneous) value of the reward signal and relies on a combination of ideas borrowed from the theories of optimal and adaptive control problems. As a result, the architecture implements a policy iteration strategy in which, hinging on the use of neural networks, the policy evaluation step and the computation of the relevant information instrumental for the policy improvement step are performed in a purely continuous-time fashion. Furthermore, the desirable features of the design method, including convergence rate and robustness properties, are discussed. Finally, the theory is validated via two benchmark numerical simulations.
Collapse
|
8
|
Lin Z, Duan J, Li SE, Ma H, Li J, Chen J, Cheng B, Ma J. Policy-Iteration-Based Finite-Horizon Approximate Dynamic Programming for Continuous-Time Nonlinear Optimal Control. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:5255-5267. [PMID: 37015565 DOI: 10.1109/tnnls.2022.3225090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The Hamilton-Jacobi-Bellman (HJB) equation serves as the necessary and sufficient condition for the optimal solution to the continuous-time (CT) optimal control problem (OCP). Compared with the infinite-horizon HJB equation, the solving of the finite-horizon (FH) HJB equation has been a long-standing challenge, because the partial time derivative of the value function is involved as an additional unknown term. To address this problem, this study first-time bridges the link between the partial time derivative and the terminal-time utility function, and thus it facilitates the use of the policy iteration (PI) technique to solve the CT FH OCPs. Based on this key finding, the FH approximate dynamic programming (ADP) algorithm is proposed leveraging an actor-critic framework. It is shown that the algorithm exhibits important properties in terms of convergence and optimality. Rather importantly, with the use of multilayer neural networks (NNs) in the actor-critic architecture, the algorithm is suitable for CT FH OCPs toward more general nonlinear and complex systems. Finally, the effectiveness of the proposed algorithm is demonstrated by conducting a series of simulations on both a linear quadratic regulator (LQR) problem and a nonlinear vehicle tracking problem.
Collapse
|
9
|
Razali MR, Mohd Faudzi AA, Shamsudin AU, Mohamaddan S. A hybrid controller method with genetic algorithm optimization to measure position and angular for mobile robot motion control. Front Robot AI 2023; 9:1087371. [PMID: 36714801 PMCID: PMC9876975 DOI: 10.3389/frobt.2022.1087371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2022] [Accepted: 12/16/2022] [Indexed: 01/15/2023] Open
Abstract
Due to the complexity of autonomous mobile robot's requirement and drastic technological changes, the safe and efficient path tracking development is becoming complex and requires intensive knowledge and information, thus the demand for advanced algorithm has rapidly increased. Analyzing unstructured gain data has been a growing interest among researchers, resulting in valuable information in many fields such as path planning and motion control. Among those, motion control is a vital part of a fast, secure operation. Yet, current approaches face problems in managing unstructured gain data and producing accurate local planning due to the lack of formulation in the knowledge on the gain optimization. Therefore, this research aims to design a new gain optimization approach to assist researcher in identifying the value of the gain's product with a qualitative comparative study of the up-to-date controllers. Gains optimization in this context is to classify the near perfect value of the gain's product and processes. For this, a domain controller will be developed based on the attributes of the Fuzzy-PID parameters. The development of the Fuzzy Logic Controller requires information on the PID controller parameters that will be fuzzified and defuzzied based on the resulting 49 fuzzy rules. Furthermore, this fuzzy inference will be optimized for its usability by a genetic algorithm (GA). It is expected that the domain controller will give a positive impact to the path planning position and angular PID controller algorithm that meet the autonomous demand.
Collapse
Affiliation(s)
- Muhammad Razmi Razali
- Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Johor Bahru, Malaysia
| | - Ahmad Athif Mohd Faudzi
- Faculty of Electrical Engineering, Universiti Teknologi Malaysia, Johor Bahru, Malaysia,Centre for Artificial Intelligence and Robotics, Universiti Teknologi Malaysia, Kuala Lumpur, Malaysia,*Correspondence: Ahmad Athif Mohd Faudzi,
| | - Abu Ubaidah Shamsudin
- Fakulti Kejuruteraan Elektrik dan Elektronik, Universiti Tun Hussein Onn Malaysia, Parit Raja, Malaysia
| | - Shahrol Mohamaddan
- Department of Bioscience and Engineering, College of Systems Engineering and Science, Shibaura Institute of Technology (SIT), Saitama, Japan
| |
Collapse
|
10
|
Rizvi SAA, Pertzborn AJ, Lin Z. Reinforcement Learning Based Optimal Tracking Control Under Unmeasurable Disturbances With Application to HVAC Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:7523-7533. [PMID: 34129505 PMCID: PMC9703879 DOI: 10.1109/tnnls.2021.3085358] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This paper presents the design of an optimal controller for solving tracking problems subject to unmeasurable disturbances and unknown system dynamics using reinforcement learning (RL). Many existing RL control methods take disturbance into account by directly measuring it and manipulating it for exploration during the learning process, thereby preventing any disturbance induced bias in the control estimates. However, in most practical scenarios, disturbance is neither measurable nor manipulable. The main contribution of this article is the introduction of a combination of a bias compensation mechanism and the integral action in the Q-learning framework to remove the need to measure or manipulate the disturbance, while preventing disturbance induced bias in the optimal control estimates. A bias compensated Q-learning scheme is presented that learns the disturbance induced bias terms separately from the optimal control parameters and ensures the convergence of the control parameters to the optimal solution even in the presence of unmeasurable disturbances. Both state feedback and output feedback algorithms are developed based on policy iteration (PI) and value iteration (VI) that guarantee the convergence of the tracking error to zero. The feasibility of the design is validated on a practical optimal control application of a heating, ventilating, and air conditioning (HVAC) zone controller.
Collapse
|
11
|
Tu Vu V, Pham TL, Dao PN. Disturbance observer-based adaptive reinforcement learning for perturbed uncertain surface vessels. ISA TRANSACTIONS 2022; 130:277-292. [PMID: 35450728 DOI: 10.1016/j.isatra.2022.03.027] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Revised: 03/12/2022] [Accepted: 03/27/2022] [Indexed: 06/14/2023]
Abstract
This article considers a problem of tracking, convergence of disturbance observer (DO) based optimal control design for uncertain surface vessels (SVs) with external disturbance. The advantage of proposed optimal control using adaptive/approximate reinforcement learning (ARL) is that consideration for whole SVs with only one dynamic equation and without conventional separation technique. Additionally, thanks to appropriate disturbance observer, the attraction region of tracking error is remarkably reduced. On the other hand, the particular case of optimal control problem is presented by directly solving for the purpose of choosing the suitable activation functions of ARL. Furthermore, the proposed ARL based optimal control also deals with non-autonomous property of closed tracking error SV model by considering the equivalent system. Based on the Lyapunov function candidate using optimal function and quadratic form of estimated error of actor/critic weight, the stability and convergence of the closed system are proven. Some examples are given to verify and demonstrate the effectiveness of the new control strategy.
Collapse
Affiliation(s)
- Van Tu Vu
- Haiphong University, Haiphong, Viet Nam
| | - Thanh Loc Pham
- School of Electrical Engineering, Hanoi University of Science and Technology, Hanoi, Viet Nam
| | - Phuong Nam Dao
- School of Electrical Engineering, Hanoi University of Science and Technology, Hanoi, Viet Nam.
| |
Collapse
|
12
|
Xue S, Luo B, Liu D, Gao Y. Neural network-based event-triggered integral reinforcement learning for constrained H∞ tracking control with experience replay. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.09.119] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
13
|
Peng Z, Luo R, Hu J, Shi K, Nguang SK, Ghosh BK. Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:4043-4055. [PMID: 33587710 DOI: 10.1109/tnnls.2021.3055761] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In this article, a novel reinforcement learning (RL) method is developed to solve the optimal tracking control problem of unknown nonlinear multiagent systems (MASs). Different from the representative RL-based optimal control algorithms, an internal reinforce Q-learning (IrQ-L) method is proposed, in which an internal reinforce reward (IRR) function is introduced for each agent to improve its capability of receiving more long-term information from the local environment. In the IrQL designs, a Q-function is defined on the basis of IRR function and an iterative IrQL algorithm is developed to learn optimally distributed control scheme, followed by the rigorous convergence and stability analysis. Furthermore, a distributed online learning framework, namely, reinforce-critic-actor neural networks, is established in the implementation of the proposed approach, which is aimed at estimating the IRR function, the Q-function, and the optimal control scheme, respectively. The implemented procedure is designed in a data-driven way without needing knowledge of the system dynamics. Finally, simulations and comparison results with the classical method are given to demonstrate the effectiveness of the proposed tracking control method.
Collapse
|
14
|
Ganguli S, Kaur G, Sarkar P. An approximate model matching technique for controller design of linear time-invariant systems using hybrid firefly-based algorithms. ISA TRANSACTIONS 2022; 127:437-448. [PMID: 34538645 DOI: 10.1016/j.isatra.2021.08.043] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Revised: 08/28/2021] [Accepted: 08/28/2021] [Indexed: 06/13/2023]
Abstract
In this article, a delta operator based intelligent control scheme is developed with the help of model match approximately using some new firefly-based hybrid metaheuristic algorithms developed by the authors. The design approach is systematic and applies to a wide variety of plant models. The approach depends mainly on an approximate model match and gives rise to rational controllers of a lower order that can be implemented using only output feedback. The plant model, when attached with a PID controller having unknown parameters, is compared with a model considered as a reference using an approximation technique to realize the controller in the unified domain. It can be agreed upon that the estimated controller parameters using the unified delta operator approach are quite similar to those found in the continuous-time domain. Thus, a consolidated approach in the area of controller synthesis is established. Some typical models widely popular in the literature further justify the usefulness of the advocated techniques. A sufficient number of comparison has been carried out to validate the efficacy of the techniques presented. The percentage improvement of the proposed methods over the parent algorithms and other techniques has also been shown for each of the test systems. The selection of a suitable reference model may increase the utility of the new control scheme applicable to various unstable and non-minimum-phase plant models.
Collapse
Affiliation(s)
- Souvik Ganguli
- Department of Electrical & Instrumentation Engineering, Thapar Institute of Engineering and Technology, Patiala 147004, Punjab, India.
| | - Gagandeep Kaur
- Department of Electrical & Instrumentation Engineering, Thapar Institute of Engineering and Technology, Patiala 147004, Punjab, India
| | - Prasanta Sarkar
- Department of Electrical Engineering, National Institute of Technical Teachers' Training & Research, Kolkata 700106, West Bengal, India
| |
Collapse
|
15
|
Yu Y, Tran H. An XGBoost-Based Fitted Q Iteration for Finding the Optimal STI Strategies for HIV Patients. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:648-656. [PMID: 35653445 DOI: 10.1109/tnnls.2022.3176204] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The computational algorithm proposed in this article is an important step toward the development of computational tools that could help guide clinicians to personalize the management of human immunodeficiency virus (HIV) infection. In this article, an XGBoost-based fitted Q iteration algorithm is proposed for finding the optimal structured treatment interruption (STI) strategies for HIV patients. Using the XGBoost-based fitted Q iteration algorithm, we can obtain acceptable and optimal STI strategies with fewer training data, when compared with the extra-tree-based fitted Q iteration algorithm, deep Q-networks (DQNs), and proximal policy optimization (PPO) algorithm. In addition, the XGBoost-based fitted Q iteration algorithm is computationally more efficient than the extra-tree-based fitted Q iteration algorithm.
Collapse
|
16
|
Deep reinforcement learning based active disturbance rejection control for ship course control. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.06.096] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
17
|
Tu Y, Fang H, Yin Y, He S. Reinforcement learning-based nonlinear tracking control system design via LDI approach with application to trolley system. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-05909-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
18
|
Luo Z, Xiong W, Huang C. Finite-iteration learning tracking of multi-agent systems via the distributed optimization method. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.08.140] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
19
|
Off-Policy: Model-Free Optimal Synchronization Control for Complex Dynamical Networks. Neural Process Lett 2022. [DOI: 10.1007/s11063-022-10748-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
20
|
Luo S, Lewis FL, Song Y, Ouakad HM. Optimal Synchronization of Unidirectionally Coupled FO Chaotic Electromechanical Devices With the Hierarchical Neural Network. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1192-1202. [PMID: 33296315 DOI: 10.1109/tnnls.2020.3041350] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
This article solves the problem of optimal synchronization, which is important but challenging for coupled fractional-order (FO) chaotic electromechanical devices composed of mechanical and electrical oscillators and electromagnetic filed by using a hierarchical neural network structure. The synchronization model of the FO electromechanical devices with capacitive and resistive couplings is built, and the phase diagrams reveal that the dynamic properties are closely related to sets of physical parameters, coupling coefficients, and FOs. To force the slave system to move from its original orbits to the orbits of the master system, an optimal synchronization policy, which includes an adaptive neural feedforward policy and an optimal neural feedback policy, is proposed. The feedforward controller is developed in the framework of FO backstepping integrated with the hierarchical neural network to estimate unknown functions of dynamic system in which the mentioned network has the formula transformation and hierarchical form to reduce the numbers of weights and membership functions. Also, an adaptive dynamic programming (ADP) policy is proposed to address the zero-sum differential game issue in the optimal neural feedback controller in which the hierarchical neural network is designed to yield solutions of the constrained Hamilton-Jacobi-Isaacs (HJI) equation online. The presented scheme not only ensures uniform ultimate boundedness of closed-loop coupled FO chaotic electromechanical devices and realizes optimal synchronization but also achieves a minimum value of cost function. Simulation results further show the validity of the presented scheme.
Collapse
|
21
|
Mu C, Peng J, Luo H, Wang K. Data-based decentralized learning scheme for nonlinear systems with mismatched interconnections. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2021.11.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
22
|
Liu Q, Kwong CF, Wei S, Zhou S, Li L, Kar P. Reinforcement learning-based joint self-optimisation method for the fuzzy logic handover algorithm in 5G HetNets. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06673-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
23
|
Zhao F, Gao W, Jiang ZP, Liu T. Event-Triggered Adaptive Optimal Control With Output Feedback: An Adaptive Dynamic Programming Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:5208-5221. [PMID: 33035169 DOI: 10.1109/tnnls.2020.3027301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article presents an event-triggered output-feedback adaptive optimal control method for continuous-time linear systems. First, it is shown that the unmeasurable states can be reconstructed by using the measured input and output data. An event-based feedback strategy is then proposed to reduce the number of controller updates and save communication resources. The discrete-time algebraic Riccati equation is iteratively solved through event-triggered adaptive dynamic programming based on both policy iteration (PI) and value iteration (VI) methods. The convergence of the proposed algorithm and the closed-loop stability is carried out by using the Lyapunov techniques. Two numerical examples are employed to verify the effectiveness of the design methodology.
Collapse
|
24
|
Sharma S, Padhy PK. Extended B-polynomial neural network for time-delayed system modeling using sampled data. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2021. [DOI: 10.3233/jifs-210580] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The combination of machine learning and artificial intelligent has already proved its potential in achieving remarkable results for modeling unknown systems. These techniques commonly use enough data samples to train and optimize their architectures. In the present era, with the availability of enough storage and computation power, the machine learning based data-driven system modeling approaches are getting popular as they do not interrupt the normal system operations and work solely on collected data. This work proposes a data-driven parametric neural network technique for modeling time-delayed systems, which is demanding but challenging area of research and comes under nonlinear optimization problem. The key contribution of this work is the inclusion of an extended B-polynomial into the network structure for estimating time-delayed first and second order system models. These type of models extensively used for addressing simulations, predictions, controlling and monitoring related issues. Also, an adaptive learning based convergence of the proposed algorithm is proved with the help of the Lyapunov stability theory. The proposed algorithm compared with existing techniques on some well-known example problems. A real practical system plant is also included for validating the proposed concept.
Collapse
Affiliation(s)
- Sudeep Sharma
- Department of Electronics and Communication, PDPMIIITDM, Jabalpur, Madhya Pradesh, India
| | - Prabin K. Padhy
- Department of Electronics and Communication, PDPMIIITDM, Jabalpur, Madhya Pradesh, India
| |
Collapse
|
25
|
Adaptive feedforward RBF neural network control with the deterministic persistence of excitation. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06293-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
26
|
Na J, Zhao J, Gao G, Li Z. Output-Feedback Robust Control of Uncertain Systems via Online Data-Driven Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2021; 32:2650-2662. [PMID: 32706646 DOI: 10.1109/tnnls.2020.3007414] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Although robust control has been studied for decades, the output-feedback robust control design is still challenging in the control field. This article proposes a new approach to address the output-feedback robust control for continuous-time uncertain systems. First, we transform the robust control problem into an optimal control problem of the nominal linear system with a constructive cost function, which allows simplifying the control design. Then, a modified algebraic Riccati equation (MARE) is constructed by further investigating the corresponding relationship with the state-feedback optimal control. To solve the derived MARE online, the vectorization operation and Kronecker's product are applied to reformulate the output Lyapunov function, and then, a new online data-driven learning method is suggested to learn its solution. Consequently, only the measurable system input and output are used to derive the solution of the MARE. In this case, the output-feedback robust control gain can be obtained without using the unknown system states. The control system stability and convergence of the derived solution are rigorously proved. Two simulation examples are provided to demonstrate the efficacy of the suggested methods.
Collapse
|
27
|
Wang Y, Karimi HR, Lam HK, Yan H. Fuzzy Output Tracking Control and Filtering for Nonlinear Discrete-Time Descriptor Systems Under Unreliable Communication Links. IEEE TRANSACTIONS ON CYBERNETICS 2020; 50:2369-2379. [PMID: 31217141 DOI: 10.1109/tcyb.2019.2920709] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
In this paper, the problems of output tracking control and filtering are investigated for Takagi-Sugeno fuzzy-approximation-based nonlinear descriptor systems in the discrete-time domain. Especially, the unreliability of the communication links between the sensor and actuator/filter is taken into account, and the phenomenon of packet dropouts is characterized by a binary Markov chain with uncertain transition probabilities, which may reflect the reality more accurately than the existing description processes. A novel bounded real lemma (BRL), which ensures the stochastic admissibility with H∞ performance for fuzzy discrete-time descriptor systems despite the uncertain Markov packet dropouts, is presented based on a fuzzy basis-dependent Lyapunov function. By resorting to the dual conditions of the obtained BRL, a solution for the designed fuzzy output tracking controller is given. A design method for the full-order fuzzy filter is also provided. Finally, two examples are finally adopted to show the applicability of the achieved design strategies.
Collapse
|