1
|
Yang Y, Zou D, He X. Graph Neural Network-Based Node Deployment for Throughput Enhancement. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:14810-14824. [PMID: 37327098 DOI: 10.1109/tnnls.2023.3281643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
The recent rapid growth in mobile data traffic entails a pressing demand for improving the throughput of the underlying wireless communication networks. Network node deployment has been considered as an effective approach for throughput enhancement which, however, often leads to highly nontrivial nonconvex optimizations. Although convex-approximation-based solutions are considered in the literature, their approximation to the actual throughput may be loose and sometimes lead to unsatisfactory performance. With this consideration, in this article, we propose a novel graph neural network (GNN) method for the network node deployment problem. Specifically, we fit a GNN to the network throughput and use the gradients of this GNN to iteratively update the locations of the network nodes. Besides, we show that an expressive GNN has the capacity to approximate both the function value and the gradients of a multivariate permutation-invariant function, as a theoretic support to the proposed method. To further improve the throughput, we also study a hybrid node deployment method based on this approach. To train the desired GNN, we adopt a policy gradient algorithm to create datasets containing good training samples. Numerical experiments show that the proposed methods produce competitive results compared with the baselines.
Collapse
|
2
|
Qiao J, Li M, Wang D. Asymmetric Constrained Optimal Tracking Control With Critic Learning of Nonlinear Multiplayer Zero-Sum Games. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2024; 35:5671-5683. [PMID: 36191112 DOI: 10.1109/tnnls.2022.3208611] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
By utilizing a neural-network-based adaptive critic mechanism, the optimal tracking control problem is investigated for nonlinear continuous-time (CT) multiplayer zero-sum games (ZSGs) with asymmetric constraints. Initially, we build an augmented system with the tracking error system and the reference system. Moreover, a novel nonquadratic function is introduced to address asymmetric constraints. Then, we derive the tracking Hamilton-Jacobi-Isaacs (HJI) equation of the constrained nonlinear multiplayer ZSG. However, it is extremely hard to get the analytical solution to the HJI equation. Hence, an adaptive critic mechanism based on neural networks is established to estimate the optimal cost function, so as to obtain the near-optimal control policy set and the near worst disturbance policy set. In the process of neural critic learning, we only utilize one critic neural network and develop a new weight updating rule. After that, by using the Lyapunov approach, the uniform ultimate boundedness stability of the tracking error in the augmented system and the weight estimation error of the critic network is verified. Finally, two simulation examples are provided to demonstrate the efficacy of the established mechanism.
Collapse
|
3
|
Su Q, Pei Z, Tang Z. Tracking Control for a Lower Extremity Exoskeleton Based on Adaptive Dynamic Programing. Biomimetics (Basel) 2023; 8:353. [PMID: 37622958 PMCID: PMC10452450 DOI: 10.3390/biomimetics8040353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/02/2023] [Accepted: 08/04/2023] [Indexed: 08/26/2023] Open
Abstract
The utilization of lower extremity exoskeletons has witnessed a growing presence across diverse domains such as the military, medical treatment, and rehabilitation. This paper introduces a novel design of a lower extremity exoskeleton specifically tailored for individuals engaged in heavy object carrying tasks. The exoskeleton incorporates an impressive 12 degrees of freedom (DOF), with four of them being effectively controlled through hydraulic cylinders. To achieve optimal control of this intricate lower extremity exoskeleton system, the authors propose an adaptive dynamic programming (ADP) algorithm. Several crucial components are established to implement this control scheme. These include the formulation of the state equation for the lower extremity exoskeleton system, which is well-suited for the ADP algorithm. Additionally, a corresponding performance index function based on the tracking error is devised, along with the game algebraic Riccati equation. By employing the value iteration ADP scheme, the lower extremity exoskeleton demonstrates highly effective tracking control. This research not only highlights the potential of the proposed control approach but also showcases its ability to enhance the overall performance and functionality of lower extremity exoskeletons, particularly in scenarios involving heavy object carrying. Overall, this study contributes to the advancement of lower extremity exoskeleton technology and offers valuable insights into the application of ADP algorithms for achieving precise and efficient control in demanding tasks.
Collapse
Affiliation(s)
| | | | - Zhiyong Tang
- School of Automation Science and Electrical Engineering, Beihang University, 37 Xueyuan Road, Haidian District, Beijing 100191, China; (Q.S.); (Z.P.)
| |
Collapse
|
4
|
Peng Z, Ji H, Zou C, Kuang Y, Cheng H, Shi K, Ghosh BK. Optimal H ∞ tracking control of nonlinear systems with zero-equilibrium-free via novel adaptive critic designs. Neural Netw 2023; 164:105-114. [PMID: 37148606 DOI: 10.1016/j.neunet.2023.04.021] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 02/16/2023] [Accepted: 04/12/2023] [Indexed: 05/08/2023]
Abstract
In this paper, a novel adaptive critic control method is designed to solve an optimal H∞ tracking control problem for continuous nonlinear systems with nonzero equilibrium based on adaptive dynamic programming (ADP). To guarantee the finiteness of a cost function, traditional methods generally assume that the controlled system has a zero equilibrium point, which is not true in practical systems. In order to overcome such obstacle and realize H∞ optimal tracking control, this paper proposes a novel cost function design with respect to disturbance, tracking error and the derivative of tracking error. Based on the designed cost function, the H∞ control problem is formulated as two-player zero-sum differential games, and then a policy iteration (PI) algorithm is proposed to solve the corresponding Hamilton-Jacobi-Isaacs (HJI) equation. In order to obtain the online solution to the HJI equation, a single-critic neural network structure based on PI algorithm is established to learn the optimal control policy and the worst-case disturbance law. It is worth mentioning that the proposed adaptive critic control method can simplify the controller design process when the equilibrium of the systems is not zero. Finally, simulations are conducted to evaluate the tracking performance of the proposed control methods.
Collapse
Affiliation(s)
- Zhinan Peng
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Hanqi Ji
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Chaobin Zou
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Yiqun Kuang
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
| | - Hong Cheng
- School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Kaibo Shi
- School of Information Science and Engineering, Chengdu University, Chengdu, 610106, China
| | - Bijoy Kumar Ghosh
- Department of Mathematics and Statistics, Texas Tech University, Lubbock, TX, 79409-1042, USA
| |
Collapse
|
5
|
Guo Z, Chen H, Li S. Deep Reinforcement Learning-Based One-to-Multiple Cooperative Computing in Large-Scale Event-Driven Wireless Sensor Networks. SENSORS (BASEL, SWITZERLAND) 2023; 23:3237. [PMID: 36991947 PMCID: PMC10058844 DOI: 10.3390/s23063237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Revised: 03/09/2023] [Accepted: 03/15/2023] [Indexed: 06/19/2023]
Abstract
Emergency event monitoring is a hot topic in wireless sensor networks (WSNs). Benefiting from the progress of Micro-Electro-Mechanical System (MEMS) technology, it is possible to process emergency events locally by using the computing capacities of redundant nodes in large-scale WSNs. However, it is challenging to design a resource scheduling and computation offloading strategy for a large number of nodes in an event-driven dynamic environment. In this paper, focusing on cooperative computing with a large number of nodes, we propose a set of solutions, including dynamic clustering, inter-cluster task assignment and intra-cluster one-to-multiple cooperative computing. Firstly, an equal-size K-means clustering algorithm is proposed, which activates the nodes around event location and then divides active nodes into several clusters. Then, through inter-cluster task assignment, every computation task of events is alternately assigned to the cluster heads. Next, in order to make each cluster efficiently complete the computation tasks within the deadline, a Deep Deterministic Policy Gradient (DDPG)-based intra-cluster one-to-multiple cooperative computing algorithm is proposed to obtain a computation offloading strategy. Simulation studies show that the performance of the proposed algorithm is close to that of the exhaustive algorithm and better than other classical algorithms and the Deep Q Network (DQN) algorithm.
Collapse
|
6
|
Sun J, Dai J, Zhang H, Yu S, Xu S, Wang J. Neural-Network-Based Immune Optimization Regulation Using Adaptive Dynamic Programming. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:1944-1953. [PMID: 35767503 DOI: 10.1109/tcyb.2022.3179302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
This article investigates optimal regulation scheme between tumor and immune cells based on the adaptive dynamic programming (ADP) approach. The therapeutic goal is to inhibit the growth of tumor cells to allowable injury degree and maximize the number of immune cells in the meantime. The reliable controller is derived through the ADP approach to make the number of cells achieve the specific ideal states. First, the main objective is to weaken the negative effect caused by chemotherapy and immunotherapy, which means that the minimal dose of chemotherapeutic and immunotherapeutic drugs can be operational in the treatment process. Second, according to the nonlinear dynamical mathematical model of tumor cells, chemotherapy and immunotherapeutic drugs can act as powerful regulatory measures, which is a closed-loop control behavior. Finally, states of the system and critic weight errors are proved to be ultimately uniformly bounded with the appropriate optimization control strategy and the simulation results are shown to demonstrate the effectiveness of the cybernetics methodology.
Collapse
|
7
|
Wu Q, Zhao B, Liu D, Polycarpou MM. Event-triggered adaptive dynamic programming for decentralized tracking control of input constrained unknown nonlinear interconnected systems. Neural Netw 2022; 157:336-349. [DOI: 10.1016/j.neunet.2022.10.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2022] [Revised: 09/26/2022] [Accepted: 10/24/2022] [Indexed: 11/11/2022]
|
8
|
Zhao M, Wang D, Ha M, Qiao J. Evolving and Incremental Value Iteration Schemes for Nonlinear Discrete-Time Zero-Sum Games. IEEE TRANSACTIONS ON CYBERNETICS 2022; PP:4487-4499. [PMID: 36063514 DOI: 10.1109/tcyb.2022.3198078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this article, evolving and incremental value iteration (VI) frameworks are constructed to address the discrete-time zero-sum game problem. First, the evolving scheme means that the closed-loop system is regulated by using the evolving policy pair. During the control stage, we are committed to establishing the stability criterion in order to guarantee the availability of evolving policy pairs. Second, a novel incremental VI algorithm, which takes the historical information of the iterative process into account, is developed to solve the regulation and tracking problems for the nonlinear zero-sum game. Via introducing different incremental factors, it is highlighted that we can adjust the convergence rate of the iterative cost function sequence. Finally, two simulation examples, including linear and nonlinear systems, are conducted to demonstrate the performance and the validity of the proposed evolving and incremental VI schemes.
Collapse
|
9
|
Constrained Optimal Control for Nonlinear Multi-Input Safety-Critical Systems with Time-Varying Safety Constraints. MATHEMATICS 2022. [DOI: 10.3390/math10152744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/10/2022]
Abstract
In this paper, we investigate the constrained optimal control problem of nonlinear multi-input safety-critical systems with uncertain disturbances and time-varying safety constraints. By utilizing a barrier function transformation, together with a new disturbance-related term and a smooth safety boundary function, a nominal system-dependent multi-input barrier transformation architecture is developed to deal with the time-varying safety constraints and uncertain disturbances. Based on the obtained transformation system, the coupled Hamilton–Jacobi–Bellman (HJB) function is established to obtain the constrained Nash equilibrium solution. In addition, due to the fact that it is difficult to solve the HJB function directly, the single critic neural network (NN) is constructed to approximate the optimal performance index function of different control inputs, respectively. It is proved theoretically that, under the influence of uncertain disturbances and time-varying safety constraints, the system states and neural network parameters can be uniformly ultimately bounded (UUB) by the proposed neural network approximation method. Finally, the effectiveness of the proposed method is verified by two nonlinear simulation examples.
Collapse
|
10
|
Deng C, Yue D, Che WW, Xie X. Cooperative Fault-Tolerant Control for a Class of Nonlinear MASs by Resilient Learning Approach. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:670-679. [PMID: 35675248 DOI: 10.1109/tnnls.2022.3176392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
In this article, a learning-based resilient fault-tolerant control method is proposed for a class of uncertain nonlinear multiagent systems (MASs) to enhance the security and reliability against denial-of-service (DoS) attacks and actuator faults. With the framework of cooperative output regulation, the developed algorithm consists of designing a distributed resilient observer and a decentralized fault-tolerant controller. Specifically, by using the data-driven method, an online resilient learning algorithm is first presented to learn the unknown exosystem matrix in the presence of DoS attacks. Then, a distributed resilient observer is proposed working against DoS attacks. In addition, based on the developed observer, a decentralized adaptive fault-tolerant controller is designed to compensate for actuator faults. Moreover, the convergence of error systems is shown by using the Lyapunov stability theory. The effectiveness of our result is examined by a simulation example.
Collapse
|
11
|
Zhao Q, Sun J, Wang G, Chen J. Event-Triggered ADP for Nonzero-Sum Games of Unknown Nonlinear Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; 33:1905-1913. [PMID: 33882002 DOI: 10.1109/tnnls.2021.3071545] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
For nonzero-sum (NZS) games of nonlinear systems, reinforcement learning (RL) or adaptive dynamic programming (ADP) has shown its capability of approximating the desired index performance and the optimal input policy iteratively. In this article, an event-triggered ADP is proposed for NZS games of continuous-time nonlinear systems with completely unknown system dynamics. To achieve the Nash equilibrium solution approximately, the critic neural networks and actor neural networks are utilized to estimate the value functions and the control policies, respectively. Compared with the traditional time-triggered mechanism, the proposed algorithm updates the neural network weights as well as the inputs of players only when a state-based event-triggered condition is violated. It is shown that the system stability and the weights' convergence are still guaranteed under mild assumptions, while occupation of communication and computation resources is considerably reduced. Meanwhile, the infamous Zeno behavior is excluded by proving the existence of a minimum inter-event time (MIET) to ensure the feasibility of the closed-loop event-triggered continuous-time system. Finally, a numerical example is simulated to illustrate the effectiveness of the proposed approach.
Collapse
|
12
|
Zhang J, Zhang H, Gao Z, Sun S. Time-varying formation control with general linear multi-agent systems by distributed event-triggered mechanisms under fixed and switching topologies. Neural Comput Appl 2022. [DOI: 10.1007/s00521-021-06539-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
13
|
Liu Q, Kwong CF, Wei S, Zhou S, Li L, Kar P. Reinforcement learning-based joint self-optimisation method for the fuzzy logic handover algorithm in 5G HetNets. Neural Comput Appl 2021. [DOI: 10.1007/s00521-021-06673-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
14
|
Online event-based adaptive critic design with experience replay to solve partially unknown multi-player nonzero-sum games. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.05.087] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
15
|
Online optimal learning algorithm for Stackelberg games with partially unknown dynamics and constrained inputs. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.03.021] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
16
|
Zhang G, Chu S, Jin X, Zhang W. Composite Neural Learning Fault-Tolerant Control for Underactuated Vehicles With Event-Triggered Input. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:2327-2338. [PMID: 32692688 DOI: 10.1109/tcyb.2020.3005800] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article presents a novel composite neural learning fault-tolerant algorithm to implement the path-following activity of underactuated vehicles with event-triggered input. With the input event-triggered mechanism, the dominant superiority is to reduce the communication burden in the channel from the controller to actuators. In the proposed scheme, the system uncertainties are dealt with in the fusion of the neural networks (NNs) and the dynamic surface control (DSC) method. The serial-parallel estimation model (SPEM) is constructed to estimate the error dynamics, where the derived prediction error could improve the compensation effect of the NNs. As for the gain uncertainties and the unknown actuator faults, four adaptive parameters are designed to stabilize the related perturbation and not be affected by the triggering instants. Based on the direct Lyapunov theorem, considerable efforts have been made to guarantee the semiglobal uniformly ultimately bounded (SGUUB) stability of the closed-loop system. Finally, comparison and practical experiments are illustrated to verify the superiority of the proposed algorithm.
Collapse
|
17
|
Deng C, Che WW, Wu ZG. A Dynamic Periodic Event-Triggered Approach to Consensus of Heterogeneous Linear Multiagent Systems With Time-Varying Communication Delays. IEEE TRANSACTIONS ON CYBERNETICS 2021; 51:1812-1821. [PMID: 32991298 DOI: 10.1109/tcyb.2020.3015746] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
This article is concerned with the event-triggered output consensus problem for heterogeneous multiagent systems (MASs) with nonuniform communication delays. Unlike the existing event-triggered consensus results, more general heterogeneous linear MASs and nonuniform communication delays are considered. To reduce communication among subsystems, novel dynamic periodic event-triggered mechanisms are proposed. By using the event-triggered signals at the previous sampling instant, new distributed observers are designed to eliminate asynchronous behavior caused by nonuniform communication delays. Based on the developed observers, the observer error system is converted into a time-delay system with interval time-varying delays. Besides, a controller is designed by using the states of observers. It is shown that the consensus problem can be solved by the proposed method. Finally, an illustrative example is provided to verify the effectiveness of the developed method.
Collapse
|