1
|
Alali M, Imani M. Bayesian reinforcement learning for navigation planning in unknown environments. Front Artif Intell 2024; 7:1308031. [PMID: 39026967 PMCID: PMC11254700 DOI: 10.3389/frai.2024.1308031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2023] [Accepted: 06/19/2024] [Indexed: 07/20/2024] Open
Abstract
This study focuses on a rescue mission problem, particularly enabling agents/robots to navigate efficiently in unknown environments. Technological advances, including manufacturing, sensing, and communication systems, have raised interest in using robots or drones for rescue operations. Effective rescue operations require quick identification of changes in the environment and/or locating the victims/injuries as soon as possible. Several techniques have been developed in recent years for autonomy in rescue missions, including motion planning, adaptive control, and more recently, reinforcement learning techniques. These techniques rely on full knowledge of the environment or the availability of simulators that can represent real environments during rescue operations. However, in practice, agents might have little or no information about the environment or the number or locations of injuries, preventing/limiting the application of most existing techniques. This study provides a probabilistic/Bayesian representation of the unknown environment, which jointly models the stochasticity in the agent's navigation and the environment uncertainty into a vector called the belief state. This belief state allows offline learning of the optimal Bayesian policy in an unknown environment without the need for any real data/interactions, which guarantees taking actions that are optimal given all available information. To address the large size of belief space, deep reinforcement learning is developed for computing an approximate Bayesian planning policy. The numerical experiments using different maze problems demonstrate the high performance of the proposed policy.
Collapse
Affiliation(s)
- Mohammad Alali
- Department of Electrical and Computer Engineering, Northeastern University, Boston, MA, United States
| | | |
Collapse
|
2
|
Singh R, Bhushan B. Reinforcement Learning-Based Model-Free Controller for Feedback Stabilization of Robotic Systems. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:7059-7073. [PMID: 35015649 DOI: 10.1109/tnnls.2021.3137548] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
This article presents a reinforcement learning (RL) algorithm for achieving model-free control of robotic applications. The RL functions are adapted with the least-square temporal difference (LSTD) learning algorithms to develop a model-free state feedback controller by establishing linear quadratic regulator (LQR) as a baseline controller. The classical least-square policy iteration technique is adapted to establish the boundary conditions for complexities incurred by the learning algorithm. Furthermore, the use of exact and approximate policy iterations estimates the parameters of the learning functions for a feedback policy. To assess the operation of the proposed controller, the trajectory tracking and balancing control problems of unmanned helicopters and balancer robotic applications are solved for real-time experiment. The results showed the robustness of the proposed approach in achieving trajectory tracking and balancing control.
Collapse
|
3
|
Yan D, Zhang W, Chen H, Shi J. Robust control strategy for multi-UAVs system using MPC combined with Kalman-consensus filter and disturbance observer. ISA TRANSACTIONS 2023; 135:35-51. [PMID: 36175191 DOI: 10.1016/j.isatra.2022.09.021] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 09/11/2022] [Accepted: 09/11/2022] [Indexed: 06/16/2023]
Abstract
The stability of formation flight is not only sensitive to external disturbances but also to data observed and transferred between sensors and unmanned aerial vehicles (UAVs). A multi-constrained model predictive control (MPC) strategy, combined with Kalman-consensus filter (KCF) and fixed-time disturbance observer (FTDOB) is developed for the formation control of multiple quadrotors here Firstly, KCF is used to effectively fuse the data shared in the formation with noise and uncertainty, which improves the applicability and robustness of the formation in complex environments. Secondly, FTDOB is able to estimate the external disturbances suffered by the quadrotor in a fixed time and provides real-time compensation for the controller. On this basis, an improved MPC (IMPC) is designed for each UAV of the formation, which improves the computational efficiency while ensuring the asymptotic stability of the system. Eventually, the capability and effectiveness of the proposed strategy are verified by simulation in terms of disturbance rejection and noise suppression, as well as good trajectory tracking of the formation.
Collapse
Affiliation(s)
- Danghui Yan
- Department of Automatic Control, Northwestern Polytechnical University, China.
| | - Weiguo Zhang
- Department of Automatic Control, Northwestern Polytechnical University, China
| | - Hang Chen
- Department of Automatic Control, Northwestern Polytechnical University, China
| | - Jingping Shi
- Department of Automatic Control, Northwestern Polytechnical University, China
| |
Collapse
|
4
|
Kumar S, Indiran T, Itty GV, Shettigar J P, Paul TV. Development of a Nonlinear Model Predictive Control-Based Nonlinear Three-Mode Controller for a Nonlinear System. ACS OMEGA 2022; 7:42418-42437. [PMID: 36440136 PMCID: PMC9685787 DOI: 10.1021/acsomega.2c05542] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Accepted: 10/24/2022] [Indexed: 06/16/2023]
Abstract
This paper presents the novelty on a nonlinear proportional integral derivative (NPID) controller developed from the gain values obtained using the Lyapunov-based nonlinear model predictive controller (LyNMPC). The tuning parameters of the proposed controller are taken from the dynamics of the nonlinear system, and these parmeters are dynamic with their value varying according to the error in the system. In this article, the authors have considered two highly nonlinear systems, namely, batch polymerization reactor and quadrotor unmanned aerial vehicle systems. The nonlinear mathematical modeling of the batch reactor as well as the quadrotor system considered from the past literature of authors. The acrylamide polymerization reaction under consideration is an exothermic reaction, thereby making the temperature profile tracking and control a challenging task. The primary aim of this article is to develop the NPID controller based on the LyNMPC algorithm and to validate the NPID on a batch reactor bench-scale plant and on an hardware-in-the-loop platform for the quadrotor hardware. A comparative study of trajectory tracking and control capabilities of LyNMPC on derived non-linear models of the batch reactor and quadrotor system is presented. The system mathematical models are obtained with the help of the first-principle energy balance equation for the batch reactor and with the nonlinear dynamics of the quadrotor which is derived based on Newton-Euler formulations. With LyNMPC, the stability of the nonlinear systems can be improved because the error sensitivity is considered in the cost function.
Collapse
Affiliation(s)
- Suraj
Suresh Kumar
- Department
of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - Thirunavukkarasu Indiran
- Department
of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - George Vadakkekkara Itty
- Department
of Electrical and Electronics Engineering, Mar Baselios Christian College of Engineering and Technology, Idukki, Peerumade 685531, India
| | - Prajwal Shettigar J
- Department
of Mechatronics Engineering, Manipal Institute
of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| | - Tinu Valsa Paul
- Department
of Instrumentation and Control Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka 576104, India
| |
Collapse
|
5
|
Design of a Multi-Constraint Formation Controller Based on Improved MPC and Consensus for Quadrotors. AEROSPACE 2022. [DOI: 10.3390/aerospace9020094] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The formation flight of quadrotor unmanned aerial vehicles (UAVs) is a complex multi-constraint process. When designing a formation controller, the dynamic model of the UAV itself has modeling errors and uncertainties. Model predictive control (MPC) is one of the best control methods for solving the constrained problem. First, a mathematical model of the quadrotor considering disturbance and uncertainty is established using the Lagrange–Euler formulation and is divided into a rotational subsystem (RS) and a translational subsystem (TS). Here, an improved MPC (IMPC) strategy based on an error model is introduced for the control of UAVs. The tracking errors caused by synthesis disturbance can be eliminated because of the integrator embedded in the augmented model. In addition, by modifying the parameters of the cost function, not only can the degree of stability of the closed-loop subsystem be specified, but also numerical problems in the MPC calculation can be improved. The simulation results demonstrate the stability of the designed controller in formation maintenance and its robustness to external disturbances and uncertainties.
Collapse
|
6
|
Pi CH, Dai YW, Hu KC, Cheng S. General Purpose Low-Level Reinforcement Learning Control for Multi-Axis Rotor Aerial Vehicles. SENSORS 2021; 21:s21134560. [PMID: 34283119 PMCID: PMC8271845 DOI: 10.3390/s21134560] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Revised: 06/23/2021] [Accepted: 06/29/2021] [Indexed: 11/16/2022]
Abstract
This paper proposes a multipurpose reinforcement learning based low-level multirotor unmanned aerial vehicles control structure constructed using neural networks with model-free training. Other low-level reinforcement learning controllers developed in studies have only been applicable to a model-specific and physical-parameter-specific multirotor, and time-consuming training is required when switching to a different vehicle. We use a 6-degree-of-freedom dynamic model combining acceleration-based control from the policy neural network to overcome these problems. The UAV automatically learns the maneuver by an end-to-end neural network from fusion states to acceleration command. The state estimation is performed using the data from on-board sensors and motion capture. The motion capture system provides spatial position information and a multisensory fusion framework fuses the measurement from the onboard inertia measurement units for compensating the time delay and low update frequency of the capture system. Without requiring expert demonstration, the trained control policy implemented using an improved algorithm can be applied to various multirotors with the output directly mapped to actuators. The algorithm's ability to control multirotors in the hovering and the tracking task is evaluated. Through simulation and actual experiments, we demonstrate the flight control with a quadrotor and hexrotor by using the trained policy. With the same policy, we verify that we can stabilize the quadrotor and hexrotor in the air under random initial states.
Collapse
Affiliation(s)
- Chen-Huan Pi
- Department of Mechanical Engineering, National Yang Ming Chiao Tung University, Hsinchu City 30010, Taiwan; (C.-H.P.); (Y.-W.D.)
| | - Yi-Wei Dai
- Department of Mechanical Engineering, National Yang Ming Chiao Tung University, Hsinchu City 30010, Taiwan; (C.-H.P.); (Y.-W.D.)
| | - Kai-Chun Hu
- Department of Applied Mathematics, National Yang Ming Chiao Tung University, Hsinchu City 30010, Taiwan;
| | - Stone Cheng
- Department of Mechanical Engineering, National Yang Ming Chiao Tung University, Hsinchu City 30010, Taiwan; (C.-H.P.); (Y.-W.D.)
- Correspondence:
| |
Collapse
|
7
|
Brito B, Everett M, How JP, Alonso-Mora J. Where to go Next: Learning a Subgoal Recommendation Policy for Navigation in Dynamic Environments. IEEE Robot Autom Lett 2021. [DOI: 10.1109/lra.2021.3068662] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
8
|
A Review on Comparative Remarks, Performance Evaluation and Improvement Strategies of Quadrotor Controllers. TECHNOLOGIES 2021. [DOI: 10.3390/technologies9020037] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The quadrotor is an ideal platform for testing control strategies because of its non-linearity and under-actuated configuration, allowing researchers to evaluate and verify control strategies. Several control strategies are used, including Proportional-Integral-Derivative (PID), Linear Quadratic Regulator (LQR), Backstepping, Feedback Linearization Control (FLC), Sliding Mode Control (SMC), and Model Predictive Control (MPC), Neural Network, H-infinity, Fuzzy Logic, and Adaptive Control. However, due to several drawbacks, such as high computation, a large amount of training data, approximation error, and the existence of uncertainty, the commercialization of those control technologies in various industrial applications is currently limited. This paper conducts a thorough analysis of the current literature on the effects of multiple controllers on quadrotors, focusing on two separate approaches: (i) controller hybridization and (ii) controller development. Besides, the limitations of the previous works are discussed, challenges and opportunities to work in this field are assessed, and potential research directions are suggested.
Collapse
|
9
|
Robust Quadrotor Control through Reinforcement Learning with Disturbance Compensation. APPLIED SCIENCES-BASEL 2021. [DOI: 10.3390/app11073257] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
In this paper, a novel control strategy is presented for reinforcement learning with disturbance compensation to solve the problem of quadrotor positioning under external disturbance. The proposed control scheme applies a trained neural-network-based reinforcement learning agent to control the quadrotor, and its output is directly mapped to four actuators in an end-to-end manner. The proposed control scheme constructs a disturbance observer to estimate the external forces exerted on the three axes of the quadrotor, such as wind gusts in an outdoor environment. By introducing an interference compensator into the neural network control agent, the tracking accuracy and robustness were significantly increased in indoor and outdoor experiments. The experimental results indicate that the proposed control strategy is highly robust to external disturbances. In the experiments, compensation improved control accuracy and reduced positioning error by 75%. To the best of our knowledge, this study is the first to achieve quadrotor positioning control through low-level reinforcement learning by using a global positioning system in an outdoor environment.
Collapse
|
10
|
Xu Q, Wang Z, Zhen Z. Information fusion estimation-based path following control of quadrotor UAVs subjected to Gaussian random disturbance. ISA TRANSACTIONS 2020; 99:84-94. [PMID: 31629487 DOI: 10.1016/j.isatra.2019.10.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 09/23/2019] [Accepted: 10/04/2019] [Indexed: 06/10/2023]
Abstract
Random disturbance has a detrimental effect on the reliability and safety of quadrotor unmanned aerial vehicles (UAVs). This paper proposes an anti-Gaussian random disturbance control method for the path following of a quadrotor UAV. The quadrotor system is linearized and divided into two subsystems, i.e., a translational subsystem and a rotational subsystem, and hierarchical strategy is used to design the overall control architecture. In order to suppress the negative effects of Gaussian random disturbances and simplify the design process of linear-quadratic optimal output tracking control problem, a new information fusion estimation based robust control named Gaussian information fusion control (GIFC) scheme is proposed. The convergence of the output tracking errors of GIFC system is proved via Lyapunov theory. The proposed GIFC control scheme is employed for position and attitude controller designs to enhance the robustness of the quadrotor system to Gaussian random perturbations. Finally numerical simulation experiments illustrate the effectiveness and robustness of the proposed control strategy.
Collapse
Affiliation(s)
- Qingzheng Xu
- College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, No.29, Jiangjun Ave, Nanjing 211106, China.
| | - Zhisheng Wang
- College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, No.29, Jiangjun Ave, Nanjing 211106, China.
| | - Ziyang Zhen
- College of Automation Engineering, Nanjing University of Aeronautics and Astronautics, No.29, Jiangjun Ave, Nanjing 211106, China.
| |
Collapse
|
11
|
Vigneshwaran B, Willjuice Iruthayarajan M, Maheswari RV. Partial discharge pattern analysis using multi-class support vector machine to estimate cavity size and position in solid insulation. Soft comput 2019. [DOI: 10.1007/s00500-019-04570-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|