1
|
Alahi MEE, Sukkuea A, Tina FW, Nag A, Kurdthongmee W, Suwannarat K, Mukhopadhyay SC. Integration of IoT-Enabled Technologies and Artificial Intelligence (AI) for Smart City Scenario: Recent Advancements and Future Trends. SENSORS (BASEL, SWITZERLAND) 2023; 23:s23115206. [PMID: 37299934 DOI: 10.3390/s23115206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 05/24/2023] [Accepted: 05/29/2023] [Indexed: 06/12/2023]
Abstract
As the global population grows, and urbanization becomes more prevalent, cities often struggle to provide convenient, secure, and sustainable lifestyles due to the lack of necessary smart technologies. Fortunately, the Internet of Things (IoT) has emerged as a solution to this challenge by connecting physical objects using electronics, sensors, software, and communication networks. This has transformed smart city infrastructures, introducing various technologies that enhance sustainability, productivity, and comfort for urban dwellers. By leveraging Artificial Intelligence (AI) to analyze the vast amount of IoT data available, new opportunities are emerging to design and manage futuristic smart cities. In this review article, we provide an overview of smart cities, defining their characteristics and exploring the architecture of IoT. A detailed analysis of various wireless communication technologies employed in smart city applications is presented, with extensive research conducted to determine the most appropriate communication technologies for specific use cases. The article also sheds light on different AI algorithms and their suitability for smart city applications. Furthermore, the integration of IoT and AI in smart city scenarios is discussed, emphasizing the potential contributions of 5G networks coupled with AI in advancing modern urban environments. This article contributes to the existing literature by highlighting the tremendous opportunities presented by integrating IoT and AI, paving the way for the development of smart cities that significantly enhance the quality of life for urban dwellers while promoting sustainability and productivity. By exploring the potential of IoT, AI, and their integration, this review article provides valuable insights into the future of smart cities, demonstrating how these technologies can positively impact urban environments and the well-being of their inhabitants.
Collapse
Affiliation(s)
- Md Eshrat E Alahi
- School of Engineering and Technology, Walailak University, 222 Thaiburi, Thasala, Nakhon Si Thammarat 80160, Thailand
| | - Arsanchai Sukkuea
- School of Engineering and Technology, Walailak University, 222 Thaiburi, Thasala, Nakhon Si Thammarat 80160, Thailand
| | - Fahmida Wazed Tina
- Creative Innovation in Science and Technology Program, Faculty of Science and Technology, Nakhon Si Thammarat Rajabhat University, Nakhon Si Thammarat 80280, Thailand
| | - Anindya Nag
- Faculty of Electrical and Computer Engineering, Technische Universität Dresden, 01062 Dresden, Germany
- Centre for Tactile Internet with Human-in-the-Loop (CeTI), Technische Universität Dresden, 01069 Dresden, Germany
| | - Wattanapong Kurdthongmee
- School of Engineering and Technology, Walailak University, 222 Thaiburi, Thasala, Nakhon Si Thammarat 80160, Thailand
| | - Korakot Suwannarat
- School of Engineering and Technology, Walailak University, 222 Thaiburi, Thasala, Nakhon Si Thammarat 80160, Thailand
| | | |
Collapse
|
2
|
Hasanzadeh A, Hamblin MR, Kiani J, Noori H, Hardie JM, Karimi M, Shafiee H. Could artificial intelligence revolutionize the development of nanovectors for gene therapy and mRNA vaccines? NANO TODAY 2022; 47:101665. [PMID: 37034382 PMCID: PMC10081506 DOI: 10.1016/j.nantod.2022.101665] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Gene therapy enables the introduction of nucleic acids like DNA and RNA into host cells, and is expected to revolutionize the treatment of a wide range of diseases. This growth has been further accelerated by the discovery of CRISPR/Cas technology, which allows accurate genomic editing in a broad range of cells and organisms in vitro and in vivo. Despite many advances in gene delivery and the development of various viral and non-viral gene delivery vectors, the lack of highly efficient non-viral systems with low cellular toxicity remains a challenge. The application of cutting-edge technologies such as artificial intelligence (AI) has great potential to find new paradigms to solve this issue. Herein, we review AI and its major subfields including machine learning (ML), neural networks (NNs), expert systems, deep learning (DL), computer vision and robotics. We discuss the potential of AI-based models and algorithms in the design of targeted gene delivery vehicles capable of crossing extracellular and intracellular barriers by viral mimicry strategies. We finally discuss the role of AI in improving the function of CRISPR/Cas systems, developing novel nanobots, and mRNA vaccine carriers.
Collapse
Affiliation(s)
- Akbar Hasanzadeh
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
| | - Michael R Hamblin
- Laser Research Centre, Faculty of Health Science, University of Johannesburg, Doornfontein 2028, South Africa
- Radiation Biology Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Jafar Kiani
- Oncopathology Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Molecular Medicine, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Hamid Noori
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
| | - Joseph M. Hardie
- Division of Engineering in Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02139 USA
| | - Mahdi Karimi
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Oncopathology Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Research Center for Science and Technology in Medicine, Tehran University of Medical Sciences, Tehran 141556559, Iran
- Applied Biotechnology Research Centre, Tehran Medical Science, Islamic Azad University, Tehran 1584743311, Iran
| | - Hadi Shafiee
- Division of Engineering in Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02139 USA
| |
Collapse
|
3
|
Gao X, Chao F, Zhou C, Ge Z, Yang L, Chang X, Shang C, Shen Q. Error controlled actor-critic. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.08.079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
4
|
Xiong K, Zhou P, Wei C. Autonomous Navigation of Unmanned Aircraft Using Space Target LOS Measurements and QLEKF. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22186992. [PMID: 36146339 PMCID: PMC9503636 DOI: 10.3390/s22186992] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/10/2022] [Accepted: 09/13/2022] [Indexed: 06/01/2023]
Abstract
An autonomous navigation method based on the fusion of INS (inertial navigation system) measurements with the line-of-sight (LOS) observations of space targets is presented for unmanned aircrafts. INS/GNSS (global navigation satellite system) integration is the conventional approach to achieving the long-term and high-precision navigation of unmanned aircrafts. However, the performance of INS/GNSS integrated navigation may be degraded gradually in a GNSS-denied environment. INS/CNS (celestial navigation system) integrated navigation has been developed as a supplement to the GNSS. A limitation of traditional INS/CNS integrated navigation is that the CNS is not efficient in suppressing the position error of the INS. To solve the abovementioned problems, we studied a novel integrated navigation method, where the position, velocity and attitude errors of the INS were corrected using a star camera mounted on the aircraft in order to observe the space targets whose absolute positions were available. Additionally, a QLEKF (Q-learning extended Kalman filter) is designed for the performance enhancement of the integrated navigation system. The effectiveness of the presented autonomous navigation method based on the star camera and the IMU (inertial measurement unit) is demonstrated via CRLB (Cramer-Rao lower bounds) analysis and numerical simulations.
Collapse
Affiliation(s)
- Kai Xiong
- Correspondence: ; Tel.: +86-10-68744843
| | | | | |
Collapse
|
5
|
Nievas N, Pagès-Bernaus A, Bonada F, Echeverria L, Abio A, Lange D, Pujante J. A Reinforcement Learning Control in Hot Stamping for Cycle Time Optimization. MATERIALS 2022; 15:ma15144825. [PMID: 35888292 PMCID: PMC9322736 DOI: 10.3390/ma15144825] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/23/2022] [Accepted: 07/10/2022] [Indexed: 11/16/2022]
Abstract
Hot stamping is a hot metal forming technology increasingly in demand that produces ultra-high strength parts with complex shapes. A major concern in these systems is how to shorten production times to improve production Key Performance Indicators. In this work, we present a Reinforcement Learning approach that can obtain an optimal behavior strategy for dynamically managing the cycle time in hot stamping to optimize manufacturing production while maintaining the quality of the final product. Results are compared with the business-as-usual cycle time control approach and the optimal solution obtained by the execution of a dynamic programming algorithm. Reinforcement Learning control outperforms the business-as-usual behavior by reducing the cycle time and the total batch time in non-stable temperature phases.
Collapse
Affiliation(s)
- Nuria Nievas
- Eurecat, Technology Centre of Catalonia, 08005 Barcelona, Spain; (F.B.); (L.E.); (A.A.); (D.L.); (J.P.)
- Business Administration Department, Universitat de Lleida, 25001 Lleida, Spain;
- Correspondence:
| | - Adela Pagès-Bernaus
- Business Administration Department, Universitat de Lleida, 25001 Lleida, Spain;
| | - Francesc Bonada
- Eurecat, Technology Centre of Catalonia, 08005 Barcelona, Spain; (F.B.); (L.E.); (A.A.); (D.L.); (J.P.)
| | - Lluís Echeverria
- Eurecat, Technology Centre of Catalonia, 08005 Barcelona, Spain; (F.B.); (L.E.); (A.A.); (D.L.); (J.P.)
| | - Albert Abio
- Eurecat, Technology Centre of Catalonia, 08005 Barcelona, Spain; (F.B.); (L.E.); (A.A.); (D.L.); (J.P.)
| | - Danillo Lange
- Eurecat, Technology Centre of Catalonia, 08005 Barcelona, Spain; (F.B.); (L.E.); (A.A.); (D.L.); (J.P.)
| | - Jaume Pujante
- Eurecat, Technology Centre of Catalonia, 08005 Barcelona, Spain; (F.B.); (L.E.); (A.A.); (D.L.); (J.P.)
| |
Collapse
|
6
|
Fathi A, Masoudi SF. Combining CNN and Q-learning for increasing the accuracy of lost gamma source finding. Sci Rep 2022; 12:2644. [PMID: 35173217 PMCID: PMC8850423 DOI: 10.1038/s41598-022-06326-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2021] [Accepted: 01/24/2022] [Indexed: 11/30/2022] Open
Abstract
The increasing use of nuclear technology in various fields makes it necessary to provide the required safety to work with this industry. Gamma source is one of the most widely used sources in industry and medicine. Finding a lost gamma source in a gamma irradiation room without human presence is challenging due to the particular arrangements and barriers in the room for radiation shielding and requires an efficient and robust method. In this paper, locating and routing the lost gamma source in the gamma irradiation room containing radiation blocking barriers are done simultaneously by using two methods, convolutional neural network (CNN) and Q-learning, which are powerful algorithms for deep learning and machine learning. Environment simulation with gamma source was performed using Geant4 simulation. The results show that by combining these two methods in geometries with radiation blocking barriers, in addition to locating with 90% accuracy, routing can also be performed. Although the presence of thick barriers in the room reduces the accuracy, increases the time required to finding the lost gamma source or the inefficiency of other methods, nevertheless, the results show that combination of CNN and Q-learning reduces the time and greatly increases the accuracy.
Collapse
Affiliation(s)
- Atefeh Fathi
- Department of Physics, K.N. Toosi University of Technology, P.O. Box 15875-4416, Tehran, 15418-49611, Iran
| | - S Farhad Masoudi
- Department of Physics, K.N. Toosi University of Technology, P.O. Box 15875-4416, Tehran, 15418-49611, Iran.
| |
Collapse
|
7
|
Ai M, Xie Y, Tang Z, Zhang J, Gui W. Deep learning feature-based setpoint generation and optimal control for flotation processes. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.07.060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
8
|
Alharbi A, Alyami H, M P, Rauf HT, Kadry S. Intelligent scaling for 6G IoE services for resource provisioning. PeerJ Comput Sci 2021; 7:e755. [PMID: 34805508 PMCID: PMC8576555 DOI: 10.7717/peerj-cs.755] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2021] [Accepted: 09/30/2021] [Indexed: 06/13/2023]
Abstract
The proposed research motivates the 6G cellular networking for the Internet of Everything's (IoE) usage empowerment that is currently not compatible with 5G. For 6G, more innovative technological resources are required to be handled by Mobile Edge Computing (MEC). Although the demand for change in service from different sectors, the increase in IoE, the limitation of available computing resources of MEC, and intelligent resource solutions are getting much more significant. This research used IScaler, an effective model for intelligent service placement solutions and resource scaling. IScaler is considered to be made for MEC in Deep Reinforcement Learning (DRL). The paper has considered several requirements for making service placement decisions. The research also highlights several challenges geared by architectonics that submerge an Intelligent Scaling and Placement module.
Collapse
Affiliation(s)
- Abdullah Alharbi
- Department of Information Technology, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Hashem Alyami
- Department of Computer Science, College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Poongodi M
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Hafiz Tayyab Rauf
- Department of Computer Science, Faculty of Engineering & Informatics, University of Bradford, Bradford, United Kingdom
| | - Seifedine Kadry
- Faculty of Applied Computing and Technology, Noroff University College, Kristiansand, Norway
| |
Collapse
|
9
|
A reinforcement learning approach to distribution-free capacity allocation for sea cargo revenue management. Inf Sci (N Y) 2021. [DOI: 10.1016/j.ins.2021.04.092] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
10
|
Sami H, Otrok H, Bentahar J, Mourad A. AI-Based Resource Provisioning of IoE Services in 6G: A Deep Reinforcement Learning Approach. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT 2021. [DOI: 10.1109/tnsm.2021.3066625] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
11
|
Balkenius C, Tjøstheim TA, Johansson B, Wallin A, Gärdenfors P. The Missing Link Between Memory and Reinforcement Learning. Front Psychol 2020; 11:560080. [PMID: 33362625 PMCID: PMC7758424 DOI: 10.3389/fpsyg.2020.560080] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 11/16/2020] [Indexed: 11/16/2022] Open
Abstract
Reinforcement learning systems usually assume that a value function is defined over all states (or state-action pairs) that can immediately give the value of a particular state or action. These values are used by a selection mechanism to decide which action to take. In contrast, when humans and animals make decisions, they collect evidence for different alternatives over time and take action only when sufficient evidence has been accumulated. We have previously developed a model of memory processing that includes semantic, episodic and working memory in a comprehensive architecture. Here, we describe how this memory mechanism can support decision making when the alternatives cannot be evaluated based on immediate sensory information alone. Instead we first imagine, and then evaluate a possible future that will result from choosing one of the alternatives. Here we present an extended model that can be used as a model for decision making that depends on accumulating evidence over time, whether that information comes from the sequential attention to different sensory properties or from internal simulation of the consequences of making a particular choice. We show how the new model explains both simple immediate choices, choices that depend on multiple sensory factors and complicated selections between alternatives that require forward looking simulations based on episodic and semantic memory structures. In this framework, vicarious trial and error is explained as an internal simulation that accumulates evidence for a particular choice. We argue that a system like this forms the "missing link" between more traditional ideas of semantic and episodic memory, and the associative nature of reinforcement learning.
Collapse
Affiliation(s)
| | | | - Birger Johansson
- Lund University Cognitive Science, Lund University, Lund, Sweden
| | - Annika Wallin
- Lund University Cognitive Science, Lund University, Lund, Sweden
| | - Peter Gärdenfors
- Lund University Cognitive Science, Lund University, Lund, Sweden
- Palaeo-Research Institute, University of Johannesburg, Johannesburg, South Africa
| |
Collapse
|
12
|
Abstract
Machine learning has been heavily researched and widely used in many disciplines. However, achieving high accuracy requires a large amount of data that is sometimes difficult, expensive, or impractical to obtain. Integrating human knowledge into machine learning can significantly reduce data requirement, increase reliability and robustness of machine learning, and build explainable machine learning systems. This allows leveraging the vast amount of human knowledge and capability of machine learning to achieve functions and performance not available before and will facilitate the interaction between human beings and machine learning systems, making machine learning decisions understandable to humans. This paper gives an overview of the knowledge and its representations that can be integrated into machine learning and the methodology. We cover the fundamentals, current status, and recent progress of the methods, with a focus on popular and new topics. The perspectives on future directions are also discussed.
Collapse
Affiliation(s)
- Changyu Deng
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xunbi Ji
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Colton Rainey
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Jianyu Zhang
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wei Lu
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Materials Science & Engineering, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
13
|
|
14
|
Konda R, La HM, Zhang J. Decentralized Function Approximated Q-Learning in Multi-Robot Systems For Predator Avoidance. IEEE Robot Autom Lett 2020. [DOI: 10.1109/lra.2020.3013920] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
15
|
LIN YAHUI, CHIU SHAOWEN, LIN YINGCHE, LIN CHIENCHUNG, PAN LUNGKWANG. INVERSE PROBLEM ALGORITHM APPLICATION TO SEMI-QUANTITATIVE ANALYSIS OF 272 PATIENTS WITH ISCHEMIC STROKE SYMPTOMS: CAROTID STENOSIS RISK ASSESSMENT FOR FIVE RISK FACTORS. J MECH MED BIOL 2020. [DOI: 10.1142/s0219519420400217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
This study proposes the inverse problem algorithm (IPA) with five risk factors applied to the semi-quantitative analysis of carotid stenosis 272 patients with suspected ischemic stroke. The IPA is known to provide a substantiated machine learning-based prediction of the expected outcomes by solving an inverse matrix of variable coefficients. In case of carotid stenosis prediction, such risk factors as patient’s age, mean arterial pressure (MAP), glucose AC, low-density lipoprotein-cholesterol (LDL-C), and C-Reactive protein (CRP) were assessed for the main group of 217 patients. Their results were processed by the STATISTICA program with a customized loss function ([Formula: see text]), yielding the first-order nonlinear semi-empirical formula with 16 terms. The loss function was calculated via the total mismatch between the theoretical predictions and true carotid stenosis cases (%) for all 217 patients. Thus, the carotid stenosis (%) compromised solution array [[Formula: see text]] was optimized using [Formula: see text] individual data points via the proposed algorithm. The results showed a complete regression with loss function [Formula: see text]=2.3543, variance [Formula: see text]=87.46%, and correlation coefficient [Formula: see text]. The reference group of 55 more patients with the same preliminary diagnosis and symptoms was selected to validate the method predictive feasibility, which was found quite satisfactory. The decreasing order of three dominant risk factors was as follows: CRP, glucose AC, and MAP, whereas age and LDL-C weakly influenced the program computation results. The IPA showed a strong convergence by its default characteristic. The reduction of the number of variables in computation deteriorated the prediction accuracy, exhibiting the algorithm’s high sensitivity to the number of variables.
Collapse
Affiliation(s)
- YA-HUI LIN
- College of Nursing, Central Taiwan, University of Science and Technology, Takun, Taichung 406, Taiwan, ROC
- Department of Clinical Pharmacy, Taichung Armed Forces General Hospital, Taichung 406, Taiwan, ROC
- Graduate Institute of Radiological Science, Central Taiwan University of Science and Technology, Takun, Taichung 406, Taiwan, ROC
| | - SHAO-WEN CHIU
- Graduate Institute of Radiological Science, Central Taiwan University of Science and Technology, Takun, Taichung 406, Taiwan, ROC
- Healthcare Technology Business Division, Healthcare Department, International Integrated Systems, Inc., Taipei 103, Taiwan, ROC
| | - YING-CHE LIN
- Neurology Department, Taichung Armed Forces General Hospital, Taichung 406, Taiwan, ROC
- Department of Neurology, Tri-Service General Hospital, National Defense Medical Center, Taipei 114, Taiwan, ROC
| | - CHIEN-CHUNG LIN
- College of Nursing, Central Taiwan, University of Science and Technology, Takun, Taichung 406, Taiwan, ROC
- Orthopedic Department, Taichung Armed Forces General Hospital, Taichung 406, Taiwan, ROC
- Department of Orthopedic Surgery Tri-Service General Hospital, National Defense Medical Center Taipei 114, Taiwan, ROC
| | - LUNG-KWANG PAN
- Graduate Institute of Radiological Science, Central Taiwan University of Science and Technology, Takun, Taichung 406, Taiwan, ROC
| |
Collapse
|
16
|
Li J, Yao L, Xu X, Cheng B, Ren J. Deep reinforcement learning for pedestrian collision avoidance and human-machine cooperative driving. Inf Sci (N Y) 2020. [DOI: 10.1016/j.ins.2020.03.105] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
17
|
Liu R, Liang J, Alkhambashi M. Research on breakthrough and innovation of UAV mission planning method based on cloud computing-based reinforcement learning algorithm. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2019. [DOI: 10.3233/jifs-179130] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Rong Liu
- UAV Research Institute of Nanjing University of Aeronautics and Astronautics, Middle and Small Size UAV Advanced Technique Key Laboratory of Ministry of Industry and Information Technology, Nanjing, Jiangsu, China
| | - Jin Liang
- Science and Technology on Aircraft Control Laboratory, FACRI, Xi’an, Shanxi, China
| | - Majid Alkhambashi
- Department of Information Technology, Al-Zahra College for Women, Muscat, Oman
| |
Collapse
|
18
|
Yuan Y, Yu ZL, Gu Z, Deng X, Li Y. A novel multi-step reinforcement learning method for solving reward hacking. APPL INTELL 2019. [DOI: 10.1007/s10489-019-01417-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
19
|
Abstract
SUMMARYA Q-learning approach is often used for navigation in static environments where state space is easy to define. In this paper, a new Q-learning approach is proposed for navigation in dynamic environments by imitating human reasoning. As a model-free method, a Q-learning method does not require the environmental model in advance. The state space and the reward function in the proposed approach are defined according to human perception and evaluation, respectively. Specifically, approximate regions instead of accurate measurements are used to define states. Moreover, due to the limitation of robot dynamics, actions for each state are calculated by introducing a dynamic window that takes robot dynamics into account. The conducted tests show that the obstacle avoidance rate of the proposed approach can reach 90.5% after training, and the robot can always operate below the dynamics limitation.
Collapse
|
20
|
Xu B. Composite Learning Finite-Time Control With Application to Quadrotors. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS: SYSTEMS 2018; 48:1806-1815. [DOI: 10.1109/tsmc.2017.2698473] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
21
|
Zhu H, Paschalidis IC, Hasselmo ME. Neural circuits for learning context-dependent associations of stimuli. Neural Netw 2018; 107:48-60. [PMID: 30177226 DOI: 10.1016/j.neunet.2018.07.018] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2017] [Revised: 07/08/2018] [Accepted: 07/09/2018] [Indexed: 10/28/2022]
Abstract
The use of reinforcement learning combined with neural networks provides a powerful framework for solving certain tasks in engineering and cognitive science. Previous research shows that neural networks have the power to automatically extract features and learn hierarchical decision rules. In this work, we investigate reinforcement learning methods for performing a context-dependent association task using two kinds of neural network models (using continuous firing rate neurons), as well as a neural circuit gating model. The task allows examination of the ability of different models to extract hierarchical decision rules and generalize beyond the examples presented to the models in the training phase. We find that the simple neural circuit gating model, trained using response-based regulation of Hebbian associations, performs almost at the same level as a reinforcement learning algorithm combined with neural networks trained with more sophisticated back-propagation of error methods. A potential explanation is that hierarchical reasoning is the key to performance and the specific learning method is less important.
Collapse
Affiliation(s)
- Henghui Zhu
- Division of Systems Engineering, Boston University, 15 Saint Mary's Street, Brookline, MA 02446, United States.
| | - Ioannis Ch Paschalidis
- Department of Electrical and Computer Engineering, Division of Systems Engineering, and Department of Biomedical Engineering, Boston University,8 Saint Mary's Street, Boston, MA 02215, United States.
| | - Michael E Hasselmo
- Center for Systems Neuroscience, Kilachand Center for Integrated Life Sciences and Engineering, Boston University, 610 Commonwealth Ave., Boston,MA 02215, United States.
| |
Collapse
|
22
|
Zhou Y, Duval B, Hao JK. Improving probability learning based local search for graph coloring. Appl Soft Comput 2018. [DOI: 10.1016/j.asoc.2018.01.027] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
23
|
|
24
|
Shi H, Lin Z, Zhang S, Li X, Hwang KS. An adaptive decision-making method with fuzzy Bayesian reinforcement learning for robot soccer. Inf Sci (N Y) 2018. [DOI: 10.1016/j.ins.2018.01.032] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
25
|
Wang GF, Fang Z, Li P. Shaping in reinforcement learning by knowledge transferred from human-demonstrations of a simple similar task. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2018. [DOI: 10.3233/jifs-17052] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Guo-Fang Wang
- School of Aeronautics and Astronautics, Zhejiang University, Hangzhou, China
| | - Zhou Fang
- School of Aeronautics and Astronautics, Zhejiang University, Hangzhou, China
| | - Ping Li
- School of Control Science and Engineering, Zhejiang University, Hangzhou, China
| |
Collapse
|
26
|
Yin B, Dridi M, Moudni AE. Recursive least-squares temporal difference learning for adaptive traffic signal control at intersection. Neural Comput Appl 2017. [DOI: 10.1007/s00521-017-3066-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
27
|
Wang W, Chen X, He J. Adaptive Critic Design with Local Gaussian Process Models. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2016. [DOI: 10.20965/jaciii.2016.p1135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper, local Gaussian process (GP) approximation is introduced to build the critic network of adaptive dynamic programming (ADP). The sample data are partitioned into local regions, and for each region, an individual GP model is utilized. The nearest local model is used to predict a given state-action point. With the two-phase value iteration method for a Gaussian-kernel (GK)-based critic network which realizes the update of the hyper-parameters and value functions simultaneously, fast value function approximation can be achieved. Combining this critic network with an actor network, we present a local GK-based ADP approach. Simulations were carried out to demonstrate the feasibility of the proposed approach.
Collapse
|
28
|
Lian C, Xu X, Chen H, He H. Near-Optimal Tracking Control of Mobile Robots Via Receding-Horizon Dual Heuristic Programming. IEEE TRANSACTIONS ON CYBERNETICS 2016; 46:2484-2496. [PMID: 26642462 DOI: 10.1109/tcyb.2015.2478857] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Trajectory tracking control of wheeled mobile robots (WMRs) has been an important research topic in control theory and robotics. Although various tracking control methods with stability have been developed for WMRs, it is still difficult to design optimal or near-optimal tracking controller under uncertainties and disturbances. In this paper, a near-optimal tracking control method is presented for WMRs based on receding-horizon dual heuristic programming (RHDHP). In the proposed method, a backstepping kinematic controller is designed to generate desired velocity profiles and the receding horizon strategy is used to decompose the infinite-horizon optimal control problem into a series of finite-horizon optimal control problems. In each horizon, a closed-loop tracking control policy is successively updated using a class of approximate dynamic programming algorithms called finite-horizon dual heuristic programming (DHP). The convergence property of the proposed method is analyzed and it is shown that the tracking control system based on RHDHP is asymptotically stable by using the Lyapunov approach. Simulation results on three tracking control problems demonstrate that the proposed method has improved control performance when compared with conventional model predictive control (MPC) and DHP. It is also illustrated that the proposed method has lower computational burden than conventional MPC, which is very beneficial for real-time tracking control.
Collapse
|
29
|
Wang D, Li C, Liu D, Mu C. Data-based robust optimal control of continuous-time affine nonlinear systems with matched uncertainties. Inf Sci (N Y) 2016. [DOI: 10.1016/j.ins.2016.05.034] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
30
|
Application of a Gradient Descent Continuous Actor-Critic Algorithm for Double-Side Day-Ahead Electricity Market Modeling. ENERGIES 2016. [DOI: 10.3390/en9090725] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
31
|
Wei Q, Liu D, Lewis FL. Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inf Sci (N Y) 2015. [DOI: 10.1016/j.ins.2015.04.044] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
32
|
Fernandez-Gauna B, Graña M, Lopez-Guede JM, Etxeberria-Agiriano I, Ansoategui I. Reinforcement Learning endowed with safe veto policies to learn the control of Linked-Multicomponent Robotic Systems. Inf Sci (N Y) 2015. [DOI: 10.1016/j.ins.2015.04.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
33
|
Embedded Adaptive Fuzzy Controller Based on Reinforcement Learning for DC Motor with Flexible Shaft. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2015. [DOI: 10.1007/s13369-015-1752-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
34
|
|
35
|
Bahrpeyma F, Zakerolhoseini A, Haghighi H. Using IDS fitted Q to develop a real-time adaptive controller for dynamic resource provisioning in Cloud's virtualized environment. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2014.10.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
36
|
Huang Z, Xu X, Zuo L. Reinforcement learning with automatic basis construction based on isometric feature mapping. Inf Sci (N Y) 2014. [DOI: 10.1016/j.ins.2014.07.008] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
37
|
|
38
|
Neural-network-based robust optimal control design for a class of uncertain nonlinear systems via adaptive dynamic programming. Inf Sci (N Y) 2014. [DOI: 10.1016/j.ins.2014.05.050] [Citation(s) in RCA: 111] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
39
|
Zhu F, Liu Q, Wang H, Zhou X, Fu Y. Unregistered biological words recognition by Q-learning with transfer learning. ScientificWorldJournal 2014; 2014:173290. [PMID: 24701139 PMCID: PMC3950481 DOI: 10.1155/2014/173290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2013] [Accepted: 01/08/2014] [Indexed: 11/23/2022] Open
Abstract
Unregistered biological words recognition is the process of identification of terms that is out of vocabulary. Although many approaches have been developed, the performance approaches are not satisfactory. As the identification process can be viewed as a Markov process, we put forward a Q-learning with transfer learning algorithm to detect unregistered biological words from texts. With the Q-learning, the recognizer can attain the optimal solution of identification during the interaction with the texts and contexts. During the processing, a transfer learning approach is utilized to fully take advantage of the knowledge gained in a source task to speed up learning in a different but related target task. A mapping, required by many transfer learning, which relates features from the source task to the target task, is carried on automatically under the reinforcement learning framework. We examined the performance of three approaches with GENIA corpus and JNLPBA04 data. The proposed approach improved performance in both experiments. The precision, recall rate, and F score results of our approach surpassed those of conventional unregistered word recognizer as well as those of Q-learning approach without transfer learning.
Collapse
Affiliation(s)
- Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China
- Center for Systems Biology, Soochow University, Suzhou 215006, China
| | - Quan Liu
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China
| | - Hui Wang
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China
| | - Xiaoke Zhou
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China
| | - Yuchen Fu
- School of Computer Science and Technology, Soochow University, Suzhou 215006, China
| |
Collapse
|