1
|
Al-Hamadani MNA, Fadhel MA, Alzubaidi L, Balazs H. Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic Review. SENSORS (BASEL, SWITZERLAND) 2024; 24:2461. [PMID: 38676080 PMCID: PMC11053800 DOI: 10.3390/s24082461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 04/04/2024] [Accepted: 04/08/2024] [Indexed: 04/28/2024]
Abstract
Reinforcement learning (RL) has emerged as a dynamic and transformative paradigm in artificial intelligence, offering the promise of intelligent decision-making in complex and dynamic environments. This unique feature enables RL to address sequential decision-making problems with simultaneous sampling, evaluation, and feedback. As a result, RL techniques have become suitable candidates for developing powerful solutions in various domains. In this study, we present a comprehensive and systematic review of RL algorithms and applications. This review commences with an exploration of the foundations of RL and proceeds to examine each algorithm in detail, concluding with a comparative analysis of RL algorithms based on several criteria. This review then extends to two key applications of RL: robotics and healthcare. In robotics manipulation, RL enhances precision and adaptability in tasks such as object grasping and autonomous learning. In healthcare, this review turns its focus to the realm of cell growth problems, clarifying how RL has provided a data-driven approach for optimizing the growth of cell cultures and the development of therapeutic solutions. This review offers a comprehensive overview, shedding light on the evolving landscape of RL and its potential in two diverse yet interconnected fields.
Collapse
Affiliation(s)
- Mokhaled N. A. Al-Hamadani
- Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, H-4032 Debrecen, Hungary;
- Doctoral School of Informatics, University of Debrecen, H-4032 Debrecen, Hungary
- Department of Electronic Techniques, Technical Institute/Alhawija, Northern Technical University, 36001 Kirkuk, Iraq
| | - Mohammed A. Fadhel
- Research and Development Department, Akunah Company, Brisbane, QLD 4120, Australia; (M.A.F.); (L.A.)
| | - Laith Alzubaidi
- Research and Development Department, Akunah Company, Brisbane, QLD 4120, Australia; (M.A.F.); (L.A.)
- School of Mechanical, Medical, and Process Engineering, Queensland University of Technology, Brisbane, QLD 4000, Australia
- Centre for Data Science, Queensland University of Technology, Brisbane, QLD 4000, Australia
| | - Harangi Balazs
- Department of Data Science and Visualization, Faculty of Informatics, University of Debrecen, H-4032 Debrecen, Hungary;
| |
Collapse
|
2
|
Zheng Y, Lin Y, Zhao L, Wu T, Jin D, Li Y. Spatial planning of urban communities via deep reinforcement learning. NATURE COMPUTATIONAL SCIENCE 2023; 3:748-762. [PMID: 38177774 DOI: 10.1038/s43588-023-00503-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/27/2023] [Indexed: 01/06/2024]
Abstract
Effective spatial planning of urban communities plays a critical role in the sustainable development of cities. Despite the convenience brought by geographic information systems and computer-aided design, determining the layout of land use and roads still heavily relies on human experts. Here we propose an artificial intelligence urban-planning model to generate spatial plans for urban communities. To overcome the difficulty of diverse and irregular urban geography, we construct a graph to describe the topology of cities in arbitrary forms and formulate urban planning as a sequential decision-making problem on the graph. To tackle the challenge of the vast solution space, we develop a reinforcement learning model based on graph neural networks. Experiments on both synthetic and real-world communities demonstrate that our computational model outperforms plans designed by human experts in objective metrics and that it can generate spatial plans responding to different circumstances and needs. We also propose a human-artificial intelligence collaborative workflow of urban planning, in which human designers can substantially benefit from our model to be more productive, generating more efficient spatial plans with much less time. Our method demonstrates the great potential of computational urban planning and paves the way for more explorations in leveraging computational methodologies to solve challenging real-world problems in urban science.
Collapse
Affiliation(s)
- Yu Zheng
- Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, P. R. China
- Department of Electronic Engineering, Tsinghua University, Beijing, P. R. China
| | - Yuming Lin
- Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, P. R. China
- Department of Electronic Engineering, Tsinghua University, Beijing, P. R. China
| | - Liang Zhao
- Department of Urban Planning and Design, Tsinghua University, Beijing, P. R. China
| | - Tinghai Wu
- Department of Urban Planning and Design, Tsinghua University, Beijing, P. R. China
| | - Depeng Jin
- Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, P. R. China
- Department of Electronic Engineering, Tsinghua University, Beijing, P. R. China
| | - Yong Li
- Beijing National Research Center for Information Science and Technology (BNRist), Tsinghua University, Beijing, P. R. China.
- Department of Electronic Engineering, Tsinghua University, Beijing, P. R. China.
| |
Collapse
|
3
|
Affiliation(s)
- Xinyang Li
- Department of Automation, Tsinghua University, Beijing, China
| | - Yuanlong Zhang
- Department of Automation, Tsinghua University, Beijing, China
| | - Jiamin Wu
- Department of Automation, Tsinghua University, Beijing, China.
| | - Qionghai Dai
- Department of Automation, Tsinghua University, Beijing, China.
| |
Collapse
|
4
|
Xu Y, Cheng Y, Chen AT, Bao Z. A compound PCP scheme underlies sequential rosettes-based cell intercalation. Development 2023; 150:dev201493. [PMID: 36975724 PMCID: PMC10263146 DOI: 10.1242/dev.201493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 03/20/2023] [Indexed: 03/29/2023]
Abstract
The formation of sequential rosettes is a type of collective cell behavior recently discovered in the Caenorhabditis elegans embryo that mediates directional cell migration through sequential formation and resolution of multicellular rosettes involving the migrating cell and its neighboring cells along the way. Here, we show that a planar cell polarity (PCP)-based polarity scheme regulates sequential rosettes, which is distinct from the known mode of PCP regulation in multicellular rosettes during the process of convergent extension. Specifically, non-muscle myosin (NMY) localization and edge contraction are perpendicular to that of Van Gogh as opposed to colocalizing with Van Gogh. Further analyses suggest a two-component polarity scheme: one being the canonical PCP pathway with MIG-1/Frizzled and VANG-1/Van Gogh localized to the vertical edges, the other being MIG-1/Frizzled and NMY-2 localized to the midline/contracting edges. The NMY-2 localization and contraction of the midline edges also required LAT-1/Latrophilin, an adhesion G protein-coupled receptor that has not been shown to regulate multicellular rosettes. Our results establish a distinct mode of PCP-mediated cell intercalation and shed light on the versatile nature of the PCP pathway.
Collapse
Affiliation(s)
- Yichi Xu
- Developmental Biology Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Yunsheng Cheng
- Developmental Biology Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Allison T. Chen
- Developmental Biology Program, Sloan Kettering Institute, New York, NY 10065, USA
| | - Zhirong Bao
- Developmental Biology Program, Sloan Kettering Institute, New York, NY 10065, USA
| |
Collapse
|
5
|
Hierarchical DDPG for Manipulator Motion Planning in Dynamic Environments. AI 2022. [DOI: 10.3390/ai3030037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
In this paper, a hierarchical reinforcement learning (HRL) architecture, namely a “Hierarchical Deep Deterministic Policy Gradient (HDDPG)” has been proposed and studied. A HDDPG utilizes manager and worker formation similar to other HRL structures. However, unlike others, the HDDPG enables sharing an identical environment and state among workers and managers, while a unique reward system is required for each Deep Deterministic Policy Gradient (DDPG) agent. Therefore, the HDDPG allows easy structural expansion with probabilistic action selection of a worker by the manager. Due to its innate structural advantage, the HDDPG has a merit in building a general AI to deal with a complex time-horizon tasks with various conflicting sub-goals. The experimental results demonstrated its usefulness with a manipulator motion planning problem in a dynamic environment, where path planning and collision avoidance conflict each other. The proposed HDDPG is compared with an HAM and a single DDPG for performance evaluation. The result shows that the HDDPG demonstrated more than 40% of reward gain and more than two times the reward improvement rate. Another important feature of the proposed HDDPG is the biased manager training capability. By adding a preference factor to each worker, the manager can be trained to prefer a certain worker to achieve better success rate for a specific objective if needed.
Collapse
|
6
|
Guan G, Zhao Z, Tang C. Delineating mechanisms and design principles of Caenorhabditis elegans embryogenesis using in toto high-resolution imaging data and computational modeling. Comput Struct Biotechnol J 2022; 20:5500-5515. [PMID: 36284714 PMCID: PMC9562942 DOI: 10.1016/j.csbj.2022.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 08/10/2022] [Accepted: 08/11/2022] [Indexed: 11/19/2022] Open
Abstract
The nematode (roundworm) Caenorhabditis elegans is one of the most popular animal models for the study of developmental biology, as its invariant development and transparent body enable in toto cellular-resolution fluorescence microscopy imaging of developmental processes at 1-min intervals. This has led to the development of various computational tools for the systematic and automated analysis of imaging data to delineate the molecular and cellular processes throughout the embryogenesis of C. elegans, such as those associated with cell lineage, cell migration, cell morphology, and gene activity. In this review, we first introduce C. elegans embryogenesis and the development of techniques for tracking cell lineage and reconstructing cell morphology during this process. We then contrast the developmental modes of C. elegans and the customized technologies used for studying them with the ones of other animal models, highlighting its advantage for studying embryogenesis with exceptional spatial and temporal resolution. This is followed by an examination of the physical models that have been devised—based on accurate determinations of developmental processes afforded by analyses of imaging data—to interpret the early embryonic development of C. elegans from subcellular to intercellular levels of multiple cells, which focus on two key processes: cell polarization and morphogenesis. We subsequently discuss how quantitative data-based theoretical modeling has improved our understanding of the mechanisms of C. elegans embryogenesis. We conclude by summarizing the challenges associated with the acquisition of C. elegans embryogenesis data, the construction of algorithms to analyze them, and the theoretical interpretation.
Collapse
|