1
|
Zeng Y, Cai R, Sun F, Huang L, Hao Z. A Survey on Causal Reinforcement Learning. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2025; 36:5942-5962. [PMID: 40030342 DOI: 10.1109/tnnls.2024.3403001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/05/2025]
Abstract
While reinforcement learning (RL) achieves tremendous success in sequential decision-making problems of many domains, it still faces key challenges of data inefficiency and the lack of interpretability. Interestingly, many researchers have leveraged insights from the causality literature recently, bringing forth flourishing works to unify the merits of causality and address well the challenges from RL. As such, it is of great necessity and significance to collate these causal RL (CRL) works, offer a review of CRL methods, and investigate the potential functionality from causality toward RL. In particular, we divide the existing CRL approaches into two categories according to whether their causality-based information is given in advance or not. We further analyze each category in terms of the formalization of different models, ranging from the Markov decision process (MDP), partially observed MDP (POMDP), multiarmed bandits (MABs), imitation learning (IL), and dynamic treatment regime (DTR). Each of them represents a distinct type of causal graphical illustration. Moreover, we summarize the evaluation matrices and open sources, while we discuss emerging applications, along with promising prospects for the future development of CRL.
Collapse
|
2
|
Dang Y, Huang C, Chen P, Liang R, Yang X, Cheng KT. Imitation Learning-Based Algorithm for Drone Cinematography System. IEEE Trans Cogn Dev Syst 2022. [DOI: 10.1109/tcds.2020.3043441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Yuanjie Dang
- College of Information and Engineering, Zhejiang University of Technology, Hangzhou, China
| | - Chong Huang
- Department of Electrical and Computer Engineering, University of California at Santa Barbara, Santa Barbara, CA, USA
| | - Peng Chen
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China
| | - Ronghua Liang
- College of Computer Science and Technology, Zhejiang University of Technology, Hangzhou, China
| | - Xin Yang
- School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China
| | - Kwang-Ting Cheng
- School of Engineering, Hong Kong University of Science and Technology, Hong Kong
| |
Collapse
|
3
|
Davis GP, Katz GE, Gentili RJ, Reggia JA. NeuroLISP: High-level symbolic programming with attractor neural networks. Neural Netw 2021; 146:200-219. [PMID: 34894482 DOI: 10.1016/j.neunet.2021.11.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Revised: 11/04/2021] [Accepted: 11/09/2021] [Indexed: 10/19/2022]
Abstract
Despite significant improvements in contemporary machine learning, symbolic methods currently outperform artificial neural networks on tasks that involve compositional reasoning, such as goal-directed planning and logical inference. This illustrates a computational explanatory gap between cognitive and neurocomputational algorithms that obscures the neurobiological mechanisms underlying cognition and impedes progress toward human-level artificial intelligence. Because of the strong relationship between cognition and working memory control, we suggest that the cognitive abilities of contemporary neural networks are limited by biologically-implausible working memory systems that rely on persistent activity maintenance and/or temporal nonlocality. Here we present NeuroLISP, an attractor neural network that can represent and execute programs written in the LISP programming language. Unlike previous approaches to high-level programming with neural networks, NeuroLISP features a temporally-local working memory based on itinerant attractor dynamics, top-down gating, and fast associative learning, and implements several high-level programming constructs such as compositional data structures, scoped variable binding, and the ability to manipulate and execute programmatic expressions in working memory (i.e., programs can be treated as data). Our computational experiments demonstrate the correctness of the NeuroLISP interpreter, and show that it can learn non-trivial programs that manipulate complex derived data structures (multiway trees), perform compositional string manipulation operations (PCFG SET task), and implement high-level symbolic AI algorithms (first-order unification). We conclude that NeuroLISP is an effective neurocognitive controller that can replace the symbolic components of hybrid models, and serves as a proof of concept for further development of high-level symbolic programming in neural networks.
Collapse
Affiliation(s)
- Gregory P Davis
- Department of Computer Science, University of Maryland, College Park, MD, USA.
| | - Garrett E Katz
- Department of Elec. Engr. and Comp. Sci., Syracuse University, Syracuse, NY, USA.
| | - Rodolphe J Gentili
- Department of Kinesiology, University of Maryland, College Park, MD, USA.
| | - James A Reggia
- Department of Computer Science, University of Maryland, College Park, MD, USA.
| |
Collapse
|
4
|
Transferring the semantic constraints in human manipulation behaviors to robots. APPL INTELL 2020. [DOI: 10.1007/s10489-019-01580-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
5
|
A Novel Application of Levenshtein Distance for Assessment of High-Level Motor Planning Underlying Performance During Learning of Complex Motor Sequences. JOURNAL OF MOTOR LEARNING AND DEVELOPMENT 2020. [DOI: 10.1123/jmld.2018-0060] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Few studies have examined high-level motor plans underlying cognitive-motor performance during practice of complex action sequences. These investigations have assessed performance through fairly simple metrics without examining how practice affects the structures of action sequences. By adapting the Levenshtein distance (LD) method to the motor domain, we propose a computational approach to accurately capture performance dynamics during practice of action sequences. Practice performance dynamics were assessed by computing the LD based on the number of insertions, deletions, and substitutions of actions needed to transform any sequence into a reference sequence (having a minimal number of actions to complete the task). Also, combining LD-based performance with mental workload metrics allowed assessment of cognitive-motor efficiency dynamics. This approach was tested on the Tower of Hanoi task. The findings revealed that throughout practice this method could capture: i) action sequence performance improvements as indexed by a reduced LD (decrease of insertions and substitutions), ii) structural modifications of the high-level plans, iii) an attenuation of mental workload, and iv) enhanced cognitive-motor efficiency. This effort complements prior work examining the practice of complex action sequences in healthy adults and has potential for probing cognitive-motor impairment in clinical populations as well as the development/assessment of cognitive robotic controllers.
Collapse
|
6
|
Shuggi IM, Shewokis PA, Herrmann JW, Gentili RJ. Changes in motor performance and mental workload during practice of reaching movements: a team dynamics perspective. Exp Brain Res 2017; 236:433-451. [PMID: 29214390 DOI: 10.1007/s00221-017-5136-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2016] [Accepted: 11/14/2017] [Indexed: 10/18/2022]
Abstract
Few investigations have examined mental workload during motor practice or learning in a context of team dynamics. This study examines the underlying cognitive-motor processes of motor practice by assessing the changes in motor performance and mental workload during practice of reaching movements. Individuals moved a robotic arm to reach targets as fast and as straight as possible while satisfying the task requirement of avoiding a collision between the end-effector and the workspace limits. Individuals practiced the task either alone (HA group) or with a synthetic teammate (HRT group), which regulated the effector velocity to help satisfy the task requirements. The findings revealed that the performance of both groups improved similarly throughout practice. However, when compared to the individuals of the HA group, those in the HRT group (1) had a lower risk of collisions, (2) exhibited higher performance consistency, and (3) revealed a higher level of mental workload while generally perceiving the robotic teammate as interfering with their performance. As the synthetic teammate changed the effector velocity in specific regions near the workspace boundaries, individuals may have been constrained to learn a piecewise visuomotor map. This piecewise map made the task more challenging, which increased mental workload and perception of the synthetic teammate as a burden. The examination of both motor performance and mental workload revealed a combination of both adaptive and maladaptive team dynamics. This work is a first step to examine the human cognitive-motor processes underlying motor practice in a context of team dynamics and contributes to inform human-robot applications.
Collapse
Affiliation(s)
- Isabelle M Shuggi
- Systems Engineering Program, University of Maryland, College Park, MD, 20742, USA.,Department of Kinesiology, School of Public Health, University of Maryland, College Park, MD, 20742, USA.,Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, 20742, USA
| | - Patricia A Shewokis
- School of Biomedical Engineering, Science, and Health Systems, Drexel University, Philadelphia, PA, 19102, USA.,Nutrition Sciences Department, College of Nursing and Health Professions, Drexel University, Philadelphia, PA, 19102, USA
| | - Jeffrey W Herrmann
- Department of Mechanical Engineering, University of Maryland, College Park, MD, 20742, USA.,Institute for Systems Research, University of Maryland, College Park, MD, 20742, USA
| | - Rodolphe J Gentili
- Department of Kinesiology, School of Public Health, University of Maryland, College Park, MD, 20742, USA. .,Program in Neuroscience and Cognitive Science, University of Maryland, College Park, MD, 20742, USA. .,Maryland Robotics Center, University of Maryland, College Park, MD, USA.
| |
Collapse
|