Zhang X, Wang Y. Covariant Cluster Transfer for Kernel Reinforcement Learning in Brain-Machine Interface.
ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020;
2020:3086-3089. [PMID:
33018657 DOI:
10.1109/embc44109.2020.9175985]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Brain-Machine Interface (BMI) provides a promising way to help disabled people restore their motor functions. The patients are able to control the external devices directly from their neural signals by the decoder. Due to various reasons such as mental fatigue and distraction, the distribution of the neural signals might change, which might lead to poor performance for the decoder. In this case, we need to calibrate the parameters before each session, which needs the professionals to label the data and is not convenient for the patient's usage at home. In this paper, we propose a covariant cluster transfer mechanism for the kernel reinforcement learning (RL) algorithm to speed up the adaptation across sessions. The parameters of the decoder will adaptively change according to a reward signal, which could be easily set by the patient. More importantly, we cluster the neural patterns in previous sessions. The cluster represents the conditional distribution from neural patterns to actions. When a distinct neural pattern appears in the new session, the nearest cluster will be transferred. In this way, the knowledge from the old session could be utilized to accelerate the learning in the new session. Our proposed algorithm is tested on the simulated neural data where the neural signal's distribution differs across sessions. Compared with the training from random initialization and a weight transfer policy, our proposed cluster transfer mechanism maintains a significantly higher success rate and a faster adaptation when the conditional distribution from neural signals to actions remains similar.
Collapse