Xu X, Li R, Zhao Z, Zhang H. Stigmergic Independent Reinforcement Learning for Multiagent Collaboration.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022;
33:4285-4299. [PMID:
33587718 DOI:
10.1109/tnnls.2021.3056418]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
With the rapid evolution of wireless mobile devices, there emerges an increased need to design effective collaboration mechanisms between intelligent agents to gradually approach the final collective objective by continuously learning from the environment based on their individual observations. In this regard, independent reinforcement learning (IRL) is often deployed in multiagent collaboration to alleviate the problem of a nonstationary learning environment. However, behavioral strategies of intelligent agents in IRL can be formulated only upon their local individual observations of the global environment, and appropriate communication mechanisms must be introduced to reduce their behavioral localities. In this article, we address the problem of communication between intelligent agents in IRL by jointly adopting mechanisms with two different scales. For the large scale, we introduce the stigmergy mechanism as an indirect communication bridge between independent learning agents, and carefully design a mathematical method to indicate the impact of digital pheromone. For the small scale, we propose a conflict-avoidance mechanism between adjacent agents by implementing an additionally embedded neural network to provide more opportunities for participants with higher action priorities. In addition, we present a federal training method to effectively optimize the neural network of each agent in a decentralized manner. Finally, we establish a simulation scenario in which a number of mobile agents in a certain area move automatically to form a specified target shape. Extensive simulations demonstrate the effectiveness of our proposed method.
Collapse