Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Yliniemi L, Tumer K. Multi-objective multiagent credit assignment in reinforcement learning and NSGA-II. Soft comput 2016. [DOI: 10.1007/s00500-016-2124-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Number	Cited by Other Article(s)
1	Singh D, Sisodia DS, Singh P. Multiobjective evolutionary-based multi-kernel learner for realizing transfer learning in the prediction of HIV-1 protease cleavage sites. Soft comput 2020. [DOI: 10.1007/s00500-019-04487-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
2	Tuning of reinforcement learning parameters applied to SOP using the Scott–Knott method. Soft comput 2019. [DOI: 10.1007/s00500-019-04206-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Abstract Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse
3	Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning. KNOWL ENG REV 2018. [DOI: 10.1017/s0269888918000292] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Abstract AbstractThe majority of multi-agent reinforcement learning (MARL) implementations aim to optimize systems with respect to a single objective, despite the fact that many real-world problems are inherently multi-objective in nature. Research into multi-objective MARL is still in its infancy, and few studies to date have dealt with the issue of credit assignment. Reward shaping has been proposed as a means to address the credit assignment problem in single-objective MARL, however it has been shown to alter the intended goals of a domain if misused, leading to unintended behaviour. Two popular shaping methods are potential-based reward shaping and difference rewards, and both have been repeatedly shown to improve learning speed and the quality of joint policies learned by agents in single-objective MARL domains. This work discusses the theoretical implications of applying these shaping approaches to cooperative multi-objective MARL problems, and evaluates their efficacy using two benchmark domains. Our results constitute the first empirical evidence that agents using these shaping methodologies can sample true Pareto optimal solutions in cooperative multi-objective stochastic games. Collapse Key Words Collapse MESH Headings Collapse Grants Collapse Affiliation(s) Collapse