Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents

Muhammad Rahman,Jiaxun Cui,Peter Stone

doi:10.1609/aaai.v38i16.29702

Abstract

Robustly cooperating with unseen agents and human partners presents significant challenges due to the diverse cooperative conventions these partners may adopt. Existing Ad Hoc Teamwork (AHT) methods address this challenge by training an agent with a population of diverse teammate policies obtained through maximizing specific diversity metrics. However, prior heuristic-based diversity metrics do not always maximize the agent's robustness in all cooperative problems. In this work, we first propose that maximizing an AHT agent's robustness requires it to emulate policies in the minimum coverage set (MCS), the set of best-response policies to any partner policies in the environment. We then introduce the L-BRDiv algorithm that generates a set of teammate policies that, when used for AHT training, encourage agents to emulate policies from the MCS. L-BRDiv works by solving a constrained optimization problem to jointly train teammate policies for AHT training and approximating AHT agent policies that are members of the MCS. We empirically demonstrate that L-BRDiv produces more robust AHT agents than state-of-the-art methods in a broader range of two-player cooperative problems without the need for extensive hyperparameter tuning for its objectives. Our study shows that L-BRDiv outperforms the baseline methods by prioritizing discovering distinct members of the MCS instead of repeatedly finding redundant policies.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence	Publication Date: Mar 24, 2024
Citations: 1

Similar Papers

Learning Ad Hoc Cooperation Policies from Limited Priors via Meta-Reinforcement Learning
Qi Fang ... Junjie Zeng
Applied sciences | VOL. 14
Qi Fang, et. al.Qi Fang ... Junjie Zeng
11 Apr 2024
Applied sciences | VOL. 14

A policy for diversity, equity, inclusion and anti-racism in the Scandinavian Society of Anaesthesiology and Intensive Care Medicine (SSAI).
Jon H Laake ... Kristian Strand
Acta anaesthesiologica Scandinavica | VOL. 66
Jon H Laake, et. al.Jon H Laake ... Kristian Strand
12 Sep 2021
Acta anaesthesiologica Scandinavica | VOL. 66

Prompting large language models for user simulation in task-oriented dialogue systems
Atheer Algherairy ... Moataz Ahmed
Computer Speech & Language | VOL. -
Atheer Algherairy, et. al.Atheer Algherairy ... Moataz Ahmed
01 Jul 2024
Computer Speech & Language | VOL. -

Structural Novelty and Diversity in Link Prediction
Javier Sanz-Cruzado ... Sofía M. Pepa
-
Javier Sanz-Cruzado, et. al.Javier Sanz-Cruzado ... Sofía M. Pepa
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents

Abstract

Talk to us

Similar Papers

More From: Proceedings of the ... AAAI Conference on Artificial Intelligence. AAAI Conference on Artificial Intelligence