Start Over

Game-theoretic payoff allocation in multiagent machine learning systems

Authors :: Han, Dongge
Wooldridge, Michael
Rogers, Alex
Publication Year :: 2021
Publisher :: University of Oxford, 2021.
Abstract: Machine learning (ML) is becoming ubiquitous in real-world applications, spanning from our domestic devices such as vacuum cleaner robots, smart phones, virtual assistants, to industrial applications such as manipulators, warehouse robots, to public services such as medical imaging, surveillance cameras, energy grid, etc. With the growing ubiquity and interconnection of these devices, interactions among them will soon become commonplace. These emerging ML problems and the interactions among agents can be formulated as Multiagent ML Systems. On the one hand, many complex ML tasks require the cooperation among multiple agents carrying distributed resources and capabilities. On the other hand, multiagent solutions often lead to improved efficiency, robustness and scalability. Among the many aspects of multiagent ML systems, an important research problem is payoff allocation. This is because multiagent ML systems typically receive a global payoff for the overall performance of all agents, while a fine-grained evaluation of each agent's contribution is absent. Nevertheless, these agent-specific payoffs not only offer natural incentives for their contribution and cooperation, but also provide crucial feedback signals for interpreting the agent's contributions and improving the their future cooperative policies. In this thesis, we investigate the properties of emerging multiagent ML systems and address new challenges that arise when applying classic payoff allocation methods from cooperative game theory to them. The first part of the thesis investigates payoff allocation in submodular multiagent ML problems. Though convex games are commonly studied in cooperative game theory, many multiagent ML applications naturally exhibit submodular characteristics. Moreover, properties of the ML applications give rise to increased computational burden, as well as a new type of malicious attack (i.e., the replication manipulation). To address these challenges, we present a theoretical analysis on payoff allocation methods (in particular, semivalues) in submodular cooperative games, and characterise their robustness against the replication manipulation. We then extend our theoretical results to an emerging application of ML data markets and discuss related and more complex attack models. Moreover, we present a sampling method for estimating the payoff allocation methods to address the computational issue. Finally, we empirically validate our results on a classic facility location problem and on real-life machine learning datasets. In the second part of the thesis, we investigate the integration of the payoff allocation into the learning process for guiding multiagent reinforcement learning (MARL) updates. To this end, we present a framework of game-theoretic payoff allocation in MARL for robotic control. Though the payoffs to agents provide crucial reward signals for improving their cooperative policy, a new challenge arises when performing the payoff allocations during learning. That is, the characteristic values of coalitions are absent except for the grand coalition, i.e., the global reward. On the one hand, it is impossible to re-run the simulation for all possible coalitions at each step. On the other hand, brute-force inferring the value of the unseen coalitions would suffer from highly inaccurate estimations due to high dimensional continuous state and action spaces in the robotic application. To address this challenge, we further incorporate a model-based RL module into our payoff allocation framework, and use a learned model of the environment to simulate values of the unseen coalitions. We empirically demonstrate that our model-based payoff allocation framework greatly improves the performance and sample-efficiency of the MARL system on (multiagent) MuJoCo robotic locomotion control tasks. In conclusion, the results presented in this thesis pave the way towards the application of game-theoretic payoff allocation methods in emerging multiagent ML systems, and opens up a number of exciting directions for future research in this topic.