The Cooperation and Competition Mechanisms In Multi-agent Reinforcement Learning

Puxuan Li; Shuo Li; Junxi Wang

doi:10.1051/itmconf/20268404004

Open Access

Issue		ITM Web Conf. Volume 84, 2026 2026 International Conference on Advent Trends in Computational Intelligence and Data Science (ATCIDS 2026)


Article Number		04004
Number of page(s)		6
Section		Computer Vision, Robotic Systems, and Intelligent Control
DOI		https://doi.org/10.1051/itmconf/20268404004
Published online		06 April 2026

ITM Web of Conferences 84, 04004 (2026)

The Cooperation and Competition Mechanisms In Multi-agent Reinforcement Learning

Puxuan Li¹^*, Shuo Li² and Junxi Wang³

¹ School of Mathematics and Physics, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu, China
² School of Advanced Technology, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu, China
³ School of Mathematics and Statistics, Central South University, Changsha, Hunan, China

^* Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.

Abstract

With the development of artificial intelligence, multi-agent systems have become an important research hotspot for achieving autonomous multi-agent collaboration and intelligent decision-making. Among them, multi-agent reinforcement learning (MARL) has become a key framework for solving uncertain and dynamic interaction problems. This article systematically reviews the cooperation and competition mechanisms in MARL. The article points out that centralized training and decentralized execution (CTDE) is the core paradigm for solving the problems of credit allocation and environmental instability in cooperation, and has derived key technologies such as value decomposition and actor-critic. For competition, the combination of game theory and deep reinforcement learning provides a theoretical basis for strategic interaction. Additionally, the article analyzes the complexity of mixed cooperative-competitive scenarios, summarizes a comprehensive framework integrating multiple technologies, and demonstrates its application potential through cases such as games and smart grids. Finally, in response to current bottlenecks, it looks forward to future directions such as combining with large models and improving generalization ability.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.