| Issue |
ITM Web Conf.
Volume 78, 2025
International Conference on Computer Science and Electronic Information Technology (CSEIT 2025)
|
|
|---|---|---|
| Article Number | 03027 | |
| Number of page(s) | 9 | |
| Section | Intelligent Systems and Computing in Industry, Robotics, and Smart Infrastructure | |
| DOI | https://doi.org/10.1051/itmconf/20257803027 | |
| Published online | 08 September 2025 | |
Strategic Learning in Multi-Player Bandit Problems: A Game-Theoretic Approach to Resource Allocation in Edge Computing
Department of Mathematics, Xiamen University Malaysia, Selangor Darul Ehsan, 43900 Sepang, Malaysia
This paper investigates strategic learning approaches for resource allocation in decentralized edge computing environments where multiple agents compete for limited resources without direct coordination. The problem is modeled using a multi-player multi-armed bandit (MP-MAB) framework, which captures the exploration-exploitation trade-offs inherent in sequential decision-making. Building upon this foundation, the study incorporates game-theoretic principles such as strategic regret minimization to guide the development of learning strategies that can achieve both stable and efficient outcomes. Three representative algorithms—Upper Confidence Bound (UCB), Thompson Sampling (TS), and Sliding Window UCB—are implemented to evaluate performance across multiple dimensions. The experimental setup leverages the MovieLens dataset to simulate realistic user demand and resource constraints. Evaluation metrics include cumulative reward, conflict rate, and Jain's fairness index to capture efficiency, contention, and equity, respectively. Experimental results reveal that Thompson Sampling consistently outperforms the other strategies, delivering on average 10.2% higher rewards, a 25% reduction in conflict rate, and improved fairness scores across 50 interaction rounds. These findings underscore the advantages of probabilistic decision-making in competitive, distributed systems. The study offers practical implications for edge computing and other real-world systems requiring decentralized resource management.
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

