| Issue |
ITM Web Conf.
Volume 78, 2025
International Conference on Computer Science and Electronic Information Technology (CSEIT 2025)
|
|
|---|---|---|
| Article Number | 01022 | |
| Number of page(s) | 10 | |
| Section | Deep Learning and Reinforcement Learning – Theories and Applications | |
| DOI | https://doi.org/10.1051/itmconf/20257801022 | |
| Published online | 08 September 2025 | |
Correlation-Aware Collaborative Adaptive Window Algorithm for Multi-Armed Bandits
Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, 999077, China
The Multi-Armed Bandit (MAB) problem is central to reinforcement learning, where it addresses the trade-off between exploration and exploitation. However, traditional MAB algorithms often encounter difficulties in non-stationary environments with evolving correlations between arms. This paper introduces the Correlation-Aware Collaborative Adaptive Window Algorithm (Adaptive UCB). The algorithm addresses key challenges by combining two techniques: Dynamic Window Recalibration (DWR) and Hierarchical Correlation-Aware Exploration (HCAE). The DWR mechanism adjusts the window size of the historical data based on real-time covariance analysis. This allows the algorithm to adapt to both abrupt and gradual changes in the environment. The HCAE method improves the selection of arms by clustering them and using Upper Confidence Bound (UCB) at the group level, which helps in exploration and minimizes sampling redundancy. The results of the experiments show that Adaptive UCB is better than other algorithms, which are Standard UCB, Sliding Window UCB, and Restart UCB. The advantage is most apparent in volatile environments and where arms are highly correlated. The Adaptive UCB has a much lower cumulative regret of 18.35% of the Standard UCB and 44.66% of the sliding window UCB. It also increases the mean average reward by 5.6% compared to Standard UCB and 1.52% compared to sliding window UCB, which proves that the algorithm is efficient in dynamic conditions.
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

