| Issue |
ITM Web Conf.
Volume 80, 2025
2025 2nd International Conference on Advanced Computer Applications and Artificial Intelligence (ACAAI 2025)
|
|
|---|---|---|
| Article Number | 02001 | |
| Number of page(s) | 10 | |
| Section | Reinforcement Learning, Bandits & Optimization | |
| DOI | https://doi.org/10.1051/itmconf/20258002001 | |
| Published online | 16 December 2025 | |
Variance and robustness aware UCB variants for evaluating e-commerce fulfilments
New York University, Tandon School of Engineering, Department of Mathematics, Brooklyn, NY, USA
* Corresponding author: This email address is being protected from spambots. You need JavaScript enabled to view it.
The online allocations of inventory to different product categories are naturally a multi-armed bandit problem. Companies need efficient and low-regret exploration to allocate the deep inventories to optimal options. This research intends to do an offline evaluation of UCB variants on a public e-commerce sales dataset, setting the arms as product categories and binary rewards from the terminal fulfilment status. It is comparing UCB1, Asymptotically Optimal UCB, UCB-Variance, Bootstrapped UCB, and IQR-UCB algorithms for performances in regret, pull allocation, and runtime comparison. The UCB-Variance algorithms result in the optimal performance combination with low regret and baseline runtime, while optimal-UCB offers a slight improvement and the baseline runtime. Bootstrapped UCB offers little gain at high costs, while the IQR- UCB shows overly aggressive exploration on binary rewards due to the vanishing bonus. For the category-level allocation, the UCB-Variance algorithm seems to be the most cost-benefit, and Optimal-UCB is a safe anytime baseline. However, conservative success labelling and less operational results provides rooms for further optimization. We should also evaluate Bootstrapped UCB with upper-quantile and lazy updates, IQR- UCB with alternative scales and adding floors, and non-stationary and contextual settings to better examine the full potentials of these methods.
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.

