Issue |
ITM Web Conf.
Volume 9, 2017
The 2016 International Conference Applied Mathematics, Computational Science and Systems Engineering
|
|
---|---|---|
Article Number | 01002 | |
Number of page(s) | 5 | |
Section | Applied Mathematics | |
DOI | https://doi.org/10.1051/itmconf/20170901002 | |
Published online | 09 January 2017 |
Minimax Normal Two-Armed Bandit with Indefinite Control Horizon
Yaroslav-the-Wise Novgorod State University, B.St-Petersburgskaya Str, 41, Velikiy Novgorod, Russia, 173003
* e-mail: Alexander.Kolnogorov@novsu.ru
We consider the two-armed bandit problem as applied to data processing if there are two alternative processing methods available with different a priori unknown efficiencies. On should determine the most effective method and provide its dominating application. The total number of data, which is interpreted as a control horizon, is assumed to have a priori known probability distribution.
The problem is considered in minimax (robust) setting. According to the main theorem of the theory of games minimax risk and minimax strategy are sought for as Bayesian ones corresponding to the worst-case prior distribution. We describe the properties of the worst-case prior and present a recursive Bellman-type equation for determination of both minimax strategy and minimax risk. Numerical results illustrating the proposed algorithm are given. The algorithm can be applied to optimization of parallel data processing if the number of processed data is not definitely known in advance.
© The Authors, published by EDP Sciences, 2017
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.