On Suboptimal Solution of Antagonistic Matrix Games

. The paper examines resource allocation games such as Colonel Blotto and Colonel Lotto games with the goal to develop tractable method for building suboptimal solution in mixed strategies of these games without solving the relevant optimization problem. The foundation of proposed method lies in the specific combinatorial properties of the partition games. It turned out that as far as distribution of resource along battlefield is concerned that pure strategies participating in (cid:2) -optimal solution possessed specific structure. Numerical experiments showed that these specific structural peculiarities can be easily reproduced utilizing previously found combinatorial properties of partition. As a result, we get (cid:2) -optimal solution of partition games and support set mixed strategies can be computed in polynomial time.


Introduction
In 1921 Borel published the seminal work [1] in which he studied a two-player constant-sum game, where the players strategically distribute a fixed amount of resources n over a finite number of m contests (called as "battlefield").The player who assigns a battlefield a higher amount of resources wins this one.The objective of the players is to maximize the number of battlefields won.(This work -as some researches reasonedbecame the cornerstone of the game theory).After Second World War Borel's model takes the name "Colonel Blotto game" brought into the scope of game theory the field of allocation games or a 'winner-takesall' conflicts with applications in the areas such as R&D races, presidential elections, auctions, tournaments (for an overview of research works on allocation games, see [2]).Hart [3] considered a discrete version of the Blotto game, called Colonel Lotto game where the battlefields are assumed indistinguishable.
More exactly the difference between Colonel Blotto and Colonel Lotto games can be explained as follows.
Let D = (D 1 , …, D m ) and E = (E 1 , …, E m ) be pure strategies of players A and B in these games.Then for Blotto game payoff function H B is defined as H B (D,E) = (1/m) ¦ i sign(D i -E i ).For Colonel Lotto game, in accordance with Hart [6], payoff function H L is defined as H L (D,E ) = (1/m 2 ) ¦ i ¦ j sign(D i -E j ).In [3] it was shown that Colonel Blotto game (n, m) and the Colonel Lotto game (n, m) have the same value.
The Colonel Blotto or Colonel Lotto games are the games, which are easy to formulate, yet analytically quite challenging.In general, these games have multiple mixed strategy equilibria only.
In [4,5] model of partition games was introduced to describe the classes of Colonel Blotto and Colonel Lotto games.Recall that an (n, m)-partition of n into m parts is a sequence of non-negative integers [6] for details on partition theory.
The antagonistic partition game [4] is a two-player constant-sum game.The playoff function H P is defined as , where 4(D) is permutation set for D.
In spite of great interest in optimal solution of Colonel Blotto and Colonel Lotto games to the best of our knowledge are not works where numerical minimax solutions were was produced and analyzed.In [3] was shown that Colonel Blotto game has a mixed strategy equilibrium in which the marginal distributions are uniform on [0.2n/m] along all battlefields.In this way for Colonel Blotto game (120,6) for example the support set should contain more than 10 8 strategies taken with equal probabilities each.
On the other hand by seminal theorem of Carathéodory [7] the games like Colonel Blotto admit equilibrium mixed strategies involving at most more than (n+1)m+1 the pure strategies.Nevertheless, theorem of Carathéodory was existence theorem only for a long time.
The payoff matrix of Colonel Blotto and Colonel Lotto games has a huge size even for small values n and m.For example, for Colonel Blotto game (120, 6) the size of payoff matrix is about 10 10 ×10 10 .Therefore, it is not surprising that traditional optimization techniques fail to find optimal solutions Colonel Blotto and Colonel Lotto games.Up to now it was common believe that solving minimax problem for antagonistic matrix games with constant sum like colonel Blotto or colonel Lotto games are intractable because the number of pure strategies grows exponentially with games' parameters (for partition games it is total resource n and number of part m).So all previous investigation efforts was concentrated on finding marginal distributions of the available resources that are the best for equilibrium [2,3].
Recently in [8] so called LMO-based decomposition techniques were proposed.In particular, there was investigated the case where these techniques allow to reduce a huge, but well organized matrix game a to a small saddle point optimization problem.
Fortunately, Colonel Blotto game (and partition game, accordingly) is the case when the matrices for each player are well organized . .Specifically, in [8] the game "attacker vs defender" was considered.This game is more general version of Colonel Blotto gamethe different battlefields are allowed to have different weights.The method from [8] as applied to Colonel Blotto games (more exactly "attacker vs defender" game) allows to solve the game in polynomial time.In reality, the method is capable to find in reasonable time near-optimal solutions to rather large (with m, n about 100) games within accuracy, in terms of the payoffs of the players, like 0.02.
Besides this method provides support set with few pure strategies (in our experiments it significantly below Carathéodory bound).However, this methodas any other computational methoddoes not allow to understand a priori what kind of pure strategies will enter to the supports of H-optimal mixed strategies.At the same time, previous results [4] allow to find a set of "appropriate" pure strategies (for H-optimal support set) without solving optimization problem [8].
In this paper, we present the first, to our knowledge, method to approximate the H-optimal solutions of partition games.Essentially our method is as follows: we employ the knowledge about given partition's parameters (peculiar resource and permutation balance [4]) to design the set of pure strategies "similar" to those participating in H-optimal solution [8].("Similarity" here means the ability to find of a game value within accuracy, in terms of the payoffs of the players, like 0.02-0.04).To check the quality of our solutions and to get the support set (the set of pure strategies with probability values) we have to solve a small matrix minmax problem by well-known method [9].Results of the numerical simulation yield strong reasons to believe that in the case of colonel Blotto-like antagonistic games our approach is quite competitive with the techniques from [8].
The rest of the paper is organized as follows.In section 2 "Analysis of the H-optimal support set" we analyzed the support sets of H-optimal solutions obtained by methods from [8] as applied to Colonel Blotto games (120, 6), (36, 6) and (100, 10) with symmetrical resources and demonstrate some combinatorial peculiarities of pure strategies participating in these solutions.In section 3 "Experimental Design: synthesis suboptimal solution for symmetric partition games" the a matrix B(KxL) is well organized if, given x ∈ R K , it is easy to identify the columns _B[x], B¯[x] of B making the maximal, resp.the minimal, inner product with x [3].basic stages of experimental procedure to compute equilibrium solutions are presented.The accuracy of obtained suboptimal solutions is compared with Hoptimal solutions was estimated by numerical simulations.Section 4 contains our concluding remarks.

Analysis of the H H-optimal support set
First of all we checked H-optimal support sets for colonel Blotto games (36,6), (120,6) and (100,10).These solutions in mixed strategies contain about two hundreds strategies.As example, fig. 1   We emphasize that there is essentially discrepancy between well-known condition on resource allocation for optimal support set [5] and resource allocation in H-optimal support set.For example, for colonel Blotto game (120,6) value of maximal resource in each battlefield for player A is (59, 62, 53, 40, 53, 64), and for player B -(53, 46, 51, 51, 50, 48).It is common knowledge that the value of maximal resource for any battlefield in accordance [5] must be less than (2n/m)+1, i.e. for colonel Blotto game (120,6) less than 41.What more, resources in each battlefield based on the univariate distribution function (mx/2n) where x[0, 2n/m].But if all pure strategies in H-optimal support set will be represented as partitions and m is even, resource allocation in these partitions are far from requirement of uniformity imposed on model in [2]  Last column of Fig. 3 shows deviation of the first moment M(Q n,m ) from ln 2. This value is asymptotical approximation of M(Q n,m ) if n,m are sufficiently large (this statement is readily apparent from [10]).From Fig. 3 we see that when n = m 2 a deviation of the first moment Q( n,m c) from ln 2 is very small (about 0.003) but for all other cases this deviation significantly more.
There is important characteristics of partitions what is known as peculiar resource [4].The value of peculiar resource is maximal when the partition In this case the partition has all different parts.
Let PR n,m 4 n,m is partition set where each partition has a maximal peculiar resource.
Proposition.For PR n,m are true: Let : * n,m is set of the pure strategies of H-optimal support set.Numerical simulations show that as a rule * , 0.5 0. ( ( )) 7 , Hence, the bigger diversity of the pure strategy the greater the chance to see this strategy in the optimal support set.

Experimental Design: synthesis suboptimal solution for symmetric partition games
The main body of numerical experiments was focused on finding of suboptimal solution by simulation of the antagonistic symmetric partition game between two players AC and AS.
The player AC is set of pure strategies * , n m : (Hoptimal solution) which get by approach [4].
The player AS is a synthetic (artificial) set of the partitions ; n,m (PR n,m , pb) where PR n,m is some set partitions with maximal value of peculiar resource each and some value of permutation balance pb[pb o , 0] (the value of left border pb o is set by experiments).The convex optimization program developed to calculate the game value between AC and AS.It is reasonably safe to suggest that the small difference of game values between AS and AC mean that these players could be used interchangeably.
Stages of the numerical simulation partition game between AC and AS.Evidently that in our experiments there are three free parameters only: N, M and pb o .Therefore, basic part of experimental works was centered on finding most appropriate parameters of the partition set, i.e. such values of free parameters which permit to build "almost H-optimal" support sets.By way of example, Fig. 4 shows the difference game value of the (100,10)-partition game between set out of 100 and 200 strategies for H-optimal solution and set which contains M of synthetic strategies (M = 100, 200, …, 600).From this figure, we notice that if numbers of synthetic strategies (player AC) twice as many strategies of H-optimal solution (player AS) the difference between games value is practically negligible (no more 0,02) However, on the base of experiments described above, it is impossible to make final conclusion about behavioral identity AS and AC.Actually, players AC and AS could be showing different results when meeting with other players.Therefore, to investigate these cases was created the test set strategies .The player T M is defined as set of strategies M (M ).These M strategies is chosen out of by chance.
By way of example, Fig. 5 shows the experimental results of game value (case 120,6-partition) if AC and AS playing with T 200 ([0, -0,85], PR n,m ) and T 400 ([0, -0,85], no more 0.5 PR n,m ).Here the first argument is interval of partition balance value and seconda value of peculiar resource).
In Table (Fig. 5) AC and AS game values are average values of twenty experiments.During each experiment, a partition balance and peculiar resource of player T M is selected randomly from preset range (see above

Conclusions
Results of the numerical simulation provides reason enough to suppose that outlined approach offered serious competition to H-optimal solution [4], at least, in case of antagonistic matrix games such as Colonel Blotto game.This method with accuracy and computing complexity comparable to [4] allows to get of approximation of H-optimal solution without decision of relevant optimization problem.The models of strategic multidimensional resource allocation (and in particularly partition games model), has been utilized in many real life applications, such as military conflicts, advertising resource allocation, political campaigns, and development portfolio selection.Nevertheless, using of these models is impossible without design of tractable decision methods.The results [4] was of crucial importance to attain these ends.In our turn we tried to make next step along this line and have discovered that complexity of the problem in question may be strongly diminished at the expense of accuracy.Certainly, validating the experimental results with help of some analytical methods seems imperative and it is our nearest goal.The simplicity of our suboptimal solution method is main advantage for the fields of behavioral game theory [10].This method can be used to develop of new experimental approaches.For example, in the laboratory experiments [11] where a version of the classic Colonel Blotto game to understand when human decision consistent with equilibrium predictions were investigated.What is more, the special features of our method offer a clearer view of why human being could be smarter than supposed in iterative reasoning model [12].
Author is much grateful to A.C. Nemirovski for his instructive explanations, and P.S. Bocharov who aids in computing experimental work.

Fig. 1 .
Fig. 1.First ten strategies and their probability from support set of Blotto game (120, 6) A vs B

Fig. 4 .
Fig. 4. Difference game values AC vs AS for (100,10)-partition game demonstrates first ten strategies of Colonel Blotto game (120, 6) for two players A vs B.Here first value is probability p(D) for associated strategy D = [a 1 , a 2,…, a 6 ] and a 1 +a 2 +•••+a 6 =120, i.e. in support set player must use strategy D with probability p(D).Altogether support set contains 227 pure strategies for player А and 229 pure strategies for player B This support set allows to get H-optimal game value with guarantee accuracy -2.587182 .10 -002 for player A and 2.040023 .10 -002 for player B.
. So it is desirable examine how with point of resource allocation all partition support sets constructed.
Let 4 n,m is set of all (n, m)-partitions and subset 4 n,m .For any D define function Q(D) as ).