Multi-objective Virtual Machine Placement for Load Balancing

. The virtual machine placement is closely related to the efficient and balanced utilization of physical resources. In this paper, the influence of two scenarios about resource utilization on load balancing is analyzed. A multi-objective ant colony optimization algorithm is proposed to solve the virtual machine placement problem, which balances the load among physical machines and the internal load of physical machine simultaneously. The proposed algorithm is compared with two single objective ant colony optimization algorithms, first fit algorithm and greedy algorithm under some instances. The results show that the proposed algorithm can search and find solutions that exhibit good balance among objectives while others cannot. This demonstrates the proposed algorithm can balance the load in the process of mapping virtual machines to physical machines.


Introduction
Cloud computing [1] as a new service model can effectively cope with mass data processing and computing needs by integrating Internet resources.Cloud computing [2] can be roughly classified into three types according to the service type: Infrastructure as a Service (IaaS), Platform as a Service (PaaS) and Software as a Service (SaaS).The virtualization [3] technology can divide physical resources into isolated virtual machines, which meet the demand of users to improve the utilization of resources and reduce the investment in infrastructure.The isolated virtual machines make it possible for different applications to run on the single server.The high load will affect the performance of upper applications, and the low load will not make full use of the limited resources, so optimal virtual machine placement closely related to the balanced utilization of resources is very important.
A lot of research has been devoted to solve the virtual machine placement problem in a data center.Virtual machine placement is often modeled as bin packing problem [4], and some solutions combine the classic algorithms for bin packing problem, such as First Fit Decreasing (FFD) [5] and Best Fit Decreasing (BFD) [6].In [7], the relationship among power consumption, resource utilization and performance has been studied.Power consumption optimization algorithm is proposed through modeling as bin packing problem.Beloglazov et al. [8] proposed Modified Best Fit Decreasing (MBFD) algorithm to solve the virtual machine placement problem based on CPU utilization.Virtual machine placement belongs to combinatorial optimization problem, so the improved algorithm can combine with genetic algorithm [9], ant colony algorithm [10] or particle swarm optimization algorithm [11] which is effective to the problem.In [12], a two-level control system is proposed to manage the mappings of workloads to virtual machines and virtual machines to physical resources.An improved multi-objective genetic algorithm is proposed to minimize total resource wastage, power consumption and thermal dissipation costs.In [13], a prototype virtual machine packing optimization mechanism on Grivon is implemented.Genetic Algorithm (GA) method is employed to avoid SLA (Service level agreement) violation, reduce number of real nodes in use and reduce virtual machine migrations.Feller et al. [14] proposed Energy-Aware ACO-based Workload Consolidation algorithm minimizing the number of physical machines required.In [15], a multi-objective ant colony system algorithm for virtual machine placement in cloud computing is proposed to minimize resource wastage and power consumption.In [16], to reduce energy consumption in cloud data center, an energy efficient virtual machine allocation algorithm is proposed based on a proposed energy efficient multi-resource allocation model and the particle swarm optimization (PSO) algorithm.
Most research on virtual machine placement only considers the initial scenario that the resources of physical machines are all idle.In the real scenario, the load of physical machines being dynamic, the virtual machine would be deployed to the physical machine with low load preferentially.Under the condition of a certain number of virtual machine requests, minimizing the number of physical machines to achieve energy efficient goal will also affect the balanced utilization of resources.This paper will study the problem in the general situation, namely the part of physical machine resources is already used.The designed algorithm in this paper is to make the utilization of physical machine resources as balanced as possible, so more needs are met under the condition of limited resources.
This paper is organized as follows.The second part makes a brief introduction on the ant colony optimization algorithm and multi-objective optimization.The third part describes and formulates the virtual machine placement problem.In the fourth part, multi-objective ant colony algorithm for load balancing is proposed in detail to solve the problem.In the fifth part, the proposed algorithm is compared with two single objective ant colony optimization algorithms, first fit algorithm and greedy algorithm to verify the effectiveness of the algorithm.The sixth part is the conclusion of this paper.

Ant colony optimization algorithm
Inspired by the collective behavior of real ant colony, Dorigo proposed ant colony optimization algorithm [10] systematically.The mechanism of ant colony optimization algorithm consists of two basic stages: adaptation phase and cooperation stage.In the adaptation phase, each candidate solution according to the accumulated pheromone adjusts the structure itself.On the one hand, the amount of pheromone will be greater if more ants pass through the path, and the probability of the path selected will be larger.On the other hand, the pheromone will evaporate over time.In the collaboration phase, candidate solutions communicate through pheromone to get the desired solution with better performance.The self-organization mechanism of the algorithm does not need to understand every aspect of the problem in detail, so it is effective to solve many combinatorial optimization problems.

Multi-objective optimization
Many scientific and engineering problems can be modeled as a multi-objective optimization problem [17] which is different from the single objective optimization problem.Performance improvement of one objective may result in performance degradation of other objectives, so it is very difficult or impossible to optimize multiple objectives simultaneously.The feasible solutions of multi-objective optimization problem form a Pareto [18] set.Generally speaking, the multi-objective optimization problem with n decision variables and m objective functions can be expressed as follows.

IST2017
In expression (1), the decision vector is x = (x 1 , x 2 , , x n ) ∈ X , and the objective vector is X is the decision space of decision vector, and Y is the objective space of objective vector.g i ( x) ≤ 0(i = 1,2, , p) defines p inequality constraints, and q equality constraints are defined by h j ( x) = 0( j = 1,2, ,q) .The following concepts [18,19] is often used.
Pareto dominance: 0 x dominates 1 x ( x 0 x 1 ), if and only if Pareto optimality: 0 x is Pareto optimal if and only if ¬∃x 1 : x 1 x 0 .
Pareto optimal set: The set of all Pareto optimal solutions is Pareto set P={x 0 |¬∃x 1 : 3 Problem description and formulation

Problem description
Considering two scenarios about resource utilization, one scenario is that the utilization of one physical machine is far greater than the utilization of another for a long time, and the virtual machine migration [20] is usually used to balance load for such cases.The number of migrations should be reduced as much as possible because of the high costs.Such result is described as the load imbalance among physical machines.Another scenario is that the utilization of one certain resource is much larger than other resources' in a physical machine.This would lead to the fact that the physical machine cannot satisfy the virtual machine resource requirements, resulting in a waste of resources.Such result is described as internal imbalance load in a physical machine.The virtual machine placement problem is actually to determine the mapping relationship between virtual machines and physical machines, and the mapping relationship between virtual machines and physical machines is multi-to-one.This paper will study that the multiple virtual machines are placed on a certain number of physical machines in the general situation.The goal is to make the load among physical machines and internal load as balanced as possible to achieve efficient and balanced utilization of physical resources so that more needs are met under the condition of limited resources.

Problem formulation
Virtual machine set is defined as VM = vm 1 ,vm 2 , ,vm M { } .Physical machine set is defined as M is the number of virtual machines and N is the number of physical machines.The types of resources include CPU, memory, storage and bandwidth.The resource request vector of virtual machine ) .The available resource vector of physical machine j pm is defined as A j = (A j 1 , A j 2 , A j 3 , A j 4 ) .The total resource vector of physical machine j pm is defined as ( , , , )

S S S S S =
. The resource utilization vector of physical machine j pm is defined as ( , , , ) When all virtual machines are placed on a certain number of physical machines, the resource utilization of each physical machine forms a matrix defined as U N×4 = U 1 ,U 2 , ,U N ( ) utilization of each dimension is defined as the Eq. ( 2), and d represents the dimension of resource.
The average resource utilization vector of all physical machines is defined as Eq.(3).
To measure the load balancing degree of physical machines in data center comprehensively, the load among physical machines and the internal load are considered.In order to reflect the degree of load balancing among the physical machines, the Outer load Balancing Degree(OBD) is defined as the average of Euclidean distance between each physical machine resource utilization vector and the average resource utilization vector of all physical machines.Details are shown in Eq. ( 4). 4 2 In order to reflect the load balancing degree of different resources in the physical machine, the Internal load Balancing Degree(IBD) is defined as the average value of the standard deviation of the resource utilization of each physical machine.Details are shown in Eq. ( 5).  4 4 Based on the above analysis and parameter definition, the problem can be formulated as follows.
Constrains: 6) is to optimize two objectives simultaneously.Constraint ( 7) and ( 8) indicate that for each physical machine, the total resources of virtual machines placed on the physical machine do not exceed the available resources.Constraint ( 8) and ( 9) indicate that a virtual machine will eventually be placed on a physical machine.The virtual machine can be placed on the physical machine on condition that the Constraint (10) is satisfied.For each virtual machine, a corresponding set of candidate physical machines is established, and each physical machine in the set satisfies the constraint condition (10).Once a physical machine is selected, the available resource is updated until all virtual machines are placed.A feasible solution of the problem is the mapping of all virtual machines and their corresponding physical machine.

Heuristic function and selection strategy
In the process of virtual machine placement, the heuristic function can help the virtual machine select the appropriate physical machine.Eq. ( 11) defines the matching distance between the virtual machine and the physical machine, and Eq. ( 12) defines the heuristic function.
cos , There are two reasons about constructing the heuristic function.On the one hand, when the value of ij d is greater, the proportion in all dimensions between i R and j A is more similar, so that can make the internal load more balanced, and the cosine of vectorial angle can eliminate the resource dimension.On the other hand, the heuristic function tends to choose the underloaded physical machine, so that can make the load among physical machines more balanced.
For virtual machine i vm , ant k selects the physical machine j pm with the probability k ij p in the set of candidate physical machines i Allowed .k ij p is defined as Eq. ( 13).

[ ] [ ]
In Eq. ( 13), i Allowed is the set of candidate physical machines of virtual machine i vm .( ) τ is the amount of pheromone between the virtual machine i vm and the physical machine [ ) 0,1 r ∈ is generated randomly, and physical machine j pm is selected if the cumulative probability ( ), is not less than r .

Maintenance of Pareto optimal set and pheromone updating
At the end of each cycle, the number of feasible solutions obtained is equal to the number of ants at most, and each feasible solution S i should be judged by the following steps to obtain a temporary Pareto optimal set.For each element in the temporary Pareto optimal set, the same method is used to maintain the global Pareto optimal set. Figure 1  ( ) is the increment of pheromone between i vm and j pm defined as the Eq. ( 14) which considers the two objectives IBD and OBD .
The pheromone is updated after the completion of one cycle by Eq. ( 15).In Eq. ( is pheromone evaporation coefficient, and A is the total number of ants.

Deterministic virtual machine placement
For multi-objective optimization problems, the number of solutions is usually more than one.
Considering the target weight is not easy to determine, this paper uses the stratified sequencing method to obtain the deterministic solution.The method is to rank all the objectives according to their importance, and then to obtain the set of optimal solutions for the most important objective, and to obtain the set of optimal solutions for the next objective on the basis of the previous set until the last objective.In the process of selecting the deterministic solution, the importance of the objective OBD is higher than that of the objective IBD, so the solution with minimum objective OBD is selected as the deterministic solution of the problem when the stratified sequencing method with two objectives is used.Figure 2  for vm in VM 4.
update available resources of pm 6.
maintain the set of candidate physical machines for virtual machines 7.
end for 8.
calculate the value of objective functions 9.
end for 10.
maintain global Pareto optimal set 11.
update the pheromone according to Eq.( 14) and Eq.( 15) 12. end for 13. return S /* the deterministic solution*/ VM is virtual machine set.PM is physical machine set.α is pheromone heuristic factor.β is visibility heuristic factor, ρ is the pheromone evaporation coefficient.A is the number of ants.G is cycle times for algorithm.C is the amount of initial pheromone.In each cycle, the complexity of each ant selects the physical machine for the virtual machine is O(N), and the complexity of maintaining the set of candidate physical machines is O(M), so the complexity of generating a feasible solution is O (M(M+N)).Because the number of solutions in Pareto optimal set is uncertain, the complexity of maintaining the Pareto optimal set is not analyzed.The algorithm will generate the number of G*A feasible solutions at most, so the complexity of generating feasible solutions is O(GAM(M+N)).

Comparison of different algorithms
Every test was repeated with 10 runs for each instance and the average result of MOACO is compared with other algorithms.Figure 3 shows the experimental results of two objectives in different scales.Figure 3 indicates that SACO-OBD performs best on the objective OBD, but it performs poor compared with MOACO and SACO-IBD on the objective IBD.SACO-IBD performs best on the objective IBD, but it performs poor compared with MOACO and SACO-OBD on the objective OBD.The experimental results of GS are similar to single objective ant colony algorithms on the objective OBD and IBD respectively, and that indicates the heuristic information is helpful for load balancing.The experimental results of MOACO are obviously better than GS and FF on the objective OBD and IBD.The results show that the proposed algorithm MOACO can search and find solutions that exhibit good balance among objectives while others cannot.

Conclusion
For the problem, this paper analyzes the influence of two scenarios about resource utilization on load balancing.Two objectives are proposed to measure the load balancing comprehensively as possible.A multi-objective ant colony optimization algorithm for virtual machine placement in the general situation is proposed to balance load by optimizing the proposed two objectives.The proposed algorithm is compared with two single objective ant colony optimization algorithms, first fit algorithm and greedy algorithm under some instances.The experimental results show that the algorithm can effectively optimize multiple objectives to achieve the goal of load balancing in different scales.

DOI: 10
is the pheromone heuristic factor, and β is the visibility heuristic factor, which indicate the relative importance of pheromone and heuristic function respectively.The virtual machine i vm selects the physical machine j pm by the roulette wheel algorithm.

Figure 2 .
Figure 2. The algorithm for virtual machine placement.

Figure 3 .
Figure 3.The results of different algorithms in different scales.

DOI: 10
The process for maintaining Pareto optimal set.
is the algorithm description.The deterministic solution is defined as S. VM, PM, α , β , ρ , A, G ,C

Table 1 .
Virtual machine template

Table 2 .
Physical machine template