Improving the structural quality of UML class diagrams with the genetic algorithm

. The problem of improving the structural quality of UML class diagrams can be formulated as an optimization problem. The Genetic algorithm is concerned to be able to solve such problems. This paper focuses on the ways in which the Genetic algorithm can be applied to the problem of improving structural quality of UML class diagrams. It develops the theme of semantically equivalent transformations of UML class diagrams during the evolutionary search. This paper suggests the structural semantics of the UML class diagrams. It also formulates the problem of improving the structural quality of a UML class diagram during the evolutionary search and proposes a solution of the problem based on the Genetic algorithm. The paper presents the results of the computational experiment aimed at improving of the structural quality of the UML class diagram with the help of the Genetic algorithm and identifies issues for future work.


Introduction
Recent advances in SBSE (Search Based Software Engineering) have provided the possibility of applying evolutionary algorithms to the problems of software engineering, many of which can be formulated as the optimization problems.
The Genetic algorithm (GA) is an evolutionary algorithm based on a natural selection mechanism.
A number of studies have investigated different ways of object-oriented software architecture design based on evolutionary algorithms and UML diagram transformations.For example, in [6] hierarchical decomposition of the system has been performed.In [7] pattern based evolutionary transformations have been performed.In [8] authors solve the class responsibility assignment problem with the help of the Genetic algorithm.However, to date it remains unclear which transformations (design patterns, for example) can be automatically performed on the architecture design stage and whether these transformations are semantically equivalent.
The aim of this work is to formalize structural semantics of UML class diagrams and to suggest an algorithm of UML class diagram evolutionary transformation aimed at the improving the diagram structural quality.
a Corresponding author : o.a.derugina@yandex.ru The evolutionary transformation of UML diagrams is significant because it allows implementing automatic refactoring of software architecture.
The primary contributions of this paper are: 1.To propose the structural semantics of the UML class diagrams to be able to check equivalence of two UML diagrams.2. To formalize the problem of improving the structural quality of a UML class diagram during the evolutionary search.3. To propose a solution of the problem based on the GA. 4. To present the results of the computational experiment aimed at the improving of the structural quality of the UML class diagrams.

Structural semantics of the UML class diagrams
In paper [9], the structural semantics of UML class diagrams was suggested.This semantics provides a formal way of describing UML diagram transformations (interface insertion, Fa<;ade pattern applying).
Let It is required to find such a diagram d2 for n generations with m individuals that: (3) In other words, it is required to find such a set of transformations T*ET, which transfers the d1 into the d2, r* d1 ~ d2 that: {r*lsd 1 ~Sd 2 ;f(dz)=min/(d;)} (4) d;

Solution of the problem based on the GA
In order to use GA, the UML class diagram d; can be considered as an individual, which corresponds to a chromosome consisting of genes, number of which is equal to the number of classes ci Ed; .Each of genes stores an information about whether or not the transformation tk E Thas been applied to a class Cmi• This approach is similar to that proposed in [7], where the authors applied design patterns to classes, but our transformation approach includes the essential check of semantic equivalence of the transformation.
Individual mutation occurs by adding/deleting a random transformation to/from a gene.
In order to simplify the representation of UML class diagrams in RAM to process them with the GA, the abstract data structure (ADS) UML Map was proposed [10].UML Map is based on hash maps; therefore, the evaluation of search complexity is 0 (1).
A scheme of applying the GA in order to improve the structural quality of the UML class diagram is showed at   A computational experiment has been conducted with the number of individuals in population = 200, the probability of mutation = 0.5, the percentage of crossover participants = 25%.The attained dynamics of f( d1) is showed at Figure 5. On the l 00 OOOth iteration has been achieved an individual d2 with the lowest value of CB0=0.941, which is showed at Figure 5.The semantic value S2 of the d2 is as follows : From ( 8) it follows that s 1 ~ s 2 , so the obtained diagram d2 is semantically equivalent to the diagram d1•

Discussion
The results presented here demonstrate that GA has a potential to conduct automatic refactoring of UML class diagrams, but further studies are needed.The limitation of the present research is that the only one type of transformations has been applied.The rate of the convergence might have decreased if the experiment had been carried out with a significantly larger set of transformations r and it would be beneficial to investigate this further [11].
In addition, the current study did not investigate the qualitative effect of different transformations on the values of various structure quality parameters.
Industrial Control Systems: Analysis, Modeling and Computation 03003-p.3

Figure 1 .
Figure 1.The Interface insertion transformation Then we can formulate the problem of improving the structural quality of a UML class diagram during the evolutionary search as follows:It is required to find such a diagram d2 for n generations with m individuals that:

Figure 2 .
Let us formulate f(d i) in accordance with the Low Coupling principle.For example, we can use a CBO (Coupling Between Objects) metric: (5) where CEO; is an average CEO value of a class diagram d; with n classes c 1 , , j E 0 .. n , which reflects a degree of dependence between the components of the system; CEO!i is a CEO value of the class c 1 Ed;, a number of classes connected with Cj except from the dependency relation; n is a number of classes located at the diagram d;.wheref(d)f itness f unction; Mpopulation size; Npopulation maximum numbe!I\.L--< p -mutation pro bability; ......_ _ __:::: _ _ __, L_...:._ _ _ ___::....._ Kmaximum number of populations w ithout improvement of f(d) .

Figure 2 .
Figure 2. Applying the GA in order to improve the structural quality of the UML class diagram