Bayesian network variable elimination method optimal elimination order construction

. Variable Elimination (VE) is the most basic one of many Bayesian network inference algorithms. The speed and complexity of reasoning mainly depend on the order of elimination. Finding the optimal elimination order is a Nondeterministic Polynomial Hard (NP-Hard) problem, which is often solved by heuristic search in practice. In order to improve the speed of reasoning of the variable elimination method, the minimum, maximum potential, minimum missing edge and minimum added complexity search methods are studied. The Asian network is taken as an example to analyze and calculate the complexity and elimination of the above search method. Meta-order, through MATLAB R2018a, the above different search methods were constructed and reasoned separately. Finally, the performance of the four search methods was compared by inference time analysis. The experimental results show that the minimum increase complexity search method is better than other search methods, and the average time consuming is at least 0.012s, which can speed up the reasoning process of Bayesian network.


Introduction
The Bayesian network is derived from the uncertain knowledge representation model proposed by Pearl [1] . Because of its rigorous theoretical foundation of probability theory, a graph model clearly shows the interdependence between random events, which has become a research hotspot in the fields of artificial intelligence, pattern recognition and fault diagnosis [2] . There are two main methods of Bayesian network reasoning: exact reasoning and approximate reasoning. Exact reasoning mainly includes variable elimination method, group tree propagation algorithm, multi-tree propagation and graph reduction algorithm. Among them, the variable elimination method is the most basic reasoning algorithm, and it is easier to understand. Its advantage lies in its versatility and simplicity. It can solve the reasoning of complex Bayesian networks, especially complex fault diagnosis Bayesian network.
The complexity of the variable elimination method mainly depends on the order of its elimination. Finding the optimal elimination order is an NP-Hard problem [3] , at present, approximate algorithms are commonly used to solve the problem, mainly including minimum degree search [4] , maximum potential search [2,5] , and minimum deficiency search [2] , Xiang Guang-jun through the analysis of the variable elimination method, puts forward an algorithm for the parallel operation of Minimum Deficiency Search and variable elimination [9] , but the algorithm did not find the optimal elimination order, and there are still certain defects; Gao Wen-yu proposed a new search algorithm-minimum increased complexity search is presented [6] , but it is only explained through simulation experiments, and is not used in examples. The specific implementation process of its search method is not given, and it is not compared with the above search method. In response to the above problems, this paper uses the Bayesian network classic model Asian network to verify the pros and cons of the above search algorithm in the variable elimination method.

Bayesian network
The Bayesian network is a directed acyclic graph, in which the nodes represent random variables in the event, and the directed edges represent the interdependence between the nodes. Each node has a probability distribution, and the probability distribution of the root node belongs to the marginal distribution, which can also be called a prior probability; the non-root node is a conditional probability.
In probability theory, the joint probability can be expressed in the form of a conditional probability chain [7] , its form is: Formula (1) can also be called a chain rule, where Y is a random variable, ( ) is the joint probability of 1 , is the conditional probability of 1 1 , , , Applying the assumption of conditional independence of Bayesian network to chain rule formula (1), we can get: is the conditional probability of Bayesian network node i Y , and ( ) i Parent Y represents the directly connected parent node of i Y . For the following discussion, a classic model of Bayesian network-Asian network [8] is introduced here. Smoking may cause lung cancer or bronchitis, and travel to Asia may cause tuberculosis. Any of these three diseases may cause breathing difficulties. If you have tuberculosis or lung cancer, the X-ray chest X-ray result may be positive. Combining these causal relationships, you can get the classic Asian network model, as shown in Figure 1, where V indicates that you have been in Asia, and T indicates that you have been infected with tuberculosis. S means smoking, L means lung cancer, B means bronchitis, E means tuberculosis or lung cancer, X means taking X-rays, and D means breathing difficulties. Using formula (2) to express the joint probability distribution of each node in Figure 1, the following formula can be obtained:

P V T S L B E X D P V P S P T V P L S P B S P D B E P X E P E T L =
(3) Table 1 shows the node probability of the Asian network, where ( )  X . Through the above description, if the first elimination method is adopted, the computational complexity increases exponentially with respect to the number n of variables. If the second elimination method is adopted, the computational complexity only depends on the number of variables related to the decomposition factor 1 2 , , , k f f f … of 1 X , and the computational complexity will be greatly reduced.
Suppose X is used to represent the set of all variable nodes in a Bayesian network N , and f is the set of probability distributions of all nodes in N . According to the definition of Bayesian network, f can be regarded as a decomposition of the joint probability distribution ( ) P X represented by N . Suppose the evidence set is observed, and their value is denoted as e . In the factor of the set f , set all the variables representing the evidence as the actual observation value, and then another set of functions can be obtained, which is denoted as ' f . This step is called setting evidence, ' f is a decomposition of the function ( ) Suppose a subset of Y is Q , and eliminate all the variables in Y but not in Q from ' f to get another set of functions, denoted as '' Multiply all the factors in According to the definition of conditional probability, formula (4) can be obtained: The above process is an elimination process to reduce calculation variables, referred to as VE algorithm [2,9,10] .
The VE algorithm needs to provide five inputs: (1) Bayesian network N , which is also a decomposition of the joint distribution ( ) P X . (2) The set E of evidence variables. (3) The value e of the evidence variable set E .  Table 2.

Analysis of elimination complexity
It can be seen from the implementation process of the variable elimination method that the most time-consuming and memory-consuming step is the call to the function Elim (f, Z). The calculation amount of this function far exceeds the calculation of other steps, so the calculation complexity of the entire variable elimination method is mainly determined by this function.
The function Elim(f, Z) means to pick out all functions { } containing Z from f, multiply them to obtain the intermediate function G, and then eliminate Z from G. Let be all the variables in G except Z. If the function is expressed as a multidimensional table, the number of functions that G needs to store is When using variable elimination method for reasoning, multiple variables need to be eliminated, and set them as 1 2 , , , n Z Z Z … in turn. Here, the total elimination cost ( ) The structure graph is to consider a set of functions . The structure graph of F can be defined as starting from an empty graph. For each variable in F, add a node to the graph accordingly. For any two variables X and Y, if they appear in the same factor i f at the same time, an edge is added between their corresponding nodes. The graph model constructed according to this rule is a structural graph. Figure 2 shows the structure of the Asian network. is just the set of all nodes adjacent to Z, denoted as nb(Z), so: In the Asian network, eliminating the variable T from f requires calculating In the structure diagram, the node adjacent to T uses

Minimal degree method
In the structure diagram, the minimum degree search method is to put the node variable with the smallest degree at the end of the elimination sequence queue, and delete the node and the directed edge connecting the node in the network. If there are multiple nodes with the same degree, choose one of them and repeat the above process until all the nodes in the structure diagram are eliminated. This elimination method is called the minimum degree method.
Taking the Asian network model as an example, set the evidence node as X=1 and E=1, where 1 means that the event is true. Find the posterior probabilities of V, T, S, L, B, and D respectively. That is to find the probability of ( ) The minimum degree search method is used to find the elimination order, and the total elimination order is Eliminate element D as follows: Eliminate element L as follows: 1| , | L S T P E L T P L S ϕ = = ∑ Eliminate element S as follows: Eliminate element T as follows: Calculate the maximum hypothesis test of V: The above process is the inference calculation process of the variable elimination method.
The Asian network is constructed in MATLAB R2018a, and ( ) , which takes 0.0750s (the subsequent inference calculation process of different search methods are the same as above, and will not be repeated). Similarly, the elimination order when seeking T is The inference time of each node can be obtained as 0.0239s, 0.0127s, 0.0164s, 0.0146s, 0.0264s. The average inference time is 0.0282s.

Maximum potential search
The maximum potential search is to number all nodes in an undirected graph according to the following rules: In an undirected graph with n nodes, in the number i step, select the neighboring nodes with the most numbered nodes, and these nodes are not numbered , And number it as n-i+1. If there are multiple such nodes, choose one of them. When all the nodes in the undirected graph are numbered, the numbers are sorted in ascending order. The order is the elimination order of the corresponding nodes. This is the general idea of the maximum potential search method. Use the maximum potential search method to find the elimination order of the Asian network: the first step is to number the B node as 8, the second step can be the number of the node S or D, here the D node is numbered as 7, and the third step is for the node S as 6. The fourth step is to number the node L as 5, the fifth step is to number the E node as 4, the sixth step can be the number of the node T or X, here you choose to number T as 3, and the seventh step is to number the node V as 2 , Step 8 number the last node X as 1. So far, all nodes have been numbered, and their elimination order is , , , , , , , Excluding the evidence nodes X and E, the elimination order of the corresponding nodes can be obtained. The elimination order when seeking V is The inference time of each node can be obtained as 0.0258s, 0.0126s, 0.0118s, 0.0126s, 0.0118s, 0.0146s. The average inference time is 0.0149s.

Minimal missing edge search
The number of missing edges of a node Z in the undirected graph is the number of edges that need to be added when using the function Elim(f, Z) to eliminate Z. The minimum missing edge search calculates the number of missing edges of the node while deleting the node from the undirected graph. Each step needs to calculate the number of missing edges of the node. When eliminating nodes, select the node with the smallest number of missing edges to delete. The inference time of each node can be obtained as 0.0150s, 0.0122s, 0.0119s, 0.0119s, 0.0121s, 0.0125s. The average inference time is 0.0126s.

Minimal increase in complexity search
The main idea is to delete a node when eliminating elements in the structure graph, and all edges connected to the node will be deleted. In order to reduce the complexity of the graph, it is often hoped to delete as many as possible. The number of edges removed is represented by d e ; when the node is deleted, the deleted graph needs to be formed into a complete graph, so some edges will be added between some nodes to form a complete graph, but the added edges will increase the complexity of the graph, so we hope to add as few edges as possible. Record the number of edges added as a e , and use / a d e e to get a new measurement index, that is, the minimum added complexity, which is represented by IC. When selecting the elimination order, first calculate the IC value of all nodes, select the smallest IC value for elimination, if there are multiple identical values, choose one of them, and then calculate the IC values of the remaining nodes, and repeat until all The node is eliminated, the elimination order can be obtained. The minimum increased complexity search method is a dynamic search process. Starting from the entire network graph, it constantly loops judgments to find the optimal solution, improves the overall running time of the reasoning, and avoids partial elimination and overall comparison. The lack of slowness. This algorithm is suitable for the elimination of multiple query variables during fault diagnosis, and can improve the overall running time. Taking the Asian network as an example, using the minimum increase complexity search method, Table 3 is the IC value of the initial state node. Table 3. Initial state node IC value.
node / It can be seen from the table that the IC values of nodes S, L, B, and E are all 0.5, and one of them can be selected for elimination. Here, node S is selected, and the remaining nodes L, B, and E after eliminating node S form a triangle graph. All nodes are equivalent and can be combined arbitrarily. Finally, the elimination order , , , , , , , V D X T S L B E ρ = is obtained. The inference time of each node can be obtained as 0.0119s, 0.0130s, 0.0124s, 0.0116s, 0.0115s, 0.0116s. The average inference time is 0.012s. Figure 3 is a comparison of the average time consumption of the four inference algorithms. It can be clearly seen that the minimum increase complexity search algorithm runs better than the other three search methods.

Conclusion
Bayesian network has a wide range of application prospects, which can be used in artificial intelligence, fault diagnosis and other fields. The speed of calculation directly determines whether the method can be applied in the above-mentioned fields. Therefore, this paper takes the Asian network model as an example, and discusses the realization process of the variable elimination method in the reasoning example. The Asian network is constructed and inferred through MATLAB R2018a, and the four elimination sequence structures of the variable elimination method are compared through experiments. Method, experiments show that the