Statistical Analysis of Health Inequalities in European Countries

Efficiently functioning health systems are a prerequisite for high-quality health care and healthy life expectancy. Health care management at all levels requires a lot of information that can be obtained only by relevant analyses of health data. There are collected and regularly updated on-line published a large number of databases and enormous number of indicators about health status, health expenditures and health systems functioning at regional, national, EU member countries, OECD countries and on the world level. Paradoxically, the extent of these data sets is the reason why without at least basic statistical analysis the level of provided information is minimal. Advanced statistical methods aimed at reducing the dimension and quantification of causal relationships can provide significant information added value. The objective of this article is to analyse causal relationships between health status, health expenditures and sources of health care in selected European countries and to identify determinants of health inequalities in European countries by applying multidimensional statistical methods.


Introduction
Quality health care system is a priority for citizens of each country and a precondition for economic prosperity. Significant differences in health status exist between European countries and regions. Health inequalities exist along many demographic or social dimensions, including sex, age, geographic area and socio-economic status.
The Europe 2020 strategy, which aims to deliver smart, sustainable and inclusive growth with high levels of employment, productivity and social cohesion, is the main vehicle for achieving this. Europe 2020 sets targets against which the process will be measured and emphasises that a major effort is needed to reduce health inequalities to ensure that everybody can benefit from economic growth. [1] Actions to improve health are an important part of two of the seven flagship initiatives that contribute to implementing Europe 2020. Achieving the Europe 2020 targets, particularly the target of reducing by 20 million the number of people in or at risk of poverty and social exclusion, will contribute substantially to creating a more equitable distribution of health. [2] Over the last century, average health status improved in Europe. However, these gains are not evenly distributed across countries or across social groups within the same country. Health inequities can be observed in higher and lower income countries across the European Region. Despite improvements of health status in European countries, important questions about how successful countries are in achieving the Europe 2020 targets on different dimensions of health system performance remain. Answering these questions is by no mean an easy task. The aim of this article is to help shed light on how well countries do in promoting the health of their population and on several dimensions of health system performance. Application of advanced multidimensional statistical method on a selected set of indicators of health and health system functioning in selected European countries could summarize some of the relative strengths and weaknesses and can be useful to identify possible priority areas for actions.
According to the above mentioned we have used correlation, factor and cluster analysis [3] on a selected set of health and social indicators from the OECD Health Statistics [4] and OECD Social Statistics databases [5], so selected countries are the European countries that are the members of OECD. For analysis were used the most recent data available.

Selected indicators
In accordance with the objectives of analysis we have selected 19 indicators (Table 1). Indicators (variables) H1 to H7 together characterize the state of health, E1 to E3 the state of healthcare expenditure, C1 to C5 healthcare resources and indicators S1 to S4 the social determinant of health of the inhabitants in the selected countries.

Selected multidimensional methods
Factor analysis is a statistical approach that can be used to analyse interrelationships among a large number of variables and to explain these variables in terms of their common underlying factors. The general purpose of factor analytic techniques is to find a way of condensing (summarizing) the information contained in a number of original variables into a smaller set of new composite factors with a minimum loss of information. Numerous variations of the general factor model are available. The two most frequently employed approaches are principal component analysis and common factor analysis. The component model is used when the objective is to summarize most of the original information (variance) in a minimum number of factors. The Scree Plot can be very helpful in determining the number of factors to extract, because displays the eigenvalues associated with a component or factor in descending order versus the number of the factors. [6], [7] An important concept in factor analysis is the rotation of factors. In practice, the objective of all methods of rotation is to simplify the rows and columns of the factor matrix to facilitate interpretation. The Varimax criterion centres on simplifying the columns of the factor matrix. With the Varimax rotation approach, there tend to be some high loadings (i.e., close to -1 or +1) and some loadings near 0 in each column of the matrix. The Factor Loadings show the correlation between the original variables and the factors and they are the key to understanding the nature of a particular factor. The Factor Scores in output of Factor analysis procedure display the values of the rotated factor scores for each of n cases, in our analysis for each of 25 European countries. Factor score show where each country falls with respect to the extracted factors. [6], [7] Cluster Analysis procedure is designed to group observations (countries) into clusters based upon similarities between them. A number of different algorithms is provided for generating clusters and are described in detail in many statistical publications, for example in [3]. We have used the agglomerative algorithm, beginning with separate clusters for each observation or variable and then joining clusters together based upon their similarity. To form the clusters, the procedure began with each observation in a separate group. It then combined the two observations which were closest together to form a new group. After recomputing the distance between the groups, the two groups then closest together are combined. This process is repeated until only one group remained. The results of the analysis are displayed in a dendrogram.
The distance between two observations we calculate by Euclidean distance, defined as (1) and distance between two clusters by Ward's method. Ward's method defines the distance between two clusters in terms of the increase in the sum of squared deviations around the cluster means that would occur if the two clusters were joined. The results of the analysis are displayed in several ways, including a dendrogram. Working from the bottom up, the dendrogram shows the sequence of joins that were made between clusters. Lines are drawn connecting the clustered that are joined at each step, while the vertical axis displays the distance between the clusters when they were joined. [7], [8], [9]

The results of multidimensional methods
The results of the correlation analysis in graphic form show the correlation coefficients between each pair of indicators and their clusters. The results indicate a strong positive dependence of health indicators on E1-E3 healthcare expenditure, employment in health and social work (C1), as well as the number of nurses per 1000 inhabitants (C3 indicator), moderate dependence on the number of physicians and technical C4 and C5 sources and strong negative dependence on social determinants S1-S4, see Figure 1.

Source: OECD Health Statistics 2017, self-processed in SAS JMP
By application of factor analysis we try to obtain a small number of common factors which account for most of the variability in the original variables. To assess the suitability of indicators for the factor analysis, we applied the Kaiser-Meyer-Olkin measure.
The KMO = 0.7544399 show suitability of the source variables for factor analysis. The first three factors were used according to eigenvalues which are higher than 1, see Figure 2.
Factor loadings which present the correlation between the original variables and the factors and they are the key to understanding the nature of a particular factor. After varimax rotation we obtained factor loadings shown in Table 2. Rotation is performed in order to simplify the explanation and naming of the factors. Based on those factor loadings, we found out that the 1st factor has strong positive correlation with the indicators of health status and health expenditures, the 2nd factor demonstrated rather moderate positive correlation with the indicators of Employment in health and social work and Disposable income and strong negative correlation with other social indicators, the 3rd factor showed strong positive correlation with the indicators of personal and technical resources. The high values of each factor mean a high level of the observed reality.
Based of above mentioned we have named three common factor as: • F1 -Factor of health status and health expenditures, • F2 -Factor of social determinants of health, • F3 -Factor of personal and technical resources of healthcare. Table 3 shows the factor scores for each monitored country. The Factor Scores displays the values of the rotated factor for each country. Graphical display of countries in a two-dimensional coordinate system with the axes of the selected factors allows us to quickly assess the observed situation in each country and also compare the situation in different countries.

Source: self-processed in Statistica 12
In the coordinate system of the factors F1 and F2 three groups of countries were created, one with high values of both factors, including all the old EU countries, the second with low values of both factors, including the new EU countries and the third with the middle level of the first and the low to medium level of the second factor, see Figure 3.  Source: Self-processed in Statistica 12 Figure 4, showing the location of Europe's monitored countries in the coordinate systems of factors F1 and F3, indicates that F3 -Factor of personnel and technical resources of health care has a positive effect on F1 -Health and healthcare expenditure. Three clusters of monitored countries were created. One group consists of states of Northern and Western Europe with high values of both factors, the group with the lowest level of both factors again forming the same five countries as in Figure 2 and third, the most numerous group of countries with medium level of both factors, again belonging the Czech Republic. Figure 5 shows the dependence of Factor F3 -Factor of Personal and Technical Resources of Health Care and Factor F2 -Factor of Social Determinants of Health. Again, there is a direct dependence of these two factors and, similarly to Fig. 3. We can observe the specific situation of the groups of countries of Greece, Spain, Italy and Portugal where, even at a low level of social determinants of health, the medium to high level of personnel and technical resources of health care. Unfortunately, the group of five new EU Member States with the lowest level of both factors is the same as in Fig. 3 and Fig. 4.
The factor analysis based on principal component method resulted in 3 mutually independent factors, each representing one dimension of health situation. These factors are appropriate for the cluster analysis. Dendrogram and parallel plots represent the results in the visual form, see Figure 6.
According to the legend, the red colour presents the high, so desired value of each factor, and the size of the values is indicated by the intensity of the colour. Low factor values are analogously shown in blue.
The colour map in the 1st column refers to the 1st factor of the health status and health expenditures, the 2nd column of colour map represents the social determinants of health and the 3rd column represents the Analogous interpretation of five clusters from the dendrogram in Figure 5 also provide parallel graphs in Figure 6.

Conclusion
The results of the selected multidimensional methods have confirmed the usefulness of their use to reduce the dimension of large-scale data sets of health indicators in Europe, assessing health inequalities and identifying some of its determinants.
The correlation analysis provided quantification of causal relationships of health indicators, health care expenditures, personnel and technical resources, and social determinants of health care. The application of factor analysis allowed to replace the 19 original indices with three common factors explaining almost 80% of variables of the original variables. Identifying these factors using factor loads has made it possible to assess the impact of social determinants and personnel and technical resources on health status as well as health inequalities in monitored countries caused by these factors. Charts 1 to 3 confirm that despite the efforts and actions of the European Commission, these inequalities are significant. This also confirms the results of cluster analysis that are consistent with the results of factor analysis. The use of appropriate statistical software packages and Excel spreadsheet allows the publication of the results of advanced statistical methods in clear visual form, understandable also to people without thorough knowledge of these methods.