Simple Single and Multi-Facility Location Models using Great Circle Distance

Facility location problems (FLP) are widely studied in operations research and supply chain domains. The most common metric used in such problems is the distance between two points, generally Euclidean distance (ED). When points/ locations on the earth surface are considered, ED may not be the appropriate distance metric to analyse with. Hence, while modelling a facility location on the earth, great circle distance (GCD) is preferable for computing optimal location(s). The different demand points may be assigned with different weights based on the importance and requirements. Weiszfeld‟s algorithm is employed to locate such an optimal point(s) iteratively. The point is generally termed as “Geometric Median”. This paper presents simple models combining GCD, weights and demand points. The algorithm is demonstrated with a single and multi-facility location problems.


Introduction
In any supply chain, the location of a supply point has a great influence on the system parameters like transportation cost, the time required and system efficiency [1]. In addition to the above, other factors like population, economy, and literacy may also play some roles in fixing the final facility location. In the case of collection point(s), the parameters that require attention include proximity to the raw materials, prices of them, infrastructure including manpower. It is generally acknowledged that the cost of logistics is very high in India. Some estimates put the cost at about 13 per cent of GDP, which is higher than the US (9%) and Germany (8%). A study conducted by Assocham-Resurgent India (2016) concluded that India can save around Rs 3.33 lakh crores if logistics costs reduce from 14 per cent to 9 per cent of GDP. Reduced logistics costs would bring down prices of products. In any logistics chain, the transportation services form a third of the cost. Among all modes of transport, roads carry about 60 per cent of the freight cargo in India [2]. Hence, minimising the total distance of any supply chain is important. When we refer to the 'distance' in general, Euclidean Distance (ED) is considered.
According to Zarinbal [3], Euclidean and Rectilinear distances account for more than 63 per cent of distance functions used in location problems. However, since the demand points are distributed over a spherical surface, using Great Circle Distance (GCD) which is based on spherical trigonometry is more relevant [4][5]. The discrepancy between the distances measured based on Euclidean and those based on spherical trigonometry becomes greater, the further apart the locations are from each other [6].
The points on the earth surface are represented by geodetic coordinates (latitudes and longitudes). The GCD between any two points on a sphere can be computed using the Haversine formula [7]. According to ReVelle and Eiselt [8], the location model has four basic features: • The customers those are located at points or routes. • The facilities that must be located. • The geographic space or surface where the customers and facilities are located. • The metric that indicates the distance, cost, or time between the locations of clients and facilities.
The geographic space and the distance metric play important roles in locating the facilities when the locations of customers remain the same. Most works have considered the space for the FLP to be flat.
The accuracy of the distances for the facility location model depends on the space feature. As discussed in the study by Shih [9], if the location points are widely separated, the difference in driving distance between the flat surface assumption (i.e. Euclidean distance) and the spherical surface assumption (i.e. great circle distance) may be significant, leading to important variations in the optimal location of the facility.
The use of "Great circle distances" finds its place in marine applications also like exploring natural resources including oil rigs and mobile drilling units. In computing the optimal location, the geometric median; the relative distance, population, GDP or number of trips are considered as the "weights". Considering the nature of the problem, this can be well-thought-out to be a Fermat-Weber problem [10].
A single facility Fermat-Weber problem requires finding a point in a real coordinate space of dimension "n" (R n ) which minimizes the sum of weighted distances to "m" given data points. In a facility location problem, the data points are the customers (demand points) or venders (supply points). The weighted distance between the new facility and a customer/ vendor is a measure of the cost to transport goods or provide services by the facility to satisfy the known requirements [11].

Minimize G(y) = ∑
(1) Where, p j = (p j1 , p j2 , …, p jn ) T is the known position of "j th " data point. y = (y 1 , y 2 , … , y n ) T is the position of the unknown new facility which has to be evaluated.

d(y, p)
is the distance function to measure the distance separating the two points y and p; y, p ϵ R n . )) is the cost function of the distance function in the interval (0, +∞), j = 1, 2, … , m. )) = w j ); j = 1, 2, … , m.
Where, w j is the weighted positive constant and ) is the norm.
Analytical solutions are available to evaluate such a "Fermat-Weber point" for three non-collinear points [12]. Finding a geometric median of a set of data points is an optimization problem, which has no analytical solution for more than four points. In such cases, the Weiszfeld [13] algorithm has been effectively used by researchers for decades. It is an iterative procedure for solving the minisum Fermat-Weber problem.
The optimal facility location problem on a sphere has been a subject of interest for many researchers. This problem was first studied by Drezner and Wesolowsky [14] and by Katz and Cooper [15] where Weiszfeld-like algorithms were proposed. A gradient algorithm was proposed by Xue [16] to solve the spherical facility location problem. The author claimed that the algorithm always converges to a global minimizer of the spherical facility location problem. To compute lower bounds, four ways were proposed and compared by Hansen et. al. [17] within a branch-andbound scheme for the Weber problem on a sphere.
The Wiggle Factor (WF) is a form of approximate correction factor used to estimate the route distances for road transport. It is defined as the ratio between the real distance travelled by road and the straight line between the two points. It is frequently used to calculate the fuel costs (which represent approximately half of the total truck costs) and hence, a certain degree of accuracy is required. The "curvature of the road" or Wiggle Factor was proposed by Cooper in 1983 [18] and he determined a value of 1.2 for UK roads. An improved methodology applied to Spanish road transport was proposed by Domínguez-Caamaño et al. [19]. They used two different WF: the first WF (1.36) describes mainly road infrastructure in rural areas while the second WF (1.29) characterizes highcapacity roads (typically motor-ways). Similarly, the total distance obtained using any model needs to be multiplied by the WF which may be suitably taken between 1.2 and 1.35.
According to the National Ocean Service, National Oceanic and Atmospheric Administration, U.S. Department of Commerce; the Earth is an irregularly shaped ellipsoid [20]. Cazabal-Valencia et al. [21] claim that minimum total coverage distance is achieved with the facility location problem considering the ellipsoid as the model for the Earth"s surface. This distance is smaller than the total coverage distance achieved under the spherical model. The model will be analysed in that direction also using ellipsoidal distance. This also needs further investigation. Several works are available in the literature [22][23][24] in this area that use both Euclidean distance and Great Circle Distance.
This paper presents two simple models to estimate the optimal facility location point(s) on the earth surface from a data set with known geodetic coordinates (latitudes and longitudes).

Great Circle Distance (GCD)
The most popular distance metric is the straight line distance or Euclidean distance. Euclidean Distance (ED) in 3D space between points (x1, y1, z1) and (x2, y2, z2) can be computed using a simple formula.
Euclidean Distance, ED But for spherical locations, we prefer great circle distance instead of Euclidean distance. The great-circle distance (GCD) is the shortest distance between two points on the surface of a sphere, in our case earth. Earth is assumed to be a perfect sphere with a radius, R = 6371.009 km for analysis, even though it is not so. Any location, x j can be represented by its geodetic coordinates (latitude, longitude) on the earth surface. In spherical trigonometry, GCD measurement is based on spherical triangles [ Figure 1] whose sides form arcs of great circles [25].
The shortest distance between any two points on the earth"s surface lies along the arc of the great circle joining these two points. GCD = Radius of Earth (R) * Internal Spherical Angle in radians (θ).

Weiszfeld's Algorithm
For a given set of data points, the "Mass Centre" refers to a point with coordinates that are weighted arithmetic average of the corresponding coordinates of the data set. It is nothing but the weighted mean of the given data points.
Where w j is the weighted positive constant associated with the data point p j and m is the number of data points. Weiszfeld's algorithm is used for improving the "mass centre" towards the "Exact Centre" or "Fermat-Weber Point". The exact centre refers to a point that has 'total minimum weighted distance to all points. Weiszfeld's algorithm is an iterative procedure used to find the 'Fermat-Weber Point' for 'm" data points. The point is also known as "Geometric Median".
The distance may be of any. The straight line or Euclidean distance can be used for shorter distances. For more number of points on the sphere which are fairly separated, great circle distance gives accurate results. The "Geometric Median" is estimated iteratively from the "Mass Centre" by the Weiszfeld"s algorithm using the formula, "y i ' refers to the present facility location, 'y (i+1) " will be the new location and each data point "j" is represented by "p j ". "w j " is the weight associated with "p j " and "i" refers the iteration number. The iteration can be stopped if the result reaches the specific "stopping criterion".

Estimation of Optimal Facility Location and Results
Given a set of "m" data points, the procedure for estimating the optimal location is given in several steps: Steps: 1.
Estimation of geodetic coordinates (latitudes and longitudes) of all points in radians with signs.(South latitudes and west longitudes are expressed in negative values)

2.
Assignment of weights, w j associated with each data point, if required.

3.
Computing the "Mass Centre" as the initial approximate centre.

4.
Estimation of internal spherical angles, θ between the approximate centre and other points.

5.
Computation of Great Circle Distances (GCD) between the approximate centre and other points.

6.
Finding the next approximate Centre using Weiszfeld"s algorithm (considering the weights).

7.
Finding the final optimal facility location point in subsequent iterations by repeating steps 4, 5 and 6.
The iteration can be stopped if there is no change in two consecutive centres or the stopping criteria is reached.
Here, the two norm represents the GCD, not ED.

Validation with Known Results
The Weiszfeld"s algorithm is a time tested algorithm to estimate the optimal point iteratively for a given set of data points. Also, the GCD is the minimum distance between two points on a sphere. This paper combines both the concepts in obtaining optimal facility location(s).
It is required to check the coding before the analysis.
Two problems considering weights for each point used by Cooper and Katz [26] were analysed. The results obtained perfectly match with Cooper and Katz both in terms of magnitudes and number of iterations. Euclidean distance is used as the distance metric here.
The other part of the algorithm is estimating the GCD between two points of known latitudes and longitudes. The algorithm is tested using the data points available in the literature [4] ) were also considered. The GCD reported by the online calculator is 19880 km whereas, the MATLAB program reports 19882.77186 km. The variation is very less. In the 8 data points problem studied by Mwemezi and Huang [4], the total distance reported is 1983.439 km whereas, the proposed algorithm which takes the geometric median as the facility location reports 1979.12 km. Both the algorithms use GCD but, Mwemezi and Huang considered the mass centre as the facility location. The proposed algorithm iteratively reaches the optimum point using Weiszfeld"s algorithm.

Example Problem
There are 8 warehouses (demand points) in a country for a particular company to which consumer goods are to be supplied monthly. For the warehouses numbered 1 and 6, two trips are to be made in a month. The company wants to have a centralized supply point from which the supplies are to be made. Locate the centralized supply point, given the locations of the warehouses. The geodetic coordinates of the warehouses are collected and given.
The geodetic coordinates are converted from degree, minutes and seconds format to decimal format (60 seconds = 1 minute, 60 minutes = 1 degree).
Subsequently, the values are converted to radians using the expression, Radians = (Degrees * π)/180 (7) Weights are assigned to each demand point, in this case, the number of monthly trips to be made to each point. The next step is to compute the "mass centre" as shown in Table 1. The initial approximate facility location, the "mass centre" is estimated as the arithmetic mean of the coordinates of the demand points. The mass centre is (-0.06545 0.1576925) radians.

First Iteration:
Interior spherical angle, θ is computed between any demand point and the Mass Centre (a total of 8 angles are computed).
Great Circle Distance = earth radius * interior spherical angle (a total of 8 GCDs are computed).
The weighted distances are analysed using Weiszfeld"s algorithm to obtain the next approximate centre (latitude, longitude) in radians.

Subsequent Iterations:
Interior spherical angle, θ is computed between any demand point and the New Approximate Centre (a total of 8 angles are computed).
Great Circle Distance = earth radius * interior spherical angle (a total of 8 GCDs are computed).
The weighted distances are analysed using Weiszfeld"s algorithm to obtain the next approximate centre (latitude, longitude) in radians.
The same procedure is repeated until there is no change in the new centre (or) the terminating condition is reached. The final supply point is the optimal facility location point.
Codes are generated in MATLAB 2012b and run in an i5 PC with 4 GB RAM.
The terminating condition is the difference in distances less than 1 km. That is, if the total distance between any two consecutive iterations is less than 1 km, the algorithm will stop. The analysis results are presented in Table 2. The optimal supply point is located at ((0.095125, 0.271249) radians or (5.4503, 15.5415) degrees, latitude and longitude respectively. The total distance is 27594.32 km.
As the weights represent the number of trips between the demand points and the optimal supply point, those weights are to be multiplied with the corresponding GCDs and added to get the total minimum distance.
In our case, Total minimum distance = 27594.32 + 7.8343 (first demand point) + 3172.5 (sixth demand point) = 30774.65 km. This is the minimum possible distance for the given problem.
If the number of warehouses (demand points) is high and more than one supply point is to be established, instead of one we can go for two or more facility locations using the same Weiszfeld"s algorithm.
Clustering technique should be applied in such case. If two optimal supply points are to be established, two initial centres are assumed randomly instead of one. The demand points will be attached to the nearest centre one by one. Proceeding in the same way, next approximate centres will be evaluated using the Weiszfeld algorithm, and again the demand points will be attached to the nearest centre one by one. This continues till two optimal centres are evaluated.
In the following section, a real-world application is discussed.

Estimation of Multi-Facility Locations and Results
In this section, a slightly different type of problem is considered. The population data of [28] 661 districts (2011) and 34 states and union territories (UTs) (2011) in India are considered and the geodetic data of the head-quarters of these districts/ states are collected from a single source [29]. It is assumed that the headquarters represent the entire district/ state/ UT. For the analysis using districts" data; equal weights of 1 each (GCD), % area share, %population share The same procedure is followed to estimate the single and two optimal facility location(s) and the final results are shown in Tables 3 and 4.  The GCD will be always greater than ED. The maximum radius from the centre and an extreme location in India can be maximum of 1600 km. If two major cities in India; Chennai and New Delhi are considered; the GCD = 1760.1 km and ED = 1745.4 km. That is, ED is less than GCD by 14.7 km (or) 0.835%. If Chennai and Nagercoil are considered; GCD = 624.47 km and ED = 621.23 km. That is, ED is less than GCD by 3.24 km (or) 0.519%.
The locations, number of districts in each cluster and total distance vary depends on the weight. The total GCD is minimum when equal weights are considered. For a single facility location, all the points are located in Madhya Pradesh.
For two facility locations, one cluster centre is located in Uttar Pradesh covering northern districts and the other one in Karnataka covering southern districts. More number of districts is covered in Cluster No. 1 and hence, it is preferable to go for more number of clusters,

Case II, States and UTs wise Analysis: Movements of Centres
Here, only less number of data points are considered (number of states and UTs). This part shows another important use of this model to analyse the movements of "Centres" over a period of time. Movement of centre implies the shift in activities.
The results are presented in Table 5. When the population is considered for the years 2001 and 2011, the points are not the same. The population optimal centre is shifted (blue dots above the text India) in the north-eastern direction by 20.953 km. Similarly, the %real GDP shares are analysed for the years 2004-05 and 2014-15 and it is observed that the point gets shifted (blue dots below the text India) towards the south-west direction by 44.689 km (Figures 2 and 3), that is about 2.13 times more than the population point.
Population points are above the economic (GDP) points. The accuracy will increase with an increase in the number of data points. Table 6 shows the variation in centres for different number of data points; number of districts and number of states.   The points differ by 203.71 km if no weights are considered and by 36.173 km if %population share is considered as the weight.
The location found using any model lies below the earth centre. If any model uses the earth-centric earth focussed (ECEF) coordinates which is in the form of x, y and z coordinates, the location has to be extrapolated to a point at the earth"s surface. The advantage of this model is, we need no such conversion as this uses the angles only during the iteration process. For example, the location (23.788, 80.785) lies 3451.0 m below earth surface in the districts wise Analysis -One Optimal Facility Location (Table 3) and the extrapolation also will result in the same values.

Conclusion and Future Work
This paper models both single and multi-facility location problems on earth using great circle distance which is more realistic. Weiszfeld"s algorithm is used to move the "Mass Centre" towards the optimal location point(s) in a few iterations. The concept is demonstrated using an eight data points problem for computing a single facility location. The number of monthly trips is taken as the weights.
The information about the population, area, real GDP, literacy rate of India is considered as the weights and analysed for both single and double facility locations. It is shown that the locations and total minimum distances vary with the weights.
Another important use of this model is demonstrated using the population and real GDP data of states and UTs over a period of time. This shows the movement of the population and economic activities in the country, which can be extended for any weight parameter. These models will help for fixing the optimal location centres, analyse the movement of activities over a period of time that helps in management decisions.