Statistical modelling of bus travel time with Burr distribution

A better understanding of the travel time distribution shape or pattern could improve the decision made by the transport operator to estimate the travel time required for the vehicle to travel from one place to another. Finding the most appropriate distribution to represent the day-today travel time variation of an individual link of a bus route is the main purpose of this study. Klang Valley, Malaysia is the study area for the research. A consecutive of 7 months ten bus routes automatic vehicle location (AVL) data are used to examine the distribution performance. The leading distribution proposed for the research is the Burr distribution. Both symmetrical and asymmetrical distributions that have been proposed in existing studies are also used for comparison purposes. Maximum likelihood estimation is applied for parameter estimation while loglikelihood value, Akaike information criterion (AIC) and Bayesian information criterion (BIC) are applied for performance assessment of the distributions. Promising results are obtained by the leading model in all different kinds of operating environment and could be treated as the preliminary preparation for further reliability analysis.


Introduction
Travel time variability (TTV) has always been recognized as the key indicator to evaluate the service quality of the transportation system. It simply means punctuality or the consistency of the transportation system for a certain journey. It can also be explained as the compliance of the public transportation system on the planned schedule. In order to improve the attractiveness of the public transportation system, it is important to maximize its service quality or reliability and improve the performance of on-time arrival. This is always a challenge for the road transport operator as the traffic systems have a stochastic nature. The improvement of the transportation system service quality does not only benefit the transport operator in terms of its attractiveness and revenue, it also benefits the users. Travelers have the expectation of on-time arrival or compliance of the transport system on the planned schedule with some acceptable level of delay. Minimizing the TTV value is found to be more important than minimizing the average travel time as it could reduce the uncertainty of time travel which leads to stress faced by the users [1][2].
Different types of distributions have been tested for TTV modelling. For example, gamma [3][4], normal [5][6], lognormal [7][8] and more recently, Burr distribution [9][10]. However, some research gaps are observed in these studies. Only a few modelling studies were done on bus TTV. It is noted that none of the studies were conducted on bus TTV modelling with Burr distribution. Besides, only a few TTV studies are done on the link level. Most of the distribution fittings on travel time were done on the route level although studies show independence of travel time between individual links of the same route [11][12]. Most of the TTV studies also did not consider the day of week as an important factor in affecting the travel time variation. Only few studies consider this factor but none of the existing studies fit weekday and weekend travel time values separately with the proposed distribution.
This research study aims to fill the research gaps found in the existing studies on statistical modelling of TTV. The main purpose of the study is to examine the suitability of the proposed distribution in explaining link travel time values. The second purpose of the study is to address the issue of the proposed distribution in fitting different link travel time values as the operating environment of each individual link in a bus route is different.
The focus of this paper is to investigate the most appropriate distribution for day-to-day variability in representing both weekday and weekend bus travel time. Klang Valley, Malaysia is the study area for the research. A consecutive of 7 months ten bus routes automatic vehicle location (AVL) data are used to examine the distribution performance. The leading distribution proposed for the research is the Burr distribution. Other symmetrical and asymmetrical distributions, i.e. normal, Weibull, gamma, generalized Pareto and lognormal are also chosen for comparison purposes. Maximum likelihood estimation is applied for parameter estimation while log-likelihood value, Akaike information criterion (AIC) and Bayesian information criterion (BIC) are applied for performance assessment of the distributions. AIC and BIC are more robust in measuring the statistical model quality as they take into account the maximized log-likelihood value and the number of parameters in the model [13][14].
Promising results are obtained from the leading model in all different kinds of operating environments. Results shown that the Burr distribution is particularly well in representing weekday travel time values. Although the performance of the distribution in fitting weekend travel time values is comparably lower, it is still the "best" representation compared to other distributions. Statistical test conducted in the study also found that the Burr distribution is not a good statistical fit for the individual link weekend travel time values. However, only less than 5% of the cases were rejected.
The remaining section of this paper is presented as follows. The existing studies on TTV modelling are discussed in Section 2. The method and data used for the research are discussed in Section 3. The detailed information of the bus travel time data, the proposed distribution and its characteristics are discussed in this section. Results obtained are discussed in Section 4 while the last section concludes the study and highlights the potential further research.

Literature review
The study of statistical modelling on public transport TTV started in the 1980s although research on TTV had started for some decades. The main reason for the limited studies conducted is the lack of accessible to public transport travel time data. The issue is being addressed now with the emergence of advanced technology, i.e. AVL systems and global positioning system (GPS). It was first believed that the symmetrical distribution, i.e. normal distribution would be a suitable model to represent the vehicular travel time data. However, asymmetrical distribution such as gamma and lognormal distribution was found to be a more appropriate model in recent studies. Table 1 shows some existing studies that have proposed a variety type of distributions to represent different transportation mode travel time data. It is found that the existing studies have the following limitations: a) The data sample size is limited in the studies conducted before the 20th's century.
Inaccurate results might be produced if limited data sample size is used to carry out the study.
b) Most of the studies conducted did not include a variety of routes or sections that have different operating environments. For example, Zhang et al. [7] only used the freeway section to conduct the study. c) Some distributions proposed can only be used for approximation. The normal distribution and the lognormal distribution could only approximate travel time values as they produce negative travel time values and very small travel time values respectively [15]. The Burr distribution has been proposed as an appropriate model in several reliability studies. In the actuarial applications studies, it is a popular model in modelling failures [16]. It is well known for its flexibility in capturing various forms of data distribution shape and its mathematical tractability. Recently, it has also been proposed in reliability engineering study, in particular TTV study. The Burr distribution was proposed to fit the individual vehicle travel time values and in overall performance it emerged as a plausible model [9][10]. It is able to capture the strong positive skew patterns and very long upper tails exhibited by the TTV data. However, to the best of the authors' knowledge, the model has not been proposed as an appropriate model to represent the bus travel time values.

Data and methodology
The data and the method used for the study are explained in this section. A detailed information of the study area on each route is discussed in the first subsection. The extraction and the filtration of the data used are presented in the second subsection. The last subsection presents the Burr distribution as the leading distribution for the research. Detailed information of the distribution such as the statistical properties of the distribution and the suitability of the model in representing the travel time observations for transportation application study is discussed in this subsection.

Data descriptions and sample size
The Automatic Vehicle Location (AVL) data which is collected by the local public transport operator is used for the research. The transportation mode and the study area conducted for the research is the bus service system in Klang Valley, Malaysia. Buses equipped with GPS are used to collect for 10 bus routes longitudinal travel time data. A consecutive of 7 months (June 2014 to December 2014) second-by-second GPS data are included. Both weekday and weekend travel time observations are included. The ten sets of bus routes are B115, T626, T629, T634, U32, U40, U48, U62, U76 and U89.
It is important to categorize the bus routes accordingly to have a clearer view on the performance of the distributions for each bus route category. The bus routes examined were categorized according to their route lengths, bus route characteristics or the type of services provided and the total number of stations in the bus route. All important information such as the travel time recorded from one station to another, the stop ID and the direction of each individual station is provided by the AVL system. The travel time used to travel from the preceding stop to the current stop is represented by the stop ID recorded in the AVL system. The AVL system also recorded the travel time taken which represents the departure time difference in any 2 consecutive stops. The detailed information of each bus route is summarized in Table 2 while Fig. 1 shows the plotted route used in the study.

Burr distribution
Burr [19] first introduced the Burr distribution as a two-parameter family which consists of two shape parameters. Tadikamalla [20] further developed the Burr distribution into a three-parameter family of distribution with an additional scale parameter. The Burr distribution is developed for the flexibility purpose so it has the advantage in fitting various types of frequency data. It is flexible enough to capture a variety form of distribution shapes. Another attractive feature of the distribution is its mathematical and computational characteristics. The cumulative distribution function (cdf) of the distribution can be written in a closed form. Therefore, the computation of its percentile values is simplified. The probability density function (pdf) and the cdf of the distribution are shown in Equation (1) and Equation (2) respectively.
where x = Travel Time; c = Shape parameter; k = Shape parameter; α = Scale parameter (Estimated median travel time) [21] There are some interesting statistical properties shown by the three-parameter version Burr distribution. The modal value x m is given by A L-shaped distribution is formed if c ≤ 1 while a unimodal distribution is formed if c > 1.
The rth moment of the Burr distribution is also an interesting statistical property. The rth moment will only exist if ck > r. The rth moment of the distribution, E(X r ) is given by where Γ(•) is the mathematical Gamma function. The first moment (also known as the mean or expected value) and the variance of the Burr distribution are given in Equation (5) and Equation (6) respectively.
The parameters of the Burr distribution are estimated by the maximum likelihood estimation method. Negative log-likelihood function or simply known as log-likelihood value, Akaike information criterion (AIC) and Bayesian information criterion (BIC) were applied in the study to evaluate the performance of the fitness of the distributions on the bus travel time values. The "best" model is the distribution with the maximized log-likelihood of generating the observations with the parameters given by minimizing the negative loglikelihood value or the lowest log-likelihood value, AIC and BIC. Minimizing the negative log-likelihood is the same as minimizing the "loss function" or the distance of two distributions, i.e. the generated result distribution and the target distribution. The formula of AIC and BIC are given in Equation (7) and Equation (8) respectively. AIC = 2k -2 ln(L) (7) BIC = k ln n -2ln(L) where n = number of observations; k = number of parameter(s) of the respective distribution function; L = Likelihood function of the estimated distribution Two ratios which are the Cases_fit ratio and the Cases_top2 ratio are introduced in order to improve the measurement accuracy in model selection. The formulas of the Cases_fit ratio and the Cases_top2 ratio are shown in Equation (9) and Equation (10) respectively.

Cases_fit ratio =
No. of stations that can be fitted with Burr distribution Total number of stations/sections for the respective bus route (9) Cases_top2 ratio = No. of times the model performed as the best or second best Total number of stations/sections for the respective bus route Equation (9) shows that the Cases_fit ratio is computed based on the number of Burr distributions that succeed in representing the section travel time values over the number of sections/stations being analysed in the respective bus route. It is observed that the Burr distribution might have convergence issue in estimating the parameters to fit for the travel time observations as suggested in existing studies. The Cases_fit ratio is thus introduced to compute the number of Burr distributions that succeed in fitting for the travel time values in each route. Besides, Equation (10) shows that the Cases_top2 ratio is computed taking the frequency of the respective distribution emerged as the "best" or second "best" model over the number of sections/stations being analysed in the respective bus route with. This ratio is introduced to improve the accuracy of the measurement when evaluating the performance of each distribution. The goodness-of-fit test values, i.e. the log-likelihood value, AIC and BIC might be difficult to be used for model selection when the difference of these values is small. Therefore, the idea of selecting the "best" two models could prevent bias in evaluating the performance of the distributions while avoiding the possibility of selecting the wrong model.

Results and discussions
The results obtained and a detailed interpretation of the results are discussed in this section. The first subsection discusses all the candidature models' performance and distribution selected as the most outperformed distribution. The statistical tests conducted in the study are presented in this subsection. Parameters estimated for the Burr distribution are presented in the last subsection.

Distribution fitting
Visualizing the travel time data is the first step of the analysis conducted in this study. The histogram and the fitted Burr distribution on the Station 1001276 in bus route T626 travel time data are shown in Fig. 2. It is observed that both the weekday and weekend travel time data presented are right-skewed and they have very long upper tails. The theoretical curves used to represent the Burr distribution in Fig. 2 show that the Burr distribution might be an appropriate model to fit the travel time values. The flexibility of the Burr distribution and the ability of the Burr distribution in capturing the long tails can be observed in the figure. Further examination in selecting the model to fit for the individual link travel time is conducted by computing the Cases_fit ratio and the Cases_top2 ratio. Table 3 summarizes the results obtained while Table 4 summarizes the results of the best two distributions fitting performance.  *GP = Generalized Pareto **Mean is calculated as the total Cases_top2 ratio of the three goodness-of-fit test values over 3.

The bold values indicate the most outperformed model for each bus route (D) indicates weekday travel time data while (E) indicates weekend travel time data
It is observed that the Burr distribution is the most outperformed model in fitting for the individual links travel time data in the first and third category of the bus routes. The results obtained show that for both weekday and weekend of travel time, the Burr distribution emerged as the most outperformed or the second most outperformed distribution for 80% and 90% of the cases in the first and third category of the bus route respectively. For the second category of bus route, the results are quite contradicting. The Burr distribution and the lognormal distribution both emerged as the most outperformed distribution for half of the conditions evaluated in the second category of bus route as shown in Table 3 and Table  4. However, the cases of bus routes which lognormal distribution emerged as the best distribution are interesting. Although the lognormal distribution emerged as the best distribution, the number of stations showing the Burr distribution as the best distribution or the lowest log-likelihood value, AIC and BIC were higher compared to the lognormal distribution. For example, in bus route U32 (D), the number of stations with Burr distribution performed as the best fitted model in terms of log-likelihood value, AIC and BIC are 29, 29 and 28 respectively compared to the values obtained from the lognormal distribution which are 17, 17 and 16 respectively. This shows that the higher ratio value shown for the lognormal distribution in Table 3 is due to the reason that the lognormal distribution emerged as the second best fitted model for most of the stations in the bus route. This situation not only can be seen in bus route U32 (D), it also can be seen in bus route U32 (E), U48 (E) for the second category of bus route and bus route U76 (E) for the third category of bus route. Therefore, in overall performance, the Burr distribution is still considered as the best fitted model for both weekday and weekend travel time observations. Further analysis is also carried out to compare the performance of the model in fitting between weekday and weekend travel time data. It is obvious that the Burr distribution is more suitable in representing the weekday travel time data. Higher ratio values were obtained for the weekday travel time data of the same bus route. The mean value of the Cases_top2 ratios show that the Burr distribution is best representing the weekday travel time data while the performance of the Burr distribution is comparably lower in representing the weekend travel time.
Statistical tests were also carried out to examine the ability of the Burr distribution in fitting a total of 793 individual links in the study. However, it is found that only 39 cases which accounted for 4.92% or less than 5% of the total cases were unable to be fitted by this distribution. The travel time histogram for Station 1004155 in bus route U89 and Station 1002351 in bus route U62 are shown in Fig. 3. The travel time data in Fig. 3 were unable to be fitted by the Burr distribution except the weekday travel time data of Station 1004155. It is found that the difference between these travel time data is the indications of bimodality or multimodality. More than one peak can be observed in the travel time values that were unable to be fitted with Burr distribution. Besides, it is also noted that weekend travel time observations have a high frequency of failure cases. Therefore, further studies could be conducted on looking into the issue of the poor performance of the Burr distribution in representing the weekend travel time data.
On the basis of the ten datasets, normal, Weibull, gamma and generalized Pareto distributions did not perform well. Although lognormal distribution might be a good statistical fit, the Burr distribution showed a higher performance or promising result in fitting individual links travel time data that are operating in different environments. The log-likelihood value, AIC and BIC computed from the Burr distribution were the lowest compared to other distributions. It is also noted that the convenience brings about by the Burr distribution for statistical descriptors computation such as percentile values. It enhanced the appropriateness of the model in TTV modelling as the traffic management is operating in a dynamic environment. Burr distribution can thus be seen as a plausible model considering its high performance and its statistical characteristics.

Burr distribution parameters
Parameters estimated for the Burr distribution are discussed in this subsection. The parameters estimated could give an insight on the basic statistical descriptions of the data. A number of selected individual links fitted by the Burr distribution with their respective parameters estimated are shown in Table 5. A higher than average scale parameter values were observed for both weekday travel time or weekend travel time on the selected individual links in the same bus route. The scale parameter of the Burr distribution suggests the most concentrated part of the data or the estimated median travel time based on the fitted distribution as a linear relationship was found in between the scale parameter value of the Burr distribution with the true median travel time value [21]. Therefore, higher travel time values or higher estimated median values were obtained at these stations compared to other stations. This can be explained by the travel time spent by the bus drivers at these stations. The stations shown in the table are either the initial station, transfer station or the last station of a bus route. Therefore, it is reasonable that a higher than average median travel time is observed as the stations are the main bus station in a bus route.

Conclusion
The focus of this study is in finding the most appropriate model to explain each section of weekday and weekend travel time values separately for day-to-day TTV analysis. A total of ten bus routes for a consecutive of 7 months data are used for the research. The bus routes were categorized in three categories according to their characteristics. The Burr distribution was proposed as the leading model while several candidature models, i.e. normal, lognormal, gamma, Weibull and generalized Pareto were also used for comparison purposes.
It is found that the Burr distribution is able to accommodate the shape or pattern, i.e. long upper tails and the right-skewed of the travel time distribution. Results obtained in the study led to the conclusion that the Burr distribution is the best representation of the travel time values in overall performance. The Burr distribution is best in representing weekday travel time values. However, its performance is comparably lower in fitting weekend travel time values. Although there are some failure cases which indicate bimodality or multimodality, it only accounts for less than 5% of the total number of cases examined in the study. It is also important to take into account the statistical properties, i.e. flexibility and mathematical tractability of the Burr distribution which enhanced its suitability for reliability analysis. Therefore, the Burr distribution is considered as the most appropriate model for day-to-day TTV analysis in representing both weekday and weekend travel time values.
This paper could be treated as the preliminary preparation for further reliability analysis. Further research should be conducted on investigating the usage of the distribution in computing the variation in travel time for each individual link of a bus route. It is also important to embed the traffic factors implicitly into the model for TTV statistical modelling study.