A Time Series Forecasting Method

This paper proposes a novel time series forecasting method based on a weighted self-constructing clustering technique. The weighted self-constructing clustering processes all the data patterns incrementally. If a data pattern is not similar enough to an existing cluster, it forms a new cluster of its own. However, if a data pattern is similar enough to an existing cluster, it is removed from the cluster it currently belongs to and added to the most similar cluster. During the clustering process, weights are learned for each cluster. Given a series of time-stamped data up to time t, we divide it into a set of training patterns. By using the weighted self-constructing clustering, the training patterns are grouped into a set of clusters. To estimate the value at time t + 1, we find the k nearest neighbors of the input pattern and use these k neighbors to decide the estimation. Experimental results are shown to demonstrate the effectiveness of the proposed approach.


Introduction
Clustering is an unsupervised classification technology, with a purpose of forming meaningful clusters for the objects under consideration.Usually, similar objects are grouped in the same cluster, and different objects are grouped in different clusters.Clustering techniques play a very important role in the field ofartificial intelligence [1] [2] [3] [4].In particular, they are widely applied in times series data analysis in a variety of areas, such as bioengineering [5],environmental monitoring [6], economic applications, and so on.In the process of clustering time series data, using the same weight for each dimension may cause bad effects.To deal with this difficulty, Huang et al. proposed TSKmeans [7], which is K-means with weights, to assign different weights to different dimensions of the data.A similarity measure based on the weighted Euclidean distance was adopted.Through quadratic programming, smooth subspace in time stamps can be produced.It was shown that TSKmeans can result in better clusters than the original K-means for time series data.
This paper proposes another weight-based clustering method for time series forecasting.Instead of using Kmeans, an iterative self-constructing clustering method is adopted.The method performs several rounds of clustering until convergence is reached.In each round, all the data points are processed incrementally.If a data point is not similar enough to an existing cluster, it forms a new cluster of its own.However, if a data point is similar enough to an existing cluster, it is removed from the cluster it currently belongs to and added to the most similar cluster.In each iteration of clustering, weights are learned for each cluster.When convergence is reached, the clustering process stops with a desired number of clusters.
The weight-based self-constructing clustering technique is then applied to time series forecasting.Given a series of time-stamped data up to time t, we divide it into a set of training patterns.By using the weight-based clustering, the training patterns are grouped into a set of clusters.To estimate the value at time t + 1, we find the k nearest neighbors of the input pattern and use the weighted sum of the centers of these k neighbors to be the estimate.Experimental results are shown to demonstrate the effectiveness of the proposed approach.
The rest of this paper is organized as follows.The weight-based self-constructing clustering technique is described in Section 2. Our proposed time series prediction method is presented in Section 3. Experimental results are shown in Section 4. Section 5 gives a conclusion.

Weight Self-Constructing Clustering
We describe the weighted self-constructing clustering algorithm in detail.Also, we improve the method by incorporating weights in the calculation of similarity, just as TKmeans does to K-means.First of all, we briefly introduce TSKmeans [7].TSKmeans is a Kmeans incorporated with weights.It tries to make the distance between the data points contained in a cluster and the center of the cluster small through the use of weights of time stamps.Given X = {ܺ ଵ ,ܺ ଶ , . . .,ܺ } is a set of n time series patterns.Each pattern Xi = {ܺ ଵ , ܺ ଶ . . ., ܺ } is the ith pattern characterized by m values, i.e., m time stamps.The membership matrix U is a n×k binary matrix, k is the total number of clusters, with ‫ݑ‬ = 1 indicating that Xi belongs to cluster p and ‫ݑ‬ , j z p, is 0. The centers and weights of clusters are represented by two sets of k vectors Z = {ܼ ଵ ,ܼ ଶ , . . .,ܼ } and W = {ܹ ଵ ,ܹ ଶ , . . .,ܹ }, with jܹ being the weight of the jth time stamp for the pth cluster.The purpose of TSKmeans is to minimize the following objective function: Subject to by the application of quadratic programming.Using these weights in each iteration of Kmeans until convergence is reached.
At the beginning, TSKmeans generates randomly the centers of clusters and sets initial values for the weights of clusters.Then the following three steps are done iteratively: Step 1.For each pattern Xi, compute the distance ‫ܦ‬ between it and cluster p by A pattern is assigned to the cluster with the smallest distance.If pattern i is assigned to cluster p, then ‫ݑ‬ is set to 1 and ‫ݑ‬ is set to 0, j z 0.
Step 2. Update the centers of all clusters by for 1 p k.
Step 3. Use known U and Z to update W by applying quadratic programming to Eq.( 1) with If clusters have changed in the current iteration, then go back and Steps 1-3 are performed again.Otherwise, TSKmeans stops.
2. If (X (୧) , Y (୧) ) currently belongs to cluster Ca, we remove ( X (୧) , Y (୧) ) from Ca, updating the characteristics of Ca, and we add (X (୧) , Y (୧) ) to the most similar cluster Ct as before.After all the patterns are considered, if none of the cluster assignments has changed, we stop with K clusters.If the cluster assignments of some patterns have changed, we update the weights W by minimizing the objective function of Eq.( 1).Then we proceed with the next round of clustering.

Time Series Prediction
Suppose we are given a time series A that is indexed by the natural numbers as A1,A2,….At (10) where Ai denotes the value taken at time i.We'd like to estimate the value at time t + 1 based on these t values.
Finally, we estimate the value at time t + 1 as and we are done.

Experimental Results
In this section, we present and compare the experimental results of several clustering methods on the Poland dataset [8].There are 1,500 instances in the dataset.The first 1,000 instances are taken for training and the remaining instances are taken for testing.The methods to be compared with include ANFIS [9], NN-MAT [10], ARMA [11], and Sorjamaa [12].the original K-means, TSKmeans, and SCC.For convenience, our proposed method is called SCC with weights, abbreviated as SCC-W.To evaluate the effectiveness of these methods, the following performance measures are adopted [13]: Maximum absolute error (MAE).It is defined as ) where yi is the desired value and ŷ is the predicted value.MAE takes the maximum difference magnitude among all the testing instances.
Root-mean-square error (RMSE).It is defined as where yi is the desired value and ŷ is the predicted value, and Nt is the number of testing instances.
These measures have a common property: a lower value indicates a better prediction performance.We have found that SCC-W provides the best performance in terms of both MAE and RMSE.The window size = 5 and k = 2 are used for BCC-W.However, From this table, different window sizes may affect the performance of SCC-W, as shown in Table 1.As can be seen, the performance of SCC-W slightly varies as window size varies.The forecasted values are very close to the target values.We can conclude that SCC-W has a good prediction capability.
Table1.effect of different window sizes(WS)

Conclusion
We have presented a weight-based self-constructing clustering method and applied it to time series forecasting.Given a series of time-stamped data up to time t, we divide it into a set of training patterns.By using the weight-based clustering, the training patterns are grouped into a set of clusters.To estimate the value at time t + 1, we find the k nearest neighbors of the input pattern and use the weighted sum of the centers of these k neighbors to be the estimate.Experimental results have shown that the proposed approach is effective and promising for time series forecasting.