Prediction of monthly electric energy consumption using pattern-based fuzzy nearest neighbour regression

Electricity demand forecasting is of important role in power system planning and operation. In this work, fuzzy nearest neighbour regression has been utilised to estimate monthly electricity demands. The forecasting model was based on the pre-processed energy consumption time series, where input and output variables were defined as patterns representing unified fragments of the time series. Relationships between inputs and outputs, which were simplified due to patterns, were modelled using nonparametric regression with weighting function defined as a fuzzy membership of learning points to the neighbourhood of a query point. In an experimental part of the work the model was evaluated using real-world data. The results are encouraging and show high performances of the model and its competitiveness compared to other forecasting models.


Introduction
Electric energy consumption forecasting is an essential issue in power system planning and operation.Mediumterm forecasting is necessary for technical and operational purposes, such as: scheduling maintenance activities, planning of production levels and fuel purchases, and planning of network investments.From an economic viewpoint, energy consumption forecasts are fundamental for negotiating contracts between energy companies and concluding contracts with customers.
Fig. 1 shows a periodical time series representing monthly energy consumption for four European countries (the data from ENTSO-E repositorywww.entsoe.eu).In this figure, seasonal variations and rising tendency can be observed, caused by the influence of the economic and technological development on the electric market.Seasonal variations reflect the annual cycle and are dependent on climatic factors, which are comparable during the same month of different years.Other factors affecting directly or indirectly the level of energy consumption are political decisions and economic policy.They can disturb general rising trend and monthly fluctuations.They include: the emergence of alternative energy sources and technologies, fluctuating economic inflation, violent change in energy prices, industrial development, and global warming issues [1], [2].
The time series of monthly electric energy demand presented in Fig. 1 differ depending on the power system size and economic development of the country.Note significant share of random component in the time series and larger amplitude of annual cycles for France than for other countries.
Two approaches have been developed for mediumterm electric energy consumption forecasting [3].The first one, called conditional modelling approach, focuses on economic analysis, management and long-term planning and forecasting of energy load and energy policies.It considers socioeconomic conditions which impact energy demands, such as economic indicators and electrical infrastructure measures.These additional inputs are introduced to the model together with historical load data and weather-related variables.Such a model can be found in [4].It includes macroeconomic indicators, such as the consumer price index, the average salary earning and the currency exchange rate.
The second approach, called autonomous modelling approach, requires a smaller set of input information to forecast future electricity demand, primarily historical loads and weather factors.Because the economic factors are not taken into consideration, this approach is more suited for stable economies.Different forecasting models are used in this case such as classical autoregressive integrated moving average (ARIMA) and multiple linear regression [5], as well as computational intelligence methods, e.g.neural networks [6].Examples of such models can be found in [7], where ARIMA, neural networks and neuro-fuzzy systems are employed to forecast future load demand based on various weather-related parameters and historical load profiles.Another example is a model presented in [8], where interval load forecasting is proposed using multi-output support vector regression.In addition, a memetic algorithm is used to select input variables among the variable candidates, which include time lagged loads and temperatures.In [9] neural network is used for forecasting load time series components extracted using digital filtering.Evolving fuzzy neural networks are proposed for monthly electricity demand forecasting in [4].In this solution fuzzy neurons represent degree of importance of each input variable (loads, weather factors and daylight time).Different weights assigned to input variables lead to improved model accuracy and more precise prediction.
The forecasting model proposed in this work belongs to the latter category.It uses fuzzy nearest neighbour regression (FNNR), based on patterns of the time series fragments.An underlying assumption in this model is: if two fragments of the time series are similar in shape, then the fragments following them are also similar in shape [10].This approach is especially attractive when the time series expresses seasonal pattern.In our earlier works, we proposed models from the same class of pattern similarity-based nonparametric regression models: the model based on k-nearest neighbours (k-NN) [11] and Nadaraya-Watson estimator [12].The proposed FNNR allowed to consider similarity degree between shapes of the time series fragments using fuzzy set theory.
The remainder of the paper is organised as follows.In Section 2, a time series representation is described, using patterns of their fragments.In Section 3, forecasting model was defined, using fuzzy nearest neighbour regression.The model has been tested on realworld data in Section 4. Finally, the work was concluded in Section 5.

Patterns of time series fragments
In the first stage of the proposed approach load time series were pre-processed using methods presented in [10].Input and output patterns were defined.The input pattern was an n-dimensional vector representing a time series fragment preceding the forecasted one.Let us denote the forecasted fragment by Yi = {Ei+1 Ei+2 … Ei+m}, and the preceding fragment by Xi = {Ei-n+1 Ei-n+2 … Ei}, where Ek is the monthly energy consumption and k is the time index.An input pattern xi = [xi,1 xi,2 … xi,n] T represented the fragment Xi.Components of that vector were pre-processed points of the sequence Xi.For example [11]: where t = 1, 2, ..., n, i E is the mean value of the points in sequence Xi, and A pattern defined using ( 1) is a copy of the sequence Xi without processing.Pattern components defined using (2) are the points of the sequence Xi divided by the mean value of this sequence.Patterns (3) are composed of the differences between points and the mean sequence value.Pattern (4) is the normalised vector [Ei-n+1 Ei-n+2 … Ei] T .All patterns defined using (4) have the unity length, mean value equal to zero and the same variance.
The output pattern yi = [yi,1 yi,2 … yi,m] T , representing the forecasted sequence Yi, had the components defined similarly to the x-pattern components:

 
In the above formulas (5)-( 8) i E and Di are determined from the sequence Xi, and not from the sequence Yi.This is because the sequence Yi is not known at the moment of forecasting.To determine the forecast of the monthly energy consumption Ei+t on the basis of the forecasted y-pattern generated by the forecasting model, transformed equations have been used ( 5)- (8).For example, in the case of (8) the forecasted energy consumption is calculated as follows: Patterns xi and yi are paired (xi, yi).The set of these pairs determined from the history is used for learning the forecasting model.

Fuzzy nearest neighbour regression
The nearest neighbour estimate m(x) is defined as the weighted average of the y-patterns in a varying neighbourhood of the query x-pattern.Typically, this neighbourhood is defined through the x-patterns which are among the k nearest neighbours of the query pattern [11].The value of k determines the number of training patterns from which the regression function is constructed and controls the degree of smoothing.The k-NN estimator gives the regression function, which is discontinuous.In the points where the set of the nearest neighbours changes, the jumps on the function graph are observed.To avoid this inconvenience, a fuzzy membership of the training points to the neighbourhood of the query point was introduced [13].In this approach, each training point belongs to the query point neighbourhood with a degree depending on the distance between these points.
The regression function m(x) has the nonparametric form: where the weighting function w(x,xj) is dependent on the similarity or distance between patterns x and xj.Usually it decreases monotonically with the distance.When using fuzzy approach, the weighting function has a form of the membership function, e.g. a Gaussian-type function: where  is a parameter controlling the width of the function, and d(x,xj) is a Euclidean distance between patterns x and xj.An estimator (10) is a linear combination of vectors yj weighted by the membership degree (11) which nonlinearly maps the distance d(x,xj).The greater the distance, the lower the weight.The width parameter  decides about the bias-variance trade-off of the estimator.Too small  value results in undersmoothing, whereas too large value results in over-smoothing.Thus, the selection of the width parameter is a key problem.In a training procedure the optimal value of  is selected, as well as the optimal length of the input pattern n.These parameters are being searched using grid search method.
The training set contains pairs of patterns (xi, yi), which are historical for the forecasted sequence, i.e. these ones for which i = n, n+1, ..., i*-m, where i* is an index of the last month before the forecasted sequence.The forecasting task is to generate the forecasts for months i*+1, i*+2, ..., i*+m.
The forecasting procedure consists of four steps: 1. Pre-processing of load time series into x-and ypatterns.2. Calculating the weights for the training x-patterns using membership function (11).3. Calculating the forecasted y-pattern from (10). 4. Decoding the forecasted y-pattern using transformed equations ( 5)-( 8) to get the monthly electricity demand for consecutive months: i*+1, i*+2, ..., i*+m.

Experimental study
In this section, the proposed FNNR method was applied to model and monthly electricity load demand was forecasted.Then results were compared with results of several reported statistical and machine learning methods for load demand forecasting.Data used in this research were taken from the publicly available ENTSO-E repository (www.entsoe.eu).They included monthly electricity demand for four European countries: Poland (PL), Germany (DE), Spain (ES) and France (FR).The time range of data was 1998-2015 for PL, and 1991-2015 for other countries.We constructed the forecasting models for 2015, using data from previous years to model learning.Two variants of forecasting were considered: • Variant A -a model generated forecasts for all 12 months of 2015 (i* was an index of December 2014, m = 12), • Variant B -for each month of 2015 a separate model was created which generated one step ahead forecast (12 models created for i* corresponding to: December 2014, January 2015, ..., and November 2015, m = 1).
The model parameters,  and n, were selected using grid search in leave-one-out cross-validation procedure.
Tables 1-8 present optimal values of parameters and Mean Absolute Percentage Errors (MAPE) obtained with these parameter values: validation errors (MAPEval) and test errors (MAPEtst for 2015).Accordingly to the tables, the selection of the best way of pattern definition seems to be difficult.Results depend on the time series features, such as a trend and level of random, irregular influences.The optimal x-pattern lengths vary between 8 and 24 depending on time series and pattern definition.Note that the optimal lengths are rarely equal to the annual cycle length, which is characteristic for these time series.Fig. 2 demonstrates test errors for individual months in both variants, A and B. Note that variant B, which generates one step ahead forecasts, does not always provide better results than variant A, in which the forecast horizon is 12 months.Errors for successive months are very varied.This is caused by the significant contribution of the random component in data.
Examples of the forecasted y-pattern construction are presented in Fig. 3. Grey lines in these figures are the xand y-patterns from the training set.A darker shade of grey indicates x-patterns which are closest to the query pattern and y-patterns paired with them.These patterns have higher value of the membership function (11), and consequently greater impact on the forecast.The query pattern and the true y-pattern paired with it are drawn with thick solid lines.The forecasted y-pattern is drawn with dotted line.Moreover, the optimal input pattern lengths are different for different pattern definitions (see Tables 1-8).
In Tables 9 and 10 results of comparative models are shown: ARIMA, exponential smoothing (ES) and Nadaraya-Watson estimator (N-WE) [12].The proposed FNNR model belongs to the same group of nonparametric regression methods as N-WE; thus, results of both models are similar.When comparing errors of all models, it can be concluded that FNNR is competitive with other models, but it should be noted also that the classical ES model outperformed all other models in six of eight cases.

Conclusion
This work proposes a practical methodology to forecast the monthly electric energy consumption using fuzzy nearest neighbour regression.This model is based on the assumption that the similarity of the input patterns implies the similarity of the output patterns paired with them.The patterns representing time series fragments are the key element of this approach.They unify data, reduce nonstationarity and filter out the trend.The main advantages of the model are the simple and understandable principle of operation and only two parameters to estimate: the length of the input pattern and the width of the membership function.Models with fewer parameters have better generalisability and do not require complex learning procedures.
We demonstrate the effectiveness of our approach on real-world data.Comparing with commonly used methods, such as ARIMA and exponential smoothing, the proposed model results in similar errors on average.Better performance of the model is observed for more regular time series with lower noise component and stable relationship between input and output patterns.The factors which decrease this stability are the nonlinear trend and heteroscedasticity of time series.

Table 1 .
Results for PL, variant A.

Table 2 .
Results for DE, variant A.

Table 3 .
Results for ES, variant A.

Table 4 .
Results for FR, variant A.

Table 5 .
Results for PL, variant B.

Table 6 .
Results for DE, variant B.

Table 7 .
Results for ES, variant B.

Table 8 .
Results for FR, variant B.

Table 9 .
MAPE of the forecasting models, variant A.

Table 10 .
MAPE of the forecasting models, variant B.