Analyzing and Forecasting of Electricity Consumption by Integration of Autoregressive Integrated Moving Average Model with Neural Network on Smart Meter Data

Smart metering is a recently developed research area over the globe and it appears to be a remedy for increasing prices of electricity. Electricity consumption forecasting is an essential process in offering intelligence to smart girds. Rapid and precise forecasting allows a utility provider to plan the resources and also to take control actions to balance the electricity supply and demand. The customers will advantage from the metering solutions by a greater understanding of their own energy utilization and forthcoming projections, allowing them to effectively manage the cost of their consumption. In this view, this paper presents an Integration of Autoregressive Integrated Moving Average (ARIMA) Model with Neural Network (NN) for Electricity Consumption Forecasting using Smart Meter Data. As the time series data often does not hold linear as well as nonlinear patterns, ARIMA or NN models are not enough to model and predict the time series data. The ARIMA-NN model will be trained using the data and generates a model. Afterward, the generated model can be utilized to predict the electricity consumption by the application of new building data. The proposed ARIMA-NN model is evaluated and the simulation outcome strongly pointed out its superior performance over the compared methods. The presented model has obtained effective testing performance with the MAPE of 25.53, an accuracy of 48.38, and MSE of 0.21.


Introduction
At present days, significant attention has been paid to smart electricity networks, where improved control over electricity supply and utilization is going to be attained due to the development of Advanced Metering Infrastructure (AMI). Smart metering becomes more familiar and it serves as an essential process to achieve defined policies, reduces the greenhouse gas emission, improves the energy efficiency, and ensures the energy required by renewable energy sources. Smart metering models become a portion of micro-grid that comprises diverse operational and energy metrics like smart appliances, renewable energy resources, and energy-efficient resources. An important issue associated with the operation of micro-grids is the optimum energy management of housing buildings based on multiple and overlapping objectives [1]. Presently, smart grid vision and smart home concepts are commonly employed for optimizing the utilization of energy and reduces electricity bills.
The development of the smart home energy management model becomes a wide global priority for supporting the trend towards a more sustainable and reliable energy supply for smart grids [2][3][4][5]. Initially, a new metering infrastructure is defined for ensuring the automatic reading and billing depending upon the present utilization. Next to that, through the accumulation of high-frequency utilization data, the system should fulfil the needs for the design of cost-reflective prices, varying concerning the time of utilization. Afterward, these metering models aim to the contribution to the minimization of the total energy utilization by raising the energy awareness of the users.
An essential aspect of smart metering models offers encouragement to the users for the utilization of lower electricity by being effectively informed regarding the utilization pattern. The prediction offers the users with the likelihood of connecting the present utilization behaviors with future costs. So, the user gets beneficiary from the prediction solutions by thoroughly understanding the individual energy utilization and their expectations, enabling them to effectively manage the cost of utilization. Through the transparent usage of energy and future projections, it is easier to interpret how much it affects economics in the upcoming days. Numerous models have been devised for electricity load forecasting. Few of the familiar time series analysis with the autoregressive integrated moving average (ARIMA) model, fuzzy logic, neuro-fuzzy model, artificial neural networks (ANNs), and support vector machines (SVMs).
Few fascinating instances of electricity forecasting on the individual household level are provided here. [6] intended to compute which machine learning (ML) model offered optimal prediction results for the entire building energy utilization for the subsequent hour. The outcome exhibited that the LS-SVM provides the optimal outcome in the prediction of the home's future electrical utilization. Generally, the presented model offered MAPE of 1.6-13.41% for a 700kWh commercial building and around 15-32% for 3 homes with mean expenditure closer to 1.5 kWh. In [7], high-resolution data from a set of three private households gathered for a month has been examined. The experimental outcome indicated that the recent prediction models offer better forecasting. Depending upon the disaggregated data from smart homes, sensors with perseverance and smart meter benchmarks disclose-considerable estimate enhancements of 4-33% to mean absolute error.
In [8], diverse models have been employed for predicting peak demand in separate homes. It is revealed that the home's historic peak load and tenancy is an effective predictor of peak load compared to season. It is represented that Seasonal Auto-Regressive Moving Average (SARMA) can be employed for modeling the intrinsic load pattern and user movement in a home and that it has a 30% lower mean square error compared to regression-based models. In [9], Kalman filter-based prediction technique leads to load forecasting with the MAPE of 30% for a sampling period and predicts a horizon identical to an hour. Small-time duration among the reception of real-time measurement data from customers' smart meter enhanced the accurateness of the presented model and results in the MAPE of almost 13%. In [10], depending upon the power expenditure readings from 23 homes gathered over Japan, a support vector regression technique and an activity sequence-driven model are developed to infer upcoming actions and improves load prediction. The activity sequence variable turns out to be a significant characteristic that can enhance the accuracy of load prediction by 15 minutes in advance for individual houses, reaches up to 42% MAPE on average.
In [11], several prediction models have been developed with the predicting horizons ranges from 15 minutes to 24 hours. The validation of the model takes place on two datasets namely one household in Germany and 6 households in the United States. The outcome pointed out that the predictive accuracy is mainly based on the selection of prediction model and parameter setting. In [12], an electricity load prediction model based on the individual household level by the use of CART, SVM, and MLP, NNs for 24-hour short-term load forecasts has been presented. It is revealed that an integration of the past utilization data and household pattern data could mainly improve the prediction of every user load. The results of 51% MAPE are reached and 48% is obtained by SVM.
This paper presents a new load forecasting at the household level. The precise forecasting of time series data has inspired the researchers to propose new techniques for load management. Since the time series data often does not hold linear as well as nonlinear patterns, ARIMA or NNs (NN) are not enough to model and predict time series data. This paper presents a hybridization of the ARIMA and NN model to make use of the advantages of time series models and NN. The proposed ARIMA-NN model is evaluated and the simulation outcome strongly pointed out its superior performance over the compared methods.
The subsequent portions are arranged as follows. Section 2 offers the presented ARIMA-NN model and Section 3 evaluates the ARIMA-NN model. Section 4 concludes. Fig. 1 shows the overall processes involved in the ARIMA-NN-based electricity forecasting in smart meters. Initially, the data is gathered from various sources such as datasets, actual readings, and calculations. Then, the ARIMA-NN model will be trained using the data and generates a model. Afterward, the generated model can be utilized to predict the electricity consumption by the application of new building data. The detailed operation of the ARIMA-NN model will be discussed in the following subsections.  [13] consists of a future value with the parameter of the linear function that is obtained from various monitoring results as well as random faults. The ARIMA technique could be defined as ARIMA (j, k, l)(J, k, L) s , where(j, k, l) denotes the non-seasonal portion of a method (P, D, Q) s is assumed to be the seasonal part of the proposed technique that is described as follows where j denotes the sequence of non-seasonal autoregression, k indicates the number of general differentiation, implies the linearity of non-seasonal MA, J signifies the order of seasonal autoregression, K specifies the count of seasonal difference, L is assumed to be ordered in seasonal MA, s reflects the length of periodicity, φresembles the AR operator in order j, j, Φ is a seasonal AR parameter from order J, ∇ k denote the varying operator,∇ S K is a seasonal differencing operator, z r implies a monitored value at time t, θ indicate the MA operator of order l, Θ is seasonal MA parameter in order L as well as a t shows the noise of stochastic technique which is considered as NID(0, σ 2 ) .

The proposed ARIMA-NN model
ARIMA modeling scheme has 3 major levels that are given below: The process of identifying normal form has two phases are, • The required differencing of series could be determined to attain the stationery as well as normality.
• The temporal association of transferred data might be discovered by analyzing the autocorrelation (ACF) and Partial Auto Correlation (PACF) functions.
In addition, ACF is assumed to be applicable in measuring primary values with the series of later values. PACF indicates of correlation among a parameter and lagging the correlations from the lower order. By assuming the graphs of ACF and PACF that are attained by electricity load concentration sequence, diverse different ARIMA techniques have been discovered for selecting models. Any method that offers a lower Akaike Information Criterion (AIC) can be chosen as an optimal fitting approach. Hence, the arithmetic form of AIC is presented as AIC = n( ln ((2πRSS)/n) + 1) + 2m (2) where m = (j + l + J + L) is several terms evaluated from the technique as well as RSS displays the combination of squared residuals.
Once the models of ARIMA have been illustrated, the attributes of such functions should be calculated. If a proper method is selected then parameters could be computed with the application of the Box-Jenkins method that analyzes the remaining model of adequate series. Various samples have been deployed for diagnostic observation in determining the residuals of chosen ARIMA from ACF and PACF graphs are not based on other frameworks. When normality metric is not given, the observations have been obtained in provided by a Box-Cox transformation. To accomplish a better forecasting method, residuals that are retained with the fitting model, should convince the needs of the white noise process. For computing, the electricity load time series is a dependent or independent, Residual Auto-Correlation (RACF) function is worked from the survey. Initially, a correlogram is obtained by segmenting the remaining ACF function in contrast with the lag value. In the case of ineffective ARIMA, the evaluated autocorrelations of residuals have been uncorrelated and distributed with the gradual value of zero.
Alternatively, Ljung-Box-Pierce statistics have been introduced to test the empty hypothesis of present autocorrelations as white noise; test statistics can be measured for a diverse number of prior lagged autocorrelations with the help of Ljung−Box−Pierce statistics (L(r) test) to test the accuracy of this method.
Thus, D(r)values have been related to the crucial test of (χ 2 ) distribution in terms of the degree of freedom with minimum significance. Finally, the cumulative period gram is applied for analyzing the white noise series. While the seasonal time is modeled in the proposed work, the periodic features of electricity load concentration-time cannot be considered. Hence, the regularities present in residuals must be examined.

Structure of NNModel
From several other NN infrastructures, the three-layer-feed forward back propagation system is generally applied. The architecture is comprised of a single hidden layer of neurons along with transfer functions as well as an external layer of linear neurons and linear transfer functions. The graphical representation of the back propagation network is depicted in Fig. 2, where Xa(a = 1,,N) is the input variables; ni(I = I,,S) denotes the outcome of neurons in the hidden layer and yb (b = I,,L) denotes the final result of NN. In general, NN should be trained for determining the values of intensity that produce the accurate result. From the training level, a group of input data has been applied in training the networks for several times. Therefore, the back propagation technique adapts the weights present in the steepest descent direction (SDD). This is the process of decreasing operation in a rapid manner. However, the fundamental gradient descent learning model is worse when compared with other models in terms of accuracy and efficiency. The optimization point states that training provided for NN could be assumed as similar to reduce the overall error from network intensity. To overcome the complexities, various learning techniques have been proposed namely resilient backpropagation, Levenberg-Marquardt as well as conjugated gradient backpropagation. A major technology was established named as Scaled Conjugate Gradient (SCG) approach. Hence, SCG training models are mainly used in removing the time conservation inline search. From the conjugate gradient model, and exploring operation is processed including conjugate dimensions that provides a rapid convergence when compared with SDD. Furthermore, the distinct backpropagation method is applied in NN training which measures the gradient value of global error function in terms of weights, f(W) , for all process k, and advanced weights based on The step size α u > 0 is assumed to be a user-selected training parameter that influences the operation of the learning model in a vast range. For every case, the backpropagation technique follows a zigzag mechanism in a lower, typical of steepest gradient descent approach. Subsequently, the conjugate gradient model eliminates the zigzag process for a definite point by integrating the particular association between direction and gradient vector at alliteration. When D k points the dimensional vector for iteration K , then weight vector is upgraded as The provided values of W u and D u , a specific measure of α K minimizes objective function to a greater extent. once the minimum number of processes is carried out the searching line direction is useful in identifying the optimal size of the original minimum. By calculating the maximum step size along with the SCG learning technique improves the speed of training as well as to removes the base of crucial user-selected attributes. A major opinion of this technique is to apply the factor ρ that is increased with all process from implementing the model by the quantity δ u , that states a Hessian matrix is not positive definite. A detailed technology of SCG in NN could be computed.

Hybrid Model
The basic nature of electricity load data cannot be captured easily using the stand-alone technique due to the presence of different features like seasonality, heteroskedasticity, and so on. The approximate value of ARIMA is difficult to identify and not adequate. Besides, the application of the ANN technique is to hold the maximum result. [14] defined that it is ineffective to be applied in ANN which is not applicable for any data type. Hence, the hybrid technique with linear and nonlinear abilities is termed as the best model to predict the electricity load data. By the integration of various factors, the above criteria are obtained. As same as the hybrid model structure, an electricity load time series is comprised with a linear autocorrelation as well as a nonlinear unit as, Where Lb is the series component and Sf implies non-linear element. The variable with time series data has been computed. The primary ARIMA would be deployed in capturing the linear unit and residuals are attained from the linear technique would have a nonlinear correlation. The residuals of hb in time t is obtained from the sequential model as Where L b is assumed to be detected value of ARIMA method in time t . The analysis of residuals is mandatory in computing the accuracy of ARIMA. An ARIMA is not enough when there is a linear correlation structure that is presented in residuals. But, diagnostic checking of residuals must be pointed to for predicting the nonlinear patterns in sequential information. If the residuals are passed by diagnostic check, the method is sufficient in nonlinear relationships which are not exactly designed respectively. Hence, the residuals are capable of labeling with the help of ANN to find non-linear relationships. Using N input nodes, the ANN residuals would be provided as Where g represents the non-linear notation computed by NN as well as ε t implies the random error. At last, integrated detection would be as

Dataset Description
For the validation of the presented ARIMA-NN model, it is tested using a UK Smart Meter dataset. The dataset contains a set of features comprises of a Household id, Plans used (standard or dynamic time of use), Date and Time, Meter readings (Kwh), and Acorn groups [15]. Fig. 3 shows the distribution of the household energy consumption from smart meters.

Conclusion
This paper has developed an integration of the ARIMA model with the NN model for Electricity Consumption Forecasting using Smart Meter Data. As the time series data often does not hold linear as well as nonlinear patterns, ARIMA or NN models are not enough to model and predict the time series data. The ARIMA-NN model is trained using the past data and creates a model for the validation process. Once the model is created, it can be used for the prediction of electricity consumption by the application of new building data. The proposed ARIMA-NN model is evaluated and the simulation outcome strongly pointed out its superior performance over the compared methods. The presented model has obtained effective testing performance with the MAPE of 25.53, an accuracy of 48.38, and the MSE of 0.21. As a part of the future scope, the proposed model can be deployed in real-time applications.