05 Sep Introducing Amazon Forecast and a Look into the Future of Time Series Prediction
By Dr. Sami Alsindi, Data Scientist at Inawisdom
Time series forecasting is a common customer need, so a means to rapidly create accurate forecasting models is therefore key to many projects. Amazon Forecast accelerates this and is based on the same technology used at Amazon.com. This post explores the use of this new service for energy consumption forecasting.
Inawisdom is an AWS Partner Network (APN) Advanced Consulting Partner with the AWS Machine Learning Competency. We work with organizations in a variety of industries to help them exploit their data assets.
Our goal at Inawisdom is to accelerate adoption of advanced analytics, artificial intelligence (AI), and machine learning (ML) by providing a full-stack of AWS Cloud and data services, from platform through data engineering, data science, AI/ML, and operational services.
We routinely work with time series data to perform forecasting for a variety of customer use cases, including personal financial predictions for consumers and predictive maintenance for manufacturers. Being able to project time series data into the future with a measure of confidence allows customers to make informed business decisions in a quantitative manner.
One of the most exciting projects I have worked on at Inawisdom was with Drax, a UK-based energy supplier. The goal was to automatically detect anomalous energy consumption within their Haven Power retail business.
Across a portfolio of thousands of customers, each reporting their consumption every half hour, manually detecting consumption pattern changes and anomalous activity is difficult and time consuming.
The time taken to identify events that indicate faulty meters, safety issues, energy theft, and changes of tenancy results in inefficiencies and debt recovery challenges.
DeepAR is a LSTM neural network that can be used to forecast time series data, accounting for trends and seasonality of the time series in order for the network to learn and give accurate forecasts.
The raw dataset we worked on consisted of millions of half-hourly energy consumption readings with years of data per customer. The results are impressive, but data wrangling took roughly two weeks in the initial phase of the project to create the forecasts.
From the created forecasts, anomalies for the previous week can be detected using another Amazon SageMaker built-in model—RandomCutForest (RCF)—on the differences from observed usage to predicted usage. To learn more, check out the case study for this project.
Figure 1 – Example of a Fault Drop anomaly.
In Figure 1, you can see an example of an automatically-detected anomaly with a week’s worth of electrical usage shown. In blue, we have the real consumption; in pink, the confidence interval from DeepAR is plotted, with the median shown as a line.
The uncharacteristic blip downwards is the 29th most significant anomaly; this triggers a classification procedure that has identified this pattern as a “Fault Drop.”
Figure 2 – Example of a Change of Tenancy anomaly.
Another example of a detected anomaly is shown in Figure 2. This time, continuous uncharacteristically low usage triggered the class of “Change of Tenancy.”
This is perhaps the most important business anomaly type that needs to be identified. The longer time period that’s passed since the customer moved out of the premises, the less likely the contact details Haven Power has for the customer will be up-to-date. Consequently, this means a lower chance of recovering the customer’s outstanding debt.
Integrating Amazon Forecast with Amazon SageMaker
Amazon Forecast is the new tool for time series automated forecasting. With Amazon Forecast, I was pleasantly surprised (and slightly irritated) to discover that we could accomplish those two weeks of work in just about 10 minutes using the Amazon Web Services (AWS) console.
From my initial experiences, Amazon Forecast will be an extremely useful accelerator for any time series predictions, such as retail demand forecasting, freeing up the time of data scientists for more interesting things.
AWS has supplied a Software Development Kit (SDK) for full integration into Amazon SageMaker, and you can view the documentation and example JupyterNotebooks on Github. Using the graphical user interface (GUI), however, actually sidesteps this whole issue and is a lot easier.
To integrate Amazon Forecast with Amazon SageMaker, you first need to create a dataset group. All that’s required is a single TARGET_TIME_SERIES file containing the data as a row-wise .csv with three columns: timestamp, item_id, and a float that’s the target of the predictor model. You can also add ITEM_METADATA and RELATED_TIME_SERIES data.
Sticking with an electricity example, the TARGET_TIME_SERIES data will be hourly meter readings, the item_ids will correspond to individual meters, and the target float will be consumption in kWh. We could add to the ITEM_METADATA any groupings, such as Standard Industry Classification (SIC) codes that group similar businesses. Finally, RELATED_TIME_SERIES data could consist of weather data, for example.
Amazon Forecast handles the backend processing and transformation of these data, while you submit a job—this can take some time—and come back to your newly-parsed dataset. There is the option to automatically refresh the dataset the model is trained on, which is something that used to involve significant effort in setting up an AWS Step Function and several AWS Lambda functions to re-parse the data, or re-process it. Forecast takes the hard work away.
Figure 3 – Forecast datasets.
Once this is complete, you can train a predictor that can predict for up to one-third the duration of your dataset, with predictions starting for the time periods just after your dataset ends.
You define the forecast horizon, how many periods you want Amazon Forecast to look into the future, and the “recipe,” which can be one of the built-in predictor types such as DeepAR+, which is an evolution of DeepAR. However, you can forego the guesswork and allow Amazon Forecast to determine the optimal predictor automatically by choosing the AutoML option, which trains using all of the recipes. Just select the recipe which results in the best fit to your dataset.
There is also the option to automatically, and periodically, retrain your predictive model. This also used to involve setting up AWS Step Functions and AWS Lambda functions, and again is made simple with Amazon Forecast.
In our case, we will first predict the next few days (72 hours):
Figure 4 – Train predictor parameters.
Once your predictor is trained, you can deploy it in order to make predictions.
Figure 5 – Predictor overview.
Once deployed, you can make predictions.
Figure 6 – Forecasting configuration.
In Figure 7 below, you can see hourly predictions for the 72-hour period after the last of the data available for meter “client_10.” In grey and black, we have the original data, the tail end of the observed usage for this particular meter. In orange, we have median (50 percent) prediction, and in green the upper confidence interval (90 percent).
Figure 7 – Forecast results (hourly).
Predictions can also be generated with lower frequency (e.g. daily) to see gradual trends. I have done this below with another predictor that calculates monthly predictions.
Figure 8 – Forecast results (daily)
And, of course, all of the above can be carried out algorithmically or parametrically using Amazon SageMaker implementations, as well. The possibilities are limitless!
Amazon Forecast makes time series forecasting effortless, removing the need for the undifferentiated heavy-lifting aspects that usually underpin it.
Additionally, Amazon Forecast massively reduces the effort required to automate data updating and model retraining. It manages this while also retaining the granularity of control that data scientists will appreciate and utilize. If only this tool had arrived three months sooner for my previous project!
AWS continues to champion the democratization of advanced and cutting-edge machine learning models, with Amazon Forecast being a perfect example of abstracting away the difficulty of model selection with the AutoML mode.
At Inawisdom, we fully embrace these developments that allow us to provide ever greater business benefit to customers and facilitate more and more exciting projects. I can’t wait to see what comes along next. Perhaps I can forecast it.