Time Series Forecasting: A Complete Guide
In the era of social media, digital shopping and IoT, time series are literally everywhere. Though it’s not as flashy as other skills, time series analysis is a true data science black-belt super power because it can both open a window to the past, or help us see into the future!
In this post we’ll focus on the latter. In the following lines we’ll review the principal aspects of the trade, such as which tools and techniques are used in the field today, and why we might be interested in using time series methods in the first place.
We’ll also go over plenty of use cases to illustrate the usefulness of time series forecasting. By the end of it, you’ll have a solid foundation with which to dig deeper into this vast and fascinating topic, and you will even be ready to start producing your own no-code forecasts!
Time series forecasting refers to the practice of examining data that changes over time, then using statistical models to predict future patterns and trends.
Probably the best known family of forecasting methods (but by no means the only one), time series forecasting draws exclusively on historical data of the variable of interest to predict future outcomes. Study the past if you would define the future, to quote Confucius.
In contrast, causal methods model the future behavior of the target variable as a function of predictor variables whose (causal) relationship to the target is well known. Since they delve into the actual driving forces behind the observed temporal patterns, these models tend to perform better in long-range forecasts.
Finally, it is actually possible to forecast in the absence of any temporal data. Judgemental forecasting is used to provide a reasoned estimate in situations where there is no data available or the data available is not applicable. A well-known example is the use of market research to estimate the sales of a new product.
Let’s begin by considering the following question: when can you choose a time series model? We obviously need historical quantitative data for our variable of study, but it also must be reasonable to assume that any patterns in the past data will continue to hold in the future.
Now, if both of these conditions are met, why would we turn to time series forecasting instead of, say, a causal model or even a mixed model?
The system is not fully understood. Or, if it is, it may be very difficult or expensive to model the relationships that drive its behavior.
And even if we do understand the system’s dynamics, it may not be possible to forecast the future values of the predictors, in the absence of which the target variable itself can’t be forecast.
Consider the case of a food delivery company that has accurately measured the effect of rainy weather on mealtime orders and wishes to implement this piece of information in their forecast. Since weather predictions are only accurate about 80% of the time within a seven-day time frame (and hold no value beyond a fortnight!), this plan can only work in the context of very short-range forecasts.
Finally, a third, widely applicable possibility in business use cases, is that we are simply more interested in predicting future behavior than in understanding its causes.
Since a time series is a series of chronologically ordered observations, essentially any variable that can be measured at successive intervals can be analyzed using time series methods. As a data set, a time series is, quite simply, a list of pairs (T, V) of time (T) and the value of the target variable at that time (V).
The variable itself can be continuous (household electricity consumption per hour, stock price changes, platelet counts, annual rainfall), discrete (daily ice cream sales, number of pregnancies, monthly sunspots), or even non-numerical (user events, system performance logs).
As British physicist William Thomson Kelvin said, what is not measured, cannot be improved. Forecasting fits naturally into any business’ decision-making process since
it enables a business to set attainable goals,
helps companies plan ahead by anticipating change,
and can be used to keep pace with the market by responding faster to it.
In a nutshell, forecasting is all about taking an informed, proactive stance in dynamic environments, and as such it is widely applicable in any business context.
For a simple illustration of its potential advantages, let’s consider a classic business use case for forecasting techniques: inventory management. Predicting customer demand is a fundamental need of businesses that manage supplies, and failure to do so can result in under- or overstocking. Understocking translates in unfulfilled sales potential and unhappy customers, while by overstocking, a company incurs increased inventory-related costs. By helping a company avoid both, accurate forecasts can optimize sales while saving on safety stocks.
Like any data analytics endeavor, any time series forecasting project begins with defining the problem, and in this initial stage we will need to establish the following points:
the forecasting horizon: projections can look several years into the future for capital investments, or just minutes or hours ahead for electricity consumption, telco or financial models.
the level of tolerable inaccuracy (for example, the launch of a new product may require only a rough estimate of future sales, while forecasts used to set quarterly goals should be quite accurate), and
the frequency with which the model will need to be run.
Make sure to talk at length with all stakeholders in your project to understand what data is available for your projection, whether it can be expected to continue to be valid in the future, and how the forecast will be used and by whom.
Libraries enabling classical time series analysis and forecasting are always a good place to start for exploration and benchmarking. Let’s go over the most popular:
forecast/fable: in R, the forecast library was the first of its kind and offered a wealth of tools for the task, including an automatic ARIMA implementation. Its successor, fable, can handle multiple time series and fit multiple models at once and has support for ensembles.
pmdarima: Python’s own toolbox for time series modeling. It has an additional incentive in the shape of the beloved pandas library, which was originally created to wrangle time series data but is now widely used in all areas of data science.
Facebook’s Prophet: an immensely popular time series forecasting library with implementations for both R and Python. One of its strongest points is that it needs no parameter tuning like ARIMA models do (more on that below) and requires only minimal, highly intuitive adjusting.
statsforecast: automatic ARIMA for Python with state-of-the-art speed, accuracy and scalability, making it ideal for production environments.
Finally, it is worth mentioning alternatives for machine learning techniques, since the latter produce the most accurate forecasts:
scikit-learn: hands-down, the most popular package. This is a general machine learning library for Python that includes tools for time series forecasting.
XGBoost: a famous overperformer and a Kaggle long-time winner package, it has implementations in both Python and R.
LSTM: not to forget deep learning algorithms, recurrent neural networks are frequently on par with XGBoost in terms of performance.
Needless to say, both Python (matplotlib, seaborn) and R (ggplot2) have their respective data visualization libraries which are well suited, among others, to time series visualization. On the other hand, many BI tools offer in-built forecasting features with some degree of customization options. These really smooth the way to time series forecasting for everyone, but make sure to always understand which algorithm the software is using and the way it is being fit to your data!
On the other hand, Superset includes Prophet in the time series e-chart, a huge highlight for those seeking to make fast, accurate and largely automated predictions. A user-friendly panel allows customization of the model and offers all the power and speed of forecasting with Prophet with just a few clicks. Scroll towards the end section for a walkthrough!
To avoid any issues downstream, make sure you fully understand the meaning of the timestamps in your data set (what process generated them; whether they reflect the time at which the data point occurred, or the time when it was recorded in the system, etc).
Adjusting the historical data during data cleaning can lead to simpler patterns and hence to more accurate forecasts. Two very basic transformations that are well worth keeping in mind are:
- calendar adjustment, which consists in working with the average daily quantity instead of the monthly total, by which we get rid of any differences arising due to different month lengths; and
- population adjustment, by which we simply switch to working with per capita quantities (for example, data per person or per 1,000 people) instead of totals, for any data that can be affected by population changes (number of beds per 100,000 people, for example).
Last but not least, beware data leakage! Or, in time-series lingo, the dreaded lookahead, which occurs when information from future data points leaks into a model. Remember that classical train-test splitting techniques are not applicable to time series data: when working with time series, the test set must immediately succeed the training data and be, ideally, at least as large as the required forecast horizon.
Loosely speaking, autocorrelation is a measure of how measurements of the target variable taken at different points in time are linearly related to one another as a function of their time difference (lag). The Autocorrelation Function (ACF), included in most statistical software packages, computes the size and direction of this correlation for different lags. For example, the ACF value at a lag of 1 is the correlation between values at time T and values at time T-1.
Stationarity is a deceptively intuitive concept that describes a time series whose statistical properties (mean, variance, …) are constant over time. Time series that exhibit a trend or seasonal variations are non-stationary, for example.
Stationarity is important as an indication of how much the system’s long-term future behavior will reflect its long-term past behavior. In fact, most time series forecasting methods make the assumption that the time series can be rendered stationary “enough” through the use of simple transformations.
One such transformation is differencing, which calculates the difference between consecutive time steps. It is sometimes (but not always) effective in stabilizing the mean of the time series and partially or totally removing trends and seasonal patterns.
Seasonality is a pattern that reoccurs in the data at fixed time intervals. Seasonal patterns with different periods can co-occur in the same data set - for example, hourly call volume in call centers will peak at certain hours in the day (daily pattern) and at certain days in the week (weekly pattern).
Exponential smoothing is a technique that uses a weighted average of past observations to predict future values. The “exponential” in the technique’s name refers to the fact that the weights decay exponentially with increasingly distant observations, such that more recent data points are given more weight. Dating from the 1950’s, this method is computationally inexpensive, widely applicable, and it tends to perform well for forecasting horizons of up to about three months.
Developed in the 1970’s, these are the veritable workhorses of time series forecasting and they continue to be applied today in many different contexts - even sophisticated ensemble or hybrid machine learning models often include ARIMA.
Much like the previous family of models, the cornerstone assumption of ARIMA is that future values of the variable of interest are a linear regression of past values, and therein lies both the key to its simplicity and its greatest drawbacks: its accuracy is limited by its inability to model nonlinearity and, since it depends heavily on past data, it tends to lose predictive power for long-range horizons.
The family name stands for AutoRegressive Integrated Moving Average, which describes three different aspects of the model:
- autoregressive terms: future values of the variable depend linearly on the p previous values (lagged observations). For example, if p=1, then the value at time T would be expressed as a function of the value at time T-1.
- integrated series: the original time series is differenced a number d of times before ARIMA is applied to it.
- moving average terms: future values of the variable depend linearly on the q previous forecast errors.
An accurate model requires finding the best combination of parameters (p, d, q) for the system of study, which is both quite challenging and heavily dependent on experience.
State-of-the-art time series forecasting models are either based on machine learning algorithms or are hybrids or ensembles of classical (statistical) and ML models.The reason is simple: armed with the ability to model highly nonlinear relationships in the data, these models consistently hit higher accuracies even for long-range forecasting windows.
Aside from predictive power, these algorithms are far easier to tune than ARIMA models, but they do require more coding. Another downside to these models is that they are less interpretable than their statistical counterparts. All things considered, due to their interpretability and the existence of automated packages for them, classical models still provide a great starting point to explore and benchmark the forecasting problem at hand.
Another paradigmatic use case of time series forecasting is electricity consumption prediction. Electrical grids are fascinating systems where the interplay of a large number of factors must be carefully synchronized so that production always matches demand: it’s the perfect arena for forecasting techniques!
In a nutshell, every electrical grid is composed of a number of power-generating plants and renewable energy sources, each having different output capabilities and playing different roles within the system. The utilization of these resources must be carefully scheduled to optimize output and minimize costs.
The good news is that as a target variable, electricity consumption meets all the conditions for predictability: its key drivers are well understood (namely, temperature, with some contributions from calendar variations and economic conditions), and there is usually plenty of historical data available on consumption and weather.
The financial market is characterized by its nonlinearity, volatility and overall complexity, but these qualities mean that successfully predicting its future movements is potentially very lucrative. The idea of using math to predict the future values of corporate stocks is a relatively old one in the context of time series forecasting. Initially, the calculations would be performed by hand!
In hospitals, high patient volume has been shown to negatively impact all aspects of care and recovery, such as length of stay at the emergency unit, waiting times, and overall length of stay. For this reason, modeling trends of new admissions, which are strongly seasonal and dependent on factors such as the flu season, weather changes and holidays, has become a fundamental need for hospital personnel.
The failure of certain industrial components can easily force the shutdown of a whole operating unit and cause huge costs in damages. Such is the case of industrial-scale water pumps, such as those found in mines, quarries, and a city’s underground system! For these pieces of equipment, it is critical that any malfunction is caught before it actually happens. For small installations, it is feasible for the maintenance team to go on periodic rounds for on-site inspections, but often this is not possible or cost-effective. Enter time series forecasting! In this context, the goal of predictive modeling is to detect deviations from the machine’s normal operation state.
We previously discussed the role of forecasting in setting attainable goals. Indeed, one of its greatest strengths for businesses of all sizes is that it enables them to see the growth they can achieve. Even without cutting-edge models and trailblazing accuracy, having a solid framework for thinking about the future is already a game-changer. It is the starting point from which businesses can transition from the “where to go to” to the “how to get there”.
As we mentioned earlier regarding the everywhereness of time series, the possibilities are endless. Sales and units sold are obvious options. Forecasting them will help you budget more accurately, plan your marketing efforts, manage your inventory, and allocate resources (for example, more customer service representatives for peak sales season). But you can also forecast transactions, conversions, web traffic (in terms of sessions, page views or users),… As a rule of thumb, any business outcome you cannot control is worth measuring in order to predict its future behavior and plan accordingly.
To demonstrate just how easy it can be to apply time series forecasting to your data, we’ll take a shot at an excerpt of the training set from Kaggle’s long-standing Store Sales time series forecasting challenge. The data consists of daily unit sales of several consumer products at supermarkets of Corporación Favorita, an Ecuadorian-based grocery retailer that spans the length of the western coast of South America, with stores all the way from Colombia to Chile.
We will use the following tools:
- Preset, a a fully-managed service for Apache Superset™, a leading open-source business intelligence tool
- Prophet, a time series forecasting tool developed by Facebook for making high-quality predictions of time-based data with trend, seasonality, and holiday effects.
After uploading the csv in Preset, create a new chart type and, in the Advanced Analytics dropdown, select the generic time series chart. For this practice, we’ll be generating a simple 30-day forecast of daily units sold for the “grocery I” family of products. Once in the chart editor, configure the chart as follows:
Make sure the date field is configured as a temporal variable. Next, scroll down the Data pane to the predictive analytics dropdown, and check the “ENABLE FORECAST” box. Now we just need to specify the periods (since our data has daily granularity, our “periods” are days), and the confidence interval (here we’re using the usual 95% intervals). In regards to seasonality, you can either let Prophet fend for itself or, if you’ve already explored the data and know for sure, they can also be specified.
And that's it! It really is that simple. Notice that for such a small dataset, Prophet has done remarkably well in detecting the shape of weekly seasonality of sales. Moreover, if you look closely at the 200k line, you’ll notice that, quite impressively, Prophet has picked up on an ever so slightly downward trend. Now, this is something we’d want to be careful with, however, since it could just be a normal seasonal variation and not a real trend - we just have too little data to know for sure. With a little careful observation, solid knowledge of the data and the underlying context, and such powerful tools, literally anyone can produce first-rate predictions!
And that’s a wrap! To recap, in this post we have learned about time series forecasting and how it fits into the spectrum of forecasting methods, and we have explored the main concepts, tools and techniques of the trade. More importantly, we’ve seen how this domain of data science and statistics is pivotal to the smooth functioning of many industries. And, you can now build your own forecast in just a few clicks!
Most of all, we hope we have encouraged you to give these techniques a try in your own projects. When it comes to forecasting, while domain knowledge and experience carry a lot of weight, even the industry veterans need a lot of trial and error to zero in on the right model for their data. Which means that the right time to get started, as always, is right now!