Module 7 of 25 · Time Series Forecasting — ARIMA, SARIMA, Prophet, LSTM, Transformers for Time Series · Intermediate

Feature Engineering for Time Series

Duration: 5 min

This module delves into the crucial process of feature engineering for time series data. We will explore various techniques to transform raw time series data into meaningful features that can significantly enhance the performance of forecasting models. Understanding and implementing effective feature engineering is vital for accurate and reliable time series forecasting.

Lag Features

Lag features are created by shifting the time series data back by a certain number of time steps. These features help capture the temporal dependencies and trends in the data, allowing models to better understand the underlying patterns. Lag features are essential for models like ARIMA and LSTM, which rely on past values to make predictions.

import pandas as pd

# Sample time series data
data = {'value': [10, 15, 20, 25, 30, 35, 40, 45, 50, 55]}
df = pd.DataFrame(data)

# Creating lag features
df['lag_1'] = df['value'].shift(1)
df['lag_2'] = df['value'].shift(2)

print(df)

Try it in Google Colab: Open in Colab

   value  lag_1  lag_2
0     10    NaN    NaN
1     15   10.0    NaN
2     20   15.0   10.0
3     25   20.0   15.0
4     30   25.0   20.0
5     35   30.0   25.0
6     40   35.0   30.0
7     45   40.0   35.0
8     50   45.0   40.0
9     55   50.0   45.0

Rolling Window Statistics

Rolling window statistics involve calculating summary statistics (such as mean, sum, max, min) over a sliding window of a specified size. These features can help capture trends and seasonality in the data, providing additional context for the forecasting models. Rolling window features are particularly useful for highlighting short-term patterns and anomalies.

import pandas as pd

# Sample time series data
data = {'value': [10, 15, 20, 25, 30, 35, 40, 45, 50, 55]}
df = pd.DataFrame(data)

# Creating rolling window features
df['rolling_mean'] = df['value'].rolling(window=3).mean()
df['rolling_sum'] = df['value'].rolling(window=3).sum()

print(df)

💡 Tip: When creating lag and rolling window features, ensure that the time series data is stationary or apply differencing to make it stationary. Non-stationary data can lead to misleading features and poor model performance.

❓ What is the purpose of creating lag features in time series data?

❓ Which rolling window statistic is useful for highlighting short-term trends in time series data?

Key Concepts

Concept Description
Trend Core principle in this module
Seasonality Core principle in this module
Stationarity Core principle in this module
Autocorrelation Core principle in this module

Check Your Understanding

❓ How does Feature handle edge cases?

❓ What is the computational complexity of Feature?

❓ Which hyperparameter is most critical for Feature?

← Previous Continue interactively → Next →

Related Courses