AI-900Chapter 42 of 100Objective 2.2

Time Series Forecasting

This chapter covers time series forecasting, a supervised machine learning technique used to predict future values based on historical data ordered by time. For the AI-900 exam, this topic appears in approximately 5-8% of questions, primarily in the context of Azure Automated Machine Learning (AutoML) and Azure Cognitive Services (now Azure AI Services) Anomaly Detector. You will need to understand what time series forecasting is, its key components (trend, seasonality, residual), how it differs from regression, and the Azure services that support it. We will also cover the specific AutoML configuration for forecasting tasks and the metrics used to evaluate forecast accuracy.

25 min read
Intermediate
Updated May 31, 2026

Weather Forecast for Sales Prediction

Imagine you are a meteorologist trying to predict tomorrow's temperature. You have historical temperature data, but you also know that temperature follows patterns: it is generally warmer in summer and colder in winter, and it tends to change gradually from day to day. You cannot just use the average of all past temperatures because that ignores these patterns. Instead, you use a model that captures three components: a long-term trend (global warming might cause a slight upward drift), a seasonal pattern (predictable yearly cycle), and residual random noise (unexpected weather events). For sales forecasting, the same logic applies: a retail store's sales have a trend (growing or shrinking over years), seasonality (higher in December holidays, lower in January), and noise (a one-day promotion spike). Time series forecasting algorithms like ARIMA or Exponential Smoothing mathematically decompose the historical data into these components and project them forward. The model learns the repeating patterns and extrapolates them, just as a meteorologist uses past weather to predict future weather, but with the understanding that the further out the prediction, the less certain it becomes.

How It Actually Works

What is Time Series Forecasting?

Time series forecasting is a machine learning technique that predicts future numerical values based on a sequence of observations collected over time. The key characteristic of time series data is that the order of observations matters — each data point is dependent on previous points. For example, daily sales figures, hourly website traffic, monthly energy consumption, and stock prices are all time series data. The goal of forecasting is to model the underlying patterns in the historical data and extend them into the future.

On the AI-900 exam, time series forecasting is categorized under supervised learning because the historical data provides labeled examples (past values) that the model learns from to predict future values. However, unlike standard regression, the features include time-dependent structures such as lags, rolling windows, and date-derived features (day of week, month, quarter).

Why Not Use Standard Regression?

Standard regression assumes that observations are independent and identically distributed (i.i.d.). In time series, observations are correlated — today's value is often similar to yesterday's. Ignoring this temporal dependency leads to poor predictions. Time series forecasting algorithms explicitly model autocorrelation, trends, and seasonality.

Key Components of Time Series

Trend: The long-term increase or decrease in the data. For example, a company's sales may grow 5% year over year. Trends can be linear or nonlinear.

Seasonality: A repeating pattern over a fixed period, such as daily, weekly, monthly, or yearly. For instance, ice cream sales peak every summer.

Residual (or Noise): The random variation after removing trend and seasonality. A good model captures trend and seasonality, leaving only white noise.

How Time Series Forecasting Works in Azure AutoML

Azure Automated Machine Learning (AutoML) provides a built-in forecasting task. When you configure a forecasting run, AutoML automatically performs the following steps:

1.

Data Preparation: The dataset must include a timestamp column (date or datetime) and a target column (the value to forecast). Optional columns include time series ID (for multiple series, e.g., different stores) and exogenous features (external factors like promotions or weather).

2.

Feature Engineering: AutoML creates time-based features such as day of week, month, quarter, year, day of year, and holiday indicators. It also generates lag features (value from previous time steps) and rolling window aggregates (e.g., 7-day moving average).

3.

Model Selection: AutoML evaluates multiple algorithms including ARIMA, Exponential Smoothing, Prophet, Gradient Boosting (LightGBM, XGBoost), and Neural Networks (TCN, DeepAR). It uses a combination of cross-validation (time series split) and hyperparameter tuning to select the best model.

4.

Forecasting Horizon: You specify the forecast horizon (how many steps into the future to predict). AutoML trains models to predict up to that horizon.

5.

Evaluation Metrics: The primary metric for forecasting is Normalized Root Mean Squared Error (NRMSE) and Normalized Mean Absolute Error (NMAE). These metrics are normalized by the range of the target variable to allow comparison across different scales.

Azure Cognitive Services Anomaly Detector

Another Azure service related to time series is the Anomaly Detector (part of Azure AI Services). It detects anomalies in time series data using three methods: spike/change point detection, whole-series anomaly detection, and batch anomaly detection. It is often used for monitoring and alerting. For the exam, know that Anomaly Detector can identify unusual patterns in real-time data streams.

Evaluation Metrics for Forecasting

Mean Absolute Error (MAE): Average absolute difference between predicted and actual values.

Root Mean Squared Error (RMSE): Square root of the average squared differences. Penalizes large errors more.

Normalized RMSE (NRMSE): RMSE divided by the range (max-min) of the actual values. Allows comparison across datasets.

Mean Absolute Percentage Error (MAPE): Average of absolute percentage errors. Not suitable if actual values are close to zero.

AutoML reports both training and validation metrics. The validation uses a time series split (rolling origin cross-validation) to avoid lookahead bias.

Configuration Details in Azure AutoML

When using the Azure Machine Learning studio or SDK, you set the task type to forecasting. Key parameters include:

forecast_horizon: Number of periods to forecast (e.g., 10 days).

time_column_name: Name of the datetime column.

target_column_name: Name of the target variable.

time_series_id_column_names: Columns that identify different time series (e.g., store ID). If omitted, all data is treated as one series.

featurization_config: Options for automatic feature engineering (e.g., 'auto' or custom).

max_horizon: Maximum forecast horizon for model training.

Example Python SDK configuration:

from azureml.automl.core.forecasting_parameters import ForecastingParameters

forecasting_parameters = ForecastingParameters(
    time_column_name='date',
    forecast_horizon=10,
    target_rolling_window_size=5,
    target_lags='auto'
)

How Time Series Forecasting Interacts with Other Azure Services

Azure Data Lake / Blob Storage: Store historical time series data.

Azure Machine Learning: Train and deploy forecasting models.

Azure Cognitive Services Anomaly Detector: Monitor predictions for anomalies.

Power BI: Visualize forecasts and share reports.

Azure Stream Analytics: Real-time anomaly detection on streaming data.

Limitations and Considerations

Forecasting becomes less accurate as the forecast horizon increases.

Models may not extrapolate well if the time series has a regime change (e.g., new product launch).

Missing data points must be handled (interpolation or forward fill).

Seasonality must be known or detected by the model.

Exam Relevance

For AI-900, you are not expected to implement forecasting models from scratch. Instead, you should know:

The definition of time series forecasting.

The difference between regression and forecasting.

The components: trend, seasonality, residual.

That Azure AutoML supports forecasting tasks.

That Anomaly Detector can detect anomalies in time series.

Common evaluation metrics (MAE, RMSE, MAPE).

The concept of forecast horizon.

Walk-Through

1

Collect and Prepare Data

Historical time series data is gathered, ensuring a timestamp column and target variable. The data must be sorted chronologically. Missing values are handled (e.g., forward fill or interpolation). For multiple series, a time series ID column is specified. The data is split into training and test sets, with the test set being the most recent period (e.g., last 30 days).

2

Configure AutoML Forecasting

In Azure Machine Learning, you create an Automated ML run with task type 'forecasting'. You set parameters like forecast horizon (e.g., 10 periods), time column name, target column, and time series ID columns. Optionally, you can specify target lags (how many past values to include as features) and rolling window size (e.g., 7-day moving average). AutoML may also automatically detect seasonality.

3

Feature Engineering by AutoML

AutoML automatically generates calendar-based features (day of week, month, quarter, year, day of year, holiday flags) and lag features (value at t-1, t-2, etc.). It also creates rolling window aggregates (mean, min, max over a window). These features help the model learn temporal patterns. Exogenous features provided by the user (e.g., promotion flag) are also included.

4

Model Training and Selection

AutoML trains multiple models, including ARIMA, Exponential Smoothing, Prophet, Gradient Boosting (LightGBM, XGBoost), and Neural Networks (TCN, DeepAR). It uses time series cross-validation (rolling origin) to evaluate each model. The cross-validation splits the data sequentially, ensuring no future data leaks into training. Hyperparameters are tuned automatically.

5

Evaluate and Deploy Best Model

The best model is selected based on the primary metric (e.g., normalized RMSE). AutoML shows training and validation metrics. The model can be deployed as a web service to generate forecasts on new data. In production, you can retrain periodically (e.g., weekly) with new data to keep the model accurate.

What This Looks Like on the Job

Enterprise Scenario 1: Retail Demand Forecasting

A large retail chain with hundreds of stores uses Azure AutoML to forecast daily sales for each store and product category. The historical data includes three years of daily sales, promotions, holidays, and weather data. The forecasting model predicts sales 14 days ahead to optimize inventory replenishment. The model is retrained weekly. Misconfiguration, such as not specifying the time series ID column correctly, would treat all stores as one series, leading to inaccurate forecasts because each store has different seasonal patterns. The solution uses a time series ID column for store ID and product category. The forecast horizon is set to 14 days, and target lags of 1, 7, and 14 days are automatically generated. The model achieves an NRMSE of 0.08, meaning the error is 8% of the sales range.

Enterprise Scenario 2: Energy Consumption Forecasting

A utility company forecasts hourly electricity demand to balance supply and demand. They use Azure AutoML with a forecast horizon of 48 hours. The data includes hourly consumption for the past two years, temperature, and day of week. The model automatically detects daily and weekly seasonality. Anomaly Detector monitors real-time consumption for unexpected spikes (e.g., equipment failure). If the forecast deviates significantly from actual, an alert is triggered. A common mistake is using too short a forecast horizon, which fails to capture weekly patterns. The correct horizon should cover at least one full seasonal cycle (e.g., 168 hours for a week).

Enterprise Scenario 3: Website Traffic Forecasting

An e-commerce company forecasts daily website visitors to allocate server capacity. They use Azure AutoML with a forecast horizon of 30 days. The data includes daily visitors for two years and marketing campaign spend. The model detects yearly seasonality (higher traffic during December holidays). They deploy the model as a web service and integrate it with Power BI dashboards. Performance consideration: the model must handle missing data due to server downtime. Forward fill is used. When misconfigured, if the forecast horizon exceeds the seasonality period, the model may extrapolate incorrectly. For example, forecasting 400 days without yearly seasonality would miss the holiday peak.

How AI-900 Actually Tests This

AI-900 Exam Focus on Time Series Forecasting

Objective Code: Domain 2 (Machine Learning), Objective 2.2 (Identify common machine learning tasks). The exam expects you to recognize time series forecasting as a supervised learning task distinct from regression and classification.

What the Exam Tests

Definition: You must be able to identify a scenario that requires time series forecasting (e.g., predicting sales for the next 12 months based on historical data).

Components: Trend, seasonality, and residual are the three components. The exam may ask which component is responsible for a repeating pattern (answer: seasonality).

Azure AutoML: Know that AutoML supports forecasting as a task type. You do not need to remember exact parameter names, but understand that AutoML can automatically generate time-based features.

Anomaly Detector: This Azure Cognitive Service can detect anomalies in time series data. It is not the same as forecasting — it identifies unusual points.

Metrics: The exam may ask which metric is used to evaluate forecasting models. Common answers include RMSE, MAE, MAPE. You should know that normalized metrics are used for comparison.

Common Wrong Answers and Why Candidates Choose Them

1.

Confusing forecasting with regression: A question might describe predicting house prices based on square footage (regression) vs. predicting next month's sales (forecasting). Candidates often choose regression for time-based data because they think 'predicting a number' is always regression. The key is that forecasting requires time order.

2.

Selecting classification for a time series task: Some candidates think predicting future values is classification because it's a 'category' of future. But forecasting outputs a continuous number, not a discrete class.

3.

Misidentifying Anomaly Detector as a forecasting tool: The exam may ask which Azure service is used to predict future values. Candidates might pick Anomaly Detector because it deals with time series. However, Anomaly Detector only identifies anomalies, not forecasts.

4.

Choosing the wrong metric: Questions about evaluation may list accuracy, precision, recall, or RMSE. RMSE is correct for forecasting; accuracy is for classification.

Specific Numbers and Terms to Memorize

Forecast horizon: the number of future time steps to predict.

NRMSE: Normalized Root Mean Squared Error.

Time series ID: column that distinguishes multiple series.

Trend, seasonality, residual.

AutoML: Automated Machine Learning.

Anomaly Detector: part of Azure AI Services.

Edge Cases and Exceptions

If the time series has no seasonality, the model may still detect a trend.

If the forecast horizon is longer than the training data, the model may extrapolate poorly.

Missing timestamps must be handled; AutoML can do forward fill.

How to Eliminate Wrong Answers

If the question mentions 'future values based on historical data ordered by time', it is forecasting.

If the question mentions 'detecting unusual spikes', it is anomaly detection.

If the question mentions 'predicting a category', it is classification.

If the question mentions 'predicting a continuous value without time order', it is regression.

Key Takeaways

Time series forecasting is a supervised learning task that predicts future numerical values based on historical time-ordered data.

The three components of a time series are trend, seasonality, and residual.

Azure AutoML supports forecasting as a task type and automatically engineers time-based features.

The forecast horizon specifies how many future time steps to predict.

Common evaluation metrics include RMSE, MAE, MAPE, and their normalized versions (NRMSE, NMAE).

Azure Anomaly Detector identifies anomalies in time series data, not forecasts.

Time series forecasting differs from regression because it models temporal dependencies.

AutoML can handle multiple time series using a time series ID column.

The primary metric for AutoML forecasting is normalized RMSE.

Forecasting accuracy decreases as the forecast horizon increases.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Time Series Forecasting

Predicts future values based on historical time-ordered data.

Uses time-based features (lags, seasonality, trend).

Evaluated with RMSE, MAE, MAPE.

Requires time column and optional time series ID.

Examples: sales prediction, weather forecasting.

Regression

Predicts a continuous value from independent features.

Assumes observations are independent and identically distributed.

Evaluated with RMSE, MAE, R-squared.

No time column required.

Examples: house price prediction, salary estimation.

Watch Out for These

Mistake

Time series forecasting is the same as regression.

Correct

Regression assumes independent observations, while time series forecasting models temporal dependencies. Forecasting uses time-based features like lags and seasonality. The exam distinguishes them: regression predicts a number from independent features, forecasting predicts future values from past values.

Mistake

Anomaly Detector can forecast future values.

Correct

Anomaly Detector only identifies anomalies in existing time series data. It does not predict future values. Forecasting is done by AutoML or custom models.

Mistake

The forecast horizon can be arbitrarily long without loss of accuracy.

Correct

Accuracy decreases as the forecast horizon increases. The model's confidence intervals widen. AutoML typically limits the horizon based on training data length.

Mistake

All time series data must be stationary before modeling.

Correct

While some algorithms (e.g., ARIMA) require stationarity, AutoML handles non-stationary data by differencing or modeling trend directly. The exam does not test stationarity.

Mistake

Time series forecasting always uses neural networks.

Correct

AutoML evaluates multiple algorithms including statistical models (ARIMA, Exponential Smoothing) and tree-based models. Neural networks are only one option.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between time series forecasting and regression?

Time series forecasting predicts future values based on past values ordered in time, incorporating temporal patterns like trend and seasonality. Regression predicts a continuous value from independent features that are not time-ordered. In forecasting, observations are dependent on previous ones; in regression, they are assumed independent. On the exam, if the data has a timestamp and the goal is to predict future values, it is forecasting.

Which Azure service is used for time series forecasting?

Azure Automated Machine Learning (AutoML) in Azure Machine Learning supports time series forecasting. You set the task type to 'forecasting'. Additionally, Azure Cognitive Services Anomaly Detector can analyze time series but does not forecast. For exam questions, know that AutoML is the primary forecasting tool.

What are trend, seasonality, and residual in time series?

Trend is the long-term direction (upward or downward). Seasonality is a repeating pattern over a fixed period (e.g., weekly, yearly). Residual is the random noise after removing trend and seasonality. A good forecasting model captures trend and seasonality, leaving only residual. The exam may ask which component represents a repeating cycle.

What is the forecast horizon in AutoML?

The forecast horizon is the number of future time steps to predict. For example, if you have daily data and want to predict the next 10 days, the horizon is 10. It is set in the AutoML configuration. The exam may test that a longer horizon generally reduces accuracy.

How does AutoML evaluate forecasting models?

AutoML uses normalized RMSE (NRMSE) as the primary metric, which is RMSE divided by the range of the target variable. It also reports MAE, MAPE, and other metrics. Cross-validation uses a time series split (rolling origin) to avoid lookahead bias. The exam may ask which metric is used for forecasting.

Can AutoML forecast multiple time series at once?

Yes, by specifying a time series ID column (e.g., store ID). AutoML trains a model that can handle multiple series, either by treating them as separate series or by sharing information across series. The exam may ask about the purpose of the time series ID column.

What is the difference between Anomaly Detector and forecasting?

Anomaly Detector identifies unusual data points in a time series (spikes, dips, change points). It does not predict future values. Forecasting predicts future values. Both work with time series data but serve different purposes. The exam may present a scenario and ask which service to use.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Time Series Forecasting — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.

Done with this chapter?