This chapter covers regression and classification, two fundamental supervised machine learning tasks. For the AI-900 exam, this topic appears in approximately 15-20% of questions under objective 2.2. Understanding the difference between predicting numeric values (regression) and categorical labels (classification) is essential. We will explore how each works, key algorithms, evaluation metrics, and how Azure Machine Learning implements them. By the end, you will be able to identify which task fits a given scenario and interpret common metrics.
Jump to a section
Imagine a chef who must predict how many portions of each dish to prepare for a banquet. For regression, the chef predicts a continuous quantity—like 'we will need 15.7 kg of pasta'—based on past guest counts and appetites. The chef uses a recipe (model) that maps guest count to pasta weight, and adjusts it using historical data (training). For classification, the chef predicts a category—like 'the main course will be either chicken, fish, or vegetarian'—based on past preferences. The chef uses a different recipe that outputs probabilities for each category and picks the highest. The key mechanistic difference: regression outputs a number on a continuous scale; classification outputs a discrete label. In Azure, automated ML helps the chef choose the best recipe (algorithm) and tweak ingredients (hyperparameters) to minimize prediction error. The chef must also evaluate the recipe using metrics like RMSE for regression and accuracy for classification, ensuring the predictions are reliable for the banquet.
What is Regression?
Regression is a supervised machine learning technique used to predict a continuous numeric value. The goal is to model the relationship between one or more independent (input) variables and a dependent (target) variable. For example, predicting house prices based on square footage, number of bedrooms, and location. The output is a real number (e.g., $350,000.50).
How Regression Works
A regression algorithm learns a function f(X) that approximates the target y. The most common algorithm is Linear Regression, which assumes a linear relationship: y = w0 + w1*x1 + w2*x2 + ... + wn*xn + ε, where w are weights (coefficients) and ε is error. During training, the algorithm minimizes a loss function—typically Mean Squared Error (MSE)—by adjusting weights using optimization techniques like Gradient Descent.
Gradient Descent: Iteratively updates weights in the direction that reduces the loss. The learning rate controls step size. Too high: overshoots minimum; too low: slow convergence.
Regularization: Techniques like Lasso (L1) and Ridge (L2) add penalty terms to prevent overfitting. L1 can shrink some coefficients to zero, effectively performing feature selection.
Other regression algorithms include: - Decision Trees: Non-linear, splits data based on feature thresholds. - Random Forest: Ensemble of decision trees, reduces overfitting. - Neural Networks: Deep learning models for complex relationships.
Evaluation Metrics for Regression
The AI-900 exam focuses on three key metrics: - Mean Absolute Error (MAE): Average of absolute differences between predicted and actual values. Less sensitive to outliers. - Root Mean Squared Error (RMSE): Square root of MSE. Penalizes larger errors more heavily. - R-squared (R²): Proportion of variance explained by the model. Ranges from 0 to 1; higher is better.
Example: If actual prices are [100, 200, 300] and predictions are [110, 190, 290], then MAE = (10+10+10)/3 = 10, RMSE = sqrt((100+100+100)/3) ≈ 10, R² = 1 - (300/20000) = 0.985.
What is Classification?
Classification predicts a discrete category or class label. Unlike regression, the output is not a continuous number but a class from a finite set. For example, predicting whether an email is 'spam' or 'not spam' (binary classification) or classifying handwritten digits 0-9 (multiclass classification).
How Classification Works
Classification algorithms output a probability distribution over classes, then assign the class with highest probability. Common algorithms: - Logistic Regression: Despite its name, it's for binary classification. Uses the sigmoid function to output a probability between 0 and 1. Decision threshold typically 0.5. - Decision Trees: Splits data based on feature values to create pure leaf nodes. - Random Forest: Ensemble of decision trees, averages probabilities. - Support Vector Machine (SVM): Finds hyperplane that best separates classes. - Neural Networks: For complex tasks like image classification.
Evaluation Metrics for Classification
The exam tests these metrics: - Accuracy: (TP+TN)/(TP+TN+FP+FN). Simple but misleading for imbalanced data. - Precision: TP/(TP+FP). Of predicted positives, how many are actually positive. - Recall (Sensitivity): TP/(TP+FN). Of actual positives, how many were correctly predicted. - F1 Score: Harmonic mean of precision and recall. 2*(Precision*Recall)/(Precision+Recall). - AUC-ROC: Area under the ROC curve, plots true positive rate vs false positive rate. Ranges 0.5 (random) to 1 (perfect).
For multiclass, metrics are averaged (macro, micro, weighted).
Regression vs Classification – Key Differences
Output type: Continuous number vs discrete class.
Loss function: MSE, MAE vs Cross-Entropy (log loss).
Evaluation: RMSE, R² vs Accuracy, F1.
Algorithms: Linear regression, random forest regressor vs logistic regression, random forest classifier.
Implementation in Azure Machine Learning
Azure ML provides automated ML (AutoML) that automatically selects the best algorithm and hyperparameters. For regression, AutoML tries algorithms like LinearRegression, RandomForestRegressor, LightGBM. For classification, it tries LogisticRegression, RandomForestClassifier, etc. It evaluates using primary metric you specify (e.g., RMSE for regression, accuracy for classification).
Training: You provide a dataset with features and target column. AutoML splits data into training and validation sets (default 80/20). It uses cross-validation (default 5 folds) to avoid overfitting.
Featurization: AutoML automatically handles missing values, scales numeric features, and encodes categorical features.
Explainability: Azure ML integrates with InterpretML to provide feature importance plots.
When to Use Which
Use regression when the target is a continuous number: sales, temperature, price.
Use classification when the target is a category: churn (yes/no), disease type, image object.
Exam Trap: Confusing Regression with Classification
Common wrong answer: 'Predicting whether a customer will buy a product is regression because it outputs a probability.' Reality: Probability is continuous, but the final output is a binary class (buy/not buy). If the task is to predict the exact probability (0.73), that's regression. If the task is to predict 'buy' or 'not buy' based on probability threshold, it's classification. The exam tests this distinction.
Define the Problem Type
First, determine if the target variable is continuous (regression) or categorical (classification). For example, predicting 'price' is regression; predicting 'color' is classification. This step is crucial because it dictates which algorithms and metrics to use. In Azure ML, you specify the task type when configuring AutoML. A common mistake is treating a binary outcome (e.g., 'yes/no') as regression because it's encoded as 0/1. However, the output is still categorical, so classification is correct.
Collect and Prepare Data
Gather historical data with features and known target values. Clean data by handling missing values (e.g., impute mean for regression, mode for classification), remove duplicates, and encode categorical features (one-hot encoding or label encoding). Scale numeric features using standardization (z-score) or normalization (min-max). In Azure ML, these steps are automated in AutoML, but you can customize featurization. For classification, ensure class balance; if imbalanced, use techniques like SMOTE or class weights.
Split Data into Training and Test Sets
Typically, 70-80% for training, 20-30% for testing. The training set is used to fit the model; the test set evaluates final performance. In cross-validation, the data is split into k folds (default 5 in AutoML). Each fold acts as a validation set once. This reduces variance in performance estimates. Never use test data for training decisions—that leaks information and overestimates performance.
Train the Model Using an Algorithm
Select an appropriate algorithm. For regression, start with linear regression; for classification, logistic regression. In AutoML, multiple algorithms are tried automatically. The algorithm learns patterns by minimizing a loss function. For example, linear regression minimizes MSE using gradient descent. Training involves adjusting weights iteratively. In Azure ML, you can also use hyperparameter tuning to find optimal settings (e.g., learning rate, tree depth). AutoML uses Bayesian optimization to search hyperparameter space efficiently.
Evaluate Model Performance
Use the test set to compute metrics. For regression: MAE, RMSE, R². For classification: accuracy, precision, recall, F1, AUC-ROC. Compare metrics against a baseline (e.g., mean for regression, majority class for classification). If performance is poor, consider feature engineering, algorithm change, or more data. In Azure ML, AutoML provides a leaderboard of models with metrics. The best model is selected based on the primary metric you defined (e.g., 'accuracy'). Always check for overfitting: if training performance is much better than test, reduce model complexity.
Deploy and Monitor the Model
Once satisfied, deploy the model as a web service in Azure Container Instances or AKS. The model can be consumed via REST API. Monitor performance over time—data drift can degrade accuracy. Azure ML provides model monitoring and retraining pipelines. For classification, monitor class distribution shifts. For regression, monitor residual distribution. If performance drops, retrain with new data.
Enterprise Scenario 1: Predicting Equipment Failure (Regression)
A manufacturing company wants to predict the remaining useful life (RUL) of machinery. The target is continuous (hours remaining). They collect sensor data: temperature, vibration, pressure. Using Azure ML AutoML with regression, they train a model that outputs RUL. They use RMSE as the primary metric. The model is deployed to an edge device that triggers maintenance alerts when RUL drops below a threshold. Misconfiguration: using classification instead of regression would treat RUL as discrete bins, losing precision and causing premature or late maintenance.
Enterprise Scenario 2: Customer Churn Prediction (Classification)
A telecom company wants to predict which customers will churn (binary classification: churn/not churn). Features include usage patterns, contract length, complaints. They use logistic regression and random forest in AutoML. The primary metric is AUC-ROC because the dataset is imbalanced (10% churn). They deploy the model as a batch endpoint to score monthly. Common pitfall: using accuracy as the metric—if 90% don't churn, a model that always predicts 'not churn' gets 90% accuracy but is useless. Precision and recall are more informative.
Enterprise Scenario 3: Image Classification for Quality Control
A food processing plant uses computer vision to classify fruit as 'ripe', 'unripe', or 'rotten' (multiclass classification). They use a deep learning model (CNN) trained in Azure ML. The model is deployed to edge devices on the conveyor belt. They monitor F1 score per class to detect drift (e.g., new rot type). Misconfiguration: using regression to output a continuous 'ripeness score' and then applying thresholds—this fails if the relationship is not linear.
What AI-900 Tests on Regression and Classification
Objective 2.2: 'Describe core machine learning concepts' includes identifying regression vs classification tasks and understanding evaluation metrics. Specific focus areas: - Identify the type of problem: Given a scenario, choose regression or classification. Example: 'Predicting the price of a house' → regression. 'Predicting whether a loan will default' → classification. - Metrics: Know which metric is appropriate for which task. For regression: RMSE, MAE, R². For classification: accuracy, precision, recall, F1, AUC-ROC. - Algorithms: Be aware that logistic regression is for classification, not regression. Common trap: candidates think logistic regression is for regression because of the name. - Overfitting/Underfitting: Understand that complex models may overfit, and regularization helps.
Common Wrong Answers and Why
'Logistic regression is used for regression problems.' – Wrong because logistic regression outputs probabilities for classification. The name is historical.
'Accuracy is always the best metric for classification.' – Wrong for imbalanced datasets; precision/recall or AUC-ROC are better.
'Mean Squared Error is used for classification.' – Wrong; MSE is for regression. Classification uses cross-entropy.
'R-squared measures how many predictions are correct.' – Wrong; R² is for regression, not classification accuracy.
Specific Numbers and Terms on the Exam
RMSE: Default primary metric for regression in AutoML.
Accuracy: Default primary metric for classification in AutoML.
AUC: Values close to 1 indicate good model; 0.5 is random.
F1 Score: Harmonic mean of precision and recall.
Cross-validation: Default 5-fold in AutoML.
Edge Cases
Multiclass classification: Output can be more than two classes. Metrics averaged (macro, micro, weighted).
Imbalanced data: Use AUC-ROC or F1 instead of accuracy.
Regression with categorical target: If target is integer but represents categories (e.g., 1=low, 2=medium, 3=high), it's ordinal classification, not regression.
How to Eliminate Wrong Answers
If the question asks for a continuous number, eliminate classification options.
If the question asks for a category, eliminate regression options.
Check the metric: if RMSE is mentioned, it's regression.
If the scenario involves probability threshold, it's classification.
Regression predicts continuous values; classification predicts categories.
Logistic regression is a classification algorithm, not regression.
For regression, key metrics are RMSE, MAE, and R².
For classification, key metrics are accuracy, precision, recall, F1, and AUC-ROC.
Accuracy can be misleading for imbalanced datasets; use F1 or AUC-ROC instead.
In Azure ML AutoML, default primary metric for regression is RMSE; for classification it is accuracy.
Cross-validation default is 5 folds in AutoML.
Overfitting occurs when model performs well on training data but poorly on test data.
These come up on the exam all the time. Here's how to tell them apart.
Regression
Predicts continuous numeric values (e.g., price, temperature).
Uses loss functions like MSE, MAE.
Evaluation metrics: RMSE, MAE, R².
Common algorithms: Linear Regression, Random Forest Regressor.
Output is a real number.
Classification
Predicts discrete categories (e.g., spam/not spam, digit 0-9).
Uses loss functions like cross-entropy (log loss).
Evaluation metrics: Accuracy, Precision, Recall, F1, AUC-ROC.
Common algorithms: Logistic Regression, Random Forest Classifier.
Output is a class label or probability distribution.
Mistake
Logistic regression is used for regression problems.
Correct
Logistic regression is a classification algorithm. It uses a sigmoid function to output probabilities, not continuous values. The name comes from the logistic function, not from regression.
Mistake
Accuracy is always the best metric for classification.
Correct
Accuracy can be misleading for imbalanced datasets. For example, if 95% of emails are not spam, a model that always predicts 'not spam' gets 95% accuracy but fails to catch spam. Use precision, recall, F1, or AUC-ROC instead.
Mistake
Mean Squared Error (MSE) is used to evaluate classification models.
Correct
MSE is a regression metric. Classification uses cross-entropy loss (log loss) for training and metrics like accuracy, precision, recall, F1, AUC-ROC for evaluation.
Mistake
R-squared measures how many predictions are correct.
Correct
R-squared is a regression metric that measures the proportion of variance in the target explained by the model. It ranges from 0 to 1, but higher is better. It does not measure correctness of categorical predictions.
Mistake
A model with high accuracy is always a good model.
Correct
High accuracy can be due to overfitting or class imbalance. Always evaluate using appropriate metrics for the problem. For example, in fraud detection, high accuracy might miss rare fraud cases.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Regression predicts a continuous number (e.g., house price). Classification predicts a discrete class (e.g., spam or not spam). The key difference is the type of output: numeric vs categorical. For the exam, if the target can be any real number, it's regression; if it's a label from a fixed set, it's classification.
No. Despite its name, logistic regression is a classification algorithm. It uses a logistic (sigmoid) function to output a probability between 0 and 1, then assigns a class based on a threshold (usually 0.5). The name is historical and can be confusing.
Avoid accuracy. Use precision, recall, F1 score, or AUC-ROC. F1 is the harmonic mean of precision and recall, balancing both. AUC-ROC measures the model's ability to distinguish between classes across thresholds.
The default primary metric for regression is normalized_root_mean_squared_error (NRMSE). You can change it to other metrics like RMSE, MAE, or R² during configuration.
Technically, you can, but it's not recommended. Regression assumes continuous output and may produce values outside [0,1] or meaningless probabilities. Classification algorithms are designed to handle binary outcomes and provide probability estimates.
R-squared measures the proportion of variance in the target variable that is explained by the model. For example, an R² of 0.8 means the model explains 80% of the variance. It ranges from 0 to 1, with higher values indicating better fit.
Overfitting occurs when a model learns noise in the training data and performs poorly on new data. Prevent it by using simpler models, regularization (L1/L2), cross-validation, and more training data.
You've just covered Regression and Classification — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.
Done with this chapter?