AI-900Chapter 4 of 100Objective 2.1

Machine Learning Core Concepts

This chapter covers the core concepts of machine learning, including definitions of ML, types of learning (supervised, unsupervised, reinforcement), and key components like features, labels, training, and inference. For the AI-900 exam, this topic area constitutes approximately 25-30% of the questions, making it essential to master. Understanding these fundamentals will help you answer questions about selecting appropriate ML techniques for given scenarios and interpreting model performance metrics.

25 min read

Intermediate

Updated May 31, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Test what I know Look up key terms

Learning to Sort Mail Like a Postal Worker

Imagine a postal worker who has never sorted mail before. Initially, they have no idea how to categorize letters. Their supervisor gives them a training set of 100 envelopes, each with an address and a label (e.g., 'local' or 'international'). The worker studies each envelope, noting patterns: local envelopes often have a town name, international ones have a country. After training, the worker creates a rule: if 'country' appears, sort as international; otherwise, local. This rule is the model. Now, a new envelope arrives without a label. The worker applies the rule: no country → local. This is inference. If the worker misclassifies, the supervisor corrects them (labeled feedback), and the worker adjusts the rule (supervised learning). Over time, the rule becomes more nuanced, like recognizing that 'London' could be local or international depending on the state. The worker's accuracy improves with more examples. In machine learning, the postal worker is the algorithm, the envelopes are data points, the labels are ground truth, the rule is the model, and the supervisor's corrections are the loss function guiding optimization.

How It Actually Works

What is Machine Learning?

Machine learning (ML) is a subset of artificial intelligence (AI) that enables systems to learn and improve from experience without being explicitly programmed. Instead of following static rules, ML algorithms build a mathematical model based on training data, which can then make predictions or decisions on new data. The core idea is that the algorithm identifies patterns in the data and uses those patterns to infer outputs.

Why Machine Learning Exists

Traditional programming requires humans to define explicit rules for every possible input. This is impractical for complex tasks like image recognition, natural language processing, or predicting customer churn, where the rules are too intricate or unknown. ML automates the discovery of these rules by learning from examples. For instance, you cannot write explicit rules to identify a cat in an image, but you can train a model on thousands of labeled cat and non-cat images.

How Machine Learning Works Internally

At its core, ML involves three main components: data, model, and learning algorithm. The process follows these steps: 1. Data Collection: Gather a dataset that represents the problem domain. The dataset must include features (input variables) and, for supervised learning, labels (target outputs). 2. Data Preprocessing: Clean the data by handling missing values, normalizing scales, encoding categorical variables, and splitting into training, validation, and test sets. Typically, 70-80% for training, 10-15% each for validation and test. 3. Model Selection: Choose an algorithm appropriate for the task (e.g., linear regression for continuous prediction, decision tree for classification). 4. Training: Feed the training data to the algorithm. The algorithm iteratively adjusts its internal parameters (e.g., weights in a neural network) to minimize the error between its predictions and the true labels. This is done using an optimization technique like gradient descent. 5. Evaluation: Assess the trained model on the validation set to tune hyperparameters (e.g., learning rate, tree depth). Use metrics like accuracy, precision, recall, or mean squared error. 6. Testing: Finally, evaluate the model on the test set to estimate its real-world performance. 7. Inference: Deploy the model to make predictions on new, unseen data.

Key Components, Values, and Defaults

Features (x): Input variables used by the model. For example, in predicting house prices, features might include square footage, number of bedrooms, and location.

Labels (y): The target output we want to predict. In supervised learning, labels are provided in the training data. For house price prediction, the label is the price.

Training Set: The portion of data used to train the model. Typically 70-80% of the total dataset.

Validation Set: Used to tune hyperparameters and prevent overfitting. Typically 10-15%.

Test Set: Used for final evaluation. Typically 10-15%.

Model Parameters: Internal values learned during training (e.g., weights in linear regression).

Hyperparameters: Configuration settings set before training (e.g., learning rate, number of trees in a random forest). Common defaults: learning rate = 0.01, number of trees = 100.

Loss Function: Measures how far the model's predictions are from the true labels. For regression, mean squared error (MSE) is common; for classification, cross-entropy loss.

Optimizer: Algorithm that updates model parameters to minimize loss. Gradient descent is the most common, with variants like Adam and SGD.

Epochs: One complete pass through the entire training dataset. Default often 10-100.

Batch Size: Number of training samples used in one iteration. Common default: 32 or 64.

Configuration and Verification Commands (Azure Machine Learning)

In Azure Machine Learning, you can configure and run training jobs using the Python SDK or CLI. Example using the Azure ML Python SDK:

from azureml.core import Workspace, Experiment, ScriptRunConfig

ws = Workspace.from_config()
experiment = Experiment(workspace=ws, name='my-experiment')

config = ScriptRunConfig(source_directory='.', script='train.py',
                         compute_target='cpu-cluster',
                         arguments=['--learning-rate', 0.01, '--epochs', 50])
run = experiment.submit(config)
run.wait_for_completion()

To verify training metrics, use the Azure ML Studio UI or retrieve them via SDK:

metrics = run.get_metrics()
print(metrics)

How Machine Learning Interacts with Related Technologies

Machine learning often works in conjunction with data engineering (to prepare data), MLOps (to manage the lifecycle), and cloud services (to scale compute). In Azure, ML models can be deployed as web services using Azure Kubernetes Service (AKS) or Azure Container Instances (ACI). They can also be integrated with Azure Functions for serverless inference or with Power BI for embedded analytics.

Types of Machine Learning

Supervised Learning: The model is trained on labeled data. Common tasks: classification (e.g., spam detection) and regression (e.g., price prediction). Algorithms: linear regression, logistic regression, decision trees, random forests, support vector machines, neural networks.

Unsupervised Learning: The model finds patterns in unlabeled data. Common tasks: clustering (e.g., customer segmentation) and dimensionality reduction (e.g., PCA). Algorithms: K-means, hierarchical clustering, DBSCAN, autoencoders.

Reinforcement Learning: The model learns by interacting with an environment, receiving rewards or penalties. Common tasks: game playing, robotics. Algorithms: Q-learning, deep Q-networks (DQN), policy gradients.

Key Terminology for AI-900

Feature Engineering: Creating new features from existing ones to improve model performance.

Overfitting: Model performs well on training data but poorly on new data. Occurs when model is too complex.

Underfitting: Model performs poorly on both training and new data. Occurs when model is too simple.

Bias: Error due to overly simplistic assumptions (high bias leads to underfitting).

Variance: Error due to sensitivity to small fluctuations in the training set (high variance leads to overfitting).

Bias-Variance Tradeoff: The balance between underfitting and overfitting.

Cross-Validation: Technique to evaluate model by partitioning data into multiple train/test sets. K-fold cross-validation uses k=5 or 10 folds.

Regularization: Technique to prevent overfitting by adding a penalty term to the loss function (e.g., L1 or L2 regularization).

Exam-Relevant Details

The AI-900 exam focuses on understanding when to use each type of ML, not on implementing algorithms.

You should know that regression predicts a numeric value, classification predicts a category, and clustering groups similar items.

Common examples: linear regression for sales forecasting, logistic regression for binary classification, K-means for customer segmentation.

Be able to identify supervised vs. unsupervised learning scenarios. For instance, if the data has labels, it's supervised; if not, it's unsupervised.

Reinforcement learning is used for sequential decision-making problems like game AI or robot navigation.

Step-by-Step: Training a Supervised ML Model

Collect and Label Data: Gather a dataset with input features and corresponding correct outputs. For image classification, this means thousands of labeled images.

Split Data: Divide into training (70%), validation (15%), and test (15%) sets. This prevents data leakage and ensures unbiased evaluation.

Choose an Algorithm: Based on the problem type. For predicting house prices (regression), choose linear regression. For email spam detection (binary classification), choose logistic regression or a decision tree.

Train the Model: Feed the training data into the algorithm. The algorithm adjusts its parameters to minimize the loss function. For linear regression, this involves finding the line of best fit using ordinary least squares.

Evaluate on Validation Set: Use the validation set to tune hyperparameters. For example, adjust the learning rate or tree depth. If the model overfits, apply regularization.

Test the Model: Finally, evaluate on the test set to get an unbiased estimate of performance. Use appropriate metrics: accuracy for classification, RMSE for regression.

Deploy for Inference: Use the trained model to make predictions on new data. In Azure, this can be done via a real-time endpoint or batch inference.

Real-World Section

Scenario 1: Retail Customer Churn Prediction A large e-commerce company wants to predict which customers are likely to stop buying. They have historical data including purchase frequency, average order value, support interactions, and churn status (label). Using a supervised classification model (e.g., gradient boosting), they train on 1 million customers. The model outputs a churn probability. The company then offers targeted discounts to high-risk customers, reducing churn by 15%. In production, the model is deployed as an Azure ML endpoint, processing new customer data daily. Common misconfigurations include using imbalanced data (only 5% churn) without resampling, leading to a model that always predicts 'not churn' with 95% accuracy but fails to identify actual churners. Techniques like SMOTE or class weighting are used to address this.

Scenario 2: Manufacturing Defect Detection A factory uses computer vision to detect defects on assembly lines. They collect images of products, some with defects (labeled). They train a convolutional neural network (CNN) using supervised learning. The model runs on edge devices (e.g., Azure IoT Edge) for low-latency inference. The training set includes 50,000 images, augmented to 200,000. The model achieves 99% accuracy on the test set. However, in production, new defect types appear that were not in training, causing false negatives. This is a common issue: concept drift. The solution is to set up a continuous retraining pipeline using Azure ML pipelines and monitor model performance.

Scenario 3: Financial Fraud Detection A bank uses unsupervised learning to detect anomalous transactions. Since fraud is rare and constantly evolving, labeled data is scarce. They use an autoencoder trained on normal transactions. During inference, transactions with high reconstruction error are flagged as anomalies. This approach catches new fraud patterns that supervised models might miss. The model processes 10 million transactions daily. Performance considerations: the model must have low latency (<100ms) and high throughput. Azure ML with GPU clusters ensures this. A common pitfall is setting the anomaly threshold too low, causing many false positives, or too high, missing fraud. Regular tuning based on business feedback is essential.

Exam Focus Section

What AI-900 Tests on This Topic The AI-900 exam (objective 2.1) requires you to:

Identify types of machine learning: supervised, unsupervised, reinforcement.

Determine which ML technique to apply for a given scenario.

Understand key concepts like features, labels, training, and inference.

Recognize common algorithms: linear regression, logistic regression, decision trees, K-means, etc.

Know the difference between regression and classification.

Common Wrong Answers and Why 1. Confusing supervised and unsupervised: Candidates often choose 'supervised' when data has no labels. Remember: if the problem mentions 'labeled data' or 'historical outcomes', it's supervised. If it says 'group customers without predefined categories', it's unsupervised. 2. Selecting regression for classification: For example, predicting 'yes/no' using linear regression. Linear regression outputs continuous values, not probabilities bounded between 0 and 1. Logistic regression is correct for binary classification. 3. Misidentifying reinforcement learning: Candidates think any scenario involving feedback is reinforcement. Actually, reinforcement learning involves an agent taking actions in an environment to maximize cumulative reward, not just receiving labeled examples. 4. Assuming more features always improve accuracy: The exam may present a scenario where adding irrelevant features degrades performance due to noise. The correct answer is to use feature selection or dimensionality reduction.

Specific Numbers and Terms That Appear - The exam often uses the phrase 'ground truth' to refer to correct labels. - Metrics like accuracy, precision, recall, F1-score are tested. Know that accuracy = (TP+TN)/(TP+TN+FP+FN). - The term 'overfitting' is frequently used; remember that it happens when model is too complex. - 'Bias-variance tradeoff' is a key concept.

Edge Cases and Exceptions - When a dataset is very small, supervised learning may not work well; unsupervised learning or transfer learning might be better. - For imbalanced classes, accuracy is misleading; use precision/recall or AUC-ROC. - Reinforcement learning is not suitable for static datasets; it requires an interactive environment.

How to Eliminate Wrong Answers - Read the scenario carefully: identify if labels are present (supervised) or not (unsupervised). - Determine the output type: numeric (regression), category (classification), or group (clustering). - If the scenario involves a sequence of actions and rewards, think reinforcement learning. - Eliminate options that mention 'training with labels' for unsupervised scenarios.

Misconceptions

Myth: Machine learning models always get better with more data. Reality: More data helps only if it is high quality and relevant. Noisy or biased data can degrade performance. Also, after a point, adding more data yields diminishing returns.

Myth: Supervised learning requires all data to be labeled. Reality: Only the training set needs labels. The model can then predict labels for new, unlabeled data.

Myth: Unsupervised learning is easier than supervised because it doesn't need labels. Reality: Unsupervised learning is often harder to evaluate because there is no ground truth; results can be subjective.

Myth: Reinforcement learning is the same as supervised learning with feedback. Reality: In RL, the agent receives a reward signal after a sequence of actions, not immediate correct labels. The agent must explore and exploit.

Myth: Once trained, a model remains accurate forever. Reality: Models can suffer from concept drift as data distributions change over time. Continuous monitoring and retraining are necessary.

Comparisons

- Supervised vs. Unsupervised Learning - Supervised: Uses labeled data; goal is to predict labels; common tasks: classification, regression; evaluation is straightforward with metrics like accuracy. - Unsupervised: Uses unlabeled data; goal is to find hidden patterns; common tasks: clustering, dimensionality reduction; evaluation is subjective or uses internal metrics like silhouette score.

- Regression vs. Classification - Regression: Predicts continuous numeric values; example: house price; evaluation: MSE, RMSE, MAE. - Classification: Predicts discrete categories; example: spam or not spam; evaluation: accuracy, precision, recall, F1-score, confusion matrix.

- Parametric vs. Non-Parametric Models - Parametric: Assumes a fixed number of parameters (e.g., linear regression); simpler, faster, but may underfit if assumptions wrong. - Non-Parametric: No fixed parameter set; can model complex patterns (e.g., decision trees, KNN); more flexible but prone to overfitting and slower.

Key Takeaways

Machine learning enables systems to learn from data without explicit programming.

The three main types are supervised, unsupervised, and reinforcement learning.

Supervised learning requires labeled data; unsupervised does not.

Common supervised tasks: regression (numeric output) and classification (categorical output).

Unsupervised tasks include clustering and anomaly detection.

Reinforcement learning involves an agent interacting with an environment to maximize rewards.

Overfitting occurs when a model is too complex; underfitting when too simple.

The bias-variance tradeoff is fundamental to model performance.

Always split data into training, validation, and test sets (e.g., 70/15/15).

Azure Machine Learning provides tools for training, deploying, and managing ML models.

The AI-900 exam tests conceptual understanding, not implementation details.

FAQ

- What is the difference between supervised and unsupervised learning? Supervised learning uses labeled data where the correct output is known, while unsupervised learning uses unlabeled data to find patterns without predefined outputs. For example, spam detection (supervised) vs. customer segmentation (unsupervised).

- What is a feature in machine learning? A feature is an individual measurable property or characteristic of the data used as input to a model. For housing price prediction, features include square footage, number of bedrooms, and location. Good features are informative and independent.

- What is overfitting and how to prevent it? Overfitting occurs when a model learns the training data too well, including noise, and performs poorly on new data. Prevention methods include using more training data, simplifying the model, regularization (L1/L2), cross-validation, and early stopping.

- What is the bias-variance tradeoff? Bias is error from overly simplistic assumptions, leading to underfitting. Variance is error from sensitivity to fluctuations in the training set, leading to overfitting. The tradeoff is the balance between the two; increasing one often decreases the other.

- What metrics are used to evaluate classification models? Common metrics include accuracy (correct predictions/total), precision (true positives/(true positives+false positives)), recall (true positives/(true positives+false negatives)), F1-score (harmonic mean of precision and recall), and AUC-ROC.

- When should I use regression vs. classification? Use regression when the target variable is continuous (e.g., price, temperature). Use classification when the target variable is categorical (e.g., spam/not spam, digit 0-9).

- What is reinforcement learning used for? Reinforcement learning is used for sequential decision-making problems where an agent learns to achieve a goal by interacting with an environment. Examples include game playing (AlphaGo), robotics, and autonomous driving.

- How does Azure Machine Learning help with ML? Azure ML provides a cloud-based environment to train, deploy, and manage ML models at scale. It offers automated ML, drag-and-drop designer, and SDKs for Python and R. It integrates with other Azure services like Data Lake Storage and Kubernetes.

Quiz

- Which type of machine learning is best suited for predicting the price of a house based on its features? Answer: Supervised learning (regression). The scenario has labeled data (house prices) and the output is continuous.

- A company wants to group its customers into segments based on purchasing behavior without predefined categories. Which ML technique should they use? Answer: Unsupervised learning (clustering). The data has no labels, and the goal is to find natural groupings.

- What is the primary difference between classification and regression? Answer: Classification predicts discrete categories (e.g., spam/not spam), while regression predicts continuous numeric values (e.g., temperature).

- A model performs extremely well on training data but poorly on test data. What is this called? Answer: Overfitting. The model has memorized the training data, including noise, and fails to generalize.

- Which of the following is an example of reinforcement learning? Answer: A robot learning to walk by receiving rewards for each step it takes without falling. This involves an agent, environment, and reward signal.

- What is the purpose of a validation set? Answer: To tune hyperparameters and prevent overfitting by providing an unbiased evaluation during model development, separate from the test set.

- True or False: Unsupervised learning always requires labeled data. Answer: False. Unsupervised learning uses unlabeled data.

- Which metric would be most appropriate for evaluating a regression model? Answer: Mean Squared Error (MSE) or Root Mean Squared Error (RMSE). These measure the average squared difference between predicted and actual values.

Meta Information

meta_title: Machine Learning Core Concepts for AI-900: Complete Guide

meta_description: Master machine learning core concepts for AI-900 exam: supervised, unsupervised, reinforcement learning, features, labels, training, inference, and key algorithms.

estimated_read_minutes: 25

Walk-Through

Collect and Prepare Data

Gather a dataset that represents the problem domain. Ensure it includes both features (input variables) and, for supervised learning, labels (target outputs). Clean the data by handling missing values (e.g., impute with mean or median), normalizing numerical features to a standard scale (e.g., 0-1 using min-max scaling), and encoding categorical variables (e.g., one-hot encoding). Split the dataset into three subsets: training (typically 70%), validation (15%), and test (15%). This split prevents data leakage and allows unbiased evaluation. In Azure ML, you can use the 'split data' module or the `train_test_split` function from scikit-learn.

Choose an Algorithm

Select a machine learning algorithm appropriate for the task. For supervised learning, if the target is continuous (e.g., price), choose regression algorithms like linear regression, decision tree regressor, or neural network. If the target is categorical (e.g., spam/not spam), choose classification algorithms like logistic regression, decision tree classifier, or support vector machine. For unsupervised learning, choose clustering algorithms like K-means or hierarchical clustering. Consider factors like dataset size, feature dimensionality, and interpretability requirements. For example, linear regression is simple and interpretable but may underfit complex patterns; random forests handle non-linearity well but are less interpretable.

Train the Model

Feed the training data into the selected algorithm. The algorithm iteratively adjusts its internal parameters (e.g., weights in linear regression) to minimize the loss function (e.g., mean squared error for regression, cross-entropy for classification). This optimization is typically done using gradient descent, which updates parameters in the direction that reduces loss. The learning rate controls step size; common default is 0.01. The training runs for a set number of epochs (e.g., 50) or until convergence. During training, monitor loss on the training set to ensure it decreases. In Azure ML, this step is executed as a script run on a compute target (e.g., CPU or GPU cluster).

Evaluate and Tune Hyperparameters

Use the validation set to assess model performance and tune hyperparameters (e.g., learning rate, number of trees, regularization strength). Compute metrics like accuracy, precision, recall, F1-score for classification, or RMSE for regression. If the model overfits (high training accuracy but low validation accuracy), apply regularization (L1/L2) or reduce model complexity. If underfitting occurs, increase complexity or add features. Use techniques like grid search or random search to find optimal hyperparameters. In Azure ML, you can use HyperDrive for automated hyperparameter tuning.

Test the Model

After tuning, evaluate the final model on the test set to get an unbiased estimate of its real-world performance. The test set has not been used during training or validation, so it simulates unseen data. Report the chosen metrics. If performance is acceptable, proceed to deployment. If not, revisit earlier steps (e.g., collect more data, engineer new features, try a different algorithm). In Azure ML, you can register the model in the model registry and then deploy it as a web service or batch inference pipeline.

What This Looks Like on the Job

How AI-900 Actually Tests This