This chapter clarifies the foundational hierarchy of artificial intelligence (AI), machine learning (ML), and deep learning (DL)—three terms often used interchangeably but with distinct meanings and scopes. For the AI-900 exam, understanding these differences is critical because roughly 15-20% of questions test your ability to classify workloads as AI, ML, or DL, and to identify which approach is appropriate for a given scenario. You will learn the precise definitions, how each relates to the other, and the specific characteristics that distinguish them, including real-world examples and exam traps.
Jump to a section
Imagine a vast library where knowledge is organized hierarchically. At the top floor, you have AI—the entire library system, including the building, staff, and rules for acquiring, storing, and using knowledge. The middle floor is Machine Learning—the team of librarians who learn from examples. They don't follow fixed rules; instead, they study thousands of books to infer patterns, like which authors write about which topics. The bottom floor is Deep Learning—a specialized team of assistants who use a complex filing system of interconnected index cards (neural networks). Each card holds a tiny piece of information, and they pass notes to each other to form a deeper understanding. For example, to identify a cat, the first assistant looks for edges, the next for shapes, the next for fur texture, and so on, until the final assistant says 'cat.' The library (AI) decides the goal (e.g., 'learn to identify cats'), the librarians (ML) choose the method (e.g., 'show it many cat pictures'), and the assistants (DL) execute the detailed pattern recognition. Without the top floor, there's no direction; without the middle floor, no learning from data; without the bottom floor, no deep insights. Each level depends on the one above, but they are distinct in complexity and capability.
What is Artificial Intelligence (AI)?
Artificial Intelligence is the broadest concept—the field of computer science dedicated to creating systems that can perform tasks that typically require human intelligence. These tasks include reasoning, learning, perception, problem-solving, and language understanding. AI is not a single technology but a collection of subfields, including machine learning, natural language processing, robotics, and expert systems. In the AI-900 exam, AI is defined as 'the capability of a computer system to mimic human cognitive functions.' Any system that exhibits intelligent behavior—even if it uses simple rule-based logic—falls under AI. For example, a chess program that uses brute-force search is AI, but it is not machine learning.
What is Machine Learning (ML)?
Machine Learning is a subset of AI. It is the study of algorithms and statistical models that enable computers to improve their performance on a task through experience (data), without being explicitly programmed for every rule. Instead of hard-coding responses, ML algorithms learn patterns from training data and apply those patterns to new, unseen data. Key components include: - Training data: Labeled or unlabeled examples used to teach the model. - Features: Input variables used by the model (e.g., pixel values, word frequencies). - Labels: Desired output for supervised learning (e.g., 'cat' or 'dog'). - Model: The mathematical representation learned from data. - Inference: Using the model to make predictions on new data.
ML is further divided into three main paradigms: 1. Supervised learning: The model learns from labeled data (input-output pairs). Tasks include classification (e.g., spam detection) and regression (e.g., price prediction). 2. Unsupervised learning: The model finds patterns in unlabeled data. Tasks include clustering (e.g., customer segmentation) and association (e.g., market basket analysis). 3. Reinforcement learning: The model learns by interacting with an environment, receiving rewards or penalties for actions. It is used in game playing and robotics.
For AI-900, you must know that ML requires data, that the model generalizes from examples, and that it is not rule-based.
What is Deep Learning (DL)?
Deep Learning is a subset of machine learning that uses artificial neural networks with multiple layers (deep architectures) to model complex patterns. The 'deep' refers to the number of hidden layers between input and output—typically more than three. Deep learning excels at tasks like image recognition, speech recognition, and natural language processing because it can automatically learn hierarchical feature representations. For example, in image processing, early layers detect edges, middle layers detect shapes, and deep layers detect entire objects.
Key characteristics: - Neural networks: Composed of neurons (nodes) organized in layers. Each neuron applies a weighted sum of inputs followed by a non-linear activation function. - Backpropagation: The algorithm used to train deep networks by propagating error gradients backward through the layers. - Large data requirements: Deep learning typically requires massive amounts of labeled data (e.g., millions of images) to perform well. - High computational power: Training deep networks often requires GPUs or TPUs.
Common architectures include: - Convolutional Neural Networks (CNNs): For image data. - Recurrent Neural Networks (RNNs): For sequential data like time series or text. - Transformers: For natural language processing (e.g., GPT, BERT).
How They Relate: Venn Diagram Hierarchy
Visualize a set of three concentric circles. The outermost circle is AI. Inside it is ML. Inside ML is DL. This means:
All DL is ML, but not all ML is DL.
All ML is AI, but not all AI is ML.
AI includes non-learning systems (e.g., rule-based expert systems).
ML includes non-deep techniques (e.g., decision trees, linear regression).
The exam tests this hierarchy frequently. A common question: 'Which of the following is a subset of machine learning?' The answer is deep learning. Another: 'Which term encompasses the others?' The answer is AI.
Practical Distinctions for AI-900
| Aspect | AI | ML | DL | |--------|----|----|----| | Definition | Mimics human intelligence | Learns from data | Uses deep neural networks | | Programming | Can be rule-based or data-driven | Data-driven, no explicit rules | Data-driven, automatic feature extraction | | Data needed | Varies (can be minimal for rule-based) | Moderate to large | Very large (millions of examples) | | Example | Chess engine | Spam filter | Self-driving car vision | | Complexity | Low to high | Medium | High |
Exam-Relevant Examples
AI but not ML: A thermostat that turns on heat when temperature drops below 18°C (rule-based).
ML but not DL: A linear regression model predicting house prices based on square footage (single-layer).
DL: A convolutional neural network classifying X-ray images for pneumonia.
Common Exam Scenarios
1. Given a scenario, identify whether it uses AI, ML, or DL. - Scenario: 'A system that recommends movies based on past ratings using a decision tree.' Answer: ML (decision tree is a classic ML algorithm, not deep learning). - Scenario: 'A system that translates languages using a transformer model.' Answer: DL (transformers are deep learning architectures). - Scenario: 'A system that follows if-then rules to approve loan applications.' Answer: AI (rule-based, no learning).
Order of hierarchy: Questions may ask which is the broadest or most specific. Remember: AI ⊃ ML ⊃ DL.
Data requirements: Deep learning requires more data than traditional ML. If a scenario mentions 'small dataset,' deep learning is likely inappropriate.
Key Values and Terms
Neural network depth: 'Deep' typically means more than one hidden layer; often 3+ layers.
Activation functions: ReLU (Rectified Linear Unit) is common in hidden layers; softmax for output in classification.
Training epochs: One complete pass through the training data.
Batch size: Number of samples processed before updating model weights.
Configuration and Tools on Azure
Azure provides services for each level: - AI: Azure Cognitive Services (pre-built AI APIs for vision, speech, language, decision). - ML: Azure Machine Learning (a platform to build, train, and deploy ML models). - DL: Azure Machine Learning supports deep learning frameworks like TensorFlow, PyTorch, and Keras. Azure also offers specialized hardware (GPU VMs, Azure Machine Learning compute clusters).
For AI-900, focus on recognizing which Azure service corresponds to which category. For example, Azure Cognitive Services is AI (including pre-built ML models), but custom ML models built with Azure Machine Learning are ML. If the model uses deep neural networks, it's DL.
Common Misclassifications on the Exam
Neural networks = AI? Yes, but more specifically ML/DL. The exam expects you to know that neural networks are a type of ML.
All AI learns? No. Rule-based systems are AI but do not learn.
Deep learning always better? No. For simple tasks with limited data, traditional ML often outperforms DL.
Summary of Relationships
AI is the superset.
ML is a subset of AI that learns from data.
DL is a subset of ML that uses deep neural networks.
Not all AI is ML; not all ML is DL.
Identify the Problem Type
First, determine the nature of the task. Is it a cognitive task that normally requires human intelligence? If yes, it falls under AI. Next, ask: can the solution be expressed as a set of explicit rules? If rules are known and fixed, a rule-based AI system (like an expert system) may suffice. If rules are unknown or too complex, machine learning is needed. Finally, assess data: if the problem involves high-dimensional data (images, audio, text) and large datasets, deep learning might be appropriate. This step is critical because the exam often presents a scenario and asks which approach to use. The key is to check for rule-based vs. data-driven, and data size/complexity.
Choose AI or ML or DL
Based on the problem type, decide the category. For AI-900, you must be able to classify scenarios. If the solution uses hard-coded rules (e.g., 'if temperature > 30, turn on AC'), it's AI but not ML. If it uses data to make predictions (e.g., regression, classification), it's ML. If it uses a multi-layer neural network with automatic feature extraction (e.g., image recognition), it's DL. Remember that DL is always ML, but the exam often asks for the 'most specific' category. For example, a convolutional neural network is DL, not just ML. Also, note that some Azure services like Cognitive Services are considered AI, but they may use ML/DL internally—the exam focuses on the user's perspective.
Consider Data Requirements
Evaluate the amount and type of data available. Traditional ML algorithms (e.g., logistic regression, decision trees) can work with thousands of examples. Deep learning typically requires millions of labeled examples to avoid overfitting. If the scenario mentions 'small dataset' or 'limited labeled data,' deep learning is usually not the best choice. Additionally, consider feature engineering: traditional ML often requires manual feature extraction (e.g., calculating pixel intensities), while DL learns features automatically. The exam may present a scenario with limited data and ask why DL is not used—the correct answer is insufficient data.
Evaluate Computational Resources
Deep learning is computationally intensive, often requiring specialized hardware like GPUs or TPUs. Training a deep network can take hours or days on a single CPU. Traditional ML can run on standard CPUs. The exam may ask about deployment considerations: if the environment has limited compute (e.g., edge devices), DL might be impractical. Azure offers GPU VMs for DL, but they cost more. For AI-900, know that Azure Machine Learning provides scalable compute options, but the choice between ML and DL affects cost and time.
Select the Appropriate Azure Service
Once the category is determined, map it to an Azure service. For AI (including pre-built ML), use Azure Cognitive Services (e.g., Computer Vision, Text Analytics). For custom ML (including DL), use Azure Machine Learning. For deep learning specifically, Azure Machine Learning supports frameworks like TensorFlow. The exam tests this mapping: e.g., 'Which Azure service should you use to build a custom image classification model with deep learning?' Answer: Azure Machine Learning (or Azure Machine Learning Studio). Note that Cognitive Services are for pre-trained models; if customization is needed, use Azure Machine Learning.
Enterprise Scenario 1: Retail Customer Service Chatbot
A large e-commerce company wants to deploy a chatbot to handle customer inquiries. The initial approach used a rule-based system (AI) that could only answer frequently asked questions with fixed responses. However, customers often asked nuanced questions, and the rule-based system failed, leading to poor satisfaction. The company switched to a machine learning solution using Azure Cognitive Services Language Understanding (LUIS). They trained a model with thousands of example utterances to classify intents (e.g., 'return item', 'track order') and extract entities (e.g., order number). This is ML because it learns from labeled data. The model was deployed as a scalable API. Performance improved, but for complex multi-turn conversations, they later integrated a deep learning-based transformer model (GPT-3 via Azure OpenAI Service) to generate natural responses. This required larger datasets and GPU compute. Common misconfiguration: not having enough training data for the ML model, leading to poor intent recognition. In production, they monitor model drift and retrain monthly.
Enterprise Scenario 2: Manufacturing Quality Inspection
A car manufacturer uses computer vision to detect defects on the assembly line. Initially, they tried traditional ML with handcrafted features (edge detection, color histograms) and a support vector machine (SVM). This worked for simple defects but failed on subtle cracks. They upgraded to a deep learning CNN (ResNet-50) trained on 100,000 labeled images. The DL model automatically learned features and achieved 99.5% accuracy. They deployed it on Azure Machine Learning with GPU VMs for inference at the edge. Key considerations: inference latency must be under 100ms to keep up with the conveyor belt. They used Azure Stack Edge to run the model locally. A common mistake: using a pre-trained model without fine-tuning on their specific defect images, resulting in poor performance. They also learned that deep learning requires careful hyperparameter tuning (learning rate, batch size) to avoid overfitting.
Enterprise Scenario 3: Healthcare Predictive Analytics
A hospital wants to predict patient readmission risk within 30 days of discharge. They have structured data (age, lab results, medications) for 50,000 patients. They used a gradient boosting model (XGBoost) in Azure Machine Learning, which is traditional ML (not deep learning) because the data is tabular and the dataset is moderate. The model was interpretable, allowing clinicians to understand key risk factors. Deep learning was considered but rejected due to lack of massive data and need for interpretability (regulatory compliance). Common misconfiguration: not handling class imbalance (few readmissions), leading to a model that always predicts 'no readmission.' They used SMOTE and weighted loss functions. The model is retrained monthly with new data.
AI-900 Exam Focus on AI vs ML vs DL
This topic falls under Objective 1.1: 'Identify features of common AI workloads.' The exam expects you to differentiate between AI, ML, and DL and to classify workloads accordingly. Approximately 15-20% of questions touch this area, often as part of a larger scenario.
Most Common Wrong Answers and Why
'All AI uses machine learning.' Candidates choose this because they think AI always learns. But rule-based systems are AI without ML. The exam includes questions about expert systems or simple automation.
'Deep learning is the same as machine learning.' While DL is a subset, the exam tests that they are not identical. A question might ask: 'Which technology uses multiple layers of neural networks?' The answer is deep learning, not just machine learning.
'Machine learning requires no data.' This is false; ML is data-driven. Candidates confuse ML with rule-based AI.
'Deep learning is always better than machine learning.' The exam emphasizes that DL requires more data and compute; for simple tasks, traditional ML may be better.
Specific Numbers and Terms on the Exam
Hierarchy: AI ⊃ ML ⊃ DL. Be able to draw the Venn diagram.
Examples: Rule-based (AI), linear regression (ML), CNN (DL).
Azure services: Cognitive Services (AI), Azure Machine Learning (ML/DL), Azure OpenAI Service (DL).
Data size: DL needs 'large amounts of labeled data.'
Feature extraction: ML often manual, DL automatic.
Edge Cases and Exceptions
Reinforcement learning: It is ML, but often considered separate. The exam may list it as a type of ML.
Transfer learning: A DL technique where a pre-trained model is fine-tuned on a smaller dataset. This can reduce data requirements, but still considered DL.
Cognitive Services: Some services (e.g., Custom Vision) allow customization, blurring the line between AI and ML. The exam treats them as AI (pre-built) but with ML capabilities.
How to Eliminate Wrong Answers
If the scenario mentions 'explicit rules,' it's AI but not ML.
If it mentions 'learning from data' without specifying neural networks, it's ML.
If it mentions 'multiple layers' or 'neural networks,' it's DL.
If it mentions 'small dataset,' eliminate DL.
If it mentions 'interpretability,' traditional ML (e.g., decision tree) is often preferred over DL.
AI is the broadest term; ML is a subset; DL is a subset of ML.
Rule-based systems are AI but not ML or DL.
Deep learning requires large data and compute resources.
For AI-900, be able to classify a given scenario as AI, ML, or DL.
Azure Cognitive Services are pre-built AI; Azure Machine Learning is for custom ML/DL.
Traditional ML algorithms (e.g., linear regression) are not deep learning.
The exam tests the hierarchy: AI ⊃ ML ⊃ DL.
These come up on the exam all the time. Here's how to tell them apart.
Rule-based AI
Uses explicit if-then rules defined by humans.
Does not require training data.
Easy to interpret and debug.
Cannot handle unseen scenarios not covered by rules.
Example: A thermostat that turns on at 18°C.
Machine Learning
Learns patterns from data automatically.
Requires large amounts of labeled or unlabeled data.
Can generalize to new, unseen data.
Model may be a black box (especially deep learning).
Example: A spam filter trained on thousands of emails.
Traditional Machine Learning
Often requires manual feature engineering.
Works well with smaller datasets (thousands of examples).
Faster to train on CPU.
Examples: Decision trees, SVMs, linear regression.
More interpretable than deep learning.
Deep Learning
Automatically learns hierarchical features.
Requires very large datasets (millions of examples).
Requires GPU/TPU for efficient training.
Examples: CNNs, RNNs, Transformers.
Less interpretable; often a black box.
Mistake
AI and machine learning are the same thing.
Correct
AI is the broad field of creating intelligent systems; ML is a subset that learns from data. Not all AI uses ML (e.g., rule-based systems).
Mistake
Deep learning is a completely separate field from machine learning.
Correct
Deep learning is a subset of machine learning that uses deep neural networks. It is not separate; it is a specialized area within ML.
Mistake
Machine learning always requires deep neural networks.
Correct
ML includes many algorithms like linear regression, decision trees, and SVMs that are not deep learning. Deep learning is just one type of ML.
Mistake
All AI systems improve over time through learning.
Correct
Many AI systems are static rule-based systems that do not learn. Only ML systems improve with experience.
Mistake
Deep learning works well with small datasets.
Correct
Deep learning typically requires very large datasets (millions of examples) to perform well. With small data, it tends to overfit.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
AI is the overarching field of creating machines that mimic human intelligence. ML is a subset of AI that enables systems to learn from data without explicit programming. DL is a subset of ML that uses deep neural networks with multiple layers to model complex patterns. All DL is ML, all ML is AI, but not vice versa. For example, a rule-based chess engine is AI but not ML; a linear regression model is ML but not DL; a convolutional neural network is DL.
No. Deep learning excels at tasks with large amounts of data and complex patterns (e.g., image recognition), but for simpler tasks with limited data, traditional ML algorithms often perform better and are more interpretable. Deep learning also requires more computational resources. The choice depends on the problem, data size, and resource constraints.
Yes. Many AI systems are rule-based, such as expert systems or simple automation scripts. They follow predefined rules without learning from data. For example, a spam filter that blocks emails containing specific keywords is AI but not ML. The AI-900 exam includes such examples to test this distinction.
Azure Cognitive Services provide pre-built AI capabilities (e.g., vision, speech) and are considered AI. Azure Machine Learning is the platform for building custom ML and DL models. For deep learning specifically, you can use frameworks like TensorFlow within Azure Machine Learning. Azure OpenAI Service also provides deep learning-based language models.
Deep learning is typically used for high-dimensional data like images, audio, and text, where automatic feature extraction is beneficial. It requires large labeled datasets (often millions of examples) and substantial compute (GPUs). If you have limited data or need interpretability, traditional ML may be more appropriate.
Think of concentric circles: AI is the outermost, containing ML, which contains DL. So AI includes everything from simple rules to complex learning. ML includes all learning algorithms, and DL is a specific type of ML using deep neural networks. This hierarchy is frequently tested on the AI-900 exam.
Generally no, because deep learning models have millions of parameters and require large amounts of data to avoid overfitting. However, transfer learning can help: you can take a pre-trained model (trained on a large dataset) and fine-tune it on a smaller dataset. This still requires some data but less than training from scratch.
You've just covered AI vs Machine Learning vs Deep Learning — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.
Done with this chapter?