AI-900Chapter 31 of 100Objective 1.2

AI Transparency and Explainability

This chapter covers AI transparency and explainability, two critical pillars of responsible AI that the AI-900 exam tests extensively. You will learn the difference between transparency (documenting the model's purpose, data, and limitations) and explainability (providing understandable reasons for individual predictions). Expect roughly 5-7% of exam questions to touch on these concepts, often in scenarios involving fairness, accountability, or regulatory compliance. Mastering this chapter will help you answer questions about model interpretability tools in Azure Machine Learning, such as the interpretability package and explanation dashboards.

25 min read
Intermediate
Updated May 31, 2026

The Recipe Box for AI Decisions

Imagine a master chef who creates a new dish every day. The chef is an AI model. One day, a customer asks, 'Why did you add cinnamon to this stew?' The chef cannot explain – they just know it tastes right. This is a black-box AI. Now imagine that chef must write down every ingredient, every measurement, and every step in a recipe card before cooking. After the dish is served, anyone can read the card to see exactly why the dish turned out as it did. That card is the explanation. But transparency goes further: it's not just the recipe card, but also the list of who tasted the dish, what feedback they gave, and what changes were made over time. In Azure AI, transparency means documenting the model's training data, its performance metrics, its limitations, and providing tools like interpretability dashboards. Explainability is the recipe card – the specific reasons for a single prediction. Together, they build trust. If a loan application is denied, the bank must show the applicant the exact factors (income, credit history, etc.) that led to the decision, just as a chef would show the recipe to a disappointed customer who wanted a sweeter dish.

How It Actually Works

What Are AI Transparency and Explainability?

AI transparency refers to the degree to which the workings of an AI system are open to inspection. It covers the entire lifecycle: the data used for training, the model architecture, the training process, the evaluation metrics, and the deployment environment. Explainability, a subset of transparency, focuses on providing human-understandable explanations for specific decisions or predictions made by the model. The AI-900 exam expects you to distinguish between these two concepts and understand their importance in building trust, ensuring fairness, and meeting regulatory requirements.

Why They Matter for Responsible AI

Microsoft's Responsible AI principles include fairness, reliability and safety, privacy and security, inclusiveness, and transparency. Transparency and explainability directly support the fairness and reliability pillars. Without transparency, you cannot audit a model for bias. Without explainability, you cannot trust a model's decision in high-stakes scenarios like medical diagnosis or loan approvals. Regulators (e.g., GDPR's 'right to explanation') increasingly require that decisions made by automated systems can be explained to affected individuals.

How Explainability Works Internally

Explainability methods fall into two categories: global and local. Global explainability describes the overall behavior of the model – which features are most important across all predictions. Local explainability explains a single prediction. Two common techniques are:

SHAP (SHapley Additive exPlanations): Based on cooperative game theory. It computes the contribution of each feature to the prediction by simulating all possible subsets of features. The algorithm runs the model multiple times with different feature combinations and averages the marginal contributions. This is computationally expensive but provides consistent and theoretically grounded explanations.

LIME (Local Interpretable Model-agnostic Explanations): Fits a simple, interpretable model (like a linear regression) locally around a specific prediction. It perturbs the input (e.g., removes words from text, turns pixels on/off in an image) and observes how the prediction changes. The simple model approximates the complex model's behavior in that neighborhood.

In Azure Machine Learning, the azureml-interpret package provides both SHAP and LIME implementations. The Explanation Dashboard visualizes these results, showing feature importance values for individual predictions and aggregate views.

Key Components in Azure

Model Interpretability SDK: Available in Python via azureml-interpret. It supports tabular, text, and image data. For tabular, you can use TabularExplainer which wraps SHAP, LIME, and other explainers.

Explanation Dashboard: Integrated into Azure Machine Learning studio. It displays:

- Global feature importance (ranked list of features) - Local feature importance for a selected data point - Summary plots (e.g., swarm plots, bar charts) - Model Performance Dashboard: Provides fairness metrics (e.g., demographic parity, equalized odds) and error analysis. - Responsible AI Dashboard: Combines model interpretability, fairness assessment, error analysis, and counterfactual what-if analysis into a single view.

Configuration and Usage

To enable explainability in Azure ML, you typically:

1.

Install the package: pip install azureml-interpret

2.

Create an explainer object, e.g., TabularExplainer(model, X_train, features, classes)

3.

Call explain_model() to generate global explanations.

4.

Call explain_local() for local explanations on specific test points.

5.

Register the explanation in the Azure ML workspace for later visualization.

Example snippet:

from interpret.ext.blackbox import TabularExplainer

# Assume 'model' is a trained classifier, 'X_train' is training data
explainer = TabularExplainer(model, X_train, features=feature_names, classes=class_names)
global_explanation = explainer.explain_global(X_train)
local_explanation = explainer.explain_local(X_test[:5])

You can then upload the explanation to the workspace:

from azureml.interpret import ExplanationClient
client = ExplanationClient.from_run(run)
client.upload_model_explanation(global_explanation, comment='global explanation')

Interaction with Other Azure Services

Azure Purview: For data lineage and cataloging, ensuring transparency in data sources.

Azure Policy: Can enforce that models have explanation artifacts before deployment.

Azure Monitor: Logs model predictions and can trigger alerts when explanation patterns change (drift detection).

Azure Machine Learning Pipelines: You can include explanation steps in MLOps pipelines to generate explanations automatically on new data.

Limitations and Trade-offs

Performance Overhead: SHAP is computationally intensive; for large models or datasets, it may take hours. LIME is faster but less consistent.

Accuracy vs. Interpretability Trade-off: Complex models (deep neural networks, ensemble methods) are often more accurate but harder to explain. Simpler models (linear regression, decision trees) are inherently interpretable but may perform worse.

Explanation Faithfulness: Some explanations may not perfectly reflect the model's internal logic. For example, LIME's local approximations can be unstable – small changes in input may lead to different explanations.

Regulatory Compliance: Even with explanations, regulators may require that the model's decisions are auditable and reproducible. This demands versioning of data, code, and environment – not just explanations.

Best Practices for the Exam

Remember that transparency is broader than explainability – it includes documentation, data governance, and model cards.

Explainability is specifically about interpreting model predictions.

Azure's Responsible AI Dashboard is the go-to tool for combining interpretability, fairness, and error analysis.

The exam may ask about counterfactual explanations – 'what if' scenarios that show how changing an input feature would change the prediction. This is part of the Responsible AI Dashboard.

Be aware that model interpretability is not the same as model performance – a model can be highly accurate but uninterpretable, or interpretable but inaccurate.

Common Exam Scenarios

1.

A bank uses a deep learning model to approve loans. A customer is denied and wants to know why. The bank should use local explainability (e.g., SHAP) to show which features contributed to the denial.

2.

A healthcare provider must document all models used in diagnosis to comply with regulations. This requires transparency – model cards, training data documentation, and version control.

3.

An e-commerce company wants to understand overall customer churn drivers. They need global explainability to see which features (e.g., price, delivery time) most affect churn.

Exam Trap: Confusing Transparency with Explainability

A common wrong answer on the exam is to pick 'transparency' when the scenario asks for 'explainability' or vice versa. Remember: transparency is about the entire system being open to inspection (documentation, data sources, model architecture). Explainability is about understanding a specific decision. If the question mentions 'a single prediction' or 'why this customer was denied', the answer is explainability. If it mentions 'documentation of the model's purpose and limitations', it's transparency.

Another trap: assuming that all models can be explained equally. The exam may present a scenario where a model is too complex to explain locally (e.g., a 100-layer neural network). The correct answer may be to use a simpler, inherently interpretable model instead, or to use a surrogate model (like a decision tree) to approximate explanations.

Summary of Key Terms for the Exam

Model card: A document that describes the intended use, performance, and limitations of a model.

Data sheet: Documentation of a dataset's provenance, composition, and collection process.

Interpretability: The ability to understand the cause of a decision.

Explainability: The ability to provide reasons for a decision in human-understandable terms.

Fairness metrics: Quantitative measures of bias, such as demographic parity (equal acceptance rates across groups) and equalized odds (equal true positive and false positive rates).

Counterfactual explanation: Shows how the input would need to change to get a different prediction.

Real-World Implementation Details

In production, generating explanations for every prediction can be expensive. A common pattern is to generate explanations on-demand for audited predictions or for a random sample. Azure ML supports this by allowing you to register explanations and query them later. The ExplanationClient can retrieve explanations by run ID.

For real-time scoring, you can deploy a model with an explanation endpoint. This involves wrapping the model and the explainer in a scoring script. However, this increases latency – for high-throughput systems, explanations are often generated asynchronously.

Conclusion

AI transparency and explainability are not optional extras – they are fundamental to responsible AI. The AI-900 exam tests your understanding of why they matter, the tools Azure provides, and how to apply them in common scenarios. Focus on the distinction between transparency and explainability, the capabilities of the Responsible AI Dashboard, and the practical considerations of implementing explanations.

Walk-Through

1

Identify the Need for Explanation

The first step is recognizing when an explanation is required. In enterprise scenarios, this often comes from regulatory compliance (e.g., GDPR's right to explanation), internal audit policies, or customer complaints. For instance, a bank must explain loan denials to applicants. At this stage, you define the scope: which predictions need explanations (all, or only those with certain outcomes), the level of detail (feature-level or counterfactual), and the audience (data scientists, regulators, or end users). In Azure, you would configure the monitoring and logging to capture prediction requests and outcomes for later analysis.

2

Select the Explanation Technique

Choose between global and local explainability based on the use case. For a single prediction, use local methods like SHAP or LIME. For overall model behavior, use global methods like permutation feature importance. Azure's `TabularExplainer` automatically selects the best explainer based on the model type. For deep neural networks, you might use `DeepExplainer` (SHAP-based). For text models, `TextExplainer` (LIME-based). Consider the trade-off: SHAP is more accurate but slower; LIME is faster but less stable. The exam expects you to know that SHAP provides consistent feature attributions based on game theory.

3

Generate the Explanation

Using the Azure ML Python SDK, you instantiate an explainer object and call its methods. For example, `explainer.explain_local(test_data)` returns a list of explanation objects, each containing feature importance values. The SDK runs the underlying algorithm: for SHAP, it computes Shapley values by sampling feature subsets; for LIME, it creates perturbed samples and fits a linear model. The process may take minutes for large datasets. You can track the run in Azure ML and attach the explanation to the run for later retrieval. The explanation object includes `feature_importances` and `expected_values`.

4

Visualize and Interpret Results

Upload the explanation to the Azure ML workspace using `ExplanationClient.upload_model_explanation()`. Then, in the Azure ML studio, open the Explanation Dashboard. Here you can view global feature importance as a bar chart, local feature importance for individual points, and a summary plot (e.g., a swarm plot showing feature value distribution against importance). You can also use the What-If tool to create counterfactual explanations – for example, 'what income would this applicant need to get approved?' The dashboard allows filtering by prediction outcome, feature values, and other dimensions.

5

Document and Act on Findings

The final step is to document the explanations in a model card or audit report. Azure ML allows you to export the dashboard as a PDF or share it via a link. If the explanation reveals bias (e.g., a feature like race has high importance), you may need to retrain the model with fairness constraints or remove biased features. The Responsible AI Dashboard also provides fairness metrics and error analysis, helping you decide corrective actions. In production, you might set up alerts when explanation patterns change, indicating data drift or model decay.

What This Looks Like on the Job

Enterprise Scenario 1: Loan Approval in Banking

A major bank uses a gradient-boosted tree model to automate loan approvals. Regulatory requirements demand that any denial be accompanied by an explanation. The bank deploys the model using Azure Machine Learning and integrates the azureml-interpret package into their scoring pipeline. For each loan application, the system generates a local explanation using SHAP. The explanation highlights the top three contributing features (e.g., debt-to-income ratio, credit score, loan amount) and their impact on the decision. These explanations are stored in a database and sent to the applicant via a secure portal. The bank also uses the Explanation Dashboard to monitor global feature importance over time, ensuring that no new biases emerge. A challenge they faced was the computational cost: generating SHAP explanations for every application (millions per month) was infeasible. They solved this by generating explanations only for denied applications and for a random 1% sample of approved ones, reducing compute by 95%.

Enterprise Scenario 2: Medical Diagnosis in Healthcare

A healthcare startup develops a deep learning model to detect diabetic retinopathy from retinal scans. Clinicians need to trust the model's decisions, so they require explanations. The team uses Azure's Image Explanation capabilities, which rely on LIME. For each scan, the explanation highlights the regions of the image that most influenced the prediction (e.g., microaneurysms or hemorrhages). These heatmaps are overlaid on the original image and presented to the ophthalmologist. The model also outputs a confidence score. The startup documents the model's performance across different ethnicities and age groups in a model card, ensuring transparency. A common issue they faced was that LIME explanations were sometimes unstable – slight rotations of the image produced different heatmaps. They mitigated this by averaging explanations over multiple perturbed versions of the same image.

Enterprise Scenario 3: Credit Scoring for a Fintech

A fintech company uses an ensemble of logistic regression and random forest to assign credit scores. They need to comply with the EU's GDPR, which gives users the right to an explanation of automated decisions. They built a custom dashboard using Azure's Responsible AI Dashboard. For each user, they provide a counterfactual explanation: 'If your income were $5,000 higher, your credit score would increase by 20 points.' They also show the most important features globally: payment history, credit utilization, and length of credit history. The fintech found that the global explanations helped them identify that the model was overly reliant on 'credit utilization' – a feature that can be manipulated by users. They retrained the model with additional features to reduce this dependency. The deployment is fully automated: every model version registers its explanations, and the dashboard updates in real-time.

Common Misconfigurations

Not versioning explanations: If the model is retrained, old explanations become stale. Always associate explanations with a specific model version.

Over-reliance on global explanations: Global feature importance can hide local variations. For example, a feature may be important globally but irrelevant for a specific prediction. Always check local explanations for individual cases.

Ignoring explanation latency: In real-time systems, generating explanations on the fly can add hundreds of milliseconds. Plan for asynchronous explanation generation or limit explanations to a subset of predictions.

How AI-900 Actually Tests This

What AI-900 Tests on This Topic

The AI-900 exam objective 1.2 ('Identify guiding principles for responsible AI') includes transparency and explainability as sub-principles. Specifically, you need to know:

The difference between transparency and explainability.

The purpose of model cards and datasheets.

The role of Azure Machine Learning's interpretability tools (Explanation Dashboard, Responsible AI Dashboard).

How explainability supports fairness and reliability.

The concept of counterfactual explanations.

Most Common Wrong Answers

1.

'Transparency means the model is interpretable.' This is wrong because transparency is broader – it includes documentation, data governance, and auditability, not just interpretability. Candidates often confuse the two because they sound similar.

2.

'Explainability is only needed for complex models.' Incorrect. Even simple models like linear regression should be explained to build trust and meet regulations. The exam may present a scenario where a simple model is used but still requires explanations.

3.

'SHAP and LIME are the same thing.' They are different: SHAP is based on game theory and provides consistent attributions; LIME is model-agnostic and fits a local surrogate. The exam might ask which method is more computationally expensive (SHAP) or which is more stable (SHAP).

4.

'The Responsible AI Dashboard only shows explanations.' Actually, it combines interpretability, fairness, error analysis, and counterfactual what-ifs. A candidate might pick an answer that says 'only explanations' and miss the broader functionality.

Specific Numbers and Terms That Appear on the Exam

Model Card: A document that includes intended use, performance metrics, and limitations. The exam may ask what information a model card contains.

Datasheet for Datasets: A document that describes the dataset's provenance, collection method, and potential biases.

Fairness metrics: Demographic parity, equalized odds, and disparate impact. Know that these are quantitative measures of bias.

Counterfactual explanation: An explanation that shows how changing an input would change the prediction (e.g., 'If your income were higher, you would be approved').

Azure Machine Learning SDK: The azureml-interpret package. The exam won't test code syntax but may ask which package is used for interpretability.

Edge Cases the Exam Loves to Test

When the model is a black box (e.g., deep neural network): The correct approach is to use model-agnostic explainers like LIME or SHAP, not to give up on explanations.

When the dataset is imbalanced: Explainability can reveal that the model relies on spurious correlations. For example, a model trained on medical images might use hospital equipment markers instead of actual pathology. The exam may ask how to detect this (using explanation insights).

When regulations require explanations for all decisions: The answer is to implement explainability in the MLOps pipeline, not to use a simpler model that may be less accurate.

How to Eliminate Wrong Answers

If the question mentions 'documentation of the model's purpose and limitations', the answer is transparency, not explainability.

If the question mentions 'why a specific prediction was made', the answer is explainability (local explanation).

If the question mentions 'a tool that shows feature importance for individual predictions', the answer is the Explanation Dashboard (or Responsible AI Dashboard).

If the question mentions 'a method that uses game theory', the answer is SHAP.

If the question mentions 'a method that fits a simple model locally', the answer is LIME.

Remember: The exam is scenario-based. Read carefully to identify whether they ask about the overall system (transparency) or a specific decision (explainability).

Key Takeaways

Transparency covers the entire AI system lifecycle; explainability focuses on individual predictions.

Azure ML's Responsible AI Dashboard combines interpretability, fairness, error analysis, and counterfactual explanations.

SHAP is based on game theory and provides consistent feature attributions; LIME fits a local surrogate model.

Model cards and datasheets are key transparency artifacts required by responsible AI principles.

Explainability is essential for regulatory compliance (e.g., GDPR right to explanation) and building user trust.

Global explainability reveals overall feature importance; local explainability explains a single prediction.

Counterfactual explanations show how changing an input would alter the prediction – often tested on the exam.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

SHAP (Shapley Additive Explanations)

Based on cooperative game theory – uses Shapley values.

Provides consistent feature attributions – same feature has same importance regardless of other features.

Computationally expensive – requires multiple model evaluations for each prediction.

Supported in Azure ML via `TabularExplainer` and `DeepExplainer`.

Produces both global and local explanations.

LIME (Local Interpretable Model-agnostic Explanations)

Model-agnostic – works with any classifier or regressor.

Fits a simple interpretable model locally around the prediction – can be unstable.

Faster than SHAP – suitable for real-time explanations on small datasets.

Supported in Azure ML via `TabularExplainer` and `TextExplainer`.

Primarily produces local explanations; global explanations require aggregating many local ones.

Watch Out for These

Mistake

Transparency and explainability are the same thing.

Correct

Transparency is broader: it includes documentation, data lineage, model cards, and audit trails. Explainability is a subset of transparency focused on interpreting individual predictions.

Mistake

Only complex models need explanations.

Correct

All models benefit from explanations for trust and compliance. Even a linear regression model's coefficients can be misinterpreted without proper context.

Mistake

SHAP and LIME produce identical explanations.

Correct

SHAP provides consistent, game-theoretic feature attributions but is computationally expensive. LIME is faster but can be unstable; explanations may vary with small input changes.

Mistake

The Responsible AI Dashboard only shows feature importance.

Correct

It also includes fairness metrics, error analysis, counterfactual what-if analysis, and model performance breakdowns.

Mistake

Explainability is only needed during model development.

Correct

Explanations must be generated in production for every prediction that requires auditing. Azure ML supports on-demand and batch explanation generation.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between transparency and explainability in AI?

Transparency refers to the openness of the entire AI system – including documentation of data, model architecture, training process, and limitations. Explainability is a subset that focuses on providing understandable reasons for specific decisions. For example, a model card is a transparency artifact, while a SHAP explanation for a loan denial is an explainability output.

Which Azure tool provides model interpretability and fairness assessment?

The Responsible AI Dashboard in Azure Machine Learning. It integrates model interpretability (using SHAP and LIME), fairness metrics (e.g., demographic parity), error analysis, and counterfactual what-if analysis into a single view.

How do I generate explanations for a model deployed in Azure ML?

Use the `azureml-interpret` Python SDK. Install it, create an explainer object (e.g., `TabularExplainer`), call `explain_global()` or `explain_local()`, and upload the explanation to your workspace using `ExplanationClient`. You can then visualize it in the Explanation Dashboard.

What is a counterfactual explanation on the AI-900 exam?

A counterfactual explanation shows how the input would need to change to get a different prediction. For example, 'If your income were $10,000 higher, your loan would be approved.' Azure's Responsible AI Dashboard includes a What-If tool to generate these.

Why is explainability important for responsible AI?

Explainability builds trust, enables bias detection, and supports regulatory compliance. Without it, users cannot understand or challenge automated decisions, and developers cannot debug or improve models.

Can deep learning models be explained?

Yes, using model-agnostic methods like LIME or SHAP. Azure ML supports `DeepExplainer` for neural networks. However, explanations may be less faithful than for simpler models.

What is the difference between global and local explainability?

Global explainability describes the overall behavior of the model – which features are most important across all predictions. Local explainability explains a single prediction – which features contributed to that specific outcome.

Terms Worth Knowing

Ready to put this to the test?

You've just covered AI Transparency and Explainability — now see how well it sticks with free AI-900 practice questions. Full explanations included, no account needed.

Done with this chapter?