Question 315 of 1,020

Quick Answer

The answer is Azure AI Video Indexer, the correct service because it is specifically designed to ingest large collections of videos, extract metadata such as transcripts, faces, emotions, and keyframes, and provide searchable insights at scale. Unlike other Azure AI services, Video Indexer combines multiple AI models—speech, vision, and language—into a single pipeline optimized for video content, making it the appropriate choice for indexing and extracting insights from video libraries. On the AI-900 exam, this question tests your ability to match a service to its core function, often appearing as a scenario where you need to choose between Video Indexer, Custom Vision, or Azure Cognitive Search. A common trap is selecting Azure Video Analyzer, which is deprecated, or Azure Media Services, which focuses on encoding and streaming rather than insight extraction. Memory tip: think “Video Indexer = Video + Index + Insights,” where the name itself tells you it indexes videos and extracts insights.

AI-900 Practice Question: Describe features of computer vision workloads on Azure

This AI-900 practice question tests your understanding of describe features of computer vision workloads on azure. Match the stated requirement to the specific cloud service, access model, or configuration option — many options are valid in isolation but not for this scenario. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

Which Azure AI service is used to index and extract insights from large collections of videos at scale?

Question 1mediummultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

Azure AI Video Indexer

Azure AI Video Indexer is the correct service because it is specifically designed to ingest large collections of videos, extract metadata (such as transcripts, faces, emotions, and keyframes), and provide searchable insights at scale. Unlike other Azure AI services, Video Indexer combines multiple AI models (speech, vision, and language) into a single pipeline optimized for video content, making it the appropriate choice for indexing and extracting insights from video libraries.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • Azure AI Custom Vision

    Why it's wrong here

    Custom Vision trains image classification models — Video Indexer extracts comprehensive insights from video content.

  • Azure AI Video Indexer

    Why this is correct

    Video Indexer extracts transcripts, faces, topics, scenes, and more from videos automatically, making video libraries searchable.

    Related concept

    Read the scenario before looking for a memorised answer.

  • Azure Blob Storage media services

    Why it's wrong here

    Blob Storage hosts video files — Video Indexer applies AI to extract insights from the video content.

  • Azure AI Speech transcription only

    Why it's wrong here

    Speech transcription is just one component — Video Indexer integrates many AI capabilities for comprehensive video understanding.

Common exam traps

Common exam trap: answer the scenario, not the keyword

The trap here is that candidates confuse Azure AI Video Indexer with Azure AI Speech transcription only, assuming that extracting insights from video is solely about transcribing audio, when in fact Video Indexer combines speech, vision, and language AI to provide comprehensive video insights.

Detailed technical explanation

How to think about this question

Under the hood, Azure AI Video Indexer uses a multi-modal pipeline that runs speech-to-text (via Azure AI Speech), optical character recognition (OCR), face detection (via Azure AI Face), and custom vision models to generate a rich index of time-stamped metadata. It also supports custom language models and brand detection, enabling enterprises to search for specific spoken phrases, on-screen text, or people across thousands of hours of video. A real-world scenario is a media company indexing a decade of news footage to instantly retrieve clips where a specific politician spoke about climate change.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A media company stores terabytes of video archives that are accessed once a year for audit purposes. Moving these objects to a cold storage tier (Azure Archive, S3 Glacier, or Google Nearline) costs a fraction of hot storage. Questions like this test whether you understand storage tiers, access frequency tradeoffs, and retrieval latency requirements.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related AI-900 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free AI-900 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this AI-900 question test?

Describe features of computer vision workloads on Azure — This question tests Describe features of computer vision workloads on Azure — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Azure AI Video Indexer — Azure AI Video Indexer is the correct service because it is specifically designed to ingest large collections of videos, extract metadata (such as transcripts, faces, emotions, and keyframes), and provide searchable insights at scale. Unlike other Azure AI services, Video Indexer combines multiple AI models (speech, vision, and language) into a single pipeline optimized for video content, making it the appropriate choice for indexing and extracting insights from video libraries.

What should I do if I get this AI-900 question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

2 more ways this is tested on AI-900

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. What is 'video indexer' (Azure Video Indexer) and what insights does it extract?

medium
  • A.A tool that compresses videos to reduce storage costs in Azure Blob Storage
  • B.A service that extracts transcripts, faces, speakers, topics, and scenes from video content
  • C.A database index that speeds up queries on video metadata tables
  • D.A tool for creating video presentations from a series of images and text

Why B: Azure Video Indexer is a cloud-based service that uses AI to analyze video and audio content. It extracts rich insights such as transcripts (speech-to-text), identified faces, speaker diarization, topics, scenes, and even sentiment, making it a comprehensive media intelligence tool rather than a storage or indexing utility.

Variation 2. What is the purpose of Azure AI Video Indexer's transcript feature?

medium
  • A.To translate video subtitles into multiple languages
  • B.To automatically convert speech in videos to searchable text with timestamps
  • C.To generate written scripts for producing new videos
  • D.To extract text visible in video frames (on-screen text)

Why B: Azure AI Video Indexer's transcript feature uses automatic speech recognition (ASR) to convert spoken audio in videos into a text transcript, which is then indexed with precise timestamps for each word or phrase. This enables users to search, navigate, and analyze video content by keyword or phrase, making the video's audio content fully searchable and accessible.

Last reviewed: Jun 11, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This AI-900 practice question is part of Courseiva's free Microsoft certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the AI-900 exam.