A healthcare provider wants to use AI to analyze unstructured medical records — scanned documents with handwritten notes and printed text — to extract diagnosis codes for billing. Which combination of Google Cloud AI products most directly addresses this document understanding use case?
Document AI is Google's specialized service for intelligent document processing — it handles complex documents with mixed handwritten and printed content, extracts structured fields, and has specialized healthcare parsers. Vision API provides foundational OCR capabilities. Together they address the document understanding pipeline from raw scan to extracted structured data.
Why this answer
Option B is correct because Document AI is purpose-built for extracting structured information (like diagnosis codes) from unstructured documents, including both handwritten and printed text, using OCR and layout understanding. The Vision API complements this by providing advanced OCR capabilities for scanned images, together forming a direct solution for the healthcare provider's document understanding use case.
Exam trap
The trap here is that candidates may confuse general-purpose AI services (like Translation API or Natural Language API) with specialized document understanding tools, or assume that any ML pipeline tool (like Vertex AI Pipelines) can directly extract data from scanned documents without OCR and layout analysis.
How to eliminate wrong answers
Option A is wrong because BigQuery ML and Looker Studio are analytics and visualization tools, not designed for OCR or information extraction from scanned documents; they would require already-extracted data. Option C is wrong because Vertex AI Pipelines and Cloud Dataflow orchestrate ML training and data processing pipelines, not direct document understanding or extraction from scanned medical records. Option D is wrong because Cloud Translation API and Natural Language API handle translation and text analysis, but they lack OCR capabilities for handwritten notes and cannot extract structured diagnosis codes from scanned documents.