A company needs to stream data from a fleet of IoT devices to BigQuery for near-real-time analytics. The data volume is unpredictable and can spike during certain events. Which Google Cloud service should be used as the ingestion point to handle variable throughput with minimal operational overhead?
Cloud Pub/Sub ingests variable-volume data and decouples producers from consumers.
Why this answer
Cloud Pub/Sub is the correct choice because it is a fully managed, scalable messaging service designed to decouple data producers from consumers, handling unpredictable and spiky throughput without requiring manual scaling. It can ingest millions of messages per second and buffer them until BigQuery is ready to consume, ensuring near-real-time analytics with minimal operational overhead.
Exam trap
Google Cloud often tests the misconception that Cloud Functions can serve as a direct ingestion point for streaming data, but candidates overlook that Cloud Functions lacks durable buffering and automatic scaling for high-throughput spikes, making Pub/Sub the correct decoupling layer.
How to eliminate wrong answers
Option A is wrong because Cloud Datastore is a NoSQL document database for storing structured data, not a streaming ingestion service; it cannot handle variable-throughput message ingestion or buffer spikes. Option B is wrong because Cloud Functions is a serverless compute platform for event-driven code execution, not a durable ingestion buffer; it lacks built-in buffering and would require custom scaling logic to handle throughput spikes. Option C is wrong because Cloud Storage is an object storage service for batch data, not designed for near-real-time streaming ingestion; it introduces latency and requires additional components (e.g., Cloud Functions or Pub/Sub notifications) to trigger downstream processing.