Knowledge + Practice

CCNA Pde Storing Data Questions

25 of 100 questions · Page 2/2 · Pde Storing Data topic · Answers revealed

Practice these questions Exam hub All questions

76

MCQeasy

A team wants to store semi-structured user profile data for a web application. The data is accessed via a REST API and requires security rules to control read/write access. Which database fits best?

A.BigQuery

B.Firestore

C.Cloud SQL

D.Cloud Bigtable

AnswerB

Document NoSQL with Security Rules and REST API.

Why this answer

Firestore is a NoSQL document database that natively stores semi-structured data (JSON-like documents) and integrates directly with Firebase Authentication and security rules to control read/write access per document or collection. Its REST API support and real-time capabilities make it ideal for web application user profiles that require flexible schemas and fine-grained access control.

Exam trap

Cisco often tests the distinction between NoSQL databases optimized for different workloads (document vs. wide-column vs. analytical), and the trap here is assuming any NoSQL database (like Bigtable) is suitable for semi-structured user profiles without considering the need for built-in security rules and REST API integration.

How to eliminate wrong answers

Option A is wrong because BigQuery is a serverless data warehouse designed for analytical queries on large datasets, not for transactional REST API access with per-document security rules. Option C is wrong because Cloud SQL is a relational database (MySQL, PostgreSQL, SQL Server) that requires a fixed schema and does not natively support document-level security rules or semi-structured data without additional abstraction. Option D is wrong because Cloud Bigtable is a wide-column NoSQL database optimized for high-throughput, low-latency time-series or analytical workloads, not for semi-structured user profiles with fine-grained access control via REST API.

Practice this question →

77

MCQeasy

A company needs a fully managed, globally distributed relational database with strong consistency, external consistency, and 99.999% SLA for a financial transaction processing system. Which Google Cloud service should they use?

A.Firestore

B.Cloud Spanner

C.Bigtable

D.Cloud SQL

AnswerB

Cloud Spanner is globally distributed, strongly consistent, and offers 99.999% SLA.

Why this answer

Cloud Spanner is the correct choice because it is a fully managed, globally distributed relational database service that provides strong consistency, external consistency (true serializable transactions across regions), and a 99.999% SLA. These features are essential for a financial transaction processing system that requires ACID compliance and global scalability without sacrificing consistency.

Exam trap

The trap here is that candidates often confuse Cloud Spanner with Bigtable or Firestore because all three are globally distributed, but only Spanner offers the relational model, strong consistency, and the 99.999% SLA required for financial transactions.

How to eliminate wrong answers

Option A (Firestore) is wrong because it is a NoSQL document database that does not support relational queries or strong consistency across global distributions (it offers eventual consistency by default). Option C (Bigtable) is wrong because it is a wide-column NoSQL database designed for high-throughput analytical workloads, not relational transactions, and it does not provide SQL support or ACID transactions. Option D (Cloud SQL) is wrong because it is a regional relational database service that cannot provide global distribution or a 99.999% SLA; it supports only single-region deployments with limited failover.

Practice this question →

78

MCQmedium

A team wants to use Cloud Storage to build a data lake with separate zones for raw, curated, and processed data. They need to automatically move objects older than 30 days from the raw zone to a cheaper storage class. How can they achieve this?

A.Set a bucket retention policy that forces deletion after 30 days

B.Write a Cloud Function to delete objects older than 30 days

C.Use gsutil rsync to move objects between buckets

D.Configure a Cloud Storage object lifecycle rule with SetStorageClass action

AnswerD

Lifecycle rules automate class transitions based on age.

Why this answer

Option D is correct because Cloud Storage object lifecycle management rules can automatically transition objects from one storage class to a cheaper one (e.g., from Standard to Nearline or Coldline) based on age. By configuring a rule with the `SetStorageClass` action and a `Condition` of `age: 30`, objects in the raw zone bucket older than 30 days are moved to a lower-cost class without manual intervention or additional compute services.

Exam trap

Cisco often tests the distinction between lifecycle rules that change storage class versus retention policies that enforce immutability, and candidates may confuse 'move to cheaper storage' with 'delete' or 'retain'.

How to eliminate wrong answers

Option A is wrong because a retention policy prevents deletion or modification of objects until the retention period expires; it does not move objects to a cheaper storage class and would lock the data, not transition it. Option B is wrong because while a Cloud Function could delete objects, the requirement is to move them to a cheaper storage class, not delete them; using a function for this is also less efficient and more complex than a native lifecycle rule. Option C is wrong because `gsutil rsync` synchronizes content between buckets but does not automatically trigger based on age; it requires manual or scheduled execution and does not natively support storage class transitions based on object age.

Practice this question →

79

MCQmedium

An organization needs to restrict access to BigQuery and Cloud Storage so that data can only be accessed from within a specific VPC network and cannot be exfiltrated. Which Google Cloud feature should they use?

A.Private Service Access

B.VPC Service Controls

C.VPC firewall rules

D.IAM conditions

AnswerB

Creates a security perimeter to prevent data exfiltration.

Why this answer

VPC Service Controls (option B) is the correct choice because it creates a security perimeter around Google Cloud services like BigQuery and Cloud Storage, preventing data exfiltration even from within a VPC. It enforces context-aware access based on the VPC network, ensuring data can only be accessed from authorized VPC sources and blocking unauthorized transfers outside the perimeter.

Exam trap

The trap here is that candidates confuse VPC firewall rules (which control network traffic) with VPC Service Controls (which control data access at the API layer), leading them to choose firewall rules because they think 'restricting access to a VPC' is purely a network-level concern.

How to eliminate wrong answers

Option A is wrong because Private Service Access is used to enable private connectivity from a VPC to Google-managed services (e.g., Cloud SQL, Memorystore) via internal IPs, but it does not provide exfiltration prevention or restrict data movement across services. Option C is wrong because VPC firewall rules control network traffic at the packet level (IP addresses, ports, protocols) but cannot prevent data exfiltration via API calls or service-to-service transfers, as they operate at layers 3/4, not at the application layer. Option D is wrong because IAM conditions allow fine-grained access control based on attributes like IP address or time, but they do not create a perimeter around services; they can restrict who can call an API but cannot block data movement between services or prevent exfiltration via authorized credentials.

Practice this question →

80

Multi-Selecthard

You are designing a Cloud Spanner schema for a global e-commerce application. The database will include a Customers table and an Orders table. To optimise performance for queries that join Customers with their Orders, which THREE design choices are recommended? (Choose 3.)

Select 3 answers

A.Use a single table for both Customers and Orders and filter by CustomerId.

B.Use interleaved tables: make Orders an interleaved table under Customers.

C.Denormalise the schema by embedding order details into the Customers table as repeated fields.

D.Include CustomerId as the first part of the Orders primary key.

E.Create a secondary index on OrderDate in the Orders table.

AnswersB, D, E

Interleaved tables store Orders rows near their parent Customer, improving join performance.

Why this answer

Spanner interleaved tables store child rows physically close to parent rows, improving join performance. Including the parent's primary key as the first part of the child's primary key is required for interleaving. Secondary indexes on non-key columns are also needed for efficient lookups.

Practice this question →

81

MCQmedium

A Cloud SQL instance for PostgreSQL is experiencing heavy read traffic. The team wants to offload read queries while maintaining data consistency. Which solution meets their needs?

A.Create read replicas and direct read queries to them.

B.Increase the tier of the Cloud SQL instance.

C.Create a Cloud SQL failover replica.

D.Use Cloud Memorystore as a cache in front of Cloud SQL.

AnswerA

Read replicas handle read traffic, reducing load on the primary instance.

Why this answer

Read replicas in Cloud SQL for PostgreSQL allow you to offload read traffic from the primary instance by creating one or more asynchronous replicas that serve read queries. This maintains data consistency because replicas use PostgreSQL's native streaming replication, ensuring that all committed transactions on the primary are eventually reflected on the replica, providing a consistent snapshot for read operations.

Exam trap

Cisco often tests the distinction between read replicas (for offloading reads) and failover replicas (for high availability), so candidates may confuse the two and incorrectly select the failover replica option.

How to eliminate wrong answers

Option B is wrong because increasing the tier of the Cloud SQL instance only scales the primary instance vertically, which does not offload read traffic — it simply gives the same instance more resources, which may not be sufficient under heavy read loads and does not separate read and write workloads. Option C is wrong because a Cloud SQL failover replica is designed for high availability and automatic failover in case of primary instance failure, not for offloading read queries; it is a synchronous standby that does not serve read traffic. Option D is wrong because Cloud Memorystore as a cache can reduce read load but does not maintain strong data consistency with Cloud SQL; cached data may become stale, and the cache is not a direct offload of read queries from the database — it introduces eventual consistency and requires application-level cache invalidation logic.

Practice this question →

82

MCQhard

A company is migrating an on-premises PostgreSQL database to Google Cloud. The database runs complex analytical queries mixed with OLTP workloads. They need PostgreSQL compatibility and want to improve analytical query performance without changing the application. Which database should they choose?

A.BigQuery

B.Cloud Spanner

C.Cloud SQL for PostgreSQL

D.AlloyDB

AnswerD

AlloyDB is PostgreSQL-compatible and uses a columnar engine to accelerate analytical queries while supporting OLTP.

Why this answer

AlloyDB is the correct choice because it is a fully managed PostgreSQL-compatible database service specifically designed for demanding transactional and analytical workloads. It combines the PostgreSQL ecosystem with a columnar engine and adaptive caching to accelerate analytical queries by up to 100x over standard PostgreSQL, all without requiring application changes. This makes it ideal for mixed OLTP and complex analytical queries while maintaining PostgreSQL compatibility.

Exam trap

The trap here is that candidates often choose Cloud SQL for PostgreSQL because it is the most familiar PostgreSQL option, overlooking that AlloyDB is specifically engineered for mixed OLTP and analytical workloads with PostgreSQL compatibility, while Cloud SQL lacks the advanced analytical acceleration features.

How to eliminate wrong answers

Option A is wrong because BigQuery is a serverless data warehouse that is not PostgreSQL-compatible and requires application changes to use its SQL dialect; it is designed for large-scale analytics, not OLTP workloads. Option B is wrong because Cloud Spanner is a globally distributed, strongly consistent relational database that uses a proprietary SQL dialect, not PostgreSQL, and is optimized for horizontal scalability and high availability, not for improving analytical query performance on mixed workloads. Option C is wrong because Cloud SQL for PostgreSQL is a fully managed PostgreSQL service but lacks the built-in columnar engine and adaptive caching needed to significantly accelerate complex analytical queries; it is best suited for standard OLTP workloads, not mixed analytical and transactional demands.

Practice this question →

83

MCQhard

A data engineer is designing a Bigtable row key for a time-series application that records temperature sensor readings every second. To avoid hotspotting, they want to distribute writes across all nodes. Which row key design is best?

A.[timestamp reversed]#[sensor_id]

B.[sensor_id]#[timestamp]

C.[hash of sensor_id]#[timestamp]

D.[timestamp]#[sensor_id]

AnswerC

Hashing distributes writes across tablets, avoiding hotspotting.

Why this answer

Hotspotting occurs when sequential keys hit a single tablet server. A reversed timestamp or hashed prefix distributes writes. Pre-pending a hash ensures even distribution.

Timestamp alone causes hotspotting. Sensor ID + timestamp can still be sequential if sensor ID is low cardinality.

Practice this question →

84

MCQhard

You are designing a row key for Cloud Bigtable to store user activity logs. Each log entry has a timestamp (millisecond precision) and a user ID. There will be millions of writes per second from many users. To avoid hotspotting, which row key design is BEST?

A.timestamp_millis#hash(userID)

B.timestamp_millis#userID

C.userID#timestamp_millis

D.hash(userID)#userID#timestamp_millis

AnswerD

Hashing the userID distributes writes evenly. Including userID and timestamp enables efficient queries per user over time.

Why this answer

Option D is best because it uses a hash of the user ID as the row key prefix, which distributes writes across all Bigtable nodes and avoids hotspotting. Appending the user ID and timestamp ensures uniqueness and supports efficient queries for a specific user's logs. This design prevents the sequential timestamp from creating a single hot node, which is critical for handling millions of writes per second.

Exam trap

Cisco often tests the misconception that placing the most selective or unique field first (like timestamp) is best for queries, but in Bigtable the row key design must prioritize write distribution over read optimization to avoid hotspotting.

How to eliminate wrong answers

Option A is wrong because placing the timestamp first causes all writes for the same millisecond to hit a single tablet server, creating a hotspot. Option B is wrong because using the raw timestamp as the prefix leads to sequential writes that overload one node, negating Bigtable's horizontal scaling. Option C is wrong because while userID as prefix distributes writes, it does not guarantee uniqueness for multiple log entries from the same user at the same millisecond, and it lacks the hash to prevent skewed access patterns if user IDs are sequential or predictable.

Practice this question →

85

Multi-Selectmedium

A data engineer is designing a Cloud Bigtable schema for high-volume time-series data. Which TWO practices should they follow to avoid performance issues?

Select 2 answers

A.Place the timestamp as the first component of the row key

B.Create as many column families as possible

C.Use a hashed prefix in the row key to distribute writes

D.Group related columns into column families

E.Store all columns in a single column family

AnswersC, D

Hashing avoids sequential hot-spotting.

Why this answer

Using a hashed prefix to avoid hot-spotting and grouping related columns into column families are recommended. Timestamp-first keys cause hot-spotting. Single column family for all data is inefficient.

Large number of column families also adds overhead.

Practice this question →

86

Multi-Selecteasy

A company wants to use BigQuery for analytics. They need to meet compliance requirements by encrypting data at rest with a key they control. Which TWO actions should they take? (Choose 2.)

Select 2 answers

A.Set the Cloud KMS key as the default encryption key for the BigQuery dataset.

B.Create a Cloud Storage bucket and load data there.

C.Use VPC Service Controls to restrict access to the dataset.

D.Create a key ring and cryptographic key in Cloud KMS.

E.Enable BigQuery column-level encryption using AEAD functions.

AnswersA, D

Setting the dataset default encryption key encrypts all tables in the dataset with the CMEK.

Why this answer

BigQuery supports Customer-Managed Encryption Keys (CMEK) for encrypting data at rest. You need to create a Cloud KMS key and then set it as the default encryption key for a BigQuery dataset. All tables in that dataset will be encrypted with that key.

Practice this question →

87

Multi-Selectmedium

A company wants to build a reporting pipeline where data is collected from IoT devices, stored raw in Cloud Storage, and then processed into BigQuery for analytics. They need to ensure data is encrypted at rest using customer-managed keys. Which THREE steps should they take? (Choose 3 correct options)

Select 3 answers

A.Delete the Cloud KMS key after data is loaded to BigQuery

B.Enable CMEK on the Cloud Storage bucket by specifying the KMS key

C.Configure the BigQuery dataset to use a CMEK key

D.Use Google-managed encryption keys

E.Create a key ring and key in Cloud Key Management Service

AnswersB, C, E

You can set a default KMS key for a bucket.

Why this answer

Option B is correct because enabling CMEK on a Cloud Storage bucket by specifying a KMS key ensures that all objects stored in the bucket are encrypted at rest using a customer-managed key, which meets the requirement for customer-managed encryption. This is done by setting the bucket's default encryption to use a specific Cloud KMS key, and any object uploaded without its own encryption key will inherit this setting.

Exam trap

Cisco often tests the misconception that you can delete the KMS key after encryption to save costs, but the trap is that this permanently locks the data, making it unrecoverable and non-compliant with retention policies.

Practice this question →

88

MCQmedium

A company stores sensitive data in BigQuery and needs to encrypt certain columns with customer-managed encryption keys (CMEK) while using BigQuery's analytics capabilities. What should they do?

A.Store the sensitive data in Cloud Storage with CMEK and use external tables.

B.Use BigQuery column-level security with data classification.

C.Create a BigQuery table with CMEK enabled; it will automatically encrypt all columns.

D.Use the AEAD encryption functions in BigQuery to encrypt specific columns during query time.

AnswerD

AEAD functions allow column-level encryption/decryption with customer-managed keys, enabling granular control.

Why this answer

Option D is correct because BigQuery's AEAD encryption functions allow you to encrypt specific columns at query time using customer-managed keys, while still leveraging BigQuery's full analytics capabilities on the unencrypted portions of the data. This approach meets the requirement of encrypting certain columns with CMEK without losing the ability to run analytical queries on the rest of the table.

Exam trap

The trap here is that candidates confuse table-level CMEK encryption (which encrypts all data at rest) with the ability to selectively encrypt specific columns, leading them to choose Option C, when in fact BigQuery requires using AEAD functions for column-level encryption with customer-managed keys.

How to eliminate wrong answers

Option A is wrong because storing data in Cloud Storage with CMEK and using external tables does not encrypt specific columns within BigQuery; it encrypts the entire file at rest, and external tables cannot enforce column-level encryption natively. Option B is wrong because BigQuery column-level security with data classification controls access to columns via policies (e.g., masking or row-level security), but it does not encrypt the data with CMEK; it only restricts visibility. Option C is wrong because enabling CMEK on a BigQuery table encrypts the entire table at rest, not specific columns; you cannot selectively apply CMEK to only certain columns within a table.

Practice this question →

89

MCQeasy

Which BigQuery feature allows you to read data directly from Cloud Storage without loading it into BigQuery storage?

A.External tables

B.BI Engine

C.Federated queries

D.Authorized views

AnswerA

External tables reference data in Cloud Storage and can be queried directly.

Why this answer

External tables in BigQuery allow querying data stored in Cloud Storage (e.g., CSV, Parquet, ORC) without loading. Authorized views restrict access, federated queries allow querying other databases, and BI Engine is for acceleration.

Practice this question →

90

MCQmedium

An organization uses Cloud Storage to store backup files. They want to automatically delete files older than 90 days, and after deletion, move remaining files to Nearline storage if not accessed for 30 days. Which Cloud Storage feature should they configure?

A.Object Versioning

B.Retention Policies

C.Bucket Lock

D.Object Lifecycle Management

AnswerD

Lifecycle rules can delete objects after a specified age and change storage class based on last access time (using Condition with LastAccessTime).

Why this answer

Object Lifecycle Management (D) is the correct feature because it allows you to define rules to automatically transition objects to colder storage classes (such as Nearline) after a specified period of inactivity and to delete objects after a set age. In this scenario, a lifecycle rule can be configured to delete objects older than 90 days and, for the remaining objects, move them to Nearline storage if they have not been accessed for 30 days. This fully automates the required data management without manual intervention.

Exam trap

Cisco often tests the distinction between lifecycle management (which automates transitions and deletions) and retention-related features (like Bucket Lock or Retention Policies), so the trap here is that candidates confuse 'automatically deleting old files' with 'preventing deletion,' leading them to incorrectly choose a retention-focused option.

How to eliminate wrong answers

Option A is wrong because Object Versioning is used to preserve, retrieve, and restore every version of an object in a bucket, not to automate deletion or storage class transitions based on age or access patterns. Option B is wrong because Retention Policies are used to enforce a minimum retention period for objects, preventing their deletion or overwrite, which is the opposite of automatically deleting old files. Option C is wrong because Bucket Lock is a feature that locks a bucket's retention policy, making it immutable and preventing any changes to the retention settings; it does not provide automated lifecycle actions like deletion or storage class transitions.

Practice this question →

91

Multi-Selectmedium

A company is migrating an on-premises PostgreSQL database to Google Cloud. They need a fully managed database that is compatible with PostgreSQL and can handle both transactional and analytical workloads with high performance. Which two database services meet these requirements? (Choose TWO.)

Select 2 answers

A.Cloud Spanner

B.Cloud SQL for PostgreSQL

C.BigQuery

D.AlloyDB

E.Firestore

AnswersB, D

Fully managed, PostgreSQL-compatible, supports OLTP and some analytical queries.

Why this answer

Cloud SQL for PostgreSQL is a fully managed database service that is compatible with PostgreSQL, making it suitable for transactional workloads. AlloyDB is also a fully managed PostgreSQL-compatible database that is optimized for high performance on both transactional and analytical workloads, offering up to 100x faster query performance for analytical queries compared to standard PostgreSQL.

Exam trap

Cisco often tests the distinction between general-purpose managed databases (Cloud SQL) and specialized high-performance databases (AlloyDB), and the trap here is that candidates may think BigQuery or Spanner are PostgreSQL-compatible because they support SQL, but they do not support the PostgreSQL dialect or transactional workloads natively.

Practice this question →

92

MCQhard

You are designing a Cloud Storage bucket to hold sensitive financial documents that must not be deleted or overwritten for 7 years. After the retention period, the documents can be deleted automatically. Which configuration should you use?

A.Set a retention policy on the bucket for 7 years and enable Object Versioning.

B.Use Object Lock with WORM mode and set a retention period of 7 years. After the period, objects are automatically deleted.

C.Set a lifecycle rule to delete objects after 7 years and enable Bucket Lock.

D.Use Bucket Lock with a retention policy of 7 years and configure a lifecycle rule to delete objects after 7 years.

AnswerD

Bucket Lock retains objects for 7 years; lifecycle rule deletes them after that period.

Why this answer

Option D is correct because Bucket Lock (also known as Object Lock) provides a WORM (Write Once, Read Many) retention policy that prevents objects from being deleted or overwritten for a specified period. By setting a retention policy of 7 years, you enforce the required compliance hold. Then, a lifecycle rule configured to delete objects after 7 years ensures automatic removal once the retention period expires.

This combination meets both the retention and automatic deletion requirements.

Exam trap

Cisco often tests the misconception that a retention policy alone (without Object Lock) or a lifecycle rule alone can enforce both retention and automatic deletion, when in fact you need both Bucket Lock for the WORM hold and a lifecycle rule for the scheduled deletion.

How to eliminate wrong answers

Option A is wrong because a retention policy on the bucket (without Object Lock) only prevents deletion of the bucket itself, not individual objects; enabling Object Versioning alone does not prevent deletion or overwrite of object versions. Option B is wrong because Object Lock with WORM mode can prevent deletion, but it does not automatically delete objects after the retention period; you must explicitly configure a lifecycle rule for deletion. Option C is wrong because a lifecycle rule alone cannot enforce a retention policy that prevents deletion or overwrite during the first 7 years; Bucket Lock is required for that enforcement, and the lifecycle rule must be combined with Bucket Lock to achieve automatic deletion after retention.

Practice this question →

93

Multi-Selecthard

A company stores data in a Cloud Storage bucket with versioning enabled. They want to automatically delete objects that are noncurrent (i.e., previous versions) after 30 days, and also delete the current version if it is older than 365 days. Which three Object Lifecycle Management conditions can be used together? (Choose three.)

Select 3 answers

A.lastAccessTime: 30

B.age: 365

C.numNewerVersions: 1

D.daysSinceCustomTime: 30

E.noncurrentTimeBefore: 30

AnswersB, C, E

Deletes current version when older than 365 days.

Why this answer

Option B is correct because the `age` condition in Object Lifecycle Management specifies the number of days since object creation, and setting it to 365 will delete the current version when it is older than 365 days. This directly meets the requirement to delete current versions older than a year.

Exam trap

Cisco often tests the distinction between Google Cloud Storage lifecycle conditions and AWS S3 lifecycle conditions, so candidates mistakenly select `lastAccessTime` (an S3-only feature) or confuse `daysSinceCustomTime` with `noncurrentTimeBefore`.

Practice this question →

94

Multi-Selecteasy

A company wants to implement a data lake on Google Cloud. They need to store raw, structured data in open formats and allow querying directly from BigQuery without loading. Which THREE services or features should they use? (Choose 3)

Select 2 answers

A.Cloud Storage (GCS)

B.Dataproc

C.Dataflow

D.Cloud SQL

E.BigLake

AnswersA, E

GCS is the underlying storage for the data lake, storing data in open formats like Parquet/ORC.

Why this answer

Cloud Storage (GCS) is the correct choice because it serves as the underlying storage layer for a data lake on Google Cloud, allowing raw structured data to be stored in open formats such as Parquet, Avro, or ORC. BigQuery can directly query data stored in GCS using external tables, eliminating the need to load data into BigQuery storage. This decouples compute from storage, enabling cost-effective and scalable data lake architectures.

Exam trap

Cisco often tests the misconception that Dataproc or Dataflow are required for querying data in a data lake, when in fact BigQuery external tables and BigLake provide direct querying without loading, and the key is to recognize that storage (GCS) and the query engine (BigLake) are the correct services.

Practice this question →

95

MCQmedium

A global e-commerce platform requires a relational database that can handle millions of transactions per second across regions with strong consistency and automatic failover. The database must also support SQL joins. Which database should they choose?

A.Cloud Spanner

B.Cloud SQL

C.Cloud Firestore

D.Cloud Bigtable

AnswerA

Spanner provides global distribution, strong consistency, SQL support, and automatic failover, meeting all requirements.

Why this answer

Cloud Spanner is the correct choice because it is a globally distributed, horizontally scalable relational database that provides strong consistency and automatic failover across regions, while fully supporting SQL joins. It combines the benefits of a traditional relational database with the horizontal scalability of NoSQL systems, making it ideal for high-throughput, globally distributed applications requiring ACID transactions.

Exam trap

Cisco often tests the misconception that Cloud SQL is suitable for global-scale, high-consistency workloads because it is a relational database, but it lacks the global distribution and automatic failover capabilities required for millions of transactions per second across regions.

How to eliminate wrong answers

Option B (Cloud SQL) is wrong because it is a single-region, vertically scalable relational database that cannot handle millions of transactions per second across regions with automatic failover; it lacks global distribution and strong consistency across regions. Option C (Cloud Firestore) is wrong because it is a NoSQL document database that does not support SQL joins and is designed for mobile and web apps with eventual consistency, not for high-throughput relational workloads requiring strong consistency. Option D (Cloud Bigtable) is wrong because it is a NoSQL wide-column database that does not support SQL joins or relational queries; it is optimized for analytical and time-series workloads, not transactional applications requiring ACID properties.

Practice this question →

96

MCQmedium

You need to create a Cloud Storage bucket for a data lake that will store raw ingested data. The data must be immutable and cannot be deleted or overwritten for a compliance period of 5 years. Which feature should you enable?

A.Object Versioning

B.Lifecycle rules to delete objects after 5 years

C.Object Lock with governance mode

D.Bucket Lock with a retention policy of 5 years

AnswerD

Correct: Bucket Lock enforces immutability for the specified period.

Why this answer

Bucket Lock with a retention policy enforces a minimum retention period on all objects in the bucket. During the retention period, objects cannot be deleted or overwritten. This is exactly for compliance needs.

Practice this question →

97

MCQeasy

A company wants to use BigQuery to query data stored in Parquet files in Cloud Storage without loading the data into BigQuery. Which BigQuery feature should they use?

A.BigQuery Omni

B.BigQuery ML

C.BigQuery external tables

D.BigQuery BI Engine

AnswerC

External tables allow querying data directly from GCS without loading into BigQuery storage.

Why this answer

BigQuery external tables allow querying data stored in Cloud Storage (including Parquet files) directly without loading it into BigQuery storage. This feature uses a federated query engine that reads the data on the fly, supporting formats like Parquet, Avro, ORC, CSV, and JSON. Option C is correct because it directly addresses the requirement to query Parquet files in Cloud Storage without ingestion.

Exam trap

Cisco often tests the distinction between features that query external data (external tables) versus features that process data within BigQuery (like BI Engine) or across clouds (Omni), leading candidates to confuse Omni's multi-cloud capability with external data access in the same cloud.

How to eliminate wrong answers

Option A is wrong because BigQuery Omni is designed to query data across multi-cloud environments (AWS, Azure) using BigQuery's interface, not for querying Parquet files in Cloud Storage without loading. Option B is wrong because BigQuery ML is a machine learning feature that enables creating and executing models using SQL, not for querying external data files. Option D is wrong because BigQuery BI Engine is an in-memory analysis service that accelerates dashboard queries on data already stored in BigQuery, not for querying external Parquet files in Cloud Storage.

Practice this question →

98

MCQhard

A data engineer needs to design a Bigtable row key for a time-series IoT application where each device sends data every second. The query pattern is to retrieve all data for a specific device over a time range. Which row key design minimizes hotspots?

A.device_id#timestamp (e.g., device123#2024-03-15-10:30:00)

B.hash(device_id)#timestamp (e.g., a3f2#2024-03-15-10:30:00)

C.timestamp#device_id (e.g., 2024-03-15-10:30:00#device123)

D.device_type#device_id#timestamp

AnswerB

Hashing the device ID distributes writes across tablets, and appending timestamp allows efficient time-range scans.

Why this answer

To avoid hotspots (where all writes hit a single tablet server), the row key should start with a hash of the device ID to distribute writes across the cluster, then append the timestamp for range scans.

Practice this question →

99

MCQhard

A company stores sensitive customer data in BigQuery and Cloud Storage. They want to encrypt the data with customer-managed encryption keys (CMEK) and ensure that access to the key material is restricted to only approved networks. Which additional Google Cloud control should they implement to enforce network-based access to the encryption keys?

A.Identity-Aware Proxy (IAP)

B.Private Google Access

C.VPC Service Controls

D.Cloud Armor

AnswerC

VPC Service Controls allow you to define a security perimeter around Google Cloud services, including Cloud KMS, to restrict access based on network origin.

Why this answer

VPC Service Controls (VPC-SC) can create a security perimeter around Cloud KMS and BigQuery/Cloud Storage resources, preventing data exfiltration and restricting access to approved networks. VPC-SC works with CMEK to add an extra layer of network-based access control. Cloud Armor is for HTTP(S) load balancing, IAP is for user identity, and Private Google Access is for on-premises access to public IPs.

Practice this question →

100

MCQhard

An e-commerce company uses Cloud Spanner for order processing. They need to query orders by customer ID and retrieve all order items. Which schema design pattern should they use for optimal performance?

A.Use interleaved tables where Orders is the parent and OrderItems is an interleaved child table with the same primary key prefix.

B.Store all data in a single table with nullable columns for order item attributes.

C.Denormalize by storing order items as a repeated field in the orders table.

D.Create two separate tables with a secondary index on customer_id in the orders table and a secondary index on order_id in the order_items table.

AnswerA

Interleaving co-locates child rows with their parent, enabling efficient joins and strong consistency.

Why this answer

Interleaved tables in Cloud Spanner physically co-locate parent and child rows on the same split, so querying orders by customer_id and retrieving all order items becomes a single, fast key-range scan without cross-node joins. This design exploits Spanner's hierarchical storage model to minimize latency and maximize throughput for this access pattern.

Exam trap

Cisco often tests the misconception that secondary indexes alone are sufficient for performance, ignoring that Spanner's distributed architecture makes cross-table joins expensive, whereas interleaving provides physical co-location that avoids network round-trips.

How to eliminate wrong answers

Option B is wrong because a single table with nullable columns violates normalization, wastes storage, and forces complex queries to filter order-item rows from order rows, eliminating Spanner's interleaving performance benefit. Option C is wrong because storing order items as a repeated field (e.g., ARRAY<STRUCT>) prevents independent indexing and filtering of individual items, and updating a single item requires rewriting the entire order row, causing contention and poor concurrency. Option D is wrong because two separate tables with secondary indexes on customer_id and order_id require a distributed join across splits, incurring cross-node communication and higher latency compared to the co-located access of interleaved tables.

Practice this question →

← PreviousPage 2 of 2 · 100 questions total

Ready to test yourself?

Try a timed practice session using only Pde Storing Data questions.

Start 20-question session