Knowledge + Practice

CCNA Data Operations and Support Questions

75 of 387 questions · Page 2/6 · Data Operations and Support · Answers revealed

Practice these questions Domain overview All questions

76

MCQhard

A data pipeline uses AWS Glue ETL jobs to process data from Amazon RDS for MySQL to Amazon S3. Recently, the jobs have been failing with the error 'Communications link failure' during the connection phase. The RDS instance is in a private subnet, and the Glue job uses a VPC endpoint for S3. What is the most likely cause?

A.The RDS database has reached the maximum number of connections.

B.The Glue job does not have IAM permissions to decrypt the RDS database using AWS KMS.

C.The JDBC driver used by Glue is incompatible with the MySQL version.

D.The Glue job does not have a network path to the RDS instance because it is not attached to the same VPC subnet.

AnswerD

Glue jobs need an ENI in the same VPC to connect to RDS.

Why this answer

Option B is correct because the Glue job needs a VPC endpoint for S3 to access S3, but for RDS, a VPC endpoint is not sufficient; the job must have network connectivity to the RDS subnet, typically via an ENI in the same VPC. Option A is wrong because KMS permissions would cause access denied, not connection failure. Option C is wrong because connection pooling is not relevant.

Option D is wrong because the error is 'Communications link failure', not authentication.

Practice this question →

77

MCQhard

A data engineer is troubleshooting a failed AWS Glue job that reads from an Apache Hive metastore in an Amazon EMR cluster. The error message indicates 'ClassNotFoundException: org.apache.hadoop.hive.ql.metadata.HiveException'. The Glue job uses a custom Python shell script. What is the most likely cause of this error?

A.Check the network connectivity between Glue and the EMR cluster.

B.Include the Hive JAR files in the 'Python library path' or use a Glue version with Hive support.

C.Modify the Python script to import the Hive libraries manually.

D.Update the IAM role to allow 'hive:Describe*' actions.

AnswerB

Glue needs Hive JARs in the classpath to connect to Hive metastore.

Why this answer

Option C is correct because the ClassNotFoundException for Hive classes indicates that the required Hive JARs are not available in the Glue job's classpath. The Glue job needs to include Hive jars either via a library path or by using a Glue version that supports Hive connectivity. Option A is incorrect because the Python script itself does not need to be modified to include imports; the JARs must be provided.

Option B is incorrect because network connectivity would not cause a class not found error. Option D is incorrect because the error is not about IAM permissions.

Practice this question →

78

MCQmedium

A data engineer notices that an AWS Glue ETL job processing data from Amazon S3 to Amazon Redshift has been failing intermittently with the error 'S3ServiceException: SlowDown'. Which action is MOST likely to resolve this issue?

A.Increase the number of partitions in the Glue job to parallelize reads.

B.Switch from a Standard to a G.2X large Glue worker type.

C.Implement exponential backoff and retry logic in the Glue job.

D.Enable S3 Transfer Acceleration on the source bucket.

AnswerC

Exponential backoff reduces request rate and handles throttling gracefully.

Why this answer

Option B is correct because the 'SlowDown' error indicates request throttling by S3; implementing exponential backoff and retries reduces request rate and prevents throttling. Option A is wrong because increasing partitions could increase the number of requests. Option C is wrong because switching to a larger instance type does not affect S3 request rate.

Option D is wrong because S3 does not have a burst concurrency limit.

Practice this question →

79

MCQhard

A company runs an Amazon Redshift cluster for analytics. During peak hours, query performance degrades significantly. The data engineer notices that disk space usage is above 80% on many nodes. Which of the following is the MOST effective long-term solution to improve query performance?

A.Increase the workload management (WLM) queue slots.

B.Resize the cluster to include additional nodes.

C.Apply compression encoding to all columns.

D.Run the VACUUM command to reclaim space.

AnswerB

Adding nodes increases both storage and compute resources, directly addressing disk usage and performance.

Why this answer

Resizing the cluster to add more nodes increases total storage and compute capacity, reducing disk pressure and improving performance. Option A is wrong because vacuuming reclaims space but does not add capacity. Option B is wrong because compression helps but may not be sufficient.

Option D is wrong because it only addresses queries waiting for resources.

Practice this question →

80

MCQeasy

A data engineer is investigating why Amazon Athena queries on the 'my-data-lake' bucket are slow. The table is partitioned by year/month/day. The exhibit shows the objects in one partition. What is the MOST likely cause of poor query performance?

A.The files are too small, causing excessive read overhead

B.The files are not compressed

C.The partition columns are not appropriately chosen

D.The data format is CSV instead of Parquet

AnswerA

Many small files cause many S3 GET requests and slow performance.

Why this answer

Option C is correct because the exhibit shows tiny files (50 bytes), which cause high metadata overhead and slow query performance. Option A is about compression, which cannot be determined. Option B is about partitioning, which is fine.

Option D is about format, but CSV is standard.

Practice this question →

81

MCQeasy

A company runs an Amazon RDS for PostgreSQL database and wants to capture change data (inserts, updates, deletes) to stream into Amazon Kinesis Data Streams for real-time processing. Which AWS service should be used to capture the changes directly from the database?

A.Amazon RDS automated snapshots

B.AWS Glue ETL job scheduled to run every minute

C.Amazon Kinesis Agent

D.AWS Database Migration Service (DMS) with ongoing replication

AnswerD

DMS supports CDC and can stream changes to Kinesis.

Why this answer

AWS DMS with ongoing replication (change data capture) is the correct service because it can continuously capture insert, update, and delete operations from the PostgreSQL transaction logs (WAL) and stream them to a Kinesis Data Streams endpoint. This allows real-time processing without modifying the source database or requiring application-level triggers.

Exam trap

The trap here is that candidates confuse scheduled polling (Glue) or file-based agents (Kinesis Agent) with true CDC, failing to recognize that only DMS ongoing replication can stream row-level changes directly from the database transaction log in real time.

How to eliminate wrong answers

Option A is wrong because Amazon RDS automated snapshots are point-in-time backups of the entire database, not a mechanism to capture individual row-level changes in real time. Option B is wrong because an AWS Glue ETL job scheduled every minute introduces at least 60 seconds of latency and cannot capture every single change as it happens, making it unsuitable for true real-time streaming. Option C is wrong because Amazon Kinesis Agent is designed to stream log files (e.g., from EC2 instances) to Kinesis, not to connect directly to a database and read transactional changes from its WAL.

Practice this question →

82

MCQeasy

A data engineer runs a Spark job on Amazon EMR that reads data from Amazon S3 and writes results back to S3. The job fails with an 'S3AccessDenied' error. The engineer verifies that the IAM role attached to the EMR cluster has s3:GetObject and s3:PutObject permissions on the relevant buckets. What is the MOST likely cause of the error?

A.S3 Transfer Acceleration is not enabled on the bucket.

B.EMRFS consistent view is not configured.

C.The S3 bucket is in a different AWS Region than the EMR cluster.

D.The IAM role does not have s3:ListBucket permission on the bucket.

AnswerD

EMR requires ListBucket permission to access objects in the bucket.

Why this answer

The IAM role attached to the EMR cluster must have the s3:ListBucket permission on the bucket to allow the Spark job to enumerate objects when reading from S3. Without this permission, even with s3:GetObject and s3:PutObject, the job fails with an 'S3AccessDenied' error because the S3 list operation is required for directory listing and file discovery.

Exam trap

The trap here is that candidates often assume GetObject and PutObject are sufficient for S3 read/write operations, overlooking that the ListBucket permission is required for directory listing and file discovery in Spark jobs.

How to eliminate wrong answers

Option A is wrong because S3 Transfer Acceleration is a feature for faster uploads over long distances and is not required for basic read/write operations; its absence does not cause an access denied error. Option B is wrong because EMRFS consistent view is a consistency mechanism for eventually consistent S3 buckets, not a permission or access control feature; its absence would not produce an S3AccessDenied error. Option C is wrong because while cross-region access can cause latency or additional costs, it does not inherently cause an access denied error as long as the IAM role has the correct permissions and the bucket policy allows cross-region access.

Practice this question →

83

MCQeasy

A data engineer receives an alert that a Kinesis Data Stream has a 'WriteProvisionedThroughputExceeded' error. The stream has 5 shards with 1 MB/s write capacity per shard. The producer application is sending data at 8 MB/s sustained. What should the engineer do to resolve the issue?

A.Reduce the record size to below 1 MB per record.

B.Enable enhanced fan-out on the stream.

C.Increase the number of shards from 5 to 10.

D.Use Kinesis Firehose as an intermediary to buffer data.

AnswerC

More shards increase the total write capacity, matching the 8 MB/s requirement.

Why this answer

The 'WriteProvisionedThroughputExceeded' error indicates that the total write throughput to the Kinesis Data Stream exceeds the provisioned capacity. With 5 shards, each offering 1 MB/s write capacity, the total write capacity is 5 MB/s. The producer is sending 8 MB/s, which is above this limit.

Increasing the number of shards to 10 raises the total write capacity to 10 MB/s, accommodating the sustained 8 MB/s throughput and resolving the throttling.

Exam trap

The trap here is that candidates confuse write-side throttling with read-side limitations, leading them to choose enhanced fan-out (a read-side optimization) instead of scaling shards to increase write capacity.

How to eliminate wrong answers

Option A is wrong because reducing record size below 1 MB does not address the throughput limit; the error is about aggregate write throughput exceeding shard capacity, not individual record size limits. Option B is wrong because enhanced fan-out is a feature for increasing read throughput (up to 2 MB/s per shard per consumer) and does not affect write capacity or resolve write-side throttling. Option D is wrong because Kinesis Firehose is a delivery service that reads from a Kinesis stream; it cannot buffer data before it is written to the stream, so it does not solve the write throughput exceedance at the producer side.

Practice this question →

84

MCQmedium

A data engineer is troubleshooting a Kinesis Data Analytics application that processes streaming data. The application is falling behind, and the metric 'MillisBehindLatest' is consistently above 60000. The source Kinesis stream has 10 shards, and the application uses a Flink application with default parallelism. What is the MOST likely cause of the lag?

A.The sink (destination) is throttling writes.

B.The Flink application parallelism is set to 1.

C.The Kinesis stream has too few shards.

D.The retention period of the Kinesis stream is too short.

AnswerB

Default parallelism of 1 causes a single consumer to process all shards.

Why this answer

Option C is correct because default parallelism in Flink is 1, which means only one task processes data from all shards, causing a bottleneck. Option A is wrong because increasing shards would increase parallelism but the issue is with parallelism. Option B is wrong because the retention period does not affect lag.

Option D is wrong because the stream is the source, not the sink.

Practice this question →

85

MCQhard

A data engineer is designing a solution to move data from an on-premises Oracle database to Amazon S3 using AWS DMS. The engineer needs to ensure that data changes are replicated continuously with minimal latency. Which DMS configuration is most appropriate?

A.Use AWS SCT to convert the schema and then use DMS for full load

B.Use a full-load task with ongoing replication (CDC)

C.Use a full-load task that runs daily

D.Use Kinesis Data Streams to capture changes and write to S3

AnswerB

CDC captures changes continuously after initial load.

Why this answer

Option B is correct because DMS with CDC (change data capture) captures ongoing changes with low latency. Option A is wrong because full load only does initial copy. Option C is wrong because SCT is for schema conversion, not data movement.

Option D is wrong because Kinesis is for streaming, but DMS is the right service for database replication.

Practice this question →

86

MCQhard

A company uses Amazon Kinesis Data Analytics for Apache Flink to process streaming data. The application reads from a Kinesis data stream and writes results to an S3 bucket. The application is consistently running out of memory and failing. The operator has already increased the Parallelism and TaskManager memory. What is the next BEST step to troubleshoot?

A.Change the processing mode from exactly-once to at-least-once

B.Reduce the number of shards in the source stream

C.Enable Apache Flink metrics in Amazon CloudWatch to monitor heap and checkpoint details

D.Increase the buffer timeout for the S3 sink

AnswerC

Detailed metrics help identify root cause of OOM.

Why this answer

Option D is correct because enabling Apache Flink metrics on CloudWatch allows monitoring of heap usage, checkpoint sizes, and backpressure, which can diagnose the memory issue. Option A adds latency but doesn't diagnose memory. Option B changes processing semantics but doesn't address memory.

Option C is unrelated.

Practice this question →

87

Multi-Selecthard

A company is using AWS Glue DataBrew to clean and transform data from an S3 bucket. The data contains personally identifiable information (PII). The company wants to mask the PII columns before making the dataset available to analysts. Which THREE actions can the engineer perform using DataBrew to mask PII? (Choose THREE.)

Select 3 answers

A.Apply a 'Column masking' transformation to replace values with 'XXX'.

B.Apply a 'Column tokenization' transformation to replace values with tokens.

C.Apply a 'Column hashing' transformation using SHA-256.

D.Apply a 'Column delete' transformation to remove the PII columns entirely.

E.Apply a 'Column encryption' transformation to encrypt the column values.

AnswersA, C, D

Column masking is a built-in transformation that hides data.

Why this answer

Options A, C, and D are correct. DataBrew provides built-in transformations for masking: 'Column masking' replaces values with a fixed pattern, 'Column hashing' replaces with a hash, and 'Column delete' removes the column. Option B is wrong because 'Column encryption' is not a DataBrew transformation; encryption is typically done at rest or with KMS.

Option E is wrong because there is no 'Column tokenization' transformation in DataBrew; tokenization would require a custom recipe step.

Practice this question →

88

MCQeasy

A data engineer notices that an Amazon RDS for PostgreSQL instance's CPU utilization is consistently above 90% during business hours. The database is used for reporting queries. Which action should be taken FIRST to improve performance?

A.Enable Multi-AZ deployment for automatic failover.

B.Enable Performance Insights and review slow queries.

C.Create a read replica to offload reporting queries.

D.Increase the instance size to a larger instance class.

AnswerB

Identifying and optimizing slow queries reduces CPU usage.

Why this answer

Option B is correct because the first step in diagnosing high CPU utilization on an RDS for PostgreSQL instance used for reporting queries is to identify the root cause. Enabling Performance Insights provides a detailed view of database load, wait events, and SQL query performance, allowing the data engineer to pinpoint slow or inefficient queries that are consuming CPU resources. Without this diagnostic data, any other action would be premature and could lead to unnecessary cost or complexity.

Exam trap

The trap here is that candidates often jump to scaling solutions (like increasing instance size or adding a read replica) without first diagnosing the root cause, but AWS emphasizes observability and optimization before capacity changes.

How to eliminate wrong answers

Option A is wrong because enabling Multi-AZ deployment improves availability and failover, not performance; it does not reduce CPU utilization or address query performance issues. Option C is wrong because creating a read replica offloads read traffic but does not fix the underlying inefficient queries that are causing high CPU on the source instance; the replica would also suffer from the same workload if queries are poorly optimized. Option D is wrong because increasing the instance size may temporarily mask the problem by providing more CPU capacity, but it does not resolve the root cause of inefficient queries and incurs higher costs without guaranteeing sustained performance improvement.

Practice this question →

89

MCQmedium

A data pipeline using AWS Glue ETL jobs is failing intermittently with the error 'Rate exceeded' when writing to an Amazon Redshift cluster. Which action is MOST effective to resolve this issue?

A.Increase the timeout of the Glue ETL job to allow more time for retries.

B.Disable workload management (WLM) concurrency scaling in Redshift.

C.Enable auto-tuning on the Redshift cluster and use concurrency scaling.

D.Change the output file format from Parquet to CSV to reduce write size.

AnswerC

Auto-tuning with concurrency scaling dynamically adds capacity to handle increased write requests.

Why this answer

Option A is correct because enabling Redshift auto-tuning with concurrency scaling can automatically handle write spikes. Option B is wrong because increasing Glue job timeout does not address rate limiting. Option C is wrong because disabling Redshift WLM concurrency scaling would exacerbate the issue.

Option D is wrong because using a different data format does not affect write rate limits.

Practice this question →

90

MCQhard

A company uses a DynamoDB table with on-demand capacity for a gaming application. During a new game launch, the table experienced throttling errors. The engineer checks CloudWatch metrics and sees that the 'ConsumedWriteCapacityUnits' exceeded the 'ProvisionedWriteCapacityUnits' (on-demand uses the table's previous peak). The application is writing at 50,000 WCU but the table's peak was 30,000 WCU. What should the engineer do to resolve throttling?

A.Add a DynamoDB Accelerator (DAX) cluster in front of the table.

B.Increase the number of partitions by splitting the partition key.

C.Contact AWS Support to pre-warm the table for higher throughput.

D.Switch the table to provisioned capacity and set WCU to 50,000.

AnswerC

Pre-warming increases the table's initial throughput limit to handle spikes.

Why this answer

Option B is correct because on-demand capacity automatically scales based on traffic, but it has a warm-up limit. Pre-warming the table increases the initial throughput limit. Option A is wrong because on-demand does not have a provisioned setting.

Option C is wrong because increasing partition count does not directly increase throughput limit. Option D is wrong because DAX improves read performance, not write throughput.

Practice this question →

91

MCQmedium

Your team uses Amazon Kinesis Data Analytics to process real-time streaming data from an Amazon Kinesis Data Stream. The application calculates windowed aggregations and writes results to an Amazon S3 bucket using a delivery stream. Recently, the application has been failing with a 'LimitExceededException' when writing to the delivery stream. You have checked the CloudWatch metrics and see that the IncomingBytes and IncomingRecords for the delivery stream are well below the provisioned limits. The delivery stream has a buffer size of 5 MB and a buffer interval of 60 seconds. The application generates about 500 records per second, each about 1 KB. What is the most likely cause and correct action?

A.Increase the number of shards in the Kinesis Data Stream to reduce the load on the application.

B.Modify the application to use PutRecordBatch with smaller batch sizes to stay within the 4 MB per-call limit.

C.Reduce the buffer interval of the delivery stream to 30 seconds to flush data more frequently.

D.Increase the buffer size of the delivery stream to 10 MB to accommodate larger writes.

AnswerB

The LimitExceededException on Firehose is often due to exceeding the 4 MB per PutRecordBatch call. Reducing batch size fixes it.

Why this answer

Option B is correct because Kinesis Data Analytics writes to Firehose through a PutRecord or PutRecordBatch call. Each call has a maximum payload size of 1 MB (for PutRecord) or 4 MB (for PutRecordBatch). If the application uses PutRecordBatch and the total payload exceeds 4 MB, it gets a LimitExceededException.

Increasing the buffer size or interval does not affect the per-call limit. Option A is wrong because the stream is not the source of the error. Option C is wrong because the buffer settings are not causing the per-call limit.

Option D is wrong because the data size is small.

Practice this question →

92

Multi-Selectmedium

A data engineer is troubleshooting an AWS Glue ETL job that fails with the error: 'An error occurred while calling o123.pyWriteDynamicFrame. Access Denied when writing to S3 bucket: my-bucket'. The job uses a Glue service role named 'GlueServiceRole'. Which TWO actions should the engineer take to resolve the issue? (Choose TWO.)

Select 2 answers

A.Disable S3 Block Public Access on the bucket.

B.Grant the GlueServiceRole permission to write to the AWS Glue Data Catalog.

C.Check if the S3 bucket policy denies access from the GlueServiceRole.

D.Verify that the IAM policy attached to GlueServiceRole includes s3:PutObject on the bucket.

E.Ensure the Glue job is in the same VPC as the S3 bucket.

AnswersC, D

Bucket policy may override IAM permissions.

Why this answer

Option C is correct because the error message indicates an access denied when writing to S3, which can be caused by a bucket policy that explicitly denies the Glue service role's access, even if the IAM policy allows it. Option D is correct because the IAM policy attached to GlueServiceRole must include the s3:PutObject permission on the specific bucket to allow the Glue job to write data.

Exam trap

The trap here is that candidates may confuse S3 access errors with network or VPC issues, but S3 is a global service and access is governed by IAM and bucket policies, not VPC placement.

Practice this question →

93

MCQeasy

A data engineer is troubleshooting a failed AWS Glue ETL job that reads from Amazon S3 and writes to Amazon Redshift. The job fails with the error: 'ERROR: Cannot insert a duplicate key into unique index'. The Redshift table has a primary key on the 'id' column. The data in S3 contains multiple records with the same 'id'. The engineer needs to ensure that only the latest record for each 'id' is loaded into Redshift. The data has a 'timestamp' column. Which approach should the engineer take?

A.Use the 'dropDuplicates' transformation in the Glue ETL script, ordering by 'timestamp' descending to keep the latest record for each 'id'.

B.Set the write mode to 'overwrite' in the Glue job to replace the entire Redshift table.

C.Load data into a staging table in Redshift and then use a MERGE operation to insert only new records.

D.Disable primary key constraints on the Redshift table before loading.

AnswerA

This removes duplicate IDs while preserving the most recent record.

Why this answer

Option B is correct because using the AWS Glue 'dropDuplicates' transformation on the 'id' column, ordering by 'timestamp' descending, will remove duplicate 'id' values, keeping the latest record. Option A is wrong because disabling constraints does not prevent duplicates; it only defers the error. Option C is wrong because setting 'overwrite' mode replaces the entire table, not just duplicates.

Option D is wrong because staging tables and MERGE require additional steps and are not directly available in Glue without custom logic.

Practice this question →

94

Multi-Selecthard

A data engineer is designing an ETL pipeline that uses AWS Glue to process data from an Amazon DynamoDB table and write results to an S3 bucket in Parquet format. The pipeline must handle schema changes in the source DynamoDB table. Which THREE steps should the engineer take to ensure the pipeline handles schema evolution? (Choose THREE.)

Select 3 answers

A.Use Glue's 'recast' transformation to handle type changes.

B.Set the Glue crawler to update the table's schema in the Data Catalog.

C.Convert the Parquet output to CSV to avoid schema constraints.

D.Partition the data by date and delete old partitions.

E.Use Spark's 'mergeSchema' option when writing to S3.

AnswersA, B, E

recast can change data types to match the target schema.

Why this answer

Options A, B, and D are correct. Updating the Glue Data Catalog allows the crawler to update schema. 'mergeSchema' is a Spark option that merges schemas. 'recast' option in Glue helps handle type changes. Option C is wrong because deleting partitions is not related to schema evolution.

Option E is wrong because converting to CSV is not a schema evolution strategy.

Practice this question →

95

MCQeasy

A company stores sensitive data in Amazon S3. To meet compliance requirements, they need to ensure that any data older than 1 year is automatically moved to a lower-cost storage class. Which S3 feature should they use?

A.S3 Replication

B.S3 Lifecycle policies

C.S3 Glacier

D.S3 Intelligent-Tiering

AnswerB

Lifecycle policies can transition objects to lower-cost storage classes based on age.

Why this answer

Option B is correct because S3 Lifecycle policies automate transitioning objects between storage classes. Option A is wrong because S3 Glacier is a storage class, not a feature to automate transitions. Option C is wrong because S3 Intelligent-Tiering automatically optimizes costs but does not enforce a specific age-based transition.

Option D is wrong because S3 Replication is for copying objects across buckets.

Practice this question →

96

Multi-Selecteasy

A data engineer is setting up a data pipeline to ingest streaming data from an IoT fleet. The data must be processed in near real-time and stored in Amazon S3 for analytics. Which THREE AWS services should the engineer consider using?

Select 3 answers

A.Amazon EMR

B.Amazon Kinesis Data Firehose

C.AWS Lambda

D.AWS Glue

E.Amazon Kinesis Data Streams

AnswersB, C, E

Delivers streaming data to S3.

Why this answer

Option A is correct because Kinesis Data Streams ingests streaming data. Option C is correct because Kinesis Data Firehose can deliver data to S3. Option D is correct because Lambda can process records in near real-time.

Option B is incorrect because Glue is batch-oriented. Option E is incorrect because EMR is for big data processing, not streaming ingestion.

Practice this question →

97

MCQmedium

A data engineer is monitoring a Redshift cluster that is experiencing slow query performance. The cluster has 4 dc2.large nodes. The engineer notices that disk space usage is at 85% across all nodes. Which action would MOST likely improve query performance?

A.Change the table design to use DISTKEY and SORTKEY.

B.Enable compression on all columns.

C.Increase the number of nodes to 8.

D.Run the VACUUM command to reclaim space.

AnswerC

Adding nodes increases disk capacity and I/O throughput, reducing disk pressure and improving query performance.

Why this answer

Option D is correct because adding more nodes distributes data and improves I/O parallelism. Option A is wrong because vacuum reclaims space but does not help if disk usage is high due to data volume. Option B is wrong because DISTKEY and SORTKEY changes are design-time decisions.

Option C is wrong because compression is already applied at load time.

Practice this question →

98

MCQmedium

A data engineer uses Amazon EMR to run a Spark job that reads from S3 and writes to HDFS on the cluster. The job fails with an 'OutOfMemoryError: Java heap space' error in the executors. Which parameter adjustment should be made to resolve this?

A.Increase spark.default.parallelism

B.Increase spark.sql.shuffle.partitions

C.Increase spark.executor.memory

D.Increase spark.driver.memory

AnswerC

This directly increases the heap size available to each executor.

Why this answer

Option A is correct because increasing spark.executor.memory allocates more heap space to executors. Option B is wrong because spark.driver.memory affects the driver, not executors. Option C is wrong because spark.sql.shuffle.partitions affects shuffle behavior, not memory.

Option D is wrong because spark.default.parallelism controls task parallelism, not memory.

Practice this question →

99

MCQhard

A company uses AWS DMS to migrate a 2 TB Oracle database to Amazon RDS for PostgreSQL. The migration completes successfully, but data validation shows some tables have missing rows. The task is configured for ongoing replication using change data capture (CDC). What is the MOST likely cause of the missing rows?

A.Source database archive log retention period too short

B.Large objects (LOBs) not supported by the target

C.Source tables missing primary keys

D.Insufficient storage on the DMS replication instance

AnswerC

Without primary keys, DMS cannot track changes for CDC, leading to missing rows.

Why this answer

Option C is correct because if a table lacks a primary key, DMS cannot uniquely identify rows for CDC, leading to missed changes. Option A is wrong because the endpoint connection is valid (migration completed). Option B is wrong because CDC captures changes from redo logs, not the source database directly.

Option D is wrong because DMS supports large objects with proper configuration.

Practice this question →

100

MCQmedium

A company uses Amazon EMR to run Spark jobs on data stored in S3. After upgrading the EMR cluster to a new release, one of the Spark jobs fails with 'OutOfMemoryError' in the executor. Which configuration change is MOST likely to resolve this issue?

A.Increase the number of core nodes in the EMR cluster.

B.Decrease spark.sql.shuffle.partitions to reduce overhead.

C.Increase spark.driver.memory in the Spark configuration.

D.Increase spark.executor.memory to allocate more memory per executor.

AnswerD

More memory per executor prevents OutOfMemoryError.

Why this answer

Option D is correct because increasing spark.executor.memory gives more memory per executor. Option A is wrong because increasing driver memory helps the driver, not executors. Option B is wrong because the number of instances doesn't directly fix executor memory.

Option C is wrong because reducing partitions may cause data skew and more memory pressure.

Practice this question →

101

MCQeasy

A data engineer has set up an AWS Lambda function that processes files uploaded to an S3 bucket. The function is triggered by S3 event notifications. However, the function is not being invoked when a file is uploaded. The engineer checks the Lambda function's CloudWatch Logs and finds no execution logs. What should the engineer check FIRST?

A.Check the Lambda function's code for errors.

B.Verify that the Lambda function's IAM role has permissions to read from S3.

C.Verify that the S3 bucket has an event notification configured for the Lambda function.

D.Check if the Lambda function is attached to a VPC.

AnswerC

Without event notification, S3 will not invoke the function.

Why this answer

Option A is correct because the S3 bucket must have an event notification configured to trigger the Lambda function. Option B is wrong because function code errors would appear in logs after invocation. Option C is wrong because the IAM role affects execution, not invocation trigger.

Option D is wrong because VPC configuration affects network access, not whether the function is triggered.

Practice this question →

102

MCQmedium

A company uses Amazon Redshift for its data warehouse. A data engineer notices that queries are running slowly and the system's disk space is nearly full. The engineer runs the STV_PARTITIONS view and sees that many slices have high 'tossed' counts. What does this indicate, and what should the engineer do?

A.The tossed rows are permanent and cannot be reclaimed; the engineer should perform a deep copy to a new table.

B.The tossed rows indicate that the sort key is not optimal; redefining the sort key will reduce tossed rows.

C.The tossed rows are due to data skew; redistribute the table on a different distribution key.

D.The tossed rows are deleted rows that need to be reclaimed by running VACUUM.

AnswerD

VACUUM removes deleted rows and reclaims disk space, improving query performance.

Why this answer

Option A is correct because 'tossed' rows indicate data that was deleted or updated and is waiting for VACUUM. A high tossed count means wasted space. Running VACUUM reclaims that space.

Option B is wrong because 'tossed' does not indicate sort key issues. Option C is wrong because 'tossed' is not about distribution. Option D is wrong because deep copy is a more drastic alternative, but VACUUM is the standard action.

Practice this question →

103

MCQeasy

A data engineer notices that a nightly AWS Glue ETL job has been failing for the past three days with the error 'Unable to locate credentials'. The job uses an IAM role for execution. What is the most likely cause of this error?

A.The IAM role does not have an access key attached.

B.The S3 bucket name in the job parameters is misspelled.

C.The IAM role's trust policy does not include glue.amazonaws.com as a trusted entity.

D.The JDBC connection string contains an incorrect password.

AnswerC

Without the trust policy, Glue cannot assume the role and gets 'Unable to locate credentials'.

Why this answer

The error 'Unable to locate credentials' indicates that the AWS Glue job cannot obtain AWS credentials to authenticate API calls. Since the job uses an IAM role for execution, the most likely cause is that the trust policy of that IAM role does not include 'glue.amazonaws.com' as a trusted entity. Without this trust relationship, AWS Glue cannot assume the role and thus has no credentials to sign requests.

Exam trap

AWS often tests the distinction between IAM role trust policies (who can assume the role) and IAM role permission policies (what actions the role can perform), and candidates mistakenly focus on permission policies when the error is about credential acquisition.

How to eliminate wrong answers

Option A is wrong because IAM roles do not use access keys; they use temporary security credentials obtained via the AWS Security Token Service (STS). Option B is wrong because a misspelled S3 bucket name would cause a 'NoSuchBucket' or 'Access Denied' error, not a credentials-related error. Option D is wrong because an incorrect JDBC password would result in a connection failure or authentication error from the database, not an 'Unable to locate credentials' error from AWS.

Practice this question →

104

MCQhard

An Amazon RDS for PostgreSQL instance is experiencing high CPU utilization and slow query performance. The data engineer suspects that a specific query is causing the problem. The engineer wants to identify the query and analyze its execution plan. Which steps should the engineer take?

A.Enable CloudWatch Logs for the RDS instance and search for slow query logs.

B.Enable the performance_schema in the PostgreSQL parameter group and query the performance_schema.events_statements_summary_by_digest table.

C.Enable Enhanced Monitoring and analyze the CPU metrics.

D.Use RDS Performance Insights to identify the top queries.

AnswerB

This provides detailed query statistics and can help identify problematic queries and their execution plans.

Why this answer

Option D is correct because enabling performance_schema and querying performance_schema.events_statements_summary_by_digest helps identify high-load queries. Option A is incorrect because Enhanced Monitoring provides OS-level metrics, not query details. Option B is incorrect because CloudWatch Logs captures database logs, but not real-time query performance.

Option C is incorrect because RDS Performance Insights provides query performance data, but the specific query and plan are best obtained via performance_schema.

Practice this question →

105

MCQmedium

Refer to the exhibit. A data engineer has an IAM policy attached to an IAM role used by an AWS Glue job. The Glue job reads from S3 bucket 'example-bucket' and writes to an S3 bucket 'output-bucket'. The job fails with an 'Access Denied' error when writing to 'output-bucket'. What is the MOST likely cause?

A.The policy does not allow s3:PutObject on any bucket.

B.The policy does not allow s3:PutObject on 'output-bucket'.

C.The policy does not allow s3:GetObject on 'output-bucket'.

D.The policy has a condition that restricts s3:PutObject to 'example-bucket'.

AnswerB

The resource is only example-bucket/*.

Why this answer

Option B is correct. The policy only allows s3:PutObject on 'example-bucket/*', not on 'output-bucket/*'. The job needs permission on the output bucket.

Option A is incorrect because s3:PutObject is allowed on example-bucket, but not on output-bucket. Option C is incorrect because there is no condition that restricts PutObject to example-bucket. Option D is incorrect because the policy allows s3:GetObject on example-bucket, which is for reading.

Practice this question →

106

Multi-Selectmedium

A data engineer is designing a disaster recovery strategy for an Amazon RDS for PostgreSQL database that is used in a data pipeline. The database must have a Recovery Point Objective (RPO) of less than 1 minute and a Recovery Time Objective (RTO) of less than 5 minutes. Which TWO actions should the engineer take?

Select 2 answers

A.Take frequent manual snapshots and copy them to another Region.

B.Enable automated backups with point-in-time recovery.

C.Enable Multi-AZ deployment with a standby instance.

D.Create a read replica in a different Availability Zone.

E.Use cross-Region replication with Amazon Aurora Global Database.

AnswersB, C

Allows recovery to any point within retention period, meeting RPO.

Why this answer

Options B and D are correct. Multi-AZ with standby provides automatic failover with RTO typically under 1-2 minutes, and automated backups with point-in-time recovery enable RPO of seconds. Option A is wrong because read replicas are not for automatic failover.

Option C is wrong because snapshots are manual and have higher RPO/RTO. Option E is wrong because cross-Region replication adds latency and may not meet RPO.

Practice this question →

107

MCQhard

A data engineer is troubleshooting an access issue. A user has the IAM policy shown in the exhibit. The user attempts to upload an object to `s3://data-lake-bucket/confidential/report.pdf`. What will happen?

A.The upload will fail with an 'Access Denied' error.

B.The upload will succeed because the Deny statement is not valid without a condition.

C.The upload will succeed because the Allow statement is more specific than the Deny.

D.The upload will succeed because the user has s3:PutObject permission on the bucket.

AnswerA

The Deny statement explicitly denies all s3 actions on the confidential prefix, taking precedence over the Allow.

Why this answer

Option B is correct because the explicit Deny overrides the Allow, so the upload will be denied. Option A is incorrect because the user has s3:PutObject allowed for the bucket, but the Deny for the confidential path takes precedence. Option C is incorrect because the policy is valid.

Option D is incorrect because the Deny is explicit.

Practice this question →

108

MCQeasy

A data engineer is running an Amazon EMR cluster with Spark to process log files. The cluster uses instance fleets with m5.xlarge core nodes. The engineer observes that the Spark job is running slower than expected. CloudWatch metrics show that the cluster's CPU utilization is below 20% but memory utilization is near 90%. Which configuration change would most likely improve performance?

A.Use memory-optimized instances (r5.xlarge) for core nodes.

B.Increase the number of core nodes from 5 to 10.

C.Increase the number of Spark shuffle partitions.

D.Decrease the number of core nodes to reduce overhead.

AnswerA

r5 instances have higher memory-to-CPU ratio, reducing memory pressure and spills.

Why this answer

Option D is correct because high memory usage with low CPU indicates that the data does not fit in memory, causing spills to disk. Using a memory-optimized instance type (e.g., r5.xlarge) provides more memory per core. Option A is wrong because increasing core nodes adds more CPU but does not address memory per node.

Option B is wrong because reducing nodes reduces total memory. Option C is wrong because the issue is memory, not shuffle partitions.

Practice this question →

109

MCQhard

A company runs a data pipeline using AWS Step Functions to orchestrate multiple AWS Lambda functions and AWS Glue jobs. The pipeline processes large CSV files from Amazon S3, transforms them, and loads them into Amazon Redshift. Recently, the pipeline has been failing intermittently with a 'StateMachineExecutionLimitExceeded' error. The error occurs when multiple pipeline runs are triggered simultaneously. The current execution limit for the state machine is 1000. The team expects up to 200 concurrent executions during peak hours. Which action should the team take to resolve the issue?

A.Increase the execution timeout for the state machine to 1 hour.

B.Increase the Lambda function concurrency limits to allow more parallel processing.

C.Implement a queue (e.g., Amazon SQS) to buffer the pipeline triggers and process them sequentially.

D.Request a service quota increase for the maximum number of state machine executions from AWS Support.

AnswerD

The default limit is 1000; increasing it to 2000 would accommodate the expected concurrency.

Why this answer

Option D is correct because the error indicates the state machine execution limit has been reached. The team should request a limit increase from AWS Support. Option A is wrong because reducing concurrency does not solve the limit issue; it only reduces the number of concurrent executions.

Option B is wrong because increasing Lambda concurrency limits does not affect Step Functions execution limits. Option C is wrong because the error is not about execution timeout; it's about exceeding the maximum number of concurrent executions.

Practice this question →

110

MCQeasy

A data engineer is troubleshooting a slow Amazon Redshift query. The EXPLAIN plan shows a 'Seq Scan' on a large table. What is the most likely cause?

A.The cluster has too many nodes.

B.There are too many concurrent queries.

C.The table does not have a proper sort key defined.

D.The workload management (WLM) queue is misconfigured.

AnswerC

Without a sort key, Redshift performs a full table scan (Seq Scan) instead of a range-restricted scan.

Why this answer

Option D is correct because 'Seq Scan' indicates a full table scan, which typically occurs when there is no suitable index (or in Redshift, no sort key). Option A (too many nodes) would not cause a Seq Scan. Option B (concurrent queries) could cause slowdown but not a Seq Scan specifically.

Option C (WLM queue) affects concurrency, not scan type.

Practice this question →

111

MCQmedium

A company uses Amazon Athena to query data stored in an S3 bucket. The data is partitioned by year, month, day, and hour. The data engineer notices that queries are scanning a large amount of data even with a WHERE clause on the partition columns. What is the MOST likely cause?

A.The data has too many partitions, causing overhead.

B.The table does not have partitions defined in the AWS Glue Data Catalog.

C.The S3 bucket uses the S3 Glacier storage class.

D.The data files are compressed with GZIP.

AnswerB

Without partition definitions, Athena scans all data.

Why this answer

Option B is correct because if partitions are not defined in the table, Athena cannot perform partition pruning. Option A is wrong because S3 storage class does not affect scanning. Option C is wrong because too many partitions improve pruning, not hinder.

Option D is wrong because compressed files reduce scan size, not increase.

Practice this question →

112

MCQhard

A data engineer has attached the IAM policy shown in the exhibit to a role used by an AWS Glue ETL job. The job fails when trying to write to the S3 bucket 'example-bucket' with the error: 'Access Denied'. What is the MOST likely reason?

A.The IAM policy does not include the bucket ARN for write operations.

B.The IAM role's trust policy does not allow Glue to assume the role.

C.The S3 bucket policy denies the PutObject action for the role.

D.The IAM policy does not grant s3:PutObject permission.

AnswerC

A bucket policy can explicitly deny access even if IAM allows it.

Why this answer

Option C is correct because the Glue job's IAM role may not have permission to call the s3:PutObject action, but the error could also be due to the bucket policy denying access. However, the policy shown allows s3:PutObject on the bucket. The most common issue is that the Glue job's role does not have the necessary trust policy or the bucket policy blocks the request.

But based on the exhibit, the policy appears correct. The error could be due to missing permissions on the Glue job's execution role. Option A is incorrect because the policy includes s3:PutObject.

Option B is incorrect because the policy includes the bucket ARN. Option D is incorrect because the error is access denied, not a bucket policy issue. Actually, the correct answer is that the role may also need s3:PutObject on the bucket itself (not just objects) for certain operations like multipart uploads.

But the most likely reason is that the bucket policy denies the request. Given the exhibit, the IAM policy is correct, so the issue is likely the bucket policy. Option C is the best answer.

Practice this question →

113

Multi-Selecthard

A data engineer is designing a disaster recovery plan for an Amazon Redshift data warehouse. The cluster is in us-east-1 and must be recoverable in us-west-2 with minimal data loss. Which THREE actions should the engineer take? (Choose THREE)

Select 3 answers

A.Create manual snapshots and copy them to us-west-2

B.Deploy Redshift in a multi-AZ configuration

C.Enable Redshift concurrent scaling

D.Schedule automated snapshots with a retention period

E.Configure automated snapshot copy to us-west-2

AnswersA, D, E

Manual snapshots can be copied across regions.

Why this answer

Options A, C, and E are correct. Cross-region snapshot copy allows recovery in another region. Automated snapshots enable point-in-time recovery.

Using a multi-AZ deployment provides high availability within a region but not cross-region. Concurrent scaling does not help disaster recovery. Manual backups are not automated.

Practice this question →

114

MCQhard

A data engineer at a financial services company manages an AWS Glue ETL pipeline that processes transaction data from Amazon S3 to Amazon Redshift for reporting. The pipeline runs every hour and uses a Glue job that reads Parquet files, performs transformations in Spark, and writes to Redshift using the JDBC connector. Recently, the job has been failing intermittently with the error: 'java.sql.BatchUpdateException: ERROR: null value in column "transaction_id" violates not-null constraint'. The data engineer has verified that the source Parquet files do contain non-null values for transaction_id. The job uses a DynamicFrame and applies a mapping to rename columns. The engineer also noticed that the failure occurs only during peak hours when there is high concurrency on Redshift. Which course of action should the engineer take to resolve this issue?

A.Add a filter in Glue to remove rows with null transaction_id.

B.Increase the Redshift WLM concurrency scaling to handle more queries.

C.Review the Glue job's mapping transformation to ensure transaction_id is correctly mapped and not dropped.

D.Increase the number of Glue workers to handle peak-hour load.

AnswerC

The mapping may have a bug that sets transaction_id to null.

Why this answer

Option C is correct. The error suggests that some rows are being written with null transaction_id. During high concurrency, Redshift might be rejecting the batch due to a transient issue, but the error is about null constraint.

The most likely cause is that the mapping is incorrectly dropping or nullifying the column. Option A is wrong because increasing Glue's worker count does not address the null value issue. Option B is wrong because increasing Redshift WLM concurrency could exacerbate the problem.

Option D is wrong because the source files are not the issue.

Practice this question →

115

Multi-Selecthard

A company is migrating its on-premises data warehouse to Amazon Redshift. The data includes tables with up to 100 columns and 500 million rows. The migration involves a full load followed by incremental updates. The company needs to minimize downtime during the final cutover. Which THREE strategies should the data engineer use to facilitate the migration? (Choose THREE.)

Select 3 answers

A.Increase the number of WLM queues to allow more concurrent loads.

B.Use the COPY command to load data from Amazon S3.

C.Use columnar format (e.g., Parquet) for the data files in S3.

D.Run VACUUM and ANALYZE commands after loading the data.

E.Disable distribution keys on the target tables to simplify loading.

AnswersB, C, D

COPY is optimized for bulk data loading into Redshift.

Why this answer

Option A is correct because using the COPY command with S3 is the most efficient way to load large datasets into Redshift. Option C is correct because using a columnar format like Parquet speeds up data transfer and reduces costs. Option E is correct because using VACUUM and ANALYZE after loading optimizes table storage and query performance.

Option B is wrong because increasing WLM concurrency does not speed up data loading. Option D is wrong because disabling distribution keys results in inefficient data distribution, leading to performance issues.

Practice this question →

116

MCQmedium

A data engineer is designing a data pipeline that processes sensitive personal data. The data is ingested via Amazon Kinesis Data Firehose and stored in Amazon S3. The pipeline must ensure that the data is encrypted at rest and in transit. The engineer also needs to audit access to the data. Which combination of services meets these requirements?

A.AWS KMS for encryption at rest, Kinesis Data Analytics for in-transit encryption, and AWS CloudTrail for auditing.

B.AWS KMS for encryption at rest, Amazon CloudWatch Logs for auditing, and TLS for in-transit encryption.

C.S3 server-side encryption (SSE-S3) for at-rest encryption, HTTPS for in-transit encryption, and AWS CloudTrail for auditing.

D.S3 client-side encryption, AWS Config for auditing, and TLS for in-transit encryption.

AnswerC

SSE-S3 encrypts objects at rest, HTTPS encrypts data in transit, and CloudTrail logs S3 API operations for auditing.

Why this answer

Option B is correct because SSE-S3 provides encryption at rest, HTTPS ensures encryption in transit, and CloudTrail logs S3 API calls for auditing. Option A is incorrect because CloudWatch Logs is for monitoring, not auditing data access. Option C is incorrect because AWS Config tracks configuration, not data access.

Option D is incorrect because Kinesis Data Analytics is for processing, not encryption.

Practice this question →

117

Multi-Selecthard

A company runs an Amazon Redshift cluster for data warehousing. The data engineering team notices that the 'Amazon Redshift Data API' is timing out when executing long-running queries. The queries typically take more than 10 minutes to complete. The team wants to ensure that the queries can complete without timeout and that the results are retrievable. Which TWO steps should the team take? (Choose TWO.)

Select 2 answers

A.Set the 'QueryExecutionTimeout' parameter in the Data API call to 30 minutes.

B.Increase the 'timeout' parameter in the Redshift cluster configuration.

C.Use the 'GetStatementResult' operation to retrieve results after the query completes.

D.Set the 'max_execution_time' parameter in the Redshift parameter group to 30 minutes.

E.Use the 'StatementName' parameter to run the query asynchronously and poll for completion.

AnswersC, E

This is the correct way to get results after the statement finishes.

Why this answer

Option B is correct because the Data API has a timeout of 10 minutes for a single call; using the 'StatementName' parameter allows you to check the status of the query asynchronously, and the query continues to run even if the API call times out. Option D is correct because you can retrieve the results using the 'GetStatementResult' operation after the statement has completed. Option A is wrong because increasing the query timeout in the cluster does not affect the Data API timeout.

Option C is wrong because the Data API does not have a 'QueryExecutionTimeout' parameter. Option E is wrong because Redshift does not have a 'max_execution_time' parameter; the relevant parameter is 'statement_timeout'.

Practice this question →

118

Multi-Selectmedium

A company uses Amazon EMR to run Spark jobs on data stored in Amazon S3. The data engineer notices that the jobs are running slower than expected. The engineer suspects that the S3 storage class might be affecting performance. Which THREE factors can impact read performance from S3? (Choose three.)

Select 3 answers

A.Use of S3 Transfer Acceleration.

B.Use of S3 Select to retrieve only a subset of data.

C.Use of S3 Object Lock.

D.Average object size in S3 bucket.

E.Data stored in compressed format (e.g., GZIP, Snappy).

AnswersB, D, E

S3 Select reduces the amount of data transferred.

Why this answer

Options A, C, and D are correct. A: S3 Select can reduce data scanned. C: Data compression reduces network transfer.

D: Larger object sizes improve throughput due to parallel requests. Option B is wrong because S3 Transfer Acceleration improves upload speed, not read performance. Option E is wrong because S3 Object Lock does not affect read performance.

Practice this question →

119

MCQmedium

An Amazon Kinesis Data Streams application is lagging behind. The data records are small (1 KB) and the shard count is 10. The consumer uses the KCL with default configuration. Which action will MOST effectively reduce the consumer lag?

A.Increase the number of KCL workers per shard (e.g., 2 workers per shard).

B.Use Enhanced Fan-Out to provide dedicated throughput.

C.Increase the number of shards to 20.

D.Reduce the record size by compressing the data.

AnswerA

More workers can process records concurrently, reducing lag.

Why this answer

Option A is correct because the KCL (Kinesis Client Library) uses a single worker per shard by default, and each worker processes records sequentially within that shard. Increasing the number of workers per shard (e.g., 2 workers) allows parallel processing of the same shard’s records, directly reducing consumer lag when records are small (1 KB) and the bottleneck is CPU or processing time per record, not throughput limits.

Exam trap

The trap here is that candidates often assume increasing shards (Option C) always reduces lag, but they miss that KCL workers are per-shard by default, so more shards only help if the shard is saturated with data, not when the consumer is slow at processing each record.

How to eliminate wrong answers

Option B is wrong because Enhanced Fan-Out provides dedicated throughput per consumer (up to 2 MB/s per shard per consumer), but the issue here is processing lag, not throttling or throughput limits—the default KCL already handles the 1 KB records easily, so dedicated throughput does not address the processing bottleneck. Option C is wrong because increasing shards to 20 would increase the number of parallel processing units, but each shard still has only one KCL worker by default, so the per-shard processing capacity remains unchanged; this would only help if the shard were overloaded with data, which is not the case with small records. Option D is wrong because compressing data reduces the size of records, but the records are already only 1 KB, and the bottleneck is processing time per record, not network or storage throughput; compression adds CPU overhead and does not reduce lag.

Practice this question →

120

MCQhard

Refer to the exhibit. A company has an S3 bucket 'my-data-lake' with the lifecycle policy shown. Objects under the 'logs/' prefix are being moved to GLACIER after 30 days and expire after 365 days. A data engineer notices that objects older than 365 days are still present in the bucket and are not being deleted. What is the most likely cause?

A.Lifecycle expiration does not apply to objects in GLACIER storage class

B.The rule status is disabled

C.The prefix filter does not match the objects

D.The expiration days count from the transition date, not the object creation date

AnswerA

Objects in GLACIER cannot be expired; they must be restored first.

Why this answer

Option C is correct because objects in GLACIER storage class can only be expired if they are restored first, or the expiration action must be applied to the current version. The policy does not specify a filter for current version, and GLACIER objects are not deletable by lifecycle without restoration. Option A is wrong because 365 days have passed.

Option B is wrong because the prefix is correct. Option D is wrong because the rule is enabled.

Practice this question →

121

MCQhard

A company runs a data lake on Amazon S3 with AWS Lake Formation for access control. The data lake contains sensitive customer information. A data scientist needs to query the data using Amazon Athena. The data scientist has been granted SELECT permission on the database and tables via Lake Formation. However, when the data scientist runs a query in Athena, they receive an error: 'Access denied. Please check your permissions.' The IAM role used by Athena has the following permissions: s3:GetObject, s3:ListBucket, and lakeformation:GetDataAccess. The Lake Formation admin has verified that the data scientist is a member of a Lake Formation data lake location and has been granted 'Describe' and 'Select' permissions on the table. What is the most likely reason for the access denied error?

A.The data scientist is not assigned to the correct Lake Formation tag.

B.The S3 bucket policy does not grant the Athena IAM role access to the S3 location.

C.The Athena IAM role is missing lakeformation:GetEffectivePermissions permission.

D.The data scientist's IAM user lacks the necessary S3 permissions.

AnswerB

Lake Formation permissions are separate from S3 bucket policies; the bucket policy must allow the IAM role to read the data.

Why this answer

Option A is correct because Lake Formation requires the IAM role to have lakeformation:GetDataAccess permission, but also the role needs permissions to the underlying S3 location. If the S3 bucket policy does not allow the Athena IAM role to access the bucket, the request will be denied. The data scientist is granted permissions in Lake Formation, but the S3 bucket policy must also grant access to the IAM role.

Option B is wrong because the role already has lakeformation:GetDataAccess. Option C is wrong because the error is not about Lake Formation tag access. Option D is wrong because Athena uses the IAM role's permissions, not the user's IAM permissions directly.

Practice this question →

122

MCQmedium

A data engineering team notices that an Amazon Kinesis Data Stream is frequently exceeding its shard write throughput limit, causing throttling. The team needs a long-term solution to handle variable write traffic without manual intervention. Which action should the team take?

A.Configure the Kinesis Client Library to throttle consumption.

B.Increase the number of shards manually during peak hours.

C.Use Amazon Kinesis Data Firehose to buffer records before delivery to the stream.

D.Implement a buffer using Amazon S3 and AWS Lambda that aggregates records and writes to Kinesis in batches.

AnswerD

This buffers writes and reduces throttling.

Why this answer

Option C is correct because using an S3 buffer with a Lambda function that batches records and writes to Kinesis can smooth out traffic spikes and reduce throttling. Option A is wrong because increasing shard count manually is not automatic. Option B is wrong because Kinesis Data Firehose is for delivery, not buffering for Kinesis streams.

Option D is wrong because Kinesis Client Library is for consumers, not producers.

Practice this question →

123

MCQeasy

Refer to the exhibit. A data engineer is troubleshooting an IAM policy attached to a user who cannot list objects in the S3 bucket 'example-bucket'. What is the most likely reason?

A.The bucket policy explicitly denies access to the user.

B.The resource ARN for the bucket is incorrect; it should be 'arn:aws:s3:::example-bucket/*'.

C.The policy includes s3:GetObject but not s3:ListObjects.

D.The policy does not include the s3:ListBucket action.

AnswerA

An explicit deny overrides the IAM policy.

Why this answer

Option B is correct because the policy grants s3:ListBucket on the bucket ARN, but the user also needs permission on the objects to list them (s3:GetObject is for reading, not listing). Actually, s3:ListBucket allows listing, but the error might be due to missing s3:GetObject on the bucket? Wait, the policy includes both actions. The issue is that the policy is correct for listing.

The exhibit shows a valid policy. The most likely reason is that the bucket policy denies access. Option A is wrong because the policy includes both actions.

Option C is wrong because there is no such action. Option D is plausible but less likely.

Practice this question →

124

MCQhard

A company uses AWS Lake Formation to manage access to data in S3. A data analyst reports being unable to query a table in Amazon Athena, receiving an 'Access Denied' error. The analyst has SELECT permission on the table in Lake Formation. What additional configuration is MOST likely causing the issue?

A.Athena does not have permission to access the Glue Data Catalog

B.The IAM role used by Athena does not have S3 GetObject permission on the underlying data

C.The analyst does not have DESCRIBE permission

D.The table is not registered with Lake Formation

AnswerB

Lake Formation grants SELECT, but S3 bucket policies or IAM may still block access.

Why this answer

Option D is correct because Lake Formation enforces access at the S3 level via IAM; if the IAM role lacks S3 permissions, access is denied. Option A is wrong because the table is registered. Option B is wrong because the analyst has SELECT permission.

Option C is wrong because Athena permissions are typically granted via Lake Formation.

Practice this question →

125

MCQhard

A data engineer is troubleshooting a failed AWS Glue ETL job that reads from a JDBC source. The error log shows 'java.sql.SQLException: Connection timed out'. The job previously ran successfully. Which of the following is the MOST likely cause?

A.The JDBC connection string has incorrect credentials.

B.The source database schema has changed.

C.The Glue job's timeout setting is too low.

D.The security group for the source database no longer allows traffic from the Glue job's IP range.

AnswerD

A network connectivity issue causes a timeout.

Why this answer

The error 'Connection timed out' indicates a network-level failure, not an authentication or schema issue. Since the job previously ran successfully, the most likely cause is that the security group for the source database no longer allows inbound traffic from the Glue job's IP range. AWS Glue ETL jobs run in a VPC with elastic network interfaces, and the security group rules must permit traffic on the JDBC port (e.g., 5432 for PostgreSQL, 3306 for MySQL).

Exam trap

AWS often tests the distinction between authentication errors (wrong credentials) and network connectivity errors (timeout), and candidates may confuse the Glue job timeout setting with a network timeout.

How to eliminate wrong answers

Option A is wrong because incorrect credentials would produce an authentication error (e.g., 'Access denied for user'), not a timeout. Option B is wrong because a schema change would cause a data type mismatch or column-not-found error, not a connection timeout. Option C is wrong because the Glue job's timeout setting controls how long the job can run before being terminated, not the network connection timeout to the JDBC source.

Practice this question →

126

Multi-Selecteasy

A data engineer needs to transfer 50 TB of data from an on-premises Hadoop cluster to Amazon S3. The network bandwidth is limited to 500 Mbps. Which TWO methods are appropriate for this transfer? (Choose TWO.)

Select 2 answers

A.Set up an AWS Direct Connect connection for higher bandwidth.

B.Order an AWS Snowball Edge device to physically ship the data.

C.Use S3 Transfer Acceleration to upload over the internet.

D.Use Amazon Kinesis Data Firehose to stream the data.

E.Use AWS DataSync to transfer data over the network.

AnswersB, E

Snowball is ideal for large datasets with low bandwidth.

Why this answer

Options A and C are correct. AWS Snowball Edge is a physical device for large data transfers over slow networks. AWS DataSync can transfer data over the network with optimization.

Option B is wrong because S3 Transfer Acceleration speeds up transfers but still requires network bandwidth. Option D is wrong because AWS Direct Connect is a dedicated network connection, not a transfer method. Option E is wrong because Amazon Kinesis is for streaming data, not bulk transfer.

Practice this question →

127

MCQeasy

A data engineer notices that an Amazon S3 bucket policy is overly permissive. What is the best practice to restrict access while maintaining required permissions?

A.Grant full S3 access using a new IAM policy.

B.Write a new bucket policy that denies all actions.

C.Use an S3 blocklist to restrict access.

D.Attach the AWS managed policy AmazonS3ReadOnlyAccess to the IAM user.

AnswerD

This policy grants only read access to S3, which is more restrictive than the current overly permissive policy.

Why this answer

Option A is correct because the AWS managed policy 'AmazonS3ReadOnlyAccess' grants read-only access and is more restrictive than full access. Option B (Deny all) would break applications. Option C (blocklist) is not a standard method.

Option D (full access) is the opposite of restriction.

Practice this question →

128

MCQmedium

A data engineer is tasked with designing a disaster recovery solution for a data lake stored in Amazon S3. The data lake contains sensitive customer data that must be replicated to a different AWS Region. The engineer needs to ensure that all objects, including those with encryption using SSE-KMS, are replicated. Which solution meets the requirements?

A.Use S3 Batch Operations to copy objects to the destination bucket.

B.Enable S3 Cross-Region Replication (CRR) with the appropriate KMS key and IAM role.

C.Use S3 Transfer Acceleration to copy objects across regions.

D.Use the AWS CLI s3 sync command scheduled in a cron job.

AnswerB

CRR supports SSE-KMS with proper configuration.

Why this answer

Option C is correct because S3 Cross-Region Replication can replicate objects with SSE-KMS if the KMS key is specified and the IAM role has necessary permissions. Option A is wrong because S3 Batch Operations is for one-time bulk actions. Option B is wrong because S3 Sync CLI command is not automatic for ongoing replication.

Option D is wrong because S3 Transfer Acceleration speeds up uploads but does not replicate.

Practice this question →

129

MCQhard

A company uses AWS Glue to run ETL jobs that process data from Amazon S3 and write results to Amazon Redshift. The Glue job uses the JDBC connection to Redshift. Recently, the job has been failing intermittently with the error: 'java.sql.SQLException: [Amazon](500310) Invalid operation: INSERT has more expressions than target columns;' The Glue job writes to a staging table in Redshift before performing a merge into the final table. The staging table schema matches the source data. The error occurs only on some days and affects different columns each time. The data engineer suspects that the source data occasionally contains extra columns due to a schema drift in the upstream data producer. Which approach should the data engineer take to handle this issue robustly?

A.Skip any records that have extra columns by adding a conditional check in the Glue script.

B.Use a Glue DynamicFrame and apply the resolveChoice method to make the schema consistent.

C.Manually update the Redshift staging table schema whenever the source data changes.

D.Use a Glue DynamicFrame and apply the dropFields method to remove extra columns before writing.

AnswerB

resolveChoice can handle schema drift by casting or dropping columns, making the job resilient.

Why this answer

Option B is correct because Glue DynamicFrames can automatically handle schema drift using the `resolveChoice` method, which allows you to specify how to handle columns that appear inconsistently across records (e.g., making them null, casting to a common type, or dropping them). This directly addresses the intermittent error caused by extra columns in the source data without requiring manual schema updates or fragile conditional logic.

Exam trap

The trap here is that candidates may confuse `dropFields` (which removes specific columns statically) with `resolveChoice` (which handles dynamic schema drift), leading them to choose Option D even though it cannot adapt to varying extra columns across different days.

How to eliminate wrong answers

Option A is wrong because skipping records with extra columns would result in data loss and does not address the root cause—the schema mismatch between the source and the staging table. Option C is wrong because manually updating the Redshift staging table schema whenever the source data changes is not scalable, error-prone, and defeats the purpose of an automated ETL pipeline. Option D is wrong because `dropFields` removes specific named columns statically at coding time, but the error occurs on different columns each day, so a dynamic approach like `resolveChoice` is needed.

Practice this question →

130

MCQmedium

A data engineering team notices that an AWS Glue ETL job fails intermittently with a 'ThrottlingException' error. The job reads from an Amazon S3 bucket and writes to an Amazon Redshift table. What is the MOST likely cause of this error?

A.The S3 bucket's request rate is exceeding the bucket's performance limits.

B.The Redshift cluster's write throughput is exceeding its provisioned capacity.

C.The Glue job is exceeding the maximum number of concurrent runs allowed.

D.The Glue job's allocated memory is insufficient for the data volume.

AnswerB

Redshift throttles writes when the cluster's I/O capacity is exceeded.

Why this answer

Option A is correct because ThrottlingException when writing to Redshift typically indicates that the write throughput exceeds the cluster's capacity, causing API throttling. Option B is incorrect because S3 throttling would result in a different error (e.g., SlowDown). Option C is incorrect because Glue job throttling would be at the API level for job operations, not data operations.

Option D is incorrect because insufficient memory would cause OutOfMemoryError, not ThrottlingException.

Practice this question →

131

MCQhard

A company uses Amazon Kinesis Data Firehose to deliver streaming log data to an Amazon S3 bucket. The delivery stream uses dynamic partitioning with a custom prefix. Recently, the delivery stream has been failing with the error 'InvalidArgumentException: The number of partitions exceeds the limit'. What is the likely cause?

A.The incoming data contains more distinct partition key values than the allowed limit.

B.The S3 bucket has a bucket policy that restricts the number of prefixes.

C.The buffer size and interval are set too low, causing many small files.

D.The data volume exceeds the maximum throughput of the delivery stream.

AnswerA

Firehose dynamic partitioning has a limit on distinct partition values per batch.

Why this answer

Option C is correct because dynamic partitioning in Firehose has a limit of 20 unique partition values per batch. Option A is wrong because data volume does not cause this error. Option B is wrong because S3 bucket policy would cause different errors.

Option D is wrong because buffer size does not affect partition count.

Practice this question →

132

MCQmedium

A company uses Amazon Kinesis Data Firehose to deliver streaming data to an Amazon S3 bucket. The data is then processed by a scheduled AWS Glue ETL job that loads it into an Amazon Redshift table. Recently, the Glue job has been failing with the error: 'S3ServiceException: Access Denied'. The Firehose delivery stream is configured with a prefix and error logging to the same S3 bucket. The Glue job uses the same IAM role that has s3:GetObject and s3:ListBucket permissions on the bucket. What is the most likely cause?

A.The Glue job expects a different data format than what Firehose writes.

B.The Glue job's IAM role does not have s3:GetObjectVersion permission.

C.The Glue job is using the wrong IAM role that does not have permissions to the S3 bucket.

D.The S3 bucket has default encryption enabled with AWS KMS (SSE-KMS), and the Glue job's IAM role lacks kms:Decrypt permission.

AnswerD

SSE-KMS requires kms:Decrypt permission; missing it causes access denied when reading.

Why this answer

Option D is correct because Firehose uses SSE-S3 by default unless configured otherwise. If the S3 bucket has default encryption enabled with SSE-KMS, Firehose will use that encryption, but the Glue job's IAM role may lack kms:Decrypt permission for the KMS key. The error 'Access Denied' when reading from S3 often indicates encryption permission issues.

Option A is wrong because the Glue job can read from S3 with the current permissions if no encryption is involved. Option B is wrong because the error is about access, not schema. Option C is wrong because the Glue job can use the same role as Firehose, but the role may not have KMS permissions.

Practice this question →

133

Multi-Selectmedium

A data engineer is troubleshooting an AWS Glue job that fails with 'java.lang.OutOfMemoryError: Java heap space'. The job processes a large dataset. Which TWO configuration changes should the engineer consider to resolve this issue? (Choose TWO.)

Select 2 answers

A.Change the output format from Parquet to CSV.

B.Increase the Spark shuffle partitions configuration (spark.sql.shuffle.partitions).

C.Reduce the number of partitions in the source data.

D.Increase the number of DPUs allocated to the Glue job.

E.Disable job bookmarks to avoid incremental processing.

AnswersB, D

More partitions reduce data per partition, lowering memory usage.

Why this answer

Options A and B are correct. A: Increasing the number of DPUs provides more memory for the job. B: Increasing the Spark shuffle partitions reduces the amount of data shuffled per partition, reducing memory pressure.

Option C is wrong because reducing the number of partitions in the source data may increase partition size. Option D is wrong because using a different data format does not directly address heap space. Option E is wrong because disabling job bookmark may cause reprocessing but not fix memory.

Practice this question →

134

MCQeasy

A data engineer needs to monitor the number of Amazon S3 PUT requests that result in a 403 AccessDenied error. Which CloudWatch metric and dimension should be used?

A.NumberOfObjects metric with the ObjectType dimension.

B.BucketSizeBytes metric with the StorageType dimension.

C.4xxErrors metric with the FilterId dimension set to '403'

D.AllRequests metric with the BucketName dimension.

AnswerC

4xxErrors metric with a filter for 403 provides the count of AccessDenied errors.

Why this answer

The correct answer is C because Amazon S3 CloudWatch metrics include `4xxErrors`, which counts HTTP 4xx status code responses. To filter specifically for 403 AccessDenied errors, you set the `FilterId` dimension to a filter that matches the 403 status code. This allows precise monitoring of unauthorized PUT requests.

Exam trap

The trap here is that candidates confuse `4xxErrors` (which counts all 4xx errors) with a metric that directly counts 403 errors, forgetting that a dimension filter is required to isolate the specific status code.

How to eliminate wrong answers

Option A is wrong because `NumberOfObjects` with `ObjectType` dimension tracks the count of objects per storage class (e.g., Standard, Glacier), not error responses. Option B is wrong because `BucketSizeBytes` with `StorageType` dimension measures bucket storage size, not request errors. Option D is wrong because `AllRequests` with `BucketName` dimension counts all requests (including successful ones) but does not filter by HTTP status code, so it cannot isolate 403 errors.

Practice this question →

135

MCQmedium

A data engineer is troubleshooting a failed AWS Glue ETL job that reads from and writes to the S3 bucket 'example-bucket'. The job's IAM role has the policy shown in the exhibit. The job fails with an Access Denied error when writing to a prefix 'output/'. Which permission is MISSING?

A.s3:PutObjectAcl

B.s3:GetBucketAcl

C.s3:ListBucket on the output prefix

D.s3:DeleteObject

AnswerD

Glue often deletes temporary files and may need DeleteObject permission.

Why this answer

Option C is correct because the policy uses a wildcard for the bucket ARN, but the PutObject action is allowed on 'example-bucket/*', which includes 'output/'. However, the ListBucket action is on the bucket itself, which is fine. The issue is that the GetObject and PutObject actions are only granted for objects, but the job might need s3:PutObject for the specific prefix.

Actually, the policy seems correct. Re-examining: The error might be due to missing s3:GetObject on the output prefix? No, wildcard covers all prefixes. Perhaps the bucket policy denies access? But the question implies the IAM policy is missing something.

Common missing permission is s3:GetBucketLocation for cross-account access, but not in this case. Another possibility: the job needs to delete temporary files? The exhibit shows no DeleteObject permission. So option C: s3:DeleteObject is likely needed for Glue cleanup.

Option A is for listing, already present. Option B is not needed for writing. Option D is for multipart upload, but not required for small files.

Practice this question →

136

Multi-Selecteasy

A company is using AWS Glue ETL jobs to process data from Amazon S3 and write results back to S3. The jobs are failing intermittently with 'ThrottlingException' errors. Which TWO configurations would help reduce these errors?

Select 2 answers

A.Decrease the number of DPUs for the job.

B.Enable GZIP compression on the output data.

C.Add retry logic with exponential backoff in the job script.

D.Change the job type from Spark to Python shell.

E.Increase the number of DPUs for the job.

AnswersC, E

Retries handle transient throttling gracefully.

Why this answer

Option B: Increasing the number of DPUs may distribute the load and reduce throttling. Option D: Implementing retry logic in the job handles transient throttling. Option A is wrong because it increases the chance of throttling.

Option C is wrong because compression doesn't reduce API calls. Option E is wrong because the job type is not related.

Practice this question →

137

MCQhard

A company runs a nightly AWS Glue ETL job that writes results to an Amazon Redshift table using the JDBC connector. Recently, the job has been failing with the error 'ERROR: connection to server at ... failed: server closed the connection unexpectedly'. The Redshift cluster is in a private subnet with a VPC endpoint for S3. The Glue job runs in the same VPC with enhanced VPC routing enabled. Which is the most likely cause?

A.The JDBC driver is missing the 'redshift' compatibility mode setting.

B.SSL is not enabled on the Redshift cluster.

C.The Glue job's security group does not allow outbound traffic to the Redshift cluster.

D.AWS Glue does not support Redshift as a data source.

AnswerA

Setting 'redshift' compatibility in the JDBC URL (e.g., '?compatible=redshift') ensures proper handling of Redshift-specific features and prevents unexpected disconnections.

Why this answer

Option D is correct because Redshift compatibility mode in the JDBC driver is required for Glue to properly handle connections. Without it, the driver may close the connection unexpectedly. Option A is wrong because Redshift does not enforce SSL by default unless configured.

Option B is wrong because the Glue job is in the same VPC and uses enhanced VPC routing, so the connection should work. Option C is wrong because Glue supports Redshift as a data source.

Practice this question →

138

MCQhard

A company uses Amazon Kinesis Data Streams to ingest real-time clickstream data. A Lambda function processes each record. Recently, the Lambda function has been failing with 'ProvisionedThroughputExceededException' when writing results to a DynamoDB table. The data engineer has already increased the DynamoDB write capacity. What else can the engineer do to resolve the issue?

A.Increase the Lambda function memory.

B.Increase the DynamoDB read capacity units.

C.Decrease the Lambda batch size to 1.

D.Increase the number of shards in the Kinesis stream.

AnswerD

More shards distribute the load across more Lambda invocations.

Why this answer

Option B is correct because increasing the number of shards increases the number of Lambda concurrent invocations, reducing the batch size per invocation and easing pressure on DynamoDB. Option A is incorrect because increasing Lambda memory does not directly help with throttling. Option C is incorrect because increasing DynamoDB read capacity does not affect writes.

Option D is incorrect because reducing batch size may increase throttling frequency.

Practice this question →

139

MCQhard

A data engineer is monitoring an Amazon Redshift cluster and notices that the 'WLM query wait time' metric is consistently high during peak hours. The cluster uses automatic WLM. The engineer wants to reduce query wait times without changing the cluster size. Which action is MOST effective?

A.Enable concurrency scaling.

B.Change WLM to manual mode and increase the number of queues.

C.Increase the maximum number of queries per queue.

D.Enable short query acceleration (SQA).

AnswerA

Concurrency scaling adds capacity to handle concurrent queries.

Why this answer

Option C is correct because enabling concurrency scaling adds transient cluster capacity to handle bursts. Option A is wrong because short query acceleration helps with short queries, not necessarily wait times. Option B is wrong because manual WLM requires tuning queues.

Option D is wrong because increased concurrency without resources may worsen contention.

Practice this question →

140

MCQhard

A data engineer is troubleshooting a DMS task that is replicating data from an on-premises Oracle database to an RDS for MySQL instance. The task is failing with 'ORA-1555: snapshot too old' error. What is the best course of action?

A.Disable full supplemental logging on the source tables.

B.Increase the size of the redo logs on the source database.

C.Enable batch optimized apply on the DMS task.

D.Increase the UNDO tablespace size and set UNDO_RETENTION to a higher value.

AnswerD

This gives the CDC process enough undo to read consistent snapshots.

Why this answer

Option A is correct because the snapshot too old error occurs when UNDO segments are too small or retention is too short for the long-running transaction needed by CDC. Increasing undo tablespace size and retention time allows CDC to read consistent old data. Option B is wrong because increasing redo logs does not prevent snapshot too old errors.

Option C is wrong because it would lose CDC capability. Option D is wrong because batch optimized mode may not work with CDC and does not address the root cause.

Practice this question →

141

MCQeasy

A data engineer needs to transform a large dataset stored in Amazon S3 using Apache Spark. The engineer wants to minimize startup time and use a serverless approach. Which AWS service should the engineer use?

A.Amazon Redshift

B.Amazon EMR

C.AWS Glue

D.Amazon Athena

AnswerC

Serverless Spark with fast startup.

Why this answer

Option A is correct because AWS Glue provides a serverless Spark environment with fast startup. Option B is wrong because Amazon EMR requires cluster provisioning. Option C is wrong because Amazon Redshift is a data warehouse, not a Spark environment.

Option D is wrong because Amazon Athena is for querying, not transforming with Spark.

Practice this question →

142

Multi-Selecteasy

Which TWO actions are effective ways to monitor the health of an Amazon DynamoDB table? (Choose two.)

Select 2 answers

A.Use AWS S3 inventory to track table size.

B.Use EC2 instance status checks.

C.Enable DynamoDB Streams and process with Lambda to detect failures.

D.Set up Amazon CloudWatch alarms on ConsumedReadCapacityUnits.

E.Monitor the 'TableHealth' metric in CloudWatch.

AnswersC, D

Streams can be used for monitoring changes.

Why this answer

Options A and D are correct. CloudWatch metrics like ConsumedReadCapacityUnits and ThrottledRequests provide insight into table health. Option B is wrong because there is no 'TableHealth' metric.

Option C is wrong because S3 is not for DynamoDB monitoring. Option E is wrong because EC2 is not relevant.

Practice this question →

143

Multi-Selecthard

A data engineer is designing a disaster recovery plan for an Amazon RDS for PostgreSQL database. The database is 500 GB and has a multi-AZ deployment. The recovery point objective (RPO) is 5 minutes, and the recovery time objective (RTO) is 2 hours. Which THREE actions should the engineer take to meet these objectives?

Select 3 answers

A.Enable Multi-AZ deployment for automatic failover.

B.Enable automated backups with a retention period of 1 day.

C.Take daily manual snapshots and export them to Amazon S3.

D.Disable automatic backups to reduce storage costs.

E.Configure a cross-region read replica for faster recovery in another region.

AnswersA, B, E

Multi-AZ provides automatic failover to standby in case of failure.

Why this answer

Option A, B, and D are correct. Automated backups (A) provide point-in-time recovery. Multi-AZ (B) provides automatic failover.

Read replica promotion (D) can be faster for cross-region recovery. Option C is wrong because snapshot export to S3 is for archival, not fast recovery. Option E is wrong because deleting logs prevents point-in-time recovery.

Practice this question →

144

Multi-Selectmedium

A company is running a critical data pipeline using AWS Glue. The pipeline must be highly available and fault-tolerant. Which TWO strategies should the data engineer implement? (Choose TWO.)

Select 2 answers

A.Configure the Glue job to run in multiple Availability Zones.

B.Use a single instance type for all Glue workers.

C.Increase the number of concurrent runs for the Glue job.

D.Enable job retries with exponential backoff.

E.Disable job bookmarks to avoid reprocessing.

AnswersA, D

Multi-AZ provides redundancy.

Why this answer

Configuring the Glue job to run in multiple Availability Zones ensures that if one AZ experiences a failure, the job can continue processing in another AZ, providing high availability and fault tolerance. This is a fundamental strategy for resilient data pipeline design in AWS.

Exam trap

The trap here is that candidates often confuse increasing concurrency (Option C) with fault tolerance, but concurrency only scales processing horizontally without providing redundancy against infrastructure failures.

Practice this question →

145

MCQmedium

A data engineer is managing an Amazon RDS for PostgreSQL instance that serves as a source for change data capture (CDC) using AWS DMS. The DMS task is a full load followed by ongoing replication. The full load completed successfully, but the ongoing replication is failing with the error 'Value too long for character type'. The engineer has verified that the target database schema matches the source. The source table has a VARCHAR(256) column, and the target has VARCHAR(256) as well. However, some source rows contain values longer than 256 characters. What should the engineer do to resolve the issue?

A.Modify the DMS task to truncate data that exceeds the column length.

B.Rename the target column to match a different source column.

C.Change the target column to a CLOB data type.

D.Alter the target table column to a larger data type, such as VARCHAR(512).

AnswerD

Resolves the length mismatch.

Why this answer

Option B is correct because the error indicates that source data exceeds the column length. The source column definition may not enforce the length, or the data was inserted bypassing constraints. The engineer should alter the target column to a larger size to accommodate the actual data.

Option A is wrong because truncating data may cause data loss. Option C is wrong because the error is not about character set. Option D is wrong because DMS maps columns by name; renaming would cause mapping issues.

Practice this question →

146

Multi-Selectmedium

A company runs a data processing pipeline on Amazon EMR. The pipeline reads data from S3, processes it with Spark, and writes results back to S3. The engineer notices that the cluster is underutilized and wants to reduce costs. Which TWO actions should the engineer take? (Choose TWO.)

Select 2 answers

A.Use Spot instances for task nodes.

B.Configure the cluster to terminate after the job completes.

C.Change the master node to a larger instance type.

D.Enable EMRFS consistent view.

E.Increase the number of core nodes to improve parallelism.

AnswersA, B

Spot instances are cheaper than On-Demand.

Why this answer

Option A is correct because using Spot instances for task nodes in Amazon EMR can significantly reduce costs, as Spot instances are spare EC2 capacity offered at up to 90% discount compared to On-Demand instances. Since task nodes are stateless and can be added or removed without affecting cluster stability, they are ideal candidates for Spot instances, allowing the engineer to lower expenses while maintaining processing capacity.

Exam trap

The trap here is that candidates may confuse cost optimization features like Spot instances and auto-termination with performance improvements or data consistency settings, leading them to select options that increase resources or enable features unrelated to cost reduction.

Practice this question →

147

MCQeasy

A data engineer uses AWS CloudTrail to investigate a security incident. The engineer runs the command shown in the exhibit. What does the output indicate?

A.A file was downloaded from the S3 bucket.

B.A file was deleted from the S3 bucket.

C.A batch of files was listed from the S3 bucket.

D.A file was uploaded to the S3 bucket.

AnswerD

PutObject indicates an upload.

Why this answer

Option C is correct because the EventName is PutObject, and the ResourceName includes the object key 'sales_2024-07-01.csv', indicating that a file was uploaded. Option A is wrong because the event is a PutObject, not a GetObject. Option B is wrong because the event is an upload, not a deletion.

Option D is wrong because the event is a single object upload, not a batch.

Practice this question →

148

MCQhard

A company uses Amazon S3 to store log files from multiple applications. The logs are encrypted with AWS KMS (SSE-KMS). A data engineer needs to grant a new IAM user read-only access to the logs. The engineer attaches an S3 bucket policy that allows s3:GetObject and a KMS key policy that allows kms:Decrypt. However, the user still receives an 'Access Denied' error when trying to download an object. What is the MOST likely missing permission?

A.The user does not have s3:ListBucket permission on the bucket.

B.The user does not have s3:GetObjectVersion permission.

C.The user's IAM policy does not include kms:Decrypt permission.

D.The user does not have kms:GenerateDataKey permission.

AnswerC

Both the key policy and the IAM user policy must allow kms:Decrypt; the IAM policy is missing this action.

Why this answer

Option C is correct because to use SSE-KMS, the user needs kms:Decrypt, but also the IAM policy must allow kms:Decrypt, not just the key policy. The key policy alone is not sufficient if the IAM user's policy denies or does not allow the action. Option A is incorrect because s3:ListBucket is for listing, not downloading.

Option B is incorrect because s3:GetObjectVersion is for versioned buckets. Option D is incorrect because kms:GenerateDataKey is for encryption, not decryption.

Practice this question →

149

MCQhard

Refer to the exhibit. An IAM policy is attached to an IAM role used by an application. The application needs to read objects from 'my-bucket' that have the tag 'classification=public'. The application account is 123456789012. However, the application is getting 'Access Denied' errors. What is the most likely reason?

A.The Deny statement uses StringNotEquals, which incorrectly denies the application account.

B.The policy does not grant s3:ListBucket permission, so the application cannot list objects.

C.The object being accessed does not have the tag 'classification=public'.

D.The Deny statement blocks all access from accounts other than 123456789012, but the application is in that account.

AnswerC

Without the tag, the Allow condition fails, leading to implicit deny.

Why this answer

Option A is correct because the Deny statement blocks any request that does not come from account 123456789012. Even though the application is in that account, the Deny does not have a condition that allows the account; it uses StringNotEquals, so if the condition is not met, the Deny applies. In this case, the s3:ExistingObjectTag condition in the Allow statement is not part of the Deny condition, but the Deny statement applies to all s3 actions and resources.

The key issue is that the Allow statement has a condition requiring the tag, but the Deny statement does not have a condition that excludes the account; it denies all actions unless the request comes from the account. However, the request does come from the account, so the Deny should not apply? Actually, StringNotEquals means if the source account is NOT 123456789012, then deny. Since it IS 123456789012, the condition is false, so the Deny does not apply.

So the Allow should work if the tag condition is met. So what's wrong? The Allow condition requires the tag, but the Deny does not. If the object does not have the tag, the Allow does not apply, but there is no explicit deny for that case.

However, the error might be due to the object not having the tag. Option A is plausible: the Deny is too broad, but it's not blocking the account. Option B: missing s3:ListBucket prevents listing but not direct GetObject if you know the key.

Option C: the condition on Allow might not match; but the error is Access Denied, not that the object doesn't exist. Option D: if the object has the tag, the Allow applies, and Deny does not, so it should work. The most likely reason is that the object does not have the tag 'classification=public', so the Allow condition fails, and there is no other Allow for GetObject, resulting in implicit deny.

So Option C is correct: the object's tag does not match the condition.

Practice this question →

150

Multi-Selecteasy

A data engineer is troubleshooting a failed AWS Glue job that reads from an S3 bucket and writes to an Amazon Redshift table. The error message indicates 'Access Denied'. Which TWO permissions are likely missing? (Choose TWO.)

Select 2 answers

A.s3:GetObject on the source S3 bucket

B.redshift:CopyFromS3 on the Redshift cluster

C.kms:Decrypt on a KMS key

D.ec2:DescribeInstances on the Redshift cluster

E.lambda:InvokeFunction on a Lambda function

AnswersA, B

Glue needs read access to objects in the S3 bucket.

Why this answer

Options A and B are required for the Glue job to read from S3 and write to Redshift. Option C (Lambda) is not used. Option D (EC2) is not needed if the job uses a VPC endpoint.

Option E (KMS) is unnecessary unless encryption is used.

Practice this question →

← PreviousPage 2 of 6 · 387 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Data Operations and Support questions.

Start 20-question session