DEA-C01 Data Store Management — All Questions With Answers

Question 1mediummultiple choice

Read the full Data Store Management explanation →

A company has an Amazon RDS for MySQL DB instance with read replicas. The primary DB instance fails. What is the correct procedure to promote a read replica to become the new primary?

Question 2hardmultiple choice

Read the full Data Store Management explanation →

A data engineering team uses Amazon Redshift for analytics. They notice that queries on a large fact table are slow. The table is distributed using DISTSTYLE ALL. Which design change would most likely improve query performance?

Question 3easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB for a gaming application. They need to store player session data that expires after 24 hours. Which DynamoDB feature should they use to automatically delete expired items?

Question 4mediummulti select

Read the full Data Store Management explanation →

Which TWO actions should a data engineer take to encrypt data at rest in an Amazon S3 bucket? (Select TWO.)

Question 5hardmulti select

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data must be immutable and support high-throughput streaming ingestion. Which THREE features should the engineer consider? (Select THREE.)

Question 6easymultiple choice

Read the full Data Store Management explanation →

An e-commerce application uses Amazon ElastiCache for Redis to cache product catalog data. The cache currently uses lazy loading. The team wants to ensure that frequently accessed product data is always fresh. Which caching strategy should they implement?

Question 7hardmultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer has attached this bucket policy to an S3 bucket. What is the effect of this policy?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/DataLakeRole"
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::example-bucket/*"
    },
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

Question 8mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer runs the above AWS CLI command to view the table metadata in the AWS Glue Data Catalog. The data is stored as CSV in S3 with partitions by year and month. When querying the table using Amazon Athena, no data is returned. What is the most likely cause?

Network Topology

Question 9mediummulti select

Read the full Data Store Management explanation →

Which THREE storage classes in Amazon S3 are designed for infrequently accessed data with millisecond retrieval times? (Select THREE.)

Question 10easymultiple choice

Read the full Data Store Management explanation →

A company stores time-series sensor data in Amazon S3. They need to query the data using SQL with minimal latency and no infrastructure management. Which service should they use?

Question 11mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to migrate an on-premises MySQL database to Amazon RDS for MySQL with minimal downtime. Which approach should they use?

Question 12hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB with on-demand capacity. They notice higher than expected costs due to a sudden spike in read traffic from a reporting job. The reporting job scans the entire table daily. What is the most cost-effective way to reduce costs while maintaining the same reporting output?

Question 13easymultiple choice

Read the full Data Store Management explanation →

A data engineer has set up an Amazon S3 lifecycle policy to transition objects to Glacier Instant Retrieval after 30 days. After 60 days, objects should transition to Deep Archive. However, objects are not transitioning to Deep Archive. What is the most likely cause?

Question 14hardmultiple choice

Read the full Data Store Management explanation →

A data engineer attaches the above IAM policy to an IAM user. The user tries to download an object from my-bucket using the AWS CLI without specifying SSE headers. The object is stored with SSE-S3. Will the download succeed?

Exhibit

Refer to the exhibit.

```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "StringEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    }
  ]
}
```

Question 15mediummulti select

Read the full Data Store Management explanation →

A company is designing a data lake on Amazon S3. Which TWO strategies improve query performance for Amazon Athena?

Question 16hardmulti select

Read the full Data Store Management explanation →

A company is migrating a legacy data warehouse to Amazon Redshift. They need to choose a distribution style to minimize data movement during joins. Which THREE factors should they consider?

Question 17mediummultiple choice

Read the full Data Store Management explanation →

A company runs a multi-AZ Amazon RDS for PostgreSQL instance. They need to run a one-time analytical query that will take several hours and consume significant I/O. The query should not impact the primary workload. What should the data engineer do?

Question 18easymultiple choice

Read the full Data Store Management explanation →

A company is migrating its on-premises MySQL database to Amazon RDS for MySQL. They want to minimize downtime and ensure data consistency. Which AWS service should be used for the migration?

Question 19mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting a slow-running query on an Amazon Redshift cluster. The query involves joining two large tables. The engineer notices that the query plan shows a large number of distribution and broadcast operations. Which design change would most likely improve query performance?

Question 20hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB for a gaming leaderboard. The table has a partition key of 'game_id' and a sort key of 'score'. The read capacity is provisioned at 1000 RCUs. During peak hours, users report high latency when querying the top 10 scores for a specific game. The DynamoDB metrics show ConsumedReadCapacityUnits averaging 800 but occasional throttling. What is the most likely cause and solution?

Question 21mediummultiple choice

Read the full Data Store Management explanation →

A data engineer runs the above AWS CLI command and receives the output. The object is part of an S3 Lifecycle policy that transitions objects to Glacier Instant Retrieval after 30 days. The object was created on January 1, 2023. Why is the object still in STANDARD_IA storage class?

Network Topology

Question 22hardmulti select

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3 for analytics. The data includes sensitive PII that must be encrypted at rest. The company requires that the encryption keys be managed by the company's own hardware security module (HSM) and rotated every 90 days. Which TWO options meet these requirements? (Choose TWO.)

Question 23hardmultiple choice

Read the full NAT/PAT explanation →

A company runs a real-time analytics platform using Amazon Kinesis Data Streams with a shard count of 10. The data is consumed by an AWS Lambda function that writes to an Amazon DynamoDB table. The DynamoDB table has a partition key of 'user_id' and a sort key of 'timestamp'. The table is provisioned with 5000 RCUs and 5000 WCUs. Recently, the application experienced increased write latency and throttling errors (ProvisionedThroughputExceededException) on the DynamoDB table. The CloudWatch metrics show that ConsumedWriteCapacityUnits averages 4500 with occasional spikes to 6000. The Lambda function’s concurrency is set to 1000. The data engineer suspects the issue is due to hot partitions. Upon investigation, the engineer finds that a small number of users generate a disproportionately large amount of data. Which course of action would best resolve the throttling while minimizing cost?

Question 24mediummultiple choice

Read the full Data Store Management explanation →

A data engineering team is designing a data lake on Amazon S3 with a folder structure that separates raw, transformed, and curated data. The team needs to implement lifecycle policies to minimize storage costs while ensuring that data in the 'raw' zone is retained for 90 days before being moved to Amazon S3 Glacier Deep Archive. Additionally, data in the 'curated' zone should be deleted after 365 days. What is the MOST cost-effective way to achieve these requirements?

Question 25hardmulti select

Read the full Data Store Management explanation →

A data engineer is setting up an Amazon Redshift cluster for a data warehouse. The cluster will store historical sales data and support complex analytical queries. To optimize query performance and manage storage, the engineer needs to choose appropriate distribution styles and sort keys for a large fact table 'sales' and several dimension tables. Which TWO of the following design decisions are BEST practices?

Question 26easymultiple choice

Read the full NAT/PAT explanation →

A data engineer ran the above CLI command to describe an Amazon DynamoDB table named 'Orders'. The table has a key schema with 'OrderID' as the partition key and 'CustomerID' as the sort key. The table currently has no items. The engineer wants to add a new attribute 'OrderDate' and then query all orders for a specific customer within a date range. Which of the following actions is the MOST efficient approach to support this query pattern?

Network Topology

Question 27mediumdrag order

Read the full Data Store Management explanation →

Arrange the steps to set up cross-region replication for an S3 bucket.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 28mediumdrag order

Read the full Data Store Management explanation →

Order the steps to set up an Amazon EMR cluster for processing data in S3 using Spark.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 29mediummatching

Read the full Data Store Management explanation →

Match each AWS Glue component to its role.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Scans data sources and populates catalog

Central metadata repository

Transform and load data

Orchestrates multiple jobs and crawlers

Interactive development environment

Question 30mediummatching

Read the full Data Store Management explanation →

Match each AWS monitoring tool to its primary use.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Metrics, logs, and alarms

API call history and auditing

Trace and analyze distributed applications

Event-driven automation

Resource configuration tracking

Question 31mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store sensitive customer data. The security team requires that all objects uploaded to a specific bucket be encrypted at rest using AWS KMS with a customer managed key. Which bucket policy statement should be applied to enforce this requirement?

Question 32easymultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data is ingested from multiple sources and needs to be partitioned by year, month, day, and event type for efficient querying with Amazon Athena. Which S3 key prefix structure is most appropriate?

Question 33hardmultiple choice

Read the full Data Store Management explanation →

A company runs an Amazon RDS for MySQL database. The database experiences high write latency during peak hours. The data engineer notices that the WriteIOPS metric is consistently at the provisioned limit. Which action would most effectively reduce write latency without increasing costs?

Question 34mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data from IoT devices. The data is written once, read rarely, but must be queryable using SQL. The storage cost must be minimized. Which storage solution should the engineer choose?

Question 35easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB as the primary data store for a web application. The application experiences occasional throttling on write requests. The data engineer needs to implement a solution that handles throttling gracefully without losing data. Which approach should the engineer use?

Question 36hardmultiple choice

Read the full Data Store Management explanation →

A financial services company stores transaction data in Amazon RDS for PostgreSQL. The company requires that all changes to the database be logged for audit purposes, including before and after images of updated rows. Which feature should the data engineer enable?

Question 37mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is configuring Amazon S3 Lifecycle policies to transition objects between storage classes. The data is accessed frequently for the first 30 days, then rarely for the next 90 days, after which it must be archived. The engineer wants to minimize costs while ensuring immediate retrieval for the first 30 days. Which lifecycle policy should the engineer implement?

Question 38easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store a large number of small files (each a few KB) from IoT sensors. The data is written once and never modified. The primary requirement is high write throughput and low latency for writes. Which storage solution is most suitable?

Question 39hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for analytics. A data engineer notices that queries are slow due to high disk usage on the compute nodes. The engineer needs to reclaim disk space without interrupting ongoing queries. Which action should the engineer take?

Question 40mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data lake must support both batch and streaming ingestion. Which TWO AWS services can ingest data directly into S3? (Choose TWO.)

Question 41hardmulti select

Read the full Data Store Management explanation →

A company is migrating an on-premises Apache Hadoop cluster to Amazon EMR. The data is stored in HDFS and must be moved to Amazon S3. Which THREE considerations are important when designing the migration? (Choose THREE.)

Question 42easymulti select

Read the full Data Store Management explanation →

A data engineer is setting up Amazon S3 bucket policies for a data lake. Which TWO statements are true regarding S3 bucket policies? (Choose TWO.)

Question 43easymultiple choice

Read the full Data Store Management explanation →

A data engineer applies the above IAM policy to a user. The user attempts to upload an object to the bucket 'my-data-lake' without specifying server-side encryption. What will happen?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::my-data-lake/*",
      "Condition": {
        "StringEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    }
  ]
}

Question 44mediummultiple choice

Read the full Data Store Management explanation →

A data engineer runs the above CLI command to describe the DynamoDB table 'Orders'. The table has a partition key 'OrderID' and sort key 'CustomerID'. Which query operation is most efficient for retrieving all orders for a specific customer?

Network Topology

Question 45hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting an access denied error when an AWS Lambda function tries to decrypt an object encrypted with the KMS key 'abc123'. The Lambda function's execution role has the above policy attached. What is the likely cause of the error?

Exhibit

Refer to the exhibit.

{
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt",
        "kms:GenerateDataKey"
      ],
      "Resource": "arn:aws:kms:us-east-1:123456789012:key/abc123"
    },
    {
      "Effect": "Deny",
      "Action": "kms:Decrypt",
      "Resource": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:SourceAccount": "123456789012"
        }
      }
    }
  ]
}

Question 46mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon RDS for MySQL with Multi-AZ deployment. The database experiences high write latency during peak hours. The application uses InnoDB tables. Which action would reduce write latency without changing the application code?

Question 47hardmultiple choice

Read the full Data Store Management explanation →

A data engineering team needs to store log files for 90 days with immediate access, then archive them for 7 years with infrequent access. Which S3 storage class configuration meets these requirements cost-effectively?

Question 48easymultiple choice

Read the full Data Store Management explanation →

A company wants to store data from thousands of IoT devices with varying data rates. The data must be stored in a schema-on-read fashion and support SQL queries. Which AWS service should be used?

Question 49mediummultiple choice

Read the full Data Store Management explanation →

A company runs a data warehouse on Amazon Redshift. Queries are slow, and the team suspects data distribution is skewed. Which approach would best help identify distribution skew?

Question 50easymultiple choice

Read the full Data Store Management explanation →

A company needs to store JSON documents that are frequently read and written by a web application. The data must be highly available and durable across multiple Availability Zones. Which AWS database service meets these requirements?

Question 51mediummultiple choice

Read the full Data Store Management explanation →

A company is migrating an on-premises MongoDB database to Amazon DocumentDB. The migration must have minimal downtime. Which service should be used to perform the migration?

Question 52mediummulti select

Read the full Data Store Management explanation →

A company uses Amazon Redshift for analytics. The data engineering team wants to improve query performance for frequently used aggregate queries. Which TWO actions would help achieve this?

Question 53hardmulti select

Read the full Data Store Management explanation →

A company stores sensitive data in Amazon S3. The security team requires encryption at rest and that the encryption keys are managed by the company using AWS KMS. The data is frequently accessed by multiple AWS services. Which THREE steps should be taken to meet these requirements?

Question 54easymulti select

Read the full Data Store Management explanation →

A company is designing a data lake on Amazon S3. The data includes CSV files, Parquet files, and images. The data engineering team needs to catalog the metadata and enable SQL queries. Which TWO AWS services should be used together?

Question 55mediummulti select

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB for a gaming application. The application experiences throttling during peak hours. The table's read and write capacity is provisioned. Which TWO actions can reduce throttling?

Question 56hardmulti select

Read the full Data Store Management explanation →

A company is migrating a large Oracle database to Amazon Aurora PostgreSQL. The migration must have minimal downtime and preserve data consistency. Which THREE AWS services or features should be used?

Question 57easymulti select

Read the full Data Store Management explanation →

A company needs to store log files from multiple applications in a centralized location. The logs are written once and accessed rarely after 30 days. The company must retain logs for 5 years. Which TWO actions meet these requirements cost-effectively?

Question 58mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for MySQL with Multi-AZ deployment. They notice that during a recent failover test, the application experienced a brief write outage. The application uses a connection string that points to the RDS instance endpoint. What is the MOST likely cause of the write outage?

Question 59easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON files that are accessed infrequently but must be retrievable within minutes. The data is immutable and must be stored cost-effectively. Which AWS service should the engineer use?

Question 60hardmultiple choice

Read the full Data Store Management explanation →

A company runs an Apache Spark job on Amazon EMR that writes output to an S3 bucket. The job fails with the error 'S3AccessDeniedException' when writing the final output, but earlier stages succeed. The EMR cluster uses a service role and an instance profile. The S3 bucket policy allows access from the VPC only. What is the MOST likely cause?

Question 61mediummultiple choice

Read the full Data Store Management explanation →

A media company stores large video files in Amazon S3 and uses Amazon CloudFront for content delivery. Users in different regions report slow download speeds for popular content. The data engineer needs to improve performance while minimizing cost. Which solution should the engineer implement?

Question 62easymultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data includes customer PII that must be encrypted at rest. The company also requires that the encryption keys be rotated automatically every year. Which encryption solution should the engineer use?

Question 63hardmultiple choice

Read the full Data Store Management explanation →

A company is running a production Amazon Aurora PostgreSQL database. The database experiences high write latency during peak hours. The data engineer suspects that the issue is due to a large number of small transactions. Which action would MOST effectively reduce write latency?

Question 64mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises Apache Hive data warehouse to Amazon EMR. The warehouse contains partitioned tables stored in HDFS. The engineer wants to use Amazon S3 as the storage layer for the EMR cluster. What is the MOST important consideration for maintaining query performance on S3?

Question 65easymultiple choice

Read the full Data Store Management explanation →

A company needs to store application log files for 90 days for compliance. The logs are generated continuously and are rarely accessed after 30 days. The data engineer must minimize storage costs. Which storage solution should the engineer choose?

Question 66hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB with on-demand capacity for a gaming application that experiences unpredictable traffic spikes. The application consistently sees 'ProvisionedThroughputExceededException' errors during spikes. The data engineer needs to resolve this issue without changing the application code. What should the engineer do?

Question 67mediummulti select

Read the full Data Store Management explanation →

A company is designing a data store for IoT sensor data that is written once and never updated. The data must be stored with high durability and low cost. Which TWO AWS storage services are most suitable? (Choose TWO.)

Question 68hardmulti select

Read the full Data Store Management explanation →

A company is using Amazon Redshift for a data warehouse. The data engineer needs to improve query performance for a table that is frequently joined with other tables on a specific column. Which THREE actions would help improve join performance? (Choose THREE.)

Question 69easymulti select

Read the full Data Store Management explanation →

A data engineer is setting up an Amazon RDS for MySQL database. The database must be highly available and automatically failover in case of an AZ outage. Which TWO configurations should the engineer enable? (Choose TWO.)

Question 70mediummultiple choice

Read the full Data Store Management explanation →

A data engineer applies the bucket policy shown in the exhibit to an S3 bucket. The bucket contains sensitive data that must be encrypted at rest and accessed only over HTTPS. Which of the following statements is true?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    },
    {
      "Effect": "Allow",
      "Principal": "*",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "StringEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    }
  ]
}

Question 71hardmultiple choice

Read the full Data Store Management explanation →

A data engineer runs the AWS CLI command shown in the exhibit to list objects in an S3 bucket. The command returns only two objects even though the bucket contains thousands of objects under the prefix. What should the engineer do to retrieve the next batch of objects?

Network Topology

Question 72easymultiple choice

Read the full Data Store Management explanation →

A data engineer deploys the CloudFormation template shown in the exhibit. After 60 days, what will be the storage class of objects in the bucket?

Exhibit

Refer to the exhibit.

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Resources": {
    "MyBucket": {
      "Type": "AWS::S3::Bucket",
      "Properties": {
        "BucketName": "my-app-data-bucket",
        "LifecycleConfiguration": {
          "Rules": [
            {
              "Id": "ArchiveRule",
              "Status": "Enabled",
              "Transitions": [
                {
                  "StorageClass": "GLACIER",
                  "TransitionInDays": 30
                }
              ]
            }
          ]
        }
      }
    }
  }
}

Question 73mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for PostgreSQL and wants to minimize downtime during a major version upgrade. Which approach best meets this requirement?

Question 74easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON logs from an application for up to 30 days, with infrequent access. Which storage solution is the most cost-effective?

Question 75hardmultiple choice

Read the full Data Store Management explanation →

A company has an Amazon DynamoDB table with a provisioned write capacity of 1000 WCU. During a flash sale, the write traffic spikes to 5000 WCU for 10 minutes. The table is not auto-scaled. Which action should the data engineer take to handle the spike without throttling?

Question 76mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3 and needs to ensure that objects are automatically encrypted at rest using server-side encryption with AWS KMS. Which bucket policy statement achieves this?

Question 77hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for its data warehouse. The data engineer notices that queries are slow on a large table that is frequently filtered on a column 'transaction_date'. Which optimization technique best improves query performance?

Question 78mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises MongoDB database to Amazon DocumentDB. Which migration strategy minimizes downtime?

Question 79easymultiple choice

Read the full Data Store Management explanation →

A company needs to store files that are accessed by multiple EC2 instances in a VPC. The files must be concurrently accessible and durable. Which storage solution should the data engineer choose?

Question 80mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting an Amazon RDS for MySQL instance that is experiencing high read latency. The instance is a Single-AZ db.r5.large with 100 GB of General Purpose (gp2) storage. Which action is most likely to reduce read latency?

Question 81hardmultiple choice

Read the full Data Store Management explanation →

A company is building a real-time analytics dashboard using Amazon Kinesis Data Streams and Amazon DynamoDB. The data engineer needs to ensure that the DynamoDB table can handle write spikes without throttling. Which approach is the most cost-effective?

Question 82mediummulti select

Read the full Data Store Management explanation →

Which TWO actions are recommended for securing data at rest in Amazon S3? (Choose two.)

Question 83hardmulti select

Read the full Data Store Management explanation →

Which THREE factors should a data engineer consider when choosing between Amazon Redshift and Amazon Athena for querying large datasets in Amazon S3? (Choose three.)

Question 84easymulti select

Read the full Data Store Management explanation →

Which TWO features of Amazon DynamoDB help ensure high availability and durability? (Choose two.)

Question 85mediummultiple choice

Read the full Data Store Management explanation →

A company stores sensitive user data in an Amazon RDS for PostgreSQL DB instance. A security audit requires that all data be encrypted at rest. The database is currently unencrypted. What is the MOST operationally efficient way to enable encryption at rest?

Question 86easymultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3 for storing raw sensor data. The data is append-only and accessed infrequently after 30 days. Compliance requires that data be retained for 7 years. Which S3 storage class is the MOST cost-effective for data older than 30 days?

Question 87hardmultiple choice

Read the full NAT/PAT explanation →

A company uses Amazon EMR to process large datasets stored in Amazon S3. The cluster uses a transient configuration and stores intermediate data on HDFS. After a job fails due to a spot instance termination, the data engineer needs to rerun the job. What should the engineer do to minimize data loss and cost?

Question 88mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises Apache Cassandra cluster to Amazon Keyspaces (for Apache Cassandra). The cluster has 10 TB of data. The migration must minimize application downtime. Which strategy should the engineer use?

Question 89easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB to store session data for a web application. The application experiences sudden spikes in traffic, causing occasional throttling errors. The data engineer needs to handle these spikes without over-provisioning capacity. What is the MOST cost-effective solution?

Question 90hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data warehouse using Amazon Redshift. The workload includes complex queries that join large tables. The engineer notices that queries are slow due to disk-based operations. Which configuration change would MOST improve query performance?

Question 91mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store historical financial records. A compliance policy requires that all objects be encrypted with a customer-managed key stored in AWS KMS. The bucket is already configured with SSE-S3. What is the LEAST disruptive way to change the encryption to SSE-KMS?

Question 92easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store JSON documents that are accessed by a serverless application using AWS Lambda. The documents are frequently updated and need low latency (single-digit milliseconds) for read and write operations. Which AWS service should the engineer use?

Question 93hardmultiple choice

Read the full Data Store Management explanation →

A company runs a critical application on Amazon RDS for MySQL that requires a Recovery Point Objective (RPO) of 5 minutes and a Recovery Time Objective (RTO) of 1 hour. The database is 500 GB. What is the MOST cost-effective disaster recovery solution that meets these requirements?

Question 94mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a data storage solution for IoT sensor data that is ingested at high velocity. The data is time-series and needs to be queried by time range. Which TWO AWS services are suitable for this use case? (Choose TWO)

Question 95hardmulti select

Read the full Data Store Management explanation →

A company is migrating an on-premises Apache Hadoop cluster to Amazon EMR. The cluster uses HDFS for storage. Which THREE features of Amazon EMR help reduce storage costs compared to on-premises HDFS? (Choose THREE)

Question 96mediummulti select

Read the full Data Store Management explanation →

A data engineer is evaluating storage options for a new application that requires low-latency access to unstructured blobs (up to 5 TB each) with high throughput. The data will be accessed frequently for the first 30 days and then rarely. Which TWO storage solutions meet these requirements? (Choose TWO)

Question 97easymultiple choice

Read the full Data Store Management explanation →

A company is using an RDS for PostgreSQL instance and wants to minimize downtime during a major version upgrade. Which approach should be taken?

Question 98mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data is accessed frequently for the first 30 days, then rarely after that. The engineer needs to minimize storage costs while ensuring data is available within minutes for the first 30 days and can be retrieved within 12 hours after that. Which lifecycle policy should be applied?

Question 99hardmultiple choice

Read the full Data Store Management explanation →

A company uses DynamoDB with provisioned capacity and experiences throttling on a table during peak hours. The data engineer notices that the table has a partition key with high cardinality and the workload is read-heavy. Which action would best resolve the throttling?

Question 100easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON logs from multiple sources in a centralized data store for querying using SQL. The logs are immutable and need to be retained for 90 days. Which AWS service should be used?

Question 101mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for analytics. The data engineer notices that queries are slow due to many small inserts. Which technique would improve write performance?

Question 102hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises Oracle database to Amazon RDS for Oracle. The database is 5 TB in size and has a 1 Gbps network connection. The migration must be completed within 48 hours. Which service should be used?

Question 103easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store sensitive data. The security team requires that all data be encrypted at rest using a customer-managed key that is rotated annually. Which encryption option should be used?

Question 104mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting a slow-running query on Amazon Redshift. The query scans a large table but returns few rows. Which diagnostic step should be taken first?

Question 105hardmultiple choice

Read the full Data Store Management explanation →

A company uses DynamoDB with global tables in two AWS Regions. The data engineer observes that a write to the table in us-east-1 is not immediately visible in a read from eu-west-1. What is the most likely reason?

Question 106mediummulti select

Read the full Data Store Management explanation →

A data engineer needs to store event data from IoT devices that arrives in bursts. The data is key-value and requires single-digit millisecond read and write latency. The engineer also needs to run complex analytical queries on the data for reporting. Which TWO services should be used together? (Choose TWO.)

Question 107hardmulti select

Read the full Data Store Management explanation →

A company is using Amazon S3 for a data lake. The data engineer needs to ensure that all new objects are automatically encrypted with a customer-managed KMS key and that the bucket policy enforces encryption. Which THREE steps should be taken? (Choose THREE.)

Question 108mediummulti select

Read the full DNS explanation →

A company uses Amazon RDS for MySQL with Multi-AZ deployment. The primary instance fails, and automatic failover occurs. After failover, the data engineer notices that the new primary instance has a different DNS endpoint. Which TWO statements are true about this scenario? (Choose TWO.)

Question 109mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer is troubleshooting write throttling on the Orders table. The table has a composite primary key (OrderID as partition key, CustomerID as sort key). The engineer notices that writes are throttled even though the write capacity is not fully utilized. What is the most likely cause?

Network Topology

Question 110hardmultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer has attached this bucket policy to an S3 bucket named data-lake-bucket. The engineer wants to allow only GET requests from the corporate network (10.0.0.0/16) over HTTPS. However, users report that they cannot access objects even when connected to the corporate network. What is the issue?

Exhibit

Refer to the exhibit.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::data-lake-bucket/*",
                "arn:aws:s3:::data-lake-bucket"
            ],
            "Condition": {
                "Bool": {
                    "aws:SecureTransport": "false"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": "*",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::data-lake-bucket/*",
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "10.0.0.0/16"
                }
            }
        }
    ]
}

Question 111mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer needs to connect to the Redshift cluster from an EC2 instance in the same VPC. The engineer can ping the EC2 instance but cannot connect to Redshift using the endpoint address and port 5439. What is the most likely cause?

Network Topology

Question 112mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON logs from multiple microservices in a cost-effective manner for ad-hoc querying using SQL. Which AWS service should be used?

Question 113easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 as its data lake. A data engineer needs to enforce encryption of data at rest using server-side encryption with AWS KMS. Which S3 bucket property should be configured?

Question 114hardmultiple choice

Read the full Data Store Management explanation →

A data engineer notices that an Amazon Redshift cluster’s storage usage is increasing rapidly due to many UPDATE and DELETE operations. The engineer needs to reclaim storage space and improve query performance. Which action should be taken?

Question 115easymultiple choice

Read the full Data Store Management explanation →

A company wants to store historical financial data for 7 years with immediate access for the first year and then infrequent access. After 7 years, the data must be automatically deleted. Which S3 lifecycle policy should be configured?

Question 116mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data store for real-time analytics on high-velocity clickstream data. The data must be stored in a schema-on-read format and support SQL queries with sub-second latency. Which service should be used?

Question 117hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting an Amazon Redshift cluster that is running out of disk space. The engineer runs STV_PARTITIONS and notices that some slices have significantly more data than others. What is the most likely cause and solution?

Question 118easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store transaction data that requires strong consistency, ACID transactions, and complex join queries. Which AWS service is most appropriate?

Question 119mediummultiple choice

Read the full Data Store Management explanation →

A company has an S3 bucket with millions of objects. The data engineer needs to identify which objects are not accessed for 90 days to move them to a lower-cost storage class. Which feature should be used?

Question 120hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a multi-Region disaster recovery solution for an Amazon DynamoDB table. The table must be available in a secondary Region with minimal data loss and automatic failover. Which feature should be used?

Question 121mediummulti select

Study the full ACL explanation →

Which TWO statements are true about Amazon S3 bucket policies and ACLs?

Question 122hardmulti select

Read the full Data Store Management explanation →

Which THREE factors should be considered when choosing a partition key for an Amazon DynamoDB table?

Question 123easymulti select

Read the full Data Store Management explanation →

Which TWO data stores are considered fully managed, serverless, and suitable for storing JSON documents?

Question 124mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB for a high-traffic web application. They notice increased read latency during peak hours. Which design change would best reduce read latency without increasing cost?

Question 125easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON transaction logs for analytics. The logs are written once and rarely accessed. The storage must be cost-effective. Which AWS service should be used?

Question 126hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for a data warehouse. They notice that queries are slow due to heavy data skew. Which optimization technique should be applied first?

Question 127mediummultiple choice

Read the full NAT/PAT explanation →

A data engineer is designing a data lake on Amazon S3. The data includes sensitive customer information that must be encrypted at rest. Which combination of actions meets this requirement with minimal operational overhead?

Question 128hardmultiple choice

Read the full Data Store Management explanation →

A company has a DynamoDB table with a partition key of 'user_id' and a sort key of 'timestamp'. They need to query all items for a user within a date range. Which query operation should be used?

Question 129easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store streaming data from IoT devices for real-time analytics. The data has a fixed schema and requires low-latency queries. Which AWS service should be used?

Question 130mediummultiple choice

Read the full Data Store Management explanation →

A company stores financial data in Amazon RDS for MySQL. They need to retain backups for 7 years to meet compliance. Which backup strategy meets this requirement?

Question 131hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a multi-region disaster recovery solution for Amazon RDS for PostgreSQL. The primary region must have a standby in a different Availability Zone, and the secondary region must have a readable replica that can be promoted in case of failure. Which configuration meets these requirements?

Question 132mediummultiple choice

Read the full Data Store Management explanation →

A company stores log files in Amazon S3. They want to automatically move logs older than 90 days to S3 Glacier Deep Archive to reduce costs. Which S3 feature should be used?

Question 133mediummulti select

Read the full Data Store Management explanation →

Which TWO actions can help optimize Amazon S3 storage costs for a data lake? (Choose two.)

Question 134hardmulti select

Read the full Data Store Management explanation →

Which THREE considerations are important when designing a DynamoDB table for high-traffic gaming leaderboards? (Choose three.)

Question 135easymulti select

Read the full Data Store Management explanation →

Which TWO statements about Amazon Redshift data distribution are correct? (Choose two.)

Question 136mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. An IAM policy is attached to a user. The user cannot upload objects to the S3 bucket 'example-bucket' using the AWS CLI. What is the most likely cause?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "true"
        }
      }
    }
  ]
}

Question 137hardmultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer ran the CLI command to check the configuration of an RDS instance named 'mydb'. Which statement accurately describes the current configuration?

Network Topology

Question 138mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A DynamoDB table 'Orders' has a GSI 'CustomerDateIndex'. A developer tries to query the GSI for all orders of a customer between two dates. The query fails. What is the most likely reason?

Exhibit

Refer to the exhibit.

{
  "Tables": [
    {
      "TableName": "Orders",
      "KeySchema": [
        {"AttributeName": "order_id", "KeyType": "HASH"},
        {"AttributeName": "customer_id", "KeyType": "RANGE"}
      ],
      "AttributeDefinitions": [
        {"AttributeName": "order_id", "AttributeType": "S"},
        {"AttributeName": "customer_id", "AttributeType": "S"},
        {"AttributeName": "order_date", "AttributeType": "S"}
      ],
      "GlobalSecondaryIndexes": [
        {
          "IndexName": "CustomerDateIndex",
          "KeySchema": [
            {"AttributeName": "customer_id", "KeyType": "HASH"},
            {"AttributeName": "order_date", "KeyType": "RANGE"}
          ],
          "Projection": {"ProjectionType": "ALL"}
        }
      ]
    }
  ]
}

Question 139easymultiple choice

Read the full Data Store Management explanation →

A company is using Amazon S3 to store critical data and needs to ensure that objects are automatically transitioned to S3 Glacier Deep Archive after 180 days to reduce costs. Which S3 lifecycle action should be configured?

Question 140mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data is accessed frequently for the first 30 days, then rarely after that. Compliance requires that data be retained for 7 years. What is the MOST cost-effective storage strategy?

Question 141hardmultiple choice

Read the full Data Store Management explanation →

A company runs an Amazon RDS for PostgreSQL database. To meet disaster recovery requirements, they set up a cross-Region read replica. The replica has been lagging by several minutes. Which action is MOST effective to reduce the replica lag?

Question 142easymultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB for a gaming application. They want to store player session data that expires after 24 hours. Which DynamoDB feature should be used?

Question 143mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises Apache HBase workload to Amazon DynamoDB. The application requires strongly consistent reads and the ability to query by a composite key (partition key + sort key). Which DynamoDB table design should be used?

Question 144hardmultiple choice

Read the full Data Store Management explanation →

A company has an Amazon Redshift cluster that stores petabytes of data. Queries are experiencing high disk usage due to large intermediate results. The data engineer needs to improve query performance without adding more nodes. Which action should the engineer take?

Question 145mediummulti select

Read the full NAT/PAT explanation →

A company is using Amazon S3 to store sensitive data. They need to ensure that all objects are encrypted at rest. Which combination of actions should be taken? (Choose TWO.)

Question 146hardmulti select

Read the full Data Store Management explanation →

A company uses Amazon Redshift for analytics. They notice that some queries are slow due to data redistribution. The data engineer wants to minimize data movement across nodes. Which table design strategy should be used? (Choose TWO.)

Question 147easymultiple choice

Read the full Data Store Management explanation →

A company is using Amazon S3 for data lake storage. They need to query the data directly using SQL without loading it into a database. Which AWS service should be used?

Question 148hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB for an e-commerce application. The application experiences sudden spikes in traffic, causing throttling errors. The data engineer needs to handle the spikes cost-effectively. Which solution should be used?

Question 149mediummultiple choice

Read the full Data Store Management explanation →

A data engineer reviews this IAM policy attached to an S3 bucket. What is the effect of this policy?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "StringNotEquals": {
          "s3:x-amz-server-side-encryption": "aws:kms"
        }
      }
    }
  ]
}

Question 150hardmultiple choice

Read the full Data Store Management explanation →

A data engineer runs this CLI command. Which query is MOST efficient against this table?

Network Topology

Question 151mediummultiple choice

Read the full Data Store Management explanation →

A data engineer sees this AWS Glue table definition in the Data Catalog. The engineer wants to query this table with Amazon Athena, but the query returns zero rows. What is the MOST likely cause?

Exhibit

Refer to the exhibit.

{
  "Tables": [
    {
      "Name": "sales",
      "Type": "EXTERNAL_TABLE",
      "Parameters": {
        "EXTERNAL": "TRUE",
        "classification": "csv"
      },
      "StorageDescriptor": {
        "Location": "s3://data-lake/sales/",
        "InputFormat": "org.apache.hadoop.mapred.TextInputFormat",
        "OutputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
        "SerdeInfo": {
          "SerializationLib": "org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe",
          "Parameters": {
            "field.delim": ","
          }
        }
      }
    }
  ]
}

Question 152hardmultiple choice

Read the full Data Store Management explanation →

A CloudFormation template includes this IAM policy for a cross-account S3 upload use case. What is the purpose of the condition?

Exhibit

Refer to the exhibit.

IamPolicyDocument:
  Version: '2012-10-17'
  Statement:
    - Effect: Allow
      Action:
        - 's3:GetObject'
        - 's3:PutObject'
      Resource: 'arn:aws:s3:::my-bucket/*'
      Condition:
        StringEquals:
          's3:x-amz-acl': 'bucket-owner-full-control'

Question 153mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for MySQL and needs to automate backups with a retention period of 35 days. They also want to be able to restore to any point within the retention period. Which configuration should be used?

Question 154mediummultiple choice

Read the full Data Store Management explanation →

A company is storing sensitive user data in an Amazon S3 bucket. The security team requires that all data be encrypted at rest using a customer-managed key stored in AWS KMS. The bucket policy must deny any PUT request that does not include the appropriate encryption header. Which bucket policy condition key should be used?

Question 155easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to set up a new Amazon RDS for MySQL database for a web application. The application experiences variable read traffic and requires low read latency. The engineer needs to minimize downtime during maintenance and provide read scalability. Which configuration meets these requirements?

Question 156hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data is partitioned by year, month, day, and hour. The engineer needs to ensure that queries using Amazon Athena are cost-effective and performant. The data is written in Parquet format, and the total volume is 50 TB. Which approach minimizes query costs?

Question 157mediummultiple choice

Read the full Data Store Management explanation →

An e-commerce company uses Amazon DynamoDB as the primary data store for its product catalog. The table has a simple primary key (ProductID) and handles 10,000 writes per second during peak hours. Recently, the engineering team noticed increased write latency and throttled requests during peak times. The table's provisioned write capacity is set to 12,000 WCU. What is the most likely cause of the throttling?

Question 158easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON logs from multiple microservices in a cost-effective manner for later analysis using Amazon Athena. The logs are generated continuously, and the total volume is about 1 TB per day. The data must be queryable within minutes of arrival. Which storage solution is most appropriate?

Question 159hardmultiple choice

Read the full Data Store Management explanation →

A data engineer ran the command shown in the exhibit on the bucket 'my-data-lake'. The engineer then tries to delete an object version but receives an 'AccessDenied' error. The engineer has full S3 permissions via IAM. What is the most likely reason for the error?

Network Topology

Question 160mediummultiple choice

Read the full Data Store Management explanation →

A company is migrating its on-premises Oracle database to Amazon RDS for Oracle. The database is 2 TB in size and has a 24-hour maintenance window. The migration must have minimal downtime. Which AWS service should be used for the migration?

Question 161hardmultiple choice

Read the full Data Store Management explanation →

A data engineer created the IAM policy shown in the exhibit. The engineer then attempts to upload an object to 'my-bucket' using the AWS CLI with the command: aws s3 cp file.txt s3://my-bucket/ --sse aws:kms. The upload fails with an 'AccessDenied' error. What is the most likely cause?

Exhibit

Refer to the exhibit.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::my-bucket/*",
            "Condition": {
                "StringEquals": {
                    "s3:x-amz-server-side-encryption": "AES256"
                }
            }
        }
    ]
}

Question 162easymultiple choice

Read the full Data Store Management explanation →

A data engineer is configuring an Amazon S3 lifecycle policy to transition objects to S3 Glacier Deep Archive after 90 days. The bucket receives new objects daily. The engineer wants to ensure that objects are not deleted before 90 days. Which lifecycle action should be used?

Question 163mediummulti select

Read the full Data Store Management explanation →

A financial services company is designing a data store for transaction records that must be immutable and auditable. The data must be stored for 7 years. Which AWS services can be combined to meet these requirements? (Choose TWO.)

Question 164hardmulti select

Read the full Data Store Management explanation →

A data engineer is troubleshooting slow query performance on an Amazon Redshift cluster. The cluster has 10 nodes and is using automatic distribution style. The engineer suspects that data distribution is causing excessive data movement. Which steps should the engineer take to diagnose and resolve the issue? (Choose THREE.)

Question 165mediummulti select

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB for a gaming leaderboard. The table has a primary key of GameId (partition key) and Score (sort key). The application needs to retrieve the top 10 scores for a given game. Which strategies can improve query performance? (Choose TWO.)

Question 166hardmulti select

Read the full Data Store Management explanation →

A data engineer is designing a data store for a real-time analytics application that requires sub-millisecond read and write latency. The data is accessed via a REST API. Which AWS services should the engineer consider? (Choose THREE.)

Question 167mediummultiple choice

Read the full Data Store Management explanation →

A data engineer reviewed the S3 lifecycle policy shown in the exhibit. The engineer notices that objects under the 'logs/' prefix are being deleted after 365 days. The business requirement is to retain logs for at least 5 years. What should the engineer change in the lifecycle policy?

Network Topology

Question 168mediummultiple choice

Read the full Data Store Management explanation →

A data engineer runs the describe-table command shown in the exhibit. The application frequently queries by CustomerID alone. Currently, these queries result in full table scans. Which action should the engineer take to improve query performance?

Network Topology

Question 169mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting a slow Amazon Redshift query that joins several large tables. The query plan shows a large number of broadcasts. Which design change would most likely reduce the broadcast operations?

Question 170hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB with provisioned capacity. During a sales event, write traffic spikes and some requests receive ProvisionedThroughputExceeded exceptions. The reads are within limits. The data engineer needs to minimize latency for the spike without manual intervention. Which solution is MOST cost-effective?

Question 171easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON log files from multiple sources and query them using SQL. The data is rarely updated and access frequency is low. Which storage solution is MOST cost-effective?

Question 172mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data consists of sensitive personally identifiable information (PII) that must be encrypted at rest. The company requires that encryption keys be rotated every 90 days and that access to the keys be logged. Which encryption solution meets these requirements?

Question 173hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon RDS for MySQL with Multi-AZ deployment. During a recent failover, the application experienced a brief outage because it cached the old database endpoint. Which solution would minimize application disruption during future failovers?

Question 174easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store time-series sensor data from thousands of IoT devices. The data is written once, read frequently for the last 24 hours, and rarely accessed after 30 days. Which storage solution is MOST cost-effective?

Question 175mediummultiple choice

Read the full NAT/PAT explanation →

A company is migrating an on-premises Apache Cassandra database to Amazon Keyspaces. The database has a table with a partition key of 'user_id' and a clustering column of 'timestamp'. The application frequently queries the last 10 records for a given user. Which table design in Keyspaces would provide the BEST query performance for this access pattern?

Question 176hardmultiple choice

Read the full Data Store Management explanation →

A data engineer notices that an Amazon Redshift cluster's storage utilization has grown unexpectedly. The cluster uses automatic compression and has a mix of fact and dimension tables. The engineer runs VACUUM and ANALYZE, but storage does not decrease. Which action is most likely to reduce storage consumption?

Question 177easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon RDS for PostgreSQL. The data engineer needs to ensure that the database is automatically backed up and that backups are retained for 35 days. What is the simplest way to achieve this?

Question 178mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3 that will be accessed by multiple AWS Glue ETL jobs. The engineer needs to ensure that the data is organized efficiently for querying and that sensitive columns are masked for certain users. Which TWO actions should the engineer take? (Choose TWO.)

Question 179hardmulti select

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB for a gaming application that requires single-digit millisecond read and write latencies. The application experiences throttling on the 'GameScores' table during peak hours. The table has a partition key of 'game_id' and a sort key of 'player_id'. The data engineer needs to improve performance without changing the table's provisioned capacity. Which THREE actions should the engineer take? (Choose THREE.)

Question 180easymulti select

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises Microsoft SQL Server database to Amazon RDS for SQL Server. The database is 2 TB in size and has a 4-hour maintenance window. The company needs to minimize downtime and ensure data consistency. Which TWO methods should the engineer use? (Choose TWO.)

Question 181mediummultiple choice

Read the full Data Store Management explanation →

A company stores sensitive customer data in Amazon S3. The security team requires that all objects be encrypted at rest using server-side encryption with customer-provided keys (SSE-C). Which bucket policy condition will enforce this requirement?

Question 182easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data for a real-time analytics application. The data will be queried using SQL-like statements and must support high-speed ingestion with minimal latency. Which AWS service is best suited for this use case?

Question 183hardmultiple choice

Read the full Data Store Management explanation →

A company is running a MySQL database on Amazon RDS. The database size is 2 TB, and the company needs to migrate it to Amazon Aurora MySQL with minimal downtime. Which migration strategy is most appropriate?

Question 184mediummultiple choice

Read the full Data Store Management explanation →

A data engineer notices that an Amazon Redshift cluster is experiencing slow query performance. The engineer suspects that tables are not properly sorted. Which diagnostic query should the engineer run to identify unsorted rows?

Question 185easymultiple choice

Read the full Data Store Management explanation →

A company is using Amazon S3 as a data lake. The data engineer needs to ensure that all objects uploaded to a specific bucket are automatically replicated to a bucket in another AWS Region for disaster recovery. Which configuration should the engineer implement?

Question 186hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB for a gaming application with high read and write throughput. The data engineer notices that the read latency is high during peak hours. The table has a partition key only (no sort key). The engineer wants to improve read performance by distributing reads across partitions more evenly. Which action should the engineer take?

Question 187mediummultiple choice

Read the full NAT/PAT explanation →

A data engineer needs to store and analyze time-series data from IoT devices. The data volume is 10 GB per day, and the queries are mostly on the most recent 7 days of data. The engineer wants to minimize storage costs while retaining historical data for 1 year. Which combination of AWS services is most cost-effective?

Question 188easymultiple choice

Read the full Data Store Management explanation →

A company is using Amazon EMR to process large datasets stored in Amazon S3. The data engineer wants to reduce the time it takes to read data from S3 by optimizing the data format. Which file format should the engineer recommend?

Question 189hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon Redshift for data warehousing. The data engineer notices that the STL_ALERT_EVENT_LOG table shows many 'missing statistics' alerts. What is the best course of action to address this issue?

Question 190mediummulti select

Read the full Data Store Management explanation →

Which TWO options are valid ways to encrypt data at rest in Amazon S3? (Choose two.)

Question 191hardmulti select

Read the full Data Store Management explanation →

Which THREE of the following are benefits of using Amazon DynamoDB Accelerator (DAX)? (Choose three.)

Question 192mediummulti select

Read the full Data Store Management explanation →

Which TWO actions can help improve query performance in Amazon Redshift? (Choose two.)

Question 193hardmultiple choice

Read the full Data Store Management explanation →

An IAM policy is attached to a user who tries to upload an object to the S3 bucket example-bucket using the AWS CLI without specifying the --server-side-encryption flag. What will happen?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "StringEquals": {
          "s3:x-amz-server-side-encryption": "aws:kms"
        }
      }
    }
  ]
}

Question 194hardmultiple choice

Read the full NAT/PAT explanation →

A company runs a real-time analytics platform on Amazon ECS that ingests streaming data from Amazon Kinesis Data Streams, processes it, and stores results in Amazon DynamoDB. The data volume spikes unpredictably, causing DynamoDB to throttle write requests. The application uses on-demand capacity mode. The data engineer notices that the throttling occurs on a specific partition due to a hot key. The hot key is a customer ID that receives a disproportionate number of writes. The application cannot change the partition key design immediately. The engineer needs to reduce throttling while maintaining low latency. Which solution is most effective?

Question 195mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is responsible for a data warehouse on Amazon Redshift that stores 5 TB of data. The engineer needs to load 50 GB of new data daily from Amazon S3 into Redshift. The current load process uses the COPY command and takes 2 hours, which is within the maintenance window. However, the engineer wants to optimize the load time and reduce the impact on concurrent queries. The engineer notices that the tables are not distributed evenly across the slices. The cluster has 4 nodes of dc2.large. Which approach will best improve load performance?

Question 196mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for MySQL with Multi-AZ deployment. The database experiences intermittent slowdowns during peak hours. The company's DevOps team suspects that the primary instance is overwhelmed. Which action should the team take to distribute the read load without changing the application code?

Question 197hardmultiple choice

Read the full NAT/PAT explanation →

A data engineer is designing a data lake on Amazon S3. The data is ingested from multiple sources and stored in a partitioned structure under the 'landing' prefix. The engineer needs to ensure that only authorized applications can write to the 'landing' zone, while all AWS accounts in the organization can read the data. Which combination of S3 bucket policies and IAM policies should be used?

Question 198easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB as its primary data store for a web application. The application experiences high latency during peak hours. The data engineer notices that the table has a large number of items with the same partition key. Which DynamoDB feature should the engineer use to improve performance?

Question 199mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data that is accessed infrequently but must be retrievable within minutes. The data is generated by IoT devices and each object is about 500 KB. The engineer wants the most cost-effective storage solution. Which AWS service should be used?

Question 200hardmultiple choice

Read the full Data Store Management explanation →

A company is migrating an on-premises PostgreSQL database to Amazon Aurora PostgreSQL. The database is 2 TB in size. The migration must have minimal downtime. Which approach should the data engineer use?

Question 201easymultiple choice

Read the full NAT/PAT explanation →

A data engineer is designing a data store for a real-time analytics application that requires sub-millisecond read and write latency. The data is key-value in nature and the workload is both read-heavy and write-heavy. Which AWS service is most suitable?

Question 202mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for its data warehouse. The data engineer notices that queries are running slower than expected. The system administrator reports that the cluster's disk space is 80% full. Which action should the engineer take to improve query performance?

Question 203hardmultiple choice

Read the full NAT/PAT explanation →

A data engineer is setting up an Amazon S3 bucket for storing sensitive financial data. The compliance team requires that all data be encrypted at rest using a customer-managed AWS KMS key. Additionally, the bucket must block public access. Which combination of settings should the engineer configure?

Question 204easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store log files from multiple applications in a centralized location. The logs are generated in JSON format and each log entry is about 1 KB. The engineer needs to query the logs occasionally using SQL-like queries. Which AWS service is most appropriate?

Question 205mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a disaster recovery plan for an Amazon RDS for MySQL database. The database must be recoverable within 1 hour in a different AWS Region. Which TWO actions should the engineer take?

Question 206hardmulti select

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB to store session data for a web application. The application experiences throttling during peak hours. The data engineer needs to reduce throttling. Which THREE actions should the engineer take?

Question 207mediummulti select

Read the full Data Store Management explanation →

A data engineer is migrating a large Oracle data warehouse to Amazon Redshift. The engineer needs to ensure optimal performance. Which TWO practices should the engineer follow?

Question 208hardmultiple choice

Read the full Data Store Management explanation →

A company runs an e-commerce platform on AWS. The product catalog is stored in Amazon DynamoDB with a table that has a partition key of 'product_id' and a sort key of 'category'. The application frequently queries products by category and by product_id. Recently, the operations team noticed that read latency has increased significantly for queries that filter by category. The DynamoDB table has auto scaling enabled. The data engineer examines the CloudWatch metrics and sees that the ReadThrottleEvents metric is non-zero for the table, but the consumed read capacity is well below the provisioned limit. The table has a global secondary index (GSI) on the 'category' attribute. Which action is most likely to resolve the latency issue?

Question 209mediummultiple choice

Read the full Data Store Management explanation →

A media company stores video metadata in Amazon RDS for PostgreSQL. The database is 500 GB and experiences high write traffic. The data engineer notices that the transaction log (WAL) is growing rapidly, causing storage issues. The company needs to retain backups for 30 days for compliance. The database is currently using automated backups with a retention period of 7 days. Which solution should the engineer implement to address the WAL growth while meeting compliance requirements?

Question 210easymultiple choice

Read the full Data Store Management explanation →

A startup uses Amazon S3 to store user-uploaded images. The images are accessed frequently for the first week after upload, but after that they are rarely accessed. The company wants to optimize storage costs without compromising availability. The data engineer must implement a lifecycle policy to transition objects to a more cost-effective storage class after 30 days. The objects must be retrievable within minutes. Which storage class should the engineer transition the objects to?

Question 211mediummultiple choice

Read the full Data Store Management explanation →

A company is using an Amazon RDS for MySQL database for its e-commerce platform. During a recent flash sale, the database experienced high read traffic, causing slow query performance. The company needs a solution that offloads read traffic with minimal application changes. Which action should be taken?

Question 212hardmultiple choice

Read the full Data Store Management explanation →

A data engineering team is designing a data lake on Amazon S3. They need to store raw data in its original format and transformed data in Parquet. The data is accessed by multiple analytics services, including Amazon Athena and Amazon Redshift Spectrum. Compliance requirements mandate that all data be encrypted at rest with AWS KMS and that the encryption keys be rotated every 90 days. Which S3 bucket configuration meets these requirements?

Question 213easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB as the primary data store for a gaming application. The application stores user profiles and game state. During peak hours, the application experiences throttling on writes to the UserProfiles table. The table's read capacity is underutilized. Which solution should resolve the write throttling?

Question 214mediummultiple choice

Read the full Data Store Management explanation →

A company is running a data warehouse on Amazon Redshift. The data engineering team notices that query performance has degraded over time. They suspect that data distribution is causing excessive data movement between nodes. The table is joined frequently on the customer_id column. Which column should be chosen as the distribution key to optimize join performance?

Question 215hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store sensitive financial data. The security team requires that all objects be encrypted at rest using AWS KMS with a customer-managed key. Additionally, they want to audit all KMS decrypt calls for compliance. Which configuration should be used to meet these requirements?

Question 216easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store time-series data from IoT devices. The data is write-heavy and requires low-latency queries by device ID and timestamp. The data volume is expected to grow to terabytes. Which AWS database service is most suitable?

Question 217mediummultiple choice

Read the full Data Store Management explanation →

A company is migrating its on-premises Oracle database to Amazon Aurora PostgreSQL. The migration must have minimal downtime. The source database is 2 TB and runs on a single server. Which AWS service should be used for the migration?

Question 218hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB with auto scaling enabled. During a marketing campaign, write traffic spikes, and some write requests fail with ProvisionedThroughputExceededException. The auto scaling policy has a target utilization of 70% and a maximum capacity that is high enough. What is the most likely cause of the throttling?

Question 219mediummulti select

Read the full Data Store Management explanation →

A company is designing a data lake on Amazon S3. The data engineering team needs to implement a lifecycle policy to manage costs. Which TWO actions should be taken to reduce storage costs?

Question 220hardmulti select

Read the full Data Store Management explanation →

A company is using Amazon Redshift for its data warehouse. The data engineering team needs to improve query performance for a large fact table that is frequently joined with multiple dimension tables. Which THREE strategies should be considered?

Question 221easymulti select

Read the full Data Store Management explanation →

A data engineer is setting up Amazon S3 bucket policies for a data lake. The security team requires that all objects uploaded to the bucket be encrypted at rest using server-side encryption. Which TWO methods can enforce encryption at upload time?

Question 222mediummultiple choice

Read the full Data Store Management explanation →

A data engineer runs the AWS CLI command to retrieve the lifecycle configuration of the 'my-data-lake' bucket. The output is shown in the exhibit. What is the effect of this lifecycle policy?

Network Topology

Question 223hardmultiple choice

Read the full Data Store Management explanation →

A company runs a real-time analytics platform on AWS. Data is ingested from thousands of IoT devices into Amazon Kinesis Data Streams. A Lambda function consumes the stream, processes the data, and writes the results to an Amazon DynamoDB table. The DynamoDB table has a provisioned write capacity of 1000 WCU, and the read capacity is set to 200 RCU. Recently, the company noticed that the Lambda function is failing with ProvisionedThroughputExceededException on DynamoDB writes. The Lambda function is configured with a batch size of 100 and a concurrency limit of 10. The Kinesis shard count is 4. The number of devices has increased, but the data volume per device has remained the same. The company needs to resolve the write throttling without increasing the DynamoDB write capacity. Which action should the data engineer take?

Question 224easymultiple choice

Read the full Data Store Management explanation →

A company stores its application logs in Amazon S3. The logs are generated daily and need to be retained for 3 years for compliance. The logs are accessed frequently for the first 30 days, occasionally for the next 6 months, and rarely after that. The data engineering team wants to minimize storage costs while ensuring that logs are available for retrieval within 12 hours for the first 6 months and within 48 hours after that. The team also wants to automatically delete logs after 3 years. Which lifecycle policy should the team implement?

Question 225mediummultiple choice

Read the full Data Store Management explanation →

A company is migrating its on-premises PostgreSQL database to Amazon RDS for PostgreSQL. The database is 5 TB in size and supports a critical application that requires less than 30 minutes of downtime. The company has a 1 Gbps network connection to AWS. The data engineering team plans to use AWS Database Migration Service (DMS) with change data capture (CDC) to keep the target in sync. During the full load phase, DMS is taking longer than expected, and the team is concerned about meeting the downtime window. Which action should the team take to speed up the full load?

Question 226mediummulti select

Read the full Data Store Management explanation →

A data engineering team is designing a data lake on Amazon S3 for storing sensor data from IoT devices. The data is written in near real-time and needs to be queried using Amazon Athena. Which TWO configurations should the team implement to optimize query performance and minimize costs?

Question 227hardmulti select

Read the full Data Store Management explanation →

A data engineer is troubleshooting an AWS Glue ETL job that reads from an S3 bucket and writes to a DynamoDB table. The job fails with an AccessDeniedException. The IAM role attached to the Glue job has the policy shown in the exhibit. Which TWO additional permissions are required to resolve the issue?

Exhibit

IAM Policy:
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "dynamodb:GetItem",
        "dynamodb:PutItem",
        "dynamodb:Query"
      ],
      "Resource": "*"
    }
  ]
}

Question 228easymulti select

Read the full Data Store Management explanation →

A data engineering team is migrating a MySQL database to Amazon RDS for MySQL. They need to ensure high availability and automated failover. Which THREE configurations should they implement?

Question 229mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store historical stock market data as CSV files. They run daily Amazon Athena queries to generate reports. Recently, the finance team reported that queries are timing out and costs have increased significantly. The data engineering team notices that the S3 bucket contains thousands of small files (average 100 KB) due to a misconfigured ingestion pipeline. They need to improve query performance and reduce costs without changing the existing reporting schedule. The team has access to AWS Glue and can create new tables. Which solution should they implement?

Question 230hardmultiple choice

Read the full Data Store Management explanation →

A data engineering team is building a real-time analytics pipeline using Amazon Kinesis Data Streams, AWS Lambda, and Amazon DynamoDB. The Lambda function consumes records from the stream and writes aggregated data to a DynamoDB table. The application requires that each record be processed exactly once to avoid duplicates. The Lambda function is idempotent, but occasionally duplicate records are written due to retries from Kinesis. The team needs to ensure exactly-once semantics for DynamoDB writes. Which solution should they implement?

Question 231easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for its data warehouse. The data engineering team loads data daily from Amazon S3 using COPY commands. Recently, the load performance has degraded because the S3 bucket contains many small files. The team needs to optimize the COPY operation to improve performance. Which approach should they take?

Question 232mediummultiple choice

Read the full Data Store Management explanation →

A data engineering team is managing an Amazon DynamoDB table that stores user session data. The table has a primary key of user_id (partition key) and session_id (sort key). The application performs strongly consistent reads on individual items. The team notices that read latency increases during peak hours. They suspect that the table is experiencing hot partitions. The team needs to improve read performance without changing the application code. Which solution should they implement?

Question 233hardmultiple choice

Read the full Data Store Management explanation →

A data team uses the CloudFormation template in the exhibit to create an S3 bucket for storing log files. After one year, they notice that the bucket size is larger than expected. They investigate and find that older versions of objects are not being deleted or transitioned. What is the most likely cause?

Exhibit

CloudFormation snippet:
"MyS3Bucket": {
  "Type": "AWS::S3::Bucket",
  "Properties": {
    "VersioningConfiguration": {
      "Status": "Enabled"
    },
    "LifecycleConfiguration": {
      "Rules": [
        {
          "Id": "ArchiveOldData",
          "Status": "Enabled",
          "ExpirationInDays": 365,
          "Transitions": [
            {
              "TransitionInDays": 30,
              "StorageClass": "STANDARD_IA"
            },
            {
              "TransitionInDays": 90,
              "StorageClass": "GLACIER"
            }
          ]
        }
      ]
    }
  }
}

Question 234easymultiple choice

Read the full Data Store Management explanation →

A company is migrating its on-premises Oracle data warehouse to Amazon Redshift. The data engineering team needs to load data from Oracle to Redshift using AWS DMS (Database Migration Service). The source database is 2 TB in size. The team wants to minimize downtime and ensure data consistency during full load. Which approach should they take?

Question 235mediummultiple choice

Read the full Data Store Management explanation →

A data engineering team is using Amazon EMR to process large datasets stored in Amazon S3. The cluster uses Spot Instances for cost savings. During processing, the team notices that tasks are failing due to Spot Instance interruptions. The team needs to make the EMR job resilient to Spot interruptions without increasing costs significantly. Which solution should they implement?

Question 236hardmultiple choice

Read the full Data Store Management explanation →

A data engineering team is responsible for an Amazon RDS for PostgreSQL instance that stores financial data. The database is 500 GB in size. The team needs to create a read replica in a different AWS Region for disaster recovery. The source database has automated backups enabled with a retention period of 7 days. The team initiates the cross-region read replica creation. After several hours, the replica status shows 'Replication Lag' of 30 minutes and is increasing. What should the team do to reduce the replication lag?

Question 237easymultiple choice

Read the full Data Store Management explanation →

A data engineering team is using AWS Glue to catalog data in an S3 data lake. They have a Glue crawler that runs daily to update the Data Catalog. Recently, they noticed that the crawler is taking longer to run and sometimes fails because of a timeout. The team suspects the issue is due to the large number of small files in the S3 bucket. They need to improve crawler performance and reliability. Which solution should they implement?

Question 238mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Kinesis Data Firehose to deliver streaming data to an Amazon S3 bucket. The data is JSON and each record is about 2 KB. The delivery stream is configured to buffer incoming data to 5 MB or 60 seconds, whichever comes first. The data engineering team notices that the S3 bucket contains many small files (average 2 MB), which makes subsequent processing inefficient. They need to reduce the number of small files without increasing the latency beyond 5 minutes. Which solution should they implement?

Question 239hardmultiple choice

Read the full Data Store Management explanation →

A data engineering team is managing an Amazon Redshift cluster that is used for BI reporting. The cluster has a mix of large tables (some over 1 TB) and many smaller tables. The team notices that queries on a large fact table are slow. The fact table is distributed using KEY distribution on the customer_id column, which has high cardinality. The team wants to improve query performance. They have the option to change the distribution style and sort key. Which redesign should they implement?

Question 240mediummultiple choice

Read the full Data Store Management explanation →

A data engineering team is using Amazon DynamoDB to store time-series data for a monitoring application. The table has a primary key of device_id (partition key) and timestamp (sort key). The application queries data for a specific device over a time range. The team notices that read latency is high for devices that generate large amounts of data. They need to improve query performance. Which solution should they implement?

Question 241hardmultiple choice

Read the full NAT/PAT explanation →

A company runs a MySQL-compatible Amazon Aurora database for its e-commerce platform. The database experiences high write latency during peak hours. The application performs frequent INSERT and UPDATE operations on a table with 50 million rows. The DB instance is db.r5.large with 500 GB of Provisioned IOPS storage. A recent performance analysis shows that the average queue depth is consistently above 32 and write latency exceeds 50 ms. The company needs to reduce write latency without changing the application code. What should a data engineer do?

Question 242easymultiple choice

Read the full Data Store Management explanation →

A media company stores video files in an Amazon S3 bucket with S3 Standard storage class. The files are accessed frequently for the first 30 days, then rarely after that. However, the company must be able to restore any deleted file within 7 days. The company wants to minimize storage costs while meeting the access and retention requirements. What should a data engineer do?

Question 243mediummultiple choice

Read the full Data Store Management explanation →

A company is migrating an on-premises MySQL database to Amazon RDS for MySQL. The database is 500 GB and has a 24/7 uptime requirement. The migration must minimize downtime. Which approach should be used?

Question 244easymultiple choice

Read the full Data Store Management explanation →

A company stores sensitive customer data in Amazon S3. The security team requires that all data be encrypted at rest using server-side encryption with AWS KMS managed keys (SSE-KMS). Which S3 bucket policy condition will enforce this requirement?

Question 245hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB with on-demand capacity for a gaming application. During a new game launch, write traffic spikes to 50,000 writes per second, but the application experiences throttling. The DynamoDB table has a partition key of 'game_id' and a sort key of 'timestamp'. What is the MOST likely cause of throttling?

Question 246easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON logs from AWS CloudTrail. The logs are append-only and rarely accessed after 90 days. Which storage solution is MOST cost-effective?

Question 247mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for data warehousing. The data engineering team notices that queries are slow due to high disk I/O. The team wants to improve query performance without changing the cluster configuration. Which action should the team take?

Question 248hardmultiple choice

Read the full Data Store Management explanation →

A company is using AWS Glue to run ETL jobs that write data to an Amazon S3 data lake. The jobs are failing with '503 Slow Down' errors. The data engineering team has already implemented retries. What is the BEST long-term solution?

Question 249easymultiple choice

Read the full Data Store Management explanation →

A company wants to use Amazon Redshift Spectrum to query data in Amazon S3. The data is in Parquet format and partitioned by date. Which step is required to enable Redshift Spectrum?

Question 250mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for PostgreSQL with Multi-AZ deployment. The primary instance fails and a failover occurs. After the failover, the application cannot connect to the database. What is the MOST likely cause?

Question 251hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon ElastiCache for Redis to cache frequently accessed data. The cache hit ratio is low, and the engineering team suspects that the eviction policy is causing important data to be removed. Which eviction policy should be used to minimize eviction of the most frequently accessed keys?

Question 252mediummulti select

Read the full Data Store Management explanation →

A company is designing a data lake on Amazon S3 for analytics. The data includes sensitive personally identifiable information (PII). Which TWO actions should the company take to protect the data? (Choose TWO.)

Question 253hardmulti select

Read the full Data Store Management explanation →

A company is using Amazon Kinesis Data Streams to ingest real-time clickstream data. The data must be stored in Amazon S3 in near real-time with minimal overhead. Which THREE steps should the data engineer take to achieve this? (Choose THREE.)

Question 254easymulti select

Read the full Data Store Management explanation →

A company is evaluating Amazon DynamoDB for a new application. The application requires single-digit millisecond latency for read and write operations. Which TWO DynamoDB features should the company enable to achieve this? (Choose TWO.)

Question 255mediummultiple choice

Read the full Data Store Management explanation →

The exhibit shows the lifecycle configuration for an S3 bucket. Objects in the bucket are 200 days old on average. What will happen to the objects?

Network Topology

Question 256hardmultiple choice

Read the full Data Store Management explanation →

An IAM role 'DataLakeRole' has the above S3 bucket policy attached to an S3 bucket. The role is assumed by an AWS Glue job. The Glue job is failing with 'Access Denied' errors when trying to list objects in the bucket. Which action should be added to the policy to fix the issue?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::123456789012:role/DataLakeRole"
      },
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-data-lake/*"
    }
  ]
}

Question 257easymultiple choice

Read the full Data Store Management explanation →

The exhibit shows the output of describe-table for a DynamoDB table. The application is experiencing throttling errors when reading data. What is the MOST likely cause?

Network Topology

Question 258easymultiple choice

Read the full Data Store Management explanation →

A company is using Amazon S3 to store sensitive data. They need to automatically transition objects to S3 Glacier Deep Archive after 90 days and delete them after 7 years. Which S3 lifecycle configuration action should be used?

Question 259mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data store for a time-series application that requires sub-millisecond read latency for the latest data and high ingestion rates. Which AWS service is most suitable?

Question 260hardmultiple choice

Read the full Data Store Management explanation →

A company is migrating an on-premises Hadoop cluster to AWS. The data is stored in HDFS and needs to be accessible by both Amazon EMR and Amazon Redshift Spectrum. Which storage solution is most cost-effective and scalable?

Question 261easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store JSON documents that are frequently updated and require ACID transactions. Which AWS database service is most appropriate?

Question 262mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for MySQL and needs to reduce read latency for a global user base. Which AWS feature should be implemented?

Question 263hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting an Amazon Redshift cluster that is experiencing slow query performance. The engineer notices that the disk space is heavily utilized and queries are spilling to disk. What is the most cost-effective solution to improve performance?

Question 264easymultiple choice

Read the full Data Store Management explanation →

A company needs to store streaming data from IoT devices with a retention period of 7 days for real-time analysis. Which AWS service is most suitable?

Question 265mediummultiple choice

Read the full Data Store Management explanation →

A company has an Amazon S3 bucket with versioning enabled. They want to automatically delete noncurrent versions of objects after 30 days. Which lifecycle rule action should be used?

Question 266hardmultiple choice

Read the full NAT/PAT explanation →

A data engineer is designing a data lake on Amazon S3. The data is frequently accessed by multiple analytics services, and the company needs to enforce fine-grained access control based on data tags. Which combination of AWS services should be used?

Question 267mediummulti select

Read the full Data Store Management explanation →

Which TWO of the following are benefits of using Amazon DynamoDB Accelerator (DAX)? (Choose TWO.)

Question 268mediummulti select

Read the full Data Store Management explanation →

Which THREE of the following are valid storage classes in Amazon S3? (Choose THREE.)

Question 269hardmulti select

Read the full Data Store Management explanation →

Which TWO of the following are best practices for Amazon Redshift table design? (Choose TWO.)

Question 270mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon S3 to store sensitive data. To meet compliance requirements, they need to automatically transition objects to S3 Glacier Deep Archive after 90 days and delete them after 7 years. What is the MOST cost-effective way to configure this?

Question 271easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store JSON documents that are frequently read and written by a web application. The data has a flexible schema and requires low-latency queries on primary key lookups. Which AWS service is MOST suitable?

Question 272hardmultiple choice

Read the full Data Store Management explanation →

A company runs a critical transactional database on Amazon RDS for PostgreSQL. They need to achieve high availability with automatic failover to a different AWS Region in case of a regional outage. Which solution meets these requirements?

Question 273mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data is ingested from multiple sources in Parquet format, and the schema evolves over time. Which approach allows querying the data with Amazon Athena while supporting schema evolution?

Question 274easymultiple choice

Read the full Data Store Management explanation →

A company needs to store relational data that requires complex joins and transactional consistency. The workload is predictable and the data size is less than 500 GB. Which AWS service is MOST cost-effective for this use case?

Question 275hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB for a gaming leaderboard. The table has a partition key of 'GameId' and a sort key of 'Score'. The application needs to query the top 10 scores for a given game. Which DynamoDB feature should be used for optimal performance?

Question 276mediummultiple choice

Read the full Data Store Management explanation →

A company is migrating an on-premises Hadoop cluster to AWS. The cluster processes large files in CSV format using Apache Spark. Which data store should be used as the primary storage for the data lake to optimize cost and performance?

Question 277hardmultiple choice

Read the full Data Store Management explanation →

A company has an Amazon Redshift cluster with a mix of frequently accessed hot data and rarely accessed cold data. They want to reduce storage costs without affecting query performance for the hot data. Which strategy is MOST effective?

Question 278easymultiple choice

Read the full Data Store Management explanation →

A company needs to store archival logs that must be retained for 10 years. The logs are accessed infrequently, but when accessed, retrieval must occur within 12 hours. Which storage class is MOST cost-effective?

Question 279mediummulti select

Read the full Data Store Management explanation →

A company is using Amazon S3 to store sensitive financial data. They need to ensure that all objects are encrypted at rest. Which TWO methods can achieve this? (Choose TWO.)

Question 280hardmulti select

Read the full Data Store Management explanation →

A company is designing a multi-Region disaster recovery solution for Amazon DynamoDB. They need to ensure that data is replicated across Regions with minimal latency and that applications can read from any Region. Which THREE steps should be taken? (Choose THREE.)

Question 281easymulti select

Read the full Data Store Management explanation →

A company is using Amazon RDS for MySQL and wants to automate backups for point-in-time recovery. Which TWO actions should be taken? (Choose TWO.)

Question 282mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer configured the lifecycle policy shown. The 'logs/' prefix contains important audit logs. After 365 days, what happens to the objects?

Network Topology

Question 283hardmultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. An IAM policy is attached to an IAM role used by an application. The application needs to decrypt objects in an S3 bucket using a customer managed KMS key. What is the effect of this policy?

Exhibit

Refer to the exhibit.
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": ["kms:Decrypt", "kms:Encrypt"],
            "Resource": "arn:aws:kms:us-east-1:123456789012:key/abc123"
        },
        {
            "Effect": "Deny",
            "Action": ["kms:Decrypt"],
            "Resource": "arn:aws:kms:us-east-1:123456789012:key/abc123",
            "Condition": {
                "StringNotEquals": {
                    "kms:ViaService": "s3.us-east-1.amazonaws.com"
                }
            }
        }
    ]
}
```

Question 284mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer notices that the Redshift cluster 'mycluster' does not have automated backups beyond 7 days. However, the compliance team requires a minimum of 35 days of backup retention. What should the engineer do?

Exhibit

Refer to the exhibit.
```
[ec2-user@ip-10-0-0-1 ~]$ aws redshift describe-clusters --cluster-identifier mycluster
{
    "Clusters": [
        {
            "ClusterIdentifier": "mycluster",
            "NodeType": "dc2.large",
            "NumberOfNodes": 2,
            "ClusterStatus": "available",
            "DBName": "dev",
            "MasterUsername": "admin",
            "AutomatedSnapshotRetentionPeriod": 7,
            "ManualSnapshotRetentionPeriod": 30,
            "ClusterVersion": "1.0"
        }
    ]
}
```

Question 285mediummultiple choice

Read the full Data Store Management explanation →

A company stores sensitive data in an S3 bucket. To meet compliance requirements, they must ensure that all objects are encrypted at rest using server-side encryption with AWS KMS. Which bucket policy statement should be applied to deny uploads that do not use the required encryption?

Question 286easymultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data is accessed frequently for the first 30 days, then rarely accessed after 90 days, and must be archived after 1 year. Which S3 lifecycle policy configuration meets these requirements with the lowest cost?

Question 287hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for its data warehouse. The data engineering team notices that queries against a large fact table are slow. The table is distributed using DISTSTYLE EVEN and has multiple sort keys. After analyzing the query plans, they find that most queries filter on a specific column, 'customer_id'. Which change would most likely improve query performance for these filter operations?

Question 288mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to migrate an on-premises PostgreSQL database to Amazon RDS for PostgreSQL. The database is 2 TB and has a continuous stream of write operations. The migration should minimize downtime. Which AWS service should be used?

Question 289hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB for a gaming application. The table has a partition key of 'user_id' and a sort key of 'game_timestamp'. The application frequently queries by 'user_id' and filters by 'game_timestamp' within a specific date range. The queries are slow. The table has a global secondary index (GSI) on 'game_timestamp'. What is the most likely cause of the slow queries?

Question 290easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data that is accessed infrequently but requires millisecond retrieval latency. The data is immutable once written. Which AWS service is most cost-effective?

Question 291mediummultiple choice

Read the full Data Store Management explanation →

A company runs a critical application on Amazon RDS for MySQL. To ensure high availability and automatic failover, the database is deployed as a Multi-AZ DB instance. The application uses read-heavy workloads. Which additional configuration should be used to offload read traffic without impacting write performance?

Question 292hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting an Amazon Redshift cluster that has been experiencing slow query performance. The engineer checks the system tables and finds that many queries are waiting on 'wlm_queued' time. The cluster has 10 nodes and uses automatic WLM. What is the most likely cause?

Question 293hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store large datasets. The data engineering team needs to provide access to specific objects in the bucket to external partners using presigned URLs. Each URL should expire after 12 hours. The team wants to ensure that the presigned URLs cannot be used to access other objects in the bucket. Which approach should be taken?

Question 294mediummulti select

Read the full Data Store Management explanation →

Which TWO of the following are valid approaches to implement fine-grained access control for Amazon DynamoDB items based on user attributes? (Choose 2.)

Question 295mediummulti select

Read the full Data Store Management explanation →

Which THREE of the following are best practices for managing data storage costs in Amazon S3? (Choose 3.)

Question 296easymulti select

Read the full Data Store Management explanation →

Which TWO of the following are features of Amazon RDS Multi-AZ deployments? (Choose 2.)

Question 297hardmultiple choice

Read the full Data Store Management explanation →

An IAM policy is attached to a role assumed by authenticated users via Amazon Cognito. What does this policy allow?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "dynamodb:GetItem",
        "dynamodb:Query"
      ],
      "Resource": "arn:aws:dynamodb:us-east-1:123456789012:table/Orders",
      "Condition": {
        "ForAllValues:StringEquals": {
          "dynamodb:LeadingKeys": [
            "${cognito-identity.amazonaws.com:sub}"
          ]
        }
      }
    }
  ]
}

Question 298mediummultiple choice

Read the full Data Store Management explanation →

A data engineer runs the above command and gets the output. What does the 'MFADelete' setting imply?

Network Topology

Question 299hardmultiple choice

Read the full Data Store Management explanation →

A data engineer runs the above DDL statement in Amazon Athena. The query returns an error. What is the most likely cause?

Exhibit

Refer to the exhibit.

CREATE EXTERNAL TABLE my_table (
  id INT,
  name STRING,
  value DOUBLE
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://my-bucket/data/'
TBLPROPERTIES ('classification'='parquet');

Question 300mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store sensitive data. The security team wants to ensure that all objects uploaded to a specific S3 bucket are automatically encrypted at rest using server-side encryption with AWS KMS managed keys (SSE-KMS). Which bucket policy statement should be added to enforce this requirement?

Question 301hardmultiple choice

Read the full Data Store Management explanation →

A data engineering team is designing a data lake on Amazon S3. They need to store raw data in a format that supports schema evolution and is optimized for analytics with Amazon Athena. Which storage format should they use?

Question 302easymultiple choice

Read the full Data Store Management explanation →

A company runs a MySQL database on Amazon RDS. The database size is 500 GB and is experiencing high read traffic. The team wants to improve read performance with minimal operational overhead. Which action should they take?

Question 303mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to transfer 10 TB of data from an on-premises Hadoop cluster to Amazon S3. The network bandwidth is limited to 100 Mbps, and the transfer must be completed within 48 hours. Which solution meets the requirements?

Question 304hardmultiple choice

Read the full Data Store Management explanation →

A company has an Amazon DynamoDB table with on-demand capacity mode. The table stores session data for a web application. Recently, the application experienced throttling errors during a traffic spike. The team wants to prevent future throttling while optimizing costs. What should they do?

Question 305easymultiple choice

Read the full NAT/PAT explanation →

A data engineer needs to store JSON documents that are frequently accessed by a low-latency web application. The data does not require complex queries, and the access pattern is primarily by a key. Which AWS service is most appropriate?

Question 306mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for data warehousing. The data team notices that queries are slow due to high disk usage on the cluster. They need to free up space without deleting any data. What should they do?

Question 307hardmultiple choice

Read the full NAT/PAT explanation →

A financial services company stores sensitive transaction data in an Amazon S3 bucket. Compliance requires that all objects be encrypted using SSE-KMS and that the bucket be protected from accidental deletion. Which combination of actions meets these requirements? (Select TWO.)

Question 308easymultiple choice

Read the full Data Store Management explanation →

A startup is building a mobile application that requires a database to store user profiles and preferences. The database must scale automatically with minimal administration. Which AWS service should they use?

Question 309mediummulti select

Read the full Data Store Management explanation →

Which TWO actions can improve query performance on an Amazon Redshift cluster? (Choose two.)

Question 310hardmulti select

Read the full Data Store Management explanation →

Which THREE factors should a data engineer consider when choosing between Amazon S3 and Amazon DynamoDB for storing time-series data? (Choose three.)

Question 311easymulti select

Read the full Data Store Management explanation →

Which TWO features of Amazon S3 help protect data from accidental deletion or modification? (Choose two.)

Question 312mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon S3 to store large amounts of archival data. The data is accessed infrequently but must be immediately retrievable when needed. Which storage class is the most cost-effective choice?

Question 313hardmultiple choice

Read the full Data Store Management explanation →

A data engineer needs to set up a new Amazon RDS for PostgreSQL database for a production workload. The database must be highly available and resilient to a single Availability Zone failure. Which configuration should the engineer choose?

Question 314easymultiple choice

Read the full Data Store Management explanation →

A company wants to migrate its on-premises MySQL database to Amazon RDS for MySQL with minimal downtime. Which AWS service should be used for the migration?

Question 315hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is troubleshooting an Amazon DynamoDB table that has frequent throttling exceptions for write requests. The table has auto scaling enabled. What is the most likely cause?

Question 316mediummultiple choice

Read the full Data Store Management explanation →

A company stores sensitive data in Amazon S3. They need to ensure that all objects are encrypted at rest. Which approach meets this requirement with minimal effort?

Question 317easymultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. Which feature should be used to manage the lifecycle of objects and move them to cheaper storage classes automatically?

Question 318hardmultiple choice

Read the full Data Store Management explanation →

A company has an Amazon RDS for MySQL database that is experiencing performance issues due to a large number of read requests. The application is read-heavy and can tolerate eventually consistent reads. Which action will reduce the load on the primary database with the least operational overhead?

Question 319mediummultiple choice

Read the full NAT/PAT explanation →

A data engineer needs to store JSON documents that are accessed by a key-value pattern. The workload requires single-digit millisecond latency at any scale. Which AWS service is most appropriate?

Question 320easymultiple choice

Read the full Data Store Management explanation →

A company wants to enforce that all data in an S3 bucket is encrypted at rest using AWS KMS. Which bucket policy condition key should be used?

Question 321mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a disaster recovery strategy for an Amazon RDS for PostgreSQL database. The primary database is in us-east-1. Which TWO approaches provide cross-region disaster recovery?

Question 322hardmulti select

Read the full Data Store Management explanation →

A company is migrating a large Oracle data warehouse to Amazon Redshift. Which THREE considerations are important for optimizing the Redshift cluster?

Question 323easymulti select

Read the full Data Store Management explanation →

A data engineer is setting up Amazon S3 event notifications to trigger an AWS Lambda function when new objects are uploaded. Which TWO actions are required to enable this?

Question 324mediummultiple choice

Read the full Data Store Management explanation →

The exhibit shows an S3 bucket policy. What is the effect of this policy?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "true"
        }
      }
    },
    {
      "Effect": "Deny",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::my-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

Question 325hardmultiple choice

Read the full Data Store Management explanation →

The exhibit shows the output of describe-table for a DynamoDB table. The table is used for a reporting job that queries by 'pk' and filters on 'sk' using a range condition. The job is running slowly. What is the most likely cause?

Network Topology

Question 326easymultiple choice

Read the full Data Store Management explanation →

The exhibit shows a build log from AWS CodeBuild. The build fails with a permission error when trying to open the downloaded file. What is the most likely cause?

Exhibit

Refer to the exhibit.

[Container] 2024/01/15 10:00:00 Running command aws s3 cp s3://my-bucket/report.csv .
[Container] 2024/01/15 10:00:02 download: s3://my-bucket/report.csv to ./report.csv
[Container] 2024/01/15 10:00:02 Running command python3 process.py
[Container] 2024/01/15 10:00:05 Error: Unable to open file 'report.csv': Permission denied

Question 327easymultiple choice

Read the full Data Store Management explanation →

A company is running a production database on Amazon RDS for PostgreSQL. The database experiences high read traffic from multiple application servers. Which data store management strategy would reduce the load on the primary database instance?

Question 328mediummultiple choice

Read the full Data Store Management explanation →

A data engineer notices that an Amazon Redshift cluster is running low on disk space. The cluster has three nodes of type dc2.large. Which action will increase the available storage capacity?

Question 329hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB as a session store for a web application. During peak hours, the application experiences high latency and throttling on the DynamoDB table. The table has a read capacity of 5000 RCU and write capacity of 2000 WCU. The application reads and writes session data using the session ID as the partition key. What is the most cost-effective solution to reduce throttling?

Question 330easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON files that are accessed infrequently but must be retrievable within minutes. The data should be stored cost-effectively. Which storage solution meets these requirements?

Question 331mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for analytics. The data engineer notices that some queries are slow and the EXPLAIN plan shows a 'Seq Scan' on a large table. Which data store management action would most likely improve query performance?

Question 332hardmultiple choice

Read the full Data Store Management explanation →

A company stores IoT sensor data in an Amazon S3 bucket. The data is ingested every minute and each object is about 10 KB. The data must be stored for at least 7 years for compliance. Which lifecycle policy configuration minimizes storage costs?

Question 333easymultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data includes personally identifiable information (PII) that must be encrypted at rest. Which encryption option provides the most control over encryption keys?

Question 334mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon RDS for MySQL with Multi-AZ deployment. The primary instance fails, and automatic failover occurs. After failover, the application experiences higher latency. What is the most likely cause?

Question 335hardmultiple choice

Read the full NAT/PAT explanation →

A data engineer is migrating an on-premises Apache HBase workload to Amazon DynamoDB. The HBase table has a row key with composite structure: customer_id (10 chars) + timestamp (10 digits). The access pattern is to query by customer_id and retrieve the latest entries. How should the DynamoDB table be designed to optimize performance?

Question 336mediummulti select

Read the full Data Store Management explanation →

A company is designing a disaster recovery strategy for an Amazon RDS for SQL Server database. The database must be recoverable in another AWS region within 15 minutes of a regional outage. Which TWO actions should the data engineer take?

Question 337hardmulti select

Read the full Data Store Management explanation →

A data engineer is using Amazon Athena to query data stored in an S3 bucket. The queries are running slowly. Which THREE actions can improve query performance?

Question 338easymulti select

Read the full Data Store Management explanation →

A company is building a data pipeline that ingests streaming data from IoT devices. The data must be stored in a durable, scalable, and cost-effective manner for batch processing. Which TWO AWS services should be used together?

Question 339mediummultiple choice

Read the full Data Store Management explanation →

A data engineer applies the above IAM policy to an IAM user. The user attempts to download an object from the bucket 'example-bucket' that is encrypted with SSE-S3 (AES256). Will the request succeed?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "StringEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    }
  ]
}

Question 340hardmultiple choice

Read the full Data Store Management explanation →

An application uses the 'orders' DynamoDB table with the schema and provisioned throughput shown in the exhibit. The application frequently queries by customer_id (range key) without specifying the order_id (partition key). What is the most likely impact on performance?

Network Topology

Question 341easymultiple choice

Read the full Data Store Management explanation →

A data engineer runs the above SQL commands on an Amazon Redshift cluster. The table 'users' is created with DISTSTYLE EVEN. What is the effect of the DISTSTYLE EVEN on query performance?

Exhibit

Refer to the exhibit.

CREATE TABLE users (
    user_id INT PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100)
) DISTSTYLE EVEN;

INSERT INTO users VALUES (1, 'alice', 'alice@example.com');
INSERT INTO users VALUES (2, 'bob', 'bob@example.com');

Question 342mediummultiple choice

Read the full Data Store Management explanation →

A company is using an Amazon RDS for MySQL database for an e-commerce application. During a sales event, the database experiences high read traffic, causing slow query performance. The company wants to reduce the read load on the primary database without changing the application code. Which solution meets these requirements?

Question 343easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data that is accessed infrequently but must be retrievable within minutes when needed. The data will be stored for 7 years for compliance. Which storage solution is MOST cost-effective?

Question 344hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB as the primary data store for a gaming application. The application experiences sudden spikes in traffic. The data engineer notices that write requests are throttled during peak times. The partition keys are well-distributed. What should the data engineer do to reduce throttling?

Question 345easymultiple choice

Read the full Data Store Management explanation →

A company is migrating an on-premises MongoDB database to Amazon DocumentDB. The data engineer needs to ensure minimal downtime during migration. Which AWS service should be used to facilitate the migration?

Question 346mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for data warehousing. The data engineer notices that query performance has degraded over time. The tables are frequently updated with new data, and the data engineer suspects that the distribution style is causing data skew. Which distribution style should the data engineer use to minimize data skew?

Question 347hardmultiple choice

Read the full NAT/PAT explanation →

A company is designing a data lake on Amazon S3. The data includes personal identifiable information (PII). The data engineer must ensure that only authorized users can access the data, and that access is logged for auditing. Which combination of services should the data engineer use?

Question 348easymultiple choice

Read the full Data Store Management explanation →

A company is storing large amounts of log data in Amazon S3. The data is accessed frequently for the first 30 days, then rarely after that. The company wants to automatically transition the data to a lower-cost storage class after 30 days. Which S3 feature should the data engineer use?

Question 349mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon DynamoDB to store session data for a web application. The data engineer needs to ensure that the data is encrypted at rest. Which action should the data engineer take?

Question 350hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon S3 to store sensitive data. The security team requires that all data be encrypted at rest using a customer-managed AWS KMS key. The data engineer must ensure that only a specific IAM role can decrypt the data. Which policy should the data engineer attach to the KMS key?

Question 351mediummulti select

Read the full Data Store Management explanation →

Which TWO statements are true about Amazon Redshift distribution styles? (Choose TWO.)

Question 352easymulti select

Read the full Data Store Management explanation →

Which THREE actions can help improve read performance in Amazon DynamoDB? (Choose THREE.)

Question 353hardmulti select

Read the full Data Store Management explanation →

Which TWO are benefits of using Amazon S3 Object Lock? (Choose TWO.)

Question 354mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is analyzing a DynamoDB table for a session management application. The table currently has 10,000 items and is 1 MB in size. The application expects 1,000 writes per second during peak hours. What should the data engineer do to accommodate the write workload?

Network Topology

Question 355hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is reviewing an IAM policy that controls access to an S3 bucket. The policy is attached to a user group. The engineer notices that users are unable to download objects from the bucket. What is the likely cause?

Exhibit

Refer to the exhibit.

```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "s3:GetObject",
            "Resource": "arn:aws:s3:::my-bucket/*",
            "Condition": {
                "StringEquals": {
                    "s3:x-amz-server-side-encryption": "AES256"
                }
            }
        }
    ]
}
```

Question 356easymultiple choice

Read the full Data Store Management explanation →

A data engineer is reviewing the lifecycle configuration of an S3 bucket. The bucket stores log files. The engineer wants to ensure that objects are deleted after 365 days. What is the current behavior?

Network Topology

Question 357mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for MySQL with Multi-AZ deployment. The database size is 2 TB and the workload is read-heavy. To improve read performance, which option should be used?

Question 358easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store large amounts of data that is accessed infrequently but must be retrieved immediately when needed. Which Amazon S3 storage class is most cost-effective?

Question 359hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB with on-demand capacity for a gaming leaderboard. The table has 100 GB of data and receives 10,000 write requests per second with spikes to 50,000. The application experiences throttling during spikes. Which action should be taken to reduce throttling without changing the application?

Question 360mediummultiple choice

Read the full NAT/PAT explanation →

A data engineer is designing a data lake on Amazon S3. The data includes personally identifiable information (PII) that must be encrypted at rest. Which combination of actions meets the encryption requirement with the least operational overhead?

Question 361easymultiple choice

Read the full Data Store Management explanation →

A company needs to migrate an on-premises 10 TB PostgreSQL database to Amazon RDS for PostgreSQL with minimal downtime. Which AWS service should be used for the migration?

Question 362hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon Redshift for analytics. The cluster has 20 nodes and the data is evenly distributed. Query performance has degraded over time. The data engineer suspects that table maintenance is needed. Which set of operations should be performed to improve query performance?

Question 363mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is setting up an Amazon S3 lifecycle policy to transition objects to S3 Glacier after 90 days and delete after 365 days. The objects are stored in the S3 Standard storage class. Which lifecycle rule configuration meets the requirements?

Question 364easymultiple choice

Read the full NAT/PAT explanation →

A company needs to store JSON documents that are accessed by a key-value pattern. The data is 500 GB and requires single-digit millisecond latency. Which AWS database is most suitable?

Question 365hardmultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for SQL Server with Multi-AZ. The database has a 500 GB data file and 100 GB log file. The application experiences high latency during peak hours. Monitoring shows high WriteIOPS on the primary. Which change will reduce latency without losing the ability to failover?

Question 366mediummulti select

Read the full Data Store Management explanation →

Which TWO actions can reduce the cost of an Amazon S3 bucket that stores infrequently accessed data? (Choose 2.)

Question 367hardmulti select

Read the full Data Store Management explanation →

Which THREE steps are recommended for migrating an on-premises Oracle database to Amazon RDS for Oracle with minimal downtime? (Choose 3.)

Question 368easymulti select

Read the full Data Store Management explanation →

Which TWO are valid Amazon Redshift distribution styles? (Choose 2.)

Question 369mediummultiple choice

Read the full Data Store Management explanation →

A company is running a production Amazon RDS for MySQL Multi-AZ DB instance. The database experiences a sudden spike in read requests, causing performance degradation. The company needs to improve read scalability with minimal application changes. Which solution should the data engineer recommend?

Question 370easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data that is accessed infrequently but must be retrievable within 5 minutes. The data is immutable once stored. Which storage solution is MOST cost-effective?

Question 371hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for its data warehouse. The data engineer notices that the most frequently accessed table is sorted by date, but queries often filter by customer_id. The table has 500 million rows and uses AUTO distribution style. What change would MOST improve query performance?

Question 372mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to migrate an on-premises Apache Hadoop cluster to AWS. The cluster stores data in HDFS and runs MapReduce jobs. The company wants to minimize operational overhead and leverage serverless technologies where possible. Which AWS service should the data engineer use to replace HDFS storage?

Question 373hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB to store session data for a web application. The application experiences occasional spikes in traffic, causing throttling on the table. The data engineer needs to implement a solution that handles traffic spikes without manual intervention and minimizes cost. What should the data engineer do?

Question 374easymultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on AWS using Amazon S3. The data consists of CSV files generated by IoT devices. The data is accessed by multiple analytics jobs, and the engineer needs to ensure that new files are immediately visible to all consumers after writing. What S3 consistency model applies?

Question 375mediummultiple choice

Read the full Data Store Management explanation →

A company has an Amazon RDS for PostgreSQL DB instance with a large table that is frequently updated. The data engineer needs to reduce storage costs by archiving old records that are no longer accessed. The archived records must be retained for 7 years due to compliance requirements. Which solution is MOST cost-effective?

Question 376hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a multi-region disaster recovery plan for an Amazon DynamoDB table. The table stores critical user profile data and must have a Recovery Point Objective (RPO) of less than 1 minute and a Recovery Time Objective (RTO) of less than 5 minutes. Which solution meets these requirements?

Question 377easymultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store customer documents. The data engineer needs to ensure that all objects uploaded to a specific S3 bucket are automatically encrypted with a customer-managed AWS KMS key. What should the data engineer do?

Question 378mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a data pipeline that ingests streaming data from IoT devices into Amazon S3 using Amazon Kinesis Data Firehose. The data must be transformed from JSON to Parquet format before storage. Which TWO actions should the data engineer take to achieve this?

Question 379hardmulti select

Read the full Data Store Management explanation →

A company stores sensitive financial data in an Amazon Redshift cluster. The data engineer must ensure that all queries are logged for audit purposes and that the logs are stored in Amazon S3 with server-side encryption. Which THREE steps should the data engineer take to meet these requirements?

Question 380mediummulti select

Read the full Data Store Management explanation →

A data engineer is optimizing an Amazon RDS for MySQL database that experiences high write throughput. The engineer wants to improve write performance and reduce latency. Which TWO database-level configuration changes can help achieve this?

Question 381mediummultiple choice

Read the full Data Store Management explanation →

A company is using Amazon RDS for MySQL with Multi-AZ deployment. The primary DB instance experiences a hardware failure, causing automatic failover to the standby. After the failover, the application reports that the database endpoint is unreachable for about 60 seconds. What is the MOST likely cause?

Question 382hardmultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. The data is ingested from multiple sources in Parquet format, partitioned by date. The engineer needs to ensure that queries using Amazon Athena are cost-effective and perform well. Which approach should the engineer take?

Question 383easymultiple choice

Read the full Data Store Management explanation →

A company stores critical financial data in Amazon DynamoDB. To meet compliance requirements, the data must be encrypted at rest with a customer-managed key. Which solution should the data engineer implement?

Question 384mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises PostgreSQL database to Amazon RDS for PostgreSQL. The database is 2 TB in size and has a tight migration window. Which migration approach minimizes downtime?

Question 385hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon Redshift for analytics. The data engineer notices that queries are slow and the system is experiencing high disk usage. The engineer suspects that the distribution style is suboptimal. Which action should the engineer take to improve query performance?

Question 386easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data from IoT devices. The data is written frequently and read occasionally. Which AWS service is MOST cost-effective for this use case?

Question 387hardmultiple choice

Read the full NAT/PAT explanation →

A company is using Amazon S3 to store sensitive customer data. The security team requires that all data be encrypted in transit and at rest. Additionally, they want to prevent any accidental public access. Which combination of actions should the data engineer take?

Question 388mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data store for a real-time leaderboard application that requires sub-millisecond read and write latency. The leaderboard stores scores for millions of users and needs to be sorted by score. Which AWS service should the engineer use?

Question 389easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store archival data that is rarely accessed but must be retained for 7 years. The data should be retrievable within 12 hours. Which Amazon S3 storage class is MOST cost-effective?

Question 390mediummulti select

Read the full Data Store Management explanation →

Which TWO actions can help improve the read performance of an Amazon DynamoDB table that is experiencing throttling? (Choose two.)

Question 391hardmulti select

Read the full Data Store Management explanation →

Which THREE factors should a data engineer consider when choosing between Amazon RDS and Amazon DynamoDB for a new application? (Choose three.)

Question 392easymulti select

Read the full Data Store Management explanation →

Which TWO methods can be used to enforce least-privilege access to an Amazon S3 bucket? (Choose two.)

Question 393hardmultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer applies this bucket policy to an S3 bucket. A user within the 10.0.0.0/24 IP range attempts to upload an object to the bucket using an HTTP (non-HTTPS) request. What is the outcome?

Exhibit

Refer to the exhibit.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject"
      ],
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "10.0.0.0/24"
        }
      }
    },
    {
      "Effect": "Deny",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::example-bucket/*",
      "Condition": {
        "Bool": {
          "aws:SecureTransport": "false"
        }
      }
    }
  ]
}

Question 394mediummultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer runs the above CLI command and sees the output. The security team requires that the RDS instance not be accessible from the internet. Which change should the engineer make?

Network Topology

Question 395easymultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer creates an Amazon Redshift table with the above DDL. The engineer runs a query to find all orders for a specific customer within a date range. Which statement about query performance is correct?

Exhibit

Refer to the exhibit.

CREATE TABLE orders (
    order_id INT PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    total_amount DECIMAL(10,2)
) DISTSTYLE KEY DISTKEY (customer_id) SORTKEY (order_date);

Question 396easymultiple choice

Read the full Data Store Management explanation →

A company uses an Amazon RDS for MySQL DB instance with Multi-AZ deployment. The primary DB instance fails unexpectedly. What happens to the database endpoint?

Question 397mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3. Data is ingested from multiple sources in JSON format. The engineer needs to optimize query performance for Amazon Athena while minimizing storage costs. Which storage strategy should the engineer use?

Question 398hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB with on-demand capacity for a gaming application that experiences unpredictable traffic spikes. The application reads the same set of 'hot' items frequently. Users report high latency during peak hours. Which action would MOST effectively reduce read latency for the hot items?

Question 399easymultiple choice

Read the full Data Store Management explanation →

A data engineer is migrating an on-premises PostgreSQL database to Amazon RDS for PostgreSQL. The database is 2 TB in size. The engineer needs to minimize downtime. Which AWS service should be used for the migration?

Question 400mediummultiple choice

Read the full Data Store Management explanation →

A company stores sensitive customer data in an Amazon S3 bucket. The security team requires that all data be encrypted at rest using a key that is automatically rotated every year. Which encryption solution should the data engineer use?

Question 401hardmultiple choice

Read the full Data Store Management explanation →

A company runs a production Amazon Redshift cluster with a 5-node ra3.4xlarge configuration. The data engineer observes that write operations are failing with 'Disk Full' errors on some nodes. The cluster has not reached its total capacity. What should the engineer do to resolve this issue?

Question 402easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured JSON data that is accessed infrequently but requires immediate retrieval when needed. The data must be durable and cost-effective. Which Amazon S3 storage class should be used?

Question 403mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB with global tables in three AWS Regions. The data engineer needs to ensure that writes to the table in us-east-1 are replicated to other regions with minimal latency. Which DynamoDB feature should be used?

Question 404mediummulti select

Read the full Data Store Management explanation →

Which TWO options are valid ways to reduce storage costs for an Amazon S3 data lake that stores historical data rarely accessed after 30 days? (Choose TWO.)

Question 405hardmulti select

Read the full Data Store Management explanation →

Which THREE factors should a data engineer consider when choosing between Amazon RDS and Amazon DynamoDB for a new application? (Choose THREE.)

Question 406easymulti select

Read the full Data Store Management explanation →

Which TWO AWS services can be used to automatically back up an Amazon RDS for SQL Server DB instance? (Choose TWO.)

Question 407hardmultiple choice

Read the full Data Store Management explanation →

A company runs a streaming application on Amazon EC2 instances that writes data to an Amazon DynamoDB table (us-east-1). The data is later consumed by a reporting job that runs every hour. Recently, the reporting job has been failing with ProvisionedThroughputExceededException errors during peak hours. The DynamoDB table uses provisioned capacity with 1000 read capacity units (RCU) and 500 write capacity units (WCU). The reporting job performs scans and reads using eventually consistent reads. The application's write traffic is steady, but the reporting job's reads spike at the top of the hour. The data engineer needs to resolve the throughput exceptions without affecting the application's writes. Which solution should the data engineer implement?

Question 408mediummultiple choice

Read the full Data Store Management explanation →

A data engineer is managing an Amazon Redshift cluster used for analytics. The cluster has a single node of type dc2.large. The engineer notices that queries are slowing down as data volume grows. The cluster's disk space is at 70% usage. The engineer needs to improve query performance and accommodate future growth. The budget allows for moderate cost increase. Which action should the engineer take?

Question 409mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store images that are accessed by a web application. The application generates presigned URLs for users to download images. Recently, the application has been experiencing errors when generating presigned URLs for objects that were uploaded using multipart upload. The errors indicate that the presigned URL does not work. The data engineer needs to ensure that presigned URLs work for all objects, including those uploaded via multipart upload. What should the data engineer do?

Question 410mediummultiple choice

Read the full Data Store Management explanation →

A company runs a production Amazon RDS for MySQL database with Multi-AZ deployment. The database experiences high read latency during peak hours. The company wants to improve read performance with minimal application changes. Which solution should a data engineer recommend?

Question 411easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store semi-structured data (JSON logs) from thousands of IoT devices. The data must be schema-less, highly scalable, and support low-latency queries by device ID and timestamp. Which AWS service should the engineer use?

Question 412hardmultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer is troubleshooting an IAM policy attached to a user. The user reports that they cannot upload objects to the S3 bucket 'data-lake-bucket' unless they explicitly specify the 'x-amz-server-side-encryption' header with value 'AES256'. The engineer wants to modify the policy to allow uploads without requiring encryption headers, but still enforce encryption on the bucket itself. Which change should the engineer make?

Exhibit

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "s3:PutObject",
        "s3:GetObject",
        "s3:DeleteObject"
      ],
      "Resource": [
        "arn:aws:s3:::data-lake-bucket/*",
        "arn:aws:s3:::data-lake-bucket"
      ],
      "Condition": {
        "StringEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    },
    {
      "Effect": "Deny",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::data-lake-bucket/*",
      "Condition": {
        "StringNotEquals": {
          "s3:x-amz-server-side-encryption": "AES256"
        }
      }
    }
  ]
}

Question 413mediummultiple choice

Read the full Data Store Management explanation →

A company stores sensitive data in an Amazon S3 bucket. A compliance requirement mandates that all data must be encrypted at rest with a key that is automatically rotated every year. The company also needs to maintain an audit trail of who used the key. Which solution meets these requirements?

Question 414hardmultiple choice

Read the full Data Store Management explanation →

A company runs an Amazon Redshift cluster with 10 RA3 nodes. The data warehouse stores 50 TB of data. The company notices that queries are slow and the cluster's storage utilization is high. The data engineer needs to improve query performance and reduce storage costs without changing the cluster's node count. Which action should the engineer take?

Question 415easymultiple choice

Read the full NAT/PAT explanation →

A data engineer is designing a data lake on Amazon S3. The data lake will store raw data, transformed data, and curated datasets. The engineer needs to ensure that raw data is immutable (never overwritten or deleted) and that only authorized users can access the transformed data. Which combination of S3 features should the engineer use?

Question 416mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB to store session data for a web application. The application experiences throttling errors during peak traffic. The data engineer observes that the table's read capacity is consistently at 100% and the write capacity is at 20%. The engineer needs to resolve the throttling with minimal cost. Which solution should the engineer implement?

Question 417hardmultiple choice

Read the full NAT/PAT explanation →

A company runs an Amazon RDS for PostgreSQL instance that stores financial data. The company requires point-in-time recovery (PITR) with a retention period of 35 days. Additionally, the company needs to create a new database from a specific snapshot every night for testing. Which combination of actions should the data engineer take to meet these requirements?

Question 418easymultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store log files from multiple applications in a central S3 bucket. The logs must be stored cost-effectively for long-term retention (7 years). The logs are accessed infrequently after the first 30 days. Which storage class should the engineer use for objects older than 30 days?

Question 419mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a data store for a real-time analytics application that requires sub-millisecond read and write latency for time-series data. The data volume is expected to grow to hundreds of terabytes. Which TWO AWS services should the engineer consider? (Choose TWO.)

Question 420hardmulti select

Read the full Data Store Management explanation →

A company has an S3 bucket with versioning enabled that stores critical data. The security team requires that once an object is deleted, it cannot be recovered by anyone, including the root user. Additionally, the company wants to ensure that objects cannot be overwritten for a specified period. Which THREE actions should the data engineer take to meet these requirements? (Choose THREE.)

Question 421easymulti select

Read the full Data Store Management explanation →

A data engineer needs to migrate an on-premises MongoDB database to AWS. The migration must have minimal downtime and support automatic scaling. Which TWO AWS services should the engineer use for the target data store? (Choose TWO.)

Question 422hardmultiple choice

Read the full Data Store Management explanation →

Refer to the exhibit. A data engineer is analyzing a query performance issue on an Amazon Redshift table. The table 'sales' has 100 million rows. The query is performing a full table scan. Which optimization should the engineer apply to improve query performance?

Exhibit

CREATE TABLE sales (
    id INT NOT NULL,
    product_id INT NOT NULL,
    sale_date DATE NOT NULL,
    amount DECIMAL(10,2),
    region VARCHAR(20)
) DISTKEY(product_id) SORTKEY(sale_date);

-- Query:
SELECT region, SUM(amount) 
FROM sales 
WHERE sale_date BETWEEN '2023-01-01' AND '2023-12-31' 
GROUP BY region;

Question 423mediummultiple choice

Read the full Data Store Management explanation →

A media company stores video files in an S3 bucket. The files are processed by a fleet of EC2 instances that read the files, add watermarks, and write the output back to the same bucket. Recently, the processing jobs have been failing with '500 Internal Server Error' and '503 Slow Down' errors. The data engineer checks the S3 bucket metrics and sees that the PUT/GET request rate is consistently above 5,500 requests per second for a single prefix. The engineer needs to resolve the errors with minimal changes to the application code. Which course of action should the engineer take?

Question 424hardmultiple choice

Read the full Data Store Management explanation →

A financial services company has an Amazon DynamoDB table named 'Transactions' with provisioned read capacity of 10,000 RCU and write capacity of 5,000 WCU. The table stores transaction records for the past 90 days. The application performs point reads by transaction ID (partition key) and range queries by customer ID and timestamp (GSI). Recently, the company started a new marketing campaign, causing a sudden spike in write traffic. The write capacity is now at 4,500 WCU, and the application is experiencing occasional throttling on writes. The data engineer needs to ensure that writes are not throttled during future campaigns, while keeping costs low. The table currently has auto scaling enabled with a maximum capacity of 10,000 WCU. Which solution should the engineer implement?

Question 425mediummultiple choice

Read the full Data Store Management explanation →

A company runs an Amazon RDS for PostgreSQL database for its e-commerce platform. The application team reports that write-intensive workloads are causing high latency and the database is experiencing storage bottlenecks. The database currently uses General Purpose SSD (gp2) storage. Which action would be MOST effective in improving write performance without changing the database instance class?

Question 426hardmultiple choice

Read the full NAT/PAT explanation →

A data engineer is designing a data lake on Amazon S3 for a healthcare organization that must comply with HIPAA regulations. The data includes protected health information (PHI) and must be encrypted at rest. The organization requires that all encryption keys be managed by AWS and rotated automatically every year. Additionally, the data must be replicated to another AWS Region for disaster recovery. Which combination of S3 features should the engineer use to meet these requirements?

Question 427easymultiple choice

Read the full Data Store Management explanation →

A company stores its application logs in an Amazon S3 bucket. The logs are accessed frequently for the first 30 days, after which they are rarely accessed but must be retained for 7 years for compliance. The company wants to optimize storage costs while maintaining immediate retrieval availability for the first 30 days and the ability to retrieve logs within 12 hours after that. Which lifecycle policy should the data engineer configure?

Question 428mediummultiple choice

Read the full Data Store Management explanation →

A data engineering team is using Amazon DynamoDB to store user session data for a web application. The application experiences sudden spikes in traffic, causing throttling on the DynamoDB table. The team wants to minimize throttling without over-provisioning read/write capacity. Which solution should the team implement?

Question 429hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store large datasets for analytics. Each dataset is stored in a separate prefix and consists of thousands of small objects (1-10 KB each). The company notices that listing objects in a prefix takes several seconds, slowing down data processing. Which solution would MOST improve listing performance?

Question 430easymultiple choice

Read the full Data Store Management explanation →

A company is migrating an on-premises MySQL database to Amazon RDS for MySQL. The database is 500 GB in size. The migration must have minimal downtime and must be completed within a week. Which AWS service should the data engineer use to perform the migration?

Question 431mediummultiple choice

Read the full Data Store Management explanation →

A data engineer needs to store clickstream data from a web application in Amazon S3. Each event is about 5 KB, and the application generates 1 million events per hour. The data is used for real-time analytics and also for batch processing. The engineer wants to minimize storage costs while ensuring that data is available for real-time queries as soon as it is written. Which storage class should the engineer use for the S3 bucket?

Question 432hardmultiple choice

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB to store metadata for a document management system. The table has a partition key of document_id and a sort key of version. The application frequently queries for the latest version of a document by document_id. The data engineer notices that these queries are consuming a lot of read capacity. How can the engineer optimize the read performance and reduce read capacity consumption?

Question 433mediummulti select

Read the full Data Store Management explanation →

A data engineer is designing a data lake on Amazon S3 that will store sensitive financial data. The engineer needs to implement encryption at rest and ensure that only authorized users can access the data. Which TWO actions should the engineer take to meet these requirements? (Choose TWO.)

Question 434hardmulti select

Read the full Data Store Management explanation →

A company runs a production Amazon RDS for PostgreSQL database. The database is experiencing performance degradation due to a high number of concurrent read queries. The data engineer needs to improve read performance without significantly increasing costs. Which TWO actions should the engineer take? (Choose TWO.)

Question 435mediummulti select

Read the full Data Store Management explanation →

A data engineer is using Amazon DynamoDB to store session data for a web application. The engineer wants to ensure that all data is encrypted at rest using an AWS managed key. Which THREE steps should the engineer take to achieve this? (Choose THREE.)

Question 436hardmulti select

Read the full Data Store Management explanation →

A company uses Amazon S3 to store log files that are generated every hour. Each log file is about 1 GB. The logs must be stored for 5 years for compliance. The data engineer wants to minimize storage costs while ensuring that logs can be retrieved within 24 hours for the first year, and within 48 hours thereafter. Which THREE lifecycle actions should the engineer configure? (Choose THREE.)

Question 437hardmultiple choice

Read the full Data Store Management explanation →

A data engineer at a media company is managing an Amazon RDS for MySQL database that stores user profiles and preferences. The database has been running on a db.r5.large instance with 500 GB of General Purpose SSD (gp2) storage. Recently, the application team has noticed increased query latency during peak hours. Amazon CloudWatch metrics show that the ReadIOPS metric is consistently peaking at 5,000 IOPS, which is near the baseline performance of the gp2 volume (1,500 IOPS baseline for 500 GB, but with bursts up to 3,000 IOPS for short periods). The database is not CPU-bound, and memory utilization is moderate. The data engineer needs to resolve the I/O bottleneck with minimal cost increase. The company is open to changing the storage type or instance class, but wants to avoid over-provisioning. What should the data engineer do?

Question 438easymultiple choice

Read the full Data Store Management explanation →

A startup is building a ride-sharing application that uses Amazon DynamoDB to store trip data. The table has a partition key of 'trip_id' and a sort key of 'status'. The application writes a new item when a trip starts and updates the status when the trip ends. The development team is experiencing high write latency during peak hours. The table is provisioned with 5,000 write capacity units (WCU) and 5,000 read capacity units (RCU). CloudWatch metrics show that WriteThrottleEvents are occurring frequently, but the consumed write capacity is never above 4,000 WCU. The team suspects that the issue is due to hot partitions. How should the data engineer resolve this issue?

Question 439mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon S3 to store sensitive documents. The security team has mandated that all objects must be encrypted at rest using server-side encryption with AWS KMS (SSE-KMS). Additionally, the company wants to ensure that any attempt to upload an unencrypted object is denied. A data engineer has configured a bucket policy that denies PutObject if the encryption header does not include x-amz-server-side-encryption: aws:kms. However, the engineer notices that some objects are still being stored without encryption. Upon investigation, the engineer suspects that the policy is not being evaluated correctly. What should the engineer do to ensure that all objects are encrypted with SSE-KMS?

Question 440mediummulti select

Read the full Data Store Management explanation →

A company is designing a data lake on AWS using Amazon S3. The data includes sensitive customer information that must be encrypted at rest. The company requires that encryption keys be managed by AWS, but the keys must be rotated automatically every year. Which TWO options meet these requirements? (Choose TWO.)

Question 441hardmulti select

Read the full Data Store Management explanation →

A company uses Amazon Redshift for its data warehouse. The cluster has multiple node types and is configured with automated snapshots. The company needs to ensure high availability and disaster recovery across AWS Regions. Which THREE actions should the company take to meet these requirements? (Choose THREE.)

Question 442easymulti select

Read the full Data Store Management explanation →

A data engineer needs to store streaming data from multiple sources into Amazon S3. The data should be organized by source, date, and hour. The engineer wants to minimize processing overhead. Which THREE S3 features should the engineer use to achieve this? (Choose THREE.)

Question 443mediummulti select

Read the full Data Store Management explanation →

A company uses Amazon DynamoDB as a key-value store for a high-traffic application. The table has a provisioned read capacity of 10,000 RCUs and write capacity of 5,000 WCUs. The application experiences occasional throttling during peak hours. Which TWO actions can reduce throttling without changing the application code? (Choose TWO.)

Question 444hardmulti select

Read the full Data Store Management explanation →

A company runs an Amazon RDS for PostgreSQL instance for an OLTP application. The database size is 500 GB. The company wants to minimize downtime during backups and ensure point-in-time recovery (PITR) for the last 7 days. Which TWO features should the company use? (Choose TWO.)

Question 445mediummulti select

Read the full Data Store Management explanation →

A company uses AWS Glue to catalog data stored in Amazon S3. The data is in Parquet format and partitioned by date. The company wants to improve query performance in Amazon Athena and reduce costs. Which THREE actions should the company take? (Choose THREE.)

Question 446easymulti select

Read the full Data Store Management explanation →

A company is migrating a MySQL database to Amazon RDS for MySQL. The database is 2 TB in size and the company can only afford minimal downtime. The migration must be secure and use AWS DMS. Which TWO configuration steps are required? (Choose TWO.)

Question 447hardmultiple choice

Read the full Data Store Management explanation →

A large e-commerce company uses Amazon DynamoDB to store shopping cart data. The table has a partition key of 'user_id' and a sort key of 'item_id'. The application performs frequent updates to the 'quantity' attribute for items in a user's cart. Recently, the operations team noticed that write requests are being throttled during peak shopping hours. The table is provisioned with 10,000 write capacity units (WCUs) and uses DynamoDB Accelerator (DAX) for read caching. The data engineer suspects that the throttling is due to hot partitions. The application uses a single AWS SDK client configured with retries. After reviewing the Amazon CloudWatch metrics, the engineer sees that the WriteThrottleEvents metric spikes for a few partition keys. The table has a high number of partitions. What should the data engineer do to resolve the throttling issue with minimal application changes?

Question 448mediummultiple choice

Read the full Data Store Management explanation →

A media company stores video files in Amazon S3 buckets organized by content type. The company has a requirement to automatically archive files that are older than 90 days to Amazon S3 Glacier Deep Archive to reduce costs. However, the company wants to retain the ability to restore files within 12 hours if needed. The data engineer creates an S3 Lifecycle policy to transition objects to Glacier Deep Archive after 90 days. After deploying the policy, the engineer notices that the storage costs have not decreased significantly. On reviewing the bucket metrics, the engineer sees that many objects are being deleted directly by users before the lifecycle policy takes effect. The company needs to enforce the lifecycle policy and prevent premature deletions. What should the data engineer do to enforce the lifecycle policy?

Question 449easymultiple choice

Read the full Data Store Management explanation →

A startup is building a real-time analytics application using Amazon Kinesis Data Streams and Amazon Kinesis Data Analytics. The application processes clickstream data from a website. The data is also stored in Amazon S3 for historical analysis. The company uses an S3 bucket with a lifecycle policy that transitions objects to Amazon S3 Glacier Deep Archive after 30 days. The data engineering team has configured a Kinesis Data Firehose delivery stream to write data to the S3 bucket. The team notices that the data in S3 is not being transitioned to Glacier Deep Archive after 30 days. The lifecycle policy is correctly configured and has been verified. What is the most likely cause of this issue?

Question 450hardmultiple choice

Read the full NAT/PAT explanation →

A healthcare company uses Amazon RDS for PostgreSQL to store patient records. The database has a size of 1 TB and is running on a db.r5.large instance. The company requires that the database be highly available and have automated backups with point-in-time recovery (PITR) for the last 35 days. The operations team has configured Multi-AZ deployment and automated backups with a 35-day retention period. During a recent disaster simulation, the team attempted to restore the database to a point in time from 30 days ago. The restore operation failed because the backup was not available. On investigation, the team found that the automated backups were being deleted before the retention period ended. The team also noticed that the database has a large number of transaction logs generating a high volume of write activity. What is the most likely cause of the backups being deleted prematurely?

Question 451mediummultiple choice

Read the full Data Store Management explanation →

A financial services company uses Amazon Redshift for its data warehouse. The cluster has two nodes and is used for complex analytical queries. The company recently migrated from a single-node cluster to a two-node cluster to improve performance. After the migration, the data engineer notices that query performance has not improved as expected. Some queries are even slower than before. The engineer checks the workload management (WLM) queue configuration and sees that there is only one queue with a concurrency level of 5. The queries are mostly large scans and aggregations. The cluster's CPU utilization is low, but disk I/O is high. What should the data engineer do to improve query performance?

Question 452hardmultiple choice

Read the full Data Store Management explanation →

A gaming company uses Amazon DynamoDB to store player profiles and game state. The table has a partition key of 'player_id' and no sort key. The table is provisioned with 5,000 RCUs and 5,000 WCUs. The application performs frequent reads and writes to update player scores. Recently, the company introduced a new feature that allows players to form guilds. The guild data is stored in a separate DynamoDB table with a partition key of 'guild_id'. The application often needs to retrieve all members of a guild. The data engineer is encountering high latency when querying the guild table because the guilds can have up to 100 members. The engineer wants to reduce latency without changing the application architecture. What should the data engineer do?

Question 453mediummultiple choice

Read the full Data Store Management explanation →

A company uses Amazon EMR to run Spark jobs on a cluster of 20 nodes. The cluster stores intermediate data on Amazon S3 using EMRFS. The company's data engineering team notices that the Spark jobs are running slower than expected. Upon investigating, they find that the cluster is experiencing high network I/O and that the S3 storage costs have increased significantly. The team suspects that the Spark jobs are writing too much intermediate data to S3. The jobs are performing many shuffle operations. The team wants to optimize the job performance and reduce costs without modifying the Spark application code. What should the data engineer do?

Question 454easymultiple choice

Read the full Data Store Management explanation →

A retail company stores customer transaction data in an Amazon S3 bucket. The data is encrypted using server-side encryption with AWS KMS (SSE-KMS). The company uses an IAM role to allow an Amazon Athena query service to read the data. The data engineer creates a new Athena workgroup and attempts to run a query on the S3 bucket. The query fails with an access denied error. The IAM role has permissions to decrypt the KMS key and read from the bucket. The engineer checks the S3 bucket policy and finds that it does not explicitly allow access. What is the most likely cause of the failure?

Question 455hardmultiple choice

Read the full Data Store Management explanation →

A company runs a transactional database on Amazon RDS for PostgreSQL with Multi-AZ deployment. The database size is 2 TB and experiences moderate write load. The company recently enabled RDS Performance Insights and noticed a high number of 'TupleLock' wait events during peak hours. The development team reports that a batch update job runs every hour, updating millions of rows in a large table. The job takes longer than expected. The DBA suspects that excessive row-level locking is causing contention. The team wants to minimize lock contention without changing the application code. Which solution should be implemented?

Question 456mediummultiple choice

Read the full Data Store Management explanation →

A data engineering team uses AWS Glue ETL jobs to process data from Amazon S3 and load it into an Amazon Redshift cluster. The cluster has a single node of type dc2.large. The team notices that the ETL jobs are failing intermittently with errors related to disk space. The Redshift cluster shows that the disk is nearly full. The team needs to resolve the disk space issue and ensure the ETL jobs can complete successfully without increasing costs significantly. Which solution should the team implement?