A data engineer is troubleshooting an AWS Glue job that reads from Amazon S3 and writes to Amazon Redshift. The job runs successfully but 5% of records are missing after the load. The engineer suspects data consistency issues. Which THREE actions could help diagnose and resolve the problem? (Choose THREE.)
Trap 1: Increase the number of DPUs for the Glue job.
More DPUs improve performance but not data consistency.
Trap 2: Use a staging table in Redshift with a transaction to commit.
Adds complexity; not a direct diagnostic step.
- A
Use the Redshift COPY command with a manifest file to load data.
Manifest file ensures all files are loaded.
- B
Increase the number of DPUs for the Glue job.
Why wrong: More DPUs improve performance but not data consistency.
- C
Enable Glue job bookmarks to track processed files.
Bookmarks prevent reprocessing or missing files.
- D
Use a staging table in Redshift with a transaction to commit.
Why wrong: Adds complexity; not a direct diagnostic step.
- E
Review the job's CloudWatch Logs for any error messages.
Logs may show partial failures.