A company uses AWS Glue ETL jobs to process data from multiple sources. The job fails with the error: 'An error occurred while calling o123.pyWriteDynamicFrame. Insufficient memory.' The job runs on a G.1X worker type with 10 workers. What should be changed to resolve this error?
More workers increase parallelism and reduce memory pressure per worker.
Why this answer
The error 'Insufficient memory' in AWS Glue ETL jobs typically indicates that the total memory across all executors is insufficient for the data being processed. Increasing the number of workers from 10 to 20 doubles the total memory and compute capacity available, allowing the job to handle larger datasets without running out of memory. This is the most direct and effective fix for a memory exhaustion error when using the G.1X worker type.
Exam trap
The trap here is that candidates often confuse 'insufficient memory' with a per-worker memory limit and choose to upgrade the worker type (G.2X), but the error is about total cluster memory, which is more effectively addressed by increasing the number of workers.
How to eliminate wrong answers
Option B is wrong because enabling the Spark UI only provides monitoring and debugging capabilities; it does not allocate additional memory or resolve the underlying memory shortage. Option C is wrong because changing the worker type to G.2X doubles the memory per worker (from 16 GB to 32 GB), but the error is about total memory insufficiency, and increasing the number of workers (option A) is a more scalable and cost-effective approach that directly addresses the error without requiring a change in worker type. Option D is wrong because reducing the number of partitions in the DynamicFrame would actually increase the data size per partition, potentially worsening memory pressure on individual executors, not resolving the overall memory shortage.