A company has a centralized logging account and multiple application accounts. All VPC Flow Logs are sent to a central S3 bucket in the logging account. The security team needs to analyze the logs using Amazon Athena. The team must ensure queries are cost-effective and return results quickly for recent logs. Which configuration should be used?
Trap 1: Convert the logs to Parquet format using AWS Glue and store them in…
Parquet is good but partitioning is more impactful for recent data.
Trap 2: Use S3 lifecycle policies to transition logs to S3 Glacier after 7…
Glacier is not directly queryable by Athena.
Trap 3: Use Athena with federated query to scan logs directly from the…
Cross-account queries add complexity and latency.
- A
Convert the logs to Parquet format using AWS Glue and store them in the same bucket.
Why wrong: Parquet is good but partitioning is more impactful for recent data.
- B
Use S3 lifecycle policies to transition logs to S3 Glacier after 7 days and query with Athena.
Why wrong: Glacier is not directly queryable by Athena.
- C
Partition the S3 bucket by date (e.g., year/month/day) and use Athena partition projection.
Correct: Partitioning reduces data scanned.
- D
Use Athena with federated query to scan logs directly from the application accounts.
Why wrong: Cross-account queries add complexity and latency.