This chapter covers S3 Server Access Logs and request monitoring, a key topic under the Monitoring domain (Objective 1.2). You will learn how to enable, configure, and analyze these logs to audit access to your S3 buckets. Expect 2-3 exam questions on log delivery, format, and analysis tools like Amazon Athena. Mastery of this topic ensures you can meet compliance and security requirements for S3 data access.
Jump to a section
Imagine a secure warehouse with a single loading dock. Every time a package arrives or leaves, a security camera records a detailed log: who delivered it, who signed for it, what truck it came on, and the timestamp down to the second. These logs are stored in a separate locked room (a different S3 bucket) so that even if someone tampers with the warehouse, the logs remain safe. The camera system runs continuously but has a slight delay—logs are written every few minutes, not in real time. If you want to investigate a theft, you review the logs to see which employee accessed which package and when. Similarly, S3 Server Access Logs record every request to an S3 bucket, capturing details like requester, bucket name, request type, HTTP status, and error codes. The logs are delivered to a target bucket (the locked room) with best-effort delivery, typically within a few minutes to a few hours. They are not real-time, so they are used for auditing and analysis, not for immediate alerts. The analogy holds: you cannot prevent the theft by watching the camera, but you can catch the culprit afterward.
What Are S3 Server Access Logs?
S3 Server Access Logs provide detailed records of requests made to an S3 bucket. Each log entry captures information such as the requester (AWS account or IAM user), bucket name, request type (GET, PUT, DELETE, etc.), HTTP status code, error codes, and timestamps. These logs are essential for security audits, compliance (e.g., PCI DSS, HIPAA), and troubleshooting access issues.
How They Work Internally
When you enable server access logging on a source bucket, S3 automatically generates log objects for every request to that bucket. The logs are written to a target bucket that you specify. The process is asynchronous and best-effort—S3 aggregates log records for a period (typically 5-15 minutes) and then delivers them as a single log object. This means logs are not real-time. The log object is a sequence of newline-delimited records, each representing a single request.
Key Components and Defaults
Source Bucket: The bucket being monitored.
Target Bucket: A separate bucket where logs are delivered. It must be in the same AWS region as the source bucket.
Target Prefix: An optional prefix (folder) to organize log objects within the target bucket.
Log Object Key Format: TargetPrefixYYYY-mm-DD-HH-MM-SS-UniqueString
Log Delivery: Best-effort, typically within a few minutes to a few hours. There is no SLA.
Log Record Fields: Over 20 fields including Bucket Owner, Bucket, Time, Remote IP, Requester, Request ID, Operation, Key, Request-URI, HTTP Status, Error Code, Bytes Sent, Object Size, Total Time, Turn-Around Time, Referrer, User-Agent, Version ID, Host ID, Signature Version, Cipher Suite, Authentication Type, Host Header, TLS Version, Access Point ARN, and more.
Configuration
To enable server access logging via AWS CLI:
aws s3api put-bucket-logging --bucket source-bucket --bucket-logging-status '{"LoggingEnabled": {"TargetBucket": "target-bucket", "TargetPrefix": "logs/"}}'Using the AWS Management Console: Go to the source bucket's Properties > Server access logging > Enable, select target bucket and prefix.
Verification
To verify logging is enabled:
aws s3api get-bucket-logging --bucket source-bucketThis returns the logging configuration if enabled.
Analyzing Logs
Logs can be analyzed using Amazon Athena, Amazon EMR, or custom scripts. Athena is the most common exam-tested method. You create a table in Athena with the log format, then query with SQL. Example:
CREATE EXTERNAL TABLE s3_access_logs (
BucketOwner STRING,
Bucket STRING,
RequestDateTime STRING,
RemoteIP STRING,
Requester STRING,
RequestID STRING,
Operation STRING,
Key STRING,
RequestURI_Operation STRING,
HTTPStatus STRING,
ErrorCode STRING,
BytesSent BIGINT,
ObjectSize BIGINT,
TotalTime STRING,
TurnAroundTime STRING,
Referrer STRING,
UserAgent STRING,
VersionId STRING,
HostId STRING,
SignatureVersion STRING,
CipherSuite STRING,
AuthenticationType STRING,
HostHeader STRING,
TlsVersion STRING,
AccessPointArn STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
'input.regex' = '([^ ]*) ([^ ]*) \\[([^]]*)\\] ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) \\"([^\"]*)\" ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*) ([^ ]*)')
LOCATION 's3://target-bucket/prefix/';Then query:
SELECT requester, operation, key, httpstatus, totaltime
FROM s3_access_logs
WHERE bucket = 'source-bucket'
AND requestdatetime LIKE '2025-04-01%';Interaction with Related Technologies
AWS CloudTrail: CloudTrail records API calls to S3 (e.g., CreateBucket, PutBucketPolicy) but not data-level requests (GetObject, PutObject) unless you enable data events. S3 server access logs capture all requests, including data-level. CloudTrail is for management events, S3 logs for data events.
Amazon CloudWatch: S3 can send metrics (e.g., 4xx errors, request count) to CloudWatch, but these are aggregated, not per-request. For detailed audit, use S3 access logs.
AWS Config: Config records resource configuration changes, not individual requests.
Amazon GuardDuty: GuardDuty uses VPC Flow Logs and DNS logs to detect threats; it does not consume S3 access logs directly.
Amazon Macie: Macie uses machine learning to discover sensitive data in S3 buckets; it can use access logs to understand data access patterns.
Important Considerations
Log Delivery Delay: Logs are delivered best-effort; there is no guarantee. Delays can be minutes to hours.
Log Object Lifecycle: Log objects accumulate in the target bucket. You should configure lifecycle policies to transition or expire them to manage costs.
Cost: You pay for storage of log objects in the target bucket and for data transfer (no charge for log delivery itself).
Cannot Log to Same Bucket: The target bucket must be different from the source bucket to avoid recursive logging.
Log Format Changes: AWS may add fields over time. Your parsing logic should be flexible.
Requester Identification: If the requester is an AWS account, the field shows the canonical user ID. For IAM users, it shows the IAM user ARN.
Cross-Account Logging: You can deliver logs to a target bucket in another account, but you must grant the S3 log delivery service write permissions via a bucket policy.
Example Bucket Policy for Cross-Account Logging
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "logging.s3.amazonaws.com"
},
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::target-bucket/prefix/*",
"Condition": {
"ArnLike": {
"aws:SourceArn": "arn:aws:s3:::source-bucket"
}
}
}
]
}This ensures only the specified source bucket can write logs to the target.
Limitations
No Real-Time: Not suitable for alerting. Use S3 Event Notifications or CloudWatch Events for real-time monitoring.
No Guaranteed Delivery: Logs may be lost if there is a system failure.
No Support for Multi-Region: Target bucket must be in the same region as source.
Logging Only for Requests to the Bucket: Does not log management actions like bucket creation. Use CloudTrail for that.
Enable Server Access Logging
On the source bucket, navigate to Properties > Server access logging > Enable. Specify the target bucket (must be in same region) and an optional prefix. Alternatively, use the AWS CLI command: `aws s3api put-bucket-logging --bucket source-bucket --bucket-logging-status '{"LoggingEnabled": {"TargetBucket": "target-bucket", "TargetPrefix": "logs/"}}'`. This creates a logging configuration that tells S3 to start recording requests. The configuration is stored as a subresource on the source bucket.
Set Permissions on Target Bucket
The target bucket must have a bucket policy that grants the S3 log delivery service (`logging.s3.amazonaws.com`) permission to write objects. If the target bucket is in the same account, AWS automatically adds this permission when you enable logging via the console. For cross-account logging, you must manually add the policy. The policy should allow `s3:PutObject` to the prefix, with a condition limiting the source bucket ARN. Without this, log delivery fails.
S3 Aggregates Log Records
As requests arrive at the source bucket, S3 captures details for each request. It aggregates these records in memory for a period (typically 5-15 minutes) to form a batch. The aggregation time is not configurable. After the batch is complete, S3 finalizes the log object and initiates delivery. The log object contains multiple lines, each representing one request.
Log Object Delivery to Target
S3 writes the log object to the target bucket using the key format: `TargetPrefixYYYY-mm-DD-HH-MM-SS-UniqueString`. The timestamp reflects when the batch was written, not the request times. Delivery is best-effort; there is no acknowledgement or retry mechanism. If the target bucket is unavailable, the log may be lost. The log object is stored as a standard S3 object, so it incurs storage costs.
Analyze Logs with Athena
To query logs, create an external table in Athena using the log format's regex. Then run SQL queries to filter by bucket, time range, requester, operation, etc. Example: `SELECT requester, operation, httpstatus FROM s3_access_logs WHERE bucket = 'my-bucket' AND requestdatetime LIKE '2025-04-01%'`. Athena charges per query based on data scanned. Partitioning logs by date can reduce cost and improve performance.
Scenario 1: Compliance Auditing for Healthcare Data
A healthcare company stores patient records in S3 and must comply with HIPAA. They enable server access logging on all buckets containing protected health information (PHI). Logs are delivered to a centralized target bucket in a separate account with strict access controls. Every month, the compliance team runs Athena queries to identify any access from unauthorized IPs or users. They also set up a lifecycle policy to transition logs older than 90 days to Glacier for long-term retention. A common issue is missing logs due to misconfigured target bucket permissions—the logging service fails silently. The team uses CloudWatch metrics to monitor the size of log objects; if no new logs appear for 24 hours, they trigger an alert.
Scenario 2: Security Incident Investigation
A financial services firm detects unusual activity: a large number of DELETE requests from an unexpected IP. They enable server access logging (if not already enabled) but realize that logs are not available for past events—logging must be enabled beforehand. They use the existing logs to trace the source IP, the IAM user (if any), and the objects deleted. They find that the requests came from a compromised access key. The logs also show the signature version and TLS version, helping them determine the client software used. The incident response team uses Athena to correlate with CloudTrail logs to identify the exact time the key was used. The limitation of log delivery delay (up to several hours) means they cannot respond in real-time, so they rely on CloudWatch alarms for immediate anomalies.
Scenario 3: Cost Allocation and Usage Analysis
A large e-commerce company uses S3 for static assets and user uploads. They enable server access logging to analyze request patterns. By querying logs with Athena, they identify which departments (via requester) consume the most bandwidth and which objects are most frequently accessed. They use this data to optimize storage classes (e.g., moving cold data to S3 Glacier) and to charge back costs to business units. They also discover that bots are scraping their public bucket, causing high GET requests. They then implement a bucket policy to restrict access based on User-Agent. The logs help them tune their caching strategy via CloudFront. A common mistake is forgetting to set a lifecycle policy on the target bucket, leading to accumulating log storage costs.
What SOA-C02 Tests
Objective 1.2: "Implement and manage logging and monitoring for S3." The exam focuses on:
Enabling server access logging (console, CLI, API)
Understanding log format fields (especially Bucket Owner, Remote IP, Requester, Operation, HTTP Status, Error Code, Total Time, Turn-Around Time)
Using Athena to query logs
Differences between S3 access logs, CloudTrail, and CloudWatch metrics
Cross-account logging permissions
Log delivery characteristics (best-effort, not real-time)
Lifecycle management for log objects
Common Wrong Answers
"S3 server access logs are delivered in real-time." Candidates confuse them with CloudWatch metrics or S3 Event Notifications. Reality: logs are aggregated and delivered every few minutes to hours; no SLA.
"You can enable logging on the target bucket itself." Recursive logging is not allowed; source and target must be different buckets.
"CloudTrail captures all S3 requests including data events by default." CloudTrail records management events by default; data events must be explicitly enabled and incur additional cost.
"Logs are automatically encrypted at rest." Log objects inherit the encryption settings of the target bucket. If the target bucket has default encryption, logs are encrypted; otherwise, they are not.
Specific Numbers and Terms
Log object key format: TargetPrefixYYYY-mm-DD-HH-MM-SS-UniqueString
Fields: 20+; know Requester, Operation, Key, HTTP Status, Error Code, Total Time, Turn-Around Time
Target bucket must be in same region.
Use aws s3api put-bucket-logging to enable.
Athena table creation requires RegexSerDe with specific regex pattern.
Lifecycle policy: transition to Glacier after 90 days or delete after 365 days.
Edge Cases
Cross-account logging: Must add bucket policy on target bucket allowing logging.s3.amazonaws.com with condition on source bucket ARN.
Logging for versioned buckets: Logs include version ID if present.
Logging for MFA Delete: Logs capture the request but not MFA details.
Logging for Requester Pays: Logs show requester as the account that paid.
How to Eliminate Wrong Answers
If the question asks for "real-time" monitoring, eliminate S3 access logs (not real-time). Look for CloudWatch or S3 Event Notifications.
If the question involves auditing who accessed a specific object, S3 access logs are appropriate; CloudTrail only if data events enabled.
If the target bucket is in a different region, answer is invalid; must be same region.
If the question mentions "guaranteed delivery," S3 access logs are best-effort, so that option is wrong.
If the question asks for cost-effective long-term analysis, Athena is the standard answer.
S3 server access logs must be enabled per bucket; they are disabled by default.
Target bucket must be in the same AWS region as the source bucket.
Logs are delivered on a best-effort basis; no SLA or guarantee.
Use Amazon Athena to query logs with SQL; create table using RegexSerDe.
Log object key format: TargetPrefixYYYY-mm-DD-HH-MM-SS-UniqueString.
Lifecycle policies should be set on target bucket to manage log storage costs.
Cross-account logging requires bucket policy on target bucket allowing logging.s3.amazonaws.com.
S3 access logs capture data-level requests; use CloudTrail for management events.
Logs are not real-time; use S3 Event Notifications for immediate alerts.
Common exam fields: Bucket Owner, Remote IP, Requester, Operation, HTTP Status, Error Code, Total Time.
These come up on the exam all the time. Here's how to tell them apart.
S3 Server Access Logs
Captures all requests (including data-level) to S3 bucket.
Delivered as log objects to a target bucket (same region).
Best-effort delivery, not real-time.
Free of charge (only storage costs for logs).
Fields include HTTP status, error codes, total time, turn-around time.
AWS CloudTrail (Data Events)
Records API calls; data events must be enabled separately.
Delivered to CloudTrail S3 bucket or CloudWatch Logs.
Near real-time via CloudWatch Events (if enabled).
Cost per event (data events are charged).
Includes IAM user ARN, source IP, user agent, request parameters.
Mistake
S3 server access logs are delivered in real-time.
Correct
Logs are aggregated and delivered on a best-effort basis, typically every 5-15 minutes but can be delayed hours. They are not real-time.
Mistake
You can enable server access logging on the same bucket as the target.
Correct
The source and target buckets must be different to avoid recursive logging. AWS rejects configuration with same bucket.
Mistake
S3 server access logs capture all API calls including bucket management.
Correct
They capture only data-level requests (GET, PUT, DELETE, HEAD) and some control requests (e.g., ListBucket). Management actions like CreateBucket are not logged; use CloudTrail.
Mistake
Logs are automatically encrypted at rest by default.
Correct
Log objects inherit the encryption of the target bucket. If the target bucket has no default encryption, logs are stored unencrypted. You should enable default encryption on the target bucket.
Mistake
Server access logging is automatically enabled on all S3 buckets.
Correct
It is disabled by default and must be manually enabled per bucket. There is no global setting.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Use the `aws s3api put-bucket-logging` command with a JSON configuration specifying the target bucket and prefix. Example: `aws s3api put-bucket-logging --bucket source-bucket --bucket-logging-status '{"LoggingEnabled": {"TargetBucket": "target-bucket", "TargetPrefix": "logs/"}}'`. Ensure the target bucket has appropriate permissions.
No, S3 server access logs are not real-time. They are delivered on a best-effort basis, typically within minutes to hours. For real-time monitoring, use S3 Event Notifications to trigger Lambda or SQS, or enable CloudWatch metrics with alarms.
S3 server access logs capture all requests (including GET, PUT, DELETE) to S3 objects and buckets. CloudTrail records API calls; by default it logs management events (e.g., CreateBucket). To log data events (object-level), you must enable CloudTrail data events, which incur additional cost. Both can be used together for comprehensive auditing.
First, create an external table in Athena defining the log schema with a RegexSerDe. Then query the table with SQL. For example: `SELECT requester, operation, httpstatus FROM s3_access_logs WHERE bucket = 'my-bucket'`. Partitioning logs by date improves performance and reduces cost.
The target bucket must have a bucket policy that grants the S3 log delivery service principal (`logging.s3.amazonaws.com`) permission to write objects. The policy should use a condition to restrict the source bucket ARN. Example: `{"Effect":"Allow","Principal":{"Service":"logging.s3.amazonaws.com"},"Action":"s3:PutObject","Resource":"arn:aws:s3:::target-bucket/prefix/*","Condition":{"ArnLike":{"aws:SourceArn":"arn:aws:s3:::source-bucket"}}}`.
Yes, all requests to the source bucket are logged, including those from the bucket owner, IAM users, and anonymous users. The `Requester` field shows the canonical user ID for AWS accounts or the ARN for IAM users.
There is no guaranteed delivery time. Logs are typically delivered every few minutes to a few hours, but delays can occur. AWS recommends not relying on logs for time-sensitive operations.
You've just covered S3 Server Access Logs and Request Monitoring — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?