This chapter covers API security mechanisms essential for the SY0-701 exam: OAuth 2.0, JSON Web Tokens (JWT), and rate limiting. These are core to securing modern web services and are tested under Objective 3.1 (Security Architecture). Understanding how they work, their weaknesses, and how to implement them correctly is critical for both the exam and real-world security engineering.
Jump to a section
Imagine you drive to a private club and hand your car keys to a valet. The valet key (OAuth token) can start the car and move it, but cannot open the trunk or glove box. The valet parks your car and later returns the key. This is OAuth 2.0: you (resource owner) grant the valet (client application) a limited token to access your car (API) for a specific purpose (scope). The valet does not get your master key (password). Now suppose the valet gives a ticket (JWT) to a different valet to retrieve your car. That ticket contains your car's ID (subject) and the valet's privileges (claims), all signed by the club manager (authorization server). If the ticket is stolen, the thief can drive your car until the ticket expires. Rate limiting is like the club only allowing 10 cars per minute into the parking lot; if a valet tries to bring 20 cars at once, the gate closes (HTTP 429 Too Many Requests). This prevents a valet from flooding the lot and blocking others (DoS).
What is API Security and Why Does It Matter?
APIs (Application Programming Interfaces) are the backbone of modern applications, allowing services to communicate and share data. Securing APIs is about controlling access, ensuring data integrity, and preventing abuse. The SY0-701 exam focuses on three key mechanisms: OAuth 2.0 for delegated authorization, JWT for stateless token-based authentication/authorization, and rate limiting for protecting against brute-force and DoS attacks. Without these, APIs are vulnerable to unauthorized access, token theft, replay attacks, and resource exhaustion.
OAuth 2.0: Delegated Authorization
OAuth 2.0 (RFC 6749) is an authorization framework that enables third-party applications to obtain limited access to a user's resources without exposing the user's credentials. It is NOT an authentication protocol—though it is often misused as one. The core roles are: - Resource Owner: The user who owns the data (e.g., you). - Client: The application requesting access (e.g., a photo printing app). - Authorization Server: The server that issues tokens after authenticating the resource owner (e.g., Google's OAuth endpoint). - Resource Server: The server hosting the protected data (e.g., Google Photos API).
How It Works (Authorization Code Grant):
1. The client redirects the user to the authorization server with a request for access. The URL includes response_type=code, client_id, redirect_uri, and scope.
2. The user authenticates (e.g., logs into Google) and consents to the requested permissions.
3. The authorization server redirects back to the client with an authorization code in the query string.
4. The client sends this code to the authorization server's token endpoint (POST request with grant_type=authorization_code, code, client credentials) to receive an access token (and optionally a refresh token).
5. The client uses the access token to call the resource server's API, typically in the Authorization: Bearer <token> header.
Key Security Considerations:
- The authorization code must be exchanged server-side to prevent interception (PKCE extension for public clients).
- Access tokens should be short-lived (minutes to hours). Refresh tokens are long-lived and must be stored securely.
- The redirect_uri must be validated to prevent open redirectors.
- OAuth 2.0 does not define token format; JWT is commonly used for access tokens.
Common Grant Types (SY0-701 relevant): - Authorization Code: Most secure for server-side apps. - Implicit (legacy): No code exchange; token returned directly. Deprecated due to security issues (token in URL). - Client Credentials: For server-to-server communication; no user involvement. - Resource Owner Password Credentials: User gives password to client; high risk. Discouraged.
JSON Web Tokens (JWT)
JWT (RFC 7519) is a compact, URL-safe token format used for transmitting claims between parties. It consists of three Base64url-encoded segments separated by dots: Header, Payload, Signature. Example:
eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWUsImlhdCI6MTUxNjIzOTAyMn0.NHVaYe26MbtOYhSKkoKYdFVomg4i8ZJd8_-RU8VNbftc4gMbzFJk4xOEOQz6io1Hr0cZP1D1m0l6gYJcYgHeader: Contains the signing algorithm (e.g., HS256 for HMAC-SHA256, RS256 for RSA-SHA256) and token type (JWT).
Payload: Contains claims (registered, public, or private). Standard claims: iss (issuer), sub (subject), aud (audience), exp (expiration), nbf (not before), iat (issued at).
Signature: Created by signing the header and payload with a secret (symmetric) or private key (asymmetric). The recipient verifies using the same secret or public key.
How JWT Secures APIs: - Stateless: The server does not need to store session state; all info is in the token. - Integrity: Signature ensures the token has not been tampered with. - Claims can include user roles, permissions, and expiration.
Security Risks:
- None Algorithm Attack: If the server accepts alg: none, an attacker can forge a token with no signature. Mitigation: Reject tokens with alg: none.
- Key Confusion: Using a symmetric key as an asymmetric key (e.g., HS256 with RSA public key). Mitigation: Validate algorithm against expected list.
- Token Theft: JWT in cookies or local storage can be stolen via XSS. Mitigation: Use HttpOnly cookies, short expiration, and refresh token rotation.
- Replay: A stolen token can be reused until expiry. Mitigation: Use short expiry and implement token revocation (blacklist).
Rate Limiting
Rate limiting controls the number of requests a client can make to an API within a given time window. It protects against brute-force attacks, credential stuffing, and DoS.
Common Algorithms: - Token Bucket: Tokens are added at a fixed rate; each request consumes a token. Bursts are allowed up to bucket size. - Leaky Bucket: Requests are processed at a constant rate; excess requests are queued or dropped. - Fixed Window: Count requests per time window (e.g., 100 requests per minute). Vulnerable to burst at window boundaries. - Sliding Window: Uses a rolling time window to avoid boundary spikes.
Implementation:
Rate limiting is often enforced at the API gateway or reverse proxy (e.g., Nginx, Kong, AWS API Gateway). Headers like X-RateLimit-Limit, X-RateLimit-Remaining, and Retry-After inform the client. When exceeded, the server returns HTTP 429 Too Many Requests.
Example Nginx Rate Limiting Config:
http {
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
server {
location /api/ {
limit_req zone=api burst=20 nodelay;
proxy_pass http://backend;
}
}
}This limits each IP to 10 requests per second, with a burst of 20.
Security Considerations: - Rate limiting should be applied per user (based on API key or token) as well as per IP to prevent one user from exhausting the limit of another (IP-based only can be circumvented via botnets). - Use exponential backoff for retries. - Rate limiting alone is not sufficient; combine with CAPTCHA or WAF for sophisticated attacks.
OAuth Authorization Code Flow
1. **Client requests authorization**: The client app redirects the user to the authorization server's `/authorize` endpoint with `response_type=code`, `client_id`, `redirect_uri`, and `scope`. 2. **User authenticates and consents**: The user logs in and grants permissions. The authorization server stores the consent. 3. **Authorization code issued**: The server redirects to the client's `redirect_uri` with a `code` parameter. This code is short-lived (minutes). 4. **Client exchanges code for token**: The client sends a POST request to the `/token` endpoint with `grant_type=authorization_code`, `code`, `redirect_uri`, and its client credentials (`client_id` and `client_secret`). 5. **Access token returned**: The authorization server validates the code and returns an access token (and optionally a refresh token) in JSON. 6. **Client calls resource server**: The client includes the access token in the `Authorization: Bearer <token>` header when calling the API. 7. **Resource server validates token**: The resource server checks the token's signature, expiration, and scope before returning data.
JWT Token Validation Process
1. **Receive token**: The resource server extracts the JWT from the `Authorization` header (Bearer scheme). 2. **Split token**: The server splits the token into three parts: header, payload, signature. 3. **Decode header**: Base64url-decode the header to get the algorithm (e.g., `RS256`) and key ID (`kid`) if present. 4. **Retrieve public key**: Using the `kid` or issuer, fetch the public key from a trusted source (e.g., JWKS endpoint). 5. **Verify signature**: Use the public key and algorithm to verify the signature against the base64url-encoded header and payload. 6. **Validate claims**: Check `exp` (not expired), `nbf` (if present, token not before), `iss` (expected issuer), `aud` (expected audience), and any custom claims. 7. **Check revocation**: If the token is blacklisted (e.g., after logout), reject it. 8. **Grant access**: If all checks pass, extract user identity and permissions from claims and process the request.
Rate Limiting with Token Bucket
1. **Initialize bucket**: For each client (identified by API key or IP), create a bucket with capacity `C` (e.g., 10 tokens) and a refill rate `R` (e.g., 1 token per second). 2. **Request arrives**: When a request comes in, check if the bucket has at least 1 token. 3. **Consume token**: If yes, remove one token and allow the request. 4. **Refill**: Tokens are added at rate `R` up to capacity `C`. This allows bursts of up to `C` requests. 5. **Deny request**: If no tokens remain, reject with HTTP 429 and set `Retry-After` header to the time until the next token is available. 6. **Logging**: Log each request and rate limit decision for monitoring. Tools like Nginx, Redis (with sorted sets), or API gateway can implement this.
Responding to OAuth Token Theft
1. **Detect anomaly**: The resource server notices unusual access patterns (e.g., same token used from different geographies) or the user reports unauthorized activity. 2. **Revoke token**: The authorization server invalidates the token by adding it to a blacklist or by rotating the signing key. 3. **Notify user**: The user is alerted and may be forced to re-authenticate. 4. **Rotate refresh token**: If a refresh token was compromised, issue a new one and invalidate the old one. 5. **Audit logs**: Review logs to determine scope of breach. 6. **Mitigate**: Implement token binding (e.g., using `cnf` claim for proof-of-possession) to prevent token reuse on different devices.
Implementing API Gateway Rate Limiting
1. **Choose rate limit strategy**: Decide per-user or per-IP limits. For authenticated APIs, use API key or token. 2. **Configure gateway**: In AWS API Gateway, create a usage plan with rate and burst limits. In Nginx, use `limit_req_zone` and `limit_req`. 3. **Set limits**: Define reasonable limits based on expected traffic (e.g., 1000 requests per hour per user). 4. **Return proper headers**: Include `X-RateLimit-Limit`, `X-RateLimit-Remaining`, and `X-RateLimit-Reset` so clients can adjust. 5. **Handle exceeded requests**: Return HTTP 429 with a `Retry-After` header. 6. **Monitor and adjust**: Use CloudWatch or equivalent to monitor throttling events and adjust limits. 7. **Test**: Simulate high traffic to verify rate limiting works and does not block legitimate users.
Scenario 1: Social Media App API Abuse
A popular social media platform provides an API for third-party apps. An attacker registers a malicious app that uses OAuth 2.0 with the client_credentials grant to obtain an access token with high privileges. The attacker then uses this token to scrape user data at high volume. The SOC analyst notices a spike in API calls from a single client ID, far exceeding normal usage. Using the API gateway logs, they see the client is making 10,000 requests per minute. The analyst immediately revokes the client's access token and blacklists the client ID. They then implement rate limiting per client (100 requests per minute) and require proof-of-possession (DPoP) for tokens. A common mistake is to only rate limit by IP, which the attacker bypasses by using a botnet. The correct response is to limit by token/API key and to monitor for anomalous patterns.
Scenario 2: JWT Algorithm Confusion Attack
A company uses JWTs for authentication on its internal API. The JWT library accepts the algorithm from the header without validation. An attacker obtains a valid token signed with RS256 and the public key. The attacker modifies the header to alg: HS256 and signs the token using the public key as the HMAC secret. The server, expecting RS256, uses the same public key to verify the HMAC signature (since it treats the algorithm as HS256) and accepts the forged token. The attacker now has admin access. The SOC detects the breach when an audit log shows a user with unusual privileges accessing sensitive data. The fix: always validate the algorithm against a whitelist and reject tokens with unexpected algorithms. Tools like JWT.io can help decode tokens for analysis.
Scenario 3: Rate Limiting Bypass via Distributed Attack
An e-commerce site uses IP-based rate limiting to prevent brute-force login attempts. An attacker uses a botnet of 1000 IPs, each making 5 requests per second, totaling 5000 requests per second. The rate limit per IP is 10 requests per second, so each IP stays under the limit. The SOC sees a high volume of failed login attempts but no single IP is blocked. The analyst realizes the attack is distributed and implements additional rate limiting based on the number of failed attempts per user account (regardless of IP) and uses CAPTCHA after a threshold. The mistake is relying solely on IP-based limits. The correct approach is multi-layered: per-IP, per-account, and behavior-based.
What SY0-701 Tests
Objective 3.1 covers secure system design, including API security. You must understand:
OAuth 2.0 roles (resource owner, client, authorization server, resource server) and grant types (authorization code, implicit, client credentials, resource owner password).
JWT structure (header, payload, signature) and common claims (exp, iss, sub, aud).
Rate limiting as a defense against brute-force and DoS.
Token security: short expiration, refresh tokens, secure storage (HttpOnly cookies).
Common Wrong Answers
Confusing OAuth with authentication: Many choose 'OAuth is for authentication' because they see 'login with Google'. Reality: OAuth is authorization; OpenID Connect is authentication on top of OAuth.
Thinking JWT is encrypted: JWT is signed, not encrypted by default. The payload is Base64-encoded, not encrypted. Encryption requires JWE (JSON Web Encryption).
Choosing 'Rate limiting prevents all DoS': Rate limiting mitigates but does not prevent distributed attacks. It is one layer.
Selecting 'Implicit grant is most secure': Implicit grant is deprecated because the token is exposed in the URL. Authorization code with PKCE is more secure.
Key Terms
Authorization: Bearer <token>
HTTP 429 Too Many Requests
Retry-After header
alg: none vulnerability
PKCE (Proof Key for Code Exchange)
Token bucket algorithm
Trick Questions
Question that asks 'Which grant type is best for mobile apps?' Answer: Authorization Code with PKCE (not Implicit).
Question that shows a JWT without a signature: Look for 'alg: none' as the vulnerability.
Question about rate limiting headers: X-RateLimit-Remaining indicates remaining requests.
Decision Rule
If the scenario involves a third-party app accessing user data without sharing password → OAuth. If the scenario involves a token that contains user info and is verified by signature → JWT. If the scenario involves limiting request frequency → Rate limiting.
OAuth 2.0 is an authorization framework, not authentication. Use OpenID Connect for authentication.
JWT consists of three parts: Header (algorithm), Payload (claims), Signature (integrity).
Never trust a JWT with 'alg: none' — always validate the algorithm against a whitelist.
Authorization Code grant with PKCE is the most secure flow for public clients (mobile, SPA).
Rate limiting uses algorithms like Token Bucket and Leaky Bucket; always include Retry-After header with HTTP 429.
Token expiration should be short (minutes) for access tokens; refresh tokens can be longer but must be revocable.
Common JWT claims: exp, iss, sub, aud, iat, nbf.
Rate limiting is a mitigation, not a prevention — combine with WAF and anomaly detection.
These come up on the exam all the time. Here's how to tell them apart.
OAuth 2.0 Authorization Code Grant
Uses an authorization code exchanged for token server-side
Token never exposed to user agent
Supports refresh tokens
Requires client secret (except public clients with PKCE)
More secure, recommended for all apps
OAuth 2.0 Implicit Grant (Deprecated)
Token returned directly in URL fragment
Token exposed to browser history and referrer headers
No refresh token
No client authentication
Deprecated due to security issues
Mistake
OAuth 2.0 is an authentication protocol.
Correct
OAuth 2.0 is an authorization framework. It delegates access, not identity. Authentication is handled by OpenID Connect (OIDC), which sits on top of OAuth 2.0.
Mistake
JWTs are encrypted and therefore secure for storing sensitive data.
Correct
JWT is signed, not encrypted. The payload is Base64-encoded, which is reversible. Sensitive data should not be stored in a JWT unless using JWE (JSON Web Encryption).
Mistake
Rate limiting by IP address is sufficient to prevent brute-force attacks.
Correct
IP-based rate limiting can be bypassed using botnets or VPNs. Effective rate limiting should be multi-layered: per IP, per user account, and per session.
Mistake
The implicit grant in OAuth 2.0 is the most secure for single-page apps.
Correct
The implicit grant is deprecated due to security risks (token in URL). The recommended flow for SPAs is the authorization code grant with PKCE.
Mistake
A JWT with a valid signature is always trustworthy.
Correct
A valid signature only proves the token has not been tampered with. The token could be stolen (replay) or issued by a compromised issuer. Always validate claims like `exp`, `iss`, and `aud`.
OAuth 2.0 is an authorization framework that allows a third-party app to access resources on behalf of a user. OpenID Connect (OIDC) is an authentication layer built on top of OAuth 2.0. OIDC adds an ID token (JWT) that contains user identity claims, allowing the client to verify the user's identity. In short: OAuth grants access, OIDC confirms identity. For the exam, remember that 'Login with Google' uses OIDC, not just OAuth.
Rate limiting restricts the number of requests a client can make in a given time window. For brute-force attacks (e.g., guessing passwords), rate limiting slows down the attacker to a point where the attack becomes impractical. For example, if you limit login attempts to 5 per minute, an attacker would need years to try a large dictionary. However, rate limiting alone is not foolproof; attackers may use distributed IPs. Combine with account lockout and CAPTCHA for stronger defense.
Some JWT libraries accept tokens with the algorithm header set to 'none', meaning no signature is used. An attacker can create a valid-looking token by simply base64-encoding a header and payload with 'alg: none' and an empty signature. If the server does not reject such tokens, the attacker can forge arbitrary claims. Mitigation: always validate that the algorithm is a supported, non-'none' value. On the exam, if you see a JWT with an empty signature or 'alg: none', that is the vulnerability.
Neither is perfectly secure, but cookies with the HttpOnly and Secure flags are generally safer because they are not accessible via JavaScript, reducing the risk of XSS-based token theft. localStorage is vulnerable to XSS. However, cookies are susceptible to CSRF attacks. The best practice is to use cookies with SameSite=Strict (or Lax) and implement CSRF tokens. For SPAs, the authorization code flow with PKCE and storing tokens in memory (not persistent) is recommended.
PKCE (Proof Key for Code Exchange) is an extension to OAuth 2.0 that prevents authorization code interception attacks. It is especially important for public clients (mobile apps, SPAs) that cannot securely store a client secret. PKCE uses a dynamically generated code verifier (a random string) and a code challenge (hash of the verifier). The client sends the challenge during the authorization request and the verifier during the token request. The authorization server verifies that the verifier matches the challenge. This ensures that even if the authorization code is intercepted, the attacker cannot exchange it without the verifier.
HTTP 429 Too Many Requests. The server should also include a Retry-After header indicating how long the client should wait before making a new request. Common headers include X-RateLimit-Limit (maximum requests allowed), X-RateLimit-Remaining (remaining requests in the current window), and X-RateLimit-Reset (time when the window resets). For the exam, know that 429 is the standard response for rate limiting.
JWTs are stateless, meaning the server does not store them. To revoke a JWT before its natural expiration, you need a token blacklist (e.g., in Redis) that the resource server checks on every request. Alternatively, you can rotate the signing key (which invalidates all tokens) or use short expiration times (e.g., 15 minutes) with refresh tokens that can be revoked. On the exam, remember that stateless tokens require additional infrastructure for revocation.
You've just covered API Security — OAuth, JWT, Rate Limiting — now see how well it sticks with free SY0-701 practice questions. Full explanations included, no account needed.
Done with this chapter?