SOA-C02Chapter 96 of 104Objective 6.2

RDS Performance Insights

This chapter covers RDS Performance Insights, a powerful feature for monitoring database performance and diagnosing bottlenecks. For the SOA-C02 exam, Performance Insights is a key topic under Domain 6 (Cost Optimization) and Objective 6.2 (Implement cost optimization strategies for databases). Approximately 5-10% of exam questions touch on monitoring and performance tools, with Performance Insights frequently appearing in scenarios involving troubleshooting high CPU or IO wait events. You will learn how to enable Performance Insights, interpret the dashboard, analyze DB Load and top SQL, and understand the cost implications of retention periods—all essential for both the exam and real-world SysOps roles.

25 min read
Intermediate
Updated May 31, 2026

Performance Insights as a Car's Dashboard

Imagine you are driving a car and want to understand why your fuel efficiency dropped suddenly. Your car has a dashboard that shows real-time metrics: speed, RPM, fuel level, and engine temperature. However, these gauges only show current values—they don't tell you what happened five minutes ago when you accelerated hard. Performance Insights is like a high-fidelity data recorder that captures a snapshot of every gauge every second, storing the last 24 hours of data in a circular buffer. It also provides a visual timeline (like a graph of RPM over time) so you can correlate a drop in fuel efficiency with that rapid acceleration event. Moreover, Performance Insights identifies the top SQL queries that consumed the most engine load—similar to pinpointing that 'flooring the accelerator' was the culprit. In AWS RDS, Performance Insights collects database load (DB Load) metrics, breaks them down by wait events, SQL queries, hosts, and users, and presents them in a dashboard that helps you quickly diagnose performance bottlenecks. The 'engine load' is analogous to the car's RPM—it measures how busy the database engine is. By analyzing the top wait events (like CPU, IO, or lock waits), you can identify whether the bottleneck is compute, storage, or contention. The feature is always on for the last hour for free, and you can pay for longer retention (up to two years) if you need historical analysis. Just as a car's data recorder helps a mechanic diagnose issues, Performance Insights helps DBAs pinpoint exactly which query or wait event is causing performance degradation.

How It Actually Works

What is RDS Performance Insights?

RDS Performance Insights is a managed database performance monitoring and tuning feature available for Amazon RDS (including Aurora, MySQL, MariaDB, PostgreSQL, Oracle, and SQL Server) and Amazon Aurora. It provides an easy-to-understand visual dashboard that helps you quickly assess the load on your database and determine when and where the load is coming from. The primary metric is Database Load (DB Load), which is measured in Average Active Sessions (AAS). DB Load represents the number of active sessions (queries or connections) that are currently being processed by the database engine. A higher DB Load indicates more contention and potential performance issues.

Why Performance Insights Exists

Traditional RDS monitoring (CloudWatch metrics) gives you aggregate metrics like CPU utilization, DatabaseConnections, and Read/Write IOPS. These metrics show you *what* is happening but not *why*. For example, a spike in CPU utilization could be caused by a single expensive query, a lock contention, or a burst of connections. Performance Insights fills this gap by providing per-second granularity of DB Load and breaking it down by: - Wait events (e.g., CPU, IO, lock, latch, network) - SQL queries (top SQL by load) - Hosts (client IP addresses) - Users (database users)

This breakdown allows you to drill down to the root cause of a performance issue within minutes.

How It Works Internally

Performance Insights is built on the RDS Performance Insights Agent, which runs on the database host (for RDS) or within the Aurora cluster. The agent collects wait event statistics from the database engine at high frequency (up to once per second) and aggregates them into a time-series dataset. For example, in PostgreSQL, it queries pg_stat_activity and pg_wait_events; in MySQL, it reads performance_schema; for Oracle, it uses V$ views. The agent normalizes these wait events into a common set of categories (like CPU, IO, Lock, etc.) and stores the data in a circular buffer that retains the last hour of data at no extra cost. For longer retention (up to 2 years), you pay per GB-month of data stored.

Key Components, Values, Defaults, and Timers

- DB Load (Average Active Sessions): The primary metric. A value of 1.0 means on average one session is active. If your DB load consistently exceeds the number of vCPUs, your database is likely overloaded. - Wait Events: Categorized into: - CPU: The session is actively using CPU. - IO: The session is waiting for disk or network I/O. - Lock: The session is waiting for a database lock (row, table, or transaction). - Latch: The session is waiting for a low-level internal lock (e.g., memory structure). - Network: The session is waiting for data transfer over the network. - Other: Miscellaneous waits. - Top SQL: Shows the SQL queries contributing the most to DB Load. You can see the full query text, load contribution, and execution plan (if available). - Retention Periods: - Free tier: Last hour of data (7 days of aggregated data at 1-hour granularity). - Paid tier: 7 days, 1 month, 2 months, 3 months, 6 months, 12 months, or 24 months. Pricing is per GB-month of data stored. - Performance Insights API: You can programmatically retrieve data using the GetResourceMetrics, GetResourceMetadata, and ListAvailableResourceDimensions API calls.

Configuration and Verification Commands

To enable Performance Insights on an existing RDS instance, you can use the AWS Management Console, CLI, or SDK. The CLI command to modify an existing DB instance:

aws rds modify-db-instance \
    --db-instance-identifier mydb \
    --enable-performance-insights \
    --performance-insights-retention-period 7 \
    --performance-insights-kms-key-id my-kms-key-id

To enable during creation:

aws rds create-db-instance \
    --db-instance-identifier mydb \
    --engine mysql \
    --enable-performance-insights \
    --performance-insights-retention-period 7

To verify the status:

aws rds describe-db-instances --db-instance-identifier mydb --query 'DBInstances[0].PerformanceInsightsEnabled'

You can also use the Performance Insights dashboard in the RDS console to view the data. The dashboard typically shows a line chart of DB Load over time, with color-coded areas for different wait event categories. Below the chart, you see a table of top SQL queries, hosts, or users.

Interaction with Related Technologies

Performance Insights works alongside other monitoring tools: - CloudWatch Metrics: Performance Insights does not replace CloudWatch; it complements it. CloudWatch gives you aggregate metrics (like CPUUtilization, FreeableMemory), while Performance Insights gives you session-level granularity. - RDS Enhanced Monitoring: Enhanced Monitoring provides OS-level metrics (CPU, memory, disk I/O, network) for the RDS host. Performance Insights focuses on database engine internals. Together, they provide a full stack view. - RDS Event Notifications: You can set up CloudWatch alarms on DB Load or other metrics to trigger automated actions (e.g., scale up). - AWS Lambda: You can integrate Performance Insights with Lambda for automated remediation (e.g., kill a runaway query).

Exam-Relevant Details

Free vs Paid Retention: The free tier retains the last hour of data at 1-second granularity. For longer retention (7 days to 2 years), you pay. The exam may ask about the default retention period or cost implications.

Supported Engines: All major engines except for SQL Server Express Edition and some older versions. Check the latest documentation for engine version support.

Data Volume: Performance Insights stores data in a separate managed service. The cost depends on the retention period and the amount of load data generated.

Security: You can encrypt Performance Insights data using a customer-managed KMS key. If you don't specify one, AWS uses the default RDS encryption key.

Permissions: To view Performance Insights data, the IAM user/role needs rds:DescribeDBInstances and pi:GetResourceMetrics permissions. The exam may test least privilege scenarios.

Common Exam Scenarios

High DB Load with CPU wait events: The database is CPU-bound. Solution: Optimize queries, add indexes, or scale up the instance class.

High DB Load with IO wait events: The database is I/O-bound. Solution: Optimize queries to reduce IO, increase IOPS (provisioned or gp3), or use read replicas.

High DB Load with Lock wait events: There is a blocking query. Use Performance Insights to identify the blocking session and kill it or optimize the transaction logic.

Cost Optimization: The exam may ask how to reduce costs while retaining performance data. Answer: Use the free 1-hour retention for real-time troubleshooting; only pay for longer retention if required for compliance or historical analysis.

Walk-Through

1

Enable Performance Insights

In the RDS console, select your DB instance and modify it to enable Performance Insights. You can choose a retention period (Free tier: 1 hour; Paid: 7 days to 2 years) and optionally specify a KMS key for encryption. The change applies immediately (no reboot required for most engines). Alternatively, use the CLI command `aws rds modify-db-instance --enable-performance-insights`. Once enabled, the Performance Insights agent starts collecting wait event data from the database engine. This step is typically a one-time setup.

2

Access the Performance Insights Dashboard

Navigate to the RDS console, select your DB instance, and click the 'Performance Insights' tab. You will see a line chart showing DB Load (Average Active Sessions) over time, with a default time range of the last hour. The chart is color-coded by wait event categories (CPU, IO, Lock, etc.). You can adjust the time range to look at historical data if you have a paid retention period. Below the chart, there are tabs for Top SQL, Top Hosts, and Top Users, which show the top contributors to DB Load.

3

Interpret the DB Load Chart

The DB Load chart shows the number of active sessions over time. If the load exceeds the number of vCPUs (e.g., 4 vCPUs = 4 AAS is a rough threshold), the database is likely overloaded. Hover over a spike to see the breakdown of wait events at that moment. For example, a spike with mostly red (CPU) indicates CPU-bound; blue (IO) indicates I/O-bound; orange (Lock) indicates blocking. The chart helps you identify the time window and the primary wait category causing the issue.

4

Identify Top SQL Queries

Click on the 'Top SQL' tab to see the SQL queries that contributed the most to DB Load during the selected time range. The table shows the query text (truncated), load contribution (percentage of total DB Load), and average active sessions. You can expand a query to see its execution plan (if available) and the full text. This helps you pinpoint which query is the 'heavy hitter'. For example, a query with 50% load might be missing an index or performing a full table scan.

5

Drill Down into a Specific Query

Click on a specific SQL query to see its load over time and its distribution by wait events. This allows you to see if the query is consistently causing load or only during certain periods. You can also see the hosts and users that are running this query. If the query is causing lock waits, you can identify the blocking session. Use this information to decide whether to optimize the query, add an index, or kill the session if it's a runaway query.

6

Take Remedial Action

Based on the analysis, take appropriate action. If the query is CPU-bound, consider rewriting it or adding indexes. If I/O-bound, increase IOPS or use read replicas. If lock waits, kill the blocking session using `CALL mysql.rds_kill(thread-id)` (MySQL) or `pg_terminate_backend(pid)` (PostgreSQL). For Aurora, you can use the Performance Insights dashboard to directly kill a session. After remediation, monitor the DB Load chart to confirm the issue is resolved.

What This Looks Like on the Job

Enterprise Scenario 1: E-Commerce Platform During Black Friday

A large e-commerce company runs its product catalog database on RDS MySQL. During Black Friday, they observe increased page load times and checkout failures. CloudWatch shows high CPU utilization (90%+), but they cannot pinpoint the cause. They enable Performance Insights and see that DB Load is 12 AAS on a 4-vCPU instance, with 80% of the load from CPU wait events. The Top SQL tab reveals a single query SELECT * FROM products WHERE category_id = ? ORDER BY price DESC is consuming 60% of the load. This query lacks an index on category_id, causing a full table scan. The DBA adds an index, and the DB Load drops to 4 AAS. Performance Insights helped them resolve the issue in minutes instead of hours. In production, they also set up a CloudWatch alarm on DB Load > 8 to automatically notify the on-call engineer. The key lesson: Performance Insights is invaluable for real-time troubleshooting during traffic spikes.

Enterprise Scenario 2: SaaS Provider with Multi-Tenant Database

A SaaS provider uses RDS PostgreSQL for a multi-tenant application. They notice that one tenant's workload is causing performance degradation for all tenants. They use Performance Insights to filter by host (tenant IP) and see that a specific tenant is running a heavy reporting query every hour. The query causes lock waits on a shared table, blocking other tenants. They identify the query and work with the tenant to schedule it during off-peak hours. Additionally, they use Performance Insights to set a baseline of normal DB Load (e.g., 2 AAS) and alert when it exceeds 5 AAS. This proactive monitoring helps them maintain SLAs. Common mistake: Not enabling Performance Insights on all instances due to cost concerns, but the free tier provides the last hour of data, which is often sufficient for troubleshooting.

Scenario 3: Database Migration Validation

A company migrates from Oracle to Aurora PostgreSQL. After migration, they use Performance Insights to compare DB Load patterns. They notice high IO wait events on Aurora that were not present on Oracle. Investigation reveals that Aurora's default buffer pool size is smaller, causing more disk reads. They adjust the innodb_buffer_pool_size parameter (Aurora MySQL) or increase the instance size. Performance Insights provides the necessary data to validate that the new database performs as expected. Without it, they might have blamed the migration for performance issues that were actually configuration-related.

How SOA-C02 Actually Tests This

Exactly What SOA-C02 Tests on This Topic

Performance Insights falls under Domain 6 (Cost Optimization), Objective 6.2: 'Implement cost optimization strategies for databases.' The exam tests your ability to:

Understand the cost differences between free (1-hour retention) and paid retention periods.

Identify scenarios where Performance Insights can reduce costs by preventing unnecessary upgrades (e.g., scale up when the real issue is a bad query).

Recognize that Performance Insights is a monitoring tool, not a tuning tool—it helps you identify the problem, but you must take action.

Know the default retention period (free: 1 hour) and the maximum paid retention (24 months).

Understand that Performance Insights data is stored in a separate service and incurs storage costs.

Be aware that Performance Insights is supported on all RDS engines except SQL Server Express.

Common Wrong Answers and Why Candidates Choose Them

1.

Wrong: 'Performance Insights replaces CloudWatch metrics.' Candidates think it's an all-in-one monitoring solution. Reality: CloudWatch provides aggregate metrics (CPU, memory, IOPS) while Performance Insights provides session-level wait event analysis. They complement each other.

2.

Wrong: 'Performance Insights automatically fixes performance issues.' It is a diagnostic tool, not an auto-remediation tool. You must interpret the data and take action (e.g., kill a query, add an index).

3.

Wrong: 'The free tier retains 7 days of data.' This is a common trap. The free tier retains the last hour of data at 1-second granularity. 7-day retention is paid.

4.

Wrong: 'You must reboot the instance to enable Performance Insights.' In most cases, enabling Performance Insights does not require a reboot. However, for some engine versions, a reboot may be needed (check documentation). The exam expects you to know it's a no-reboot change.

Specific Numbers, Values, and Terms

DB Load is measured in Average Active Sessions (AAS).

Free retention: 1 hour (60 minutes) of data at 1-second granularity.

Paid retention: 7 days, 1 month, 2 months, 3 months, 6 months, 12 months, 24 months.

Supported engines: Amazon Aurora (MySQL and PostgreSQL compatible), RDS for MySQL, MariaDB, PostgreSQL, Oracle, SQL Server (except Express Edition).

Permissions: pi:GetResourceMetrics and rds:DescribeDBInstances.

Maximum retention: 24 months (2 years).

Edge Cases and Exceptions

Aurora Serverless: Performance Insights is supported for Aurora Serverless v1 and v2. For Serverless v1, there is a 1-hour retention limit (no paid retention). For v2, paid retention is available.

SQL Server Express: Not supported.

Cross-Region Performance Insights: You can view Performance Insights data for cross-region read replicas, but the data is stored in the replica's region.

KMS Encryption: If you enable Performance Insights with a custom KMS key, you must grant the RDS service principal permissions to use the key. Failure to do so will cause Performance Insights to fail.

How to Eliminate Wrong Answers

When you see a question about Performance Insights, first identify the problem: is it about cost, performance diagnosis, or configuration? For cost questions, remember that the free tier gives you only the last hour. For diagnosis questions, think about what metric is being high (CPU, IO, Lock) and what the corresponding wait event category would be. If the question mentions 'high CPU utilization', the Performance Insights dashboard would show high DB Load with CPU wait events. If it mentions 'slow queries due to blocking', look for Lock wait events. Eliminate any answer that suggests Performance Insights automatically resolves the issue or that it replaces other monitoring tools.

Key Takeaways

Performance Insights measures DB Load in Average Active Sessions (AAS) — a value > number of vCPUs indicates overload.

Free tier retains only the last hour of high-resolution data; paid retention can go up to 24 months.

DB Load is broken down by wait events: CPU, IO, Lock, Latch, Network, and Other.

Top SQL tab shows the queries contributing the most load, helping pinpoint problematic queries.

Performance Insights does not automatically fix issues; it provides data for manual or automated remediation.

Enhanced Monitoring and Performance Insights are complementary — use both for full visibility.

Enabling Performance Insights typically does not require a reboot.

Supported on all major RDS engines except SQL Server Express Edition.

IAM permissions needed: `pi:GetResourceMetrics` and `rds:DescribeDBInstances`.

Performance Insights data can be encrypted with a customer-managed KMS key.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Performance Insights (Free Tier)

Retains last 1 hour of data at 1-second granularity.

No additional cost beyond the RDS instance.

Suitable for real-time troubleshooting of immediate issues.

Aggregated data at 1-hour granularity for last 7 days available (not high-res).

Cannot perform historical analysis beyond 1 hour.

Performance Insights (Paid Tier)

Retains data from 7 days up to 24 months at 1-second granularity.

Costs based on storage per GB-month.

Suitable for compliance, trend analysis, and capacity planning.

Full high-resolution data available for the entire retention period.

Allows you to compare current performance to historical baselines.

Watch Out for These

Mistake

Performance Insights is the same as Enhanced Monitoring.

Correct

Enhanced Monitoring provides OS-level metrics (CPU, memory, disk I/O) from the hypervisor. Performance Insights provides database engine-level wait event analysis and SQL breakdown. They are complementary, not identical.

Mistake

Enabling Performance Insights requires a database reboot.

Correct

For most RDS engines, enabling Performance Insights does not require a reboot. The change is applied dynamically. However, for some older engine versions, a reboot may be necessary. Always check the documentation.

Mistake

The free tier of Performance Insights retains 7 days of data.

Correct

The free tier retains only the last hour of data at 1-second granularity. 7-day retention requires a paid subscription. The aggregated 1-hour data points for the last 7 days are available for free, but not the high-resolution data.

Mistake

Performance Insights can automatically kill runaway queries.

Correct

Performance Insights is a monitoring and diagnostic tool. It does not automatically remediate issues. You can manually kill a query from the console or via CLI, but the tool itself does not take automated actions.

Mistake

Performance Insights works with all RDS engines without any prerequisites.

Correct

Performance Insights requires that the database engine version supports it. For example, SQL Server Express Edition is not supported. Also, for MySQL, the Performance Schema must be enabled (it is enabled by default on RDS for MySQL 5.6+).

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I enable Performance Insights on an existing RDS instance?

You can enable Performance Insights via the RDS console by modifying the DB instance and checking the 'Enable Performance Insights' checkbox. Alternatively, use the AWS CLI: `aws rds modify-db-instance --db-instance-identifier mydb --enable-performance-insights --performance-insights-retention-period 7`. No reboot is required for most engines. You can also specify a KMS key for encryption.

What is the cost of Performance Insights?

The free tier includes the last hour of data at 1-second granularity and aggregated data at 1-hour granularity for the last 7 days. For longer retention (7 days to 24 months), you pay per GB-month of data stored. Pricing varies by region. You can estimate costs using the AWS Pricing Calculator. There is no upfront charge; you only pay for the storage of performance data beyond the free tier.

How do I identify the top SQL queries causing high DB Load?

In the Performance Insights dashboard, click on the 'Top SQL' tab. You will see a list of SQL queries sorted by load contribution (percentage of total DB Load). You can expand a query to see its full text and execution plan. You can also filter by time range. This helps you identify which queries are the most resource-intensive.

Can Performance Insights be used with Aurora Serverless?

Yes, Performance Insights is supported for Aurora Serverless v1 and v2. For Aurora Serverless v1, only the free tier (1-hour retention) is available. For Aurora Serverless v2, you can opt for paid retention up to 24 months. Note that Aurora Serverless v1 does not support custom KMS keys for Performance Insights.

What permissions do I need to view Performance Insights data?

To view Performance Insights data, an IAM user or role needs the following permissions: `rds:DescribeDBInstances` (to list instances) and `pi:GetResourceMetrics` (to retrieve performance data). Additionally, `pi:GetResourceMetadata` may be needed for some operations. For least privilege, restrict these to specific resource ARNs.

How do I kill a query from Performance Insights?

In the Performance Insights dashboard, navigate to the 'Top SQL' tab and click on the query you want to kill. There is a 'Kill query' button (for RDS MySQL and PostgreSQL) that allows you to terminate the session. For Aurora, you can also use the 'Kill session' option. Alternatively, you can use the database-native command (e.g., `CALL mysql.rds_kill(thread-id)` for MySQL).

What is the difference between Performance Insights and Enhanced Monitoring?

Enhanced Monitoring provides OS-level metrics (CPU, memory, disk I/O, network) collected from the hypervisor, with granularity down to 1 second. Performance Insights provides database engine-level metrics, specifically DB Load broken down by wait events and SQL queries. Enhanced Monitoring helps you understand the health of the underlying host, while Performance Insights helps you understand what the database engine is doing. Both are complementary.

Terms Worth Knowing

Ready to put this to the test?

You've just covered RDS Performance Insights — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?