SOA-C02Chapter 6 of 104Objective 3.2

AWS Systems Manager

This chapter covers AWS Systems Manager, a critical service for managing and operating EC2 instances and on-premises servers at scale. For the SOA-C02 exam, Systems Manager is a major topic, appearing in roughly 10-15% of questions across the Deployment and Operations domains. Mastering Systems Manager is essential because it provides a unified interface for patching, configuration, automation, and inventory without requiring direct SSH or RDP access. This chapter will explain the core components, how they work together, and exactly what you need to know for the exam.

25 min read
Intermediate
Updated May 31, 2026

Systems Manager: The Fleet Mechanic

Imagine a large trucking company with thousands of trucks (EC2 instances) and hundreds of trailers (on-premises servers) spread across the country. The company needs to perform maintenance, install updates, check tire pressure (patch compliance), run diagnostics (run commands), and ensure each truck has the correct cargo manifest (configuration). Doing this manually at each location is impossible. AWS Systems Manager is like a central fleet mechanic's office. Each truck has a standardized onboard computer (SSM Agent) that reports to the mechanic's office over a secure radio channel (SSM endpoint). The mechanic can send a command to a specific truck or a group of trucks (targeting using tags or resource groups) to run a diagnostic script (Run Command), install a new engine control software update (Patch Manager), or change the cargo routing configuration (State Manager). Importantly, the mechanic doesn't need to drive to the truck; the truck initiates the connection to the office (outbound HTTPS) and listens for instructions. This means the trucks can be behind a private gate (private subnet with no inbound SSH) and still receive commands. The mechanic's office also keeps a log of every action performed on each truck (Systems Manager Inventory and CloudTrail). If a truck's onboard computer fails (SSM Agent not running), the mechanic cannot communicate with it, and the truck becomes 'unmanaged'. The fleet mechanic can also create 'maintenance windows' – scheduled times when trucks are out of service – to apply updates without disrupting deliveries.

How It Actually Works

What is AWS Systems Manager?

AWS Systems Manager is a suite of tools that gives you operational control over your AWS resources, especially EC2 instances and on-premises servers. It allows you to automate operational tasks, manage configuration, apply patches, collect inventory, and run commands across a fleet of instances, all from a single console or API. The key differentiator is that Systems Manager operates through an agent-based model: the SSM Agent runs on each instance and communicates outbound to the AWS Systems Manager service. This eliminates the need for inbound SSH or RDP ports, making it ideal for instances in private subnets or behind restrictive firewalls.

How Systems Manager Works Internally

The SSM Agent, installed on the instance, initiates a connection to the Systems Manager service over HTTPS (port 443). It uses AWS Identity and Access Management (IAM) roles to authenticate. The agent polls the service for pending commands or tasks. When you issue a command via Run Command, State Manager, or Patch Manager, the service places the task in a queue. The agent picks it up, executes it locally, and reports the output and status back to the service. This polling-based model ensures that instances do not need to be directly accessible from the management console.

Key Components and Defaults

SSM Agent: Pre-installed on Amazon Linux, Ubuntu, and Windows Server AMIs. For on-premises servers, you must manually install it. The agent version must be up to date; Systems Manager automatically updates the agent if you enable that option in Patch Manager.

Systems Manager Endpoints: By default, the agent connects to ssm.<region>.amazonaws.com. For instances in private subnets without internet access, you must create VPC Endpoints for Systems Manager (com.amazonaws.<region>.ssm, com.amazonaws.<region>.ec2messages, com.amazonaws.<region>.ssmmessages).

IAM Roles: The instance must have an IAM role with the AmazonSSMManagedInstanceCore managed policy attached. This policy grants permissions for the agent to communicate with Systems Manager and perform core actions.

Resource Groups: Systems Manager uses resource groups to target instances. Resource groups are collections of resources that share tags or CloudFormation stack membership.

Document: A Systems Manager document (SSM document) defines the action that Systems Manager performs on an instance. Documents can be AWS-managed (e.g., AWS-RunShellScript, AWS-ApplyPatchBaseline) or custom. Documents are written in JSON or YAML and include schema version, description, parameters, and steps.

Maintenance Windows: Schedule windows of time to run tasks. You can set a duration (e.g., 2 hours) and a cutoff time (e.g., 30 minutes before end) to stop starting new tasks.

Rate Controls: When running commands on a large fleet, you can set concurrency (number of instances to run simultaneously) and error threshold (how many failures before the command stops).

Configuration and Verification Commands

To verify that an instance is managed by Systems Manager, check the following:

The SSM Agent is running: sudo systemctl status amazon-ssm-agent (Linux) or Get-Service AmazonSSMAgent (Windows).

The instance has the correct IAM role attached.

The instance can reach the Systems Manager endpoints: curl https://ssm.<region>.amazonaws.com.

In the AWS Console, navigate to Systems Manager > Fleet Manager. The instance should appear with a status of 'Online'.

You can also use the AWS CLI to list managed instances:

aws ssm describe-instance-information

This returns a list of instances, their ping status (Online/ConnectionLost), agent version, and platform type.

How Systems Manager Interacts with Related Technologies

AWS Config: Systems Manager Inventory can be integrated with AWS Config rules to evaluate configuration compliance.

CloudTrail: All Systems Manager API calls are logged in CloudTrail, providing an audit trail of actions.

AWS Lambda: You can invoke Lambda functions from Systems Manager documents using aws:invokeLambdaFunction action.

Amazon EventBridge (CloudWatch Events): You can trigger Systems Manager automation documents based on events (e.g., when an instance is terminated, automatically update a Systems Manager document).

AWS Resource Groups: Used for targeting instances. Resource groups are created based on tags or CloudFormation stacks.

VPC Endpoints: For instances in private subnets, you must create VPC endpoints for SSM, EC2 Messages, and SSM Messages. Without these, the agent cannot communicate with the service.

Patch Manager Deep Dive

Patch Manager automates the process of patching managed instances with security updates. It uses patch baselines that define which patches to install (e.g., only critical and security patches). You can create custom patch baselines for Windows (using Microsoft Update categories) and Linux (using Red Hat or Debian repositories). Patch Manager supports both scanning (reporting missing patches) and installation. The AWS-RunPatchBaseline document is used to apply patches. During patching, the instance is not rebooted automatically unless you specify the RebootOption parameter. For Windows, you can configure the reboot behavior.

State Manager Deep Dive

State Manager ensures that instances are in a consistent state. You define a configuration (e.g., install a specific software, configure a firewall rule) as an association. The association is applied on a schedule (e.g., every 30 minutes) or when the instance is started. The association is stored in Systems Manager and the agent applies it. If the configuration drifts, State Manager reapplies it. This is ideal for enforcing compliance.

Automation Deep Dive

Automation allows you to run complex workflows against AWS resources (not just instances). For example, you can create an automation that stops an EC2 instance, creates an AMI, and then restarts the instance. Automation documents support branching, error handling, and looping. They can be triggered manually, by EventBridge, or by maintenance windows.

Session Manager Deep Dive

Session Manager provides secure browser-based shell access to instances without needing SSH keys or opening inbound ports. It uses the SSM Agent to establish a tunnel. The session is recorded and can be logged to CloudTrail and S3. Session Manager can be used to access instances in private subnets if VPC endpoints are configured. It supports port forwarding for accessing databases or web servers on instances.

Parameter Store Deep Dive

Parameter Store is a hierarchical store for configuration data and secrets. It integrates with Systems Manager to provide parameters to commands and documents. Parameters can be plain text or encrypted (using KMS). Parameter Store supports tiers: Standard (up to 10,000 parameters, 4 KB size) and Advanced (up to 100,000 parameters, 8 KB size, with policies). You can reference a parameter in a Run Command using {{ ssm:/path/to/parameter }}.

Inventory Deep Slide

Inventory collects metadata from managed instances, including installed applications, Windows updates, network configuration, and files. You can configure inventory collection on a schedule using State Manager. The data is stored in an S3 bucket (if configured) and can be queried using Systems Manager Inventory console or API. This helps with license management and compliance audits.

Compliance Deep Dive

Compliance in Systems Manager is based on patch compliance and association compliance. Patch compliance shows how many instances are compliant with your patch baseline. Association compliance shows whether instances have applied their State Manager associations. Compliance data is updated each time a scan or association runs. You can view compliance summaries in the console and set up Amazon SNS notifications for non-compliant instances.

Distributor Deep Dive

Distributor allows you to package and distribute software packages to managed instances. You create a package (a ZIP file with install scripts) and publish it. Then you can use State Manager or Run Command to install the package on instances. This is useful for deploying custom applications or third-party software consistently.

OpsCenter Deep Dive

OpsCenter aggregates operational issues (OpsItems) from various AWS services (e.g., CloudWatch alarms, AWS Config rules) and provides a central place to view and resolve them. OpsItems can be created manually or automatically. They can be assigned to operators, have severity levels, and be tracked through resolution.

Explorer Deep Dive

Explorer is a dashboard that provides a summary of your operational data, including OpsItems, patch compliance, and inventory. It allows you to filter by account, region, or resource group. Explorer helps you quickly identify operational health issues across your organization.

Hybrid Activations

To manage on-premises servers, you need to create a hybrid activation. This generates an activation code and ID that you use when installing the SSM Agent on the on-premises machine. The machine registers with Systems Manager and is treated similarly to an EC2 instance. You must ensure the on-premises server can reach the Systems Manager endpoints over the internet or via a VPC endpoint.

Security and Permissions

IAM roles and policies are crucial. The AmazonSSMManagedInstanceCore policy is the minimum required for an instance to be managed. For additional capabilities (e.g., sending commands to other instances), you need additional permissions. For on-premises servers, you create an IAM service role for Systems Manager. Always follow the principle of least privilege.

Troubleshooting Common Issues

Instance not showing as managed: Check SSM Agent status, IAM role, network connectivity, and time synchronization (NTP).

Command fails with 'AccessDenied': The instance's IAM role does not have the necessary permissions.

Session Manager cannot connect: Ensure the instance has the AmazonSSMManagedInstanceCore policy and VPC endpoints are configured for private subnets.

Patch Manager fails: Verify that the patch baseline is correctly configured and the instance has internet access to download patches.

Best Practices

Always use the latest SSM Agent version; enable automatic updates.

Use resource groups and tags to organize and target instances.

Use maintenance windows to schedule patching during off-peak hours.

Use rate controls to limit the blast radius of failures.

Enable CloudTrail logging for all Systems Manager actions.

Use Parameter Store for secrets and configuration data instead of hardcoding.

Regularly review compliance reports.

Summary

AWS Systems Manager is an essential tool for SysOps administrators to manage large fleets of instances efficiently and securely. It reduces operational overhead by automating patching, configuration, and inventory tasks. For the SOA-C02 exam, you need to understand the core components, how they interact, and common troubleshooting steps. Focus on SSM Agent, IAM roles, documents, maintenance windows, and the differences between Run Command, State Manager, and Patch Manager.

Walk-Through

1

Install and Configure SSM Agent

The SSM Agent must be installed on each instance you want to manage. For Amazon Linux, Ubuntu, and Windows Server AMIs, the agent is pre-installed. For on-premises servers or custom AMIs, you must manually install it. The agent communicates outbound over HTTPS (port 443) to the Systems Manager service. It uses an IAM instance role (for EC2) or an IAM service role (for on-premises) to authenticate. The agent polls the service every few seconds for pending tasks. Without a valid IAM role and network connectivity to the SSM endpoints, the instance will not appear as managed.

2

Attach IAM Role with Required Permissions

For EC2 instances, you must attach an IAM role that includes the `AmazonSSMManagedInstanceCore` managed policy. This policy grants permissions for the agent to call Systems Manager APIs. For on-premises servers, you create a hybrid activation which generates an activation code and ID. You then use these when installing the agent. The on-premises server will assume an IAM role (created during activation) that has the necessary permissions. Without the correct IAM permissions, the agent cannot register or receive commands.

3

Verify Network Connectivity to SSM Endpoints

The SSM Agent must be able to reach the Systems Manager endpoints. For instances in a public subnet with internet access, this works by default. For instances in private subnets, you must create VPC Endpoints for SSM (com.amazonaws.<region>.ssm), EC2 Messages (com.amazonaws.<region>.ec2messages), and SSM Messages (com.amazonaws.<region>.ssmmessages). Additionally, if you use Session Manager, you need an endpoint for Session Manager. You also need a route to the endpoint (e.g., via a NAT gateway or VPC endpoint). If the agent cannot connect, the instance will show as 'ConnectionLost' or 'Not managed'.

4

Create and Configure SSM Documents

SSM documents define the actions to perform on instances. AWS provides pre-built documents like `AWS-RunShellScript` (for Linux) and `AWS-RunPowerShellScript` (for Windows). You can also create custom documents in JSON or YAML. Documents include parameters, schema version, and steps. For example, a Run Command using `AWS-RunShellScript` requires the `commands` parameter. You can store sensitive parameters in Parameter Store and reference them in documents using `{{ ssm:parameter-name }}`. Documents are versioned and can be shared across accounts.

5

Execute Tasks via Run Command or State Manager

Run Command is used for one-time ad-hoc commands. You select targets (by tags, resource groups, or manually), choose a document, and provide parameters. The command is executed asynchronously on each instance. State Manager is used for recurring tasks. You create an association that ties a document to a target and a schedule (e.g., every 30 minutes). The association is applied automatically. Both Run Command and State Manager support rate controls: you can set the maximum number of concurrent instances and a failure threshold. The status of each command or association is reported back to Systems Manager and can be viewed in the console or via CLI.

What This Looks Like on the Job

Scenario 1: Patching a Large Fleet of EC2 Instances

A company runs 500 EC2 instances across multiple AWS regions. They need to apply critical security patches monthly without disrupting production. They use Patch Manager with a custom patch baseline that includes only critical and security patches. They create a maintenance window that runs every Sunday at 2 AM, with a duration of 4 hours and a cutoff of 1 hour. They use rate controls: concurrency of 10 instances and error threshold of 5. The patching is done using the AWS-RunPatchBaseline document with Operation: Install and RebootOption: NoReboot. After patching, they run a compliance scan. If any instance fails, they receive an SNS notification. This automated process reduces manual effort and ensures compliance.

Scenario 2: Enforcing Configuration on On-Premises Servers

A company has 200 on-premises Windows servers that must have a specific antivirus software installed and a security baseline configured. They use Systems Manager Hybrid Activations to register these servers. Then they create a State Manager association that runs the AWS-InstallApplication document to ensure the antivirus is installed. They also use a custom document that applies registry settings for security hardening. The association is set to run every 6 hours. If a server drifts (e.g., antivirus is uninstalled), State Manager reapplies the configuration within 6 hours. This ensures consistent security posture without manual intervention.

Scenario 3: Secure Remote Access to Private Subnet Instances

A developer needs SSH access to an EC2 instance in a private subnet that has no internet access. The company has strict security policies against bastion hosts. They use Session Manager with VPC endpoints for SSM and Session Manager. The developer authenticates via IAM and opens a browser-based shell session. All session activity is logged to CloudTrail and an S3 bucket. The developer can also use port forwarding to access a database on the instance. This eliminates the need for SSH keys and inbound ports, improving security posture.

How SOA-C02 Actually Tests This

SOA-C02 Exam Focus: AWS Systems Manager

This topic is tested under Objective 3.2: 'Deploy and manage AWS Systems Manager'. The exam expects you to know how to set up Systems Manager, manage instances, and automate operations. Key areas: SSM Agent, IAM roles, documents, maintenance windows, rate controls, and the differences between Run Command, State Manager, Patch Manager, and Session Manager.

Common Wrong Answers and Why Candidates Choose Them

1.

'Systems Manager requires inbound SSH/RDP ports.' – Wrong. Systems Manager works via outbound HTTPS from the agent. Candidates assume management requires inbound access.

2.

'State Manager and Run Command are interchangeable.' – Wrong. Run Command is for one-time tasks; State Manager is for recurring enforcement. Candidates confuse them because both use documents.

3.

'Patch Manager can patch instances without the SSM Agent.' – Wrong. The agent is mandatory. Candidates may think patching is like AWS Inspector which is agentless.

4.

'Session Manager requires a bastion host.' – Wrong. Session Manager does not require a bastion host; it uses the agent. Candidates think of traditional SSH access.

Specific Numbers and Values on the Exam

SSM Agent version: 2.x (latest).

Default polling interval: every 5 seconds (but not explicitly tested).

Maximum number of parameters in Parameter Store: Standard tier 10,000; Advanced tier 100,000.

Parameter Store size limit: Standard 4 KB; Advanced 8 KB.

Maintenance window duration: up to 24 hours.

Rate controls: concurrency and error threshold (no default, you set them).

Hybrid activation: activation code and ID.

Edge Cases and Exceptions

If an instance is in a private subnet without VPC endpoints, it cannot be managed.

If the SSM Agent is not running, the instance appears as 'ConnectionLost'.

If the IAM role lacks permissions, the instance may register but commands will fail with AccessDenied.

For on-premises servers, you must create a hybrid activation; you cannot use the same IAM role as EC2 instances.

Patch Manager on Linux uses the package manager (yum, apt) configured on the instance; if the instance cannot reach the repositories, patching fails.

How to Eliminate Wrong Answers

If an answer suggests opening inbound ports, it's likely wrong because Systems Manager uses outbound communication.

If an answer mentions 'agentless', it's wrong for Systems Manager (except for some inventory collection via API).

If an answer confuses Run Command with State Manager, check if the task is recurring (State Manager) or one-time (Run Command).

If an answer suggests that Session Manager requires SSH keys, it's wrong; Session Manager uses IAM authentication.

Key Takeaways

SSM Agent must be installed and running on each managed instance; it communicates outbound over HTTPS.

IAM role with AmazonSSMManagedInstanceCore policy is required for EC2 instances to be managed.

For private subnets, create VPC endpoints for SSM, EC2 Messages, and SSM Messages.

Run Command is for one-time tasks; State Manager is for recurring enforcement.

Patch Manager uses patch baselines to define which patches to install; supports both scanning and installation.

Maintenance windows allow you to schedule operations with duration and cutoff times.

Rate controls (concurrency and error threshold) prevent widespread failures during large-scale operations.

Session Manager provides secure shell access without SSH keys or inbound ports; logs sessions to CloudTrail and S3.

Parameter Store stores configuration data and secrets; supports plain text and encrypted parameters using KMS.

Hybrid Activations are required to manage on-premises servers with Systems Manager.

Systems Manager documents (SSM documents) define actions; can be AWS-managed or custom.

Compliance in Systems Manager covers patch compliance and association compliance.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Run Command

One-time execution of commands or scripts.

No scheduling; runs immediately when invoked.

Useful for ad-hoc troubleshooting or updates.

Targets can be tags, resource groups, or manual selection.

No automatic re-application if configuration drifts.

State Manager

Recurring enforcement of desired state.

Runs on a schedule (e.g., every 30 minutes) or on instance start.

Useful for maintaining consistent configuration.

Targets defined in association; can use tags or resource groups.

Automatically re-applies configuration if drift is detected.

Watch Out for These

Mistake

Systems Manager requires instances to have public IP addresses.

Correct

Instances can be in private subnets as long as they can reach the Systems Manager endpoints via VPC endpoints or NAT gateway. No public IP is needed.

Mistake

Run Command and State Manager are the same.

Correct

Run Command is for one-time, ad-hoc execution. State Manager is for recurring enforcement of desired state. Both use documents but have different use cases.

Mistake

Patch Manager works on instances without the SSM Agent.

Correct

The SSM Agent is mandatory for Patch Manager. Without it, the instance cannot be managed by Systems Manager.

Mistake

Session Manager requires a bastion host or VPN.

Correct

Session Manager does not require a bastion host. It uses the SSM Agent to establish a secure tunnel over HTTPS. No inbound ports are needed.

Mistake

Systems Manager can manage instances across different AWS accounts without any setup.

Correct

To manage instances in another account, you need cross-account IAM roles and proper trust policies. Systems Manager itself is per-account; you use AWS Organizations or Resource Access Manager for multi-account management.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Run Command and State Manager?

Run Command is used for one-time, ad-hoc execution of commands or scripts on managed instances. State Manager is used to define a desired state configuration that is applied on a recurring schedule or when an instance starts. State Manager automatically reapplies the configuration if it drifts, while Run Command does not. For example, you would use Run Command to manually run a script, and State Manager to ensure a specific software is always installed.

How do I manage instances in a private subnet with Systems Manager?

To manage instances in a private subnet, you must create VPC Endpoints for Systems Manager services: SSM (com.amazonaws.<region>.ssm), EC2 Messages (com.amazonaws.<region>.ec2messages), and SSM Messages (com.amazonaws.<region>.ssmmessages). Additionally, for Session Manager, you need an endpoint for Session Manager. Ensure the instances have a route to these endpoints (e.g., via the endpoint's private DNS or a NAT gateway). The SSM Agent will then communicate over HTTPS through these endpoints.

What IAM role permissions are needed for Systems Manager?

The minimum IAM role for an EC2 instance to be managed by Systems Manager is the AWS managed policy `AmazonSSMManagedInstanceCore`. This policy grants permissions for the SSM Agent to call necessary APIs. For additional capabilities (e.g., sending commands to other instances, accessing Parameter Store), you need to add more permissions. For on-premises servers, you create a hybrid activation which generates an IAM service role with similar permissions.

Can Systems Manager patch instances without the SSM Agent?

No, the SSM Agent is required for Patch Manager to work. The agent must be installed and running on the instance to receive patching commands, download patches, and report compliance. Without the agent, the instance cannot be managed by Systems Manager at all.

How does Session Manager work without SSH?

Session Manager uses the SSM Agent to establish a secure tunnel between the instance and the Systems Manager service. When you start a session, the agent creates a WebSocket connection over HTTPS. You authenticate via IAM and can then interact with the instance's shell through the browser or AWS CLI. No inbound ports are opened, and no SSH keys are needed. Session activity can be logged to CloudTrail and S3.

What is a maintenance window in Systems Manager?

A maintenance window is a schedule that defines a time window for executing operational tasks (like patching or running commands) on managed instances. You specify a duration (e.g., 4 hours) and an optional cutoff time (e.g., 30 minutes before the end) to stop starting new tasks. Maintenance windows ensure that disruptive operations happen during planned downtime. They can be used with Run Command, Patch Manager, and State Manager.

How do I register an on-premises server with Systems Manager?

To register an on-premises server, you first create a hybrid activation in the Systems Manager console or CLI. This generates an activation code and ID. Then you install the SSM Agent on the server and run the registration command with the activation code and ID. The server will then appear as a managed instance in Systems Manager. Ensure the server can reach the Systems Manager endpoints over the internet or via a VPC endpoint.

Terms Worth Knowing

Ready to put this to the test?

You've just covered AWS Systems Manager — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?