This chapter covers container escape techniques, a critical area for the CompTIA PenTest+ PT0-002 exam under Domain 3.0: Attacks and Exploits, Objective 3.5. Container escape is a high-impact attack vector where a penetration tester demonstrates breaking out of a container's isolation to access the host or other containers. Although not heavily weighted (approximately 5-10% of exam questions), understanding container escape is essential for cloud and container pentesting scenarios. This chapter provides the technical depth needed to identify, exploit, and mitigate container escape vulnerabilities, with a focus on exam-relevant techniques.
Jump to a section
Imagine a high-security prison with individual cells that are isolated from each other and from the outside world. Each cell has a locked door, a small window, and basic amenities. The prison is run by a central guard system that manages all cells. A container escape is like a prisoner finding a way to break out of their cell and gain access to the prison's control room or even the outside world. This could happen if the cell door is left unlocked (misconfigured container), if the prisoner finds a hidden tunnel (kernel vulnerability), or if they trick the guard into opening the door (using a privileged capability). Once out of the cell, the prisoner can access other prisoners' cells (other containers), the prison's central systems (host), or even escape the prison entirely (host compromise). The guard system represents the host kernel, which is shared among all containers. If a prisoner can exploit a flaw in the guard system, they can gain control over the entire prison. In container escape, the attacker exploits a vulnerability in the container runtime, kernel, or misconfiguration to break out of the container's isolation and interact with the host OS or other containers.
What is Container Escape?
Container escape is a security exploit where an attacker breaks out of the isolated environment of a container to gain unauthorized access to the host operating system or other containers. Containers rely on Linux kernel features like namespaces and cgroups to isolate processes, but these are not a full security boundary like a virtual machine. A container escape can lead to full host compromise, allowing the attacker to access all containers, host data, and pivot to other systems. For the PT0-002 exam, you must understand the common techniques, underlying mechanisms, and detection methods.
How Container Isolation Works
Containers use Linux namespaces to isolate processes, network, filesystem, and other resources. Each container gets its own set of namespaces:
PID namespace: Isolates process IDs, so processes inside the container cannot see host processes.
Network namespace: Provides its own network stack, including interfaces, IP addresses, and routing tables.
Mount namespace: Isolates filesystem mount points.
User namespace: Maps container user IDs to host user IDs, allowing a container to run as root inside but as a non-root user on the host.
UTS namespace: Isolates hostname and domain name.
IPC namespace: Isolates inter-process communication resources.
Additionally, cgroups limit resource usage (CPU, memory, I/O). The container runtime (e.g., Docker, containerd) sets up these namespaces and cgroups when starting a container. The host kernel is shared among all containers, meaning a vulnerability in the kernel or a misconfiguration can allow a process to break out of its namespace.
Common Container Escape Techniques
#### 1. Privileged Container Escape
A container running with --privileged flag has all capabilities and access to host devices. This effectively disables most isolation. The container can:
Mount the host filesystem via /dev/sda or similar.
Use nsenter to enter host namespaces.
Execute commands on the host.
Exploit example:
# Inside a privileged container
fdisk -l # Identify host disk
mkdir /mnt/host
mount /dev/sda1 /mnt/host
chroot /mnt/host
# Now you have a shell on the hostDetection: Check if container runs with privileged: true in its security context.
#### 2. Capability-Based Escape
Linux capabilities break down root privileges into smaller units. Containers by default drop many capabilities but may retain dangerous ones like CAP_SYS_ADMIN, CAP_NET_RAW, CAP_DAC_OVERRIDE, etc. CAP_SYS_ADMIN is particularly dangerous as it allows mounting filesystems, accessing namespaces, and performing privileged syscalls.
Exploit with CAP_SYS_ADMIN:
# If the container has CAP_SYS_ADMIN, you can mount the host's cgroup hierarchy
mkdir /tmp/cgrp
mount -t cgroup -o memory cgroup /tmp/cgrp
mkdir /tmp/cgrp/x
# Create a cgroup with release_agent to execute a script on host when all processes exit
echo "/bin/sh -c 'echo hacked > /host_root_flag'" > /tmp/cgrp/x/release_agent
# Trigger exit by creating a process in the cgroup and killing it
sh -c "echo \$\$ > /tmp/cgrp/x/cgroup.procs"
kill -9 $$ # This triggers release_agentDetection: Check capabilities with capsh --print or getpcaps.
#### 3. Mounting Host Filesystem via /proc
If a container has CAP_SYS_PTRACE and CAP_DAC_READ_SEARCH, it can access /proc/1/root (the host's root filesystem). This allows reading or writing host files.
Exploit:
# Inside container
cat /proc/1/environ # May leak host environment variables
ls -la /proc/1/root/
# If writable, you can modify host filesDetection: Check if /proc/1/root is accessible.
#### 4. Kernel Vulnerabilities (CVE-2019-5736, CVE-2022-0492)
Kernel exploits can break out of namespaces. For example, CVE-2019-5736 affected runc (the container runtime). It allowed a malicious container to overwrite the host runc binary, leading to host code execution when a user executed docker exec.
Mechanism: The exploit used /proc/self/exe to overwrite the runc binary from inside the container.
Detection: Always keep kernel and runc updated. Use vulnerability scanners.
#### 5. Docker Socket Mount
Mounting the Docker socket (/var/run/docker.sock) inside a container gives the container control over the Docker daemon on the host. This allows creating new containers with arbitrary privileges, effectively escaping.
Exploit:
# Inside container with docker socket mounted
curl -X POST -H "Content-Type: application/json" -d '{"Image":"ubuntu","Cmd":["/bin/bash"],"Binds":["/:/host"]}' --unix-socket /var/run/docker.sock http://localhost/containers/create
curl -X POST --unix-socket /var/run/docker.sock http://localhost/containers/<id>/start
# Now you have a new container with host root mountedDetection: Check if the container has the Docker socket mounted as a volume.
Key Components and Defaults
Docker default seccomp profile: Blocks dangerous syscalls. But privileged containers bypass seccomp.
AppArmor/SELinux: Can enforce mandatory access control. Many container escapes fail if AppArmor is properly configured.
User namespaces: Enable user namespace remapping to run containers as non-root on the host. This significantly reduces escape risk.
Read-only root filesystem: Prevents writing to container filesystem, but does not prevent all escapes.
Configuration and Verification Commands
Check container privileges:
docker inspect --format='{{.HostConfig.Privileged}}' <container>
docker inspect --format='{{.HostConfig.CapAdd}}' <container>Check for Docker socket mount:
docker inspect --format='{{json .Mounts}}' <container> | jqCheck running capabilities inside container:
cat /proc/1/status | grep Cap
capsh --printCheck if user namespace is enabled:
# On host
cat /etc/docker/daemon.json | grep userns-remapInteraction with Related Technologies
Container escape techniques often interact with: - Kubernetes: Pod security policies (now Pod Security Admission) restrict containers. Escaping a container in Kubernetes can lead to node compromise. - Cloud environments: If the host is a VM, escaping the container gives access to the VM, not necessarily the hypervisor. But further exploits may be possible. - Container registries: Malicious images can contain escape payloads.
Mitigation Strategies
Run containers with the least privilege: drop all capabilities except those needed.
Enable user namespace remapping.
Use read-only root filesystem.
Avoid mounting the Docker socket.
Keep kernel and container runtime updated.
Use AppArmor or SELinux profiles.
Use security scanners to detect misconfigurations.
Exam Relevance
For PT0-002, you should know:
The difference between a privileged container and a container with specific capabilities.
How CAP_SYS_ADMIN can be abused with cgroups.
The danger of mounting the Docker socket.
CVE-2019-5736 as a classic runtime escape.
Commands to detect escape vectors.
Identify Escape Vectors
The first step is to enumerate the container's configuration to find potential escape vectors. Check if the container is running in privileged mode, what capabilities are added, if the Docker socket is mounted, and if the root filesystem is writable. Use commands like `docker inspect` on the host or `cat /proc/1/status | grep Cap` inside the container. Also check for any mounted host directories. This reconnaissance determines which technique is most likely to succeed. For example, if `CapEff` includes `0000003fffffffff`, the container has all capabilities, indicating a privileged container.
Exploit Privileged Mode
If the container is privileged, the attacker can directly access host devices. Use `fdisk -l` to list disks, then mount the host filesystem. For example, `mount /dev/sda1 /mnt/host` and then `chroot /mnt/host` to get a shell on the host. The attacker can also use `nsenter` to enter any namespace on the host. This is the simplest escape method and is often tested on the exam. Remember that privileged containers bypass most security mechanisms like seccomp and AppArmor.
Exploit Capabilities (CAP_SYS_ADMIN)
If the container has `CAP_SYS_ADMIN` but is not fully privileged, the attacker can use the cgroup release_agent technique. Create a new cgroup, set the release_agent to a script, and trigger a process exit in that cgroup. The host kernel will execute the release_agent as root. This technique works because `CAP_SYS_ADMIN` allows mounting cgroup filesystems and writing to cgroup control files. The attacker must ensure the container has the necessary mount and write permissions. This is a classic exam scenario.
Exploit Docker Socket Mount
If the Docker socket is mounted inside the container, the attacker can communicate directly with the Docker daemon on the host. Create a new container with the host root filesystem mounted, then execute commands in that container. For example, use `docker -H unix:///var/run/docker.sock run -v /:/host -it ubuntu chroot /host /bin/bash`. This gives a shell on the host. This attack does not require any kernel exploit and is often tested alongside container escape scenarios.
Exploit Kernel or Runtime Vulnerabilities
If no misconfigurations are found, the attacker may need to exploit a kernel or container runtime vulnerability. For example, CVE-2019-5736 exploits a race condition in runc to overwrite the host runc binary. The attacker must have a way to trigger the exploit, usually by getting a user on the host to run `docker exec` into the container. The exploit uses `/proc/self/exe` to write to the host binary. This is more complex and less common in exams but may appear as a multiple-choice question about the vulnerability name.
In a real-world penetration test of a Kubernetes cluster, a common scenario is discovering a pod running with privileged: true or with CAP_SYS_ADMIN. For example, a DevOps team might have created a monitoring container that needs to access host metrics and mistakenly gave it full privileges. As a pentester, you would identify this via a kubectl describe pod and then execute an escape to gain access to the node. Once on the node, you can access other pods' data, service account tokens, and potentially pivot to the cloud environment. In another scenario, a container in a CI/CD pipeline might have the Docker socket mounted to allow building images inside a container. This is a classic misconfiguration that leads to host compromise. The pentester can create a new container with host root access and then exfiltrate secrets or deploy backdoors. Performance considerations: Container escapes often require the container to have sufficient CPU and memory to execute exploits, but most techniques are lightweight. In production, misconfigurations are often discovered during security audits using tools like kube-bench or docker-bench. When misconfigured, the impact is severe: an attacker can gain root on the host, access all containers, and then move laterally. Mitigation involves enforcing Pod Security Standards (restricted profile), disabling privileged containers, and using tools like OPA Gatekeeper to prevent dangerous configurations.
The PT0-002 exam (Objective 3.5) tests container escape techniques primarily through scenario-based multiple-choice questions. You will be given a description of a container configuration and asked to identify the most likely escape method or the vulnerability that allows escape. The most common wrong answers include: (1) Thinking that a container running as root inside is automatically dangerous — but root inside is normal; it's the host mapping that matters. (2) Confusing CAP_SYS_ADMIN with full privileged mode — privileged mode includes all capabilities plus device access. (3) Assuming that a read-only root filesystem prevents all escapes — it does not prevent cgroup-based escapes or Docker socket attacks. (4) Forgetting that the Docker socket mount gives full control over the host Docker daemon. Key numbers and terms: --privileged, CAP_SYS_ADMIN, CVE-2019-5736, /var/run/docker.sock, release_agent, nsenter. The exam may ask: "Which capability allows mounting a filesystem?" Answer: CAP_SYS_ADMIN. Or: "What is the risk of mounting the Docker socket?" Answer: It allows creating containers with host access. Edge cases: User namespace remapping can prevent many escapes even if the container runs as root inside. The exam may test that user namespaces map container root to a non-root host user, so escaping does not give root on the host. Also, AppArmor or SELinux can block certain syscalls even if capabilities are present. To eliminate wrong answers, trace the mechanism: if the container has the Docker socket, the attacker doesn't need kernel exploits; if it has CAP_SYS_ADMIN but no socket, the cgroup technique is likely; if it's privileged, direct host mount is possible.
Container escape exploits weaknesses in namespace isolation, capabilities, or runtime vulnerabilities.
Privileged containers (`--privileged`) provide full host access and are the easiest to escape from.
`CAP_SYS_ADMIN` enables mounting filesystems and can be abused with cgroup release_agent for escape.
Mounting the Docker socket (`/var/run/docker.sock`) gives control over the host Docker daemon.
CVE-2019-5736 is a classic runc vulnerability that allows overwriting the host runc binary via `/proc/self/exe`.
User namespace remapping (`userns-remap`) mitigates many escapes by mapping container root to non-root on host.
Always check container security context: privileged, capabilities, volume mounts, and read-only root filesystem.
Detection commands: `docker inspect`, `cat /proc/1/status`, `capsh --print`.
Mitigation: use Pod Security Standards, avoid privileged containers, drop dangerous capabilities, and keep software updated.
Container escape is a high-impact attack that often leads to full host compromise in penetration tests.
These come up on the exam all the time. Here's how to tell them apart.
Privileged Container
Has all capabilities enabled.
Has access to all host devices (e.g., /dev/sda).
Bypasses seccomp and AppArmor profiles.
Can directly mount host filesystem via device nodes.
Simple escape: mount host disk and chroot.
Container with CAP_SYS_ADMIN
Has only specific capabilities added (e.g., CAP_SYS_ADMIN).
No direct device access unless explicitly added.
Seccomp and AppArmor still apply.
Escape requires more complex techniques like cgroup release_agent.
More common in production due to least privilege attempts.
Mistake
A container running as root inside is insecure and can easily escape.
Correct
Running as root inside a container is normal and does not imply escape risk. The key is whether the container is mapped to root on the host via user namespaces. Without user namespace remapping, root inside is root on the host for kernel operations, but most escapes still require specific capabilities or misconfigurations.
Mistake
Containers provide the same security isolation as virtual machines.
Correct
Containers share the host kernel, so they have a larger attack surface than VMs. A kernel vulnerability can lead to container escape, whereas VMs have a hypervisor layer that isolates the guest OS from the host. The exam emphasizes that containers are not a security boundary.
Mistake
If a container has no capabilities added, it cannot escape.
Correct
Even with default capabilities, some escapes are possible if the host kernel is vulnerable or if the container runtime has a bug. For example, CVE-2019-5736 affected runc regardless of capabilities. Also, mounting the Docker socket does not require capabilities; it's a volume mount.
Mistake
A read-only root filesystem inside the container prevents escape.
Correct
A read-only root filesystem prevents writing to the container's own filesystem but does not prevent mounting host filesystems or writing to cgroup control files (if `/sys/fs/cgroup` is writable). The cgroup release_agent technique requires writing to cgroup files, which may be allowed even with a read-only root.
Mistake
Docker's default seccomp profile blocks all dangerous syscalls.
Correct
The default seccomp profile blocks around 44 syscalls but allows many that can be used for escape, such as `mount`, `open`, `write`, etc., if the container has the necessary capabilities. It is not a complete security solution.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Container escape is an attack where a penetration tester breaks out of a container's isolation to gain access to the host operating system or other containers. This is typically achieved by exploiting misconfigurations (e.g., privileged mode, mounted Docker socket), capabilities (e.g., CAP_SYS_ADMIN), or kernel/runtime vulnerabilities (e.g., CVE-2019-5736). On the PT0-002 exam, you'll need to identify which technique applies based on the container's configuration.
A privileged container has all capabilities and access to all host devices. This allows an attacker to directly mount the host filesystem using device nodes like /dev/sda1 and then chroot into it, gaining a root shell on the host. They can also use nsenter to enter any host namespace. The exam expects you to know that privileged mode effectively disables container isolation.
This technique exploits the cgroup release_agent feature. If a container has CAP_SYS_ADMIN, it can mount the cgroup filesystem, create a new cgroup, and set its release_agent to a script. When all processes in that cgroup exit, the host kernel executes the release_agent as root. This allows arbitrary code execution on the host. The exam may test that this requires CAP_SYS_ADMIN and writable cgroup files.
The Docker socket (/var/run/docker.sock) is the Unix socket used to communicate with the Docker daemon. If mounted inside a container, the container can send commands to the daemon, such as creating a new container with host root filesystem mounted. This effectively gives the attacker full control over the host. On the exam, this is a common scenario for container escape.
CVE-2019-5736 is a vulnerability in runc, the container runtime used by Docker. It allows a malicious container to overwrite the host runc binary by exploiting a race condition with /proc/self/exe. When a user on the host runs docker exec into the container, the overwritten runc executes and can give the attacker host code execution. The exam may ask for the vulnerability name or its impact.
User namespace remapping maps the container's root user (UID 0) to a non-root user on the host (e.g., UID 100000). This means even if a container has CAP_SYS_ADMIN, it cannot perform actions that require real root on the host, such as mounting host devices or writing to protected files. The exam expects you to know that this is a strong mitigation.
From the host, use `docker inspect` to check Privileged, CapAdd, and Mounts. Inside the container, check /proc/1/status for CapEff (capabilities), run `mount` to see mounted filesystems, and check if /var/run/docker.sock exists. Tools like `capsh --print` list current capabilities. The exam may test these commands in scenario questions.
You've just covered Container Escape Techniques — now see how well it sticks with free PT0-002 practice questions. Full explanations included, no account needed.
Done with this chapter?