PT0-002Chapter 14 of 104Objective 5.2

Scripting and Automation for PenTest

This chapter covers the scripting and automation skills essential for penetration testing, a core component of the PT0-002 exam's Tools Scripts domain (Objective 5.2). Automation separates efficient, professional testers from those who waste time on repetitive tasks. Expect 10-15% of exam questions to touch on scripting concepts, particularly Bash and Python, regular expressions, and common automation patterns. Mastering these will not only help you pass but also drastically improve your real-world testing speed.

25 min read
Intermediate
Updated May 31, 2026

Scripting as a Master Chef's Kitchen

Imagine a master chef running a busy restaurant kitchen. Without automation, every time a diner orders a steak, the chef must manually walk to the pantry, retrieve the meat, season it, heat the grill, cook to temperature, plate, and serve. Each steak takes the same steps, but doing them manually for 100 orders is exhausting and error-prone. Now, the chef creates a 'recipe card' – a script. It lists every step: 'Take steak from fridge, season with salt and pepper, grill 4 minutes per side for medium-rare, rest 2 minutes, plate with garnish.' The chef can give this card to a junior cook, who executes it exactly every time. Moreover, the chef can write a recipe that says, 'If the order is for a vegetarian, skip the steak and prepare the portobello mushroom instead.' This is a conditional. The chef can also create a recipe that repeats: 'For each table in section 3, prepare a bread basket.' That's a loop. The chef can even combine recipes: 'Before grilling any steak, call the 'preheat grill' recipe.' That's function calling. In penetration testing, your scripts are these recipe cards. Instead of manually running Nmap, then parsing output, then running a vulnerability scanner, you write a script that does it all in sequence, makes decisions based on results, and repeats across multiple hosts. This saves hours, reduces mistakes, and allows you to focus on interpreting results rather than typing commands.

How It Actually Works

What is Scripting and Automation in PenTesting?

Scripting and automation refer to the use of programming or shell scripting languages to automate repetitive tasks during a penetration test. Instead of manually running each tool, parsing output, and deciding next steps, a tester writes a script that orchestrates the entire workflow. This is critical for efficiency, consistency, and scalability. The PT0-002 exam expects you to understand how to write simple scripts in Bash (Linux shell) and Python, and to use regular expressions to parse and extract data from tool output.

Why Automation Matters for PenTest+

The exam objectives explicitly state: "Given a scenario, analyze a script or automate a task." This means you must be able to read a script, understand what it does, identify errors, and modify it for a given purpose. You are not expected to write complex programs from scratch, but you must be comfortable with:

Variables, conditionals, loops

Reading and writing files

Calling external commands and capturing output

Using regular expressions for pattern matching

Basic error handling

Bash Scripting Fundamentals

Bash is the default shell on most Linux distributions and is the primary scripting language for pen testers using Kali Linux. A Bash script is a plain text file with a .sh extension, starting with the shebang line #!/bin/bash. This tells the system to use the Bash interpreter.

Variables in Bash are untyped and assigned without spaces around =:

#!/bin/bash
target="10.0.0.1"
port=80
echo "Scanning $target on port $port"

Command substitution captures output of a command into a variable:

result=$(nmap -p $port $target)

Conditionals use if, elif, else, and fi. The test command [ ] or [[ ]] evaluates conditions:

if [ -f "$output_file" ]; then
    echo "File exists"
else
    echo "File missing"
fi

Loops include for, while, and until. Common pattern: iterate over a list of hosts:

for ip in $(cat targets.txt); do
    nmap -sS $ip
    echo "Done with $ip"
done

Functions group reusable code:

function scan_port() {
    nc -zv $1 $2
}
scan_port $target 443

Exit codes are important for error handling. 0 means success, non-zero means failure. Use $? to capture the last command's exit code.

Python Scripting for PenTesting

Python is more powerful for complex automation, especially when parsing structured data (JSON, XML) or using libraries like subprocess, socket, and re. The exam expects you to understand Python scripts that call external tools and process their output.

Running system commands uses subprocess.run():

import subprocess
result = subprocess.run(["nmap", "-p", "80", "10.0.0.1"], capture_output=True, text=True)
print(result.stdout)

File I/O:

with open('targets.txt', 'r') as f:
    for line in f:
        ip = line.strip()
        print(f"Scanning {ip}")

Error handling with try/except:

try:
    result = subprocess.run(["nmap", ip], capture_output=True, timeout=30)
except subprocess.TimeoutExpired:
    print(f"Scan of {ip} timed out")

Regular expressions via the re module are critical for parsing tool output:

import re
output = "22/tcp open ssh"
match = re.search(r'(\d+)/tcp\s+(\w+)', output)
if match:
    port = match.group(1)
    state = match.group(2)
    print(f"Port {port} is {state}")

Regular Expressions Deep Dive

Regular expressions (regex) are patterns used to match text. The exam tests your ability to read and write simple regex patterns. Key metacharacters: - . - any single character - * - zero or more of preceding - + - one or more of preceding - ? - zero or one of preceding - ^ - start of line - $ - end of line - [abc] - character class: a, b, or c - [^abc] - negated character class: not a, b, c - \d - digit (0-9) - \w - word character (alphanumeric + underscore) - \s - whitespace - | - alternation (OR) - () - grouping

Example: Extract all IP addresses from a file:

import re
pattern = r'\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b'
with open('nmap_output.txt') as f:
    content = f.read()
    ips = re.findall(pattern, content)
    print(ips)

Common exam trap: The greedy vs non-greedy matching. By default, * and + are greedy (match as much as possible). Use *? or +? for non-greedy.

Automating Common PenTest Tasks

Network scanning automation: A script that takes a list of CIDR ranges, runs Nmap, and saves results per host.

#!/bin/bash
for range in $(cat ranges.txt); do
    nmap -sS -sV -O $range -oA scan_$range
done

Service enumeration: Parse Nmap output to identify web servers and run Nikto:

#!/bin/bash
for ip in $(grep "open" scan.gnmap | awk '{print $2}'); do
    ports=$(grep $ip scan.gnmap | grep -oP '\d+/open' | cut -d'/' -f1)
    for port in $ports; do
        if [ $port -eq 80 ] || [ $port -eq 443 ]; then
            nikto -h $ip -p $port
        fi
    done
done

Credential testing: Use Hydra with a list of users and passwords:

#!/bin/bash
for user in $(cat users.txt); do
    hydra -l $user -P passwords.txt ssh://target_ip
done

Report generation: Combine multiple tool outputs into a single report using Python:

import json
results = {}
# Parse Nmap XML
# Parse Nikto JSON
# Combine into final report
with open('report.json', 'w') as f:
    json.dump(results, f, indent=4)

Key Components and Defaults

Shebang: #!/bin/bash or #!/usr/bin/env python3

Exit codes: 0 success, 1 general error, 2 misuse of shell builtins

Bash variable defaults: ${var:-default} uses default if var unset; ${var:=default} assigns default

Python subprocess: shell=True is dangerous (command injection) – avoid unless absolutely necessary

Timeout: Always set timeouts in automation to prevent hanging

Logging: Use set -x in Bash for debugging; logging module in Python

Integration with Other Tools

Scripts often chain multiple tools. For example: 1. Run Nmap to discover open ports. 2. Parse output to identify services. 3. For each web service, run Nikto or Gobuster. 4. For SSH, run Hydra with common credentials. 5. Aggregate results into a CSV or JSON.

This is the essence of automation: reduce manual intervention, ensure consistency, and allow repeatability. The exam expects you to understand how to combine tools in a script.

Common Pitfalls in Automation Scripts

Hardcoded paths: Use variables or command-line arguments.

No error handling: Script fails silently or continues with bad data.

Race conditions: Multiple processes writing to the same file.

Infinite loops: Missing break or incorrect loop condition.

Insecure coding: Using eval() or shell=True with user input.

Exam-Relevant Scripting Patterns

Pattern 1: Loop through a file of targets and run a command.

while read line; do
    command $line
done < targets.txt

Pattern 2: Parse command output and act on it.

import subprocess
import re
output = subprocess.run(['nmap', '-p', '80', '10.0.0.1'], capture_output=True, text=True)
if 'open' in output.stdout:
    print('Port 80 is open')

Pattern 3: Use regex to extract data.

import re
data = "Host: 192.168.1.1 (router.local)"
match = re.search(r'Host: (\S+) \((\S+)\)', data)
if match:
    ip = match.group(1)
    hostname = match.group(2)

Pattern 4: Error handling with retries.

import time
for attempt in range(3):
    result = subprocess.run(['ping', '-c', '1', ip], capture_output=True)
    if result.returncode == 0:
        print(f'{ip} is up')
        break
    else:
        time.sleep(2)

Walk-Through

1

Identify Repetitive Tasks

Start by listing all manual steps in your penetration test that are repetitive and deterministic. Examples: scanning multiple IPs with Nmap, running Nikto on every open web port, or testing a list of credentials against SSH. For each task, note the input (target list, port list, wordlist), the command or tool, and the output format. This analysis is the foundation of automation; the exam expects you to recognize which tasks are suitable for scripting. Avoid tasks that require complex decision-making or human intuition—those remain manual.

2

Choose Scripting Language

Decide between Bash and Python based on complexity. For simple sequential command execution with minimal parsing, Bash is sufficient and faster to write. For tasks requiring complex data structures, regex, or API calls, Python is better. The exam may present both; you must understand the strengths of each. Bash excels at piping commands and file operations; Python excels at logic and library support. A common hybrid approach: use Bash for orchestration and Python for heavy lifting.

3

Write the Script Skeleton

Start with the shebang line and basic structure: variable declarations, input file handling, and a main loop. For Bash, use `#!/bin/bash` and `set -e` to exit on error. For Python, use `#!/usr/bin/env python3` and `if __name__ == '__main__':`. Define functions for reusable blocks. Include command-line argument parsing using `$1`, `$2` in Bash or `sys.argv` / `argparse` in Python. This skeleton ensures the script is modular and maintainable.

4

Implement Core Logic with Loops and Conditionals

Write the main loop that iterates over inputs (e.g., targets from a file). Inside the loop, use conditionals to handle different scenarios (e.g., if port 80 is open, run Nikto; if SSH is open, run Hydra). Use `for` loops in Bash or `for` loops in Python. Ensure proper indentation and syntax. Test the loop with a small subset of data first. The exam may ask you to identify missing logic, such as not handling an empty file or not checking command success.

5

Parse Output and Extract Data

After running a command, capture its output and parse it to extract relevant information. In Bash, use `grep`, `awk`, `sed`, or regex with `=~`. In Python, use the `re` module or string methods. For example, extract open ports from Nmap output: `grep 'open'` in Bash or `re.findall(r'(\d+)/tcp\s+open', output)` in Python. Store extracted data in variables or arrays for later use. The exam tests your ability to write correct regex patterns and avoid common mistakes like greedy matching.

6

Add Error Handling and Logging

Implement error handling to make the script robust. In Bash, check exit codes with `$?` or use `||` and `&&`. In Python, use try/except blocks. Add logging to record successes, failures, and unexpected events. Use `set -x` in Bash for debug traces or `logging` module in Python. Set timeouts to prevent hanging on unresponsive hosts. The exam may present a script without error handling and ask what could go wrong (e.g., infinite loop, unhandled exception).

7

Test and Refine the Script

Run the script against a controlled environment (e.g., a lab with known targets). Verify that it produces the expected output and handles edge cases (e.g., no targets, all ports filtered, network timeout). Check for performance issues like excessive resource usage. Refine the script by adding comments, removing hardcoded values, and making it configurable via arguments. The exam may present a script with bugs; you must identify them (e.g., missing `fi`, incorrect regex, off-by-one errors).

What This Looks Like on the Job

In a typical enterprise penetration test, a tester might face 10,000+ IP addresses across multiple subnets. Manually scanning each with Nmap would take days. Instead, the tester writes a Bash script that reads a CIDR list, runs Nmap with the -sn ping sweep to discover live hosts, then feeds the live hosts into a full port scan. The script parses the .gnmap output to extract open ports and launches appropriate enumeration tools (e.g., Nikto for HTTP, Hydra for SSH) in parallel using background processes or xargs. This reduces a 40-hour manual task to a few hours of automated scanning.

Another scenario: During a web application test, the tester needs to check for common directories on hundreds of sites. A Python script using requests library iterates over a wordlist and a list of base URLs, sending GET requests and logging HTTP status codes. It uses threading to speed up the process and writes results to a CSV file. The script includes error handling for timeouts and SSL errors, and it respects robots.txt if needed. This automation catches hidden endpoints that would be missed by manual browsing.

A third scenario: Post-exploitation, the tester must extract sensitive data from multiple compromised hosts. A Python script uses paramiko to SSH into each host, runs commands to collect password hashes, configuration files, and browser history, then downloads the files via SFTP. The script logs all actions and handles connection failures gracefully. This ensures consistent data collection across dozens of hosts without manual intervention.

Common misconfigurations: Not setting timeouts causes scripts to hang indefinitely on unresponsive hosts. Hardcoded credentials in scripts can be exposed if the script is shared. Lack of logging makes debugging impossible. Over-parallelization can overwhelm the network or target. The experienced tester designs scripts with configuration files, logging, and rate limiting to avoid these issues.

How PT0-002 Actually Tests This

The PT0-002 exam (Objective 5.2) specifically tests your ability to "analyze a script or automate a task." This means you will be given a short script (Bash or Python) and asked to determine its purpose, identify errors, or modify it to achieve a different goal. The most common wrong answers come from misreading the script's logic. For example, a script that runs nmap -sS on a list of IPs but uses >> to append output to a single file might be mistaken for overwriting the file. Candidates often confuse > (overwrite) with >> (append).

Another frequent trap: Regex patterns that match incorrectly. For instance, the pattern \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} will also match invalid IPs like 999.999.999.999 because it doesn't restrict each octet to 0-255. The exam expects you to know that a proper IP regex includes bounds checking, though often a simpler pattern is used for demonstration.

Candidates also struggle with understanding exit codes. A script that checks $? -eq 0 after a command is testing success, but many think $? contains the output rather than the exit code. Similarly, in Python, subprocess.run().returncode is often confused with stdout.

Edge cases the exam loves: What happens when the input file is empty? The loop never runs, but the script might still create an empty output file. What if a command fails? Without error handling, the script continues with potentially corrupted data. What if a variable is unset? In Bash, using an unset variable can cause the script to behave unexpectedly; set -u catches this.

To eliminate wrong answers, always trace the script's execution step by step. Identify the shebang, variable assignments, loops, conditionals, and command calls. Determine the data flow: what input is read, how it is processed, and what output is produced. If the question asks for the script's purpose, look at the commands being run and the output file names. If it asks for an error, look for syntax mistakes (missing then, fi, do, done in Bash; indentation or missing colons in Python) or logical errors (infinite loops, off-by-one).

Key Takeaways

The shebang `#!/bin/bash` or `#!/usr/bin/env python3` must be the first line of a script.

In Bash, use `$?` to check the exit code of the last command; 0 means success.

In Python, use `subprocess.run()` with `capture_output=True` and `text=True` to run system commands and capture stdout/stderr.

The regex pattern `\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}` matches IP addresses but does not validate octet range (0-255).

Always set timeouts in automation scripts to prevent hanging on unresponsive targets.

Avoid `shell=True` in Python subprocess calls due to command injection risk.

Use `set -e` in Bash to exit on first error; use `set -u` to treat unset variables as errors.

Common loop patterns: `for ip in $(cat file); do ... done` (Bash) and `with open('file') as f: for line in f: ...` (Python).

Regular expression greedy matching: `.*` matches as much as possible; use `.*?` for non-greedy.

Logging and error handling are essential for robust automation scripts – exam questions often test missing error handling.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Bash Scripting

Native to Linux/Unix; no additional runtime needed on Kali

Best for simple command orchestration and piping

Limited data structures (only arrays and associative arrays in Bash 4+)

String manipulation is primitive (sed, awk, cut)

Error handling is less structured (exit codes, trap)

Python Scripting

Cross-platform; requires Python interpreter (usually pre-installed)

Best for complex logic, parsing, and API interactions

Rich data structures: lists, dicts, sets, tuples

Powerful string and regex support via built-in methods and `re` module

Structured exception handling with try/except/finally

Watch Out for These

Mistake

Bash scripts must have the .sh extension to run.

Correct

The extension is not required; the shebang line (`#!/bin/bash`) determines the interpreter. The file must be executable (`chmod +x`). The .sh is a convention for human readability.

Mistake

In Python, `subprocess.run()` with `shell=True` is safe if the input is trusted.

Correct

`shell=True` is always dangerous because it invokes a shell that can interpret shell metacharacters. Even with 'trusted' input, it's best to avoid it. Use a list of arguments instead.

Mistake

The regex `\d+` matches only a single digit.

Correct

`\d+` matches one or more digits (greedy). To match a single digit, use `\d` (without `+`) or `\d{1}`.

Mistake

In Bash, `$?` contains the standard output of the last command.

Correct

`$?` holds the exit code (0 for success, non-zero for failure). To capture standard output, use command substitution: `output=$(command)`.

Mistake

A Python script must have a shebang line to run.

Correct

The shebang is only needed if you run the script directly (`./script.py`). You can always run it with `python3 script.py` regardless of shebang.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between `>` and `>>` in Bash?

`>` redirects output to a file, overwriting it if it exists. `>>` appends output to the end of the file, preserving existing content. In automation, use `>` when you want a fresh file each run, and `>>` when accumulating results from multiple commands (e.g., inside a loop). Exam tip: if a script uses `>>` inside a loop, the output file will contain all iterations; if it uses `>`, each iteration overwrites the previous.

How do I pass arguments to a Bash script?

Arguments are accessed via positional parameters: `$1` (first argument), `$2` (second), etc. `$0` is the script name. `$#` gives the number of arguments. `$@` expands to all arguments as separate words. Example: `./script.sh target.com 80` – inside script, `target=$1` and `port=$2`. For more robust parsing, use `getopts` or `shift`.

What is the purpose of `set -e` in a Bash script?

`set -e` causes the script to exit immediately if any command returns a non-zero exit code. This prevents the script from continuing after a failure, which could lead to corrupted data or wasted time. However, it can be too aggressive if a command is expected to fail occasionally (e.g., `grep` not finding a match). In that case, use `|| true` to allow failure.

How do I run a Python script from the command line?

You can run `python3 script.py` or make the script executable with `chmod +x script.py` and add a shebang `#!/usr/bin/env python3`, then run `./script.py`. The shebang line tells the system which interpreter to use. If you omit the shebang, you must explicitly call `python3`.

What is a common mistake when using `subprocess.run()` in Python?

A common mistake is forgetting to set `capture_output=True` when you want to capture stdout/stderr. Without it, the output goes directly to the console and is not stored. Another mistake is using `shell=True` without needing it, which introduces security risks. Also, forgetting to set `text=True` (or `universal_newlines=True` in older Python) will return bytes instead of strings.

How do I extract all IP addresses from a text file using regex?

Use a pattern like `r'\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b'` in Python's `re.findall()`. In Bash, you can use `grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}' file`. Note that this pattern does not validate that each octet is ≤255; it only checks format. For exam purposes, this is usually sufficient.

What does the `-oA` flag do in Nmap?

`-oA basename` tells Nmap to output results in all three major formats: normal (`.nmap`), grepable (`.gnmap`), and XML (`.xml`), using `basename` as the prefix. This is useful for automation because you can parse the XML with Python or the grepable format with Bash commands like `grep` and `awk`.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Scripting and Automation for PenTest — now see how well it sticks with free PT0-002 practice questions. Full explanations included, no account needed.

Done with this chapter?