Wintermute Framework, Part 4: Cartridges, MCP, and Surgeon

In Part 3 we wired up the AI subsystem. This post is about the capability surface the AI plays with: cartridges (in-process Python plugins whose methods auto-register as AI tools), the MCP runtime that lets us mount external MCP servers like Burp/Maltego/JIRA, and Surgeon (a shipped MCP server for firmware emulation hook generation and AFL++ fuzzing).

By the end of this post we have:

tpm20, jtag, and firmware_analysis cartridges loaded against the
IoT-camera engagement,
a custom Burp-Suite MCP server registered alongside Surgeon,
a worked I²C-EEPROM extraction → static analysis chain that hands a
Vulnerability back into the operation.

Hardware-side, this post focuses on OpenOCD/JTAG, firmware static analysis, and TPM 2.0 quirks. Bring an HS2 dongle if you’re following along on real silicon.

What a Cartridge Actually Is

A cartridge is a single Python module under wintermute/cartridges/ that exposes one primary class. The shipped ones are:

Cartridge module	Primary class	Purpose
`wintermute/cartridges/tpm20.py`	`tpm20`	TPM 2.0 command builder + transport (PCR state, DA lockout, fuzzing).
`wintermute/cartridges/jtag.py`	`JTAGCartridge`	OpenOCD telnet RPC: halt/resume, register/memory read, firmware dump.
`wintermute/cartridges/firmware_analysis.py`	`FirmwareAnalysisCartridge`	Stateless blob analysis: entropy, secrets, strings, basefind.

CartridgeManager (wintermute/cartridges/manager.py) is a singleton. On load("name") it imports the module, finds the primary class (by exact-name match, Cartridge suffix, or “first class defined in the module”), instantiates it, walks its public methods, and feeds each one through register_tools(...) from wintermute/ai/utils/tool_factory.py. The generated Tool objects are inserted into the global ToolRegistry. On unload("name") they are removed and an observer callback fires, so WintermuteMCP can broadcast notifications/tools/list_changed to connected clients.

The mental model: cartridges turn ordinary Python class instances into AI tools, and adding/removing a cartridge changes the AI’s capability surface live.

Loading From the Console — and What Happens

			
onoSendai [acme-iotcam-2026-Q2] > cartridges
onoSendai [acme-iotcam-2026-Q2/cartridges] > list
📦 Available Cartridges
┃ Name               ┃ Loaded ┃
┃ firmware_analysis  ┃        ┃
┃ jtag               ┃        ┃
┃ tpm20              ┃        ┃
onoSendai [.../cartridges] > load tpm20
✔ Loaded cartridge tpm20 — 9 tool(s) registered with the AI.
[*] Exposed functions: get_random, read_public, nv_read, nv_write,
    start_auth_session, test_pcr_state, test_da_lockout, fuzz_command, execute
onoSendai [.../cartridges] > load firmware_analysis
✔ Loaded cartridge firmware_analysis — 4 tool(s) registered with the AI.
[*] Exposed functions: analyze_entropy, scan_for_secrets, extract_strings,
    find_base_address
onoSendai [.../cartridges] > tpm20
onoSendai [.../cartridges/tpm20] > list
⚙️ Cartridge: tpm20 (tpm20)
┃ Function          ┃ Signature                           ┃ Description                       ┃
┃ get_random        ┃ (num_bytes: int)                    ┃ Request num_bytes of randomness…  ┃
┃ test_pcr_state    ┃ (pcr_index: int)                    ┃ Read a PCR; verify it changes…    ┃
┃ test_da_lockout   ┃ (max_attempts: int = 5)             ┃ Force a DA lockout to test reset… ┃
┃ ...               ┃                                     ┃                                   ┃
onoSendai [.../cartridges/tpm20] > run test_pcr_state 0
{'pcr_index': 0, 'value': '0x000...', 'changed_after_extend': True}

		

The same run is what the LLM does via tool-call: tools.call("test_pcr_state", {"pcr_index": 0}). The shape is identical because there is a single registry.

Programmatic Cartridge Use

From examples/07-Programmatic-Hardware-Cartridges.ipynb, the canonical JTAG dump-then-analyse chain (here in production form, not the notebook’s in-memory fake transport):

			
from wintermute.cartridges.jtag import JTAGCartridge, OpenOCDConfig, OpenOCDTransport
from wintermute.cartridges.firmware_analysis import FirmwareAnalysisCartridge
from wintermute.utils.blob_manager import WorkspaceManager
# Real OpenOCD running locally on :4444 against the iot-cam-01 JTAG
transport = OpenOCDTransport(OpenOCDConfig(host="localhost", port=4444))
workspace = WorkspaceManager()           # defaults to ./wintermute_workspace
jtag = JTAGCartridge(transport=transport, workspace=workspace)
fa   = FirmwareAnalysisCartridge()
assert jtag.halt_core()
descriptor = jtag.dump_firmware(start_address="0x08000000",
                                size_bytes=0x100000,
                                filename="iotcam-flash.bin")
# descriptor: {"file_path": "...sha256-prefix.bin", "size_bytes": ...,
#              "sha256": "...", "type": "binary_blob"}
entropy  = fa.analyze_entropy(descriptor["file_path"])
secrets  = fa.scan_for_secrets(descriptor["file_path"])
strings  = fa.extract_strings(descriptor["file_path"], min_length=8)
base     = fa.find_base_address(descriptor["file_path"], arch="arm",
                                min_addr=0x08000000, max_addr=0x40000000)

		

Three properties of this code are crucial for the per-test-case sub-agents in Part 7:

The dump never enters the LLM context. dump_firmware writes the
bytes through WorkspaceManager and returns a descriptor. When the AI
calls this tool, the descriptor is what the model sees; the bytes stay on
disk addressable by file_path.
Each follow-up tool consumes file_path, not bytes. analyze_entropy,
scan_for_secrets, extract_strings, find_base_address all take a
file_path. This composes cleanly: the agent says “dump → analyze entropy
→ find base address,” each step is a tool call, no step ever attempts
to put 1 MiB of flash into a prompt.
find_base_address is multiprocess-bounded. The cartridge caps
workers at os.cpu_count() // 2 and silences basefind‘s progress bars,
so an agent running this without supervision does not eat the whole
machine.

Tying a Cartridge Method To a Live Test Case

Sub-agents in Part 7 will repeatedly do the following: take a TestCaseRun.bound[i].object_id (the peripheral hostname/alias), find the right cartridge method, and invoke it with the right arguments.

For IOT-HW-UART-001:iot-cam-01:debug-uart, the steps in the test plan are “capture boot output,” “test BREAK interrupt,” “verify root shell.” None of those map 1-to-1 to JTAG cartridge methods, but the JTAG cartridge is the one that proves a UART exposes interruptible U-Boot — by halting the core, reading PC after a BREAK, and verifying it landed in the bootloader. The sub-agent is the thing that bridges the natural-language step in the test case to the typed method on a cartridge.

For IOT-HW-I2C-001:iot-cam-01:mcio-eeprom-1, however, none of the shipped cartridges has an I²C method — and that is fine. We will add a small I2CCartridge next, and because cartridges hot-load, the orchestrator picks it up without restart.

Writing Your Own Cartridge

Drop a new file wintermute/cartridges/i2c.py:

			
from __future__ import annotations
from typing import Any
from smbus2 import SMBus       # apt install python3-smbus2
class I2CCartridge:
    """Direct I²C bus access via Linux i2c-dev for EEPROM extraction."""
    def __init__(self, bus_number: int = 2) -> None:
        self.bus = SMBus(bus_number)
    def detect(self) -> list[int]:
        """Probe addresses 0x03..0x77 and return the list of responders."""
        found = []
        for addr in range(0x03, 0x78):
            try:
                self.bus.read_byte(addr)
                found.append(addr)
            except OSError:
                pass
        return found
    def dump_eeprom(self, address: int, size: int = 256) -> dict[str, Any]:
        """Sequential read of `size` bytes from device `address`. Returns a
        workspace blob descriptor (NEVER raw bytes — the LLM never sees them).
        """
        # i2c sequential read: write 0x00, then read `size` bytes
        self.bus.write_byte(address, 0)
        data = bytes(self.bus.read_byte(address) for _ in range(size))
        return data    # tool_factory.LARGE_PAYLOAD_THRESHOLD_BYTES handles offload
    def write_byte(self, address: int, register: int, value: int) -> dict[str, int]:
        self.bus.write_byte_data(address, register, value)
        return {"address": address, "register": register, "value": value}

		

That is the entire cartridge. Now:

			
onoSendai [.../cartridges] > load i2c
✔ Loaded cartridge i2c — 3 tool(s) registered with the AI.
[*] Exposed functions: detect, dump_eeprom, write_byte
onoSendai [.../cartridges/i2c] > run detect
[80, 81]                    # 0x50 0x51
onoSendai [.../cartridges/i2c] > run dump_eeprom 0x50 256
{'file_path': './wintermute_workspace/blob-3a4f.../...bin',
 'size_bytes': 256, 'sha256': '...', 'type': 'binary_blob'}

		

Two specific Wintermute behaviors made this easy:

tool_factory.function_to_tool reads type hints. Returning bytes
(dump_eeprom) automatically goes through _maybe_offload_payload and
becomes a workspace blob descriptor.
CartridgeManager._find_primary_class picks I2CCartridge because
its name ends in Cartridge. No registration boilerplate.

This is what “agentic framework for hardware red teams” actually means — turning a 30-line wrapper around smbus2 into something the orchestrator can compose with TPM, JTAG, and firmware analysis without writing any glue. We will use this exact I2CCartridge in Part 6 and Part 7.

The MCP Side — Two Directions

Wintermute speaks MCP both ways:

Outbound (integrations/mcp_runtime.py): MCPRuntime and the
mcp register/start/stop console family let Wintermute consume an
external MCP server. Tools from that server land in the same global
registry as cartridge methods — the LLM does not see the seam.
Inbound (WintermuteMCP.py): wintermute-mcp runs as an MCP server
exposing 80+ tools (operation CRUD, devices, vulnerabilities, AI chat,
reports, cartridges, SSH, depthcharge, burp ingest…). Any MCP client —
Claude Desktop, Cursor, or our own orchestrator from Part 6 — drives the
framework remotely.

Outbound: Mounting an External MCP Server From the Console

Suppose we have a Burp Suite MCP server (a community project that exposes Burp’s REST API as MCP tools). Register, start, list:

			
onoSendai > mcp register burpsuite uvx burp-mcp --proxy http://127.0.0.1:8080
[*] Registered MCP server burpsuite (uvx burp-mcp --proxy http://127.0.0.1:8080)
    Saved to ~/.config/wintermute/mcp_servers.json
onoSendai > mcp start burpsuite
[*] Server burpsuite started (PID 84211)
onoSendai > tools mcp
🔌 External MCP Tools
┃ Name                  ┃ Description                       ┃ Server     ┃
┃ burp_active_scan      ┃ Trigger an active scan against…   ┃ burpsuite  ┃
┃ burp_get_issues       ┃ Pull current issue list…          ┃ burpsuite  ┃
┃ ...                   ┃                                   ┃            ┃
onoSendai > tools list
🧰 Native AI Tools
┃ Name                  ┃ Description           ┃ Source     ┃
┃ get_random            ┃ TPM 2.0 randomness…   ┃ internal   ┃
┃ analyze_entropy       ┃ Shannon entropy…      ┃ internal   ┃
┃ burp_active_scan      ┃ ...                   ┃ mcp        ┃

		

The agent now sees the union — analyze_entropy (cartridge), burp_active_scan (external MCP), nv_read (TPM), dump_firmware (JTAG) — all callable by name through ToolsRuntime.run_tool. This is how we drop a Maltego MCP, a JIRA MCP, an internal Confluence MCP onto an engagement and have the orchestrator (Part 6) discover them on the fly.

Inbound: Driving Wintermute From Claude Desktop

Run the server from a different terminal:

$ wintermute-mcp --transport stdio

In ~/.config/Claude/claude_desktop_config.json:

			
{
  "mcpServers": {
    "wintermute": {
      "command": "wintermute-mcp",
      "args": ["--transport", "stdio"]
    }
  }
}

		

Now Claude Desktop has tools like create_operation, add_device, add_peripheral_to_device, setup_storage_backend, add_test_plan_from_json, generate_test_runs, update_test_run_status, add_vulnerability_to_test_run, generate_report, run_ssh_command, execute_depthcharge_catalog, execute_depthcharge_memory_dump, ingest_burp_scan, attach_evidence, load_cartridge — every one of which mutates the same Operation container we built in Part 2.

A red-team workflow becomes “Claude, load the IoT camera engagement, attach TestPlans/TP-HW-BLACKBOX-001.json, generate runs, then for the I²C-EEPROM run, dump it via the i2c cartridge and analyze the dump.” Claude composes the calls in the right order; Wintermute persists the result.

Surgeon: Firmware Hooks and AFL++ Fuzzing

Surgeon is Wintermute’s purpose-built MCP server for firmware emulation hook generation. Source: wintermute/integrations/surgeon/server.py.

The exposed MCP tools (and the offensive use case for each):

Surgeon tool	Purpose
`create_hook_skeleton`	Generate a C hook for a peripheral’s MMIO region (UART/WIFI/JTAG/PCIE/USB/…).
`list_firmware_symbols`	`nm` over the firmware ELF — finds candidate hook addresses (functions to instrument).
`write_config_file`	Write YAML/JSON configs into the Surgeon project tree.
`build_firmware`	`make build FIRMWARE=<name>` — compiles the instrumented binary.
`start_fuzzing`	`make run-fuzz FIRMWARE=<name>` — launches the AFL++ docker container.
`get_fuzzer_stats`	Read `fuzzer_stats` from the AFL output dir.

The interesting one is create_hook_skeleton. It takes a peripheral type (UART, WIFI, BLUETOOTH, ETHERNET, USB, PCIE, JTAG, TPM, ZIGBEE) and a malicious_snippet of C code, and produces an emulator hook with read/write callbacks at the peripheral’s MMIO base. The malicious_snippet lets a red-team scenario inject arbitrary fault-injection behavior — integer overflow on a register read, randomized RX bytes on a radio, fake PCIe vendor ID:

			
# Driven via MCP from the orchestrator
hook = await surgeon_session.call_tool("create_hook_skeleton", {
    "firmware_name": "iotcam_v3",
    "peripheral_name": "wifi_chip",
    "address_base": "0x40001000",
    "peripheral_type": "WIFI",
    "malicious_snippet":
        "if (offset == 0x08 && (rand() % 100 < 5)) "
        "  *val = 0xDEADBEEF;  // 5% radio noise injection",
})

		

This produces a C file under <SURGEON_ROOT>/src/runtime/handlers/iotcam_v3/wifi_chip.c ready for make build FIRMWARE=iotcam_v3. Hook in a SDR-shape radio abstraction (TX writes are logged, RX reads can be poisoned), build, then start_fuzzing. AFL++ inside the Surgeon docker container drives the emulated firmware against the hook, looking for crashes — the offensive playbook for “find a memory-corruption bug in the WiFi RX path of a firmware blob you can’t run on real hardware.”

Wiring Surgeon Into Wintermute

SurgeonBackend is the in-process bridge. From examples/04-AI-Enrichment-and-Tools.ipynb extended to Surgeon:

			
from wintermute.ai.tools_runtime import ToolsRuntime
from wintermute.integrations.surgeon.backend import SurgeonBackend
runtime = ToolsRuntime()
backend = SurgeonBackend(surgeon_root="/opt/surgeon")
await backend.start()
runtime.register_backend(backend)
# Now the LLM can call create_hook_skeleton, build_firmware, start_fuzzing
# alongside cartridge methods
all_tools = await runtime.get_all_tools()
print([t["function"]["name"] for t in all_tools])

		

SurgeonBackend.start() (integrations/surgeon/backend.py) spawns the Surgeon FastMCP server as a subprocess and connects via stdio. get_ai_tools() returns the FastMCP tool list converted into OpenAI function-calling format; execute_tool() round-trips through MCP. From the LLM’s vantage, Surgeon tools are indistinguishable from cartridge methods.

A Composed Pentest Step: I²C-EEPROM Extraction Into a Vulnerability

Putting everything in this post together, here is the sequence the Part-7 sub-agent will execute autonomously for IOT-HW-I2C-001:iot-cam-01:mcio-eeprom-1:

			
# Manual transcript — Part 7 will let the LLM make these calls itself
from wintermute.cartridges.manager import CartridgeManager
from wintermute.ai.tools_runtime import tools as registry
from wintermute.findings import ReproductionStep, Vulnerability
# 0. Cartridges available on the active operation
mgr = CartridgeManager()
mgr.load("i2c")
mgr.load("firmware_analysis")
# 1. Detect addresses on bus 2
addrs = registry.call("detect", {})["result"]
assert 0x50 in addrs
# 2. Dump the EEPROM — descriptor returned, no bytes in context
desc = registry.call("dump_eeprom", {"address": 0x50, "size": 256})
# 3. Static analysis on the dump
strings = registry.call("extract_strings",
                        {"file_path": desc["file_path"], "min_length": 8})["result"]
secrets = registry.call("scan_for_secrets",
                        {"file_path": desc["file_path"]})["result"]
# 4. Decide & write back into the live operation
interesting = strings["top_20_interesting_strings"]
creds_present = any("password" in s or "admin" in s for s in interesting)
pem_present   = bool(secrets["matches"].get("pem_block", []))
run = next(r for r in op.test_runs
           if r.run_id == "IOT-HW-I2C-001:iot-cam-01:mcio-eeprom-1")
run.start()
if creds_present or pem_present:
    vuln = Vulnerability(
        title="Hardcoded credentials/keys in I2C EEPROM (MCIO bus 2, 0x50)",
        description=(
            f"Recovered top strings: {interesting[:3]} ... "
            f"PEM blocks at offsets: {secrets['matches'].get('pem_block', [])}"
        ),
        cvss=8,
        threat="unauthorized device access via static credentials",
        reproduction_steps=[
            ReproductionStep(
                title="Detect I2C devices on bus 2",
                tool="i2c.detect", action="probe", confidence=80,
                arguments=[],
            ),
            ReproductionStep(
                title="Sequential read of 256 bytes from 0x50",
                tool="i2c.dump_eeprom", action="read",
                confidence=90, arguments=["0x50", "256"],
            ),
            ReproductionStep(
                title="Extract printable strings (>=8 chars)",
                tool="firmware_analysis.extract_strings",
                action="analyze", confidence=80,
                arguments=[desc["file_path"], "8"],
            ),
        ],
    )
    run.findings.append(vuln)
    run.status = RunStatus.failed       # vulnerability found = run "failed" target
else:
    run.status = RunStatus.passed
run.finish()
op.save()

		

That is the shape of every per-test-case sub-agent in Part 7. The sub-agent’s job is to translate the natural-language steps in TestCase.steps into the right cartridge / MCP / Surgeon tool calls in the right order, then write the verdict into the live TestCaseRun. The framework already ships:

the cartridge surface (in-process, hot-loadable),
the MCP surface (external tools, hot-mountable),
the workspace (large blobs offloaded automatically),
the run/finding/repro-step model.

The agent layer above is genuinely small once the framework underneath does this much work for it.

What’s Next

Part 5 builds our first end-to-end agentic flow: a single-prompt Claude call that, given the IoT camera operation, chooses whether to dump the EEPROM, runs through tool_calling_chat‘s loop, and writes a finding. It is intentionally simple — one agent, one test case — because Part 6 and Part 7 generalize it into the orchestrator and the per-test-case sub-agents.

Leave a ReplyCancel reply

Hey!

Join the club

Categories

Tags

Recent Posts

Wintermute Framework, Part 9: Attacking U-Boot Over UART — init=/bin/bash via bootargs Injection

Wintermute Framework, Part 8: U-Boot Secure Boot Testing With the Depthcharge Backend

Wintermute Framework, Part 7: Per-Test-Case Sub-Agents

Blogroll

Wintermute Framework, Part 4: Cartridges, MCP, and Surgeon

Wintermute Framework, Part 4: Cartridges, MCP, and Surgeon

What a Cartridge Actually Is

Loading From the Console — and What Happens

Programmatic Cartridge Use

Tying a Cartridge Method To a Live Test Case

Writing Your Own Cartridge

The MCP Side — Two Directions

Outbound: Mounting an External MCP Server From the Console

Inbound: Driving Wintermute From Claude Desktop

Surgeon: Firmware Hooks and AFL++ Fuzzing

Wiring Surgeon Into Wintermute

A Composed Pentest Step: I²C-EEPROM Extraction Into a Vulnerability

What’s Next

Share this:

Like this:

Leave a ReplyCancel reply

Hey!

Join the club

Categories

Tags

Recent Posts

Wintermute Framework, Part 9: Attacking U-Boot Over UART — init=/bin/bash via bootargs Injection

Wintermute Framework, Part 8: U-Boot Secure Boot Testing With the Depthcharge Backend

Wintermute Framework, Part 7: Per-Test-Case Sub-Agents

Blogroll

Discover more from Exploit.Ninja