Wintermute Framework, Part 4: Cartridges, MCP, and Surgeon

Wintermute Framework, Part 4: Cartridges, MCP, and Surgeon

In Part 3 we wired up the AI subsystem. This post is about the capability surface the AI plays with: cartridges (in-process Python plugins whose methods auto-register as AI tools), the MCP runtime that lets us mount external MCP servers like Burp/Maltego/JIRA, and Surgeon (a shipped MCP server for firmware emulation hook generation and AFL++ fuzzing).

By the end of this post we have:

  • tpm20, jtag, and firmware_analysis cartridges loaded against the
    IoT-camera engagement,
  • a custom Burp-Suite MCP server registered alongside Surgeon,
  • a worked I²C-EEPROM extraction → static analysis chain that hands a
    Vulnerability back into the operation.

Hardware-side, this post focuses on OpenOCD/JTAG, firmware static analysis, and TPM 2.0 quirks. Bring an HS2 dongle if you’re following along on real silicon.

What a Cartridge Actually Is

A cartridge is a single Python module under wintermute/cartridges/ that exposes one primary class. The shipped ones are:

Cartridge modulePrimary classPurpose
wintermute/cartridges/tpm20.pytpm20TPM 2.0 command builder + transport (PCR state, DA lockout, fuzzing).
wintermute/cartridges/jtag.pyJTAGCartridgeOpenOCD telnet RPC: halt/resume, register/memory read, firmware dump.
wintermute/cartridges/firmware_analysis.pyFirmwareAnalysisCartridgeStateless blob analysis: entropy, secrets, strings, basefind.

CartridgeManager (wintermute/cartridges/manager.py) is a singleton. On load("name") it imports the module, finds the primary class (by exact-name match, Cartridge suffix, or “first class defined in the module”), instantiates it, walks its public methods, and feeds each one through register_tools(...) from wintermute/ai/utils/tool_factory.py. The generated Tool objects are inserted into the global ToolRegistry. On unload("name") they are removed and an observer callback fires, so WintermuteMCP can broadcast notifications/tools/list_changed to connected clients.

The mental model: cartridges turn ordinary Python class instances into AI tools, and adding/removing a cartridge changes the AI’s capability surface live.

Loading From the Console — and What Happens

onoSendai [acme-iotcam-2026-Q2] > cartridges
onoSendai [acme-iotcam-2026-Q2/cartridges] > list
📦 Available Cartridges
┃ Name ┃ Loaded ┃
┃ firmware_analysis ┃ ┃
┃ jtag ┃ ┃
┃ tpm20 ┃ ┃
onoSendai [.../cartridges] > load tpm20
✔ Loaded cartridge tpm20 — 9 tool(s) registered with the AI.
[*] Exposed functions: get_random, read_public, nv_read, nv_write,
start_auth_session, test_pcr_state, test_da_lockout, fuzz_command, execute
onoSendai [.../cartridges] > load firmware_analysis
✔ Loaded cartridge firmware_analysis — 4 tool(s) registered with the AI.
[*] Exposed functions: analyze_entropy, scan_for_secrets, extract_strings,
find_base_address
onoSendai [.../cartridges] > tpm20
onoSendai [.../cartridges/tpm20] > list
⚙️ Cartridge: tpm20 (tpm20)
┃ Function ┃ Signature ┃ Description ┃
┃ get_random ┃ (num_bytes: int) ┃ Request num_bytes of randomness… ┃
┃ test_pcr_state ┃ (pcr_index: int) ┃ Read a PCR; verify it changes… ┃
┃ test_da_lockout ┃ (max_attempts: int = 5) ┃ Force a DA lockout to test reset… ┃
┃ ... ┃ ┃ ┃
onoSendai [.../cartridges/tpm20] > run test_pcr_state 0
{'pcr_index': 0, 'value': '0x000...', 'changed_after_extend': True}

The same run is what the LLM does via tool-call: tools.call("test_pcr_state", {"pcr_index": 0}). The shape is identical because there is a single registry.

Programmatic Cartridge Use

From examples/07-Programmatic-Hardware-Cartridges.ipynb, the canonical JTAG dump-then-analyse chain (here in production form, not the notebook’s in-memory fake transport):

from wintermute.cartridges.jtag import JTAGCartridge, OpenOCDConfig, OpenOCDTransport
from wintermute.cartridges.firmware_analysis import FirmwareAnalysisCartridge
from wintermute.utils.blob_manager import WorkspaceManager
# Real OpenOCD running locally on :4444 against the iot-cam-01 JTAG
transport = OpenOCDTransport(OpenOCDConfig(host="localhost", port=4444))
workspace = WorkspaceManager() # defaults to ./wintermute_workspace
jtag = JTAGCartridge(transport=transport, workspace=workspace)
fa = FirmwareAnalysisCartridge()
assert jtag.halt_core()
descriptor = jtag.dump_firmware(start_address="0x08000000",
size_bytes=0x100000,
filename="iotcam-flash.bin")
# descriptor: {"file_path": "...sha256-prefix.bin", "size_bytes": ...,
# "sha256": "...", "type": "binary_blob"}
entropy = fa.analyze_entropy(descriptor["file_path"])
secrets = fa.scan_for_secrets(descriptor["file_path"])
strings = fa.extract_strings(descriptor["file_path"], min_length=8)
base = fa.find_base_address(descriptor["file_path"], arch="arm",
min_addr=0x08000000, max_addr=0x40000000)

Three properties of this code are crucial for the per-test-case sub-agents in Part 7:

  1. The dump never enters the LLM context. dump_firmware writes the
    bytes through WorkspaceManager and returns a descriptor. When the AI
    calls this tool, the descriptor is what the model sees; the bytes stay on
    disk addressable by file_path.
  2. Each follow-up tool consumes file_path, not bytes. analyze_entropy,
    scan_for_secrets, extract_strings, find_base_address all take a
    file_path. This composes cleanly: the agent says “dump → analyze entropy
    → find base address,” each step is a tool call, no step ever attempts
    to put 1 MiB of flash into a prompt.
  3. find_base_address is multiprocess-bounded. The cartridge caps
    workers at os.cpu_count() // 2 and silences basefind‘s progress bars,
    so an agent running this without supervision does not eat the whole
    machine.

Tying a Cartridge Method To a Live Test Case

Sub-agents in Part 7 will repeatedly do the following: take a TestCaseRun.bound[i].object_id (the peripheral hostname/alias), find the right cartridge method, and invoke it with the right arguments.

For IOT-HW-UART-001:iot-cam-01:debug-uart, the steps in the test plan are “capture boot output,” “test BREAK interrupt,” “verify root shell.” None of those map 1-to-1 to JTAG cartridge methods, but the JTAG cartridge is the one that proves a UART exposes interruptible U-Boot — by halting the core, reading PC after a BREAK, and verifying it landed in the bootloader. The sub-agent is the thing that bridges the natural-language step in the test case to the typed method on a cartridge.

For IOT-HW-I2C-001:iot-cam-01:mcio-eeprom-1, however, none of the shipped cartridges has an I²C method — and that is fine. We will add a small I2CCartridge next, and because cartridges hot-load, the orchestrator picks it up without restart.

Writing Your Own Cartridge

Drop a new file wintermute/cartridges/i2c.py:

from __future__ import annotations
from typing import Any
from smbus2 import SMBus # apt install python3-smbus2
class I2CCartridge:
"""Direct I²C bus access via Linux i2c-dev for EEPROM extraction."""
def __init__(self, bus_number: int = 2) -> None:
self.bus = SMBus(bus_number)
def detect(self) -> list[int]:
"""Probe addresses 0x03..0x77 and return the list of responders."""
found = []
for addr in range(0x03, 0x78):
try:
self.bus.read_byte(addr)
found.append(addr)
except OSError:
pass
return found
def dump_eeprom(self, address: int, size: int = 256) -> dict[str, Any]:
"""Sequential read of `size` bytes from device `address`. Returns a
workspace blob descriptor (NEVER raw bytes — the LLM never sees them).
"""
# i2c sequential read: write 0x00, then read `size` bytes
self.bus.write_byte(address, 0)
data = bytes(self.bus.read_byte(address) for _ in range(size))
return data # tool_factory.LARGE_PAYLOAD_THRESHOLD_BYTES handles offload
def write_byte(self, address: int, register: int, value: int) -> dict[str, int]:
self.bus.write_byte_data(address, register, value)
return {"address": address, "register": register, "value": value}

That is the entire cartridge. Now:

onoSendai [.../cartridges] > load i2c
✔ Loaded cartridge i2c — 3 tool(s) registered with the AI.
[*] Exposed functions: detect, dump_eeprom, write_byte
onoSendai [.../cartridges/i2c] > run detect
[80, 81] # 0x50 0x51
onoSendai [.../cartridges/i2c] > run dump_eeprom 0x50 256
{'file_path': './wintermute_workspace/blob-3a4f.../...bin',
'size_bytes': 256, 'sha256': '...', 'type': 'binary_blob'}

Two specific Wintermute behaviors made this easy:

  1. tool_factory.function_to_tool reads type hints. Returning bytes
    (dump_eeprom) automatically goes through _maybe_offload_payload and
    becomes a workspace blob descriptor.
  2. CartridgeManager._find_primary_class picks I2CCartridge because
    its name ends in Cartridge. No registration boilerplate.

This is what “agentic framework for hardware red teams” actually means — turning a 30-line wrapper around smbus2 into something the orchestrator can compose with TPM, JTAG, and firmware analysis without writing any glue. We will use this exact I2CCartridge in Part 6 and Part 7.

The MCP Side — Two Directions

Wintermute speaks MCP both ways:

  • Outbound (integrations/mcp_runtime.py): MCPRuntime and the
    mcp register/start/stop console family let Wintermute consume an
    external MCP server. Tools from that server land in the same global
    registry as cartridge methods — the LLM does not see the seam.
  • Inbound (WintermuteMCP.py): wintermute-mcp runs as an MCP server
    exposing 80+ tools (operation CRUD, devices, vulnerabilities, AI chat,
    reports, cartridges, SSH, depthcharge, burp ingest…). Any MCP client —
    Claude Desktop, Cursor, or our own orchestrator from Part 6 — drives the
    framework remotely.

Outbound: Mounting an External MCP Server From the Console

Suppose we have a Burp Suite MCP server (a community project that exposes Burp’s REST API as MCP tools). Register, start, list:

onoSendai > mcp register burpsuite uvx burp-mcp --proxy http://127.0.0.1:8080
[*] Registered MCP server burpsuite (uvx burp-mcp --proxy http://127.0.0.1:8080)
Saved to ~/.config/wintermute/mcp_servers.json
onoSendai > mcp start burpsuite
[*] Server burpsuite started (PID 84211)
onoSendai > tools mcp
🔌 External MCP Tools
Name Description Server
burp_active_scan Trigger an active scan against… burpsuite
burp_get_issues Pull current issue list… burpsuite
...
onoSendai > tools list
🧰 Native AI Tools
Name Description Source
get_random TPM 2.0 randomness… internal
analyze_entropy Shannon entropy… internal
burp_active_scan ... mcp

The agent now sees the union — analyze_entropy (cartridge), burp_active_scan (external MCP), nv_read (TPM), dump_firmware (JTAG) — all callable by name through ToolsRuntime.run_tool. This is how we drop a Maltego MCP, a JIRA MCP, an internal Confluence MCP onto an engagement and have the orchestrator (Part 6) discover them on the fly.

Inbound: Driving Wintermute From Claude Desktop

Run the server from a different terminal:

$ wintermute-mcp --transport stdio

In ~/.config/Claude/claude_desktop_config.json:

{
"mcpServers": {
"wintermute": {
"command": "wintermute-mcp",
"args": ["--transport", "stdio"]
}
}
}

Now Claude Desktop has tools like create_operation, add_device, add_peripheral_to_device, setup_storage_backend, add_test_plan_from_json, generate_test_runs, update_test_run_status, add_vulnerability_to_test_run, generate_report, run_ssh_command, execute_depthcharge_catalog, execute_depthcharge_memory_dump, ingest_burp_scan, attach_evidence, load_cartridge — every one of which mutates the same Operation container we built in Part 2.

A red-team workflow becomes “Claude, load the IoT camera engagement, attach TestPlans/TP-HW-BLACKBOX-001.json, generate runs, then for the I²C-EEPROM run, dump it via the i2c cartridge and analyze the dump.” Claude composes the calls in the right order; Wintermute persists the result.

Surgeon: Firmware Hooks and AFL++ Fuzzing

Surgeon is Wintermute’s purpose-built MCP server for firmware emulation hook generation. Source: wintermute/integrations/surgeon/server.py.

The exposed MCP tools (and the offensive use case for each):

Surgeon toolPurpose
create_hook_skeletonGenerate a C hook for a peripheral’s MMIO region (UART/WIFI/JTAG/PCIE/USB/…).
list_firmware_symbolsnm over the firmware ELF — finds candidate hook addresses (functions to instrument).
write_config_fileWrite YAML/JSON configs into the Surgeon project tree.
build_firmwaremake build FIRMWARE=<name> — compiles the instrumented binary.
start_fuzzingmake run-fuzz FIRMWARE=<name> — launches the AFL++ docker container.
get_fuzzer_statsRead fuzzer_stats from the AFL output dir.

The interesting one is create_hook_skeleton. It takes a peripheral type (UART, WIFI, BLUETOOTH, ETHERNET, USB, PCIE, JTAG, TPM, ZIGBEE) and a malicious_snippet of C code, and produces an emulator hook with read/write callbacks at the peripheral’s MMIO base. The malicious_snippet lets a red-team scenario inject arbitrary fault-injection behavior — integer overflow on a register read, randomized RX bytes on a radio, fake PCIe vendor ID:

# Driven via MCP from the orchestrator
hook = await surgeon_session.call_tool("create_hook_skeleton", {
"firmware_name": "iotcam_v3",
"peripheral_name": "wifi_chip",
"address_base": "0x40001000",
"peripheral_type": "WIFI",
"malicious_snippet":
"if (offset == 0x08 && (rand() % 100 < 5)) "
" *val = 0xDEADBEEF; // 5% radio noise injection",
})

This produces a C file under <SURGEON_ROOT>/src/runtime/handlers/iotcam_v3/wifi_chip.c ready for make build FIRMWARE=iotcam_v3. Hook in a SDR-shape radio abstraction (TX writes are logged, RX reads can be poisoned), build, then start_fuzzing. AFL++ inside the Surgeon docker container drives the emulated firmware against the hook, looking for crashes — the offensive playbook for “find a memory-corruption bug in the WiFi RX path of a firmware blob you can’t run on real hardware.”

Wiring Surgeon Into Wintermute

SurgeonBackend is the in-process bridge. From examples/04-AI-Enrichment-and-Tools.ipynb extended to Surgeon:

from wintermute.ai.tools_runtime import ToolsRuntime
from wintermute.integrations.surgeon.backend import SurgeonBackend
runtime = ToolsRuntime()
backend = SurgeonBackend(surgeon_root="/opt/surgeon")
await backend.start()
runtime.register_backend(backend)
# Now the LLM can call create_hook_skeleton, build_firmware, start_fuzzing
# alongside cartridge methods
all_tools = await runtime.get_all_tools()
print([t["function"]["name"] for t in all_tools])

SurgeonBackend.start() (integrations/surgeon/backend.py) spawns the Surgeon FastMCP server as a subprocess and connects via stdio. get_ai_tools() returns the FastMCP tool list converted into OpenAI function-calling format; execute_tool() round-trips through MCP. From the LLM’s vantage, Surgeon tools are indistinguishable from cartridge methods.

A Composed Pentest Step: I²C-EEPROM Extraction Into a Vulnerability

Putting everything in this post together, here is the sequence the Part-7 sub-agent will execute autonomously for IOT-HW-I2C-001:iot-cam-01:mcio-eeprom-1:

# Manual transcript — Part 7 will let the LLM make these calls itself
from wintermute.cartridges.manager import CartridgeManager
from wintermute.ai.tools_runtime import tools as registry
from wintermute.findings import ReproductionStep, Vulnerability
# 0. Cartridges available on the active operation
mgr = CartridgeManager()
mgr.load("i2c")
mgr.load("firmware_analysis")
# 1. Detect addresses on bus 2
addrs = registry.call("detect", {})["result"]
assert 0x50 in addrs
# 2. Dump the EEPROM — descriptor returned, no bytes in context
desc = registry.call("dump_eeprom", {"address": 0x50, "size": 256})
# 3. Static analysis on the dump
strings = registry.call("extract_strings",
{"file_path": desc["file_path"], "min_length": 8})["result"]
secrets = registry.call("scan_for_secrets",
{"file_path": desc["file_path"]})["result"]
# 4. Decide & write back into the live operation
interesting = strings["top_20_interesting_strings"]
creds_present = any("password" in s or "admin" in s for s in interesting)
pem_present = bool(secrets["matches"].get("pem_block", []))
run = next(r for r in op.test_runs
if r.run_id == "IOT-HW-I2C-001:iot-cam-01:mcio-eeprom-1")
run.start()
if creds_present or pem_present:
vuln = Vulnerability(
title="Hardcoded credentials/keys in I2C EEPROM (MCIO bus 2, 0x50)",
description=(
f"Recovered top strings: {interesting[:3]} ... "
f"PEM blocks at offsets: {secrets['matches'].get('pem_block', [])}"
),
cvss=8,
threat="unauthorized device access via static credentials",
reproduction_steps=[
ReproductionStep(
title="Detect I2C devices on bus 2",
tool="i2c.detect", action="probe", confidence=80,
arguments=[],
),
ReproductionStep(
title="Sequential read of 256 bytes from 0x50",
tool="i2c.dump_eeprom", action="read",
confidence=90, arguments=["0x50", "256"],
),
ReproductionStep(
title="Extract printable strings (>=8 chars)",
tool="firmware_analysis.extract_strings",
action="analyze", confidence=80,
arguments=[desc["file_path"], "8"],
),
],
)
run.findings.append(vuln)
run.status = RunStatus.failed # vulnerability found = run "failed" target
else:
run.status = RunStatus.passed
run.finish()
op.save()

That is the shape of every per-test-case sub-agent in Part 7. The sub-agent’s job is to translate the natural-language steps in TestCase.steps into the right cartridge / MCP / Surgeon tool calls in the right order, then write the verdict into the live TestCaseRun. The framework already ships:

  • the cartridge surface (in-process, hot-loadable),
  • the MCP surface (external tools, hot-mountable),
  • the workspace (large blobs offloaded automatically),
  • the run/finding/repro-step model.

The agent layer above is genuinely small once the framework underneath does this much work for it.

What’s Next

Part 5 builds our first end-to-end agentic flow: a single-prompt Claude call that, given the IoT camera operation, chooses whether to dump the EEPROM, runs through tool_calling_chat‘s loop, and writes a finding. It is intentionally simple — one agent, one test case — because Part 6 and Part 7 generalize it into the orchestrator and the per-test-case sub-agents.

Leave a Reply

Hey!

I’m Bedrock. Discover the ultimate Minetest resource – your go-to guide for expert tutorials, stunning mods, and exclusive stories. Elevate your game with insider knowledge and tips from seasoned Minetest enthusiasts.

Join the club

Stay updated with our latest tips and other news by joining our newsletter.

Discover more from Exploit.Ninja

Subscribe now to keep reading and get access to the full archive.

Continue reading