In Part 1 we mapped the architecture and the engagement data model. In this post we drive Wintermute by hand: build an Operation, persist it, drive it from the REPL, and learn the console’s context-stack idioms — because every later post (the agent in Part 5, the orchestrator in Part 6, the sub-agents in Part 7) sits on top of these plumbing primitives.
This post is a field manual for the operator. No AI yet.
The Reference Engagement: a Raspberry-Pi-Based IoT Camera
I’ll use a single fictional engagement throughout the series so the data accumulates. Today is the in-take; later posts will fill in the agentic execution.
- Engagement name:
acme-iotcam-2026-Q2 - Scope: a managed IoT camera (
iot-cam-01, IP10.0.0.5), an MCIO board with
an I²C EEPROM at0x50, and a backing AWS accountacme-prodwith an IAM
role we suspect is over-privileged. - Analyst: Case (
case,case@acme.com). - Stakeholder: Acme’s security lead, Robert Smith.
We’ll add hardware peripherals (UART on J3, JTAG on J5, the I²C-EEPROM as a
peripheral attached to the device), one declarative TestPlan (the shipped
TestPlans/TP-HW-BLACKBOX-001.json), and persist the lot to disk so it can be
loaded on a different host or by an MCP client.
Building the Operation Programmatically
from wintermute.core import Operation, TestPlanfrom wintermute.peripherals import JTAG, UART, Peripheralfrom wintermute.hardware import Architecture, Processorop = Operation("acme-iotcam-2026-Q2", start_date="04/01/2026", end_date="04/30/2026")op.addAnalyst("Case", "case", "case@acme.com")op.addUser(uid="rsmith", name="Robert Smith", email="robert@acme.com", teams=["stakeholder"])# Target deviceop.addDevice("iot-cam-01", "10.0.0.5", operatingsystem="Linux 5.10")dev = op.getDeviceByHostname("iot-cam-01")# Tag it with processor + architecturedev.processor = Processor( processor="BCM2837", manufacturer="Broadcom", processor_family="Cortex-A", architecture=Architecture( core="Cortex-A53", instruction_set="ARMv8-A", cpu_cores=4, key_features={"trustzone": True, "neon": True}, ), endianness="little",)# Hardware peripherals — exactly the shape examples/03-Hardware-Security-Testing.ipynb# uses, just with our own pinoutsdev.peripherals.extend([ UART(name="debug-uart", baudrate=115200, device_path="/dev/ttyUSB0", pins={"tx": "J3-1", "rx": "J3-2", "gnd": "J3-3"}), JTAG(name="main-jtag", device_path="/dev/jtag0", pins={"tck": "J5-1", "tms": "J5-2", "tdi": "J5-3", "tdo": "J5-4", "gnd": "J5-5"}), Peripheral(name="mcio-eeprom-1", pType="I2C", pins={"scl": "MCIO-7", "sda": "MCIO-9"}),])# Network services on the devicedev.addService(portNumber=443, app="lighttpd", protocol="ipv4", transport_layer="HTTPS")dev.addService(portNumber=22, app="dropbear", protocol="ipv4", transport_layer="SSH")# Cloud account in scopeop.addAWSAccount("acme-prod", account_id="111122223333")
Three details are non-obvious enough that they bite people:
addDeviceandaddAnalystuse upsert-with-merge semantics
(Operation._merge_attributes,core.py:1068). Calling them twice with the
same hostname / userid does not create duplicates — list fields are extended
uniquely, dicts are shallow-merged, scalar fields overwrite only when the
new value is truthy. This is the contract every cartridge and AI tool relies
on, so partial enrichment is safe.Peripheralis the generic class. There are dedicated subclasses forUART,JTAG,Wifi,Bluetooth,USB,PCIe,Ethernet, andTPMPeripheral(TPM 2.0 transport-aware). For a one-off bus where there is
no dedicated subclass (I²C, SPI, SWD, Zigbee), usePeripheral(name=..., pType=...).dev.peripherals.extend([...])anddev.services.append(...)are perfectly
legitimate — the helpersaddService/addPeripheralexist for de-dup but
direct list manipulation is whatOperation.from_dictdoes internally, so
it is part of the contract.
Loading a Hardware Test Plan
The shipped TestPlans/TP-HW-BLACKBOX-001.json covers seven categories of
hardware testing. We attach it via Operation.addTestPlan(...):
import jsonfrom pathlib import Pathfrom wintermute.core import TestPlantp_data = json.loads(Path("TestPlans/TP-HW-BLACKBOX-001.json").read_text())op.addTestPlan(TestPlan.from_dict(tp_data))
Now look at one specific test case in that plan — IOT-HW-UART-001 — to see
how scoping resolves against our Operation:
{ "code": "IOT-HW-UART-001", "name": "UART discovery & console exposure", "execution_mode": "per_binding", "execution_binding": "uart", "target_scope": { "tags": ["uart", "console"], "bindings": [ { "kind": "device", "name": "dut", "where": {}, "cardinality": "one" }, { "kind": "peripheral", "name": "uart", "where": { "device": "dut", "pType": "UART" }, "cardinality": "many" } ] }, "steps": [ { "title": "Capture boot-time serial output", "tool": "serial-capture", "action": "collect", "confidence": 80, "arguments": ["capture_boot_log"] }, ... ]}
Now generate runs:
runs = op.generateTestRuns()for r in runs: print(f"{r.run_id:<55} {r.test_case_code} {r.status.value}")
generateTestRuns() (core.py:1008) walks every TestCase across attached
plans, calls resolveBindings() to match the case’s selectors against
op.devices and op.cloud_accounts, then createRunsForTestCase()
(core.py:936) fans the case out per the execution_mode:
once→ one run withrun_id = "TC_CODE:once".per_device→ one run per matched DUT.per_binding→ one run per matched peripheral, withrun_id = "TC_CODE:DEVICE:OBJECT".
For our IoT camera, the UART case yields a single run
(IOT-HW-UART-001:iot-cam-01:debug-uart) because we only declared one UART
peripheral. The I²C-EEPROM case (IOT-HW-I2C-001 in the same plan) likewise
binds to mcio-eeprom-1. Tag a second device tomorrow and the same plan
produces twice as many runs without modification.
This selector + cardinality machinery is the lever the orchestrator pulls in
Part 6. The agent does not invent test cases out of nothing — it loads a
real TestPlan, resolves it against the live operation, and dispatches one
sub-agent per generated run.
Persistence: JsonFileBackend, DynamoDBBackend, and the Protocol
from wintermute.backends.json_storage import JsonFileBackendOperation.register_backend("json", JsonFileBackend(base_path="./.wintermute_data"), make_default=True)op.save() # writes acme-iotcam-2026-Q2.json
StorageBackend (wintermute/storage.py) is a four-method protocol —
save, load, list_all, delete. The two shipped implementations are:
| Backend | Module | Notes |
|---|---|---|
JsonFileBackend | backends/json_storage.py | TinyDB underneath; one file per operation; ideal for laptops. |
DynamoDBBackend | backends/dynamodb.py | Single-table; uses operation name as partition key. |
Switching at runtime is a single call:
Operation.use_backend("dynamodb")op.save() # now writes to DynamoDB
This matters during a multi-analyst engagement: the field operator runs the
console with JsonFileBackend to sync to a thumb drive; the home base runs
the MCP server with DynamoDBBackend so other analysts can pick up the
state. The agent does not care which is active.
To implement a custom backend (Postgres, S3, Notion):
class MyPostgresBackend: def save(self, operation_id: str, data: dict) -> bool: ... def load(self, operation_id: str) -> dict | None: ... def list_all(self) -> list[str]: ... def delete(self, operation_id: str) -> bool: ...Operation.register_backend("postgres", MyPostgresBackend(...), make_default=True)
No subclassing, no abstract base. Same Protocol pattern is used for
TicketBackend, ReportBackend, and the ToolBackend an MCP server
exposes — there is exactly one mental model for “swap a subsystem.”
The Console: Context Stack and Builders
wintermute (the binary) is a Metasploit-style REPL. Three idioms make it
worth using over scripted Python:
1. Operation creation and direct field setting.
onoSendai > operation create acme-iotcam-2026-Q2onoSendai [acme-iotcam-2026-Q2] > set start_date 04/01/2026onoSendai [acme-iotcam-2026-Q2] > set end_date 04/30/2026
2. Add commands with inline arguments.
onoSendai [acme-iotcam-2026-Q2] > add device iot-cam-01 10.0.0.5onoSendai [acme-iotcam-2026-Q2] > add analyst Case case case@acme.comonoSendai [acme-iotcam-2026-Q2] > add user rsmith "Robert Smith" robert@acme.comonoSendai [acme-iotcam-2026-Q2] > add cloudaccount acme-prod awsonoSendai [acme-iotcam-2026-Q2] > add awsaccount acme-prod 111122223333
3. Domain context drilldown. Typing a domain name (devices, analysts,
users) drops the prompt one level:
onoSendai [acme-iotcam-2026-Q2] > devicesonoSendai [acme-iotcam-2026-Q2/devices] > listonoSendai [acme-iotcam-2026-Q2/devices] > iot-cam-01 # bare-id drilldown == editonoSendai [acme-iotcam-2026-Q2/devices/iot-cam-01] > add peripheral uartonoSendai [acme-iotcam-2026-Q2/devices/iot-cam-01/uart] > set name debug-uartonoSendai [.../uart] > set baudrate 115200onoSendai [.../uart] > set device_path /dev/ttyUSB0onoSendai [.../uart] > save
The builder pattern is implemented in BuilderContext (WintermuteConsole.py:111)
and dispatched through _dispatch_builder_command (WintermuteConsole.py:4869).
save materializes the partial-built object and attaches it to the parent’s
list field (peripherals in this case). back pops one frame.
The reason this pattern matters for agents: when the AI is on, every
add/edit/set is also exposed as MCP tool (e.g., add_device,
add_peripheral_to_device, edit_device). The same builder hierarchy is
how the MCP server’s ObjectRegistry exposes nested objects — the agent
fundamentally does what the operator does.
Test Run Drilldown — the testruns Domain
Once a plan is attached and runs are generated, testruns is its own
context:
onoSendai [acme-iotcam-2026-Q2] > testruns load TestPlans/TP-HW-BLACKBOX-001.json[*] Loaded test plan TP-HW-BLACKBOX-001 (24 test case(s))onoSendai [acme-iotcam-2026-Q2] > testruns generate[*] Generated 17 new test run(s). Total runs: 17onoSendai [acme-iotcam-2026-Q2] > testrunsonoSendai [.../testruns] > list🧪 Test Runs┃ Run ID ┃ Test Case ┃ Bound / Target ┃ Status ┃┃ IOT-HW-GEN-001:iot-cam-01 ┃ IOT-HW-GEN-001 ┃ dut=iot-cam-01 ┃ not_run ┃┃ IOT-HW-DISC-001:iot-cam-01 ┃ IOT-HW-DISC-001 ┃ dut=iot-cam-01 ┃ not_run ┃┃ IOT-HW-UART-001:iot-cam-01:debug-uart ┃ IOT-HW-UART-001 ┃ uart=debug-uart ┃ not_run ┃┃ ...onoSendai [.../testruns] > IOT-HW-UART-001:iot-cam-01:debug-uartonoSendai [.../testruns/IOT-HW-UART-001:iot-cam-01:debug-uart] > showonoSendai [.../testruns/...] > start[*] Run IOT-HW-UART-001:iot-cam-01:debug-uart -> in_progressonoSendai [.../testruns/...] > note "Captured boot log; bootloader allows interrupt"onoSendai [.../testruns/...] > vuln "Unauthenticated U-Boot console" 8onoSendai [.../testruns/...] > fail[*] Run ... -> failed
In this sequence we just executed a test run manually. Every operation
is reachable programmatically (run.start(), run.findings.append(...),
run.status = RunStatus.failed, run.finish()) and via MCP
(update_test_run_status, add_note_to_test_run, add_vulnerability_to_test_run).
The point of Part 7’s per-test-case sub-agent is to do exactly this loop —
start → exec tools → attach findings → status → finish — without an
operator typing.
Tickets and Reports: Two More Pluggable Backends
A pentest is not done when the run is failed; it is done when the
deliverable is on the customer’s desk and the bug is in their tracker. Both
are first-class.
from wintermute.tickets import InMemoryBackend, Status, TicketTicket.register_backend("mem", InMemoryBackend(), make_default=True)tid = Ticket.create(title="Unauthenticated U-Boot console on iot-cam-01", description="UART J3 exposes interruptible U-Boot. ...")Ticket.update(tid, status=Status.IN_PROGRESS)Ticket.comment(tid, text="Repro confirmed at 115200 8N1", author="case")
Swap InMemoryBackend for BugzillaBackend(url, api_key, product, component)
and the same code talks to Bugzilla. The metaclass pattern (TicketMeta in
wintermute/tickets.py) routes the static Ticket.create / read / update / comment calls to the active backend. There is no per-call backend argument
because the cartridge code (Part 4) and AI workflows (Part 5+) just call
Ticket.create(...) and trust the engagement-level configuration.
The same shape applies to reports. From examples/01-Basic-Examples.ipynb:
from wintermute.backends.docx_reports import DocxTplPerVulnBackendfrom wintermute.reports import Report, ReportSpecReport.register_backend("docx", DocxTplPerVulnBackend( template_dir="templates", main_template="report_main.docx", vuln_template="report_vuln.docx",), make_default=True)Report.save( ReportSpec(title="ACME IoT Camera Hardware Assessment", author="Case", summary="Findings on iot-cam-01 ..."), [op], "out.docx",)
DocxTplPerVulnBackend walks the operation graph (devices → services →
peripherals → vulnerabilities and runs → findings) via
collect_vulnerabilities and collect_test_runs (reports.py:297,
reports.py:393), composing a per-vulnerability section from
templates/report_vuln.docx, a per-test-run section from
templates/report_test_run.docx, and stitching them under
templates/report_main.docx. Templates ship in the
templates/
directory; copy and customize for your client palette.
A Real-World Pentest Workflow — Without an LLM Yet
Pulling the threads together, here is what a typical day looks like with the console alone:
- Day 0 — Onboarding.
wintermute→operation create acme-iotcam-2026-Q2
→ set dates →add analyst,add user,add device,add cloudaccount
→backend setup json ./data→save. - Day 1 — Recon and modeling.
Manual visual board survey. Drill into[devices/iot-cam-01],add peripheral uart/jtag/spi, set pinouts. - Day 1–N — Plan-driven execution.
testruns load TestPlans/TP-HW-BLACKBOX-001.json→testruns generate
→ drilldown per run →note,vuln,pass/fail. - Day N+1 — Deliverable.
setup_report_backend docx ...→generate_report ./out/iotcam.docx. - Day N+2 — Tracking.
EachVulnerabilitybecomes aTicketrow in Bugzilla via the sameTicket.create(...)calls.
This is the workflow we will automate in the rest of the series. Every
phase has an MCP tool counterpart, every phase mutates the same Operation
object, every phase is a place where an agent can plug in.
What’s Next
Part 3 covers the AI subsystem proper: the Router, the four shipped LLM providers, the RAG engine (vector_store_type: "local" vs Qdrant), how tools.json glues binary locations to the LLM tool surface, and the difference between simple_chat, tool_calling_chat, and the global ToolRegistry. After that, every post is agent-shaped.






Leave a Reply