Evidence InventoryPublic V1 + V2 run index

All Runs

How to use this page This page lists the Lab's public run inventory across published V1 and V2 evidence lines. Each row shows the run ID, verdict, evidence tier, linked ruleset, and available evidence surfaces.

Note:Search engines are pointed at a smaller subset of high-value runs. This page remains the full public run index, including simulated baselines and other valid records that are not prioritized for search. For a complete route inventory, use the Site Topology Map.

Total Runs

Reproduced (V2)

Simulated (V1)

Dispute Ready

📊 How to Read This Table

Reproduced

Real evidence pack with verifiable hash/seal

Can download, re-run locally, compare hash

✓ Reproduced = downloadable pack + deterministic recheck + hash matches release seal

Simulated

Synthetic evidence pack (not from real execution)

For demo/coverage only, not dispute-ready

Dispute Ready

FAIL verdict with evidence pointers to triggered clauses

Arbitration-ready: clause + evidence + FMM pointer

Declared

Manifest/metadata only, no downloadable evidence

Cannot be independently verified

Domain Labels (D1, D2, D3, D4)

d1Provenance: Identity, environment, provenance integrity

d2Lifecycle: Execution lifecycle, state transitions

d3Arbitration: Dispute resolution, evidence pointers

d4Interop: Cross-framework protocol compliance

Host vs Interop

Host= Orchestration framework running the agent (LangGraph, CrewAI, etc.)

Interop= Protocol stack used for cross-framework communication (MCP, A2A, ACP)

All (48)Dispute Ready (1)Reproduced (11)Simulated (32)Declared (4)

ID	Scenario(Domain + Validates)	Tier ⓘ	Source	Host + Interop ⓘ	Verdict	Ruleset
mcp-d1-fail-benchmark-001	d1d1_basic_fail Identity, environment, provenance integrity	Dispute Ready	v2	MCPMCP	FAIL	ruleset-v2.0.1