Evidence InventoryPublic V1 + V2 run index
All Runs
How to use this page This page lists the Lab's public run inventory across published V1 and V2 evidence lines. Each row shows the run ID, verdict, evidence tier, linked ruleset, and available evidence surfaces.
Note:Search engines are pointed at a smaller subset of high-value runs. This page remains the full public run index, including simulated baselines and other valid records that are not prioritized for search. For a complete route inventory, use the Site Topology Map.
48
Total Runs
11
Reproduced (V2)
32
Simulated (V1)
1
Dispute Ready
📊 How to Read This Table
Reproduced
Real evidence pack with verifiable hash/seal
Can download, re-run locally, compare hash
✓ Reproduced = downloadable pack + deterministic recheck + hash matches release seal
Simulated
Synthetic evidence pack (not from real execution)
For demo/coverage only, not dispute-ready
Dispute Ready
FAIL verdict with evidence pointers to triggered clauses
Arbitration-ready: clause + evidence + FMM pointer
Declared
Manifest/metadata only, no downloadable evidence
Cannot be independently verified
Domain Labels (D1, D2, D3, D4)
d1Provenance: Identity, environment, provenance integrity
d2Lifecycle: Execution lifecycle, state transitions
d3Arbitration: Dispute resolution, evidence pointers
d4Interop: Cross-framework protocol compliance
Host vs Interop
Host= Orchestration framework running the agent (LangGraph, CrewAI, etc.)
Interop= Protocol stack used for cross-framework communication (MCP, A2A, ACP)
| ID | Scenario(Domain + Validates) | Tier ⓘ | Source | Host + Interop ⓘ | Verdict | Ruleset |
|---|---|---|---|---|---|---|
| mcp-d1-fail-benchmark-001 | d1d1_basic_fail Identity, environment, provenance integrity | Dispute Ready | v2 | MCPMCP | FAIL | ruleset-v2.0.1 |