ResourcesSSOT: RUN-INDEX-V2026-01-26
All Runs
How to read a run Each run displays source, tier, verdict, and available evidence surfaces. If an item is missing in 32 V1 or 16 V2 runs, that is an evidence signal — not a UI error.
Note:The public
sitemap.xml indexes only high-value (Reproduced/Dispute-Ready) runs to maintain SEO quality. For a complete inventory of all runs (including simulated baselines), use this index or the Site Topology Map.48
Total Runs
11
Reproduced (V2)
32
Simulated (V1)
1
Dispute Ready
📊 How to Read This Table
Reproduced
Real evidence pack with verifiable hash/seal
Can download, re-run locally, compare hash
✓ Reproduced = downloadable pack + deterministic recheck + hash matches release seal
Simulated
Synthetic evidence pack (not from real execution)
For demo/coverage only, not dispute-ready
Dispute Ready
FAIL verdict with evidence pointers to triggered clauses
Arbitration-ready: clause + evidence + FMM pointer
Declared
Manifest/metadata only, no downloadable evidence
Cannot be independently verified
Domain Labels (D1, D2, D3, D4)
d1Provenance: Identity, environment, provenance integrity
d2Lifecycle: Execution lifecycle, state transitions
d3Arbitration: Dispute resolution, evidence pointers
d4Interop: Cross-framework protocol compliance
Host vs Interop
Host= Orchestration framework running the agent (LangGraph, CrewAI, etc.)
Interop= Protocol stack used for cross-framework communication (MCP, A2A, ACP)
| ID | Scenario(Domain + Validates) | Tier ⓘ | Source | Host + Interop ⓘ | Verdict | Ruleset |
|---|---|---|---|---|---|---|
| mcp-d1-fail-benchmark-001 | d1d1_basic_fail Identity, environment, provenance integrity | Dispute Ready | v2 | MCPMCP | FAIL | ruleset-v2.0.1 |