Evidence InventoryPublic V1 + V2 run index
All Runs
How to use this page This page lists the Lab's public run inventory across published V1 and V2 evidence lines. Each row shows the run ID, verdict, evidence tier, linked ruleset, and available evidence surfaces.
Note:Search engines are pointed at a smaller subset of high-value runs. This page remains the full public run index, including simulated baselines and other valid records that are not prioritized for search. For a complete route inventory, use the Site Topology Map.
48
Total Runs
11
Reproduced (V2)
32
Simulated (V1)
1
Dispute Ready
📊 How to Read This Table
Reproduced
Real evidence pack with verifiable hash/seal
Can download, re-run locally, compare hash
✓ Reproduced = downloadable pack + deterministic recheck + hash matches release seal
Simulated
Synthetic evidence pack (not from real execution)
For demo/coverage only, not dispute-ready
Dispute Ready
FAIL verdict with evidence pointers to triggered clauses
Arbitration-ready: clause + evidence + FMM pointer
Declared
Manifest/metadata only, no downloadable evidence
Cannot be independently verified
Domain Labels (D1, D2, D3, D4)
d1Provenance: Identity, environment, provenance integrity
d2Lifecycle: Execution lifecycle, state transitions
d3Arbitration: Dispute resolution, evidence pointers
d4Interop: Cross-framework protocol compliance
Host vs Interop
Host= Orchestration framework running the agent (LangGraph, CrewAI, etc.)
Interop= Protocol stack used for cross-framework communication (MCP, A2A, ACP)
| ID | Scenario(Domain + Validates) | Tier ⓘ | Source | Host + Interop ⓘ | Verdict | Ruleset |
|---|---|---|---|---|---|---|
| mcp-d1-fail-benchmark-001 | d1d1_basic_fail Identity, environment, provenance integrity | Dispute Ready | v2 | MCPMCP | FAIL | ruleset-v2.0.1 |
| acp-d1-real-runner-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | ACPACP | PASS | ruleset-v2.0.1 |
| acp-d1-real-runner-002 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | ACPACP | PASS | ruleset-v2.0.1 |
| interop-mcp-a2a-langgraph-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | LangGraphLANGGRAPHMCPA2A | PASS | ruleset-v2.0.1 |
| langchain-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | LangChainLANGCHAIN | PASS | ruleset-v2.0.1 |
| langgraph-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | LangGraphLANGGRAPH | PASS | ruleset-v2.0.1 |
| mcp-d1-real-runner-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | MCPMCP | PASS | ruleset-v2.0.1 |
| mcp-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | MCPMCP | PASS | ruleset-v2.0.1 |
| mcp-d1-real-runner-det-002 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | MCPMCP | PASS | ruleset-v2.0.1 |
| metagpt-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | metagptMETAGPT | PASS | ruleset-v2.0.1 |
| semantickernel-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | Semantic KernelSEMANTIC KERNEL | PASS | ruleset-v2.0.1 |
| mcp-d1-synthetic-001 | d1d1_basic_pass Identity, environment, provenance integrity | Reproduced | v2 | MCPMCP | PASS | ruleset-v2.0.1 |
| admission-not-admissible-01 | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | fixture | FAIL | ruleset-1.0 |
| arb-d1-budget-fail-deny-missing-gate-v0.4 | d1d1-fail-cl-d1-03-scenario Identity, environment, provenance integrity | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d1-budget-fail-fixture-v0.3 | d1d1-budget-fail-scenario Identity, environment, provenance integrity | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| arb-d1-budget-fail-outcome-invalid-v0.4 | d1d1-fail-cl-d1-02-scenario Identity, environment, provenance integrity | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d1-budget-pass-fixture-v0.3 | d1d1-budget-pass-scenario Identity, environment, provenance integrity | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| arb-d2-lifecycle-fail-post-terminal-exec-v0.4 | d2d2-fail-cl-d2-03-scenario Execution lifecycle, state transitions | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d2-lifecycle-fail-terminal-state-invalid-v0.4 | d2d2-fail-cl-d2-02-scenario Execution lifecycle, state transitions | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d2-lifecycle-state-fail-fixture-v0.3 | d2d2-lifecycle-fail-scenario Execution lifecycle, state transitions | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| arb-d2-lifecycle-state-pass-fixture-v0.3 | d2d2-lifecycle-pass-scenario Execution lifecycle, state transitions | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| arb-d3-authz-decision-fail-fixture-v0.3 | d3d3-authz-fail-scenario Dispute resolution, evidence pointers | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| arb-d3-authz-decision-pass-fixture-v0.3 | d3d3-authz-pass-scenario Dispute resolution, evidence pointers | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| arb-d3-authz-fail-deny-no-confirm-v0.4 | d3d3-fail-cl-d3-03-scenario Dispute resolution, evidence pointers | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d3-authz-fail-sra-incomplete-v0.4 | d3d3-fail-cl-d3-02-scenario Dispute resolution, evidence pointers | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d4-termination-fail-reason-invalid-v0.4 | d4d4-fail-cl-d4-02-scenario Cross-framework protocol compliance | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d4-termination-fail-uncontrolled-recovery-v0.4 | d4d4-fail-cl-d4-03-scenario Cross-framework protocol compliance | Simulated | v1 | fixture | N/A | ruleset-1.2 |
| arb-d4-termination-recovery-fail-fixture-v0.3 | d4d4-termination-fail-scenario Cross-framework protocol compliance | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| arb-d4-termination-recovery-pass-fixture-v0.3 | d4d4-termination-pass-scenario Cross-framework protocol compliance | Simulated | v1 | fixture | N/A | ruleset-1.1 |
| gf-01-a2a-fail | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | A2A | N/A | ruleset-1.0 |
| gf-01-a2a-official-v0.2 | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | A2A | N/A | ruleset-1.0 |
| gf-01-a2a-pass | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | A2A | N/A | ruleset-1.0 |
| gf-01-adjudicated-fail-01 | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | fixture | FAIL | ruleset-1.0 |
| gf-01-langchain-official-v0.2 | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | LangChain | N/A | ruleset-1.0 |
| gf-01-langchain-pass | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | LangChain | PASS | ruleset-1.0 |
| gf-01-mcp-official-v0.2 | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | MCP | N/A | ruleset-1.0 |
| gf-01-mcp-pass | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | MCP | PASS | ruleset-1.0 |
| gf-02-fail | gf-01-single-agent-lifecycle General conformance | Simulated | v1 | fixture | FAIL | ruleset-1.0 |
| pydantic-ai-d1-budget-fail-01 | d1d1-budget-fail-scenario Identity, environment, provenance integrity | Simulated | v1 | pydantic-ai | N/A | ruleset-1.2 |
| pydantic-ai-d1-budget-fail-02 | d1d1-budget-fail-scenario Identity, environment, provenance integrity | Simulated | v1 | pydantic-ai | N/A | ruleset-1.2 |
| pydantic-ai-d1-budget-pass-01 | d1d1-budget-pass-scenario Identity, environment, provenance integrity | Simulated | v1 | pydantic-ai | N/A | ruleset-1.2 |
| pydantic-ai-d1-budget-pass-02 | d1d1-budget-pass-scenario Identity, environment, provenance integrity | Simulated | v1 | pydantic-ai | N/A | ruleset-1.2 |
| v05-d1-langgraph-pass-budget-allow | d1d1-budget-pass-scenario Identity, environment, provenance integrity | Simulated | v1 | LangGraph | N/A | ruleset-1.0 |
| v05-d1-sk-pass-budget-allow | d1d1-budget-pass-scenario Identity, environment, provenance integrity | Simulated | v1 | sk | N/A | ruleset-1.0 |
| crewai-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Declared | v2 | CrewAICREWAI | PASS | ruleset-v2.0.1 |
| crewai-d1-real-runner-det-002 | d1d1_basic_pass Identity, environment, provenance integrity | Declared | v2 | CrewAICREWAI | PASS | ruleset-v2.0.1 |
| magentic-one-d1-real-runner-det-001 | d1d1_basic_pass Identity, environment, provenance integrity | Declared | v2 | magentic_oneMAGENTIC-ONEAUTOGEN | PASS | ruleset-v2.0.1 |
| magentic-one-d1-real-runner-det-002 | d1d1_basic_pass Identity, environment, provenance integrity | Declared | v2 | magentic_oneMAGENTIC-ONEAUTOGEN | PASS | ruleset-v2.0.1 |