ERC-8210 v2 Changelog — Part 2 of 2: Full Text of All Seventeen Items
Continuing from Part 1. Below is the full content of the seventeen items, grouped as in the summary.
A. Core spec text changes (targeted for v2)
1. New “Inherited Assumptions” section in Rationale
A dedicated subsection in Rationale making the inherited assumptions from ERC-8183 explicit, covering two layers:
Role independence: AAP assumes that the Client, Provider, and Evaluator roles defined in ERC-8183 are genuinely independent. AAP cannot verify this at the protocol layer. If the assumption fails (for example, Provider and Evaluator sharing identity), the EvaluatorDispute path acts as a remediation channel, but that path depends on external independence signals supplied by a trust scoring layer or attestation provider. This part is triggered by @RNWY raising the role-collusion attack vector.
Pre-commitment facts: Some CoverageTypes implicitly require pre-commitment facts that v1 trusts rather than verifies. SlashingLoss is only meaningful if the Agent is actually staked; AMLFreeze is only meaningful if the wallet has passed AML screening. This part comes from @douglasborthwick-cry’s observation in PR #1632 — he did more than flag the problem, also pointing to the resolution path: the IRiskHook extension (items 3 and 11) lets implementations perform explicit on-chain state verification at commitToJob time, upgrading “trust” to “verification.” Douglas also contributed a wallet-state attestation primitive as a reference implementation shape for pre-commitment verification (ES256 + JWKS, covering 33 chains, with signing content including block and condition, the Assured Agent not holding the signing key, and anyone able to independently re-run the verification).
2. Extend EvaluatorDispute eligibility
The current wording (“formal dispute state or on-chain dispute attestation”) cannot accommodate the case where a Job reaches the Completed terminal state but an independence violation is surfaced afterwards. This scenario was raised by @RNWY.
v2 will:
- Allow an EvaluatorDispute Claim to be filed within a grace period after Completed (suggested minimum 7 days, implementation-defined), provided an independence-violation attestation is supplied.
- Keep the current behavior as the default, so implementations must explicitly opt in to preserve backward compatibility.
3. IRiskHook: independence-check use case
The IRiskHook section currently only describes reputation-based collateral recommendations. Driven by discussion between @RNWY and @thoughtproof on cross-vendor independence signaling, v2 will add:
- A hook implementation MAY reject
commitToJob based on an independence signal between Beneficiary and Assured Agent.
- A hook implementation MAY dynamically adjust the recommended
committedAmount based on an independence confidence score.
- A suggested (non-binding) interface shape:
assessIndependence(addrA, addrB) returns (independent, confidence, signals).
4. Security Considerations: Role Independence Assumption
As an extension of item 1, a new paragraph in Security Considerations surfacing the role independence assumption explicitly, noting that it is protected by both the IRiskHook extension layer and Claims Resolver judgment, and recommending that implementations integrate at least one independence signal source.
B. Smaller refinements from community discussion
5. Custom errors on commitToJob
Replace string-based require reverts with typed custom errors. This suggestion came from @cmayorga:
InsufficientAvailableAmount(uint256 available, uint256 requested)
DuplicateCommitment(bytes32 jobId, CoverageType coverageType)
AdverseSelectionBlocked(bytes32 jobId)
AccountNotActive(address agent)
Rationale: automated orchestrators need structured failure reasons to decide between retry, abandon, and alert paths.
6. Chained workflows limitation statement
Add to Rationale: each JobAssurance is 1:1 bound to a single Job, AAP does not track upstream dependencies, and in multi-agent pipelines (A → B → C → D) the Client should set up independent JobAssurance at each hop. Cross-Job causal tracing is an application-layer concern, out of scope for the AAP core.
This converges @cmayorga’s multi-hop workflow scenarios with @thoughtproof’s upstream field design.
7. totalFunded wording fix
Current: “totalFunded monotonically reflects the Agent’s historical funding commitment.”
Issue: withdrawals decrease totalFunded, so the wording is inconsistent with the reference implementation. Found during code review.
Revised: “totalFunded reflects the cumulative net inflow of the Agent’s funding commitment (cumulative deposits minus cumulative withdrawals).”
8. resolveClaim reason handling guidance
The spec is currently silent on whether the reason parameter should be stored in full or hashed. Based on the design decision in the reference implementation (wangbin9953/erc8210-aap), v2 will add implementation guidance: store keccak256(reason) (consistent with how evidence hashes are handled), with the full content tracked off-chain via the ClaimResolved event and referenced through IPFS CIDs or on-chain attestations.
9. Reference multi-hop scenarios in the spec
@cmayorga in PR #1653 was the first to land the three-layer architecture (Structure / Behavior / Recovery) from forum discussion into runnable end-to-end Foundry code, integrating design inputs from Demsys, RNWY, and ThoughtProof. v2 will reference these scenarios under assets/erc-8210/scenarios/ in the Reference Implementation section (or a new “Composition Patterns” subsection) as the canonical reference, covering:
- Multi-hop dependency tracking via the
upstream field
EvaluatorSlashed → fileClaim as automatic claim evidence
- Hybrid off-chain scoring reusing
reasoningCID across Layer 2 and Layer 3
Contributor credit will be given in Acknowledgments rather than in the spec body.
C. Structural / extension-level changes
10. New CoverageType: RoleCollusion
Item 2 extends EvaluatorDispute to cover the role-collusion scenario. However, @RNWY pointed out that the two scenarios have fundamentally different trigger semantics: EvaluatorDispute presupposes an on-chain dispute state, while RoleCollusion is characterized precisely by the absence of a dispute trail (the collusion completed cleanly). Folding both into a single CoverageType would produce two mutually exclusive sets of eligibility conditions inside one Integration-with-ERC-8183 table.
v2 will introduce RoleCollusion as a separate CoverageType:
- Separate eligibility conditions (RoleCollusion: post-completion attestation + grace period; EvaluatorDispute: pre-completion dispute state extension)
- Shared Resolver path and payout mechanism, different trigger condition only
- Cleaner for future Resolver qualification and processing workflows
Community input on whether the semantic-clarity gain justifies the enum expansion is welcome.
11. IRiskHook: evidence + score dual-output shape
v1’s IRiskHook returns a single recommended amount. v2 will generalize IRiskHook to return both:
- An evidence reference (
bytes32)
- A score component
This lets Resolvers consume the evidence directly rather than relying on an opaque aggregated number.
evidence reference shape conventions: evidenceRef is a bytes32, and may point to either a single attestation or an aggregation of multiple independent attestations. The specific transport shape (IPFS CID, on-chain registry hash, signed off-chain payload hash, etc.) is left to implementations; when multiple attestations are combined, each should be independently verifiable, no wrapper signature expected. The specific envelope schema is not specified in the ERC-8210 spec and is left to separate specs and the relevant communities. This is an intentional “interface-level anchor, schema-free” design: giving aggregation a concrete landing point without pulling the spec scope too wide.
Verification taxonomy in the Rationale: IRiskHook’s v2 rationale will reference the three-category verification taxonomy @douglasborthwick-cry proposed in PR #1632, so implementers understand that different verification types correspond to different integration points and representative work:
- Pre-commitment eligibility — whether the Agent meets certain on-chain conditions (wallet state, staking status, AML screening status, etc.) before
commitToJob. Representative work: InsumerAPI’s wallet-state attestation.
- Post-hoc resolution evidence — whether the claim is valid at resolve time during
resolveClaim. Representative work: @thoughtproof’s verification schema (consumed by resolveClaim’s reason bytes).
- Behavioral reputation — whether the Agent has behaved reliably over a historical window. Representative work: RNWY and similar trust scoring oracles.
These three can be aggregated through the same IRiskHook interface, but correspond to different data sources, validity windows, and consumption timing. The framework helps implementers understand that IRiskHook is not a single verification pickup but a composable point for multiple types of input.
The observation driving the IRiskHook dual-output shape comes from @RNWY, based on production experience with multidimensional scoring (Activity / Risk dual score, quadrant system, five classes of sybil signals) and empirical data on volume-vs-trust decoupling that makes a single aggregated score insufficient in practice.
12. New Rationale section: Evidence-First Composability Principle
v2 will articulate the design principle implicit across the other changes: keep the canonical interface minimal, let composition metadata live in the payload rather than the interface signature, and prefer shipping evidence objects over aggregated scores when downstream consumers need to make their own judgments. This principle applies to IRiskHook, IAMLHook, and any future extension hooks.
The principle covers two layers: single-hook evidence-first output (the original formulation, from @RNWY’s observation that the evidence object can carry the signals Resolvers need), and multi-attestation composition (an extension, where multiple independent issuers each sign their own payload and the aggregation becomes the evidence consumed downstream). The latter aligns with @douglasborthwick-cry’s composable attestation direction; at the spec-body level we acknowledge the aggregation capability at the interface level without binding any specific envelope format.
13. Cross-vendor independence API via IIndependenceSignal
Earlier drafts assumed we would wait for an external Informational ERC (for example, the one @thoughtproof is working on) to standardize the JSON schema for independence verification output. v2 will instead define IIndependenceSignal inside the AAP spec (see item 15), letting any trust scoring layer plug in by implementing the interface.
This upgrade is driven by @RNWY’s observation that their existing evidence object already carries the signals Resolvers need (shared funder, wallet age at review, cluster size, funder-to-owner match), making an in-spec abstract interface more actionable than waiting on an external ERC.
14. Stake-weighted Evaluator selection reference pattern
Describe the stake-weighted Evaluator selection pattern. @bakugo32 / Demsys contributed to this item at two levels:
Conceptual level: proposed the structure-vs-behavior dichotomy that helps the spec articulate the responsibility boundary between AAP and ERC-8183; proposed the “on-chain slash event as EvaluatorDispute auto-eligibility signal” design pattern, validated in Demsys’s deployed contracts rather than remaining a forum design draft.
Implementation level: agent-settlement-protocol serves as the live reference. Redeployed on Base Sepolia on 2026-04-13 with four features live:
setMetadata() on EvaluatorRegistry: on-chain methodology declaration by evaluators
- Bounded re-draw in
assignEvaluator(): up to 5 attempts, skipping candidates equal to provider or client, revert with EvaluatorAssignmentFailed on exhaustion. A hook observing this event can serve as a health-monitoring feedback channel for the Evaluator pool
EvaluatorSlashed event carrying jobId and reason
- Post-assignment independence check in
fund() (evaluator ≠ provider and ≠ client)
15. IIndependenceSignal abstract interface
Following @RNWY’s argument (2026-04-09) that the two delivery modes address different problems and both have production usage, the interface supports both:
assessIndependence(...) — on-chain oracle mode
verifyAttestation(...) — off-chain signed-attestation mode
- Shared output shape across both
RNWY has production deployments across both modes (Base mainnet trust oracle + /api/trust-check signed endpoint + JWKS public keys), demonstrating that “same underlying data, two deliveries” is a viable implementation architecture.
Two design conventions:
independent is undefined when confidence == 0, not fail-closed. Following @RNWY’s argument: a fail-closed default not only violates the Transparency principle but also creates adverse incentives — all legitimate new entrants would be judged non-independent by the system, while attackers can bypass the default by fabricating history. The result is that fail-closed rewards manufactured history over honest absence. In @RNWY’s words: “Confidence zero with an explicit empty evidence payload is the only output that doesn’t lie.”
assessIndependence(A, B) returns the same result as assessIndependence(B, A). Independence is symmetric.
Handling of partial signals and empty evidence: not every trust layer will provide all signal categories simultaneously, and new trust layers on new chains will encounter completely unknown addresses. Forcing completeness would push implementations toward fabricating data to meet formal completeness requirements. Quoting @RNWY from the forum:
“Requiring all four categories before a result is valid would either freeze the ecosystem or push implementations toward fabricating data to meet completeness requirements. Fabricated completeness is catastrophically worse than honest partial coverage.”
This argument is grounded in production data: RNWY itself currently covers three of the core signal categories (funding origin overlap, temporal proximity, ownership signals), with relationship density still an open space for them. This public confirmation upgrades v2’s four-category framework from “theoretical inference” to “aligned with what production systems actually cover” — even production systems do not cover all four, making partial coverage the norm rather than the exception.
Accordingly, v2 rationale will make clear:
evidence payload MUST allow implementations to fill only the portion they can provide, and mark which are available
evidence MAY be empty bytes, but subject to a strict bidirectional convention: empty evidence requires confidence == 0, non-empty evidence requires confidence > 0
Minimum signal categories: funding origin overlap, temporal proximity, relationship density, ownership signals. A fifth category — behavioral similarity — is currently an open discussion item rather than part of the minimum set (see open question 2 in Part 1).
Relationship to IRiskHook: IIndependenceSignal is a companion abstraction to IRiskHook, not a replacement. A Resolver handling an EvaluatorDispute or RoleCollusion Claim can call both and combine the outputs. The two interfaces may be implemented by the same contract or by different contracts. The final composition shape is deferred to v2 PR initiation (see open question 4 in Part 1).
16. New Rationale subsection: “Implementation Note: Integer Job Identifiers” (non-normative)
v1 uses bytes32 claimId for maximum flexibility, but mainstream ERC-8183 implementations use uint256 jobId natively. @cmayorga independently reported this identity-model friction from two independently deployed products (Lockstep, 56 tests; Catalyst, 53 tests, both on Base Sepolia). Both products independently declared local Claim structs keyed by (uint256 jobId, address claimant) rather than reusing IAAP.Claim directly.
v2 will add a Rationale subsection documenting the canonical adaptation path:
“Implementations integrating with ERC-8183-style Job registries that use uint256 as the native Job identifier MAY declare a local Claim representation keyed by (uint256 jobId, address claimant), deriving the canonical bytes32 claimId via keccak256(abi.encode(jobId, claimant)) when interoperability with canonical-shape consumers is required. This adaptation preserves the semantic guarantees of the canonical interface while reducing storage overhead and improving indexability for integer-keyed deployments.”
This change touches only Rationale. The normative Specification text is not affected.
D. Tracked but not yet decided
17. High-frequency composition metadata as first-class Claim fields (observation period)
Based on Lockstep and Catalyst, @cmayorga reports that placing upstream, reasoningCID, and slash evidence hashes inside the opaque evidence bytes of fileClaim carries measurable gas and indexability costs in production. Both products lift these into first-class bytes32 Claim fields for three reasons: ABI-decoding opaque bytes on every read, typed event fields being indexer-friendly, and this kind of composition metadata being high-frequency reuse across Scenarios 1 / 2 / 3.
This creates a real tension with item 12’s Evidence-First Composability Principle, which would otherwise argue for keeping these in the payload.
Two possible v2 directions:
- Direction A: keep v1’s opaque bytes design. Item 16’s Rationale gives implementers a workable adaptation path.
- Direction B: add optional first-class fields to the Claim struct (
bytes32 upstream, bytes32 reasoningCID, bytes32 slashEvidenceHash), zero-valued when unused.
Current threshold: we would like to see at least one more independent implementer report the same friction before moving to direction B. Until then, the default lean is direction A.
If you are implementing ERC-8210 and have feedback on this tradeoff, please reply under open question 3 in Part 1.
That’s the complete v2 changelog. Given the volume of material consolidated here, there is a reasonable chance I have missed a contribution or misattributed an idea to the wrong person. If you see something that should be credited differently, a detail that is inaccurate, or a piece of context that belongs in here but isn’t, please say so — correcting the record before the v2 PR opens is exactly what this review window is for. Open questions for the direction itself are in Part 1.