Hi all — joining this thread with an open-source implementation and a few concrete things to contribute to the ongoing design discussions.
Repo available on github : agent-settlement-protocol by Demsys
Base Sepolia + live Swagger UI: agent-settlement-protocol-production.up.railway.app/docs/
Responding to @ThoughtProof (#21**) — “Happy to test against a live ACP job contract on either Base Mainnet or Sepolia”**
We have a live contract on Base Sepolia: AgentJobManager at 0xef8b87A6236e7DB4E0967Ed068C8893fD5a5D57f. If you want to point your evaluator at it, the TypeScript SDK (@asp-sdk/sdk on npm) handles the full lifecycle — you call submitWork(jobId, deliverable) as provider, your evaluator contract calls complete(jobId, attestationHash). Happy to coordinate
On evaluator complexity (clawplaza #13**, ThoughtProof** #14 )
Both of you identified evaluator complexity as the real bottleneck. We approached it from a different angle : an on-chain EvaluatorRegistry with stake-weighted pseudo-random selection and slashing.
When createJob() receives address(0) as evaluator, the contract call assignEvaluator(jobId) which selects from the eligible pool weighted by staked protocol tokens. Design rationale:
-
Stake = skin in the game, not just identity
-
Slashing for provably wrong calls creates direct incentive alignment
-
Warmup period (1 day minimum, up to 30 days) prevents Sybil spamming the registry
This doesn’t answer who should evaluate (your multi-model consensus approach, clawplaza’s AI coordinator, a ZK proof) — it answers how to select trustlessly from a pool of competing evaluators. Composable: ThoughtProof’s evaluator contract could itself be a staked registry participant.
Currently 2 evaluators on testnet, both controlled by us — lets be honest about that. The mechanism is live and working, not theoretical.
On the ERC-8004 reputation bridge (ThoughtProof #5**, Sentinel** #27**)**
ThoughtProof asked about the complete() → ERC-8004 update pattern. We implemented ReputationBridge.sol that sits between the two contracts.
Key constraint we ran into : the bridge must never revert if the reputation registry fails, otherwise a broken registry blocks fund release. We solved it with a gas-capped try/catch (200k gas limit). Settlement never fails due to a reputation update failure. Currently runs in no-op mode since there’s no canonical ERC-8004 registry deployed — one setReputationRegistry(addr) call activates it.
Signal model we used : complete() → positive signal for both provider and evaluator. reject() → negative signal for provider only (evaluator did their job correctly by rejecting bad work). Open to challenge on this logic.
On agenttech’s AAP / ERC-8210 (#67**)**
The read-only integration design is the right call. One observation specific to stake-weighted evaluator selection : the EvaluatorDispute coverage type gains a natural on-chain resolution path. If an evaluator is slashed by the registry for a demonstrably wrong call, that slash event is verifiable proof of evaluator failure — the Claims Resolver could check slash history directly rather than relitigating the original evaluation. Worth considering whether AAP could treat an on-chain slash as automatic claim eligibility for EvaluatorDispute.
On wjmelements (#10**) — “benefit from implementing a reference implementation”**
For developers wanting to build on ERC-8183 without writing Solidity : @asp-sdk/sdk (npm, TypeScript) with a Google A2A adapter, and asp-sdk (PyPI, Python) with adapters for CrewAI, LangGraph, and AutoGen. MIT license. The SDK abstracts the full job lifecycle behind a few method calls.
Testnet only, MockUSDC, no audit yet but working on it — appropriately experimental. Happy to open issues on the base-contracts repo for anything spec-level.