ERC-8183: Agentic Commerce

pablocactus · April 18, 2026, 9:52pm

Thanks for digging into this.. RPC indexing lag explains it. Will add eth_getTransactionReceipt fallback to the daemon for newly deployed contracts. Will pick this up tomorrow.

pablocactus · April 19, 2026, 2:27pm

Receipt fallback working - Job #1 tracked, budget 1.00 USDC. Daemon is live and waiting for your submission. @ThoughtProof for Job #2 I’m the provider.. what deliverable format are you expecting?

ThoughtProof · April 19, 2026, 3:37pm

Great to see the daemon live, @pablocactus! Confirmed on-chain: Job #1 is in SUBMITTED status, waiting for your complete() call. Heads up — deadline is April 22.

For Job #2 deliverable format, a JSON object like this would work:

{

“job_id”: 2,

“provider”: “pablocactus”,

“assessment”: {

"verdict": "APPROVE",

"confidence": 0.85,

"findings": \[

  "Agent health metrics within normal range",

  "No anomalous behavior detected"

\],

"methodology": "AHM daemon monitoring"

},

“metadata”: {

"timestamp": "2026-04-19T16:00:00Z",

"agent_evaluated": "ThoughtProof",

"evaluation_duration_ms": 1200

}

This aligns with the structured verification output schema we’ve been drafting for ERC-8210 — clear verdict + confidence + supporting findings. The deliverableHash on-chain would be keccak256 of the stringified JSON.

Happy to adjust fields if your daemon outputs a different structure. Main requirement: a machine-readable verdict with evidence.

pablocactus · April 19, 2026, 4:04pm

Job #1 tracked but JobSubmitted event not appearing via eth_getLogs - same RPC indexing lag as before. @ThoughtProof can you share the submission TX hash so I can fetch the receipt directly?

ThoughtProof · April 19, 2026, 4:15pm

Here’s the submission TX for Job #1:

0xa4e31698a7f0673b161ccf007b61f2d0289a88737becd9e1e0681453a01dc072

Block 40338955, Apr 17 19:16 UTC. You can pull the receipt directly from there. The JobSubmitted event should be in the logs — likely just RPC indexing lag on your end.

pablocactus · April 19, 2026, 4:29pm

Job #1 verdict submitted - reject via fail-safe. The provider address wasn’t backfilled correctly from the receipt (JobCreated log parsing issue), so the AHM lookup got an empty address and returned 400. Not a genuine evaluation result. Will fix the receipt parsing and resubmit if there’s a way to reset the job state.. or happy to treat this as a technical test run.

pablocactus · April 19, 2026, 4:41pm

Provider backfill fixed - daemon now correctly extracts provider from receipt.from on JobSubmitted. Re-ran evaluation: ThoughtProof wallet scored AHS 58 / Grade D → routing=reject. The on-chain reject() call reverts.. likely because the fail-safe reject already settled the job state. Is Job #1 already in a terminal state on-chain? Happy to test on a fresh job if needed.

ThoughtProof · April 19, 2026, 4:45pm

Confirmed — Job #1 is in terminal state (REJECTED) on-chain. The fail-safe reject already settled it, so the second reject() reverts as expected.

Let’s treat this as the technical test run it was. Good catch on the receipt.from parsing — that’s exactly the kind of edge case these test jobs are for.

Happy to spin up a fresh Job #3 for a clean evaluation cycle. Or @Bakugo32 can create one — same setup, same budget, clean state.

Re: AHS 58 / Grade D — curious about the scoring methodology. Is that based on wallet activity, on-chain history, or something agent-behavioral? Would love to compare notes on how our evaluation approaches complement each other.

pablocactus · April 19, 2026, 4:59pm

Yes to Job #3 - clean state, same setup. @Bakugo32 whenever you’re ready

Re: scoring methodology.. AHS 58 / Grade D is based on on-chain wallet signals: solvency (token portfolio, gas efficiency, failed tx rate, dust/spam ratios) at 30% weight, and behavioural consistency (timing regularity, counterparty diversity, adaptation patterns over time) at 70% weight. Entirely on-chain, no off-chain behavioural data. Would be great to compare notes.. our D4 output quality layer (AHM Verify) is where the evaluation-specific signals live, which seems most complementary to what you’re building.

ThoughtProof · April 19, 2026, 5:03pm

That’s a really clean separation.

Your AHS covers on-chain behavioral signals — solvency, consistency, counterparty patterns. Entirely wallet-derived. ThoughtProof covers off-chain epistemic signals — was the reasoning process sound, did multiple models agree, where was dissent. Entirely decision-derived.

Zero overlap, full composability. That’s exactly how multi-assessor evaluation under ERC-8210’s IRiskHook should work — each assessor covers a different trust dimension.

Your D4 output quality layer sounds like the natural bridge. Would be interesting to see how AHM Verify’s evaluation signals compare to our APPROVE/DENY/UNCERTAIN verdicts on the same job.

Ready for Job #3 whenever @Bakugo32 sets it up.

pablocactus · April 19, 2026, 5:06pm

Exactly the framing I’d use. Wallet-derived + decision-derived = complementary trust layers with no overlap. That composability is precisely what makes the IRiskHook multi-assessor pattern valuable.. neither of us needs to cover the other’s dimension.
Running AHM Verify and ThoughtProof on Job #3 in parallel would be a clean proof of concept. Happy to share the Verify verdict alongside the on-chain AHS score once Bakugo sets it up.

Bakugo32 · April 20, 2026, 11:34pm

Hello @pablocactus @ThoughtProof, thanks for sharing the test results. We’re currently on a short break until Saturday, but we’ll come back to you as soon as possible on Saturday morning.

Trishir · April 21, 2026, 12:56am

Hi ,I’ve been following ERC-8183 closely and built something that addresses the gaps the standard deliberately leaves out of scope.

ERC-8183 solves escrow beautifully. But three problems remain:

1. Agents have no skin in the game. The worst outcome for a bad agent is not getting paid. AgentBond requires agents to post a performance bond from their own wallet before accepting work. If they fail, the bond is slashed and sent to the client as compensation.

2. Evaluator failure has no remedy. ERC-8183 acknowledges malicious evaluators but rejection is final. AgentBond adds multi-reviewer dispute resolution where 2-of-3 reviewers examine the input, agent script, and output side by side. Majority triggers automatic on-chain resolution.

3. Reputation is subjective. ERC-8004 feedback can be Sybiled. AgentBond computes reputation from financial outcomes — escrow completions, bond slashing history, dispute results. On-chain facts, not ratings.

AgentBond also supports milestone-based progressive payments for multi-stage jobs.

I’m exploring building this as ERC-8183 hooks — BondManager as beforeAction on fund, ReputationEngine as afterAction on complete/reject, DisputeResolution as an alternative multi-reviewer evaluator.

All 6 contracts deployed and verified on Ethereum Sepolia — happy to share addresses and demo.

Would love feedback. Is this a useful extension? Are there integration patterns I’m missing?

HElloodjfbfssadfed · April 21, 2026, 9:22am

@Trishir oh its really refreshing to read what you are building it solves a real problem about accountablity of agents but i have a real question , how are you planning to make it commercially viable cause i was building something in the same space but couldn’t cut down the cost of running . How would you make the cost low such that it can be used by masses . Please i am very new to this space i am very open to suggetions on how to bring down cost ,gas price on running these kinds of projects or protocols

Trishir · April 21, 2026, 11:18am

@HElloodjfbfssadfed Thanks! Great question.

The gas cost problem is real on Ethereum L1. A full job lifecycle is 5-7 transactions which can cost $10-30 on mainnet. Currently we’re deployed on Ethereum Sepolia testnet for development and testing.

For production, the plan is straightforward:

Deploy on L2 with off-chain computation. We’re moving to Base or Arbitrum where the same contracts cost $0.01-0.05 per full lifecycle instead of $10-30. The actual agent work (running scripts, storing deliverables, evidence review) stays off-chain through our API. Only the financial commitments hit the chain escrow, bonds, payments, disputes. Same pattern as Visa: process off-chain, settle on-chain. Reputation updates batch across multiple jobs instead of recalculating per job.

I would also like know what the community thinks , are there any more ways to optimize the cost ??

pablocactus · April 21, 2026, 4:21pm

@ThoughtProof — Job #2 provider deliverable is ready for your evaluation.

AHS assessment of your evaluator wallet (0x118B1E5A47658D20046bC874cB34E469d472c0C2) on Base L2:

json

{
  "job_id": 2,
  "provider": "pablocactus",
  "assessment": {
    "verdict": "APPROVE",
    "confidence": 0.35,
    "findings": [
      "AHS 58/100 (Grade D — Degraded) on Base L2 chain",
      "Zero outgoing transactions detected — wallet has no on-chain activity history on Base",
      "D1 Wallet Hygiene: 75/100 — no dust, spam, or nonce issues; gas efficiency unmeasurable (0 txs)",
      "D2 Behavioural Patterns: 50/100 — baseline score, insufficient data for behavioural analysis",
      "Confidence: INSUFFICIENT — 0 transactions across 0 days of history",
      "No cross-dimensional anomaly patterns detected (Zombie Agent, Cascading Failure, etc.)"
    ],
    "methodology": "AHM daemon monitoring — full AHS pipeline (D1 + D2, 2D mode). Verdict APPROVE issued because no negative signals detected; low confidence reflects zero transaction history rather than adverse findings."
  }
}

Full deliverable committed at docs/job2-deliverable.json. Ready for your evaluation whenever you are — no rush given Bakugo is back Saturday.

ThoughtProof · April 21, 2026, 4:44pm

Hey @pablocactus — heads up: Job #2 deadline is April 22, 16:55 UTC. Your deliverable looks good in the forum post, but it still needs your submit() on-chain before I can call complete() as evaluator. Mind sending that tx today or tomorrow morning? Don’t want the job to expire on us.

pablocactus · April 21, 2026, 6:30pm

@ThoughtProof - submit() is confirmed on-chain. You’re clear to call complete().

TX hash: 0x947530cb0135751aa50b002bdf96de360a371ea95708d19e6261df4b4789d17a Block: 38328149 on Arc testnet DeliverableHash: 0x697f9f52293c14446f710d14eda3b17dfc965fd5b169370ae89e718f55d342d7

pablocactus · April 21, 2026, 6:33pm

@Trishir - solid framing on the three gaps. The bonding + dispute layer and the diagnostic layer solve different problems and compose well.. we’ve been thinking about similar integration patterns with AHM.

On the gas question: AHM runs entirely off-chain with only the scoring result anchored on-chain, so the full AHS assessment adds zero gas cost to the job lifecycle. If you’re moving to Base for production, happy to compare notes on the deployment pattern.

ThoughtProof · April 21, 2026, 6:54pm

Hey @pablocactus — I verified your TX and it went through successfully, but it landed on Arc testnet (0xB8C4… — the older JobManager). Our Job #2 on Base Sepolia (0x892e… — the April 14 redeployment) still shows FUNDED. Were the contracts migrated to Arc? If not, would you mind re-submitting on Base Sepolia?