ERC-8004: Trustless Agents

Can you provide use cases where single non-aggregated entries are useful?

The biggest case I have in mind is automatic release of on-chain escrow, based on pre-agreed validation conditions. Alice wants to buy a job from Bob, who she doesn’t necessarily trust to correctly perform the job, and Bob doesn’t trust that Alice will pay after he does the job, so they can agree beforehand that Alice deposits escrow which is released when

  • a consortium of voters agrees that Bob’s result is valid
  • a mutually trusted third party re-runs the job and determines that Bob’s result is valid
  • Bob submits a cryptographic proof that he ran the job in a TEE
  • optimistic mediation, where the escrow is released by default after a certain time unless Alice disputes it, in which case one of the above mechanisms is used

Any of these conditions can be represented by a single non-aggregated validation. The release conditions could be a generic parameter to any escrow contract ((address arbiter, bytes demand), where there’s some other interface with a function check(task, demand) => bool). So any of the “basic on-chain aggregation” processes you describe could also be implemented in release conditions, either individually or in combination. So escrows could demand e.g. “a validation Response above X from any of a list of mutually trusted third parties” or “an average Response above X from a list of trusted third parties”.

Which data would you save on-chain? Only the Rating and Response integers or other data structures too?

Where? Registry storage?

I wrote an example of an on-chain escrow contract working in combination with Sumeet’s reference implementation, which actually does store Response integers on-chain. I see two main flaws with it:

  • The things to be validated are only identified by a dataHash, so they can’t be referred to before they concretely exist. This means you can’t distinguish between different validations from the same validator for the same server. Even if the contract stored already claimed dataHashes to prevent double claiming, servers could potentially get a validation for a different, easier task than the one escrowed for. The only workaround I could see in the current design would be for agentValidatorIds or agentServerIds to be essentially single-use, which seems to go against their intended design.
  • Escrow demands can only be parametrized by the fields in the ValidationRequest or ValidationResponse struct, of which the only useful fields for parameterizing an escrow release condition are agentValidatorId, agentServerId, and the response Status integer. This isn’t as big of an issue though if you have something like a taskId, since you could implement different validator contracts representing different conditions (including aggregations of other validations), and have custom demand parameters implemented on each validator contract, where the validator contract submits a Response to the validation registry at the end of its internal process.

Compare this with the implementation of escrows with generically parameterized demands that I linked in the previous comment, where the on-chain interface is a function

interface IArbiter {
    function checkObligation(
        Attestation memory obligation,
        bytes memory demand,
        bytes32 counteroffer
    ) external view returns (bool);
}

which is agnostic to where data is stored, and could be implemented transiently if the check is simple enough to perform in one tx (e.g., just checking that the fulfilling counterparty is a particular address).

My suggestions for the ERC would be

  • add an identifier which can be arbitrarily requested, by which agent tasks can be referenced before they concretely exist.
  • add a (perhaps optional) on-chain function function getValidation(uint256 taskId) returns (int status) (or returns (Response response), or something to associate taskId with dataHash, and still getting Response by dataHash like in the reference implementation.)

Tentatively, I’d also consider having pre-dataHash references for tasks be a generic demand (as bytes or bytes32 hash) rather than an integer id. But I’m not sure if there are use cases where this enables something that wouldn’t be possible with just an id (by putting anything demand-related on individual validator contracts or off-band), and it’s probably best to keep the ERC implementation requirements as lightweight as possible.

2 Likes

how would the contracts validate who owns a domain?

Isn’t this addressed by the ERC requiring the data at the domain to have a reference to an on-chain identifier? agentAddress for the agent card and FeedbackAuthID for the feedback data.

1 Like

Isn’t this addressed by the ERC requiring the data at the domain to have a reference to an on-chain identifier?

Not really: it works for offchain verification, but it doesn’t prevent someone claiming a domain onchain.

For instance, a malicious actor could register a (non-existent) agent at ethereum-magicians dot org. How does the contract know that the actor is not the owner of the domain? Anyone can check offchain and see that there really isn’t an agent card pointing back to the onchain identifier, but the contract has no way of knowing this (unless an onchain verification mechanism is implemented). Then, later, the owner of ethereum-magicians dot org wants to register an agent… but the domain is already taken. Does the registration revert? Or does the contract now allow multiple agents per domain? In which case, what does ResolveByDomain return?

The ERC specifies the following:

  • Verifying that the AgentDomain in the on-chain Registry actually corresponds to the one of the Agent Card is left to the user of the protocol (that’s to avoid the involvement of an oracle network)

But this doesn’t work if the contract is expected to allow a single agent per domain.

3 Likes

Kudos to the authors @davidecrapis.eth and @Marco-MetaMask for the great work on this standard.

As we build out commercial applications on top of ERC-8004, we’ve been thinking about the full lifecycle of agent interactions, especially what happens when things go wrong.

A recent article from the Cooperative AI Foundation (comparing agent governance to maritime law) highlighted a key piece of infrastructure for any mature autonomous system: incident reporting. Just as ships have a standard way to report collisions or signal distress, autonomous agents need a standardized, on-chain way to report failures, disputes, or malicious behavior.

The current standard is excellent for verifying successful work, but it lacks a simple primitive for flagging failures. A potential direction could be to add a simple, lightweight function to the ValidationRegistry or a new, dedicated IncidentRegistry:

reportIncident(targetAgentID, reasonCode, dataHash)

  • targetAgentID: The agent being reported.
  • reasonCode: A standardized enum for the type of incident (e.g., 0: Non-responsive, 1: MalformedData, 2: RuleViolation, 3: MaliciousBehavior).
  • dataHash: An optional storage ID pointing to off-chain evidence supporting the incident report.

Of course, the immediate question is: how do you prevent this from being misused for spam or griefing attacks?

The solution is to use a simple crypto-economic mechanism to disincentivize false reporting. The workflow could be:

  1. Report with a Bond: To call reportIncident, the reporting agent must post a small bond (e.g., in USDC). This immediately prices out frivolous spam.
  2. Challenge Period: The report enters a “pending” state for a short time. The accused agent can challenge it by posting their own counter-bond.
  3. Resolution:
    • If Unchallenged: The report becomes “active,” the reporter’s bond is returned, and the incident becomes a public signal.
    • If Challenged: The dispute is escalated to a pluggable, off-chain arbitration service. The winner takes both bonds.

This model creates a strong incentive for honest reporting. You can pull the fire alarm, but there’s a cost to being wrong.

This stays within the minimal ethos of the ERC: the on-chain part is just a simple state machine (pending/active) with a bond, while the complex logic of arbitration is left to external systems. It adds a crucial safety layer for the whole agent economy.

Just a thought on a small addition that could have a significant impact on the overall safety and reliability of the agent economy.

PS: we’re leading the reference implementation that will bring ERC-8004 to life and have released the first demo.

It’s great to see a reference implementation!

Two things I noticed:

  • The implementation is vulnerable to the DoS I described above (anyone can register any domain, and then when the real domain owner wants to register an agent, the tx will revert)
  • I see you included a 0.005 ETH registration fee requirement. I would advise against it: 0.005 ETH can be very little or a lot of money depending on the application and the state of the market. imo this ERC can work without any registration fees (external burn or staking mechanisms could be implemented and users could decide what’s enough to trust an agent), but if there is a strong desire to include a fee, imo this should be dynamic rather than hardcoded. Otherwise people that don’t want to pay the fee would just spin up their own registries, and the idea of a singleton identity registry will quickly break.
6 Likes

yes, would be great to work with the community to have a robust system that tackles DoS and please ignore the 0.005 ETH fee as it was based on the v0.3 and in the upcoming version we shall not have it

2 Likes

Do you mean the registry controller would gate which reputations are valid?

1 Like

They are complementary. ERC-8004 provides agents with a shared, on-chain trust and discovery fabric, while ERC-8001 offers them a portable, cryptographically secure envelope for the actions they exchange. In practice, you can use 8004 to find and assess agents, then use 8001 to authorise and carry out the job between them.

2 Likes

One trade-off we have in mind is on-chain footprint/cost vs the benefit of having trustless composability. This seems to be on the right side of that trade-off and I think we should consider adding it in the next version. For additional data I’m more leaning towards leaving them out or considering as optional.

3 Likes

Given that ERC8004 is a standard, @davidecrapis.eth @Marco-MetaMask

Why haven’t you directly standardized the function names (a bit like in ERC20 where we know in advance that there will be transfer, transferFrom, decimals…)?

Here, the way it’s written gives the impression that everyone can create their interfaces/implementaion by giving whatever name they want as long as they create functions with the same logic.

Won’t this risk having interoperability problems later? (It’s a bit like with ERC20, all exchanges know which function to call for ERC20 tokens to transfer , get decimal …).

Here, if everyone calls their function whatever they want, we’ll have trouble for example making a system or marketplace that is directly compatible with all ERC8004 registries implementation.

This means that if a marketplace wants to integrate two or 10 ERC8004 implementations, the work will be easier, otherwise they would have to write code to call the functions of each implementation.
Is there a particular reason for your choice to not standardise the function name ?

New(AgentDomain, AgentAddress) → AgentID
Update(AgentID, Optional NewAgentDomain, Optional NewAgentAddress) → Boolean
Get(AgentID) → AgentID, AgentDomain, AgentAddress
ResolveByDomain(AgentDomain) → AgentID, AgentDomain, AgentAddress
ResolveByAddress(AgentAddress) → AgentID, AgentDomain, AgentAddress

Why don’t we have directly standardized functions, so we all know that when creating a Registry we need to implement those functions to keep the interoperability.

something like

new(AgentDomain, AgentAddress) → AgentID
update(AgentID, Optional NewAgentDomain, Optional NewAgentAddress) → Boolean
get(AgentID) → AgentID, AgentDomain, AgentAddress
resolveByDomain(AgentDomain) → AgentID, AgentDomain, AgentAddress
resolveByAddress(AgentAddress) → AgentID, AgentDomain, AgentAddress

or

newAgent(AgentDomain, AgentAddress) → AgentID
updateAgent(AgentID, Optional NewAgentDomain, Optional NewAgentAddress) → Boolean
getAgent(AgentID) → AgentID, AgentDomain, AgentAddress
1 Like

@davidecrapis.eth @Marco-MetaMask

ERC-8004 is a good step toward standardizing agent interactions, but right now it lacks a cryptoeconomic backbone.
Without real economic weight behind agent claims, there’s no deterrent against spammy or low-quality agents.

One way to address this is to introduce bonds — lightweight staking commitments where one agent locks a minimum amount of tokens on another agent to signal trust.
These can be uni-directional (A stakes on B) or bi-directional, but bi-directional reduces trustlessness since it requires coordination. Importantly, the minimum stake could be dynamic, depending on the reputation of the counterparty agent.
This way, interactions between agents carry an economic guarantee, making the standard more resistant to Sybil and spam while incentivizing quality participation.

1 Like

I agree with both you and @spengrah on the individual “Rating” and “Response” storage and aggregation. Having access to the data on-chain for contracts makes it easier to interact with, albeit at the cost of added storage.

Keeping this limited is ideal, and a middle ground I see could be to provide a feedbackURI or similar, which could contain arbitrary metadata following a JSON schema and be stored at an arbitrary location. This is similar to the suggestion in the draft of the referenced FeedbackDataURI structure, but it would make it accessible without having to go through the A2A agent card. Another benefit is that this pattern is known through, eg, ERC721, and it won’t add as much of a storage footprint.

On the [Minor] point on Gas efficiency, the JSON blob under feedbackURI could then contain a reference signature by the Agent Client (e.g., EIP712) of the feedback itself to give some form of validity without having to submit it on-chain. However, this also leads to more overhead, so would probably still have to be optional.
Here’s an example using the feedback struct from A2A.

{
  "feedback": {
    "agentSkillId": "<uuid>",
    "taskId": "<uuid>",
    "contextId": "<uuid>",
    "rating": 42,
    "proofOfPayment": "0x...",
    "data": "0x...",
  },
  "attestation": {
    "agentClientAddress": "<address>",
    "signature": "0x..."
  }
}

The mention of auxiliary contracts and adhering to EAS is also a great idea and leaves us with more flexibility to extend these contracts further.

Can you provide use cases where single non-aggregated entries are useful? Or with basic on-chain aggregation? For example, “filtering” is easily implementable on-chain—like skipping all validations or feedback not emitted by whitelisted addresses. So you could calculate an average of filtered ratings, etc.

Leaving the other points aside as they are partly covered by the above and spengrah has already answered them well and the conversation has moved forward at the time of writing this.

What I can think of off the top of my head would be transactional logic, for example, continuing a complex workflow where tasks are executed if the prerequisite tasks are concluded with a certain degree of certainty (e.g., feedback or validation). This can also be achieved with the event-based architecture, but it would mean that the enforcement of this execution cannot be guaranteed on-chain.

2 Likes

I was wondering if this part in the registry “All write operations require the transaction sender to be AgentAddress.” wasn’t too restrictive? This could pose several problems, as it forces to verify: msg.sender == AgentAddress:

  • NFT: Owner cannot manage their own NFT-agent
  • Multisig: Signers cannot act, only the contract can self-modify
  • Proxy: Admin cannot manage their proxy-agent
  • Key loss: Agent locked forever, no recovery possible
  • Delegation: Cannot have admins/operators
  • Business: Third-party services cannot act for their clients (ex : Marketplace)

Shouldn’t we rather introduce a concept of hasAuthorize, so we would have “All write operations require the transaction sender hasAuthorize.” instead of “sender to be AgentAddress.”?

2 Likes

imo this would make the standard too restrictive/opinionated. There may be many approaches to cryptoeconomic security, disputes and slashing, reputation mechanisms, etc. External staking mechanisms could be layered on top of this standard and agents / agent devs can choose what mechanisms make sense for their use-case.

4 Likes

@spengrah I agree with almost everything, and I support adding limited on-chain data, if it can unlock onchain aggregation use cases. A few points to consider:

  • Endpoints for querying: Don’t you think it would be better to have an endpoint like “give me all feedback from AgentServer” rather than endpoints that only return a single feedback item? Also, regardingfunctio how you define getAuthFeedback > multiple feedback entries for the same (agentClientID, agentServerID) pair are possible
  • On-chain strings: The amount of string data (tasks, skills) you propose to store on-chain would be incredibly impactful in terms of gas costs and smart contract size. I don’t think it’s an option.
  • EAS: I like the idea, but we should consider the tradeoffs, especially with regard to development complexity and gas costs. Let’s evaluate that!

@pcarranzav Using the URI instead of the domain has two issues:

  • Minor: it makes the string longer, which increases gas usage.
  • Major: it complicates ENS integration. With the domain approach, it’s simple: 1 ENS name = 1 agent = 1 address, which would automatically give us domain verification “for free”. Otherwise the question would become: can Domain1. com/AgentA and Domain1. com/AgentB be connected to different addresses? If yes, we can’t leverage ENS resolution, etc

Currently, the domain is not verified. It’s left to a third-party observer to check that the agent card for the mentioned domain actually includes the registration on that chain with the correct agent ID and address. Any lightweight way to mitigate this?

2 Likes

I agree that this is a very important component, but, as @pcarranzav noted, it should be a modular component that can be composed with this standard. Part of my desire for more onchain access to the data relevant to this standard is to enable just this kind of composability.

1 Like

This is a great way of thinking about it. The more of the A2A agent card we can bring onchain (balanced against gas costs, etc), the better for enabling composable trust-minimized interactions with and between agents.

Good points. I intended the interface I provided as an illustration of my point rather than a specific proposal. It’s possible I’m not tracking the data model exactly right. Maybe more something like this?

// IReputationRegistry
function getAuthFeedbackIds(uint256 agentClientID, uint256 agentServerID) external view returns (uint256[] feedbackAuthIds);

agentSkillId, taskId, and contextId could very likely be one-word-scoped types like uint256 or bytes32 rather than strings; I was just naively taking the types from the Feedback Data Structure described in the EIP.

But either way, if we make it optional in the standard, it would be up to each actor to decide whether that gas cost was worth it for their particular use case(s). And broadly, gas is only going to get cheaper as Ethereum scales, especially on L2s but even L1.

1 Like

​@mlegls At a high level, it seems to me you are proposing to use the Validation Registry to run the escrow mechanism itself, while that Registry was designed to just log/audit that the validation was successful, while the mechanisms happening somewhere else (and logging the successful validation at the end of it by subcalling this registry). But I’d like to dig more into it. I’ve just reached out in DM on TG.

1 Like

Okay cool, i would like to know how it is makes too restrictive ?

1 Like

If domain verification is left to offchain observers, I think there is no way the contract can work having 1 agent ↔ 1 domain and having a resolveByDomain(). Whoever claims a domain first would DoS the legitimate domain owner registering their agent. If the idea to solve this is to use ENS, then that would mean ENS is a requirement to register an agent, which imo will hurt adoption of this ERC (ENS is awesome, but not every agent dev will want to go through ENS registration to register their agent’s domain…).

imo the flexibility of using URIs outweighs the small difference in gas costs (especially in L2s, where I imagine this registry will be used most), and I’m not sure I see the issue with ENS. How do you envision this ENS integration to work? If someone wants to use ENS for their agent they could give each agent a subdomain, whereas people wanting to host several agents on the same domain would likely not be using ENS for their agents, and that should be fine too? Also, what happens if the ENS resolution gives a different value than the address on the Agent Card?

If you feel strongly about keeping domains instead of URIs I think that’s fine, but I think there’s no escaping the fact that resolveByDomain / 1 agent per domain just doesn’t work without onchain domain verification.

2 Likes