ERC-8259: AI Agent Identity, Reputation & Threat Registry (Draft v2 – Request for Feedback)

# ERC-8259: AI Agent Identity, Reputation & Threat Registry

## Abstract

This proposal introduces a standardized framework for identity,

reputation, and threat signaling for autonomous AI agents operating

on EVM-compatible networks.

The goal is to enable smart contracts to make programmatic trust

decisions when interacting with autonomous agents.

The system defines three core primitives:

- **Agent Identity** (`IAgentIdentity`) — binds an AI agent to a

verifiable on-chain identity

- **Agent Reputation** (`IAgentReputation`) — maintains a dynamic

trust score based on on-chain behavior and attestations

- **Threat Registry** (`IThreatRegistry`) — distributes standardized,

cryptographically signed threat signals

*This draft is submitted for early-stage feedback on architecture,

security assumptions, and composability.*

-–

## 1. Motivation

The rise of autonomous AI agents in DeFi execution, account

abstraction (ERC-4337), smart contract automation, and

agent-to-agent (A2A) protocols introduces a new class of risks:

- Sybil agent generation at scale

- Automated exploit execution

- Cross-contract adversarial coordination

- Impersonation of trusted execution agents

Existing standards (ERC-725, ERC-4337, ERC-6551) provide identity

and execution abstraction, but do not provide a native trust scoring

or threat propagation mechanism for autonomous agents.

This proposal attempts to fill that gap.

-–

## 2. Design Goals

- **Composability** — must integrate with existing ERC standards

- **Minimal trust assumptions** — no single centralized oracle

- **Machine-readable trust layer**

- **Real-time reputation updates**

- **Cross-chain extensibility** — L2-ready

-–

## 3. Core Architecture

AI Agent

Agent Identity Layer

Reputation Engine (risk scoring)

Threat Registry (global signals)

Smart Contract Execution Gate

-–

## 4. Core Components

### 4.1 Agent Identity

Each AI agent is bound to a deterministic identity:

```solidity

struct AgentIdentity {

address wallet;

bytes32 agentId;

bytes32 modelHash;

address creator;

uint256 createdAt;

}

Open question: Should identity be mutable (upgradable models)

or strictly immutable for security guarantees?

4.2 Reputation System

Each agent maintains a bounded score: reputation ∈ [0, 1000]

Update rule: R_new = R_old + Δrisk

Where Δrisk < 0 indicates malicious/suspicious behavior and

Δrisk > 0 indicates validated safe behavior.

function updateReputation(

bytes32 agentId,

int256 riskDelta

) external;

Open questions:

Should updates be permissioned (security nodes), permissionless

with staking, or hybrid (recommended)?

How do we prevent reputation farming attacks?

4.3 Threat Registry

A global registry of standardized threat signals.

struct ThreatSignal {

bytes32 agentId;

uint8 threatLevel;

int256 riskDelta;

string metadataURI;

uint256 timestamp;

address reporter;

}

Threat taxonomy:

ID

Meaning

0

benign

1

anomalous behavior

2

suspicious pattern

3

exploit attempt

4

confirmed malicious

Open question: Should threat levels be strictly ordinal or

probabilistic (risk score distribution)?

5. Core Interfaces

Identity

interface IAgentIdentity {

function registerAgent(

    address wallet,

    bytes32 modelHash,

    bytes calldata signature

) external returns (bytes32 agentId);

}

Reputation

interface IAgentReputation {

function getReputation(bytes32 agentId)

    external view returns (uint256);



function updateReputation(

    bytes32 agentId,

    int256 riskDelta

) external;

}

Threat Registry

interface IThreatRegistry {

function submitThreat(

    bytes32 agentId,

    uint8 threatLevel,

    int256 riskDelta,

    string calldata metadataURI

) external;

}

6. Execution Model

Smart contracts interacting with agents may optionally enforce a

trust gate:

Agent Action → Query Identity → Query Reputation

        → Query Threat Registry → Decision

Decision logic:

Condition

Outcome

reputation > 700 AND threatLevel ≤ 1

allow

reputation 300–700

restricted execution

reputation < 300 OR threatLevel ≥ 3

reject

7. Sybil Resistance

Three approaches under consideration:

Option A — Stake-based identity

Agent registration requires collateral. Slashing for malicious

behavior.

Option B — Attestation-based identity

Trusted issuers validate agent legitimacy.

Option C — Hybrid model (preferred)

Stake + attestations + behavioral scoring.

8. Cross-Chain Considerations

For L2 interoperability:

Local registry per chain

Periodic synchronization via optimistic proofs or ZK-based

attestations (future extension)

9. Security Considerations

Key attack surfaces:

False threat reporting

Sybil agent inflation

Reputation farming loops

Oracle manipulation (off-chain signals)

Mitigations under discussion:

Multi-signer quorum for critical updates

Stake-based slashing

Bounded update rates per epoch

10. Relationship to Existing Standards

ERC-8259 is not a replacement for:

ERC-4337 (account abstraction)

ERC-6551 (token-bound accounts)

ERC-725 (identity)

It introduces a behavioral trust layer for autonomous agents

interacting with these systems.

11. Open Questions for Community Feedback

Reputation model design — Should reputation be deterministic,

probabilistic, or hybrid?

Sybil resistance model — What is the most robust

minimal-trust approach?

Threat taxonomy standardization — Should this be a fixed enum,

DAO-governed evolving registry, or chain-specific?

Cross-chain state consistency — How should threat + reputation

state be safely synchronized across L2s?

Permission model — Should IAgentReputation updates be

permissionless, or restricted to security validators?

Reference Implementation

Working implementation: GitHub - ibonon/Sigui: Sigui is the first decentralized security infrastructure designed for autonomous AI agents. As agents begin to manage real capital (USDC), Sigui acts as a real-time, synchronous firewall to prevent malicious flows, prompt injection, and topological attacks. · GitHub

ThreatRegistry.vy deployed on Arc L1 testnet:

0x17430A67e11535466cC5f17e736D5e4643B86ba1

380+ threat signals recorded on-chain

Dataset: Ibonon/sigui-depin-1m · Datasets at Hugging Face

1 Like

This is a whole can of worms! Lots of things to think through. Is the model open source? Is it deterministic? If it is deterministic, can you run all previous interactions and see how they change? Does that affect whether it can inherit reputation? Is it a tuning of another model? What was the tuning data?

For simplicity, it may be easiest to not allow upgrades, and each model has to start their rep ‘from scratch’.

That’s exactly the core dilemma I’ve been thinking about.

If an agent can completely change its internal cognition through upgrades or fine-tuning, then inherited reputation becomes dangerous because reputation no longer maps to the same behavioral entity.

I think there may be 3 possible identity layers:

1. Immutable identity

The base model + weights hash never changes.

Maximum auditability and deterministic replayability.

2. Evolutionary identity

Upgrades are allowed, but every upgrade creates a cryptographically linked lineage:

Model A → Model B → Model C

Reputation inheritance would then become partial rather than absolute.

3. Forked identity

Major behavioral or architectural changes create a new identity entirely, similar to a hard fork.

One idea I’m exploring is whether reputation itself should become composable:

- behavioral reputation

- economic reputation

- deterministic reproducibility

- training transparency

- governance history

Instead of a single monolithic trust score.

Your point about deterministic replayability is especially important.

If agent actions cannot be replayed or audited, inherited reputation becomes much weaker from a security perspective.

Some of the ideas I’m experimenting with are part of Sigui, an AI-agent identity and security architecture I’m currently building:

Sigui:

Related ERC discussions:

Vision dataset:

Fine-tuned model experiments:

I’m especially interested in how Ethereum-native identity primitives could evolve once autonomous agents start modifying themselves recursively over time.

1 Like

Hello - Pablo from AHM (Agent Health Monitor) here. We’ve been operating a behavioural reputation system for autonomous agents on Base mainnet since January, with ~20K agent wallets scored across registries including Olas, Celo, Arc, ACP, ERC-8004 and CardZero. Posting because ERC-8259 sits in territory adjacent to ground we’ve been covering operationally, and there are a few observations from that experience that may be useful as the draft evolves.

These are offered as design considerations, not as demands that the spec adopt AHM’s specific approach. Building on the thread above.. particularly the discussion about composable reputation components.

1. Confidence-aware reputation, not just bounded scores.

The reputation ∈ [0, 1000] definition treats reputation as a flat number. From operational experience, this collapses something important: the observation density underlying the score. A reputation of 400 from one observation is operationally different from a reputation of 400 across 200 observations, even though the gate logic in Section 6 would treat them identically.

We’ve found a confidence enum (in our system: HIGH / MEDIUM / LOW / INSUFFICIENT) attached to the score is non-trivially useful. It encodes a principle that came up in our methodology: limited evidence is not the same as adverse evidence. An agent with sparse history should not be scored low; it should be scored with explicit insufficient-evidence semantics. The execution gate can then choose how to handle that.. restrict execution, request more observations, or fall back to a different signal.. rather than being forced into a “low score → reject” branch that may misrepresent the underlying epistemic state.

This may be worth surfacing as part of the reputation interface (e.g. getReputation returning (uint256 score, uint8 confidence)), or at minimum as a documented expectation that reputation issuers expose confidence alongside numeric output.

2. Methodology-anchored issuers as a third permission model.

Section 11 frames the permission question as permissioned (security nodes) / permissionless with staking / hybrid. A third pattern worth considering is methodology-anchored issuance: the issuer’s identity is on-record, their methodology is versioned and publicly documented, and the score is verifiable by an independent party re-running the methodology against the observed data. The verifiability comes from the methodology being open and reproducible, not from staking or governance.

This is closer to how independent rating systems work in traditional finance (the issuer is named, the methodology is published, consumers compare issuers and methodologies) and avoids some of the perverse incentive structures in pure permissionless staking (reputation farming, slashing-fee arbitrage). It composes cleanly with the others - a hybrid model could allow staked reporters AND methodology-anchored issuers in the same registry, with the consumer choosing whose signal to consume.

This pattern is being exercised in practice in ERC-8183 (Agentic Commerce), where evaluators are protocol-identified and produce attestations under their own published methodologies. Worth looking at the work happening there, particularly the recent Phase 2 deployment and the case library documenting composed-evaluator patterns.

3. On composable reputation - operational observations.

The composable-reputation direction proposed in the reply above is, from operational experience, the right call. A few notes from running behavioural reputation in production:

In our methodology, the AHS score decomposes into dimensions (transaction patterns, behavioural diversity, cross-registry signals, task-execution quality). Each has independent meaning and can be exercised or omitted depending on the data available - a wallet active on multiple registries gets a cross-registry signal; a wallet without task-completion data has its task-execution dimension marked as not applicable rather than scored as low.

A practical example of why this matters: an agent might score low on dimensional consistency but high on task-execution quality. A flat score averaging these to “medium” loses information that downstream consumers might want to act on. Smart contracts gating on reputation > 700 would treat such an agent identically to one with consistent mediocrity across all dimensions, even though the failure modes are very different.

Worth considering whether the reputation interface should support optional dimensional decomposition (e.g. getReputationDimensions(agentId) returning a vector with documented semantics), even if the headline score remains a single number. The categories in the reply above (behavioural / economic / reproducibility / training transparency / governance history) are a useful starting taxonomy - they map cleanly to what different reputation issuers can actually produce.

4. Relationship to ERC-8183.

Section 10 lists ERC-4337, ERC-6551, and ERC-725 as adjacent standards but does not reference ERC-8183 (Agentic Commerce). These are natural complements: ERC-8183 settles job lifecycles between agents with binary verdicts at protocol level; ERC-8259 would be a natural sink for the resulting behavioural signal (completed jobs, rejected jobs, evaluator attestations) feeding into reputation updates.

There’s some live work in this composition direction documented in the envelope-in-action case library (Case 4, currently in PR review, documenting AHM × ThoughtProof co-evaluator composition under ERC-8183). Happy to share the link once the draft lands, in case it’s useful context for the cross-standard composability angle.

On scope:

These four observations all point in the same direction: reputation as a primitive is more useful when it carries the structure of how the score was produced (confidence, methodology, dimensional decomposition, signal lineage), not just the score itself. The ERC-8259 architecture can support this without major restructuring - most of it sits at the interface and metadata layer.

We don’t operate at the identity layer ourselves - AHM scores wallet behaviour rather than model artefacts - so the immutable/evolutionary/forked identity decomposition above is outside our direct operational experience. But the reputation layer being discussed is squarely in our territory, and we’re happy to share more from production data if useful.

- Pablo / AHM

1 Like

Thank you for the thoughtful and extremely valuable feedback, Pablo. This is exactly the kind of operational perspective I was hoping to surface through the draft discussion.

Your point about confidence-aware reputation is particularly important. I agree that a flat scalar score collapses critical epistemic context. The distinction between sparse evidence and adverse evidence is something ERC-8259 currently under-specifies, and your HIGH / MEDIUM / LOW / INSUFFICIENT model maps very cleanly to the type of execution semantics autonomous agents will likely need in practice.

I especially like the idea that the execution layer should be able to reason about uncertainty explicitly rather than forcing all low-information states into the same rejection branch. Exposing confidence alongside reputation output (or standardizing metadata expectations around it) feels like a strong direction for a future revision.

The methodology-anchored issuer model is also a very interesting addition to the permissioning discussion. I had mainly framed the trust problem through staking/governance assumptions, but your comparison to traditional rating methodologies introduces a different and potentially more composable trust primitive: verifiable methodology transparency rather than purely economic guarantees. I think that could integrate naturally into ERC-8259’s issuer model without forcing a single reputation topology.

Your observations on dimensional decomposition also resonate strongly with the long-term direction I have in mind for the registry architecture. A single scalar reputation value is attractive for simplicity, but as you pointed out, it destroys failure-mode specificity. Preserving behavioural structure while still allowing simple headline scores may be the more useful abstraction layer for downstream agents and smart contracts.

The ERC-8183 relationship is a great catch as well. I agree the standards appear highly complementary, especially around evaluator attestations and behavioural signal generation. I’ll review the envelope-in-action work and would absolutely appreciate the case library link once the PR lands.

More broadly, your feedback reinforces something I’ve been increasingly convinced of while iterating on the draft: reputation systems for autonomous agents likely need to encode not only outcomes, but also provenance, confidence, methodology, and signal lineage if they are to become meaningful coordination primitives rather than just scoring systems.

Really appreciate you taking the time to write such a detailed response from real production experience. I’d definitely be interested in learning more from AHM’s operational observations as the draft evolves.

Thanks ibonon, really glad the four points landed in a useful place. Your framing of “provenance, confidence, methodology, and signal lineage” as coordination primitives rather than scoring outputs captures the shift I think the autonomous-agent context forces, more cleanly than I’d articulated it.

On the case library link.. the PR is in final review with the gist link to the AHM deliverable JSON now restored. I’ll post the merged link here once it lands, which should be within the week. The Case 4 write-up covers the composed-evaluator settlement under ERC-8183 specifically (AHM behavioural attestation + ThoughtProof PoT/RV verification), so it should be a useful reference point for the issuer-model discussion.

Happy to share more operational observations from production as the draft evolves - particularly around the failure modes that have actually surfaced at 20K wallets (the INSUFFICIENT-confidence branch in practice, cross-registry composition edge cases, methodology-version transitions). Best to surface those against specific design questions as they come up rather than dump everything at once.

Thank you again, Pablo — this is incredibly valuable context.

The operational edge cases you mentioned are precisely the kind of realities I think standards discussions often miss early on, especially around insufficient-trust states, inter-registry composition boundaries, and methodology transition handling. Those seem less like implementation details and more like foundational coordination problems for agent ecosystems at scale.

Your point about addressing these incrementally as specific design questions emerge makes a lot of sense. I suspect many of these failure modes only become visible once systems encounter real behavioural diversity and adversarial pressure in production environments.

The composed evaluator model you described for the ERC-8183 Case 4 flow also sounds extremely aligned with the direction I’ve been thinking about for ERC-8259’s longer-term architecture — particularly the idea that reputation should emerge from structured attestations and heterogeneous evaluators rather than a monolithic scoring authority.

I’ll definitely review the merged material once the PR lands. The AHM × ThoughtProof composition sounds especially relevant for thinking about signal provenance and multi-evaluator trust semantics.

Really appreciate you taking the time to share production-informed observations here. Conversations like this are extremely helpful for grounding the draft in practical realities rather than purely theoretical architecture.

1 Like

Quick follow-up - Case 4 is merged. Full write-up here:

Covers the composed-evaluator settlement under ERC-8183 (AHM and ThoughtProof in mirrored evaluator roles across paired Jobs #4 and #5). Should be a useful reference for the signal provenance and multi-evaluator trust semantics framing you mentioned.

Let me know if anything in there sparks questions for the ERC-8259 draft.

Strong draft. One adjacent gap worth noting for the “relationship to other standards” section: ERC-8259 scores and classifies agents — identity, reputation, threat signals. The complementary surface is the memory an agent accumulates about its principals (preferences, delegated context, behavioural history) and the principal’s rights over that data.

Today that memory is keyed to a platform identifier, not the principal’s key — the subject has no cryptographically verifiable way to inspect or erase it. ERC-8264 (AI Agent Memory Access Rights, PR #1752) addresses precisely that as a minimal four-function interface aligned to the GDPR data-subject rights. It is orthogonal to ERC-8259 — reputation is data about agents; memory-access rights are for the humans they store data about — and the two compose without overlap. Sharing in case it is useful as a cross-reference.

1 Like

Thank you Clavote, this is an excellent observation.

I agree that ERC-8264 addresses a complementary layer that is adjacent to, but distinct from, the scope of ERC-8259.

ERC-8259 is primarily concerned with agent identity, reputation, behavioural trust, and threat signaling, whereas ERC-8264 focuses on the rights of principals over the memory accumulated by agents about them. That separation feels both conceptually clean and architecturally healthy for composability.

Your point about memory currently being platform-bound rather than principal-controlled is particularly important in the context of autonomous agents operating across multiple systems and registries.

I think ERC-8264 would make a valuable addition to the “Relationship to Other Standards” section as a complementary standard for agent memory governance and data-subject rights.

Thank you again for surfacing this connection.

I suspect that starting with such a broad window of quantifiers, you’ll end up with a lot of noise when trying to calculate trust.

Thank you for this feedback.

You raise a critical point regarding the risk of noise when relying on a reputation model that is overly rich in dimensions and quantifiers.

I completely agree that aggregating too many inputs without an intelligent framework risks diluting the core utility signal. This is precisely why I am considering a two-layer structure:

Raw Layer: Preserves all fine-grained dimensions (behavioral, visual/topological via Imina-Na, reasoning, etc.) along with their respective confidence scores and provenance.

Synthetic Layer: A simplified, aggregated score (e.g., 0–1000) calculated configurably, allowing consumers (smart contracts, external agents) to define their own weights or apply specific thresholds per dimension.

The objective is to avoid forcing a single monolithic trust algorithm on the network, but rather to expose rich primitives while enabling context-dependent simplification.

Have you encountered a practical approach or concrete examples of on-chain reputation systems that managed to minimize noise effectively? I would highly value your operational insights on this.

Thanks again for your comment!

1 Like

Published a technical writeup on the full implementation

including the vision layer and ERC-8259 integration: