ERC-8028: Data Anchoring Token (DAT)

Simple Summary
The Data Anchoring Token (DAT) is a semi-fungible token (SFT) standard designed for AI-native digital assets. Each DAT represents a dynamic bundle of ownership certificate, usage rights, and revenue share, optimized for AI data, models, and inference workflows.**

What it does (short):

  • Represents AI assets as classes; each token carries a value (quota), shareRatio (revenue weight), and optional expireAt.
  • Adds minimal surfaces for recordUsage, transferValue (intra-class), settleRevenueETH/ERC20, and claim.
  • Keeps large payloads off-chain via metadataURI + integrityHash.
  • Plays nicely with existing standards: ERC-3525 (slot mapping), ERC-2981 (discovery), ERC-4907/5006 (rental roles).


Links


Context & intent

DAT targets real AI workflows: usage-based access (inference/training/agent runs) with deterministic on-chain revenue distribution to contributors. We’re aiming for the smallest viable interface that wallets/indexers can support without bespoke integrations.

6 Likes

We’re already trying DAT in a real user flow on LazAI Testnet through an AI companion called Lazbubu.

It’s a playful way to show what “data-anchored” agents can feel like: your chats, quests, and choices actually shape the companion over time. No two Lazbubus end up the same, and that’s the point we want people to experience why this matters, not just read a spec.

If you’re curious:

This is testnet-only for now. We’ll share what we learn and fold it back into the DAT draft. Feedback and first impressions are very welcome.

2 Likes

Hi. Could you provide concrete examples of what value represents for each “asset class” (dataset/model/agent)?

Also is the primary intended user a B2B platform (e.g., HuggingFace tokenizing its models for other platforms) or a C2C marketplace (e.g., an individual selling access to their custom-trained “My-Art-Style” LoRA model)?

2 Likes

Hello @hash. Thanks for asking, good question tho, lemme elaborate.

value is a quota whose unit is defined per Class (described in the policyURI, with precision via unitDecimals). The Class policy sets what one unit means and how many decimals it has.

So, to your example: if the Class policy says 1 unit = 1 image, then value = 100 means **you can read 100 images. The actual unit depends on the Class settings (e.g., images, API requests, tokens, minutes, etc.).

Who is it for? Both.

  • B2B: e.g., a hosting platform issues DATs for its models/datasets, and any integrator can consume them with transparent on-chain usage accounting and deterministic revenue settlement.
  • C2C: e.g., an individual artist sells access to a custom LoRA model; value could be 1,000 inference calls or 50k output tokens, and revenue is distributed by shareRatio.
2 Likes

I have two questions

  • Can multiple underlying model versions map to the same DAT class (or it’s 1 to 1 relationship)?
  • What determines the ‘value’ deducted per call — gas, oracle price, or a contract-set schedule?
1 Like

Hi @anaye1997, thanks for the questions! Here’s how we see it:

1. Can multiple underlying model versions map to the same DAT class (or it’s 1 to 1 relationship)?

In the nutshell - Yes. A DAT Class is the boundary for accounting and provenance (classId, metadataURI, integrityHash, policyURI).

  • If versions are materially different (new weights/architecture/licensing/evals), we recommend a separate Class per version so quotas, usage accounting, and revenue distribution stay cleanly separated. (but its up to your model)
  • If changes are minor, you can keep a single Class and update metadata (optionally adding proofs/attestations).

Rule of thumb: the moment you need separate quotas/settlement/provenance, create a separate Class.

2) What determines the ‘value’ deducted per call — gas, oracle price, or a contract-set schedule?

Value is a quota, and the deduction rules are defined by the Class policy (policyURI, precision via unitDecimals). Gas is not tied to quota by default. You can:

  • Set a static rule (“1 inference = 1 unit”),
  • Use metrics (output tokens / seconds / steps),
  • Make it dynamic via an oracle/adapter (e.g., deduct more for higher complexity).

In our intended profile, the deduction per call is the inference weight: the amount passed to recordUsage(tokenId, metricType, amount) reflects a workload-based metric (e.g., a¡input_tokens + b¡output_tokens + c¡latency + d¡model_tier) defined in the Class policyURI. This keeps quota tied to actual compute, not gas.

  • Hard-quota mode (recommended): recordUsage decrements value by amount (and prevents underflow).
  • Soft-quota mode: the contract emits an event only; limits/policy are enforced off-chain.

How is “value” and “unit” accurately determined?

  • The proposal mentions that each DAT token has a value whose unit is defined by the Class Policy (such as images, API requests, tokens, minutes, etc.). However, how exactly is the “value” measured and calculated across different use cases?How can units be standardized across various applications?How is the How do different asset classes (e.g., datasets, models, inference) balance their “value”?

2. Is the fee deduction mechanism fair and scalable?

The fee deduction is based on workload or inference calls, but does this approach face scalability or fairness issues? For example:

  • Can more complex workloads lead to disproportionate fees?
  • If using oracles or adapters for dynamic deductions, how can transparency and fairness be ensured?

Hi @0x_WeakSheep thanks for asking. I tried to make it clear. So following your questions

1.How is “value” and “unit” accurately determined?

Making it simple:

  • value = quota: how many times/how many units the resource can be used.
  • unit = the quota unit, defined by the Class (in policyURI), precision set by unitDecimals.
  • Examples:
    • Dataset: images / records / MB / requests.
    • Model: inference weight (e.g., input tokens + output tokens + seconds).
    • Agent: steps / tasks / minutes.
  • In hard-quota, recordUsage(...) decrements the quota; in soft-quota, it only emits an event and limits are enforced off-chain.

Simple example: if the Class policy says “1 unit = 1 image,” then value = 100 = you can read 100 images.


2. Is the fee deduction mechanism fair and scalable?

It depends what you mean by “fair.” Everyone has their own idea of fairness, but the principle is simple: we deduct based on real workload (inference weight), not gas. The coefficients and rules live in the policyURI they’re visible, versioned, and you can set caps (max per call/minute). That’s pretty fair and transparent, right?

On scalability, accounting is just a single recordUsage event, and payouts are done via index + claim cheap and predictable on-chain.

If you need dynamic deductions, you can plug in an oracle/adapter, but the source and version must be declared in the policy; if something breaks, we fall back to a static schedule. That covers most cases but if you see a hole or a better way, tell me and I’ll happily dig into it lol.

After reviewing 8024, I want to highlight an architectural observation that may help frame some long-term design choices.

DAT is clearly modeling a rights-based, semi-fungible asset.

The core semantics—class identity, integrityHash, metadataURI, quota/usage rules—put it much closer to a logical entitlement system than a fungible token standard.

This raises an important question:

Should the “rights semantics” and the “value/transfer semantics” remain in the same layer,
or is there a benefit to separating them?


1. A layered approach often leads to stronger composability

This is the approach we’ve chosen in the privacy space, adopting a ‘minimal primitive + higher-order construction’ design philosophy. We welcome everyone to join the discussion, in the privacy space:

The benefit of this separation is that:

  • 8086 remains small, stable, reusable
  • 8085 can evolve independently
  • applications can adopt whichever layer they need
  • the underlying primitive becomes a “reusable substrate” across many asset types
1 Like

Hi Henry, thank you for sharing.

I see your note on the marketing page:

LazAI empowers developers to create value-aligned, personalized AI agents

From that main motivation, I don’t see why that requires anything more than an existing value token.


If there are obvious other types of things that need to be tracked, and if you think there are lots of other projects that have the same problem, I would recommend to put this at the very beginning of your proposal.

1 Like