ERC-7208: On-chain Data Container

pash7ka · September 4, 2024, 8:06am

Hi @SamWilsn,
You’re right: Data Objects have a lot in common with some data-related Java patterns. At the same time, there are differences.
Data Access Objects are used in Java to abstract data storage from business logic. This is similar to the purpose of the Data Objects we have, but DAOs are limited to generic CRUD operations, while Data Objects provide data-specific operations (like mint/burn/transfer when we store data for a fungible token).
Data Objects are also similar to Components in the ECS pattern, since they inherit data and base logic for it. But I think it would be wrong to represent Data Managers as Entities - because unlike Entities, which usually only have an ID, Data Managers have their own logic and data.
Data Index provides an intermediate layer between business logic (implemented in Data Managers) and data storage (in Data Objects), and this is similar to the Repository Design Pattern , but since the main purpose of Data Index is to manage access to Data Objects, I don’t think it has exactly the same purpose and semantics as Repositories.

galimba · September 20, 2024, 12:54pm

We’ve open-sourced an educational example, with a minimalistic implementation of the ERC-7208 and shared storage between different interfaces of tokens.

What does this mean? A single ERC-1155 smart contract (i.e. fractional ownership of an asset, represented by the NFTs) can now be traded concurrently through its original smart contract or an alternative ERC-20 interface.

Why is this relevant? Because ERC-1155 is usually an illiquid asset. The possibility of trading it through an ERC-20 interface enables tapping into liquidity pools through DEXs (i.e. Uniswap and others)

How does it work? By abstracting the storage of the asset from the interface (ERC-1155, ERC-20, etc) being used to access and modify the underlying asset. This is only possible through ERC-7208

TechnicallyWeb3 · September 24, 2024, 7:33pm

This is awesome and just what we need for our browser… We have built a version of this technology just this last week. We call them WebContracts and they are supposed to mimic an Apache server including redirect codes, MIME types (and a special version of Extended MIME type that we build specifically for web3 links). Our contract works as a transferable file and token store (ERC20 and ERC721).

I was actually about to update our contract to take low-level data rather than just strings. And I wanted to expand the token withdrawal to be able to execute any function of the owned token’s contract.

Would anyone look over my latest updates to my WebContract project (v0.1.0 coming soon to return full redirectCodes in the next couple days)?

Would like some help to get an ERC draft together. *I’m new here and not sure how or where to start my own thread.

GitHub Repo: technicallyweb3/web-contract
NPM Package @tw3/solidity/contracts/WebContract.sol

TechnicallyWeb3 · September 24, 2024, 9:33pm

Would what I’m building be considered a Data Manager in this scenario? I have also built a LinuxVirtualFS contract which brings functions like ls, mkdir, rm, touch, nanoUpdate and more to make the contract feel like you’re interacting with a linux machine. What roles would these components play in the ERC-7208 ecosystem?

galimba · September 28, 2024, 10:44am

This is super interesting (!!!) Thanks for asking this type of question, and I hope we can help you sort it out.

It would benefit from being an abstract generic storage, like a File DataObject for file handling. Then the functions can be handled elsewhere on a different contract, perhaps on a different chain as well.

For sure, lets chat =)

You can always open a new thread on this forum, or if the project already has a few backers, you can open up a PR on GitHub - ethereum/ERCs: The Ethereum Request for Comment repository for a DRAFT proposal

Yes and No. The DataManager is “the last mile” component that exposes interfaces with the end-user. The storage layer is with the DataObject, the DataPoint is the “pointer” to the data, not unlike an inode, and the DataIndex is the “access manager”. ERC-7208 can help you set up an architecture design pattern that is sustainable and interoperable, by using simple interfaces for separation of domain and to facilitate communication between components (i.e. read/write functions and other common interfaces).

I’ll need to understand a bit more about what you’re trying to achieve, and how deep down the rabbit hole you want to take it. Do you want to have “just the feeling” of interacting with a terminal, or do you want to have a full Linux running on top of EVM? Do you want a FS or do you want just to expose some functions that pretend you are using a FS?

First thing, ERC-7208 is not an ecosystem. It is an ecosystem builder. You can use it to build smart contracts that can interact with one another, including adapting previously deployed assets to be wrapped and exposed through a different implementation. This ERC can provide you with the framework to build those things you mentioned in the previous msg and much more, but the motivation will determine the architectural design you wish to follow. Where is this code going to run? Is it a private chain, public, or localhost? Who is going to be using this, and for what purpose?

An over-simplification can look like this:

File Data can be stored in a bytes32 format (if you want to have 32-byte block size or you can simply use bytes blobs), indexed by a DataPoint, which is issued by a DataPointRegistry
The DataPoint is managed by a File DataObject implementing the “low-level file management” logic.
The Access Management can be implemented through a DataIndex, that allows/denies access based on user/application privileges
The Logic facing the user/application (i.e. mkdir/rm/touch/…etc) can be exposed through one or multiple DataManager contracts.

galimba · October 21, 2024, 12:37pm

In preparation for going from “In Review” to “Last Call”, the team has released some technical documentation that we hope you’ll find helpful.

https://docs.nexera.network/nexera-standard/

galimba · November 8, 2024, 7:55pm

The Minimalistic Example implementation repository has been closed for a few days while undergoing an audit with Omniscia. After completing the process, the code will be renamed as the Reference Implementation and the report will be made public.

We’ll be pushing for the ERC to go for LAST CALL, and soon after FINAL.

nickjuntilla · November 10, 2024, 11:50pm

After spending 3 days reading the (very trite) description of ERC-7208 and it’s related interfaces, going over the example implementations and further dissecting it with ChatGPT. I would highly recommend you update your description and explanation here: ethereum . org/EIPS/eip-7208 The main issues I had trouble with are that it’s not clear what the hierarchy of this system is. What containers contain what and where the actual data lives.

From someone with fresh eyes: You have something called a data registry which appears to contain the actual data but has no read and write methods. Further more you have permissions on the data registry and then also on the data index. You have almost identical descriptions for both. It is not clear how the data registry is different from the the data index and what are the responsibilities of each. A chart or some of the charts you have created for this page would be great.

It is not clear how a data object relates to a data point. I have discerned that it is a 1 to many relationship and I feel like the probability of that is 80% at this point but it is not explicitly stated anywhere on the main informational page. It is also not clear if using a data object is always necessary.

It is not clear if a data index is always necessary. It is not clear how data is registered in the data index. It is not clear the usefulness of the data index because there are no example methods or description of how it may be used to retrieve many data points or all the datapoints of a dataobject if that is one of the uses.

It seems like there are many ways this system of data abstraction can be implemented and the interfaces you have created are designed to allow for many different implementations.

There are also several mistakes on the ethereum . org/EIPS/eip-7208 page. In the write method on the data index interface this line

@param operation Read operation to execute on the data
should probably be
@param operation Write operation to execute on the data

On this like
* @return If the role was revoked (otherwise account didn't had the role)
“had” should be “have”.

Througout the documentation this wording is used

  * @return Operation-specific data

But it is unclear what this means. An example or short explanation would be very helpful.

Overall I think this is a good idea, but for someone coming in with fresh eyes it is very hard to understand. A very basic example of the least necessary operations and smart contracts necessary to implement this would be very helpful to conceptualize is the core of this system and what is ‘nice to have.’ How is an index used? Where is the data stored?

It is helpful to read this long wall of comments, but it would be even more helpful if this information could be distilled and put back into the main website especially with some hierarchical charts.

Thank you very much for your work.

nickjuntilla · November 10, 2024, 11:54pm

I’d also like to recommend that when you introduce the concepts of the data points, the data objects, the registry and the index you introduce them in order from the most granular to the most encompassing. This probably means you start with a data point and end with the the data point index.

galimba · November 11, 2024, 12:57pm

Hi Nick! Thank you for going through the trouble of reading and reviewing the ERC

You are correct, the currently published description is bound for an update! And we have been working on it. I am confident all your points have been addressed in the next update, which we’ll publish in a few days.

Generally speaking, I’ve found that GPT is prone to hallucinations after reading this thread. The standard has changed a lot since its conception, and the terminology has evolved with it. Still, I’ll try to clarify a few points here:

We’re providing two diagrams, here and here

The first is a conceptual overview, where we exemplify the separation of domain between storage, access management, and business logic.
The latter is a more technical diagram, showcasing the interactions between components and their roles.

Data is stored within DataObjects, and uniquely identified by DataPoints. Each DataPoints acts as a unique reference to the information stored on-chain. You can think of a DataPoint as a “pointer” to the data being stored and managed by the DataObject.

There is a misconception here. The Standard doesn’t have a Data Registry. We propose a DataPoint Registry, with its main role being access management, rather than data storage. This is why it does not have a read() or write() method.

The standard states:

The Data Point Registry is a smart contract entrusted with Data Point access control. Data Managers may request the allocation of Data Points to the Data Point Registry .

Additionally, since each Data Point Registry implements the same Structure for all the Data Points under it’s management, in essence it defines the “space of compatible Data Points”.

In a previous iteration of the standard (in DRAFT, about a year ago), we used to have some form of smart contract that acted as a single data repository. We have left that architecture behind on account of its centralization. It is better to have multiple Data Objects capable of coexisting and managing their own internal low-level data-management logic than a single central repository for all data.

Both the DataPoint Registry and the Data Index are smart contracts designed for access control. The Standard does not impose restrictions on whether they should be the same or separate contracts. That is, if you wanted to build a monolithic implementation you should be able to do it and still comply with the standard Interfaces. However, they are exposed as separate entities because of the separation of domain. I trust these diagrams (here and here) help explain the roles.

Regarding the Data Index, the standard says:

The Data Index is a smart contract entrusted with access control. It is a gating mechanism for Data Managers to access Data Objects . If a Data Manager intends to access a Data Point (either by read() , write() , or any other method), the Data Index should be used for validating access to the data.

So DataPoint Registry implements DataPoint access-management, while Data Index implements Data Manager access control to Data Objects. This requires the Data Index to implement user control or some form of ID management (either through address, NFT gating, or some other mechanism. The Standard does not impose restrictions on the mechanism, leaving it open to the developer to chose which one suits the implementation. However, the standard has this to say about it:

The mechanism for ID managamenent determines a space of compatibility between implementations.

In this context, this means that “Data Portability” or “Data Mobility” between implementations is subject to a compatible Data Index. In other words: when a Data Manager using Data Index A wants to switch implementations to Data Index B, the Data Manager must consider that the internal ID management of both Data Index implementations must be compatible.

I hope this was clarified on the previous points. Data Points and Data Objects are necessary every time that data needs to be indexed and stored for its abstraction.

Indeed, a Data Index is always necessary for access-control from Data Managers to Data Objects, which translates in ID management for the end-user (albeit an address or a contract.

I believe this misunderstanding comes from the previous one regarding the Data Index role. Data is not registered in the Data Index. Data lives within one or many Data Objects, referenced or pointed by a Data Point.

This is correct, and it is the main motivation behind the standard. This architecture is designed to allow for many different implementations to be compatible with one another, enabling data mobility or portability between those implementations.

I think this is within the read() function declaration, so it’s ok to have a “Read”, as it’s a view function.

Fixed, thank you!

This means that each read or write will also require an operation to be performed. If it is a read, it retrieves some data from the operation performed (i.e. a read on an mapping address -> address will retrieve an address, while a read on a public bool variable will retrieve a bool). Conversely, the write operations must receive operation-specific data to be written.

We have a reference implementation that is currently under audit and will be published in the next weeks. You’ll find it here

For now, what you are looking for is also provided here, as part of the current work In Review, and cataloged as an Example Implementation.

For the next version, we were thinking of going for a “Top-Down” approach… but I understand what you’re saying and it makes sense.
Maybe we can go in this order:

Data Point, as the structure is important to how data is indexed.
Data Object, as this is where the actual data resides.
Data Manager, as this is the user-facing contract, implementing any interface.
Data Point Registry, as this is access management for data.
Data Index, as this is access-management for business logics.

Once again, I thank you for taking the time and putting in the effort on reviewing this ERC. I cannot stress how important this process is, and how valuable your input was.

Please, let us know if you have any follow-up questions.

pash7ka · November 11, 2024, 1:25pm

Hi Nick! Thank you for you interest to the ERC.

I see DataPoint as a label of application which creates & modifies data.

You allocate the DataPoint through the DataPoint Registry for your application and then use it to define who can access application’s data. After that your DataManager specify DataPoint to show which data they want to access.

johnsn · November 13, 2024, 2:47am

GM, sorry if this has already been answered but I’m not sure if I understood it correctly — could this standard theoretically allow for the integration of tokens from other L1s into the Ethereum ecosystem, enabling them to be treated as compatible Data Objects for things like staking and reward distribution? Or is it more focused on abstracting data storage and management without actually transferring assets across chains?

galimba · November 13, 2024, 10:12am

Hey @johnsn, thank you for taking an interest in this ERC!
I think you’ve raised a question that hasn’t been explicitly addressed yet but has been assumed throughout the discussion.

The short answers are: “Mostly YES” and “YES.”

Now for the longer explanation:

In response to this part of your question:

“is it more focused on abstracting data storage and management without actually transferring assets across chains?”

EIPs (Ethereum Improvement Proposals) are the primary mechanism for proposing new features to Ethereum, gathering technical feedback from the community, and documenting design decisions. ERCs, as application-level standards, provide conventions for specific use cases, such as token standards (e.g., ERC-20), name registries, and wallet formats.

This particular standard, ERC-7208, focuses on abstracting on-chain data storage from the logic used to manage it. The motivation here isn’t solely about transferring assets across chains, but rather achieving asset-data interoperability.

On-chain data is typically tied to specific storage mechanisms—for example, ERC-20 uses a mapping of address -> balance. Because of these dependencies, other standards often limit how data can be managed or shared. ERC-7208 provides a framework to decouple those dependencies and enable more flexible interactions.

You can think of most ERCs as theoretical frameworks. For instance, ERC-20 was instrumental in building stablecoins, but the standard only defines the interface for token contracts, not their implementation. Similarly, ERC-7208 proposes a high-level architecture or design framework, leaving the implementation, tooling, and applications to developers.

As for this part of your question:

“ould this standard theoretically allow for the integration of tokens from other L1s into the Ethereum ecosystem, enabling them to be treated as compatible Data Objects for things like staking and reward distribution?”

Yes, absolutely. This is a theoretical application of the standard. In practice, several companies are already working on interoperability solutions based on ERC-7208.

At its core, the standard defines a set of interfaces for smart contracts to interact with one another. These interfaces specify the function names, expected parameters, and guidelines for their usage. However, the actual implementation is up to developers, allowing them to tailor it to their needs.

As stated in the DRAFT:

“We recognize there is no ‘one size fits all’ solution to solve the standardization and interoperability challenges.”

We believe this ERC is a step in the right direction, and its adoption so far has shown that it is a valuable framework. Not just for cross-chain interoperability, but mainly for adapting assets from one standard to another, for embedding regulatory-compliance checks within the logic of certain assets or protocols, for future-proofing already existing solutions, and for building data-oriented smart contract architectures.

This ERC is not intended to solve all these problems directly. Instead, it aims to provide a framework that empowers engineers to develop solutions for them.

Thanks again for your question, and I hope this clarifies things!

galimba · November 18, 2024, 9:46pm

Updates pushed. Source code updated and sent for audit.

futreall · November 19, 2024, 7:46am

How does the concept of horizontal data mobility introduced in ERC-7208 redefine the approach to on-chain interoperability, and what implications could this have for the future of standardization within the Ethereum ecosystem?

xdaniortega · November 29, 2024, 12:18pm

Thanks @futreall for taking part of the debate of this standard, your contribution will help to better understand the capabilities ERC-7208 to newcomers.

Until now, transferring assets between chains means bridging from source chain (SC) to destination chain (DC) or wrapping assets, locking it in SC and minting a new token in DC,
this has several implications and, maybe one of the most important parts, it is very difficult (UX-wise) for a user to operate across chains.

With the new architecture proposed by abstracting logic from storage, a transfer operation logic will remain in the source chain and the asset virtually shares storage between chains, meaning there is no need to wrap or bridge assets.
Do we see how cool is that?

Let’s take this example:
Given an already functional ERC20 token under ERC7208, we would have a DataManager that defines the logic of the transfer operation single-chain. In this case, it can be used in a local DeFi protocol as usual.
If we would want to give this ERC20 token cross-chain capabilities, DataManager can decide to upgrade to a DataIndex that has omnichain support (omni = any), turning the capabilities of the same DeFi protocol to use the asset cross-chain.

In regards your question about the implications about the future of standardrization within the Ethereum ecosystem. I’d say cross-chain projects and DeFi is one of the hottest topics right now in the industry. A lot of players are bringin solutions to patch or meet the requirements they have for a single use case, and we bring ERC7208 for the long run.
As you may see, we wanted to make this standard as agnostic and upgradeable as possible, giving full potential for upcoming use cases and solving the fragmentation the industry has across standards.

alex-ppg · December 2, 2024, 2:33pm

I believe that the EIP presently possesses some ambiguities in its content, and the use of loose RFC-2119 terminology does not benefit it or its adoption. In my opinion, the Data Point structure and the relevant interfaces need to be strictly defined to encourage ecosystem growth, permit auditors to validate the standard correctly, and allow utility libraries in off-chain software to be utilized across EIP-7208 implementations.

Beyond concerns around RFC-2119 terminology, the EIP definition itself is presently incorrect in the sense that its definitions are incompatible (particularly interface definitions). As an example, the Data Object Interface definition is as follows:

interface IDataObject {
    function read(DataPoint dp, bytes4 operation, bytes calldata data) external view returns(bytes memory);

    function write(DataPoint dp, bytes4 operation, bytes calldata data) external returns(bytes memory);

    function setDIImplementation(DataPoint dp, address newImpl) external;
}

These definitions make use of the user-defined DataPoint variable type causing implementations to be forced on a particular pragma version range, a trait that is generally avoided in EIPs relating to smart contract implementations that do not make use of opcodes.

Secondly, the user-defined DataPoint causes function signatures to not be generate-able by the interface declaration alone. Specifically, different function signatures will be generated if the DataPoint is a uint256 variable than if it is a bytes32 variable and so on. The EIP mentions that the DataPoint SHOULD be bytes32, but does not mandate it so.

To ensure that the definitions are compatible, one of the following edits should be made:

The code within the EIP should be clarified as pseudo-code
The DataPoint itself should be set to a particular data type strictly (i.e. use MUST in place of SHOULD in relation to whether it is a bytes32 data type) and all DataPoint mentions should be replaced by the underlying data type (including interface-style data types such as IDataObject)

Depending on the desired direction of the EIP by the team that drafted it, any of the aforementioned edits would be acceptable. However, one of the edits must be incorporated to ensure that the EIP is clearly and concisely defined without any ambiguities.

From a long-time auditor’s and ecosystem developer’s PoV, I believe that restricting the DataPoint type is acceptable for this particular EIP. Additionally, I believe that some of the interface declarations should use MUST terminology as well (i.e. IDataObject and IDataIndex read/write function definitions) to impose a common baseline for all EIP-7208 implementations.

Given that the above MUST terminology changes would not actually restrict how the underlying 32 bytes are used in a particular DataObject and the dynamic data arguments of the read/write function definitions allow maximum flexibility I do not believe the above recommendations are unsound.

Disclaimer: This feedback has been provided on behalf of Omniscia as part of a security engagement by the Nexera team in relation to this EIP

galimba · December 2, 2024, 8:07pm

Thank you @alex-ppg and Omniscia

We have applied the following suggestions:

DataPoint MUST be bytes32
The erc enforces the implementation of the interfaces (IDataObject, IDataPointRegistry, IDataIndex) with MUST rather than SHOULD
Functions within the interfaces (i.e. read()/ write()) now take arguments in the underlying data type (i.e. address / bytes32)
Improved DataManager enforcement

galimba · December 9, 2024, 8:34am

Hey @Vitaliyr888, compatibility is ensured because although ERC-7208 is NOT a token standard, it is designed to abstract the underlying asset’s storage of any token and enable the separation of its logic. We’re providing a short list with some examples here.

We’ll happily provide some insights if you’d like to ask about a specific integration or interoperability use case.

galimba · January 1, 2025, 3:06pm

The ERC-7208 is now on LAST CALL. We’ll be pushing for FINAL in a few weeks.

The audit report is here

The reference implementation is here

Special thanks to Omniscia for their contributions.