Hi @SamWilsn,
You’re right: Data Objects have a lot in common with some data-related Java patterns. At the same time, there are differences.
Data Access Objects are used in Java to abstract data storage from business logic. This is similar to the purpose of the Data Objects we have, but DAOs are limited to generic CRUD operations, while Data Objects provide data-specific operations (like mint/burn/transfer when we store data for a fungible token).
Data Objects are also similar to Components in the ECS pattern, since they inherit data and base logic for it. But I think it would be wrong to represent Data Managers as Entities - because unlike Entities, which usually only have an ID, Data Managers have their own logic and data.
Data Index provides an intermediate layer between business logic (implemented in Data Managers) and data storage (in Data Objects), and this is similar to the Repository Design Pattern , but since the main purpose of Data Index is to manage access to Data Objects, I don’t think it has exactly the same purpose and semantics as Repositories.
We’ve open-sourced an educational example, with a minimalistic implementation of the ERC-7208 and shared storage between different interfaces of tokens.
What does this mean? A single ERC-1155 smart contract (i.e. fractional ownership of an asset, represented by the NFTs) can now be traded concurrently through its original smart contract or an alternative ERC-20 interface.
Why is this relevant? Because ERC-1155 is usually an illiquid asset. The possibility of trading it through an ERC-20 interface enables tapping into liquidity pools through DEXs (i.e. Uniswap and others)
How does it work? By abstracting the storage of the asset from the interface (ERC-1155, ERC-20, etc) being used to access and modify the underlying asset. This is only possible through ERC-7208
This is awesome and just what we need for our browser… We have built a version of this technology just this last week. We call them WebContracts and they are supposed to mimic an Apache server including redirect codes, MIME types (and a special version of Extended MIME type that we build specifically for web3 links). Our contract works as a transferable file and token store (ERC20 and ERC721).
I was actually about to update our contract to take low-level data rather than just strings. And I wanted to expand the token withdrawal to be able to execute any function of the owned token’s contract.
Would anyone look over my latest updates to my WebContract project (v0.1.0 coming soon to return full redirectCodes in the next couple days)?
Would like some help to get an ERC draft together. *I’m new here and not sure how or where to start my own thread.
GitHub Repo: technicallyweb3/web-contract
NPM Package @tw3/solidity/contracts/WebContract.sol
Would what I’m building be considered a Data Manager in this scenario? I have also built a LinuxVirtualFS contract which brings functions like ls, mkdir, rm, touch, nanoUpdate and more to make the contract feel like you’re interacting with a linux machine. What roles would these components play in the ERC-7208 ecosystem?
This is super interesting (!!!) Thanks for asking this type of question, and I hope we can help you sort it out.
It would benefit from being an abstract generic storage, like a File DataObject for file handling. Then the functions can be handled elsewhere on a different contract, perhaps on a different chain as well.
For sure, lets chat =)
You can always open a new thread on this forum, or if the project already has a few backers, you can open up a PR on GitHub - ethereum/ERCs: The Ethereum Request for Comment repository for a DRAFT proposal
Yes and No. The DataManager is “the last mile” component that exposes interfaces with the end-user. The storage layer is with the DataObject, the DataPoint is the “pointer” to the data, not unlike an inode, and the DataIndex is the “access manager”. ERC-7208 can help you set up an architecture design pattern that is sustainable and interoperable, by using simple interfaces for separation of domain and to facilitate communication between components (i.e. read/write functions and other common interfaces).
I’ll need to understand a bit more about what you’re trying to achieve, and how deep down the rabbit hole you want to take it. Do you want to have “just the feeling” of interacting with a terminal, or do you want to have a full Linux running on top of EVM? Do you want a FS or do you want just to expose some functions that pretend you are using a FS?
First thing, ERC-7208 is not an ecosystem. It is an ecosystem builder. You can use it to build smart contracts that can interact with one another, including adapting previously deployed assets to be wrapped and exposed through a different implementation. This ERC can provide you with the framework to build those things you mentioned in the previous msg and much more, but the motivation will determine the architectural design you wish to follow. Where is this code going to run? Is it a private chain, public, or localhost? Who is going to be using this, and for what purpose?
An over-simplification can look like this:
- File Data can be stored in a
bytes32
format (if you want to have 32-byte block size or you can simply usebytes
blobs), indexed by aDataPoint
, which is issued by aDataPointRegistry
- The
DataPoint
is managed by aFile DataObject
implementing the “low-level file management” logic. - The Access Management can be implemented through a
DataIndex
, that allows/denies access based on user/application privileges - The Logic facing the user/application (i.e.
mkdir
/rm
/touch
/…etc) can be exposed through one or multipleDataManager
contracts.
In preparation for going from “In Review” to “Last Call”, the team has released some technical documentation that we hope you’ll find helpful.
The Minimalistic Example implementation repository has been closed for a few days while undergoing an audit with Omniscia. After completing the process, the code will be renamed as the Reference Implementation
and the report will be made public.
We’ll be pushing for the ERC to go for LAST CALL
, and soon after FINAL
.
After spending 3 days reading the (very trite) description of ERC-7208 and it’s related interfaces, going over the example implementations and further dissecting it with ChatGPT. I would highly recommend you update your description and explanation here: ethereum . org/EIPS/eip-7208 The main issues I had trouble with are that it’s not clear what the hierarchy of this system is. What containers contain what and where the actual data lives.
From someone with fresh eyes: You have something called a data registry which appears to contain the actual data but has no read and write methods. Further more you have permissions on the data registry and then also on the data index. You have almost identical descriptions for both. It is not clear how the data registry is different from the the data index and what are the responsibilities of each. A chart or some of the charts you have created for this page would be great.
It is not clear how a data object relates to a data point. I have discerned that it is a 1 to many relationship and I feel like the probability of that is 80% at this point but it is not explicitly stated anywhere on the main informational page. It is also not clear if using a data object is always necessary.
It is not clear if a data index is always necessary. It is not clear how data is registered in the data index. It is not clear the usefulness of the data index because there are no example methods or description of how it may be used to retrieve many data points or all the datapoints of a dataobject if that is one of the uses.
It seems like there are many ways this system of data abstraction can be implemented and the interfaces you have created are designed to allow for many different implementations.
There are also several mistakes on the ethereum . org/EIPS/eip-7208 page. In the write method on the data index interface this line
@param operation Read operation to execute on the data
should probably be
@param operation Write operation to execute on the data
On this like
* @return If the role was revoked (otherwise account didn't had the role)
“had” should be “have”.
Througout the documentation this wording is used
* @return Operation-specific data
But it is unclear what this means. An example or short explanation would be very helpful.
Overall I think this is a good idea, but for someone coming in with fresh eyes it is very hard to understand. A very basic example of the least necessary operations and smart contracts necessary to implement this would be very helpful to conceptualize is the core of this system and what is ‘nice to have.’ How is an index used? Where is the data stored?
It is helpful to read this long wall of comments, but it would be even more helpful if this information could be distilled and put back into the main website especially with some hierarchical charts.
Thank you very much for your work.
I’d also like to recommend that when you introduce the concepts of the data points, the data objects, the registry and the index you introduce them in order from the most granular to the most encompassing. This probably means you start with a data point and end with the the data point index.
Hi Nick! Thank you for going through the trouble of reading and reviewing the ERC
You are correct, the currently published description is bound for an update! And we have been working on it. I am confident all your points have been addressed in the next update, which we’ll publish in a few days.
Generally speaking, I’ve found that GPT is prone to hallucinations after reading this thread. The standard has changed a lot since its conception, and the terminology has evolved with it. Still, I’ll try to clarify a few points here:
We’re providing two diagrams, here and here
- The first is a conceptual overview, where we exemplify the separation of domain between storage, access management, and business logic.
- The latter is a more technical diagram, showcasing the interactions between components and their roles.
Data is stored within DataObjects
, and uniquely identified by DataPoints
. Each DataPoints
acts as a unique reference to the information stored on-chain. You can think of a DataPoint
as a “pointer” to the data being stored and managed by the DataObject
.
There is a misconception here. The Standard doesn’t have a Data Registry. We propose a DataPoint Registry
, with its main role being access management, rather than data storage. This is why it does not have a read()
or write()
method.
The standard states:
The Data Point Registry is a smart contract entrusted with Data Point access control. Data Managers may request the allocation of Data Points to the Data Point Registry .
Additionally, since each Data Point Registry
implements the same Structure for all the Data Points
under it’s management, in essence it defines the “space of compatible Data Points
”.
In a previous iteration of the standard (in DRAFT
, about a year ago), we used to have some form of smart contract that acted as a single data repository. We have left that architecture behind on account of its centralization. It is better to have multiple Data Objects
capable of coexisting and managing their own internal low-level data-management logic than a single central repository for all data.
Both the DataPoint Registry
and the Data Index
are smart contracts designed for access control. The Standard does not impose restrictions on whether they should be the same or separate contracts. That is, if you wanted to build a monolithic implementation you should be able to do it and still comply with the standard Interfaces. However, they are exposed as separate entities because of the separation of domain. I trust these diagrams (here and here) help explain the roles.
Regarding the Data Index
, the standard says:
The Data Index is a smart contract entrusted with access control. It is a gating mechanism for Data Managers to access Data Objects . If a Data Manager intends to access a Data Point (either by
read()
,write()
, or any other method), the Data Index should be used for validating access to the data.
So DataPoint Registry
implements DataPoint
access-management, while Data Index
implements Data Manager
access control to Data Objects
. This requires the Data Index
to implement user control or some form of ID management (either through address
, NFT gating, or some other mechanism. The Standard does not impose restrictions on the mechanism, leaving it open to the developer to chose which one suits the implementation. However, the standard has this to say about it:
The mechanism for ID managamenent determines a space of compatibility between implementations.
In this context, this means that “Data Portability” or “Data Mobility” between implementations is subject to a compatible Data Index
. In other words: when a Data Manager
using Data Index A
wants to switch implementations to Data Index B
, the Data Manager
must consider that the internal ID management of both Data Index
implementations must be compatible.
I hope this was clarified on the previous points. Data Points
and Data Objects
are necessary every time that data needs to be indexed and stored for its abstraction.
Indeed, a Data Index
is always necessary for access-control from Data Managers
to Data Objects
, which translates in ID management
for the end-user (albeit an address
or a contract.
I believe this misunderstanding comes from the previous one regarding the Data Index
role. Data is not registered in the Data Index. Data lives within one or many Data Objects
, referenced or pointed by a Data Point
.
This is correct, and it is the main motivation behind the standard. This architecture is designed to allow for many different implementations to be compatible with one another, enabling data mobility or portability between those implementations.
I think this is within the read()
function declaration, so it’s ok to have a “Read”, as it’s a view
function.
Fixed, thank you!
This means that each read
or write
will also require an operation to be performed. If it is a read
, it retrieves some data from the operation performed (i.e. a read
on an mapping address -> address
will retrieve an address
, while a read
on a public bool variable
will retrieve a bool
). Conversely, the write
operations must receive operation-specific data
to be written.
We have a reference implementation that is currently under audit and will be published in the next weeks. You’ll find it here
For now, what you are looking for is also provided here, as part of the current work In Review
, and cataloged as an Example Implementation
.
For the next version, we were thinking of going for a “Top-Down” approach… but I understand what you’re saying and it makes sense.
Maybe we can go in this order:
- Data Point, as the structure is important to how data is indexed.
- Data Object, as this is where the actual data resides.
- Data Manager, as this is the user-facing contract, implementing any interface.
- Data Point Registry, as this is access management for data.
- Data Index, as this is access-management for business logics.
Once again, I thank you for taking the time and putting in the effort on reviewing this ERC. I cannot stress how important this process is, and how valuable your input was.
Please, let us know if you have any follow-up questions.
Hi Nick! Thank you for you interest to the ERC.
I see DataPoint as a label of application which creates & modifies data.
You allocate the DataPoint through the DataPoint Registry for your application and then use it to define who can access application’s data. After that your DataManager specify DataPoint to show which data they want to access.
GM, sorry if this has already been answered but I’m not sure if I understood it correctly — could this standard theoretically allow for the integration of tokens from other L1s into the Ethereum ecosystem, enabling them to be treated as compatible Data Objects for things like staking and reward distribution? Or is it more focused on abstracting data storage and management without actually transferring assets across chains?
Hey @johnsn, thank you for taking an interest in this ERC!
I think you’ve raised a question that hasn’t been explicitly addressed yet but has been assumed throughout the discussion.
The short answers are: “Mostly YES” and “YES.”
Now for the longer explanation:
In response to this part of your question:
“is it more focused on abstracting data storage and management without actually transferring assets across chains?”
EIPs (Ethereum Improvement Proposals) are the primary mechanism for proposing new features to Ethereum, gathering technical feedback from the community, and documenting design decisions. ERCs, as application-level standards, provide conventions for specific use cases, such as token standards (e.g., ERC-20), name registries, and wallet formats.
This particular standard, ERC-7208, focuses on abstracting on-chain data storage from the logic used to manage it. The motivation here isn’t solely about transferring assets across chains, but rather achieving asset-data interoperability.
On-chain data is typically tied to specific storage mechanisms—for example, ERC-20 uses a mapping of address -> balance
. Because of these dependencies, other standards often limit how data can be managed or shared. ERC-7208 provides a framework to decouple those dependencies and enable more flexible interactions.
You can think of most ERCs as theoretical frameworks. For instance, ERC-20 was instrumental in building stablecoins, but the standard only defines the interface for token contracts, not their implementation. Similarly, ERC-7208 proposes a high-level architecture or design framework, leaving the implementation, tooling, and applications to developers.
As for this part of your question:
“ould this standard theoretically allow for the integration of tokens from other L1s into the Ethereum ecosystem, enabling them to be treated as compatible Data Objects for things like staking and reward distribution?”
Yes, absolutely. This is a theoretical application of the standard. In practice, several companies are already working on interoperability solutions based on ERC-7208.
At its core, the standard defines a set of interfaces for smart contracts to interact with one another. These interfaces specify the function names, expected parameters, and guidelines for their usage. However, the actual implementation is up to developers, allowing them to tailor it to their needs.
As stated in the DRAFT:
“We recognize there is no ‘one size fits all’ solution to solve the standardization and interoperability challenges.”
We believe this ERC is a step in the right direction, and its adoption so far has shown that it is a valuable framework. Not just for cross-chain interoperability, but mainly for adapting assets from one standard to another, for embedding regulatory-compliance checks within the logic of certain assets or protocols, for future-proofing already existing solutions, and for building data-oriented smart contract architectures.
This ERC is not intended to solve all these problems directly. Instead, it aims to provide a framework that empowers engineers to develop solutions for them.
Thanks again for your question, and I hope this clarifies things!
Updates pushed. Source code updated and sent for audit.
How does the concept of horizontal data mobility introduced in ERC-7208 redefine the approach to on-chain interoperability, and what implications could this have for the future of standardization within the Ethereum ecosystem?
Thanks @futreall for taking part of the debate of this standard, your contribution will help to better understand the capabilities ERC-7208 to newcomers.
Until now, transferring assets between chains means bridging from source chain (SC) to destination chain (DC) or wrapping assets, locking it in SC and minting a new token in DC,
this has several implications and, maybe one of the most important parts, it is very difficult (UX-wise) for a user to operate across chains.
With the new architecture proposed by abstracting logic from storage, a transfer operation logic will remain in the source chain and the asset virtually shares storage between chains, meaning there is no need to wrap or bridge assets.
Do we see how cool is that?
Let’s take this example:
Given an already functional ERC20 token under ERC7208, we would have a DataManager that defines the logic of the transfer operation single-chain. In this case, it can be used in a local DeFi protocol as usual.
If we would want to give this ERC20 token cross-chain capabilities, DataManager can decide to upgrade to a DataIndex that has omnichain support (omni = any), turning the capabilities of the same DeFi protocol to use the asset cross-chain.
In regards your question about the implications about the future of standardrization within the Ethereum ecosystem. I’d say cross-chain projects and DeFi is one of the hottest topics right now in the industry. A lot of players are bringin solutions to patch or meet the requirements they have for a single use case, and we bring ERC7208 for the long run.
As you may see, we wanted to make this standard as agnostic and upgradeable as possible, giving full potential for upcoming use cases and solving the fragmentation the industry has across standards.
I believe that the EIP presently possesses some ambiguities in its content, and the use of loose RFC-2119 terminology does not benefit it or its adoption. In my opinion, the Data Point structure and the relevant interfaces need to be strictly defined to encourage ecosystem growth, permit auditors to validate the standard correctly, and allow utility libraries in off-chain software to be utilized across EIP-7208 implementations.
Beyond concerns around RFC-2119 terminology, the EIP definition itself is presently incorrect in the sense that its definitions are incompatible (particularly interface definitions). As an example, the Data Object Interface definition is as follows:
interface IDataObject {
function read(DataPoint dp, bytes4 operation, bytes calldata data) external view returns(bytes memory);
function write(DataPoint dp, bytes4 operation, bytes calldata data) external returns(bytes memory);
function setDIImplementation(DataPoint dp, address newImpl) external;
}
These definitions make use of the user-defined DataPoint
variable type causing implementations to be forced on a particular pragma
version range, a trait that is generally avoided in EIPs relating to smart contract implementations that do not make use of opcodes.
Secondly, the user-defined DataPoint
causes function signatures to not be generate-able by the interface
declaration alone. Specifically, different function signatures will be generated if the DataPoint
is a uint256
variable than if it is a bytes32
variable and so on. The EIP mentions that the DataPoint
SHOULD be bytes32
, but does not mandate it so.
To ensure that the definitions are compatible, one of the following edits should be made:
- The code within the EIP should be clarified as pseudo-code
- The
DataPoint
itself should be set to a particular data type strictly (i.e. use MUST in place of SHOULD in relation to whether it is abytes32
data type) and allDataPoint
mentions should be replaced by the underlying data type (including interface-style data types such asIDataObject
)
Depending on the desired direction of the EIP by the team that drafted it, any of the aforementioned edits would be acceptable. However, one of the edits must be incorporated to ensure that the EIP is clearly and concisely defined without any ambiguities.
From a long-time auditor’s and ecosystem developer’s PoV, I believe that restricting the DataPoint
type is acceptable for this particular EIP. Additionally, I believe that some of the interface
declarations should use MUST terminology as well (i.e. IDataObject
and IDataIndex
read/write function definitions) to impose a common baseline for all EIP-7208 implementations.
Given that the above MUST terminology changes would not actually restrict how the underlying 32 bytes are used in a particular DataObject
and the dynamic data
arguments of the read/write function definitions allow maximum flexibility I do not believe the above recommendations are unsound.
Disclaimer: This feedback has been provided on behalf of Omniscia as part of a security engagement by the Nexera team in relation to this EIP
Thank you @alex-ppg and Omniscia
We have applied the following suggestions:
DataPoint
MUST be bytes32- The erc enforces the implementation of the interfaces (
IDataObject
,IDataPointRegistry
,IDataIndex
) with MUST rather than SHOULD - Functions within the interfaces (i.e.
read()
/write()
) now take arguments in the underlying data type (i.e.address
/bytes32
) - Improved DataManager enforcement
Hey @Vitaliyr888, compatibility is ensured because although ERC-7208 is NOT a token standard, it is designed to abstract the underlying asset’s storage of any token and enable the separation of its logic. We’re providing a short list with some examples here.
We’ll happily provide some insights if you’d like to ask about a specific integration or interoperability use case.
The ERC-7208 is now on LAST CALL. We’ll be pushing for FINAL in a few weeks.
The audit report is here
The reference implementation is here
Special thanks to Omniscia for their contributions.