EIP-4444: Bound Historical Data in Execution Clients

Then I’d like a bit more discussion of what the solutions and who is responsible for fixing what is going to get broken.

A. This is only necessary if one wants to validate the entire blockchain from genesis, which I argue is an uncommon operation at best, and I suspect eventually will simply be something that no one does.

B. The old clients don’t have to be maintained, they only need to continue to exist. No updates need to be applied to them.

Preserving the history of Ethereum is fundamental and we believe there are various out-of-band ways to achieve this.

It should be stated what preserving the history of Ethereum is fundamental to, and how important state history preservation is relative to other properties of the protocol.

And if it is as fundamental as stated, why do the authors not propose an alternative, sustainable mechanism for it?

Adding this: Not that the current situation is indefinitely sustainable, but the current requirement sufficiently preserves and provides state history. The burden put on network users is heavy and growing, but there needs a realistic plan for how to maintain this widely-used aspect of the Ethereum network.

1 Like

Yes. The Ethereum blockchain is, fundamentally, an immutable record of transactions – value transfers and valid computations. It seems to me that there should be some standard protocol for that, however the history is stored.

1 Like

Additionally, if we use MUST NOT then the peer that breaks this requirement SHOULD be disconnected and penalised.

I agree completely that “preserving this widely-used aspect of the Ethereum network” is of utmost importance. But why does it have to be a capability that the node maintains?

It seems to me that maintaining this ability in the node is exactly the problem. If the data is immutable, the entire state and the entire history of the chain can be written once to some content-addressable store (such as IPFS) and as long as someone preserves that data, anyone can get it. AN Ethereum node would not even have to be involved. All one would need is the hash of where the immutable data is stored.

Fresh data can be written as an ‘addendum’, so there would have to be some sort of published manifest of the original hash and the periodic ongoing hashes. I would argue that the hash of the manifest should be part of the chain, but, short of that, the community would have to maintain it (perhaps by publishing the hash to a smart contract).

My point is that because the data is immutable, and because we have content-address storage to store it in, there’s literally no need to continue to provide the ability to regenerate this data from genesis. The only outcome of regenerating from genesis would be to arrive at the same IPFS hash as you already have.

On top of that, there’s no reason the clients have to maintain this capability, and the entire purpose of this EIP is to remove that requirement. This might possibly open a whole new area of innovation related to providing access to this historical data – which I think would allow for amazingly more interesting dApps than we currently have (because of the need for a node to get to it).

Furthermore, if the historical data is chunked into manageable pieces, and it was properly indexed by chunk (with a bloom filter in front of each chunk) each individual user could easily download and pin only that portion of the database that they are interested in. Thereby distributing this historical data throughout the community as a natural by-product of using it. (See TrueBlocks).

I agree that people are uploading lot of stuff on blockchain especially with the rise of NFTs but also not optimized token contracts which are causing state bloat. But did anyone think about other examples and use cases of blockchain? What if people uploaded their important documents like birth certificates on blockchain as the mission of blockchain is ledger which stores information on-chain forever. Suddenly those people won’t be able to access their documents because some devs thought it’s a good idea to delete blockchain state after some time… Another great example is NFTs especially NFTs that were made before ERC-721 ie 2017 and older like CryptoPunks. Those will be gone for ever.

From developer perspective, I’m sure that there are many data that are not important and doesn’t need to be stored.

Probably better idea would be to store data on full nodes and have light nodes or think about different ideas how to make infrastructure the most efficient without having to delete and loose data.

Don’t get me wrong, I’m just trying to think realistically from non-core-dev perspective and I’m against this EIP.

1 Like

Ethereum was never designed to be a permanent data storage system. Something like FileCoin is much better suited for long term data storage, and they have incentives built into the protocol to ensure that the cost of long term storage is paid for by those seeking it.

Also, this EIP removes history but not state. State expiry is also an active area of research, but out of scope for this thread.

3 Likes

I have been an Ethereum watcher and dapp developer for years, and have admired all the EIPs that have come through. However, this EIP is deeply troubling. I think this would be an extremely negative EIP to implement. Here’s why:

Ethereum was touted as the system to build “unstoppable” apps over the years, I loved it. With this EIP, these “unstoppable” apps will simply, well, stop (at least, their UXs/UIs will). It forces substantial and necessary adoption of some a.n.other (unknown and uncertain) protocol entirely outside of the Ethereum system. Pulling the “promise” of data persistence on old apps will be disastrous for the long term reputation of dapp development on Ethereum.

For those who say Ethereum was never meant to store data permanently, that is simply not true. It was! It’s specced that way and therefore used that way. And with this EIP, it will no longer. This is a truly fundamental change to (and destruction of) the Ethereum value proposition. Providing canonical transaction history tightly coupled with canonical transaction generation is CRITICAL to Ethereum’s value proposition. Offloading this entirely outside of Etheruem’s control gives away (and destroys) the future utility of Ether. Why? Because the whole point of canonical transaction generation is that you also have canonical transaction history.

With EIP 4444, it is possible to lose entire chunks of past transaction history. Forever. As in gone. No one knows who sent (or did) what to who or when.

There’s a reason why JP Morgan, HSBC, etc, any of these long storied banks are still around. It’s because you can rely on them having, somewhere inside their big walled offices, a transaction history going back over 100 years. This builds TRUST (yes, centralized trust). But that’s why (other) people come back to them (even if blockchain ppl don’t). The old history may be hand written in log books, sure not convenient, but it’s there.

Now imagine Ethereum was just such an organisation. You go to Ethereum in 12 years time and you ask (in code), what transaction happened on this account 10 years ago? The reply: oh go to Graph/Bittorrent/IPFS/a.n. other, we don’t keep that. You try numerous of these organisations, by some stroke of bad luck, they messed up on your particular EOA/contract (or their tech dies), and it’s gone. Would you trust the Ethereum system, simply because it moved to cool stateless consensus, and therefore decided they didn’t need to include anything as boring as past history anymore? I wouldn’t.

What this EIP fails to realise is that the value of canonical decentralised transactions in almost all real world use cases isn’t canonical decentralised transactions, but canonical history of decentralised transactions. The blockchain that does this will win. Ethereum does both today, but with this EIP, it won’t any more. That’s what I would call the broken promise of Ethereum if this EIP happens, and without a simultaneous EIP that ensures canonical transaction history at the Ethereum protocol level. Perhaps some compromise can be made on time horizon. “Forever” is not enforceable, but say 20 years (for example) is, and good enough for most real world use cases (but even then not good enough for academic records, for example). Remember, the well known banks most of us also use can keep (centralised) canonical records for over a century, and universities can keep records for multiple centuries.

2 Likes

Arweave fixes your problem. Period.

It is a.n.other protocol. canonical transaction history is absolutely critical to Ethereum. Why entrust that outside? If Arweave figures out Smartweave properly, let’s see where Ether ends up. More power to Arweave :slight_smile:

Hi! I co-lead ResDev for Protocol Labs, we’re more than happy to help make available prove-ably immutable history of ETH forever on Filecoin & IPFS, and would not require any protocol changes. Please don’t hesitate to let me know if anyone would like help doing this!

Thanks!

1 Like

PEEPanEIP-4444: Bound Historical Data in Execution Clients with @ralexstokes

My two cents. I disagree with this EIP, both in general and about the contents.

It doesn’t add any real advantage to clients, I can already prune txs if I’d like to.

It removes a feature without offering any solution; everything is “out-of-scope” or “just use other p2p network for that”.

We can already use centralized servers or CANs (IPFS, Swarm, etc.) to store data, even Ethereum related (i.e., TrueBlocks indexes), but having that mandatory is something that makes no sense for me.

Ethereum is a p2p network, using the network to share data and reach consensus should be what it’s used for. And all the Ethereum ecosystem should be consistent without relying on others p2p networks.

Furthermore, storage is a commodity by now, and it will be even more in the future. Four hundred GB (and counting) of old data is really nothing to fear of, I can buy an HDD to store 4 TB of data for $30, and it will last for another 20 years of historical txs. Storage technologies grow and improve really faster than Ethereum’s txs.

In my opinion, the right path is continuing to improve light client, spreading more nodes among constrained devices and hardware.

The wrong path is transforming forcibly all full nodes into light clients.

1 Like

Hi, your are saying this, but TheGraph just censored Tornado Cash.
The underlying problem is things like databased shared over ipfs or Google Bigquery wouldn’t allows us to have a realtime service as it would take a day to download updates. We also need very old data in order to let a user withdraw funds deposited several years ago.

This isn’t that much a problem currently as this just mean we will need our own home (not cloud) hosted node with parity_tracing, along OpenEthereum Fatdb like for getting smart contract storage range at past blocks. Such thing is possible because full anciant data is broadcasted and updated in real time but show why it is important to have node/rpc be able to broadcast full chain history over the p2p network.

If third party service are made required, then well, Ethereum, will be decentralized at the currency level like Bitcoin while Dapps like casinos or yield farming, will be fully permissionned have to register with authorities by paying an army of lawayers in order to be allowed to run which means, DeFi won’t be that much different than Fintechs on the tradionnal banking system which rent their computing hardware.

I think keeping history on the p2p definitely worth the reduced transactions per seconds outputs or if we decide to behave like swift or MasterCard or Visa for being able to run as fast as them.

No, their point is large database reduce the transaction per second speed. But I think this is far fetched as Visa and Mastercard are in my country required to record data of all transactions of the past 10 years for law enforcement, and this doesn’t prevent them to run.

Would this be in real time for each blocks like the current p2p network, or would there be daily updates pushes ?

Hi, your are saying this, but TheGraph just censored Tornado Cash.
The underlying problem is things like databased shared over ipfs or Google Bigquery wouldn’t allows us to have a realtime service as it would take a day to download updates. We also need very old data in order to let a user withdraw funds deposited several years ago.

This isn’t that much a problem currently as this just mean we will need our own home (not cloud) hosted node with parity_tracing, along OpenEthereum Fatdb like for getting smart contract storage range at past blocks. Such thing is possible because full anciant data is broadcasted and updated in real time but show why it is important to have node/rpc be able to broadcast full chain history over the p2p network.

If third party service are made required, then well, Ethereum, will be decentralized at the currency level like Bitcoin while Dapps like casinos or yield farming, will be fully permissionned have to register with authorities by paying an army of lawayers in order to be allowed to run which means, DeFi won’t be that much different than Fintechs on the tradionnal banking system which rent their computing hardware.

I think keeping history on the p2p definitely worth the reduced transactions per seconds outputs or if we decide to behave like swift or MasterCard or Visa for being able to run as fast as them.

I propose something which is done by the Cloud industry which bill money to keep data. Deposit Ethers on smart contracts : at each blocks, a very tiny fee is removed. When the smart contract values drops to 0, it’s code/storage is SUICIDED and it’s relevant transactions deleted from history.

That way : what is needed is kept while what is forgotten is destroyed. This is also means more efficient than the proposal since stuff can be destroyed before 1 year.
Please also notice that destroying what is unused is also how the human memory works and things always fit in the size of a human skull.

ytrezq Deposit Ethers on smart contracts : at each blocks, a very tiny fee is removed

Or maybe, stake the ether and use the proceeds to pay for the data storage in perpetuity.