Formalizing decentralization goals in the context of larger L1 gaslimits and 2020s-era tech

One important thing to understand is that the challenge is less about storage capacity and more about database architecture. Most clients use LevelDB as the key-value database for storing state. LevelDB is quite efficient, and there are few viable alternatives.

LevelDB stores key-value pairs in read-only “level” files, organized by generation.

I won’t go too deep into the architecture of LevelDB, but the key point is that these level files are periodically compacted (i.e., merged). As the database grows, large files occasionally get compacted at random times. This can cause the database to stall intermittently. As a result, historical nodes with larger states and databases will fall behind regular nodes with smaller databases.

To solve this problem, clients will need to adopt a more advanced architecture, where LevelDB is split into multiple shards (e.g., by the first bytes of the key). If you use, say, 32 shards, then you can utilize 32 CPU cores in parallel, and each shard’s database will be 32 times smaller. Additionally, you can spread the data across multiple SSDs, improving read/write bandwidth by a factor of 32.

If LevelDB remains monolithic, then only a single core is utilized during database updates, and moving to a more powerful machine won’t help the historical node keep up.

So, the key point is that the challenge is not so much about storage requirements, but about re-architecting clients into a parallel, sharded database architecture — which is significantly more complex to engineer and QA.

1 Like

Yes, but with unbounded state growth people have to buy new equipment regularly with no change to the gas limit.

This doesn’t solve state growth for the most important type of node oporater: RPC provider (ideally local). There are solutions to this like Portal Network, but it is unclear if that will work without incentives. There are incentivized designs that could work, but I don’t think anyone is working on them.

This assumes the relative gas usage of state vs compute remains constant at all price points, which I’m incredibly skeptical of. There are new state heavy use cases that open up once gas price is low enough (like Ethereum as a backup system or S3).


I think the key problem with almost all discussions on this topic in that they seem entirely focused on builders/stakers and everyone ignores the operators who I believe are by far the most important, which are RPC operators. We should be pushing hard to get regular users running their own trustless clients for answering RPC requests and those need fast random access to all state.

Things that reduce the state a builder or validator needs doesn’t help with this at all. While this is a solvable problem in theory, almost no solutions to the state problem or discussions of state growth even mention this class of operator. Solving decentralized state with unbounded growth in a way that gives usable response times for random access is a hard problem, and until that is solved we shouldn’t be seriously considering cranking up the state growth.

2 Likes

Chiming in here to say that I think FOCIL is a must before we continue much further on the path towards mega builders. I’m not necessarily opposed to the idea that mega builders are the right path for Ethereum, though I likely agree with Micah, Potuz and others, that if you can’t execute the tip, nor can you prove the tip, you’re not really running Ethereum independently, and you are basically just a light node regardless of if you’re in the attester set.

What I want to focus on is the importance of deciding who are the FOCIL includer set. The reason being, I am not confident the ‘medium nodes’ i.e. the attester set, are on track to be sufficiently decentralised to be entrusted as the FOCIL includers. Eth staking is the most costly to run its ever been, and pays the least it ever has. The lack of a production solution to mev centralisation, and the extreme and growing economies of scale of staking, has the proportion of solo stakers going in one direction only.

Exacerbating this, is the idea that we should move to extremely large minimum eth requirements to be in this set, to make 3SF/SSF more viable. This excludes nearly all independent validators (solo stakers), and leaves only enterprises and protocols in the attester set. In a world where almost all attesters are using someone else’s delegated ether, I am unconvinced that they are a reliable basis on which to hang the hopes of real CR for the chain.

I think FOCIL is extremely needed, I think a 16x magnification of the honest attester subset is insufficient, and likely should be 10x that at least, and unless we intend to meaningfully change the trajectory of the attester set with features like MEV burn, liveness correlation penalties, and sublinear reward curves; I think we should be focusing urgently on if there is an economically rational rainbow staking model that can allow FOCIL to run ideally with thousands of includers in every slot. As others have pointed out, FOCIL can be run on very modest hardware requirements, provided the economics of these nodes can be figured out, the hardware is not the bottleneck.

tl;dr: the economics of staking are highly centralising, (even without exacerbating matters with issuance caps), and we need a real decentralised includer set for FOCIL if we expect it to permanently constrain centralised builders.

2 Likes

What’s the problem being solved?? With the proposed change to this heavy node concept? We already have heavy nodes… it’s impossible for average users to run their own eth node because of 1. The high cost of 32 ETH, 2. high quality SSDs, now in the range of 3 TB minimum for minimal maintenance, and only increasing, and 3. the bandwidth costs are ridiculously high, even for people like me in major cities without cheap, fast Internet

Hi V im no expert, and watch things more from a outside spectator abtstract view. Seeing it from this perspectivei want to drop two points:

  1. you once said, that overgrowing complexity of the protocol will lead to vulnerability. this made sense to me. Like with any app, …a) the more complex and b) the more user friendly …the more potential to flaws and surface to get hacked…

  2. Most EIP are about finding compromises and solutions about ensuring the immutability of the blockchain while request minimizing resource requirements. This will be a premanent challenge. But to leave the regular path, to be not to fixaed on something as the only solution, i want to ask: How does mother nature ensure that the all dominating blockchain and distributet ledger the DNA is immutabel? …well, DNA Blockchain is simply spread everywhere in high numbers permanently and onwards going forerver. While humans and banas still have 60% of their DNA in common. The DNA Blockchain is so not verfied by acitve verifications from the bananas and so, but simply by the simple dominant presence everywhere. If one wants to verfiy that the core DNA is correct, just pick a blade of grass from the garden. And permanent breeding everywhere and anytime makes it dominant forever …if one wanted to attack this chain, he would have to change all these core blockchains in each single blade of grass, ant, banana etc. …it is simply immutable, but being spread everywhere and being present everywhere.

…this is just a very abstract thought. But i sense that this everlasting search for the right compromise between realistic low resource requirements, and immutability, …is the search for the holy grail. …maybe new ideas new paths can be considered…

As a home staker the thing I am most sensitive to is bandwidth, specifically upload speed.

Hardware is cheap, I’ve been running since beacon genesis on an 8th gen i5 (released in 2018) with two replacement fans at $10 each and an upgrade from 2TB to 4TB ($300) to stay comfortable and not have to babysit pruning. I know that at some point I’ll need to replace my hardware, that’s fine, that’s part of the responsibility of participating. Personally I don’t see a problem with asking home stakers to have 4TB SSDs, and if Ethereum is not the best it can be because we want to try and keep it to 2TB to suit edge cases, I don’t think that’s a compromise we should make. Putting up ~$100k collateral and not being willing to spend <1% of that on the hardware is crazy to me.

But bandwidth is something that I don’t have as much control over, and I think that is the battle we need to choose.

As a home staker my staking rewards are classed as income from a tax perspective, and at a worst case that eats into ~50% of my rewards and I can’t write off any of the costs against that. So at the moment running a single validator is earning ~$1200 after taxes, and if that reward stays the same but I need to upgrade my internet connection, I’m going to struggle to justify to my partner the economics of having our money tied up validating for ~$1000 annual return.

3 Likes

I had a scratch paper idea for reducing compute requirements, it leverages the ideology that the majority of transactions do not access the same state. In theory, you could assemble transactions into some form of DAG and process/prove them in separate groups on different machines and aggregate the proofs individually. However, it runs into the issues of race conditions and a load of extra overhead and complexity. It may be worth it though if it gets throughput high enough.

As a home staker with a 900/500MB fibre connection the thing I’m most sensitive to as of today is storage (specifically growth rate). Once a node is synced, the CPU is idling, memory requirements are stable and I don’t notice bandwidth consumption - you could increase all 3 factors by 50% and I would not care.
I do however have to upgrade storage from 2TB to 4TB SSD this year, which represents 50% of the total HW costs. Unlike CPU & DDR4 memory, SSD prices / TB have barely changed over the last 5 years.
I hear the argument of the total investment for 32Eth vs investment into hardware but given that there is no financial incentive in home staking over liquid pool staking and I already contribute my labour the only incentive for home staking is idealism. Every time you are asking me to spend money on HW you are asking me to pay for my idealism to continue with my home staking node. Therefore, if storage growth rate would meaningfully increase without a financial incentive for home stakers, I would stop and switch to a liquid staking pool instead.

I’m a solo staker since Ethereum staking began in December 2020, and I’m very active in the community, helping where I can and engaging with fellow stakers. I’d tend to agree that storage isn’t a major issue for most participants. However, CPU and bandwidth are a different matter—both can be expensive to upgrade, and in the case of bandwidth, often impossible to improve due to local infrastructure constraints.

In many parts of Europe, particularly in rural regions like where I’m based in Germany—connectivity bottlenecks remain a significant concern. Many connections don’t even reach 100 Mbit/s. Although there are national plans to expand fiber coverage, the rollout is slow, and in areas where it’s not economically viable, service may never arrive.

The second point I want to highlight is resilience - something that’s absolutely essential when running an operation with this level of personal and financial commitment. For private individuals, it’s a lot to manage alone. There should be greater attention on ensuring robust failsafe mechanisms are in place, so that in the event of disasters, both individual stakers and the network as a whole can recover quickly and securely.

I’ve shared some more detailed thoughts on this here: Emergency exit for compromised validator withdrawal address - 2FA - #10 by noe

Vitalik, I’d really value your thoughts specifically on the resilience part. Your perspective could help bring more awareness to how we can better support solo stakers in the long term, especially in light of EIP-7251’s current cl client implementation phase.

1 Like

Another “home” staker here, running ~50 validators for several programs (Rocket Pool, Lido CSM, NodeSet, SSV). Just to get you an understanding why I’m not that concerned about hardware requirements: to get better resiliency, I run three nodes with different EL and CL clients, separate light machines running just the VCs, everything in a separate rack with its own router, a dedicated internet connection just for that and another backup internet connection from a wireless ISP to back everything up, a UPS and even a generator if needed are included as well. Sounds paranoid, I know, but it became my hobby perfecting that system and gets me a very reliable infrastructure I can use for a lot of things within Ethereum. I just like thinking about resiliency and I’m willing to invest that kind of money to have a proven system generating returns for me.

The best thing about staking right now in my opinion is the low power draw. The whole system I just described including UPS just sucks around ~120W from the wall. I live in Central Europe and electricity cost is moderate. Back in the days, when I was mining those ETH I today use for staking, power draw was a constant headache for me. Not only that I couldn’t limitlessly pull power from the outlets in my parent’s workshop because EU outlets are only made for ~3.6kW per line, the heat was an enormous problem. During winter, it was nice to have a heated workshop (my cat really liked it there), but summer was literally hell for my GPUs back then. It killed more than one.

If staking (!) requirements in the future come with increased power draw requirements up to 15 kW (!!!) for a heavy node, this will be a massive driver for centralization. No home staker, even sophisticated ones like me, will want to afford that kind of infrastructure (power lines, cooling capacities, ability to pay the power bill of >€3K a month). I’m also concerned about the general blockchain power draw something like this would drive. Still low compared to mining of course, but the switch to staking was a huge step in the right direction from an environmental perspective (something a lot of people tend to forget these days). Losing that would be just wrong.

I’m happy to invest in hardware and bandwidth as needed if the return of invest is right for me, but substantially higher power draw makes it a lot more difficult to participate.

If staking (!) requirements in the future come with increased power draw requirements up to 15 kW (!!!) for a heavy node, this will be a massive driver for centralization. No home staker, even sophisticated ones like me, will want to afford that kind of infrastructure (power lines, cooling capacities, ability to pay the power bill of >€3K a month)

The 15 kW would be for a prover node. So a type of node where the network only needs to have enough of them to guarantee that one honest one will always stay running including in heavy censorship scenarios. Hence, we maybe want ~100 in the normal case - perhaps the ideal is to have a class of actors who have local server setups that they normally use for AI, but have the capability to quickly switch to proving always on standby.

Stakers would have much lower requirements.

1 Like

Would it be a good idea to have a system where solo stakers can vote on such things? We already have lists with them (used in airdrops so far) and I’m sure a voting system requiring their signature would be easy to implement by the EF. They are what makes ethereum descentralized and I think their voice is drown in the ocean of massive staking providers (that can basically signal anything with their validators)

Thanks for getting back to me. I already suspected that this would not be the new requirement for stakers in general, but things like this can still be a limiting factor for desired diversity amongst those prover node operators.

Coming back to home staker requirements, I’d say increases are fine as long as there is a roadmap taking into account that stakers at home have certain limitations (mainly bandwidth and power draw) and need predictable increases in requirements (e.g. more data storage needed in the future) to still be able to participate and diversify the decentralisation landscape.