I’ve never seen this topic addressed directly so I thought it would be interesting to do so here. I’m trying to answer the question: What is the maximum theoretical scale that could be achieved on the existing Ethereum 1.x chain, that is, without sharding? I’m interested in a purely theoretical approach to this question, from a project management perspective, rather than in a more scientific answer, or in actually attempting to achieve such scale. If it seems likely that we could get serious performance gains on the existing, pre-Serenity chain, then I think investment in a more concerted “Ethereum 1.x” scaling effort would be justified.
I’ve had conversations with @fredhr, @jpitts, @AlexeyAkhunov, and several members of the Ewasm team recently, and @karalabe spoke to the same topic recently as part of the Eth 1.x conversation, so this is my attempt to summarize and draw some conclusions from these conversations.
The prerequisite for all of the ideas listed here is that they do not require major protocol changes, and are not breaking changes of the sort that will be introduced with Serenity. Many of these would not require a hard fork; most do not involve a protocol change. A couple are admittedly pushing the envelope as far as “not requiring major protocol changes” but “major” is subjective.
These are not hypothetical, “someday maybe” technologies; they’re technologies that have been proven elsewhere and/or extensively studied. I chose not to include, e.g., STARKs here since they don’t seem to be feasible yet in their present form.
Possible scaling technologies and max. theoretical scale of each, very roughly in order of feasibility/confidence:
- Reduced uncle rate, shorter block processing time (discussion here): 10x
- Improved I/O from better data structures, e.g., TurboGeth (source: @AlexeyAkhunov): 4x
- State pruning (reduced I/O) (source): 15x
- Bounded account/storage trie growth via state rent or stateless clients (source: my own wild speculation, and discussion with @AlexeyAkhunov: block processing should get faster with reduced I/O, and with bounded state we should see I/O benefits over time from improving hardware): 5x
- Block pre-announcement, pre-warming the state (source: @AlexeyAkhunov): 5x
- BitcoinNG-style leader election and block proposal (source): 50x
- Ewasm with JIT compilation (source: benchmarking work that @gcolvin did last year, code and talk): 50x (modulo concerns about JIT safety)
- Parallelization of transactions (source: internal conversations on the Ewasm team): 50x
- Multidimensional gas/metering (gas is a blunt tool and is designed to be overly conservative; if we could meter, say, I/O and computation separately then we could pack more transactions into each block) (source: me)
Naively multiplying these all yields a scale of 1.87 billion x current scale. This is obviously an absurdly high number for at least two reasons: 1. This is a “best case scenario” and many of these ideas may not perform “as advertised” or may not work at all. 2. This assumes total orthogonality among the ideas, which is obviously not the case.
However, even if we assume, for the sake of argument, that only 10% of these ideas yield fruits, and that those yield only 1% of “advertised” performance/orthogonality, that’s still 1.875 million. Still a pretty high number. Still lots of room to poke holes.
One obvious weakness in this argument is that it’s “top down” versus “bottom up,” i.e., these ideas are all nice in theory until we try actually implementing them in existing clients and find that they might not actually work, might take as long as Serenity (e.g., stateless clients, which have wicked UX challenges), might break other things (e.g. usability), or might be mutually incompatible.
Another import caveat: as @karalabe and @AlexeyAkhunov eloquently explained in their Eth 1.x proposals, any attempt to scale today will immediately exacerbate the state and storage size issues, so any meaningful scaling (of Eth 1.x or 2.0) is still blocked on that.
Finally, there are probably fundamental limits to how far a single chain can be scaled, such as sync time (possibly alleviated by stateless clients?) and I/O limits.
What am I missing? What have I got wrong here?
Is this exercise useful? To reiterate, the point is not to come up with some specific,. proposed scaling plan for Eth 1x, but rather to consider the question of reinvesting in scaling Eth 1x vs. doubling down on Eth 2 from a high-level, project management perspective.
Thanks!