Delegated Execution Sharding (DES): A hyper-parallelized zkEVM for theoretically optimal execution-layer scalability

I’ve been doing some research regarding the Lean spec, specifically the Lean-Multisig; one thing that’s obvious is that PQ is being prioritised first, but also “a minimal zkVM” is being worked on for the sake of XMSS signature aggregation, rather than going for a full-on zkEVM spec at this stage.

The multisig-snarking has some interesting effects in my opinion:

Rather than cryptographic aggregation through pairings like in BLS, Lean is actually computing all of the signatures but proving the validity of the computations as an aggregate; this provides the succinctness of a snark.

Since aggregations are aggregated recursively, signature verification is actually parallelized across the signing-commitee, which is paradigmatically different than pairing-based aggregation, which is just cryptographic verification rather than computational proving.

This of course is often brought to light in the context of SSF, which becomes feasible with regard to that kind of level of efficiency that isn’t present with BLS.

The thing I see nobody talking about is the fact that this in-effect has just parallelised the execution of the beacon-chain, and the implications of that being possible are actually spectacular; allow me try and explain my thinking:

The conceptual separation of the execution and consensus chain is in part due to the disjointed nature of the the beacon-chain computing the consensus-engine’s PoS and blob-verification, while the execution-chain’s STF computes the EVM and world-state; both are in effect separate state-machines, and the beacon machine encapsulates the ethereum virtual machine.

Parallelising the consensus-engine is a big deal because of the level of scalability it brings; in a way SSF is infinite scalability on the consensus-layer. Now ask yourself the question, if theoretically optimal scalability through recursive aggregation of computational proofs could be implemented on the execution-layer, what would that look like? Theoretically a network that totally leverages all available computational resources.

One reason why this kind of parallelisation is possible on the beacon-chain is that validator signatures hold no context: the validity/existence of a signature remains independent of the validity/existence of all other signatures, so they can be infinitely parallelised as they never collide and can thus be computed across disjointed processors. Perfect parallelisation in the execution-layer’s STF is challenging in contrast, as the world-state is a kind of context that could end up colliding across different computations, but how far can we get?

The existence of BALs proves that parallelisation of the EVM is beneficial, but what’s the limit to the number of state-changes that can be parallelised? Well no matter how well you parallelise disjoint transactions you reach the point where the CPU runs out of threads; what if then the disjointed execution moves to other nodes? Here’s how it could work in practice:

Every slot an inclusion-list is established, perhaps similar to EIP-7805; the IL defines metadata denoting an ‘execution-column’ that select txs fall into; these columns all contain txs that are disjoint in their side-effects, and thus can execute in parallel to other column’s txs. Think about it like partitioning txs into equivalence classes based on a side-effect equivalence-relation. Then APS-style delegation can be used to select ‘execution-committees’ to execute certain columns of txs in the IL. Similar to the current Lean-Multisig, these computations are snarkified, and these snarks are aggregated by a committee-leader (or any member) and sent to the proposer of the block, who recursively aggregates the commitees’ snarks into a further snark that proves valid execution across the entire block.

This in effect eliminates the upper-bound that exists in the network currently where, for the sake of decentralisation, any home-computer needs to be able to re-execute any other block; this is true even given full snarkification, as proposers still need to be able to fully execute their own block.

I think if you take this and combine it with EIP-4444 and EIP-7736 history and state expiry, full DAS, and something similar to a verkle-tree, you can potentially leverage close to the entire network’s resources on both the data and execution layers, thus achieving theoretically optimal scalability. You can even optimise it further by implementing new gas calculations that increase costs relative to how parallelizable a transaction is, so that users essentially pay to access more active parts of the world-state. It’s worth noting this multi-threaded gas could have prevented a situation such as the DAO accumulating an erroneous fraction of the total supply of ether, as users would have had to pay increasingly large gas costs as more and more users started mutating the same state in the world-computer; gas costs that pay relative to how parallelizable a tx is thus incentivise account-abstraction and use of one’s own code, which strengthens decentralisation.

In essence the dilemma of a blockchain is that any one node must be capable of following the execution of the entire network in real time, since the only execution you can truly trust is that which happened on your own CPU; with zkEVM parallelism you can solve this bottleneck by using succinct proofs to prove the validity of execution across untrusted nodes executing the chain in parallel, thus achieving theoretically optimal scalability.

Thank you for reading.

1 Like

Proposers can outsource proof generation to 3rd-parties. They can trust the result without reexecuting the block, because correctness is proven. Of course, relying on 3rd-parties harms decentralization. But with further scaling, increasing the gap between the hardware reqs of block/proof builders and validators seems unavoidable.

If I understand it correctly, your suggestion is that proof generation should happen in committees, parallelized. The main problem I see is that there is a fundamental incentive incompatibility between the block builder and the prover: The builder can create blocks that are very expensive to prove. If a committee fails to prove their shard by the deadline, is this fault attributable to the committee, or to the proposer/builder? More on this here: https://ethresear.ch/t/prover-killers-killer-you-build-it-you-prove-it/22308.

I also considered this a few years back. The problem is that “parallelizability” is not a property of the transaction per se, but a property of the smart contract, and of the specific set of transactions included in the block.

Thank you for the reply!

To your first and second point:

It’s a good point. I originally was thinking that there could be execution-committees elected to compute their sharded execution, and that there would be one proposer who aggregates their proofs into one to include in the block. This came from the fact that the ‘commitees-attest, proposers aggregate’ model of the CL was inspiration for delegated execution. In theory delegated execution is agnostic to the kind of proposer-builder dynamics in place; a builder-like entity could also do the aggregation. I guess you can think of it like spreading the building across multiple builders who execute disjointed tx batches in parallel per block, instead of one executing sequentially. I think the point you were making is that the incentives are kind of erroneous if execution-committees are elected, and they elect their own builders, but you could always implement it so that the builders themselves are the ones who are delegated the sharded execution and proving; again, you can think of it like multiple builders in one slot.

Since the inclusion list is established before execution you’d know how parallelizable a tx is before executing it in its ‘execution column’; you could charge a premium for txs that are included in execution columns with many other txs that call the same contract or modify the same state, that introduces a lot of uncertainty though in tx fees which I concede is problematic. Either way its not totally necessary.

I hope I responded to your points adequately.