Vitalik’s rollup-centric roadmap for ethereum provides a vision for ethereum that separates rollups and shards as playing two distinct but critical roles in scaling ethereum. While I am generally in favour of all the concepts involved, I would like to argue as to why I feel there needs to be tighter coupling between rollups and shards, than is likely to be seen in the current roadmap.
Before that I’d like to briefly summarise the insights underlying rollups and shards.
Insights
It’s all social consensus - If you accept that social consensus is what governs blockchains, you can accept that nodes don’t need to be able to sync all the way to genesis. Social consensus is slow, but it can be established over the time-period of some weeks or months. Economic consensus (PoW, PoS) is very fast but backstopped by slow-moving social consensus.
Once you accept social consensus can established over say a 6-month scale, a node needs two things to sync. 6-month old state (on which you have social consensus) and 6-months of history.
Social consensus is n-of-N not 1-of-N - A single node can’t credibly fork the system, a few nodes are required to fork and establish their own network effect. Once you accept that, you don’t need every single node to store all data (state and history) - you can split it amongst n nodes. You now have n processors, n hard disks, n times the internet bandwidth to use in order to sync the network and be able to credibly fork it.
Rollups split state among n nodes - Not everyone cares about all state. Each user cares only about the state of only their own accounts and the contracts they wish to interact with. Rollups provide a natural way to split state. Each rollup has its own state and nodes of a rollup need to remain in sync with state of only that rollup, and not all the other rollups there exist.
Sharding splits history among n nodes - Not everyone needs to store all history. Accordingly, ethereum is split into n shards, with nodes of each shard storing history of that shard. Shards collectively become a giant “history dumping ground” for rollups. I’m just gonna refer to these nodes as shard nodes here onwards, to distinguish them from rollup nodes.
What are the differences?
Clearly there are some differences between the way rollups split state and shards split history.
1. Rollup nodes care more about the state they store, than shard nodes care about the history they store.
Rollup nodes are storing and syncing state of specifically the rollups on which they personally have assets and find other state objects interesting. Shard nodes on the other hand are storing chunks of history from both rollups they care about and rollups they don’t care about. Even for the rollups they care about, they may not have the complete history, as it could be spread across multiple shards. While I completely agree that both types of nodes (rollup nodes and shard nodes) require some degree of altruism to exist - it is easier to be altruistic when dealing with objects you personally care about, such as your accounts. There is personal bias in this statement ofcourse, but I do feel it generalises. And that more people will run rollup nodes than shard nodes in the current system.
2. Rollup designers have more flexibility to manage state, than shard designers have to manage history.
Rollups are completely free in terms of how they manage state. They can implement state expiry with any time period, or state rent, or auction out state or offer it first-come-first-serve. They can rely on any model of state providers to help rollup nodes sync and prove state as needed. They can implement account-based state model, UTXO model, or any other model that parallelises creation and validation of state or state transitions in different ways. They can make different tradeoffs on assumptions of processing power or altruism. It is even possible for rollups to fall out of use, and have not a single node left who cares enough to sync the state of the rollup. This can happen without harming the overall system.
Shards however are being designed in a central fashion. All shards store the exact same amount of data being written at the same rate using the same opcodes and the same auction model by which it is allocated. All shards assume the same time period beyond which history is no longer stored (and social consensus is sufficient). This does not allow different subsystems to make different tradeoffs over altruistic assumptions. It does not allow different execution environments to make different design choices for their data availability model that fit in more closely with other execution-related choices they may have made. It provides them limited flexibility as to how their data is split amongst multiple shard nodes. Communication between nodes happens in a uniform hierarchial way, be it peer scoring, bandwidth management and privacy. There may be constructions where multiple shard nodes strategically agree on who stores what and communicates it when - that are more beneficial for the overall system than splitting everything equally.
3. Rollup design is funded privately, sharding design is funded publicly.
Rollups have investors who may wish to extract returns by various means. Sharding however is being driven by core devs and researchers, whose sources of funding are more altruistic. Their values and culture are different. Clearly there are advantages and disadvantages to both models that I will not get into here, but on the face of it I cannot see a fundamental reason why one kind of design should be privately funded and the other should be publicly funded.
How to bring in tighter coupling between rollups and shards?
I have not spent too much time on this question - it is indeed an open one - I was hoping to get answers in the replies.
Regarding 1, it might make sense for rollups to register themselves to specific shards, assuming we retain the centrally-designed 64-shard model. That way there is clear mapping as to which shards are used by which rollups, and users running rollup nodes can naturally also run shard nodes for those specific shards.
Regarding 2, I think it is fairly open question as to how you “open up” the design of sharding, and let rollup and protocol designers define their own nodes and subnets and everything, with their own assumptions over altruism, compute power, bandwidth etc. There is a clear social component here as - the distinction between a user running an ethereum node and one running a bitcoin node is purely cultural. If you erode the socially unifying notion of an “ethereum node”, and instead allow 10 different types of ethereum data nodes to be created by 10 different protocol design teams, they will all compete for the same mindshare and same altruistic node operators who will have decide which nodes they want to run and which history they want to store. Ofcourse, this form of opening up has already happened when it comes to rollup nodes. Opening this up might also run the risk of some history being stored by no nodes, depending on how it is implemented. Which is again true with rollups, it is possible for there to exist rollups whose state is being synced by no one because nobody cares to.
Assuming some form of opening up of sharding design is attempted, one will have to carefully draw the line as to which design choices are opened up and which one are centrally retained by the current core devs.
Regarding 3, I don’t have much thoughts as to what should be done. What I know is rollup design can be publicly funded, and sharding design can be opened up to private funding and maintenance. And what should or should not be done is a meta-discussion that should be had.