There are currently a large number of (optimistic and ZK) rollup projects, at various stages of development. One pattern that is common to almost all of them is the use of temporary training wheels: while a project’s tech is still immature, the project launches early anyway to allow the ecosystem to start forming, but instead of relying fully on its fraud proofs or ZK proofs, there is some kind of multisig that has the ability to force a particular outcome in case there are bugs in the code.
L2beat’s risk analysis page shows a large amount of stats about various rollups, including the status of their training wheels:
However, as of today this information is not well-standardized, making it difficult for users to understand what specific trust model a particular rollup is using. Rollup teams may even have the incentive to keep quiet about their present-day trust model, focusing the discussion instead on the fully-trustless tomorrow.
This post proposes a simple milestone-based schema to help us categorize rollups into three different stages, depending on how heavily they rely on their training wheels. This is intended to achieve a few goals:
Make it easier for users to identify the extent to which a particular rollup depends on “trust in a specific group of humans” vs “trust in code”
Help motivate rollup projects to improve on their trust models, reducing the risk that trust minimization gets deprioritized because it is “less visible” than eg. flashy UX improvements
Give the ecosystem some precise milestones to coordinate around and celebrate, letting us say when “The Surge” is half-complete or fully complete, paralleling “The Merge”
This schema is NOT intended to imply a moral judgement that movement to maximum trust in code as quickly as possible is the only correct course of action. Rollups absolutely should have a clear roadmap to taking off training wheels, but they should take training wheels off only when they are ready.
The Schema
Stage 0: full training wheels
Requirements:
The project should call itself a rollup.
All rollup transactions should go on-chain.
There should exist a “rollup full node”: an independently runnable software package that can read the L1 chain, extract and the rollup chain, and compute the current state of the rollup chain. If it disagrees with a rollup state root posted into the contract, it should give an alarm.
There should be machinery that allows users to either post rollup transactions or at least ensure a withdrawal of their assets with no cooperation from the operator. That is, the operator cannot freeze or steal users’ assets by censoring users; their only possible tool for doing so must be to post a false state root.
It’s okay if the on-chain mechanism for posting new state roots is simply a multisig, with no active fraud proof or validity proof whatsoever.
Stage 1: limited training wheels
Requirements:
There must be a running fraud proof or validity proof scheme, which has the practical authority to accept or reject which state roots get accepted by the rollup contract.
There can exist a multisig-based override mechanism (“security council”) that can override the fraud proof or validity proof system’s outputs and post state roots, to be used in case the proof system code is bugged. However:
The multisig must be 6 of 8 or stricter (that is, >= 8 participants AND >= 75% threshold)
At least a quorum-blocking group (that is, enough participants to prevent the multisig from acting) must be outside the organization that is running the rollup.
There can exist an upgrade mechanism, but if it has a lower threshold than the multisig, upgrades must have a mandatory activation delay of at least 7 days or the maximum length of the fraud proof game, whichever is longer. The goal of this rule is to ensure that the upgrade mechanism cannot be used to intervene in real-time disputes.
Stage 2: no training wheels
Requirements:
In the event that code does not have bugs, there must not be any group of actors that can, even unanimously, post a state root other than the output of the code
This somewhat awkward phrasing (“IF the code does not have bugs, THEN no one can override it”) is meant to permit use of security councils in ways that are clearly limited to adjudicating undeniable bugs, such as the following:
The rollup uses two or more independent implementations of its state transition function (eg. two distinct fraud provers, two distinct validity provers, or one of each), and the security council can adjudicate only if they disagree - which would only happen if there is a bug
If someone submits a transaction or series of transactions that contains two valid proofs for two distinct state roots after processing the same data (ie. “the prover disagrees with itself”), control temporarily turns over to the security council
If no valid proof is submitted for >= 7 days (ie. “the prover is stuck”), control temporarily turns over to the security council
Upgrades are allowed, but must have a delay of >= 30 days
Thank you for documenting this proposed framework. Agree that it would be very useful to have a defined framework for rollup classification as they progress through various stages of training wheels.
From the Coinbase perspective, this would be a valuable resources for presenting our users with a standardized risk assessment as they interact and move assets across different rollups. We would be excited to collaborate on fleshing this out and making this a resource that can be standardized across the broader ecosystem.
As currently specified, this seems like a reasonable v0 that we can expand on. One additional requirement I might suggest would be to encode something about the fraud proof or validity proof mechanism being available in such a way (e.g open source) that it can be reviewed by 3rd parties.
Like @jessepollak said, I think this is a good v0 proposal that would benefit the entire rollup ecosystem. As more and more assets are held in rollups, it’s becoming increasingly important for users to have a clear understanding of the security model of the system they’re using. A standardized security framework is going to be a key part of this.
My primary feedback is that there are “nice to haves” that aren’t necessarily covered by a framework like this that can have a real impact on the overall security of the network. For example, the ability to easily preview and upgrade and understand its implications can be the deciding factor between a bug slipping through and a secure system. Of course, that’s not strictly required but, like I said, it’s a nice to have that has a real impact.
IMO a good v1 version of a framework like this looks sort of like a tech/skill tree. You have specific key requirements that are necessary for advancement to the next stage but you can also unlock bonus items that give you “points” but aren’t requirements for advancement. You might be able to extend this model by assigning points and required features such that the requirement for advancement is X number of points and some set of necessary features. Assigning points also reflects the relative value of different features (especially non-critical features) and can help teams prioritize their work.
I’d really like to see comments on this proposal from other rollup teams so we can start to build consensus. Generally, the sooner the better with a framework like this. Rollup risk is only increasing and this is a critical first step in pushing rollups towards real security.
I would say yes, as long as any of the members of the security council can initiate the proof. That would be a 1-of-8 trust model, a higher trust-minimization bar than the 3-of-8 assumption already inherent in the multisig.
An upgrade delay of 7 or 30 days is a safety vs liveness trade off that favors safety by forcing a liveness delay - this sounds reasonable But what if there is an urgent need to upgrade quickly? Say a zero day attack or a bug that may enable an attacker to drain funds in a few days? Recall the DAO fork (upgrade) had some urgency. Perhaps a higher level committee (or social consensus?) should allow faster upgrades. One way is via an on chain governance mechanism driven by eth validators or other (more specific) stake holders…
But what if there is an urgent need to upgrade quickly? Say a zero day attack or a bug that may enable an attacker to drain funds in a few days?
Then we’re screwed, and that is a risk that we have to take. This is an unavoidable tradeoff: if we want the system to be secure against 51% attacks on the governance layer, then we have to sacrifice the ability of benign 51% collusions in governance to fix broken code.
Ideas around combining multiple proof systems and only turning on governance when they disagree (the reasoning being that it’s very unlikely they would all be bugged in the same direction) is the best that I can come up with to get the best of both worlds here.
Jumping in to comment that StarkNet is following a different approach, one that doesn’t quite follow these 3 steps. On one hand, we have the main security technology - validity proofs - turned on from the get go, we’ve never submitted a state update to StarkNet without having a STARK proof for it (same holds for all StarkEx systems). So that maps to somewhere between Step 1 and 2 (currently our upgrade period is less than 30 days, so not Step 2). On the other hand, we have decided to deploy decentralized operation at the sequencer+prover level at the very end. Roughly, the steps we’re taking are thus:
Functionality - i.e., having the basic structure of the system, accounts, format, state, fees, etc. This has been completed (well, there’s Cairo 1.0 and regenesis but it’s a soft one, meaning the system is already functional)
Performance - this is where we’re at now, and it means increasing TPS.
Decentralization - having decentralized everything - provers, sequencers, etc.
The main reason we followed this path is to move fast. Easier to solve functionality and performance when the operators are not yet decentralized.
I love the proposal. One of the things we struggle with in L2BEAT is explaining difficult technical terms to non-tech-savvy users. The risk framework is great, but it’s full of technical jargon. A precise categorization would definitely help us explain risks to users in an easier to comprehend way.
To give folks some idea, with the current framework:
Stage 0 rollups: Optimism (missing FPs),
Stage 1 rollups: Arbitrum (FPs are behind a whitelist) + with small adjustments to msig structure zkSync v1 and dYdX could be listed here,
Stage 2 rollups: Fuel v1,
Here’s a (very) quick mockup of how we could present such info to the user:
IMO naming should indicate that “Stage 2” is final and desired. Maybe instead of naming stages with numbers, we should use tiers like “Tier A - Fully secured by Ethereum”, “Tier B - Limited security“, “Tier C - Under construction”
In Stage 0, “rollup full node” for zk rollups might not meet this criteria as often zk rollups don’t push all tx on chain but only state diffs.
Here’s a (very) quick mockup of how we could present such info to the user:
That looks great to me! It would definitely be much clearer to newbies than the current risk framework, which has excellent info but doesn’t really help users understand which properties are more important.
Though perhaps rename “stage” to something clearer like “security level”. And happy to accept alternate namings, tier etc.
In Stage 0, “rollup full node” for zk rollups might not meet this criteria as often zk rollups don’t push all tx on chain but only state diffs.
Ooh, good point. I guess there are a few paths to take:
Accept that for rollups that only publish state diffs, a “full node” would just play the state diffs to compute the current state
The rollup must offer a ZK-SNARK verifier that attempts to verify that there are valid transactions that created the transitions, though it’s ok if it’s not “plugged in” and these SNARKs are only published with a delay
Make the decision that rollups that only publish state diffs are “not rollups” until they get to stage 1
I’m inclined to lean toward (1) particularly if we brand “stage 0” as “under construction”, because then users would understand that that’s what stage 0 means and it has very few guarantees. But something like (2) could work too. Maybe survey the ZK-EVM teams on this to get their views?
Hey all, Bartek from l2beat here. We are in the process of finalizing the implementation of the ranking which will be released soon. We are hoping for one last final feedback - please check out l2beat forum post and consider providing your feedback
As a teaser, we are still trying to decide if we should go with a simple, linear ranking as proposed by Vitalik, or it is better having slightly more complicated (but hopefully still easy to understand) more multi-faceted maturity ranking that could look like the following mockup.