Pectra Retrospective

parithosh · March 5, 2025, 10:39pm

There were a couple of things we tried to do with Pectra and we’d like to start the retrospective by focussing on that. We had a major shift towards local testing instead of relying on live devnets. This was achieved by the use of Kurtosis (for locally running devnets), EELS/EEST (allowing the testing team to fill the tests via EELS instead of a client implementation), Assertoor (for executing actions on local devnets). This shift to local testing allowed client teams to debug and iterate quickly, creating faster feedback cycles and for devnets to be far more stable than they used to be. Stable devnets also allow us to focus on more complicated tests that might need manual involvement. This change seems to be mostly well received and positive, so we will likely continue down this route.

In the past, we used to have a lot of bugs that were caught twice - once on Hive and again on devnets. This led to a situation where broken devnets triage cycles were diverting dev time away, especially when the bug was already caught in a more reproducible and appropriate location. To remedy this, the change was with the enforcement of test coverage before devnet inclusion. EEST tests were used on Hive as well as Assertoor to trigger the bugs locally before we spent time and effort in devnets. Client teams could also individually do so, with configs pinned in discord. Hive additionally got a major update with the UI as well as the manner of running, reducing time for test runs from ~24h down to ~4h, while also being cheaper to run. This shift has largely been positive and we will likely continue to spend time improving hive and catching bugs locally. However, this leads us to the first criticism - the enforcement of test passing means that there would likely always be a difference between the fastest and the slowest client teams. This problem is exasperated due to the increase in clients we need to test (We had 8 clients to test during the merge, now we have ~12 and the matrix of ELs and CLs to test is too large for effective local tests). One potential approach is to select and enforce all tests in the major clients, while offering the infra and configs for smaller clients (allowing the teams themselves to test their client rather than the testing team). We are currently looking for ACD input on how to proceed with the increased client load.

One of the major criticisms of the Pectra fork process has been the lack of strong commitment of what Pectra is until quite a lot of time had passed. We are in favour of earlier commitments to “core” EIPs that go into a fork and happy to expand it with tested features, allowing us more flexibility in choosing when to ship. E.g: Pectra could have been fixed with the core of maxEB, 7702/etc, with PeerDAS and EOF as strong contenders for features that could make it. Then a few months into the fork process, we could take a call on if the core EIP set is increased with PeerDAS/EOF or if those could go in the next fork. This would allow us more time thinking about blockers for the core EIP, rather than spending it on testing features pre-emptively. This point has been made by other client teams already and we won’t be expanding more on the topic, we do think the CIP/SIP + testing requirement changes to the ACD process adequately address this point. We will however start explicitly assigning a person on the pandaOps team to be the point of contact for each feature that is CFI-ed, while providing best effort support to non-CFI’ed features. ie.: As a EIP champion you can work independently on EELS/EEST/CL specs to get a feature CFI-ed by ACD, pandaOps will then assign a person for the devnet cycle/testing co-ordination and ACD can judge the feature for SFI, SFI-ed EIPs will get our highest priority support. This does indeed mean the barrier for an EIP to become CFI must be high to effectively use resources.

As outlined by the recent holesky incident, we seem to be ignoring tech debt/maintenance in lieu of new features. We would propose seeing tech debt as a feature that gets worked on along with other features - i.e, it always takes up one slot in the CFI/SFI considerations and is a part of discussions as to how many features are reasonable to CFI/SFI. For e.g, the non-finality devnet performed in Nov already helped us understand that we need to be doing more of these, the issue was dev bandwidth and the topic was benched until after Pectra went live. Ideally allocating ongoing resources would help prevent such oversights in the future.

One of the un-written process changes over the years is the devnet specific spec document. This has become a schelling point for discussions in the testing calls as well as gives us an idea of the open PRs and immediate work that needs co-ordinating across teams. This has mostly been a success for short-term issues, however the current blind spot is the longer-term work required for shipping a fork/feature. The longer-term work could include a more research-ey topic (e.g moving cell proof computation to tx sender in PeerDAS), I.e a topic that has gone through the initial research thought cycles but has not made it in as a concrete spec/PR that needs implementing. These type of tasks are currently badly captured in the spec document that the pandaOps team maintains, resulting in clunkier integration in the devnet cycle. The pandaOps team is also badly placed to understand the research co-ordination implications required to get it to the PR stage (we’re more suited for being involved in the PR → implementation/tests stage). The change we’d like to request is for a new doc/process to capture said changes and for an introduction of a technical project manager, a person who could better own the research → PR pipeline and is focussed on shipping the overall feature. This person could of course be focussing on multiple features, but the preference would be for one person per feature. This role could also be handled by the EIP champion, however their skillset might not always overlap with that of a technical project manager - we might need to handle this case by case. This should help give us a better overview on features and reduce the waiting time for implementers/testers.