EVM Object Format (EOF)

Hi all,

I was just looking at the dependencies for EIP3540, and it seems that EIP3540 requires EIP5450 which requires EIP4200 and EIP4750. That means EOFv1 couldn’t be launched without also including: (1) static branching opcodes (RJUMP, etc); (2) subroutine calls (CALLF, RETF). The reason for this is that the stack validation algorithm relies on static control flow. This is necessary because of the strict requirement that unreachable code is not permitted.

It seems to me that you could cut things differently to reduce the dependencies whilst still achieving the overall goal. Here’s an alternative:

Launch EIP3540 (EOFv1) and EIP5450 (Stack Code Validation) together on their own (i.e. without EIP4200 & EIP4750). In a subsequent fork, launch EIP4200 and EIP4750 (and perhaps e.g. EIP663) as non-breaking changes to EOFv1.

By “non-breaking changes” I mean that no EOF version change is required. We are just updating EOFv1 after the fact to make the release more incremental.

The current code validation rules (EIP5450) are a problem though. We need a code validation algorithm that does not rely on static branching. Instead, it needs to ensure arbitrary (junk) bytes are not permitted within a code section (otherwise e.g EIP4200 is a breaking change). That’s easy enough, and doesn’t need reachability (and is e.g. what I believe the JVM does). Something like this:

Every byte within a code section is either a valid opcode for an instruction or part of the immediate operands of an instruction.

This prevents arbitrary bytes from existing within a code section. Yes, it means instructions may be unreachable. But it also means that EIP4200 or EIP4750 can be deployed after the fact as non-breaking changes. Furthermore, code validation requires a single linear scan (i.e. does not require a worklist algorithm as per EIP5450).

What compilers would target code-validated only EOF? how would that reduce total bytes?

I think the minimally viable variant is

  • EOF Container
  • Code Validation
  • Stack Validation
  • Static Jumps
  • Code Sections (with CALLFI/JUMPFI)
  • EIP-663 - EXCHANGE, EXCHANGE2, and DUPN
  • Data Opcodes
  • Easy opcode bans: CALLCODE, SELFDESTRUCT, PC
    For EOF1 we would still permit CREATE[2] from memory. And a restriction that only EOF containers can CREATE[2] EOF contracts that I think was already in most big-EOF code.

Then we could do a follow on with EOF2 in a future hard fork, possibly wire it in early so L2s and such can use it:

  • Ban Code Introspections
    • Ban EXTCODE opcodes
    • Ban EXTCODE into EOF2 contracts
    • Add FACTORYCREATE and EXTCREATE opcodes (neè CREATE3/4)
  • Ban Gas Observability
    • Ban old CALL series
    • Add LCALL, LDELEGATECALL, LSTATICCALL

This would be all the “breaking” changes (container and validation) and the MVP for compiler use in EOF1. Every compiler ask would ideally be in EOFv1 at launch.

EOF2 would just be (mostly) a different selection of opcodes with the same validation core logic. The subcontainer section would be added, but we could also activate FACTORYCREATE in EOF1 as well.

So we can do it in two steps, with only one set of validator logic between the two. Simply a data-driven switch to change the opcode table for validation.

What compilers would target code-validated only EOF?

Surely, it doesn’t matter whether or not they target it by default at the beginning? Presumably they would have an option for early adopters. The point of code-validated only EOF is to get EOF through the process and deployed.

Once this is done we can start rolling out non-breaking changes, such as static jumps and DUPN, SWAPN. With these instructions deployed, compilers would start to target EOF by default (as it would be in their interest). Even CALLF and RETF could be rolled out as a non-breaking change (provided that JUMP/JUMPDEST was retained). Then, later, JUMP / JUMPDEST are retired in a breaking change taking us to EOFv2 (possibly with other goodies such as removing code/gas introspection piggybacked on board).

All I am saying is that your MVP is not actually the MVP. There is another (smaller) option. The nice thing about this option is that it allows us to roll out non-breaking changes and, in doing so, incentivise compiler writers to adopt. It means they can take their time and there is less pressure. Anyway, its just a thought.

Personally, I wouldn’t include these in the EOFv1 proposal … because you don’t need to. Once EOFv1 is live, these can deployed as non-breaking changes to EOFv1. Given that discussions are still ongoing about what the right combination of instructions is (SWAP_N, SWAP_N_M, etc) … it seems like a decision that can be deferred at no cost.

IIRC vyper considers EXCHANGE to be critical for avoiding code size regression. If it’s just something that adds more code size reduction instead of preventing regression then it can be moved out of the MVP.

If the compilers won’t target it, it’s not viable. I don’t want to ship the sequal to blake2b because we cannot assume future forks changing EOF will indeed happen, each step needs to be able to stand on its own. We can iterate with solidity and vyper to figure out where that line is.

Well, fair enough I understand that. But, once the container format + code validation is in place, the risk/reward trade off for EIP’s like EIP4200, EIP663 is much more favourable. They become very minor updates.

EIP-4200, static jumps, is essential to the first draft. Without it we would need to keep the old JumpDest evaluation rules in. It also facilitates getting rid of mandatory jumpdests, which can be up to 20% of code size. Not having it in V1 means the code validator would need to be different in future versions.

It’s because of subtle interdependencies like this that the list looks trimmable but really isn’t for an MVP that compilers would actually target.

Yes, you use a lightweight code validator and retain JUMP / JUMPDEST in EOFv1. Then, phase in static jumps / subroutines / xchg as non-breaking changes to EOFv1. Finally, switch to EOFv2 as you have it (i.e. removing JUMP / JUMPDEST, etc). As part of moving to EOFv2 you can upgrade the code validator.

Look, I’m not trying to suggest this is an ideal approach. Its not! Going straight to EOFv2 would be much nicer. But, you are considering two options (mega EOF vs staged EOFv1 => EOFv2), and I’m simply saying: there is a third option which gets you to the same place (EOFv2). This initially requires only two EIPs and, therefore, could potentially be launched alongside something else. If you don’t need to use this option, great — I’ll be very happy with that. But, having all the options on the table might be helpful.

This is the original approach with first deploying EIP-3540 and EIP-3670. However, then the feedback was that it is undesirable to have EIPv1 and EIPv2 live in the future.

Well, I thought the EOFv1 specification included EIP-4200, EIP-4750, EIP-5450, and EIP-6206. It was a lot more than just two EIPs. I’m suggesting something even more lightweight which is literally just EIP-3540 and EIP-3670.

We need static jumps in there as well (EIP-4200), otherwise we are committing to an eternity of technical debt where there are variants of EOF that accept dynamic jumps.

There is no sensible world where we support dynamic jumps in EOF.

The problems with splitting off those others (especially stack validation - 5450) is we then have multiple validation scenarios for subsequent versions of EOF, all of which have to be maintained for eternity. Just like stripping off static jumps.

Requiring clients to maintain multiple EOF validation functions multiples the surface through which consensus errors can enter. the impact is exponential, not additive.

2 Likes

We need static jumps in there as well (EIP-4200), otherwise we are committing to an eternity of technical debt where there are variants of EOF that accept dynamic jumps.

Well, if you go with EOFv1 first and then EOFv2 in a second fork you are committing to a similar level of technical debt (i.e. supporting two major EOF versions).

There is no sensible world where we support dynamic jumps in EOF.

Well, if that was the only way to get EOF across the line … would you do it? The end game (EOFv2) is the same in all cases (i.e. doesn’t have dynamic jumps).

The problems with splitting off those others (especially stack validation - 5450) is we then have multiple validation scenarios for subsequent versions of EOF, all of which have to be maintained for eternity. Just like stripping off static jumps.

There are options if maintaining multiple code validators is really a sticking point. For example, you could drop EIP5450 altogether or simplify it (e.g. by relaxing the reachability requirement). This EIP seems to be a nice to have, not a must have. Again, I’m not suggesting this is ideal (it isn’t) just that it seems possible. There are trade offs and compromising is never pretty. I appreciate you don’t believe the trade off makes sense in this case — fair enough.

In summary, let’s clarify the different strategies:

  • (One Stop) This is mega EOF. Do it all in one fork — great!

  • (Two Stop) This is EOFv1 in one fork, followed by EOFv2 in a subsequent fork.

  • (Three Stop) Initially two EIPS in one fork. Then, non-breaking changes in a subsequent fork taking us to a weakened version of EOFv1. Finally, in a third fork, you get to EOFv2.

The three stop strategy looks unattractive because it takes more stops (and for reasons you’re highlighting above). But, if you couldn’t get the two stop strategy started in Prague, but could get the three stop strategy started … then it works out the same. :man_shrugging:

Dynamic jumps must die. Nothing is more important.

1 Like

EOF should not be released partially just to release it earlier. There is not an urgent need for EOF yet more versions mean more complexity. Building out a partial version with a usefulness lifespan of only a year is bad for sustainability.

How would you do a jump table or a switch statement then? With dynamic jumps it is constant complexity; without it is logarithmic.

With the RJUMPV instruction.

1 Like

The current spec is now in https://github.com/ipsilon/eof/blob/main/spec/eof.md.
Patches welcome.

There has been an urgent need to be rid of dynamic jumps ever since the EVM was invented, and EOF Functions are the third or fourth proposal to be rid of them. Every time a proposal fails a solution is delayed 2 or 3 years, people working on the problem move on, and new core devs have to be educated and convinced. It never ends.

The problem is simple. So long as the EVM has dynamic jumps you cannot traverse the control flow graph in linear time – it can take quadratic time. This should be obvious. When you hit a dynamic jump you cannot, in general, know the possible destinations without executing the code. The complexity becomes the product of the number of JUMPs by the number of JUMPDESTs. But anything that takes more than linear time is a DoS vulnerability unless it done offline.

That means, among other things, that you cannot validate safety properties like the lack of stack of underflow. Worse, from my point of view, you cannot translate EVM stack code to register code, either for faster interpretation or for compilation to machine code. And even working offline they cause problems: static analyses like symbolic execution become vulnerable to “quadratic path explosion,” and some approaches to zkEVMs are forced to resort to parallel execution.

Dynamic jumps must die. Now.

3 Likes

Will the final EOF proposal get an entirely new EIP number? I’ve searched around and can’t find one.