EIP-4750: EOF Functions

What are the advantages of this approach over EIP-2315? It would seem to be both less efficient and – by moving each function into its own section – get in the way of further optimization.

The meta question is, What do want to do with additional code sections? To me they seem most useful as a way of linking in library code as modules with defined interfaces.

Leaving the meta-question aside…

My biggest concern is that we wind up with new exceptional halting conditions (and new machine state and code to enforce them) when I’m trying to get rid of them. However, I’m pretty sure they can be enforced at validation time instead along the lines of EIP-3779.

My second biggest concern is that you can’t do tail call optimization. But that’s the price we pay for some useful structure. That’s part of why I’ve come to like having, in Intel’s terminology, both subroutines and procedures. These are well-defined procedures.

  1. Version where instead of designated type sections we encode inputs and outputs number as two first bytes of each code section.

I’d prefer something like this. It could generalize nicely to a more flexible section header.

having non-code bytes inside code sections would mean we have to be careful to not consider them executable

If the first byte is a new opcode the rest can be encoded as the immediate data of that opcode.

I agree it might be less efficient comparing to 2315 because base pointer is saved additionally in the return stack, and this is a price to pay for more runtime correctness guarantees. I.e. 4750 approach guarantees that callee cannot read caller’s stack, while 2315 allows this.

And yes, in the future we should be able to get rid of these runtime underflow checks by using 3779-style validation. Then inefficiency goes away, too.

Tail call optimization should be possible with a special new opcode like TAILCALLF as @ekpyron noted above.

Overall 2315 approach is less restricted and I guess allowing more funky optimizations.

And 4750 is more strict, with more runtime checks, which allows for simpler reasoning about bytecode and its structure, fewer edge cases in protocol rules, possibly easier to audit compilers’ code.

I can also see both approaches possibly co-existing (less restricted “subroutines” inside restricted “procedures”), if compiler authors would find this complexity worthwhile.

I like this idea more than just bytes with a special meaning inside code section. (but this wastes precious opcode space)

I’ve roughly sketched out an extension to this proposal – EOF - Modules - HackMD – that allows for multiple entry points to each code section, mainly by having one type section for each code section. I’ve called these procedures – per Procedures for the EVM - HackMD – to distinguish them from the Simple Subroutines for the EVM - HackMD they are built on, and from the EIP-4750 functions defined here.

I’ve made a PR.

I closed this in favor of EIP-5450: EOF - Stack Validation. Thanks!

I still prefer that EOF code sections represent Modules containing multiple procedures rather than being a single Function. This allows for low-level optimizations within a module, but no control flow between modules except via defined interfaces. In my opinion modules provide a more useful level of packaging.

Multiple entry points can also be added in a future upgrade, so they are not at all a showstopper for me. Let’s just keep in mind that they do allow for inter-procedural optimizations, which single-entry code sections impede. Modules could also support linking libraries of separately-compiled code sections into programs, which is a traditional purpose of object file formats. I’ve closed this PR.

And 4750 is more strict, with more runtime checks, which allows for simpler reasoning about bytecode and its structure, fewer edge cases in protocol rules, possibly easier to audit compilers’ code.

From my point of view leaving checks until runtime makes reasoning more difficult – you don’t know for sure that a program won’t halt in those ways – but with EIP-5450: EOF - Stack Validation the constraints can mostly be checked at validation time. So I think this proposal should be made to require 5450, and most all of the places that call for an exceptional halt should be changed to use “MUST”.

Hey, I notice that this EIP doesn’t include any requirement that when using JUMPF the function being jumped to has the same number of outputs as the current function. That seems like it could have some pretty odd results. I suspect there should be such a requirement.

It is validated at deploy-time, see Code Validation section of the spec:

  1. Code section is invalid in case an immediate argument of any JUMPF is such that type[callee_section_index].outputs != type[caller_section_index].outputs, i.e. it is allowed to only jump to functions with the same output type.

Oh, I see, I missed that, thanks!

Deprecating JUMPDEST analysis

For my understanding, does this refer to deprecating the JUMPDEST op-code itself, or just in reference to a change in how Ethereum client-implementations do JUMPDEST analysis?

The JUMPDEST analysis is what is deprecated, replaced with code and stack validaiton.

JUMPDEST becomes a NOOP code inside of EOF code (zero stack impact and no external changes on invocation).

Roger that–makes sense to me, thank you for clarifying! :slight_smile:

JUMPDEST (0x5b ) instruction is renamed to NOP (“no operation”)

This nomenclature makes sense for EOF code, but if legacy contracts are to continue being executed, we need to retain JUMPDEST.

Or is the idea that EOF enables versioning of the opcode table, and therefore different versions will include different names for opcodes? So EOFv1’s version of the opcode table will see JUMPDEST replaced by NOP?

Noting also that EIP-5450 mentions the following:

Remark: We rely on the notions of operand stack and type section as defined by EIP-4750

However, I don’t see a clear definition of “operand stack” in this EIP.

Exactly. And EOF and legacy contracts can call each other.

Prior to EOF it’s just “the stack,” but with the addition of a return stack (in this EIP) it needed to be distinguished, hence “operand stack.” Other EIPs call what is taken off of “the stack” operands.

1 Like