EVM Object Format (EOF)

Data contracts should have INVALID as the only instruction in the code section to signal they are not executable.

Thanks @holiman @jochem-brouwer for your comments and questions.
Can you check if this spec re-wording change make it any better?

Can I just confirm that below from the “Contract Creation restrictions” section of the EIP

This adds two validation steps in the contract creation, any of it failing will result in contract creation failure.

implies that when an execution client is performing the deploy validation of the EOF1 container/header as part of a contract creation transaction, if the code is invalid, the contract should just not be created (but no gas charged for this contract creation failure), not that an exceptional abort should occur and all remaining gas should be consumed. Is that correct (per this comment on the related tests PR from @gumb0)?

The EIP wording has been updated based on the feedback. No functional changes. EIP-3540: container and validation spec clarification, new code by chfast · Pull Request #4822 · ethereum/EIPs · GitHub

Data contracts should have INVALID as the only instruction in the code section to signal they are not executable.

I think the requirement that an EOF1 container requires a code section (or any sections at all) is a bit restrictive. In light of SSTORE2-like patterns, “data-only” contracts might be quite popular, and requiring a code section even with a single instruction incurs (if I am reading correctly) 4 bytes of overhead.

I suggest that the spec be modified so that the code section is not required. CALL-ing an EOF1 container with no code section has the same semantics as calling a legacy contract with no code. In other words, if the code section is not explicitly included, the code of the container is implied to be empty.

(I originally considered that an EOF1 container with no explicit code section should imply code with a single 0xFE (INVALID) instruction. But that has some weird consistency issues with EXTCODEHASH and EXTCODESIZE).

I also suggest enabling completely empty EOF1 containers, e.g. EOF1 containers whose contents are EF0001 00. It may be desirable to deploy empty contracts for some reason (e.g. as part of some address marking scheme, especially if a GETNONCE instruction ever becomes available).

1 Like

As a somewhat concluding step we propose EIP-5450: EOF - Stack Validation (building on top of all previous EOF EIPs) to reduce the number of checks to be done in the interpreter loop.

New EIP idea to remove already deployed invalid EOF code.

If account is called and its code starts with EF but is not valid EOF the execution is done normally (it will result in exception) but additionally the account’s code is deleted (or alternatively account is added to selfdestruct list).

This is similar to touching “empty” accounts in order to remove them from the state.

The disadvantage is that we have to validate EOF code before every execution. However, after all invalid EOF code is deleted from the state the EIP can be silently disabled.

The Mainnet has only 3 such contracts but there are rumors that some L2s have more.

I’m not a client dev so my intuition on this may be off, but it feels like an irregular state change to just remove those 3 accounts would be the simpler option.

EIP-3540 [Rationale]:
Finally, create transaction must be allowed to contain legacy initcode and deploy legacy code because otherwise there is no transition period allowing upgrading transaction signing tools. Deprecating such transactions may be considered in future.

I’m strongly opposed to breaking legacy initcode (eg UniswapV2Factory), but deprecation, perhaps via gas discrimination, could be beneficial to encourage adoption of EOF, once there is a demonstrated performance benefit.

1 Like

There are other reasons to deploy bytecode besides executing it. I once used it to store a large amount of calldata for a subsequent operation. It is also useful for multisigs, but my usage was for storing verified bids in a crowd-liquidation system. This allows the finalization step to operate without knowing the details of the operation.

It’s unfortunate that EOF uses bytecode for the formatting and not the high bits of the eth balance.

1 Like

I put a clarification for EIP-3540 (EOFv1): EIP-3540: Clarify contract creation failure by chfast · Pull Request #5878 · ethereum/EIPs · GitHub

This is also a functional change because of geth implementation differs and therefore geth stance is reflected in the state tests. So I assume many implementations followed this.

Why do we specify the number of code sections in two places:

  1. In the type 1 EIP-4750 section header, where we specify <code sections>*4 for the length of the that section in bytes.
  2. In the type 2 code header, where we specify the number of code sections whose length we’ll specify

It gives us the possibility in the future to relax the strict ordering the header sections. If in the types 2 code header we used the value from the type 1 section, there would always be a requirement that the type header must precede the code header.

1 Like

Published a lengthier discussion starter about a large changeset called “EOFv2”:

1 Like

Adding a comment here as an acknowledgement that we’re following along with this EIP at Art Blocks as it relates to how we now do on-chain storage of artists’ generative art scripts using our BytecodeStorage library.

We have this issue on our end tracking this and have filed issues on the two common “SSTORE2” library implementations that we are aware of flagging this as well:

No response to this comment is expected, just flagging here for visibility.

Hi everyone,

I would like to submit an EOF-related proposal. (Previously posted on the R&D Discord, but now here upon recommendation.) I have read the EIPs and have hopefully not missed anything.

EOF1 contracts can only DELEGATECALL EOF1 contracts

Motivation:
Currently contracts can selfdestruct in three different ways (directly through SELFDESTRUCT, indirectly through CALLCODE and indirectly through DELEGATECALL). EIP 3670 disables the first two possibilities, however the third possibility remains. Allowing EOF1 contracts to only DELEGATECALL other EOF1 contracts allows the following strong statement: EOF1 contract can never be destructed.

Specification:
When an EOF1 contract performs a DELEGATECALL the target contract has to be EOF1. If it is not EOF1 (e.g. it is EOF0 or EOF2), the DELEGATECALL exceptionally halts. Hence, (among other things) all the gas passed along is consumed and 0 is pushed onto the stack. DELEGATECALL to an empty code also fails.

Security Implications:
Attacks based on SELFDESTRUCT simply disappear for EOF1 contracts. These include:

Backwards Compatibility:
No backwards compatibility is broken as EOF is newly introduced. In theory EOF1 contracts could use EOF0 libraries using DELEGATECALL but that seems relatively far fetched.

Complexity:
The check is relatively simple. Hence, no changes to the gas cost of DELEGATECALL would be needed and the implementation overhead should not be prohibitive.

Please let me know if I should provide more clarifications, expand, write a PR or move this elsewhere.

2 Likes

I think the biggest takeaway from this is if we don’t restrict DELEGATECALL then it becomes an escape hatch to do any features we banned in EOFv1. SELFDESTRUCT and CALLCODE are the only ones it really impacts right now, but it establishes a pattern.

PC, JUMP, and JUMPI are not escapable because their scope is only on the EVM code.

Note that if the calling contract doesn’t matter then a regular CALL could be used to access the features at extra cost. If we banned ECRECOVER from EOFv1 all we would be doing is just increasing the cost of that precompile to include a cold account load for the host contract. Currently none of the precompiles depend on the caller so this is actually the current state for all precompiles. We couldn’t ban a precompile in EOF and expect contracts not to find a way to use it.

So I’m personally in favor of this.

1 Like

During the Edelweiss interop there have been numerous discussions around EOF. We explored the idea how to properly reduce introspection, and discussed potential roadmaps. This document tries to give a glimpse into that process:

Would it make sense to have the code section sizes in the type section rather than in the header?

At the time of writing, the header is dynamically sized based on the number of code sections, this complicates header validation. In addition, this splits up function metadata a bit. We need to check the header to get the function’s size, then the type section to get the functions’ inputs, outputs, and max stack depth, then finally the code section to get the function’s instructions.

If the code size (u16) is stored in the type section, then we could have all of the function metadata in the same place.


Current container:

container  := header, body
header := magic, version, kind_type, type_size, kind_code, num_code_sections, code_size+, kind_data, data_size, terminator
body := type_section, code_section+, data_section
type_section := (inputs, outputs, max_stack_height)+

Proposed container:

container  := header, body
header := magic, version, kind_type, type_size, kind_code, num_code_sections, kind_data, data_size, terminator
body := type_section, code_section+, data_section
type_section := (inputs, outputs, max_stack_height, size)+

With the proposed schema, the header will always be 13 bytes, simplifying header parsing and allowing functions to be validated as the type section is parsed without having to refer back to the header.

It seems like this wasn’t posted here, but since December we had a "Unified EOF Specification (Unified EOF specification - HackMD) describing the changes of EIP-3540/3670/4200/4750/5450/6206.

After the Edelweiss Interop discussions we have posted a rollout discussion document.

This week the above two have been merged into a single specification: the “Mega EOF Endgame Specification”.

It explains all the changes needed to achieve banning of code and gas introspection/observability.