EVM Object Format (EOF)

EIP-3540 [Rationale]:
Finally, create transaction must be allowed to contain legacy initcode and deploy legacy code because otherwise there is no transition period allowing upgrading transaction signing tools. Deprecating such transactions may be considered in future.

I’m strongly opposed to breaking legacy initcode (eg UniswapV2Factory), but deprecation, perhaps via gas discrimination, could be beneficial to encourage adoption of EOF, once there is a demonstrated performance benefit.

1 Like

There are other reasons to deploy bytecode besides executing it. I once used it to store a large amount of calldata for a subsequent operation. It is also useful for multisigs, but my usage was for storing verified bids in a crowd-liquidation system. This allows the finalization step to operate without knowing the details of the operation.

It’s unfortunate that EOF uses bytecode for the formatting and not the high bits of the eth balance.

1 Like

I put a clarification for EIP-3540 (EOFv1): EIP-3540: Clarify contract creation failure by chfast · Pull Request #5878 · ethereum/EIPs · GitHub

This is also a functional change because of geth implementation differs and therefore geth stance is reflected in the state tests. So I assume many implementations followed this.

Why do we specify the number of code sections in two places:

  1. In the type 1 EIP-4750 section header, where we specify <code sections>*4 for the length of the that section in bytes.
  2. In the type 2 code header, where we specify the number of code sections whose length we’ll specify

It gives us the possibility in the future to relax the strict ordering the header sections. If in the types 2 code header we used the value from the type 1 section, there would always be a requirement that the type header must precede the code header.

1 Like

Published a lengthier discussion starter about a large changeset called “EOFv2”:

1 Like

Adding a comment here as an acknowledgement that we’re following along with this EIP at Art Blocks as it relates to how we now do on-chain storage of artists’ generative art scripts using our BytecodeStorage library.

We have this issue on our end tracking this and have filed issues on the two common “SSTORE2” library implementations that we are aware of flagging this as well:

No response to this comment is expected, just flagging here for visibility.

Hi everyone,

I would like to submit an EOF-related proposal. (Previously posted on the R&D Discord, but now here upon recommendation.) I have read the EIPs and have hopefully not missed anything.

EOF1 contracts can only DELEGATECALL EOF1 contracts

Motivation:
Currently contracts can selfdestruct in three different ways (directly through SELFDESTRUCT, indirectly through CALLCODE and indirectly through DELEGATECALL). EIP 3670 disables the first two possibilities, however the third possibility remains. Allowing EOF1 contracts to only DELEGATECALL other EOF1 contracts allows the following strong statement: EOF1 contract can never be destructed.

Specification:
When an EOF1 contract performs a DELEGATECALL the target contract has to be EOF1. If it is not EOF1 (e.g. it is EOF0 or EOF2), the DELEGATECALL exceptionally halts. Hence, (among other things) all the gas passed along is consumed and 0 is pushed onto the stack. DELEGATECALL to an empty code also fails.

Security Implications:
Attacks based on SELFDESTRUCT simply disappear for EOF1 contracts. These include:

Backwards Compatibility:
No backwards compatibility is broken as EOF is newly introduced. In theory EOF1 contracts could use EOF0 libraries using DELEGATECALL but that seems relatively far fetched.

Complexity:
The check is relatively simple. Hence, no changes to the gas cost of DELEGATECALL would be needed and the implementation overhead should not be prohibitive.

Please let me know if I should provide more clarifications, expand, write a PR or move this elsewhere.

2 Likes

I think the biggest takeaway from this is if we don’t restrict DELEGATECALL then it becomes an escape hatch to do any features we banned in EOFv1. SELFDESTRUCT and CALLCODE are the only ones it really impacts right now, but it establishes a pattern.

PC, JUMP, and JUMPI are not escapable because their scope is only on the EVM code.

Note that if the calling contract doesn’t matter then a regular CALL could be used to access the features at extra cost. If we banned ECRECOVER from EOFv1 all we would be doing is just increasing the cost of that precompile to include a cold account load for the host contract. Currently none of the precompiles depend on the caller so this is actually the current state for all precompiles. We couldn’t ban a precompile in EOF and expect contracts not to find a way to use it.

So I’m personally in favor of this.

1 Like

During the Edelweiss interop there have been numerous discussions around EOF. We explored the idea how to properly reduce introspection, and discussed potential roadmaps. This document tries to give a glimpse into that process:

Would it make sense to have the code section sizes in the type section rather than in the header?

At the time of writing, the header is dynamically sized based on the number of code sections, this complicates header validation. In addition, this splits up function metadata a bit. We need to check the header to get the function’s size, then the type section to get the functions’ inputs, outputs, and max stack depth, then finally the code section to get the function’s instructions.

If the code size (u16) is stored in the type section, then we could have all of the function metadata in the same place.


Current container:

container  := header, body
header := magic, version, kind_type, type_size, kind_code, num_code_sections, code_size+, kind_data, data_size, terminator
body := type_section, code_section+, data_section
type_section := (inputs, outputs, max_stack_height)+

Proposed container:

container  := header, body
header := magic, version, kind_type, type_size, kind_code, num_code_sections, kind_data, data_size, terminator
body := type_section, code_section+, data_section
type_section := (inputs, outputs, max_stack_height, size)+

With the proposed schema, the header will always be 13 bytes, simplifying header parsing and allowing functions to be validated as the type section is parsed without having to refer back to the header.

It seems like this wasn’t posted here, but since December we had a "Unified EOF Specification (Unified EOF specification - HackMD) describing the changes of EIP-3540/3670/4200/4750/5450/6206.

After the Edelweiss Interop discussions we have posted a rollout discussion document.

This week the above two have been merged into a single specification: the “Mega EOF Endgame Specification”.

It explains all the changes needed to achieve banning of code and gas introspection/observability.

For anyone interested joining the discussions, there are bi-weekly calls called “EOF Implementers Call”, the next one is EOF Implementers Call #11 · Issue #748 · ethereum/pm · GitHub

I think this was discussed during the header format discussions in December, but I can’t remember the reasons, perhaps @gumb0 or @matt can?

It would bring certain benefits for sure, the reasons we decided against something like this were, I think:

  • Desire to keep section headers definition general (and future-proof) enough, avoiding very special treatment of some sections. So currently we have generally just two kinds of sections: single-instance section, defined by one size in the header, and multiple-instance sections defined by array of sizes.
  • It seems like a useful property to be able to find the start and end of any section after parsing only the header, without the need to parse any of the section bodies. The information about the structure of the container is encapsulated in the header.

Note also that with the new creation instructions proposal we extend the format with another array of sections - container sections - and it is similarly defined as number + array of sizes in the header.

Is the type section strictly meant to contain subroutine metadata, or can it be extended to allow for arbitrary multi-instance sections? If the latter, then adding container sections can follow the same pattern.

While finding the section start/end from just the header seems useful, it seems negligible to parse the header and subsequent type section for this particular use case, and a dynamic header seems to be more of a challenge for single pass parsers than a benefit. It can be done either way, but the current way seems more complex for implementors.

Howdy!

I was looking into "Mega EOF Endgame" Specification - HackMD + EIP-3540: EOF - EVM Object Format v1 and it was a bit unclear to me when the following items from the megathread are targetted for EIP-wise:

  • If the target account of EXTCODECOPY is an EOF contract, then it will copy 0 bytes.
  • If the target account of EXTCODEHASH is an EOF contract, then it will return 0x9dbf3648db8210552e9c4f75c6a1c3057c0ca432043bd648be15fe7be05646f5 (the hash of EF00, as if that would be the code).
  • If the target account of EXTCODESIZE is an EOF contract, then it will return 2.

Is the EIP that this is targeted for still TBD?

IIUC this is not the behavior effective as of EIP-3540 itself based on this section from EIP:

  • EXTCODECOPY/EXTCODESIZE/EXTCODEHASH with the EOF target contract - works as with legacy target contract
    • EXTCODESIZE returns the size of entire target container
    • EXTCODEHASH returns the hash of entire target container
    • EXTCODECOPY can copy from target’s code section
    • EXTCODECOPY can copy from target’s data section
    • EXTCODECOPY can copy from target’s EOF header
    • EXTCODECOPY can copy entire target container
    • Results don’t differ when executed inside legacy or EOF contract

However, I’m not sure if there are other EIPs targeted for the same hard fork that would have this impact or if that is a later stage,

Not sure if this is better suited to ask here or in the Core Devs Discord, so opted to post here and cross-link in the Discord – my apologies if I missed it in my search here and there is a better place to post this.

Correct, these changes are not EIPified yet.

They are targeted for the same fork, so this part if EIP-3540 can be viewed as outdated.

Fantastic – thank you for clarifying!

Including here for broader visibility our plans for upgrading/migrating our contracts for on-chain art storage to support the EOF v1 hardfork plans.

Primarily sharing this here for visibility, perhaps for other on-chain art teams who may stumble upon this EIP discussion thread, but if any folks have feedback as to how we may be misunderstanding the EOF hardfork path here, please do reach out here in Discourse, via Twitter (I’m @purphat), or in the ETH R&D Discord (I’m purplehat.eth#7327)