EVM Object Format (EOF)

Thanks for the response!

It actually complicates matters, so removing it would be helpful. The other two contracts are annoying too, but cause less issue.

Reviving this now, after having reviewed the EIP a bit. My notes:

The terminator is not particularly well specificed. The only thing it says, is that “If the terminator is encountered, section size MUST NOT follow.”. The>

I suggest that the “Validation rules” are clarified further re terminator.

Example cases. Which of these are ‘valid’ and which are not?
In each case, where it’s not valid, there should be a corresponding rule in the ‘Validation rules’.

// 1.
// 0-size code section, with terminator, no code)

0xEF0001 0x01 0x0000 0x00

// 2.
// 0-size code section, no terminator, no code)

0xEF0001 0x01 0x0000

// 3.
// 0-size code,0-size data, with terminator

0xEF0001 0x01 0x0000 0x02 0x0000 0x00

// 4.
// 0-size code,0-size data, no terminator

0xEF0001 0x01 0x0000 0x02 0x0000

// 5.
// 1-size code,0-size data
0xEF0001 0x01 0x0001 0x02 0x0000 0x00 0xEF


// 6.
// 1-size code,16-size data, but actual code is smaller than that. Does the 'infinite field of zeroes' apply?
0xEF0001 0x01 0x0000 0x02 0x0010 0x00 0xdada


Thanks for the comment.

The intended meaning is that terminator byte is mandatory. But I can agree the specification is not clear about it nor what terminator is.

By following this reasoning, all examples without terminator are invalid (2,4).

It is also specified that "section_size MUST NOT be 0". We decided to include this rule to eliminate two encodings for the same effect - empty section. If a section is empty this forces omitting its header as well. This makes (1,3,5) invalid.

Finally, we also wanted all bytes of a section to be present (no ‘infinite field of zeroes’) but looks this rule is not articulated. That would make 6 invalid. We also don’t allow any bytes outside of sections specified by headers: “Stray bytes outside of sections MUST NOT be present. This includes trailing bytes after the last section.”

In summary, EOF container requires: no implicit bytes, no additional bytes, shorter encoding if two options are possible. The specification does not express this perfectly yet and we will apply fixes.

Thanks. A follow-up question/observation then.

  • If a code section must be present,
  • And a section must not be empty,

Then it’s not possible to deploy a data-only contract. This is IMO a side-effect which deserves mention in the EIP.

1 Like

This was actually desired, though it is still possible to have a contract with a single instruction in the code section (such as INVALID) to have a determined execution path. Agree we should document it.

As a further step to replace dynamic jumps, we propose EIP-4750 (EIP-4750: EOF Functions). This could be adopted together with EIP-4200 (EIP-4200: Static relative jumps) to remove the need for dynamic jumps.

I cannot deploy an EOF where codesize is zero. What if I want to deploy an EOF where I only want the data section and no code?

  1. If PC goes outside of the code section bounds, execution aborts with failure.

This is not entirely in-line with current semantics when PC runs out of code. On a contract where PC goes out-of-bounds the STOP instruction is executed. I am also fairly sure that with PUSH if not all data can be read (e.g. the contract 60, PUSH1 where it is not clear what data should be pushed) then also STOP is executed.

Current EIP changes this behavior; if you run out-of-bounds or PUSH when it is not clear what bytes you should push now goes OOG. Is this intended?

EIP-3670 requires that the code section ends with a terminating instruction, this makes running out-of-bounds impossible and code ending with truncated PUSH data invalid code.

1 Like

Data contracts should have INVALID as the only instruction in the code section to signal they are not executable.

Thanks @holiman @jochem-brouwer for your comments and questions.
Can you check if this spec re-wording change make it any better?

Can I just confirm that below from the “Contract Creation restrictions” section of the EIP

This adds two validation steps in the contract creation, any of it failing will result in contract creation failure.

implies that when an execution client is performing the deploy validation of the EOF1 container/header as part of a contract creation transaction, if the code is invalid, the contract should just not be created (but no gas charged for this contract creation failure), not that an exceptional abort should occur and all remaining gas should be consumed. Is that correct (per this comment on the related tests PR from @gumb0)?

The EIP wording has been updated based on the feedback. No functional changes. EIP-3540: container and validation spec clarification, new code by chfast · Pull Request #4822 · ethereum/EIPs · GitHub

Data contracts should have INVALID as the only instruction in the code section to signal they are not executable.

I think the requirement that an EOF1 container requires a code section (or any sections at all) is a bit restrictive. In light of SSTORE2-like patterns, “data-only” contracts might be quite popular, and requiring a code section even with a single instruction incurs (if I am reading correctly) 4 bytes of overhead.

I suggest that the spec be modified so that the code section is not required. CALL-ing an EOF1 container with no code section has the same semantics as calling a legacy contract with no code. In other words, if the code section is not explicitly included, the code of the container is implied to be empty.

(I originally considered that an EOF1 container with no explicit code section should imply code with a single 0xFE (INVALID) instruction. But that has some weird consistency issues with EXTCODEHASH and EXTCODESIZE).

I also suggest enabling completely empty EOF1 containers, e.g. EOF1 containers whose contents are EF0001 00. It may be desirable to deploy empty contracts for some reason (e.g. as part of some address marking scheme, especially if a GETNONCE instruction ever becomes available).

1 Like

As a somewhat concluding step we propose EIP-5450: EOF - Stack Validation (building on top of all previous EOF EIPs) to reduce the number of checks to be done in the interpreter loop.

New EIP idea to remove already deployed invalid EOF code.

If account is called and its code starts with EF but is not valid EOF the execution is done normally (it will result in exception) but additionally the account’s code is deleted (or alternatively account is added to selfdestruct list).

This is similar to touching “empty” accounts in order to remove them from the state.

The disadvantage is that we have to validate EOF code before every execution. However, after all invalid EOF code is deleted from the state the EIP can be silently disabled.

The Mainnet has only 3 such contracts but there are rumors that some L2s have more.

I’m not a client dev so my intuition on this may be off, but it feels like an irregular state change to just remove those 3 accounts would be the simpler option.

EIP-3540 [Rationale]:
Finally, create transaction must be allowed to contain legacy initcode and deploy legacy code because otherwise there is no transition period allowing upgrading transaction signing tools. Deprecating such transactions may be considered in future.

I’m strongly opposed to breaking legacy initcode (eg UniswapV2Factory), but deprecation, perhaps via gas discrimination, could be beneficial to encourage adoption of EOF, once there is a demonstrated performance benefit.

1 Like

There are other reasons to deploy bytecode besides executing it. I once used it to store a large amount of calldata for a subsequent operation. It is also useful for multisigs, but my usage was for storing verified bids in a crowd-liquidation system. This allows the finalization step to operate without knowing the details of the operation.

It’s unfortunate that EOF uses bytecode for the formatting and not the high bits of the eth balance.

1 Like

I put a clarification for EIP-3540 (EOFv1): EIP-3540: Clarify contract creation failure by chfast · Pull Request #5878 · ethereum/EIPs · GitHub

This is also a functional change because of geth implementation differs and therefore geth stance is reflected in the state tests. So I assume many implementations followed this.