One of the arguments against any EVM changes is that it’s much harder to add features to the EVM than to remove them (eg. the complexities around even removing a little-used opcode like SELFDESTRUCT
), and so if the EVM keeps changing, ever-increasing ugliness and complexity is likely to be the outcome.
One way to greatly reduce this tradeoff is to find a way to automatically convert version n
EVM code to version n+1
EVM code every time there is an upgrade (not necessarily immediately; perhaps convert when old code is “touched”, and make sure that all version n
code is converted to version n+1
before attempting to implement version n+2
).
But there are difficulties in the current EVM that make conversion hard:
- Dynamic jumps, which generate code coordinates to jump to at run time, making it hard to transform code
-
CODECOPY
,EXTCODECOPY
andEXTCODEHASH
, which read code directly
EOF is an upgrade to the EVM, and so it has the downsides that I mentioned. But there is one way to adjust EOF to make it much better in this regard, by setting the stage for a system where any future EVM upgrades do not have these problems, and so force-conversion becomes possible:
Ban EOF-formatted code from being read with CODECOPY
, CODESIZE
, EXTCODECOPY
, EXTCODESIZE
and EXTCODEHASH
.
Fortunately, EOF bans dynamic jumps already, making code transformations easier. But banning code reading would let us go all the way. If we decide to change from the EVM to some other VM (eg. WASM, Cairo…) in the future, it would be possible to automatically transform EVM code into code of the new VM that has equivalent functionality.
Specific changes that would be needed would be:
-
Remove
CODECOPY
andCODESIZE
from the EIP-3670 valid opcode list -
The
EXTCODECOPY
opcode would check if the code it is reading starts with the EIP-3541 magic byte. If it does, it would:- Option 1: act as if the code is zero
- Option 2: raise an exception
-
The
EXTCODEHASH
andEXTCODESIZE
opcodes, when acting on code that starts with the EIP-3541 magic byte, can be treated in two different ways:- Option 1a: return zero
- Option 1b: throw an exception
- Option 2: no change, but we make a commitment that the
EXTCODEHASH
andEXTCODESIZE
opcodes returns the keccak and size of the full code, and these values may change as code gets upgraded
Some optional additions (which could be added later) include:
- The code reading opcodes could have their functionality changed to read the data section of the code, or the empty string if the data section is absent (EIP-3540 gives EOF-formatted contracts the right to have up to one data section)
- A
CREATE4
opcode that copies the code of an existing address (in a similar way to howDELEGATECALL
works), though it could still use a memory slice for the data field. The “recommended” pattern for developers would be that new code templates would get pushed with a manual transaction, and anything automated would just copy a template. Use cases like creating lots of contracts with small modifications (eg. user wallets with different public keys) would be accomplished with this data field.