EVM instruction set versioning

version

#1

eip:
title: EVM version instruction
author: Andreas Olofsson <@androlo>
discussions-to: EVM instruction set versioning
status: Draft
type: Standards Track
category: Core
created: -

This is currently a bit of a pick-and-choose proposal, which is why it has not yet been submitted as a proper EIP.

Simple Summary

This document proposes that the ABI of contract code is extended to include an EVM (instruction set) target version number, or ID, that allows the EVM to pick the correct instruction set when executing contracts.

Abstract

This document is a proposal to add support for multiple EVM instruction set versions, and for manually specifying the desired target in code. A version number would be added to the EVM instruction set, which would be updated for every change that would break (i.e. alter the behavior of) contracts that are already deployed. The old instruction set would still remain after a version update, making it possible to pick which one to use. Additionally, a new instruction would be added to reserve a certain opcode for safe ABI formatting and for making it possible to view the target version of a contract from code.

Motivation

It is vital that the EVM can undergo changes without making previously deployed contracts break. New instructions and new functionality can sometimes be added without breaking old code, but sometimes that is not possible - like when opcodes has to be remapped or removed.

We are already dealing with these kinds of issues; for example, CALLCODE is still around despite having been replaced by DELEGATECALL - which was assigned its own new opcode - and the other call instructions still has the return position and size parameters in its parameter list even though they were made obsolete by the new returndata system. Ideally, one could think, the current DELEGATECALL instruction should instead have been re-mapped to CALLCODE, and the other call instructions would be changed to new versions that does not have the redundant parameters.

The changes proposed here would address all the issues above and any similar issues that may arise in the future.

Specification

Instruction

Name/Mnemonic

TARGET

Opcode

TBD

Parameters

address - the address of the target account.

Result

Pushes the target version ID of the account with address ‘address’ onto the stack.

ABI update

Contract code (init and body sections) must be preceded by 2 bytes - the TARGET opcode followed by a 1 byte target ID.

EVM version ID

Target IDs starts at 1 and is incremented by 1 when an update is made.

If an account is not a contract account the instruction could either return 0 or cause the EVM to revert (TBD).

Changes to CREATE

Contracts that are already deployed and therefore does not start with the TARGET opcode would be assigned a target ID of 1.

To ensure that future contracts conform to the new standard, contract creation has to be modified. During creation, the EVM must revert if:

  • the contract initialization code does not have a target set.
  • the contract runtime code does not have a target set.
  • either of the target IDs are invalid.

Rationale

The reason for adding version to the first byte of the code is to make it easy for the EVM to find and to use it. The EVM routine for getting and checking the target ID would be trivial.

A one-byte target ID should be plenty, given that most updates are backwards compatible and therefore they do not require a version change.

The ability to view the target of another contract may become very useful. Old contracts will inevitabely become less and less safe as the EVM evolves - particularly if proposals like EIP 615 are implemented. This proposal would make it possible for contract writers to avoid calling contracts that are potentially less safe.

Some possible changes

The reason for not adding the target ID as a separate field is to avoid complicating the account data-structure, although doing that may actually be more practical. If that is the case then a lot of these suggestions could be scrapped.

Obviously, the big job here would be to change the EVM to allow multiple instruction sets to be chosen from. The purpose of the instruction is mainly to reserve a certain opcode for target version, which would make the starting sequence distinguishable from other code, i.e. it would always be possible to tell whether code is on the new ABI format or not. If the target version is instead stored in its own field, that would remove most of the motivation for this instruction.

In the above case, the ABI could instead be modified to have only the version number byte in front of the code when CREATE is called. The version number would then be stripped out by the EVM and added to the reserved field before the code is actually run. A drawback to this is that it would be more difficult to check that the input is well formed.

Backwards Compatibility

Adding this instruction in accordance with the spec would not cause backwards compatibility issues.

Test Cases

None.

Implementation

For EVM designers to decide.

Copyright

Copyright and related rights waived via CC0.


#2

This is interesting and might also be important for different networks that use the EVM. Makes me think of chainID https://chainid.network