EIP-3779: Safer Control Flow for the EVM

Abstract

This EIP specifies validation rules for some important safety properties, including

  • valid jump destinations,
  • valid instructions,
  • no stack underflows, and
  • no stack overflows without recursion.

Valid contracts will not halt with an exception unless they either run out of gas or overflow stack during a recursive subroutine call.

Code is validated at contract creation time – not runtime – by the provided algorithm. This is a one-pass algorithm, linear in the size of the bytecode, so as not to be a DoS surface.

2 Likes

Am I correct that is is intended to be a possible improvement under EOF (EIP3540) but that the exact mechanism through which this would be integrated with EOF is not currently specified?

In the end this just specifies a few more rules, in the same way that EIP-3690 “extends contact creation validation rules (as defined in EIP-3540).” I should be more clear about that.

The validation algorithm (or its equivalent) would presumably be run at code-validation time, as defined in EIP-3540. I should be more clear about that. Logically, it could be run after the code given there is run. For performance reasons an implementation would probably integrate the code into a single pass.

3690 extends the rules defined in 3670. The basic validation rules are defined by 3670 and not 3540.

For clarity I suggest to provide a reference implementation in Python which extends the validation of 3670 the same way 3690 does.

Aha. I’ve been using Go for example implementations because I don’t know Python, and when I write the reference implementation it will almost surely be a draft PR against Geth.

Anyway, this PR would render 3670 unnecessary - all jump destinations are validated, and no jumpdest table is needed. However, EIPs generally should not reference other EIPs that are still Drafts. We are making an exception for 3540 because it is almost sure to go in. 3670 is more controversial.

3670 does not introduce a jumpdest table (3690 does), so not sure what do you mean.

3540/3670/3690 are all in Review and not Draft.

Why is 3670 controversial?

3540/3670/3690 are all in Review and not Draft.

Good, I and I think @MicahZoltu I missed that, or it changed since his review.

Why is 3670 controversial?

Sorry - I meant 3690. 3670 is good. 3690 I don’t like at all.

I’ve added an EOF container section with a jumptable which gives valid jump destinations for every dynamic jump. This extends the range of valid programs without sacrificing tractable validation.

I’ve added an EOF container section with a jumptable which gives valid jump destinations for every dynamic jump. This extends the range of valid programs without sacrificing tractable validation.

I would prefer this extension to have a separate opcode pair (JUMPTABLE / JUMPTABLEI ?), and maybe to be proposed in a separate EIP.
Now it seems like two different semantics (that will need two different implemantations) conflated into JUMP/JUMPI.

Some more comments / confusions:

Valid contracts will not halt with an exception unless they either run out of gas or overflow stack during a recursive subroutine call.

As I understand here by subroutines you mean subroutines as they’re currently generated by compilers (with dynamic jumps).

So recursive call to a subroutine can overflow, but why wouldn’t be the same possible with a simple loop doing PUSH ?

A new EOF section called vjumptable (section_kind = 4 ) is introduced. It contains a sequence of n tuples (jumpsrc, jumpdesti_, sorted in ascending lexicographic order.

Please specify how jumpsrc and jumpdest are encoded (16-bit unsigned integer?)

  1. Every JUMP and JUMPI either
  • matches at least one tuple in the vjumptable or the vtraptable, or
  • is static

First, as noted above these two cases I think should better be different opcodes.

Then, for the tabulated jump, I think validation should check:

  • each jumpsrc and jumpdest in both tables points to an instruction (is within code bounds and doesn’t point to immediate)
    • potentially we could restrict jumpdest to point to JUMPDEST opcode for consistency. Not sure it’s worth it, but it probably will simplify breaking into blocks?
    • jumpsrc could alse be additionally restricted to point to a tabulated jump instruction
  • each tabulated jump instruction always has exactly one item in vtraptable with corresponding jumpsrc.
  • number of items in vjumptable for each jumpsrc is not contstained (can be 0)

Your algorithm code doesn’t seem to check all potential destinations of tabulated jumps, did you intend to support switch-like control flow with this at all?

… subroutines as they’re currently generated by compilers (with dynamic jumps).

Yes, as limited by the rules. I want to loosen up the rules enough to capture most of what Solidity generates.

… So recursive call to a subroutine can overflow, but why wouldn’t be the same possible with a simple loop doing PUSH ?

Because that breaks the constant stack depth rule.

… A new EOF section called vjumptable (section_kind = 4 ) is introduced. It contains a sequence of n tuples (jumpsrc, jumpdesti_, sorted in ascending lexicographic order.

Please specify how jumpsrc and jumpdest are encoded (16-bit unsigned integer?)

The whole table layout needs to change, so this will get fixed, thanks. Probably 2 MSB-first bytes, unsigned.

  1. Every JUMP and JUMPI either
  • matches at least one tuple in the vjumptable or the vtraptable, or
  • is static

First, as noted above these two cases I think should better be different opcodes.

My desire in this EIP is not to introduce new opcodes, but to constrain the existing opcodes so that they are safer (as defined) without breaking too much existing code that is already safe.

Then, for the tabulated jump, I think validation should check:

  • each jumpsrc and jumpdest in both tables points to an instruction (is within > code bounds and doesn’t point to immediate)
  • potentially we could restrict jumpdest to point to JUMPDEST opcode for consistency. Not sure it’s worth it, but it probably will simplify breaking into blocks?
  • jumpsrc could alse be additionally restricted to point to a tabulated jump instruction
  • each tabulated jump instruction always has exactly one item in vtraptable with corresponding jumpsrc.
  • number of items in vjumptable for each jumpsrc is not contstained (can be 0)

I think I just didn’t get around to validating the jump tables themselves, thanks for noticing. I think that will still require some version of jumpdest analysis be done.

JUMPDEST does makes it easier to break code into basic blocks with a simple one-pass algorithm. My intent is that every JUMP* instruction has at least one entry in the jumptable. And yes, there needs to be exactly one vtraptable destination for each JUMP*, and at most one vjumptable destination for every JUMP*.

Your algorithm code doesn’t seem to check all potential destinations of tabulated jumps, did you intend to support switch-like control flow with this at all?

Yes. Switches, virtual functions, and function pointers should all be handled via this mechanism.

Brooke and I are working on the algorithm, will be sure we get this right, thanks.

1 Like