EIP-4200: Static relative jumps

Why there is no RJUMPV

The EIP-615 proposes new jump instruction JUMPV which looks up the jump destination in the adjusted array using the index value taken from the stack. For good reason similar control-flow instructions exist in most if not all IRs.

However, I have not found any common use-case in contracts development justifying relatively high complexity of the instruction (e.g. encoding of the array in bytecode, handling invalid index values). Furthermore, nothing prevents adding such instruction in future if it turns out to be needed.

Such instruction can help with implementing switch statements, but only if case constants are continuous or at least dense.

The most used “switch” in every solidity contract is the external function dispatch. But JUMPV will not help here as the functions ids are “hashed” values (sparse). You would first need to the index of a function id but then you don’t need JUMPV any more as a static jump will do.

I also wanted search solidity source codes on GitHub to check for interesting usages of switch. Just to realize solidity does not support switch statement at all.

The proposed JUMPV indeed matches the semantics of br_table from WebAssembly. However, the LLVM IR has extended variant of such control flow instruction called switch. There instead of a table of indexes we have “an array of pairs of comparison value constants and [jump destinations]”. This would handle function dispatch use-case, but it complicates the instruction even more.

If I’m mistaken and there are valid use-cases for JUMPV please let me know.

I’d be happy enough to deprecate them. But also happy enough to restrict them at validation time to safe uses. EIP-3779: Safer Control Flow for the EVM.

I’d actually prefer JUMPR and JUMPRI. I’d especially like for all of the JUMP opcodes to sort together (e.g. they are often referred to as a group with JUMP*) and for the names to be concise. JUMPREL and JUMPRELIF would be OK, but a lot to type.

From the doc:

Because the destinations are validated upfront, the cost of these instructions are less than their dynamic counterparts: RJUMP should cost 5, and RJUMPI should cost 7. This is a reduction of 2 gas, compared to JUMP and JUMPI .

I think 3 and 5 would be more appropriate. Since we don’t need to check the destination, doing the jump is only a matter of assigning the immediate value to the PC. And even the immediate value can be decoded during jumpdest analysis. Barely more work than a no-op.

This is a fair point. We tried to stay conservative before we made actual benchmarks, but given no outliers are found through benchmarking I would agree with lowering the costs further.

1 Like

We plan to release a new EIP proposing function sections, which would be a way to get rid of them.

With the EIP-2315 subroutines we can also be rid of JUMP if we want to. Currently the EIP-3779 safety conditions restrict JUMP to ‘static’ uses, but could just as well deprecate it.

To be clear, the proposal was released (and it was planned since the inception of EOF):

I think discussing the merits of the different “subroutine” proposals may be better discussed in a dedicated comparison topic, as opposed to this topic here.

1 Like

I do think this EIP is ready for CFI for Shanghai. It’s very easy to implement, has been implemented, tested, timed, and discussed here and elsewhere to mutual consensus.

There is a reason jump tables are a common facility on CPUs. Important use cases include the common “big switch” interpreters that index opcodes into a table of jumps to the implementation of each operation. However, switches can always be implemented with conditional jumps, so they aren’t strictly necessary. Another important use case, in the absence of dynamic jumps, is constrained polymorphism – such as using jump table indexes for pointers to functions and virtual functions.

So I would be happy to see them proposed here, but don’t insist. We should at least leave space in the allocation of bytecode values for later introduction of indexed jumps and calls.

I’ve withdrawn EIP-3779. In the end it just wasn’t possible to tame JUMP with a static analysis. However, EIP-2315 will deprecate JUMP and JUMPI, because traversing the CFG is subject to quadratic path explosion otherwise, turning the validation code into an attack vector.

Why there is no RJUMPV

If I’m mistaken and there are valid use-cases for JUMPV please let me know.

For reference: there is call for using jump-table-like approaches for switches, including the external function dispatch, on the solidity layer: e.g. Constant-complexity dispatcher idea · Issue #12650 · ethereum/solidity · GitHub
And we had some experimental work towards it (not with a traditional jump table, but with multiple jump destinations encoded in one stack slot and selection via shifting and masking - of course all of that would be hell for static analysis… and becomes impossible if we move away from dynamic jumps completely). I’m not necessarily saying that this warrants actually adding RJUMPV (since there’s always ways around needing it) - but if we had it, there would definitely be use cases.

Another one is internal function call dispatch (calling internal functions via a function pointer) in Solidity via-IR code generation, which works via a dense switch.

I agree. RJUMPV is far superior to the alternatives, both for efficiency and for static analysis. There are reasons that so many machines have an instruction to support jump tables. We could of course wait and do this later. But “later” tends to mean “2 or 3 years.” So I’d prefer we do this now.

I don’t think it’s really that difficult to specify or implement. E.g:

RJUMPV n_dest dest_0 … dest_default

If the 0-based index on the stack is too large we jump to the default destination. The validator can check statically that there are no invalid destinations. And since the jump table is inline (not shared) it doesn’t create a DoS vulnerability.

(An vectored call would also be useful for virtual functions, but RJUMPV can do that job well enough with a JUMPV to a call.)

1 Like

Hi all, I have a quick question: why would we not disable JUMP and JUMPI if we activate this EIP? The entire motivation seems to be that we do not want these dynamic jumps. So, why not throw these out? (Or does this impact the flexibility of EVM too much?)

This EIP alone is not enough to replace dynamic jumps entirely.

With

it will be possible to complete replace them.

Hi, I am implementing EIP 4200 in EthereumJS.

I noticed that the EIP test cases and the reference code does not include cases for RJUMPV.

Hi, reference code has it (see elif opcode == 0x5e: branch), and I’ve just added some test cases for RJUMPV Update EIP-4200: Add Test Cases for RJUMPV by gumb0 · Pull Request #6096 · ethereum/EIPs · GitHub
Thanks for spotting this!

1 Like

EDIT: never mind, I just realized that I interpreted the whole idea of RJUMPV wrong, I assumed that it would jump to a relative offset if case >= count, but it indeed uses a jump table (I would however suggest to add this line in the EIP, it makes it more clear - that in RJUMPV the relative offsets is a list of count * 2 int16s and whatever stack item is pushed, we jump to that relative offset

Old msg;

You are right that there is a 0x5e branch but I do not think the validation matches the text of the EIP.

0x5e/RJUMPV is also not in immediate_sizes (immediate size is thus 0?)

I do not think the reference implementation of RJUMPV is correct:

            jump_table_size = code[pos]
            if jump_table_size == 0:
                raise ValidationException("empty jump table")                

            pc_post_instruction = pos + 1 + 2 * jump_table_size

jump_table_size (weird variable name, this is count in the text spec…?) is thus whatever count is.

pc_post_instruction now is essentially pos + 1 + 2 * count? This should be the two bytes after the current pos, no? (interpreted as an int16)

I also find the code very hard to read. I get the impression that this validation code is of an older version of this EIP, where RJUMPV is an opcode which has to do with some kind of jump table?

In text I would argue that the validation code (for RJUMPV) should;

  • Check if the next three bytes after RJUMPV are into len(code)
    • Note: the check that the code has a terminiating opcode at the end of the container is checked in the next iteration of the while loop
  • Check if count (so the next byte after RJUMPV) is not 0
  • Check if pc_post_instruction (which is now pc_post_instruction = pos + immediate_sizes[opcode]), if you add int.from_bytes(code[pos+1:pos+3], byteorder = "big", signed = True) is within the container
  • Initialize the immediate_sizes[0x5e] = 3 # RJUMPV

I do not think that the EIP defines very well what happens if case in RJUMPV is either larger than the table size, or it is larger than 255. Do we exceptionally abort, or do we use the default case (no branching?).

The EIP states:

If no match is found (i.e. the default case) in the RJUMPV instruction execution will continue without branching. This allows for gaps in the arguments to be filled with 0 s, and a choice of implementation by the programmer. Alternate options would include exceptional aborts in case of no match.

This is ambiguous to me, because it defines essentially two ways what to do if “no match is found”, either it continuous without branching, or it exceptionally aborts… (the “alternate option”)?

This is in Rationale section where alternative designs are mentioned that were decided against.

Specification is:

  1. RJUMPV count relative_offset+ pops a value (case) from the stack, and sets the PC to PC_post_instruction + ((case >= count) ? 0 : relative_offset[case]).

i.e. when case >= count, PC is set to PC_post_instruction

1 Like

Love this proposal but a small suggestion: Wouldn’t it make more sense for the RJUMPV opcode to allow for jump tables of sizes 1-256 rather than 0-255 and reverting if the table size is zero?

Two main reasons for this change:

  1. Allowing the table size “count” byte to be zero doesn’t seem to serve a purpose as the standard explicitly disallows it.
  2. Allowing tables of size 256 would lend itself very well to logic that jump based on a 4-bit index as the full range of 256 values (0x00 - 0xff) could translate to a valid jump destination.
2 Likes