EIP-4200: Static relative jumps

gcolvin · January 31, 2022, 9:23pm

I’d actually prefer JUMPR and JUMPRI. I’d especially like for all of the JUMP opcodes to sort together (e.g. they are often referred to as a group with JUMP*) and for the names to be concise. JUMPREL and JUMPRELIF would be OK, but a lot to type.

gcolvin · January 31, 2022, 9:32pm

From the doc:

Because the destinations are validated upfront, the cost of these instructions are less than their dynamic counterparts: RJUMP should cost 5, and RJUMPI should cost 7. This is a reduction of 2 gas, compared to JUMP and JUMPI .

I think 3 and 5 would be more appropriate. Since we don’t need to check the destination, doing the jump is only a matter of assigning the immediate value to the PC. And even the immediate value can be decoded during jumpdest analysis. Barely more work than a no-op.

axic · February 1, 2022, 12:05am

This is a fair point. We tried to stay conservative before we made actual benchmarks, but given no outliers are found through benchmarking I would agree with lowering the costs further.

gcolvin · February 4, 2022, 8:40pm

We plan to release a new EIP proposing function sections, which would be a way to get rid of them.

With the EIP-2315 subroutines we can also be rid of JUMP if we want to. Currently the EIP-3779 safety conditions restrict JUMP to ‘static’ uses, but could just as well deprecate it.

axic · February 4, 2022, 10:57pm

To be clear, the proposal was released (and it was planned since the inception of EOF):

I think discussing the merits of the different “subroutine” proposals may be better discussed in a dedicated comparison topic, as opposed to this topic here.

gcolvin · February 7, 2022, 8:36am

I do think this EIP is ready for CFI for Shanghai. It’s very easy to implement, has been implemented, tested, timed, and discussed here and elsewhere to mutual consensus.

gcolvin · March 7, 2022, 8:12pm

There is a reason jump tables are a common facility on CPUs. Important use cases include the common “big switch” interpreters that index opcodes into a table of jumps to the implementation of each operation. However, switches can always be implemented with conditional jumps, so they aren’t strictly necessary. Another important use case, in the absence of dynamic jumps, is constrained polymorphism – such as using jump table indexes for pointers to functions and virtual functions.

So I would be happy to see them proposed here, but don’t insist. We should at least leave space in the allocation of bytecode values for later introduction of indexed jumps and calls.

gcolvin · September 4, 2022, 2:36am

I’ve withdrawn EIP-3779. In the end it just wasn’t possible to tame JUMP with a static analysis. However, EIP-2315 will deprecate JUMP and JUMPI, because traversing the CFG is subject to quadratic path explosion otherwise, turning the validation code into an attack vector.

ekpyron · October 26, 2022, 7:24pm

Why there is no RJUMPV
…
If I’m mistaken and there are valid use-cases for JUMPV please let me know.

For reference: there is call for using jump-table-like approaches for switches, including the external function dispatch, on the solidity layer: e.g. Constant-complexity dispatcher idea · Issue #12650 · ethereum/solidity · GitHub
And we had some experimental work towards it (not with a traditional jump table, but with multiple jump destinations encoded in one stack slot and selection via shifting and masking - of course all of that would be hell for static analysis… and becomes impossible if we move away from dynamic jumps completely). I’m not necessarily saying that this warrants actually adding RJUMPV (since there’s always ways around needing it) - but if we had it, there would definitely be use cases.

Another one is internal function call dispatch (calling internal functions via a function pointer) in Solidity via-IR code generation, which works via a dense switch.

gcolvin · October 29, 2022, 8:56pm

I agree. RJUMPV is far superior to the alternatives, both for efficiency and for static analysis. There are reasons that so many machines have an instruction to support jump tables. We could of course wait and do this later. But “later” tends to mean “2 or 3 years.” So I’d prefer we do this now.

I don’t think it’s really that difficult to specify or implement. E.g:

RJUMPV n_dest dest_0 … dest_default

If the 0-based index on the stack is too large we jump to the default destination. The validator can check statically that there are no invalid destinations. And since the jump table is inline (not shared) it doesn’t create a DoS vulnerability.

(An vectored call would also be useful for virtual functions, but RJUMPV can do that job well enough with a JUMPV to a call.)

jochem-brouwer · November 11, 2022, 10:17am

Hi all, I have a quick question: why would we not disable JUMP and JUMPI if we activate this EIP? The entire motivation seems to be that we do not want these dynamic jumps. So, why not throw these out? (Or does this impact the flexibility of EVM too much?)

axic · November 11, 2022, 1:54pm

This EIP alone is not enough to replace dynamic jumps entirely.

With

it will be possible to complete replace them.

jochem-brouwer · December 6, 2022, 2:29pm

Hi, I am implementing EIP 4200 in EthereumJS.

I noticed that the EIP test cases and the reference code does not include cases for RJUMPV.

gumb0 · December 7, 2022, 10:59am

Hi, reference code has it (see elif opcode == 0x5e: branch), and I’ve just added some test cases for RJUMPV Update EIP-4200: Add Test Cases for RJUMPV by gumb0 · Pull Request #6096 · ethereum/EIPs · GitHub
Thanks for spotting this!

jochem-brouwer · December 7, 2022, 6:50pm

EDIT: never mind, I just realized that I interpreted the whole idea of RJUMPV wrong, I assumed that it would jump to a relative offset if case >= count, but it indeed uses a jump table (I would however suggest to add this line in the EIP, it makes it more clear - that in RJUMPV the relative offsets is a list of count * 2 int16s and whatever stack item is pushed, we jump to that relative offset

Old msg;

You are right that there is a 0x5e branch but I do not think the validation matches the text of the EIP.

0x5e/RJUMPV is also not in immediate_sizes (immediate size is thus 0?)

I do not think the reference implementation of RJUMPV is correct:

            jump_table_size = code[pos]
            if jump_table_size == 0:
                raise ValidationException("empty jump table")                

            pc_post_instruction = pos + 1 + 2 * jump_table_size

jump_table_size (weird variable name, this is count in the text spec…?) is thus whatever count is.

pc_post_instruction now is essentially pos + 1 + 2 * count? This should be the two bytes after the current pos, no? (interpreted as an int16)

I also find the code very hard to read. I get the impression that this validation code is of an older version of this EIP, where RJUMPV is an opcode which has to do with some kind of jump table?

In text I would argue that the validation code (for RJUMPV) should;

Check if the next three bytes after RJUMPV are into len(code)
- Note: the check that the code has a terminiating opcode at the end of the container is checked in the next iteration of the while loop
Check if count (so the next byte after RJUMPV) is not 0
Check if pc_post_instruction (which is now pc_post_instruction = pos + immediate_sizes[opcode]), if you add int.from_bytes(code[pos+1:pos+3], byteorder = "big", signed = True) is within the container
Initialize the immediate_sizes[0x5e] = 3 # RJUMPV

jochem-brouwer · December 11, 2022, 12:31pm

I do not think that the EIP defines very well what happens if case in RJUMPV is either larger than the table size, or it is larger than 255. Do we exceptionally abort, or do we use the default case (no branching?).

The EIP states:

If no match is found (i.e. the default case) in the RJUMPV instruction execution will continue without branching. This allows for gaps in the arguments to be filled with 0 s, and a choice of implementation by the programmer. Alternate options would include exceptional aborts in case of no match.

This is ambiguous to me, because it defines essentially two ways what to do if “no match is found”, either it continuous without branching, or it exceptionally aborts… (the “alternate option”)?

gumb0 · December 11, 2022, 10:33pm

This is in Rationale section where alternative designs are mentioned that were decided against.

Specification is:

RJUMPV count relative_offset+ pops a value (case) from the stack, and sets the PC to PC_post_instruction + ((case >= count) ? 0 : relative_offset[case]).

i.e. when case >= count, PC is set to PC_post_instruction

Philogy · February 19, 2023, 11:23pm

Love this proposal but a small suggestion: Wouldn’t it make more sense for the RJUMPV opcode to allow for jump tables of sizes 1-256 rather than 0-255 and reverting if the table size is zero?

Two main reasons for this change:

Allowing the table size “count” byte to be zero doesn’t seem to serve a purpose as the standard explicitly disallows it.
Allowing tables of size 256 would lend itself very well to logic that jump based on a 4-bit index as the full range of 256 values (0x00 - 0xff) could translate to a valid jump destination.