EIP-2315 "Simple Subroutines for the EVM" - Analysis

It would really be nice if we could have this EIP without the restriction on where to start a subroutine. Note that this EIP does not bring a very big benefit for compilers. If the subroutine does not have many return values, it might even be better to use a regular jump instead of wasting 16 bytes on average when using the subroutine mechanism.

I think it should be possible to extend the current analysis to also allow beginsubs at arbitrary positions, the following might be a start:

We have a bitmap where bit i is 1 if and only if there is a beginsub opcode somewhere between PC 32 * i and 32 * (i + 1) - 1. With this data structure, we essentially do the analysis proposed by @holiman , just with an additional linear search inside the 32 bytes of code (problem to solve: pushdata):

The subroutine size data structure proposed by @chfast can be similarly modified: For each chunk of 32 bytes, we store the (largest) number of following chunks that do not contain a beginsub opcode.

For a jumpsub, we check if the target is a valid beginsub by first looking at the 32-byte-granular beginsub bitmap. If there is no match, we OOG. If there is a match, we check the code itself.

For jumps, we keep track of the PC of the beginsub of the current subroutine, let this be O. Let E be the number of 32-byte chunks following the chunk of O that do not contain a beginsub opcode, as read from the data structure.

For a jump with target T, we check if T is less than O and OOG if yes. Let C be floor(T / 32), the target chunk. If C = floor(O / 32), we perform a linear search for beginsub on the at most 32 bytes of the chunk and OOG if we find one. If C > floor(O / 32) + E, we OOG. If C = floor(O / 32) + E, we do a linear search on the 32 bytes of chunk C and OOG if and only if there is a beginsub between the beginning of the chunk and T.

I have the feeling that there should already be generic data structures and algorithms that can answer the question “does the array slice from x to y contain a zero” and that we can use without many modifications…