Current max size of deployable code is 24576 (0x6000). We propose to also limit the size of executable code to 2x the above limit, i.e. 49152 (0xc000).
This gives nice properties:
instruction offset in code fits 16-bit value,
code size fits 16-bit value.
Possible implementations of subroutine maps
Shadow array
Max memory footprint:0xc000
We create a byte array of the size of the code shadowing the instructions in the code. The array is initially filled with zeros.
a JUMPDEST instruction is indicated by 0x80 mask in the shadow array.
To validate jump destination: (shadow_array[location] & 0x80) != 0.
a BEGINSUB instruction is indicated by 0x40 mask in the shadow array.
the size of a subroutine is encoded in the remaining 6-bits using variable-length encoding analogous to LEB128. Examples:
40 - subroutine of length zero,
5f - subroutine of length 31 (0x1f), the max length that fits single byte encoding,
7f - subroutine of length 1023 (0x1f + 2^5 * 0x1f), the max length that fits two-byte encoding,
6081 - subroutine of length 32 followed by a JUMPDEST instruction.
Bitset with linear search
Max memory footprint:0xc000 / 4 == 12288
We create a bitset indicating an instruction is JUMPDEST and another bitset indicating an instruction is BEGINSUB. During execution we keep track of where the current subroutine starts.
To validate if a jump stays within a subroutine:
when jumping backwards, just check if juspdest_location >= current_subroutine_begin,
when jumping forward, find the beginning of the next subroutine starting from current PC (where the jump is). This is done by linear search in subroutine bitset. We can process 64 bits at a time by using 64-bit load and checking if value is not zero. This may be optimized further using SSE2 (128-bit) or AVX (256-bit).
In worst case we need 0xc000 / 64 == 768 comparisons.
Span map
Max memory footprint:0xc000 * 2 == 98304 + jumpdest map.
We keep an array of all BEGINSUB instructions offsets. Single value is uint16 and values are already sorted.
To verify JUMPSUB check if location is in the array using binary search.
When entering a subroutine remember subroutines begin and end location. The end location is the next value in the array.
To verify if a jump stays within a subroutine check if jumpdest_location is between subroutineās begin and end.
Iām not able to prove it, but that was exactly my intention.
One detail that you donāt mention explicitly, but might be worth to note: I believe all three implementations would require that JUMPSUB pushes current subroutine start into control stack togetter with return location, so that after RETURNSUB we can remember to which one we returned.
Here is a complete (untested) C++ implementation of a variant of @chfastās idea of the shadow array. It does not use a complicated variable-length encoding and needs less memory, though. It needs two bit fields with code_size number of bits: The first stores at bit i whether code[i] is push data or not. The second stores the length of each subroutine such that the length of the subroutine that starts at byte i is stored in binary starting at the ith bit.
It does not need a stop marker for the length encoding because it reads the length bit-by-bit and checks if the end of the subroutine has been reached already. This can easily be removed by making the size of the bitset twice as large or by using a more complicated (prefix-free) number encoding.
This PR incorporates the fix to prevent the EVM from stepping (as opposed to jumping) to BEGINSUB ā JUMPSUB transfers control to the instruction after BEGINSUB; BEGINSUB itself aborts. This PR does not prevent ājumping into subroutines.ā PR EIP
Subroutine maps help answering following questions (sometimes also by using information in the code itself):
is this instruction a JUMPDEST?
is this instruction a BEGINSUB?
what is the length of this subroutine?
For subroutine maps we have two classes of implementations (as identified with @chriseth).
āCode shadow arrayā
The size is proportional to the code length. We believe the variant with the lowest memory footprint would use 2 bits per byte of code (proposed by @chriseth). Up to one byte per byte of code presented as āShadow arrayā in EIP-2315 "Simple Subroutines for the EVM" - Analysis - #41 by chfast. This class of implementations is generic and does not depend on the discussed executable code length cap.
āSpan mapā
This is a map/set of subroutine begins/ends but can be efficiently stored as an array and binary search can be used to lookup values in the map. The size is proportional to the number of subroutines. But in worst case the code may be only 1-byte long subroutines.
This depends on the code length cap because with the cap we can use uint16 type, otherwise we need uint32. In the worst case with uint32 type the memory footprint is 4x the code length.
True. I actually did not realized that I need to remember the all the subroutines on the control stack.
This is however not needed for āspan mapā. One binary search is enough to find both begin and end of a subroutine.
Coming back around to this, do we know what the changes to the Yellow paper will be to support this? When I went into it recently I found that even BEGINSUB would add more to the YP than Iād like. Iād be more sympathetic if this feature can be expressed succinctly there.
Also, one reason I donāt like this feature is that it prevents other means of implementing subroutines ā e.g. I think Christian has shown that in some cases leaf routines can be implemented more efficiently with JUMP than JUMPSUB.
I donāt know how difficult it would be to add this to the YP, as I never edited it myself. But I personally donāt see much interest in the community to keep the YP up to date nowadays - itās never considered when EIPs are being proposed / discussed, and the most recent version only partially covers Petersburg (the fork from February 2019)
It wouldnāt prevent ignoring JUMPSUB by the compiler and implementing all the subroutines with only JUMPs.
Mixing two kinds of subroutines in one contract (JUMPSUB subs and JUMP subs) doesnāt sound like a good idea in general to me, for the reasons pointed out in the original analysis: difficulty to test all the edge cases of such complicated contract, complications for the static analysis, complications for finding limits of each subroutine if we want to merklize them separately or runtime-analyze them separately.
Nick Saves endeavors to keep the YP up to date - itās the only spec we have. Actually editing the TeX is difficult, but getting an idea of how much work a new feature would be isnāt so bad.
At the least, we need a PR to the EIP, if there isnāt one I missed. I expect this will be a pain to specify, as currently there is almost no structure specified for bytecode and it amounts to a huge expansion of illegal jump locations. The EIP will also need to explain how to implement this efficiently.
Part of my change of heart here is that although a check for this is needed at runtime (and in my experience checking syntactic constraints at runtime is trouble) a compiler from EVM code can assume that it never happens and take advantage of that promise inside of the subroutine. (And I still wish Iād never introduced beginsub, but stuck with the bare minimum.)
However, Iām not sure I understand this part, so it worries me:
These changes do not affect the semantics of existing EVM code that does not contain the new opcodes.
For EVM code that contains a BEGINSUB opcode that was not meant to be executed but is jumped over, this change can result in such contracts starting to fail where they previously did not fail. While arbitrary data being appended to the end of the bytecode is common, such contracts would have to contain data parts (non-executable code) enclosed by two chunks of executable code. The Solidity compiler is not known to be able to generate such code, not even with inline assembly, unless inline assembly is used to return custom-designed data as the runtime code in the constructor.