EIP-615: Subroutines and Static Jumps for the EVM -- Last Call

This proposal is about significantly increasing the formal tractability of the EVM. We deprecate dynamic jumps, which play hell with formal specs, proofs of correctness, static analysis, fast compilation, and more. And being rid of them, introduce subroutines and other facilities to replace them. People doing auditing and build tools for analyzing EVM code intend to take advantage of the formal tractability if they can, and the eWasm team is investigating the possibilities for near-linear-time compilation of well-structured EVM and eWasm code.

Much previous discussion was here.

Merged draft is now ready for discussion. Previous discussions over the years can be found here, here, and here.

Following is my review of EIP-615. This includes all issues past and present which apply to the current text. This includes technical problems which warrant cancelling/extending this Last Call, specific recommendations to improve the EIP (I will supply PR if invited), and arguments on theoretical and practical topics.

Dependency is missing

In the text it states

Dependencies

EIP-1702. Generalized Account Versioning Scheme. This proposal needs a versioning scheme to allow for its bytecode (and eventually eWasm bytecode) to be deployed with existing bytecode on the same blockchain.

However, such a dependency is missing in the front matter header.

EIP-1702 DRAFT is not a finalized document. Therefore I believe EIP-615 is prematurely in Last Call status. For precedent, please see ERC-721 where ERC-165 was a prerequisite and the latter was finalized before the former entered Last Call.

There will be no benefit, and much cost, in finalizing EIP-1720 if it cannot be used. The prerequisites are not met and so it can’t be used. The cost is that it cannot continue to be edited. So again, Last Call should be aborted.

Another dependency. The instruction arguments are specified to use “MSB-first, two’s-complement, two-byte positive integers”. This puts an upper bound on the non-data part (and non-JUMPDEST part) of contract sizes at 0x7fff. This depends on contract sizes being limited, which is currently specified by EIP-170 to be limited to 0x6000, so this is good. This dependency on EIP-170 can please be included in the front matter for cross reference purposes, and referenced in text.

Test cases are missing

EIP-1 states:

Test Cases - Test cases for an implementation are mandatory for EIPs that are affecting consensus changes. Other EIPs can choose to include links to test cases if applicable.

However this EIP fails to include test cases.

Backwards compatibility is incomplete

This EIP introduces a backwards compatibility against a finalized EIP, specifically EIP-1167. Requesting please that this be added as a note for cross-reference.

Formal verification is not shown to provide any practical benefit

The first line in the EIP states:

…formal specification and verification are an essential tool against loss.

However this claim is not substantiated sufficiently. Only references are made under the heading “some papers” at bottom.

I have raised this topic before, but that wasn’t Last Call, so I’ll do it again here.

Has anybody translated any deployed contract to the proposed byte code and successfully detected any actual problem using formal verification? Would EIP-615 have prevented this week’s exploit on 0x Exchange 2.0?

Notes on complexity are misleading

The paper hypothesizes a vulnerability where contract authors can evade formal analysis by introducing quadratic complexity against a static analyzer.

Otherwise, Contracts can be crafted or discovered with quadratic complexity to use as denial of service attack vectors against validations and optimizations.

This is entirely impractical because no serious projects publish using EVM, they all use a higher level language. Everybody knows that a contract which does not publish its higher-level source code is suspect. And since Solidity is advertised here to be able to target EIP-615 as easily as current EVM, now therefore all the formal analysis benefits of EIP-615 already exist today against Solidity source code.

Separately, there is a note on complexity at:

All of the instructions are O(1) with a small constant, requiring just a few machine operations each, whereas a JUMP or JUMPI typically does an O(log n) binary search of an array of JUMPDESToffsets before every jump.

This is an implementation detail. Anybody that cared to implement JUMPDEST lookup in O(1) time would simply store all bytecode in 9-bit bytes where the extra bit denotes valid jump destinations.

Dynamic jumps are rarely used

I have provided a method to analyze existing contracts to find if they are actually using dynamic jumps or if instead there is a trick to immediately (O(1)) find jump destinations.

This analysis is performed here:

It demonstrates that the vast majority of JUMPs on EVM are not actually dynamic.

This puts into question the underlying motivation of this EIP which is to remove dynamic jumps. This relevant analysis is not referenced from the EIP.

Other notes/corrections

(Negative values are reserved for extensions.)

At current, zero is also unused and could be reserved.

Thanks for the feedback @fulldecent.

This draft doesn’t much care what versioning scheme is provided, so long as there is one. If necessary the core devs can hold off on assigning this EIP to a fork until such a scheme is available. But this proposal has been languishing since December of 2016 and my availability to work on it is rapidly dwindling, so I think it’s past time to move it forward.

I’m sorry, but I don’t understand the compatibility issue here.

Will do.

True. This proposal was written long before this requirement was imposed. I’ll let the core devs decide whether to grandfather it in, or ask for somebody else to prepare a new draft with test cases.

Not that I know of. I think they are waiting on the proposal to be accepted.

What gets deployed on the blockchain is EVM code. An attacker can generate vulnerable EVM contracts by design, by fuzzing, or by scanning the blockchain looking for them.

There is no formal specification for Solidity. There are at least two for the EVM. So if you want to be certain of what of an EVM contract does you have to analyze the bytecode.

Solidity uses dynamic jumps to implement function calls.

I do not consider grandfathering out test cases for consensus critical code to be a good idea, especially with such a large change as this one. There are 10 new opcodes, as many opcodes that has been added to the EVM since frontier, combined.

While I feel that what is in this EIP to be a good idea asking a non-specific “somebody else” to prepare the tests does not make me confident this EIP is ready for inclusion in any hard fork.

It’s a practical matter @shemnon. Last year we applied for an EF grant for @expede and I to work on this proposal. For months we never got either an acceptance or a rejection. So we both took on other work. If the core devs reject this proposal it will be dead until I leave my job, @expede and @boris business fails, or somebody else decides to champion it.

I think it would be a shame to reject years of work because I didn’t do something that was not required until years after the proposal was first written.

That’s not the point. This proposal / Working Group was not funded even though all the other ETH1x ones were because “somebody at the EF doesn’t like Boris”. At least, that’s the only feedback we got.

Thus can’t afford to spend time building tests etc. That’s the core issue for all “third party” EIPs: if you aren’t employed by the EF or Parity or Pegasys, you’ve got volunteer time only.

Greg’s wording is weird, but “our business failing” seems to mean that we might circle back around and volunteer to work on this if we have nothing else to do.

That is not the case. The way we were treated by the EF means we are unlikely to volunteer on core going forward.

EIP requirement or not, a fully worked sample is something that would have been an aid in understanding how the opcodes work. This EIP covers a lot of ground and contains at least three distinct categories of opcodes that may have been better served as separate EIPs.

It doesn’t need to be as exhaustive as the the reference tests but something as simple as including some of the unit tests from your proof of concept implementation and making them normative is what is needed.

@boris All I meant is that so long as you and @expede are starting up a business Brooke will not have time to champion this EIP. If your business fails (and I pray it succeeds) then she might, given EF or other support. Same goes for me and my current employment. The point remains that it would be a shame to see years of work lost because the primary author could no longer afford to go without funding.

@shemnon I’ll see what I can do. What is there now is a section laying out the effects of the primitive instructions on the stack. From a conversation with Dimitry writing test vectors for this EIP should not be very difficult for someone who knows how to do it. I don’t think I can reconstruct the testing I did in 2017.

We discussed breaking up the EIP earlier. If we remove dynamic jumps these are the minimal set of opcodes needed to replace them. They are also the minimal set needed to map one-to-one to Wasm opcodes. It would be a lot of work at this point to spit the EIP up, and we would have to start the process over for each resulting new EIP. Which, given the constraints on my future time, would have the same effect as withdrawing this EIP completely.

Have the Vyper and Solidity projects had a look at these opcodes? How would they impact their tooling? Would they use it or ignore it?

@fubuloubu tells me the Vyper project (or at least he) is in support. As I recall it the Solidity team added hooks for at least the primitive opcodes into Yul. I think some degree of support was added to Solidity as well. @axic would know.

@shemnon I just checked the source. Yul and Solidity have hooks (defined opcodes and empty code generation functions) for the eventual implementation of this proposal. @axic or @chriseth would know how much work remains.

We moved from internal calls to dynamic jumps for gas savings within our contracts and have had nothing but problems with it. This proposal would be very helpful, as it would allow us to get the safety we need while saving gas, which is what lead us to make that decision.

A quick note here: https://eips.ethereum.org/EIPS/eip-1380 would also help achieving that by using the old system for calls in Vyper.

1 Like

The Yul assembler has support for a version of EIP615 from 2 years ago. No recent changes are applied. Back when it was implemented there were no decision on the size of the immediate. The output of the assembler was never tested given there is no testing environment for it.

Solidity to Yul is still progress, so there is no complete support for Solidity to EVM615.

Solidity has zero support for this. Many internal routines of the code generator would have to be rewritten or at least modified.

There is support for compiling Yul to something like this proposal, but it is completely untested. Furthermore, we are currently working on a full rewrite of the code generator to target Yul. Once that is finished, we can compile to both web assembly and execution environments that require static jumps, but this will still take several months.

In general, I strongly oppose this proposal in its current form. I think the introduction of multi-byte opcodes is dangerous and in general, it is a big change to EVM implementations.

The proposal was initially written with speedup in mind and not with easy of verification. The speedup turned out to be not really present, Pawel’s evmone implementation seems to be a much better solution.

I do not see the benefit of ease of verification worthwhile when compared to the risk of radical changes to the EVM, especially as implementations still need to keep both implementations. If code is analyzed, symbolic execution has to be performed in any case. In most situations, this can easily resolve the jump targets even when they are taken from the stack. At least when compiled from Solidity, the situations where this is difficult will still remain: The only situation where the jump target is not directly available on the stack is when function types are used. These function types would still need a dispatch table, so the problem is not really solved. The benefit of a dispatch table over dynamic jumps is of course that the set of potential targets is smaller. I would say that this set can already be restricted in the current EVM by analyzing how many elements on the stack are accessed by a routine, how many elements are returned and similar techniques. Furthermore, additional output from the compiler can also be used to improve the analysis - the compiler knows which jumpdest is dynamic and which is not. Unless when jump targets are read from storage (and maybe even then!), a formal system should be able to verify the correctness of this information to guard against compiler bugs.

In closing, I think this is a problem that should be solved above the EVM. It is a big risk to modify the EVM implementations and also the Solidity code generator to conform to it. Even when this proposal is implemented, malicious code can still run the old way, so this can only be a protection against bugs and not against bad intent. If it is just about finding bugs, we should have access to the source code and can use the help of the compiler and other tools. Furthermore, instead of adapting Solidity’s “legacy” code generator, I would prefer to focus on the re-implementation that uses Yul. The code that is generated from Yul will already use a dispatch table instead of dynamic jumps and it should be rather straightforward to resolve all jump destinations.

Thanks for the feedback @chriseth. Most of it would have been more helpful when you co-authored this proposal in 2016. By now it is simply too late to change the form of this proposal–it would amount to abandoning the proposal, probably permanently. The best I can do now to simplify it would be to stick to the primitive operations and eliminate JUMPV, JUMPSUBV, GETLOCAL, and PUTLOCAL.

Improving the tractability of formal analysis of EVM code was a desiderata from the start. (See link to original proposal below.) The performance gains were to be had by facilitating near-linear-time compilation, in the same way that Wasm compilers like Lightbeam take advantage of its clean control flow to that end. My measurements indicate that evmone is about twice as fast as aleth, and my study of the code indicates that this is primarily because it implements optimizations that you decided not to pay me to implement for aleth back in 2017. It’s true that those optimizations do not depend on this proposal.

Please, let’s keep the discussion technical and not personal.

Thank you, @shemnon. And my apologies, @chriseth.