EIP-615: Subroutines and Static Jumps for the EVM

Is that the only use? Do people develop in assembly alternative uses? Where’s the usage data?

Dynamic jumps also used for switch statements and virtual functions, and could be used in lots of creative ways in assembly code, I suppose. I’m not too concerned about placing a bit of a burden on creative assembly coders in return for better high-level code generation.

2 Likes

I just want to make sure we don’t miss a geniune use case in our mission to salt the earth of dynamic jumps (sarcasm mine)

Thus the call for introduction in two phases. We can make validation optional if need be, but I’m not sure I’ve ever seen the use of unconstrained dynamic jumps in programs for other CPUs. Or even an unconstrained dynamic jump instruction (and please someone correct me.)

Yes, they could be split into 2 (or 3) EIPs. I feel that this would be cleaner as well. I brought this up a while back, and the prevailing feeling was that since this EIP already has momentum, just let it be. TL;DR yes, but no for political reasons.

Also, I don’t think that any of these features are controversial. Let’s just get them into the spec.

3 Likes

Yes, absolutely we could do that. In fact, there’s nothing stopping a client from doing actor-style parallel execution for performance reasons today, without these changes, for large use cases. Invoking the same contract concurrently gets much more tricky, but yes that analysis is both possible and easier if these changes go in.

As an aside, there’s a lot of optimization that mainstream clients could be doing that they’re not currently. From my cursory reads through several of them, they’re pretty much straight out of the Yellow Paper verbatim, run interpreted, don’t attempt to use natively-sized words, process sequentially, and do gas bookkeeping with at runtime on each opcode call. There’s a lot of room to speed up the existing clients, without needing to wait for eWASM.

1 Like

As mentioned above, if you remove dynamic jumps you have to at least add subroutines. Beyond that, switch statements and virtual functions would then have to emulated slowly with a chain of comparisons, which is why real chips support jump tables. Any other uses of dynamic jumps are too obscure for me to know about, but could probably be accomplished with jump tables.

Removing dynamic jumps is a terrible step 1. I would suggest we add features first, than propose removal in a later EIP

1 Like

Sure, but in theory it would also be done in the following sequence of separate EIPs (in this order):

  1. Introduce static jumps [leave dynamic jumps in]
  2. Introduce subroutines [still have dynamic jumps]
  3. Deprecate dynamic jumps
1 Like

It’s dynamic jumps that make static analyses useless-to-impossible. They must die. The rest of the features are the made necessary in their absence. That is the logic of the paper.

2 Likes

Also, we are trying to schedule hard forks on a 9-month schedule, and they have been taking up to 16. Too many small EIPs and it will take years to get this work rolled out. Seems silly when it took me 2 months to implement.

2 Likes

Sure, but they don’t need to be step 1. It could be a longer strategy done in a forwards-compatible way until step 3. Really it’s more a question of default language-level support, so that when analyzable control flow is the default we gain the benefits. It doesn’t have to be done in one shot.

But again, you [Greg] made arguments the other day about political expediency. The argument about multiple EIPs is correct in theory, but not necessarily required or practical.

1 Like

I definitely appreciate this as a political argument!

2 Likes

I do see this proposal as a logical whole, better as one proposal than three. None of the features are visible to a high-level programmer, and the analytic and performance gains are there only for programs that do not use dynamic jumps. But high-level compilers and assembly coders can use these features in concert to produce much better code.

1 Like

The idea is to execute transactions of 1 block concurrently, stopping at a barrier whenever there is a read from the state or a write to the state. Both read and write result in one of the transactions to successfully acquire an exclusive lock on the item being read or written, whereas other transaction wanting to access the same item, would need to wait. That exclusive lock would be held until the transaction completes the execution. Deadlocks need to be detected and need to result in one of the deadlocked transactions being aborted and restarted. This model would lead to what they call “Serialisable” transactional isolation.
I thought that symbolic execution coupled with control flow analysis can help to elide some of the locking, but I have not spent enough time thinking about it.

2 Likes

I’m really not a fan of the new complexity that this introduces to the instruction set representation. Currently, every instruction takes one byte, with the exception of PUSHn, which depends on the value of n.

This EIP introduces 10(!) new instructions, all but two of which have multibyte encodings.

As an alternative suggestion, why not instead take these arguments from the stack, but require that they were PUSHed immediately before? In that event, BEGINSUB n_args, n_results would be encoded as PUSHn n_args PUSHn n_results BEGINSUB, and removes the need for everyone to adopt new instruction decoding code. It would also remove the need for two of the new instructions - JUMPTO and JUMPIF can be represented using the existing JUMP and JUMPI instructions, but with the new restrictions.

PUTLOCAL and GETLOCAL introduce an entirely new type of memory and don’t seem to have any direct connection to the rest of this EIP. I think they should be in a separate EIP.

@Arachnid . So sort of a reverse polish notation with extra PUSHes. A tiny bit verbose, and an unusual constraint on an instruction set. It might also complicate validation a little, as it would have to look backwards from these instructions to be sure the previous pushes were valid. Still, I’m open to the change, if we thought the complication would help enough users.

@Arachnid I don’t see how PUTLOCAL and GETLOCAL introduce new kinds of memory, they just provide an alternative to multiple DUPs and SWAPs for getting values where you want them on the stack. So not necessary, but useful and efficient. But as with JUMPV and JUMPSUBV they can be emulated with slower sequences of other instructions, despite being directly supported by Wasm and most all CPUs. If reducing the size of the proposal would make the difference to its acceptance then these would be the instructions to postpone.

Validators can do this fairly easily by calculating provenance on stack elements. Executors don’t need to care, and can just treat them as stack arguments.

I misunderstood how they work, sorry. I thought they accessed a ‘local variable storage’, but they access elements further down the stack at a location specified by a frame pointer.

I do still think that this EIP specifies several different modifications, and should be split into smaller, more concise EIPs. It would make it easier to review and approve them independently.

Fair enough. I’m still not sure I can write a regular grammar to express your idea, or how to put it in the Yellow Paper. I guess a back reference from the appendix where BEGINSUB is described to an extra exceptional halting state in the case that BEGINSUB would be executed with arguments on that the stack that were not the results of one of the PUSHn operations.

And yes, these could be three EIPs, with the condition that the second two depend on the first. I don’t know if that makes it easier or harder to evaluate the facility as a whole. Which is to say: This EIP offers the control-flow primitives provided by Wasm and by most every CPU ever. Shall we just put them all in now, or spend the next two years at it?

I should maybe add a table of corresponding EVM+615/Wasm/8086/ARM operations to clarify.

1 Like