EIP-2315 Simple Subroutines for the EVM

gcolvin · February 2, 2020, 10:11pm

@tkstanczak @holiman Here are a couple more test cases, probably wrong as I hurry to get out in the sun.

offset step op        stack
0      0    PUSH1 3   []
1      1    JUMPSUB   [3]
2      8    STOP      []
3      2    JUMPDEST  []
4      3    PUSH1 7   []
5      4    JUMPSUB   [7]
6      7    RETURNSUB []
7      5    JUMPDEST  []
8      6    RETURNSUB []

Program should STOP with an empty stack after 8 steps.

offset step op        stack
0      0    PUSH1 3   []
1      1    JUMPSUB   [3]
2      2    JUMPDEST  []
3      3    RETURNSUB []
1      4    JUMPSUB   []

Program should STOP with an empty stack after 4 steps, due to virtual zero at end of code.

tkstanczak · February 2, 2020, 11:16pm

I think the second one will cause InvalidJumpDestination exception (which is a good test case too).

tkstanczak · February 2, 2020, 11:22pm

offset step op        stack
0      0    PUSH1 3   []
1      1    JUMPSUB   [2]
2      2    JUMPDEST  []
3      3    RETURNSUB []
4      4    STOP   []

gcolvin · February 2, 2020, 11:24pm

The idea here is to demand nothing, but simply provide a mechanism. For this EIP such demands will need to be made by other tools.

gcolvin · February 2, 2020, 11:58pm

PUSH1 2 I assume. The second time this hits RETURNSUB it will pop the codesize left on the call stack and jump to the implicit 0 past the end of the code (at offset 5) and stop. It will never get to the STOP at offset 4.

gcolvin · February 14, 2020, 2:22am

contract fun {
    function multiply(uint x, uint y) returns (uint) {
        return x * y;
    }
    function test() returns (uint) {
        return multiply(2,3);
    }
}

Solidity:
   MULTIPLY:
      0x0
      dup2
      dup4
      mul
      swap1
      pop
      swap3
      swap2
      pop
      pop
      jump
   TEST:
      0x0
      RTN
      0x2
      0x3
      MULTIPLY
      jump
   RTN:
      swap1
      pop
      swap1
      jump

Comparable EIP-2315 or EIP-615:
   MULTIPLY:
       mul
       returnsub   
   TEST:
       0x2
       0x3
       MULTIPLY
       jumpsub
       returnsub

lialan · February 17, 2020, 6:34am

Pros:

Less gas consumption
much easier for static analysis
better readability: concise and clear syntax
easier to maintain
less error prone
no hard fork needed

Cons:

nothing.This is how a machine should work.

gcolvin · February 17, 2020, 7:09am

Optimized code from the latest solc does a better job with the multiply() function, which is a leaf. Non-leaf functions remain costly to get out of, as shown by adding a layer to the test.

contract fun {
    function multiply(uint x, uint y) public returns (uint) {
        return x * y;
    }
    function test_mul(uint x, uint y) public returns (uint) {
        return multiply(x,y);
    }
    function test(uint x, uint y) public returns (uint) {
        return test_mul(2,3);
    }
}

Here is what solc can do now with just jump:

1  MULTIPLY:   
5     mul
3     swap1
8     jump
=
17 gas

1  TEST_MUL:
5     0x00
5     RTN
5     dup4
5     dup4
5     MULTIPLY
8     jump
=
34 gas

1  RTN:
3     swap4
3     swap3
2     pop
2     pop
2     pop
8     jump
=
21 gas (twice)

   TEST:
5     0x00
5     RTN
5     0x02
5     0x03
5     TEST_MUL
5     jump
=
30 gas

123 gas TOTAL

But with jumpsub and returnsub only a third as much gas is needed.

1  MULTIPLY:
5     mul
3     returnsub
=
9 gas

1  TEST_MUL:
3     MULTIPLY
5     jumpsub
3     returnsub
=
12 gas

1  TEST:
3     0x02
3     0x03
3     TEST_MUL
5     jumpsub
3     returnsub
=
18 gas

39 gas TOTAL

lialan · February 17, 2020, 9:07pm

Surprised to see the optimizer doing such a bad job. But yeah this test case shows exactly how the subroutine features would benefit sophisticated situations like this.

gcolvin · February 24, 2020, 12:05am

I haven’t tried to hand-optimize the Solidity output, but I don’t think that it could do much better. It’s intrinsically difficult to handle subroutines without instructions for the purpose, as the history of CPU development pretty clearly shows.

I’m not sure what’s sophisticated here – just two function calls and one multiplication.

lialan · February 25, 2020, 9:58pm

Some diggings: On Remix it is using Solidity compiler’s legacy optimizer, which cannot handle functions with calls to other functions, hence the non-optimized TEST_MUL function.

gcolvin · February 25, 2020, 11:25pm

Hi @lialan The output above is from the latest solc.

axic · February 26, 2020, 5:57pm

Is static analysis that much easier if all the subroutine offsets are still dynamic (e.g. not an immediate on JUMPSUB)?

gcolvin · February 26, 2020, 11:14pm

Better to say, some kinds of static analysis, @axic, as I’ve heard complaints about the problem of deciphering subroutines in EVM code, and the EIP gives a good example. Other kinds analysis don’t care.

axic · February 28, 2020, 5:42pm

The reason none of this optimised is that all functions are marked public. That means all of them need to made available externally. In a more realistic example, most of those would not be marked public and then they become inlined or better optimised.

gcolvin · February 28, 2020, 6:12pm

@axic Thanks, I’ll try that. The public case is still relevant, but I also want to see what the limits are.

gcolvin · March 1, 2020, 10:41am

@axic, @lialan, @holiman. @karalabe. This program should be more optimizable. test() must be public or else the optimizer tries to eliminate everything.

pragma solidity >0.6.2;
contract fun {
    function test(uint x, uint y) pure public returns (uint) {
        return test_mul(x, y);
    }
    function test_mul(uint x, uint y) pure private returns (uint) {
        return multiply(x, y);
    }
    function multiply(uint x, uint y) pure private returns (uint) {
        return x * y;
    }
}

Compiling with solc --asm --optimize fun.sol gives this. I don’t fully understand what it is doing.

   TEST:
      0x00
      OUT
      dup4
      dup4
      MULTIPLY
      jump
   OUT:
      swap4
      swap3
      pop
      pop
      pop
      jump
   MULTIPLY:
      0x00
      dup2
      dup4
      mul
      OUT
      jump

Code generation with JUMPSUB should also optimize away test_mul.

  TEST:
     MULTIPLY
     jumpsub
     returnsub

  MULTIPLY:
     mul
     returnsub

I think in both cases the multiply() function could be inlined, but then there would be no test.

chriseth · March 7, 2020, 2:05pm

Arguing with example code is not too helpful, in my opinion. This proposals allows to eliminate moving up the return address in the stack prior to returning and that’s it. If we just want to optimize for fewer swap operations on return, we could also introduce a “rotate stack” opcode. If we also want to improve static analysis capabilities, then I would propose the EIP to be strengthened (or maybe just clarified) by the following rule:

If a BEGINSUB is reached in any way except a JUMPSUB, it causes an out-of-gas failure. This includes:

directly jumping to the beginsub using jump or jumpi
reaching the BEGINSUB by continuing execution from the one prior to it without jumping
BEGINSUB being the first opcode

An even bigger improvement would be to disallow regular jumps into and out of subroutines, but I’m not sure if that would be a too drastic change.

I cannot elaborate on whether this change is worth it, since Solidity neither benefits much from it nor is it a big cost to switch to using these opcodes in the code generator (as long as dynamic jumps are still allowed). People working on debuggers (remix, truffle, etc), static analysis tools and VM implementations should answer that question.

gcolvin · March 7, 2020, 6:38pm

I don’t understand. Doesn’t this proposal allow for remove the return address from the stack entirely?

gcolvin · March 7, 2020, 6:42pm

Of course if we keep making improvements we could end up back at EIP-615, so I want to keep this minimal. Last we talked @holiman wasn’t even sure BEGINSUB was worth it.