PUSH1 2 I assume. The second time this hits RETURNSUB it will pop the codesize left on the call stack and jump to the implicit 0 past the end of the code (at offset 5) and stop. It will never get to the STOP at offset 4.
Optimized code from the latest solc does a better job with the multiply() function, which is a leaf. Non-leaf functions remain costly to get out of, as shown by adding a layer to the test.
contract fun {
function multiply(uint x, uint y) public returns (uint) {
return x * y;
}
function test_mul(uint x, uint y) public returns (uint) {
return multiply(x,y);
}
function test(uint x, uint y) public returns (uint) {
return test_mul(2,3);
}
}
Here is what solc can do now with just jump:
1 MULTIPLY:
5 mul
3 swap1
8 jump
=
17 gas
1 TEST_MUL:
5 0x00
5 RTN
5 dup4
5 dup4
5 MULTIPLY
8 jump
=
34 gas
1 RTN:
3 swap4
3 swap3
2 pop
2 pop
2 pop
8 jump
=
21 gas (twice)
TEST:
5 0x00
5 RTN
5 0x02
5 0x03
5 TEST_MUL
5 jump
=
30 gas
123 gas TOTAL
But with jumpsub and returnsub only a third as much gas is needed.
1 MULTIPLY:
5 mul
3 returnsub
=
9 gas
1 TEST_MUL:
3 MULTIPLY
5 jumpsub
3 returnsub
=
12 gas
1 TEST:
3 0x02
3 0x03
3 TEST_MUL
5 jumpsub
3 returnsub
=
18 gas
39 gas TOTAL
Surprised to see the optimizer doing such a bad job. But yeah this test case shows exactly how the subroutine features would benefit sophisticated situations like this.
I haven’t tried to hand-optimize the Solidity output, but I don’t think that it could do much better. It’s intrinsically difficult to handle subroutines without instructions for the purpose, as the history of CPU development pretty clearly shows.
I’m not sure what’s sophisticated here – just two function calls and one multiplication.
Some diggings: On Remix it is using Solidity compiler’s legacy optimizer, which cannot handle functions with calls to other functions, hence the non-optimized TEST_MUL function.
Better to say, some kinds of static analysis, @axic, as I’ve heard complaints about the problem of deciphering subroutines in EVM code, and the EIP gives a good example. Other kinds analysis don’t care.
The reason none of this optimised is that all functions are marked public. That means all of them need to made available externally. In a more realistic example, most of those would not be marked public and then they become inlined or better optimised.
@axic, @lialan, @holiman. @karalabe. This program should be more optimizable. test() must be public or else the optimizer tries to eliminate everything.
pragma solidity >0.6.2;
contract fun {
function test(uint x, uint y) pure public returns (uint) {
return test_mul(x, y);
}
function test_mul(uint x, uint y) pure private returns (uint) {
return multiply(x, y);
}
function multiply(uint x, uint y) pure private returns (uint) {
return x * y;
}
}
Compiling with solc --asm --optimize fun.sol gives this. I don’t fully understand what it is doing.
TEST:
0x00
OUT
dup4
dup4
MULTIPLY
jump
OUT:
swap4
swap3
pop
pop
pop
jump
MULTIPLY:
0x00
dup2
dup4
mul
OUT
jump
Code generation with JUMPSUB should also optimize away test_mul.
TEST:
MULTIPLY
jumpsub
returnsub
MULTIPLY:
mul
returnsub
I think in both cases the multiply() function could be inlined, but then there would be no test.
Arguing with example code is not too helpful, in my opinion. This proposals allows to eliminate moving up the return address in the stack prior to returning and that’s it. If we just want to optimize for fewer swap operations on return, we could also introduce a “rotate stack” opcode. If we also want to improve static analysis capabilities, then I would propose the EIP to be strengthened (or maybe just clarified) by the following rule:
If a BEGINSUB is reached in any way except a JUMPSUB, it causes an out-of-gas failure. This includes:
directly jumping to the beginsub using jump or jumpi
reaching the BEGINSUB by continuing execution from the one prior to it without jumping
BEGINSUB being the first opcode
An even bigger improvement would be to disallow regular jumps into and out of subroutines, but I’m not sure if that would be a too drastic change.
I cannot elaborate on whether this change is worth it, since Solidity neither benefits much from it nor is it a big cost to switch to using these opcodes in the code generator (as long as dynamic jumps are still allowed). People working on debuggers (remix, truffle, etc), static analysis tools and VM implementations should answer that question.
Of course if we keep making improvements we could end up back at EIP-615, so I want to keep this minimal. Last we talked @holiman wasn’t even sure BEGINSUB was worth it.