EIP-2315 Simple Subroutines for the EVM

axic · April 5, 2020, 12:27pm

I thought one of the main reasons (apart from the complexity of the EIP – to which one solution offered was to split it up) was exactly the multi-byte opcode problem, description starts here (also check the example from @gumb0): EIP-663: Unlimited SWAP and DUP instructions - #10 by chriseth

axic · April 5, 2020, 12:28pm

Thanks! I think that definitely doesn’t belong under the Implementation section, but rather Specification.

holiman · April 5, 2020, 2:31pm

Right – so it’s definitely the case, that old jumpdest analysis will no longer be valid, and the execution flow of a contract may change. I expect that to not actually be an issue for any real-world usecase, maybe with the exception of people use an on-chain “purity” verifier to attest that “this code will never do a DELEGATECALL”. It might be argued that such a verifier should reasonably exit on an opcode which it does not recognize.

EIP-615 contained statements about what was allowed and what was not allowed: the validity of code, and jumps. Whereas 2315 does not make any assumptions/certifications on the validity of code, so from that perspective it does not require versioning.

The only reason some might think versioning is a good thing, would be if we believe that it would cause problems for existing contracts. And I have yet to be convinced that a new multibyte instruction would cause be a problem in practice.

I’m not particularly advocating for it, though, just speculating.

matkt · April 10, 2020, 4:28pm

Hi
Looking at the description of the (eip-2315) it seem that there is a contradiction between the test cases and the algorithm.
It is said that during the JUMPSUB we store PC + 1 in the RStack. And this is confirmed in Note 2 (A value popped from return_stack may be outside of the code length, if the last JUMPSUB was the last byte of the code).

But in the test cases (for example 0x6004b300b2b7) we can see that it is PC which is stored in the stack (and not PC + 1)

2 - JUMPSUB - 8 - [4] -[]
4 - BEGINSUB - 1 - [] - [2]

For me with the algorithm we should have 3 in RStack and not 2.

2 - JUMPSUB - 8 - [4] -[]
4 - BEGINSUB - 1 - [] - [3]

Can you tell me if it is an issue or if I missed something ?

Thanks

holiman · April 13, 2020, 10:41am

You are technically correct,but it’s not an issue. In geth, I implemented it differently than the spec states: instead of adding PC+1, I put PC there. And later on, the actual jump goes to PC+1. Unfortunately, this little implementation detail leaks out in the example traces. However, the important thing is that the behaviour of the code is consistent with the semantics of the EIP.

I realize that the example traces are “wrong”, and I will update either the EIP or the go-ethereum repo.

In essence, the EIP should specify exactly the observable behaviour of the EVM, and leave the internal represenation up to node implementors. So in that sense, the return_stack is not necessarily represented by an actual stack in reality.

holiman · April 13, 2020, 10:59am

Please see https://github.com/ethereum/EIPs/pull/2599 my proposed update

axic · April 20, 2020, 11:11am

Can any JUMP* go into any subroutine? Also can JUMP from within a subroutine go into another subroutine?

What happens in the case of:

JUMPSUB
JUMP (and jumping into another subroutine)
RETURNSUB

One would assume RETURNSUB picks up the last item from the return_stack. However from an analysis point of view in this case a RETURNSUB within a subroutine block may not only mean returning from that block, but returning from within some other block.

gcolvin · April 21, 2020, 7:58pm

Yes, RETURNSUB must always return to the address last pushed on the stack and pop it @axic, regardless of the block it’s in. So in your example the RETURNSUB is unreachable, and the code will return to the JUMPSUB when and if it hits some other RETURNSUB. I think that will amount to a loop until the code jumped to stops somehow.

This proposal doesn’t constrain the structure of the code much at all, it just provides an efficient mechanism.

gcolvin · April 21, 2020, 8:04pm

If subroutines are dynamic jump themselves how is disabling other kinds of dynamic jumps helping?

Other kinds of dynamic jumps aren’t disabled by this proposal. It only provides a more efficient alternative way to implement subroutines.

axic · April 21, 2020, 8:11pm

Sorry I had a typo in my example. I meant:

BEGINSUB
JUMP <next>
RETURNSUB (1)

next:
JUMPDEST
RETURNSUB (2)

And then (2) returns to where BEGINSUB was first invoked.

holiman · April 22, 2020, 3:07pm

Yes. So in that way, it’s possible to implement the type of tail recursion I talked about here: EIP-2315 Simple Subroutines for the EVM

chfast · April 29, 2020, 11:26am

The analysis of the EIP with two proposed changes: EIP-2315 "Simple Subroutines for the EVM" analysis. By @chfast, @gumb0 and @axic.

We decided to publish in separately because of the length of it and wish to receive direct responses to the proposed changes.

chriseth · May 6, 2020, 12:14pm

Here is a preliminary implementation to use subroutines for yul functions:

It needs much more work to get it properly working, but maybe you can already use the generated bytecode for testing.

Here is the asm.js binary: https://313936-40892817-gh.circle-artifacts.com/0/soljson.js

gcolvin · May 7, 2020, 8:09pm

Thanks, Christian.
(20must20)

axic · May 8, 2020, 3:52pm

If I see correctly this kind of relies/follows the recommendations from EIP-2315 "Simple Subroutines for the EVM" - Analysis, because it relies on not-flowing into beginsub (at least in one place in the assembler)

axic · May 8, 2020, 3:54pm

Currently the following opcodes are proposed:

0xb2 BEGINSUB
0xb3 JUMPSUB
0xb7 RETURNSUB

I propose to use these instead:

0x5c BEGINSUB
0x5d RETURNSUB
0x5e JUMPSUB

The reason: there are 4 opcodes free between 0x5b and 0x60, and in the same block we have JUMPDEST.

Placing it randomly after LOG would mean we have more holes in the opcode table.

gcolvin · May 9, 2020, 7:18am

Much better.
(…20)

chriseth · May 11, 2020, 2:27pm

Indeed! As the linked analysis article correctly states, flowing into a subroutine is not a feature a code generator would use. So I did not really follow the recommendations, I just did it in a clean way which is the same as what is recommended in the analysis

gcolvin · May 11, 2020, 7:46pm

Optimizing code generators can do some very strange things. Subroutines here need not be contiguous regions of code.

adriamb · May 14, 2020, 2:07pm

In openethereum, we implemented this EIP in https://github.com/openethereum/openethereum/pull/11629, also added a test for checking the recursion stack limit:

   PUSH <recursion_limit>
s: BEGINSUB
   DUP1
   JUMPI :c
   STOP
c: JUMPDEST
   PUSH1 1
   SWAP
   SUB
   JUMPSUB :s

with recursion_limit=1024 stops, with recursion_limit=1025 reverts