EIP-663: Unlimited SWAP and DUP instructions

That would be fully contrary to the intention of this EIP of making more of the stack space addressable, though.

What conceptual complexity, resp. what confusion and footguns, do you mean exactly? Generally, Iā€™d not expect people to decipher opcodes by hand - and disassembled, youā€™d just print SWAP_N 0 as SWAP17. But in any case, either way is completely fine - the additional 16 items are indeed of no real concern. So if there is any strong preference to starting from zero, as far as Iā€™m concerned we can just go with that as well.

Itā€™s a conceptual hurdle for anybody learning the EVM, or writing a low level tool/assembler/disassembler to understand, maintain and debug. I donā€™t think itā€™s necessarily a huge hurdle, but it is a nonzero one which IMO is not outweighed by the additional space addressability (I donā€™t think the extra 16 items will be useful in practice, although if I see a compelling example I could have my mind changed).

1 Like

I like that the spec reads the stack index inline, as PUSH opcodes do.

If the code is legacy bytecode, both of these instructions result in an exceptional halt

I donā€™t want to prevent non-EOF code from using this, because I donā€™t plan to adopt EOF. Contracts shouldnā€™t be using invalid opcodes to revert, so it shouldnā€™t matter if we break such behavior for legacy contracts. We havenā€™t done the same for other opcodes introduced in the past. Why is it being done here?

Exactly because of the property you like:

This cannot be achieved on legacy code, due to jumpdest-analysis and that existing code on chain can contain this instruction already (hereā€™s one explainer on the current thread). The use of immediate (in-line) arguments is made possible by EOF.

1 Like

Yeah, fair enough. And yeah, as I said, youā€™re right, reaching the additional 16 items is definitely not overly relevant in practice. The (weak) argument for starting at 17 was rather to avoid having duplicated, resp. ā€œuselessā€ opcodes (i.e. SWAP_N 0...SWAP_N 16 would never be used, as long as we still have SWAP1...SWAP16). But yeah, if we want to keep the option to deprecate or remove the old swaps eventually (even though Iā€™m not sure thatā€™ll ever actually happen), resp. since thereā€™s concern about starting at 17 making it harder to maintain tools (even though I also donā€™t think thatā€™s that significant), I see no problem with starting from 0 instead.

This clarification to the spec was merged, but the +17 idea was not included. That is tracked in this branch now, pending decision: GitHub - ipsilon/EIPs at eip-663-plus17

3 Likes

For what itā€™s worth, there was some feedback from Twitter people wanting SWAPMN:
https://twitter.com/alexberegszaszi/status/1598124647723433984

If it were to happen, XCHG may sound like an alternative name.

However some questioned how frequent the use case for SWAPMN may be:
https://twitter.com/recmo/status/1598215821125304321

1 Like

A couple thoughts:

3-4 SWAP_N instructions will cover the entire addressable space of 1024 stack items.

We could have potentially 1-3 SWAP_N_M instructions which take different numbers of immediates to address codesize concerns.

What would the tradeoffs be for, instead of using immediate opcodes, using the first element(s) of the stack for the DUP / SWAP as suggested here? EIP-? : Introduce Opcodes B0 DUPN and B1 SWAPN - #2 by gumb0

The immediate argument idea seems simpler to implement, so I assume itā€™s the best choice.
SWAP_N_M as suggested by @charles-cooper would allow optimizing and cleaning a lot of EVM compiler code. Would it be in scope to add these three?

DUP_N
SWAP_N
SWAP_N_M

The big tradeoff with using stack values as parameters vs immediate values is that thereā€™s more overhead, will probably cost more gas, and allows less upfront validation and analysis.

1 Like

What we lose with stack reading is provably static dups and swaps. With some code flow analysis you can prove some stack based loads are static, but it opens the door to dynamic swaps and dups (which may be useful for on-stack arrays). Dynamic swaps and dups, however, nerf almost all useful register mapping schemes. It also complicates the stack proving requirements in EOF.

2 Likes

My understanding is that static stack machine code is essentially already in SSA form.

I would prefer to have this EIP accepted instead of having stack allocation in memory which is happeing in the Solidity compiler with ā€œviaIRā€ option

1 Like

I donā€™t think having dynamic stack access (i.e. using the first element of the stack to define which stack item(s) is being accessed) is a good idea. It complicates code analysis and does not provide much benefit, besides being compatible with non-EOF contracts.

I do agree though that this EIP should be expanded to 1 or even 2 DUP_N and SWAP_N instructions, and also SWAP_N_M instructions which will cover the entire addressable range of the stack.

The use case for SWAP_N_M is very clear, to the point that I think this EIP is only really useful if it includes SWAP_N_M or some of the variants I proposed above. It is used by compilers for stack scheduling. Right now, in order to swap the nā€™th and mā€™th items of the stack, (for instance the 2nd and 3rd items need to be swapped for an ADD instruction), you need to issue SWAPN SWAPM SWAPN, which also costs 9 gas. Having a single instruction for this would not be more costly in the VM implementation and would simplify a lot of bytecode.

1 Like