EIP-3855: PUSH0 instruction

This is the discussion topic for

4 Likes

Thank you for putting this forward! I was thinking about the need for a push0 instruction just the other day. Based on your stats, the impact of it is even larger than I expected!

Out of curiosity, did you collect stats on any other small numbers, such as 1, 2, and 32?

2 Likes

We haven’t, but we could do that. Would take a couple of days though.

Instead of a special “PUSH 1”, I think something like INC/DEC would be a more interesting instruction, as that has multiple uses, especially around loops and rounding. I think 2 and 32 may be more common, though Solidity does rounding using both 31 and 32, which can also be optimised to a combination of shits.

In the EIP we reason for this opcode:

0x5f means it is in a “contiguous” space with the rest of the PUSH implementations and potentially could share the implementation.

If this argument is not strong enough, then 0x5c seems like a good alternative choice.

So we can do PUSH0 INC instead of PUSH1 1? :laughing:

A combination of whatnow?

Yes, isn’t that so much nicer?! :slight_smile:

In fairness my hunch is that the constant 1 is mostly used for loops, in the form of PUSH1 1 ADD, so instead of that INC seems better. That is if we are willing to go into the direction of CISC. (PUSH0 is still inspired by the constant 0 register in RISC machines.)

EVM hardcore mode :grimacing:

Discounting the typo, I meant “combination of shifts and other bitwise instructions”.

Love the idea, but maybe we should use a different mnemonic, like IPUSH0 (for immediate) so that if we add others, we have room to grow? :slight_smile:

1 Like

I have posted the results here: https://gist.github.com/hugo-dc/1ca4682d60098282d7e499bdd0b01fca
Includes analysis of:

  • Occurrences of PUSHn opcodes pushing the values 1, 2, 8, 31, and 32.
  • Occurrences of pushing the specific values 1, 2, 8, 31, and 32, by any of the PUSH opcodes.
  • A comparison between PUSH1 for the specific values 1, 2, 8, 31, and 32 vs any other values.
1 Like

My hunch is that the majority of cases using the constant 1 are for-loops. And likely a large number of the uses of the constant 32 is for such loops too, which operate on word sizes. (Though a significant number of occurrences should be for memory operations.)

And for these use cases I think this is a better direction to go:

My intuition is that saving a 1 byte is a very marginal improvement and needs a pretty strong justification for actually reserving an opcode for it (which are after all limited)?

It is not only about saving 1 byte. The main motivation is runtime cost and avoiding that contracts use weird optimisations because they have no better option, and that optimisation limiting us in introducing other features.

Please read the motivation in the EIP and if it is fails to present convincing points, then we need to improve it.

They are not limited, one can have extension bytes and two-byte opcodes, but even if someone mentally limits it to one byte, then we still have over 100 of them left.

Technically speaking all the PUSHn opcodes are not one byte opcodes :slight_smile:

1 Like

These stats are taken from a histogram of several thousand blocks at the end of last year’s chain. One-byte pushes account for almost half of the pushes and over 10% of the instructions. So from @hugo-dc’s numbers about 4% of instructions are PUSH1 0, and about 6% are a push of 1, 2, or 32.

OP Count %
All PUSH 78,137,163 22.94%
PUSH1 37,886,773 11.12%

FYI - this would save us some headache for solidity code generation - for once, in some situations we have to create stack balance between branches (we sometimes choose an awkward codesize for doing so…), and apart from that we constantly have to seek balance between keeping zeroes on stack or repushing them, both of which would be made easier, simpler and cleaner using a PUSH0. I’ve even had a draft for an optimizer step once that analysed which code paths are only executed prior to any external call and replaced zeroes with returndatasize in those paths - and considered something similar with callvalue for non-payable functions after their callvalue check - all of which is extremely awkward and it’d be nice to be able to drop crazy ideas like that with this EIP. Not that it’s crucial for us, but definitely a nice-to-have.

All PUSH, DUP, and SWAP operations should cost base gas. It’s weird that they don’t.

We have some benchmarks about them, and it is not as clear cut. We do plan to share these with some recommendations, but I do not think this strictly is related to this EIP.

PEEPanEIP-3855: PUSH0 instruction with @axic @chfast @hugo-dc

1 Like

It’s related to the motivation, because people are only using those opcodes instead of PUSH0 because they are cheaper.