Thank you for putting this forward! I was thinking about the need for a push0 instruction just the other day. Based on your stats, the impact of it is even larger than I expected!
Out of curiosity, did you collect stats on any other small numbers, such as 1, 2, and 32?
We haven’t, but we could do that. Would take a couple of days though.
Instead of a special “PUSH 1”, I think something like INC/DEC would be a more interesting instruction, as that has multiple uses, especially around loops and rounding. I think 2 and 32 may be more common, though Solidity does rounding using both 31 and 32, which can also be optimised to a combination of shits.
In fairness my hunch is that the constant 1 is mostly used for loops, in the form of PUSH1 1 ADD, so instead of that INC seems better. That is if we are willing to go into the direction of CISC. (PUSH0 is still inspired by the constant 0 register in RISC machines.)
EVM hardcore mode
Discounting the typo, I meant “combination of shifts and other bitwise instructions”.
My hunch is that the majority of cases using the constant 1 are for-loops. And likely a large number of the uses of the constant 32 is for such loops too, which operate on word sizes. (Though a significant number of occurrences should be for memory operations.)
And for these use cases I think this is a better direction to go:
My intuition is that saving a 1 byte is a very marginal improvement and needs a pretty strong justification for actually reserving an opcode for it (which are after all limited)?
It is not only about saving 1 byte. The main motivation is runtime cost and avoiding that contracts use weird optimisations because they have no better option, and that optimisation limiting us in introducing other features.
Please read the motivation in the EIP and if it is fails to present convincing points, then we need to improve it.
They are not limited, one can have extension bytes and two-byte opcodes, but even if someone mentally limits it to one byte, then we still have over 100 of them left.
Technically speaking all the PUSHn opcodes are not one byte opcodes
These stats are taken from a histogram of several thousand blocks at the end of last year’s chain. One-byte pushes account for almost half of the pushes and over 10% of the instructions. So from @hugo-dc’s numbers about 4% of instructions are PUSH1 0, and about 6% are a push of 1, 2, or 32.
FYI - this would save us some headache for solidity code generation - for once, in some situations we have to create stack balance between branches (we sometimes choose an awkward codesize for doing so…), and apart from that we constantly have to seek balance between keeping zeroes on stack or repushing them, both of which would be made easier, simpler and cleaner using a PUSH0. I’ve even had a draft for an optimizer step once that analysed which code paths are only executed prior to any external call and replaced zeroes with returndatasize in those paths - and considered something similar with callvalue for non-payable functions after their callvalue check - all of which is extremely awkward and it’d be nice to be able to drop crazy ideas like that with this EIP. Not that it’s crucial for us, but definitely a nice-to-have.
We have some benchmarks about them, and it is not as clear cut. We do plan to share these with some recommendations, but I do not think this strictly is related to this EIP.
PUSH0 has been enabled by default in Solidity since 0.8.20, but most blockchains still haven’t implemented it, causling “invalid opcode” error if developers use the latest compiler. So we’re stuck with using an older version.
The best we can do for the moment to work with the latest Solidity compiler is to set evmVersion to the previous version:
Polygon announced just days ago that they’ve just implemented PUSH0 on their zkEVM blockchain. I’ve tested the testnet and it works. zkEVM mainnet will work in 4 days. We need the many other blockchains to do so too.