EIP 1884: Repricing for trie-size-dependent opcodes

MatthiasEgli · October 1, 2019, 10:52pm

Related to the repricing, how much time should a node at most spend on computing a block? Asked differently, if it takes more than X seconds to compute a block, for which X would this cause issues for the network?

Initially, 1 GAS represented around 1’000ns on an unknown machine which was used to determine the initial gas cost, and the block gas limit was at 3.1415m (that number looks familiar…). So this seems to indicate that ~3 seconds were deemed acceptable to compute a block, but of course this might all be coincidence.

Today, we are often achieving 20m GAS/second on recent hardware using new versions of geth, and 10m GAS/second on the “ordinary” (ssd-powered) cloud instance out there, so ~1 second to compute todays 10m gas limit blocks.

Obviously, any X above or close to the target block time of 13 seconds is a certain show-stopper. But what about blocks which require 5 seconds on recent powerful hardware today? In the end it is about when the network starts to either slow down due to longer block times created by the miners or when nodes used by applications (e.g. Infura, but of course there are many other ones out there) can’t keep up anymore. Does anyone have experience or suggestions regarding the “minimum supported throughput on the minimum supported hardware”?

wjmelements · October 2, 2019, 5:54am

You’re right here: if updates happen often enough it would make sense to selfdestruct data contracts when you replace them, but in other contexts it would be worse. Adding in the code to ensure that only you can self-destruct your data is more expensive than prefixing it with the invalid opcode. In the case where you never update the data, there is no benefit to cleanup. In the case where reads happen sufficiently more often than updates, the differential read cost could overtake the write savings. As with GST2, expected gasPrice should play a role in your calculation.

wjmelements · October 2, 2019, 5:55am

Ooh, I hadn’t considered that.

wjmelements · October 2, 2019, 6:04am

Besides the block time, you also have to consider the block propagation time. A node will not pass along a block unless it is valid, and the network is not fully-connected, so a block must be validated by many nodes before all of the miners will adopt it. Block validation time is a key component of uncle rate.

PaoloRebuffo · October 2, 2019, 9:24am

Given the success of ethereum it is likely that more “repricing” operations will be needed in the future.
I therefore wonder if it is more productive to focus on giving a price to states, ( to complete the work of Alexey Ahkunov) and drop this EIP.

jochem-brouwer · October 2, 2019, 1:39pm

This is not part of the spec. Balance gas will get bumped to the new gas amount even if it is called on the current address.

adlerjohn · October 2, 2019, 3:12pm

A common misconception, but a false one nonetheless. Clients only verify that sufficient PoW was done (i.e., they only validate the header) before propagating the block.

github.com/openethereum/parity-ethereum

Propagate new blocks after verifying PoW but without waiting for them to be imported

opened 02:08PM - 17 Sep 18 UTC

closed 10:30AM - 28 Nov 18 UTC

AlexeyAkhunov

F7-optimisation 💊 M4-core ⛓

- **Parity Ethereum version**: all versions - **Operating system**: all operati…ng systems - **Network**: ethereum / ropsten Go-etherum currently propagates new blocks straight after it checks that Proof Of Work is valid. This makes block propagation speed independent of how fast the clients are able to process them fully. Current version of Parity imports blocks (which includes executing all transactions, computing the new state root, and committing the new state) before it propagates them. This issue requests the change so that Parity also propagates after PoW check. Propagating after PoW check does not present DoS attack vector, because producing a block with valid PoW is quite expensive, so they cannot be created in large enough quantities to enable such an attack. Miners will have to properly import the blocks before building on top of them, therefore, this change will not lead to invalid block branches growing.

wjmelements · October 3, 2019, 8:14am

That seems unfair to existing contracts.

wjmelements · October 3, 2019, 9:38am

I have submitted a proposed adjustment to EIP 1884 that would solve the EXTCODECOPY issue and the SELFBALANCE discrepancy.

dominic · October 12, 2019, 4:08am

I find the argument to increase the gas pricing because of how one concrete implementation performs in syncing time highly questionable. I suggest to completely drop this EIP and define a new strategy for gas pricing / gas pricing changes.

Both operations SLOAD and BALANCE are key value lookups whose complexity solely depend on the data structure chosen by the client implementing it. There is no reason why geth couldn’t adopt a backing data store or just an index with O(1) lookup times for these key value pairs. Does anyone know why geth is not considering an O(1) hash table for SLOAD and BALANCE?
While operating a full sync is an important operation I do not see why it should be the measure for EVM pricing. IMHO normal day-to-day operation of a node should be there reference and not the syncing time. During normal day-to-day operation EVM execution and PoW happen back to back with PoW being probably 99% of the CPU resources burnt and EVM 1% (Sorry this is just a guess have not found reference values on this). The strategy to price EVM operations thus should be the long term effect on the total size of the state tree and only to a lesser degree the CPU time associated with executing the EVM – at least as long as the PoW CPU time is so much bigger.
Making changes to the EVM gas pricing should only be considered if there is a consensus between the client implementations. If we’re seeing that all client implementations have the same performance disparity on a certain operation it would make sense to start the process to change the operation cost - under the same hardware setup at least the big clients aleth, parity, geth (others?) should be checked before drawing any conclusions.

holiman · October 12, 2019, 4:20am

Geth is considering it, and working on it. Parity doesn’t have it either, and none of the other clients do, to my knowledge. It’s not a trivial problem to solve. If it indeed becomes solved (or at least improved) in future iterations, and it’s deemed possible to lower the limits again, that would be wonderful.

Actually, verifyign the PoW on a block takes somwhere between 5-15ms (depending on machine, of course), and verifying the full block (verifying all txs, running all executions) take up somewhere around 200-300ms. It’s mainly disk io that’s the bound – again, depending on machine. Sometime in the future, when we have Optane DC ram, it might be a different case.

My benchmarks have been published for over half a year, nobody has challenged it with drastically different measurements made on another client. I made these on geth, because I’m a geth-developer. Nobody from parity/trinity/nethermind/besu at any time offered the opinion that “No, that’s just geth performing badly, we don’t see this as an issue”.

This is most definitely a cross-client concern.

dominic · October 15, 2019, 3:58am

Hey, this is great! Then if solving the problem within geth is within reach, this EIP should definitely be dropped. A not yet done fix in a client should not be the cause for temporary EVM repricing. Changing EVM rules breaks the immutability of smart contracts and the blockchain and can only be be the last resort.

200-300ms for the validation, and then given you accept that block another 200-300ms for executing transactions of a new candidate and then 14400-14600ms for solving that new PoW (assuming avg. 15sec block time). So EVM calculation seem to reach 2-4% according to that 200-300ms numbers on day to day work for miners. Questions regarding these tests:

What is the reference hardware for this? (RAM and CPU frequency/cores, storage)
there general agreement on the reference
You mentioned disk io, does the most recent state data set fit in memory?
- How big is that data set in your measurements?
- Should it fit memory? Should it not fit?
- Are the tools to re-execute this test and get the same numbers somewhere in git for others to reproduce? If not can you make them available and point us there? Would love to try.

At 2-4% it makes sense to look at options to make geth execution faster but these seem to be specific to geth and shouldn’t cause EVM repricing. Just regarding disk IO there are multiple options each of which would solve this. At least:

Data prefetching and
Optimistic parallel transaction execution and of course
Use a hash table for O(1) access

These would alleviate all io read concerns at this time and probably also reduce from that 2-4%. Do you know what the progress has been on this on geth over the last half year or so?

Absence of evidence is not evidence. As an outsider I have no indication that this assumption for the other clients is right or wrong. I think I’m just asking for proper process. There should be a minimum quorum of the other clients to positively confirm or reject that this must be fixed in the EVM and is not a client implementation detail before the EVM is modified. I guess this is an Ethereum governance question.

You have done excellent work with these measurements, and I wish more client teams would take an example and do similar bench-marking to come to best practices on how to implement an efficient client and find consensus on EVM pricing/mispricing. And these measurements show an issue that needs to be addressed, an O(1) operation such as SLOAD should not regress in such a way and needs to be fixed - but IMHO this is a bug with the client and needs to be fixed there.

Cheers!

holiman · October 16, 2019, 7:34am

Let’s take a step back again, I think my comment was misinterpreted. The SLOAD/BALANCE and other opcodes like EXTCODEHASH are not inherently O(1), due to the fact that the ethereum state is represented in a trie. Lookup of any key in the trie is inherently dependent on the number of lookups that needs to be made. Thus, 1884 uses the term trie-size-dependent opcodes.

Now, when I said Geth is considering it, I was referring to this PR: https://github.com/ethereum/go-ethereum/pull/20152 . That PR introduces an additional acceleration data structure, which adds an extra 11-12 Gb of disk requirements for a flat-database of the trie leaves. During runtime, it maintains an in-memory tree of diffs that from time to time get ‘flattened’ towards the disk layer. This difflayer is needed to handle reorgs. This lookup structure is an optimization, and we’re not certain how good we can make it and how soon we can have it done.

The issue that EIP-1884 tries to handle is valid now – or even, if you look at the numbers, it was measured up to around blocks 6.5M - and we’re approaching 9M now. So 1884 is sorely needed, and might even be too conservative.

Data prefetching already is done here, by executing the following block as we’re executing the current.

Interesting. Could you please share your evidence for that (imo pretty bold) claim?

That’s why we discuss EIPs within the AllCoreDevs forum, and decide on what to include on publically broadcasted ACD meetings, where all client developers are present. Within that forum, there has been consensus reached that this is real problem that needs to be addressed on the platform layer.

SLOAD is also a lookup into a trie (although an individual per-account trie), inherently not O(1).

Regarding my benchmarks, it’s available here. The datapoints for three differents runs are there, all executed on a m5d.2xlarge aws instance. The benchmarkers were done on a couple of separate branches, but the relevant commit that makes geth dump out stats is here.

dominic · October 16, 2019, 2:20pm

Hi,

thanks for taking the time to go through this, I honestly feel a bit bad for taking so much of your time away from developing eth code. Also I’m bit worried that this is now a conversation of two individuals, not sure how this can bloom into Ethereum governance consensus.

This looks like some really good work. That it adds storage and is additional acceleration data makes sense. In general database terms you could probably call it an index that you are adding. The additional amount of 12 Gb data though makes me curious what is that % of the total state data for adding this index? Is that 100%? Is one full world state (excluding history) currently 12 Gb heavy?

This sounds like the index for geth is still a couple of month out. It this also true for the other clients?

Thanks for the link, this is great! From looking at it I think the prefetcher can not be very effective, as it is just running one block ahead. If it has to do real disk io the prefetch will not be faster than the “real” loop after it and the “real” loop will catch up to the prefetcher because apart from IO they are both doing the same work. I think you will end up in a case where the “prefetcher” is usually overrun by the “real” runner and then they work in parallel. So in terms of IO throughput increase it is max 2x. A simple alternative could be to prefetch all the transaction destinations - In fact you could do this even for the current block. E.g. before executing the block, loop the transactions and send the IO scheduler a message to prefetch each transaction destinations data. Then run the block.

The key is that we must make use of the high parallelism of SSDs and execute 8 - 16 reads commands at the same time all the time if we want to use the hardware efficiently.

An additive alternative approach would be the optimistic parallel transaction execution – but also much more complex. In this scenario you execute all transactions in the block at the same time, but in isolated states. After the transactions finish one would need to compare the reads and writes each transaction has done and decide whether they are mergeable in any order – Those that were not mergeable (e.g. because two transactions modified the same balance or the same storage location), would need to be executed in a second parallel step. This would probably a bigger rewrite but would not only use IO more efficiently but would allow effective use of multiple CPUs when processing transactions.

The evidence is the hardware capacity here vs. the theoretic work we ask it to do. I have yet to understand how big all contract storage of a recent block is in Gb, so here without that knowledge just disk io based:

Let’s assume the ideal worst case: You got 8M gas and all of that goes into SLOAD. At SLOAD=200 gas you would get a maximum 40,000 SLOAD calls in a block. Each SLOAD call wants to read 32 bytes of data. In total that is 40,000 x 32 bytes => 1.2mb of data that need to be read from the disk – worst case.

The SSD (nvme in case of m5d.2xlarge) on the other hand gives >100,000 IOPS or ~400mb random reads per second if we can send it read commands asynchronously. So from a hardware perspective we are able to randomly read the required 1.2mb within ~3 milliseconds.

So we have plenty of runway in terms of raw IO power here, but might be limited by the sequential access patterns, data fragmentation or other things. It would be interesting indeed to see the actual amount of disk IO going on during these tests and what part of the 200-300ms you mentioned before are disk wait – and what part are in-memory tree traversals and others.

The call notes state “General concensus that there are no objections from any of the clients” – This seems to be inline with what you’re saying, but is also with what I’m afraid of here – This EIP is going to break tons of contracts, just because nobody had the time or motivation to closely analyze & evaluate it’s impact or necessity. My question is primarily a Ethereum Governance question. Is it intentional that EIPs that break smart contract immutability are passed through because it is the path of least resistance for the client teams? E.g. in this case changing some hard-coded gas values is easier than implementing and index seems to be the reason for this EIP to be accepted.

Sweet! Did you sync over the network or did you have the chain data local? Do you have the rough setup / command line history to reproduce this?

Best

veox · November 2, 2019, 4:57pm

Although this has been answered extensively, I’d like to highlight one more reason why the above bold highlights are not exactly correct.

Block headers require the recording of a state root.
This state root is calculated from the trie of account storage roots.
Every account’s storage root is itself calculated from a trie of stored data.

The backing data store can be whatever a node implementation chooses, but if the impl-n is to verify (or propose!) block headers and state changes, then it must be able to compute this.

I suppose this is what @holiman meant to convey when writing:

“Ethereum state is represented in a trie” is not geth’s choice, it’s a protocol-level choice (made years ago).

This is described in the yellowpaper section 4.1 in a tad more detail, although there’s not much more information than the bullet points above.

dominic · November 4, 2019, 11:01am

You have to differentiate here between modifying operations and non-modifying operations. For a modifying operations such as SSTORE you indeed have to have a trie in order to update the merkle nodes correctly. But read operations like SLOAD and BALANCE do not need to to update trie nodes after the operation, hence you can shortcut the trie for read and just use hashmap. SSTORE though - because of the need to recalculate the merkle tree will stay more expensive.

I know that there have been efforts to introduce constant-time storage reads (SLOAD) in geth. I’m curious as to why these efforts failed. There was for example Turbogeth with this interesting quote:

State reads to require only 1 database lookup. With the change (3), and a small addition, where for each short node, we also write out the value node separately, with the full path, we can read values from the state without expanding the trie at all. I have implemented this change in this commit, but it is a bit messy. Subsequent profiling sessions showed that state reading is now negligible and does not even appear in the report.

Long term constant-time SLOAD and BALANCE calls should be achievable in all implementations. I would just hope for it to arrive quicker then it might actually take. Once it arrives though we should be able to drop the EVM gas prices for SLOAD back down again.

Making the testing more repeatable would be good, also I’m still confused as to whether the reference to gas price adjustment should be the cold-sync times or the live block processing. They are different challenges.

fulldecent · December 2, 2019, 11:53pm

I have reviewed this EIP everything looks great but I have one note.

EIP-150 is listed as a requirement. But when reading the text I see that EIP-150 affected the analysis of EIP-1884 (DRAFT) but it does not affect the prescription of EIP-1884 (DRAFT). Therefore I think it is appropriate to demote EIP-150 from a prerequisite to a reference.

Quick fix --> https://github.com/ethereum/EIPs/pull/2421

timbeiko · December 9, 2019, 3:09pm

Antoine Le Calvez
(@khannib) has been tweeting about the impacts of this change on the network.

It seems like the Gemini deposit contract as well as some Uniswap contracts are affected:

Update: it seems like the Uniswap errors are from an arbitrage bot: https://twitter.com/UniswapExchange/status/1204065068498989056

dror · January 28, 2021, 2:26pm

Is there a requirement that opcode price will be fixed, and not dependent on its arguments?

I mean, instead of adding SELFBALANCE opcode, we can price BALANCE(selfaddr) cheaper than BALANCE(some-other-addr)

timbeiko · January 28, 2021, 9:41pm

@dror this EIP is already live on mainnet, as part of the Istanbul hard fork: https://eips.ethereum.org/EIPS/eip-1679