EIP 1884: Repricing for trie-size-dependent opcodes

KyleJKistner · August 22, 2019, 2:30pm

I have to agree with Ilan. I’m with bZx and our dApp is the third largest on Kyber. We rely heavily on Kyber and perform multiple swaps. This would explosively increase the gas cost of running our contracts and the adverse effects could be severe. Such a rapid increase is extremely alarming.

jochem-brouwer · August 22, 2019, 4:54pm

I have stated this before but I want to note it again.

In general in the design of EVM I do not understand why we have things as EXTCODESIZE and CODESIZE. I can’t imagine that a simple check in one of the blocks regarding EXTCODESIZE to check if we are calling on the current address adds any significant overhead. Hence I do not see why these opcodes are not merged together in a single opcode where the gas price depends on the address you are calling on (an external address costs more gas).

I do understand that an extra opcode SELFBALANCE is consistent with the other opcodes CODESIZE and CODECOPY so adding this seems reasonable. However on all existing contracts which call their own balances (which are by no doubt a significant amount of contracts) a developer would like to change the BALANCE opcode to the new SELFBALANCE opcode regarding this EIP - but of course this is not possible in most if not all cases.

Why can’t we add an extra clause to the BALANCE opcode where "if BALANCE is called on the current ADDRESS the gas cost will be the same as SELFBALANCE" gas cost. I would like to propose this on EXTCODESIZE and EXTCODECOPY regarding CODESIZE and CODECOPY respectively too - so maybe this would be better in another EIP.

It just does not seem reasonable to me that in the current EIP the gas cost of an opcode is increased (BALANCE), but a new opcode is added (SELFBALANCE) where in some situations the “old” (BALANCE) opcode performs the exact same operation - but with more gas.

What is confusing is that in test cases it says "Test that balance(this) costs as before, " - this implies that reading the current account balance gas cost is the same - but it does not state which opcode is used for this?

maurelian · August 23, 2019, 7:02pm

Welcome @ilanDoron.

In my opinion, repricing is a necessary evil in order to reflect the actual cost of particular operations. Although it may not be the exact rationale for the opcodes in the original post, mispricing has led to DOS attacks in the past. Underpaying miners for their computation (thus losing security) is also a risk if pricing is not adjusted to reflect reality.

edit:

To be clear, I do think it’s important that dapps be affordable to users, but I think the way to do that is by reducing the overall gas price, not underpricing ops.

I also think that not breaking functionality which depends on outdated gas pricing assumptions is an important consideration.

fulldecent · August 27, 2019, 2:46am

@axic thank you for the ping

Proposed fix for backwards compatibility:

ilanDoron · August 27, 2019, 11:52am

Thank you for Your reply

I Agree opcodes should be correctly priced.
In this case we have two issues.
price change is very big and also it breaks one or our contracts.

I suggest the opcode repricing be done in a gradual fashion.

Looking forward, Etheruem being an infra layer, is dictated to have stable costs, as much as possible.
Which makes it more reliable to develop dapps using it.
This opcode repricing is an example which makes this layer less reliable as development infra.

tkstanczak · August 30, 2019, 4:05pm

jochem-brouwer · September 4, 2019, 10:27am

Hey guys. I just realized something when I was thinking about using contracts in combination with CREATE2 to store large amounts of data (reading them from external contracts gets cheaper if you read a lot of data - see this article (not mine) about someone who researched this exact approach).

The problem relies on the fact that the proposed increase of SLOAD is from 200 -> 800. This is more than EXTCODECOPY (700 gas + 3 gas per word). This means that it is cheaper even reading a full slot (32 bytes / 8 words) using EXTCODECOPY. Correct me if I am wrong but this would cost 700 + 32/4 * 3 = 724 gas to read 32 bytes from an external contract. This does not seem rational to me and if this happens this might have the unintended side effects of people going to SSTORE data in other contracts (FYI: SSTORE equivalent: deploy a contract where the code is the storage, the actual storage of this contract would be empty) especially if those things are going to be read a lot (as opposed to writing it a lot which is pretty expensive). This might get some effects which were unintended like, for example, the GasToken which “abuses” the gas refund counter. We might hence see people deploying read-only contracts if this EIP gets deployed as-is because reading it is cheaper (and it will get much more cheaper if you start reading for example 2 slots - 748 gas as opposed to 1600 gas!).

Proposed solution would be to either bump EXTCODECOPY or to lower the proposed 800 gas for SLOAD.

tkstanczak · September 4, 2019, 2:27pm

Strange indeed. Maybe not a desired behaviour but loading contract bytecode is indeed much cheaper becase code is not stored in the Patricia tree.

jochem-brouwer · September 4, 2019, 5:47pm

I was not aware that this was in fact cheaper so it is good that this is cleared up (note: not exactly aware about the storage location of the contract code / storage slots and the cost of looking these up - I assumed those were about the same). I do wonder if the EIP proposers are aware of this semi-weird gas pricing though, as this might bring these unintended (?) side effects (a la GasToken) at the Istanbul fork.

tkstanczak · September 4, 2019, 6:07pm

@holiman? I believe this is safe (although a bit awkward if it becomes a common practice and it may lead to the following EIPs having to deal with even stranger legacy contracts).

MatthiasEgli · September 11, 2019, 1:07pm

Is there a fundamental need for geth to lookup data in a SLOAD operation in a patricia merkle tree? It’s obvious that it is necessary to create the tree to calculate the storage tree root, but why can’t there be a constant-time lookup cache layer for reading these values? According to a recent analysis by Péter (here: https://twitter.com/peter_szilagyi/status/1166633058348556288) the raw uncompressed size of storage data is 15.32 GB, potentially allowing a reduction to below 9GB, which should enable a quite efficient cache.

If this is the case, then the increase in SLOAD time is more due to the implementation decisions taken by the client which can be fixed without a network upgrade. Given the current (relatively high due to recent client optimizations) gas cost for SSTORE, adding a slight cost with additional caching here while gaining considerable lookup speed for SLOAD should increase the performance overall significantly.

MatthiasEgli · September 17, 2019, 11:08am

After collecting more feedback on this from many involved like Péter Szilagyi, @holiman and @AlexeyAkhunov (turbo-geth) it is clear that actually a lot of work is currently going into getting constant-time lookup into geth, either as a side-effect of a new sync protocol or in the form of a new database layout.

I know that it is pretty late in the process, but knowing that this will be fixed client-side in the foreseeable future which might even require to make it a lot cheaper again, combined with the concerns from major projects and from new projects we are in contact with (which are afraid to choose Ethereum due to being less able to rely on it still working for them in the future - I know there are a lot of arguments for and against this, but it is a fact that EIP-1884 is being used as an argument against Ethereum), why not focus on the client implementations and drop this quite contested EIP?

holiman · September 17, 2019, 12:15pm

To clarify a bit on that (and sorry I didn’t answer earlier).
So keeping a db on the side is very nice, on paper. In reality, the problem is reorgs. So if you have a flat db, you are incapable of reverting a few blocks.

So what you wind up with is multiple layers of flat databases, where the bottom layer might be a couple of hundred blocks back. That one is on disk, and there are overlays in memory. To actually lookup a value, you need to investigate the in-memory layers first, in case the value has been changed in the last N blocks. Then eventually you hit disk and obtain the last stored value.

This is a promising approach, and somewhat of a necessity for the future new sync protocol which @karalabe is working on, but it’s still work in progress, and will probably not be a magic bullet so solve the lookup problem. It’s basically research at this point

jochem-brouwer · September 19, 2019, 6:38am

For consistency, can we rename BALANCE to EXTBALANCE to comply with the other EXT* opcodes and their non-EXT equivalents?

holiman · September 19, 2019, 7:49am

For consistency, can we rename BALANCE to EXTBALANCE to comply with the other EXT* opcodes and their non- EXT equivalents?

I’d love for that to happen… I didn’t want to do that right away in 1884, though, since it might give the impression of being a new opcode, whereas it in fact is just a UI thing, not related to the consensus rules.

axic · September 19, 2019, 12:53pm

EIP-1803: Rename opcodes for clarity tries to do all this renaming, but it doesn’t seem to be tied to hard forks.

epheph · September 19, 2019, 6:57pm

It seems this conversation has stalled, but it still seems extremely important. There is a strange, small incentive to use contracts for storage of even single word and an enormous benefit for 2 or more words. The gas savings becomes tremendous; loading (4) 32-byte words would cost 3,200 via SLOAD and 712 gas via EXTCODECOPY, ~80% gas savings.

If this is storage data shared with other contracts, retrieving this data will be even MORE cost-effective, since their CALL and EXTCODECOPY will cost basically same, but the CALL still has to SLOAD.

If these incentives remain out of alignment, I do expect another GasToken-like project to emerge (as @jochem-brouwer implies). These savings are even greater than GasToken: they save up around 80% (for 4 storage variables, reasonable number of state variables for many contracts) while GasToken is limited to 50%, they don’t require holding and minting balances for yourself or on behalf of users and don’t require you to provide an oversized GasLimit with each transaction

wjmelements · September 30, 2019, 8:17pm

EIP1884 should allow EXTBALANCE to cost the same as SELFBALANCE in the case the parameter is the call address, but this is not mentioned in the spec.

wjmelements · September 30, 2019, 8:54pm

Hello, I am an engineer who has been exploiting irregularities in the fee structure to save my customers gas. My project is TrueUSD, which has a market cap of $190m. Our tokens have wasted tons of EVM space (>30 MB so far) because of poor design decisions in the past. This looks like another.

I plan to exploit the following issues at scale if 1884 goes through as-is.

As others have pointed out, with this change is cheaper to read data from EXTCODECOPY than to read it from local state with SLOAD. Reading a word from data costs 800 per word while reading a word from code costs 700, plus 3 per word.

It is not only cheaper to read data from code, but also to write it. After the 32000 fixed cost of CREATE, writing a word costs 6400 in code but 20000 for state.

So, under this scheme, if a contract wants to update a group of fields about a user, and that group is larger than 2, they should use external code. If updating is sufficiently less-common than reading, then all data should be externalized into code.

Unlike the GasToken exploit, there is no incentive to clean up contract code used in this way, and there is no easy way to assess fees to the polluters. Punishing good behavior (SLOAD) could result in an explosion of the state space much sooner than expected, and without enough time to plan intelligent mitigation.

The fix for this while keeping the proposed cost increases is to also increase EXTCODECOPY cost per word to something reasonable like 800.

A better fix would be to find a way to make the SLOAD costs constant and not logarithmic. It is not good that this increase predicts future increases as the network grows. If SLOAD is not scalable then surely Ethereum is not either.

adlerjohn · October 1, 2019, 4:32pm

Could you clarify why this is? Selfdestructing a contract (a la GST2) results in a gas refund, just as clearing a word with SSTORE does.