EIP-1153: Transient storage opcodes

i don’t know about linear storage i mainly just think that this eip is a good idea and that many instructions could be made easier and even removed if there is a memory like this. it feels like everything that has to do with transaction-wide data and contract-to-contract messaging is a form of hack. that itself is not a problem i guess since it works fine, but maybe the hacks are becoming too many, and it also seems to be the way forward.

actually my solidity evm has linear storage since i wanted it to reside only in the vm memory. it also does all invocations in the same actual vm (only internal functions) - so running it is one single call to a view function. it would probably be easy to emulate the stuff you’re suggesting in there.

you have to be more specific though, are you talking about a space that is editable by any contract? like one single “scratch space” that is read writable by any contract that wants to?

hmm maybe that’s preferable to the behemoth system i am envisioning…

hmm…

i guess static variables in the way i think about would not be feasible. at least things like reentrancy locks would be better solved with gas netting then putting in a tx wide memory that is not reserved for a single account, but i guess library stuff like you suggest would be plenty more efficient since you would only have to write it once.

and i guess the single scratch space could be used for other call and return data too.

I have thought about this some more, and done some more experimentation.

It seems like the security issue with calldata/returndata is tied to whether or not the transient memory is cleared in combination with a call or return. If not, then it is very bug prone. If it is, that makes it useless for what it was originally intended for which is to store data between calls (all it would become is essentially a merging of the current calldata and returndata memories)

If a map with one memory for every address is implemented, the evm could use address 0 as a combined call/return data storage, used by the various return, calldata, and call instructions, but the other ones would be writable to only by a single contract account each, so the efficient data editing by other accounts would not work. Also, regardless of how it’s implemented, it would be an enormous new component. Real static variables would be fully supported though.

If there is only the scratch space then that would be good for the libraries, who could pass data very efficiently, but that’s all it would be good for - no static variables, not possible to use for regular calls/returns, which means it would require a new memory + new instructions on top of those that already exists.

EDIT:

A more advanced memory model is maybe needed to do the linear memory but it seems it would require either more instructions or modified instructions, and potentially a lot more bookkeeping by the evm. I am not sure how it would work though. Maybe someone already has suggestions out there. I’m gonna look around a bit.

@moodysalem started to revive this proposal: fix: edit eip-1153.md to treat transient storage the same as persistent storage with regards to reverts by moodysalem · Pull Request #4529 · ethereum/EIPs · GitHub

I wonder if instead of storage, should we consider an alternative: a second persistent memory area during a transaction frame. Basically that means instead of a word-addressed memory (e.g. TSTORE/TLOAD), there is a freely addressed memory space.

Here is a concrete example of my use case that requires these opcodes (A is a third party developer, B is the protocol):

  • A calls B#openCheckbook()
    • B calls ICheckbookOpener(msg.sender).opened()
      • A calls B#spend() which accumulates a debt to the checkbook (transient storage)
      • A calls B#deposit() which pays off a debt to the checkbook (transient storage)
    • B checks the checkbook is balanced and reverts if not

We want user contract A to be able to call methods in protocol contract B in any order and while interacting with any external contracts in between calls to B, and check the accounting only when user contract A is done. Giving up control to the caller of B is the easiest way to do this. The minimum amount of transient storage slots used by this pattern for accounting in our use case is 6, but transactions can easily use more than 6 slots for accounting. The cost to make all these transient SSTOREs makes up more than half the gas cost.

As of EIP-3529, the refund is limited to 20% of the gas used, so the transaction must spend at least 600k gas to get a full refund for 6 transient storage slots. In addition, the SLOADs for 6 slots that are guaranteed to be 0 make up another ~15k gas. This model isn’t really feasible with the high cost of transient storage.

The only alternative model available in the EVM today is creating a separate ‘VM’ (a la weiroll VM) that is executed by contract B. However, to do so without introducing security issues in contract B is very difficult, and also the design is hard to use (difficult to construct the calldata onchain).

Interesting EIP. This would make it possible to create maps and dynamic arrays in contracts/solidity without paying excessive gas costs. In case of reverts it should rollback all changes to the transient storage. In case that someone wants to propagate error messages you can use and accumulate returndata. The gas cost should be on par with accounting for reverts and the lookup time of this “dynamic” memory.

Note: maps/dynamic arrays would also be possible if we do @axic 's proposal. This would mean that it is only available in the current call frame, but I think in practical cases this would okay. It would also be easier to implement since there is easy revert logic, just discard all memory and transient memory.

This EIP would also be useful for Arbitrum.

The Outbox contract is responsible for handling execution of L2 to L1 transactions. It uses a pattern similar to a reentrancy lock in order to surface context information of the current L2 to L1 transaction. This allows the outbox to surface the context information without putting any assumptions on a necessary interface when consuming these transactions.

2 Likes

I’m interested in seeing this progress, and am willing to contribute some motivation text to the proposal and implementation elbow grease.

First some text questions:
I do have some questions about exactly what is meant by “a write within a frame” If we have contact A and the callstack A->B->A, and the outermost A reverts, I would expect the transient writes in the innermost call to A to also revert.

I think the example implementation proposal works semantically but isn’t a model for implementors. It makes reverts cheap, but at the expense of extra complexity in reads since they must look through the stack of maps to see if a previous write exists. It also adds a linear in dirty set cost to ordinary returns. I’d prefer to spend the cost on reverts, while making reading and writing cheap.

One way to achieve this is to have a single map per transaction that maps (account, addressing_word)->word and a list of writes and calls. A write appends the old value and the new value of the address to the list, and a call adds a marker to the list. On revert the list can be used to undo all writes that happened after the most recent call.

As for axic’s proposal I’m less enthused. I really want this to be across call frames, akin to static data in C. Otherwise I think it is akin to the existing memory, and doesn’t enable the interesting uses.

Another pattern that this would benefit is reading constructor arguments from the sender contract, as in UniswapV3Factory

This is done to prevent the constructor arguments from being part of the CREATE2 address, allowing CREATE2 addresses to be cheaply computed onchain because the init code hash does not need to be computed. With smaller proxies, the cost of writing and reading from storage may be less than 20% of the gas to deploy the proxy.

Yes, it should behave just like regular storage. If the outermost call to A reverts after exiting the A->B->A call stack, any writes in the innermost A should also revert. I would approve any clarifications to the text in the EIP.

Great feedback, thanks! Would you be able to send a PR to edit the EIP?

Throwing another use case in here: better transient solidity arrays. Memory based arrays in solidity suck as you have to define their lengths beforehand and lack push semantics. Transient storage would make it trivial to implement arrays like every other language does without massive headaches. Would be a major win for devUX

2 Likes

Reentrancy protection is a sufficient reason to support these opcodes.

A couple of other missing features that would become possible with EIP 1153:

  • transient mapping. Currently there is no easy way to have a transient map or an array of variable length objects in memory.
  • Maintaining an opaque context across calls in the same transaction.

We have a concrete use for both in the EntryPoint contract of ERC 4337, where the batching function needs to iterate twice on structs that have a bytes[] component. The contract currently resorts to an asm implementation of memory pointers. The proposed opcodes would make it cleaner and less dependent on current solidity memory management.

Mapping, could be implemented using just memory, but the fact that TSTOR/TLOAD works across frames would enable the ERC 4337 paymaster to maintain its context without returning it to EntryPoint in the first call and receiving it as calldata in the second call. EntryPoint only conveys the opaque context between multiple calls to the paymaster, and never needs to see it. With TSTOR/TLOAD the context would be maintained locally in the paymaster.

3 Likes

I’ve opened https://github.com/ethereum/EIPs/pull/4791.

I’m still unsure of the process to get this discussed by the relevant people. I’ll go start work on an implementing PR if people haven’t already.

2 Likes

Another perspective for why transient storage makes sense is that the low refund cap incentivizes state bloat, because it ends up being cheaper to keep a storage slot dirty if it’s used in a transient way. Granted, this is probably not a significant component of the state growth problem, but I think it clearly shows that storage and transient runtime data are inherently different and have conflicting implementation concerns, so they should be implemented as separate concepts in the EVM.

I believe efficient transient storage will enable a whole family of smart contract programming patterns to be developed, because it is a key primitive for contract composability where multiple call frames are involved.

3 Likes

I’ve opened a draft implementing PR https://github.com/ethereum/go-ethereum/pull/24463. There’s a number of points keeping me from making it a real PR, starting with a lack of tests, all discussed there. Comments welcome.

Also we should discuss how the semantics work during contract creation. I’m not entirely sure my PR will be right depending on what we decide.

1 Like

At ETHDenver, the Optimism team built a proof of concept implementation of EIP-1153, including the necessary changes to go-ethereum and solc to allow developers to test it out. This EIP introduces two new opcodes - TLOAD and TSTORE. Both of which mirror SLOAD and SSTORE but instead use memory instead of reading/writing to disk. These opcodes enable an in memory key/value store that is namespaced by the current address(this). Some usecases include reentrency locks, in memory mappings, better in memory arrays.

The transient storage persists throughout execution of the transaction, but is not globally accessible. This means that if contract A writes something into transient storage, then contract B will not be able to read that value. It seems like there is some desire to have a global transient storage that can be accessed between accounts. This adds complexity as a system will need to be in place for preventing contracts from overwriting keys that were written by other contracts and contracts will need to use a dynamic keying system to prevent storage slot collision. Perhaps another opcode EXTSLOAD can be considered in the future for this functionality. h/t @moodysalem

The solc fork supports yul functions for TLOAD and TSTORE as well as an experimental transient keyword in solidity. For a quickstart guide on compiling contracts locally to use these opcodes, see https://github.com/tynes/eip1153

A hosted node running the code can be found at https://eip1153.optimism.io. If you are interested in receiving some funds to test it out, dm @tyneslol on Twitter.

Shoutout to Matthew Slipper and Conner Fromknecht for their help in implementing it, Ben Wilson in deploying it, as well as various conversations with protolambda, Kelvin Fichter, Karl Floersch, Ben Jones and Ethereum researchers around its usage and implementation.

Note: the solc fork is highly experimental and should not be used in production.

See the code here:

4 Likes

I have concerns over the current pricing model. I can setup GitHub - imapp-pl/gas-cost-estimator for geth, REVM, and sputnik and use the above implementation for geth, and implement it in REVM and sputnik (common rust EVMs) to measure performance.

My guess is 100 is much too high, even with revert requirements. If i were to estimate before measuring, I would guess it would probably be better around 30 gas for TSTORE. Reasoning, in reference to comparable opcodes that exist today:

MSTORE + 5 // store in map, add 5 for calculating placement in mapping
MSTORE + 5 // store changelog info price similarly to standard word memory storage, add 5 for calculating placement in mapping
// IMPLICIT REVERT
MLOAD + 5  // load changelog info, add 5 for calculating location in mapping to load
MSTORE + 5 // undo storage change, add 5 for calculating update location in mapping

This naive guess would put it at 32 gas for a TSTORE. TLOAD should never be more than 10 probably. Symmetric pricing doesn’t make sense given TLOAD has no revert considerations to my knowledge.

edit: the changelog is a mapping of index to “original values” built up when you do a TSTORE. You basically build up a mapping or array of changed indexes, then roll them back when there is a revert, by loading the value by index in the mapping (thus the extra “MSTORE” + “MLOAD” - not literal MSTORE/MLOADs, but conceptually and performance similar)

One thing to take into consideration is the total amount of memory that can be allocated if a block has a single tx that uses all gas to store as many key/value pairs as possible. If this is enough memory to oom the node on consumer hardware, then the gas price is too low.

I definitely prefer to lower the gas cost as much as possible while being safe and think it would be good to have some benchmarks/formulas on how much memory would be allocated in 15 million gas in this scenario.

how much memory would be allocated in 15 million gas in this scenario.

15,000,000 total gas / 32 gas = 468,750 TSTOREs
max memory usage: 468,750 TSTOREs * 32 bytes per word = 15,000,000 bytes == 15mb memory allocation that is dropped at the end of the transaction

edit: you also have to do 2 PUSH1(x) per TSTORE, so actual max is 394,736 TSTOREs so 12.6mb

I implemented a PoC of this for vyper as well, but I think the codegen would be better if TLOAD and TSTORE were byte addressed instead of word addressed like SLOAD and SSTORE. I discussed with @tynes and it sounds like this is doable in geth, just slightly more complicated as unaligned loads/stores need to issue two reads or writes. Also it would be nice if there were batch copy opcodes, analogous to CALLDATACOPY and CODECOPY.

Maybe worth it to price in a transient storage expansion cost, to model cost going up as you approach OOM.