EIP-1153: Transient storage opcodes

BTW, I can recognize here another dropped proposal I have made in ethereum/solidity
to support an trailing data in message call. Possibly it worth reviewing it once more.

You will need to point me to the existing implementation of ReentranceLock, I could not find it, sorry.

The reentrancy lock has to be unlocked after the call regardless of whether one uses storage or transient storage. Because after the call is complete, the lock is still locked. So no change in usage here, apart from the gas cost.

Transient Storage cannot be used directly to implement signalling between frames of different contracts. All interactions between distinct contracts can only happen via CALL and STATICCALL.

You will need to point me to the existing implementation of ReentranceLock, I could not find it, sorry.

Oh… I mean nothing special. Just a standard implementation with some further message call in the locked scope.
See Contract Mutex, modifier noReentrancy

(Even I am not sure it is good)
why not?

Because of this (from EIP)

you mean definitely: “Only owning contract frames may write- access their transient storage”.
What is bad with read access?

I see what you mean. To read from other contracts’s storage, one would need to modify the TLOAD opcode to have 2 arguments, one for address of account you are reading, and the other - for the address of the “cell” you are reading. It might not be a bad idea, actually.

One use case for contracts reading other contracts transient storage could be calling libraries via CALL (STATICCALL) instead of DELEGATECALL or CALLCODE, and passing structures (like trees and linked lists) without having to serialise them into input data. Calling via CALL and STATICCALL is arguably safer, because you don’t give the callee access to your storage.

I thought about usual public accessors, not about extending TLOAD opcode.

Replacing accessor functions with “native” read access per opcode deserves seperate EIP and cautios evalation. The idea to have a “native” per-reference read-only access to data structures without serialization looks to me as a major change targeting EVM-2.0.
I need to compare it to library pattern.

How much as should the opcode execution cost?
I would try to compare it with gas costs for usual message call.

I suggested 8 gas per TLOAD operation. It does not cost much more to read from other contract’s transient storage. An implementation would most probably already have some kind of nested hash table (contract_address => (storage address => value)) which can be access by 2-argument TLOAD.

Sorry for long delay, @AlexeyAkhunov.

All in all I see two proposals here:

  1. introduce transient storage and TSTORE/TLOAD opcodes.
  2. allow store reading opcodes (TLOAD, SLOAD) to read other contract’s storage.

I think (2) is interesting, because it is much less expensive than usual contract reading calls - we don’t need serialization/deserealization any more. It is some kind of reversing delegate call idea: we let our contract code read external storage.
If we think ethereum as a storage for public data like identities or merkle trees, offered onchain for other contracts, this proposal will help a lot. It allows powerful searches even if they are implemented later in user’s contract instead of in the public storage initially.
Nevertheless, (2) is possible without Transient Storage at all.
In case of permanent storage I am also not quite sure about 8 gas costs. Possibly we should limit an amount of other contract storages accessible by user’s contract, in order to ensure all calls are hitting the cache and not a disk. Then we can keep it low.

For (1) I am still unclear about meaningful distinct usecase.
Technically it means that some contract should create (not load from disk!) a reasonable amount of data (more than few flags for signalling) and offer it for access to other contracts. What use case should it be? I don’t know.

1 Like

RETURNDATA could have leveraged this generic memory had it existed before, instead of having to put everything in a special memory that is only used for one single thing (and requires 2 instructions). Though most of us are extremely happy that they added it, it can also be argued that return data is “wrong” compared to adding a generic transaction-spanning memory, readable and writable through VM instructions, in a manner similar to what is proposed here.

Languages could easily work around control flow and all other type of issues. Just because they can use it doesn’t mean they have to. But you know all that. Solidity & EVM could just reserve the two first words for return data position and length, for example. The point is there would still only be 2 instructions, TLOAD & TSTORE, just like RETURNDATA & RETURNDATASIZE, but it can be used for a lot more. Return system could be deprecated. If the memory is different arrays mapped to addresses then maybe other transaction-spanning data could be mapped to it as well, for example, maybe CALLDATA could just be assigned to transMem[0x000…000]. It would still be read-only there.

Maybe taking it to far but there seem to be many potential use cases.

Bear in mind that this kind of global transient memory would make static analysis difficult or even impossible in many cases.

3 Likes

This is interesting.

What do you think about updating the EIP to make it a map of arrays? I read some of your posts and the discussion here and in the gas netting thread and I saw that suggestion, but i will formulate it again. Basically, the key would be an ethereum address and the value a regular byte array. Informally:

Map<Address, byte[]> tStorage;

Perhaps it would require a temp version too so nothing is added until it is certain there is no reverts.

Instructions:

TSTORE (sAddr val) - pop 'sAddr' and 'val' and set tStore(_ADDRESS_)[sAddr] = val
TLOAD (accAddr, sAddr) - pop 'accAddr' and 'sAddr' and push tStore(accAddr)[sAddr]

_ADDRESS_ would be the address of the current account. Additionally there could be TCOPY or something which could work like CALLDATACOPY and CODECOPY works now, i.e. copying data from transient storage to memory. The EVM could use _ADDRESS_ = 0 for immutable data, since no account would be able to write there, and it could decide for itself what addresses to use so it doesn’t overwrite anything itself.

The good thing with this is that much of the EVM functionality that are now using separate instructions and storage locations could maybe be harmonized. For example, it could be used for all the things you say. It could be used for return data, and tx data - and maybe even contract-to-contract calldata.

There would be difficulties, and some of this is probably not practical, but it is at least an interesting discussion. There are plenty of instructions involved in passing data around (all the CALLs, and everything CALLDATA-related, and RETURN-related), and then there are reentrancy locks and such, so maybe it can all be part of one single EIP.

The thing that makes this even worth talking about imo is that there is no good and uniform way to store data over an entire transaction execution, so everything related has to be done using things that feel like “hacks”, like using permanent storage for non permanent data, special call, calldata and return logic, and other things.

Yes, I suspect that when EVM was designed, one of perceived goal was maximum isolation for the sake of security. In the hindsight, that turned out to be quite restrictive, even for implementation of security-related primitives (like reentrancy locks).
I think a good way to go about it is to first redesign EVM completely given what we know now about our needs, and then make a reduction back to EVM, to see what are the best modifications. It would be quite an intense process though :slight_smile:

1 Like

Seems it’s pretty complicated to do it the byte array way. At least in a programming language. “static initialization” is no problem and could just be a flag in the transient storage array (a reserved address) that is set the first time the contract code is run during a transaction, and it would be possible to use that to prepare user defined variables, but the contract storage map type seems like the only good way to actually do that efficiently. But using a map instead of an array will of course make it terrible to use for arrays like return and call-data.

If no dynamic arrays or mappings are allowed:

{ // start of body code

transient uint x = 5;
transient bytes bts;
transient bool b;

// tStorage
// 0x0 : static initialization
// 0x20: free mem pointer

// assembly version of what would always run when contract is called, before any functions.
if(iszero(tload(address, 0)) { // is tStorage of this contract 0 at address 0x00
tstore(0x00, 0x01) // set init flag
tstore(0x40, 0x05) // init x
// 0x60 reference to ‘bts’
// 0x80 value of b
tstore(0x20, 0x100) // update free tstore pointer
}

}

Yes, compilers would need to hide this problem away, but inserting the initialisation logic you showed later. Also, to calculate the size of storage require, and perhaps issue opcode to explicitly resize the storage. Although it makes life harder for compilers, it will eventually make life easier for developers. They will be able to use libraries (finally), closely integrate with eWASM, and stop using cryptographic hash function as a cheat data structure :slight_smile:

I’ve implemented TLOAD/STORE as I suggested in a previous post (memory type array mapped to addresses).

I modified the LLL compiler and added the instructions in at 0x5C (TLOAD) and 0x5D (TSTORE), just after JUMPDEST. This is a contract doing a “static init” type routine before running:

{
	
	(def "T_INIT_ADDR" 0x00)
	(def "T_COUNTER_ADDR" 0x20)

	(def "tInit" (TLOAD (ADDRESS) T_INIT_ADDR))

	(def "tInitW" (val) (TSTORE T_INIT_ADDR val))

	(def "StaticInit" 
	    (unless tInit {
		(tInitW 1)
	    })
	)
	
	StaticInit
	
	(return tInit)
}

It compiles down to this:

6000305c600c57600160005d5b6000305c60005260206000f300

(the STOP at the end is auto injected by LLLC)

I also modified my own evm implementation to include the new opcodes. This is the output after running that code:

{
	"errno": 0,
	"errpc": 24,
	"returnData": "0000000000000000000000000000000000000000000000000000000000000001",
	"mem": "0000000000000000000000000000000000000000000000000000000000000001",
	"stack": [],
	"accounts": [
		{
			"address": "cd1722f2947def4cf144679da39c4c32bdc35681",
			"balance": "0",
			"nonce": 1,
			"code": "",
			"storage": [],
			"destroyed": false
		},
		{
			"address": "0f572e5295c57f15886f9b263e2f6d2d6c7b5ec6",
			"balance": "0",
			"nonce": 0,
			"code": "6000305c600c57600160005d5b6000305c60005260206000f300",
			"storage": [],
			"destroyed": false
		}
	],
	"logs": []
}

The EVM implementation was actually not hard, since I just created TStorage by modifying a copy of the data structure I use for normal storage, using memory structs as values (instead of just 32 byte ints). This of course would not be as simple to do in an actual fully featured EVM like the one in geth or parity…

Either way, I will continue to experiment a bit.