EIP-2718: Typed Transaction Envelope

I’m a little hesitant to make any assertions about what MUST be included in a transaction’s signature, mainly because I am hesitant to make assertions in this document about what it even means to “sign” something. I would like to leave the system as flexible as possible for future transaction types so that things we haven’t thought of today are possible, and the best way I think to achieve that is to put as few requirements on the transaction as possible.

I generally think it is a good idea for transaction types to sign the TransactionType, as it removes the possibility of various types of replay attacks, but maybe some future transaction types are specifically designed to enable certain classes of replay (e.g., sign a transaction that can be submitted as either type 5 or type 6 or both).

I’m good changing the signing language to SHOULD. The idea that we don’t know what future transaction types will look like, how they will be signed, etc, makes enough sense to me.

1 Like

I added a SHOULD for signing TransactionType.


I have added some text about what ORIGIN and CALLER mean going forward. For TransactionType 0 they are fully backward compatible and the change is invisible to contracts. However, for all other transaction types, the value of both ORIGIN and CALLER will have a transaction-dependent meaning. For ORIGIN, I feel like the risks are pretty low. However, I am concerned that for CALLER the risks in this change are a bit more significant.

Do people think that we are OK to redefine CALLER for future transaction types? We could allow its contents to be determined per transaction type, but require that it always be an address (never some other data) so that existing contracts won’t choke on receiving a non-address CALLER.

I’m not strongly for or against a fee bomb. I do believe a fee bomb will unduly increase the complexity. Depending on EIP-1559, there could be a major change to the transaction format in the near future anyways. Their solution is to slowly scales down the fraction of the block dedicated to legacy transactions.

I don’t think it is okay to redefine ORIGIN or CALLER in this way.

  1. There are no other opcodes which pack multiple return values into a single word
  2. We’d need to analyze every contract and determine if modifying the high 32-bits would break anything.

I believe a new opcode for TransactionType would be preferable. However, we should be cautious of allowing contracts to access such information. Are there compelling use cases for this? We can always add it later via a new EIP.

If we’re going to colloquially rename opcodes, I believe renaming ORIGIN to GASPAYER would make more sense. As new transaction types are proposed we can decide if there is value in adding a type-dependent data opcode.

The problem is that in the context of the generalized concept of typed transactions (not sponsored transactions specifically), we cannot assert what ORIGIN or CALLER means globally. Each transaction type will need to define what those opcodes return and for some it may not be comparable to what legacy transactions return for those opcodes. While for EIP 2711 it may not break things too badly if we jam the gas payer into ORIGIN and the SENDER into caller, I am not confident that the same will be true for all future transaction types. If we want the freedom to create new transaction types going forward, then I think we need to solve the problem of ORIGIN/CALLER.

One option that is a bit of a middle ground is that we could assert that ORIGIN and CALLER must always be an address, but we cease asserting what those addresses represent. If we were to go that route then I think we should add a new opcode for Transaction Type so that contracts can figure out what those two addresses represent.

Alternatively, we could assert that all transaction types must have a CALLER that represents “the address that will be considered to have called the contract”. This constrains what we can do with transactions (what would a 2 of 2 multisig contract set for CALLER?), but maybe it is a reasonable constraint?

The last option is to assert that ORIGIN == <CALLER of first frame>, and CALLER is always an address and each Transaction Type would define what that address is. I think this is the most backward compatible solution, but it means we’ll have to create a new opcode for TRANSACTION_DATA and TRANSACTION_TYPE (or we could bit pack them if we want to try to save opcodes).

I don’t think think we need to boil the ocean in this EIP. I can’t come up with any use cases where CALLER wouldn’t refer to the address of the entity making a call. If there are, we should could address them. However, I don’t see a benefit in altering a widely used opcode to support potential transaction types.

ORIGIN is a bit of a special case since AFAIK it hasn’t been used for anything terribly productive on mainnet. To be safe and less contentious, we might as well just introduce GASPAYER since all transactions will be paid by someone. CALLER is widely used and any transaction type which significantly alters the meaning of it will be certainly be met with resistance.

My intuition is that we should minimize the observability of different transaction types from within the EVM. For example, what if a transaction was introduced which paid a portion of the fees to a developer fund and to boycott it, some contracts would not allow transactions of that type? I believe all transactions should be treated equally once they enter the EVM. What use cases can you imagine if contracts can treat transactions unequally?

I think this is more than reasonable and, in fact, is already the implicit assertion made by contract developers.

Is there a reason why it wouldn’t set CALLER to the address of the multisig?

Also, I spent some time messing around with different RLP encodings of the typed transaction format. The envelope format was much easier to implement, so I’m happy to say I was wrong about it. For a typical transaction, the flat structure was 4 bytes shorter than the envelope structure. I didn’t get a chance finish the lazy transaction, but lazy decoding isn’t standard RLP anyways and @MicahZoltu pointed out earlier – it adds complexity without much savings.

I’ve posted my code here if anyone is interested.

After sleeping on it and reading the feedback from @matt I have removed the ORIGIN and CALLER stuff. I added a note in the rationale saying that ORIGIN and CALLER should be the same for the first frame of the transaction for all transaction types, and that if future transaction types want to include additional data they will need a new opcode.

I am mildly convinced that allowing differentiation by transaction type may lead to some bad things like contracts not working for people who utilize certain types of transactions, but in that case I’m not sure how to best deal with sponsored transactions. I’ll continue the discussion on that over in EIP-2711: Separate gas payer from msg.sender

1 Like

Not sure I’m knowledgable enough to comment on this EIP’s worth, but I noticed a few small issue with wording:

In the rationale section, under “Opaque second item rather than an array” section you say,

By having the second item of the array just be opaque bytes, rather than a list, we can support different encoding formats for the transaction payload in the future, such as SSZ or a fixed-width format.

In the backward compatibility section you say:

...noting that the second element is a list rather than a value.

Did you mean that the second item is bytes?

And in the Security Considerations section you say:

...the second item as a value when it is encoded as an array

Probably a result of the change to bytes after the initial writing of the spec.

Thought I’d point that out as it’s a bit confusing…

@tjayrush Both of those were mistakes due to a change from earlier version. Both have been fixed!

2 Likes

Sorry if I missed this in the docs, does each transaction type get its own mempool?

It’s not clear what you mean by “get its own mempool”. If you mean the mempool may need to maintain a list of transactions of a certain type to perform additional checks (e.g. that their total gas is less than the allow 1559 limits or that their valid_until block hasn’t lapsed), then I suppose the answer is yes. Whether or not these checks are performed in parallel seems like an implementation concern.

That is “out of scope” of this EIP, but for the currently on-deck 2718 transaction types, 1559 is the only one that would need its own mempool. The rest would share one with legacy transactions.

Out of scope makes sense, but it seems like there would be plenty of situations where we’d want a smooth transition from one tx type to another.

As an administrative matter, these “parsimonious changes” are a lot more acceptable to me if these sorts of major updates were to actually happen multiple times a year, which is a bit of a catch-22.

In the 1559 case, we have two mempools but they aren’t intended to live side-by-side forever. The intent is that one eventually replaces the other. This is a bit different from other new transaction types where the intent is that they live side-by-side forever. If we imagine 1559 landing after 2711 and other new transaction types, I suspect 1559 will need to actually replace all transaction types with new transaction types that include the new 1559 gas semantics. For example, if we have transaction type 0 (legacy) and transaction type 1 (sponsored/batch/expiring transacitons) when 1559 lands, then 1559 would need to introduce two new types: 2 (legacy with 1559 semantics) and 3 (sponsored/batch/expiring with 1559 semantics).

Questions that I think we would need to answer to move forward:

Do we think that switching mempools is a common enough operation that it is worth trying to generalize a solution? Do we think that should be part of 2718, or should it be part of a separate EIP that defines a mechanism for dealing with pool transitions? Will we always want to go from one mempool to another, or are there situations where we may want multiple side-by-side mempools indefinitely?

Maybe I missed this somewhere in all the text above here… But if the format is rlp([0, rlp([nonce, gasPrice, gasLimit, to, value, data, v, r, s])]) . The signed data is bundled in the inner rlp. So a wrapped transaction can be re-wrapped with some other format? How would you uniquely identify a transaction? The hash of the inner payload, or the hash of the (unsigned) wrapping?

Good question, and the EIP does need to be updated to specify what is hashed for the unique transaction identifier. My initial thinking is that we should identify the transaction by the wrapped hash. However, that would mean that on the fork block when all transactions in the pending queue are wrapped (a one-time operation), their hashes would all change which will almost certainly break any dapps running during the transition, and probably break a lot of user interfaces around that time. The situation wouldn’t be unrecoverable, but it definitely could be messy.

We could use the inner transaction hash, but long term this feels dirty to me as every other transaction type going forward will (hopefully) be identified by a full hash of the transaction, and we’ll forever be left with this one oddball situation to deal with.

Perhaps as a mechanism to protect dapps operating during the transition, clients could have some range of blocks over which transactions have two unique identifiers (hash of inner and hash of outer) such that when someone looks up either with the client, the client will return the details requested for that transaction. Since this would just be a feature for dealing with a transient problem, the code for this (and any related DB entries) could eventually be deleted, it would only have to exist for some finite period of time around the fork block. We just want to make sure that most transactions that were in the pending queue on fork block are accessible by either old or new transaction hash, even though they were mined after the fork block.

Thoughts? Core dev thoughts on the subject would be particularly valuable as it would help provide insight into how realistic either solution is.

1 Like

Well, all that juggling just to handle a temporary UX-cornercase around the actual fork block seems not worth it, IMO. I hadn’t read the EIP properly, and thought that both old-style and ‘wrapped’ txs were allowed.

As I see it, it’s very odd to sign something, and have the ‘wrapping’ not be part of the signed stuff. So my gut feelings are

  • The signature should encompass the wrapping,
  • The hash should be a hash of the whole wrapped package

But with that, we have to break up the wrapping, since the inner part now must know about the outer part…?

There are two separate problems that I think you may be conflating @holiman:

  1. Signing the wrapped transaction vs signing only the inner transaction for type 0 transactions.
  2. Hashing the wrapped transaction vs hashing the inner transaction for type 0 transactions.

If we change how transactions are signed, then every single wallet will break (be unable to sign transactions) as of the fork block until it is updated. By signing only the inner transaction, wallets can continue to sign the same thing they always signed and the client they communicate with (e.g., Geth) can just wrap them up.

If we change how transactions are hashed, then dapps will break if they submit a transaction before the fork block and it is mined after the fork block. In almost all cases, this can probably be resolved by the end-user by refreshing the page (and possibly clearing their local browser cache, depending on the specifics of the dapp).

I think I can get on board with just eating the transient problem with the hashes changes around the fork block. I don’t think I can get on board with having all signing tools breaking until updated (this would include all hardware wallets, offline wallets, etc. I believe). Changing the signature would also break anyone who has a pre-signed transaction sitting around (e.g., a paper asset recovery transaction for a cold wallet).

In a perfect world I agree that the signature should sign the envelope (including transaction type) and the hash should be of the whole thing. I just don’t think we can reasonably achieve the former is all.

I have updated 2718 to include specification on hashing (hash the envelope).

I have also added some recommendations for client developers (wrap transactions just before fork block and provide access to transactions by both hashes for a time) but neither are MUST, just SHOULD so there is no requirement if client developers think it isn’t worth the effort.

I also added some rational for signing only the Payload for type 0 transactions and hashing the outer transaction.