EIP-2718: Typed Transaction Envelope

I am interested in exploring the solution space for how we can deprecate the old format in a more universal way. I like that we have mechanisms through which we can still support the old format, but I think we’d benefit from a strategy that let us eventually migrate all tooling to the new format.

In the legacy format, the transaction hash is defined as keccak(legacy_9_item_rlp_transaction) and the signature is sign(first_6_items_of_legacy_txn). I would propose that we add a new version of the old transaction which:

  1. includes the TransactionType as part of the signature.
  2. computes the hash as `keccak(rlp([TransactionType, ]))

This give us both a legacy version of the current transaction format and a modernized version, allowing us to differentiate between transactions that are still being created using old tooling and ones using the new modern approach.

My thought is that we can leverage this to add a “fee bomb” into the protocol. The exact mechanism is up for debate, but I would propose:

  1. have the bomb slowly ramp up transaction fees for legacy transactions
    • start small and ramp up to something like 2-10x multiplier on the fees.
  2. have be bomb kick in in a 12-24 month timeframe.

The rational for ramping up the transaction fees for legacy transactions is that it provides a financial incentive to get off the tooling that is still using the old format. This incentive should work for both users and developers since users will not want to pay higher fees and developers of transaction signing infrastructure should be sensitive to the needs of their users.

The benefits I see from being able to fully leave behind the legacy format are:

  1. reduced complexity for client and tooling developers (no need to special case the old format).
  2. reduced complexity for future protocol changes (no extra special rules for if TransactionType == 0)

I’m curious to hear what other people think about this.

1 Like

I think the spec is missing a section on how transaction hash should be computed. It seems like for the legacy TransactionType we are unable to change, but we might benefit from having a defined standard for new types assuming we can come up with a scheme that we expect to be forwards compatible. I would suggest:

  1. The TransactionType must be included in the fields that are signed.
  2. The hash must be computed from the full transaction payload `keccak(rlp([TransactionType, [, …]]))

These rules would only apply to all new transaction types, with the legacy type being stuck with the legacy rules for signing and hashing (see my previous post on adding a second type here that follows the new convention).

@pipermerriam Why a fee bomb instead of just a well defined EOL schedule? Even with a fee bomb, we would still need an EOL schedule in order to stop supporting legacy transactions, and it is unclear to me what value the fee bomb adds if an EOL schedule is still necessary.

Is the fear that people will procrastinate upgrading their tooling and then be upset in 1-2 years when all of a sudden it stops working?

I’m a little hesitant to make any assertions about what MUST be included in a transaction’s signature, mainly because I am hesitant to make assertions in this document about what it even means to “sign” something. I would like to leave the system as flexible as possible for future transaction types so that things we haven’t thought of today are possible, and the best way I think to achieve that is to put as few requirements on the transaction as possible.

I generally think it is a good idea for transaction types to sign the TransactionType, as it removes the possibility of various types of replay attacks, but maybe some future transaction types are specifically designed to enable certain classes of replay (e.g., sign a transaction that can be submitted as either type 5 or type 6 or both).

1 Like

I’m good changing the signing language to SHOULD. The idea that we don’t know what future transaction types will look like, how they will be signed, etc, makes enough sense to me.

1 Like

I added a SHOULD for signing TransactionType.

I have added some text about what ORIGIN and CALLER mean going forward. For TransactionType 0 they are fully backward compatible and the change is invisible to contracts. However, for all other transaction types, the value of both ORIGIN and CALLER will have a transaction-dependent meaning. For ORIGIN, I feel like the risks are pretty low. However, I am concerned that for CALLER the risks in this change are a bit more significant.

Do people think that we are OK to redefine CALLER for future transaction types? We could allow its contents to be determined per transaction type, but require that it always be an address (never some other data) so that existing contracts won’t choke on receiving a non-address CALLER.

I’m not strongly for or against a fee bomb. I do believe a fee bomb will unduly increase the complexity. Depending on EIP-1559, there could be a major change to the transaction format in the near future anyways. Their solution is to slowly scales down the fraction of the block dedicated to legacy transactions.

I don’t think it is okay to redefine ORIGIN or CALLER in this way.

  1. There are no other opcodes which pack multiple return values into a single word
  2. We’d need to analyze every contract and determine if modifying the high 32-bits would break anything.

I believe a new opcode for TransactionType would be preferable. However, we should be cautious of allowing contracts to access such information. Are there compelling use cases for this? We can always add it later via a new EIP.

If we’re going to colloquially rename opcodes, I believe renaming ORIGIN to GASPAYER would make more sense. As new transaction types are proposed we can decide if there is value in adding a type-dependent data opcode.

The problem is that in the context of the generalized concept of typed transactions (not sponsored transactions specifically), we cannot assert what ORIGIN or CALLER means globally. Each transaction type will need to define what those opcodes return and for some it may not be comparable to what legacy transactions return for those opcodes. While for EIP 2711 it may not break things too badly if we jam the gas payer into ORIGIN and the SENDER into caller, I am not confident that the same will be true for all future transaction types. If we want the freedom to create new transaction types going forward, then I think we need to solve the problem of ORIGIN/CALLER.

One option that is a bit of a middle ground is that we could assert that ORIGIN and CALLER must always be an address, but we cease asserting what those addresses represent. If we were to go that route then I think we should add a new opcode for Transaction Type so that contracts can figure out what those two addresses represent.

Alternatively, we could assert that all transaction types must have a CALLER that represents “the address that will be considered to have called the contract”. This constrains what we can do with transactions (what would a 2 of 2 multisig contract set for CALLER?), but maybe it is a reasonable constraint?

The last option is to assert that ORIGIN == <CALLER of first frame>, and CALLER is always an address and each Transaction Type would define what that address is. I think this is the most backward compatible solution, but it means we’ll have to create a new opcode for TRANSACTION_DATA and TRANSACTION_TYPE (or we could bit pack them if we want to try to save opcodes).

I don’t think think we need to boil the ocean in this EIP. I can’t come up with any use cases where CALLER wouldn’t refer to the address of the entity making a call. If there are, we should could address them. However, I don’t see a benefit in altering a widely used opcode to support potential transaction types.

ORIGIN is a bit of a special case since AFAIK it hasn’t been used for anything terribly productive on mainnet. To be safe and less contentious, we might as well just introduce GASPAYER since all transactions will be paid by someone. CALLER is widely used and any transaction type which significantly alters the meaning of it will be certainly be met with resistance.

My intuition is that we should minimize the observability of different transaction types from within the EVM. For example, what if a transaction was introduced which paid a portion of the fees to a developer fund and to boycott it, some contracts would not allow transactions of that type? I believe all transactions should be treated equally once they enter the EVM. What use cases can you imagine if contracts can treat transactions unequally?

I think this is more than reasonable and, in fact, is already the implicit assertion made by contract developers.

Is there a reason why it wouldn’t set CALLER to the address of the multisig?

Also, I spent some time messing around with different RLP encodings of the typed transaction format. The envelope format was much easier to implement, so I’m happy to say I was wrong about it. For a typical transaction, the flat structure was 4 bytes shorter than the envelope structure. I didn’t get a chance finish the lazy transaction, but lazy decoding isn’t standard RLP anyways and @MicahZoltu pointed out earlier – it adds complexity without much savings.

I’ve posted my code here if anyone is interested.

1 Like

After sleeping on it and reading the feedback from @matt I have removed the ORIGIN and CALLER stuff. I added a note in the rationale saying that ORIGIN and CALLER should be the same for the first frame of the transaction for all transaction types, and that if future transaction types want to include additional data they will need a new opcode.

I am mildly convinced that allowing differentiation by transaction type may lead to some bad things like contracts not working for people who utilize certain types of transactions, but in that case I’m not sure how to best deal with sponsored transactions. I’ll continue the discussion on that over in EIP-2711: Separate gas payer from msg.sender

1 Like

Not sure I’m knowledgable enough to comment on this EIP’s worth, but I noticed a few small issue with wording:

In the rationale section, under “Opaque second item rather than an array” section you say,

By having the second item of the array just be opaque bytes, rather than a list, we can support different encoding formats for the transaction payload in the future, such as SSZ or a fixed-width format.

In the backward compatibility section you say:

...noting that the second element is a list rather than a value.

Did you mean that the second item is bytes?

And in the Security Considerations section you say:

...the second item as a value when it is encoded as an array

Probably a result of the change to bytes after the initial writing of the spec.

Thought I’d point that out as it’s a bit confusing…

1 Like

@tjayrush Both of those were mistakes due to a change from earlier version. Both have been fixed!


Sorry if I missed this in the docs, does each transaction type get its own mempool?

It’s not clear what you mean by “get its own mempool”. If you mean the mempool may need to maintain a list of transactions of a certain type to perform additional checks (e.g. that their total gas is less than the allow 1559 limits or that their valid_until block hasn’t lapsed), then I suppose the answer is yes. Whether or not these checks are performed in parallel seems like an implementation concern.

That is “out of scope” of this EIP, but for the currently on-deck 2718 transaction types, 1559 is the only one that would need its own mempool. The rest would share one with legacy transactions.

Out of scope makes sense, but it seems like there would be plenty of situations where we’d want a smooth transition from one tx type to another.

As an administrative matter, these “parsimonious changes” are a lot more acceptable to me if these sorts of major updates were to actually happen multiple times a year, which is a bit of a catch-22.

In the 1559 case, we have two mempools but they aren’t intended to live side-by-side forever. The intent is that one eventually replaces the other. This is a bit different from other new transaction types where the intent is that they live side-by-side forever. If we imagine 1559 landing after 2711 and other new transaction types, I suspect 1559 will need to actually replace all transaction types with new transaction types that include the new 1559 gas semantics. For example, if we have transaction type 0 (legacy) and transaction type 1 (sponsored/batch/expiring transacitons) when 1559 lands, then 1559 would need to introduce two new types: 2 (legacy with 1559 semantics) and 3 (sponsored/batch/expiring with 1559 semantics).

Questions that I think we would need to answer to move forward:

Do we think that switching mempools is a common enough operation that it is worth trying to generalize a solution? Do we think that should be part of 2718, or should it be part of a separate EIP that defines a mechanism for dealing with pool transitions? Will we always want to go from one mempool to another, or are there situations where we may want multiple side-by-side mempools indefinitely?

Maybe I missed this somewhere in all the text above here… But if the format is rlp([0, rlp([nonce, gasPrice, gasLimit, to, value, data, v, r, s])]) . The signed data is bundled in the inner rlp. So a wrapped transaction can be re-wrapped with some other format? How would you uniquely identify a transaction? The hash of the inner payload, or the hash of the (unsigned) wrapping?

Good question, and the EIP does need to be updated to specify what is hashed for the unique transaction identifier. My initial thinking is that we should identify the transaction by the wrapped hash. However, that would mean that on the fork block when all transactions in the pending queue are wrapped (a one-time operation), their hashes would all change which will almost certainly break any dapps running during the transition, and probably break a lot of user interfaces around that time. The situation wouldn’t be unrecoverable, but it definitely could be messy.

We could use the inner transaction hash, but long term this feels dirty to me as every other transaction type going forward will (hopefully) be identified by a full hash of the transaction, and we’ll forever be left with this one oddball situation to deal with.

Perhaps as a mechanism to protect dapps operating during the transition, clients could have some range of blocks over which transactions have two unique identifiers (hash of inner and hash of outer) such that when someone looks up either with the client, the client will return the details requested for that transaction. Since this would just be a feature for dealing with a transient problem, the code for this (and any related DB entries) could eventually be deleted, it would only have to exist for some finite period of time around the fork block. We just want to make sure that most transactions that were in the pending queue on fork block are accessible by either old or new transaction hash, even though they were mined after the fork block.

Thoughts? Core dev thoughts on the subject would be particularly valuable as it would help provide insight into how realistic either solution is.

1 Like