EIP-2718: Typed Transaction Envelope

Out of scope makes sense, but it seems like there would be plenty of situations where we’d want a smooth transition from one tx type to another.

As an administrative matter, these “parsimonious changes” are a lot more acceptable to me if these sorts of major updates were to actually happen multiple times a year, which is a bit of a catch-22.

In the 1559 case, we have two mempools but they aren’t intended to live side-by-side forever. The intent is that one eventually replaces the other. This is a bit different from other new transaction types where the intent is that they live side-by-side forever. If we imagine 1559 landing after 2711 and other new transaction types, I suspect 1559 will need to actually replace all transaction types with new transaction types that include the new 1559 gas semantics. For example, if we have transaction type 0 (legacy) and transaction type 1 (sponsored/batch/expiring transacitons) when 1559 lands, then 1559 would need to introduce two new types: 2 (legacy with 1559 semantics) and 3 (sponsored/batch/expiring with 1559 semantics).

Questions that I think we would need to answer to move forward:

Do we think that switching mempools is a common enough operation that it is worth trying to generalize a solution? Do we think that should be part of 2718, or should it be part of a separate EIP that defines a mechanism for dealing with pool transitions? Will we always want to go from one mempool to another, or are there situations where we may want multiple side-by-side mempools indefinitely?

Maybe I missed this somewhere in all the text above here… But if the format is rlp([0, rlp([nonce, gasPrice, gasLimit, to, value, data, v, r, s])]) . The signed data is bundled in the inner rlp. So a wrapped transaction can be re-wrapped with some other format? How would you uniquely identify a transaction? The hash of the inner payload, or the hash of the (unsigned) wrapping?

Good question, and the EIP does need to be updated to specify what is hashed for the unique transaction identifier. My initial thinking is that we should identify the transaction by the wrapped hash. However, that would mean that on the fork block when all transactions in the pending queue are wrapped (a one-time operation), their hashes would all change which will almost certainly break any dapps running during the transition, and probably break a lot of user interfaces around that time. The situation wouldn’t be unrecoverable, but it definitely could be messy.

We could use the inner transaction hash, but long term this feels dirty to me as every other transaction type going forward will (hopefully) be identified by a full hash of the transaction, and we’ll forever be left with this one oddball situation to deal with.

Perhaps as a mechanism to protect dapps operating during the transition, clients could have some range of blocks over which transactions have two unique identifiers (hash of inner and hash of outer) such that when someone looks up either with the client, the client will return the details requested for that transaction. Since this would just be a feature for dealing with a transient problem, the code for this (and any related DB entries) could eventually be deleted, it would only have to exist for some finite period of time around the fork block. We just want to make sure that most transactions that were in the pending queue on fork block are accessible by either old or new transaction hash, even though they were mined after the fork block.

Thoughts? Core dev thoughts on the subject would be particularly valuable as it would help provide insight into how realistic either solution is.

1 Like

Well, all that juggling just to handle a temporary UX-cornercase around the actual fork block seems not worth it, IMO. I hadn’t read the EIP properly, and thought that both old-style and ‘wrapped’ txs were allowed.

As I see it, it’s very odd to sign something, and have the ‘wrapping’ not be part of the signed stuff. So my gut feelings are

  • The signature should encompass the wrapping,
  • The hash should be a hash of the whole wrapped package

But with that, we have to break up the wrapping, since the inner part now must know about the outer part…?

There are two separate problems that I think you may be conflating @holiman:

  1. Signing the wrapped transaction vs signing only the inner transaction for type 0 transactions.
  2. Hashing the wrapped transaction vs hashing the inner transaction for type 0 transactions.

If we change how transactions are signed, then every single wallet will break (be unable to sign transactions) as of the fork block until it is updated. By signing only the inner transaction, wallets can continue to sign the same thing they always signed and the client they communicate with (e.g., Geth) can just wrap them up.

If we change how transactions are hashed, then dapps will break if they submit a transaction before the fork block and it is mined after the fork block. In almost all cases, this can probably be resolved by the end-user by refreshing the page (and possibly clearing their local browser cache, depending on the specifics of the dapp).

I think I can get on board with just eating the transient problem with the hashes changes around the fork block. I don’t think I can get on board with having all signing tools breaking until updated (this would include all hardware wallets, offline wallets, etc. I believe). Changing the signature would also break anyone who has a pre-signed transaction sitting around (e.g., a paper asset recovery transaction for a cold wallet).

In a perfect world I agree that the signature should sign the envelope (including transaction type) and the hash should be of the whole thing. I just don’t think we can reasonably achieve the former is all.

I have updated 2718 to include specification on hashing (hash the envelope).

I have also added some recommendations for client developers (wrap transactions just before fork block and provide access to transactions by both hashes for a time) but neither are MUST, just SHOULD so there is no requirement if client developers think it isn’t worth the effort.

I also added some rational for signing only the Payload for type 0 transactions and hashing the outer transaction.

Transaction Receipts

In 2711, we introduce batch transactions with a single gas payer. There will be a single transaction hash for the whole thing, so that leads me to believe that means there will be a single receipt for the whole thing. However, there may be multiple sub-transactions that are part of that receipt, each of which can succeed/fail independently.

We could assert that batch transactions must all succeed, or all fail as part of 2711, which lets us get away with a single status code, but what about future transaction types that may not adhere to that rule? Another option would be to have the status code field be a uint256, where each bit represents the success of an inner transaction in the batch (putting a limit of 256 transactions per batch), but again that feels like it is tied pretty tightly with 2711 specifically and doesn’t generalize well (what if you have a transaction tree?).

Transaction receipts also currently have a from , to , contractAddress in them, as well as a logs array. from and to will need to be changed to something else or removed I think. We could put the gas payer in for from in theory, and technically we don’t need to separate out the logs by sub-transaction (can just have one big logs array for all nested transactions).

This all is making me wonder if we should version transaction receipts as well? A more useful transaction receipt would have something like childReceipts which contains an array of sub-receipts where each sub-receipt had a from, to, contractAddress, status field in it. If we do version transaction receipts, it feels like we should do it in 2711 so that receipt types align with transaction types, so each transaction type would define both the transaction payload and the transaction receipt payloads.

I’m looking for feedback/thoughts on how to handle receipts for different transaction types. My current leaning is to version receipts and couple the version with transaction types (so an EIP that defines a new transaction type would also define a new receipt type).

I have updated 2718 to now include information on how receipts should be enveloped. I went with the typed receipts solution with the type number matching the transaction type, so it is up to each new transaction type to define its receipts.

I like this standard a lot and if i am to add my won 2cents - we should avoid numbers as the version. They should be descriptive of what they are. The data segment is opaque, so the identifier is an enum and making it a letter instead of a number maybe more practical.

I wrote a EIP-2718 envelope type for ditto transactions, and i used the letter ‘d’ so that it is evident that it is more evident that it is a ditto transaction vs an arbitrary auto-incrementing number.

The reason numbers are used instead of a more human readable mechanism has to do with the way transactions are encoded on the wire. The smaller the number is, the fewer bytes it consumes. For numbers between 0 and 127 the value only takes up one byte when encoded. This means we have fairly little space to work with before we start needing more bytes. If we used a single ASCII encoded letter I don’t think readability would be improved, but it may become hard to keep track of which values have been used and which haven’t.

1 Like

That makes sense. So, I also see that in RLP-encoded a single byte between 0x00, 0x7f is the explicit byte resulting in a single-byte keyspace of 128.( https://medium.com/coinmonks/data-structure-in-ethereum-episode-1-recursive-length-prefix-rlp-encoding-decoding-d1016832f919 ). On that note, there are only 127 printable ASCII codes. What do you think of using 7-bit ASCII characters as the version param? So, for example I want to create a new “ditto transaction” type if I wanted to use ‘d’ for ditto transactions which would be ascii 0x44 that fits within the single byte range of RLP.

I don’t believe that a single ASCII encoded character is any more human readable than a single number between 0 and 127, and having each transaction type pick a letter rapidly runs into problems with keeping track of which letters we have used and which we haven’t (much easier when we just start at 1 and count up). Also, we would need to map all of the transactions to printable ASCII characters, which wouldn’t align exactly with the actual ASCII code values (e.g., 0 is non-printing, so we would need 0 to mean something else). Once we have a mapping, we might as well map 0 to “Legacy Transaction” which is human readable.

New transaction types need not replace the old in sequence, but rather we could support up to 127 types which will be defined the same way we allocate opcodes. For example, I may want to define a quantum resistant transaction ‘q’ or a byte-boundary packed transaction ‘p’ which would be smaller than RLP. I already wrote up a use-case for ‘d’ ditto transactions, it doesn’t really make sense to call this transaction #1. Using #0 as the archetypal transaction doesn’t bother me - but perhaps enforcing a sequence here would breed confusion as types of transaction expands.

This follows the “char enum” pattern that you see in PostgreSQL.

In what situations do you foresee people reading transaction data decoded as ASCII? The transaction will likely be presented as a byte array of numbers (either hex or decimal encoded), and decoding to ASCII would require an extra step.

Since q alone isn’t enough to indicate what the purpose of the transaction is, there MUST be a mapping somewhere for q => Quantum Resistant Transaction. What is the advantage of having that over 5 => Quantum Resistant Transaction?

I really like that RLF is partially human readable, and using it as a ‘char enum’ datatype feels like a natural paring to me. The two-byte envelope in this standard defines a large keyspace (127) for versions, it seems like a waste to use them sequentially. In terms of mapping, yeah we create maps all the time this isn’t a concern, every OPCODE or message type needs to be allocated manually through discussion. If an EIP wants to use a new code defines what OPCODE to use, so why not what version code to use.

Another thing to note, by using a version string you don’t need to use variable-sized datatypes in the transaction its self. Using fixed-byte boundaries that are versioned would save about 9 bytes per transaction, maybe more… If you wanted a different key size, then you could define a different version to use - not unlike SSL/TLS cipher suite handshakes.

Can you explain what you mean by that? Where would we get the 9 byte benefit? Does it require using a different encoding system than RLP (increases engineering complexity of implementation).

I think most of us will agree that scalability is more important than complexity - but byte boundary encoding is actually quite simple. This is more than just asn.1 or potobuf, look at TCP/IP - the OSI models is full of byte-boundary based protocols.

RLP needs to use a signalling byte to define datatypes and sometimes sizes - this is an overhead that users must pay for in the form of gas and nodes must pay for in the form of storage. At million transactions per day, this is megabytes of overhead per day, and gigabytes per year. The purpose of RLP’s overhead is that the structure doesn’t have to be known ahead of time - but this isn’t the best choice for encoding a structure of fixed-byte entities that really hasn’t changed.

A traditional transaction is 9 elements:
[nonce, gasPrice, gasLimit, to, value, data, senderV, senderR, senderS]

^ Every element has a fixed size, except for data, so we can move that to the end. This could define a byte boundary schema which is far more dense than RLP:

1 byte , 2 bytes , 2 bytes , 32 bytes , 4 bytes , 32 bytes , 16 bytes , 16 bytes , Variable Data Size

This would put the theoretical smallest byte-boundary packed transaction at just 137 bytes, if this where a EIP-2718 then it would be 140 bytes - which is much smaller than whatever RLP is generating. I am considering writhing ^ this up as an EIP and giving this the version string ‘p’ for a packed message. If for example we wanted to make the nonce larger than a single byte, then a new version string would have to be defined - and we have 127 of them.

If I understand what you are suggesting, it is that we should not use RLP for the envelope and instead just have 1 byte for the version, followed by remaining bytes for the payload? Presumably, we would reserve 255 for future expansion (e.g., extension value)?

If so, I’m not against it and I think @AlexeyAkhunov has argued in the past that we shouldn’t be using RLP as much as we do. I would like to get feedback from client developers before making such a change though, as there is value in everything in a protocol using the same serialization format as it generally makes client development easier than when you have custom protocols for different parts of the system.

In this case, it may be worth it to avoid having to have an extra 1-4 bytes on every transaction.

I think that forcing clients to include multiple serialization schemes to be minimally consensus-compliant is not the direction we should go. Even if the savings is non-negligible. Eventually we should move away from RLP, but in the interim, I believe we should accept what we have.