EIP-7495: SSZ StableContainer

Discussion thread for EIP-7495: SSZ PartialContainer

Background

Potential use

Showcase for how this would look for transactions: https://eth-light.xyz (desktop only, not mobile-friendly)

1 Like

This is interesting. Thanks.

The section called “Why not a Container full of Optional[E]” ends with this sentence: “Therefore, the number of fields is constant and the Merkle tree shape is stable.” I apologize if I’m misunderstanding something, but would it be correct to add, “…the Merkle tree shape is stable, if perhaps larger.”

I understand that the shape of the tree won’t change if the number of optional fields is constant, but isn’t it true that in (half?) of the cases, the tree will be twice as big?

I’m mostly just curious, but it might help to make that explicit.

1 Like

Thanks for the comment, I have updated the PR with a more detailed description of the overhead.

While the tree will indeed be twice as big, the number of hashes required to compute the root hash of the tree only scales logarithmically. SSZ implementations typically precompute root hashes of pure zero trees, so the empty half of the tree doesn’t need to be hashed in practice. For example, if you have a tree with 32 leaves, but the last 16 leaves are empty, hash_tree_root(tree) == hash(hash_tree_root(tree.left) || hash_tree_root(tree.right)) == hash(hash_tree_root(tree.left) || zeroHashConstants[depth = 4])

As SSZ specs live in the ethereum/consensus-specs repo, the PartialContainer spec will be introduced there once we agree on an overall approach for representing transactions / receipts (EIP-6404 / EIP-6466).

Until then, I’d like to have everything in the EIPs repo, including the PartialContainer. This is in line with prior art, for example, EIP-4895, where consensus changes are being discussed in the EIP before making it into consensus-specs.

Note that I’m still working on updating EIP-6404 / EIP-6466 to use the PartialContainer. Once we have agreement, I’ll extend remerkleable with support for PartialContainer and then create a matching consensus-specs PR that also provides tests. Likewise, will open a consensus-specs PR to change the consensus Transaction structure to match the proposed execution Transaction structure.

Yes, a new fork could relax validity constraints, while a subsequent fork could require the field to be present again. Whether that makes sense is up to the application, but the serialization format proposed here will be stable across both directions (from optional to required and vice versa), essentially providing maximum flexibility when it comes to future design space.

In fact, an implementation could simply treat all fields as optional always, if they prefer to enforce constraints in the application layer instead of the serialization library. Another implementation may go the other direction and also push down an invariant callback to ensure that only valid combinations are accepted; e.g., a receipt can contain either an intermediate state root, or a status code, but not both at the same time. Yet another implementation may try to use union types to represent various valid combinations; point being is, none of these implementation details leak into the serialization representation.

Renamed from PartialContainer to StableContainer to highlight the stable serialization and merkleization across spec versions.

Also, simplified the specification by getting rid of the outer T and instead directly referring to the container in those situations. The Python code also was updated to no longer require the extra @dataclass layer. Serialization, merkleization, semantics and so on did not change at all.

StableContainer look like a good idea to address the issues outlined in the Motivation section.
However it seems to me that introducing of Optional type is a matter of a separate concern.
Doesn’t the StableContainer solely solve the mentioned issues?

I.e. adding a field in a new container version would be just straightforward.
Removing the field (or rather deprecating it) would be just changing its type to something like Nothing.

Moreover with introducing the Nothing type e.g.

class Example(StableContainer[4]):
    a: uint64
    b: uint64

could be represented via a regular container:

class Example(Container):
    a: uint64
    b: uint64
    f2: Nothing
    f3: Nothing

and then removing the field a and adding a new field c would look like

class Example(Container):
    f0: Nothing
    b: uint64
    c: uint16
    f3: Nothing

Would that approach suit the needs?

1 Like

Yeah, your example is indeed how those messages would get serialized.

Passing N to the StableContainer[N] essentially fills it up with Nothing up through N total fields. Setting N in the type metadata reduces verbosity.

Using Optional[T] instead of Nothing ensures that messages of old style (for example, in your case, messages containing a) can still be parsed. This also allows for use cases that have conditional fields, for example, as highlighted in EIP-6493, where a check_transaction_supported function defines valid combinations.

Yes, this is exactly the use case I’m a bit concerned about. It’s kind of leaking of SSZ static typization

Effectively instead of declaring dedicated classes for every transaction type there is a single container with a number of Optional fields which should be cross validated.

Does it make sense to declare separate SSZ type for every Tx type, like TransactionPayloadLegacy, TransactionPayloadEip2930, etc ?
I see that it would be more verbose and probably would require more boilerplate code to handle. But it would be more type safe and less error-prone imo.

That looks to me as a separate concern. The most obvious way seems to wrap those types with a Union (or StableUnion if one needs stable merkleization and tree paths)

Does it make sense to declare separate SSZ type for every Tx type

Overall, even if you have separate SSZ types, you’d still need additional validation to check invariants. For example, to check that the from address is correct. Or, to check that the blob_versioned_hashes match the blob in the wrapper. Or that the to field is present for the blob transaction (it’s optional without using blob). Or that the max_fee_per_gas is >= max_priority_fee_per_gas. Note you may also need matching TransactionSignatureXyz, and somehow check that it is compatible with the payload union.

Checking valid field combinations in EIP-6493 is ~20 lines, mostly to ensure that txns retain the limitations from RLP. It could be exhaustively tested. The full list of supported combos is:

  • TransactionPayloadReplayable → original format, not locked to chain ID (no SSZ equivalent)
  • TransactionPayloadLegacyRlp → optional to, no access list, no prio fee, no blob
  • TransactionPayloadEip2930Rlp → optional to, yes access list, no prio fee, no blob
  • TransactionPayloadEip1559Rlp → optional to, yes access list, yes prio fee, no blob
  • TransactionPayloadEip4844Rlp → required to, yes access list, yes prio fee, yes blob
  • TransactionPayloadLegacySsz → optional to, no access list, no prio fee, no blob
  • TransactionPayloadEip2930Ssz → optional to, yes access list, no prio fee, no blob
  • TransactionPayloadEip1559Ssz → optional to, yes access list, yes prio fee, no blob
  • TransactionPayloadEip4844Ssz → required to, yes access list, yes prio fee, yes blob

There’s the problem of combinatorial explosion. For example, if you want a transaction that has a blob but no priority fee nor access list, and then another one wants no blob but wants a priority fee and no access list, and so on; that’s 8 additional different “tx types” for features that don’t have anything to do with each other. With future features such as multidimensional fees, CREATE2 transaction, different sig_hash mechanisms, and so on, one may want to move towards allowing the signer to pick the combo they want instead of being forced to select a type that supports a superset of what’s needed and then having to trick around with empty lists and default values for all the features they don’t want, like currently done in RLP.

Furthermore, you’d need some mechanism to transfer type information. For example, using an enum prefix similar to Union. However, that leads to a requirement for the verifier to know about all the enum cases and their meaning. Because new types may be introduced in the future, verifiers can’t become immutable. That’s the case even if they solely care about certain fields of the container; for example, only from, to, and value, and ignore all other fields. On the other hand, with StableContainer, that could be achieved with a followup proposal like a SparseView that includes just the bitvectors, the requested 3 fields, plus a merkle proof. The merkle proof shape is statically determinable solely by the bitvectors and the requested fields regardless of tx type, which is not necessarily given by a Union approach.

About StableUnion, would be interested to understand what you mean there and how the differences to the StableContainer are.

I’d also like to better understand more type safe and less error-prone arguments. In practice, implementations likely go for a single implementation that handles all transactions. Then, for each feature, check if it is used and, if it is, process it. The difference would be that with the TransactionPayloadXyz jungle you’d need a Generics based implementation that generates another copy of the code for each individual type (feature combination), while with the StableContainer approach you’d have a single function with runtime checks for all the features. Code size is smaller with the StableContainer, while the Generics based implementation can exclude certain invalid field combinations (the 20-line check in EIP-6493) in the serialization library rather than its usage. Code size may also have implications on ZK logic based verifiers.

Sorry - my mistake here. The present Union implementation is inherently ‘stable’ actually. For some reason I thought it has the structure similar to Container

Of course there is always a lot of semantics which couldn’t be expressed with types. But whenever it’s feasible it always better to express semantics/constraints via static types in the presence of such system (SSZ in our case).

Of course I don’t treat it as an immutable rule. If you are saying there is a ‘combinatorial explosion’ then of course it probably doesn’t make sense to ‘die hard’ sticking to strong typing. However some things could be expressed in a more canonical way on my mind:

Obvious example:

    # EIP-4844
    max_fee_per_blob_gas: Optional[uint256]
    blob_versioned_hashes: Optional[List[VersionedHash, MAX_BLOB_COMMITMENTS_PER_BLOCK]]

Both fields are either present or absent. Ideally it would look like:

    # EIP-4844
    eip4844Data: Optional[Eip4844Data]

class Eip4844Data(StableContainer[N]):    
    max_fee_per_blob_gas: uint256
    blob_versioned_hashes: List[VersionedHash, MAX_BLOB_COMMITMENTS_PER_BLOCK]

I’m not sure about transaction representation. Teku has dedicated Java types for every hardfork version of every structure. That seriously helps to avoid shooting your foot when adding/changing processing logic across the whole codebase.
Worth to mention there is a type hierarchy as well (which is not applicable in spec) which makes things a lot simpler.