EIP-6475: SSZ Optional

Discussion thread for EIP-6475: SSZ Optional

Background

Zahary’s notes: Consider possible improvements to the SSZ spec before phase0 is launched · Issue #1916 · ethereum/consensus-specs · GitHub

Potential use

EIP-4844 and EIP-6404, to represent a transaction’s to value. It can be None to denote deployment of a contract, or otherwise the transaction destination address.

2 Likes

Added tests and reference implementation based on protolambda/remerkleable (Python) to EIP.

Added Nim implementation: implement SSZ Optionals (EIP-6475) by etan-status · Pull Request #41 · status-im/nim-ssz-serialization · GitHub

One thing to note is that for the purpose of merkleization, this is equivalent to List[T, 1], and to Union[None, T]. It is just the serialization that is more compact than those workarounds, and the obvious ability to write more concise code, instead of switching on list lengths or union selectors.

[on comparison to union type] The serialization is less compact, due to the extra selector byte.

Honestly this seems a very week argument for introducing a new type that is already supported. I feel like this is a bad idea because Unions were introduced for exactly this purpose.

Also, this kind of abuses the length bytes and only works in the specific case where the optional type is of fixed length.

SSZ Union are not “already supported” across the board; so far, they are not used in any final Ethereum specification. Certain libraries such as nim-ssz-serialization currently only fully implement the limited support needed for handling the Optional[T] workaround. There are also no official tests for SSZ unions.

If “unions were introduced for exactly this purpose”, they would most-likely optimally encode for this purpose. However, they do not, and they are clearly a more powerful construct than just Optional[T]. The same design space as Union[None, T] is actually already offered by List[T, 1], which also encodes more compactly for fixed-length types, and actually is “already supported” and well tested.

Regarding “abuses the length byte”, I’m not sure what you are referring to, as SSZ does not explicitly encode a “length byte” anywhere. For example, in a List[T, N] of fixed-length type, N is implicitly derived from len(List) / sizeof(List[0]). For variable-length types, N is derived from (&(*bytes[0]) - &bytes[0]) / sizeof(uint32) if len(List) > 0 or assumed to be 0 if len(List) == 0. The proposed Optional[T] SSZ type follows these same conventions.

The proposed Optional[T] works not just for fixed-length types, but also for most variable-length types such as Bitlist[N], Union[type_0, type_1, ...], Container with variable-length members, and Vector with variable-length members. As proposed, the only types that cannot be nested on the very next layer are List[T, N] and Optional[T], which both already have a natural way to denote absence (len = 0 and None). The concept of Illegal types already exists in SSZ. If needed, the format can be accordingly extended in the future.

Indeed, since the SSZ unions haven’t been used in any production spec yet, the Optional type described here can also be framed as a special case optimization for the the Union type. This was my original proposal (please see point 2 in the linked issue):

To put this in context, a blob transaction uses 131,072 bytes for the blob alone (assuming a single blob), and this change saves one byte (less than 0.001%).

Point being, serialization and merkleization of SSZ Unions are still being discussed and not properly tested, as they are a new SSZ type not currently used in any finalized spec.

Note: the precise definition of Union is still a topic of discussion. Union is currently not yet used in consensus, and so there remains freedom to decide exactly how Union types are to be hash-tree-rooted and serialized. There are different approaches being proposed that attempt to maximize efficiency and simplicity.

For EIP-4844, using something simpler like the Optional[Address] proposed here, or even just plain List[Address, 1] (actually the exact same serialization and merkleization for fixed-length types), reduces the complexity of requiring an entire union framework to just represent an optional address. Yes, it also shaves off that single byte, in either List or Optional case, as a side effect.

This was discussed in today’s EIP-4844 breakout call.

Decision on this was postponed until we know whether SSZ Unions will actually be used in their full scope. If they are, their design regarding serialization and merkleization will have to be finalized, and exhaustive tests for them added. On the other hand, if we decide that SSZ Unions are not needed at this time, EIP-4844 could move to List[T, 1] or to Optional[T].

as a process note, this type of thing would make a lot more sense as a PR to the consensus-specs, rather than an EIP

if we move ahead with this change, we will want to recreate as a spec PR anyway so the specs for SSZ all live in one place

1 Like

Agree that a consensus-specs PR is also warranted. Furthermore, if tomorrow’s SSZ breakout call reveals that we actually end up with a situation where full Union support is required, and not just the Optional subset, updates to the Union spec should be discussed as well to gain the same optimizations from @zah.

Namely that the union’s None case serializes without the selector (no downside), and potentially also that the [None, T] case serializes without the selector (no List / nullable Union at the very next layer). Optional and Union[None, T] ideally have the same properties in a world containing union, so Optional becomes just syntactic sugar to make it explicit that there is no intention to expand it to Union[None, T, U] later on, and to allow language features such as .isSome checks, similarly to how we have sugar for ByteVector etc.

In the other case, if we determine that full Union support is not needed at this time (for SSZ transactions discussion), I would advocate to only depend on the actually used Optional subset.

BTW, if there is any default Address constant that could be used to represent the None case, I would prefer that one over either Optional or Union. But I suspect that Optional support is actually really becoming necessary.

consensus-specs PR: EIP-6475: Add SSZ `Optional[T]` type by etan-status · Pull Request #3336 · ethereum/consensus-specs · GitHub

Of the 3 possible implementations being proposed, I think the List<T, 1> is the most viable at the moment, and it would be reasonable to put syntactic sugar around that to call it an optional. Implementors would be free to either exploit the Optional wrapping it, or continue to use it as a List that requires only 1 iteration.

I think the EIP is ambiguous on how Optional<T> intends to serialize an empty Optional. Serialization of None is undefined in the SSZ spec, however in my spelunking of discussions around SSZ, it seems the intention is that None should be written as an empty list- two consecutive offsets to the same place. Moreover that seems synonymous with treating an empty Optional as a variable length encoding of b"" , as suggested by the suggested python serialization. If I am correct, then there really is no difference other than naming this Optional, and either way we have chosen to implement this as List<T,1>

Regarding the Union implementation from a tactical perspective, I find the “unions aren’t ready yet” argument much more compelling than the “unions add size overhead” argument. Lists are here and ready, and would be trivial to implement an Optional based on list length.

Alternately, from a design perspective, there is also a bit more semantic clarity in using the Union. Many programming languages (Swift, Kotlin, Rust…) have implemented Optionals in this way, and so developers should find that a bit more intuitive. Using Lists would be more hacky from a design perspective, and that design wart will live on forever.

Yes, for fixed-length inner types, the proposed serialization matches the one of List<T,1>.

However, for variable-length T, List<T,1> emits an offset-entry, which is not necessary in the optional case as the length can be implied from the full list length - if it is serialized as b"", it is None, otherwise it is Some(T).

For merkleization, Optional[T], List<T,1> and Union<T,1> are all the same, with same hash_tree_root. Only serialization is affected.

Conceptually, an Optional type (however it is serialized / merkleized) is a useful concept (see EIP-4844 and the Verkle effort). It makes it clear to the reader what is meant. With List<T, 1> it needs to be accessed like an array, and with Union<None, T> it is unclear whether the intent is for this to grow to a Union<None, T, U> in the future. At the very least, it should be defined as a typealias and recommended for use by the protocol, so that the underlying intent can be appropriately represented in higher-level languages (e.g., Swift ?. / !., C# ??, Nim valueOr and so on).

If there is a need for nested Optional[Optional[...]], or Optional[List[T, N]] (note that empty lists usually already represent optional), the Union serialization could help disambiguate between None and Some(None) / Some(List[T, N]()). If those constructs are not needed, the proposed format based on @zah 's suggestion is a bit more compact, and gets rid of the parsing complexity of “what if someone sends an unknown selector”. In any case, for unions as well as optionals, the None branch could be serialized as b"" instead of b"\0" without loss of generality (so, only the Some branch would have a prefix).

Ultimately, point being, what is needed for new features are Optionals, not Unions, at this time.

When you say b"" are you intending that to be read as python for List[uint8, 0] ? Because if not, then I am left to interpret it as “don’t write anything to the stream”. In that second case, we would be without a marker for the Optional[T], and I still don’t understand how an empty Optional[T] could be serialized. Would you mind specifically addressing the second paragraph in my prior comment? I think that is the only thing holding me back from being fully on board with a new unique SSZ type that is not simply an alias for List[T, 1] or Union[T, None].

Yes, the current proposal is to “don’t write anything to the stream” for the None case, and to restrict Optional[T] to T that cannot ever encode as the empty data. There are only two cases that can encode as empty, namely:

  1. A nested Optional (but, Optional[Container[Optional[T]]] is alright if truly needed).
  2. A List[T, N] (can use 0 to denote no elements present, and Optional[Container[List[T, N]]] is alright if truly needed).

Note that there are similar “illegal types” for others:

  • Empty vector types (Vector[type, 0], Bitvector[0]) are illegal.
  • Containers with no fields are illegal.
  • The None type option in a Union type is only legal as the first option (i.e. with index zero).

If it is a concern to disallow Optional[Optional[T]] and Optional[List[T,N]], it could be minimally changed to encode as: b"" (empty) for None, and b"\1" + serialize(value) for Some.

Or, if the current (unused) Union encoding is to be applied, None would encode as b"\0" instead of b"", for consistency. Or, the Union encoding could be updated to encode the None case as b"" as well.

From discussion at today’s AllCoreDevs call (Execution Layer Meeting 160 · Issue #759 · ethereum/pm · GitHub), additional change request was made:

  • None case should remain b"".
  • Some case should be prefixed with b"\x01" to allow Optional[Optional] and Optional[List], and for compatibility with implementations that wish to implement Optional as a special case of Union.
  • Union (currently unused) should change the None case to b"" as well.

Changes will be discussed in EIP-4844 meeting: EIP-4844 Implementers' Call #21 · Issue #760 · ethereum/pm · GitHub

Changes have been implemented:

1 Like

An alternative scheme to the Optional could be EIP-7495: SSZ PartialContainer

More flexible and even more compact, but needs to be attached to the surrounding container.

It seems ideal for the purpose of EIP-6493: SSZ Transaction Signature Scheme - the only other location where EIP-6475 optional is still considered is TheVerge: spec draft by gballet · Pull Request #3230 · ethereum/consensus-specs · GitHub which could also be represented using PartialContainer.