EIP-5450: EOF - Stack Validation

gumb0 · August 17, 2022, 9:14am

This is the discussion topic for EIP-5450: EOF - Stack Validation

gcolvin · August 17, 2022, 7:34pm

Very nice, thanks!

I have of course been working on the same validation constraints since EIP-615 and EIP-2315.

So far my only problem reading this is that I could not find a definition of “stack height” in the EIP. Eventually I thought to look in EIP-5440, where it is used in code snippets but still not defined. I dug a little further in previous EIPs but gave up for now. My Python is getting better, but it was too much work trying to find the relevant code. So I think this problem is deeper than this one EIP. The concept can be expressed in a sentence or two of English.

gcolvin · August 25, 2022, 1:47pm

Finally got back to this. In 4750 I had missed (my emphasis):

A return stack is introduced, separate from the data stack. It is a stack of items representing execution state to return to after function execution is finished. Each item is comprised of: code section index, offset in the code section (PC value), calling function stack height.

Clear enough in the context of 4750’s explicit spec. A reminder that the height is relative to function entry would help here – which is what confused me.

gcolvin · August 25, 2022, 2:39pm

These might be more serious issues, but I don’t fully understand the code.

I don’t see that you add CALLF to the worklist in the validation loop, which it seems will prevent you from traversing every possible path.

I also don’t see that you deprecate JUMP here, or in EIP-5450 or EIP-3670. And I don’t see that you handle them in the validation code. It seems that if you don’t handle them you can’t traverse every path, but if you do handle them you will suffer a quadratic path explosion.

gumb0 · August 25, 2022, 3:27pm

Thanks for the feedback, I’m adding clarification about stack height in EIP-5450: Add rationale, more clarifications and update authors by gumb0 · Pull Request #5535 · ethereum/EIPs · GitHub
I can see how it might be a bit confusing, because stack height of EIP-5450 is a validation-time calculation (and yes, relative to function entry), while in EIP-4750 it refers to actual runtime value.

Each function section is validated independently, so validate_function(func_id, code, types) will be called in loop for each section.

This is true, this EIP assumes JUMP* is deprecated. I think it’s better to have a separate tiny EIP for deprecating them, this is not done yet.

gcolvin · August 25, 2022, 4:36pm

Thanks for clearing that up. The Yellow Paper mostly assumes the reader knows what a stack is in the text, and uses mathematical notation otherwise. Eg, ADD is defined as μ′s[0] ≡ μs[0] + μs[1].

I remember now that EIP-615 also validates functions independently. I now fear that – because functions can consume stack – nested calls can cause underflows that won’t be detected that way.

gcolvin · August 25, 2022, 4:51pm

A hunch I haven’t time to follow: I think that since you are validating the use of the return stack you will not need to keep the code_section_index and stack_height on the return stack, only the offset.

gcolvin · November 2, 2022, 6:26am

The current validation algorithm ignores unreachable instructions. The algorithm can be extended to reject any code having any unreachable instructions but additional instructions traversal is needed (or more efficient algorithm must be developed).

Start with a bitset of zeros for every byte in the code. For each instruction traversed during validation set the bit(s) in the set corresponding to byte offset(s) of the instruction. if there are any zero bits remaining in the set there are unreachable instructions

gcolvin · April 22, 2025, 5:16pm

This “discussions to” thread is looks pretty dead, but…

In the motivation you state that

Single pass transpilation passes can be safely executed with the code validation and advanced stack/register handling can be applied with the stack height validations.

But I don’t see that the invariants given in the “Properties of validated code” section actually prove that. I’m also not sure what the one-pass algorithms is for traversing EVM code. It should be simpler than the validation algorithm, but looks to be not so simple as the standard depth-first search of directed cyclic graphs.