Compilation is clearly beneficial for direct (CPU/GPU) execution.
For ZKP, it’s more tricky. SNARKs do not prove the program execution directly, but rather prove knowledge of the program execution trace (on some input).
Let’s assume f(x)
is our program, x
is some input to the program, w
is the program execution trace (in some form). There is also some circuit C(x, w, y)
to check the trace validity, i.e. C(x, w, f(x))
should output true
for valid traces.
A SNARK prover typically proves satisfiability of such circuit (and that it knows the w
witness), which is existentially quantified (due to the additional parameter w
) and (in case of proving program execution) uniform (inputs/traces of different lengths are served by the same circuit).
So, it’s drastically different from the direct execution. The difference is critical for performance, as you can transform evaluation (sub)problems to circuit satisfiability (sub)problems, which might be much cheaper to prove (e.g. for proving 1 / a
operation one has to provide some b
and check that a * b == 1
, while calculating the inverse directly will require proving much more intermediate steps).
Moreover, since the execution trace is available to prover it can select a cheaper implementation option, depending on the concrete trace content. E.g. a value can be u256
in general, but a particular occurrence can fit byte, so its properties/operations can be proved using less steps than the full u256
value.
So, the ability to inspect the trace can largely compensate absence of compilation. Static analyses/optimizations can still benefit ZKP (a call site can be recognized statically as monomorphic, so one can skip proving branch decisions). However, such optimizations are not that trivial to implement in practice:
- one need to withstand “JIT bombs” attacks
- one should prove somehow that such optimizations preserve semantics
With that said, I generally agree that compilation is the right direction. However, in the case of ZK proving of EVM code execution it’s not really clear whether compilation worth implementing given the associated costs. The EVM+EOF would definitely help here (e.g. static analysis, separate validation).