Clarifying PQ Verification on EVM: We’re Mixing Enforcement Models
Reading recent comments (@paulangusbark, @rdubois-crypto, @SirSpudlington), I think we’re unintentionally conflating fundamentally different enforcement lanes under one “PQ on EVM” umbrella.
Until we separate these lanes explicitly, gas comparisons become structurally misleading.
This isn’t about Falcon vs Dilithium vs ML-DSA.
It’s about architecture.
The Core Problem
Right now in the thread, we’re mixing:
-
Falcon precompile discussions
-
Dilithium (EIP-8051) conversations
-
ML-DSA Solidity POCs
-
ZK-from-PQ constructions
-
AA-layer validation flows
These are not the same enforcement model.
Yet they’re often benchmarked or discussed as if they were.
Three Distinct Enforcement Lanes
L1 Native Solidity Verification
(Measured Upper Bound — Not a Deployment Claim)
This lane answers one question:
What is the ceiling if Ethereum provides zero protocol support?
Example (measured):
-
ML-DSA-65
verify()POC in pure Solidity
→ 68,901,612 gas -
Inner primitive (
PreA compute_w_fromPacked_A_ntt)
→ 1,499,354 gas
This is obviously not mainnet-viable.
But now we have a quantified upper bound.
That number turns abstract arguments about “precompile savings” into measurable deltas.
If Falcon512 verifies in ~7–15M gas natively — that’s useful data too.
But this lane must be explicitly tagged:
enforcement_lane = L1_native_upper_bound
Otherwise people interpret these numbers as deployment proposals instead of architectural stress tests.
L1 Realistic Enforcement Today
ZK-from-PQ
On Ethereum mainnet today, enforceable PQ typically means:
-
PQ signature verified off-chain
-
ZK proof generated
-
Ethereum verifies the proof
In this case, the benchmark is:
proof_verification + calldata
This is not PQ signature verification.
It is proof verification.
This lane must be tagged:
enforcement_lane = L1_ZK_from_PQ
Otherwise we compare native-PQ gas to ZK-proof gas — which are entirely different primitives.
L2 / Protocol-Native PQ
If PQ verification is integrated as:
-
a precompile
-
a system contract
-
a protocol primitive
then gas becomes meaningful in a deployment sense.
This lane should be tagged:
enforcement_lane = protocol_native
This is where real architectural optimization happens.
But without knowing the upper bound (lane 1), we don’t know what a precompile actually saves.
Why This Separation Matters
Without explicit enforcement tagging, we risk:
-
Comparing native-PQ gas to ZK-proof gas
-
Confusing stress-tests with deployment targets
-
Optimizing for the wrong architectural layer
-
Mixing AA-surface validation with protocol-level primitives
-
Turning “gas numbers” into misleading signals
The disagreement in this thread isn’t about algorithms.
It’s about which enforcement lane we’re implicitly assuming.
Proposal: Minimal Structural Metadata
If we want comparable PQ benchmarking across Falcon, Dilithium, ML-DSA, hybrids, etc., every benchmark row should include:
-
surface(ERC-1271 / validateUserOp / protocol) -
wiring_lane(FIPS-SHAKE / Keccak / hybrid) -
enforcement_lane(L1_native_upper_bound / L1_ZK_from_PQ / protocol_native) -
optionally
key_storage_assumption(software_resident / TPM/HSM compatible)
Only then are comparisons structurally honest.
Without this metadata, we’re benchmarking architectures, not algorithms — but labeling them as algorithm comparisons.
Clarification
To be explicit:
-
Native L1 PQ verification is not viable today.
That’s precisely why measuring it as an upper bound is useful. -
Measurement ≠ Deployment.
The goal isn’t to push ML-DSA on L1.
The goal is to make enforcement assumptions explicit.
Open to Alignment
If there’s interest, I’m happy to align on a minimal shared benchmark harness with explicit lane tagging.
The ecosystem doesn’t need competing benchmark threads.
It needs a structurally comparable one.