ERC-7812: ZK Identity Registry

UPDATE February 2025:

  1. The ERC-7812 has been merged as a draft!
  2. The contracts have been deterministically deployed on Mainnet and Sepolia. The registry is available at 0x781268D46a654D020922f115D75dd3D56D287812.

Hello Magicians, I am excited to share our latest work with you!

Abstract

This EIP introduces an on-chain registry system for storing abstract statements, where the state of the system can be proven in zero knowledge without disclosing anything about these statements. Developers may use the singleton EvidenceRegistry contract to integrate custom business-specific registrars for statement processing and proving.

Motivation

The standardization and aggregation of provable statements in a singleton on-chain registry significantly improves reusability, scalability, and security of the abundance of zero knowledge privacy-oriented solutions. The abstract specification of the registry allows custom indentity-based, reputation-based, proof-of-attendance-based, etc., protocols to be implemented with little to minimal constraints.

The given proposal lays the important foundation for specific solution to build upon. The more concrete specifications of statements and commitments structures are expected to emerge as separate, standalone EIPs.

Specification

Check out the full specification on GitHub:

The complete reference implementation can be found here.


Would love to see an insightful discussion rolling!

19 Likes

How might the ERC-7812 standard address interoperability challenges among different privacy-oriented protocols while maintaining strict data privacy guarantees?

1 Like

Hey, thanks for the question!

The EIP is designed to provide a unified provable on-chain storage for all the protocols that integrate with the EvidenceRegistry. For example, there may be a Registrar that manages the ICAO masters list (for on-chain national passport verification). Any protocol that wants to prove the validity of a passport will be able to reference the evidence registry and prove that one of the masters has signed a passport.

The specific business use cases and their implementation are outsourced to the registrars. This means there may be general-purpose ā€œsocialā€, ā€œpassportā€, and ā€œPOAPā€ registrars that standardize how to manage and prove their data. We think that the behavior of registrars may even be described as separate EIPs.

Every registrar is expected to provide their own means of secure data proving (e.g. on-chain ZK verifier).

Hope that helps!

1 Like

Hey @Arvolear thatā€™s very interesting! Letā€™s connect, I would be happy to share some views, and see whether I can contribute!:slightly_smiling_face:
I am working on the smart wallet 4337 angle.

1 Like

Hey @Arvolear
Is the SMT implementation limited to 64 levels max? That would mean some key path collision probability. That would be big enough but for ā€œpublic goodā€ singleton is something to consider.

Also, are there any mechanism supposed to migrate existing statements from other protocols? :wink:

2 Likes

The SMT collision bits are algorithmically limited to 256 bits and will be set to 80 bits during deployment. We tested that the EVM throws runtime ā€œstack too deepā€ errors at the depths of 96 and more. To be clear, 80 bits is 2^80=1,209*10^24 elements which is practically impossible to reach.

The existing protocols can potentially push their state commitments (e.g. IMT roots in case of Semaphore) into the registry for the later proving. Migrating the entire state will definitely be too expensive, so suggestions are warmly welcome.

1 Like

@Arvolear thanks for the clarification. There if 80 depth limit which I overlooked
But the number of nodes in the tree still looks limited to 2^64 according to some struct field types like uint64 nodesCount; uint64 childLeft; and uint64 childRight;
Am I missing something how the SMT implementation works?

Yes, existing protocols can push their state commitments into the registry. Itā€™s more easy doing that from the beginning but may be tricky to switch if a protocol already has itā€™s own SMT-based registry implementations up and running.
The solution for the former case may be introducing some constant timestamp after which the custom protocol starts writing to the EvidenceRegistry. The essential drawback is protocol complication with some logic, which will ā€œfallbackā€ to legacy SMT implementations, so we may need more solutionsā€¦

1 Like

Indeed. The uint64 nodesCount just slipped out of my mind.

So there are two limits:

  1. The overall nodes count of 2^64 ~= 1.84*10^19.
  2. The nodes collision limit of 80 bits prefix.

In a degenerate case (if no randomization is used) it is fairly easy to reach the second limit. One just needs to push two very similar keys to the tree. However, since every element is hashed (by the EvidenceDB._getIsolatedKey() function) before being inserted, I am not sure which one of the limits is more likely anymore.

1 Like

The second limit should be more likely.

By the Birthday Problem formula Pcol ā‰ˆ 1 āˆ’ e ^ āˆ’n(nāˆ’1)/2T there is ~40% chance of at least one collision for 2^40. For 2^42 it is practically guaranteed.

But generally SMT is fine as for 2^36 (~69 billions) nodes it is only 0.1951% of collision and probability drops exponentially (or something similar) if less nodes.

1 Like

@Arvolear another question just came to my mind.
Have you considered not to store nodeHash in this struct to save another 20K gas per each node?

    struct Node {
        NodeType nodeType;
        uint64 childLeft;
        uint64 childRight;
        bytes32 nodeHash;
        bytes32 key;
        bytes32 value;
    }

Yes, this is something that you the getProof function needs. However, are there any on-chain clients expected, which call this function?

Usually the one, who picks proofs of existence/non-existence of a key need to do it anonymously so itā€™s likely just an API call with no transaction. Thus, you can generate hashes for siblings right in the getProof function or return key/values/nodeTypes to generate proofs off-chain on some client.

@Arvolear in case of eliminating hashing out of add, update and remove node transactions you may save gas even more.
E.g. Poseidon hash implementation in circomlibjs for 2 and 3 uint256 elements take 54K and 70K gas respectively. If Iā€™m not missing something the hash recalc is done for each of those 3 methods up to (!) the root. Then, gas saving may be essential, unless the hash function is very cheap or introduces as EVM precompile.

Thanks for proposing that!

We have actually thought of such optimizations but there is a problem: you need to anchor to the SMT root on-chain. Without these hashes the on-chain verification integration will be extremely expensive. Currently the getRoot() method just performs a single SLOAD.

P.S.

We are working on an alternative to SMT which we call Cartesian Merkle Tree (CMT). CMT has similar properties to SMT but under some conditions (when the cost of a hash function is low) cheaper by up to ~20%.

Currently, on-chain SMT with poseidon wins. However certain off-chain applications may see some benefit as CMT takes ~50% less space.

For more information, check out the Solidity reference implementation here.

1 Like

@Arvolear thanks for the prompt and the CMT link :+1:
Yes, the root always need all tree branch recalculation, so not good for on-chain verification

1 Like

Hey thatā€™s cool work!

This has the potential of unlocking lots of things. Not just passports. Ideally this could be similar to a DID container at the end.

Wanted to mainly ask 3 things:

  • Have you considered that bytes32 might not be (I think for sure will not) enough space for PQ-primitives which are usually much larger than EC points.
  • I imagine that vs quantum adversary, weā€™re screwed (as is normal ofc). Could this include some contingency plan? ie. a functionality that allows an upgrade and maybe a shutdown/halt?
  • Iā€™m curious on why the hashing (isolation) needs to be done on-chain. Specially if we consider that Poseidon is what has been implemented. Wouldnā€™t it make more sense to allow the user to hash by himself? And if thereā€™s any security risk on allowing that, why not choosing a hash that is pre-compile friendly? Thus lowering significantly the gas cost?

Thanks!

1 Like

Hey, thanks for taking a look!

  • About post-quantumness. Currently, the whole Ethereum protocol is screwed due to ECDSA not being PQ secure. The bytes32 key/values were chosen because 1) they occupy a single storage slot; 2) the individual registrars may hash arbitrary data and easily pack it in 32 bytes.
  • Unfortunately, this would introduce some centralized entity that ā€œgovernsā€ the registry. I would like to avoid that at all costs.
  • The isolation is done on-chain to avoid collisions within the SMT. If isolation is delegated to registrars, an adversary may remove data they shouldnā€™t access. The poseidon is chosen because it is de facto the standard in the ZK world. Precompile sha2 and opcode keccak are cheaper on-chain but MUCH more expensive (in terms of constraints and complexity) in ZK.

Hope that clarifies some things.

2 Likes

Quick update: the ERC has been merged as a draft. Will be pushing the proposal to review soon. Thanks to everyone involved!

Update: deployed the registry deterministically both to Mainnet and Sepolia networks. The address is computed to be 0x781268D46a654D020922f115D75dd3D56D287812 (starts and ends with 7812).

Check out the full deployment script here.

Great work @Arvolear!

Considering that statement often carries randomness within itself (e.g. you committed to your id with some salt for privacy), current design hints that duplicates of encapsulated value (within statement) may exist within the registry (the isolation with sender+statement cannot detect this), does that constitute some potential problem?