The contracts have been deterministically deployed on Mainnet and Sepolia. The registry is available at 0x781268D46a654D020922f115D75dd3D56D287812.
Hello Magicians, I am excited to share our latest work with you!
Abstract
This EIP introduces an on-chain registry system for storing abstract statements, where the state of the system can be proven in zero knowledge without disclosing anything about these statements. Developers may use the singleton EvidenceRegistry contract to integrate custom business-specific registrars for statement processing and proving.
Motivation
The standardization and aggregation of provable statements in a singleton on-chain registry significantly improves reusability, scalability, and security of the abundance of zero knowledge privacy-oriented solutions. The abstract specification of the registry allows custom indentity-based, reputation-based, proof-of-attendance-based, etc., protocols to be implemented with little to minimal constraints.
The given proposal lays the important foundation for specific solution to build upon. The more concrete specifications of statements and commitments structures are expected to emerge as separate, standalone EIPs.
Specification
Check out the full specification on GitHub:
The complete reference implementation can be found here.
Would love to see an insightful discussion rolling!
How might the ERC-7812 standard address interoperability challenges among different privacy-oriented protocols while maintaining strict data privacy guarantees?
The EIP is designed to provide a unified provable on-chain storage for all the protocols that integrate with the EvidenceRegistry. For example, there may be a Registrar that manages the ICAO masters list (for on-chain national passport verification). Any protocol that wants to prove the validity of a passport will be able to reference the evidence registry and prove that one of the masters has signed a passport.
The specific business use cases and their implementation are outsourced to the registrars. This means there may be general-purpose āsocialā, āpassportā, and āPOAPā registrars that standardize how to manage and prove their data. We think that the behavior of registrars may even be described as separate EIPs.
Every registrar is expected to provide their own means of secure data proving (e.g. on-chain ZK verifier).
Hey @Arvolear thatās very interesting! Letās connect, I would be happy to share some views, and see whether I can contribute!
I am working on the smart wallet 4337 angle.
Hey @Arvolear
Is the SMT implementation limited to 64 levels max? That would mean some key path collision probability. That would be big enough but for āpublic goodā singleton is something to consider.
Also, are there any mechanism supposed to migrate existing statements from other protocols?
The SMT collision bits are algorithmically limited to 256 bits and will be set to 80 bits during deployment. We tested that the EVM throws runtime āstack too deepā errors at the depths of 96 and more. To be clear, 80 bits is 2^80=1,209*10^24 elements which is practically impossible to reach.
The existing protocols can potentially push their state commitments (e.g. IMT roots in case of Semaphore) into the registry for the later proving. Migrating the entire state will definitely be too expensive, so suggestions are warmly welcome.
@Arvolear thanks for the clarification. There if 80 depth limit which I overlooked
But the number of nodes in the tree still looks limited to 2^64 according to some struct field types like uint64 nodesCount;uint64 childLeft; and uint64 childRight;
Am I missing something how the SMT implementation works?
Yes, existing protocols can push their state commitments into the registry. Itās more easy doing that from the beginning but may be tricky to switch if a protocol already has itās own SMT-based registry implementations up and running.
The solution for the former case may be introducing some constant timestamp after which the custom protocol starts writing to the EvidenceRegistry. The essential drawback is protocol complication with some logic, which will āfallbackā to legacy SMT implementations, so we may need more solutionsā¦
Indeed. The uint64 nodesCount just slipped out of my mind.
So there are two limits:
The overall nodes count of 2^64 ~= 1.84*10^19.
The nodes collision limit of 80 bits prefix.
In a degenerate case (if no randomization is used) it is fairly easy to reach the second limit. One just needs to push two very similar keys to the tree. However, since every element is hashed (by the EvidenceDB._getIsolatedKey() function) before being inserted, I am not sure which one of the limits is more likely anymore.
By the Birthday Problem formula Pcol ā 1 ā e ^ ān(nā1)/2T there is ~40% chance of at least one collision for 2^40. For 2^42 it is practically guaranteed.
But generally SMT is fine as for 2^36 (~69 billions) nodes it is only 0.1951% of collision and probability drops exponentially (or something similar) if less nodes.
Yes, this is something that you the getProof function needs. However, are there any on-chain clients expected, which call this function?
Usually the one, who picks proofs of existence/non-existence of a key need to do it anonymously so itās likely just an API call with no transaction. Thus, you can generate hashes for siblings right in the getProof function or return key/values/nodeTypes to generate proofs off-chain on some client.
@Arvolear in case of eliminating hashing out of add, update and remove node transactions you may save gas even more.
E.g. Poseidon hash implementation in circomlibjs for 2 and 3 uint256 elements take 54K and 70K gas respectively. If Iām not missing something the hash recalc is done for each of those 3 methods up to (!) the root. Then, gas saving may be essential, unless the hash function is very cheap or introduces as EVM precompile.
We have actually thought of such optimizations but there is a problem: you need to anchor to the SMT root on-chain. Without these hashes the on-chain verification integration will be extremely expensive. Currently the getRoot() method just performs a single SLOAD.
P.S.
We are working on an alternative to SMT which we call Cartesian Merkle Tree (CMT). CMT has similar properties to SMT but under some conditions (when the cost of a hash function is low) cheaper by up to ~20%.
Currently, on-chain SMT with poseidon wins. However certain off-chain applications may see some benefit as CMT takes ~50% less space.
For more information, check out the Solidity reference implementation here.
This has the potential of unlocking lots of things. Not just passports. Ideally this could be similar to a DID container at the end.
Wanted to mainly ask 3 things:
Have you considered that bytes32 might not be (I think for sure will not) enough space for PQ-primitives which are usually much larger than EC points.
I imagine that vs quantum adversary, weāre screwed (as is normal ofc). Could this include some contingency plan? ie. a functionality that allows an upgrade and maybe a shutdown/halt?
Iām curious on why the hashing (isolation) needs to be done on-chain. Specially if we consider that Poseidon is what has been implemented. Wouldnāt it make more sense to allow the user to hash by himself? And if thereās any security risk on allowing that, why not choosing a hash that is pre-compile friendly? Thus lowering significantly the gas cost?
About post-quantumness. Currently, the whole Ethereum protocol is screwed due to ECDSA not being PQ secure. The bytes32 key/values were chosen because 1) they occupy a single storage slot; 2) the individual registrars may hash arbitrary data and easily pack it in 32 bytes.
Unfortunately, this would introduce some centralized entity that āgovernsā the registry. I would like to avoid that at all costs.
The isolation is done on-chain to avoid collisions within the SMT. If isolation is delegated to registrars, an adversary may remove data they shouldnāt access. The poseidon is chosen because it is de facto the standard in the ZK world. Precompile sha2 and opcode keccak are cheaper on-chain but MUCH more expensive (in terms of constraints and complexity) in ZK.
Update: deployed the registry deterministically both to Mainnet and Sepolia networks. The address is computed to be 0x781268D46a654D020922f115D75dd3D56D287812 (starts and ends with 7812).
Considering that statement often carries randomness within itself (e.g. you committed to your id with some salt for privacy), current design hints that duplicates of encapsulated value (within statement) may exist within the registry (the isolation with sender+statement cannot detect this), does that constitute some potential problem?