Historical work on account abstraction
- https://github.com/ethereum/EIPs/issues/86 (includes some use cases)
What is account abstraction, and what are the motivating use cases?
In short, account abstraction means that not only the execution of a transaction can be arbitrarily complex computation logic as specified by the EVM, but also the authorization logic of a transaction would be opened up so that users could create accounts with whatever authorization logic they want. Currently, a transaction can only “start from” an externally owned account (EOA), which has one specific authorization policy: an ECDSA signature. With account abstraction, a transaction could start from accounts that have other kinds of authorization policies, including multisigs, other cryptographic algorithms, and more complex constructions such as ZK-SNARK verification.
- Multisig wallets and other uses of smart contract wallets (eg. social recovery). Currently, such wallets require an awkward construction where there is a separate account that stores a small amount of ETH to pay for transaction fees, and that account gets refilled over time.
- Cryptography other than ECDSA. Schnorr signatures, BLS signatures, other curves and quantum-proof algorithms such as Lamport/Winternitz could all be implemented.
- Privacy solutions would no longer require “relayers”, as the verification of the proof would be moved into authorization logic and so would be a condition of fee payment.
- On-chain DEXes and similar systems where there’s often multiple transactions from many users trying to claim the same arbitrage opportunity; currently, this leads to inefficiency because many failed transactions are nevertheless published on chain, but with abstraction, it’s theoretically possible to make sure failed transactions do not get included on chain at all, improving gas efficiency.
How would account abstraction work?
It’s easiest to describe account abstraction by walking through a series of “strawman” schemes and understanding where their flaws are.
Note that for terminology we’ll use the word “target” as the address that the “top-level” call in an account-abstracted transaction is directed to. This ensures that we avoid the connotations of existing terminology (“destination” or “to address”) that imply that this address is the recipient of a transfer; in fact, in all of our use cases the “target” address represents the account of the sender.
Strawman scheme 1: Naive total abstraction
Users can send transactions from a new special address, that we call
ENTRY_POINT (eg. set it to
2**160 - 1), to any account, without paying for gas using the usual mechanisms. The implication is that the target of these transactions is the account of the sender, and the account code would be processing the transaction with its data and perform the desired operations.
Miners would use a simple filter to determine which transactions to accept: they check their account balance before fully processing the transaction, check their account balance after fully processing the transaction, and see if the difference is a sufficient fee. Users would simply include a
send(coinbase(), fee) in their transaction, after any needed authoritzation steps, to pay the fee.
This approach is simple and maximally flexible. However, there are two huge problems with it.
- It means that miners have to fully execute every transaction before they can know whether or not they should accept it. As we saw in the DAO soft fork attempt, this is a bad idea that leads to high levels of DoS vulnerability.
- Nodes in the network have an even worse time figuring out whether or not they should propagate the transaction, because even if they execute the transaction and it seems like it would pay the miner, one single transaction could invalidate the fee-paying property of every other transaction. Hence, network-level DoS risks are even more huge.
Strawman scheme 2: signature abstraction only
We create a third type of account, “wallet accounts”. A wallet account is like a contract, but has two pieces of code: (i) verification code, and (ii) execution code. A call from
ENTRY_POINT to a wallet account has two steps: (i) run the verification code, using the whole transaction as input, and verify that the output is nonzero, and then (ii) run the execution code normally. The verification code execution has no access to external calls (except to precompiles), contract storage, or any ability to “write” to anything; it must be a pure function. Additionally, verification code execution has a flat gas limit of 400,000.
This scheme has basically no security risks, as everything that miners and network nodes need to do is the same as before; the only difference is that instead of ECDSA verification (a pure function), miners need to execute some EVM code inside of a restricted environment (another pure function). However, the scheme only offers a portion of the benefits of abstraction, and we can do better.
Here is a concrete scheme that provides many of the benefits of full abstraction, without most of the risks. First, the EVM-level changes:
- We retain the property that there are two types of accounts, EOAs and contract accounts, the latter having only one piece of code. If desired, a subsequent fork could forcibly convert EOAs into contract accounts that have equivalent functionality, but this is not strictly needed.
- A transaction with a (0, 0) signature is treated as a transaction whose
tx.origin(and hence top-level sender) is the
2**160 - 1)
- We add an opcode,
PAYGAS, which takes one argument (
gasprice) off the stack, deducts
tx.startgas * gaspricefrom the balance, records that
remaining_gas * gaspriceshould be refunded at the end of transaction execution, and records that if the transaction fails after that point, it should only revert back to the point of calling the opcode. Does nothing if not used in the top-level call frame (ie.
msg.sender == ENTRY_POINT) and does nothing if used when the opcode has already been activated. Pushes 1 to the stack if successful and 0 if failed.
Now, the miner, and network node, policy:
- When a miner or network node receives a transaction, they verify that the code of the top-level contract starts with
require(msg.sender == ENTRY_POINT)(exact EVM byte sequence pending). If this is not true, they reject the transaction.
- They then run the code until they reach one of three situations:
- A call (except to a precompile) has been made, or an external state reading opcode (BALANCE, EXTCODE*) or an environment opcode (TIMESTAMP, DIFFICULTY, NUMBER, BLOCKHASH, COINBASE, GASLIMIT) has been used. In this case, they reject the transaction.
- 400000 gas has been spent. In this case, they reject the transaction.
PAYGASopcode has been used. In this case, they accept or reject the transaction based on whether the gasprice given to the opcode is sufficient.
- Miners and network nodes do not relay more than one transaction for each account. The impact on usability from this is low because account abstraction will as a side effect enable accounts that support performing multiple operations inside of one transaction.
The PAYGAS opcode creates a “breakpoint” that separates transaction execution into (1) a “verification” step, which executes code that can only read the transaction and on contract storage (which might contain public keys, anti-replay nonces, Merkle roots and other info) and cannot write anything and (2) an “execution” step which is not constrained, except that nothing that takes place during the execution step is capable of reverting the gas payment determined in PAYGAS.
The only-storage-dependent and non-writing rules are there to ensure that a transaction with some target T is guaranteed to continue being valid and having the same gasprice until either (i) that transaction is included, or (ii) another transaction with target T is included. This ensures that transactions have similar technical properties to what they have today. The “ENTRY_POINT only” guard requirement is there to ensure that this rule cannot be violated by transactions with other targets calling into the account.
Transaction hash uniqueness
There are two possible strategies to go regarding the transaction hash uniqueness invariant. The first strategy is to accept that this form of abstraction will remove this invariant, because nothing now strictly prevents the same transaction from being included twice (though allowing that would be inadvisable for most applications). This would require rearchitecting of some client code.
The second strategy is to maintain the requirement that transactions have nonces, add nonces to contracts, and keep the
tx.nonce == contract.nonce check (and add
contract.nonce += 1 to the post-execution step). This preserves uniqueness, but harms some use cases (eg. Tornado Cash) because multiple users may send transactions at the same time, and so there may be race conditions for transactions with the same nonce. That said, note that such applications would already be limited by the network-layer “only propagate one transaction per account” rule.
One simple fix for TC and similar applications would be to shard the uniqueness-nonce space. In TC, every withdrawal must provide a “nullifier” that is cryptographically connected to their withdrawal note. The system could maintain eg. 16 contracts, each of which stores a map of used nullifiers only for a range 0…, 1…, 2… up to f… Note particularly that the execution phase of the transaction could call out to these other contracts and update all of them. Applications other than TC (eg. UTXO-based subsystems) could use similar principles.