Why increase the address size?
At some point, perhaps soon, we are going to have to increase the address size from 20 bytes to 32 bytes. Some reasons for this include:
- Adding an address space ID if we use a state expiry scheme that requires it
- Adding a shard ID if we have multiple EVM-capable execution shards
- Security: 20 bytes is not secure enough
To elaborate on (3), current 20 byte (160 bit) addresses only provide 80 bits of collision resistance, meaning that someone can spend 2**80
computing work to generate two pieces of contract init code (or (sender, ID) pairs, or one piece of contract code and one EOA private key) that have the same address. 2**80
will soon be within reach of sophisticated attackers; the bitcoin blockchain has already made more than 2**90
hashes.
This possibility of attack means that if someone gives you an address that is not yet on-chain, and claims that the address has some property, they cannot prove that the address actually has that property, because they could have some second way of accessing that account. The properties of addresses become more complicated: you can trust an address if either (i) it is on chain, or (ii) you personally created it; for example, an organization cannot safely receive funds at a multisig unless that multisig has already been published on-chain, because whoever was the last to provide their public key could have chosen their private key in such a way as to make the address also a valid EOA controlled by themselves.
These problems can be eliminated if we go up to 32 byte addresses, increasing the hash length and simultaneously adding shard and epoch data and a version number to add forward compatibility for the future. The challenge, however, is that existing contracts are designed to accept 20 byte addresses. Solidity type-checks addresses to verify that they are in range, and byte-packs addresses to save storage space. This document attempts to give some proposals for how this can be done reasonably backwards-compatibly.
Proposal
We make a new address schema as follows:
Byte 0 : Version byte (must be 1 for now)
Byte 1-2 : Must be zero (could be shard number in the future)
Byte 3-5 : Epoch number (0 <= e <= 16777215)
Byte 6-31 : 26 byte hash
For example, the private key 0x0000...01
should correspond to the new-style address:
0x01000000000157aE408398dF7E5f4552091A69125d5dFcb7B8C2659029395bdF
Note that given a 32-byte value, it’s always possible to tell if it’s new-style or old-style: new-style addresses have the version byte set to 1 or higher, so they are >= 2**160
, whereas old-style addresses are < 2**160
.
We add:
- A new opcode,
CREATE3
, that is capable of generating 32 byte addresses. If we add address space separation, it would take the desired epoch number as an input, allowing contracts to create new contracts in their preferred address space - A new transaction type that makes the
msg.sender
a new-style address, and if it’s a contract creation, creates the contract at a new-style address - A new opcode,
BIGCALLER
, that returns the address of the caller regardless of whether it’s new-style or old-style (Solidity and other langs would be expected to treat the output as aBigAddress
type which would be an alias forbytes32
instead ofbytes20
)
The CALLER
opcode, if the caller is new-style, instead converts the address into a “compressed address”:
DOMAIN_SEPARATION_KEY = b'\xfeBLAHBLAHBLAH'
def get_compressed_address(address: 'bytes32'):
if address[:12] == bytes([0] * 12):
return address[12:]
return sha3(DOMAIN_SEPARATION_KEY + address)[12:]
It also immediately saves the mapping from the compressed address to the original address in a translation table. That is, it sets storage slot compressed_address
of contract TRANSLATION_TABLE_ADDR
to address
.
The *CALL
and *EXT
opcodes, if they encounter an old-style destination address, attempt to decompress it:
DOMAIN_SEPARATION_KEY = b'\xfeBLAHBLAHBLAH'
TRANSLATION_TABLE_ADDR = bytes([1] + [0, 0, 0, 0, 0] + [0] * 25 + [255])
ZERO_CHUNK = bytes32([0] * 32)
def get_decompressed_address(state: 'EthereumState', address: 'bytes32'):
# If the address is already new-style, just call that address directly
if address[:12] != bytes([0] * 12):
return address
elif state[TRANSLATION_TABLE_ADDR][address] != ZERO_CHUNK:
return state[TRANSLATION_TABLE_ADDR][address]
else:
return address
This ensures that new-style addresses would be able to interact with pre-existing contracts, maintain persistent identities, hold tokens, etc.
Example
Suppose that Alice has a new-style address ALICE_LONG
and the compressed form is ALICE_SHORT
. Alice tries to interact with a simple auction contract, AUCTIONEER
(which is pre-existing, so it uses the CALLER
opcode).
- Alice calls
AUCTIONEER
fromALICE_LONG
to make a bid, passing along 5 ETH.AUCTIONEER
uses theCALLER
opcode to get the caller, and the opcode returnsALICE_SHORT
(and simultaneously addsALICE_SHORT -> ALICE_LONG
to the translation table).AUCTIONEER
confirms that Alice’s bid is higher than any existing one, and saves Alice’s bid, usingALICE_SHORT
as her identity. - Bob calls
AUCTIONEER
with a 6 ETH bid.AUCTIONEER
confirms that Bob’s bid is higher, and needs to refund Alice her bid as her bid is now losing.AUCTIONEER
usesCALL
passing alongALICE_SHORT
as an argument. TheCALL
opcode looks upALICE_SHORT
in the translation table, getsALICE_LONG
as a result, and so correctly sends Alice’s 5 ETH back toALICE_LONG
.
Alternatives
It’s worth noting that we don’t have to do this if we are okay with unpublished addresses requiring the creator to be trusted. If we are okay with this weaker security property, then we could instead just move to a scheme where the hash decreases to 15 bytes (still 120 bits of preimage security) and the remaining 5 bytes get used for version/shard/epoch, though this would instead require somehow invalidating existing addresses that collide with the new schema.