Increasing address size from 20 to 32 bytes

greg7mdp · March 7, 2021, 5:20pm

I’m wondering if, instead of deciding on one address format which has hardcoded support for shards/L2 chains addressing, it might not be getter to have optional address extensions.

So in the base 32 byte address, there could be a byte stating whether the address is followed by a second 32 byte extended address. If the byte is 0, there is no extended address information. If the byte is not 0, its value would specify the type of the address extension that follows in the next 32 bytes. Any address extension would be optional, but as you described providing it might be a way to optimize transfers.

John-Status · March 7, 2021, 5:51pm

Great idea! Doesn’t increase the base address size, and provides 32 bytes in which to encode multiple chain IDs when this functionality is needed

IMHO this extension (that enables multiple chain IDs to be encoded) will be needed in most cases when an address that tokens might be sent to is shared/published. E.g. when users share addresses with one another, most of the time they should be using this address extension.

anono1618 · March 7, 2021, 6:46pm

Just enforcement, and obviousness that the checksum exists.

zamicol · March 7, 2021, 7:05pm

Why not allow for the full 256 bit keccak256 checksum? GUIs can include however many bytes they want at the end of the address when copying and pasting. Ethereum can specify that anything under 4 bytes is invalid for GUIs.

fare · March 7, 2021, 7:24pm

I’m all for longer addresses… but if you’re going to make big incompatible changes, why not bundle that with switching from EVM to EWASM or RISCV or something, instead of making an ugly and futile attempt at backwards compatibility?

rubi · March 8, 2021, 7:22am

This may be misleading. It also requires 2**80 bytes (around 1 bn petabytes) of (fast) memory. You can trade off to something like 1,000 petabytes memory and 2**100 hashes, but it’s still hard.
I’m not saying this is not a problem, just making current situation somewhat clearer.

vbuterin · March 8, 2021, 3:20pm

This may be misleading. It also requires 2**80 bytes (around 1 bn petabytes) of (fast) memory.

I’m pretty sure you can use cycle finding algos to find a collision in sqrt time and O(1) memory.

Eg. https://diglib.tugraz.at/download.php?id=576a7826f0534&location=browse around page 55 talks about some approaches to do this.

jochem-brouwer · March 8, 2021, 3:57pm

What was the motivation to use 20 byte addresses instead of the 32-byte addresses which are generated by default?

vbuterin · March 8, 2021, 5:13pm

It was a holdover from bitcoin (much like the v value in signatures being increased by 27).

poemm · March 8, 2021, 7:06pm

This is a brief survey of all (?) EVM opcodes which interact with addresses.

`ADDRESS`, `ORIGIN`, `CALLER`, `COINBASE`

Input: nothing.
Output: stack item with an address from the execution environment. Currently 160 bits.

`BALANCE`, `EXTCODESIZE`, `EXTCODECOPY`, `EXTCODEHASH`

Input: an address from stack, which we currently truncate to 160 bits.
Output: info about that address is possibly pushed to stack.

`CALL`, `CALLCODE`, `DELEGATECALL`, `STATICCALL`

Input: an address from stack, which we currently truncate to 160 bits.
Output: error code from the message call.

`CREATE`, `CREATE2`

Input: The init code, endowment, and possibly a salt.
Output: new account’s address (currently 160 bits) to stack, where the new address is roughly: Hash(creating contract's address, creating contract's nonce or a salt, init code)[12:32]. This interacts with addresses in two ways: as the input and output of a hash.

`SELFDESTRUCT`

Input: an address from stack, which we currently truncate to 160 bits.
Output: send the remaining balance to that address.

`SLOAD`, `SSTORE`

These only implicitly touch the current contract’s address to read/write its storage.

UliGall · March 8, 2021, 9:29pm

Did someone consider the effects of this change to vanity addresses? It seems to me that 0x01000000000157aE408398dF7E5f4552091A69125d5dFcb7B8C2659029395bdF might be considered as a vanity adress and that it will be harder in the future to differentiate between “high-effort” and “low-effort” ones. Might not be the most important detail, but as users often check some parts of the address to verify the correctness, we should keep this in mind.

cdili · March 9, 2021, 3:07am

Is it worth considering some of the following?

use SHA3 instead of KECCAK?
address extension, post 21 Increasing address size from 20 to 32 bytes

(extensions may want to consider multihash “support” https://w3c-ccg.github.io/multihash/index.xml)

rubi · March 9, 2021, 11:42am

A random collision indeed. But here you need chosen prefixs - two different ones. Notice the fact that the prefixes are different is meaningful since the iterators travel different paths in the space.

vbuterin · March 10, 2021, 5:13pm

Notice the fact that the prefixes are different is meaningful since the iterators travel different paths in the space.

At worst, you can just have an iterator that randomly hops between both parts of the space (EOA pubkeys and contract codes fitting a template), and if you find a collision there’s a 50% chance that one preimage is a pubkey and the other preimage is a contract code. So I don’t think this is a barrier.

rubi · March 10, 2021, 11:06pm

Err… You need everything to be deterministic so you keep cycling.

Edit: Ah of course if you use (state % 2) as your random number then everything works out deterministically and perfectly. True.

Arachnid · March 11, 2021, 1:18am

If we’re defining a new address format, can we please define a canonical text representation that is not just the hexadecimal encoding of the address? Ethereum’s lack of a checksum in its text representation is one of its greatest weaknesses, and if everyone has to support a new address format anyway, that’s an excellent time to fix it. This should be a core part of any new address proposal, and not an afterthought - if 32 byte hexadecimal addresses get a foothold, it will be impossible to fix this (again).

qizhou · March 12, 2021, 10:37am

Why not still keep 20 bytes in EVM, while adding extra fields in tx/msg to including chain_id/epoch_id? From a normal user perspective, the address is 32 bytes instead of 20 bytes, but wallets will automatically translate 32 bytes to 20 bytes byte hash + chain_id + others and put them in proper data fields.

For example, a tx sending to a 32-byte address will put “to” field as 20 bytes, and “to_chain_id” field from a 32-byte address.

And EVM can stay as it is except adding a few OPCODE to read like “to_chain_id” of current tx context.

vbuterin · March 13, 2021, 8:48pm

I don’t think that would actually solve either the address space expansion problem or the security problem… the issues all happen in the EVM, not in clients.

MicahZoltu · March 17, 2021, 8:06am

Contract level checksum validation or (maybe in the future) EVM level checksum validation.

greg7mdp · May 13, 2021, 8:28pm

Is there a benefit to having the shard id in the address, instead of adding it in the transaction along with the chain id?

Increasing address size from 20 to 32 bytes

ADDRESS, ORIGIN, CALLER, COINBASE

BALANCE, EXTCODESIZE, EXTCODECOPY, EXTCODEHASH

CALL, CALLCODE, DELEGATECALL, STATICCALL

CREATE, CREATE2

SELFDESTRUCT

SLOAD, SSTORE