EIP-4788: Beacon root in EVM

ralexstokes · June 22, 2023, 2:39pm

not sure – what do other precompiles do here?

jochem-brouwer · June 22, 2023, 7:03pm

Blake2F throws, ecrecover pads input with zeros to get the expected 129 bytes length. It should be noted though what should happen if < 32 bytes are given as input: do we left- or right-pad the storage key with zeros in order to retrieve it? (And if there are more than 32 bytes, just take the first 32 bytes)

jochem-brouwer · June 22, 2023, 8:23pm

I want to raise a point regarding gas, but I do not want to make the EIP implementation more complex:

What if someone sends a transaction with one of these storage slots warm? We could lower the gas (but yes, this would make the implementation more complex)

ralexstokes · June 22, 2023, 9:20pm

i’m not too worried about this, and someone could prototype a verification of say a validator balance to ensure it doesn’t feel too expensive

one option if it was a problem is either (1) update precompile in future fork, (2) have a “caching” contract that proxies to the precompile so you get the warm/cold gas distinction

ralexstokes · June 22, 2023, 9:20pm

more than 32, just take the first 32

less should throw

ralexstokes · June 22, 2023, 10:50pm

I added one option here: Update EIP-4788: Bound precompile storage by ralexstokes · Pull Request #7178 · ethereum/EIPs · GitHub

lmk if see anything we should change

haltman-at · June 24, 2023, 6:13am

I mean, new precompiles add more semantics too! Both need to be specified. I’m just saying it’s a little confusing because doesn’t match the roles of opcodes and precompiles as they’ve existed so far. Getting information about the environment has been the province of opcodes, precompiles are for complicated computations. Well, except for 0x4… but that oversight is being rectified with the addition of MCOPY!

chfast · August 1, 2023, 5:23pm

Why don’t you deploy the real bytecode at the given address instead of trying to describe what it suppose to do in pseudocode? And come up with some arbitrary gas cost of this?

holiman · August 2, 2023, 6:54am

This strongly resonates with me. I also wonder.

Some historical context, this approach was once suggested for blockhash, with EIP-210: Blockhash refactoring

@axic already posted that link, however, I think that was early in the discussion, when the beacon root EIP also included an opcode? Later on, the opcode was dropped, and the EIP is now in a situation where it could be replaced by:

A contract X
A system-call to X, invoking it with the latest beaconroot+number at the start of block processing.

The contract X would store new values if the sender is system-address, otherwise respond with the value requested (as per the pseudo-code in the current version of the EIP).
The address of X could either be pre-determined, like a precompile, or we could just create it with CREATE2 in advance – before the hardfork, and then just ‘bless it’ at cancun by beginning the population of values (optional: the ‘blessing’ could also include auto-inclusion into the prepopulated access-list).

In that case, the core of the EIP would basically be:

At the start of processing any execution block where block.timestamp >= FORK_TIMESTAMP (i.e. before processing any transactions), make a transaction to the HISTORY_STORAGE_ADDRESS, with calldata BEACONROOT, sender SYSTEM_SENDER and gas 100_000.

The internals of what the contract does would not be consensus-critical (any more than other contract internals)

protolambda · August 4, 2023, 1:04am

Optimism also utilizes a system-transaction to expose the block commitment of an external layer, very similar to what is being explored with the “v2” here. But introduces the L1 blockhash, instead of a beacon-chain block root.

System transaction comparison

And instead of extending the EVM interface with the external blockroot as explicit argument, and instead of modifying the block-header, we use our “deposit tx type” (like 1559 tx, without signature) (specs). This deposit-tx is inserted at the top of the block, and passed in through a transactions-list via Payload-Attributes in the Engine API (this same functionality is used to reproduce blocks from inputs only, as only inputs are made available on L1).

Predeploy, not precompile

The transaction is then processed by a “predeploy” (specs). The bytecode of a predeployed EVM contract is present starting at genesis of the chain, not deployed by a regular user.
This “predeploy” mechanism is the same trick as what I implemented in testnet tools to embed the beacon deposit-contract with initial storage at a special address, which e.g. is going into the deposit-contract in Holesky at 0x000000006465706F73697420636f6E7472616374 (“deposit contract” in utf8).
This process is described here as well as in this merge-at-genesis tutorial by Afri.

In the case of EIP-4788 the block-header is extended, so the tx itself does not really have to be a tx, but the pattern of calling a predeploy, instead of a precompile, definitely works.

Bridge usage

And we’re considering to extend the predeploy contract with a ring-buffer: it’s very useful to retain recent history, and expose that history to other contracts, rather than just the very latest L1 blockhash (or beacon block root in this case). Bridges and other tools may build their txs for a certain beacon block-root, and want to interact with it like e.g. doing a merkle proof, without having to “remember” it with a prior transaction that persists the latest beacon-root first. Maybe this type of functionality should be considered for the beacon block root predeploy as well?

etan-status · August 31, 2023, 2:23pm

With the parent_beacon_block_root hashed into the block_hash, there is an on-chain commitment that is exposed to EVM via the blockhash opcode. Hence, contracts have everything they need to validate the parent block root via calldata by simply pushing in the full block header or a matching zk proof.

Not sure if the system contract should be included without it being absolutely necessary, or before there being applications making use of it. The contract just feels like adding relatively heavy infrastructure that will be hard to modify in the future.

madlabman · September 1, 2023, 3:26pm

Given the case that we have a proof of any property of a block whose timestamp is out of the current state of a ring buffer, i.e., it was included in the blockchain too long ago, what approach should we use for the proof?

My first thought is for a Prover contract to have a storage cache with (timestamp, root) “checkpoints,” which is populated by calling the EIP-presented contract. Then anyone can bring a proof of an arbitrary root against some checkpoint in the cache, effectively adding a new checkpoint to the cache. Eventually, we can use any proof for any block root, if I understand correctly.

Is there a more convenient approach? Or am I wrong in something?

poojaranjan · September 20, 2023, 6:59pm

EIP-4788:Beacon block root in the EVM with @ralexstokes

bowenl · November 7, 2023, 9:29pm

is there a timeline of when this will be released?

madlabman · November 20, 2023, 8:17am

Hey all! I’ve elaborated on the idea of the block roots cache at Lido research forum, I’ll appreciate any feedback!

ihagopian · January 15, 2024, 4:50pm

Hey, today I did some quick slides about the BLOCKHASH situation fro Verkle Trees.

If you go to the last slide, you’ll see that “Solution 4” proposes a ring buffer. This EIP-4788 takes a similar approach to this other use case of beacon roots.

In the slides, I mention a potential DoS attack vector. Remember that in Verkle Trees, there’s a single tree for all accounts and storage. This means that someone could try brute-forcing writes to generate tree addresses with the same prefixes as the branches to attack, which would cause that branch to be longer than average.

In the case of the BLOCKHASH ring buffer, these addresses are fixed (or could be in a contract as proposed in EIP-4788). The point is that the same DoS risk would exist for the (two) ring buffer entries in this contract. This would mean updating the ring buffer can have a higher cost than the average depth of the tree.

This kind of DoS attack can be done in any branch of any storage slot. The main difference between any branch and these branches is that these ones are “system-related,” so from an economic perspective might have a better ratio of cost/benefit for an attacker (that’s the handwavy argument).

To be clear, I don’t have numbers if this is a true problem – just sharing the concern. Maybe someone can have more experience and knowledge to gauge if this is a problem.

madlabman · January 19, 2024, 9:25pm

Hey folks! I’ve compiled a small project demonstrating a usage of the block roots. It includes:

creating a proof from a beacon state (thanks to the lodestar javascript packages)
verifying a proof via a smart contract

I will be glad for any feedback and suggestions if you find how to make it more useful.

matt · January 22, 2024, 9:29pm

I don’t understand what the attack is. Even today it is possible to mine addresses near an account to create a deeper branch. It can be a bit more difficult with storage because, today, that keyspace is controlled fully by the contract. After verkle, it will no longer be, but still, I don’t see what the regression will be beyond a small performance hit (which the attacker will pay dearly for in other storage costs).

ihagopian · January 23, 2024, 12:07am

I don’t understand what the attack is. Even today it is possible to mine addresses near an account to create a deeper branch. It can be a bit more difficult with storage because, today, that keyspace is controlled fully by the contract.

Correct. I never claimed this is a “novel attack” strategy, just that it’s now possible for storage slots in VKTs, and considering this EIP uses a “system contract,” attacking those branches might have a different impact in the protocol than any other random branch from a contract. Totally fair if that is found irrelevant, just mentioned this as a point that might not be obvious to everybody.

I don’t see what the regression will be beyond a small performance hit (which the attacker will pay dearly for in other storage costs).

Sure, maybe (and I hope too!) it’s a small performance hit, but I’m not sure we can extrapolate the conclusion from MPT to VKT since:

VKT use EC scalar multiplications to update tree updates (i.e: update commitments). These scalar mults are slower than Keccack.
The VKT will be shallower, so maybe this will offset the above fact. (Isn’t obvious to me).

(which the attacker will pay dearly for in other storage costs).

In a VKT, you only need to find, for example, one key with an 8-byte prefix match to attack a branch and generate a branch of depth 8. Also, storage slots next to each other live in the same leaf node, meaning that a single key attacks 256 storage slots at once.

This isn’t an argument to say that “the attacker will pay dearly for in other storage costs” is wrong. I’m just sharing some facts that explain why I don’t think it is obvious.

It’s totally possible I’m being paranoid – so I hope your intuition is correct! Having to solve this problem would be quite annoying.

ihagopian · February 26, 2024, 5:09pm

I wanted to signal that there’s currently a discussion on whether or not these branch poisoning attack vectors in VKT are a concerning problem. Also, to be clear, the concern is not coupled to the ring buffer idea since it’s a more general concern.

I’ve done some initial exploration: Verkle Trees - An exploration of tree branches attacks - HackMD, where I dive deeper into some points I made before. There might be more discussions until we conclude. In any case, all this is off-topic here.

Again, sorry if this was an unnecessary tangent in the thread!