EIP8188: Last-Written Block for Accounts and Slots

First Step towards Hot-Cold State Separation

Update (2026-06-11): This EIP previously contained the specs for both recording block number and the pricing for state tiering. The state tiering part was moved to a separate proposal (EIP-8295).

For every account and storage slot, EIP-8188 records the block number at which it was last written. There is no gas change, no removal of state, no tree migration, no resurrection mechanism and no out-of-protocol dependency. It proposes single consensus-visible field last_written_block that names the write-age of each piece of state.

The pricing that uses this signal lives in companion proposals, EIP-8295and EIP-8295.

What EIP-8188 does

The full spec is in the EIP. TLDR:

  • The account RLP gains a fifth field, last_written_block. Storage slot encoding wraps the value in a list to add the same field.

  • The field is set to the current block number when the item is mutated. Reads never touch it.

  • No gas costs change. Read pricing, write pricing and EIP-2929’s warm/cold distinction are all untouched.

The architectural unlock for EL clients

last_written_block gives every client the same write-age signal at consensus level. Today each client picks its own heuristic for which state is ā€œfresh enough to keep hotā€. The heuristics work, but they are not agreed across clients and they are invisible to the protocol. EIP-8188 names the boundary once, in consensus. Once named, every client can tier its storage along it without inventing its own definition, and the protocol can later price against the exact same boundary.

The benefit of tiering this way is to decouple the cost of state access and state-root computation from total state growth. If a client keeps only recently-written state on the hot path, the commit-critical work stops scaling with the entire state and starts scaling with the recently-mutated set. Long-term total storage is still a separate problem. This is about the mutable working set, not the size of the archive.

Geth, Nethermind, Besu, Ethrex (LSM clients)

These clients persist trie nodes into a single LSM-tree database (PebbleDB for Geth, RocksDB for Nethermind/Besu/Ethrex). Every trie node is a separate KV entry, and updates to inactive trie nodes pull reads from higher SSTable levels, which means more disk I/O.

The natural use of last_written_block is to separate cold subtrees out of the live LSM into an immutable, append-only flat file, leaving small stubs in the LSM that point at the cold blob. The live LSM stops carrying state that has not been touched in years, compaction sees fewer files, and the active set gets smaller.

We prototyped this on geth (see Hot-cold storage separation in practice). Replaying mainnet from block 19,999,256 on write-age-tagged state, moving cold subtrees into a compressed archive:

  • The total on-disk footprint comes down, by roughly 22% with compression (a net saving on the order of 54 GB at the recommended cap setting), even though nothing is deleted from consensus.

  • The hot trie shrinks by around 60%, so compaction and the commit path run over a much smaller set.

  • The write-age metadata itself costs under 0.5% of the snapshot.

  • Rebuilding a moved-out subtree on access stays cheap (worst case a few dozen leaves).

These are storage-footprint numbers. Performance under real workloads is still to be benchmarked, and the implementation is unoptimized. The point is that the separation is real and the cold set genuinely stays cold.

Erigon

Erigon 3 already runs as a temporal database. State sits in an MDBX buffer plus four Domains (Accounts, Storage, Code, Commitment) whose recent writes are periodically frozen into compressed segment files. The architecture is close to what EIP-8188 enables, except that Erigon’s hot-cold boundary is its own freeze schedule, not write-age.

The Commitment Domain benefits most, since every state-root computation touches the trie up to the root, including paths that cross subtrees frozen years ago. With the metadata, branches in deeply cold territory can stay in deep snapshot files without being reinserted into the MDBX buffer, and a consensus-agreed boundary replaces the local freeze heuristic.

Reth

Reth uses MDBX and Nippy Jar. MDBX holds frequently-rewritten data such as plain state, hashed state, the trie tables, changesets and history indices. Nippy Jar holds append-only artefacts: headers, transactions, receipts, and eventually frozen plain-state segments. The substrate is already shaped for what EIP-8188 enables.

With a write-age boundary, Reth can split each of its three state representations along it:

  • Plain state. Active accounts and slots stay in PlainAccountState/PlainStorageState, while cold ones move into per-segment Nippy Jar files.

  • Hashed state. Same split, mirrored. The MerkleStage that recomputes hashed state every block runs over the active subset, and cold hashed state is read-only and stable.

  • Trie state. AccountsTrie and StoragesTrie are MDBX tables today. They can be similarly split, with cold subtrees mirrored to Nippy Jar and the in-MDBX trie keeping references.

FAQ

Why HegotĆ” and not later?

First, simplicity. The EIP is small: two RLP encoding changes with a clean way to distinguish legacy from post-fork entries, no tree migration, no resurrection mechanism and no networking changes.

Second, and more important, the signal only becomes useful once it has accumulated. last_written_block is only meaningful for state that has been written under the new rules. The earlier the field starts recording, the sooner the cold set is well-defined and the sooner clients and any future pricing can rely on it. Shipping the metadata in HegotĆ” and deciding the pricing afterward allows for cleaner separation and gives time to finalize on the state tiering pricing.

Isn’t the signal alone pointless without pricing?

The signal alone gives clients a shared storage hint, which is already worth something. But it does not, by itself, make writing cold state more expensive, so it leaves the attack surface open: pulling deeply cold state back onto the hot path stays as cheap as any other write. That is the gap EIP-8295 closes. 8188 is the prerequisite, not the whole mechanism.

Why not state expiry?

State expiry is the bigger long-term lever, and it has been on the table for years. The blocker is resurrection: expired state has to come back when a user needs it, and that requires decentralized, censorship-resistant infrastructure to serve old state on demand. Without it, resurrection data can only come from centralized providers, which is a censorship vector.

EIP-8188 has no resurrection prerequisite. State never disappears. It just gets a label. Whatever decentralized state-serving infrastructure eventually ships will unblock real expiry. Until then, 8188 plus 8295 lets us attack the hot-path cost of state growth without waiting for that infrastructure.

Doesn’t the metadata bloat the state?

The field is small and most state never pays for it. A block number is around 4 bytes on an account and around 5 bytes on a storage slot (the slot also gains a one-byte list prefix). Worst case, if every item were rewritten after the fork:

  • 360M accounts Ɨ 5 bytes ā‰ˆ 1.8 GB

  • 1.5B storage slots Ɨ 6 bytes ā‰ˆ 9.0 GB

  • roughly 10.8 GB total

That is a worst case. Pre-fork state that is never re-touched keeps its legacy encoding and pays nothing. In the hot-cold prototype the realized metadata overhead was under 0.5% of the snapshot, and it is dwarfed by the multi-GB the separation saves on the hot path.

Is it compatible with future tree migration?

Yes. last_written_block is tracked per account and per slot. When the trie moves to a page-based layout (such as EIP-7864), the field aggregates to the page: page.last_written_block = max(item.last_written_block for item in page). The page stays active for as long as any item inside it is written, which matches how clients load and store a page as a unit anyway.

Closing

EIP-8188 is the small, self-contained first step. It makes the recently-mutated set explicit at consensus level and gives every client the same boundary to tier storage along, without changing a single cost. On its own it is a storage hint with measured benefits. Paired with EIP-8295/EIP-8296 it becomes a pricing signal that stops treating writes to inactive state as if they were free.

Hiya! Sat down to re-read the EIP after todays ACDE. Here are my points/criticism, it is meant as constructive criticism and should not be interpreted as a complete rejection of this EIP :smiley: :+1:

I’ll reply here to the EIP as of EIPs/EIPS/eip-8188.md at 45b443dc26ffae9437f8e396e677b4242e067613 Ā· ethereum/EIPs Ā· GitHub

Why is the period by block number and not by block.timestamp or something else? If we get lower slot time then we also have to adjust this period constant, I would assume that we would measure this in time spent inactive and not on how much blocks inactive :thinking:

current_period = max(0, (block_number - PERIOD_START_BLOCK) // PERIOD_LENGTH)

Again strong suggestion to move to timestamps or slot numbers, because we cannot know beforehand what block is the first block of the fork. This would then pick a block which, if no blocks missed, is the fork block (so in practice it will be later in time that period begins).
I suggest the current period starts with 1, not 0, because otherwise the first period would do ā€œnothingā€ to state (or update the state to the list variant where we would now write a 0, which is the default value). If we start at 1 then all new writes will upgrade the written slot from current period 0 to period 1 (ā€œupgradingā€ the slot).

Tier Gas Constants

INACTIVE > ACTIVE MUST hold.

What is meant by this? The inactive cost is higher than active?

Why is there an active write cost? This is extra cost on top of the existing gas schedule but it feels to me like this could also be described as the normal write cost (base cost), and you pay extra if you are in the inactive period.

It is not entirely clear to me when these write costs are applied. Are they always applied? Or is it only applied if we have to update the period?

  • New slot (zero → nonzero): created with last_written_period = current_period. Account period bumped.

This means that I first evaluate the tier (period of slot is 0) and then write the current period to it. At period INACTIVE_MIN_AGE there is an economical change: now all new state creations are suddenly more expensive because all new state creations are now inactive and thus pay more write gas.

SELFDESTRUCT (not same-tx)

What if it is the same tx? The target could get a balance if the value transfer is nonzero, so should also update the period. Note that selfdestruct happens after the transaction finishes, so the account is not deleted once you call this opcode (ideally we ship this with remove selfdestruct, saves the trouble to think about it)

Clients MAY keep Active state in a smaller mutable store and Inactive state in a larger stable store, or use any equivalent architecture such as multiple column families, multiple databases, different compaction policies, or different storage media.

I have a problem with ā€œdifferent storage mediaā€, per the post above I read that this could be a HDD. This sounds like a DoS vector for a live node. Note that storage or accounts in Inactive state could still be read a lot, which would thus not make sense to move to another (slower) disk :thinking: (later in EIP this is also noted that this tracks write inactivity, not read inactivity, so this is known)

However, the actual impact should be quantified during benchmarking across different clients.

I think its fine to measure in the added state trie bytes. After period 256 then all state gets a byte bump if it gets written to. Some clients will likely create a lookup table for the write period, so that would add more overhead on the footprint of that client.

Relation with state creation cost (EIP-8037)

The EIP needs an update with the latest EIP-8037 spec, in particular how are the bytes added measured? So for period 1, storage slots would add 2 bytes to state and accounts 1 in case these are in ā€œlegacyā€ mode. Also note that if we upgrade (or create) in period 256 we now need an extra byte to store this. Updating period 255 → 256 would thus create an extra byte, would we charge for this? (cc @misilva73 )

Note: the tier based pricing for writes are a great candidate for the state reservoir, would directly apply this, then possible extra state cost due to inactiveness could be paid from the reservoir

Renewal gaming

I think this needs a deeper analysis, because this will depend on the ratio of INACTIVE/ACTIVE, the INACTIVE_MIN_AGE, and the write frequency of the slot. One could imagine at extreme values, says INACTIVE/ACTIVE = 1000, there is an economic demand to keep slots active: if the write frequency of this slot is once per INACTIVE_MIN_AGE, then there is an economic incentive to ā€œwriteā€ to this value (change the value, and then change it back to the original, just to update the period), which would thus claim part of the block space to update these slots. I am not sure but I don’t think this kind of usage is wanted :thinking: The analysis calculates the cost necessary to keep a storage slot active from now to period X. If this cost is lower than paying the inactive ā€œfeeā€ then this is the incentive to do the period bump.

Situations

What happens if INACTIVE_MIN_AGE = 2, CURRENT_PERIOD = 10, and I clear an (active) storage slot in the same transaction (I set it to 0). Then later I set it from zero to nonzero. Do I now pay the inactive cost? Period Update Rule section seems to indicate I do.

General comments
The reason adding these write periods to the state trie such that clients can optimize their db layout does not work for me. If clients want this, then they can use an optional tracker in their DB (which also gives them a bit more freedom on how they would track this). I do not see this as a good reason to bake the write period into the protocol, because now every client is forced to add this extra logic (and storage) for this feature and this client can also decide to do nothing with that information. However it is shown by a practical experiment in the opening post that this could reduce the size of the ā€œhotā€ DB by 78%. So in current ā€œmarket conditionsā€ (demand) such db layout could be lucrative (and would be a recommendation and would not need a fork to enable it as clients can switch to this DB strategy)

Writing these last-written periods to the state also makes it easier for attackers to target storage which is in the inactive storage (reading/writing to this should take longer than active storage). This is obviously also possible today, but now this ā€œfeatureā€ is baked in the protocol.

Also, for some storage slots which are never written to (but read a lot) it does not make sense to move these to the the immutable flat file append-only (SSTable?). I assume that reads from SSTable are also expensive/take time. So slots which are not written to but read a lot (for instance a proxy contract which would read the target contract from a slot) should not move to SSTable - right?

TODOs
The main point here is that the EIP in Hegota will be launched with Block Level Access lists. These writes when updating the period of some state item also need to be recorded in the BAL. If this is not recorded then clients using the BAL to calculate the state root (by applying all changes) will end up in a wrong state root if the period bumps are not part of the BAL. The recent version of EIP-8037 should also be taken into consideration.

I am not convinced we should do this: the inactive/active division is something which already can optionally be done in order to optimize the DB in the current situation. I feel that adding this price mechanism and state tiering to state will indeed result in DB layouts like this. I do not think the protocol should push clients to a particular DB layout. The idea to price by estimating the write cost based on last-written makes sense, but note that BALs will already announce a node that it will either write or read specific state, such that this can be prefetched.

2 Likes

After ACDE’s comments yesterday, wanted to address the points made on reads being left out or not charging etc..

We don’t want reads for updating metadata for a simple reason which is that most of reads happen offchain. This means that data that is ā€œhotā€ (ie. user that looks up his balance everyday but doesn’t do anything with it) actually will keep the value hot while the protocol will see it cold.

So the reads that touch state inside the EVM aren’t actually a good way to measure/update if something is or not hot. Though, it can also be argued that is better than nothing. The issue would be that the price of reading would become much larger as we would always need to update state-root and subtree for any read. I don’t think we want that.

4 Likes

Agree

Agree

Yes, that is the intended meaning. Writes to Inactive state must cost more than writes to Active state. The exact constants are still undecided, but the intended shape is that Active writes stay close to today’s normal write cost, while Inactive writes pay a surcharge.

That is not the intended logic. Creations should not be classified as Inactive. I’ll update the spec to make the separation between state creation and state update explicit.

That said, state creation should still cost at least as much as the Inactive write tier. In the worst case, creating new state can require traversing and modifying parts of the trie that are themselves Inactive, so the gas schedule should reflect that.

It’s a loose definition. It really depends on clients’ implementations and actual gas cost differences. The proposal does not say that all Inactive state should move to slow storage.

A client can use the signal selectively. For example, in Geth, it may make sense to separate only inactive trie nodes while keeping the flat state on the fast path. In that design, reads still come from the regular state path, while the optimization mainly targets the commit and root-computation path.

Good point. The EIP needs to be updated against the latest EIP-8037 accounting model.

There are a few ways to handle this. One is to account for the exact extra bytes introduced by the period field when a legacy entry is upgraded. Another is to reserve enough bytes up front so that later growth in the period encoding does not keep changing the accounting model.

Also, if one period is on the order of months, hitting larger encodings such as period 256 is a very long-horizon issue. It still needs to be specified, but it does not seem like the main design risk.

What do you mean by state reservoir?

I think that is acceptable and in fact part of the design. If a user wants to keep some state Active because they expect future writes, then periodically refreshing it is the maintenance cost they pay for that privilege.

The goal is not to make such behavior impossible, but to make it explicit and paid for.

In that case, the zero-to-nonzero write should be treated as state creation, not as a write to Inactive state. Will make it more explicit.

Yes every client can do this internally. In fact, Erigon 3 is the only one that currently does this using a notion of timeline via tx number.

The issue is that a local tracker exposes security vulnerabilities. Without writes to Inactive state being more expensive, malicious actors can target Inactive state more explicitly. If we bake this in-protocol, gas becomes a protection to these attacks.

Also there’s a problem with sync. If the metadata lives only in client-local DB structures, it’s not part of the canonical state and cannot be independently verified.

In this proposal, we have a write-recency signal, but not a read-recency signal. So it should not be interpreted as a rule that all Inactive state belongs in a cold store. Another way to think about it is write-optimized and read-optimized storage separation.

I don’t think BAL needs to carry the period explicitly. Clients can derive the current period from block fields (i.e. timestamp or block number). State creation or updates will use the latest period anyways.

Agreed and that is not the intent. The proposal should not prescribe a specific storage architecture. What it does provide is a consensus-visible signal that clients may choose to use in different ways. To what extent will clients utilize this information requires more discussions.

That is orthogonal. BALs are about access disclosure for execution flow. Having the write signal is about state classification for pricing and storage-policy decisions. The two can complement each other, but they solve different problems.

Side note:
As of writing this, I’m personally leaning towards something simpler and more flexible. That is, we use block number instead of period for per-state metadata, and ship this for HegotĆ” first. Later in the future we can decide how we want to do the gas pricing for Inactive state. The downside is we’d have to store more bytes if we use block number (4 bytes given the current block number).

First, is it true that writing to a never before initialized account will incur the inactive storage write cost? Does that correctly reflect the real cost of writing such an account?


Is there a case to be made for charging INACTIVE_ACCOUNT_READ / INACTIVE_STORAGE_READ but importantly, not updating the last_written_period on read?

This would allow moving infrequently accessed state to a slower storage medium, but would penalize old read-only state even if it is frequently accessed.

That’s one possibility that we can look into. Like you said, frequently read data will be penalized for this. Will have to check what’s the current share of this set of state.

No that’s not true, will update the spec to clarify on this part. However, there’s a case such that state creation should not be cheaper than inactive state write. The justification of this is that, clients could move inactive parts of the trie to the inactive storage, but upon some state creation, it might have to access these inactive portions.