EIP8188: State Tiering by Write Age

State hygiene without state expiry: the case for EIP-8188 in Hegotá

EIP-8188 introduces renewal-age pricing for state writes. Each account and each storage slot gets a new last_written_period field. Writes to state that hasn’t been touched in roughly a year (the Inactive tier) cost more than writes to recently-mutated state (the Active tier). Reads are unchanged. There is no removal of state, no tree migration, no resurrection mechanism and no out-of-protocol infrastructure dependency.

What EIP-8188 does

The full spec is in the EIP. TLDR:

  • A global period counter is derived from block number: period = (block_number - PERIOD_START_BLOCK) // PERIOD_LENGTH. Target period length is around six months.

  • The account RLP gains a fifth field, last_written_period. Storage slot encoding wraps the value in a list to add the same field. Legacy entries decode as last_written_period = 0.

  • A state item is Active if current_period - last_written_period < INACTIVE_MIN_AGE, otherwise Inactive.

  • Writes to Active state pay ACTIVE_*_WRITE gas. Writes to Inactive state pay INACTIVE_*_WRITE. Reads are unchanged.

  • The period is updated on actual mutation, evaluated before the write. So a write that lifts a slot from Inactive to Active still pays the Inactive cost on the way out.

  • This is independent from EIP-2929’s warm/cold cache distinction, which still applies to reads.

What it explicitly does not do: remove state, expire state or change how the trie is shaped.

The architectural unlock for EL clients

last_written_period gives every client the same write-age signal at consensus level. Today, each client picks its own heuristic for which state is “fresh enough to keep hot”. The heuristics work, but they aren’t visible to gas pricing, and they don’t agree across clients. EIP-8188 names the boundary in the protocol. Once named, every client can tier its storage along it without inventing its own definition.

The benefit of tiering this way is to decouple the cost of state access and state root computation from total state growth. The Active state DB becomes economically rate-limited by gas over the renewal window. It is not strictly bounded, but clients get a shared protocol signal for separating mutable and stable state, which makes per-block performance predictable even as the state grows. Long-term storage cost is still a separate problem to solve.

Geth, Nethermind, Besu, Ethrex (LSM clients)

These clients persist trie nodes into a single LSM-tree database (PebbleDB for Geth, RocksDB for Nethermind/Besu/Ethrex). Every trie node is a separate KV entry, and updates to inactive trie nodes require reads from higher SSTable levels, which means more disk I/O.

The natural use of last_written_period is to separate Inactive subtrees out of the live LSM into an immutable, append-only flat file, leaving small stubs in the LSM that point at the cold blob. The live LSM stops carrying state that hasn’t been touched in years, compaction sees fewer files, and the active set gets smaller.

We’ve prototyped this on geth and replayed 2.6M mainnet blocks (head 19,999,256 → 22,627,956) which is roughly 1 year worth of blocks on top of EIP-8188-converted state. Some numbers:

  • Trie bytes in PebbleDB: 148 GB → 33 GB (a 78% reduction).

  • Trie bytes in flat file: 162 GB, append-only, can be stored in slower storage (e.g. HDD).

  • 80% of the original stubs survived the entire block import. Cold state really does stay cold.

  • TBD on performance benchmarks

Note that this implementation is far from an optimized one. We can further reduce the size of the flat file with compression.

Erigon

Erigon 3 already runs as a temporal database. State sits in an MDBX buffer plus four Domains (Accounts, Storage, Code, Commitment) whose recent writes get periodically frozen into compressed segment files. The architecture is very close to what EIP-8188 wants, except that the boundary between hot and cold is step-based, defined by Erigon’s own freeze schedule, not write-age-based.

The Commitment Domain especially benefits from EIP-8188, since every state root computation touches the trie all the way to the root, including paths that cross subtrees frozen potentially years ago. With the metadata, branches in deeply-Inactive territory can stay in deep snapshot files without being reinserted back into the MDBX buffer. The Inactive write-tier gas charge then reflects the real cost of that pull-back when it does happen.

Reth

Reth uses MDBX and Nippy Jar, where MDBX holds frequently-rewritten data such as plain state, hashed state, the trie tables, changesets, history indices. Nippy Jar holds append-only blockchain artefacts: headers, transactions, receipts, and eventually frozen plain-state segments. The substrate is already shaped for what EIP-8188 enables.

With state tiering, Reth can split each of its three state representations along the Active/Inactive axis:

  • Plain state. Active accounts and slots stay in PlainAccountState/PlainStorageState, while Inactive ones move into per-segment Nippy Jar files.

  • Hashed state. Same split, mirrored. The MerkleStage that recomputes hashed state every block runs over the Active subset, and Inactive hashed state is read-only and stable.

  • Trie state. AccountsTrie and StoragesTrie are MDBX tables today. They can be similarly split, with Inactive subtrees mirrored to Nippy Jar and the in-MDBX trie keeping references.

FAQ

Why Hegota and not later?

First, simplicity. The EIP is small: two RLP encoding changes (with a clean way to distinguish legacy from post-fork entries), no tree migration, no resurrection mechanism and no networking changes.

Second, the metadata and the tier pricing don’t have to ship at the same time. The metadata is the gating dependency: it has to start accumulating from the activation fork, and until then there’s no separation of Active and Inactive state. The tier pricing can activate at the same fork, at fork+N, or after a configurable delay. Shipping the metadata in Hegota and perhaps scheduling tier-pricing activation later gives client teams buffer time to ship their storage optimizations.

Why not state expiry?

State expiry is the bigger long-term lever, and it has been on the table for a long time. The major blocker is the resurrection mechanism. Expired state has to come back if a user needs it, and that “coming back” requires decentralized and censorship-resistant infrastructure to serve old state on demand. Without it, resurrection data can only be served by centralized providers (a censorship vector).

EIP-8188 has no resurrection prerequisite. State doesn’t disappear. It just costs more to write. Whatever decentralized state-provider infrastructure eventually ships will unblock expiry. Until that lands, 8188 allows us to solve parts of the problem stemming from state bloat.

Doesn’t the metadata bloat the state?

Worst case ~3.4 GB total. One byte per account (360M × 1B ≈ 0.36 GB), two bytes per storage slot (1.5B × 2B ≈ 3 GB). And only state that gets written post-fork carries the new encoding: pre-fork state that’s never re-touched keeps the legacy form and pays no overhead at all. From our prototype in geth, it’s only 300MB added.

What about state that’s read-only but actively accessed?

Reads cost the same as today. The tier mechanism only affects writes. A contract that’s heavily read but never written sees no change. Its last_written_period drifts to whatever it was at deployment, and over time the slot decays into the Inactive tier.

The case where this would actually impact is a state that’s read constantly and written rarely but not never (e.g. a configuration value updated once a year). The yearly update pays the Inactive cost. The alternative model is users can keep the relevant state Active by periodically writing to it.

Won’t users keep rewriting state to keep it Active?

Yes. The whole point of renewal-age pricing is to make state maintenance an ongoing cost paid by the parties who want cheap future writes. If you want a slot to stay Active, you pay to refresh it. If you don’t care, you don’t, and your eventual write pays the Inactive cost.

The Active set isn’t strictly bounded by protocol, but it’s economically rate-limited by the gas limit over the renewal window. The cheapest possible refresh determines the upper bound on Active-set growth per period. On mainnet today that bound is much smaller than total state.

What if a malicious actor tries to keep all state Active forever?

The attacker is bounded by what’s externally writable. Storage slots only refresh via SSTORE in their owning contract, so an outsider can only refresh slots via publicly callable path.

Accounts are the workable exploit surface. An account’s period can be bumped on any nonzero balance transfer, so 1 wei to any EOA bumps that account back to Active. So the attack strategy is to refresh every account on every new period.

The cheapest delivery is a disperse-style batcher contract which contains a function to send to a batch of addresss. Each inner CALL with 1 wei sent costs roughly ~12,200 gas (9,000 for G_callvalue + 2,600 cold address access + ~600 for calldata and loop overhead). Given the present mainnet conditions: 360M accounts × 12,200 gas = ~4.4 Tgas per cycle. At ~0.2 gwei base fee and USD2290/ETH, that’s roughly USD2M per 6-month period.

The amount is within reach of a state-backed actor. But it’s not as bad as it seems. Storage slots outnumber accounts by roughly 4× on mainnet (per the state growth page). The attacker can guarantee accounts exist in the Active portion, but inactive storage slots stay frozen regardless.

Is it compatible with future tree migration?

Yes. last_written_period is currently tracked per-account and per-slot. When the trie moves to a page-based layout (such as EIP-7864), the period aggregates to the page (i.e. page.last_written_period = max(slot.last_written_period for slot in page)). This keeps the page Active for as long as any slot inside it is being written, which should match how clients load and store the page as a unit anyway.

Closing

EIP-8188 is the smallest change we can make to the state that turns state hygiene into a continuing cost without depending on infrastructure that doesn’t yet exist. The whole mechanism is two RLP fields, two write-cost tiers, and a configurable activation delay. Every client can use the same metadata to optimize its own storage in its own way.

Hegotá should ship the metadata. Tier pricing can come at the same fork or later. By the time it activates, every client will have had real mainnet data to optimize against.