SELFDESTRUCT usage report—post-Cancun Mainnet, blocks 20M–25M
Published dataset (472,641 events): https://gist.github.com/chfast/7923cc97710e6e788b2fcd840c17ffb0
1. Data format
The dataset is CSV with one header line and one row per SD-SAMETX event. Rows are appended in collection order, which is roughly chronological by block. All hex fields are lowercase without a 0x prefix.
Header
block,tx,addr,ben,balance,origin,caller,initcode,codesize,codehash
Columns
| Column | Type | Meaning |
|---|---|---|
block |
decimal uint64 | block number containing the SELFDESTRUCT |
tx |
32-byte hex | transaction hash |
addr |
20-byte hex | address of the contract being destroyed |
ben |
20-byte hex | beneficiary (recipient of addr’s ETH balance) |
balance |
decimal wei | ETH balance at SD opcode time |
origin |
20-byte hex | top-level signer EOA (evm.TxContext.Origin) |
caller |
20-byte hex | immediate caller of the destroyed contract — typically the factory in the deploy-and-die-in-constructor pattern; equals origin for direct contract-creation transactions |
initcode |
0 or 1 |
1 ⇒ SD fired from inside the constructor (runtime code never persisted); 0 ⇒ SD fired after construction (runtime code is currently in state) |
codesize |
decimal bytes | (only when initcode=0; empty otherwise) size of the persisted runtime code |
codehash |
32-byte hex | (only when initcode=0; empty otherwise) keccak256 of the persisted runtime code |
Detection note for initcode: it’s set from IntraBlockState.GetCodeSize(self) at the SD opcode. During a constructor frame the runtime code has not yet been deposited (SetCode runs after evm.Run() returns), so codesize == 0 reliably marks the in-constructor case. A persistent contract with empty runtime code can’t reach SD (calls to empty-code addresses don’t dispatch any opcode), so the equivalence initcode == 1 ⇔ in-constructor-frame holds in practice.
Scope: SD-SAMETX only
This dataset contains only SD-SAMETX events (every SELFDESTRUCT of a same-tx-created contract post-Cancun, i.e., the path where EIP-6780 actually destroys the account). The Erigon instrumentation that produced this data also emits three other tags — SD-BURN, SD-BROKEN-BURN, FINAL-BURN — covering orthogonal phenomena (burn accounting, EIP-6780-blocked burns, deferred destruction). They are not in this CSV and not analyzed here, because they don’t bear on the “what would EIP-6780 removal cost in state bloat” question this report investigates.
2. Headline numbers (SD-SAMETX only)
| Metric | Value |
|---|---|
| Total SD-SAMETX events | 472,641 |
| Block range | 20,000,000 → 24,999,999 (5.00 M blocks, ~695 days @ 12s) |
| Unique CREATE2 addresses (= wasted accounts if SD disabled) | 413,508 |
| Hard-fail events (events − unique addrs; CREATE2 collisions on redeploy) | 59,133 (12.51%) |
| Wasted code bytes (Σ runtime codesize over unique addrs, initcode=0) | 3,268,302 B (≈ 3.27 MB) |
| Unique runtime codehashes among initcode=0 contracts | 305 |
| Wasted code bytes deduplicated by codehash (one copy per unique hash) | 909,207 B (≈ 909 KB) |
How each metric is computed
The model is: in a no-SD world, the first deploy at each unique CREATE2 address succeeds and leaves a persistent account; every later deploy at the same address hard-fails (the CREATE2 collision check rejects deployment when the target address has nonce != 0 or non-empty codehash).
wasted_accounts = N_unique_addrs
(each unique CREATE2 address leaves exactly one
persistent account — class and initcode don't matter
for the count, only for what's *in* the account)
hard_fail_events = N_total_events − N_unique_addrs
(events at addrs that already exist in state would
hit the CREATE2 collision and revert)
wasted_code_bytes = Σ codesize over unique addrs with initcode=0
(initcode=1 means runtime code never gets deposited
— the persistent account would carry empty code,
so it doesn't contribute to code bytes)
wasted_code_bytes_dedup = Σ codesize over unique codehashes with initcode=0
(Erigon's state stores Code keyed by codehash, so
identical templates deployed at many addresses
share a single code blob in the trie)
The 3.27 MB → 909 KB compression ratio (≈ 3.6×) reflects that the same Deposit/Wrapper templates are deployed at many distinct addresses. State would actually only gain ≈ 909 KB of new code but ≈ 414k new account records.
3. Classes of usage
Each event is classified into one of six classes by the following decision tree (first match wins):
A_DirectScript origin == caller
(top-level CREATE tx; no factory contract)
B_SharedFactory caller has ≥ 2 distinct origins
(one factory contract used by multiple operators)
C_SybilBatch origin uses ≥ 3 distinct factories AND events/tx ≥ 3
(single operator rotating across many factories,
batch-deploying many ephemerals per tx)
D_MultiTenantAggregator (origin, caller) has ≥ 5 distinct bens
AND events/ben < 50
(one operator + one factory routing to many
customer wallets, light per-ben volume)
E_DepositForwarder the event's addr appears ≥ 2 times in the log
(CEX-style: same CREATE2 address redeployed)
F_OneShot otherwise
(single-tenant deposit forwarder without address
reuse — fresh CREATE2 each time)
The cutoffs (f ≥ 3 && ept ≥ 3 for sybil, bens ≥ 5 && events/ben < 50 for multi-tenant) were tuned on a smaller sample so the multi-tenant detector wouldn’t trip on single-tenant operators rotating across a handful of hot wallets.
Per-class breakdown (combined log, 472,641 events)
| Class | Events | %ev | Uniq addrs | Hard-fail | %hf | Addrs w/code | Code bytes | Origins | Callers |
|---|---|---|---|---|---|---|---|---|---|
| A_DirectScript | 34,384 | 7.3% | 34,339 | 45 | 0.13% | 172 | 180,069 | 2,098 | 2,098 |
| B_SharedFactory | 98,972 | 20.9% | 97,120 | 1,852 | 1.87% | 1,849 | 1,140,264 | 604 | 79 |
| C_SybilBatch | 37,262 | 7.9% | 34,638 | 2,624 | 7.04% | 264 | 45,634 | 28 | 190 |
| D_MultiTenantAggregator | 10,618 | 2.2% | 3,764 | 6,854 | 64.55% | 90 | 1,914 | 7 | 7 |
| E_DepositForwarder | 59,768 | 12.6% | 12,010 | 47,758 | 79.91% | 496 | 107,282 | 112 | 129 |
| F_OneShot | 231,637 | 49.0% | 231,637 | 0 | 0.00% | 1,048 | 1,793,139 | 504 | 3,251 |
| Total | 472,641 | 100.0% | 413,508 | 59,133 | 12.51% | 3,919 | 3,268,302 | — | — |
%hf = hard-fail rate within the class (hard_fail / events); the bold total is the dataset-wide rate.
Reading the table
- F_OneShot dominates volume (49% of events) and accounts for 56% of unique wasted accounts. These are operators who already use fresh addresses every time — they don’t need address reuse, just want to avoid leaving persistent contracts. Pure state-bloat absorbers.
- B_SharedFactory is unexpectedly large (21% of events, 1.14 MB of code bytes). 79 distinct shared-factory contracts collectively serve 604 distinct EOAs — significant SaaS-style usage of the pattern.
- E_DepositForwarder concentrates the hard-fails (47,758 of 59,133 = 80.8% of all hard-fail events). 112 operators across 129 factories. This is the slice that genuinely depends on EIP-6780 same-tx redeploy semantics.
- D_MultiTenantAggregator is tiny in event count (2.2%) but has the highest hard-fail-to-events ratio (65%) — every customer redeposits, so most events are at reused addrs. Only 7 distinct operators in this class.
- C_SybilBatch at 7.9% fits the airdrop/mint-farming pattern: 28 operators each rotating across ~7 factories on average (190 callers ÷ 28 origins), batch-deploying ephemeral helpers.
- A_DirectScript at 7.3% — over 2k distinct EOAs each running their own one-shot init-code-as-script pattern. No factory at all.
4. Frequency per 100k-block bucket
Computed by bucket = floor(block / 100_000) * 100_000; events per bucket are simply the count of SD-SAMETX lines whose block falls in [bucket, bucket + 100_000).
20,000,000 5,392 #####
20,100,000 4,255 ####
20,200,000 4,272 ####
20,300,000 6,087 ######
20,400,000 5,129 #####
20,500,000 4,803 ####
20,600,000 5,013 #####
20,700,000 5,308 #####
20,800,000 4,613 ####
20,900,000 4,849 ####
21,000,000 6,090 ######
21,100,000 6,539 ######
21,200,000 4,414 ####
21,300,000 3,761 ###
21,400,000 3,412 ###
21,500,000 4,120 ####
21,600,000 4,185 ####
21,700,000 5,195 #####
21,800,000 5,335 #####
21,900,000 4,714 ####
22,000,000 5,651 #####
22,100,000 6,143 ######
22,200,000 6,928 ######
22,300,000 6,735 ######
22,400,000 5,649 #####
22,500,000 6,599 ######
22,600,000 7,057 #######
22,700,000 9,515 #########
22,800,000 9,049 #########
22,900,000 8,573 ########
23,000,000 100,530 ####################################################################################################
23,100,000 7,647 #######
23,200,000 7,522 #######
23,300,000 7,291 #######
23,400,000 8,049 ########
23,500,000 8,847 ########
23,600,000 8,389 ########
23,700,000 8,121 ########
23,800,000 7,787 #######
23,900,000 11,747 ###########
24,000,000 8,907 ########
24,100,000 7,757 #######
24,200,000 9,079 #########
24,300,000 17,025 #################
24,400,000 17,207 #################
24,500,000 15,887 ###############
24,600,000 13,394 #############
24,700,000 13,835 #############
24,800,000 10,026 ##########
24,900,000 14,209 ##############
(1 char ≈ 1,000 events.)
Trend reading
- Background regime (20.0 M – 22.6 M): flat at ≈ 4–7 k events / 100k blocks (~50–80 events/day). Stable post-Cancun adoption.
- First step-up (≈ 22.7 M): ramps to ≈ 7–10 k.
- Anomaly at 23.0 M: 100,530 events in a single 100k-block window — 13× the surrounding rate. Likely one operator / one batch tx dominating; flagged for separate investigation.
- Plateau (23.1 M – 23.8 M): settles at ≈ 7–9 k.
- Second step-up (≈ 24.3 M): doubles to ≈ 14–17 k. This regime persists through the end of the sample.
Net trajectory: usage roughly tripled between the start and end of the window (5 k → ~14 k per 100k blocks), with one exceptional spike at 23.0 M.