Thanks, this is exactly the kind of review I was looking for!
In the future if you wouldn’t mind, it’s generally preferable to write the text in the post itself. It makes quoting easier
Noted!
Edit: wouldn’t have worked, as I can only put two links in a post as a “new user”
Generally the state is one of the smaller portions of data fast synced nodes store. Depending on the exact storage model, should be something like 30-60 GB. Headers, block bodies,
receipts, and caches take up a lot of the rest of the data. Not sure how much keeping the last N blocks worth of state takes up, but generally these aren’t simple copies, they’re state diffs.
Amended that section to be more precise. Can the size of the state be tracked somewhere, short of running a node with custom code and measuring it there?
Geth nodes I’ve synced recently are closer to 430 GBs.
Do you know what explains the discrepancy with https://etherscan.io/chartsync/chaindefault ?
I think you mean a Modified Patricia Merkle Tree
Got to push back a bit here - a patricia tree is a radix tree! In fact, the wikipedia page for radix tree says:
Donald R. Morrison first described what Donald Knuth, pages 498-500 in Volume III of The Art of Computer Programming, calls “Patricia’s trees” in 1968.[6] Gernot Gwehenberger independently invented and described the data structure at about the same time.[7] PATRICIA trees are radix trees with radix equals 2, which means that each bit of the key is compared individually and each node is a two-way (i.e., left versus right) branch.
“modified patricia tree” is a term that is almost 100% associated with Ethereum. For someone that’s not into blockchains but has a CS background, you can immediately tell what a radix tree, whereas a patricia tree is a much more obscure term.
I’ll add the term though, it’s good to have it mentionned.
This is called Dynamic State Access (DSA)!
I don’t think an instruction (as in an EVM instruction) …
True, included!
I don’t think this is true. Fast sync isn’t checkpointed like warp sync or something, the db you’re syncing is always changing under you. The network isn’t storing old checkpoints AFAIK. It’s just using the pivots that are inherent to geth’s handling of block reorgs.
You’re right, I must have gotten warp & fast sync conflated. I’ll read up & update that section. Tell me if you know a good writeup on this.
Overall, awesome write up. Thank you for sharing!
Thanks