I am trying to think of ways to encourage more people to run more nodes. I think this will happen if one of two are true: (1) it gets easier to run a node, (2) nodes produce useful data. The two issues are related.
One of the reasons why the node is hard to run on a local machine is the size of the data if one wishes to have access to useful data. For my use case, useful data means enough data to fully audit and account for an Ethereum addresses (or collections of addresses). I can’t do that without --tracing
enabled, which means 100s of GB of data.
Having to store 100s of GB of tracing data is required even if the account I wish to audit didn’t first appear on the chain until recently. For example, if I only want to audit a smart contract that was deployed at block 6,000,000, I still have to store the entire trace history of all accounts prior to block 6,000,000.
Would it be possible for the nodes to start syncing at a given block hash? Say I wanted to start syncing at block 6,000,000, and I know through other means that block 6,000,000 was hash 0x123… I know it’s not possible now, but is it conceptually possible to do this? After starting at block 0x123…, I would then be willing to participate fully in the network. Later, if I’ve extracted the data I need, I could restart the node at a later block. In other words, the node could fully verify the blocks (after the first one I specify) and then, after I’m done extracting what I need, I can throw away the block’s data.
This would allow me to account for my addresses (in an ongoing manner) but with a minimal imposition on my machine. It seems to me, that if people could more easily extract useful (i.e. accounting) data for their accounts, they might be more likely to run nodes.
Other than the ‘getting the block hash from some other source’, is there a fundamental reason (security-wise) why this wouldn’t work?