During the AllCoreDev call yesterday, 10th of July 2020, we kept discussing the issues of network health and the dominance of go-ethereum in the network, which leads to go-ethereum developers having to bear much more responsibility than they should.
There are few things that I took away from that call, but here is one of them.
On a call 2 weeks ago, we very quickly jumped to the notion of “client diversity”, which was broadly understood as the relative share of different kinds of Ethereum implementations in the network. Somehow we rather implicitly made a connection between two actual goals and the “client diversity”. These 2 goals seem to be:
- Resilience of the network to implementation bugs (this was suggested in Discord by @matt)
- Resilience of the network to capture (this was added in Discord by @MicahZoltu)
Before the call, @shamatar asked in the gitter channel “How a share of the client in the network is measured?”. My answer was that it is usually done by crawlers, that record the info from the eth handshakes. This is relatively easy to sybil attack or masquerade.
This question eventually led to the realisation that we do not have a good definition of what “client diversity” that we desire means. Here is a thought experiment. I spin up couple of backend nodes somewhere, and create lots of lightweight devp2p sentries that would talk to the backend nodes but they will appear as distinct nodes. Although it would change the stats on ethernodes.org, it will not change the fact that if go-ethereum has a bug, the network will probably suffer. Why? Because what matters is not exactly the number of installations, but number of critical installations, which includes mining pools, exchanges with high volume, wallet/dapp backends, etherscan. This makes the definition messier, but also more useful, because it can make our proposals more actionable
So this was my suggestion:
perhaps the first step we can make (and this is what Eth Cat Herders can help us with) is to create a reliable system that would give us info about the critical installations, but without revealing too much info about them. So we can track important metrics rather than vanity metrics.
It does not need to be completely decentralised, it can be mediated, but it would be super useful to assess where we are and any progress we make.
@vorot93 brought up a good point that “don’t think any major player in their right mind will ever divulge ownership over specific nodes. Too much of a security risk”
My reply was as follows. That is why I think it might need to be mediated, so that the attribution to the specific critical operators is kept private, but we see the aggregate picture. Or maybe we can use anonymised tokens that Eth Cat Herders will regularly issue to the critical operators, they will drop them in accordingly without attribution, but we can see the stats reliably. I do not see an obvious incentive for the critical operators to anonymously disinform us.
EDIT: An afterthought. The presence of such metrics might be enough to start shifting the balance in diversity of the critical installations - because it creates a useful coordination point for all the critical operators - now they will know what happens if they rely on a single implementation, so it might convince them to run at least two, even at a higher cost.