EIP-3436: Expanded Clique Block Choice Rule

In early March we had a couple of deadlocks on the Goerli network. These resulted from clients observing competing out of order blocks and settling on different equally preferred chain head blocks. This EIP proposes a block choice rule that should be deterministic regardless of when the blocks were observed. I have weak opinions as to what the particular rules should be, so if you have better ideas feel free to propose. But I have a strong opinion that first observed should not be one of the rules as that is not deterministic across nodes and is what caused the deadlock.

1 Like

@karalabe can I get your take?

Hey @shemnon. I’m working on implementing this at etclabscore/core-geth.

I ran into an issue with the second Rationale > Scenario defined in the spec. It seems that the second fork 2,4,6 is invalid, since signer 6's block will be rejected for having signed too recently.

Below I’m using zero-indexed block signer order indexes, so 7 is really 8

On that fork, the sequence of signers is described to be ... 7, 0, 1, 2, 3, 4, 5, 6, 1, 3, 5. This supposes that the latest fork block from signer 5 (zero-index name for signer 6) is 4 blocks from their last block on the common segment. With 8 signers and a SIGNER_LIMIT of 8/2+1 from EIP-225 this causes that latest fork block to be invalid (5>4).

Maybe I’ve got something wrong? Or misunderstood the scenario?

Yea, the even length halt scenario isn’t as clean as I hoped.

Here’s a revised one. 8 nodes, zero based. 0-6 all produce in-order blocks, then a netsplit. 0, 2, and 3 on the first fork and 1, 4, 6, 7 on the second fork, and 5 goes offline. 7, 0, and 1 all missed an important in-turn block.

How does this scenario sound?

0, 1, 2, 3, 4, 5, 6,

  • fork 1 - 0, 3, 2.
    • Possible next: 1, 4, 5, 7
    • On this fork: 0, 2, 3
  • fork 2 - 1, 7, 4.
    • Possible next: 0, 2, 3, 5
    • On this fork: 1, 4, 6, 7
  • Offline after split: 5

Do we prefer fork 1 or fork 2? I don’t have a strong opinion but IMHO it should be calculated strictly based on what is in the tested block, not on the eligible next blocks nor on the prior blocks, and not depend on knowledge of the forks other validators are on.

  1. Then choose the block whose validator had the least recent in-turn block assignment.
  2. Then choose the block with the lowest hash.

As for those rules… Rule number 3 means that determining whether to reorg is not really bounded. If it were 7 signers, only 5 were active, and chugged through 10M blocks. And suddenly number 6,7 pops up, and constructs a block each. Then we’d have to go through 10M blocks while searching for the ‘least recent’.

I guess I don’t see why we don’t just skip 3, and go directly to 4 and compare hashes ? That seems like the ultimate tie-breaker, and it’s highly ‘localized’ and cheap to perform.

Least recent may not be the best wording. The specification of the EIP describes what that means, and it does not imply a 10M block lookback, just calculating data within the header and knowing what the full set of validators is:

When resolving rule 3 clients should use the following formula, where validator_index is the integer index of the validator that signed the block when sorted as per epoch checkpointing, header_number is the number of the header, and validator_count is the count of the current validators. Clients should choose the block with the largest value. Note that an in-turn block is considered to be the most recent in-turn block.

(header_number - validator_index) % validator_count

We could skip 3 and go to 4 but that turns it into a PoW race when trying to censor the chain. With rule 3 in place you are only ever in a PoW race with yourself.