Summarizing current state and open questions:
Thanks for looking! Any and all feedback is greatly appreciated :).
Background: As an Ethereum user, I’m concerned that the BIP-39 mnemonic backup of my wallet is a single point of failure. I’d like to use Shamir’s Secret Sharing Scheme to split it into shares that I can distribute for storage and later use to recover my original mnemonic.
I’d like each of the shares to also be a standard, BIP-39 mnemonic, so that if one is found/lost, the finder would not necessarily know that it was only one of a larger scheme.
So far, I’ve been primarily focused on building the UI for this using an existing implementation of Shamir’s scheme. But, I’ve gotten some feedback that in order to make this tool more broadly useful, and ensure recovery at some arbitrary point in the future when tooling may have changed, that I should use a standardized implementation of Shamir’s. Or, since that doesn’t exist as far as I’m aware, create one.
So, I’ve gone down the route of creating an EIP to specify a standardized approach for splitting a BIP-39 mnemonic into shares that are also BIP-39 mnemonics. Although this isn’t my core domain, I think I’ve gotten a good start and am hoping to get some feedback here :).
Questions:
- What’s a good choice for a Galois Field for this?
My spec originally converted the mnemonic to hex entropy and applied Shamir’s to the hex values using GF(256). The resulting hex shares were then converted to valid BIP-39 mnemonics.
Earlier in this thread, it was suggested that I could use GF(2048) instead so we could apply Shamir’s directly to the mnemonic. The problem I see with this is that the resulting shares would not be valid BIP-39 mnemonics. (Unless we jumped through some hoops, only converting the entropy section of the mnemonic and calculating the checksum from there for each share. Doable, but then, what’s the advantage of using GF(2048) here?)
Would there be any advantage to using GF(16)? Or any other field? GF(256) seems like the most common in the wild, but not sure there are other considerations I should be thinking about in choosing the field.
- Does the choice of a primitive polynomial matter?
Other than being included in the specification and consistent across implementations, does the actual choice matter?
I’ve been following various reference implementations and using the lowest primitive polynomial. For GF(256), I used 29 ($x^8 + x^4 + x^3 + x^2 + 1$
). For GF(2048), I used 5 ($x^11 + x^2 + 1$
).
Are there any arguments for using a different primitive polynomial here?
Thanks for reading!
Links: