EIP-2565: Big integer modular exponentiation (EIP-198) gas cost

MicahZoltu · March 8, 2021, 2:23am

@tkstanczak In what way is the spec not entirely clear? It is written in Python and thus extremely precise.

@holiman Same question as above, additional feedback on what specifically you find unclear about the provided ~10 lines of python would be more valuable than “I like the old one better”. Python is extremely specific, English is not.

A relevant comic to lighten the discussion:

holiman · March 8, 2021, 7:39am

Ok. let’s compare.

ADJUSTED_EXPONENT_LENGTH is defined as follows.
…
If length_of_EXPONENT > 32, then return 8 * (length_of_EXPONENT - 32) plus the index of the highest bit in the first 32 bytes of EXPONENT

Versus

elif exponent_length > 32: iteration_count = (8 * (exponent_length - 32)) + ((exponent & (2**256 - 1)).bit_length() - 1)

These two examples describe the same thing. However:

The former can be read and it’s clear what the intention is. It can be easily reasoned about.
The latter is very hard to read and reason about. However, as you point out, it’s a great reference implementation, since it can be executed and the reader can experiment with it: both to figure out what it’s doing, and to verify against their own implementation.
- But having to experiment with the python implementation to figure out how it behaves is not ideal, IMO.

Further:

The latter implementation does not reuse the same definitions. It does not talk define ADJUSTED_EXPONENT_LENGTH, but instead calculate_iteration_count.

The first point in the rationale says

Modify ‘computational complexity’ formula to better reflect the computational complexity".
…
In the complexity formula defined in this EIP, x is divided by 8 to account for the number of limbs in multiprecision arithmetic"

The second says

setting the QGUADDIVISOR to 3

And thirdly

Set a minimum gas cost to prevent abuse

All in all, three specific changes. But looking at the python implementation, it’s very difficult to figure out that it’s in fact only three specific changes that are made.

Conclusion

So my preferred specification change would be:

Use the original phrasing as much as possible,
Add a “reference implementation” section, where the current full python implementation can reside.

@tkstanczak If you can add the details on how you misinterpreted it, I think that would be a good as an example.

tkstanczak · March 8, 2021, 12:21pm

Hi,

I interpreted this part incorrectly:

exponent & (2**256 - 1)

I used bitwise and on the last 256 bits of exponents instead of the first 256 bits (this makes more sense to me when reading code like this - it is surprising for me that python uses the first bits.

If Python uses the first 256 bits then I would only expect that if it was interpreting the number as a machine representation (then, probably little endian which would maybe make sense)

So the spec in the format that @holiman suggested would allow me to avoid a failing test that I spent some time on searching the cause for.

MicahZoltu · March 9, 2021, 11:23am

My understanding is that exponent & (2**256 - 1) should be the same as exponent % 2**256 (and in fact, in Python this appears to be the case). This is why I think the python implementation is a better specification because “first 32 bytes of EXPONENT” is ill defined since “first” depends on endianness. Hopefully @kelly can give us insight to make sure that the python implementation is currently correct.

Also, we should probably have a test case for > 256-bit exponent.

MicahZoltu · March 9, 2021, 11:29am

holiman:

ADJUSTED_EXPONENT_LENGTH is defined as follows.
…
If length_of_EXPONENT > 32, then return 8 * (length_of_EXPONENT - 32) plus the index of the highest bit in the first 32 bytes of EXPONENT

Versus

elif exponent_length > 32: iteration_count = (8 * (exponent_length - 32)) + ((exponent & (2**256 - 1)).bit_length() - 1)

These two examples describe the same thing. However:

The former can be read and it’s clear what the intention is. It can be easily reasoned about.

The latter is very hard to read and reason about. However, as you point out, it’s a great reference implementation, since it can be executed and the reader can experiment with it: both to figure out what it’s doing, and to verify against their own implementation.

But having to experiment with the python implementation to figure out how it behaves is not ideal, IMO.

I’m not sure how to resolve our disagreement here since I find the python definition easier to read than the English definition, especially since “first” is ambiguous (see comments above). However, that is certainly a subjective statement.

I believe this is our repeated debate rearing its ugly head again, where you want EIPs to be written for an audience of current client developers and I want EIPs written for future client developers. It is a difference of whether an EIP is a “diff” or it is a “new state of the world” definition.

Hypothetically, if the Ethereum client specification was in a GitHub repository like the PoS specification is in a GitHub repository, and instead of EIPs we just had PRs against the spec, would you generally favor such a system? In that case, every “EIP” would just be a diff of the spec (in the form of a GitHub pull request or git diff).

kelly · March 10, 2021, 9:15pm

Hey all, sorry for the delay in responding.

To add some clarification, I believe that the current Python implementation is correct. One thing to note is that the ‘calculate_iteration_count’ which is under debate here is equivalent to ‘adjusted_exponent_length’ as specified in EIP-198, and no changes to this formula are suggested in this EIP (other than the naming change which was suggested).

In fact, the original EIP-2565 called this equivalency out, and this was even noted directly in the Python implementation, but it was suggested that this note be removed (see outdated suggestions here: https://github.com/ethereum/EIPs/pull/2892#pullrequestreview-503780228). If it would be preferable we can revert to the previous version which builds on EIP-198.

Please let me know if there is any other confusion or questions I can answer.