Some preliminary results about performing jumpdest analysis and doing JUMPS
.
Building the jumpdest analysis(es) required:
Old implementation versus new, where we also need to build the secondary smaller bitmap. It incurs an extra cost in all cases:
JumpdestAnalysis_1200k/zeroes-6 801µs ± 2% 1981µs ± 1% +147.15% (p=0.016 n=5+4)
JumpdestAnalysis_1200k/jumpdests-6 803µs ± 1% 1958µs ± 3% +143.75% (p=0.016 n=4+5)
JumpdestAnalysis_1200k/push32-6 397µs ± 2% 462µs ± 2% +16.58% (p=0.008 n=5+5)
JumpdestAnalysis_1200k/push1-6 1.85ms ± 0% 1.94ms ± 2% +5.08% (p=0.016 n=4+5)
JumpdestAnalysis_1200k/beginsub-6 1.56ms ± 7% 2.09ms ± 4% +33.50% (p=0.008 n=5+5)
JumpdestAnalysis_1200k/beginsub_push-6 1.88ms ± 2% 1.97ms ± 2% +4.71% (p=0.008 n=5+5)
Code is 1.2 Mb.
zeroes: zero-filled code
jumpdests: all jumpdest
push32 : all push32
push1: all push1
beginsub: all beginsub
beginsub_push: all push1, but every 32th is a BEGINSUB
Executing a JUMP
This benchmark evaluates the validity of a JUMP across the entire body of code, from 0
to the end (1.2 Mb).
Code consists of all JUMPDEST
.
BenchmarkJumpdestValidation/jumpdests-6 151248043 7.57 ns/op
BenchmarkJumpdestValidation/jumpdests-with-subroutines-6 25620 46575 ns/op
The ‘old’ way is very fast, at sub 8ns
, and does not depend on the size of the jump nor the size of the code.
The code chosen is notorious for the seek-based approach, since it there are no
subroutines to abort the check early. It takes 46us
. In this case, I have not optimized it to check
bits 64 at a time, but if we assume that it can be optimized by a factor of 64 for large jumps, that would
still land at 727ns
, which is almost a factor 100 slower than previously.
Note, though. that for a ‘regular’ contrcact, limited to 24K
, the figures look differently:
BenchmarkJumpdestValidation/jumpdests-6 154333724 7.59 ns/op
BenchmarkJumpdestValidation/jumpdests-with-subroutines-6 1000000 1070 ns/op
If we apply the same 64
divisor, we wind up with 7.6ns
vs 16.7ns
, which is much more reasonable.