EIPs 3336 and 3337: Improving the EVM's memory model

gcolvin · April 2, 2021, 12:25am

@Arachnid On 3336, I’m skeptical. First, I think the quadratic cost function for memory means that most memory usage is going to be well within high-level cache – kilobytes, not megabytes – where the relevant boundary is the cache line, not the memory page, and a cache line is typically two EVM words. So it seems like a lot of complexity and likely performance impact to handle a fairly small amount of memory.

A (not so radical) suggestion: Allow memory to also grow from the top down. That is, twos-complement, towards negative offsets. Keep the same cost function. Ideal for a stack of frames. Also, twos complement, the frame stack could be aliased with memory via MLOAD and MSTORE.

gcolvin · August 17, 2021, 2:27am

@Arachnid. @chfast dug up some stats that I’ve summarize here. The median contract uses ~200 bytes of memory and ~13 stack slots, and the most used is ~1.4 megabytes of memory and all 1024 stack slots

gist.github.com

https://gist.github.com/chfast/d303d15d151620d23bba617cd79097af#gistcomment-3859141

README.md

This is implemented by [evmone](https://github.com/ethereum/evmone) 
in https://github.com/ethereum/evmone/commit/a236b9828f9ae1ac257d7c0cb56f654e3a45c7ea#diff-edccf78e9152726b4c97cee57b2b1433R104-R124

Stats about EVM executions are agregated by 1000 blocks.

- start block
- end block
- number of executions (samples)
- total memory
- memory size per execution

This file has been truncated. show original

evm-stats-8M.csv

46000,46999,1,96,96,96,96,96,96,96,96,96,96,96,96,96,96,5,5,5,5,5,5,5,5,5,5,5,5,5
48000,48999,22,3072,0,0,0,0,0,0,160,288,288,288,288,288,288,0,0,0,0,0,0,4,9,9,9,9,9,9
49000,49999,41,18880,96,96,96,96,96,96,512,768,960,960,960,960,960,4,4,4,4,4,4,8,16,16,16,16,16,16
50000,50999,105,19168,0,0,0,96,96,96,96,288,416,576,2880,2880,2880,0,0,0,4,4,4,13,16,16,16,16,16,16
51000,51999,285,40800,96,96,96,96,96,96,96,128,288,480,2880,2880,2880,4,4,4,4,4,5,13,16,16,16,16,16,16
52000,52999,403,55808,64,64,96,96,96,96,96,96,256,480,704,3488,3488,3,3,4,6,6,12,16,16,16,16,16,16,16
53000,53999,63,16672,96,96,96,96,96,96,160,1408,1568,1568,1568,1568,1568,4,4,4,4,4,4,13,20,20,20,20,20,20
54000,54999,126,88608,0,0,0,0,0,96,160,2048,3904,6080,6080,10208,10208,0,0,0,0,0,4,13,16,16,16,18,18,18
55000,55999,339,68672,96,96,96,96,96,96,96,992,1568,1632,1728,2304,2304,3,3,3,4,4,4,13,16,16,19,20,20,20
56000,56999,125,34336,96,96,96,96,96,96,128,1472,1504,1504,2016,2016,2016,3,3,3,3,3,3,10,14,16,16,16,16,16

This file has been truncated. show original

gcolvin · August 28, 2021, 2:11pm

Notes on frame stacks — lists of data frames in memory for arguments and local variables. Expanding on previous discussions here, in an EIP draft, and with Nick.

Two conventions are typical — either create the frame before calling the subroutine, or create the frame after the BEGINSUB. Adding these two instructions to EIP-3337 would ease the way. They are modeled on the corresponding Intel opcodes.

`ENTER _frame_size_`

Write the current FP to memory as with MSTOREF at FP - frame_size, then set FP to FP - frame size.
This a cheaper shorthand for

MSTOREF
GETFP
PUSH frame_size
SUB
SETFP

This should be placed either before a JUMPSUB operation or after a BEGINSUB, depending on the calling convention. Repeated calls to ENTER create a stack of frames, linked by their previous FP field. The stack grows towards lower addresses in memory, as is the common, efficient practice.

`LEAVE`

Restores FP to the value which was stored at offset FP in memory by the most recent ENTER, thus popping a frame off of the stack of frames. This a shorthand for

MLOADF
SETFP

This should be placed after the JUMPSUB operation or before the RETURNSUB – depending on the calling convention – to pop the most recent frame from the call stack.

These operations would be especially useful if the gas formula for memory was changed so that writes from the top of memory down (equivalently, from -1 down, twos-complement) are charged the same gas as writes from the bottom of memory up.

The total cost to expand the memory to size a words is currently

Cmem(a) = 3 * a + floor(a ** 2 / 512)

If a, b and memory offsets in general are allowed to be negative a slightly modified formula

Cmem(a) = 3 * |a| + floor(a ** 2 / 512)
gives the desired results – stack memory can grow from the top down, and heap memory can be allocated from the bottom up, without address conflicts or excessive gas charges. It would also be possible to alias the data stack with the frame stack, as is done by Intel, DEC, and p-code, and other machines. This would allow for a single, indefinitely deep data stack, pointers to stack items, and variable-size stack items.

An ENTER opcode would also combine well with the code sections supported by the Ethereum Object Format with Wasm-style function tables, providing a structure to define valid sets of entry points into collections of subroutines.

And if we really wanted to go to town, we could alias the data stack with the frame stack and interleave data-stack items with stack frames, as is often done. This would make it easier to support variable-width data items.