EIP-2930: Optional access lists

I don’t believe it has been formally specified yet, but the approach we’ve taken so far for typed transactions we also append the type field and the transaction-specific fields. You can sort of see how this is done in the PR: https://github.com/ethereum/go-ethereum/pull/21502/files#diff-77719ae57e7e6c3e0cac05fa12c6c5f2e5f9bc810c034d1446de2414c36d9210

looks like the response of this function now is a bytearray.
then you evaluate the first byte and cast the rest to tx.rlp
if legacy byte then the whole bytearray is casted to tx.rlp

also the block rlp now is
RLP (HEADER + TXLIST + UNCLELIST)
where TXLIST is list of bytearrays. and the same logic applies.

Do I understand this right?
if in the block trlist would have 01+trRLP it is invalid rlp and I get errors when parsing it. because 01 means just one byte and there fore each prefix byte is another list element. i dont think that this is how it was supposed to be

meaning this would be an RLP of access transaction type inside block RLP

0xf8c801b8c5f8c301800183061a8094095e7baea6a6c7c4c2dfeb977efac326af552d87830186a0821122f85bf859940000000000000000000000000000000000001337f842a00000000000000000000000000000000000000000000000000000000000000001a0000000000000000000000000000000000000000000000000000000000000000280a07672ffebbbf9eb38251124628835a51e308c844b7864295abbbe181ad4e6663ca01a9a4f40f06befc68b0529ace1b8a2cc3e6306fd5089863f4677eaa78a744286
list [ 01 , trrlp ]

and not this

0x01f8c301800183061a8094095e7baea6a6c7c4c2dfeb977efac326af552d87830186a0821122f85bf859940000000000000000000000000000000000001337f842a00000000000000000000000000000000000000000000000000000000000000001a0000000000000000000000000000000000000000000000000000000000000000280a07672ffebbbf9eb38251124628835a51e308c844b7864295abbbe181ad4e6663ca01a9a4f40f06befc68b0529ace1b8a2cc3e6306fd5089863f4677eaa78a744286
01, trrlp

because in block it is list of transactions
in the second case it would be

list [ 01, trrlp, another_trrlp … ]

01 would act as a transaction
also the hash then should be hash ( list [ 01, trrlp ] ) the sha3 hash of the whole bytes of the first example. not the hash (01 + trrlp)

I’m not sure if I quite follow, but the encoding over the wire should be rlp_string(0x01 || rlp([chainId, nonce, ..., v, r, s]) where rlp_string wraps in the inner bytes with the rlp string prefix b8. Note that since this encoding only occurs over the wire, it isn’t considered in the consensus logic. Legacy transaction will not change in their encoding and they will not be wrapped. When the tx trie root is calculated, 2718 txs are not wrapped in an rlp string.

Since the wrapped version is just a convenience encoding for transmission, I believe that the consensus-defined encoding (01 || payload) should be the preimage for the hash.

/// <summary>
/// We store the extra information here to be able to recreate the order of the incoming transactions.
/// EIP-2930 (https://eips.ethereum.org/EIPS/eip-2930) states that:
/// 'Allowing duplicates
/// This is done because it maximizes simplicity, avoiding questions of what to prevent duplication against:
/// just between two addresses/keys in the access list,
/// between the access list and the tx sender/recipient/newly created contract,
/// other restrictions?
/// Because gas is charged per item, there is no gain and only cost in including a value in the access list twice,
/// so this should not lead to extra chain bloat in practice.'
///
/// While spec is simplified in this matter (somewhat) it leads to a bit more edge cases.
/// We can no longer simply store the access list as a dictionary, we need to store the order of items
/// and info on duplicates. The way that I suggest is by adding an additional queue structure.
/// It be further optimized by only including a queue of integers and a strict ordering algorithm for the dictionary.
///
/// I leave it for later in case such an optimization is needed.
/// </summary>
public class AccessListBuilder
{
    private readonly Dictionary<Address, IReadOnlySet<UInt256>> _data = new();

    private readonly Queue<object> _orderQueue = new();

    private Address? _currentAddress;
    
    public void AddAddress(Address address)
    {
        _currentAddress = address;
        _orderQueue.Enqueue(_currentAddress);
        if (!_data.ContainsKey(_currentAddress))
        {
            _data[_currentAddress] = new HashSet<UInt256>();
        }
    }

    public void AddStorage(UInt256 index)
    {
        if (_currentAddress == null)
        {
            throw new InvalidOperationException("No address known when adding index to the access list");
        }
        
        _orderQueue.Enqueue(index);
        (_data[_currentAddress] as HashSet<UInt256>)!.Add(index);
    }

    public AccessList ToAccessList()
    {
        return new(_data, _orderQueue);
    }
}

Shall we RLP storage in the access list without leading zeros. I think this improves the memory allocations needed and decreases the network bandwidth / deserialization / serialization cost.

I believe the signature format should be: keccak256(1||rlp([chainId, nonce, gasPrice, gasLimit, to, value, data, access_list])).
or it should not contain the type at all.
If we move the type inside the serialization format then we make an assumption that serialized data starts with the first item or we lose the replay protection meaning of adding the type in the first place.

2 Likes

Very true. EIP-2718 even explicitly says that this needs to be the case, we (probably me) just messed it up. :cry:

PR out to fix it: Makes signature have type as the first byte. by MicahZoltu · Pull Request #3253 · ethereum/EIPs · GitHub

1 Like

Since this has gone live on Ropsten, I’ve been tinkering with it, and was surprised to find increased gas costs in places where I expected savings.

Digging in, I’ve found that the ACCESS_LIST_ADDRESS_COST is charged even for the contract being called in a transaction. This is at a “discounted” rate relative to making a CALL from one contract to another, but since there is no CALL opcode charged for the initial transaction, this is 2,400 additional gas being charged by including an access list. This means that the only scenario in which a transaction can save gas by including an access list for the contract invoked by a transaction is if it makes at least 25 SLOAD operations from its own data - a fairly rare occurrence from my sampling of transactions.

It strikes me as an oversight that the 2,400 gas cost is included for calls to the first contract invoked by a transaction, given that there is no equivalent cost absent access lists, and there is no way to specify storage slots in an access list without incurring this cost. And given that the contract targeted by a transaction is going to be loaded no matter what, there’s not really any additional computational cost that this 2,400 gas is needed to cover.

I’m assuming it’s too late for Berlin, but for London might it make sense to exclude the 2,400 gas cost for access lists for the “to” address of a contract?

We did a performance experiment by pre-loading access_list of the transactions in parallel in a block and then execute the transactions (assuming access_list contains full access list of each tx). This saves a lot of read latency when executing the transactions because the data to be accesses in tx are already in memory.

Result

The result is that about 60% performance improvement on a commodity PC. The example code based on go-ethereum is here.

Steps of the test:

  1. Given a range of blocks, create access_list of each tx by running each tx and save the resulting access_list
  2. Start geth and import blocks
  3. Before executing txs for each block, the import will pre-load account/storage data to memory in parallel.
  4. Record used time

Result Details

  • Block ranges 8M to 8.20M
  • Geth’s import time: 4h6m ~ 246m
  • Geth’s with preload import time: 2h34m ~ 154m
  • Improvement 246 / 154 - 1 ~ 60%
  • Based on 5900x + 64G memory + SN750
2 Likes

Hey all,

I’m having a lot of trouble getting information about how to use these new access lists in an everyday application. I see examples of what they look like, but I can’t find any explanation about how I would actually create them correctly - e.g., what addresses to use, and what “storage keys” are…

I’m using (Berlin-compatible) @ethereumjs/tx@3.1.3 to self-sign transactions, and I’ve read all their docs, but they seem to be focused exclusively on format, not content. I need something that will tell me, “This is what a storage key is. This is how you make one. This is how you obtain storage keys for a given transaction. This is the address you would use in an access list for this example transaction, and this is why…” etc.

(FYI, this is for a live use-case serving real customers right now, so I’d at least like to know that what we currently have in production will not break on ~April 14th. Even better if I can use these new features to optimze…)

Question if this is right. This EIP produced an edge case where sending access list for a simple transaction that touches few storage indexes for either sender address or recipient address is suboptimal and can lead to higher gas price than omitting that part of access list.

For example a transaction with access list for transaction that is sent to address X and accesses index 0x1, will have intrinsic access list cost of ACCESS_LIST_ADDRESS_COST + ACCESS_LIST_STORAGE_KEY_COST = 2400 + 1900 = 4300

And there will be additional cost for SLOAD when actually executing transaction WARM_STORAGE_READ_COST = 100 and free if it will be SSTORE.

But if we won’t send this access list, then we won’t charge anything for accessing recipient address (nor sender address), and for SLOAD we would charge only 2100.

In the end we are charging 2200 more. So first [2400/200, 2400/100] = [12, 24] first accesses on those 2 addresses are better not to be included in access list in order to reduce gas cost. 12 - for only SLOADs, 24 for only SSTOREs, and we can have any combination in the middle.

Is it known trade-off of 2930+2929? Is this by design or was just missed?

1 Like

You can use Create access list by MariusVanDerWijden · Pull Request #22550 · ethereum/go-ethereum · GitHub to figure out which accounts/slots are touched.

Yes, it’s known. It’s because to is free, it has this sideeffect. It’s not expected that all (or even many) users will use 2930, it exists today because it’s needed in order to salvage a few cases which would otherwise be irretrievably broken due to 2929.

In the future, when costs are raised even further and the difference between “runtime-list” and “tx-list” is larger, it may be more beneficial to use 2930 access lists, and this effect may become more marginal.

1 Like

Currently, EIP-2930 access lists have the nice property that they are very easy verify and calculate the cost of. This is for a couple reasons, but particularly because duplicate keys are allowed. If we were to not charge 2,400 gas for each address (even ones added by 2929) then you could create a valid tx with 1000s of copies of the sender’s address in the access list. To protect against this you’d need to have some way of restricting duplicates - or at least a way to restrict duplicate to, from, createdContract keys.

@holiman It looks like the specification will charge ACCESS_LIST_ADDRESS_COST if you specify accessList storage slots in the auto-warm to address, even though to should already be warm. There is no way to specify storage in to without incurring this cost, and the difference would outweigh any benefits if you were reading less than 12 slots in the to contract. This seems counter to the intentions of the proposal.

It could have been easy to make the first incident of these in access list free, and then it would always be better to specify everything you will definitely access in the access list. Instead wallets will have to prune the access lists, removing lists for the to address.

I question the decision to allow duplicate addresses; it should be very easy for clients and servers to check for that. We are now seeing the consequences of putting simplicity above good incentives.

The intentions of the proposal is to make it possible to salvage the one percent of cases which were broken by EIP-2929. It was not the intention to turn every tx into type 2930.

Yes, it would have. Or, maybe not free, but cheaper. That wasn’t the primary goal, though.

Anytime a node has to validate a transaction validity, we want that to be as quick as possible. Currently, it’s things like verify signature (derive sender) and intrinsic gas (whereas nonce and balance check is not needed for tx validity, only for deeming if it can be included in a certain block).

A node is constantly bombarded with transactions from the network, and needs to quickly tell if a transaction is valid or not. If we need to validate the uniqueness of the addresses (and slots too, in that case, because one want to be consistent), then the client may have to build up a set of thousands of items to check. Yes it can be done, but it’s easier to just charge what the user specifies. The word “easy” here doesn’t mean “easier for devs”, it means simply “easier for the node”.

If we make the validity-checks too cumbersome, an attacker can submit thousands of invalid transactions, forcing the recipient node to evaluate them all before rejecting them.

3 Likes

Can someone explain, what is the best situation to use AccessList? I used AccessList in a complex transaction, but the gas used is higher than before.

1 Like

Hi. I’m implementing EIP-2028 call-data gas costs and EIP-2930 access-list gas costs.

EIP-2028 reduced non-zero byte call data costs from 64 to 16. EIP-2930 contains the following line though:

At the beginning of execution (ie. at the same time as the 21000 + 4 * zeroes + 12 * nonzeroes start gas is charged), we charge additional gas for the access list: ACCESS_LIST_ADDRESS_COST gas per address and ACCESS_LIST_STORAGE_KEY_COST gas per storage key. For example, the above example would be charged ACCESS_LIST_ADDRESS_COST * 2 + ACCESS_LIST_STORAGE_KEY_COST * 2 gas.

Is this 12 * nonzeroes part a bug in the spec or did we change the call-data cost in another proposal that I’m not aware of?

12 is 16 - 4

Post must be at least 20 characters