Paginated getLogs

Motivation

When I first entered this space >4 years ago, the first thing i built was an indexer for the aave protocol. Now 4years later, looking at this app not much has improved and I would probably do it the same way again.

Indexing events should be easy, but it is not.
Perhaps i might be able to do a few calls less, or I might be able to use alchemy* on L2s, but the core pain of eth_getLogs persists.

  • there is no cursor
  • there is no handling of reorgs
  • constant block-range limits on increasingly faster chains are pain

Running an archive node myself, might be possible, but it’s a huge barrier of entry for “getting started”.

While the bigger providers, started to offer custom built solutions for the problem (pipelines by alchemy, and streams by quicknode), these solutions require a vendor lock in, so there should be a feasible node level alternative imo.

Spec

My knowledge on the topic is at best, limited. Therefore I will not suggest any specific spec.
That said, I assume a new method would need to be introduced to not break existing eth_getLogs usage.

My naive assumptions is that:

  1. fromBlock should also accept a blockHash. This way *getLogs could return removed: true events, if the fromBlock was reorged.
  2. if the search range exceeds some sort of limit on the archive node, the node should just return (not error) and the return should contain the last visited blockHash, so on the next query one can continue from there.

Perhaps it could make sense to have a more complex cursor including e.g. a txIndex. I assume with ever growing blocks one might end up in a situation where one block might exceed the node limits (iirc. there was some issue with infura in the past).

Prior work

There seem to be two related prs that i could find:

*In contrast to other node providers, alchemy does not implement a strict maximum blockRange, but allows for a dynamic range / suggest a dynamic range if the range contained to many matches Based on your parameters, this block range should work: [0x0, 0x270f]

3 Likes

Large reorgs don’t really happen anymore, and if you use a logs subscription they come with a removed field, which does handle reorgs.

Large reorgs don’t really happen anymore,

That’s only partially true. On ethereum yes, but ethereum sets the baseline for a lot of other l1/l2s - also it does not really matter if large or not.

and if you use a logs subscription they come with a removed field, which does handle reorgs.

Websockets, are only supported by a subset of node providers, stateful connections are not suitable for all use-cases and even if you use websockets, you still need to handle connection drops.

I’m not saying there are no workarounds. People obviously work around these issues, but it must be acknowledged that the default experience currently is quite bad and should be improved.


Also i want to emphasize that handling reorgs would just be a nice side effect of better getLogs. The main thing boosting dx would be pagination which could even be necessary given ever increasing blocksize.

The way to do it without websockets (eth_subscribe) is with filters (eth_newFilter). Don’t use providers that don’t support subscriptions or filters. If they won’t provide the good API that lets you notice log removals they are ripping you off by forcing you to make extra requests. With so many alternatives there’s no excuse. You can just use anyone else, or run your own node.