GraphQL interface to Ethereum node data

@adamschmideg Thanks for the summary! A few notes:

This is already available via the pending or block queries. I think it makes sense to make it explicit that the account state depends on which block you query it in.

With the use of the ‘pending’ query, I don’t think we need a special BlockNumber type.

There’s no need for a special query for this - callers can include multiple block sections in their query if they want multiple arbitrary blocks.

Thanks everyone for spearheading making GraphQL for Ethereum a real thing!

Given there is now an expressive syntax for querying a node with the implementation of this proposal, it now opens the door to providing much friendlier APIs do dapp developers. One which standardize and simplify their jobs, and can provide performance and scalability performance to those apps.

I’d like to suggest a feature (or perhaps class of features) that would be a natural extension to the current schema and are perhaps supportable in a GraphQL Extension proposal – if not added directly to EIP-1767.

I’m looking forward to thoughts, feedback, and ideas on what the best way to take continue moving this proposal forward.

Thanks in advance!


Proposal

Extend Query Schema to support grouping by topics and returning FIRST N or LAST N logs from the group.

Example Schema

The follow example GraphQL schema addition would support all necessary operations for this grouped-selection proposal. Other schema formulations may do the the same, so another may work as well or better than this example.


type GroupedLogs {
     groupTopics: [Bytes32!]!
     count: Long!
     logs(first: Int = NULL, last: Int = NULL): [Log!]
}

type Query {
     logs(filter: FilterCriteria!): [Log!]!
     
     # The same FilterCriteria as `logs` are used, but specifying topic indices 
     # in the groupTopics parameter would allow grouping the resulting logs by 
     # the topics specified.
     #
     # For the context of this proposal, logs are always sorted in natural 
     # (BlockNumber, LogIndex) order.
     groupedLogs(filter: FilterCriteria!, groupTopics: [Int!]!): [GroupedLogs!]!
}

Rationale

Logs are emitted by smart contracts for a number of reasons. For complex applications, they are often the only way of notifying a client of certain results, and being able to quickly select the last N available logs would man clients don’t need to scan nodes for ALL logs just to retrieve the most recent event for the index they care about.

An Example

For example, say you have a game that generates NFTs, lets call them Wizards, where the Wizard has a contract function evolve. Holders of the Wizard can call evolve at any block, and by virtue of doing that transaction, there is some probability that the Wizard evolves and has its “power” incremented by some amount.

Assuming you were a UI that wanted to display all created Wizards, sorted by their current power, there are a few options at play of how to implement this solution.

  • Keep track of this large sorted list of Wizards on chain as the Wizard is evolved.

    • This doesn’t scale for a number of reasons, most notably the unbounded data-structure / algorithmic runtime don’t work with a Gas-capped based system.
  • Track all Wizards created with an off-chain aggregation (e.g. by specifying a log filter for a Created event), and using that list issue a large number of queries for the current chain state of each of the Wizards, and then organizing and sorting those items.

    • This requires issuing a large amount of queries to eth nodes to fetch all of the current state of the chain, potentially across thousands of eth_calls. Even when batching this is particularly expensive especially for light nodes where you’re penalized heavily for asking for too much data. Also, if its not also coupled with an Event system there is no efficient way to query Power-over-time for a set of Wizards. When implemented in this way, current solutions effectively use centralized application servers to consolidate the on-chain data and provide a nice query interface.
  • Each time an Wizard is evolved, an Evolved event can be emitted, which would contain the current Wizard address as well as the power-level after the evolution. Using this method a client can issue a Query.logs request to fetch ALL logs across all Wizards, and then the client can store all those logs and do various operations on them, including sorting and grabbing the last Evolved event – using this to drive the UI.

    • This approach is architecturally nice because it allows things users expect, like being able to see power-over-time easily for Wizards, but comes with the drawback that a client must effectively fetch all available data to figure out what the current power-level of each Wizard. The fact that fetching all that data is a costly endeavor
  1. This proposal Architecturally we could use the above approach, coupled with the ability of an eth node to return the last N Evolved logs for each Wizard. With this proposal, a dapp builder doesn’t need to do any expensive blockchain scanning, nor coordinate large-scale eth_call spam to a node. This also totally eschews the need for business login in a centralized app server, and means that any node implementing this protocol can efficient power the user-interface for this theorized game. It also means that any third party node providers can easily cache the results of this particular query and scale it out even further, without having any specific business logic implemented for a dapp.

Real-Life Usage

The current Augur event architecture works similarly to the 2nd the last example above. Each Augur market is represented by a contract deployment, and each match trade of shares on the market is logged by a parent contract (Augur) so that the trades can be analyzed by clients to support common trading use cases like Last Trade Price, or to display a user’s current Profit and Loss across all the markets in which they participate.

Currently, fetching the most recent state of the application involves scanning and caching all log messages for the set of markets, and then using that to drive basic UI functionality, like sorting based on last trade price.

This is incredibly expensive per client, putting load on eth nodes to synchronize all log state to each client instead of returning the exact relevant data (the current state of each market over a range of blocks).

Potential Objects

Adding more advanced Filtering may add load on already loaded nodes

While this may seem true on the outset, and will be true to some extent, I believe that this load pales in compare to the load genreated by the alternatives. Take a common case, a client is listening to the Events coming off an eth node, the client goes offline for some number of hours and then needs to catch up to the state of the chain.

  1. In the case where they are able to query the chain directly for the data, they may need to do an unbounded number of requests to refresh the state of all objects that may have changed. In this case, a client would fall into the second case above and be required to issue potentially thousands of eth_calls to the node in order to get understand the current state.
  2. In the case where the client relies upon log notifications, they may need to scan up to 12 hours of logs to fetch the most recent events they care about. In this case there are two pieces: the index scan, and returning the data. The index scan for the data will need to happen both in the naive log fetching implementation and this proposal, but the amount of data returned stands to be significantly reduced if only the LAST N logs filter per group were used.

But Logs are supposed to be for Events, not long term storage!

Even if this is true there are cases where relatively short term storage of the logs for reliable delivery to nodes which may go away for some small number of hours or days could see a benefit from the ability to efficiently query for logs. If, on the other hand, it is decided the Log storage should be for a moderate, long, or forever time scale for full nodes then being able to efficiently query large amounts of event data becomes even more useful.

1 Like

I don’t see how pending and the BlockNumber type are related. I mean to replace all occurrences of number: Long, both in block() and in blocks().

Parity assigned GraphQL-support to their next milestone. Yey.

1 Like

As I understand it, the reason for a special BlockNumber type is because in addition to numbers, we have two special values: pending and latest. In the current schema, the pending query handles the former, and the latest block can be fetched with a block query specifying neither number nor hash.

Interesting idea, but I’m not sure how this can be implemented efficiently on a node. As I understand it, returning the set of GroupedTopics would require scanning all the logs covered by the FilterCriteria, the same as a regular logs query. Why not simply do a regular query with the same criteria and do the grouping on the client side? The load on the node will be the same, and the reduction in data transmitted doesn’t seem substantial.

@pgebheim I see a lot these ideas developing as extension EIPs that provide an API layer on top of the nodes, but not necessarily in them. There is definitely a base layer of GraphQL that should be supported by the nodes for all of the reasons listed in the original justification as well as the extra benefits you get with query stitching and composing extensions on top of the base API. In my opinion, EIP-1767 should cover the base functionality of what nodes currently expose for data in and out.

There are additional layers that can be added on top of the nodes that can provide extra functionality. This can include an extension API for event / log indexing, transaction filtering, unrolling of the rlp encoded data etc. As to the differences between what is in EIP-1767 and what is currently in the EthQL project, I view this as an implementation toward this direction. For this particular standard (EIP-1767), I think it should limit the functionality and concentrate on providing a solid foundation from which we can extend in different directions including concepts like paging, log querying, and higher level convenience functions that are currently embedded in application code.

I don’t see it specified how many items are returned from a collection. For example, Github has these requirements

  • Clients must supply a first or last argument on any connection.
  • Values of first and last must be within 1-100.

This is one way to be explicit about it and it still allows for different pagination strategies.

This feels like a reasonable approach.

Any thoughts on how we should approach various proposals for layers on top of EIP-1767? Is there a need for separate EIPs for each extension or group of extensions?

In a naive implementation an eth node could scan all blocks to find this data. In an optimized case, this would be a composite key lookup – followed by a HEAD or a TAIL operation on the on the data based off a natural index. DBMS can handle these sorts of queries over VAST data sets optimally. It’s really a problem with known solutions, we just need to decide whether we want to do this in Ethereum nodes.

I personally find these sorts of premature optimization conversations to be a limiting factor in terms of bring good interfaces for application developers. I can construct any number of examples where the data size of log fetching becomes substantial – but even worse than that is pushing that development cost to every single application development team, and the cost of degraded UX from web applications that now need to fetch potentially thousands of documents spanning thousands of blocks in order to render data from a small set.

The performance penalty here becomes particularly poor when the client is speaking to a light client, where actually fetching logs from every block in a range requires also fetching block headers for each block returned even if only a small subset of them are actually used in the application.

I have concrete examples from Augur if you want me to go into them.

We have half a dozen node implementations for Ethereum, and if we want graphql to be a standard, they’ll all be expected to implement whatever is defined here. Thus, we should strive for a minimum viable interface first.

For a feature to add value, it should do something that’s significantly more efficient to do on the node than on the client. In cases like this, it seems like you could implement it just as efficiently by using the existing log filter support, then doing a grouping operation in the client. This functionality can be provided via a client-side library, meaning it only has to be implemented once - instead of half a dozen times.

Lets do some math:

For Augur’s first order volume goals, we expect around 2600 trades per week. Currently, the logs that are emitted for a trade happening clock in at just around 900 bytes. This means that a normal weekly cadence of coming back to trade will cause each user to download 2.34mb of data just for this one piece in order to update their local databases to then update the order. Those trades are likely to exist across ~30 markets at a time, meaning that if we could just fetch the latest log for each market efficiently, we would transmit 30*900, or around 27kb of data to get that user’s state up to date.

In any respect, transmitting an extra 2.2mb of data to client + the associated cost of deserialization, grouping, etc etc is going to make the user experience of a dApp that uses an eth node directly far worse.

NOW – lets take a look at this from the perspective of a user that is getting this data from a light node :wink:

Based on the way that light nodes need to fetch data from full nodes, and then verify the blocks before giving it a to a client, a light client needs to on average fetch 8x the amount of data in block headers as a full node.

In the example above, a light client would need to fetch ~2600 blocks worth of headers from a full node to scan and return all the logs. This creates the situation where the node must request ~20,000 block headers from the full nodes that are serving that light client in order to return the data.

Contrast this with the case where the light client can ask the full node directly for the last logs, where it would need to fetch and validate headers for 30 blocks for a total of 240 blocks to fetch.

Put that into the context of light client throttling and you’ll see that clearly expecting all clients to just fetch all the logs is going to put undue stress on the entire network and degrade UX for any client that is attempting to take advantage of eth nodes.

In the edge case that you only want the latest log entry for each group, this would save some data from node to client, yes. I’d suggest, though, that this is quite uncommon - usually we need all the logs in order to reconstruct a state.

Neither solution is scalable, though, because of the load it puts on the node - a solution like The Graph makes a lot more sense.

Also - 900 bytes per log entry?! That’s 28 words. What on earth is being recorded here?

Er? Where are you getting this from?

Adding this interface to the graphql API won’t allow this - you’d need to change the light client protocol instead. I don’t see how this is possible, though, as there’s no way at present to generate a proof-of-nonexistent for a more recent entry than the one the full node returned.

You say if a functionality can be added both on the node side, and on the client side, we should strive for the minimum on the node side. I think the math is actually the inverse here. We have 5-6 node implementations, and we’d need at least as many client-side libraries as mainstream programming languages. So if we want to gain a wider adoption of GraphQL, we have two options to support debated features,
a) Make it part of the spec and include it in the node implementation
b) Create libraries for the mainstream languages and implement it there

It’s not clear to me who the target audience / potential users of GraphQL would be. The EIP suggests it’s a long-term replacement of JSON-RPC. But it covers only a subset of JSON-RPC. Do you see the two APIs living together in the long run? Will their feature set diverge?

Why do from(): Account! and to(): Account! in a transaction take a block argument? I’d specify the block like this:

block(number: 42) {
  transactions {
    from
    to
  }
}

Now I have to specify a – potentially different – block for the accounts.

For most purposes the standard graphql libraries already available in the user’s language of choice should be sufficient - though in practice, most consumers are in JavaScript.

If we start adding everything and the kitchen sink, we will end up with 0 implementations on the server side, or several incompatible implementations, and arguing about how many languages will have to add support on the client side will be academic.

There’s a table in the EIP that shows JSON-RPC coverage; the GraphQL API covers all JSON-RPC functionality other than deprecated functionality (mining interface, transaction signing, etc).

To give you the flexibility to specify the block you want to fetch the account at. We should make it clear that the default value is the block the transaction was mined in, though, or the latest block if the transaction has not been mined.

I set up a Slack channel for all Ethereum+GraphQL related stuff, including this EIP. Feel free to join. It may enable a quicker feedback loop for clients wanting to implement or implementing a GraphQL interface.

I created an initial test suite based on Pantheon’s work. It’s a collection of graphql files and their expected output as json. This is not it’s final location, we’re working on a framework to run the tests across clients. See the README.md in the parent folder for how to run the tests.

According to these tests, the Geth and the Pantheon implementations are somewhat different, 8 of 67 tests fail with Geth. I’ll look into them and compare with the current spec.

Please, mercy, I can’t handle another Slack tab open in my browser all the time. Can’t we use Gitter, or Discord, if we must?

1 Like

Sorry, I know given a few different people in this space, they all prefer different chat apps. I have 9 installed on my phone :wink: You can set up a Zapier integration from Slack to Discord. Or I’ll cross-post the most important updates here.

I’m checking the differences between the Geth and Pantheon implementations. What I’ve found so far is how they differ in error handling.

They both consider a case, but return different error messages (for example “Invalid params” vs “hex number with leading zero digits”). I see a few options here.

  • Standardize error messages
  • Ignore the message part and standardize only the categories (the “extensions” part of the returned json).
  • Standardize “compile-time” error messages only (missing fields, etc).

They treat missing entities differently. If you query the balance of a non-existent account or a property of a non-existent block, Geth will return 0 and null, respectively. Pantheon returns an error in both cases (specific to the case). In this case, I’d return both data, and errors (which is ok by the spec). So a dumber client would be use 0 balance as earlier, a smarter client could tell the account name was too short.

I wrote a draft of how to handle error messages. It’s in the form of a WIP pull request so we can discuss it. I think an important lesson of JSON RPC is that we should standardize error codes and messages.